Are there good reasons why it's a better practice to have only one return statement in a function?
Or is it okay to return from a function as soon as it is logically correct to do so, meaning there may be many return statements in the function?
I often have several statements at the start of a method to return for "easy" situations. For example, this:
public void DoStuff(Foo foo)
{
if (foo != null)
{
...
}
}
... can be made more readable (IMHO) like this:
public void DoStuff(Foo foo)
{
if (foo == null) return;
...
}
So yes, I think it's fine to have multiple "exit points" from a function/method.
null
while you expect some value. Having an argument that can take on null
can be useful, but it should be used sparingly IMHO. An exception of course is as much as an exit point as a return statement, and most IDE's nowadays highlight all exit points when you e.g. point to the method's return type. - Maarten Bodewes
DoStuff() { DoStuffInner(); IncreaseStuffCallCounter(); }
- Jim Balter
Nobody has mentioned or quoted Code Complete [1] so I'll do it.
Minimize the number of returns in each routine. It's harder to understand a routine if, reading it at the bottom, you're unaware of the possibility that it returned somewhere above.
Use a return when it enhances readability. In certain routines, once you know the answer, you want to return it to the calling routine immediately. If the routine is defined in such a way that it doesn't require any cleanup, not returning immediately means that you have to write more code.
[1] http://www.cc2e.com/I would say it would be incredibly unwise to decide arbitrarily against multiple exit points as I have found the technique to be useful in practice over and over again, in fact I have often refactored existing code to multiple exit points for clarity. We can compare the two approaches thus:-
string fooBar(string s, int? i) {
string ret = "";
if(!string.IsNullOrEmpty(s) && i != null) {
var res = someFunction(s, i);
bool passed = true;
foreach(var r in res) {
if(!r.Passed) {
passed = false;
break;
}
}
if(passed) {
// Rest of code...
}
}
return ret;
}
Compare this to the code where multiple exit points are permitted:-
string fooBar(string s, int? i) {
var ret = "";
if(string.IsNullOrEmpty(s) || i == null) return null;
var res = someFunction(s, i);
foreach(var r in res) {
if(!r.Passed) return null;
}
// Rest of code...
return ret;
}
I think the latter is considerably clearer. As far as I can tell the criticism of multiple exit points is a rather archaic point of view these days.
string ret = ""
should be string ret = null
to both functions be equivalent. - Andre Figueiredo
I currently am working on a codebase where two of the people working on it blindly subscribe to the "single point of exit" theory and I can tell you that from experience, it's a horrible horrible practice. It makes code extremely difficult to maintain and I'll show you why.
With the "single point of exit" theory, you inevitably wind up with code that looks like this:
function()
{
HRESULT error = S_OK;
if(SUCCEEDED(Operation1()))
{
if(SUCCEEDED(Operation2()))
{
if(SUCCEEDED(Operation3()))
{
if(SUCCEEDED(Operation4()))
{
}
else
{
error = OPERATION4FAILED;
}
}
else
{
error = OPERATION3FAILED;
}
}
else
{
error = OPERATION2FAILED;
}
}
else
{
error = OPERATION1FAILED;
}
return error;
}
Not only does this make the code very hard to follow, but now say later on you need to go back and add an operation in between 1 and 2. You have to indent just about the entire freaking function, and good luck making sure all of your if/else conditions and braces are matched up properly.
This method makes code maintenance extremely difficult and error prone.
if
statement is useful because it can be inverted (! SUCCEEDED
). But most importantly, this function is doing multiple things, and those results can be broken apart: function doStep1() { int r = ERROR1; if (SUCCESS(step1())) r = doStep2(); return r; }
where inevitably the last step finally returns OK
upon success. In object oriented design, this could even be made into objects... - pickypg
Structured programming [1] says you should only ever have one return statement per function. This is to limit the complexity. Many people such as Martin Fowler argue that it is simpler to write functions with multiple return statements. He presents this argument in the classic refactoring [2] book he wrote. This works well if you follow his other advice and write small functions. I agree with this point of view and only strict structured programming purists adhere to single return statements per function.
[1] http://en.wikipedia.org/wiki/Structured_programmingGOTO
to move control flow even when function existed. It never says "never use GOTO
". - Esteban Küber
goto
, break
, return
and continue
(did I miss any?) are all about ignoring the 'structure' of the code. Some programs may be too hard to structure well, and these tricks can be useful, but it makes sense to say they should be avoided (in structured programming). - Thomas Ahle
break
and continue
can make the structure of a loop less complicated, and thus easier to read and maintain. Used properly they're not about ignoring structure. (Obviously you don't just throw them in willy-nilly, but that's true of all programming constructs.) - nnnnnn
As Kent Beck notes when discussing guard clauses in Implementation Patterns [1] making a routine have a single entry and exit point ...
"was to prevent the confusion possible when jumping into and out of many locations in the same routine. It made good sense when applied to FORTRAN or assembly language programs written with lots of global data where even understanding which statements were executed was hard work ... with small methods and mostly local data, it is needlessly conservative."
I find a function written with guard clauses much easier to follow than one long nested bunch of if then else
statements.
In a function that has no side-effects, there's no good reason to have more than a single return and you should write them in a functional style. In a method with side-effects, things are more sequential (time-indexed), so you write in an imperative style, using the return statement as a command to stop executing.
In other words, when possible, favor this style
return a > 0 ?
positively(a):
negatively(a);
over this
if (a > 0)
return positively(a);
else
return negatively(a);
If you find yourself writing several layers of nested conditions, there's probably a way you can refactor that, using predicate list for example. If you find that your ifs and elses are far apart syntactically, you might want to break that down into smaller functions. A conditional block that spans more than a screenful of text is hard to read.
There's no hard and fast rule that applies to every language. Something like having a single return statement won't make your code good. But good code will tend to allow you to write your functions that way.
I've seen it in coding standards for C++ that were a hang-over from C, as if you don't have RAII or other automatic memory management then you have to clean up for each return, which either means cut-and-paste of the clean-up or a goto (logically the same as 'finally' in managed languages), both of which are considered bad form. If your practices are to use smart pointers and collections in C++ or another automatic memory system, then there isn't a strong reason for it, and it become all about readability, and more of a judgement call.
auto_ptr
, you can use plain pointers in parallel. Though it would be odd to be writing 'optimised' code with a non-optimising compiler in the first place. - Pete Kirkham
try
... finally
in Java) and you need to do resource maintenance you could do with a single return at the end of a method. Before you do this, you should seriously consider refactoring the code to get rid of the situation. - Maarten Bodewes
I lean to the idea that return statements in the middle of the function are bad. You can use returns to build a few guard clauses at the top of the function, and of course tell the compiler what to return at the end of the function without issue, but returns in the middle of the function can be easy to miss and can make the function harder to interpret.
Are there good reasons why it's a better practice to have only one return statement in a function?
Yes, there are:
The question is often posed as a false dichotomy between multiple returns or deeply nested if statements. There's almost always a third solution which is very linear (no deep nesting) with only a single exit point.
Update: Apparently MISRA guidelines promote single exit [1], too.
To be clear, I'm not saying it's always wrong to have multiple returns. But given otherwise equivalent solutions, there are lots of good reasons to prefer the one with a single return.
[1] https://stackoverflow.com/questions/17177276/c-c-conditional-return-statements/17177315#17177315Contract.Ensures
with multiple return points. - julealgon
goto
solves the problem. Using goto
here increases code readability. - q126y
goto
to get to common cleanup code, then you've probably simplified the function so that there's a single return
at the end of the clean-up code. So you could say you've solved the problem with goto
, but I'd say you solved it by simplifying to a single return
. - Adrian McCarthy
Having a single exit point does provide an advantage in debugging, because it allows you to set a single breakpoint at the end of a function to see what value is actually going to be returned.
In general I try to have only a single exit point from a function. There are times, however, that doing so actually ends up creating a more complex function body than is necessary, in which case it's better to have multiple exit points. It really has to be a "judgement call" based on the resulting complexity, but the goal should be as few exit points as possible without sacrificing complexity and understandability.
No, because we don't live in the 1970s any more [1]. If your function is long enough that multiple returns are a problem, it's too long.
(Quite apart from the fact that any multi-line function in a language with exceptions will have multiple exit points anyway.)
[1] http://en.wikipedia.org/wiki/Structured_programmingMy preference would be for single exit unless it really complicates things. I have found that in some cases, multiple exist points can mask other more significant design problems:
public void DoStuff(Foo foo)
{
if (foo == null) return;
}
On seeing this code, I would immediately ask:
Depending on the answers to these questions it might be that
In both of the above cases the code can probably be reworked with an assertion to ensure that 'foo' is never null and the relevant callers changed.
There are two other reasons (specific I think to C++ code) where multiple exists can actually have a negative affect. They are code size, and compiler optimizations.
A non-POD C++ object in scope at the exit of a function will have its destructor called. Where there are several return statements, it may be the case that there are different objects in scope and so the list of destructors to call will be different. The compiler therefore needs to generate code for each return statement:
void foo (int i, int j) {
A a;
if (i > 0) {
B b;
return ; // Call dtor for 'b' followed by 'a'
}
if (i == j) {
C c;
B b;
return ; // Call dtor for 'b', 'c' and then 'a'
}
return 'a' // Call dtor for 'a'
}
If code size is an issue - then this may be something worth avoiding.
The other issue relates to "Named Return Value OptimiZation" (aka Copy Elision, ISO C++ '03 12.8/15). C++ allows an implementation to skip calling the copy constructor if it can:
A foo () {
A a1;
// do something
return a1;
}
void bar () {
A a2 ( foo() );
}
Just taking the code as is, the object 'a1' is constructed in 'foo' and then its copy construct will be called to construct 'a2'. However, copy elision allows the compiler to construct 'a1' in the same place on the stack as 'a2'. There is therefore no need to "copy" the object when the function returns.
Multiple exit points complicates the work of the compiler in trying to detect this, and at least for a relatively recent version of VC++ the optimization did not take place where the function body had multiple returns. See Named Return Value Optimization in Visual C++ 2005 [1] for more details.
[1] http://blogs.msdn.com/aymans/archive/2005/10/13/480699.aspxthrow new ArgumentNullException()
in C# in this case), I really liked your other considerations, they are all valid to me, and could be critical in some niche contexts. - julealgon
foo
is being tested has nothing to do with the subject, which is whether to do if (foo == NULL) return; dowork;
or if (foo != NULL) { dowork; }
- Jim Balter
Having a single exit point reduces Cyclomatic Complexity [1] and therefore, in theory, reduces the probability that you will introduce bugs into your code when you change it. Practice however, tends to suggest that a more pragmatic approach is needed. I therefore tend to aim to have a single exit point, but allow my code to have several if that is more readable.
[1] http://en.wikipedia.org/wiki/Cyclomatic_complexityI force myself to use only one return
statement, as it will in a sense generate code smell. Let me explain:
function isCorrect($param1, $param2, $param3) {
$toret = false;
if ($param1 != $param2) {
if ($param1 == ($param3 * 2)) {
if ($param2 == ($param3 / 3)) {
$toret = true;
} else {
$error = 'Error 3';
}
} else {
$error = 'Error 2';
}
} else {
$error = 'Error 1';
}
return $toret;
}
(The conditions are arbritary...)
The more conditions, the larger the function gets, the more difficult it is to read. So if you're attuned to the code smell, you'll realise it, and want to refactor the code. Two possible solutions are:
Multiple Returns
function isCorrect($param1, $param2, $param3) {
if ($param1 == $param2) { $error = 'Error 1'; return false; }
if ($param1 != ($param3 * 2)) { $error = 'Error 2'; return false; }
if ($param2 != ($param3 / 3)) { $error = 'Error 3'; return false; }
return true;
}
Separate Functions
function isEqual($param1, $param2) {
return $param1 == $param2;
}
function isDouble($param1, $param2) {
return $param1 == ($param2 * 2);
}
function isThird($param1, $param2) {
return $param1 == ($param2 / 3);
}
function isCorrect($param1, $param2, $param3) {
return !isEqual($param1, $param2)
&& isDouble($param1, $param3)
&& isThird($param2, $param3);
}
Granted, it is longer and a bit messy, but in the process of refactoring the function this way, we've
I would say you should have as many as required, or any that make the code cleaner (such as guard clauses [1]).
I have personally never heard/seen any "best practices" say that you should have only one return statement.
For the most part, I tend to exit a function as soon as possible based on a logic path (guard clauses are an excellent example of this).
[1] http://www.refactoring.com/catalog/replaceNestedConditionalWithGuardClauses.htmlThere are good things to say about having a single exit-point, just as there are bad things to say about the inevitable "arrow" [1] programming that results.
If using multiple exit points during input validation or resource allocation, I try to put all the 'error-exits' very visibly at the top of the function.
Both the Spartan Programming [2] article of the "SSDSLPedia" and the single function exit point [3] article of the "Portland Pattern Repository's Wiki" have some insightful arguments around this. Also, of course, there is this post to consider.
If you really want a single exit-point (in any non-exception-enabled language) for example in order to release resources in one single place, I find the careful application of goto to be good; see for example this rather contrived example (compressed to save screen real-estate):
int f(int y) {
int value = -1;
void *data = NULL;
if (y < 0)
goto clean;
if ((data = malloc(123)) == NULL)
goto clean;
/* More code */
value = 1;
clean:
free(data);
return value;
}
Personally I, in general, dislike arrow programming more than I dislike multiple exit-points, although both are useful when applied correctly. The best, of course, is to structure your program to require neither. Breaking down your function into multiple chunks usually help :)
Although when doing so, I find I end up with multiple exit points anyway as in this example, where some larger function has been broken down into several smaller functions:
int g(int y) {
value = 0;
if ((value = g0(y, value)) == -1)
return -1;
if ((value = g1(y, value)) == -1)
return -1;
return g2(y, value);
}
Depending on the project or coding guidelines, most of the boiler-plate code could be replaced by macros. As a side note, breaking it down this way makes the functions g0, g1 ,g2 very easy to test individually.
Obviously, in an OO and exception-enabled language, I wouldn't use if-statements like that (or at all, if I could get away with it with little enough effort), and the code would be much more plain. And non-arrowy. And most of the non-final returns would probably be exceptions.
In short;
I believe that multiple returns are usually good (in the code that I write in C#). The single-return style is a holdover from C. But you probably aren't coding in C.
There is no law requiring only one exit point for a method in all programming languages. Some people insist on the superiority of this style, and sometimes they elevate it to a "rule" or "law" but this belief is not backed up by any evidence or research.
More than one return style may be a bad habit in C code, where resources have to be explicitly de-allocated, but languages such as Java, C#, Python or JavaScript that have constructs such as automatic garbage collection and try..finally
blocks (and using
blocks in C#), and this argument does not apply - in these languages, it is very uncommon to need centralised manual resource deallocation.
There are cases where a single return is more readable, and cases where it isn't. See if it reduces the number of lines of code, makes the logic clearer or reduces the number of braces and indents or temporary variables.
Therefore, use as many returns as suits your artistic sensibilities, because it is a layout and readability issue, not a technical one.
I have talked about this at greater length on my blog [1].
[1] http://www.anthonysteele.co.uk/the-single-exit-point-lawYou know the adage - beauty is in the eyes of the beholder.
Some people swear by NetBeans [1] and some by IntelliJ IDEA [2], some by Python [3] and some by PHP [4].
In some shops you could lose your job if you insist on doing this:
public void hello()
{
if (....)
{
....
}
}
The question is all about visibility and maintainability.
I am addicted to using boolean algebra to reduce and simplify logic and use of state machines. However, there were past colleagues who believed my employ of "mathematical techniques" in coding is unsuitable, because it would not be visible and maintainable. And that would be a bad practice. Sorry people, the techniques I employ is very visible and maintainable to me - because when I return to the code six months later, I would understand the code clearly rather seeing a mess of proverbial spaghetti.
Hey buddy (like a former client used to say) do what you want as long as you know how to fix it when I need you to fix it.
I remember 20 years ago, a colleague of mine was fired for employing what today would be called agile development [5] strategy. He had a meticulous incremental plan. But his manager was yelling at him "You can't incrementally release features to users! You must stick with the waterfall [6]." His response to the manager was that incremental development would be more precise to customer's needs. He believed in developing for the customers needs, but the manager believed in coding to "customer's requirement".
We are frequently guilty for breaking data normalization, MVP [7] and MVC [8] boundaries. We inline instead of constructing a function. We take shortcuts.
Personally, I believe that PHP is bad practice, but what do I know. All the theoretical arguments boils down to trying fulfill one set of rules
quality = precision, maintainability and profitability.
All other rules fade into the background. And of course this rule never fades:
[1] http://en.wikipedia.org/wiki/NetBeansLaziness is the virtue of a good programmer.
I lean towards using guard clauses to return early and otherwise exit at the end of a method. The single entry and exit rule has historical significance and was particularly helpful when dealing with legacy code that ran to 10 A4 pages for a single C++ method with multiple returns (and many defects). More recently, accepted good practice is to keep methods small which makes multiple exits less of an impedance to understanding. In the following Kronoz example copied from above, the question is what occurs in //Rest of code...?:
void string fooBar(string s, int? i) {
if(string.IsNullOrEmpty(s) || i == null) return null;
var res = someFunction(s, i);
foreach(var r in res) {
if(!r.Passed) return null;
}
// Rest of code...
return ret;
}
I realise the example is somewhat contrived but I would be tempted to refactor the foreach loop into a LINQ statement that could then be considered a guard clause. Again, in a contrived example the intent of the code isn't apparent and someFunction() may have some other side effect or the result may be used in the // Rest of code....
if (string.IsNullOrEmpty(s) || i == null) return null;
if (someFunction(s, i).Any(r => !r.Passed)) return null;
Giving the following refactored function:
void string fooBar(string s, int? i) {
if (string.IsNullOrEmpty(s) || i == null) return null;
if (someFunction(s, i).Any(r => !r.Passed)) return null;
// Rest of code...
return ret;
}
null
instead of throwing an exception indicating that the argument is not accepted? - Maarten Bodewes
One good reason I can think of is for code maintenance: you have a single point of exit. If you want to change the format of the result,..., it's just much simpler to implement. Also, for debugging, you can just stick a breakpoint there :)
Having said that, I once had to work in a library where the coding standards imposed 'one return statement per function', and I found it pretty tough. I write lots of numerical computations code, and there often are 'special cases', so the code ended up being quite hard to follow...
Multiple exit points are fine for small enough functions -- that is, a function that can be viewed on one screen length on its entirety. If a lengthy function likewise includes multiple exit points, it's a sign that the function can be chopped up further.
That said I avoid multiple-exit functions unless absolutely necessary. I have felt pain of bugs that are due to some stray return in some obscure line in more complex functions.
I've worked with terrible coding standards that forced a single exit path on you and the result is nearly always unstructured spaghetti if the function is anything but trivial -- you end up with lots of breaks and continues that just get in the way.
if
statement in front of each method call that returns success or not :( - Maarten Bodewes
Single exit point - all other things equal - makes code significantly more readable. But there's a catch: popular construction
resulttype res;
if if if...
return res;
is a fake, "res=" is not much better than "return". It has single return statement, but multiple points where function actually ends.
If you have function with multiple returns (or "res="s), it's often a good idea to break it into several smaller functions with single exit point.
My usual policy is to have only one return statement at the end of a function unless the complexity of the code is greatly reduced by adding more. In fact, I'm rather a fan of Eiffel, which enforces the only one return rule by having no return statement (there's just a auto-created 'result' variable to put your result in).
There certainly are cases where code can be made clearer with multiple returns than the obvious version without them would be. One could argue that more rework is needed if you have a function that is too complex to be understandable without multiple return statements, but sometimes it's good to be pragmatic about such things.
If you end up with more than a few returns there may be something wrong with your code. Otherwise I would agree that sometimes it is nice to be able to return from multiple places in a subroutine, especially when it make the code cleaner.
sub Int_to_String( Int i ){
given( i ){
when 0 { return "zero" }
when 1 { return "one" }
when 2 { return "two" }
when 3 { return "three" }
when 4 { return "four" }
...
default { return undef }
}
}
would be better written like this
@Int_to_String = qw{
zero
one
two
three
four
...
}
sub Int_to_String( Int i ){
return undef if i < 0;
return undef unless i < @Int_to_String.length;
return @Int_to_String[i]
}
Note this is was just a quick example
I vote for Single return at the end as a guideline. This helps a common code clean-up handling ... For example, take a look at the following code ...
void ProcessMyFile (char *szFileName)
{
FILE *fp = NULL;
char *pbyBuffer = NULL:
do {
fp = fopen (szFileName, "r");
if (NULL == fp) {
break;
}
pbyBuffer = malloc (__SOME__SIZE___);
if (NULL == pbyBuffer) {
break;
}
/*** Do some processing with file ***/
} while (0);
if (pbyBuffer) {
free (pbyBuffer);
}
if (fp) {
fclose (fp);
}
}
The more return statements you have in a function, the higher complexity in that one method. If you find yourself wondering if you have too many return statements, you might want to ask yourself if you have too many lines of code in that function.
But, not, there is nothing wrong with one/many return statements. In some languages, it is a better practice (C++) than in others (C).
This is probably an unusual perspective, but I think that anyone who believes that multiple return statements are to be favoured has never had to use a debugger on a microprocessor that supports only 4 hardware breakpoints. ;-)
While the issues of "arrow code" are completely correct, one issue that seems to go away when using multiple return statements is in the situation where you are using a debugger. You have no convenient catch-all position to put a breakpoint to guarantee that you're going to see the exit and hence the return condition.
If it's okay to write down just an opinion, that's mine:
I totally and absolutely disagree with the `Single return statement theory' and find it mostly speculative and even destructive regarding the code readability, logic and descriptive aspects.
That habit of having one-single-return is even poor for bare procedural programming not to mention more high-level abstractions (functional, combinatory etc.). And furthermore, I wish all the code written in that style to go through some special rewriting parser to make it have multiple return statements!
A function (if it's really a function/query according to `Query-Command separation' note - see Eiffel programming lang. for example) just MUST define as many return points as the control flow scenarios it has. It is much more clear and mathematically consistent; and it is the way to write functions (i.e. Queries)
But I would not be so militant for the mutation messages that your agent does receive - the procedure calls.
It doesn't make sense to always require a single return type. I think it is more of a flag that something may need to be simplified. Sometimes it's necessary to have multiple returns, but often you can keep things simpler by at least trying to have a single exit point.
The only important question is "How is the code simpler, better readable, easier to understand?" If it is simpler with multiple returns, then use them.
Having multiple exit points is essentially the same thing as using a GOTO
. Whether or not that's a bad thing depends on how you feel about raptors.
GOTO
? BAD programmer. - ErikE
GOSUB
which is a structured programming concept. GOTO
is not. - Adrian McCarthy
GOTO
. And by the same token, neither is return
the same as GOTO! Return
means "abandon current block of code (for reasons obvious--there is no more work to do) and resume execution after the calling statement". It is not arbitrary or absolute. GOTO
on the other hand forces execution to jump around to an arbitrary, unknown point for purposes unknown. Ultimately, it is all about human comprehension of code. - ErikE
return
in the middle of a function is jarring, and confusing. So change! Those humans should change their minds. In the final analysis what I think we're really arguing about is convention, not sense. - ErikE
return
usage. But we can agree to disagree. I'm sorry if I was too harsh in my comments. - ErikE
You already implicitly have multiple implicit return statements, caused by error handling, so deal with it.
As is typical with programming, though, there are examples both for and against the multiple return practice. If it makes the code clearer, do it one way or the other. Use of many control structures can help (the case statement, for example).
Well, maybe I'm one of the few people here old enough to remember one of the big reasons why "only one return statement" was pushed so hard. It's so the compiler can emit more efficient code. For each function call, the compiler typically pushes some registers on the stack to preserve their values. This way, the function can use those registers for temporary storage. When the function returns, those saved registers have to be popped off the stack and back into the registers. That's one POP (or MOV -(SP),Rn) instruction per register. If you have a bunch of return statements, then either each one has to pop all the registers (which makes the compiled code bigger) or the compiler has to keep track of which registers might have been modified and only pop those (decreasing code size, but increasing compilation time).
One reason why it still makes sense today to try to stick with one return statement is ease of automated refactoring. If your IDE supports method-extraction refactoring (selecting a range of lines and turning them into a method), it's very difficult to do this if the lines you want to extract have a return statement in them, especially if you're returning a value.
I use multiple exit points for having error-case + handling + return value as close in proximity as possible.
So having to test for conditions a, b, c that have to be true and you need to handle each of them differently:
if (a is false) {
handle this situation (eg. report, log, message, etc.)
return some-err-code
}
if (b is false) {
handle this situation
return other-err-code
}
if (c is false) {
handle this situation
return yet-another-err-code
}
perform any action assured that a, b and c are ok.
The a, b and c might be different things, like a is input parameter check, b is pointer check to newly allocated memory and c is check for a value in 'a' parameter.
In the interests of good standards and industry best practises, we must establish the correct number of return statements to appear in all functions. Obviously there is consensus against having one return statement. So I propose we set it at two.
I would appreciate it if everyone would look through their code right now, locate any functions with only one exit point, and add another one. It doesn't matter where.
The result of this change will undoubtedly be fewer bugs, greater readability and unimaginable wealth falling from the sky onto our heads.
I prefer a single return statement. One reason which has not yet been pointed out is that some refactoring tools work better for single points of exit, e.g. Eclipse JDT extract/inline method.
There are times when it is necessary for performance reasons (I don't want to fetch a different cache line kind of the same need as a continue; sometimes).
If you allocate resources (memory, file descriptors, locks, etc.) without using RAII then muliple returns can be error prone and are certainly duplicative as the releases need to be done manually multiple times and you must keep careful track.
In the example:
function()
{
HRESULT error = S_OK;
if(SUCCEEDED(Operation1()))
{
if(SUCCEEDED(Operation2()))
{
if(SUCCEEDED(Operation3()))
{
if(SUCCEEDED(Operation4()))
{
}
else
{
error = OPERATION4FAILED;
}
}
else
{
error = OPERATION3FAILED;
}
}
else
{
error = OPERATION2FAILED;
}
}
else
{
error = OPERATION1FAILED;
}
return error;
}
I would have written it as:
function() {
HRESULT error = OPERATION1FAILED;//assume failure
if(SUCCEEDED(Operation1())) {
error = OPERATION2FAILED;//assume failure
if(SUCCEEDED(Operation3())) {
error = OPERATION3FAILED;//assume failure
if(SUCCEEDED(Operation3())) {
error = OPERATION4FAILED; //assume failure
if(SUCCEEDED(Operation4())) {
error = S_OK;
}
}
}
}
return error;
}
Which certainly seems better.
This tends to be especially helpful in the manual resource release case as where and which releases are necessary is pretty straightforward. As in the following example:
function() {
HRESULT error = OPERATION1FAILED;//assume failure
if(SUCCEEDED(Operation1())) {
//allocate resource for op2;
char* const p2 = new char[1024];
error = OPERATION2FAILED;//assume failure
if(SUCCEEDED(Operation2(p2))) {
//allocate resource for op3;
char* const p3 = new char[1024];
error = OPERATION3FAILED;//assume failure
if(SUCCEEDED(Operation3(p3))) {
error = OPERATION4FAILED; //assume failure
if(SUCCEEDED(Operation4(p2,p3))) {
error = S_OK;
}
}
//free resource for op3;
delete [] p3;
}
//free resource for op2;
delete [] p2;
}
return error;
}
If you write this code without RAII (forgetting the issue of exceptions!) with multiple exits then the deletes have to be written multiple times. If you write it with }else{
then
it gets a little ugly.
But RAII makes the multiple exit resource issue moot.
I always avoid multiple return statements. Even in small functions. Small functions can become larger, and tracking the multiple return paths makes it harder (to my small mind) to keep track of what is going on. A single return also makes debugging easier. I've seen people post that the only alternative to multiple return statements is a messy arrow of nested IF statements 10 levels deep. While I certain agree that such coding does occur, it isn't the only option. I wouldn't make the choice between a multiple return statements and a nest of IFs, I'd refactor it so you'd eliminate both. And that is how I code. The following code eliminates both issues and, in my mind, is very easy to read:
public string GetResult()
{
string rv = null;
bool okay = false;
okay = PerformTest(1);
if (okay)
{
okay = PerformTest(2);
}
if (okay)
{
okay = PerformTest(3);
}
if (okay)
{
okay = PerformTest(4);
};
if (okay)
{
okay = PerformTest(5);
}
if (okay)
{
rv = "All Tests Passed";
}
return rv;
}
As an alternative to the nested IFs, there's a way to use do
/while(false)
to break out anywhere:
function()
{
HRESULT error = S_OK;
do
{
if(!SUCCEEDED(Operation1()))
{
error = OPERATION1FAILED;
break;
}
if(!SUCCEEDED(Operation2()))
{
error = OPERATION2FAILED;
break;
}
if(!SUCCEEDED(Operation3()))
{
error = OPERATION3FAILED;
break;
}
if(!SUCCEEDED(Operation4()))
{
error = OPERATION4FAILED;
break;
}
} while (false);
return error;
}
That gets you one exit point, lets you have other nesting of operations, but still not a real deep structure. If you don't like the !SUCCEEDED you could always do FAILED whatever. This kind of thing also lets you add other code between any two other checks without having to re-indent anything.
If you were really crazy, that whole if
block could be macroized too. :D
#define BREAKIFFAILED(x,y) if (!SUCCEEDED((x))) { error = (Y); break; }
do
{
BREAKIFFAILED(Operation1(), OPERATION1FAILED)
BREAKIFFAILED(Operation2(), OPERATION2FAILED)
BREAKIFFAILED(Operation3(), OPERATION3FAILED)
BREAKIFFAILED(Operation4(), OPERATION4FAILED)
} while (false);
I think in different situations different method is better. For example, if you should process the return value before return, you should have one point of exit. But in other situations, it is more comfortable to use several returns.
One note. If you should process the return value before return in several situations, but not in all, the best solutions (IMHO) to define a method like ProcessVal and call it before return:
var retVal = new RetVal();
if(!someCondition)
return ProcessVal(retVal);
if(!anotherCondition)
return retVal;
I'm probably going to be hated for this, but ideally there should be no return statement at all I think, a function should just return its last expression, and should in the completely ideal case contain only one.
So not
function name(arg) {
if (arg.failure?)
return;
//code for non failure
}
But rather
function name(arg) {
if (arg.failure?)
voidConstant
else {
//code for non failure
}
If-statements that aren't expressions and return statements are a very dubious practise to me.
One might argue... if you have multiple conditions that must be satisfied before the tasks of the function are to be performed, then don't invoke the function until those conditions are met:
Instead of:
function doStuff(foo) {
if (foo != null) return;
}
Or
function doStuff(foo) {
if (foo !== null) {
...
}
}
Don't invoke doStuff
until foo != null
if(foo != null) doStuff(foo);
Which, requires that every call site ensures that the conditions for the invocation are satisfied before the call. If there are multiple call sites, this logic is perhaps best placed in a separate function, in a method of the to-be-invoked function (assuming they are first-class citizens), or in a proxy.
On the topic of whether or not the function is mathematically provable, consider the logic over the syntax. If a function has multiple return points, this doesn't mean (by default) that it is not mathematically provable.
This is mainly a hang over from Fortran, where it was possible to pass multiple statement labels to a function so it could return to any one of them.
So this sort of code was perfectly valid
CALL SOMESUB(ARG1, 101, 102, 103)
C Some code
101 CONTINUE
C Some more code
102 CONTINUE
C Yet more code
103 CONTINUE
C You get the general idea
But the function being called made the decision as to where your code path went. Efficient? Probably. Maintainable? No.
That is where that rule comes from (incidentally along with no multiple entry points to a function, which is possible in fortran and assembler, but not in C).
However, the wording of that looks like it can be applied to other languages (the one about multiple entry points can't be applied to other languages, so it's not really a program). So the rule got carried over, even though it refers to a completely different problem, and isn't applicable.
For more structured languages, that rule needs to be dropped or at least thought through more. Certainly a function spattered with returns is difficult to understand, but returning at the beginning isn't an issue. And in some C++ compilers a single return point may generate better code if you're returning a value from only one place.
But the original rule is misunderstood, misapplied. and no longer relevant.
You can do this for achieving only one return statement - declare it at start and output it at the end - problem solved:
$content = "";
$return = false;
if($content != "")
{
$return = true;
}
else
{
$return = false;
}
return $return;
I'm usually in favor of multiple return statements. They are easiest to read.
There are situations where it isn't good. Sometimes returning from a function can be very complicated. I recall one case where all functions had to link into multiple different libraries. One library expected return values to be error/status codes and others didn't. Having a single return statement can save time there.
I'm surprised that no one mentioned goto. Goto is not the bane of programming that everyone would have you believe. If you must have just a single return in each function, put it at the end and use gotos to jump to that return statement as needed. Definitely avoid flags and arrow programming which are both ugly and run slowly.
Multiple exit is good if you manage it well
The first step is to specify the reasons of exit. Mine is usually something like this:
1. No need to execute the function
2. Error is found
3. Early completion
4. Normal completion
I suppose you can group "1. No need to execute the function" into "3. Early completion" (a very early completion if you will).
The second step is to let the world outside the function know the reason of exit. The pseudo-code looks something like this:
function foo (input, output, exit_status)
exit_status == UNDEFINED
if (check_the_need_to_execute == false) then
exit_status = NO_NEED_TO_EXECUTE // reason #1
exit
useful_work
if (error_is_found == true) then
exit_status = ERROR // reason #2
exit
if (need_to_go_further == false) then
exit_status = EARLY_COMPLETION // reason #3
exit
more_work
if (error_is_found == true) then
exit_status = ERROR
else
exit_status = NORMAL_COMPLETION // reason #4
end function
Obviously, if it's beneficial to move a lump of work in the illustration above into a separate function, you should do so.
If you want to, you can be more specific with the exit status, say, with several error codes and early completion codes to pinpoint the reason (or even the location) of exit.
Even if you force this function into one that has only a single exit, I think you still need to specify exit status anyway. The caller needs to know whether it's OK to use the output, and it helps maintenance.
I know I will be jumped on for this, but I am serious.
Return statements are basically a hangover from the procedural programming days. They are a form of goto, along with break, continue, if, switch/case, while, for, yield and some other statements and the equivalents in most modern programming languages.
Return statements effectively 'GOTO' the point where the function was called, assigning a variable in that scope.
Return statements are what I call a 'Convenient Nightmare'. They seem to get things done quickly, but cause massive maintenance headaches down the line.
This is the most important and fundamental concept of object oriented programming. It is the raison d'etre of OOP.
Whenever you return anything from a method, you are basically 'leaking' state information from the object. It doesn't matter if your state has changed or not, nor whether this information comes from other objects - it makes no difference to the caller. What this does is allow an object's behaviour to be outside of the object - breaking encapsulation. It allows the caller to start manipulating the object in ways that lead to fragile designs.
I recommend any developer to read about the Law of Demeter [1] (LoD) on c2.com or Wikipedia. LoD is a design philosophy that has been used at places that have real 'mission-critical' software constraints in the literal sense, like the JPL. It has been shown to reduce the amount of bugs in code and improve flexibility.
There has an excellent analogy based on walking a dog. When you walk a dog, you do not physically grab hold of its legs and move them such that the dog walks. You command the dog to walk and it takes care of it's own legs. A return statement in this analogy is equivalent to the dog letting you grab hold of its legs.
You will notice that none of these require a return statement. You might think the constructor is a return, and you are on to something. Actually the return is from the memory allocator. The constructor just sets what is in the memory. This is OK so long as the encapsulation of that new object is OK, because, as you made it, you have full control over it - no-one else can break it.
Accessing attributes of other objects is right out. Getters are out (but you knew they were bad already, right?). Setters are OK, but it is better to use constructors. Inheritance is bad - when you inherit from another class, any changes in that class can and probably will break you. Type sniffing is bad (Yes - LoD implies that Java/C++ style type based dispatch is incorrect - asking about type, even implicitly, is breaking encapsulation. Type is an implicit attribute of an object. Interfaces are The Right Thing).
So why is this all a problem? Well, unless your universe is very different from mine, you spend a lot of time debugging code. You aren't writing code that you plan never to reuse. Your software requirements are changing, and that causes internal API/interface changes. Every time you have used a return statement you have introduced a very tricky dependency - methods returning anything are required to know about how whatever they return is going to be used - that is each and every case! As soon as the interface changes, on one end or the other, everything can break, and you are faced with a lengthy and tedious bug hunt.
They really are an malignant cancer in your code, because once you start using them, they promote further use elsewhere (which is why you can often find returning method-chains amongst object systems).
So what is the alternative?
With OOP - the goal is to tell other objects what to do, and let them take care of it. So you have to forget the procedural ways of doing things. It's easy really - just never write return statements. There are much better ways of doing the same things:
If you really need an answer back - use a call back. Pass in a data structure to be filled in, even. That way you keep the interfaces clean and open to change, and your whole system is less fragile and more adaptable. It does not slow your system down, in fact it can speed it up, in the same way as tail call optimisation does - except in this case, there is no tail call so you don't even have to waste time manipulating the stack with return values.
If you follow these arguments, you will find there really is never a need for a return statement.
If you follow these practices, I guarantee that pretty soon you will find that you are spending a lot less time hunting bugs, are adapting to requirement changes much more quickly, and having less problems understanding your own code.
[1] http://en.wikipedia.org/wiki/Law_of_DemeterMath.Max(int, int)
?? - Stefan Steinegger