I remember when I first saw the spec for C#. It was around the time of the beta release. C# was a fresh clean language. The first thing that I noticed, though, was that it had more stuff. I used to joke with friends that it was as if the language designers had taken the list of bugs for Java and decided to do something about each of them. To be fair, they did a great job but C#, even in its early days, was not a minimal language. If a feature had a legitimate use and Java decided not to include it, chances are, C# did – more power to the programmer.
There were two features that puzzled me though: ‘ref’ and ‘out’. The idea for these had been around for ages. Ada had them, and they’d cropped up in a number of Algol-derived languages. You could even argue that C had the rudiments of the semantics – you could always pass pointers into your functions and modify what they refer to, but to me, that was really more of a convenience. When you don’t have exceptions, error codes are often munged into the return value. Passing values back via parameters can be cleaner at times in C, but if you are working in a more modern language there really are better alternatives.
The thing I didn’t get at the time was why ‘ref’ and ‘out’ where there. I could understand ‘ref’ being used for interop but ‘out’ seemed archaic unless it was there for some sort of value type optimization. I didn’t think about it much beyond that. It wasn’t until a while ago that I started to think about it again after I’d seen a team that overused ‘ref’ and ‘out’ severely.
Whenever you are tempted to use ‘ref’ or ‘out’ in C# consider whether you can do the work by designing functions with single return values.
This may seem extreme, but it isn’t, really. Our thoughts become clearer when we force ourselves to reconsider our designs under that constraint. Sometimes we find that we can decompose the problem into several functions rather than one. At other times we find some relationship between outputs that leads us to create a new class or struct that we can use as a single return type. Really, a function is supposed to do some thing for you, not some things. If a function is giving you two things, it's beyond its quota.
There really are only two cases: either the things that you want the caller to know about are related (and you can show that relationship in a data structure) or they aren’t and you should consider letting the caller ask for them independently. In the first case, you are raising the level of abstraction the system by introducing a new type. In the second, you are increasing the orthogonality of the system by making sure that functions do one useful thing for the caller. Either improves design.
If you need more convincing, attempt to extract random pieces of code into methods with a refactoring tool in C#. Nearly every case where tools choose to use 'ref' and 'out' are malformed abstractions. In languages without 'ref' and 'out' the tool would just say "no can do."
End note: The trick of bundling a function’s outgoing values into a class isn’t always called for, but it is a great tool to have in your arsenal. I can’t tell you the number of times I’ve seen classes that came about that way move toward centrality in a design.
I think both are/were required for interop. COM had [in,out] (ref) and [out] only (out) -- the difference was significant due to COM's transparent cross-process or cross-machine marshalling. You didn't want to marshal a value both ways if one way was enough.
For what it's worth.
Posted by: Kim Gräsman | November 29, 2010 at 01:21 PM
Yes, that makes sense. I forgot about COM's [out].
Posted by: Michael Feathers | November 29, 2010 at 01:25 PM
I'm writing Standard ML right now, and I have been comparing C#'s "bool foo.TryParse( string s, out foo result)" idea to ML's "foo.fromString s : option foo". I think, based on this article, you would suggest ML's approach over the C# approach? I have found both approaches awkward. I've been considering a continuation passing style approach: "void withFoo( string s, Action success, Action fail )", but I haven't decided if I like it better.
Posted by: Ben | November 29, 2010 at 01:27 PM
I'm curious to hear what you think is a good alternative to the TryParse pattern (http://msdn.microsoft.com/en-us/library/ms229009.aspx) used by some .NET Framework methods such as Int32.TryParse and Dictionary.TryGetValue. I think these are the main situations where I end up using something that uses out, but I'm not sure what the better alternative is.
Posted by: Amy Thorne | November 29, 2010 at 01:46 PM
How do you feel about using a Tuple when you want to return two values from a function?
A little bit awkward from C#, but I think easier to understand than ref/out. It works better when you have language support, like in F# or python
Posted by: Dave | November 29, 2010 at 02:21 PM
Several people have commented on combining success/fail with a return result. One frequently used pattern, especially for remote communication calls, is an OpResult.
// subst generic brackets for braces, it's getting stripped
public class OpResult[T] {
public T Result { get; private set; }
public bool WasSuccessful { get; private set; }
...constructor, etc...
}
WasSuccessful can be replaced with an IEnumerable of error strings if needed - whatever the situation might call for.
I presume what Michael is warning against (please correct me if I'm wrong of course) is a method that modifies multiple separate data structures, not a method that returns an operation success indicator and result data at the same time.
Posted by: Nolan Egly | November 29, 2010 at 02:50 PM
I think 'out' is just misleading in your argument. Its just the fall guy because it was the abused feature
you could of made a mess with any number of language features if you don't gel related things together into data structures.
not only do you have the returning problem, you have the passing around a bunch of related values. It all leads to nasty code.
its also fun when the related parameters / results are needed by different functions and are passed in different orders. Especially if they are all a common basic type (like a int) and then values get transposed and all kinds of fun occurs.
so "return back, not out"
is not a good rule, its more like one of those things you tell people who have abused a particular language feature to make them stop and think, but its a transient piece of advice that shouldn't become a "rule" or rule of thumb.
Posted by: Keith | November 30, 2010 at 08:14 PM
The one situation where out is the better - i.e. more elegant - solution IMO is when dealing with the immutable (aka persistent) version of some usueful data abstractions. Consider the pop operation of an immutable stack; since the popped element and the new stack instance don't have any relation (except their common past), returning the new stack object in an out parameter seems more appropriate to me than creating a new abstraction just for this purpose. YMMV.
Posted by: Johannes Link | December 06, 2010 at 09:27 AM
can you please explain in a bit more detail why we shouldn't be using the TryParse pattern ?
Posted by: Erhan Hosca | December 30, 2010 at 12:36 PM