« The Single Responsibility and Open/Closed Principle are the Same | Main

November 05, 2013



Does this have performance implications? Did you benchmark it at all? How often is this operation being done?


It seems that you recommend using language features, or different languages, to encode control structures in a different way. Logically, the control structures don't seem to have gone. But your way of presenting the logic of program execution is easier to understand and extend. Is that what you're saying?


I like the idea. I would say that switch statements aren't worse than polymorphism though - they're just different. Switch statements are perhaps better when the number of cases isn't likely to change but the number of functions (which switch on the same type) is likely to change. Polymorphism is better in the opposite situation - the number of cases (or subtypes) is likely to change, but the number of occurrences of the same switch (or number of member functions) is likely to remain constant. Perhaps the same could be said for a simple if statement.

Franklin Chen

Unfortunately, in this example, the code is quite inefficient. Each call to pad makes a copy of the original array. Even in the case where n is ary.length.

I'll write up more about what one could do instead with a language with lazy data structures and lazy evaluation, and whether that is even a good solution anyway.

Nick Dandoulakis

I like it. The code is expressing the intent in a clear manner. Deriving a proper abstraction, like the `pad` function, requires more thought and time.

In my opinion, the if-statement could be replaced with
pad(ary, n).take(n), and we're done.

We can even rename `pad` to `zeropad`, or something similar.


I'll throw my golf swing into the mix for fun....

def padded_take ary, n
ary[0..n] + [0] * [n - ary.length, 0].max

..but that shouldn't take away from your point. Any time you can remove conditions from a function it is a good thing. A mentor once told me that a method should know what it is supposed to do, and should not have to guess at the output based on the inputs. I think this is along the same vein.


RE: "I'm sure there's some way that we can see this as a form of single responsibility violation."

Agreed. More precisely, there are two violations, one of the responsibility rule, one of the open/closed principle. (As you suggest in a recent article, these violations tend to be coupled.)

First, the traditional if/then/else control structure tightly (via syntax) couples the responsibilities of "observing a condition" with "action based on that condition".

Second, we're forced to squeeze all the conditional actions into one block, i.e. the if/then/else actions are open to modification but closed to extension. It isn't a surprise that we end up invasively modifying code to add more actions, rather than extending a condition with more actions.

RE: "What can we do?"

We solve those problems! It isn't difficulty, and answers already exist. My own favorite is based on ArrowChoice in Haskell: a condition is expressed as a sum-type value (a + b), and conditional action is expressed by `left :: (a ~~> a') -> ((a + b) ~~> (a' + b))`.

I favor a more monadic variation that also separates the conditional action from the point of application: `leftM :: ((a ~~> a') * (a + b)) ~~> (a' + b)`.

The technique you're describing is closer to using numbers as conditionals based on `f^4 = f . f . f . f` vs. `f^0 = id`. This isn't open to extension, though, since the number is 'consumed' by this expression form. So I wouldn't recommend it as a general technique.

Josh Cheek

I think in some cases this is true, but I'm not convinced this is one of them. I think this is what I'd like best:

def padded_take ary, n
  ary.take n if n <= ary.length
  ary + [0] * (n-ary.length)

The example in the article would be better if it hadn't pulled the padding out into its own method, because it hides all the relevant details. With `pad(ary, n).take(n)`, we can't see what it means to pad the array. So we must go down into the next method anyway to answer questions like "does it get padded with zeros or nil?" and "does it pad on the left side or the right side?" and "what if n is negative?"

I think that distracts from the actual point of the article, which is about conditionals, not extracting methods, so the final version should probably be:

def padded_take ary, n
  pad_length = [0, n - ary.length].max
  (ary + [0] * pad_length).take(n)

And for me, at least, it's easier to understand the guard clause than try to think through the normalization that allows the second one to work in both cases.


Consider that in a language with more lazy value support (such as Haskell or Clojure), the solution becomes even more elegant, the equivalent of:

def padded_take ary, n

where Repeat is an Enumerable which lazily produces an infinite list of some value (in this case, 0), and where #then returns not a lazy Enumerable which will first enumerate the receiver and then, if it runs out of values, enumerate the argument.

You can implement this in Ruby without too much trouble, but in another language these would be part of the standard library, and this idiom would be simple and recognizable enough that defining #padded_take would be silly.

Bernard Gallet

You have not removed conditions. You just pushed them on the side. How do you think max is implemented?

Michael Feathers

Bernard: I don't think we can write any software that does not depend upon conditionals at some layer of the stack. But not having them part of the code we directly maintain can give us some advantages.

Johannes Brodwall

I like it.

To those saying that calling pad every time is inefficient because it creates a copy when padding is empty: Doesn't this optimization belong in the + operator?


Try the APL pogramming language, rarely use loops. Very optimized primitives for processing without loops.

Michael Feathers

Jim: Oddly enough, I'm currently working on a J (APL derivative) interpreter that can be embedded in Ruby: https://github.com/michaelfeathers/jop

Marcel Popescu

"Some of the worst code I've ever seen is a cavernous nightmare of nested conditions with odd bits of work interspersed within them."

Ayende's story is a must here, IMO - http://ayende.com/blog/4612/it-really-happened-legacy-programmers-tales

On-topic, I agree with the enumerable idea; in C#, that would be

ary.Concat(Enumerable.Repeat(0, n)).Take(n).ToArray();

A bit more verbose than Peeja's example but still in the same ballpark, and definitely superior to the if as it expresses the "what" instead of the "how" (take the array, concatenate it with a sequence of zeroes, only keep the first n elements).



Great point. Meanwhile, though, I was wondering about "ary". I gotta ask, is there a common standard that includes eliminating any repeated characters?

Or, otherwise, what up with that name?


Michael Feathers

Mike: The word 'array' has always had too many letters. I've tried to alert the OED to this, but they haven't responded so I've decided to take matters into my own hands.


Being a mathematics aficionado, I also like these "elegant", "case-less" solutions.

HOWEVER, I have found out that when thinking about such algorithms, almost always we (all stakeholders in the code: analysts, testers, and also these lowly programmers) argue by cases: "Let's assume the list is longer, then ...; but if it is shorter, then ...".

And all the (unit and other) tests done will also do this.

So my experience is that you have gained nothing - nothing at all; and the risk is HIGHER that maintenance introduces errors because each change "changes all cases together".

Yes, if someone writes code where ONLY the cases that behave VERY SIMILARLY are grouped together with such ideas, THEN the code really becomes much better to maintain (I hesitate to write "easier to maintain" ...).

But if the grouping becomes like tricky like in the original example (where the max is just another "case-distinguishing operator", i.e., something, which exactly the same amount of "think time" as an if), then the maintainers will curse you more ....


A special case: In PASCAL (my first language) and all later languages, you can write

boolVar := a > b;

For a long time I and many other "advanced" guys laughed at the people who wrote

IF a > b
THEN boolVar := TRUE
ELSE boolVar := FALSE;

However, after having worked (and instructed) larger teams, I found that the latter allows you do add comments "at the right place":

IF a > b

(* This is the usual case when the user does this and that ... *)
THEN boolVar := TRUE

(* Only when that funny things happens, we arrive at this place; this has a few consequences downstream, namely ... *)
ELSE boolVar := FALSE;

This important "real-world" structure is completely lost in the "elegant", IF-less formulation ...


Nevertheless, refactorings like "replace cases with polymorphism" and also "replace cases with encompassing algorithm" (as in the article) are valuable tools of the trade ...



(Ahem ... sorry for the 3...5 grammar and spelling errors in my posting).


... after some reflection, I suspect I know why the idea of "unconditional programming" seems so attractive for Michael's padded_take function: In that case, the array in question was used (always or at some places) as a SUBSTITUTE for an infinite list. Thus, the actual job was:

Get me the first n objects of an infinite list defined like that...

There is no case distinction in this specification, and therefore Michael (and we all) expect that none should be in the code. The problem is therefore not with the algorithm, but with the wrong data structure.
Now, the selection of a finite array to represent an infinite list may well have good reasons (among them performance-related onces). And in that case, the "re-introduction" of the case-less abstraction is a good thing.

However, it might also be worthwhile to change the used collection implementation so that one would not get need any case distinction in the first place.

And more important, as I said in my previous comment, there are cases where "the world" has some distinction, and hence that distinction should remain in the code (although much could be said about that "the world" ...).

But Michael was careful enough not to claim that this abstraction is "obviously better" or something like that.

And on the whole, I agree with the statement at the beginning: "Over and over again, I find that better code has fewer if-statements ....".

Gregory Shilin

Actually one can write much simpler version of the function:
def padded_take ary, n
Array.[](*ary, 0 * n).take(n)

Gregory Shilin

Sorry, that's the correct one:
def padded_take ary, n
Array.[](ary, [0] * n).flatten.take(n)


Nice example.

It occurs to me that every time we do this (remove conditionals) is an act of generalisation of special cases: null object, strategy/command, using a map instead of a switch etc.

It gives me the idea to ask myself the explicit question of "how can I change the code so that one case is just a degenerate case of a more general one?"

Reminds me of Kevin Rutherfords ramblings : http://silkandspinach.net/2004/07/16/if/ http://silkandspinach.net/2004/07/16/hexagonal-soup/

The comments to this entry are closed.