When I was a teen, I read a lot of science fiction. Robert Heinlein was one of my favorite authors. He wrote intelligent adventures which were quite good on their own terms, but I always had the sense that they were window-dressing for what he really wanted to do: social commentary through the veil of his own brand of paleolibertarianism. He’d set homesteaders on a planet and engineer situations which called on them to display frontier justice and deep individualism. The characters he poked fun at where the ones who didn’t get it; who didn’t subscribe to his view of what makes the human animal tick in society.
One day, when I was reading one of his books, I came across a word I hadn’t seen before. It was the word canalize. One of the characters in Heinlein’s book spoke about a person being canalized in his thinking. The notion was that as kids we are very malleable in our thoughts and actions but through interaction with the society we grow up in, we eventually assume a course (like a canal) which is confluent with our environment - so much so that we are unaware of it. Once we are on that course, its very hard to change. The thoughts and habits of a lifetime are hard to adjust radically.
I think that design can be canalized as well. What do I mean? Well, let’s imagine an example. Let’s suppose that we have a long method which consists of a sequence of validations. I’m sure most people reading this can imagine the sort of code I’m talking about. It’s just series of blocks, about five or ten lines apiece in which something (parameters, input data, etc) is being validated bit by bit. This sort of code arises in most systems. People do the first piece: a single validation, and then time goes on and they discover a need for a second validation and they add it. For many people, it just never seems worthwhile to extract the validations to separate methods. The code is rather clear. There is no deep nesting; it’s just a long sequence of similar blocks.
Now, think about a canal.
The water is flowing and carving its course. What is the course in the code I just described? Well, it seems that it doesn’t have one, really. People can introduce just about any logic that they want to in that sequence. The could start to do completely different work just because it is a convenient place to do it. But, what if we started to extract each validation into a separate method? People could still throw in random bits of logic, another responsibility, but they’d have to make a choice. They could throw it into one of the extracted validations or throw it into the original method. Either way, the code is a bit better off. If the logic is thrown into one validation method, the other ones are fine. If it is thrown into the top-level method, the one we extracted from, we at least have our courses in: the code shows a direction and people coming along later will at least be able to see that the application has a pattern in place for validation.
If you’re thinking that this is related to the “no broken windows” theory, you’re right. But, I think there is a little bit more. When we factor our code finely, we make some changes much harder than they would be otherwise. Large methods are like stagnant pools. All sorts of computation can be bound together in a big tangle. However, once you factor a bit, and set a course, chances are no one will go through the process of inlining those methods. It’s just not a natural thing to do. We’ve given the code a bit more structure, and moved toward a design where validation is seen as a separate thing - people kind of know where it goes now.
Now, let’s try something else. We can make the observation that our validations can be performed in any order. The sad thing about most languages is that we have to put things in a particular order, even if order doesn’t really matter. Text, in a file, has a natural order: top-down. Could we do something a bit different? One thing that we could do is turn our little validation methods into objects and put them in a set. Then we could write code to iterate the set and fire each of them off. Notice that I said “set.” We could use a list, but lists have the same issue as a sequence of method calls in the text of a larger method: there is an ordering, but it doesn’t matter. One the one hand, people might look at the ordering and think that it is significant. On the other hand, people might start to take advantage of this accidental ordering and couple things together in a way in which it is hard to change. In either case, seeing that the code uses a set rather than a list, at least betrays something about the intent of the designer.
Donald Norman uses the term affordance to describe qualities of things which tend to encourage or discourage particular uses. In software, Kevlin Henney has written about the affordances of interfaces: how they can communicate proper use to users. I think that we can use the concept a slightly different area. There are ways of writing code which encourage or discourage particular future changes in a code base. It’s never perfect, but it helps. However, you can have too much of a good thing. If you canalize too deeply, it can be harder to refactor your code toward a vastly different structure. The paths you lay down are structure, and while structure is great for supporting our work, it also canalizes and reinforces itself over time.
In the end, software development is a natural process. We shouldn't be surprised to see the same sort of inertial change patterns arise that we see in nature.
I agree with you but like to add something about setting structures. The structures you set by factoring out code in a certain way reflect your current knowlegde about the problem and solution domain as well as your engineering skills.
So if you set this into "stone" the question is how valid your mental model is at a later time and how hard it is for someone else to follow your thoughts and your flow through the channel and to understand WHY you did the structuring that way.
Enabling change here (even being able to create a totally different flow) is challenging at least. Many people really refrain from inlining code to extract it again (you can even inline classes with a single refactoring).
So by factoring out you kind of take the stance of knowing better or best how to structure the solution. But that may not be the case.
So make it easy for someone else to follow your thoughts and try to include the why's into the code (naming, comments) and don't discourage them
to undo your canalizing.
How this relates to the OCP is also an interesting discussion.
Posted by: Michael Hunger | July 01, 2009 at 03:39 PM
Hey, in your book "Working Effectively with legacy code" (sweet book btw), the table at the bottom of page 6, is reprinted at the top of page 7, with no differences in them. Seems like a typo.
Peace
Posted by: Mark Rogers | July 22, 2009 at 03:04 PM