Today I got caught up reading a back and forth in twitter about a proposed assertion syntax for Ruby testing frameworks. It was interesting, but yet again it was about how to make tests 'read well.' It's hard to disagree with that, right? Well, I don't in principle, I just think about the amount of time we spend trying to warp programming syntax into English and I wonder whether it is really worth it. If it is, maybe we need more malleable languages.. more malleable than Ruby. Or, maybe we don't. Maybe it's just too much fun.. maybe there's too much 'game' in molding an existing language into what we want it to be. But, I'm getting off topic.
The thing I wanted to blog about is the clash between this natural language style of programming and the other sorts of programming we do. In the agile community, seeded by Kent Beck and Ward Cunningham, there is this meme that the code should tell a story. Luckily, our languages are nimble enough to help us with this. We can use evocative names and string together code lines so that they read like a book. Yes, it's work, but it is very doable. OO was made for that sort of thing.
Let's look at an alternative (avert your eyes if you must):
`git log #{@filename}| grep ^commit`.split(/\n/).map(&:split).map {|fields| fields[SHA1_COLUMN] }
This is a line of code that I wrote to pull the sha1s out of the output of a git log command.
Ugly? Yes, but I can make it better. Let's pull out the bash by saving the shell call's result into a variable called 'text':
text.split(/\n/).map(&:split).map {|fields| fields[SHA1_COLUMN] }
Better? No, I can still hear you saying "yuck!" I'm not going to argue with you, but I am going to say that that I think that this code is an example of an equally valid mode of expression. It's just a visual mode rather than a narrative one.
Let's bring it home. Try to explain the code out loud: "take the text and split it into lines, then split those lines into fields and map each line to the field containing the sha1."
Maybe some people do sub-vocalize that sort of thing to themselves when they see code like that, I just know that I don't. When I see code like that, I construct a picture of what it is doing inside my head. Words don't play into it much.
I don't think I'm alone in this. We have different ways of making code understandable and making it 'read like English' isn't the only way.
I remember years ago reading about something called Neuro-Linguistic Programming, and although I felt that much of it was popularized off the deep end, the idea that they had that we have different cognitive modalities is something I'll always remember. According their theory, some people are far more visual. Others prefer to get information in an auditory way, and often they subvocalize when they are thinking.
No doubt this is a ruthless simplification, but what I notice about programming is that there aren't really any fixed boundaries. We go back and forth between more structural (a.k.a functional, visual) ways of thinking about code and more narrative (auditory) ways.
When I look at really good functional code (and believe me the piece I have up there isn't a good example) there is a balance between good naming and structure that reveals computation.
It just isn't narrative.
It's a different mode and I think we have to recognize it.
I tend to like my programming language to be quite readable by us humans - but I agree we shouldn't spend lots and lots of time forcing the Programming Language into English.
One place where I feel that a bit of that work IS worth it is in tests. To me the tests need to communicate with humans first and the computer second.
In non-test code the program needs to communicate with both the computer and humans at more equal level so compromise on language is fine by me.
Posted by: Verdammelt | March 30, 2011 at 07:11 AM
@Verdammelt I'm inclined to agree... except, I think tables work well for tests too. It seems situational. I also wonder whether with tests we're seeing the remnant of the "tests should be readable by the end-user" mode of thought also. To me, it's "sometimes yes, sometimes no."
Posted by: Michael Feathers | March 30, 2011 at 07:39 AM
But I think tables are very readable by humans as well!
Posted by: Verdammelt | March 30, 2011 at 07:40 AM
Okay, that's fair. :) I thought you were pushing the natural language thing.
Posted by: Michael Feathers | March 30, 2011 at 07:46 AM
Yeah, my point, was that tests need to be much more human readable - so i think working harder to make them readable is ok. For non-test code it is important to be readable - but more compromise on human vs. computer readability is fine there.
Posted by: Verdammelt | March 30, 2011 at 08:26 AM
Readability can be associated with natural language but I (personally) think that's a bad association. I've always being wondering how a Chinese code looks like. How do they name variables and methods? If they do that in their language I prefer readability to be far from natural language. We can speak both the programming language, so it's better to use that. I've read programs that looks like Chinese, and I think that's the point of readability, the program should be legible.
On the other hand, people that don't know the programming language shouldn't judge readability, maybe they cannot read the program because they have limited knowledge about the language in which was expressed. This occurred to me when someone told me not to use the null coalescing operator (C#) because the resulting code was not legible.
I have seen also some coding that are written focusing in natural language and knowing the programming language doesn't help much to understand what's going on behind those methods, what is the result of the method when()?
This just to say that readability is not the direct result from using the natural language and (we all know this) is not achieved using just the programming language. Writing readable code is an skill, and as such, practice is required, a lot.
Erlis
Posted by: Erlis | March 30, 2011 at 08:28 AM
I think that line of code actually could be read as a narrative, if only you didn't have to write regular expressions.
If someone wrote an "object model" for git, wouldn't the code look a lot less intimidating?
`git log #{@filename}| grep ^commit`.split(/\n/).map(&:split).map {|fields| fields[SHA1_COLUMN] }
In Clojure, a syntax I'm a little more familiar, I could envision something like this:
(map :sha1-column (:commits (. git (log filename))
With pipeline syntax, we can arrange the words in another order that's a little different, but in general, it reads out:
"For each of the commits for a particular file, select its sha1- column"
I was a Smalltalk programmer before too, and I'm pretty sure the Smalltalk code can be made to look very close to the English I have above ;-)
Posted by: Duncanmak | March 30, 2011 at 09:59 AM
I can appreciate where you are going with this, but I think that the visual form made me want to see if there wasn't a more expressive (less piped with less temporary elements) form and I came up with this:
`git log #{@filename}`.scan(/commit (.*)/).flatten
Except for the superfluous Array#flatten which is an artifact of how captures are treated within String#scan, I think this meets both structural and narrative criteria.
That is not to say that there isn't a place for both forms of expression, but I think that it's possible to find a good expression that transcends both forms.
Posted by: Bheeshmar | March 30, 2011 at 07:48 PM
The narrative aspect should not obscure the "structure that reveals computation," but I find value in using all aspects of the [computer] language to communicate to my future self and other programmers.
In chapter 23 ("See No Evil") of SQL Antipatterns, Bill Karwin talks about a related concept: writing readable code at the cost of leaving absent or obscuring important logic.
The concept applies to all languages -- not just SQL. A quote from the chapter: "Some computer scientists have estimated that up to 50 percent of the lines of code in a robust application are devoted to handling error cases."
Error logic is often ugly.
I agree that the communication with programmers should not supersede the communication with computers.
Posted by: Matthew Rodatus | April 01, 2011 at 06:15 AM
Not that it's the point of this article, but getting a list of just the SHA doesn't require any ruby at all.
git log --format="%h" --no-abbrev
if you don't need/want the whole sha, you can omit the --no-abbrev flag.
Posted by: Mike Busch | April 07, 2011 at 08:04 PM
I usually feel TDD (test driven development) is more of the 'visual' cognitive modality that you talk about related to neuro-linguistic programming. The visual stimulus for developing code is the stimulus of seeing tests perform.
Posted by: Mohan Arun | July 01, 2011 at 01:05 AM
Apparently experienced chess players "see" the board differently from beginners. I suspect the same is true from code - and that there are non-narrative ways of "seeing" that develop with experience.
There's an excellent recent book from Daniel Kahneman which goes into a lot of detail on this stuff. I posted a quick (and _very_ incomplete) overview here: http://www.agilekiwi.com/peopleskills/understanding-ourselves/becoming-an-expert/
Posted by: John Rusk | March 06, 2012 at 11:10 PM
I got the feeling that some people want to build programs like in Star Trek. Talk to the computer in human English.
Some people mention to make it more readable. Readable to who? To programmers or to non-programmers? Unless you are trying to achieve Star Trek technology why would you want the code to be more readable to "end users". They don't want to read this stuff.
More readable to folks that have limited knowledge about the language? Why make it so easy for them. There is a saying in Colombia, "darles las cosas masticadas" which means give somebody the food already chewed. When you say this about somebody it means more like "you have to feed that person like a baby and hold their hand".
I believe in more readable to programmers (and technologists) with the goal of making the program more efficient. Not necessarily through words. Maybe the future is visual. We develop code through visual shapes where a circle may be a class, a square a function and lines to map between different objects. Maybe the future is auditory. Program through the use of sounds. The synthesizer is the sound of the future, according to Giorgio.
Posted by: Tom Ordonez | May 16, 2013 at 08:39 AM
The easier you make it to understand code the easier it is to produce code. The easier it is to produce code the harder the problems you can tackle.
Code that matches how people think is easier to understand without a steep learning curve.
Yes, experts see code differently from non-experts just as when in mathematician mode I see math differently from most people. But matching language structure to how people think helps you go further faster. Generally elegance and brevity go together. If either are missing it is time to step back and think.
Posted by: alexK | May 17, 2013 at 12:04 PM