I've spent the past 8 years or so looking at ugly code. This isn't uncommon in software development but in my case, I've been looking at different ugly code developed by different teams every couple of weeks. One question that people often have is whether to refactor or rewrite. It's never a simple call. Usually, this is because the reason that people want to rewrite code is because they don't understand it. Yet, rewriting code often requires us to understand it well enough to proceed with the rewrite, especially if there are existing customers who depend on all of the nuances of behavior that the system has consistently exhibited.
When you see enough of this, it's easy to take a bleak view. There are many places in the industry where existing mountains of code are a drag on progress. The worst thing is that quite often organizations are so close to the problem that they can not see its full extent. When business sponsors receive the cold-shoulder enough times when they want particular features, they stop asking. They have learned that "changes to X are expensive" and they gradually shape their business in different ways. The effects ripple out from there. Younger organizations without as much software infrastructure often have a competitive advantage provided they can ramp up to a base feature set quickly and provide value that more encumbered software-based companies can't. It's a scenario that plays out over and over again, but people don't really talk about it. When Agile came around there was a lot of talk about flattening the cost of change in code. With good practice, you can do quite a bit of that. However, entropy happens. Older codebases are typically harder to change.
A few years ago, I was telling some friends about an experiment I would love to run. I'd like to have code base where every line of code written disappears exactly three months after it is written. If you were able to get past people gaming the system by copying code someplace else and then copying it back in when it was deleted, you'd have a very interesting set of constraints. Developers would be rewriting code constantly and they'd develop insights into ways to rewrite the code more compactly and (hopefully) more understandably. Perhaps more importantly, the business would have to make very serious tradeoffs about the features. If you are limited to a smaller number of features (because code keeps disappearing), you have to make sure that the ones you keep are really the ones which are making you money. I have the suspicion that a company could actually do better over the long term doing that, and the reason is because the costs of carrying code are real, but no one accounts for them.
Recently, there's been a strong move toward Lean in the industry. The roots of Lean are in manufacturing - the Toyota Production System, etc. I like some of the concepts, but I remember being shocked early on by what the lean community considers to be "inventory." In manufacturing, inventory is a very clear concept. It is the extra stuff hanging around.. the things that partially done and queued for the next bit of work. In Lean Software Development, requirements are often seen as inventory. If you spend a lot of time elaborating requirements for features you are not going to work on for a while your process isn't streamlined enough. That's fair, but I think that the brutal reality is that we have something much more tangible that we can see as inventory: our code.
Let's go back to manufacturing. If you are making cars or widgets, you make them one by one. They proceed through the manufacturing process and you can gain very real efficiencies by paying attention to how the pieces go through the process. Lean Software Development has chosen to see tasks as pieces. We carry them through a process and end up with completed products on the other side.
It's a nice view of the world, but it is a bit of a lie. In software development, we are essentially working on the same car or widget continuously, often for years. We are in the same soup, the same codebase. We can't expect a model based on independence of pieces in manufacturing to be accurate when we are working continuously on a single thing (a codebase) that shows wear over time and needs constant attention.
No, to me, code is inventory. It is stuff lying around and it has substantial cost of ownership. It might do us good to consider what we can do to minimize it.
I think that the future belongs to organizations that learn how to strategically delete code. Many companies are getting better at cutting unprofitable features in their products, but the next step is to pull those features out by the root: the code. Carrying costs are larger than we think. There's competitive advantage for companies that recognize this.
When writing code, I always try to ask myself "will this code pay for itself?" (and idea borrowed from Josh Bloch I think). And it's easy to spot a piece of code has stopped paying for itself, but that's usually long after the fact.
The 3 month experiment aside, how do you decide this? Is code like any other asset that depreciates by a fixed amount over time, until you eventually write it off?
There are some artificial milestones:
- when someone leaves/is leaving, someone on the team rewrites parts of the code the departing developer was responsible for.
- an update to a new release of a core technology (e.g. the appserver, or the language or whatever) should trigger a cull to make optimal use of that release. Instead what happens is you end up with a mixture of code that uses the newer features, and code that does things the old way.
- use source control to guide changes. e.g. the top 10 most modified classes in the last six months that are twelve months or more old need a rewrite
Of course, the potential for abuse is enormous and convincing management types that putting effort into rewriting something that's already working will be a very hard sell.
But a long lived code base that doesn't get refreshed regularly turns into every growing islands of "don't touch it, it just works". Which is fine, but eventually, you DO have to touch it.
Posted by: Iequalszero | May 17, 2011 at 03:57 PM
I forget the source, but I've seen it stated something like this. In accounting terms:
Features are an asset; code is a liability.
It was further suggested that lines of code should be included in technical debt metrics. EVERY line of code is technical debt, by definition.
This makes a lot of sense.
[Anyone remember the source? It was a great article.]
Posted by: MetaThis | May 17, 2011 at 07:39 PM
How about using monitoring/probing of production code to find out what are the most used features and which ones are seldom or never used? Having those kind of metrics would help drive say a story every release to rip out an unused feature to reduce code bloat & complexity.
Of course if we continue with your car analogy, we wouldn't want to remove the emergency brake because it is rarely used!
Posted by: Gregory | May 17, 2011 at 08:30 PM
Of course your own code isn't ugly, only other people's code. This sounds like a elitist at work.
Posted by: Grechen Wilson | May 17, 2011 at 11:19 PM
@Grechen: He didn't write anything close to that. and so what if you can't see the ugliness in your own code? Do you expect to be able to accurately evaluate merits of anything that's your own?
Posted by: Anders Eurenius | May 18, 2011 at 03:31 AM
A corollary is to use as high level languages as possible; you need fewer lines of code per given feature.
(So why do my bosses insist on moving to Java in 2011!?)
Posted by: Bernt B | May 18, 2011 at 03:37 AM
Instead of having few modules and a master file to call in, we should break the entire stuff into different modules according to the business needs defining the use and what feature it refers. We can also have small reporting system which can calculate or find out the usage frequency of the particular module by finding how many times that module is called by the user to avail the feature. This makes sense to cut down or freeze or take critical decision on seldom used code (feature).
Posted by: Logic | May 18, 2011 at 04:08 AM
@MetaThis: See Ted Dziuba's post "Taco Bell Programming," at http://teddziuba.com/2010/10/taco-bell-programming.html. (No attribution is given in the post.)
Posted by: Benjamin Klein | May 18, 2011 at 04:40 AM
Michael,
Your concept about deleting code and rewriting it is interesting, but going so far as to eliminate features while you are doing it would cause some anxiety, even with the focus limited to those features to those that aren’t used or rarely used. People hate taking things away! One way to aid the business call is to add profiling the app so that you could prove which features weren’t actually being used and then calculate the carrying costs of those never or rarely used features. And if some features haven’t needed to be modified as the result of adding other capability, wouldn’t you be unnecessarily adding to the carrying costs by rewriting the code?
I blogged about source code recently myself, but my take was that it was actually an undervalued asset.
http://www.softwareresults.us/2011/02/source-code-important-yet-undervalued.html
Posted by: Dave Moran | May 18, 2011 at 05:44 AM
Code as an inventory is so much more intuitive than requirements as inventory. I also see unused features as inventory and therefore, a good way to clean up code is to remove those features that no one has used or are likely to use.
In my opinion, Code is an asset, not a liability. Some code qualifies as non-performing asset and they need to be either cleaned up or deleted. In a balance sheet, assets are also listed in the sequence of how liquid they are - fixed asset at the bottom and highly liquid asset at the top. Extending the metaphor, maintainability of code can be substituted for liquidity.
Posted by: Prashant Gandhi | May 18, 2011 at 06:30 AM
Every project I work on these days gets a "graveyard" folder to put dead code in:
http://workstuff.tumblr.com/post/5606561851/every-project-needs-a-graveyard
Posted by: Fields | May 18, 2011 at 07:06 AM
If we can find a way to automatic remove non-used features, that would be great.
In hosted website app, that should be easy. If a feature has not been used in 3 months it goes offline.
I guess most (all) managers would agree to that, they would think, yes but every feature will be used at least once by "A user".
Hey even I think that.
Yet I know that I (and these managers with me) will be very surprised about the results
Y
Posted by: YvesHanoulle | May 18, 2011 at 07:49 AM
I like the analogy of code as inventory, makes me think that by far the most costly inventory is code that hasn't been shipped. Unshipped features are by definition unused and the process of fine tuning new code both operationally for user experience is expensive. If you have a lot of code that hasn't gone through this process you are hurting yourself the same way a manufacturer would with a big pile of unsold inventory.
Posted by: Lars | May 18, 2011 at 08:03 AM
@MetaThis: maybe "Code is Liability not an asset" - http://dev.af83.com/code-liability-not-asset-part-1-3/2010/02/24
Posted by: Bruno Michel | May 18, 2011 at 08:15 AM
I think you strategically delete code like you might preen a bonsai tree. When then the bonsai tree pisses you off or looks ugly you delete it and start again.
But code is not inventory. It’s scaffolding, that may or may not be well erected, that the business uses to do its jobs. The scaffolding can always be taken down, replaced, etc.. The code is not an essential part of the business, it’s accidental – like inventory – but unlike inventory does not become part of a product only part of an enabler.
Posted by: Financialagile | May 18, 2011 at 08:25 AM
Alex Stepanov gave a talk at Adobe: "Companies think software is an asset. It's not. It's a liability. The asset is your accumulated knowledge and experience." He then demonstrated, on real Adobe code, how a long function could be reduced to a few lines.
Posted by: Jon Reid | May 18, 2011 at 08:36 AM
I like your analogy of seeing code as inventory. but i have a question on this.
What are the competitive advantage that companies can avail by removing unnecessary features/ delete code?
-> Is it the maintenance cost of non essential features? Or
-> Maintenance of code (repository maintenance etc.)?
What are the benefits from developers point of you?
Posted by: Account Deleted | May 18, 2011 at 08:49 AM
Instead of talking of 'carrying' code -- Dijkstra frowned on that kind of metaphorical thinking (sensibly so) -- it might be clearer to rephrase and summarise this in more straightforward terms.
The problem here is that larger code is more complex, and more complex is harder to work on. The suggestion is that we impose limits on the amount of code. That seems somewhat prudent, but it does not really attack the problem: we want to make code easier to change *despite* it being more complex.
Just limiting code size is not really a solution. Very broadly, functionality is proportional to code size (in general, and in particular where all else stays constant -- is that not reasonable?). So limiting code size is limiting functionality -- which is setting limits on what we do. We do not want that, we want to make things easier.
Also, this limiting really amounts to re-arranging the costs. It is transferring the costs of future change into costs of current activity. But instead of paying those moved costs by doing *more work* now, they are paid by having *less functionality* now. Is that not contradictory to one of the principles of agile (or at least XP)? You do not try to predict the future, you just do only what you need now.
Posted by: Harrison Ainsworth | May 18, 2011 at 10:41 AM
The problem with the "code is inventory" idea is the same problem as the "waterfall" project process, and also the Lean way thinking -- these are all manufacturing-world models being applied to a process (software development) that has very little to do with manufacturing.
Models based on manufacturing are always going to be flawed. Software development is a design process, not a manufacturing one, and so the question of "inventory" is a meaningless one.
Posted by: Corey Reid | May 18, 2011 at 11:14 AM
It's an old article that I was just recently made aware of, but Alistair Cockburn has spent some time with this metaphor: http://alistair.cockburn.us/What+engineering+has+in+common+with+manufacturing+and+why+it+matters
I really like his concept of "unvalidated decisions" as the items in inventory.
Posted by: Greg Vaughn | May 18, 2011 at 11:28 AM
Bas Vodde and I spent a month working with different teams in the same company last year. When we met, we'd brag to each other about how much code we deleted. It's an obvious sign of improvement.
Posted by: J. B. Rainsberger | May 18, 2011 at 11:33 AM
Yet another lesson one can learn from evolution. Features that are not important anymore are removed fairly quickly. On several occasions breeding has resulted in the loss of some features as the process was focused on other things to be. This can have a negative impact: Understanding which features are those you want to keep and which have to die is pretty vital.
As to the deleted code: There is always the Source Control (DNA)
Why comparisons to manufacturing should fail would have to be explained to me. Software making isn't just a pure thought process. There is a result, compiled software that does things, can be distributed and shows resistance to change. There are constraints, not only in requirements, but in hardware and infrastructure. It is not a pile of drawings, diagrams and documents or recordings of narrations describing patterns that would solve a problem.
Posted by: Fquednau | May 18, 2011 at 11:55 AM
Brilliant! Many great comments, although some still don't realise, that code (in the strict sense) is actually a liability.
I think the confusion stems from the fact, that you typically write code for some benefits. The features and new options you have gained by writing the code are the real assets.
Code was required to obtain them, but itself is only burden and maintaining it costs. It is pure liability. Could I get these benefits without it, I wouldn't think twice. Here we go to the core of the problem: many useful things just cannot be achieved without writing some code. Then I can only try hard to write it in a maintainable fashion and constantly keep it in a good shape. That will at least assure me, that I'm not paying too much for the features I gained. But again, it is a cost to factor code in an optimal way. And even then the code costs - at least the time required to read it. So when its relative value is not that high, I prefer to delete it.
One thing, that particularly costs in code's maintainance is that you have to update it whenever one of its dependencies change. The fact, that code has dependencies additionally limits the overall flexibility of the codebase. Those dependencies cannot be freely modified to not break their clients.
To sum up, 'if we wish to count lines of code, we should not regard them as "lines produced" but as "lines spent"' (Dijkstra)
Posted by: Przemekpokrywka | May 18, 2011 at 02:33 PM
It would seem that perfection is attained not when there is nothing left to
add, but when there is nothing left to take away.
— Antoine de Saint-Exupéry
Posted by: Roger Pate | May 19, 2011 at 04:44 PM
I like a lot of my old code. Maybe it's not perfect but it works even if I can't remember how. If someone came along and deleted it and told me to rewrite it I would basically respond that I don't know how. I once upon a time learned how the thing worked and that knowledge has since become irrelevant. If I ever needed to understand it again I had the code to jog my memory. Now that it's gone I'm starting from scratch with no guarantee that the next version will be any better.
Posted by: Alex Skorulis | May 19, 2011 at 08:21 PM