It's often hard to understand large classes. Sometimes we can't see the forest for the trees.
One tool that I often use is something I call a class feature diagram. They are easy enough to create. All you do is create a box for every field and a bubble for every method in your class and then draw an arrow from each method to the fields and methods it uses.
Here's a class feature diagram for one class:
And, here's one for another one:
You can learn a lot from these diagrams. You can see internal coupling and internal cohesion. Often they help you understand how to move forward in the face of a large refactoring task.
For me, though, the really fun bit about feature diagrams is the way that they change under refactoring. For example, the feature diagram for this class:
class A {
private int a, b;
public void foo(int value) {
a++;
b += value * value + value;
}
}
looks like this:
If we extract a method named bar
from foo
, we can end up with code that looks like this:
class A {
private int a, b;
public void foo(int value) {
a++;
b += bar(value);
}
private int bar(value) {
return value * value + value;
}
}
However, there's another way of extracting a method that will give us this:
Here it is:
class A {
private int a, b;
public void foo(int value) {
a++;
bar(value);
}
private void bar(int value) {
b += value * value + value;
}
}
So, what's the difference? Is one any better than the other? On the one hand, the first extract method gives us a pure function, a function without side-effects. In general, pure functions are great. They are easier to reason about, but there are some benefits to the second approach. Let's take a look at that diagram again:
What we've done in this refactoring is introduce a node between two other nodes. By doing so, we've made it possible to split this class into two classes, one which contains foo
and a
and another which contains bar
and b
. The first of those classes can use the other.
This strategy is an application of the Dependency Inversion Principle inside classes. We had a method foo
which depended on two concrete things and we ended up making it depend on something abstract (bar
) and one less abstract thing. In old-school terms, we've encapsulated b
within bar
.
Now, you might look at this and say "Well, this is just a toy example." Yes, it is, but it points toward a very useful strategy with extract method: you can gain advantage when you extract commmand methods; that is, methods which return void
and mutate some fields. You get the advantages in cases where you are able to start encapsulating more - when you start to be able to hide M things behind N methods where M < N. At that point, you are in a great position to do an extract class refactoring.
Sidenote: This definitely isn't the only class splitting strategy you can use with extract method. Extracting pure functions can be very useful but it also moves us away from OO and toward a more functional style of programming. The proper way to mix OO and FP seems to still be an open problem in the industry today.
This is pure and simple, which leads to brilliant, thanks for sharing.
Do you use a tool to automatically build the graphs?
I wonder whether there is a graph based IDE - the notion at least seems to me to be a good idea.
Posted by: Serverdude | May 24, 2010 at 11:54 AM
I always wondered whether a visual (graph) layout would enable different or quicker reasoning about software (compared to textual representations), or whether this was just a matter of personal preference.
"Code Bubbles" is a great step in this direction: http://www.youtube.com/watch?v=PsPX0nElJ0k&feature=player_embedded
Posted by: Martin Aatmaa | May 24, 2010 at 01:58 PM
ServerDude: Yeah, it's a handwritten tool which uses BCEL to pull info from class files, produce dot files and then render them with GraphViz's neato. I'm going to put it up on github soon.
Posted by: Michael Feathers | May 24, 2010 at 06:19 PM
Sounds even better :)
Posted by: Serverdude | May 24, 2010 at 11:51 PM
Hi Michael,
Very nice with tools that help you visualize the problem and the design behind the problem. Could you recommend a similar tool to someone working with C++?
Cheers
Posted by: Magnus Skog | May 25, 2010 at 07:54 AM
Great post.
Posted by: Adam | May 27, 2010 at 06:51 AM
Mike, have you looked at the DGML graph generation stuff in Visual Studio 2010 Ultimate? It will make the graphs you are talking about.
Posted by: Peter Provost | June 17, 2010 at 04:24 PM
Structure101 (the most useful code analysis tool you've never heard of) does something like this too. If you drill down to an individual class and look at what they call partitions. I'm convinced this is an automatable refactoring but haven't fully worked out the rules yet though.
Posted by: Matt Read | June 26, 2010 at 02:55 PM
I loved Michael's book and a few years ago I was inspired by the book to write a tool (classgraph on sf.net ) for graphing class structure.
It worked with Eclipse 2 and it still works despite not being maintained anymore for some years.
http://sourceforge.net/projects/classgraph/
Posted by: Account Deleted | July 08, 2010 at 06:39 AM
Interesting approach with those 'class feature diagrams'. This should be made available to all IDE's IMO.
It makes the coupling and cohesion instantly clear, which eases the improvement of the object model.
For instance, take the second example which clearly shows the class consists of 2 parts which can be separated (one with 2 functions and 1 field and one with 4 functions and 2 fields).
Posted by: Whermeling | July 13, 2010 at 12:30 AM
What approach do you advocate to handle a command failure (since it is void and can not return a failure indicator)?
Posted by: Billy Gitel | July 09, 2011 at 03:58 PM