Join any discussion about unit testing and there's one question that will invariably come up: how do you test private methods?
There are a number of answers. Some people advocate increased visibility. They say that if you feel that you must test a private method, it's really no great sin to make it protected, public or package private in order to make it testable.
Other people say that you can use special tools to access private methods. In Java, at least, you can quite easily rig something up with reflection to give you any kind of access you'd like.
These are "okay" answers, but really they are missing something. When I write tests and I have the urge to test a private method, I take it as a hint. The hint tells me that my class is encapsulating so much that it has ceased to be "understandable" by tests through its public interface. I listen to the hint, and factor my design differently. Usually, I end up moving the private method (and possibly some methods around it) to a new class where it can be non-private and accessible to tests.
Aiming for testability actually changes your design.
But, is that good?
The answer for me, so far, has been yes. There appears to be a synergy between testability (at the unit level) and good design. If you aim for testability and make some good choices, design gets better. If you aim for good design, your design becomes more testable.
In the end, it all comes down to cohesion and coupling. If classes are deeply coupled with their neighbors, it is hard to control them in a test or observe them independently. If a class isn't cohesive, it may have some logic which is not easily exercisable through its public interface.
It seems that reverse is true also. Classes which are hard to instantiate and use in a test harness are more coupled than they could be, and classes with private methods that you feel the urge to test, invariably have some sort of cohesion problem: they have more than one responsibility.
I suspect that some people reading this might think it is a load of wishful thinking, or that this synergy between testability and good design is just a hypothesis without much to back it. All I can say is that in the community of people doing test-driven development there are a number of people who have found that this question of testing private methods doesn't come up much in their practice. They target both testability and good design and find that both goals nurture each other.
Is the synergy real? I think it is. When a bit of functionality is not easily understood by a test at the public interface, the chances of it being easily understood by a human are ridiculously low. And, if there is any universal baseline for good design it has to be understandability.
So, having said all of this, I end this blog with a challenge: Can anyone come up with an example in which there is a private method that they feel that they have to test.. something that they don't feel comfortable testing through the public interface of the class, but which can't be made testable with a better design?
Hi Michael,
Suppose I present you a piece of untestable code and you come up with another design that makes it testable (I suppose that almost any piece of code can become testable with a design change). How can we agree that your design is better than the previous one (except for making the code testable)?
Posted by: Luiz Esmiralha | September 20, 2007 at 07:23 AM
Luiz,
That is tough because people can differ on good design, but on the other hand we've been attempting to define it for over 50 years and I think we've made some progress. There are some design principles we can apply.
In the end, it's all about tradeoffs, and I find that in my own work, the designs that I end up with when I use TDD tend to adhere to good design criteria that tend to be hard to find in less testable code.
Michael
Posted by: Michael Feathers | September 20, 2007 at 07:45 AM
In general, I agree, and live by the mantra "if it's hard to test, it'll be hard to use."
One place where this broke down for me was in implementing a skip-list, and trying to do it in a fully-TDD manner. The public API is pretty straightforward: you add items to the list, and they should come out sorted.
The implementation used a set of linked lists, and one of the key functionality tests was that the size of those lists double at each level (ie, at the lowest level, the linked list has N entries, at the level above, N/2 entries, and so on). This is pretty key to a skip-list implementation, as the API could be satisfied by a simple insertion sort, but the performance would be very different.
I couldn't come up with a clean way to extract that inner data structure into a testable form. Sure, it could be made into its own class, but it's really the core of the skip-list implementation; the public API is just window dressing.
I ended up creating a protected method getStatistics(), which returned an array of list sizes.
So, how would you approachthis?
Posted by: Keith D Gregory | September 20, 2007 at 08:54 AM
Keith - It sounds like the behaviour you want is not only "inserts data and outputs it in sorted order", but also "inserts data fast" or "uses little memory". To test the additional performance assertions, why not create a specific performance test that inserts lots of data and checks speed or memory use? That way, if someone invents a spiffy new algorithm that is more efficient, you can swap it in without breaking anything.
Posted by: Douglas Squirrel | September 20, 2007 at 10:37 AM
Unfortunately, in most environments I have not found it very feasible to unit-test performance. Running the tests on different machines with different loads will make it difficult to define pass or fail for CPU tests. And many languages do not expose enough details about memory usage to be able to gauge that.
But even if you could do that, you'd have to really specify the behavior you want more carefully. You might say you want O(log n) under most circumstances, but a skip list is a probabilistic data structure, which will behave differently under different loads. (Best and worst case for skip lists are pretty far apart.) That makes it really hard to get your tests right. In addition, I think there are some random elements to most skip lists, which don't fit very well with unit tests.
If you do unit test performance, you'll probably end up making tests run longer than you'd want them to, so you can factor out variations in CPU performance. And it's unlikely you'd just be able to swap in a different algorithm and expect your unit tests to pass: each algorithm has somewhat different performance characteristics, so you'll have to change your tests.
Really, I don't think performance is a good thing for unit tests to attempt to check. If you're implementing an algorithm that's been proven to be correct, by testing private methods you can use knowledge of the invariants of various steps to make sure that your implementation matches the algorithm. Why should you throw all that away and rely solely on black box interface testing? You're just putting a screen in front of your tests so that you can't see all the detail that's there.
Posted by: Alan Shutko | September 20, 2007 at 12:08 PM
Alan - I'd suggest unit-testing simply that the skip-list performs better than a more naive implementation.
I used this technique once to test-drive my code into executing several operations simultaneously so the overall run time would be faster than if I had serialized their execution; I fed it two stub operations that simply slept for 100ms, and asserted that the overall time for executing both operations was less than 200ms. This drove my code into using threading to execute both operations simultaneously and pass the unit test (which was fast and repeatable).
p.s. As far as design is concerned - you can see that this unit test also drove my design to be more cohesive and decoupled in such a way that the operations themselves were abstracted (hence mockable) away from the unit that executed them.
Posted by: Avi Naparstek | September 20, 2007 at 01:20 PM
Avi, in Mr. Feathers' awesome book "Working Effectively with Legacy Code", he makes a case to the effect that a 10ms test is a slow test. It sounds like it was OK for you, but a 200ms test usually isn't considered fast. If you only have one or two tests that take that long it's probably OK, especially if performance of the component under test is mission-critical.
Posted by: Jeremy Weiskotten | September 20, 2007 at 04:20 PM
An example of where you turned private functionality into another class would help a lot.
Is it /just/ as simple as it sounds? What if nothing else uses the class? Also, doesn't that make you type a lot of class-defining boilerplate for that little bit of transparency? Good to think about even then though, I suppose.
Posted by: Skeptic | September 20, 2007 at 04:50 PM
There are times when a private method in a class cannot really be moved to another class and exposed via a public interface. A recent example is when I had a private method that created a list of exceptions that I needed to process in a certain manner, different from my normal processing flow. In this case, I created a private method to read up a config file that held the exceptions and massage them to an acceptable form for use by the main processing flow. This creation/massaging was not general enough to push into another class, but I wanted to unit test it anyway.
But in general, I agree with what you say about the synergistic relationship between unit testability and good design.
Posted by: aditya | September 21, 2007 at 12:47 AM
Jeremy - Yes, I agree that I wouldn't want all my unit-tests to tak 200ms, but like you pointed out - we can get away with having a handful of these without killing the suite.
Either way - I think my point was that performance can be unit-tested (and even test-driven) by decoupling the performing-unit as much as possible and then comparing it's performance with another implementation that we know is not good enough.
This also switches our notion of performance from "running fast" to "running fater than XXX", which I think is a good thing as it adds some measure to "how fast do you want it to be" instead of just "have it be as fast as possible".
Posted by: Avi Naparstek | September 21, 2007 at 02:59 AM
Unfortunately, "running faster than something" isn't very useful.
If I have something I really want to run in O(log n) rather than O(n), it's because I know that in the production system, I'm dealing with large amounts of data. You won't necessarily see any improvements in smaller tests. (In fact, on small datasets, more naive implementations may be faster than more sophisticated implementations.)
If you aren't doing anything complicated, you could unit test your performance and be fine. But I can't imagine a good design that would abstract out the inside of a skip list so that you could mock it out. I suppose most people don't implement core algorithms like this, they just use the ones in the language, in which case you just don't care about that level.
Question for thought: What do Sun's unit tests for the various Collections implementations look like?
Posted by: Alan Shutko | September 21, 2007 at 07:29 AM
I don't know. Black-box testing (with no access to private methods or internal state) is certainly necessary - you need to make sure you're satisfying the interface contract - but as a developer, I don't think it's sufficient. You really want to know that the module is working correctly inside, both because that internal correctness may be important to performance, robustness, etc., and because some kinds of internal failures may lead to external problems over very long periods of time.
If, for instance, you have a fragmentation problem in a memory allocator inside a class that stores data, external-interface failures may only appear after a very long time or after testing with particular sequences of operations. Tests written to verify the internal operation of the class may be able to detect the pathology immediately.
Posted by: scott preece | September 21, 2007 at 07:38 AM
Do we really need to directly test every method or is it sufficient to test the public API of the class in a way that exercises the appropriate corner cases even for the private methods?
Consider these alternatives:
1) I assume that if the method was originally private, then it was definitely used by other methods in the class. If you split it out to another class, you would certainly increase the coupling and might even violate information hiding. From an information hiding perspective, this is no better than making it public. The only reason to do this is if the method is useful in other situations in which case, it was probably not tagged as private to begin with so your "design for testability" flag would never be triggered. Clearly you need some other means to evaluate your design for cohesion/coupling issues.
2) Making it public or protected and writing tests directly to it., would violate information hiding. This still may be the best choice in some cases.
3) Only testing it indirectly through the class' public API, might be less robust testing but might be OK if it's really only in a separate method for readability or DRY reasons. In these cases, you should still be able to get 100% statement coverage and cover the potentially troublesome paths.
4) Inlining, might violate DRY and make the code less readable.
4 is an obvious bad choice in anything but the most trivial cases. I can't see 1 happening... as someone asked above, can you give an example? Trade offs must be considered when deciding between the other two.
Posted by: Larry Maccherone | September 21, 2007 at 08:36 AM
I think the question is not "Why do you need to test private functions?" as someone suggested.The discussion can take a philosophical turn if we think more in this line.
We need to find a way to test a private method gracefully. The best way I know, and I would do in my C++ code is make my Tester class a friend of the class I am trying to test. So effectively you have full control of everything from your Tester class. And use forward declaration so that the Real class is not dependent of the Tester class.
Posted by: Shuva | September 22, 2007 at 01:03 AM
Definitions:
http://en.wikipedia.org/wiki/Skip_list
http://www.csee.umbc.edu/courses/undergraduate/341/fall01/Lectures/SkipLists/skip_lists/skip_lists.html
Animated Visualizations (demos):
http://www.geocities.com/SiliconValley/Network/1854/skiplist.html
http://iamwww.unibe.ch/~wenger/DA/SkipList/
[Aggressive but safe Java Applet popup windows on the following site.]
http://www.cs.mu.oz.au/aia/SkipList.html
But the animated algorithm/code debugger is most interesting.
Posted by: Jeff Grigg | September 22, 2007 at 06:05 AM
With regard to how to test "random" behavior and performance characteristics of list/search/sort algorithms, I have tested random behavior by mocking out the random number generator, and would probably try testing list/search/sort by providing a test Comparator implementation (assuming Java-like interfaces). I have found such tests to be implementation specific, in that I haven't found a good way to generalize the tests to cover other reasonable implementations and refactorings without changes to the tests. But it worked out reasonably well, so I'd probably do it like this again, when the need arises.
In 2002 I posted an example of Test Driven Development of "random" behavior in
http://tech.groups.yahoo.com/group/extremeprogramming/files/
as "JeffGrigg-Magic8Ball.zip".
(To run under JUnit 4 comment out the 'junit.swingui.TestRunner' lines, and run the tests through your IDE.)
In the TestOracleEngine class, the checkAnswer method tells the mock object what method and parameter value to expect next ('nextInt(int highBound)'), and what value to return ('selectValue'), then it calls the method under test ('oracleEngine.getAnswer()'), and validates the String response message.
Posted by: Jeff Grigg | September 22, 2007 at 06:55 AM
Keith,
In general, I wouldn't use to TDD to drive me to a specific algorithm, but I would write tests to support my work if I had a particular algorithm in mind. I don't see any problem with that, really. The "emergent design" aspect of TDD is great when you need/want it. At other times, when you have a good idea of the algorithm you want to target, it's great to write tests that drive you at a higher level: they allow you to break up your algorithm implementation work into little chunks.
Re the skiplist, I wonder whether it is a good example.. is there a private method in the implementation you want to test? Sounds like you want data from the skiplist. Without having tried it, it seems that one possibility is to make a class for the lists.. maybe call it "Level." Each Level can delegate search to its next lower level. The Skiplist could hold the top Level.
Michael
Posted by: Michael Feathers | September 23, 2007 at 03:07 AM
aditya:
Why doesn't this thing that reads a config file and massages exceptions seem like a class to you? I really think it is important to reflect on that. Classes don't have to be key abstractions in a domain, they can just be useful, easily understood things that do some work for you.
I think that often our threshold for class-hood is too high. I have a public example that demonstrates this. If you go to the Object Mentor website and download Zess from the resources section, you'll see a little mail server I wrote. The whole thing could've been done in one class, but because of testing, I found it convenient to have MailReceiver, MessageProcessor, and MailSender classes. There's also a Roster class whose entire job is to load the list of email addresses from a file. Yes, it could've been one class, but it would've been an awful class. Clearer (and more testable) this way, in my opinion.
Michael
Posted by: Michael Feathers | September 23, 2007 at 03:16 AM
I agree whole heartedly, don't test private methods, if you must perhaps your class is too big.
There is one out on this that I use a lot.
A lot of "I want to place a test here" translates into... "The precondition/postcondition or invariant is this XXX, how do I assert that in the unit test?"
Answer: Don't. Put the precondition/postcondition/invariant check in the code where it belongs. Leave it there (unless it becomes a performance hotspot) in the production code.
Posted by: John Carter | September 23, 2007 at 06:33 PM
Hi Michael,
I totally agree that there is a virtuous cycle between testability and good design. I also agree that private methods that feel like they need to be tested, usually belong in another class.
On the other side of this argument, are there cases you've encountered where the proliferation of little classes has been too distracting for other people on a project?
The problem I am thinking of is that cohesion is a bit subjective, and different people have different "cohesion tolerances." The extreme example is people who are disgruntled about object-oriented languages who say they can't figure out where anything actually gets done.
Posted by: Bob Evans | September 24, 2007 at 11:36 AM
http://aaronfeng.com/articles/2007/10/09/private-method-creep
Posted by: Aaron Feng | October 09, 2007 at 02:57 AM
Consider this scenario: I have a "green" test suite and start refactoring. While at it, I
start to be a bit "creative". I guess "extract
method" is considered to be a refactoring, but
at some point I have created a lot of methods
that are covered by a test, nonetheless if
something breaks I have to dig into a lot of
methods to find the problem - or start extracting logic into separate classes, driven by tests. These tests are then driving the refactoring, but I have to do a lot of refactoring before I can work on features again - if working "by the book" I usually have to refactor when I encounter duplication, but it seems to me that I often have to refactor to reduce complexity.
Does anyone else experience the same problem?
Posted by: Robert Schmitt | October 10, 2007 at 05:16 AM
Even if you did extract the private method into its own class I'm not sure that you are necessarily going to want to expose the new class to clients. If not I'm not sure that I'd actually want to write tests against it, so I'd probably still just test against the public interface.
Posted by: Colin Jack | December 02, 2007 at 11:07 AM
I feel the "visibility modifiers" of languages such as C# sometimes get in the way of my design technique.
For example, today I was TDDing out a design for the following problem:
1. I have a list of lists of named point data in a legacy INI-like text file:
[Standard]
numpts=1
pts=1.2 3.4
[Standard2]
numpts=2
pts=2.3 4.5 6.7 8.9
2. I am only interested in the actual point lists, which represent nodes in certain splines.
At the highest level of abstraction, this translates to a function with this prototype:
ListOfSpline ParseSplines(string repr)
This is really the level I want my API to be, nothing else. That is, the user of my library should only think of this method, nothing else.
But using TDD, it is much simpler if this high-level method used some lower-level objects to parse the ini file, retrieve the coordinate lists, and build the point lists, since among other problems the Spline object does not easily compare to other spline objects (no clear Equals()).
Here is where I find private/public a burden: I feel strongly that library APIs (as this example is) loose a lot of value if I have to make underlying (and somewhat complex) classes/methods public.
My current solution is to put all "underlying objects" into a separate namespace called "Internal":
namespace MyAPI { SplineParser }
namespace MyAPI.Internal { IniParser, SplineList etc. }
(everything is public sadly, even though it is a general library, but at least they are testable)
I get the feeling public/private is in conflict with general library development.
So anyone had this feeling too..?
Posted by: Olof | January 21, 2008 at 07:46 AM
Olof,
It seems you are just re-stating Michael's original questions: 1) do I need to test private methods, 2) if so, how do I do it?
In your case, some might say that SplineParser is doing too much (parse INI, find spline list, parse spline list). I'd want to test all of it and if I found that member visibility was getting in the way, I'd probably break it out into testable chunks (reinforcing Michael's conjecture that if you feel the need to test private methods, it is an indicator that design should be changed instead of 'hacking' visibility).
Cheers.
Posted by: Russ Sherk | March 14, 2008 at 06:51 AM