(From the opening of every episode of The Greatest TV Series Ever Made)
Just about everyone in the Perl community who does testing knows that I'm a huge testing fan. In fact, to steal a turn of phrase from Adrian Howard, you might even call me a "testing bigot". I wrote the new Test::Harness that ships with Perl's core. I've written Test::Most and Test::Aggregate. I maintain or have co-maintainership on a number of modules in the Test:: namespace. I was invited to Google's first Automated Testing Conference in London and gave a lightning talk (my talk on TAP is about 42 minutes into that). I was also last year's Perl-QA Hackathon in Oslo, Norway and I'll be at this year's Perl-QA Hackathon in Birmingham, UK. I was also the one of the reviewers on Perl Testing: A Developer's Notebook.
In short, I'm steeped in software testing. I've been doing this for years. When I interview for a job, I always ask them how they test their software and I've turned down one job, in part, because they didn't. If I'm just hacking around and playing with an idea, I don't mind buying some technical debt and skipping a bit on testing while I'm exploring a new idea. I've even posted some code to the CPAN which is a bit short on testing. That being said, I wouldn't dream of writing major software without testing and I don't want to release any CPAN code as version 1.00 without comprehensive test coverage to have a minimum baseline of guaranteed functionality.
A number of years ago at OSCON I was attending a talk on testing when Mark Jason Dominus started asking some questions about testing. He was arguing that he couldn't have written Higher Order Perl (a magnificent book which I highly recommend you buy) with test-driven development (TDD). If I recall his comments correctly (and my apologies if I misrepresent him), he was exploring many new ideas, playing around with interfaces, and basically didn't know what the final code was going to look like until it was the final code. In short, he felt TDD would have been an obstacle to his exploratory programming as he would have had to continually rewrite the tests to address his evolving code. I tried to explain alternate strategies to deal with this, including deleting the code and adding tests back in and, in fact, when I created CPAN distributions of three of the HOP modules, I did find a couple of bugs in my testing. Regardless, I was hard-pressed to rebut his arguments.
Now that's not really a terribly terribly heretical idea. Many people realize that exploratory programming and TDD don't always play well together. James Shore has a great blog post about Voluntary Technical Debt and how this helped them launch CardMeeting. It's the same story: tight deadline, new idea, playing with concepts.
But that's not quite what I have in mind when I say "I don't do TDD". In reality, sometimes I do TDD, but not very often. I've tried TDD. I've written a test, added a stub method, written another test, returned a dummy object, written another test
Two words: tee dious (sic).
At work, we get a fairly substantial set of requirements for each task. We're a core application which many other BBC projects rely on for their programme metadata. When we get requirements, they're generally fleshed out enough that when we think through the problem, we have a decent handle on what needs to be done, even if the exact implementation isn't nailed down perfectly.
When I get a new task, I usually start by reading the tests for the code I'm working on and I often write quite a few tests first and then write the code for it. This isn't "pure" TDD in the minds of many people as I'm not writing a single test, then code, then another test, and then code, ad nauseum. However, it's close enough for TDD in my book.
That's not the only way I write code, though. Sometimes we have a complex case where I want to see what's going on in our interface and I'll load some fixture data, fire up a browser and start exploring our REST interface. Then I'll write some code and verify that our title=Doctor Who query parameters are returning correct results. Then I'll write the tests.
To some, this would be heresy. You must always write the tests first, right? My reply: can you provide me with data to back that up?
I recently wrote some code for Class::Sniff which would detect "long methods" and report them as a code smell. I even wrote a blog post about how I did this (quelle surprise, eh?). That's when Ben Tilly asked an embarrassingly obvious question: how do I know that long methods are a code smell?
I threw out the usual justifications, but he wouldn't let up. He wanted information and he cited the excellent book Code Complete as a counter-argument. I got down my copy of this book and started reading "How Long Should A Routine Be" (page 175, second edition). The author, Steve McConnell, argues that routines should not be longer than 200 lines. Holy crud! That's waaaaaay to long. If a routine is longer than about 20 or 30 lines, I reckon it's time to break it up.
Regrettably, McConnell has the cheek to cite six separate studies, all of which found that longer routines were not only not correlated with a greater defect rate, but were also often cheaper to develop and easier to comprehend. As a result, the latest version of Class::Sniff on github now documents that longer routines may not be a code smell after all. Ben was right. I was wrong.
But what does this have to do with TDD? One problem I have with the testing world is that many "best practices" are backed up with anecdotes ("when I write my tests first
I also don't write many unit tests unless I'm trying to isolate a particular bug or I have code paths which are difficult to demonstrate with integration tests. I prefer integration tests as they demonstrate that various bits of my code play well together. I've long found, anecdotally, that integration tests make it easier to stumble across bugs -- though I admit that they're then harder to track down. Again, eschewing unit tests is heresy to many test advocates, but in my experience, it works fairly well.
I've also long suspect that TDD can prove to be a stumbling block, but I've advocated it as a way of better understanding what your interface should look like. I've really not spoken too much about my objections to it because, quite frankly, the opinion of the testing world seems to be dead set against me and who am I to argue with the collective wisdom of so many? (I usually get bitten pretty hard when I do so. This is why I want a blog where I can delete trollish or rude comments while keeping reasonable comments of those who disagree with me)
The problem with all of these opinions is that I rarely see hard information about them. I don't see hard numbers. I don't see graphs and statistics and circles and arrows and a paragraph on the back of each one (with a tip 'o the keyboard to Arlo Guthrie). So you can imagine my delight when I read a blog post about research supporting the effectiveness of TDD. This is the meat I want and here's a snippet from the actual research paper:
We found that test-first students on average wrote more tests and, in turn, students who wrote more tests tended to be more productive. We also observed that the minimum quality increased linearly with the number of programmer tests, independent of the development strategy employed.
More productive? Minimum quality increased? That sounds great and now we have at least one study to back this up. Of course, more studies should be done, but this is a great start.
Um, or maybe it's not a great start. Jacob Proffitt has a great blog post where he analyzes the results of the study. He agrees with some of the conclusions of the study, namely
But in digging further down, he found that
In short, "test last" programmers had higher quality code. However, you should also read the comments to that blog post, including an interesting reply by one of the authors of the cited study.
So does this prove that you should be "testing last" instead of "testing first"? Of course not. This was only one study. Correlation is not causation. These were undergrads, not professional programmers. There should have been a "no testing" control group (I should note that I've become so dependent on tests that I actually write worse code when I'm trying to modify something without tests).
So for the time being, I'm quite comfortable with my testing approach. I've given up on "pure" TDD of writing one test at at time. I'll happily write a block of tests and then the code. I'm also happy to write a block of code and then the tests. I can now even cite a study to show that I'm not a complete moron, even if one study is little more than anecdotal information.
Frankly, I've been secretly embarrassed about the fact that I'm not really a TDD zealot. Also, I've found with my testing strategy that I'm not writing as many tests as others. This has also been a bit of an embarrassment for me, but I've not brought it up before. My code works well and I'm (usually) comfortable with the quality after a few iterations, but since I've joined the testing cult, my apostasy is not something I've been terribly keen to bring up.
Now if you'll excuse me, I have some more episodes of "The Prisoner" to watch.