When developers first discover the wonders of test-driven development, it’s like gaining entrance to a new and better world with less stress and insecurity. It truly is a wonderful experience well worth celebrating. But internalizing the benefits of testing is only the first step to enlightenment. Knowing what not to test is the harder part of the lesson.
While as a beginner you shouldn’t worry much about what not to test on day one, you better start picking it up by day two. Humans are creatures of habit, and if you start forming bad habits of over-testing early on, it will be hard to shake later. And shake them you must.
“But what’s the harm in over-testing, Phil, don’t you want your code to be safe? If we catch just one bug from entering production, isn’t it worth it?”. Fuck no it ain’t, and don’t call me Phil. This line of argument is how we got the TSA, and how they squandered billions fondling balls and confiscating nail clippers.
Tests aren’t free (they cost a buck o’five)
Every line of code you write has a cost. It takes time to write it, it takes time to update it, and it takes time to read and understand it. Thus it follows that the benefit derived must be greater than the cost to make it. In the case of over-testing, that’s by definition not the case.
Think of it like this: What’s the cost to prevent a bug? If it takes you 1,000 lines of validation testing to catch the one time Bob accidentally removed the validates_presence_of :name
declaration, was it worth it? Of course not (yes, yes, if you were working on an airport control system for launching rockets to Mars and the rockets would hit the White House if they weren’t scheduled with a name, you can test it—but you aren’t, so forget it).
The problem with calling out over-testing is that it’s hard to boil down to a catchy phrase. There’s nothing succinct like test-first, red-green, or other sexy terms that helped propel test-driven development to its rightful place on the center stage. Testing just what’s useful takes nuance, experience, and dozens of fine-grained heuristics.
Seven don’ts of testing
But while all that nuance might have a place in a two-hour dinner conversation with enlightened participants, not so much in a blog post. So let me firebomb the debate with the following list of nuance-less opinions about testing your typical Rails application:
- Don’t aim for 100% coverage.
- Code-to-test ratios above 1:2 is a smell, above 1:3 is a stink.
- You’re probably doing it wrong if testing is taking more than 1/3 of your time. You’re definitely doing it wrong if it’s taking up more than half.
- Don’t test standard Active Record associations, validations, or scopes.
- Reserve integration testing for issues arising from the integration of separate elements (aka don’t integration test things that can be unit tested instead).
- Don’t use Cucumber unless you live in the magic kingdom of non-programmers-writing-tests (and send me a bottle of fairy dust if you’re there!)
- Don’t force yourself to test-first every controller, model, and view (my ratio is typically 20% test-first, 80% test-after).
Given all the hundreds of books we’ve seen on how to get started on test-driven development, I wish there’d be just one or two that’d focus on how to tame the beast. There’s a lot of subtlety in figuring out what’s worth testing that’s lost when everyone is focusing on the same bowling or bacon examples of how to test.
But first things first. We must collectively decide that the TSA-style of testing, the coverage theater of quality, is discredited before we can move forward. Very few applications operate at a level of criticality that warrant testing everything.
In the wise words of Kent Beck, the man who deserves the most credit for popularizing test-driven development:
I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it.
Hal Helms
on 11 Apr 12Thank heavens someone finally said it! We desperately need some sanity on this subject. Instead of testing inspiring confidence, the radical TDDers (not all!) are instilling fear and doubt.
Aaron H.
on 11 Apr 12Brilliant analogy with the TSA.
Andrew Carter
on 11 Apr 12Love the TSA analogy. And could not agree more on Cucumber. To me, efficient testing comes with understanding of both your framework and your application.
Markus
on 11 Apr 12Well done, David. You highlighted exactly that dark spot in the over-construction in test-oriented-sw-development. I wish that there will be a vital discussion around how to judge better any invest in testing.
—form switzerland. MK
Garry
on 11 Apr 12Maybe ‘Agile Web Development with Rails Edition 5’ can live up to the Seven don’ts. Because I learnt my first lessons of TDD from this series and it does smell.
Justin Reese
on 11 Apr 12Pedantic nit: My brain stumbled on “less stress and insecurity”, not immediately assigning “less” to both “stress” and “insecurity”, I think because both “less” and “in-” being negations. You might consider repeating “less” (“less stress and less insecurity”) or flipping the approach (“less stress and more security”).
– Your friendly neighborhood drive-by unsolicited self-appointed copy editor
gernb
on 11 Apr 12Wow, that’s a pretty loaded statement
“When developers first discover the wonders of test-driven development, it’s like gaining entrance to a new and better world with less stress and insecurity.”
No, actually, I shipped 20+ highly successful commercial products without TDD. Now that I’m on a TDD project I get far more stress because my output has gone down > 50% since more time is spent writing tests than writing code.
Justin Reese
on 11 Apr 12And then I go and say:
Justin Reese
on 11 Apr 12DAMMIT
Sean Iams
on 11 Apr 12I like your point about 20% test-first and 80% test-after. I always felt dirty/wrong/stupid/naive when arguing for this type of approach. Sometimes I just need to explore a few approaches before I settle on one, and if I’m writing proper tests before each exploration, then I lose my creative momentum very quickly.
Steve
on 11 Apr 12I find that a few high level integration tests can go a long way!
Bertrand Chardon
on 11 Apr 12Regarding point 7) I’ve found that forcing myself to write the tests for the controllers first allowed me to get some distance/perspective on the nuts and bolts and think more in terms of “what is my app expected to do” and made the design flaws, additional requirements and control flow of that specific part of the app even more apparent.
I understand (and moreover agree with) your more general view that testing for testing is not the panacea, but there is something to thinking in terms of behaviour rather than in terms of if/then/else and classes that helps less confirmed developers produce a more sane code base.
Melvin R.
on 11 Apr 12This was very helpful. How much to test and what not to test has been one of the most challenging things to decide on. This helps.
DHH
on 11 Apr 12Bertrand, if you’re not figuring out what your app is supposed to do before you’re writing tests, it’s too late. Figuring out what it should do is why you have designers work with you.
Casey Duncan
on 11 Apr 12Context is very important here. I agree that 100% test coverage, and edge-case parameter testing is not very important for in-house apis, and in particular one-user code. But if you are providing a library/framework for others to embed in their application as a black-box that’s a whole other story.
Also the nature of testing changes considerably in the case of server-side and client-side code IMO. The latter can often be extremely costly to test, and add considerable inertia to future changes. Couple that with the fact that these changes can be frequent, and great test coverage becomes a losing proposition.
DHH
on 11 Apr 12Casey, I observe all these heuristics with all the code that I write. Whether it’s for a framework used by thousands, like Rails, or an app, like Basecamp, used by millions.
Completely agree that the burden on automated testing for the client-side is currently much higher (and indeed much TOO high). Hence, we don’t test nearly as much on the client-side as we do on the server-side (much great cost, much lower criticality = much less testing).
Justin Ko
on 11 Apr 12In my opinion, the introduction of integration/acceptance/browser driven tests has been the greatest detriment. DHH is right, if you have tests that run over the same code, it is a waste.
With that said, if you think testing your code via the console/browser is faster than TDD (in the long run), you might want to think twice on that.
I think the best takeaway from this post is that super high test coverage is not needed for 99% of the apps that are built. It is okay if Basecamp has bugs.
When I first started using Rails (2007 or so), my output (pushing implementation code) was much higher than it has been the past couple months. I put the blame on “high level tests” – Unit tests are plenty enough.
Max
on 11 Apr 12So, how to test things like paypal integration without cucumber?
Justin Ko
on 11 Apr 12Oh, and I forgot to add that if I’m prototyping/spiking, I will only write Capybara tests, and no unit tests. Much easier to change and still add regression protection.
Andrea
on 11 Apr 12I normally code in TDD. Following this way is normal having an high coverage ratio. Tests costs money??? Yes they are … and they worth any penny IMHO. I make continuos deployment and release features any 2 weeks to my client with confidence ….. without tests this becomes a dream. Talking about very big apps here.
gigi
on 11 Apr 12Web apps are even harder to test because they have so many layers, sql, javascript, server side language, html and all that stuff, but if you don’t do anything critical like financial transactions then you can have your users test all edge cases.
With good logging systems and users good will, you cand catch all those small bugs without spending to much time on extensive testing.
Of course obvious bugs will get caught on internal beta testing but that is done anyway, the small edge case bugs are harder to catch.
DHH
on 11 Apr 12Andrea, false dilemma. The argument isn’t about test vs no test. That was settled long ago. It’s time to move beyond that into what’s worth testing and what’s not, which is of course the whole point of this entry to the debate.
Max, you don’t need cucumber to do integration tests.
Sidu Ponnappa
on 11 Apr 12+1 to Justin.
And I think (2) and (7) just indicate that DHH doesn’t know how to do TDD. But then, who am I to judge the mighty, eh? Especially when they give everyone out on the fringes mumbling “Yeah, I do that testing shit, yeah” an opportunity to quote scripture to avoid doing TDD.
I can just hear it: “DHH does 80% of his testing after the fact, so I’m not going to write any tests at all.”
Agreed on the rest, though, especially (6).
Eric
on 11 Apr 12AWESOME post, especially #6, always wondered who was getting non-programmers to write tests.
Barry
on 11 Apr 12Context is important here. How about medical applications? How much testing do you think they should have? I wouldn’t think having the users (patients) test the edge cases would fly very well in an application used in say a Children’s hospital. You can lose more than good will.
Andrew
on 11 Apr 12@Sidu,
If people only did what DHH does, then nobody would use RSpec. In reality, I’ve never met a Rails person who doesn’t use RSpec.
I think you might be underestimating the free will of the average Rails developer.
Karl Freeman
on 11 Apr 12Erik
on 11 Apr 12I use Cucumber quite a bit, even on projects where I’m the only developer. I think it’s actually really hard to use well, so “don’t use it” is good one-size-fits-all advice. It’s easy to end up with “step spaghetti” and overly-specific scenarios that might as well be a method full of Capybara calls.
But I really like the way it forces me to describe my acceptance criteria in clear, human terms. Although maybe maintaining Steak scenarios with a concise, descriptive comment at the top gives me the same benefit. In some sense, Cucumber, used right, is just forced documentation on top of normal, bread-and-butter acceptance tests.
Sidu Ponnappa
on 11 Apr 12@Andrew – fair enough :)
Just to clarify my (knee jerk) comment – all my work has been done test first for the last seven years across different languages. I don’t care about coverage or test-code ratios or anything else, I simply test drive my code. It’s how I work in an OO language.
This typically results in greater than 80% coverage on Java where you have C1 coverage and greater than 95% on Ruby with C0 1:1 or 1:2 test to code ratios about 95% of my code is written test first (I’m not counting build scripts etc.) any tests I write after writing code are UI driven selenium/sahi/watir tests that I try to keep to a minimum because they’re brittle and hard to maintainI write no unit tests after the fact. At all.
So I find (2) misleading when taken in conjunction with (7). If you’re seeing those kinds of ratios, you’re doing it wrong but it’s probably because you’re not doing TDD or you’re writing implementation bound tests, the kind that test the implementation not the interface (like testing if a field is set in a constructor/AR associations).
And you know you’re doing this because you can’t do red -> green -> refactor – when you’re refactoring, you’ll have to change your test when you change the implementation.
Jeff
on 11 Apr 12I’ve definitely found that trying to reach 100% code coverage causes you to test things that definitely do not need to be tested.
However, structuring my code to be able to reach 100% code coverage has always helped me create systems that are de-coupled, DRY, and properly interfaced.
Steve
on 11 Apr 12I think what David is talking about here is spot on.
1. Write tests
2. Test the critical parts of your application
3. Realize that you will never catch every single bug that could ever possible happen (so don’t overtest).
4. It is okay to have a small bug here and there.
Another thing to consider is how much is it costing you to test? 37signals doesn’t “shard” their database because its cheaper just to buy a nice server. It is far more expensive to hire new programmers than to just buy extra hardware.
Couldn’t you take the same logic and apply it to tests? What if you just wrote up a document entitled, “Tests that should be performed before launching” and wrote down in plain english what a “human” should test before each deploy? The person running these tests doesn’t need to be overly technical and shouldn’t take more than a few hours on each deploy.
We have found that having a DIFFERENT person test the application(rather than the person that wrote it) has helped us find many new bugs. At the end of the document you also tell them, “try to break the website”. This helps us find bugs we would have never thought of before.
More than likely you will not even have to hire someone. Most companies have people that have extra time throughout their day. Receptionists, customer service people, etc.. If you are a small shop, you could have your business partner or spouse do it.
This is not to say you shouldn’t have written tests (point number #1)
Here is the approach we have taken at TrackMyDrive:
1. Write unit tests/functional tests that cover the critical parts of your application. In our example making sure the tax deduction amount is correct(we really don’t want to get that wrong)
2. Write various integration tests (A user signs up, adds a trip and sees it displayed).
3. Before any big roll, we have a script that a human goes through to verify important things (perhaps bugs we have found in the past)
4. Have good monitoring in place for when errors do pop up.
It has worked well for us and has not overburdened us with TSA style tests.
Andreas Krey
on 11 Apr 12Andrea, 15a ago we made releases without any automated regression testing and still were confident in what we deployed. Test coverage and confidence are only correlated, although I tend to be more cautious these days.
Paul D
on 11 Apr 12More than slavish adherence to some testing dogma, is people need to write better architected code.
If you are working with a Big Ball of Mud, testing is not going to somehow make everything OK.
I thought the original point Kent Beck put forth is that TDD made your Code Better Architected, not 100% Free Of Errors, which seems impossible with ruby… the side effect of TDD was that it forced you away from the real evil; Crappy Big Ball of Mud Architecture. Seems like people are forgetting that. It’s “red, green… refactor, Refactor, REFACTOR!!!”, not “Well, rcov is 100%, so we’re good.”
DHH can probably get away with less testing, because his code isn’t a giant pile of shit. The point doesn’t seem to be mindless adherence to some repetitive dogma… it’s writing code so clean the first time that testing is superfluous.
Gavin Laking
on 11 Apr 12Completely disagree with point 6 on the basis that using Cucumber is great for assuring multiple parts of a system are working together. In legacy apps, which are pretty much all apps having been built in a non-TDD/BDD way; Cucumber would be my first choice for providing much needed confidence in the app and more importantly- my changes. I’m not really sold on the whole “In order to…”/ “As a…”/ stuff though to be fair; clients don’t read these files do they? Do certain shops print these out for clients as documentation? I doubt it.
Let’s encourage developers to test. Let’s not promote the idea that it is sometimes okay not to bother as we hurt those with less experience and deny them the opportunity to improve their overall programming abilities.
AstonJ
on 11 Apr 12Great points DHH. I like how frank your posts are, and how they often go against the grain (or rather you not being afraid to voice them when they do).
I haven’t begun TDD myself yet (only just started my first Rails app – to practise on) but I think the philosophy I am going to go with for my next app is, if it ain’t important – don’t bother testing for it. But we’ll see… knowing me I will end up with a 1:5 ratio LOL.
Chad Burt
on 11 Apr 12It’s true that client-side integration tests are much more costly to produce, but I’m not sure the solution is to give up on them. It also depends on context. My current project mostly runs on the client (backbone.js single-page-app). There is no other way to test the behavior of the system other than using something like Selenium.
We need better testing tools for client-side code that allows us to spend 20% of our time on them and still get decent (but not 100%) coverage.
Joost Baaij
on 11 Apr 12Interesting to see a blanket “don’t use cucumber”. As general advice, it’s probably sound. Cucumber is surprisingly hard to do well, and often ends up being a burden. I have made a click with it just three months ago, and I have been using rails since the beginning. But when it clicked, it really did. Customers writing tests? No, customers don’t write tests. Cucumber helps ME and it is more valuable than integration tests in Ruby, which is what I have been doing for years. Writing the tests in English does let me step outside my coding comfort zone and I have come to great solutions earlier than I would have otherwise. But it takes practice. I have nuked so many features-folders over the years it it not funny. But now I find I can’t code without it.
Kyle Lahnakoski
on 11 Apr 12I am surprised by #2 (code:test not to exceed 1:3). 1:3 has been my general observation for minimum reliability! Most of my testing code is spent setting up an environment to perform a test: For example, in finance, the accounts must go though requisite transactions before a bug reveals itself. Or in networking, the client/server pair must be in just the right state before a test can be run. A good portion of my testing code is not in SVN either: they are depreciated tests, due to changes in code, and does not include all that intermediate verification code I write while building a module.
At the same time, I avoid testing as much as possible: I add assertions to code if I think they will reveal a problem. I do not write tests for code that will be used extensively by other modules (because higher level tests should catch those problems). Sometimes I capture the complex tests by converting my logs into tests. Many times I hope my production code will fail-and-report sufficiently for me to find the next bug (try-catch dominates the simplest of algorithms).
Because testing code is so much larger than the code it tests, it seems to have O(n^2) the number of bugs. Testing code is much harder to get right than the algorithm itself.
But in the end I guess it is worth it: Only a working application gets though.
michal
on 11 Apr 12Couldn’t agree more. Tests are means to end, not end in itself. Too bad so many people aims for number of tests or test coverage rather than for happy users!
アレクセイ
on 11 Apr 12Would somebody be so kind to tell me what does TSA mean? Can’t google any suitable meaning :(
The greatest issue with tests for me is that the whole suite runs very slow when there are hundreds of them. I’ve tried several approaches to speed it up but haven’t tried to test less yet :)
Richard N
on 11 Apr 12@アレクセイ: http://www.tsa.gov/approach/tech/ait/index.shtm
Jonathan Allen
on 11 Apr 12The flip side of that is not bending over backwards to unit test an integration component. If you code is literally this:
Public View Index() { var data = m_Repository.GetIndexData(); return new View(data); }then you don’t need a unit test and all that entails.
Annonymous
on 11 Apr 12@アレクセイ http://www.tsa.gov/
It’s a U.S. government agency. Lots of privacy invasion issues going on with how they’re screening for airplanes.
Colin Nederkoorn
on 11 Apr 12We’ve been using Code Climate to deal with the “smell” issues. It plugs right into rails projects and rates your code. Pretty slick and lets us know early if we’re heading in the wrong direction
Aninda
on 11 Apr 12I’m a TDDer. I test because : I’m not the only person working on my codebase. Non trivial codebases, require testing.
Would you allow a commit to rails without a test ? Think about why.
I’m all for writing better tests, avoiding duplication of tests. running tests faster. But testing is important.
Justin Ko
on 11 Apr 12Essentially, 37s delegates its integration tests to its users, and that’s okay :)
AdamA
on 11 Apr 12Oh great. I just spent a couple of days setting up testing for my rails apps that didn’t have any. I guess I’ll delete them now. (Yes, I know that’s not the point.)
John Kamenik
on 11 Apr 12I love this argument. I think it is spot on, but it is missing one key thing: “Pro tip” in the title.
Part of being a professional is knowing when to test (both for yourself and others) and when you shouldn’t test. But the problem in my experience is that people for which this article will resonate are the ones that we should encourage to test more (i.e., the no tests camp).
I have repeated seen and dealt with bugs that were not caught because of number 4 (don’t test ActiveRecord), but were caused by using the “if” and “unless” constructs of validations. It is often hard to explain to a person not apt to test already that there are reasons not to test, but this isn’t one of those times.
Granty
on 11 Apr 12How about sticking to the airport them and call this: JET – Just Enough Testing
Alex Neth
on 11 Apr 12I generally agree. However the phrase “Don’t test standard Active Record association” jumped out at me, since I’ve discovered a major bug in the most basic function of Active Record associations in 3.0.x.
https://github.com/rails/rails/issues/5744
GeeIWonder
on 11 Apr 12Just like Ruby on Rails, TDD is the ONLY way to go. In as much as you would be nuts to use some other stack than Ruby on Rails (Java, PHP, .NET, etc.), so it is with TDD. You are an idiot of you don’t employ TDD. Why you ask… uh… because DHH says so. duh!
Iain Dooley
on 11 Apr 12I have never used Cucumber but I have used Lettuce which I believe is roughly the same thing but for Django.
Prior to that I had been through the nightmarish experience of unit testing a large database driven application using PHPUnit.
The most torturous aspect of PHPUnit in my experience was producing database fixtures and I found the fixture format/system in Lettuce novel and refreshing, so much so that when I wrote my own little experiment in a kind of looser integration testing I adopted the same https://github.com/iaindooley/Murphy
Needless to say I found the whole notion that non technical people would ever read or write a spec completely farcical and the whole BDD thing is lost on me but that was one good thing I got out of it.
How does everyone else deal with db fixtures in their tests?
Jonathan Allen
on 11 Apr 12I think near-100% code coverage is essential for library and framework code. That is to say, code that you are going to be reusing all over the place and that forms the basis of one or more applications.
It is when you reach the upper levels of the stack that automated testing becomes more difficult and less useful. For example, you don’t need an automated test to see if the Open button actually displays the open file dialog and then passes the filename to your data library, but that data library damn well better be tested to the point you are sure it can parse or refect any file format given to it.
Brian Dear
on 11 Apr 12@Jonathan Allen, I totally agree. Library and framework coverage is vital because it helps you spot when things break when frameworks and libraries evolve.
Avdi Grimm
on 11 Apr 12You reference TDD in the first paragraph, but as far as I can tell you’re not talking about TDD at all. You’re talking about testing for the purpose of bug prevention, whereas TDD is a design discipline. To quote Kent Beck: “correctness is a side effect”.
Avdi Grimm
on 11 Apr 12I should say, I don’t disagree with anything you’ve said here about testing for the purpose of correctness and catching regressions. There’s a lot I don’t test; in particular, anything that smacks of testing the framework. And I can be pretty cavalier about throwing tests away, too.
But it’s time and past time that we stopped conflating TDD with preventing bugs; it is and always has been a design discipline first and foremost, with automated regression tests as a nice byproduct. Asking “will this prevent bugs?” or “will this prevent enough bugs to be worth the time?” is putting up a strawman. The answer to that question will always be “no”; because that was never the point. The question to ask is “will this help me arrive at the simplest, DRYest design that could possibly work?”.
Jamie Hill
on 11 Apr 12Love the analogy and wish someone had pointed this out to me around 5 years ago when I was fanatically unit testing every fork a piece of code could possibly take.
Leif
on 11 Apr 12As an experiment, I once did get to 100% coverage for a library I wrote nearly a decade ago. You can read about my findings on the PerlMonks site
YMMV, but if your careful to make sure you just test the stuff you actually wrote, as opposed to everything you end up using, high levels of coverage are not that hard to achieve.
My gut feel is that non-trivial software is noticeably better (correct,stable) when coverage gets over 40%, with diminishing returns after about 85% – I have exactly zero statisitics to back that up other than 8 years of actually doing TDD and Agile in various forms
Gregory Brown
on 11 Apr 12@Avdi,
The “design discipline” argument is definitely what people flock to TDD for. Ironically though, the little well-established empirical evidence that has been collected about the actual impact of TDD that we do have only shows a positive impact on reducing defects… the outcomes regarding the impact on design have either been shown to be inconclusive or contradictory between studies.
You may believe (and to a large extent I do as well) that this is an indication of a lack of high quality studies, or even something that is intangible and not easily measured, but if you believe the question is “will this help me arrive at the simplest, DRYest design that could possibly work?”, perhaps we can think of a way to measure that rather than just using our guts and anecdotal evidence?
Mark Burns
on 11 Apr 12Have to back up what Avdi says here, TDD is not just bug prevention. TDD is design plus repeatable sanity checks and regression prevention. It’s just faster to check that you haven’t fucked things up by doing the same kind of clicking around or IRB checking or whatever you’d do as you ‘normally develop’ an app without tests, or with an arbitrary level of test coverage.
Also would like to say that there’s some sense in the idea of balance in terms of tests, which may be what David is getting at.
It’s important not to over-test but I think that depends on the domain. If I was doing something literally life critical then my attention to testing and insistence on 100% TDD, vs being tempted to sketch out the code, then write some tests would be much lower.
I’ve never worked in any environment where even ‘100% TDDers’ don’t occasionally pre-empt the tests and just write code, but then I’ve never worked in absolutely mission critical situations where a bug in the app would be the collapse of a bridge.
I think we all, at various levels, decide to make judgement calls, and I’d love to hear whether or not developers like Avdi have occasional guilty, test-after moments.
Yan Pritzker
on 12 Apr 12Having been non-TDD most of my life and recently started using more TDD approaches, I have a few observations, and I want to clarify them by not using the word “TDD” as it means many things to many people.
1. Test-first (that is, writing the test before writing any code) can slow you down if a. You’re an experienced engineer and see good design in your head. b. You’re working on something new where the design is nebulous, then spiking and trying different things is a better way than writing tests. If you write tests too early you may find it hard to refactor your design (introducing collaborator classes, etc) at the time that it needs refactoring the most. 2. Complex tests indicate complex code. I like this rule and I use it. Tests can help you spot code smells like code that is overly coupled with collaborators. 3. Test first when dealing with a bug is almost a given. You write a test to capture the bug, then squash it. 4. Tests in general are a really great way to prevent regression and ensure your code is maintainable. According to research, they are not however the best way to spot bugs (code review is one method that’s more effective, for example). 5. Untested code is legacy code. It’s impossible to work with because it’s too brittle and leads to people being afraid of refactoring, which leads to more legacy code, which leads to codebase rewrite. That said, testing that your method that is hardcoded to return a string returns a String is useless. I’ve literally seen these sorts of tests. Knowing that 100% code coverage is almost impossible, you have to focus on testing what matters. Critical business logic, etc.
So the TLDR is: I wish people would stop being religious zealots about TDD. As you build experience as a software developer, you know when to reach for the right tool at the right time. Sometimes it’s test first, sometimes it’s test last. Sometimes you don’t test because it’s a trivial method with no real logic inside, and you save your time to deliver value.
People who think they can prescribe a one size fits all approach are very difficult to work with.
Yan Pritzker
on 12 Apr 12Formatting fail, and linking fail :)
Eric Holmes
on 12 Apr 12unit tests for models and request specs with capybara. It’s a beautiful thing.
Avdi Grimm
on 12 Apr 12@Greg:
I’m not convinced those are two different things.
“Design Goodness” isn’t really something that can be measured objectively in a vacuum. It shows itself in how long new features take (and whether than number stays constant over time); how many LOC the changes touch; how many bugs are introduced by the change; and how many developers quit rather than make one more change to the system. If TDD results in fewer bugs introduced by new features, that would be indicative to me of its positive influence on design.
Bruno Pedroso
on 12 Apr 12Totally agree with @Avdi. DHH, you missed the point. You are writing tests for the wrong reason. Maybe you can pick it on day three ;-)
BTW, it’s ok to make some noise by creating controversy. I’m just a little sorry about the negative influence it’ll cause in some people. Now they have one more excuse to not reaching day two or three =(
Robert Sullivan
on 12 Apr 12Good to see some ‘take a step back and think’ Tim Ferris-Analytical approach to testing. Often it seems we are swept up in the avalanche of new ideas such that we never get a chance to vet them thoroughly and tune them to our own processes and requirements.
SD
on 12 Apr 12I could not agree more with you…the point 7 defines me too…great post…
Mike
on 12 Apr 12i’ve read several programming books recently that spend about 70% of the book on teaching testing…made me very angry
Craig
on 12 Apr 12@Andrea
“Talking about very big apps here.”
That’s the standard excuse for brain-damaged behaviour. Are you too special (“big”/enterprise/”mission-critical”) for common sense?
Avi
on 12 Apr 12Great points. What not to test is just as important as what to test. If you’re interested in learning how to start TDD, Typemock is hosting a free webinar next week introducing TDD: http://j.mp/IVmQNi
Philip Schwarz
on 12 Apr 12Some more useful TDD posts:
James Shore – A Hardheaded View of TDD James Shore – A Hardheaded View of TDD http://jamesshore.com/Blog/A-Hardheaded-View-of-TDD.html
Jeff Langr Test-Driven Development – A Guide for Non-Programmers http://pragprog.com/magazines/2011-11/testdriven-development
TDD as if you Meant It http://cumulative-hypotheses.org/2011/08/30/tdd-as-if-you-meant-it/
Jason Gorman: The Test-driven Development Maturity Model http://codemanship.co.uk/parlezuml/blog/?postid=1066
Uncle Bob Double Entry Bookkeeping Dilemma. Should I Invest or Not? http://blog.8thlight.com/uncle-bob/2011/11/06/Double-Entry-Bookkeeping-Dilemma-Should-I-Invest-or-Not.html
Jason Gorman: TDD Is Neither Necessary Nor Sufficient For Good Design http://codemanship.co.uk/parlezuml/blog/?postid=1079
Michael Feathers: Making Too Much of TDD http://michaelfeathers.typepad.com/michael_feathers_blog/2010/12/making-too-much-of-tdd.html
Philip Schwarz
on 12 Apr 12Oh, and this: Uncle Bob Flipping the Bit: http://blog.8thlight.com/uncle-bob/2012/01/11/Flipping-the-Bit.html
GeorgeM
on 12 Apr 12Definitely agree that 100% coverage is a smell. There are two ways to get a higher coverage figure; push it up, or pull it up. People aiming for 100% tend to pull it up. How? Simplest way is to write “tests” that don’t assert anything! Seen it done.
Not so sure on the integration tests aspect. With frameworks being ever-larger parts of apps these days, I find unit tests force me to spend a considerable amount of time mocking out parts of the framework. When I start doing that, I just move the tests further out.
My approach these days tends toward starting off with a functional test for the happy path, and drilling down from there.
P8
on 12 Apr 12Regarding point 2, I’ve noticed on projects using rspec the code/test ratio is typically a factor 1.7 higher with the same amount of tests/specs. My TestUnit projects are around 1/1.3, my rspec projects are around 1/2.3. With rspec you get more lines of code but better documented tests/specs.
Thomas Eyde
on 12 Apr 12Living in a .NET world, the technology doesn’t apply, but the principles do.
There are similar tools like Cucumber in .NET, but I have rejected them, because I have yet to see a non-developer write a test. The same goes with different BDD-inspired testing frameworks: If I need a When-Then-test, I can use class- and method-names to achieve that.
However, those who experience a slow-down or refactoring difficulties caused by tests are doing it wrong.
Tests are harder to write last, so write them first.
Testing implementation is brittle, test expectations.
Low-level tests inhibits refactoring, postpone them until you really need them.
Tests should be designed so you can cover your needs with 7 + 5 tests, not 7 * 5. See the difference?
Aslak Hellesøy
on 12 Apr 12David – when you say “Don’t use Cucumber unless…” I assume you mean “Don’t use Cucumber to test your app unless…”.
I would agree with that. In fact, I would never use Cucumber to “test” anything.
I only use Cucumber as an aid to discover what code to write (and unit testing to discover how to write it). Keep them few. Your Cucumber scenarios (including step definitions) shouldn’t be more than roughly 10-20% of your app code.
I realise this is not how most people use Cucumber, which I think explains why some people don’t like it.
Rob G
on 12 Apr 12I’d add the “fragile test” problem, where the tests are too specific.
I remember one employer whose code base had plenty of tests, but any changes to the code base inevitably broke dozens, if not hundreds, of tests. For example there were SQL generation tests that failed if there was different whitespace, even if the remainder of the SQL was identical.
Consequently, I spent over half my time fixing broken tests, and my productivity suffered massively.
Andrea
on 12 Apr 12I’ve been saying this for years. Every time, I’ve been told to shut up and go back to my corner, because I didn’t know better.
Now I can point them to your post and tell them to shut up.
Sad world we live in. But yeah, thanks I guess, this will probably make my life easier in the future.
Relistan
on 12 Apr 12You know you’ve written a good article when you spark so much debate, with strong voices on both sides. Well done.
David Chelimsky
on 12 Apr 12You say “Don’t test standard Active Record associations, validations, or scopes.”, but not all associations, validations, and scopes are equal.
Scopes that support authorization are likely to have a higher cost of failure than scopes that constrain “recent articles.” Same goes for validations, some of which have a low cost of failure, but a list of events is going to be pretty useless if their links don’t show up because something as simple as “validates_presence_of :title” got lost when you were DRYing up your event and series models. In some cases that won’t matter, but if you’re in the business of selling tickets then you’re losing money if that makes it to production and stays there for even a few minutes.
So rather than “test x but don’t test y”, how about “test things that will cost you $ if they fail in production.”
P8
on 12 Apr 12”.. my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it.”
If the majority of developers lived up to Kent Beck’s words they’d probably write more tests not less.
Also, as the creator of Rails, David knows better than most how stuff like scopes work. So he’s confident he won’t make a typical mistake there. Someone new to Rails will probably be less confident and write more tests.
Drew Paremo
on 12 Apr 12Some here are only full of words. I think it is good you have tested enough so you can really weigh the argument? What you think?
Rebecca
on 12 Apr 12I’m quite happy to keep testing everything that my app will not work as intended without – and seeing as I don’t make a habit of writing unnecessary code, I’ll keep on maintaining my 100% test coverage.
For simple things like validations and scopes, it’s not testing that Rails works as it should, it’s testing that I’m using Rails as it should be used. My tests have caught a lot of mistakes I’ve made, mistakes I may not have noticed – so the time I spend on testing has been well spent.
Cássio Marques
on 12 Apr 12If you’re writing your tests first only 20% of the time, then you are only doing TDD 20% of the time. The other 80% the CODE is driving your tests (which are going to be tendentious) and not the other way around.
That said, I agree that stuff like associations and validations MAY NOT be tested, but that does not hold true all the time.
DHH
on 12 Apr 12Hellesøy, I’ve heard this justification before, but that must be something that mainly help teams without designers. I have a hard time imagining “designing” my app by writing an integration test—whether through Cucumber or not.
It just doesn’t seem like neither an efficient or a likely way to discover a good user experience. A given feature can be implemented in all sorts of different ways depending on how the design is constructed—tying that down upfront with an a priori integration test doesn’t seem helpful.
But of course, if your project for whatever reason does not use designers to design the experience or how something works, then maybe there’s value there. I tend to find that line of argument along with ideas like ‘designers are just supposed to create the css’, which is a terrible way to think of design.
Chelimsky, agree that you should consider the criticality, but it’s not the only factor. If you don’t also consider the likelihood of a mistake, you’re definitely TSA testing.
James Shore
on 12 Apr 12I’m also happy to see the recommendation against Cucumber. I was heavily involved with Fit for several years (a precursor to Cucumber). I found that the “customer-friendly specifications” approach created more problems than it solved. See The Problems with Acceptance Testing for details.
I was surprised to see the comment about 1:2 or 1:3 code-to-test ratio. That seems high to me—I’m used to a ratio that’s closer to 1:1, unless you’re testing code that’s particularly test-unfriendly, like GUI code.
P8
on 12 Apr 12I wonder if github considered the likelihood of mass-assignment injection. Is that something you do recommend testing?
Robert Sullivan
on 12 Apr 12@P8: One of my first jobs was as a tester for a company writing tax software. This was written in some strange variety of Basic. We had hand-crafted, automated unit testing and integration testing tools long before JUnit, Cucumber, etc. Crude, but effective. For all our work, of course we would miss certain scenarios due to the unique input our customers dreamed up. I once went to one of the lead developers about a bug we had missed. He sagely noted that, as far as testing, the customers were “the real testers”, and I think it is only now with facebook, google, that we see this idea implemented in the main stream. Again, YMMV, we were not building the mars lander here. With the mass-assignment issue, the important thing is to FIX it ASAP. The person who discovered it kindly alerted GitHub but they sat on it – stupid user. And that is where GitHub failed miserably, IMO.
Paul D. Waite
on 12 Apr 12I’ve got no experience of highly critical applications (medical, spaceflight, etc), but I bet there are several processes that are used in addition to code tests that are indispensible for preventing bugs, and much better than creating another thousand code tests — e.g. code review, manual testing, sharing information and expertise, etc.
Aslak Hellesøy
on 12 Apr 12DHH,
I didn’t think I’d have to emphasise that by design I meant architectural design – not graphic or interaction design.
You know – classes, functions, methods and so on.
P8
on 12 Apr 12@Robert Sullivan: Yes, you’ll always miss corner cases that only arise on production, but you want those to be non-critical.
I bet there are tons of Rails applications where important objects aren’t scoped to the user in the controllers. DHH saying people shouldn’t test scopes doesn’t help.
Mike
on 12 Apr 12I always felt it was management’s responsibility to determine the type and amount of testing which should be done. They’re the ones who create the budget and they’re the ones that have to listen to irate clients.
Ross
on 12 Apr 12Can’t agree more with this article.
Though I fear the backlash this could potentially create. Every anti-test rails developer out there finally has a “See! I told you so!” article they can point to. Hopefully they don’t miss the point.
As for a pithy catch phrase to prevent over-testing, the one I learned at Pivotal and the one I pass on to every new developer is “Only test logic”.
Anything declarative, like associations, validations, constructors and the like doesn’t warrant testing.
Anything with complex logic, mostly objects with methods that iterate, branch or do complex queries benefit greatly from testing. Especially testing first to nail down the object’s interface.
DHH
on 12 Apr 12Hellesøy, how does integration testing help you design the internals of a system? Isn’t the whole point to treat the system like a black box?
Even if you’re talking about unit testing, the only way I’ve found tests to help design is by ensuring that I don’t regress when refactoring. That’s a very valuable attribute, probably the key attribute of testing for me, but I file that under “don’t create bugs”, not under “helped me design”.
Ruslan
on 12 Apr 12I’m totally disagree!!! What is the difference if you are working in airport system or on usual internet shop ??? When we start measuring bug cost by domain area of the system ? How it’s related at all ? Why we comparing cost of airport system bug with internet shop bug, obviously firs one is much more expensive.
But, try to compare cost of fixing bug on production with cost to find it in development, in this situation it doesn’t matter in what system you are working on, (cost in production)/(cost in development) will go to infinity. And this equation will work for both systems. So, statements like “Don’t aim for 100% coverage.” it’s bull shit, we should, we have to, we must have 100% coverage.
From my experience, when peoples saying “we don’t need 100% coverage” it means they don’t know how to reach it, or they know but don’t want to do that.
Mostly, to reach 100% you have to do refactoring of original code, it’s expensive, and hard and there is no ROI at that point. It’s hard to admit that code is hard to test which is equal it’s hard to use or extend, so it’s easier to say NO to 100% coverage.
Yan Pritzker
on 12 Apr 12@DHH I think people are saying ‘help you design’ in the sense that a complex test helps you spot complexity in your code. I think really good engineers see complexity before they need a test to show that to them, but it helps nonetheless.
Omar
on 12 Apr 12When people say coverage testing they mean statement coverage or path coverage. I believe you can get 100% in the first one but never in the second one if you have a non trivial system
<script>alert('lol')</script>
on 12 Apr 12Yan Pritzker
on 12 Apr 12Cássio code-driven code (or traditional design driven, as we were doing for years before TDD was even a term), is only going to be tedious to test if you’re not good at clean design. If you have enough experience with good OO design, you will not require a test to show you that your design sucks, you will know it intuitively from having applied experience, design patterns, and best practices. Sure a test can definitely expose bad design, and I’ve used tests for this, but I would say that 90% of the time, I’m writing the right design from the start based on experience, and writing a test first is just slowing me down by forcing me to reverse engineer the model that’s in my head already.
Brian Balke
on 12 Apr 12Obviously a lot of perspectives here. What comes to mind are the disciplines engendered by thorough immersion in Watts Humphrey’s Personal Software Process – or any other CMM level III style practices.
Every developer should have an organized practice for assessing how they spend their time. They should have a framework for analyzing the nature of the artifacts they are generating (API, GUI, demo app, etc.) and a sense of how the application of certain practices impacts outcomes.
The most important study I did revealed that the use of a rigorous design practice shifted the amount of time I spent in test from 50% to 10% of the development cycle (total time spent roughly equal).
What that revealed to me is that it is the thought processes that produce quality, and that any practice metric regarding test coverage or code to commentary ratios or module size is meaningless unless it couples to thought processes. The right question for us to be asking as managers (of ourselves or others) is “Why do I think this test/comment/module is necessary?”, not “Do I have enough?” of aforesaid artifacts.
That said, I have found that a rigorous test suite is an invaluable aid to refactoring – although obviously that implies integration level tests, rather than unit-level tests (refactoring obviously invalidating unit-level tests).
And perhaps the most scandalized I have been was hearing a developer who had been a walk-on-water consultant all his career describe his test practices in the context of developing a piece of foundational infrastructure. The practices certainly reflected the focus of someone expecting to be on to the next job shortly…
Drew Paremo
on 12 Apr 12Yan, you have a very clear point coming from a common sense. Of course you wouldn’t do test first for implementations coming from repeated experiences you have. This is not the point of test driven development.
Anonymous Coward
on 12 Apr 12In addition to Rusian, it seems that others cannot handle 100% or like 80% test coverage when introducing change. I hope I am wrong.
Smahlatz
on 13 Apr 12I think the article shows a misunderstanding of some of the fundamentals of TDD. TDD is not a testing discipline, it is a design approach which promotes loosely coupled, self documenting, maintainable code. The resultant tests you get for free. TDD melds test and development into a single activity.
“Every line of code you write has a cost”. Yes – but in practice, TDD uses test and your IDE to generate code. Under these practices, you get code and test with the same effort that would give you just code previously.
pelumini
on 13 Apr 12People don’t think again, they follow the loudest voice(s). The noise everywhere is test test test; even if it’s stupid and meaningless – just test it! Thank God some sanity in sight!
Ciprian Radu
on 13 Apr 12Very nice set of don’ts, but why formulating things in a negative way?! Why not coming and say the dos of the testing, like:
1. Be happy with your percent of code coverage as long as your application is production ready.
2. Reuse, automate – make sure you minimize both your time to code and time to test. Be lazy for writing code or tests and be smart to be productive! The goal is to get the project done and not to write code/tests ‘till the end of time on the same project. And another thing: keep in mind that the more code you write the more work you need to do when changes arise.
3. Always mind your project specifics – this can influence your decisions. Is one thing to do a project for sending rockets to Mars and another thing to build a CMS. As it is one thing to build a onetime project and another thing to have a product that constantly changes or to build a reusable component/engine. Also project size, team size and team members’ seniority might require different approaches.
4. You must unit test your complex business logic. Apply point 2 as needed meaning – improve your tests on a need-to basis or correct the errors and move on.
I like better the positive way of seeing things, than the negative one – even though we’re discussing TDD – which by its nature can be seen as negative. I also reduced the list not because the things that were there were not meaningful but because they are debatable. Of course you can use metrics like: “code to test ratio” or “time spent coding vs. time spent testing” – but they are related to the specifics of the project (see point 3).
I left to the end one special topic: Cucumber. In my opinion there are projects for which this is the holy grail. Having test scenarios that are meaningful for the client can be a very powerful asset when discussing the requirements with the client and especially during acceptance phase. And the fact that you can use the same scenarios to run automated tests is awesome. And I would say that this is not a matter of personal preference but is definitely the way to go if you are building products and you plan to extend or change them in the future (as you see, again, related to project specifics).
Probably would help if you can share with us what a typical Rails application looks like and how do you measure that so that we can say if those don’ts are applicable in that case or not.
Ciprian Radu
on 13 Apr 12Rephrasing my last remark: In order to see how the “7 don’ts” relate to the “typical Rails app” it will be very useful if you can provide some metrics that will give us an idea on what a typical Rails app is. Having that it will be beneficial for us and will also add real value because we can compare our experience with yours and see where the differences are and what’s applicable or not. I’m sure you got my point.
Rodrigo Rosenfeld Rosas
on 13 Apr 12Hi David, I share your point of view, specially in the test-after approach which I also practice most of the time and I also don’t think there is anything wrong with it. But the real issue with testing for me always have been the same: speed!
I’m getting back to Rails development now and I’ve just created a new Rails application to work around a Grails bug I’ve reported some weeks ago and that wasn’t fixed yet. But testing a single action (a complicate one though) will take me about 1s to run the test, as reported by Rspec.
I’m using FactoryGirl, but I don’t know how to write tests for this action that will read just as great as they are implemented but that would run faster without actually hitting the database.
I’m using several features of ActiveRecord including changes in the model table names and foreign keys and calculations like “maximum” in before/after callbacks. All of those have to work for the action to complete. Also I couldn’t find any references explaining how to use a fake user object to authenticate with Devise so that I don’t need to create a user just for passing the authentication filters.
In Grails they’re able to allow a lot of features with their mocked model implementation and that makes unit tests much faster to run. I couldn’t find any similar mocked class of ActiveRecord that would accept about the same DSL including dealing with legacy databases and calculations.
It would be fantastic to have more articles on how to speed up tests in Rails and avoid touching the database without getting crazy like in the Object on Rails Avdi’s approach.
What would you recommend on this line?
James OKelly
on 13 Apr 12If you aren’t testing mass-assignment protection you will get into trouble. If you are testing rails internals, you are wasting time. If you arent getting to 100% on your unit tests, you are going to regret it.
Functional and integration testing have their place, but that is where you should skimp if anywhere, and if there is a piece that routinely breaks, you had better write a test for it to catch it before it goes to production.
All in all, I don’t think DHH is wrong, here, however, as a master of PR, he likes to stir shit up to get more coverage for himself, and for Rails in general.
That being said. Rspec sucks, TestUnit sucks (Sorry DHH, it’s true) Minitest IS THE BOMB!
Oh, and if your tests take more then a couple minutes to run, you are testing in the wrong framework, OR you are putting TOO MUCH business logic into your rails app.
Instead create a library and test it separately. You will be surprised at how fast your tests run once you de-couple it from Rails.
Raghav
on 13 Apr 12Well Said David.
Deryl R.Doucette
on 14 Apr 12I definitely find I do NOT agree with the sentiments stated in the post. I truly believe that by writing your tests first you drive out far better code than you would without them. I also truly believe that tests-after results in most people altering their tests to make them pass against their code rather than checking their code to see why its failing, and then modifying their code to pass the tests as they should. This results in a no-care- attitude about the reasoning for tests in the first place, in my mind. Sorry, I just do not believe that tests-after-code is at all the way to go, and not writing your tests in the first place smacks more of “I’m too good for tests” and lack of understanding of both the types of testing that should be done, and the need for tests at all.
Eugene Garcia
on 14 Apr 12TDD sounds great at first, but I quickly realized that when I’m in the “zone”, TDD can quickly destroy that highly productive state of mind. Therefore, when I’m in the “zone”, I give myself permission to not test anything until I’m out of the “zone”. I’m only in the “zone” about 10% of the time, so there is plenty of time in that other 90% to go back and add tests, and to use TDD the proper way.
llewellyn falco
on 14 Apr 12The phrase you are looking for is “Test till bored”
and there is a very good way to achieve high coverage with a much better code/test ratio (mine is around 60/1 & I usually have > 95% coverage)
The ruby version is here: https://github.com/kytrinyx/approvals
There are videos about it here, I’d start with this one: http://www.youtube.com/watch?v=vKLUycNLhgc (it’s in .Net, but the code is virtually identical for ruby)
Si
on 16 Apr 12The examples given by people of Github and other potential security failures don’t really go against DHH’s point. His point seems to be to critically evaluate the value of the test code you are writing. And based on that criteria, tests which would have caught such failures are valuable tests – so they are worth writing.
Likewise, the “TDD is purely about design” approach probably wouldn’t have actually resulted in these tests being written, since the focus would have been on class, object and data relationships. I agree there is value in using tests for this purpose, but it doesn’t in itself naturally catch security errors.
So I’m generally quite convinced by his point. I think a seven-point checklist goes against his preceding sentence regarding “nuance, experience, and dozens of fine-grained heuristics”, but its use is understandable as I suspect n-point checklists are the only language some programmers understand. Having said that, many of them are basically “ease up on yourself” and don’t mistake a particular metric for a mark of quality, which seems spot on. I personally would take a slight exception to the scope issue on point 4 – I have found automated tests particularly invaluable for testing complex scopes (for relations which the client insists on, rather than ones I have chosen!). But where the complexity is not there and the criticality is low, I’m quite happy to spend the time focusing on writing tests to avoid mass-assignment injection instead ;-)
This discussion is closed.