TDD, Straw Men, and Rhetoric

dasil003 · on April 30, 2014

The elephant in the room here is the bugs that crop up in the interfaces between units. Depending on your problem domain, it may be easier or harder to create interfaces between all these TDD'd, isolated modules. There's certainly value in a spec as isolated component documentation, but I'd argue there's more value in specs for regressions, particularly when refactoring. This is especially the case in dynamic languages like Ruby which require tests to avoid the most basic runtime errors.

The answer to this conundrum is often: that's the job of the integration tests. Okay, but integration tests are slow, so you'll never test all the permutations of the interactions of the components.

DHH said during the keynote, "it's easy to make your tests fast when they don't test anything". Of course that's hyperbole, but there's a kernel of truth to be investigated there. When working with something as complex as an ORM like ActiveRecord, isolating the business logic and using the ORM strictly for persistence may allow for fast tests, but you still run the risk of bugs creeping in on the interface because of some assumption about or change in the way ActiveRecord works between versions or whatever.

That's why, as ugly and slow as Rails unit tests are, they are a simplifying compromise that strikes a balance between the theoretical ideal of a unit test and an integration test. ActiveRecord itself is this way too, in that often times the business logic just isn't complex enough to warrant the complexity of separating persistence from domain logic. As much as DHH may be talking out his ass without really having ever grokked TDD, I don't think his complaints are completely without merit.

asuffield · on April 30, 2014

I have to agree, and add an observation that every project I've seen which emphasised highly isolated unit tests spent the majority of its test development effort on local behaviour that was easy to test, and very little effort on testing complex interactions and emergent behaviour.

Furthermore, when a complex interaction with changing behaviour in third party code caused a bug, I have always observed the resident advocates of highly isolated unit tests throw up their hands and say "oh well that's not our problem".

I'll offer up my own rule that I've worked out over the years: the first test you write, from day one, should be the end-to-end performance stress test that fully loads a production deployment of the project, making it do everything all at once and sending random junk input until it falls over. Even when your project doesn't really have any functionality yet, run that one continually in the background. This one test will find so many classes of bugs that you weren't expecting - it optimises your tests for learning surprising things as soon as possible. You then start fleshing out your test suite to cover the things you learn. With some smart design, there will be a lot of shared code between your every-growing stress test and the suite of more targeted tests.

You might find that you get down to detailed unit tests of every part of the code... but you'll probably find that you only need to bother with unit testing obscure conditions for the complicated parts.

The core philosophical difference here is that I regard tests as a tool for learning and making the code better, not as a way to make myself feel happier. The biggest performance barrier I experience as a developer is not the seconds it takes to cycle through the tests, it's the weeks it takes to learn what I'm trying to build. I aim my tests at that target.

gary_bernhardt · on April 30, 2014

DHH's complaints certainly aren't without merit, but they are naively formed.

I did a conference talk called "Boundaries" on this topic, and how a particular type of disciplined functional style can mitigate it to a large extent (https://www.destroyallsoftware.com/talks/boundaries). I think that we can probably have (most of) our cake and eat it to, just as we have so many times before: microprocessors reliably perform computations; TCP reliably delivers over unreliable networks; etc. But we're not going to get there by throwing up our hands.

ritchiea · on April 30, 2014

When you say naively formed, what does that mean exactly? A lot of your post was dedicated to correcting DHH's TDD history. I appreciate the hard work and attention to detail but I don't actually think when the idea of ultra-fast tests originated has much to do with DHH's complaints.

You also spend time in your post talking about how much you value the tight feedback loop between your tests and your code and how your tests being ultra fast helps that. That's great! That isn't really a response to DHH though, that's explaining why your method works for you. Which is a valuable contribution to the discussion, but not really a critique. You also bring up DHH's lack of theoretical computer science knowledge. I don't have the slightest idea what that has to do with the value of TDD, can you explain that?

I guess what I'm trying to get at here is that you didn't really propose a counter argument to DHH here, you said his historical knowledge is wrong, he doesn't know computer science and you get a lot of value from fast tests. The core of DHH's argument is that the metrics that are being touted to measure test suites are not metrics that actually help us write good code or code that is reliably well tested. It would be great to hear your direct responses to those arguments.

gary_bernhardt · on April 30, 2014

The feedback loop is my response. He positioned isolated unit testing as having drawbacks, but he never mentions (and doesn't seem to have experienced) the value of it.

Others have written responses to his claims about design. I think that he's less off-base with those; the design benefits of TDD are oversold to some extent, although they certainly do exist in my experience. It's difficult to oversell the speed benefits of isolated unit testing, though. You really have to experience it. Here's Noel Rappin commenting on my post saying exactly that, in fact: https://twitter.com/noelrap/status/461633622185746432

dragonwriter · on April 30, 2014

> The answer to this conundrum is often: that's the job of the integration tests. Okay, but integration tests are slow, so you'll never test all the permutations of the interactions of the components.

Integration tests aren't necessarily slow in any dramatic way (though they'll naturally be slower than unit tests), but in any nontrivial system you still won't test all the permutations of interactions between components because the number of such permutations will be prohibitively large even if integration tests were as fast as unit tests.

My preference is to aim for path complete integration tests, and thorough unit tests.

> When working with something as complex as an ORM like ActiveRecord, isolating the business logic and using the ORM strictly for persistence may allow for fast tests, but you still run the risk of bugs creeping in on the interface because of some assumption about or change in the way ActiveRecord works between versions or whatever.

If you isolate the domain and persistence layers rather than combining them the way Rails seems oriented toward, the persistence layer still ought to be a testable unit -- and as its job is persistence, you wouldn't isolate the database from it to test it. OTOH, its a unit that should be more stable than the model layer in most cases, and you won't pay the higher costs for running its unit tests when you are making changes to the model layer.

> That's why, as ugly and slow as Rails unit tests are, they are a simplifying compromise that strikes a balance between the theoretical ideal of a unit test and an integration test.

Why compromise between unit tests and integration tests when the two aren't exclusive and serve different purposes? Why not actually have good unit tests and good integration tests, instead of beasts that are neither fish nor fowl and are less than optimal as either?

It seems that the argument here is based on the false dichotomy that suggests if I do unit testing, I can't have integration tests, so I either need to just do integration tests or, for some reason, trade both for some sort-of-unit-ish tests that test the persistence and domain layers as if they were the same unit.

grannyg00se · on May 1, 2014

> Why compromise between unit tests and integration tests when the two aren't exclusive and serve different purposes? Why not actually have good unit tests and good integration tests, instead of beasts that are neither fish nor fowl and are less than optimal as either?

I wonder this myself. Unit tests serve a completely different purpose and should not get in the way of integration testing. They may/should help with structuring the code so that it is easier to integration test. They definitely shouldn't hinder. Then there is also the system level test, and then manual testing as well. All of these serve different purposes and can live happily together.

dasil003 · on May 1, 2014

You've talked around my core point a lot, but you haven't addressed it directly at all. What do you do about bugs on the interfaces and interaction of the heavily modularized and unit-tested components?

dragonwriter · on May 1, 2014

> You've talked around my core point a lot, but you haven't addressed it directly at all.

I thought I addressed it quite directly.

> What do you do about bugs on the interfaces and interaction of the heavily modularized and unit-tested components?

Ideally, catch them with the unit tests -- whose entire purpose is testing interfaces -- but, where that fails, with traditional integration tests (which, when they find bugs missed by unit tests, prompt additional unit tests to isolate the offending unit and assure that they are properly squashed) rather than by abandoning unit and integration testing for something that isn't quite either and lacks the specificity and utility for test-on-each save of unit tests while abandoning the full end-to-end cycle testing of integration tests.

dasil003 · on May 2, 2014

Your unit test guarantees the intended interface of the unit, but it doesn't guarantee the usage of said interface. Typically other unit tests will stub out this dependency. But what is your guarantee that the stub agrees with the spec'ed behavior.

Now this may range from a non-issue if the interface is obvious and straightforward and leaves little room for error, to extremely error-prone in a duck-typed language. Of course there are ways to mitigate this with the test and stubbing infrastructure, but it can be brittle as well.

This problem is why I think there is some sense of coarser-grained units with fewer stubs and some qualities of integration without going all the way up to the full-path acceptance level.

dasil003 · on May 1, 2014

Wow. Downvote. Just, wow.

chromatic · on April 30, 2014

Depending on your problem domain, it may be easier or harder to create interfaces between all these TDD'd, isolated modules.

Aren't you assuming that TDD requires pervasive unit isolation?

I practice TDD often and prefer not to isolate units at least at the level of filesystem artifact.

I believe that one of the reasons test-first became TDD and gave up on the distinction between unit tests and acceptance tests is because the Smalltalk sUnit style of testing units as units didn't work so well in languages which aren't Smalltalk.

jonahx · on April 30, 2014

Gary,

I've been hoping that you would write an intelligent response to DHH's latest articles and talk ever since I saw them, thank you for doing it.

I suspect that his claims of TDD isolation damaging the integrity of designs are mostly red-herrings too. I know they are for me personally -- in fact, the opposite is true. I'd love to see you address this issue as well. In particular, your thoughts as they relate the technique of moving logic completely out of controllers (as you do in raptor). This technique seems to infuriate DHH in the context of rails, while to me it seems a far superior design.

matthewmacleod · on April 30, 2014

I suspect that his claims of TDD isolation damaging the integrity of designs are mostly red-herrings too. I know they are for me personally -- in fact, the opposite is true.

I'd be interested to see more examples of that. In my experience, this sort of warning genuinely isn't a red-herring - I've worked on several projects that have been seriously creaking under the weight of test suites, and in which the desire to isolate units in particular has led to both poor code architecture, and ironically poor tests.

In particular, your thoughts as they relate the technique of moving logic completely out of controllers (as you do in raptor). This technique seems to infuriate DHH in the context of rails, while to me it seems a far superior design.

Raptor's design is interesting, but I dispute that the entire "C" in "MVC" is worthless. I agree that such is often misused, but something like Raptor essentially pushes the logic that it's OK to have in a Rails controller into the router, and I'm not sure that would actually be scalable in practice.

jonahx · on April 30, 2014

In fairness, I believe there exist code bases with a fragmented, poor design and lots of unit tests. I would place the blame with the programmer's design skills, though, and not with scrupulously testable, loosely coupled modules.

I do not believe that TDD is a replacement for good design skills at all, and it's easy TDD yourself into a poor design. Even strong TDD proponents like Uncle Bob admit as much [1]. However, I also believe that most good designs do happen to have the property of being easily unit testable, and if your code does not have that property, you should take pause.

[1] http://www.infoq.com/interviews/coplien-martin-tdd, point 5 in the transcript

gary_bernhardt · on April 30, 2014

I've experienced (and created) exactly those weighty, oppressive test suites. I think that they're probably more a symptom of us collectively learning to test than anything else. Even today, there are very few people who are experts at test isolation; five years ago there were almost none.

It's tempting to give up in situations like that, but I like to think back to "GOTO Considered Harmful". It was contentious at the time! There were people who literally thought that you couldn't build complex software systems without GOTO. It's a reminder that we always underestimate how good we can get at something, and how much freedom we'll have after adopting a constraint.

eclark · on April 30, 2014

All of these arguments seem like they are overly prescriptive. Individuals should use what works best for their needs. TDD may work wonders for me today on one project, and then it could be horrible tomorrow on the next project. My co-worker might have the inverse.

Stop arguing over how other people create software; Ship Code instead.

ebiester · on April 30, 2014

I believe that it's a useful conversation to have -- if we are all just trying to ship code, when are we going to have the discussion of what makes code readable? What makes code reliable? What makes code quicker to deliver?

Further, as a automated test advocate (whether or not it's TDD or unit or mocks or stubs or whatever flavor you like) I want to walk into projects where I don't have to deal with changing code without having a good and fast test base to minimize the potential defects. If TDD gets us to that point, I'm all for it.

I don't care how you create software unless I have to use it or inherit it.

ericHosick · on April 30, 2014

> Stop arguing over how other people create software; Ship Code instead.

Electrical engineering, mechanical engineering, architectural design and the medical profession, to name a few, have bodies of knowledge they are required to use.

Is it really a good idea for software developers to say "stop arguing over how other people create software; Ship Code instead" considering we don't have any industry standard bodies of knowledge?

lugg · on April 30, 2014

Considering we don't have any industry standard bodies of knowledge, yes it probably is a good idea, because we don't have anyone who can lay out a definitive answer on many of our preference based arguments.

i.e. these arguments will never end if there is no definitive answer, or body to pick one.

gary_bernhardt · on April 30, 2014

Industry standard bodies of knowledge arise from people doing, then talking about what they did, then doing some more, then talking some more. At no point did a God of Electrical Engineering hand down tablets.

lugg · on May 1, 2014

Very good point. I stand corrected.

Silhouette · on May 1, 2014

There is at least one serious effort underway to collect (or perhaps, catalogue) such a body of knowledge:

http://www.computer.org/portal/web/swebok

I think SWEBOK is interesting both as a demonstration of how far we have to go before we get anywhere near the standards of established engineering fields, and as a demonstration of how much we have already collectively figured out, even if most of us aren't familiar with more than a small fraction of what is out there.

GuiA · on April 30, 2014

> Individuals should use what works best for their needs

Maybe rather: teams should use what works best for their needs ?

EDIT: whee, downvotes. Apparently HN'ers work on teams that ship software and where a developer uses kanban, another TDD, another waterfall, another scrum, and another only commits code when the 8 planets are aligned.

jasonlotito · on May 1, 2014

> whee, downvotes.

I'm going to take a stab at the down votes, as I could easily understand why someone might. First, consider your original comment was pedantic and offered little in the way of original insight. It would be like me replying to you and saying:

"Maybe rather: teams should use hwat works best for their needs per project?"

What does that really add?

Basically, it added little to the conversation. Heck, your edit complaining about the down votes is longer than your comment.

eclark · on May 1, 2014

TDD vs Testing after is a personal choice apart from how the team set goals. That is to say you can be on a waterfall schedule and still practice TDD.

As long as code shows up to code review with good tests, I don't personally care if the tests were written before or after.

MartinCron · on May 1, 2014

Maybe you are being down voted by fans of the former-planet Pluto? Sensitive bunch, those guys.

JeremyMorgan · on April 30, 2014

You said it better than me. No reason to get religious over it. When I'm the architect on a project, I do it in some cases, and other times I don't. I'm not over concerned what everyone else is doing in this regard.

nshepperd · on May 1, 2014

> That file is going to be a plain old Ruby class: not a Rails model, controller, etc. Isolating those would lead to pain, as anyone who's tried to do it knows. I keep models very simple and allow them to integrate with the database; I keep controllers very simple and generally don't unit test them at all, although I do integration test them.

This is what I was thinking of when I read DHH's post. Mock objects and indirection and all that certainly add complexity, but if your "business logic" is all in pure functions, or at least self-contained source files that don't do IO (and don't use "frameworks"), you don't need mocks to test it.

Better to keep your glue code simple and do integration tests, while unit testing your actual logic, than to shoehorn extra indirection for mocking into complicated glue code.

I gather that DHH complained because he had been doing the latter, and found that it sucked.

avaku · on April 30, 2014

Is it me, or there are other people who feel stomach ache when reading this sort of stuff? Why don't just learn algorithms and other computer science "direct" approach by analysing problem and solving it in most officiant way? 100 lines of tests for 50 lines of code for a _catalog_? Really? I'm sorry just reading this brings bad feelings, like those which traders describe when they feel something's wrong with the market...

matthewmacleod · on April 30, 2014

Why don't just learn algorithms and other computer science "direct" approach by analysing problem and solving it in most officiant way?

This doesn't really say anything; your statement amounts to "Why not do it properly, instead of using tests?" - it should be obvious why that's invalid.

In particular, unit testing protects against "minor changes with unexpected consequences." This happens all of the time. "analysing problem and solving it" will not make a bit of difference. Do you expect that people who use unit tests are instead "ignoring the problem and not solving it?"

A test ratio of 2:1 isn't a big deal. It serves as a record: "Here's a specification of what my code does. You can automatically verify that it does what it says." Think of it as part spec, part test.

6d0debc071 · on May 1, 2014

> Why don't just learn algorithms and other computer science "direct" approach by analysing problem and solving it in most efficient way?

---

The problem often isn't clearly defined. To a certain extent programming is an exploration of the problem until you're in a position to go 'And so...' Add to that, that knowledge of the context and lower level abstraction layer that your algorithm corresponds to is imperfect. (Imagine doing maths where a malicious demon sometimes changes the contents of variables on you according to a set of rules that you don't know.)

As such, any algorithm that someone wrote while trying to express the problem and provide a solution may or may not do what they expect it to when given in any particular language on any particular machine.

There are a couple of ways to try to address some of the difficulties with that. One is to write code at such a low level that you can be sure of the blocks you're using. Another is to try to climb the abstraction layers using a language with certain safeties built in to try to limit the context you have to be aware of.

But there are drawbacks with both of those methods, and testing is a way to deal with the inevitable oversights involved. (It's also good in getting people to more tightly define the problem in the first place for you. Acceptance tests are a wonderful thing sometimes.)

GFK_of_xmaspast · on May 1, 2014

"Gee, why don't they just write the code the right way the first time."

ssmoot · on April 30, 2014

> "Second, that sentence is false. Isolation from the database, or anything else, is generally done with mocks, but mocks didn't even exist when TDD was rediscovered by Kent Beck in 1994-1995."

See, that's not how I remember it. Stubs implementing the Interfaces you wrote (because Composition over Inheritance) was the common solution. AR's idea of a Unit Test has always been wrong to my mind. They aren't Unit tests. They're Integration tests.

This only really became an issue with the rise of Rails and the fact that you couldn't really write Unit tests for an ActiveRecord.

At least that's how I remember it.

steveklabnik · on April 30, 2014

Considering Gary provides citations for his timeline, I'm inclined to believe his over your memory.

That said, I think you're confusing _unit_ tests and _TDD_. Unit tests imply that you test a single isolation, TDD implies that you write your tests before you write your implementation. You can TDD unit tests, but you don't have to, and you can do all-acceptance TDD if you want.

yeukhon · on April 30, 2014

Exactly. I think most of the TDD probably come to some tutorials which test addition a+b of that sort and demonstrate how to test "ah there's logical error! how incredible TDD helps us to identify error early on!". I do some TDD from time to time, but outside of my textbook algorithm, writing pure unit test which involves using mocks is really tough and time-consuming (unless someone has built a test driver for you to use throughout the whole development).

In any, if you ever do curl http://localhost:port/mynewapp/page1 that's already testing and is doing functional testing. If you write that down before you start writing a piece of code, you are doing TDD to test whether your app will return 200 or not.

ssmoot · on April 30, 2014

The quote Gary references is:

> "The classical definition of a unit test in TDD lore is one that doesn't touch the database."

So nope. Not confused. ;-)

That said the conference he referred to happened in 2000 (http://martinfowler.com/articles/xp2000.html) and his claim is that Mocks didn't take hold until long after.

There's a big swath of time between the rise of the xUnits and the modern Mock fancy he doesn't cover (or provide citations for).

Considering the early TDD mantra was red-green-refactor, saying you could do it with all Acceptance tests is a bit of a stretch. I mean tomato tomatoe I guess, but that's certainly not a definition I've ever come across (and it would certainly run afoul of Uncle Bob's 3 rules of TDD).

edit: BTW, c2 has a wealth of information on the subject: http://c2.com/cgi/wiki?TestDrivenDevelopment

Try looking for how many times "design" is mentioned on the page, and in what context. That's how TDD was sold to me. Is your component difficult to test in isolation? Then the design is flawed. One of the big advantages of TDD early on (and it seems to hold true today) is that it helps you write testable code (in Units).

BTW, I didn't suggest XP and TDD were the same. XP2000 was a conference. You can believe me or not I guess, but back in the day all you had was your xUnit. Selenium was a twinkle in someone's eye and local databases were not the norm. It was definitely all about the Units (and I think if you read the material of the time the emphasis will be pretty obvious).

You might also consider that http://en.wikipedia.org/wiki/Test-driven_development#xUnit_f... pretty strongly associates TDD with Unit Testing.

steveklabnik · on April 30, 2014

TDD and XP are not the same thing, either. I don't have my copy of "Test-Driven Development by Example" handy, so I'll just have to point you to http://en.wikipedia.org/wiki/Test-driven_development , which doesn't use unit tests in its definition.

Now, it's not that there's no connection at all: unit tests are _incredibly_ helpful when doing TDD, because you want your tests to run quickly. But it's "Test-first," not "Unit-test first." When you're doing TDD, you don't write acceptance tests second, you write both your acceptance and unit tests first.

matthewmacleod · on April 30, 2014

I think there is a good point here, in that a very tight TDD loop requires fast tests. Nothing's untrue about that, and if you've worked with a set of speedy tests I'm sure you know it can be quite efficient, and indeed it can be quite a different way of working.

Ultimately I don't really think that this is particularly in conflict with what DHH was talking about. A tight TDD loop might work for some developers, but if you get to the stage where you are making substantial architectural changes to enable this, then it looks like a problem. And I've seen a lot of this first hand.

FWIW, I take your point that things like Spring aren't ideal solutions. But with a bit of care they can help offer the best of both worlds; I've got Guard and Spring working together on the Rails project I currently have open. Looking at an equivalent model (100 LOC, 200 test LOC), it runs without database isolation in less than 400ms, which is less time than it takes me to switch to my terminal. It's really great, and I don't have to worry about isolation.

TLDR; there are compromise solutions possible. Maybe not suited for everyone, but in some cases better than going too far with "design for testability."

sandal · on May 1, 2014

The critical flaw in this post is that Gary is not refuting what DHH said.

Gary makes this claim:

> You finally get to see what's really going on. David's tests run in a few minutes, and he's fine with that.

> I'm not fine with that. A lot of other people are not fine with that.

But what DHH actually said is this:

> You might think, well, that's pretty fast for a whole suite, but I still wouldn't want to wait 80 seconds every time I make a single change to my model, and want to test that. Of course not! Why on earth would you run your entire test harness for every single line change in a particular model? If you have so little confidence in the locality of your changes, the tests are indeed telling you that the system has overly high coupling.

and this:

> These days I can run the entire test suite for our Person model — 52 cases, 111 assertions — in just under 4 seconds from start to finish. Plenty fast enough for a great feedback cycle!

Using a workflow like Gary's, there's an argument to be made that 4 seconds is not acceptable, and this is why we want single files that can run in a few milliseconds.

However, that's not the only possible way of running tests, and the difference between 4 seconds and 300ms for the feedback you're actually interested is massively different than 300ms vs "a few minutes".

For a post that calls DHH out on a strawman, this is in itself a great example of one.

gary_bernhardt · on May 1, 2014

Yes, I focus on my per-file runtime in the post, and I mention David's suite runtime in one sentence at the beginning. They are not meant to be compared. David's file runtime is four seconds. This is unacceptable to me. This is unacceptable to other people who replied to your tweets. This would double the length of my high-speed TDD loop, which would make those portions of my TDD process take twice as long.

Yes, it would've been clearer for me to specifically address both suite runtimes and both unit runtimes. You know what else would've been clearer? All of the 2,000 words or so that I deleted from that post while I was editing it down into its final form. This is just how writing works. I don't think that it's misleading as written.

Of course, I've already told you, on Twitter, exactly my reasons for rejecting both four-minute suites and four-second test files. They're not in the post, but you know the reasons. You know that I wasn't selectively attacking a subset of his argument, because you know that I do have an answer for test file runtime. And yet, for some reason, here we are!

(For anyone reading this later, the tweets in question are gone. Lately I've been deleting all replies, as well as trivial non-replies, for Reasons.)

jgon · on April 30, 2014

On the one hand this article does at least provide citations for some of the timelines involved. On the other hand it makes me throw my hands up when I see some of the examples provided. 100 lines of code to test 50 lines of code, which implement a catalog is pretty par for the course it seems in TDD advocacy. This could honestly be me venting my current frustrations, but I get tired of this type of advocacy, also often seen in the functional world, of proving how great something is by using the most trivial possible example, in a domain squarely in your technique's wheelhouse. 100 lines of code runs in 0.24s? What is this, I don't even. Your functional algorithm is incredibly elegant at implementing the fibonacci sequence? Awesome.

But are these things honestly the bulk of what people do? Is testing that you can put objects in a catalog, access them, and remove them most of what people do and need assurances on? Am I the only person who has spent most of their career working on software with codebases measured in the millions of lines, with significant user interfaces as well as significant technical domain knowledge embedded in them? I know that TDD says if our objects have many collaborators we are "doing it wrong" but honestly how far do you break down something without it blowing up into a million classes and a ton of code? How do you get around the fact that a button press can set off a numerical simulation (actually these are super testable and I support that), several trips to the database, multiple changes to the user interface, all in the face of dozens of constraints at all levels of the program to ultimately end up with the graphical representation of chemical pressure that the user wants? Especially when every moving piece in that calculation is supposed to be tested apparently in independent units?

I would estimate that the bulk of code that I have experienced in my career relied on multiple other objects to be effective. Do I just mock these out and pay the price of keeping those mocks in sync with the classes they are imitating? Do I then have my test be tightly coupled to the internals of the function being tested, making sure this mock is called in this way, this many times? Do I rewrite things such that every function takes its collaborators as arguments and returns the results I am looking for? What happens when each of those collaborators is itself a giant series of other stateful objects or at least provides access to a highly complex piece of state which is a pain in the ass to setup?

I ask these questions seriously, because on the one hand TDD advocates keep telling me I am not professional if I don't follow their practices, but on the other the details of how to do so in the face of real-world constraints (not 50 line data-containers) are always somehow left out of the discussion. I want to believe, I do, it just gets hard to sometimes.

gary_bernhardt · on May 1, 2014

I didn't say "I have a 50 line example, therefore it works in all cases". Destroy All Software is 2,208 lines of production Ruby code, and most of it is tested in exactly that way. I could've showed you charge_purchase_spec, download_policy_spec, etc. The point is that most of the application is decomposed into these little 50 line pieces that can be thought of and tested all by themselves. You look at this and see a trivial example, but the whole idea is that large applications can be built out of lots of trivial little examples and it really, actually works.

You're also misinterpreting what the catalog is. You see one word, "catalog", and decide that all it does is "put objects in a catalog, access them, and remove them". No.

The DAS Catalog class enforces a few integrity guarantees, like "there will be no gaps in serial numbers" and "titles won't be duplicated". Then it provides several querying mechanisms, like finding seasons by slug, finding episodes by slug, or finding the season for a given episode. It also has some aggregation behavior like "give me a list of all screencast titles".

None of those Catalog behaviors is very complex, and all of the tests are simple. That's the point! Most software is made up of fairly simple things like this: aggregate some stuff; find some stuff; check a compound property of something; make a few decisions. These things are easily tested in isolation. For the rest, there's integration testing, or even plain old exploratory testing. But that rest is not nearly as large as it seems to be naively.

The last three paragraphs of your comment misunderstand what TDD is and what it provides you. I don't think that you're "not a professional" if you don't do TDD. I find value in it for a lot of code; so do many other people. Like the article says (you read it, right?), I only do TDD about 75% of the time in web apps, and more like 50% outside of web apps. As is also mentioned in the article (you did REALLY read it, right?), the best way to learn TDD's limitations is to do it 100% of the time, which has the unfortunate property of creating some zealots who haven't yet found a balance.

jgon · on May 1, 2014

Hi Gary, thank you for the reply. Let me first assure you that I did read your article. I know it's just the word of some guy on the internet but a while back I got so fed up with dumb comments that I promised myself I would never comment on something without having first read it in an honest fashion.

Next, I would say that your comment is still somewhat frustrating to me. My tone may have been overly combative, but I was legitimately asking questions in the hope of getting an answer or at least pointed to one. So I hope you can understand my frustration when you dismiss three paragraphs of my writing as misunderstanding what TDD provides. The may be true, but if you're going to spend time typing those words I feel like you could at least type a few more either taking a stand on what you believe TDD to be or pointing out some resources that I could use to educate myself. As it is you've told me I'm wrong but not how or why.

As for most software being composed of stuff that is simple in the aggregate, I again disagree but this may be my ignorance. Again, most of my experience in software has come on large systems where each object has several collaborators, each of which may also have several collaborators, recursively continuing out. The two options I see for TDD are either to spend considerable effort setting up these collaborators, or to mock them out. If I mock them out, I am coupling myself to the internals of how the method is implemented, because I have to give explicit implementations of each method called in the course of the method doing its job. If the method changes, the mock has to change, and the test breaks, even if the surface behavior of the object may not change.

I may be missing something here, but this is the reality of software as I experience it. Answers that say any of the following do not help me, and honestly I consider any of these evidence that TDD is not cure-all it is made out to be:

1) You're doing it wrong, where wrong is any case TDD has failed, essentially making it unfalsifiable 2) Your system is poorly written and doesn't work for TDD, essentially condemning the vast pool of legacy that makes the world spin to be without tests, or refactored/rewritten without adding user value 3) Your understanding is incomplete, without ever specifying what it would take to have a complete understanding or how to go about gaining this, with a similar symptom as case 1.

I appreciate that you take a more balanced approach to TDD advocacy. I should do a better job of separating you from people like "Uncle Bob" from whom I took the direct quote about not being a professional. I have also dealt with several TDD advocates throughout my career whose responses to questioning TDD have fallen into the previous 3 categories. Because of this I sometimes find it hard to separate more reasoned TDD advocacy from the former, and so my apologies if I misinterpreted the strength of your message.

gary_bernhardt · on May 1, 2014

The first paragraph of the Wikipedia article on TDD is a correct definition. Write a minimal test; see it fail; make it pass with a minimal code change; refactor to improve the design, keeping the tests passing. There are things that people do alongside TDD; there are common ways to perform TDD. But the red-green-refactor cycle is what "TDD" means.

I don't how many you mean by "several collaborators" in your composition paragraph. I'll assume 8, because that's enough collaborators that TDD will really start to hurt.

All software (not even most, but all) is composed of aggregations of trivially simple components. At the machine level, everything is built from instructions; in a fully OO language, everything is built from method calls; in some functional languages, everything is built from unary functions.

Systems are built by aggregating the trivially simple primitives provided by the environment. We have full control of that aggregation process; it has no mind of its own. We can choose to aggregate four things at a time or eight things at a time. That's not a big difference. It's the difference between 8 and 4 + 4. Same result, different decomposition.

The "4 + 4" analogy is not a straw man: splitting an eight-object interaction into two four-object interactions aggregated together is a well-defined operation on the syntax tree. IDEs even automate it, and have been for a decade or more; this is what an automated "extract method" refactoring is. Doing it well requires years of practice, but all software is made of aggregations of trivially simple components, and we have full control of the aggregation.

Mocking is not required for TDD. It's often used to do isolated unit testing, which may be done within a TDD loop. But even isolated testing can be done without mocks. I did a talk called "Boundaries" about that topic. https://www.destroyallsoftware.com/talks/boundaries

So yes, you are missing something here: first, most people doing TDD aren't mocking, or are mocking very rarely. Second, mocking is not required for isolation; you can also isolate by structuring your software in certain ways, which is what Boundaries is about.

Addressing each of the three responses you anticipate:

1) "Falsifiability": Nothing in software practices is falsifiable in practice. I've never seen a single piece of experimental literature that I considered sound. There's not even experimental evidence saying that "structured programming is better than willy-nilly GOTOs". And I've never even heard of a meta-analysis or an experiment being reproduced several times independently.

Sometimes you really are doing it wrong. If you try Haskell, can't figure out how to write to a file, and throw up your hands, that doesn't mean that Haskell "has failed" or "is unfalsifiable"; it means you don't know how to use it. Haskell and TDD are both particularly difficult to get your head around at first. Maybe you don't want to spend the effort. That's totally fine. This is why I don't actually know Haskell well.

2) "Your system is poorly written." This is a hugely subjective claim and you have to treat it as such. You have to silently append "by my standards of design" to the end of it. You also have to realize that everyone's standards of design are informed by the practices that they've used while doing the design.

If you primarily work on systems composed of functions with, say, ten collaborators (meaning a total of ten arguments and referenced globals/functions/etc.), then yes, I will say "your system is poorly designed". It doesn't mean that I think that you're a bad person; it does mean that I won't work with you to continue building your system in that way. Doing isolated unit testing on that system will be very difficult. Doing integrated unit testing, with or without TDD, will be less difficult. If we crank the collaborators up to 20 or 30, all programming tasks will be difficult, testing or not.

In the second part of your issue (2), you conflate TDD with testing, which is not correct. TDD is a loop of actions that produces tests. There are many other ways to produce tests.

3) "Your understanding is incomplete." If want to understand these ideas, read "TDD By Example" by Beck to learn about the TDD process, then "Growing Object-Oriented Software Guided by Tests" by Freeman and Price to learn about TDD design feedback and the careful use of mocks. Yes, it'll take time. (But less time than learning Haskell.)

If you want to understand what I mean about isolation without mocks, watch my "Boundaries" talk (I'd recommend doing that after reading the two books above). If you want to see live examples of TDD, and the trade-offs inherent in TDD being made and dicussed, watch my Destroy All Software screencasts.

However, if you don't want to do the work to tease these ideas apart, then I think that you should acknowledge that you're not willing to put the effort in. This is a different path than saying "people haven't said the right things to me for me to believe it works", which is the vibe I get right now. I learned TDD by doing it, incorrectly, and painfully, over and over again. You have the advantage of being able to read a couple of books to jump past my first year of learning. That's a huge efficiency gain, but the process can't be compacted into a comment on Hacker News that transmits the better part of a decade of experience.

(Finally, somewhat tangentially, I recommend that all programmers disabuse themselves of any belief that we have experimental evidence about programming practices by reading "The Leprechauns of Software Engineering".)

Perceptes · on May 1, 2014

You bring up some very valid concerns. I'm an advocate of TDD, but I'd love to hear some responses from TDD advocates on the disconnect you see between the TDD ideal and the reality you experience, preferably in a less defensive and accusatory tone than Gary's reply to you.

jshen · on May 1, 2014

Know of any open source code bases built this way?

antrix · on April 30, 2014

As a start, I recommend reading "Growing Object-Oriented Software, Guided by Tests". I am still not a 100% convert but I did get many of the same questions answered.

http://www.amazon.com/Growing-Object-Oriented-Software-Guide...

jgon · on April 30, 2014

Can you believe that I literally bought that book yesterday? I guess I have even more motivation to read through it now!

currywurst · on April 30, 2014

  > TDD advocates keep telling me I am not professional

Well it's nice that they hold the keys to the kingdom ;) ! In your darkest hours, remember this powerful mantra: "There is no silver bullet", and proceed to systematically take down glib generalizations.

desireco42 · on April 30, 2014

I have great respect for Gary Bernhardt. I think he is missing DHH point here, which as I understand it, that:

tests can be an end in itself

I think this is reasonable argument on DHH part that we want to spend more time writing actual code, instead of going through movements of TDD. That doesn't mean TDD is not valuable.

tieTYT · on April 30, 2014

> Classical TDD does not involve mocking or other forms of synthetic isolation by definition. We even use the term "classical TDD" to mean "TDD without isolation".

I wish there was a source on this paragraph. According to this talk that often refers to Kent Beck's book ( the talk: http://vimeo.com/68375232 / a good summary of the talk, in text: https://groups.google.com/forum/#!topic/growing-object-orien... ), you are supposed to isolate from the database.

gary_bernhardt · on April 30, 2014

Well, like I say in the post, mocks didn't exist back then, so they couldn't have been mocking in the sense that we are now. I wasn't there, but I believe it's true that in some cases they were doing what we'd now consider "fakes", which are simplified replacements for production dependencies that have minimal working implementations. (An in-memory database is an example.)

Fakes get you around the question of database integration, but not the question of integration in general. You'd have to create a fake of every class in the system for that, in which case you probably just re-invented mocks and/or stubs.

I probably could've more explicitly called out the fact that DHH is obsessing over database integration when it's just a special case of what isolated testers are actually worried about, which is integration in general. It was already a long post, though.

tieTYT · on April 30, 2014

Thanks for the reply and for the article. I took your quote out of context. I'm not very interested in DHH's mis/understanding of TDD. I'm more interested in my understanding of TDD. I'm trying to figure out if I'm doing TDD correctly or incorrectly when I isolate the database. Whether it's done through a mock or a fake or whatever is not that important to me.

The thing I don't like about TDD is I never know if I'm doing it correctly. There's always that No True Scotsman fallacy^1 situation. "Oh, you tried TDD and it didn't work out for you? If you were doing real TDD it would have worked out." And I say this as a person who has watched three of Uncle Bob's videos on TDD that you have to pay for: http://cleancoders.com/episode/clean-code-episode-6-p1/show http://cleancoders.com/episode/clean-code-episode-6-p2/show http://cleancoders.com/episode/clean-code-episode-19-p1/show

Sorry for hijacking the topic. I've been trying TDD for years, and I never know if I'm doing it correctly. I saw this line and it made me question it yet again.

^1: http://rationalwiki.org/wiki/No_True_Scotsman

EDIT: Sorry, I commented as I read. Looks like you're going to explain it in the article.

gary_bernhardt · on April 30, 2014

My main response is that there's no "right" way to do TDD. There is a core definition, which is the red/green/refactor loop. New tests must fail; all tests must be green to refactor. Almost everything else is someone's interpretation. And, honestly, after enough time doing it you'll start taking careful shortcuts through the loop. (Am I allowed to say that in public? ;)

Pairing in person with someone who's done it for a long time can help a lot, but even then you're getting someone's interpretation. If you worked with the DHH of three years ago, you'd get a very different interpretation (slower cycles, integration tests, no isolation) than if you worked with me (faster cycles, integration/unit mix, biased toward isolated unit tests). We were both doing TDD, though! (Given that he's said that he used to do TDD, I assume that he was following the core red/green/refactor loop.)

ssmoot · on April 30, 2014

At least in .NET land stubs were very common (several libraries actually generated them for you based on your Interfaces).

Nice post BTW.

raverbashing · on April 30, 2014

Of course creating mocks or circumventing the database may create more bugs (or hide more) than just wait the couple of minutes for the real thing

Adding another failure point is exactly what the name says: another possibility of error or bug

Sure, we can talk about the "one true way" of doing tests, how to make hundreds of tests run in a short time, etc (the fact that TDD insists in creating one test for each tiny thing goes against it, btw) but yeah, I prefer spending more time solving the problem, and not testing around it.

gary_bernhardt · on April 30, 2014

TDD is a way of writing tests, not a prescription about when to write them. I explicitly say in the post that I only do TDD 75% of the time for web apps and more like 50% for other code.

blatherard · on April 30, 2014

This comports quite well with my personal experience. In late 1999 I happened to pick up XP Explained and set about to try this testing thing. At the time, there wasn't any "Unit Tests == Fast Tests" thing. What we did on our project was have two different types of tests: "Fast" tests and "SlowAndExpensive" tests, that we ran separately. But there wasn't much some fundamental distinction between them. That came a lot later.

GFK_of_xmaspast · on May 1, 2014

The best testing is the kind you use and monitor.

andyl · on April 30, 2014

Fast tests are awesome, but hard to achieve - at least for me. TDD advocates: prove DHH wrong with easy to adopt frameworks and working software, not blog posts or books.

_pius · on April 30, 2014

TDD advocates: prove DHH wrong with easy to adopt frameworks and working software, not blog posts or books.

Gary Bernhardt's about as credible on this as one can be, as he's literally recorded hours of himself building working software with TDD.

From the article:

These tests are fast enough that I can hit enter (my test-running keystroke) and have a response before I have time to think. It means that the flow of my thoughts never breaks. If you've watched Destroy All Software screencasts, you know that I'll sometimes run tests ten times per minute. All of my screencasts are recorded live after doing many takes to smooth the presentation out, so you're seeing my actual speed, not the result of editing.

I understand disagreeing, but responding to this post with "show, not tell" just makes it look like you haven't read the article.

gary_bernhardt · on April 30, 2014

Humorously, I already did create a software tool that does isolation automatically. It's called Dingus and it's five years old: https://github.com/garybernhardt/dingus.

It was a terrible idea. It only worked well if you applied the same discipline that you would've had to apply when doing it manually, but it lured you into a false sense of security when using it sloppily.

(Dingus is fine when used as a standard mock/stub/spy library, but the automatic isolation features are dangerous and are documented as such in the README.)

why-el · on April 30, 2014

Ok, Gary's casts are great, but there aren't near as substantial to conclude anything from them in terms of software architecture.

andyl · on May 1, 2014

I subscribed to DAS and watched all the videos. I admire Gary and other TDD advocates. I believe in the power of fast tests.

But still, after watching the videos, I can't do what Gary does. I'm not that good, and nobody I have ever worked with is that good either.

For the common developer to achieve Gary-like performance, the tools/frameworks/conventions have got to improve. IMHO.

In time I expect this will happen, and that DHH will be proven wrong! :-)

akerl_ · on April 30, 2014

RSpec has been one of the easiest tools I've ever learned to use. Their documentation is beautiful (and their demo code is actually written as tests), and it's pretty easy to dive right in to writing tests.

dasil003 · on April 30, 2014

Isn't this just a reflexive and ignorant response? Have you actually looked at any of the code he's shipped? I mean certainly people should be allowed to write about their experiences without having to face accusations that they've never actually done anything but write.

ebiester · on April 30, 2014

It's an acquired skill, and it requires a team that is willing to do it - the trick is to lower impedance toward writing good tests.

jongraehl · on May 1, 2014

I'd believe the same endorsements from someone who makes their living primarily off working code, rather than selling edu. material promoting test-heavy coding.

That said, rapid "all's still well" feedback is awesome when you can get it.

sync · on April 30, 2014

> "I want my feedback to be so fast that I can't think before it shows up. If I can think, then I'll sometimes lose attention, and I don't want to lose attention."

Don't you want to think while programming? I feel like that's practically all I do -- I spend most of my time thinking about a problem and very little actually writing code.

matsemann · on April 30, 2014

Ironic that your reply to a piece about strawmen is nothing but a strawman. Of course he is not meaning what you're saying, he's talking about context switching. If I have to wait 15 seconds for my tests to run, I start doing something else and my thought process around the problem is gone.

sync · on April 30, 2014

I don't understand. Are you not continuing to think about the problem (or the next problem) while tests are running, regardless of how long they take?

thetrb · on April 30, 2014

If you move on to the next problem, then your test finishes and forces you to move back to the original problem then you just had to do 2 context switches. If you never had to wait for the test result then you could have eliminated those.

afternooner · on April 30, 2014

No, I really don't want to think while coding 90% of the code I write. I listen to audiobooks and podcasts while coding to keep my mind occupied while I code. I want instant feedback, else I'm going to quickly check reddit or HN, and then I've lost 15 minutes to a 45 second test.

codereflection · on April 30, 2014

If you'd ever seen Gary code, you'd know that he is a very quick thinker (almost nonhuman like) that requires very fast feedback loop to keep up the pace.

Retric · on April 30, 2014

Your not watching him think, it's a few takes before he get's the presentation the way he wants.

Edit: it's a little misleading, but "All of my screencasts are recorded live after doing many takes to smooth the presentation out"

codereflection · on April 30, 2014

I'm not talking about DAS, I'm talking about watching him code live.

JeremyMorgan · on April 30, 2014

This banter is really getting old and hogging a lot of space.

If you don't like TDD, don't do it. If your employer or organization forces you to do it, leave.

If you do like it and think it matters, do it.

It's really that simple.

matthewmacleod · on April 30, 2014

It's really that simple.

That's stupid. It's an interesting discussion about software best-practices. A bunch of well-respected people are writing about what they do, what they see as the upsides and downsides of both approaches, and arguing for why their approach is best.

This is content which is appropriate to the audience, reasonably interesting, and technically useful.

bdcravens · on April 30, 2014

Yes, this discussion of industry approaches to our craft is taking up room for far more important things, like the discussion of real estate prices in San Francisco.

codereflection · on April 30, 2014

While it make look to be that simple, the greater benefit of having these discussions out in the open amongst the various developer communities should not be overlooked. There are a lot of developers out there that will take what DHH says for "the golden rule" without critically thinking about the content of his assertions. (Of course this is true for a lot of vocal developers, including Gary). When we have folks like Uncle Bob and Gary Bernhardt challenging those assertions, all of us can benefit. And for those people who are not used to, or don't have the experience in the community of speaking up and challenging ideas, they can learn that it's not OK just to take someone's opinion on software development as the truth. If you don't like the conversations, ignore them and read something else.

sergiotapia · on April 30, 2014

I'm flagging the TDD he-said-she-said banter submissions. I'm not sure if this is kosher or not, but they feel a little trite no?

steveklabnik · on April 30, 2014

You can flag whatever you'd like, but

> If you flag something, please don't also comment that you did.

http://ycombinator.com/newsguidelines.html

gbaygon · on April 30, 2014

> You can flag whatever you'd like, but

No, you can't. If you flag articles just because you are tired to see some topic in the front page your flagging habilities will be removed (happened to me).

steveklabnik · on April 30, 2014

I just said you could, I didn't say it was a good idea.

gbaygon · on April 30, 2014

fair enough

sergiotapia · on May 1, 2014

I didn't know that, I just unflagged the submission.

krschultz · on April 30, 2014

It actually feels like a bunch of experts in our field having an intellectual debate about an important aspect of the profession in public. If that's not relevant, I don't know what is.