I like to think I'm a pretty good architect - my team respects me, I solve a lot...

3pt14159 · on Oct 17, 2022

What you're missing is that most people in web dev use a full featured framework like Django or Rails and do not need unit tests nearly as much as integration tests because very, very, very often most of the architectural decisions have been made for them and the framework linking it all together is where mismatches between expectations seep in.

"Wups, I pluralized this thing that should have been singular in the routes / urls file"

Also, when developing APIs for the front end it's pretty unlikely that I need to test Rabbit.permanent_url on its own and much more likely I need to test things like listing all the rabbits for a given rabbit farm that are candidates for sale in the local meat market.

Where exactly should this test go? The framework handles all the magic SQL generation and the frontend folks really only care about input -> output.

If you're building everything from the ground up, then of course there will be way more unit tests, but with an established framework you don't have to test everything. You trust the framework to get some things right for you.

kipple · on Oct 17, 2022

There's a couple reasons I personally don't find front-end unit tests valuable:

1. (In my experience) client code is mostly integration: it's integrating local user interactions and remote data APIs into a stable experience. It's rare that bugs come from an idempotent function with a clear I/O that can be unit tested — it's much more likely that bugs come from something like an unexpected API response or a complex combination of user states.

2. TypeScript. Static typing obviates a good chunk of the low-hanging unit tests. And it addresses your point here:

> But I think the point that gets lost on people is that the value of unit tests isn't chiefly the output of running the test suite. It's that the process of writing good unit tests forces you to write well-structured code.

Strict TypeScript (+ ESLint) also does wonders to encourage well-structured code, such as making it hard to have a mystery object passed around your app, collecting new properties as it goes. That mystery object would need a type definition, and would be easier to deal with as a series of discrete states instead of an amalgamation of mutations. Types encourage clear structures and interfaces for your code.

With all that, I'd rather focus my time on type safety + integration tests.

tunesmith · on Oct 17, 2022

I'm generally a proponent of "why not both?" when it comes to types and unit tests. At least with our codebase (nextjs, typescript strict mode, eslint), there is still a ton of room for improvement.

maerF0x0 · on Oct 17, 2022

This is such an important line of thinking. If it has a positive ROI then do them both. (The greater the availability of capital the more truth this holds)

rstuart4133 · on Oct 17, 2022

I agree, but I'd disagree with your reasoning on why unit tests delivery better bang for the time spent. In my experience, integration tests are very fragile, and something like selenium is like testing egg shells by dancing on them. Sure, like all tests they improve reliability, but the amount of effort required to maintain them is enormous compared to unit tests. Given that unit tests generally are at least the same amount of code again, that's one hell of a whack.

Re "It's that the process of writing good unit tests forces you to write well-structured code": it doesn't ring true to me. I've see a lot of beautifully structured code that doesn't have a lot of tests. Much of the Linux kernel is like that. But what is true is it you are forced to write tests, you forced to write code that's testable. As anyone who's tried to write unit tests after the fact will testify, the difference between code that's designed to be testable and code that wasn't written with that in mind is so dramatic, it's almost never worth the effort to retro fit unit tests unless you are doing a major refactor. That's because you have to refactor it to get decent coverage.

Which brings us to the 100% code coverage thing mentioned in the article. The benefit of insisting on 100% code coverage isn't that 100% of the code is tested. It's that 100% of the code can be tested. What less than 100% means is that at some point the programmer gave up making his code testable.

But maybe I'm wrong about the benefits of 100% code testing for reliability. Sqlite's report on the difference achieving DO-178B made to bug report levels was an eye opener. Still, they say to achieve DO-178B, the size of the unit test code went from 1-2 times the original code base to 7 times. Again, that's a _lot_ of an overhead. But maybe that's what we actually need to be at.

jamesfinlayson · on Oct 18, 2022

I remember pushing to get a project to 100% code coverage a few years ago - getting that last 1% was tough but it revealed a bug in a previously uncovered catch block - it was doing something that would have thrown an exception without logging the cause of the original failure.

froh · on Oct 17, 2022

in my experience the hard parts in software are remote interactions, i.e. behavior that localized unit testing has a hard time to capture, and concurrency.

so the valuable tests, those which actually find issues, are the tests in near production environments under near production load.

now I'm not sure if you describe just superficial testing, at all levels, including the "integration" and "selenium-ish" level?

did you ever measure code coverage across all tests?

my thinking is rather you only need to consider unit tests for those paths that are not touched by your integration and system tests.

my aha moment was when SQLite, famous for their efforts in full MCDC coverage, found uncomfortably many bugs through fuzzing. the SQLite team realized they didn't add corner case checks where it was hard to create a test case "because nobody provides suchandsuch input anyhow".

so my take would be to measure the coverage in the testing of your organization, to look at what you find and to then decide as a team where you have business risk level "undercoverage".

phtrivier · on Oct 17, 2022

Your ideal is what is commonly described as the "testing pyramid".

It's very well suited to "single artifact" code (a lib, an app with an easy way to drive interaction through a gui, etc...)

It tends to turn into a "testing sandglass" with time (because most of the value of integration tests is also derived from gui/e2e tests.)

Depending on the app, it might makes sense.

Kent C Dodds (author of the poorly named "testing library") is embracing this sandglass shape by calling it the "testing trophy".

Honestly, to me the hardest deterrent to testing are :

* If it's not done from day 1, you end up with that one hard to test piece of code that makes every other piece of code hard to test

* Few people enjoy writing tests (I do, but I reckon I'm part of a minority.)

Do what helps you the most !!

tibbon · on Oct 17, 2022

I strongly agree. Most of this felt fine, except about testing.

I'm frequently tasked with writing reliable services, and I always 100% of the time start with tests. Are they perfect? Absolutely not, but I am able to write, test, maintain and iterate on at-scale critical services with great confidence; and 90% of that confidence can come from well structured tests. My code is generally also easy to refactor, understand, port to other languages, etc. Testing is such a critical part of that. I only test things "in staging" as a last check; and very rarely am I disappointed with the behavior (sometimes the speed or scale, but not behavior)

ericmcer · on Oct 17, 2022

I just struggle to get value out of frontend unit tests when using Typescript and React. Your main vulnerability is code receiving data in a shape/type it doesn't expect, and it is going to be very difficult to do that with React and TS. I still write tests for specific functions and complex Regex's but very rarely will for a React component.

cgrealy · on Oct 17, 2022

> I've always suspected that the ideal testing setup would be mostly unit tests, integration tests only to fill those gaps, and selenium-ish tests (hopefully rare!) only to fill in the final remaining gaps.

I would have thought this is pretty standard. You've just described the testing pyramid.

dkarl · on Oct 17, 2022

Unit tests are great for getting something done correctly the first time, but they aren't as helpful for avoiding regressions over time. New functionality gets lots of manual testing, so you can get away with not having automated tests that test the assembled unit (ahem) that you are responsible for delivering. In the short term, this continues to be true, as the service is maintained by developers who have deep, recent knowledge of the system and are adept at manual testing.

However, once you have a large amount of established, stable functionality, and developers have moved on to other projects, you want to be able to make small marginal changes (extensions and bug fixes) with small marginal effort. Spending hours running manual tests isn't reasonable like it was when the system was getting its first big release. But at the same time you don't want to break all that stuff you aren't testing, so developers are careful to make sure that their changes only affect the functionality they are willing (and able) to manually test. If they find the bug in code that affects the whole system, they often won't fix it there. Large refactorings are completely ruled out. Over time, the consequences of making purely local changes and avoiding refactoring put you on a slide to crufty, non-cohesive, special-case-ridden code.

If you have automated top-level tests that test the complete unit that you are responsible for delivering, you can make whatever changes seem appropriate, even if they have global consequences, and feel confident shipping your code.

Reading guides to unit testing, it's funny that they almost all frame their examples as, you are responsible for delivering a certain class, so let's write unit tests for it. But how often are you responsible for shipping a single class? How do you extend that advice to shipping an entire service? Do you test the service as a running whole, or do you test all the classes in it? For me, tests of the entire service are what gives future developers the confidence to make changes and deliver them without worrying about the global effects of the change they made, so I think that's the most important level of testing in the long term.

Edit/PS: Unit tests of units (classes/functions) that are lower down from the top-level functionality also only help as long as the functionality they test is stable. Higher-level, more public functionality changes more slowly, which mean tests at that level require less maintenance over time than lower-level tests that might be invalidated by internal changes due to refactoring.

smrtinsert · on Oct 17, 2022

100% in agreement. The largest projects that I've successfully completed have had extensive unit test and test suites in general.

maerF0x0 · on Oct 17, 2022

> I think unit tests should be first priority,

15 YoE here and I full agree that unittests can easily be the best ROI on quality investments. However, coverage and UTs are easy to game. So if you have a toxic culture then folks will simply abuse the goals/metrics, just like any other.

> showing it does what you wanted and doesn’t break everything.

This is an extremely broken premise. Why? Well how do you show it doesnt break *Everything* ? Sure you can easily click through a single happy path of your 2^10 branches in the new feature, but that doesnt convince a rational person that 1) the feature works as designed in all scenarios, nor that 2) you didn't break tons of other things.

I've seen this be addressed in a couple of ways 1) Be like "no customers complained" to which I'd share my experience is thy rarely do. Most customers shrug, try again, and if they cant get what they want they move to another task, or if it's really critical path they simply churn out of your product into a competitor.

or 2) Using "stats" like datadog dashboards. Unfortunately those most often simply mean you hit a line of code, maybe with some volume of data (eg if you count the length of an array). A datadog dashboard isnt going to tell you that you're pumping invalid JSON into that VARCHAR field, or that oops your code is actually returning a 200 OK to the customer when a downstream service fails to accept the data you've accepted as safe in your db...

Unittests can also be creatively used to isolate the easy to test portions and leave the harder to simulate things to other layers of testing. Eg: extract a portion of the code out to a function and UT the function, leave the network details to integration tests instead of implementing a full fledged Mock (which are usually actually stubs btw[1].

I've come to the conclusion that a couple factors are in play -- nihilism and ignorance. Sadly so many engineers and their managers (and sadly some product folks too) have started to behave as though quality doesnt matter. And sure, they're right no one died when you took down prod or broke that feature. But your $1B ARR company can lose approximately an engineer year every few hours if you screw things up. and as for ignorance, so much of tech nowadays is just the blind leading the blind -- engineers are promoted as coin tosses gone well rather than smart decisions made[2], they take absurd risks and lose little (personally) when it fails, but keep the full reward when it works.

[1]: https://jesusvalerareales.medium.com/testing-with-test-doubl... and sinonjs docs https://sinonjs.org/releases/v14/ are both super good resources to help one think about what those "mocks" do for you and how much assurance each provides that what you wanted to happen happen.

[2]: if you dont understand the difference consider if an employee should be fired for taking company funds to Vegas and betting it all? Even if they win that's not the kind of employee you want around.