[HN Gopher] The big TDD misunderstanding (2022)
       ___________________________________________________________________
        
       The big TDD misunderstanding (2022)
        
       Author : WolfOliver
       Score  : 107 points
       Date   : 2023-11-19 09:48 UTC (13 hours ago)
        
 (HTM) web link (linkedrecords.com)
 (TXT) w3m dump (linkedrecords.com)
        
       | voiceofunreason wrote:
       | << Originally, the term "unit" in "unit test" referred not to the
       | system under test but to the test itself. >>
       | 
       | Retroactive continuity - how does it work?
       | 
       | For today's lucky 10,000: "unit test", as a label, was in wide
       | use prior to the arrival of the Extreme Programming movement in
       | the late 1990s. And the "unit" in question was the test subject.
       | 
       | But, as far as I can tell, _Smalltalk_ lacked a testing culture
       | (Beck, 1994), so perhaps the testing community's definitions
       | weren't well represented in Smalltalk spaces.
       | 
       | "The past was alterable. The past never had been altered."
       | 
       | (Not particularly fair to single out this one author - this
       | origin myth has been common during the last 10 years or so.)
        
         | jdougan wrote:
         | At my job we wrote tests in Smalltalk and that was before
         | Kent's work. It wasn't the later tests-first discipline. If I
         | recall correctly write tests later was fairly common then in ST
         | shops. SUnit also had the virtue of standardizing the
         | terminology and testing frameworks.
        
       | emadb wrote:
       | The big TDD misunderstanding is that most people consider TDD a
       | testing practice. The article doesn't talk about TDD, it gives
       | the reader some tips on how to write tests. That's not TDD.
        
         | shimst3r wrote:
         | Instead of Test-Driven Design, it should've been called Design-
         | By-Testing.
        
           | paulluuk wrote:
           | Did you mean Test Driven Development, or is Test-Driven
           | Design a whole other thing?
        
         | skrebbel wrote:
         | Yeah it's kind of unfortunate because they make a very good
         | argument about defining a thing better, and in the title use a
         | wrong definition of an adjacent term.
        
         | WolfOliver wrote:
         | Maybe the term TDD in the title can be replaced with "unit
         | testing". But unit testing is an major part of TDD.
        
         | MoreQARespect wrote:
         | I'm fully aware of the idea that TDD is a "design practice" but
         | I find it to be completely wrongheaded.
         | 
         | The principle that tests that couple to low level code give you
         | feedback about tightly coupled code is true but it does that
         | because low level/unit tests couple too tightly to your code -
         | I.e. because they too are bad code!
         | 
         | Have you ever refactored working code into working code and had
         | a slew of tests fail _anyway_? That 's the child of test driven
         | design.
         | 
         | High level/integration TDD doesnt give "feedback" on your
         | design it just tells you if your code matches the spec. This is
         | actually more useful. It then lets you refactor bad code with a
         | safety harness and give failures that _actually_ mean failure
         | and not  "changed code".
         | 
         | I keep wishing for the idea of test driven design to die.
         | Writing tests which break on working code is inordinately
         | uneconomic way to detect design issues as compared to
         | developing an eye for it and fixing it under a test harness
         | with no opinion on your design.
         | 
         | So, yes this - high level test driven development - _is_ TDD
         | and moreover it 's got a _better_ cost /benefit trade off than
         | test driven design.
        
           | surgical_fire wrote:
           | I don't even like TDD much, but I think that this missed the
           | point:
           | 
           | > Have you ever refactored working code into working code and
           | had a slew of tests fail anyway?
           | 
           | Yes - and that is intended. The "refactor of working code
           | into working code" often changes some assumptions that were
           | made during implementation.
           | 
           | Those tests are not there to give "feedback on your design",
           | they are there to endure that the implementation does what
           | you thought it should do when you wrote your code. Yes, that
           | means that when you refactor your code, quite a few tests
           | will have to be changed to match the new code.
           | 
           | But the amount of times I had this happen and it highlighted
           | issues on the refactor is definitely not negligible. The cost
           | of not having these tests (which would translate into bugs)
           | would certainly have surpassed the costs of keeping those
           | tests around.
        
             | thom wrote:
             | If we're talking "what you thought it should do" and not
             | "how you thought it should do it" this is all fine. If
             | requirements change tests should change. I think the
             | objection is more to changing implementation details and
             | having to rewrite twice as much code, when your functional
             | tests (which test things that actually make you money)
             | never changed.
        
               | chris_wot wrote:
               | If your functional tests fail because you made an
               | "unrelated" code change, then you've done something
               | wrong.
        
               | naasking wrote:
               | Maybe, but I think the point is that it's probably very
               | easy to get into this situation, and not many people talk
               | about it or point out how to avoid it.
        
               | chris_wot wrote:
               | I'm still not following what the issue is. If you
               | refactor some code and change the behaviour of the code,
               | and the code tests the expected behaviour and passes,
               | then you have one of two problems:
               | 
               | 1. You had a bug you didn't know about and your test was
               | invalid (in which case the test is useless! Fix the issue
               | then you fix the test...)
               | 
               | or
               | 
               | 2. You had no bug and you just introduced a new one, in
               | which case the test has done its job and alerted you to
               | the problem so you can fix your mistake.
               | 
               | What is the exact problem?
               | 
               | Now if this is an issue with changing the behaviour of
               | the system, that's not a refactor. In that case, your
               | tests are testing old behaviour, and yes, they are going
               | to have to be changed.
        
               | naasking wrote:
               | The point is that you're not changing the interface to
               | the system, but you're changing implementation details
               | that don't affect the interface semantics. TDD does lead
               | you to a sort of coupling to implementation details,
               | which results in breaking a lot of unit tests if you
               | change those implementation details. What this yields is
               | either hesitancy to undertake positive refactorings
               | because you have to either update all of those tests or
               | just delete them altogether, so were those tests really
               | useful to begin with? The point is that it's apparently
               | wasted work and possibly an active impediment to positive
               | change, and I haven't seen much discussion around
               | avoiding this outcome, or what to do about it.
        
               | thom wrote:
               | There has been discussion about this more than a decade
               | ago by people like Dan North and Liz Keogh. I think it's
               | widely accepted that strict TDD can reduce agility when
               | projects face a lot of uncertainty and flux (both at the
               | requirements and implementation levels). I will maintain
               | that functional and integration tests are more effective
               | than low-level unit tests in most cases, because they're
               | more likely to test things customers care about directly,
               | and are less volatile than implementation-level
               | specifics. But there's no free lunch, all we're ever
               | trying to do is get value for our investment of time and
               | reduce what risks we can. Sometimes you'll work on
               | projects where you build low level capabilities that are
               | very valuable, and the actual requirements vary wildly as
               | stakeholders navigate uncertainty. In those cases you're
               | glad to have solid foundations even if everything above
               | is quite wobbly. Time, change and uncertainty are part of
               | your domain and you have to reason about them the same as
               | everything else.
        
           | thom wrote:
           | I think many people realise this, thus the spike and
           | stabilise pattern. But yes, integration and functional tests
           | are both higher value in and of themselves, and lower risk in
           | terms of rework, so ought to be a priority. For pieces of
           | logic with many edge cases and iterations, mix in some
           | targeted property-based testing and you're usually in a good
           | place.
        
           | lukeramsden wrote:
           | Part of test-driven design is using the tests to drive out a
           | sensible and easy to use interface for the system under test,
           | and to make it testable from the get-go (not too much non-
           | determinism, threading issues, whatever it is). It's well
           | known that you should likely _delete these tests_ once you've
           | written higher level ones that are more testing behaviour
           | than implementation! But the best and quickest way to get to
           | having high quality _behaviour_ tests is to start by using
           | "implementation tests" to make sure you have an easily
           | testable system, and then go from there.
        
             | Dylan16807 wrote:
             | > It's well known that you should likely _delete these
             | tests_ once you've written higher level ones that are more
             | testing behaviour than implementation!
             | 
             | Is it? I don't think I've ever seen that mentioned.
        
             | MoreQARespect wrote:
             | >It's well known that you should likely _delete these
             | tests_ once you've written higher level ones that are more
             | testing behaviour than implementation!
             | 
             | Building tests only to throw them away is the design
             | equivalent of burning stacks of $10 notes to stay warm.
             | 
             | As a process it _works_. It 's just 2x easier to write
             | behavioral tests first and thrash out a good design later
             | under its harness.
             | 
             | It mystifies me that doubling the SLOC of your code by
             | adding low level tests only to trash them later became seen
             | as a best practice. It's so incredibly wasteful.
        
               | spinningslate wrote:
               | Don't agree, though I think it's more suble than "throw
               | away the tests" - more "evolve them to a larger scope".
               | 
               | I find this particularly with web services,especially
               | when the the services are some form of stateless
               | calculators. I'll usually start with tests that focus on
               | the function at the native programming language level.
               | Those help me get the function(s) working correctly. The
               | code and tests co-evolve.
               | 
               | Once I get the logic working, I'll add on the HTTP
               | handling. There's no _domain_ logic in there, but there
               | is still logic (e.g. mapping from json to native types,
               | authentication, ...). Things can go wrong there too. At
               | this point I 'll migrate the original tests to use the
               | web service. Doing so means I get more reassurance for
               | each test run: not only that the domain logic works, but
               | that the translation in & out works correctly too.
               | 
               | At that point there's no point leaving the original tests
               | in place. They're just covering a subset of the E2E tests
               | so provide no extra assurance.
               | 
               | I'm therefore with TFA in leaning towards E2E testing
               | because I get more bang for the buck. There are still
               | places where I'll keep native language tests, for example
               | if there's particularly gnarly logic that I want extra
               | reassurance on, or E2E testing is too slow. But they tend
               | to be the exception, not the rule.
        
               | munch117 wrote:
               | > At that point there's no point leaving the original
               | tests in place. They're just covering a subset of the E2E
               | tests so provide no extra assurance.
               | 
               | They give you feedback when something fails, by better
               | localising where it failed. I agree that E2E tests
               | provide better assurance, but tests are not only there to
               | provide assurance, they are also there to assist you in
               | development.
        
               | MoreQARespect wrote:
               | Starting low level and evolving to a larger scope is
               | still unnecessary work.
               | 
               | It's still cheaper starting off building a
               | playwright/calls-a-rest-api test against your web app
               | than building a low level unit test and "evolving" it
               | into a playwright test.
               | 
               | I agree that low level unit tests are faster and more
               | appropriate and if you are surrounding complex logic with
               | a simple and stable api (e.g. testing a parser) but it's
               | better to work your way _down_ to that level when it
               | makes sense, not starting there and working your way up.
        
               | spinningslate wrote:
               | That's not my experience. In the early stages, it's often
               | not clear what the interface or logic should be - even at
               | the external behaviour level. Hence the reason tests and
               | code evolve together. Doing that at native code level
               | means I can focus on one thing: the domain logic. I use
               | FastAPI plus pytest for most of these projects. The net
               | cost of migrating a domain-only test to use the web API
               | is small. Doing that once the underlying api has
               | stabilised is less effort than starting with a web test.
        
               | MoreQARespect wrote:
               | I dont think ive ever worked on any project where they
               | hadnt yet decided whether they wanted a command line app
               | or a website or an android app before I started. That
               | part is usually fixed in stone.
               | 
               | Sometimes lower level requirements are decided before
               | higher level requirements.
               | 
               | I find that this often causes pretty bad requirements
               | churn - when you actually get the customer to think about
               | the UI or get them to look at one then inevitably the
               | domain model gets adjusted in response. This is the
               | essence of why BDD/example driven specification works.
        
               | jt2190 wrote:
               | > As a process it works. It's just 2x easier to write
               | behavioral tests first and thrash out a good design later
               | under its harness.
               | 
               | I think this "2x easier" only applies to developers who
               | deeply understand how to design software. A very poorly
               | designed implementation can still pass the high level
               | tests, while also being hard to reason about (typically
               | poor data structures) and debug, having excessive
               | requirements for test setup and tear down due to lots of
               | assumed state, and be hard to change, and might have no
               | modularity at all, meaning that the tests cover tens of
               | thousands of lines (but only the happy path, really).
               | 
               | Code like this can still be valuable of course, since it
               | satisfies the requirements and produces business value,
               | however I'd say that it runs a high risk of being marked
               | for a complete rewrite, likely by someone who also
               | doesn't really know how to design software.
               | (Organizations that don't know what well designed
               | software looks like tend not to hire people who are good
               | at it.)
        
               | MoreQARespect wrote:
               | "Test driven design" in the wrong hands will also lead to
               | a poorly designed non modular implementation in less
               | skilled hands.
               | 
               | I've seen _plenty_ of horrible unit test driven developed
               | code with a _mess_ of unnecessary mocks.
               | 
               | So no, this isnt about skill.
               | 
               | "Test driven design" doesnt provide effective safety
               | rails to prevent bad design from happening. It just
               | causes more pain to those who use it as such.
               | _Experience_ is what is supposed to tell you how to react
               | to that pain.
               | 
               | In the hands of junior developers test driven design is
               | more like test driven self flagellation in that respect:
               | an exercise in unnecessary shame and humiliation.
               | 
               | Moreover since it _prevents_ those tests with a
               | clusterfuck of mocks from operating as a reliable safety
               | harness (because they fail when implementation code
               | _changes_ , not in the presence of bugs), it actively
               | inhibits iterative exploration towards good design.
               | 
               | These tests have the effect of _locking in_ bad design
               | because keeping tightly coupled low level tests green and
               | refactoring is twice as much work as just refactoring
               | without this type of test.
        
               | User23 wrote:
               | > I've seen plenty of horrible unit test driven developed
               | code with a mess of unnecessary mocks.
               | 
               | Mocks are an anti-pattern. They are a tool that either by
               | design or unfortunate happenstance allows and encourages
               | poor separation of concerns, thereby eliminating the
               | single largest benefit of TDD: clean designs.
        
               | jt2190 wrote:
               | You asserted:
               | 
               | > ... TDD is a "design practice" but I find it to be
               | completely wrongheaded.
               | 
               | > The principle that tests that couple to low level code
               | give you feedback about tightly coupled code is true but
               | it does that because low level/unit tests couple too
               | tightly to your code - I.e. because they too are bad
               | code!
               | 
               | But now you're asserting:
               | 
               | > "Test driven design" in the wrong hands will also lead
               | to a poorly designed non modular implementation in less
               | skilled hands.
               | 
               | Which feels like it contradicts your earlier assertion
               | that TDD produces low-level unit tests. In other words,
               | for there to be a "unit test" there must be a boundary
               | around the "unit", and if the code created by following
               | TDD doesn't even have module-sized units, then is that
               | really TDD anymore?
               | 
               | Edit: Or are you asserting that TDD doesn't provide any
               | direction at all about what kind of testing to do? If so,
               | then what does it direct us to do?
        
               | MoreQARespect wrote:
               | >"Test driven design" in the wrong hands will also lead
               | to a poorly designed non modular implementation in less
               | skilled hands.
               | 
               | >Which feels like it contradicts your earlier assertion
               | that TDD produces low-level unit tests.
               | 
               | No, it doesnt contradict that at all. Test driven design,
               | whether done optimally or suboptimally, produces low
               | level unit tests.
               | 
               | Whether the "feedback" from those tests is taken into
               | account determines whether you get bad design or not.
               | 
               | Either way I do not consider it a good practice. The
               | person I was replying to was suggesting that it was a
               | practice that was more suited to be people with a lack of
               | experience. I dont think that is true.
               | 
               | >Or are you asserting that TDD doesn't provide any
               | direction at all about what kind of testing to do?
               | 
               | I'm saying that test driven design provides weak
               | direction about design and it is not uncommon for test
               | driven design to still produce bad designs because that
               | weak direction is not followed by people with less
               | experience.
               | 
               | Thus I dont think it's a practice whose effectiveness is
               | moderated by experience level. It's just a bad idea
               | either way.
        
               | jt2190 wrote:
               | Thanks for clarifying.
               | 
               | I think this nails it:
               | 
               | > Whether the "feedback" from those tests is taken into
               | account determines whether you get bad design or not.
               | 
               | Which to me was kind of the whole point of TDD in the
               | first place; to let the ease and/or difficulty of testing
               | become feedback that informs the design overall, leading
               | to code that requires less set up to test, fewer
               | dependencies to mock, etc.
               | 
               | I also agree that a lot of devs ignore that feedback, and
               | that just telling someone to "do TDD" without first
               | making sure that they know that they need to strive to
               | have little to no test setup and few or no mocks, etc.,
               | otherwise the advice is pointless.
               | 
               | Overall I get the sense that a sizable number of
               | programmers accept a mentality of "I'm told programming
               | is hard, this feels hard so I must be doing it right".
               | It's a mentality of helplessness, of lack of agency, as
               | if there is nothing more they can do to make things
               | easier. Thus they churn out overly complex, difficult
               | code.
        
               | MoreQARespect wrote:
               | >Which to me was kind of the whole point of TDD in the
               | first place; to let the ease and/or difficulty of testing
               | become feedback that informs the design overall
               | 
               | Yes and that is precisely what I was arguing against
               | throughout this thread.
               | 
               | For me, (integration) test driven development development
               | is about creating:
               | 
               | * A signal to let me know if my feature is working and
               | easy access to debugging information if it is not.
               | 
               | * A body of high quality tests.
               | 
               | It is 0% about design, except insofar as the tests give
               | me a safety harness for refactoring or experimenting with
               | design changes.
        
               | lmm wrote:
               | What exactly is it wasting? Is your screen going to run
               | out of ink? Even in the physical contruction world,
               | people often build as much or more scaffolding as the
               | thing they're actually building, and that takes time and
               | effort to put up and take down, but it's worthwhile.
               | 
               | Sure, maybe you can do everything you would do via TDD in
               | your head instead. But it's likely to be slower and more
               | error-prone. You've got a computer there, you might as
               | well use it; "thinking aloud" by writing out your
               | possible API designs and playing with them in code tends
               | to be quicker and more effective.
        
             | User23 wrote:
             | Put simply, doing TDD properly leads to sensible separation
             | of concerns.
        
           | aljarry wrote:
           | > Have you ever refactored working code into working code and
           | had a slew of tests fail anyway? That's the child of test
           | driven design.
           | 
           | I had this problem, when either testing too much
           | implementation, or relying too much on implementation to
           | write tests. If, on the other hand, I test only the required
           | assumptions, I'd get lower line/branch coverage, but my tests
           | wouldn't break while changing implementation.
           | 
           | My take on this - TDD works well when you fully control the
           | model, and when you don't test for implementation, but the
           | minimal required assumptions.
        
           | chris_wot wrote:
           | If you've refactored code and a bunch of tests fail, then
           | you've likely introduced a bug.
        
             | chris_wot wrote:
             | Not sure why I'm getting downvoted so badly, because by its
             | very nature refactoring should t change the functionality
             | of the system. If you have functional unit tests that are
             | failing, then something has changed and your refactor has
             | changed the behaviour of the system!
        
               | tsimionescu wrote:
               | It is very common for unit tests to be white-box testing,
               | and thus to depend significantly on internal details of a
               | class.
               | 
               | Say, when unit testing a list class, a test might call
               | the add function and then assert that the length field
               | has changed appropriately.
               | 
               | Then, if you change the list to calculate length on
               | demand instead of keeping a length field, your test will
               | now fail even thought the behavior has not actually
               | changed.
               | 
               | This is a somewhat silly example, but it is very common
               | for unit tests to depend on implementation details. And
               | note that this is not about private VS public
               | methods/fields. The line between implementation details
               | and public API is fuzzy and depends on the larger use of
               | the unit within the system.
        
               | zmgsabst wrote:
               | The behavior has changed:
               | 
               | Checking length is now a function call and not a cached
               | variable -- a change in call signature and runtime
               | performance.
               | 
               | Consumers of your list class are going to have to update
               | their code (eg, that checks the list length) and your
               | test successfully notified you of that breaking API
               | change.
        
               | tsimionescu wrote:
               | Then any code change is a breaking API change and the
               | term API is meaningless. If the compiler replaces a
               | conditional jump + a move with a conditional move, it has
               | now changed the total length of my code and affected its
               | performance, and now users will have to adjust their code
               | accordingly.
               | 
               | The API of a piece of code is a convention, sometimes
               | compiler enforced, typically not entirely. If that
               | convention is broken, it's good that tests fail. If
               | changes outside that convention break tests, then it's
               | pure overhead to repair those tests.
               | 
               | As a side note, the length check is not necessarily no
               | longer cached just because the variable is no longer
               | visible to that test. Perhaps the custom list
               | implementation was replaced with a wrapper around
               | java.ArrayList, so the length field is no longer
               | accessible.
        
           | OJFord wrote:
           | I don't think that's TDD's fault, that's writing a crappy
           | test's fault.
           | 
           | If you keep it small and focussed, don't include setup that
           | isn't necessary and relevant, only exercise the thing which
           | is actually under test, only make an assertion about the
           | thing you actually care about (e.g. there is the key
           | 'total_amount' with the value '123' in the response, not that
           | the entire response body is x); that's much less likely to
           | happen.
        
           | osigurdson wrote:
           | I think there can be some value to using TDD in some
           | situations but as soon as people get dogmatic about it, the
           | value is lost.
           | 
           | The economic arguments are hard to make. Sure, writing the
           | code initially might cost $X and writing tests might cost
           | $1.5X but how can we conclude that the net present value
           | (NPV) of writing the tests is necessarily negative - this
           | plainly depends on the context.
        
         | danmaz74 wrote:
         | As I remember the discourse about TDD, originally it _was_
         | described as a testing practice, and later people started
         | proposing to change the last D from  "development" to "design".
        
         | deneas wrote:
         | I mean I think it's fair to assume that TEST-Driven-Development
         | has something to do with testing. That being said, Kent Beck
         | recently (https://tidyfirst.substack.com/p/tdd-outcomes) raised
         | a point saying TDD doesn't have to be just an X technique,
         | which I wholeheartedly agree with.
        
         | marcosdumay wrote:
         | Well, it's exactly as much about testing as it focus on writing
         | and running tests.
         | 
         | What means, it's absolutely entirely about them.
         | 
         | People can claim it's about requirements all they want. The
         | entire thing runs around the tests, and there's absolutely no
         | consideration to the requirements except on the part where you
         | map them into tests. If you try to create a requirements
         | framework, you'll notice that there is much more to them than
         | testing if they are met.
        
       | michalc wrote:
       | > Now, you change a little thing in your code base, and the only
       | thing the testing suite tells you is that you will be busy the
       | rest of the day rewriting false positive test cases.
       | 
       | If there is anything that makes me cry, it's hearing "it's done,
       | now I need to fix the tests"
        
         | WolfOliver wrote:
         | Agree, this is usually a sign the team writes tests for the
         | sake of writing tests.
        
         | scaramanga wrote:
         | If changing the implementation but not the behaviour breaks a
         | test, I just delete the test.
        
         | tetha wrote:
         | It's something we've changed when we switched our configuration
         | management. The old config management had very, very meticulous
         | tests of everything. This resulted in great "code" coverage,
         | but whenever you changed a default value, at least 6 tests
         | would fail now. Now, we much rather go ahead and test much more
         | coarsely. If the config management can take 3 VMs and setup a
         | RabbitMQ cluster that clusters and accepts messages, how wrong
         | can it be?
         | 
         | And this has also bled into my development and strengthened my
         | support of bug-driven testing. For a lot of pretty simple
         | business logic, do a few high level e2e tests for the important
         | behaviors. And then when it breaks, add more tests for those
         | parts.
         | 
         | But note, this may be different for very fiddly parts of the
         | code base - complex algorithms, math-heavy and such. But that's
         | when you'd rather start table based testing and such. At a past
         | gamedev job, we had several issues with some complex cost
         | balancing math, so I eventually setup a test that allows the
         | game balancing team to supply CSV files with expected results.
         | That cleared up these issues within 2 days or so.
        
           | tremon wrote:
           | _whenever you changed a default value, at least 6 tests would
           | fail now_
           | 
           | Testing default values makes a lot of sense. Both non-set
           | configuration values and non-supplied function parameters
           | become part of your API. Your consumers _will_ rely on those
           | default values, and if you alter them, your consumers _will_
           | see different behaviour.
        
           | preommr wrote:
           | > how wrong can it be?
           | 
           | Me, right before some really annoying bug starts to show up
           | and the surface area is basically half the codebase, across
           | multiple levels of abstraction, in various combinations.
        
         | int0x80 wrote:
         | Sometimes, you have to make a complex feature or fix. You can
         | first make a prototype of the code or proof of concept that
         | barely works. Then you can see the gap that remains to make the
         | change production ready and the implications of your change.
         | That involves fixing regressions in the test suite caused by
         | your changes.
        
       | gardenhedge wrote:
       | > Never change your code without having a red test
       | 
       | I'll never understand why people insist on this. If you want to
       | write your tests first, that is fine. Noone is going to stop you.
       | But why must you insist everyone does it this way?
        
         | shimst3r wrote:
         | If you do TDD, you write your tests first. If you don't follow
         | TDD, you don't have to. The author insists on this because they
         | assume a TDD approach.
        
           | gardenhedge wrote:
           | I suppose that is a fair reading. If TDD is the goal then the
           | "never" instruction makes sense.
        
         | stepbeek wrote:
         | In the context of the article, I read this as "modify the test
         | first for an existing unit test and a bit of production code
         | under test". I don't think this is equivalent to "always write
         | the test first".
        
         | tarcio wrote:
         | It helps you think about the problem better.
         | 
         | A particular test has assertions that must be implemented
         | correctly in order for them to pass.
         | 
         | By starting in red you gradually make your way towards green
         | usually paving your thinking and potentially unlocking other
         | tests.
         | 
         | I don't believe you must do this every time though.
        
           | gardenhedge wrote:
           | > I don't believe you must do this every time though.
           | 
           | So I think we agree. If you find it beneficial, please carry
           | on doing it. For anyone who doesn't find it beneficial, they
           | can write tests in whatever way they want.
        
         | resonious wrote:
         | The charitable interpretation is that this is just the
         | definition of TDD, and the author isn't actually trying to push
         | it on anyone. I completely agree with you though - pushing TDD
         | as "the best" is just obnoxious until someone comes in with
         | some more solid evidence to back it up.
        
         | Manfred wrote:
         | The delivery of a text depends on the literary goal of the
         | author.
         | 
         | When you created or found a new way do approach development and
         | you want to spread that idea, you want to _persuade_.
         | 
         | Alternatively you may want to _inform_ , for example when you
         | teach or journal. In which case you write down the facts
         | without embellishing them.
         | 
         | Finally you may want to do a critical _review_ , then you
         | explain the method in relation to other methods and existing
         | practices. Pros and cons, etc.
         | 
         | The hope is that the reader will understand these goals and
         | adopt it into their own beliefs as they see fit in relation on
         | how it was presented.
        
         | ozim wrote:
         | For me it is LARPing, people pretending that code they write is
         | so complicated and so important they need 100% test coverage
         | and nuclear reactor will melt down somewhere if they don't do
         | their job on the highest level.
         | 
         | I write testable code without TDD and if something breaks we
         | write test to cover that case so it doesn't happen again.
        
           | minimeme wrote:
           | How do you refactor code, if you have a poor test coverage?
           | Also for me actually the most importan benefit is the instant
           | feedback I get, when I write the unit tests before the
           | implemenation. I can move faster with more confidents.
        
             | usrusr wrote:
             | Probably by working in an environment where that confidence
             | is conveniently provided by static type analysis. If the
             | parts are sufficiently reshaped to fit together again you
             | just know that it will work. And chances are that feedback
             | is an order or two of magnitude more instantaneous.
        
         | imiric wrote:
         | TDD helps you think about the API as a user, which naturally
         | leads to a user-friendly design, and in turn also makes the
         | implementation testable from the start. Often when testing is
         | done after the implementation, the code is tightly coupled,
         | relies on external state, and requires refactoring just to make
         | it testable.
         | 
         | That said, this workflow is tedious during the exploratory
         | phase of the project. If you're unsure yet about which entities
         | should exist, and how they will be used and interact with each
         | other, TDD can be a hindrance. Once the design is mostly
         | settled, this workflow can be helpful, but insisting that it
         | must be used always is also annoying. Everything in software
         | development is a trade-off, and dogmatism should be avoided.
        
         | WolfOliver wrote:
         | It is the safest way to make sure your test will actually fail
         | in case your code does not work. I often had the situation
         | where I wrote a test which is always green, if if the code is
         | broken.
         | 
         | Often I do not have the clarity to write the test first, then I
         | just write the code and the test later. But I comment out the
         | code or introduce a bug on purpose to make sure when I run the
         | test code it actually works and detects the bug.
        
         | rightbyte wrote:
         | I think you are hitting the key point to make. It boils down
         | too "dogmas are bad".
         | 
         | All these programming methodologies seem to be in risk of
         | developing a cargo cult following and since programmers are way
         | more dogmatic than people it also gets way worse than in other
         | fields.
         | 
         | I don't like TDD and I think it is silly. But fine whatever
         | floats your boat. The problems begins when someone tries to
         | push the single right way onto others and it seems to mainly be
         | a problem in corporate settings where most people rather get
         | money than doing what they think is the right thing -- and some
         | people push agile, TDD or whatever.
        
       | resonious wrote:
       | Huh, I found this more interesting than I thought I would. I
       | hadn't heard before that the "unit" in "unit test" just meant
       | "can run independently". I once failed an interview partly
       | because of only writing "feature tests" and not "unit tests" in
       | the project I showed. But actually those tests didn't depend on
       | each other, so... looks like they really were unit tests!
       | 
       | Anyway, I'm still not totally sure about TDD itself - the "don't
       | write any code without a red test" part. I get the idea, but it
       | doesn't feel very productive when I try it. Of course maybe I'm
       | just bad at it, but I also haven't seen any compelling arguments
       | for it other than it makes the tests stronger (against what?
       | someone undoing my commits?). I think even Uncle Bob's underlying
       | backing argument was that TDD is more "professional", leading me
       | to believe it's just a song-and-dance secret handshake that helps
       | you get into a certain kind of company. OR, it's a technique to
       | try and combat against lazy devs, to try and make it impossible
       | for them to write bad tests. And maybe it is actually good but
       | only for some kinds of projects... I wish we had a way to
       | actually research this stuff rather than endlessly share opinions
       | and anecdotes.
        
         | viraptor wrote:
         | > arguments for it other than it makes the tests stronger
         | 
         | It's supposed to lead to a better design. It's easy to write
         | some code that maybe works and you can't actually test (lots of
         | interdependencies, weird state, etc.), or you only think you're
         | trying correctly. But making the test first forces you to write
         | something that 1. You can test (by definition) 2. Is decoupled
         | to the level where you can check mainly for the behaviour
         | you're interested in. 3. You won't bypass accidentally. It's
         | not even someone undoing your commits, but some value in the
         | call chain changing in a way that accidentally makes the
         | feature not run at all.
         | 
         | I've seen it many times in practice and will bet that any large
         | project where the tests were written after the code, has some
         | tests that don't actually do anything. They were already
         | passing before the thing they're supposedly testing was
         | implemented.
        
         | MoreQARespect wrote:
         | If I have the tooling all set up (e.g. playwright, database
         | fixtures, mitmproxy) and the integration test closely resembles
         | the requirement then I'm about as productive doing TDD as not
         | doing TDD except I get tests as a side effect.
         | 
         | If I do snapshot test driven development (e.g. actual rest API
         | responses are written into the "expected" portion of the test
         | by the test) then I'm sometimes a little bit more productive.
         | 
         | There's a definite benefit to fixing the requirement rather
         | than letting it evaporate into the ether.
         | 
         | Uncle bob style unit test driven development, on the other
         | hand, is something more akin to a ritual from a cult. Unit test
         | driven development on integration code (e.g. code that handles
         | APIs, databases, UIs) is singularly useless. It only really
         | works well on algorithmic or logical code - parsers, pricing
         | engines, etc. where the requirement can be well represented.
        
         | ChrisMarshallNY wrote:
         | BitD (Back in the Day), "unit tests" were independent tests
         | that we wrote, that tested the system. It applied to pretty
         | much any tests, including what we now call "test harnesses."
         | There weren't really any "rules," defining what a "unit test"
         | was.
         | 
         | The introduction of TDD (before it, actually, as testing
         | frameworks probably had a lot of influence), formalized what a
         | "unit test" is.
         | 
         | In general, I prefer using test harnesses, over suites of unit
         | tests[0], but I still use both.
         | 
         | [0] https://littlegreenviper.com/miscellany/testing-harness-
         | vs-u...
        
           | WolfOliver wrote:
           | Thats a new term for me, thanks for pointing out.
        
         | Matumio wrote:
         | > not totally sure about TDD itself - the "don't write any code
         | without a red test" part
         | 
         | I'm not into TDD, but I'm absolutely into "never skip the red
         | phase".
         | 
         | After fixing a bug or writing a test, I always revert and
         | repeat the test. Same when testing manually (except for the
         | absolutely trivial). You wouldn't believe how often the test
         | then just passes. It's the hygienic thing to do. It's so easy
         | to fool yourself.
         | 
         | About half of the time I realize my test (or my dev setup) was
         | wrong. The other times I learn something important, either that
         | I didn't fully understand the original problem, or my bugfix.
        
           | okl wrote:
           | "Never trust a test you didn't see fail"
        
         | okl wrote:
         | > OR, it's a technique to try and combat against lazy devs,
         | 
         | I think that many of these practices are a result of
         | programmers starting to code before having understood the
         | problem in detail and before thinking through what they want to
         | accomplish. Many times programmers feel an itch in their
         | fingers to just start coding. TDD is an improvement for some
         | because it forces them to think about edge cases and how to
         | test their work results before starting to code their
         | implementation. And bonus: they can do so while coding tests.
        
       | philippta wrote:
       | In my experience a lot of engineers are stuck thinking in MVC
       | terms an fail to write modular code. As a result most business
       | logic is part of a request / response flow. This makes it
       | infeasible to even attempt to write tests first, thus leaving
       | integration or e2e tests as the only remaining options.
        
         | trwrto wrote:
         | Where should the business logic rather be? My tests are
         | typically calling APIs to test the business logic. Trying to
         | improve myself here.
        
           | philippta wrote:
           | There aren't any hard rules here, but you can try to build
           | your business logic as if it was a library, where your HTTP
           | API is merely a interface to it.
        
         | laurencerowe wrote:
         | I'm not a TDD purist but I've found that so long as the request
         | / response flow is a JSON api or similar (as opposed to old
         | style forms and html rendering) then writing integration tests
         | first is quite easy so long as you make sure your test fixtures
         | are fairly fast.
        
           | _heimdall wrote:
           | With this approach do you stop at testing the JSON API or do
           | you still make it to testing the rendered HTML actually shown
           | to the user?
           | 
           | I've always actually likes the simplicity of testing HTML
           | APIs in a frontend project. For me, tests get a lot more
           | simple when I can verify the final HTML directly from the
           | response and don't need to parse the JSON and run it through
           | client-side rendering logic first.
        
       | almostnormal wrote:
       | Part of the problem is caused by all sides using the same terms
       | but with a different meaning.
       | 
       | > You just don't know if your system works as a whole, even
       | though each line is tested.
       | 
       | ... even though each line has been executed.
       | 
       | One test per line is strongly supported by tools calculating
       | coverage and calling that "tested".
       | 
       | A test for one specific line is rarely possible. It may be
       | missing some required behavior that hasn't been challanged by any
       | test, or it may be inconsistent with other parts of the code.
       | 
       | A good start would be to stop calling something just executed
       | "tested".
        
         | usrusr wrote:
         | I like the term "exercised" for coverage of questionable
         | assertative value. It's rather pointless in environments with a
         | strict compiler and even some linters get there, but there are
         | still others where you barely know more than that the brackets
         | are balanced before trying to execute. That form of coverage is
         | depressingly valuable there. Makes me wonder if there is a
         | school of testing that deliberately tries to restrict
         | meaningful assertions to higher level tests, so that their
         | "exercise" part is cheaper to maintain?
         | 
         | About the terms thing:
         | 
         | Semantics drifting away over time from whatever a sequence of
         | letters were originally used for isn't an exception, it's
         | standard practice in human communication. The winning strategy
         | is adjusting expectations while receiving and being aware of
         | possible ambiguities while sending, not sweeping together a
         | proud little hill of pedantry to die on.
         | 
         | The good news is that neither the article nor your little jab
         | at the term "tested" really do that, the pedantry is really
         | just a front to make the text a more interesting read. But it
         | also invites the kind of sallow attack that is made very
         | visible by discussions on the internet, but will also play out
         | in the heads of readers elsewhere.
        
           | almostnormal wrote:
           | > Makes me wonder if there is a school of testing that
           | deliberately tries to restrict meaningful assertions to
           | higher level tests, so that their "exercise" part is cheaper
           | to maintain?
           | 
           | That's a nice summary of the effect I observe in the bubble I
           | work in, but I'm sure it is not deliberate but Hanlon's razor
           | applies.
           | 
           | With sufficient freedom for interpretation it is what the
           | maturity model of choice requires.
        
       | thom wrote:
       | I think this article misunderstands the motivation for unit tests
       | being isolated and ends up muddying the definition unnecessarily.
       | The unit of unit testing means we want to assign blame to a
       | single thing when a single test fails. That's why order
       | dependencies have to be eliminated, but that's an implementation
       | detail.
        
       | imiric wrote:
       | I also wasn't aware that "unit" referred to an isolated test, not
       | to the SUT. I usually distinguish tests by their relative level,
       | since "unit" can be arbitrary and bring up endless discussions
       | about what it actually means. So low-level tests are those that
       | test a single method or class, and integration and E2E tests
       | confirm the functionality at a higher level.
       | 
       | I disagree with the premise that "unit", or low-level tests, are
       | not useful because they test the implementation. These are the
       | tests that check every single branch in the code, every possible
       | happy and sad path, use invalid inputs, etc. The reason they're
       | so useful is because they should a) run very quickly, and b) not
       | require any external state or setup, i.e. the traditional "unit".
       | This does lead to a lot of work maintaining them whenever the
       | implementation changes, but this is a necessary chore because of
       | the value they provide. If I'm only relying on high-level
       | integration and E2E tests, because there's much fewer of them and
       | they are slower and more expensive to run, I might miss a low-
       | level bug that is only manifested under very specific conditions.
       | 
       | This is why I still think that the traditional test pyramid is
       | the best model to follow. Every new school of thought since then
       | is a reaction towards the chore of maintaining "unit" tests. Yet
       | I think we can all agree that projects like SQLite are much
       | better for having very high testing standards[1]. I'm not saying
       | that every project needs to do the same, but we can certainly
       | follow their lead and aspire to that goal.
       | 
       | [1]: https://www.sqlite.org/testing.html
        
         | WolfOliver wrote:
         | It make sense to write a test for a class when the class/method
         | does complex calculations. Today this is less the case then it
         | was when the test pyramid was introduced.
        
         | melvinroest wrote:
         | Having high testing standards means practically to me (having
         | worked for a few SaaS companies): change code somewhere else,
         | and see where it fails elsewhere. Though, I see failing tests
         | as guidelines as nothing is 100% tested. If you don't see them
         | as guidelines but as absolute, then you'll get those back in
         | bugs via Zendesk.
        
         | vmenge wrote:
         | I've never had issues with integration tests running with real
         | databases -- they never felt slow or incurred any significant
         | amount of time for me.
         | 
         | I also don't think unit tests bring as much value as
         | integration tests. In fact, a lot of times unit tests are IMO
         | useless or just make your code harder to change. The more
         | towards testing implementation the worse it gets IMO, unless I
         | really really care that something is done in a very peculiar
         | way, which is not very often.
         | 
         | My opinion will be of course biased by my past experiences, but
         | this has worked well for me so far with both monoliths and
         | microservices, from e-shops and real estate marketplaces to
         | IoTs.
        
           | icedchai wrote:
           | I once worked at a place that demanded we write unit tests
           | for every new method. Something that was simply a getter or
           | setter? New unit test. I'd argue that the code was covered by
           | tests on other code, where it was actually used. This would
           | result in more and more useless arguments. Eventually, I just
           | moved on. The company is no longer in business anyway.
        
           | imiric wrote:
           | > I've never had issues with integration tests running with
           | real databases -- they never felt slow or incurred any
           | significant amount of time for me.
           | 
           | They might not be slow individually, but if you have
           | thousands of them, even a runtime of a couple of seconds adds
           | up considerably. Especially if they're not parallelized, or
           | parallelizable. Also, since they depend on an external
           | service, they're tedious to execute, so now Docker becomes a
           | requirement for every environment they run in, including slow
           | CI machines. Then there is external state you need to think
           | about, to ensure tests are isolated and don't clobber each
           | other, expensive setup/teardown, ensuring that they cleanup
           | after they're done, etc. It's all complexity that you don't
           | have, or shouldn't have, with low-level tests.
           | 
           | That's not to say that such tests shouldn't exist, of course,
           | but that they shouldn't be the primary test type a project
           | relies on.
           | 
           | > I also don't think unit tests bring as much value as
           | integration tests. In fact, a lot of times unit tests are IMO
           | useless or just make your code harder to change.
           | 
           | You're repeating the same argument as TFA, which is what I
           | disagree with. IME I _much_ preferred working on codebases
           | with a high coverage from low-level tests, than on those that
           | mostly rely on higher level ones. This is because with more
           | lower level tests there is a higher degree of confidence that
           | a change won't inadvertently break something a higher level
           | test is not accounting for. Yes, this means larger
           | refactorings also means having to update your tests, but this
           | is a trade-off worth making. Besides, nowadays it's becoming
           | easier to just have an AI maintain tests for you, so this
           | argument is quickly losing ground.
           | 
           | > My opinion will be of course biased by my past experiences
           | 
           | Sure, as all our opinions are, and this is fine. There is no
           | golden rule that should be strictly followed about this,
           | regardless of what some authority claims. I've also worked on
           | codebases that use your approach, and it has worked "well" to
           | some extent, but with more code coverage I always had more
           | confidence in my work, and quicker and convenient test suites
           | ensured that I would rely on this safety net more often.
        
           | layer8 wrote:
           | Do you set up a new database schema for each unit test? If
           | yes, that tends to be slow if you have many tests (hundreds
           | or thousands), and if no, then you risk getting stateful
           | dependencies between tests and they aren't really unit tests
           | anymore.
        
         | emmelaich wrote:
         | Wow that's interesting, because I never even considered that a
         | unit test to be other than a test to a small unit.
         | 
         | Is it not right there in the name?
        
           | cassianoleal wrote:
           | I guess what the OP is arguing is that "unit test" doesn't
           | mean you "test a unit" but rather that "each test is a unit"
           | - i.e. each test executes independently from all other tests.
           | 
           | I just find that good testing practice tbh but it's true that
           | there are loads of test suites out there that require tests
           | to be run in a particular sequence. I haven't seen one of
           | those in a while but they used to be quite common.
        
           | BurningFrog wrote:
           | Yes. That is definitely the original intention of the term!
           | 
           | Of course, language can and does drift etc, but I haven't
           | seen the other use anywhere else.
        
         | magicalhippo wrote:
         | I think it depends on what exactly the code does.
         | 
         | We have some custom rounding routines (to ensure consistent
         | results). That's the kind of stuff you want to have lots and
         | lots of unit tests for, testing all the paths, edge cases and
         | so on.
         | 
         | We also have a complex price calculation module, which depends
         | on lots of tables stored in the DB as well as some fixed logic
         | to do its job. Sure we could test all the individual pieces of
         | code, but like Lego pieces it's how you put them together that
         | matters so IMO integration testing is more useful.
         | 
         | So we do a mix. We have low-level unit testing for low-level
         | library style code, and focus more on integration testing for
         | higher-level modules and business logic.
        
           | sfn42 wrote:
           | I take a similar approach in .NET. I try to build these Lego
           | pieces as traditional classes - no dependencies (except maybe
           | a logger), just inputs and outputs. And then I have a few key
           | "services" which tie everything together. So the service will
           | pull some data from an API and maybe pull some data from an
           | API, then pass it to these pure classes for processing.
           | 
           | I don't unit test the service, I integration test the api
           | itself which indirectly tests the service. Mock the third
           | party API, spin up a real db with testcontainers.
           | 
           | And then I unit test the pure classes. This makes it much
           | easier to test the logic itself, and the service doesn't
           | really have any logic - it just calls a method to get some
           | data, then another method to process it and then returns it.
           | I _could_ quite easily use mocks to test that it calls the
           | right methods with the right parameters etc, but the
           | integration tests test that stuff implicitly, without being a
           | hindrance to refactoring and similar work.
        
           | Scubabear68 wrote:
           | This is of course the correct answer - it depends on the
           | context of your code. A single dogmatic approach to testing
           | will not work equally well across all problem domains.
           | 
           | Simple stateless components, hitting a defined wire protocol
           | or file format, utilizing certain API's, testing numerical
           | stuff, implies unit testing will go far.
           | 
           | Stateful components, complex multi-class flows, and heavily
           | data driven domains will often benefit from higher level
           | integration/end to end tests.
        
           | j1elo wrote:
           | I just wrote a sibling comment and then realized you just
           | stated exactly the same I wanted to say, but with more
           | concrete examples :)
           | 
           | That's exactly the sweet spot: complex, self-contained logic
           | units _might_ benefit from low-level unit testing, but for
           | the most part, what you 're interested to know is if the
           | whole thing works or not. Just IMHO, of course...
        
         | vidarh wrote:
         | To me, unit tests primary value is in libraries or components
         | where you want confidence before you build on top of them.
         | 
         | You can sidestep them in favour of higher level tests when the
         | only place they're being used is in one single component you
         | control.
         | 
         | But once you start wanting to reuse a piece of code with
         | confidence across components, unit tests become more and more
         | important. Same as more people are involved.
         | 
         | Often the natural time to fill in lacking unit tests is as an
         | alternative to ad hoc debugging.
        
         | DavidWoof wrote:
         | > I also wasn't aware that "unit" referred to an isolated test
         | 
         | It never did. "Unit test" in programming has always had the
         | meaning it does now: it's a test of a unit of code.
         | 
         | But "unit test" was originally used in electronics, and the
         | meaning in electronics was a bit closer to what the author
         | suggests. The author is being a bit fanciful (aka lying) by
         | excluding this context and pretending that we all don't really
         | understand what Kent Beck et. al. were talking about.
        
           | voiceofunreason wrote:
           | Yes.
           | 
           | << I call them "unit tests" but they don't match the accepted
           | definition of unit tests very well. >>
           | 
           | I'm not entirely certain it's fair to accuse the author of
           | lying; ignorance derived from limited exposure to materials
           | outside the bubble (rather than deceit) is the more likely
           | culprit here.
           | 
           | (Not helped at all by the fact that much of the TDD/XP origin
           | story is pre-Google, and requires a different set of research
           | patterns to track down.)
        
             | kragen wrote:
             | this kind of reckless disregard for whether what you are
             | saying is true or not is a kind of lie that is, if
             | anything, even more corrosive than lies by people who know
             | the truth; at least they have a plan in mind to achieve a
             | goal of some benefit to someone
        
           | troupo wrote:
           | > pretending that we all don't really understand what Kent
           | Beck et. al. were talking about.
           | 
           | Here's what Kent Beck has to say about testing:
           | https://stackoverflow.com/a/153565
           | 
           | --- start quote ---
           | 
           | I get paid for code that works, not for tests, so my
           | philosophy is to test as little as possible to reach a given
           | level of confidence
           | 
           | --- end quote ---
        
             | kragen wrote:
             | good thinking but irrelevant to the question at hand
        
           | layer8 wrote:
           | He links to where he got the notion from. I don't think it's
           | that clear-cut.
        
         | drewcoo wrote:
         | > I also wasn't aware that "unit" referred to an isolated test,
         | not to the SUT.
         | 
         | I'm with you. That claim is unsubstantiated. It seems to trace
         | to the belief that the first unit tests were XUnit family, thus
         | were SUnit for Scheme. But Kent Beck made it pretty clear that
         | SUnit "units" were classes.
         | 
         | https://web.archive.org/web/20150315073817/http://www.xprogr...
         | 
         | There were unit tests before that. SUnit took its name from
         | common parlance, not vice versa. It was a strange naming
         | convention, given that the unit testing framework could be used
         | to test anything and not just units. Much like the slightly
         | older Test Anything Protocol (TAP) could.
         | 
         | > [on unit tests] This does lead to a lot of work maintaining
         | them whenever the implementation changes, but this is a
         | necessary chore because of the value they provide.
         | 
         | I disagree. Unit tests can still be behavioral. Then they
         | change whenever the behavior changes. They should still work
         | with a mere implementation change.
         | 
         | > This is why I still think that the traditional test pyramid
         | is the best model to follow.
         | 
         | I'll disagree a little with that, too. I think a newer test
         | pyramid that uses contract testing to verify integrations is
         | better. The notion of contract tests is much newer than the
         | pyramids and, properly applied, can speed up feedback by orders
         | of magnitude while also cutting debugging time and maintenance
         | by orders of magnitude.
         | 
         | On that front, I love what Pact is doing and would like to see
         | more competition in the area. Hottest thing in testing since
         | Cypress/Playwright . . .
         | 
         | https://pact.io
        
         | j1elo wrote:
         | I believe the recent-ish reactions against the chore of
         | maintaining the most lower level of unit tests, is because with
         | years and experience we might be going through an industry
         | tendency where we collectively learn that those chores are not
         | worth it.
         | 
         | 100% code coverage is a red herring.
         | 
         | If you're in essence testing things that are part of the
         | private implementation, only through indirect second effects of
         | the public surface... then I'd say you went too far.
         | 
         | What you want to do is to know that the system functions as it
         | should. "I might miss a low-level bug that is only manifested
         | under very specific conditions." means to me that there's a
         | whole-system condition that it's possible to occur and thus
         | should be added to the higher level tests.
         | 
         | Not that lower level unit tests are not useful, but I'd say
         | only for intricate and isolated pieces of code that are
         | difficult to verify. Otherwise, _most_ software is a changing
         | entity because we tend to not know what we actually want out of
         | it, thus its lower level details tend to evolve a lot over
         | time, and we shouldn 't have _two_ implementations of it (first
         | one the code itself, second one a myriad tiny tests tightly
         | coupled to the former)
        
           | andrewprock wrote:
           | You should be very skeptical of anyone that claims they have
           | 100% test coverage.
           | 
           | Only under very rare circumstances is 100% test coverage is
           | even possible, let alone done. Typically when people say
           | coverage they mean "code line coverage", as opposed to the
           | more useful "code path coverage". Since it's combinatorially
           | expensive to enumerate all possible code paths, you rarely
           | see 100% code path coverage in a production system. You might
           | see it for testing vary narrow ADTs, for example; booleans or
           | floats. But you'll almost never see it for black boxes which
           | take more than one simply defined input doing cheap work.
        
       | JonChesterfield wrote:
       | This article is internally inconsistent. It leads with
       | considering "unit" to be "the whole system" being bad, and then
       | tip #1 is to test from the outside in, at whole system
       | granularity. On the other hand, it does point out that "design
       | for test" is a nonsense, so that meets my priors.
       | 
       | By far the worst part of TDD was the proposed resolution to the
       | tension with encapsulation. The parts one wants to unit test are
       | the small, isolated parts, aka "the implementation", which are
       | also the parts one generally wants an abstraction boundary over.
       | Two schools of thought on that:
       | 
       | - one is to test through the API, which means a lot of tests
       | trying to thread the needle to hit parts of the implementation.
       | The tests will be robust to changes in the implementation, but
       | the grey box coverage approach won't be, and you'll have a lot of
       | tests
       | 
       | - two is to change the API to expose the internals, market that
       | as "good testable design" and then test the new API, much of
       | which is only used from test code in the immediate future. Talk
       | about how one doesn't test the implementation and don't mention
       | the moving of goal posts
       | 
       | Related to that is enthusiasm for putting test code somewhere
       | separate to production code so it gets hit by the usual language
       | isolation constraints that come from cross-module boundaries.
       | 
       | Both of those are insane nonsense. Don't mess up your API to make
       | testing it easier, the API was literally the point of what you're
       | building. Write the tests in the same module as the
       | implementation and most of the API challenge evaporates. E.g. in
       | C++, write the tests in an anonymous namespace in the source
       | file. Have more tests that go through the interface from outside
       | if you like, but don't only have those, as you need way more to
       | establish whether the implementation is still working. Much like
       | having end to end tests helps but having only end to end tests is
       | not helpful.
       | 
       | I like test driven development. It's pretty hard to persuade
       | colleagues to do it so multi-developer stuff is all end to end
       | tested. Everything I write for myself has unit tests that look a
       | lot like the cases I checked in the repl while thinking about the
       | problem. It's an automated recheck-prior-reasoning system,
       | wouldn't want to be without that.
        
         | WolfOliver wrote:
         | > It leads with considering "unit" to be "the whole system"
         | being bad
         | 
         | I do not understand this statement. Could you point out which
         | part of the article you mean.
        
           | JonChesterfield wrote:
           | > when people started considering the system under test as
           | the "unit", it significantly affected the quality of test
           | suites (in a bad way)
        
             | WolfOliver wrote:
             | It should be "when people started considering parts of the
             | system under test as the "units""
        
               | JonChesterfield wrote:
               | Units are parts of the system. Perhaps you want to say
               | something about granularity?
        
               | tsimionescu wrote:
               | The claim in that phrase of the article is that a "unit
               | test" was originally supposed to mean "a test that runs
               | unitarily, by itself". Conversely, the more common
               | interpretation of the term is "a test for a single unit
               | of code".
               | 
               | So, according to that phrase, units are not part of the
               | system. The units are supposed to be the tests.
               | 
               | Note that I don't agree with this semantics game. But
               | that is the intended meaning of that phrase.
        
       | nu11ptr wrote:
       | I am not going to say that some of these testing religions don't
       | have a place, but mostly they miss the point. By focusing on TDD
       | or "code coverage" the essential questions are missed. Instead of
       | focusing on methodology instead I recommend asking yourself
       | simple questions starting with:
       | 
       | 1. How do I meet my quality goals with this project (or module of
       | a project)?
       | 
       | This is the root question and it will lead to other questions:
       | 
       | 2. What design and testing practice is most likely to lead to
       | this outcome?
       | 
       | 3. What is the pay off value for this module for a given type of
       | testing?
       | 
       | 4. How can I be confident this project/module will continue to
       | work after refactoring?
       | 
       | etc. etc.
       | 
       | I have used TDD style unit testing for certain types of modules
       | that were very algorithmic centric. I have also used integration
       | testing only for other modules that were I/O centric without much
       | logic. I personally think choosing a testing strategy as the "one
       | right way" and then trying to come up with different rules to
       | justify it is exactly inverse of how one should be thinking about
       | it (top down vs bottom up design of sorts).
        
       | danielovichdk wrote:
       | Read https://www.manning.com/books/unit-testing it's the best
       | book on the subject and is presenting the matter with good
       | evidence.
       | 
       | "Tip #4: TDD says the process of writing tests first will/should
       | drive the design of your software. "
       | 
       | Yes and if that does not happen during TDD i would argue you are
       | not doing TDD. Sure you always have some sort of boundaries but
       | design up front is a poor choice when you try to iterate towards
       | the best possible solution.
        
       | projektfu wrote:
       | I think it's both attention-getting and distracting to start with
       | a definition of unit testing that hardly anybody uses. Now I'm
       | not interested in the article because I have to see what your
       | sources are and whether you're gaslighting me.
       | 
       | The reason people use the term unit test to mean the size of the
       | system under test is because that's what it's generally meant.
       | Before OO, it would mean module. Now it means class. The original
       | approach would be to have smaller, testable functions that made
       | up the functionality of the module and test them individually.
       | Decoupling was done so that you didn't need to mock the database
       | or the filesystem, just the logic that you're writing.
       | 
       | Some people disagree with unit testing and focus on functional
       | testing. For example, the programming style developed by Harlan
       | Mills at IBM was to specify the units very carefully using formal
       | methods and write to the specification. Then, black-box testing
       | was used to gain confidence in the system as a whole.
       | 
       | I feel that a refactor shouldn't break unit tests, at least not
       | if the tools are smart enough. If you rename a method or class,
       | its uses should have been renamed in the unit tests. If you push
       | a method down or up in a hierarchy, a failing test tells you that
       | the test is assuming the wrong class. But most cases of failing
       | tests should be places where you made a mistake.
       | 
       | However, I agree that functional tests are the hurdle you should
       | have crossed before shipping code. Use unit testing to get 100ms
       | results as you work, functional tests to verify that everything
       | is working correctly. Write them so that you could confidently
       | push to production whenever they're green.
        
       | gombosg wrote:
       | I think that unit tests are super valuable because when used
       | properly, they serve as micro-specifications for each component
       | involved.
       | 
       | These would be super hard to backfill later, because usually only
       | the developer who implements them knows everything about the
       | units (services, methods, classes etc.) in question.
       | 
       | With a strongly typed language, a suite of fast unit tests can
       | already be in feature parity with a much slower integration test,
       | because even if mocked out, they essentially test the whole call
       | chain.
       | 
       | They can offer even more, because unit tests are supposed to test
       | edge cases, all error cases, wrong/malformed/null inputs etc. By
       | using integration tests only, as the call chain increases on the
       | inside, it would take an exponentially higher amount of
       | integration tests to cover all cases. (E.g. if a call chain
       | contains 3 services, with 3 outcomes each, theoretically it could
       | take up to 27 integration test cases to cover them all.)
       | 
       | Also, ballooning unit test sizes or resorting to unit testing
       | private methods give the developer feedback that the service is
       | probably not "single responsibility" enough, providing incentive
       | to split and refactor it. This leads to a more maintainable
       | service architecture, that integration tests don't help with.
       | 
       | (Of course, let's not forget that this kind of unit testing is
       | probably only reasonable on the backend. On the frontend,
       | component tests from a functional/user perspective probably bring
       | better results - hence the popularity of frameworks like
       | Storybook and Testing Library. I consider these as integration
       | rather than unit tests.)
        
       | skohan wrote:
       | I think it also heavily depends on the language you are working
       | with. For instance, unit tests are much more important in a duck-
       | typed language than a strongly typed language, since the compiler
       | is less capable of catching a number of issues.
        
       | seanwilson wrote:
       | Which companies or large projects use TDD at the moment? There's
       | always such intense discussion about what it is and its benefits,
       | yet I don't see anyone actually doing TDD.
        
         | JackMorgan wrote:
         | I've been in several multi-million line codebases that all were
         | built with TDD. It's possible.
         | 
         | The default way of organizing code with DI makes unit tests
         | extremely expensive to write and maintain. Mocks should be
         | banned if you want to add unit tests or practice TDD. Instead
         | the tested code should be pure. Pure code is easy to test, even
         | if it's calling a dozen helper functions.
        
         | hmeh wrote:
         | Dozen-ish person team, 3 year project so far, billion dollar
         | revenue company (not a software company), >500k LOC, TDD since
         | the beginning. Have been doing TDD for 18 years or so. Still
         | getting better at it.
        
       | andsmedeiros wrote:
       | TDD can be valuable but sometimes hindering. I find myself often
       | with an incomplete idea of what I want and, thus, no clear API to
       | start testing. Writing a quick prototype -- sometimes on godbolt
       | or replit -- and then writing tests and production code will
       | actually yield me a better productivity.
       | 
       | I usually test all of the public API of something and only it.
       | Exported functions, classes, constants and whatever should be
       | tested and properly documented. If writing tests for the public
       | surface is not enough, most likely the underlying code is poorly
       | written, probably lacking proper abstractions to expose the
       | adequate state associated with a determined behaviour (e.g.: a
       | class that does too much).
        
         | tsimionescu wrote:
         | I think this is only true upto some point. Ultimately the API
         | of a unit of code is not fully defined by the public VS private
         | language features, it is defined by the conventions for its
         | use. If a field/method is public but it is documented to not be
         | used by anything except, say, some tests, then it shouldn't be
         | considered part of the actual API.
         | 
         | Even in languages which have a very powerful type system, there
         | are assumptions that have to be left to documentation (e.g.
         | that the Monad laws are respected by types which match the
         | Monad typeclasses in Haskell). Testing parts which are
         | documented to not be relevant is often actively harmful, since
         | it causes problems with later changes.
        
       | dmos62 wrote:
       | This resonates. I learned the hard way that you want your main
       | tests to integrate all layers of your system: if the system is an
       | HTTP API, the principal tests should be about using that API. All
       | other tests are secondary and optional: can be used if they seem
       | useful during implementation or maintenance, but should never be
       | relied upon to test correctness. Sometimes you have to
       | compromise, because testing the full stack is too expensive, but
       | that's the only reason to compromise.
       | 
       | This is largely because if you try to test parts of your system
       | separately, you have to perfectly simulate how they integrate
       | with other parts, otherwise you'll get there worst case scenario:
       | false test passes. That's too hard to do in practice.
       | 
       | I suspect that heavy formalization of the parts' interfaces would
       | go a long way here, but I've not yet seen that done.
        
         | troupo wrote:
         | > if the system is an HTTP API, the principal tests should be
         | about using that API
         | 
         | So many times yes!
         | 
         | Funnily enough it's also the quickest way to get high code
         | coverage numbers that is still used as a metric everywhere
        
           | shoo wrote:
           | > the quickest way to get high code coverage numbers
           | 
           | apochryphal tale of test coverage + Goodhart's law:
           | 
           | Team is responsible for a software component that's two-
           | thirds gnarly driver code that's impossible to unit test, and
           | one third trivial stuff that's easy to unit test. Team has
           | 33% unit test coverage.
           | 
           | Test coverage becomes an organisational KPI. Teams must hit
           | at least 80% statement coverage. Oh no!
           | 
           | Team re-architects their component & adds several abstraction
           | layers that wrap the gnarly driver code, that don't serve any
           | functional purpose. Abstraction layers involve many lines of
           | code, but are elegantly designed so they are easy to test.
           | Now the codebase is 20% gnarly driver code that's impossible
           | to unit test for, 10% trivial stuff that's easy to unit test,
           | and 70% layers of unnecessary nonsense that's easy to unit
           | test. 80% statement coverage target achieved! Raises and
           | promos for everyone!
        
       | sebtron wrote:
       | When I first leant about unit tests / TDD, I was confused because
       | everyone assumes you are doing OOP. What am I supposed to do with
       | my C code? I can just test a function, right? Or do I have to
       | forcefully turn my program into some OO-syle architecture?
       | 
       | But then I realized it does not matter, there is only important
       | thing about unit tests: that they exists. All the rest is
       | implementation detail.
       | 
       | Mocking or not, isolated "unit" or full workflow, it does not
       | matter. All I care about is that I can press a button (or type
       | "make test" or whatever) and my tests run and I know if I broke
       | something.
       | 
       | Sure, your tests need to be maintainable, you should not need to
       | rewrite them when you make internal changes, and so on. You'll
       | learn as you go. Just write them and make them easy to run.
        
         | okl wrote:
         | For C code you can use link-time substitution and a mock
         | generator like CMock (http://www.throwtheswitch.org/cmock).
         | 
         | Link-time substitution means that you swap out certain objects
         | with others when you build your test binaries.
         | 
         | For example, let's say your production software binary consists
         | of a main function and objects A, B and C. For a unit test you
         | could use a different main (the test), object B and a mock for
         | object C - leaving out A.
        
       | tbrownaw wrote:
       | > _Tip #1: Write the tests from outside in._
       | 
       | > _Tip #2: Do not isolate code when you test it._
       | 
       | > _Tip #3: Never change your code without having a red test._
       | 
       | > _Tip #4: TDD says the process of writing tests first will
       | /should drive the design of your software. I never understood
       | this. Maybe this works for other people but it does not work for
       | me. It is software architecture 101 -- Non-functional
       | requirements (NFR) define your architecture. NFR usually do not
       | play a role when writing unit tests._
       | 
       | The one time I ever did "proper" red/green cycle TDD, it worked
       | because I was writing a client library for an existing wire
       | protocol, and knew in adance _exactly_ what it needed to do and
       | how it needed to do it.
       | 
       | Item 2 is right, but this also means that #1 is wrong. And
       | knowing what order #2 requires, means knowing how the code is
       | designed (#4).
        
         | sheepshear wrote:
         | The tips are not contradictory if you follow the advice to
         | start at a higher level.
         | 
         | Let's say you had to invent that wire protocol. You would write
         | a test for a client that doesn't care which wire protocol is
         | used.
        
           | HumblyTossed wrote:
           | TDD works great for this. Usually before I am sent a new
           | piece of equipment (has to go through the approval/purchase
           | process) I'm given the docs. I'll write unit tests using the
           | examples in the docs (or made up examples based on the docs).
           | I'll write my software controller against that. By the time I
           | get the actual device I'm just confirming my code works.
        
         | randomdata wrote:
         | TDD was later given the name Behavior Driven Development
         | (before being usurped by the likes of Cucumber, Gerkhin) in an
         | attempt to avoid this confusion. TDD advocates that you test
         | that the client library does what its public interface claims
         | it does - its behavior, not how it is implemented under the
         | hood. The wire protocol is almost irrelevant. The tests should
         | hold true even when the wire protocol is replaced with another
         | protocol.
        
           | WolfOliver wrote:
           | Do you have any references that BDD was used as a term before
           | Cucumber?
        
             | voiceofunreason wrote:
             | Search for the writings of Dan North, and don't forget that
             | in the English alphabet, "Behaviour" is spelled with a "U".
             | 
             | Very roughly, Cucumber framework appears in late 2008?
             | Whereas BDD first appears in the TDD community no later
             | than December 2005.
             | 
             | Ex: here Dave Astels references a BDD talk he presented in
             | 2005-09: https://web.archive.org/web/20051212160347/http://
             | blog.davea...
        
           | rileymat2 wrote:
           | I suspect it goes deeper than that, which is some of the
           | confusion.
           | 
           | If you have multiple layers/parts some will treat each part
           | as an independent library to be used; Implementation details
           | of one level are depending on public interfaces of the next
           | level.
        
           | voiceofunreason wrote:
           | That's not quite right, historically.
           | 
           | Behavior Driven Development began as a re-languaging of TDD:
           | "The developers were much more receptive to TDD when I
           | stopped talking about testing." -- Dan North.
           | 
           | BDD diverged from TDD fairly early, after some insights by
           | Chris Matts.
           | 
           | As for TDD advocating tests of the public interface... that
           | seems to me to have been more aspirational than factual. The
           | tests in TDD are written by developers for developers, and as
           | such tend to be a bit more white/clear box than pure
           | interface testing would suggest.
           | 
           | In the edge cases where everything you need for testing is
           | exposed via the "public" interface, these are equivalent, of
           | course, but there are tradeoffs to be considered when the
           | information you want when running isolated experiments on an
           | implementation isn't part of the contract that you want to be
           | supporting indefinitely.
        
       | osigurdson wrote:
       | My view on unit testing is if there are no dependencies, there is
       | no real reason not to write tests for all behaviours. While you
       | may have a wonderful integration testing suite, it is still great
       | to know that building blocks work as intended.
       | 
       | The problems arise with dependencies as now you need to decide to
       | mock them or use concrete implementations. The concrete
       | implementation might be hard to set up , slow to run in a test -
       | or both. Using a mock, on the other hand, is essentially an
       | alternate implementation. So now your code has the real
       | implementation + one implementation per test (in the limit) which
       | is plainly absurd.
       | 
       | My current thinking (after writing a lot of mocks) is to try to
       | shape code so that more of it can be tested without hard to setup
       | dependencies. When this can't be done, think hard about the right
       | approach. Try to put yourself in the shoes of a future
       | maintainer. For example, instead of creating a bespoke mock for
       | just your particular test, consider creating a common test
       | utility that mocks a commonly used dependency in accordance with
       | common testing patterns. This is just one example. Annoyingly, a
       | lot of creativity is required once dependencies of this nature
       | are involved which is why it is great to shape code to avoid it
       | where possible.
        
         | troupo wrote:
         | > it is still great to know that building blocks work as
         | intended.
         | 
         | The only way to know that the building blocks work as intended
         | is through integration testing. The multiple "all unit tests
         | passed, no integration tests" memes show that really well.
        
           | osigurdson wrote:
           | I suspect you are not finding many bugs in common built-in
           | libraries via integration tests however. The same concept can
           | extend to code that our own teams write.
        
             | tomnipotent wrote:
             | > finding many bugs in common built-in libraries via
             | integration tests
             | 
             | I'm not convinced they're finding them with unit tests,
             | either. No shortage of public repos with great test
             | coverage that still have pages of Pull Requests and Issues
             | dealing with bugs. In many of those instances, you could
             | argue that it's the act of integration testing (the team
             | trying to use the project) that ends up catching the
             | issues.
        
               | osigurdson wrote:
               | I think for low level functionality, unit tests and
               | integration tests are the same thing. As an absurd
               | example, consider a function that's adds two numbers
               | together producing a result.
        
         | michaelteter wrote:
         | > try to shape code so that more of it can be tested without
         | hard to setup dependencies
         | 
         | Yes!
         | 
         | In general, functional core + imperative shell goes a long way
         | toward this goal.
         | 
         | With that approach should also be minimal coupling and complex
         | structured types outside with simple standard types passed to
         | core functions.
         | 
         | These things make unit testing so much easier and faster (dev
         | time and test run time).
        
         | patrick451 wrote:
         | The tests in the codebase I currently work in is a mocking
         | nightmare. It feels like somebody learned about c++ interface
         | classes and gmock for the first time when the codebase was
         | first being put together and went completely bananas. There are
         | almost no classes which _don 't_ inherit from a pure interface.
         | 
         | Two of the main drawbacks to this are
         | 
         | - Classes which have only a single implementation inherit from
         | an interface just so they can be mocked. We often only need
         | polymorphism for testing, but not at runtime. This not only
         | makes the code slower (minor concern, often) but more
         | importantly much more difficult to follow.
         | 
         | - The tests rely heavily on implementation details. The
         | typically assertion is NOT on the input/output behavior of some
         | method. Rather, it's on asserting that various mocked
         | dependencies got called at certain times within the method
         | under test. This heavily couples the tests to the
         | implementation and makes refactoring a pain.
         | 
         | - We have no tests which tests multiple classes together that
         | aren't at the scale of of system wide, end-to-end tests. So
         | when we DI class Bar into class Foo, we use a mock and don't
         | actually test that Bar and Foo work well together.
         | 
         | Personally, I think the code base would be in much better shape
         | if we completely banned gmock.
        
       | mannykannot wrote:
       | The article highlights this claim:
       | 
       |  _" Now, you change a little thing in your code base, and the
       | only thing the testing suite tells you is that you will be busy
       | the rest of the day rewriting false positive test cases."_
       | 
       | Whenever this is the case, it would seem at least one of the
       | following is true:
       | 
       | 1) There are many ways the 'little change' could break the
       | system.
       | 
       | 2) Many of the existing tests are testing for accidental
       | properties which are not relevant to the correct functioning of
       | the system.
       | 
       | If only the second proposition describes the situation, then, in
       | my experience, it is usually a consequence of tests written to
       | help get the implementation correct being retained in the test
       | suite. That is not necessarily a bad thing: with slight
       | modification, they _might_ save time in writing tests that are
       | useful in getting the new implementation correct.
       | 
       | I should make it clear that I don't think this observation
       | invalidates any of the points the author is making; in fact, I
       | think it supports them.
        
       | acidburnNSA wrote:
       | Was 'unit' originally intended to be a test you could run in
       | isolation? I don't think so. I'm not an expert in testing
       | history, but this Dec 2000 Software Quality Assurance guide from
       | the Nuclear Regulatory Commission defines Unit Testing as:
       | 
       | > Unit Testing - It is defined as testing of a unit of software
       | such as a subroutine that can be compiled or assembled. The unit
       | is relatively small; e.g., on the order of 100 lines. A separate
       | driver is designed and implemented in order to test the unit in
       | the range of its applicability.
       | 
       | NUREG-1737 https://www.nrc.gov/docs/ML0101/ML010170081.pdf
       | 
       | Going back, this 1993 nuclear guidance has simililar language:
       | 
       | > A unit of software is an element of the software design that
       | can be compiled or assembled and is relatively small (e.g., 100
       | lines of high-order language code). Require that each software
       | unit be separately tested.
       | 
       | NUREG/BR-0167 https://www.nrc.gov/docs/ML0127/ML012750471.pdf
        
         | voiceofunreason wrote:
         | See also: ANSI/IEEE 1008-1987
        
       | euos wrote:
       | I took a break from a big tech and joined a startup. It was
       | infuriating how people were actively opposing my TDD approach. It
       | was redeeming when I was shipping one of my projects (service for
       | integration with 3rd party) - and product managers and others
       | were expecting we will need weeks to test and fix the bugs - but
       | instead the other party just said "it's perfect, no comments".
       | 
       | All because I was "wasting my time" on writing "useless tests"
       | that helped me identify and discuss edge cases early in the
       | process. Also, I could continue working on parts even while
       | waiting for a response from product managers or while DevOps were
       | still struggling to provision the resources.
        
       | dave333 wrote:
       | For some classic wisdom about writing tests see the classic "Art
       | of Software Testing" by Glenford Myers. It's $149+++ on Amazon,
       | but only $5 on ebay:
       | 
       | https://www.ebay.com/sch/i.html?_from=R40&_trksid=p3671980.m...
       | 
       | This was originally published before TDD was a thing, but is
       | highly applicable.
        
       | jgalt212 wrote:
       | > Write the tests from outside in. With this, I mean you should
       | write your tests from a realistic user perspective. To have the
       | best quality assurance and refactor resistance, you would write
       | e2e or integration tests.
       | 
       | Yah, yah. But good look trying to figure out what went wrong when
       | only the failing test you have is an e2e or integration test.
        
       ___________________________________________________________________
       (page generated 2023-11-19 23:01 UTC)