[HN Gopher] Waiting on Tests
___________________________________________________________________
Waiting on Tests
Author : et1337
Score : 28 points
Date : 2024-01-03 14:22 UTC (8 hours ago)
(HTM) web link (etodd.io)
(TXT) w3m dump (etodd.io)
| Jenk wrote:
| From zero tests in 2019, to some 59k+ tests in just 4-5 years. On
| an established, significantly large product no less. Yeah I have
| zero faith in the quality and efficacy of those tests.
| MetaWhirledPeas wrote:
| Yeah that's quite a few tests. I just assumed the author was
| summing individual assertions, or unit tests, or something.
| hipadev23 wrote:
| But hey they have 100% coverage and lots of green text in their
| CLI. Surely they're shipping!
| evilduck wrote:
| In my own experience, post-facto automated testing just locks
| in the codebase to behave exactly as it did before the tests.
| The benefit is preventing the introduction of new regressions
| unless they performed significant refactors along the way. And
| 59k tests is an average of 32 new tests added every single day
| of the week for 5 years straight. While it's easy to hit a
| couple dozen a day as an IC on the first month of an effort
| like this, the long tail is the hard part. Adding 32 new and
| valuable tests in a single day in year three of this effort
| sounds hard.
|
| I'd be curious to hear about the manpower and logistical
| requirements behind this post. How many LoC was the original
| codebase? What's the current coverage at 59k? How many people
| were involved? How did they convince upper management to
| suddenly allocate this much opex spend for 5 years running, or
| what were they doing before this increase in testing
| expenditure?
| The_Colonel wrote:
| > In my own experience, post-facto automated testing just
| locks in the codebase to behave exactly as it did before the
| tests
|
| That's a great starting point for refactoring or just normal
| feature development. You need the confidence that your
| changes won't break the system in unforeseen ways.
| hinkley wrote:
| > post-facto automated testing just locks in the codebase
|
| Some of us call those pinning tests. It's pretty on the nose
| for your complaints, since it encodes a sense of FOMO right
| into the name.
| wavemode wrote:
| > post-facto automated testing just locks in the codebase to
| behave exactly as it did before the tests
|
| What, in your opinion, is the purpose of automated tests?
| MetaWhirledPeas wrote:
| This is fun. Seems like you did some good analysis.
|
| I couldn't help but notice you run a "most common" test suite,
| implying that you have more tests that don't get run for whatever
| reason (the change doesn't affect that code, it's too slow,
| whatever). We end up having to do something similar to a small
| degree, and it bugs me.
|
| What I would really like to see (and this might become a personal
| project at some point) is a way to optimize the tests at the step
| level. I believe this would accomplish several goals:
|
| - It would reduce redundancy, leading to faster execution time
| and easier maintenance
|
| - It would make the test scenarios easier to see and evaluate
| from a high level
|
| Inevitably when you write tests you end up covering scenario X,
| then months or years later when working to cover scenario Y you
| accidentally create redundant coverage for scenario X. This waste
| continues to build over time. I think this could be improved if I
| could break the tests up into chunks (steps) that could be
| composed to cover multiple scenarios. Then a tool could analyze
| those chunks to remove redundancy or even make suggestions for
| optimization. And it could outline and communicate coverage more
| clearly. If formatted properly it might serve not only as a
| regression test suite definition, but also as documentation of
| current behavior. (Think Cucumber, but in reverse.)
| ozim wrote:
| Well kind of can imagine having 60% coverage where 10% is
| doubled because no one checked.
|
| And there is no tool that looks for that really.
| hinkley wrote:
| > Passed in 9m50s
|
| The vast majority of build failures will not happen on the last
| test, and they also probably won't happen in the slowest set.
|
| Running all of your tests in parallel to try to honor the
| responsiveness requirements of CI, is better than doing nothing.
| But consider that the information contained in a red build is
| much higher than the data contained in a green build. A build
| that goes red in 90 seconds is better than one that goes red in
| 10 minutes. And red in 90 is enormously more information than
| green in 10.
___________________________________________________________________
(page generated 2024-01-03 23:00 UTC)