[HN Gopher] Show HN: FlakyBot - identify and suppress flaky tests
___________________________________________________________________
Show HN: FlakyBot - identify and suppress flaky tests
Author : ankitdce
Score : 18 points
Date : 2021-10-28 17:32 UTC (5 hours ago)
(HTM) web link (www.flakybot.com)
(TXT) w3m dump (www.flakybot.com)
| tossaway9000 wrote:
| How about you fix the flaky tests? Am I insane for thinking that?
| The whole concept of "just reboot it" or "re run it again" and
| "fixing" the problem is at least one reason the modern world sits
| on a mountain of complete garbage software.
| ankitdce wrote:
| Haha great point. Well from what we have learned from our users
| is "fixing" test typically end up with "delete most of them".
| Fixing tests can be time consuming effort.
|
| Another way to think about it is, whether Flaky tests are worth
| keeping? At some point if the tests fail often, do these really
| add value. And we think - it does. If you are able to identify
| flakiness from real failure and reduce noise, you can still
| avoid real failures.
| rio517 wrote:
| Wow. That works like really poor technical leadership. Fixing
| flaky tests (as opposed to deleting them) is indeed time
| consuming, but it is a far cheaper choice than getting to the
| point your test suite is untrustworthy.
|
| There may be a point where the cost of ownership for a
| specific test exceeds its utility, but the way to resolve
| that is usually to reevaluate your code and supporting tests.
| Suppressing flaky tests seems a very unwise choice.
|
| Perhaps under extreme circumstances and with unhealthy code
| bases there may be a case for this, but I struggle to imagine
| it.
| ankitdce wrote:
| That is a fair argument. Not all organizations have the
| bandwidth to measure and manage stability of builds. Some
| companies build internal tools / dev productivity team for
| this purpose. There are always right intentions to comment
| out the flaky test with the mindset of coming back to it,
| but it is also a very low priority item in most cases when
| you have to ship new features.
|
| Fixing flaky tests can very commonly take longer than
| writing new tests.
| manacit wrote:
| This is how we think about testing for the most part - if a
| test is 'flaky', it gets looked at very quickly, and if it's
| not urgent (e.g. the behavior is fine and it's actually a
| flake), it's skipped in code.
|
| Once the test is skipped, a domain expert can come back and
| take a look and figure out why it was flaky, and fix it.
|
| If it's urgently broken (e.g. there is real impact), we treat
| it like an incident and gather people with the right context to
| fix it quickly.
|
| As long as everyone agrees to these norms, it's not a huge
| burden to keep this up with thousands of tests. People
| generally write their tests to be more resilient when they know
| they're on the hook for them not being flaky, and nobody stays
| blocked for long when they are permitted to skip a flaky test.
| ankitdce wrote:
| Curious, how often do you see a flaky test in your system? In
| my past experience at one of the mid-size startups, we used
| to get a new flaky test almost on a weekly basis in a
| monorepo. We started the process of actually flagging them as
| ignored (we created a separate tag for flaky tests), but
| later realized that the backlog of fixing flaky test never
| came down.
|
| In another case observed, devs just got used to rerunning the
| entire suite (the flakiness here was about 10-20%)
| ankitdce wrote:
| Hi HN, we are Spriha and Ankit building Flakybot is a tool to
| automatically identify and suppress test-flakiness so that
| developers are better able to trust their test results.
|
| Most CI systems leave it up to teams to manually identify and
| debug test flakiness. Since most CI systems today don't handle
| test reruns, teams just end up with manually rerunning tests that
| are flaky. Ultimately, tribal knowledge gets built over time
| where certain tests are known to be flaky, but the flakiness
| isn't specifically addressed. Our solution, Flakybot, removes one
| of the hardest parts of the problem: identifying flaky tests in
| the first place.
|
| We ingest test artifacts from CI systems, and note when builds
| are healthy (so that we can mark them as "known-good builds" to
| use while testing for flakiness). This helps automatically
| identify flakiness, and proactively offer mitigation strategies,
| both in the short term and long term. You can read more about
| this here: https://ritzy-angelfish-3de.notion.site/FlakyBot-How-
| it-work...
|
| We're in the early stages of development and are opening up
| Flakybot for private beta to companies that have serious test-
| flakiness issues. The CI systems we currently support are
| Jenkins, CircleCI and BuildKite, but if your team uses a
| different CI and has very serious test-flakiness problems, sign
| up anyway and we'll reach out. During the private beta, we'll
| work closely with our users to ensure their test flakiness issues
| are resolved before we open it up more broadly.
| ncmncm wrote:
| What I want is a tool to make flaky tests fail reliably.
|
| They won't be fixed until they start actually preventing commits.
| If somebody deletes a test, that is on that person. I don't want
| a tool automatically suppressing testing.
| ankitdce wrote:
| That's very interesting feedback. We certainly don't have a way
| to force simulate failure.
|
| A related capability we are working on is to also rerun the
| identified flaky tests X times so they pass. This depends on
| the capabilities of the test runner, so it will work with
| specific ones first (cypress, pytest, etc). That way you still
| make sure that flaky tests pass instead of supressing.
| parthi wrote:
| We've been relying on manual testing so far. We're just starting
| to think about unit tests and integration tests. We don't know
| where to start. Would be cool if you could provide guidance on
| setting up good testing practices in the first place so that we
| avoid flaky tests all together.
| ankitdce wrote:
| Yeh flaky tests generally creep up in a big service due to
| several issues. There are some best practices to avoid the
| tests that requires some discipline and good oversight! We
| wrote some stuff around it: https://www.flakybot.com/blog/five-
| causes-for-flaky-tests
|
| This is by no means an exhaustive list, but our goal with
| FlakyBot is to get better at identifying root causes as we
| identify flakiness across the systems.
___________________________________________________________________
(page generated 2021-10-28 23:02 UTC)