hngopher.com

       [HN Gopher] Is something bugging you?
       ___________________________________________________________________
        
       Is something bugging you?
        
       Author : wwilson
       Score  : 824 points
       Date   : 2024-02-13 12:13 UTC (10 hours ago)
        
 (HTM) web link (antithesis.com)
 (TXT) w3m dump (antithesis.com)
        
       | loadzero wrote:
       | Sounds a bit like jockey applied to qemu. Very neat indeed.
       | 
       | https://www.cs.purdue.edu/homes/xyzhang/spring07/Papers/HPL-...
        
         | voidmain wrote:
         | There's indeed a connection between record/replay and
         | deterministic execution, but there's a difference worth
         | mentioning, too. Both can tell you about the past, but only
         | deterministic execution can tell you about alternate histories.
         | And that's very valuable both for bug search (fuzzing works
         | better) and for debugging (see for example the graphs where we
         | show when a bug became likely to occur, seconds before it
         | actually occurred).
         | 
         | (Also, you won't be able to usefully record a hypervisor with
         | jockey or rr, because those operate in userspace and the actual
         | execution of guest code does not. You could probably record
         | software cpu execution with qemu, but it would be slow)
         | 
         | I'm a co-founder of Antithesis.
        
           | pfdietz wrote:
           | I assume deterministic execution also lets you do failing
           | test case reduction.
           | 
           | I've found this sort of high volume random testing w. test
           | case reduction is just a game changer for compiler testing,
           | where there's much the same effect at quickly flushing out
           | newly introduced bugs.
           | 
           | I like the subtle dig at type systems. :)
        
           | loadzero wrote:
           | I have been down this road a little bit, applying the ideas
           | from jockey to write and ship a deterministic HFT system, so
           | I have some understanding of the difficulties here.
           | 
           | We needed that for fault tolerance, so we could have a hot
           | synced standby. We did have to record all inputs (and outputs
           | for sanity checking) though.
           | 
           | We did also get a good taste of the debugging superpowers you
           | mention in your blog article. We could pull down a trace from
           | a days trading and replay on our own machines, and skip back
           | and forth in time and find the root cause of anything.
           | 
           | It sounds like what you have done is something similar, but
           | with your own (AMD64) virtual machine implementation, making
           | it fully deterministic and replayable, and providing useful
           | and custom hardware impls (networking, clock, etc).
           | 
           | That sounds like a lot of hard but also fun work.
           | 
           | I am missing something though, in that you are not using it
           | just for lockstep sync or deterministic replays, but you are
           | using it for fuzzing. That is, you are altering the replay
           | somehow to find crashes or assertion failures.
           | 
           | Ah, I think perhaps you are running a large number of sims
           | with a different seed (for injecting faults or whatnot) for
           | your VM, and then just recording that seed when something
           | fails.
        
       | Rygian wrote:
       | The writing is really enjoyable.
       | 
       | > Programming in this state is like living life surrounded by a
       | force field that protects you from all harm. [...] We deleted all
       | of our dependencies (including Zookeeper) because they had bugs,
       | and wrote our own Paxos implementation in very little time and it
       | _had no bugs_.
       | 
       | Being able to make that statement and back it by evidence must be
       | indeed a cool thing.
        
         | llm_trw wrote:
         | I have proved my code has no bugs according to the spec.
         | 
         | I do not make the claim my spec has no bugs.
        
           | yasuocidal wrote:
           | "Its not a bug, its a feature"
        
           | coldtea wrote:
           | With formal proof systems, you can also claim that for your
           | spec.
        
             | svieira wrote:
             | A formal proof is only as good as what-you-are-proving maps
             | to what-you-intended-to-prove.
        
             | AlotOfReading wrote:
             | I've written formal proofs with bugs more than once.
             | Reality is much messier than you can encode into any proof
             | and there will ultimately be a boundary where the _real_
             | systems you 're trying to build can still have bugs.
             | 
             | Formal verification is incredibly, amazingly good if you
             | achieve it, but it's not the same as "perfect".
        
             | llm_trw wrote:
             | No you can't.
             | 
             | You can claim that your spec doesn't violate some
             | invariants in a finite number of steps, you can't claim
             | that the spec contains all the invariants the real system
             | must have and that it doesn't violate them in number of
             | steps + 1.
        
         | btrettel wrote:
         | The earliest that I've seen the attitude that one should
         | eliminate dependencies because they have more bugs than
         | internally written code was this book from 1995:
         | https://store.doverpublications.com/products/9780486152936
         | 
         | pp. 65-66:
         | 
         | > The longer I have computed, the less I seem to use Numerical
         | Software Packages. In an ideal world this would be crazy; maybe
         | it is even a little bit crazy today. But I've been bitten too
         | often by bugs in those Packages. For me, it is simply too
         | frustrating to be sidetracked while solving my own problem by
         | the need to debug somebody else's software. So, except for
         | linear algebra packages, I usually roll my own. It's
         | inefficient, I suppose, but my nerves are calmer.
         | 
         | > The most troubling aspect of using Numerical Software
         | Packages, however, is not their occasional goofs, but rather
         | the way the packages inevitably hide deficiencies in a
         | problem's formulation. We can dump a set of equations into a
         | solver and it will usually give back a solution without
         | complaint - even if the equations are quite poorly conditioned
         | or have an unsuspected singularity that is distorting the
         | answers from physical reality. Or it may give us an alternative
         | solution that we failed to anticipate. The package helps us
         | ignore these possibilities - or even to detect their occurrence
         | if the execution is buried inside a larger program. Given our
         | capacity for error-blindness, software that actually hides our
         | errors from us is a questionable form of progress.
         | 
         | > And if we do detect suspicious behavior, we really can't dig
         | into the package to find our troubles. We will simply have to
         | reprogram the problem ourselves. We would have been better off
         | doing so from the beginning - with a good chance that the
         | immersion into the problem's reality would have dispelled the
         | logical confusions before ever getting to the machine.
         | 
         | I suppose whether to do this depends on how rigorous one is,
         | how rigorous certain dependencies are, and how much time one
         | has. I'm not going to be writing my own database (too
         | complicated, multiple well-tested options available) but if I
         | only use a subset of the functionality of a smaller package
         | that isn't tested well, rolling my own could make sense.
        
           | voidmain wrote:
           | In the specific case in question, the biggest problem was
           | that dependencies like Zookeeper weren't compatible with our
           | testing approach, so we couldn't do true end to end tests
           | unless we replaced them. One of the nice things about
           | Antithesis is that because our approach to deterministic
           | simulation is at the whole system level, we can do it against
           | real dependencies if you can install them.
           | 
           | I was a co-founder of both FoundationDB and Antithesis.
        
           | spinningD20 wrote:
           | That tracks well (both the quotes and your thoughts).
           | 
           | One example that comes to mind where I want to roll my own
           | thing (and am in the process of doing so) is replacing our
           | ci/cd usage of jenkins that is solely for running qa
           | automation tests against PR's on github. Jenkins does way way
           | more than we need. We just need github PR
           | interaction/webhook, secure credentials management, and
           | spawning ecs tasks on aws...
           | 
           | Every time I force myself to update our jenkins instance, I
           | buckle up because there is probably some random plugin, or
           | jenkins agent thing, or ... SOMETHING that will break and
           | require me to spend time tracking down what broke and why.
           | 100% surface area for issues, whilst we use <5% of what
           | Jenkins actually provides.
        
       | flgstnd wrote:
       | the palantir testimonial on the landing page is funny
        
         | CiPHPerCoder wrote:
         | Even funnier if you manage to click "Declassify" :)
        
           | flgstnd wrote:
           | you're ip address is probably in the palantir databases
           | anyway :o
        
         | zellyn wrote:
         | And if you highlight the redactions, it reads:
         | 
         | REDACTED REDACTED REDACTED REDACTED REDACTED REDACTED and
         | REDACTED REDACTED? REDACTED REDACTED Antithesis REDACTED
         | REDACTED REDACTED REDACTED, REDACTED REDACTED REDACTED
         | REDACTED. REDACTED REDACTED Palantir REDACTED REDACTED REDACTED
         | REDACTED REDACTED REDACTED REDACTED.
         | 
         | :-)
        
         | couchand wrote:
         | This sort of awkward joke made to cover for capitalist illogic
         | makes us all dumber.
        
       | larsiusprime wrote:
       | Was an eaaaaaaaarly tester for this. Pretty neat stuff.
        
       | indiv0 wrote:
       | I've been super interested in this field since finding out about
       | it from the `sled` simulation guide [0] (which outlines how
       | FoundationDB does what they do).
       | 
       | Currently bringing a similar kind of testing in to our workplace
       | by writing our services to run on top of `madsim` [1]. This lets
       | us continue writing async/await-style services in tokio but then
       | (in tests) replace them with a deterministic executor that
       | patches all sources of non-determinism (including dependencies
       | that call out to the OS). It's pretty seamless.
       | 
       | The author of this article isn't joking when they say that the
       | startup cost of this effort is monumental. Dealing with every
       | possible source of non-determinism, re-writing services to be
       | testable/sans-IO [2], etc. takes a lot of engineering effort.
       | 
       | Once the system is in place though, it's hard to describe just
       | how confident you feel in your code. Combined with tools like
       | quickcheck [3], you can test hundreds of thousands of subtle
       | failure cases in I/O, event ordering, timeouts, dropped packets,
       | filesystem failures, etc.
       | 
       | This kind of testing is an incredibly powerful tool to have in
       | your toolbelt, if you have the patience and fortitude to invest
       | in it.
       | 
       | As for Antithesis itself, it looks very very cool. Bringing the
       | deterministic testing down the stack to below the OS is awesome.
       | Should make it possible to test entire systems without wiring up
       | a harness manually every time. Can't wait to try it out!
       | 
       | [0]: https://sled.rs/simulation.html
       | 
       | [1]: https://github.com/madsim-rs/madsim?tab=readme-ov-
       | file#madsi...
       | 
       | [2]: https://sans-io.readthedocs.io/
       | 
       | [3]: https://github.com/BurntSushi/quickcheck?tab=readme-ov-
       | file#...
        
         | michael_j_ward wrote:
         | > Dealing with every possible source of non-determinism, re-
         | writing services to be testable/sans-IO [2], etc. takes a lot
         | of engineering effort.
         | 
         | Are there public examples of what such a re-write looks like?
         | 
         | Also, are you working at a rust shop that's developing this
         | way?
         | 
         | Final Note, TigerBeetle is another product that was written
         | this way.
        
           | wwilson wrote:
           | TigerBeetle is actually another customer of ours. You might
           | ask why, given that they have their own, very sophisticated
           | simulation testing. The answer is that they're so fanatical
           | about correctness, they wanted a "red team" for their own
           | fault simulator, in case a bug in their tests might hide a
           | bug in their database!
           | 
           | I gotta say, that is some next-level commitment to writing a
           | good database.
           | 
           | Disclosure: Antithesis co-founder here.
        
           | indiv0 wrote:
           | Sure! I mentioned a few orthogonal concepts that go well
           | together, and each of the following examples has a different
           | combination that they employ:
           | 
           | - the company that developed Madsim (RisingWave) [0] [1] is
           | tries hardest to eliminate non-determinism with the broadest
           | scope (stubbing out syscalls, etc.)
           | 
           | - sled [2] itself has an interesting combo of deterministic
           | tests combined with quickcheck+failpoints test case auto-
           | discovery
           | 
           | - Dropbox [3] uses a similar approach but they talk about it
           | a bit more abstractly.
           | 
           | Sans-IO is more documented in Python [4], but str0m [5] and
           | quinn-proto [6] are the best examples in Rust I'm aware of.
           | Note that sans-IO is orthogonal to deterministic test
           | frameworks, but it composes well with them.
           | 
           | With the disclaimer that anything I comment on this site is
           | my opinion alone, and does not reflect the company I work at
           | ---- I do work at a rust shop that has utilized these
           | techniques on some projects.
           | 
           | TigerBeetle is an amazing example and I've looked at it
           | before! They are really the best example of this approach
           | outside of FoundationDB I think.
           | 
           | [0]: https://risingwave.com/blog/deterministic-simulation-a-
           | new-e...
           | 
           | [1]: https://risingwave.com/blog/applying-deterministic-
           | simulatio...
           | 
           | [2]: https://dropbox.tech/infrastructure/-testing-our-new-
           | sync-en...
           | 
           | [3]: https://github.com/spacejam/sled
           | 
           | [4]: https://fractalideas.com/blog/sans-io-when-rubber-meets-
           | road...
           | 
           | [5]: https://github.com/algesten/str0m
           | 
           | [6]: https://docs.rs/quinn-
           | proto/0.10.6/quinn_proto/struct.Connec...
        
         | Voultapher wrote:
         | > you can test hundreds of thousands of subtle failure cases in
         | I/O, event ordering, timeouts, dropped packets, filesystem
         | failures, etc.
         | 
         | As cool as all this is, I can't stop but wonder how often the
         | culture of micro-services and distributed computing is ill
         | advised. So much complexity I've seen in such systems boils
         | down to calling a "function" is: async, depends on the OS, is
         | executed at some point or never, always returns a bunch of
         | strings that need to be parsed to re-enter the static type
         | system, which comes with its own set of failure modes. This
         | makes the seemingly simple task of abstracting logic into a
         | named component, aka a function, extremely complex. You don't
         | need to test for any of the subtle failures you mentioned if
         | you leave the logic inside the same process and just call a
         | function. I know monoliths aren't always a good idea or fit, at
         | the same time I'm highly septical whether the current
         | prevalence of service based software architectures is justified
         | and pays off.
        
       | pfdietz wrote:
       | This sounds quite cool. Although it doesn't say so, I imagine the
       | name is riff off Hypothesis, the testing tool that performs
       | automatic test case simplification in a general way.
        
         | intuitionist wrote:
         | (I'm an early employee of what was, on my start date, just
         | called "Void Star.")
         | 
         | As I recall it, the name meant two things:
         | 
         | 1. Our "autonomous testing" approach is the opposite, or the
         | antithesis, of flaky and unreliable testing methodologies.
         | 
         | 2. You can think of our product as standing in dialectical
         | opposition to buggy customer software, pointing out its
         | internal contradictions (bugs) and together synthesizing a new,
         | bug-free software product. (N.b.: I've never actually read
         | Hegel.)
         | 
         | We did note the resonance with Hypothesis (a library I like a
         | lot!) at the time, but it was just an added bonus :).
        
       | vinnymac wrote:
       | I wonder if they are working on a time travel debugger. If it is
       | truly deterministic presumably you could visit any point in time
       | after a record is made and replay it.
        
         | wwilson wrote:
         | No comment. :-)
         | 
         | Disclosure: I am a co-founder of Antithesis.
        
           | bloopernova wrote:
           | It looks amazing, nice work!
           | 
           | Do you have any plans to let small open source teams use the
           | project for free? Obviously you have bills to pay and your
           | customers are happy to do that, but I was wondering if you'd
           | allow open source projects access to your service once a week
           | or something.
           | 
           | Partly because I want to play with this and I can't see my
           | employer or client paying for it! But also it fits neatly
           | into "DX", the Developer Experience, i.e. making the
           | development cycle as friction free for devs as possible. I'm
           | DevOps with a lifelong interest in UX, so DX is something I'm
           | excited about.
        
             | wwilson wrote:
             | Pricing suitable for small teams, and perhaps even a free
             | tier, is absolutely on the roadmap. We decided to build the
             | "hard", security-obsessed version of the infrastructure
             | first -- single-tenant, with dedicated and physically
             | isolated hardware and networking for every customer. That
             | means there's a bit of per-customer overhead that we have
             | to recoup.
             | 
             | In the future, we will probably have a multi-tenant
             | offering that's easier for open source projects to adopt.
             | In the meantime, if your project is cool and would benefit
             | from our testing, you can try to get our research team
             | interested in using it as part of the curriculum that makes
             | our platform smarter.
             | 
             | Disclosure: I'm an Antithesis co-founder.
        
             | nlavezzo wrote:
             | We've actually done quite a bit of testing on open source
             | projects as we've built this, and have discussed doing an
             | on-going program of testing open source projects that have
             | interested contributors. We'd probably find some
             | interesting things and could do some write-ups. Reach out
             | to us via our contact page or contact@antithesis.com and
             | let's chat.
        
         | _dain_ wrote:
         | [I work at Antithesis]
         | 
         | The system can certainly revisit a previous simulated moment
         | and replay it. And we have some pretty cool things using that
         | capability as a primitive. Check out the probability chart in
         | the bug report linked from the demo page:
         | https://antithesis.com/product/demo
        
           | xbar wrote:
           | Now I want a simulation-run replay scrubbing slider MIDI-
           | connected to my Pioneer DJ rig to scratch through our
           | troublesome tests as my homies push patched containers.
           | 
           | Seriously: impressive product revelation.
        
             | wwilson wrote:
             | Let's do it.
        
         | ismailmaj wrote:
         | That's exactly what Tomorrow Corporation uses for their hand
         | written game engine and compiler:
         | https://www.youtube.com/watch?v=72y2EC5fkcE
        
         | rdtsc wrote:
         | That's what rr-project does essentially?
        
         | acemarke wrote:
         | Exactly - that's what we've already built for web development
         | at https://replay.io :)
         | 
         | I did a "Learn with Jason" show discussion that covered the
         | concepts of Replay, how to use it, and how it works:
         | 
         | - https://www.learnwithjason.dev/travel-through-time-to-
         | debug-...
         | 
         | Not only is the debugger itself time-traveling, but those time-
         | travel capabilities are exposed by our backend API:
         | 
         | - https://static.replay.io/protocol/
         | 
         | Our entire debugging frontend is built on that API. We've also
         | started to build new advanced features that leverage that API
         | in unique ways, like our React and Redux DevTools integration
         | and "Jump to Code" feature:
         | 
         | - https://blog.replay.io/how-we-rebuilt-react-devtools-with-
         | re...
         | 
         | - https://blog.isquaredsoftware.com/2023/10/presentations-
         | reac...
         | 
         | - https://github.com/Replayio/Protocol-Examples
        
       | chrispy513 wrote:
       | This looks to be an incredible tool that was years in the making.
       | Excited to see where it goes from here!
        
       | User23 wrote:
       | Reminds me of the clever hack of playing back TCP dump logs from
       | prod on a test network, but dialed up. Neat.
       | 
       | Naturally I'd prefer professional programmers learn the cognitive
       | tools for manageably reasoning about nondeterminism, but they've
       | been around over half a century and it hasn't happened yet.
       | 
       | What's really interesting to me is that the simulation adequately
       | replicates the real network. One of the more popular criticisms
       | of analytical approaches is sone variant of: yeah, but the real
       | network isn't going to behave like your model. Which by the way
       | is an entirely plausible concern for anyone who has messed with
       | that layer.
        
         | Rygian wrote:
         | What is interesting here is that the solution could fuzz-test
         | anything, including the network model, leading to failures even
         | more implausible than reality.
        
         | zamfi wrote:
         | > Naturally I'd prefer professional programmers learn the
         | cognitive tools for manageably reasoning about nondeterminism
         | 
         | It's not an either-or here, though. Part of the challenge is
         | you're not always thinking about all the non-determinisms in
         | your code, and the interconnections between your code and other
         | code (whose behavior you can sometimes only _assume_ ) can make
         | that close to impossible. Part of that is the "your model of
         | the network" critique, but also part of that is "your model of
         | how people will use your software" isn't necessarily correct
         | either.
        
       | kretaceous wrote:
       | This might be the best introduction post I've read.
       | 
       | Lays the foundation (get it?) for who the people are and what
       | they've built.
       | 
       | Then explains how the current thing they are building is a result
       | of the previous thing. It feels that they actually want this
       | problem solved for everyone because they have experienced how
       | good the solution feels.
       | 
       | Then tells us about the teams (pretty big names with complex
       | systems) that have already used it.
       | 
       | All of these wrapped in good writing that appeals to
       | developers/founders. Landing page is great too!
        
         | getoffmycase wrote:
         | The entire testing system they describe feels like something I
         | can strive towards too. They make you want their solution
         | because it offers a way of life and thinking and doing like
         | you've never experienced before
        
         | foobarqux wrote:
         | Except it doesn't actually explain in what it does: Is it
         | fuzzing? Do you supply your own test cases? Is it testing
         | hardware non-determinism?
        
           | Aeolun wrote:
           | Yeah. I could figure out the global idea, but then the
           | mechanics of how it would actually work were very sparse.
        
           | wwilson wrote:
           | Post author here. Sorry it was vague, but there's only so
           | much detail you can go into in a blog post aimed at general
           | audiences. Our documentation (https://antithesis.com/docs/)
           | has a lot more info.
           | 
           | Here's my attempt at a more complete answer: think of the
           | story of the blind men and the elephant. There's a thing,
           | called fuzzing, invented by security researchers. There's a
           | thing, called property-based testing, invented by functional
           | programmers. There's a thing, called network simulation,
           | invented by distributed systems people. There's a thing,
           | called rare-event simulation, invented by physicists (!). But
           | if you squint, all of these things are really the same kind
           | of thing, which we call "autonomous testing". It's where you
           | express high-level properties of your system, and have the
           | computer do the grunt work to see if they're true. Antithesis
           | is our attempt to take the best ideas from each of these
           | fields, and turn them into something really usable for the
           | vast majority of software.
           | 
           | We believe the two fundamental problems preventing widespread
           | adoption of autonomous testing are: (1) most software is non-
           | deterministic, but non-determinism breaks the core feedback
           | loop that guides things like coverage-guided fuzzing. (2) the
           | state space you're searching is inconceivably vast, and the
           | search problem in full generality is insolubly hard.
           | Antithesis tries to address both of these problems.
           | 
           | So... is it fuzzing? Sort of, except you can apply it to
           | whole interacting networked systems, not just standalone
           | parsers and libraries. Is it property-based testing? Sort of,
           | except you can express properties that require a "global"
           | view of the entire state space traversed by the system, which
           | could never be locally asserted in code. Is it fault
           | injection or chaos testing? Sort of, except that it can use
           | the techniques of coverage guided fuzzing to get deep into
           | the nooks and crannies of your software, and determinism to
           | ensure that every bug is replayable, no matter how weird it
           | is.
           | 
           | It's hard to explain, because it's hard to wrap your arms
           | around the whole thing. But our other big goal is to make all
           | of this easy to understand and easy to use. In some ways,
           | that's proved to be even harder than the very hard
           | technological problems we've faced. But we're excited and up
           | for it, and we think the payoff could be big for our whole
           | industry.
           | 
           | Your feedback about what's explained well and what's
           | explained poorly is an important signal for us in this third
           | very hard task. Please keep giving it to us!
        
             | jldugger wrote:
             | I remember watching the Strange Loop video on your testing
             | strategy, and now I need to go back and relearn how it
             | differed from model checking (ie Promela or TLA+). Model
             | checking is probably the big QA story that tech companies
             | ignore because it requires dramatically more education,
             | especially from QA departments typically seen as "inferior"
             | to SWE.
        
               | rhodin wrote:
               | Video of [0] the Strangeloop talk [1].
               | 
               | [0] https://www.youtube.com/watch?v=4fFDFbi3toc [1]
               | https://thestrangeloop.com/2014/testing-distributed-
               | systems-...
        
             | randomdata wrote:
             | _> most software is non-deterministic_
             | 
             | Doesn't Antithesis rely on the fact that software is always
             | deterministic? Reproducibility appears to be its top
             | selling feature - something that wouldn't be possible if
             | software were non-deterministic.
        
               | wwilson wrote:
               | We can force any* software to be deterministic.
               | 
               | * Offer only good for x86-64 software that runs on Linux
               | whose dependencies you can install locally or mock. The
               | first two restrictions we will probably relax someday.
        
               | randomdata wrote:
               | Aren't you just 'forcing' determinism in the inputs,
               | relying on the software to be always deterministic for
               | the same inputs?
        
               | wwilson wrote:
               | Nope. We're emulating a deterministic computer, so your
               | software can't act nondeterministically if it tries.
        
               | randomdata wrote:
               | Right, by emulating a deterministic computer you can
               | ensure that the inputs to the software are always
               | deterministic - something traditional computing
               | environments are unable to offer for various reasons.
               | 
               | However, if we pretend that software was somehow able to
               | be non-deterministic, it would be able to evade your
               | deterministic computer. But since software is always
               | deterministic, you just have to guarantee determinism in
               | the inputs.
        
               | _dain_ wrote:
               | [I work at Antithesis]
               | 
               |  _> But since software is always deterministic, you just
               | have to guarantee determinism in the inputs._
               | 
               | This is technically correct, but that's a very load-
               | bearing "just". A _lot_ of things would have to count as
               | inputs. Think about execution time, for example. CPUs don
               | 't execute at the same speed all the time because of
               | automatic throttling. Network packets have different
               | flight times. Threads and processes get scheduled a
               | little differently. In distributed/concurrent systems,
               | all this matters. If you run the same workload twice,
               | observable events will happen at different times and in
               | different orders because of tiny deviations in initial
               | conditions.
               | 
               | So yes, if you consider the time it takes to run every
               | single machine instruction as an "input", then software
               | is deterministic given the same inputs. But in the real
               | world that's not actionable. Even if you had all those
               | inputs, how are you going to pass them in? For all
               | intents and purposes most software execution is non-
               | deterministic.
               | 
               | The Antithesis simulation _is_ deterministic in this way
               | though. It is in charge of how long _everything_ takes in
               | "simulated time", right down to the running times of
               | individual CPU instructions. Everything observable from
               | within the simulation happens the exact same way, every
               | time. You can compare a memory dump at the same
               | (simulated) instant across two different runs and they
               | will be bit-for-bit identical.
        
               | randomdata wrote:
               | _> Think about execution time, for example._
               | 
               | Sure. A good example. Execution time - more accurately,
               | execution speed - isn't a property of software. For
               | example, as you point out yourself, you can alter the
               | execution speed without altering the software. It is,
               | indeed, an input.
               | 
               |  _> Even if you had all those inputs, how are you going
               | to pass them in?_
               | 
               | Well, we know how to pass them in non-deterministically.
               | That's how software is able to do anything.
               | 
               | Perhaps one could create a simulated environment that is
               | able to control all the inputs? In fact, I'm told there
               | is a company known as Antithesis working on exactly that.
        
               | mlhpdx wrote:
               | Oh, that sounds like a challenge...
               | 
               | Is the challenge here the same as with digital
               | simulations of electronic circuits? That is, at the end
               | of the day analog physics becomes confounding? Or are you
               | doing deterministic simulation of random RF noise as
               | well?
        
               | pokler wrote:
               | That point about dependencies -- how well does this play
               | or easy to integrate with a build system like Bazel or
               | Buck?
        
             | crdrost wrote:
             | This vaguely reminds me of Jefferson's "Virtual Time" paper
             | from 1985[1]. The underlying idea at the time didn't really
             | take off because it required, like Zookeeper, a greenfield
             | project: except that it kinda doesn't and today you could
             | imagine instrumenting an entire Linux syscall table and
             | letting any Linux container become a virtual time system --
             | but Linux didn't exist in 1985 and wouldn't be standard
             | until much later.
             | 
             | So Jefferson just says, let's take your I/O-ful process,
             | split it a message-passing actor model, and monitor all the
             | messages going in and coming out. The messages coming out,
             | they won't necessarily _do what they 're supposed to do_
             | yet, they'll just be recorded with a plus sign and a
             | virtual timestamp, and by assumption eventually you'll
             | block on some response. So we have a bunch of recorded
             | message timestamps coming in, we have your recorded
             | messages going out.
             | 
             | Well, there's a problem here, which is that if we have
             | multiple actors we may discover that their timestamps have
             | traveled out-of-order. You sent some message at t=532 but
             | someone actually sent you a message at t=231 that you might
             | have selected instead of whatever you actually selected to
             | send the t=532 message. (For instance in the OS case, they
             | might have literally sent a SIGKILL to your process and you
             | might not have sent anything after that.) That's what the
             | plus sign is for, indirectly: we can restart your process
             | from either a known synchronization state or else from the
             | very beginning, we know all of its inputs during its first
             | run so we have "determinized" it up past t=231 to see what
             | it does now. Now, it sends a new message at say t=373. So
             | we use the opposite of +, the minus sign, to send to all
             | the other processes the "undo" message for their t=532
             | message, this removes it from their message buffer: that
             | will never be sent to them. And if they haven't hit that
             | timestamp in their personal processing yet, no further
             | action is needed, otherwise we need to roll them back too.
             | Doing so you determinize the whole networked cluster.
             | 
             | The only other really modern implementation of these older
             | ideas that I remember seeing was Haxl[2], a Haskell library
             | which does something similar but rather than using a
             | virtual time coordinate, it just uses a process-local
             | cache: when you request any I/O, it first fetches from the
             | cache if possible and then if that's not possible it goes
             | out, fetches the data, and then caches it. As a result you
             | can just offer someone a pre-populated cache which, with
             | these recorded inputs, will regenerate the offending stack
             | trace deterministically.
             | 
             | 1: https://dl.acm.org/doi/10.1145/3916.3988
             | 
             | 2: https://github.com/facebook/Haxl
        
             | kodablah wrote:
             | Has any thought been given to repurposing this
             | deterministic computer for more than just autonomous
             | testing/fuzzing? For example, given an ability to
             | record/snapshot the state, resumable software (i.e. durable
             | execution)?
        
               | wwilson wrote:
               | Somebody once suggested to me that this could be very
               | hand for the reproducible builds folks. I'm sure that now
               | that we're out in the open, lots of people will suggest
               | great applications for it.
               | 
               | Disclosure: Antithesis co-founder.
        
               | cperciva wrote:
               | My favourite application for "deterministic computer" is
               | creating a cluster in order to have a virtual machine
               | which is resilient to hardware failure. Potentially even
               | "this VM will keep running even if an entire AWS region
               | goes down" (although that would add significant latency).
        
             | ajb wrote:
             | This is interesting - it is kind of picking a fight with
             | SaaS/cloud providers though, as that is the one kind of
             | software you won't be able to import into your environment:
             | not because it can't do the job, but because you don't have
             | the code. So this would create an incentive to go back to
             | PaaS.
             | 
             | It's definitely true though that a big problem with backend
             | is that you can't easily treat it as a whole system for
             | test purposes.
        
               | pkghost wrote:
               | > it is kind of picking a fight with SaaS/cloud providers
               | 
               | or starting a bidding war
        
               | ajb wrote:
               | how so?
        
             | EasyMark wrote:
             | thanks, I'll dig in. I'm a very visual person and
             | charts/diagrams/flows always help my grasp of something
             | more than a wall of text. Maybe include some of those in
             | there when you get the time?
        
             | criddell wrote:
             | > turn them into something really usable for the vast
             | majority of software
             | 
             | Would it work for debugging, say, Notepad on Windows?
        
             | amw-zero wrote:
             | Is there more info on how Antithesis solves problem number
             | 2 (large state spaces)? I understand the fuzzing / workload
             | generation part well, but there's so many different state
             | space reduction techniques that I don't know what
             | Antithesis is doing under the hood to combat that.
        
             | gitgud wrote:
             | _> Your feedback about what 's explained well and what's
             | explained poorly is an important signal for us in this
             | third very hard task. Please keep giving it to us!_
             | 
             | It's hard to understand these complex concepts via language
             | alone.
             | 
             | Diagrams would be a huge help to understand how this system
             | of testing works compared to existing testing concepts
        
           | kretaceous wrote:
           | Sure, it doesn't go into details. And that is exactly why I
           | termed it an excellent _introduction_ and a sales pitch.
           | 
           | I haven't heard of deterministic testing before. Nor have I
           | heard of FoundationDB or the related things. And I went from
           | knowing zero things about them to getting impressed and
           | interested. This led me to go into their docs, blogs, landing
           | page, etc. to know more.
        
         | k__ wrote:
         | Did you read a different article than me?
         | 
         | The linked article is 3/4 about some history and rationale
         | before it actually tells you what they build.
         | 
         | It's like those pesky recipe blogs that tell you about the
         | authors childhood, when you just want to make vegan pancakes.
        
         | chinchilla2020 wrote:
         | It seems like marketing copy. Not a technical blog post.
         | 
         | It would be nice to see some actual use cases and examples.
         | 
         | Instead, the writer just name-dropped a few big companies and
         | claimed to have a revolutionary product that works magically.
         | Then include the typical buzzwords like '10x programmer' and
         | 'stealth mode'. The latter doesn't make sense because they also
         | name-drop clients.
        
       | sneak wrote:
       | Imagine being proud of working for Palantir.
        
         | mgfist wrote:
         | Your life depends on lots of unsavory tasks.
        
           | sneak wrote:
           | Yes, like sewage pipe maintenance. Not data mining to figure
           | out who to assassinate without trial.
           | 
           | Using the "unsavory" euphemism for unethical and illegal
           | violence is somewhat of a deception, is it not?
        
       | jitl wrote:
       | To me this is very reminiscent of time travel debugging tools
       | like the one used for Firefox's C++ code, rr / Pernosco:
       | https://pernos.co/
        
         | rvnx wrote:
         | Seems more like a fuzzer for Docker images.
         | 
         | Like this:
         | https://docs.gitlab.com/ee/user/application_security/coverag...
         | 
         | It won't tell you whether the software works correctly, it will
         | just tell you if it raises an exception or crashes.
         | 
         | Put a fuzzer on Chrome for example, you won't catch most of the
         | issues it has, though Chrome actually has tons of bugs and
         | issues, but you _may_ find security issues if you devote a big
         | enough budget to run your fuzzer long time enough to cover all
         | the branches.
         | 
         | So it's good in the case where you use "exceptions as tests",
         | where any minor out-of-scope behavior raises an exception and
         | all the cases are pre-planned (a bit like you baked-in runtime
         | checks, and the fuzzer explores them)
        
           | jitl wrote:
           | The similarity is about obtaining determinism through
           | something like a hypervisor. The way rr works is it basically
           | writes down the result of all the system calls, etc,
           | basically everything that ended up on the Turing machine's
           | tape, so you can rewind and replay.
        
       | intrasight wrote:
       | >It's pretty weird for a startup to remain in stealth for over
       | five years.
       | 
       | Not really. I have friends who work for a startup that's been in
       | "stealth" for 20 years. Stealth is a business model not a phase.
        
       | jimbokun wrote:
       | > The biggest effect was that it gave our tiny engineering team
       | the productivity of a team 50x its size.
       | 
       | I feel like the idea of the legendary "10x" developer has been
       | bastardized to just mean workers who work 15 hours a day 6.5 days
       | a week to get something out the door until they burn out.
       | 
       | But here's your real 10x (or 50x) productivity. People who
       | implement something very few people even considered or understood
       | to be possible, which then gives amazing leverage to deliver
       | working software in a fraction of the time.
        
         | FirmwareBurner wrote:
         | Your definition is also vague. Someone still needs to do the
         | legwork. One man armies who can do everything themselves don't
         | really fit in standardized teams where everything is
         | compartmentalized and work divided and spread out.
         | 
         | They work best on their own projects with nobody else in their
         | way, no colleagues, no managers, but that's not most jobs. Once
         | you're part of a team, you can't do too much work yourself no
         | matter how good you are, as inevitably the other slower/weaker
         | team members will slow you down as you'll fight dealing with
         | the issues they introduce into the project or the issues from
         | management, so every team moves at the speed of the lowest
         | common denominator no matter their rockstars.
        
           | jollyllama wrote:
           | That rings true and is probably why the 10x engineers I have
           | seen usually work on devops or modify the framework the other
           | devs are using in some way. For example, an engineer who
           | speeds up a build or test suite by an order of magnitude is
           | easily a 10x engineer in most organizations, in terms of man
           | hours saved.
        
             | FirmwareBurner wrote:
             | _> For example, an engineer who speeds up a build or test
             | suite by an order of magnitude is easily a 10x engineer in
             | most organizations, in terms of man hours saved._
             | 
             | Yeah but this isn't something scalable that can happen
             | regularly as part of your job description. Like most
             | jobs/companies don't have so many low hanging fruits to
             | pick that someone can speed of build by orders of magnitude
             | on a weekly basis. It's usually a one time thing. And one
             | time things don't usually make you a 10x dev. Maybe you
             | just got lucky once to see something others missed.
             | 
             | And often times at big places most people know where the
             | low hanging fruits are and can fix them, but management,
             | release schedules and tech debt are perpetually in the way.
             | 
             | IMHO what makes you a 10x dev is you always know how to
             | unblock people no matter the issue so that the project is
             | constantly smooth saling, not chasing orders of magnitude
             | improvements unicorns.
        
               | tranceylc wrote:
               | Does anyone else feel like people follow these sort of
               | industry pop-culture terms a bit too intensely? What I
               | mean is that the existence of the term tends to bring out
               | people trying to figure who that might be, as if it has
               | to be 100% true.
               | 
               | I personally think that some people can provide "10x"
               | (arbitrary) the value on occasion, like the low hanging
               | fruit you said. I also believe some people are slightly
               | more skilled than others, and get more results out of
               | their work. That said, there are so many ways for
               | somebody to have an impact that doesn't have to
               | immediate, that I find the term itself too prevalent.
        
               | lukan wrote:
               | "Does anyone else feel like people follow these sort of
               | industry pop-culture terms a bit too intensely? "
               | 
               | Agreed, there is too much effort going into the
               | "superstars" theme, but there are definitely people who
               | get 10x done in the same time as others.
        
               | t-3 wrote:
               | Yep. No matter what you're doing, some people are more
               | productive than others. Often it's a matter of experience
               | and practice, sometimes ability to focus, sometimes
               | motivation, rarely it's a lack or surplus of inherent
               | ability. Using people effectively in the context of a
               | team all depends on the skill of the manager though.
        
               | jollyllama wrote:
               | It really does depend on where you work. The order of
               | magnitude improvements I'm describing involved
               | interdisciplinary expertise involving both bespoke
               | distributed build systems and assembly language. They're
               | not unicorns, they do exist, but they are very rare and
               | most engineers just aren't going to be able to find them,
               | even with infinite time. Hence why a 10x engineer is so
               | valuable and not everyone can be one. I myself am
               | certainly not one, in most contexts.
        
               | vdqtp3 wrote:
               | > Like most jobs/companies don't have so many low hanging
               | fruits to pick that someone can speed of build by orders
               | of magnitude on a weekly basis
               | 
               | You and I have worked at very different organizations.
               | Everywhere I've been has had insane levels of
               | inefficiency in literally every process.
        
               | ejb999 wrote:
               | same here - it is especially bad in huge companies, the
               | inefficiencies and waste are legendary.
        
               | FirmwareBurner wrote:
               | _> insane levels of inefficiency in literally every
               | process._
               | 
               | In processes yes, not in code, and solo 10x devs alone
               | can't fix broken processes as those are a the effect of
               | broken management and engineering culture.
               | 
               | People know where the inefficiencies are, but management
               | doesn't care.
        
           | theamk wrote:
           | Nothing wrong with "one man armies" in the team context.
           | There is a long list of tasks that needs to be done.. over
           | same time period, one person will do 5 complex tasks (with
           | tests and documentation), while the other will do just 1
           | task, and then spend even more time redoing it properly.
           | 
           | Over time this produces funny effects, like super-big 20
           | point task done in few days because wrong person started
           | working on it.
        
         | giantg2 wrote:
         | I'm tired of hearing about 10x engineers. I just want to be a
         | good 1x engineer. Or good at anything in life realy.
        
           | datameta wrote:
           | The truest 10x engineer I ever encountered was a memory
           | firmware guy with ASIC experience who absolutely made sure to
           | log off at 5 every day after really putting in the work. Go
           | to guy for all parts of the codebase, even that which he
           | didn't expressly touch.
        
             | harryvederci wrote:
             | > I'm tired of hearing about 10x engineers.
             | 
             | "The truest 10x engineer I ever encountered was..."
        
           | JimDabell wrote:
           | The "10x engineer" comes from the observation that there is a
           | 10x difference in productivity between the best and the worst
           | engineers. By saying that you want to be a 1x engineer,
           | you're saying you want to be the least productive engineer
           | possible. 1x is not the average, 1x is the worst.
        
             | mathgradthrow wrote:
             | the worst engineer certainly has negative productivity, so
             | I'm not sure that your explanation can possibly be the
             | correct one.
        
               | JimDabell wrote:
               | I'm explaining what the terms "10x" and "1x" mean, not
               | asserting that the original observation is correct under
               | all circumstances.
        
               | mathgradthrow wrote:
               | i believe the original was for an entire "organizations"
               | performance, and was also done in 1977. Since they are
               | averages, It makes "sense" to conclude that the best of a
               | good team is 10x better than the average of the worst
               | team. Not really what the experimwnt concludes but what
               | can you do.
        
               | JimDabell wrote:
               | The first was 1968, but there have been more studies
               | since.
               | 
               | https://www.construx.com/blog/the-origins-of-10x-how-
               | valid-i...
        
               | randomdata wrote:
               | Except you haven't explained it at all. Sackman,
               | Erickson, and Grant found that some developers were able
               | to complete what was effectively a programming contest in
               | a 10th of the time of the slowest participants. This is
               | the origin of the 10x developer idea.
               | 
               | You, on the other hand, are claiming that 10x engineers
               | are 10 times more productive than the worst engineers.
               | Completing a programming challenge in a 10th of the time
               | is not the same as being 10 times more productive, and
               | obviously your usage can't be an explanation, even as one
               | you made up on the spot, as the math doesn't add up.
        
               | JimDabell wrote:
               | That was designed as a repeatable experiment, which seems
               | entirely reasonable when you want to conduct a study. Why
               | are you characterising that as "a programming contest"?
               | That seems like an uncharitably distorted way of
               | describing a study.
               | 
               | That study also does not exist in isolation:
               | 
               | https://www.construx.com/blog/the-origins-of-10x-how-
               | valid-i...
        
               | randomdata wrote:
               | _> Why are you characterising that as "a programming
               | contest"? _
               | 
               | Because it was? Do you have a better way to repeatedly
               | test _performance_? And yes, the study 's intent was to
               | look at _performance_ , not productivity. It's even right
               | in the title. Not sure where you dreamed up the latter.
        
             | randomdata wrote:
             | I'm not sure your math works.
             | 
             | What we do know is that the worst engineers provide
             | negative productivity. If 1x is the worst engineer, then
             | let's for the sake of discussion denote x as -1 in order
             | for the product to be negative. Except that means the 10x
             | engineer provides -10 productivity, actually making them
             | the worst engineer. Therein lies a conflict.
             | 
             | What we also know is that best engineer has positive
             | productivity, so that means the multiplicand must always be
             | positive. Which means that it is the multiplier that must
             | go negative, meaning that a -1x and maybe even a -10x
             | engineer exists.
        
               | JimDabell wrote:
               | You are arguing against the idea that there is a factor
               | of ten difference in productivity between the best and
               | the worst engineers. That's fine if you want to do that,
               | but that's explicitly where the term "10x engineer" comes
               | from and what defines its meaning. So if you disagree
               | with the underlying concept, there is no way for you to
               | use terms like "[n]x engineer" coherently since you
               | disagree with its most fundamental premise. You certainly
               | shouldn't reinvent different meanings for these terms.
        
               | moritzwarhier wrote:
               | Thank you. This sounds so trivial at first, but your
               | reductio ad absurdum at the beginning of your comment
               | really nails it.
               | 
               | Throw into the mix the fact that productivity is hard to
               | measure as soon as more than one person works on
               | something and that doesn't even begin to consider the
               | economical aspects of software.
               | 
               | And even when ignoring this point, there's that pesky
               | short-term vs long-term thing.
               | 
               | Also, how do you define the term "productivity"? I was
               | assuming that you mean somethint along the lines of
               | (indirect, if employed) monetary output.
        
               | margalabargala wrote:
               | You're not _wrong_ , but I think you may be treating
               | something as literal math, when it is in fact idiomatic
               | labels used to express trends.
        
               | randomdata wrote:
               | The problem here is the introduction of productivity.
               | 
               | The 10x developer originated from a study that measured
               | _performance_. The 10x developer being able to do a task
               | in a 10th of the time is quite conceivable and reflects
               | what the study found. I 'm sure we've all seen a
               | developer take 10 hours to do a job that would take
               | another developer just 1 hour. Nobody is doing it in
               | negative hours, so the math works.
               | 
               | But performance is not the same as productivity.
        
             | hattmall wrote:
             | Hmm, I never thought of it that way. I just heard 10x
             | employees and fit it to what I knew. Which is that 90% of
             | the work is accomplished by about 10% of workers. The other
             | 90% really only get 10% done. So most developers are
             | somewhere on a scale of 0.1 - 1. With 1 being a totally
             | competent and good developer. The 10x people are just
             | different though, it's like a pro-athlete to a regular
             | player. It's not unique to software development, though it
             | may stand out and be sought after more. I've noticed it in
             | pretty much every industry. Some people are just able to
             | achieve flow state in their work and be vastly more
             | productive than others, be it writing code or laying sod. I
             | don't find that there's a lot of in between 1 and 10
             | though.
        
             | SkyBelow wrote:
             | Even if this was the origin of the term, it still doesn't
             | make sense because the best engineers can solve problems
             | the worst would never be able to do so. The difference
             | between the best and worst is much more than 10x the worst.
             | Maybe the worst who meets certain minimums at a company,
             | but then the best would also be limited by those willing to
             | work for what the company pays, and I hypothesis that the
             | minimums of the lower bound and the maximums of the upper
             | bound are correlated.
        
               | JimDabell wrote:
               | It sounds like you disagree with the concept of a 10x
               | engineer then. In which case you should avoid using the
               | term, rather than making up a new definition.
        
               | robocat wrote:
               | Concepts and words change meaning and sometimes we all
               | need to accept that the popular meaning is not the
               | definition we use.
               | 
               | This is especially common when dealing with historical or
               | academic definitions versus common modern usage.
               | "Evolution" particularly annoys me.
               | 
               | You should avoid using the term, rather than using a
               | definition at odds with common usage. Your usage is
               | confusing - and that is why you are getting push-back.
               | 
               | The definition you have given is nonsensical - it can't
               | be consistent over time or between companies because it
               | depends on finding a minimum in a group. And a value that
               | is strongly dependent on the worst developer is useless
               | because it mostly measures how bad the worst developer is
               | - it doesn't say anything about how good the best
               | developer is.
        
           | tnel77 wrote:
           | It depends on the day if I feel like a 2x or a 0.1x engineer.
           | Keep at it. You are not alone!
        
           | loeg wrote:
           | Spend less time on HN and you might get more done.
        
             | tomsthumb wrote:
             | Do you want to read hacker news or be hacker news?
        
           | Xeyz0r wrote:
           | You took the words right out of my mouth
        
           | AlienRobot wrote:
           | Do 10x engineers get 10x the wages? Somehow I feel being
           | exceptionally better than other engineers is just unfair to
           | both of you and the ones worse than you. I wouldn't want to
           | be a 10x either, I'd rather just be normal engineer.
        
             | tantaman wrote:
             | Meta compensates 10x types very well. 3x bonus multipliers,
             | additional equity that can range from 100k-1m+, and level
             | increases are a huge bump to comp (https://www.levels.fyi/)
        
               | chinchilla2020 wrote:
               | I have many meta colleagues I've worked with in the past.
               | All of them are well compensated but none of them were
               | outstanding, or 10x.
        
           | ponector wrote:
           | Once you have few years of experience, you don't need to be
           | 10x to have success. You can be a reliable 1.3x, a little bit
           | better then your teammates.
           | 
           | In the end it doesn't matter, whole team could be laid off at
           | once.
        
           | hyperthesis wrote:
           | I think getting something worthwhile done is a better focus
           | (actually quite hard!), and naturally increases your
           | productivity as a side-effect.
           | 
           | Productivity has no inherent value - like efficiency and
           | perfection, it is necessarily of something else. Its value is
           | entirely derived.
        
         | didgetmaster wrote:
         | It seems like the industry would get a lot more 10x behavior if
         | it was recognized and rewarded more often than it currently
         | does. Too often, management will focus more on the guy who
         | works 12 hour days to accomplish 8 hours of real work than the
         | guy who gets the same thing accomplished in an 8 hour day.
         | Also, deviations from 'normal' are frowned upon. Taking time to
         | improve the process isn't built into the schedule; so taking
         | time to build a wheelbarrow is discouraged when they think you
         | could be hauling buckets faster instead.
        
           | Terretta wrote:
           | It's almost impossible to get executives to think in return
           | on equity ("RoE") for the future instead of "costs" measured
           | in dollars and cents last quarter.
           | 
           | Which is weird, since so many executives are working in a VC-
           | funded environment, and internal work should be "venture
           | funded" as well.
        
           | happytiger wrote:
           | That's because most executives can't understand technology
           | deeply enough to know the difference.
        
             | didgetmaster wrote:
             | Even when they are smart enough to know, they seem to have
             | very short memories. While I don't consider myself to be a
             | 10x engineer; I have certainly done a number of 10x things
             | over my career.
             | 
             | I worked for a company where I almost single handedly built
             | a product that resulted in tens of millions of dollars in
             | sales. I got a nice 'atta boy' for it, but my future ideas
             | were often overridden by someone in management who 'knew
             | better'. After the management changed, I found myself in a
             | downsizing event once I started criticizing them for a lack
             | of innovation.
        
               | KuriousCat wrote:
               | This is the sad part of it, many people without core
               | competence end up in "leadership" positions and remove
               | any "perceived" threats to their authority. I believe
               | part of it is due to the absence of leadership training
               | in the engineering curriculum. Colleges should encourage
               | engineers to take up few leadership courses and get them
               | trained on things like Influence and Power.
        
           | sangnoir wrote:
           | >It seems like the industry would get a lot more 10x behavior
           | if it was recognized and rewarded more often than it
           | currently does
           | 
           | I'd be happier if industry cares more for team productivity -
           | I have witnessed how rewarding "10x" individuals may lead to
           | perverse results on a wider scale, a la Cobra Effect. In one
           | insidious case, our management-enabled, long-tenured "10x"
           | rockstar fixed all the big customer-facing bugs quickly, but
           | would create multiple smaller bugs and regressions for the 1x
           | developers to fix while he moved to the next big problem
           | worthy of his attention. Everyone else ended up being 0.7x -
           | which made the curse of an engineer look even more productive
           | comparatively!
           | 
           | Because he was allowed to break the rules, there was a
           | growing portion of the codebase that only he could work on -
           | while it wasn't Rust, imagine an org has a "No Unsafe Rust"
           | rule that is optional to 1 guy. Organizations ought to be
           | _very_ careful how they measure productivity, and should
           | certainly look beyond first-order metrics.
        
             | lifeisstillgood wrote:
             | I try to look at these things through the lens of "software
             | literacy" - software is a form of literacy and this story
             | might be better viewed as "a bunch of illiterate managers
             | are impressed with one good writer at the encyclopdia
             | publishers, now it turns out this guy makes mistakes, but
             | hey, what do you expect when the management cannot read or
             | write !"
        
           | SomeCallMeTim wrote:
           | This reminds me of the "Parable of the Two Programmers." [1]
           | A story about what happens to a brilliant developer given an
           | identical task to a mediocre developer.
           | 
           | [1] I preserved a copy of it on my (no-advertising or
           | monetization) blog here:
           | https://realmensch.org/2017/08/25/the-parable-of-the-two-
           | pro...
        
             | mjevans wrote:
             | I can't seem to find it in a google search, maybe I'm just
             | recalling entirely the wrong terms.
             | 
             | In the early computing era there was a competition.
             | Something like take some input and produce an output. One
             | programmer made a large program in (IIRC) Fortran with
             | complex specifications documentation etc. The other used
             | shell pipes, sort, and a small handful or two of other
             | programs in a pipeline to accomplish the same task in like
             | 10 developer min.
        
               | ianmcgowan wrote:
               | Sounds like "Knuth vs McIlroy", which has been discussed
               | on hn and elsewhere before, and the general take is that
               | it was somewhat unfair to Knuth.
               | 
               | [1] https://homepages.cwi.nl/~storm/teaching/reader/Bentl
               | eyEtAl8... [2]
               | https://www.google.com/search?q=knuth+vs+mcilroy
        
               | ramses0 wrote:
               | The Knuth link in the sibling comment is an original, but
               | you're probably thinking of "The Tao of Programming"
               | 
               | http://catb.org/~esr/writings/unix-koans/ten-
               | thousand.html
               | 
               | """"And who better understands the Unix-nature?" Master
               | Foo asked. "Is it he who writes the ten thousand lines,
               | or he who, perceiving the emptiness of the task, gains
               | merit by not coding?""""
        
               | SomeCallMeTim wrote:
               | I was both of those developers at different times, at
               | least metaphorically.
               | 
               | I drank from the OO koolaid at one point. I was really
               | into building things up using OOD and creating
               | extensible, flexible code to accomplish everything.
               | 
               | And when I showed some code I'd written to my brother, he
               | (rightly) scoffed and said that should have been 2-3
               | lines of shell script.
               | 
               | And I was enlightened. ;)
               | 
               | Like, I seriously rebuilt my programming philosophy
               | practically from the ground up after that one comment.
               | It's cool having a really smart brother, even if he's
               | younger than me. :)
        
             | _a_a_a_ wrote:
             | Without more backup I can only describe that as being
             | fiction. Righteous fiction, where the good guy gets
             | downtrodden and the bad guy wins to fuel the reader's
             | resentment.
        
               | 6510 wrote:
               | To me it is a story about managers clueless about the
               | work. You can make all the effort in the world to imagine
               | doing something but the taste of the soup is in the
               | eating. I do very simple physical grunt work for a
               | living, there it is much more obvious that it is
               | impossible. It's truly hilarious.
               | 
               | They probably deserve more praise when they do guess
               | correctly but would anyone really know when it happens?
        
               | SomeCallMeTim wrote:
               | It's practically my life experience.
               | 
               | Sometimes I'm appreciated, and managers actually realize
               | what they have when I create something for them.
               | Frequently I accomplish borderline miracles and a manager
               | will look at me and say, "OK, what about this other
               | thing?"
               | 
               | My first job out of college, I was working for a company
               | run by a guy who said to me, "Programmers are a dime a
               | dozen."
               | 
               | He also said to me, after I quit, after his client
               | refused to give him any more work unless he guaranteed
               | that I was the lead developer on it, "I can't believe you
               | quit." I simply shrugged and thought, "Maybe you
               | shouldn't have treated me like crap, including not even
               | matching the other offer I got."
               | 
               | I've also made quite a lot of money "Rescuing Small
               | Companies From Code Disasters. (TM)" ;) Yes, that's my
               | catch phrase. So I've seen the messes that teams often
               | create.
               | 
               | The "incompetent" team code description in the story is
               | practically prescient. I've seen the results of exactly
               | that kind of management and team a dozen times. Things
               | that, given the same project description, I could have
               | created in 1/100 the code and with much more overall
               | flexibility. I've literally thrown out entire projects
               | like that and replaced them with the much smaller,
               | tighter, and faster code that does more than the original
               | project.
               | 
               | So all I can say is: Find better teams to work with if
               | you think this is fiction. This resonates with me because
               | it contains industry Truth.
        
             | 6510 wrote:
             | I had an idea once but when I tried to explain it people
             | didn't understand.
             | 
             | I revisited earlier thought: communication is a 2 man job,
             | one is to not make an effort to understand while the other
             | explains things poorly. It always manages to never work
             | out.
             | 
             | Periodically I thought about the puzzle and was eventually
             | able to explain it such that people thought it was
             | brilliant ~ tho much to complex to execute.
             | 
             | I thought about it some more, years went by and I
             | eventually managed to make it easy to understand. The
             | response: "If it was that simple someone else would have
             | thought of it." I still find it hilarious decades later.
             | 
             | It pops to mind often when I rewrite some code and it goes
             | from almost unreadable to something simple and elegant. Ah,
             | this must be how someone else would have done it!
        
               | drekipus wrote:
               | > Ah, this must be how someone else would have done it!
               | 
               | This is a good exclamation :D
               | 
               | And it's a poignant story. Thanks for sharing.
        
               | lifeisstillgood wrote:
               | That's pretty good. It needs an Athena poster :-)
        
             | HenryBemis wrote:
             | "Give me six hours to chop down a tree and I will spend the
             | first four sharpening the axe."
             | 
             | -- Abraham Lincoln
             | 
             | I have started to follow this 'lately' (for a decade) and
             | it has worked miracles. As for the anxious
             | managers/clients, I keep them updated of the
             | design/documentation/though process, mentioning the risks
             | of the path-not-taken, and that maintain their peace of
             | mind. But this depends heavily on the client and the
             | managers.
        
           | ransom1538 wrote:
           | Honestly? You work at a place a manager hasn't heard "impact"
           | yet? I thought managers at this point just walk around the
           | office saying "impact".
        
           | PH95VuimJjqBqy wrote:
           | > It seems like the industry would get a lot more 10x
           | behavior if it was recognized and rewarded more often than it
           | currently does.
           | 
           | I don't agree with that, there are a _lot_ of completely crap
           | developers and they get put into positions where even the
           | ones capable of doing so aren't allowed to because it's not
           | on a ticket.
           | 
           | I've seen some thing.
        
         | throwitaway222 wrote:
         | No one reading this during the hours of 9-5 is a 10x.
        
           | randomdata wrote:
           | Or is. If a 1x puts in an 8 hour day, a 10x only has to put
           | in a 48 minute day. That leaves plenty of time to read this.
        
             | simmerup wrote:
             | That's a bad take because you're assuming that developer is
             | capable of replicating that * 10
        
               | adra wrote:
               | That's entirely the fundamental flaw of the Nx developer
               | ethos to a tee. No individual will benchmark reliably
               | against any other person of their same trade/craft
               | perfectly over time. The mythical BS times developer is
               | so over simplified to be a meaningless concept. Hire
               | "unicorn" and get amazing results just isn't a guarantee.
               | They just probably have better chance than average to
               | make a higher impact, which is good enough for companies
               | that are willing to pay Nx times average salaries to
               | acquire them.
        
             | sebastianz wrote:
             | His point is that smart and productive people are generally
             | hard working, focused and diligent, which is how they get
             | to be so experienced and productive.
             | 
             | Hence not wasting time on social networks.
             | 
             | > a 10x only has to put in a 48 minute day
             | 
             | Nobody would call this person "10x".
        
               | randomdata wrote:
               | _> His point is that smart and productive people are
               | generally hard working, focused and diligent_
               | 
               | I don't think that tracks. Smart, productive, hard
               | working people don't work 9-5. They work every hour they
               | can, breaking only when they have pushed themselves to
               | the limit. The limit can be hit at any hour. There is no
               | magical property of the universe that gives people
               | unlimited stamina during the hours of 9-5.
               | 
               |  _> Nobody would call this person  "10x"._
               | 
               | I'm not sure they would call anyone that, to be fair. A
               | "10x developer" who also puts in 8 hours alongside the 1x
               | developers isn't a 10x developer, he would be called a
               | _sucker_.
        
               | mewpmewp2 wrote:
               | Hackernews is hardly a waste of time though. 10x is
               | probably curious of topics mentioned on Hackernews.
        
           | adra wrote:
           | I know it's meant to be funny, but the number of tech people
           | who spend zero time learning about "what's out there", are
           | usually not the most effective developers. You won't find
           | better solutions to existing or even new problems without an
           | interest in industry. Maybe this particular article isn't
           | "industry valuable fair enough", but having zero interest in
           | refining and enhancing your craft beyond the work in front of
           | you is almost guaranteed to end with worse outcomes.
        
             | eszed wrote:
             | Hard agree.
             | 
             | Another flaw in his thinking: brain cycles and sub-
             | conscious processing.
             | 
             | I'm in the middle of a hard problem right now. I ran out of
             | ideas, and opened HN about half an hour ago. In that time,
             | without "trying", I've had two new ideas - one sent me back
             | to my notes, which revealed that my original thinking was
             | flawed; the second sent me to documentation, which
             | suggested a new route to pursue. I'm digesting the
             | implications of that while I write this.
             | 
             | Beating my head against the problem directly for thirty
             | minutes would have been less productive. (Though if I
             | wasn't WFH I would have, and also been miserable, and
             | learned less about the industry than I have from this
             | thread. So there's that.)
             | 
             | I'm far from a 10x _anything_ , but I don't have the only
             | brain which works this way.
        
         | andrei_says_ wrote:
         | On my team, one of the main multipliers is understanding the
         | need behind the requested implementation, and proposing
         | alternative solutions - minimizing or avoiding code changes
         | altogether. It helps that we work on internal tooling and are
         | very close to the process and stakeholders.
         | 
         | "Hmmm, there's another way to accomplish this" being the 10x.
         | Doing things faster is not it.
        
           | switch007 wrote:
           | Exactly this. It's why it's so frustrating when product
           | managers who think they're above giving background run the
           | show (the ones who think they're your manager and are
           | therefore too important to share that with you)
        
         | mettamage wrote:
         | When I was in college, I've met a few people that coded _a lot_
         | faster than me. Typically, they started since they were 12
         | instead of 21 (like me). That's how 10x engineers exist, by the
         | time they are 30, they have roughly 20 years of programming
         | experience behind their belt instead of 10.
         | 
         | Also, their professional experience is much greater. Sure,
         | their initial jobs at 15 are the occassional weird gig for the
         | uncle/aunt or cousin/nephew but they get picked up by
         | professional firms at 18 and do a job next to their CS studies.
         | 
         | At least, that's how it used to be. Not sure if this is still
         | happening due to the new job environment, but this was the
         | reality from around 2004 to 2018.
         | 
         | For 10x engineers to exist, all it takes is a few examples. To
         | me, everyone is in agreement that they seem to be rare. I point
         | to a public 10x engineer. He'd never say it himself, but my
         | guess is that this person is a 10x engineer [1].
         | 
         | If you disagree, I'm curious how you'd disagree. I'm just a
         | blind man touching a part of the elephant [2]. I do not claim
         | to see the whole picture.
         | 
         | [1] https://bellard.org/ (the person who created JSLinux)
         | 
         | [2] https://en.wikipedia.org/wiki/Blind_men_and_an_elephant -
         | if you don't know the parable, it's a fun one!
        
           | QuercusMax wrote:
           | Yup, that's been my experience as someone who asked for a C++
           | compiler for my 12th birthday, worked on a bunch of random
           | websites and webapps for friends of the family, and spent
           | some time at age 16-17 running a Beowulf cluster and
           | attempting to help postdocs port their code to run on MPI
           | (with mixed success). All thru my CS education I was writing
           | tons of toy programs, contributing (as much as I could)
           | toward OSS, reading lots of stuff on best practices, and
           | leaning on my much older (12 years) brother who was working
           | in the industry. He pointed me to Java and IntelliJ, told me
           | to read Design Patterns (Gang of Four) and Refactoring
           | (Fowler). I read Joel on Software religiously, even though he
           | was a Microsoft guy and I was a hardcore Linux-head.
           | 
           | By the time I joined my first real company at age 21, I was
           | ready to start putting a lot of this stuff into place. I
           | joined a small med device software company which had a great
           | product but really no strong software engineering culture:
           | zero unit tests, using CVS with no branches, release builds
           | were done manually on the COO's workstation, etc.
           | 
           | As literally the most junior person in the company I worked
           | through all these things and convinced my much more senior
           | colleagues that we should start using release branches
           | instead of "hey everybody, please don't check in any new code
           | until we get this release out the door". I wrote automated
           | build scripts mostly for my own benefit, until the COO
           | realized that he didn't have to worry about keeping a dev
           | environment on his machine, now that he didn't code any more.
           | I wrote a junit-inspired unit testing framework for the
           | language we were using
           | (https://en.wikipedia.org/wiki/IDL_(programming_language) -
           | like Matlab but weirder).
           | 
           | Without my work as a "10x junior engineer", the company would
           | have been unable to scale to more than 3 or 4 developers. I
           | got involved in hiring and made sure we were hiring people
           | who were on board with writing tests. We finally turned into
           | a "real" software company 2 or 3 years after I joined.
        
             | mettamage wrote:
             | This sounds similar to the best programmer I personally
             | know and he was an intern working at LLVM at the time. It's
             | funny how companies treat that part of his life as "no
             | experience". Then suddenly he goes into the HFT space and
             | within a couple of years he has a similar rank that people
             | have that are twice his age.
             | 
             | 10x engineers exist. To be fair, it does depend which
             | software engineer you see as "the standard software
             | engineer", but if I take myself as a standard (as an
             | employed software engineer with 5 years of experience),
             | then 10x software engineers exist.
        
           | nlavezzo wrote:
           | Nick with Antithesis here with a funny story on this.
           | 
           | I became friends with Dave our CTO when I was 5 or 6, we were
           | neighbors. He'd already started coding little games in Basic
           | (this was 1985). Later in our friendship, like when I was
           | maybe 10, I asked him if he could help me learn to code,
           | which he did. After a week or two I had made some progress
           | but compared what I could do to what he was doing and figured
           | "I guess I just started too late, what's the point?".
           | 
           | I found out later that most people didn't start coding till
           | late HS or college! It worked out though - I'm programmer
           | adjacent and have taken care of the business side of our
           | projects through the years :)
        
           | theamk wrote:
           | Last year, we had 2 new hires.. one is fresh out of college
           | (and not one of the top ones), other with 15 years experience
           | on resume in our industry.
           | 
           | I am not sure there is 10x difference, but there is at least
           | 5x difference in performance, in favor of fresh college grad,
           | and they are now working on the more complex tasks too.
           | 
           | The sad part is our hiring is still heavily in "senior
           | engineer with lots of experience" phase, and intership
           | program has been canceled.
        
           | jjjjj55555 wrote:
           | Some people organize their time and focus their efforts more
           | efficiently than others. They also use tools that others
           | might not even know or careabout.
           | 
           | You probably surf the internet 10x faster than your parents.
           | Yes you've probably had more exposure than them, but you
           | could probably teach them how to do it just as fast. But
           | would they want to learn and would they actually adapt what
           | you taught them?
        
             | joantune wrote:
             | With motivation, repetition, and those depend on how
             | plastic your brain is, thus the age, yes!
        
           | SomeCallMeTim wrote:
           | Yes: Programmers who start at twelve are often the 10x
           | programmers who can really program faster than the average
           | developer by a lot.
           | 
           | No: It's not because they have 10 more years of experience.
           | Read "The Mythical Man Month." That's the book that
           | popularized the concept that some developers were 5-25x
           | faster than others. One of the takeaways was that the speed
           | of a developer was not correlated with experience. At all.
           | 
           | That said, the kind of person who can learn programming at 12
           | might just be the kind of person who is really good at
           | programming.
           | 
           | I started learning programming concepts at 11-12. I'm not the
           | best programmer I know, but when I started out in the
           | industry at 22 I was working with developers with 10+ years
           | of (real) experience on me...and I was able to come in and
           | improve on their code to an extreme degree. I was completing
           | my projects faster than other senior developers. With less
           | than two years of experience in the industry I was promoted
           | to "senior" developer and put on a project as lead (and sole)
           | developer and my project was the only one to be completed on
           | time, and with no defects. (This is video game industry, so
           | it wasn't exactly a super-simple project; at the time this
           | meant games written 100% in assembly language with all kinds
           | of memory and performance constraints, and a single bug meant
           | Nintendo would reject the image and make you fix the problem.
           | We got our cartridge approved the first time through.)
           | 
           | Some programmers are just faster and more intuitive with
           | programming than others. This shouldn't be a surprise. Some
           | writers are better and faster than others. Some artists are
           | better and faster than others. Some architects are better and
           | faster than others. Some product designers are better and
           | faster than others. It's not _all_ about the number of hours
           | of practice in any of these cases; yes, the best in a field
           | often practices an insane amount. But the very top in each
           | field, despite having similar numbers of hours of practice
           | and experience, can vary in skill by an insane amount. Even
           | some of the best in each field are vastly different in speed:
           | You can have an artist who takes years to paint a single
           | painting, and another who does several per week, but of
           | similar ultimate quality. Humans have different aptitudes.
           | This shouldn 't even be controversial.
           | 
           | I do wonder if the "learned programming at 12" has anything
           | to do with it: Most people will only ever be able to speak a
           | language as fluently as a native speaker if they learn it
           | before they're about 13-14 years old. After that the brain
           | (again, for most people; this isn't universal) apparently
           | becomes less flexible. In MRI studies they can actually
           | detect differences between the parts of the brain used to
           | learn a foreign language as an adult vs. as a tween or early
           | teen. So there's a chance that early exposure to the right
           | concepts actually reshapes the brain. But that's just
           | conjecture mixed with my intuition of the situation: When I
           | observe "normal" developers program, it really feels like I'm
           | a native speaker and they're trying to convert between an
           | alien way of thinking about a problem into a foreign language
           | they're not that familiar with.
           | 
           | AND...there may not be a need to explicitly PROGRAM before
           | you're 15 to be good at it as an adult. There are video games
           | that exercise similar brain regions that could substitute for
           | actual programming experience. AND I may be 100% wrong. Would
           | be good for someone to fund some studies.
        
           | eschneider wrote:
           | I'm not even sure that coding _much_ faster than necessary is
           | even required to give a 3-5x multiple on "average", let alone
           | "worst case" developers. Some of the biggest productivity
           | wins can be had by being able to look at requirements,
           | knowing what's right or wrong about them, and getting
           | everyone on the same page so the thing only needs to be made
           | once. Being good at test and debug so problems are identified
           | and fixed _early_ are also big wins. Lots of that is just
           | having the experience to recognize what sort of problem
           | you're dealing with very quickly.
           | 
           | Being a programming prodigy is nice, but I don't think you
           | even really need that.
        
             | joantune wrote:
             | Underrated comment
        
           | confidantlake wrote:
           | I am not convinced that just starting early is all there is
           | to it. I started Math, Sports, and Piano at like 6 years old
           | but there are still plenty of "10x <insert activity here>"
           | people that figuratively and literally run circles around me.
           | Talent is a real thing.
        
             | joantune wrote:
             | The intensity you did it though matters. You probably
             | didn't spend that many years on a specific sport for
             | instance.
             | 
             | And when we're talking about sports, genetics matter as
             | well (depending on each one)
             | 
             | When we're talking brains, while genetics also matter,
             | assuming normal (whatever that is) brain, the plasticity
             | changes a lot how it operates.
             | 
             | So, the 10 years thing is definitely a big if not the
             | biggest part. In my opinion. Would love to see studies if
             | any exist out there on this
        
         | VoodooJuJu wrote:
         | 10x developer is just a buzzword people throw around when
         | they're trying to sell you something.
        
         | xnx wrote:
         | Anyone can be a 10x engineer when they write something
         | similar/identical to what they've written before. Other jobs
         | are not like this. A plumber may only be 20% faster on the best
         | days of their career.
        
         | ManuelKiessling wrote:
         | What makes a car go fast? The brakes:
         | 
         | https://manuel.kiessling.net/2011/04/07/why-developing-witho...
        
         | BobbyTables2 wrote:
         | > People who implement something very few people even
         | considered or understood to be possible, which then gives
         | amazing leverage to deliver working software in a fraction of
         | the time.
         | 
         | I agree with the first part of your statement, but what really
         | happens to such people?
         | 
         | In my experience (sample size greater than one), they receive
         | some kudos, but remain underpaid, never promoted, and are given
         | more work under tight deadlines. At least until some of them
         | are laid off along with lower performers.
         | 
         | But for those who say that hard things are impossible, they
         | seem to get along just fine. They merely declare such things as
         | out-of-scope or lie about their roadmap.
        
           | bedobi wrote:
           | > In my experience (sample size greater than one), they
           | receive some kudos, but remain underpaid, never promoted, and
           | are given more work under tight deadlines. At least until
           | some of them are laid off along with lower performers.
           | 
           | 100% agree, I've seen plenty of the best of the best get
           | treated like trash and laid off at first sight of trouble on
           | the horizon
        
         | hyperthesis wrote:
         | I've always thought a x10 is one who sits back and sees a
         | simpler way - like some math problems have an easy solution, if
         | you can see it. Also: change the question; change the context
         | (Alan Kay)
         | 
         | (And absolutely not brute-force grinding themselves away)
        
         | strangattractor wrote:
         | 6.5 X 15 is only 97 hours per week not even close to the 400
         | hrs (5X40) per week of programming a 10X Rust programmer can
         | provide. I jest but all this 10X stuff is getting ridiculous.
         | They stayed in "Stealth" mode because they didn't have anything
         | worth showing for 5 years. Doesn't sound all that productive to
         | me. More likely what they are trying to do was hard and
         | complicated and took a while to figure out.
        
           | joantune wrote:
           | They're not boasting about their current productivity,
           | they're boasting about the one they achieved at FoundationDB
           | when they implemented the testing, which gave them the idea
           | to build antithesis
        
         | devjab wrote:
         | In my experience it often comes down to business processes. We
         | have a guy in my extended team who knows everything about his
         | side of the company. When I work with him I accomplish business
         | altering deliveries in a very short amount of time, which after
         | a week or two rarely needs to be touched again unless something
         | in the business changes. He's not a PO and we don't do anything
         | too formally because it's just him, me and another developer +
         | whatever business manager will benefit from the development
         | (and a few testers from their tran). In many ways the way we
         | work these projects are very akin to Team Topologies.
         | 
         | At other times I'll be assigned projects with regular POs,
         | Architects and business employees who barely know what it is
         | they are doing themselves, with poorly defined tasks and all
         | sorts of bureaucratic nonsense "agile" process methods and well
         | spend forever delivering nothing.
         | 
         | So sometimes I'm a 50x developer delivering business altering
         | changes. At other times I'm a useless cog in a sea of pseudo
         | workers. I don't particularly care, I get paid, but if
         | management actually knew what was going on, and how to change
         | it... well...
        
       | agentultra wrote:
       | I got similar productivity boosts after learning TLA+ and Alloy.
       | 
       | Simulation is an interesting approach but I am curious if they
       | ever implemented the simulation wrong would it report errors that
       | don't happen on the target platform or fail to find errors that
       | the target platform reports? How wide the gap is will matter...
       | and how many possible platforms and configurations will the
       | hypervisor cover?
        
       | ComputerGuru wrote:
       | I was mentally hijacked into clicking the jobs link (despite
       | recently deciding I wasn't going to go down that rabbit hole
       | again!) but fortunately/unfortunately it is in-person and daily
       | so, so flying out from Chicago a week out of the month won't work
       | and I don't even have to ask!
       | 
       | More to the point of the story (though I do think the actual
       | point was indeed a hiring or contracting pitch), this reminds me
       | a lot of the internal tests the SQLite team has. I would love to
       | hear from someone with access to those if they feel the same way.
        
         | laiysb wrote:
         | > I was mentally hijacked into clicking the jobs link (despite
         | recently deciding I wasn't going to go down that rabbit hole
         | again!) but fortunately/unfortunately it is in-person and daily
         | so, so flying out from Chicago a week out of the month won't
         | work and I don't even have to ask!
         | 
         | given their PLTR connection, probably not
        
           | ComputerGuru wrote:
           | Oh, suddenly I'm not interested, either! Thanks!
        
       | timwis wrote:
       | Gosh, I know it's a bit late, but I wish they'd called the
       | product _The Prime Radiant_
       | 
       | Fans of Asimov's _Foundation_ series will appreciate the analogue
       | to how this system aims to predict every eventuality based on
       | every possible combination of events, a la psychohistory.
       | 
       | P.S. amazing intro post. Can't wait to try the product.
        
         | rvnx wrote:
         | It would be the opposite of the product:
         | 
         | For a software not interacting with the real world, there is
         | only one possibility for frame N+1, if you know the state of a
         | system.
         | 
         | https://en.wikipedia.org/wiki/Determinism
         | 
         | PRNG are illusions, just misunderstood by humans.
        
           | samatman wrote:
           | Did you intend to link to
           | https://en.wikipedia.org/wiki/Deterministic_algorithm?
        
           | timwis wrote:
           | Feels like I may have brought a spoon to a gun fight, but I
           | would have considered psychohistory to be the ultimate
           | extrapolation of determinism, and the fact that the prime
           | radiant is able to predict _which_ version of events will
           | happen is because it (somehow) knows the state of the system.
           | 
           | Of course, to argue against myself, it would surely be based
           | on layers of probabilities, and they say several times in the
           | series that it can't predict low-level specific things, just
           | high-level things. And perhaps the whole underlying question
           | posed by the series is whether the universe really is
           | deterministic. But anyway I don't think it's all off-base.
        
       | amw-zero wrote:
       | I'm trying to avoid diving into the hype cycle about this
       | immediately - but this sounds like the holy grail right? Use your
       | existing application as-is (assuming it's containerized), and
       | simply check properties on it?
       | 
       | The blocker in doing that has always been the foundations of our
       | machines: non-deterministic CPUs and operating systems. Re-
       | building an entire vertical computing stack is practically
       | impossible, so they just _avoid_ it by building a high-fidelity
       | deterministic simulator.
       | 
       | I do wonder how they are checking for equivalence between the
       | simulator and existing OS's, as that sounds like a non-trivial
       | task. But, even still, I'm really bought in to this idea.
        
         | wilkystyle wrote:
         | Does it even _need_ to be containerized? According to the post,
         | it sounds like Antithesis is a solution at the hypervisor
         | layer.
        
           | amw-zero wrote:
           | Yes it looks like containerization is required: https://antit
           | hesis.com/docs/getting_started/setup.html#conta...
        
             | voidmain wrote:
             | Containers are doing two jobs for us: they give our
             | customers a convenient way to send us software to run, and
             | they give us a convenient place to simulate the network
             | boundary between different machines in a distributed
             | system. The whole guest operating system running the
             | containers is also running inside the deterministic
             | hypervisor and under test (and it's mostly just NixOS
             | Linux, not something weird that we wrote).
             | 
             | I'm a co-founder of Antithesis.
        
               | tikhonj wrote:
               | Oh, cool to hear you're using NixOS. The Nix philosophy
               | totally gels with the philosophy described in the post.
               | 
               | But it's also probably fair to describe NixOS as
               | something weird that somebody else wrote :)
        
       | xyzelement wrote:
       | I appreciated this post. Separately from what they are talking
       | about, I found this bit insightful:
       | 
       | // This limits the value of testing, because if you had the
       | foresight to write a test for a particular case, then you
       | probably had the foresight to make the code handle that case too.
       | 
       | I often felt this way when I saw developers feel a sense of doing
       | good work and creating safe software because they wrote unit
       | tests like expect add(2,2) = 4. There is basically a 1-1
       | correlation between cases you thought to test and that you coded
       | for, which is really no better off in terms of unexplored
       | scenarios.
       | 
       | I get that this has some incremental value in catching blatant
       | miscoding and regressions down the road so it's helpful, it's
       | just not getting at the _main_ thing that will kill you.
       | 
       | I felt similarly about human QA back in my finance days that
       | asked developers for a test plan. If the dev writes a test plan,
       | it also only covers what the dev already thought about. So I
       | asked my team to write the vaguest/highest level test plan
       | possible (eg, "it should now be possible to trade a Singaporean
       | bond" rather than "type the Singaporean bond ticker into the
       | field, type the amount, type the yield, click buy or sell") - the
       | vagueness made more room for the QA person to do something
       | different (even things like tabbing vs clicking, or filling the
       | fields out of sequence, or misreading the labels) than how the
       | dev saw it, which is the whole point.
        
       | jwr wrote:
       | FoundationDB is an impressive achievement, quite possibly the
       | only distributed database out there that lives up to its strict
       | serializability claims (see
       | https://jepsen.io/consistency/models/strict-serializable for a
       | good definition). The way they wrote it is indeed very
       | interesting and a tool that does this for other systems is
       | immediately worth looking at.
        
         | candiddevmike wrote:
         | > quite possibly the only distributed database out there that
         | lives up to its strict serializability claims
         | 
         | Jepsen has never tested FoundationDB, not sure why you claim
         | this and link to Jepsen's site.
        
           | nlavezzo wrote:
           | FDB co-founder here.
           | 
           | Aphyr / Jepsen never tested FDB because, as he tweeted "their
           | testing appears to be waaaay more rigorous than mine." We
           | actually put a screen cap of that tweet in the blog post
           | linked here.
        
           | krisoft wrote:
           | > not sure why you claim this and link to Jepsen's site.
           | 
           | They link to the website for a definition of the term they
           | are using.
        
         | mcmoor wrote:
         | Is it that good? I've been tasked to deploy it for sometime and
         | it always bit me in the ass for one reason or another. And I'm
         | not the one who use it so I don't know if it's actually good.
         | For now I much prefer redis.
        
           | foobiekr wrote:
           | It's great, but operationally there are lots of gotchas and
           | little guidance.
           | 
           | We got bitten _hard_ in production when we accidentally
           | allowed some of the nodes to get above 90% of the storage
           | used. The whole database collapsed into a state where it
           | could only do a few transactions a second. Then the ops team,
           | thinking they were clever, doubled the size of the cluster in
           | order to give it the resources it needed to get the average
           | utilization down to 45%; this was an unforced error as that
           | pushed the size of the cluster outside the fdb comfort zone
           | (120 nodes) which is itself a problem. The deed was done
           | though and pulling nodes was not possible in this state, so
           | slowly, slooooowly... things got fixed.
           | 
           | We ended up spending an entire weekend slowly, slowly getting
           | things back into a good place. We did not lose data, but
           | basically prod was down for the duration, and we found it
           | necessary to _manually_ evict the full nodes one at a time
           | over the period.
           | 
           | Now, this was a few years ago, and fdb has performed wickedly
           | fast, with utter, total reliability before that and since,
           | and to this day the ops team is butthurt about fdb.
           | 
           | From an engineering perspective, if you aren't using java fdb
           | is pretty not great, since the very limited number of
           | abstraction layers that exist are all java-centric. There are
           | many, many issues with the maximum transaction time thing,
           | the maximum key size and value size and total transaction
           | size issue, the lack of pushdown predicates (e.g., filtered
           | scans can't be done in-place which means that in AWS, they
           | cost a lot in inter-az network charge terms and also are
           | gated by the network performance of your instances), and so
           | on.
           | 
           | What ALL of these have issues have in common is that they
           | bite you late in the game. The storage issue bites you when
           | you're hitting the DB hard in production and have a big data
           | set, the lack of abstractions means that even something as
           | finding leaked junked keys turns out to be impossible unless
           | you were diligent to manually frame all your values so you
           | could identify things as more than just bytes, the
           | transaction time thing is very weird to deal with as you tend
           | to have creeping crud aspects and the lack of libraries that
           | instrument the transactions to give you early warning is an
           | issue, likewise for certain kinds of key-value pairs, there's
           | a creeping size problem - hey, this value is an index of
           | other values; if you're not very careful up front, you _will_
           | eventually hit either the txn size limit or the key limit.
           | The usual workarounds for those is to do separate
           | transactions - a staging transaction, then essentially a swap
           | operation and then a garbage collection transaction - but
           | that has lots of issues overtime when coupled with
           | application failure.
           | 
           | There are answers to ALL of these, manual ones. For the
           | popular languages other than java - Go, python, maybe Ruby -
           | there _should_ be answers for them, but there aren't. These
           | are very sharp edges. Those java layers are _also_ _not_
           | _bug_ _free_. So yeah, one has a reliable storage storage
           | layer (a topic that has come up over and over again in the
           | last few years) but it's the layer on top of that where all
           | the bugs are, but now with constraints and factors that are
           | harder to reason about than the usual storage layer.
           | 
           | One might say, hey, SQL has all of these problems too, except
           | no. You can bump into transaction limits, but the limits are
           | vastly higher than fdb and the transaction time sluggishness
           | will identify it long before you run into the "your
           | transaction is rejected, spin retrying something that will
           | _never_ recover" sort of issue that your average developer
           | will eventually encounter in fdb.
           | 
           | That said, I love fdb as a software achievement. I just wish
           | they had finished it. For my current project, I have designed
           | it out. I might be able to avoid all of the sharp edges above
           | at this point, but since we are not a java shop, I also can't
           | rely on all the engineers to even know they exist.
        
           | jwr wrote:
           | It depends how you define "good". I care mostly about my
           | distributed database being correct, living up to its
           | consistency claims, and providing strict serializability.
           | 
           | (see also https://aphyr.com/posts/283-jepsen-redis)
           | 
           | I care much less about how easy it is to use or deploy, but
           | "good" is a subjective term, so other people might see things
           | differently.
        
       | aduffy wrote:
       | Looks like this coincides with seed funding[1], congrats folks!
       | Did you guys just bootstrap through the last 5 years of
       | development?
       | 
       | [1] https://www.saltwire.com/cape-breton/business/code-
       | testing-s...
        
       | samsquire wrote:
       | This is really exciting.
       | 
       | I am an absolute beginner at TLA+ but I really like this possible
       | design space.
       | 
       | I have an idea for a package manager that combines type system
       | with this style of deterministic testing and state space
       | exploration.
       | 
       | Imagine knowing that your invocation of                  package-
       | manager install <tool name>
       | 
       | Will always work because file system and OS state are part of the
       | deterministic model.
       | 
       | or an next gen Helm with type system and state space exploration
       | is tested:                  kubectl apply <yaml>
       | 
       | will always work when it comes up because all configuration state
       | space exploration has been tested thanks to types.
        
         | __MatrixMan__ wrote:
         | Coincidence, I'm reading this and thinking about test harnesses
         | for my package manager idea, which is really just a thin
         | wrapper around nix, designed under the assumption that the
         | network might partition at any moment: keep the data nearest
         | where it's needed, refer by hash not by name, gossip metadata
         | necessary to find the hash for you, no single points of
         | failure.
         | 
         | Tell me more about yours?
        
           | samsquire wrote:
           | I am thinking about state machine progressions and TLA+ style
           | specifications which are invariants over a progression of
           | variables.
           | 
           | Your package manager knows your operating system's current
           | state and the state space of all the control flow graph
           | through the program and configuration together can go to, it
           | can verify that everything lines up and there will be no
           | error when executed a bit like a compiler but without causing
           | the Halting problem.
           | 
           | In TLA+ you can dump a state graph as a dot file, which I
           | turn into a SVG and run with TLA+ graph visualiser.
           | 
           | Types verify possible control flow is valid at every point.
           | We just need to add types to the operating system and file
           | system and represent state space for deterministic
           | verification.
           | 
           | You could hide packages that won't work.
           | 
           | The package manager would have to lookup precached state
           | spaces or download them as part of the verification process.
        
       | traspler wrote:
       | Checking their bug report which should contain "detailed
       | information about a particular bug" I am not sure I can fully
       | understand those claims:
       | https://public.antithesis.com/report/ZsfkRkU58VYYW1yRVF8zsvU...
       | 
       | To my untrained eye I get: Logs, a graph of when in time the bug
       | happened over multiple runs and a statistical analysis which part
       | of the application code could be invovled. The statistical
       | analysis is nice but it is completely flat, without any
       | hierarchical relationships making it quite hard to parse
       | mentally.
       | 
       | I kind of expected more context to be provided about inputs,
       | steps and systems that lead to the bug. Is it expected to then
       | start adding all the logging/debugging that might be missing from
       | the logs and re-run it to track it down? I hoped that given the
       | deterministic systems and inputs there could be more initial
       | hints provided.
        
       | Invictus0 wrote:
       | Talk about bad writing. If I don't know what the hell your thing
       | is in the first paragraph, I'm not going to read your whole blog
       | post to find out. Homepage is just as bad.
        
         | tranceylc wrote:
         | The article is more of a history lesson and context than it is
         | an ad. I see what you mean, but clicking "product -> What Is
         | Antithesis?" Shows a clear description of what it does. Perhaps
         | that could also either be added to the article or the home
         | page?
        
       | agumonkey wrote:
       | interesting, this kind of responsive environment is dear but rare
       | 
       | i can't recall the last time i went to a place and people even
       | considered investing in such setups
       | 
       | i assume that except for hard problems and teams seeking
       | challenges, most people will revert to the mean and refuse any
       | kind of infrastructure work because it's mentally more
       | comfortable piling features and fixing bugs later
       | 
       | ps: i wish there was a meetup of teams like this, or even job
       | boards :)
        
         | nlavezzo wrote:
         | We'll be starting some meetups, attending conferences, etc.
         | this year. Also hop into our Discord if you want to chat, lots
         | of us are in there regularly. discord.gg/antithesis
        
           | agumonkey wrote:
           | oh, that's cool, thanks
        
       | shermantanktop wrote:
       | I kept cringing when I read the words "no bugs."
       | 
       | This is hubris in the classic style - it's asking for a literal
       | thunderbolt from the heavens.
       | 
       | It may be true, but...come on.
       | 
       | Everyone who has ever written a program has thought they were
       | done only to find one more bug. It's the fundamental experience
       | of programming to asymptotically approach zero bugs but never
       | actually get there.
       | 
       | Again, perhaps the claim is true but it goes against my instincts
       | to entertain the possibility.
        
         | rkangel wrote:
         | I think there is something interesting about the fact that
         | someone writing "no bugs" makes us all uncomfortable.
         | 
         | If they really did have a complex product, running in
         | production from a sizeable userbase and had 2 bug reports ever,
         | then I think it's a reasonable thing to say.
         | 
         | The fact that it _isn 't_ a reasonable thing to say for the
         | most other software is a little sad.
        
           | shermantanktop wrote:
           | Right, the claim may be true, but I have a visceral reaction
           | to it. And tbh I'd be hesitant to work with someone who made
           | a zero-bugs claim about their own work.
        
         | sfink wrote:
         | Yeah, same. It suggests that you must be employing one of the
         | time-honored approaches to getting zero bugs:
         | 
         | * Redefine all bugs as features
         | 
         | * Redefine "bug" to conveniently only apply to the things your
         | system prevents
         | 
         | * Don't write software
         | 
         | This reminds me of bugzilla's "Zarro Boogs" phrase that
         | pointedly avoids saying "Zero Bugs" because it's such a
         | deceptive term, see https://en.wikipedia.org/wiki/Bugzilla
         | 
         | Being able to say "no bugs" with justifiable confidence, even
         | when restricting it to some class of bugs, is truly a great and
         | significant thing. cf Rust. But claiming to have no bugs is
         | cringeworthy.
        
         | chinchilla2020 wrote:
         | It's a marketing blog post, not a technical post. Something
         | about the whole thing feels icky.
        
       | sackfield wrote:
       | "At FoundationDB, once we hit the point of having ~zero bugs and
       | confidence that any new ones would be found immediately, we
       | entered into this blessed condition and we flew. Programming in
       | this state is like living life surrounded by a force field that
       | protects you from all harm. Suddenly, you feel like you can take
       | risks"
       | 
       | When this state hits it really is a thing to behold. Its very
       | empowering to trust your system to this extent, and to know if
       | you introduce a bug a test will save you.
        
       | dkyc wrote:
       | On mobile, the "Let's talk" button in the top right corner is cut
       | off by the carousel menu overlay. Seems like CSS is still out of
       | scope of the bug fixing magic for now.
       | 
       | On a more serious note, it's an interesting blog post, but it
       | comes off as veeery confident about what is clearly an incredibly
       | broad and complex topic. Curious to see how it will work in
       | production.
        
         | wruza wrote:
         | Yeah, if only there was some scientific way to ensure that
         | elements don't overlap, let's call it "constraints" maybe, so
         | one could test layouts by simply solving, idk... something like
         | a set of linear equations? Hope some day CSS will stop being
         | "aweso"me and become nothing in favor of a useful layout
         | system.
        
         | wwilson wrote:
         | Aww... crap, you're right. I knew we should have finished the
         | UI testing product and run it on ourselves before launching.
         | 
         | Disclosure: Antithesis co-founder.
        
         | terpimost wrote:
         | Designer here, sorry, it is intentional. I thought horizontally
         | scrollable menu is more straightforward than full screen
         | expander.
        
       | thomastraum wrote:
       | https://antithesis.com/images/people/will.jpg the look of the CEO
       | is selling the software to me automatically. reliable and nice
        
       | islandert wrote:
       | There's a straightforward way to reach this testing state for
       | optimization problems. Write 2 implementations of the code, one
       | that is simple/slow and one that is optimized. Generate random
       | inputs and assert outputs match correctly.
       | 
       | I've used this for leetcode-style problems and have never failed
       | on correctness.
       | 
       | It is liberating to code in systems that test like this for the
       | exact reasons mentioned in the article.
        
         | mrkeen wrote:
         | Non-overlapping problem spaces.
         | 
         | Leet-code ends in unit-testing land, this product begins in
         | system-testing land.
        
       | Qwuke wrote:
       | I met Antithesis at Strangeloop this year and got to talk to
       | employees about the state of the art of automated fault injection
       | that I was following when I worked at Amazon, and I cannot
       | overstate how their product is a huge leap forward compared to
       | many of the formal verification systems being used today.
       | 
       | I actually got to follow their bug tracking process on an issue
       | they identified in Apache Spark streaming - going off of the
       | docs, they managed to identify a subtle and insidious correctness
       | error in a common operation that would've caused headaches in low
       | visibility edge case for years at that point. In the end the docs
       | were incorrect, but after that showing I cannot imagine how
       | critical tools like Antithesis will be inside companies building
       | distributed systems.
       | 
       | I hope we get some blog posts that dig into the technical weeds
       | soon, I'd love to hear what brought them to their current
       | approach.
        
       | BoppreH wrote:
       | Three thoughts:
       | 
       | 1. It's a brilliant idea that came at the right time. It feels
       | like people are finally losing patience with flaky software, see
       | developer sentiment on: fuzzers, static typing, memory safety,
       | standardized protocols, containers, etc.
       | 
       | 2. It's meant to be niche. $2 per hour per CPU (or $7000 per year
       | per CPU if reserved), no free tier for hobby or FOSS, and the
       | only way to try/buy is to contact them. Ouch. It's a valid
       | business model, I'm just sad it's not going for maximum positive
       | impact.
       | 
       | 3. Kudos for the high quality writing and documentation, and I
       | absolutely love that the docs include things like (emphasis in
       | original):
       | 
       | > If a bug is found in production, or by your customers, _you
       | should demand an explanation from us_.
       | 
       | That's exactly how you buy developer goodwill. Reminds me of
       | Mullvad, who I still recommend to people even after they dropped
       | the ball on me.
        
         | wwilson wrote:
         | Thanks for your kind words! As I mention in this comment
         | (https://news.ycombinator.com/item?id=39358526) we are planning
         | to have pricing suitable for small teams, and perhaps even a
         | free tier for FOSS, in the future.
         | 
         | Disclosure: Antithesis co-founder.
        
           | eatonphil wrote:
           | There a few FOSS projects I'd love to set this up for if you
           | ever get to the free tier. :)
        
         | jerf wrote:
         | "It's meant to be niche. $2 per hour per CPU (or $7000 per year
         | per CPU if reserved), no free tier for hobby or FOSS, and the
         | only way to try/buy is to contact them. Ouch. It's a valid
         | business model, I'm just sad it's not going for maximum
         | positive impact."
         | 
         | This is the sort of thing that, if it takes off, will start
         | affecting the entire software world. Hardware will start adding
         | features to support it. In 30 years this may simply be how
         | computing works. But the pioneers need to recover the costs of
         | the arrows they got stuck with before it can really spread out.
         | Don't look at this an event, but as the beginning of a process.
        
         | whatshisface wrote:
         | $2 per hour per CPU could be expensive or inexpensive,
         | depending on how long it takes to fuzz your program. I wonder
         | how that multiplies out in real use cases?
        
       | benrutter wrote:
       | This is a great pitch, and I don't want to come across as
       | negative, but I feel like a statement like "we found all bugs"
       | can only be true with a very narrow definition of bug.
       | 
       | The most pernicious, hard-to-find bugs that I've come across have
       | all been around the business logic of an application, rather than
       | it hitting into an error state. I'm thinking of the category
       | where you have something like "a database is currently reporting
       | a completed transaction against a customer, but no completed
       | purchase item, how should it be displayed on the customer _recent
       | transactions_ page? ". Implementing something where "a thing will
       | appear and not crash" in those cases is one thing, but making
       | sure that it actually makes sense as a choice given all the
       | context of everyone elses choices everywhere else in the stack is
       | a lot harder.
       | 
       | Or to take a database, something along the lines of "our query
       | planner produces a really suboptimal plan in this edge-case".
       | 
       | Neither of those types of problems could ever be automatically
       | detected, because they aren't issues of the programming reaching
       | an error state- the issue is figuring out in the first place what
       | "correct" actually is for you application.
       | 
       | Maybe I'm setting the bar too high for what a "bug" is, but I
       | guess my point is, its one thing to fantasize about having zero
       | bugs, its another to build software in the real world. I probably
       | still settle for 0 run time errors though to be fair. . .
        
         | moritonal wrote:
         | Good summary of the hard part of being a software developer
         | that deals with clients.
        
           | Aachen wrote:
           | What software developer does not deal with clients (and makes
           | a living)?
        
             | ejb999 wrote:
             | lots of software developers never deal with clients
             | (clients as in the people who will actually use the
             | software) - most of them in fact, in any of the big
             | companies I have worked for anyway...and that is probably
             | not a good thing.
             | 
             | I myself, prefer to work with the people who will actually
             | use what I build - get a better product hat way.
        
         | adamauckland wrote:
         | I consider a "bug" to be "it was supposed to do something and
         | failed".
         | 
         | Issues around business logic are not failures of the system,
         | the system worked to spec, the spec was not comprehensive
         | enough and now we iterate.
        
           | repelsteeltje wrote:
           | ...And now we could probably start debating your narrow
           | definition of "system". ;-)
        
             | pipo234 wrote:
             | Most of the software I've built doesn't have "a spec.", but
             | let me zoom in on specs. around streaming media. MPEG DASH,
             | CMAF or even the base media file format (ISO/IEC 14496-12)
             | at times can be pretty vague. In practice, this frequently
             | turns up in actual interoperability issues where it's
             | pretty difficult to point out which of two products is
             | according to spec and which one has a bug.
             | 
             | So yes, I totally agree with GP and would actually go
             | further: a phrase like "we found all the bugs in the
             | database" is nonsense and makes the article less credible.
        
           | Aachen wrote:
           | What do you call it when the spec is wrong? Like clearly
           | actually wrong, such as when someone copied a paragraph from
           | one CRUD-describing page to the next and forgot to change the
           | word "thing1" to "thing2" in the delete description.
           | 
           | Because I'd call that a bug. A spec bug, but a bug. It's no
           | feature request to make the code based on the newer page
           | delete thing2 rather than thing1, it's fixing a defect
        
             | pinkmuffinere wrote:
             | Ya, I would like a word for this as well. I naturally refer
             | to this category of error as bug, but this occasionally
             | leads to significant conflict with others at work. I now
             | default to calling _almost everything_ a feature request,
             | which is obviously dumb but less likely to get me into
             | trouble. If there is a better word for "it does exactly
             | what we planned, but what we planned was wrong" I would
             | love to adopt it.
        
               | Aachen wrote:
               | I reported such a bug to some software my company uses
               | (Tempo). Vendor proceeds to call it a feature request
               | because the software _successfully fails_ to show public
               | information (visible in the UI, but HTTP 403 in the API
               | unless you 're an admin).
               | 
               | Instead of changing one word in the code that defines the
               | access level required for this GET call, it gets triaged
               | as not being a bug, put on a backlog, and we never heard
               | from it again obviously
               | 
               | We pay for this shit
        
             | SilasX wrote:
             | There's the distinction between correctness and fitness for
             | purpose which I think is helpful for clarifying the issues
             | here.
             | 
             | Correctness bug: it didn't do what the spec says it should
             | do.
             | 
             | Fitness for purpose bug: it does what the spec says to do,
             | but, with better knowledge, the spec isn't what you
             | actually want.
             | 
             | Edit: looks like this maps, respectively, to failing
             | verification and failing validation.
             | https://news.ycombinator.com/item?id=39359673
             | 
             | Edit2: My earlier comment on the different things that get
             | called "bugs", before I was aware of this terminology:
             | https://news.ycombinator.com/item?id=22259973
        
           | rkangel wrote:
           | Systems Engineering has terminology for this distinction.
           | 
           | Verification is "does this thing do what I asked it to do".
           | 
           | Validation is "did I ask it to do the right thing".
        
             | crashabr wrote:
             | Tangentially related, but I've recently started
             | distinguishing verification and validation in my data
             | cleaning work:
             | 
             | verification refers to "is this dataset clean?" or the more
             | precise "does this dataset confirm my assumptions about
             | what a what a correct dataset should be given its focus"
             | 
             | validation refers to "can it answer my questions?" or the
             | more rigorous "can I test my hypotheses against this
             | dataset?"
             | 
             | So I find this interesting (but in hindsight unsurprising)
             | that similar definitions are used in other fields. Would
             | you have a source for your defintions?
        
               | rkangel wrote:
               | They're fairly standard terms from "old style" project
               | management - they show up in the usual V Model of
               | Waterfall vein.
               | 
               | E.g. see Wikipedia: https://en.m.wikipedia.org/wiki/Verif
               | ication_and_validation
        
           | zestyping wrote:
           | A spec bug is just as bad as a code bug! Declaring a system
           | free of defects because it matches the spec is sneaky
           | sleight-of-hand that ignores the costs of having a spec.
           | 
           | The actual testing value is the difference between the cost
           | of writing and maintaining the code, and the cost of writing
           | and maintaining the spec.
           | 
           | If the spec is similar in complexity to the code itself, then
           | bugs in the spec are just as likely as bugs in the code, thus
           | verification to spec has gained you nothing (and probably
           | cost you a lot).
        
         | amw-zero wrote:
         | I do think that it was a mistake to use the word "all" and
         | imply that there are absolutely no bugs in FoundationDB.
         | However, FoundationDB is truly known as having advanced the
         | state of the art for testing practices:
         | https://apple.github.io/foundationdb/testing.html.
         | 
         | So in normal cases this would reek of someone being arrogant /
         | overconfident, but here they really have gotten very close to
         | zero bugs.
        
           | spinningD20 wrote:
           | The other issue I would point out is that building a
           | database, while impressive with their quality, is still
           | fundamentally different than an application or set of
           | applications like a larger SaaS offering would involve (api,
           | web, mobile, etc). Like the difference between API and UI
           | test strategies, where API has much more clearly defined and
           | standardized inputs and outputs.
           | 
           | To be clear, I am not saying that you can't define all inputs
           | and outputs of a "complete SaaS product offering stack",
           | because you likely could, though if it's already been built
           | by someone that doesn't have these things in mind, then it's
           | a different problem space to find bugs.
           | 
           | As someone who has spent the last 15 years championing
           | quality strategy for companies and training folks of varying
           | roles on how to properly assess risk, it does indeed feel
           | like this has a more narrow scope of "bug" as a definition,
           | in the sort of way that a developer could try to claim that
           | robust unit tests would catch "any" bugs, or even most of
           | them. The types of risk to a software's quality have larger
           | surface areas than at that level.
        
             | amw-zero wrote:
             | There's a lot of assertions that I throw into business
             | applications that would be very useful to test in this way.
             | So I don't think this only applies to testing databases.
             | 
             | Also, when properties are difficult to think of, that often
             | means that a model of the behavior might be more
             | appropriate to test against, e.g.
             | https://concerningquality.com/model-based-testing/. It
             | would take a bit of design work to get this to play nicely
             | with the Antithesis approach, but it's definitely doable.
        
               | spinningD20 wrote:
               | Just to clarify, I am definitely not saying this is only
               | useful or only applies to databases.
               | 
               | The point was more that, I don't see how this testing
               | approach (at the level that it functions) would catch all
               | of the bugs that I have seen in my career, and so to say
               | "all of the bugs" or even "most of the bugs" is
               | definitely a stretch.
               | 
               | This is certainly useful, just like unit tests,
               | assertions, etc are all very useful. It's just not the
               | whole picture of "bugs".
        
               | amw-zero wrote:
               | Yes, there are plenty of non-functional logic bugs, e.g.
               | performance issues. I think this starts to drastically
               | hone in on the set of "all" bugs though, especially by
               | doing things like network fault injection by default.
               | This will trigger complex interactions between
               | dependencies that are likely almost never tested.
               | 
               | They should clarify that this is focused on functional
               | logic bugs though, I agree with that.
        
         | nlavezzo wrote:
         | I think the reference to "all the bugs" here is basically that
         | our insanely brutal deterministic testing system was not
         | finding any more bugs after 100's of thousands of runs. Can't
         | prove a negative obviously, but the fact that we'd gotten to
         | that "all green" status gave us a ton of confidence to push
         | forward in feature development, believing we were building on
         | something solid - which, time has shown we were.
        
           | dap wrote:
           | Thanks -- that's very clarifying! But isn't this circular?
           | The lack of bugs is used as evidence of the effectiveness of
           | the testing approach, but the testing approach is validated
           | by...not finding any more bugs in the software?
        
             | FridgeSeal wrote:
             | Yeah but if your software is running in an environment that
             | controls for a lot of non-determinism _and_ can simulate
             | various kinds of failures and degradations at varying
             | rates, and do it all in accelerated time and your software
             | is still working correctly; I think it'd be somewhat
             | reasonable to assert that maybe the testing setup has done
             | a pretty good job.
        
               | dap wrote:
               | Agreed, the approach sounds very interesting and I can
               | see how it could be very effective! I'd love to try it on
               | my own stuff. That's why it's so surprising (to me) to
               | claim that the approach found nearly every bug in
               | something as complicated as a production distributed
               | database. My career experience tells me (quite strongly)
               | that can't possibly be true.
        
         | dap wrote:
         | The best definition I've heard for "bug" is "software not
         | working as documented". Of course, a lot of software is lacking
         | documentation -- and those are doc bugs. But I like this
         | definition because even when the docs are incomplete, the
         | definition guides you to ask: would I really document that the
         | software behaves like this or would I change the behavior [and
         | document that]? It's harder (at least for me) to sweep goofy
         | behavior under the rug.
        
         | oconnor663 wrote:
         | To be fair, the line right after that is "I know, I know,
         | that's an insane thing to say."
        
         | pshc wrote:
         | I feel like business logic bugs live on a separate layer, the
         | application layer, and it's not fair to count those against the
         | database itself.
         | 
         | I agree that suboptimal query planning would be a database-
         | layer bug, a defect which could easily be missed by the bug-
         | testing framework.
        
       | aftbit wrote:
       | This "no bugs" maximalism is counterproductive. There are many
       | classes of bugs that this cannot hope to handle. For example,
       | let's say I have a transaction processing application that speaks
       | to Stripe to handle the credit card flow. What happens if Stripe
       | begins send a webhook showing that it rejected my transactions
       | but report them as completed successfully when I poll them? The
       | need to "delete all of our dependencies" (I presume they wrote
       | their own OS kernel too?) in FoundationDB shows that upstream
       | bugs will always sneak through this tooling.
        
       | mempko wrote:
       | In my career I learned two powerful tools to get bug free code.
       | Design by Contract and Randomized testing.
       | 
       | I had to roll this by myself for each project I did. Antithesis
       | seems to systematize it and created great tooling around it.
       | That's Great!!!
       | 
       | However, looking at their docs they rely on assertion failures to
       | find bugs. I believe Antithesis has a missed opportunity here by
       | not properly pushing for Design by Contract instead of generic
       | use of assertions. They don't even mention Design by Contract. I
       | suspect the vast majority of people here on HN have never heard
       | of it.
       | 
       | They should create a Design by Contract SDK for languages that
       | don't have one (think most languages) that interacts nicely with
       | tooling and only fallback to generic assertions when their SDK is
       | not available. A Design by Contract SDK would provide better
       | error messages over generic assertions, further helping users
       | solve bugs. In fact, their testing framework is useless without
       | contracts being diligently used. It requires a different training
       | and mindset from engineers. Teaching them Design by Contract puts
       | them in that frame of mind.
       | 
       | They have an opportunity to teach Design by Contract to a new
       | generation of engineers. I'm surprised they don't even mention
       | it.
        
         | mrkeen wrote:
         | I've never gotten anything more out of DbC than it being
         | assertions and if-statements, but described using fancy
         | English. I even worked with the creator of C4J a few years ago.
        
           | mempko wrote:
           | The primary benefit imo is
           | 
           | * Way of thinking and discipline. Instead of adhock
           | assertions, you deliberately state in code "These are the
           | preconditions, invariants, and postconditions" of this
           | function/module
           | 
           | * Better error messages.
           | 
           | * Better documentation (can automate extracting the contracts
           | as documentation).
           | 
           | * Better tooling. Can automate creating tests from
           | preconditions. You can sample the functions input space and
           | make sure invariants and postconditions hold.
           | 
           | It's like, do you name all your functions 'func a1, func a2,
           | func a3' or do you provide better names?
        
       | 0xbadcafebee wrote:
       | > The biggest effect was that it gave our tiny engineering team
       | the productivity of a team 50x its size.
       | 
       | 49 years ago, a man named Fred Brooks published a book, wherein
       | he postulated that adding people to a late software project makes
       | it later. It's staggering that 49 years later, people are still
       | discovering that having a larger engineering team does not make
       | your work more productive (or better). So what does make work
       | more productive?
       | 
       | Productivity requires efficiency. Efficiency is expensive,
       | complicated, nuanced, curt. You can't just start out from day 1
       | with an efficient team or company. It has to be grown,
       | intentionally, continuously, like a garden of fragile flowers in
       | a harsh environment.
       | 
       | Is the soil's pH right? Good. Is it getting enough sun? Good.
       | Wait, is that leaf a little yellow? Might need to shade it. Hmm,
       | are we watering it too much? Let's change some things and see.
       | Ok, doing better now. Ah, it's growing fast now. Let's trim some
       | of those lower leaves. Hmm, it's looking a little tall, is it
       | growing too fast? Maybe it does need more sun after all.
       | 
       | If you really pay attention, and continue to make changes towards
       | the goal of efficiency, you'll get there. No need for a 10x
       | developer or 3 billion dollars. You just have to listen, look,
       | change, measure, repeat. Eventually you'll feel the magic of
       | zooming along productively. But you have to keep your eye on it
       | until it blooms. And then keep it blooming...
        
       | Communitivity wrote:
       | There are situations where no bugs is an important requirement,
       | if it means no bugs that cause a noticeable failure. Things such
       | as planes, submarines, nuclear reactors. For those there is
       | provably correct code. That takes a long time to write, and I
       | mean a really long time. Applying that to all software doesn't
       | make sense from a commercial perspective. There are areas where
       | improvements can have a big impact though, such as language
       | safety improvements (Rust) and cybersecurity requirements
       | regarding private data protection. I see those as being the
       | biggest win.
       | 
       | I don't see no bugs in a distributed database as important enough
       | to delay shipping for 5 years, but (a) it's not my baby; (b) I
       | don't know what industries/use cases they are targeting. For me
       | it's much more important to ship something with no critical bugs
       | early, get user feedback, iterate, then rinse and repeat
       | continually.
        
         | amw-zero wrote:
         | This is a false dichotomy though. The proposed approach here
         | has a (theoretically) great cost to value ratio. Spending time
         | on a workload generation process, and adding some asserts to
         | your code is much lower cost than hand-writing tens of
         | thousands of test cases.
         | 
         | So it's not that this approach is only useful for critical
         | applications, it's that it's low-cost enough to potentially
         | speed up "regular" business application testing.
        
         | 0xbadcafebee wrote:
         | A lot of people underestimate the power of QA. Yeah, it would
         | be great if we could just perfectly engineer something out of
         | the gate. But you can also just take several months to stare at
         | something, poke at it, jiggle it, and fix every conceivable
         | problem, before shipping it. Heresy in the software world, but
         | in every other part of the world it's called quality.
        
         | mrkeen wrote:
         | > I don't see no bugs in a distributed database as important
         | enough to delay shipping for 5 years
         | 
         | The marketplace has enough distributed databases with bugs.
         | There's a nice catalogue of them at jepsen.io.
         | 
         | > For me it's much more important to ship something with no
         | critical bugs early, get user feedback, iterate, then rinse and
         | repeat continually.
         | 
         | * You can't really choose which bugs are critical if you're
         | selling a database. A lost write is as critical as the customer
         | deems it is.
         | 
         | * You're not limited to your own users' feedback. There's
         | plenty of users out there who disapprove of a buggy database,
         | so you can probably take their views onboard before release.
        
       | fleaflicker wrote:
       | Business value is a good way to think about it:
       | 
       | > As a software developer, fixing bugs is a good thing. Right?
       | Isn't it always a good thing?
       | 
       | > No!
       | 
       | > Fixing bugs is only important when the value of having the bug
       | fixed exceeds the cost of the fixing it.
       | 
       | https://www.joelonsoftware.com/2001/07/31/hard-assed-bug-fix...
        
       | coolThingsFirst wrote:
       | Why are all the cool people working on DBs and talking about
       | Paxos?
        
       | A-Dmonkey wrote:
       | one of the best applications yet of AI in cyber
        
       | mprime1 wrote:
       | Great read. Great product. I've been an early user of Antithesis.
       | My background is dependability and formal distributed systems.
       | 
       | This thing is magic (or rather, it's indistinguishable from magic
       | ;-)).
       | 
       | If they told me I could test any distributed system without a
       | single line of code change, do things like step-by-step
       | debugging, even rollback time at will, I would not believe it.
       | But Antithesis works as advertised.
       | 
       | It's a game-changer for distributed systems that truly care about
       | dependability.
        
       | chrsw wrote:
       | Could this work for embedded C projects? Bare metal or RTOS?
        
       | karatekidd32v wrote:
       | Not directly related to this post, but clicking around the
       | webpage I chuckled seeing Palantir's case study/testimonial:
       | 
       | https://antithesis.com/solutions/who_we_help
        
       | norir wrote:
       | I think there is a lot of opportunity for integrating simulation
       | into software development. I'm surprised it isn't more common
       | though I suppose the upfront investment would scare many away.
        
       | kendallgclark wrote:
       | Happy customer here ---- maybe the first or second? Distributed
       | systems are hard; #iykyk.
       | 
       | Antithesis makes them less hard (not in line an NP hard sense but
       | still!).
        
       | binarymax wrote:
       | I got really excited about this, and I spent a little time
       | looking through the documentation, but I can't figure out how
       | this is different than randomizing unit tests? It seems if I have
       | a unit test suite already, then that's 99% of the work? Am I
       | misunderstanding? I am drawing my conclusions from reading the
       | Getting Started series of the docs, especially the Workloads
       | section:
       | https://antithesis.com/docs/getting_started/workload.html
        
         | jakewins wrote:
         | This is that, and the exact same vibe, except: it promises to
         | keep being that simple even after you add threads, and locks,
         | and network calls, and disk accesses and..
         | 
         | With this, if you write a test for a function that makes a
         | network call and writes the result to disk, your test will fail
         | if your code does not handle the network call failing or
         | stalling indefinitely, or the disk running out of space, or the
         | power going out just before you close the file, or..
         | 
         | So it's; yes, but it expands the space where testing is as easy
         | as unit testing to cover much more interesting levels of
         | complexity
        
         | nlavezzo wrote:
         | Antithesis here - curious what part of the Getting Started doc
         | gave you that impression? If you take a look at our How
         | Antithesis Works page, it might help answer you question as to
         | how Antithesis is different from just bundling your unit tests.
         | 
         | https://antithesis.com/docs/introduction/how_antithesis_work...
         | 
         | In short though, unit tests can help to inform a workload, but
         | we don't require them. We autonomously explore software system
         | execution paths by introducing different inputs, faults, etc.,
         | which discovers behaviors that may have been unforeseen by
         | anyone writing unit tests.
        
           | binarymax wrote:
           | Thanks for the response. The linked introduction does help.
           | The workload page does give me that impression (and based on
           | upvotes of my post it does to others as well)...so perhaps
           | disambiguating that the void test*() examples on the
           | workloads page are not unit tests might help!
           | 
           | Congrats on the launch and I'll consider using it for some of
           | my projects.
        
       | iamnotsure wrote:
       | "I love me a powerful type system, but it's not the same as
       | actually running your software in thousands and thousands of
       | crazy situations you'd never dreamed of."
       | 
       | Would not trust. Formal software verification is badly needed.
       | Running thousands of tests means almost nothing in software
       | world. Don't fool beginners with your test hero stories.
        
         | mrkeen wrote:
         | Great, but formal software verification is not yet broadly
         | applicable to most day-to-day app development.
         | 
         | Good type systems (a pretty decent chunk of formal software
         | dev) are absolutely necessary and available.
         | 
         | But things get tricky moving past that.
         | 
         | I've tried out TLA+/PlusCal, and one or more things usually
         | happen:
         | 
         | 1) The state space blows up and there's simply too much to
         | simulate, so you can't run your proof.
         | 
         | 2) With regard to race-detection, you yourself have to choose
         | which sections of code are atomic, and which can be
         | interleaved. Huge effort, source of errors, and fills the TLA
         | file with noise.
         | 
         | 3) Anything you want to run/simulate needs an implementation in
         | TLA+. By necessity it's a cut-down version, or 'model'. But
         | even when I'm happy to pretend all-of-Kafka is just a single
         | linkedlist, there's still _so much (bug-inviting) coding_ to
         | model your critical logic in terms of your linked list.
         | 
         | Ironically, TLA+ is not itself typed ( _deliberately_!). In a
         | toy traffic light example, I once proved that cars and
         | pedestrians wouldn 't be given "greenLight" at the same time.
         | Instead, the cars had "greenLight" and the pedestrians had
         | "green"!
        
         | sfink wrote:
         | That'll work great for your Distributed QSort Incorporated
         | startup, where the only product is a sorting algorithm.
         | 
         | Formal software verification is very useful. But what can be
         | usefully formalized is rather limited, and what can be
         | formalized correctly in practice is even more limited. That
         | means you need to restrict your scope to something sane and
         | useful. As a result, in the real world running thousands of
         | tests is practically useful. (Well, it depends on what those
         | tests are; it's easy to write 1000s of tests that either test
         | the same thing, or only test the things that will pass and not
         | the things that would fail.) They are _especially_ useful if
         | running in a mode where the unexpected happens often, as it
         | sounds like this system can do. (It 's reminiscent of rr's
         | chaos mode -- https://rr-project.org/ linking to
         | https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mo...
         | )
        
         | pfdietz wrote:
         | Formal verification require a formal statement of what the
         | software is supposed to do.
         | 
         | But if you have that, you have a recipe for doing property
         | based testing: generate inputs that satisfy the conditions
         | specified in this formal description, then verify that the
         | behavior satisfies the specification also.
         | 
         | And then run for millions and millions of inputs.
         | 
         | Is it _really_ going to be worth proving the program correct,
         | when you could just run an endless series of tests? Especially
         | if the verifier takes forever solving NP hard theorem proving
         | problems at every check in. Use that compute time to just run
         | the tests.
        
       | mamidon wrote:
       | I can see how determinism can be achieved (not easy, but
       | possible), and I can see how describing a few important system
       | invariants can match 100's or 1000's of hand rolled tests, but
       | I'm having a hard time understanding how it's possible to
       | intelligently explore the inputs to generate.
       | 
       | e.g. if I wrote a compiler, how would Antithesis generate mostly
       | valid source code for it? Simply fuzzing utf8 inputs wouldn't get
       | very far.
        
         | pfdietz wrote:
         | I don't know how they'd do compiler testing, but I know how I
         | do it (testing Common Lisp), and can talk about that if you're
         | interested.
         | 
         | But it would be cool to hear how they'd do it.
        
         | chinchilla2020 wrote:
         | The blog post has some impressive copy but is lacking details
         | on how you implement their product.
         | 
         | I am highly skeptical of any claims that something 'magically
         | just works' without much configuration or setup.
        
           | intuitionist wrote:
           | (Disclosure: I'm an Antithesis employee.)
           | 
           | The blog post is meant as a high-level introduction for a
           | general audience. The documentation
           | (https://antithesis.com/docs/) goes into considerably more
           | detail about what kind of configuration and setup you need to
           | start testing with Antithesis.
        
       | Gehinnn wrote:
       | Reading this article, I want the same now for js code that
       | involves web-workers...
       | 
       | How can I write code that involves a webworker in a way that I
       | can simulate every possible CPU scheduling between the main
       | thread in the webworker (given they communicate via post message
       | and no shared array buffer)? Is it possible to write such brute
       | force test in pure JS, without having to simulate the entire
       | computer?
        
         | mrkeen wrote:
         | Use TLA+/PlusCal for this. It's what it's there for.
        
       | gadders wrote:
       | This sounds amazing, but I wonder how long it would take to set
       | up for any reasonably complex system.
        
       | zoogeny wrote:
       | What is described in this post is the gold standard of software
       | reliability testing. A world where all critical and foundational
       | systems are tested to this level would be a massive step forward
       | for technology in general.
       | 
       | I'm skeptical of their claims but inspired by the vision. Even
       | taking into account my skepticism, I would prefer to deploy
       | systems tested to this standard over alternatives.
        
       | mlsu wrote:
       | Pricing doesn't make sense.
       | 
       | What does a CPU hour mean for this framework? How many do I need?
        
       | jewel wrote:
       | This reminds me of Java Pathfinder, but for distributed systems.
        
       | ijustlovemath wrote:
       | We've done something similar for our medical device; totally
       | deterministic simulations that cover all sorts of real world
       | scenarios and help us improve our product. When you have
       | determinism, you can make changes and just rerun the whole thing
       | to make sure you actually addressed the problems you found.
       | 
       | Another nice side effect is that if you hang on to the
       | specification for the simulation, you only have to hang on to
       | core metrics from the simulation, since the entire program state
       | can be reproduced in a debugger by just using the same
       | specification on the same code version.
        
       | schaum wrote:
       | that sounds like "automated advanced chaos monkey" to me
       | https://en.wikipedia.org/wiki/Chaos_engineering#Chaos_Monkey
        
         | nlavezzo wrote:
         | Depends on how far you mean with "advanced" here. We
         | specifically cover the differences between Antithesis and Chaos
         | Engineering in our "How It's Different" page:
         | 
         | https://antithesis.com/product/how_is_antithesis_different/
         | 
         | Here's the relevant text though:
         | 
         | Antithesis testing resembles chaos testing, in that it injects
         | faults to trigger and identify problems. But Antithesis runs
         | these tests in a fully deterministic simulated environment,
         | rather than in production. This means Antithesis testing never
         | risks real-world downtime. This in turn allows for much more
         | aggressive fault injection, which finds more bugs, and finds
         | them faster. Antithesis can also test new builds before they
         | roll out to production, meaning you find the bugs before your
         | customer does.
         | 
         | Finally, Antithesis can perfectly reproduce any problem it
         | finds, enabling quick debugging. While chaos testing can
         | discover problems in production, it is then unable to replicate
         | them, because the real world is not deterministic.
        
       | bell-cot wrote:
       | First reaction: "Yes, your site's weird font is bugging me!"
        
         | intuitionist wrote:
         | (Antithesis employee here.)
         | 
         | We're using Inter, which our designer assures me is pretty
         | popular. But this isn't the first time we've heard this
         | complaint. Also, we had an issue in testing on a different part
         | of our site where the font was getting computed as something
         | weird and ugly on our development NixOS machines. Would you
         | mind replying with what the browser console says your font is
         | getting computed as? Thanks!
         | 
         | Edit: The other person who mentioned this seems to think that
         | it's caused by their JavaScript blocker--we're trying to figure
         | out why, but in the meantime, enabling JS might help if you
         | haven't.
        
           | bell-cot wrote:
           | (It's JS blocking - I told NoScript to allow antithesis.com,
           | and that completely changed the font in FireFox.)
        
       | zubairq wrote:
       | I need to follow this example to build software faster
        
       | ajb wrote:
       | I'm sure I've heard of something similar being built, but
       | specific to the JVM (ie, a specialised JVM that tests your code
       | by choosing the most hostile thread switching points).
       | Unfortunately that was mentioned to me at least 10 years ago, and
       | I can't find it.
        
       | lifeinthevoid wrote:
       | I don't want to sound silly, but there are 24 open and 37 closed
       | bugs on the FoundationDB Github page. Could it perhaps be that
       | bug-free is somewhat exaggerated?
       | 
       | Antithesis looks very promising by the way :-)
       | 
       | Edit: perhaps Apple didn't continue the rigorous testing while
       | evolving the FoundationDB codebase.
        
       | jsdwarf wrote:
       | > and found all of the bugs in the database
       | 
       | This is when I stopped reading
        
       | hintymad wrote:
       | I really like antithesis' approach: it's non-intrusive as all the
       | changes are on a VM so one can run deterministic simulation
       | without changing their code. It's also technically challenging,
       | as making a VM suitable for deterministic simulation is not an
       | easy feat.
       | 
       | On a side, I was wondering how this approach compares to Meta's
       | Hermit(https://github.com/facebookexperimental/hermit), which is
       | a deterministic Linux instead of a VM.
        
       | stavros wrote:
       | This is really impressive, but still, if you're working on a
       | piece of software where this can work, count yourself lucky. Most
       | software I've worked on (boring line-of-business stuff) would
       | need as many lines of code to test a behavior as to implement the
       | behavior.
       | 
       | It's not very frequently that you have a behavior that's very
       | hard to make correct, but very easy to check for correctness.
        
       | shuntress wrote:
       | > a platform that takes your software and hunts for bugs in it
       | 
       | Ok but, what actually IS it?
       | 
       | It seems like it is a cloud service that will run integration
       | tests. I have to figure out how to deploy to this special
       | environment and I still have to write those integration tests
       | using special libraries.
       | 
       | But even after all that integration refactoring, how is this
       | supposed to help me find actual bugs that I wouldn't already have
       | found in my own environment with my own integration tests?
        
         | mrinterweb wrote:
         | I came away with the same questions.
        
       | mcapodici wrote:
       | So this is a valgrind for containers? "If" it works well, and
       | doesn't flag false things, this is pretty useful.
       | 
       | You might want to sell this as a bug finder. But it also could be
       | sold as a security hardening tool.
        
       | georgelyon wrote:
       | Congratulations to the Antithesis team!
       | 
       | I actually interviewed with them when they were just starting,
       | and outside of being very technically proficient, they are also a
       | great group of folks. They flew my wife and I out to DC on what
       | happened to be the coldest day of the year that year (we are from
       | California) so we didn't end up following through but I'd like to
       | think there is an alternative me out there in the multiverse
       | hacking away on this stuff.
       | 
       | I highly recommend Will's talks (which I believe he links in the
       | blog post):
       | 
       | https://m.youtube.com/watch?v=4fFDFbi3toc
       | 
       | https://m.youtube.com/watch?v=fFSPwJFXVlw
        
       | JonChesterfield wrote:
       | > We thought about this and decided to just go all out and write
       | a hypervisor which emulates a deterministic computer.
       | 
       | Huh. Yes, that would work. It's in the category of obvious in
       | hindsight. That is a very convincing sales pitch.
        
       ___________________________________________________________________
       (page generated 2024-02-13 23:00 UTC)