[HN Gopher] Automatically Make Unit Tests
       ___________________________________________________________________
        
       Automatically Make Unit Tests
        
       Author : todsacerdoti
       Score  : 39 points
       Date   : 2021-05-19 09:57 UTC (13 hours ago)
        
 (HTM) web link (wiki.call-cc.org)
 (TXT) w3m dump (wiki.call-cc.org)
        
       | ehutch79 wrote:
       | How does it know what to test? If I have a function named
       | calc_potential_margin, how does it know what the output should
       | be?
        
         | darthrupert wrote:
         | By evaluating it.
        
           | ehutch79 wrote:
           | Awesome, is there a python version of this, never touching
           | test code and having a full test suite with corner cases and
           | whatnot would be awesome
        
       | darthrupert wrote:
       | Pretty good. Essentially just a way to save function executions
       | and replay them later against previous evaluation results.
       | 
       | I would guess that Jupyter notebooks could easily be made into
       | such a thing as well. Also doctests with editor integration might
       | work like this already?
        
       | teddyh wrote:
       | This turns REPL expressions into tests, which is useful if you do
       | REPL-driven development, which admittedly is common in the
       | Lisp/Scheme world.
        
       | peterlk wrote:
       | This brings to mind a conversation that I've had several times
       | over the years. It goes something like this:
       | 
       | > Wouldn't it be cool if we could have code that would test
       | itself?
       | 
       | > Yeah, if only there were some kind of automated way to
       | encapsulate the requirements of classes and methods that could
       | ensure that they behaved within some known boundaries...?
       | 
       | We've had automated unit tests for quite some time, but they are
       | more often called compiler checks. Want your "automated unit
       | tests" to be more comprehensive? Improve your type system and
       | your compiler!
       | 
       | With all that said, I do not want to be dismissive of peoples'
       | projects. This is fun and neat.
        
         | sidmitra wrote:
         | Type checking, compile-time errors won't check if the program
         | you wrote is correct, only that it's internally consistent.
         | That is an entirely different thing than checking for
         | correctness. You do need a layer of tests that actually try to
         | prove that given some input your program actually does what
         | it's supposed to do. That is hard to automate.
         | 
         | Joe Armstrong(of the Erlang fame), talks about this a bit here:
         | https://youtu.be/TTM_b7EJg5E?t=778
         | 
         | I've pointed to a specific timestamp, but there might be more
         | details somewhere else in that talk.
        
           | bollu wrote:
           | That's untrue. There are type-systems and type-checkers which
           | can check for _correctness_ , as specified by a mathematical
           | formula. This is proof assistants such as Coq[1] work. For
           | example, take a look at fiat-crypto [2], a project that
           | generates code that is _correct by mathematical proof_ , and
           | is now deployed in Chrome. Another famous example is the
           | CompCert [3] compiler, a C compiler along with a correctness
           | proof that the compiler generates assembly which correctly
           | simulates the C semantics.
           | 
           | These are based on powerful types as "dependent types" which
           | can encode mathematics and programs in the same programming
           | language, and allow the mathematics to "talk about" the
           | program. The mathematical underpinning is the Curry-Howard-
           | Lambek correspondence.
           | 
           | [1] https://coq.inria.fr/ [2] https://github.com/mit-
           | plv/fiat-crypto [3] https://compcert.org/ [4] https://en.wiki
           | pedia.org/wiki/Curry%E2%80%93Howard_correspon...
        
         | carlmr wrote:
         | The next thing is semi-automatic tests, which you can do with
         | property based testing, fuzz testing, and sanitizers.
        
       | lgleason wrote:
       | I have yet to see an effective version of these "wonder" tools
       | actually work in practice. That said, they make great snake oil
       | for testing automation teams that often end up doing manual
       | testing because the promises of the "easy" automation were never
       | fulfilled.
       | 
       | Too many people looking for silver bullets.
        
       | danpalmer wrote:
       | This is neat, it reminds me of Hypothesis which does randomised
       | property-based testing and if it finds a failing edge case it
       | saves it to a database so that it's always used in future test
       | runs.
       | 
       | That said, I suspect I'd find a similar limitation with this as
       | with Hypothesis. Most of the tests I write aren't for pure
       | functions, or require a fair amount of setup or complex
       | assertions. It's possible to write test helpers that do all of
       | this, and I do, but too much of that means tests that are complex
       | enough to need their own tests, so I think it's important to
       | balance.
       | 
       | This is probably specific to my normal work (Python, web
       | services), but I'd suspect applies to a lot of testing done
       | elsewhere.
        
         | CloselyChunky wrote:
         | > Most of the tests I write aren't for pure functions
         | 
         | In response to this, I recommend the "Functional Core,
         | Imperative Shell"[0] talk/pattern. The idea is to extract your
         | business logic as pure functions independent of the data access
         | layer. This pattern allowed me to test large portions of a code
         | base using property tests. This works really well in most cases
         | and gives me much more confidence in the product that will be
         | deployed.
         | 
         | [0]:
         | https://www.destroyallsoftware.com/screencasts/catalog/funct...
        
       | xutopia wrote:
       | This feels like it defeats the purpose of writing tests though.
       | For me writing test very much has to do with validating my
       | assumptions... it helps me gain confidence in my code. Now I'd
       | have to trust these automated tests? My confidence just wouldn't
       | be there.
        
         | mumblemumble wrote:
         | To me, testing in the REPL is also about validating my
         | assumptions.
         | 
         | REPL-driven development is basically test-driven development
         | with sorter feedback cycles during active development, but the
         | downside is you need to make an extra effort to go back and
         | preserve the important tests for posterity. This might well
         | reduce some of that friction.
        
         | omginternets wrote:
         | Oftentimes, finagling with the REPL is merely about getting the
         | syntax right ("did I use the right number of parens?") rather
         | than formulating assumptions. In such cases, using this tool
         | doesn't invalidate the TDD workflow, since your assumptions
         | don't change.
         | 
         | The essential complexity of writing unit tests is in
         | formulating your assumptions ahead of time. As you correctly
         | point out, no tool can ever solve this. However, an
         | _accidental_ complexity is correctly _expressing_ your
         | assumptions in code. Tools such as a REPL and this library can
         | _definitely_ help with that.
        
         | twobitshifter wrote:
         | This is more, did I accidentally break something? The
         | correctness of the tests is not known but you will know which
         | tests have changed. Then you can go and review that code and
         | see why it changed and whether the new behavior is correct.
         | This gives you confidence in a change didn't break anything but
         | not that things weren't broken from the start.
        
         | cratermoon wrote:
         | Ian Cooper agrees, in spades. "TDD, Where Did It All Go Wrong"
         | https://www.youtube.com/watch?v=EZ05e7EMOLM
        
         | slver wrote:
         | It's for regression testing.
        
           | crznp wrote:
           | Title says unit tests. But to GP's point, it isn't really
           | automatic, the developer still decides what to test. It is
           | transforming REPL logs, not generating tests based directly
           | on the code (which isn't useful in my experience).
           | 
           | To be honest, my development REPL logs have a lot of junk, so
           | it would be extra work to write a clean one, and then make
           | the right assertion (maybe I was looking for "even number"
           | not "exactly 132"). Lisp-y REPLs are friendlier than some,
           | but it would be cool to associate it with something like a
           | notebook where it is easier to update imports/mocks/etc out
           | of order. Cool idea though.
        
             | slver wrote:
             | Unit test and regression test are not in conflict.
        
               | ehutch79 wrote:
               | They are when you refer to regressions tests as unit
               | tests.
        
               | slver wrote:
               | Unit testing is "what" you test, it's your scope. And
               | regression testing is "why" you're testing, you running
               | tests to detect regressions.
        
               | ehutch79 wrote:
               | Right, but if you're testing the 'why' and calling it a
               | unit test... confusion ensues...
        
               | slver wrote:
               | Is it that confusing that units of code are subject to
               | change, hence subject to regressions, hence their tests
               | are used to prevent regression?
               | 
               | 1. Unit testing is a kind of functional testing. On a
               | unit.
               | 
               | 2. And regression testing is rerunning your existing
               | functional tests after, say, refactoring, to see if they
               | pass.
               | 
               | Complex? Confusing? Which part of this precisely is
               | confusing?
        
               | ehutch79 wrote:
               | You said a unit test is 'what' you test?
               | 
               | So if I say 'I wrote a unit test' what kind of test do
               | you expect?
        
               | ehutch79 wrote:
               | Realisticly the problem here is;
               | 
               | The headline says they're generating unit tests
               | automatically.
               | 
               | Which means I expect that I run a command, and it plops
               | out unit test code. automatically. without my
               | interaction.
               | 
               | The article is about recording repl interactions. that's
               | not automatic.
               | 
               | and really, they look like implementation details? so bad
               | examples? I don't know scheme, so _shrug_
        
           | rbanffy wrote:
           | Makes sense, but it's always good to capture the intent of
           | the test. If it is to prevent a regression, having data about
           | the issue that was fixed is vital.
           | 
           | In general, I prefer to start writing tests, just enough the
           | tests fail but call the functionality (this makes API design
           | flaws obvious), then actually build the functionality and,
           | finally, make the test conditions reflect what we need.
        
       ___________________________________________________________________
       (page generated 2021-05-19 23:02 UTC)