[HN Gopher] Fuzz me wrong - How QuickCheck destroyed my favourit...
       ___________________________________________________________________
        
       Fuzz me wrong - How QuickCheck destroyed my favourite theory
        
       Author : lrngjcb
       Score  : 117 points
       Date   : 2021-02-12 11:16 UTC (11 hours ago)
        
 (HTM) web link (thma.github.io)
 (TXT) w3m dump (thma.github.io)
        
       | Aissen wrote:
       | There was a great introduction to QuickCheck and many other
       | interesting concepts at 36c3:
       | https://www.youtube.com/watch?v=_2tK9G7rKQQ
       | 
       | I'd recommend it if you don't understand much of the article.
        
       | tonetheman wrote:
       | Another article that I am too stupid to understand... how do you
       | get anything done in Haskell...
        
         | rowanG077 wrote:
         | This comment makes me really sad. It's not that you are stupid.
         | It's just that this article is littered with terminology
         | unfamiliair to you(I assume).
        
         | rkangel wrote:
         | While I do agree with you about Haskell's close association
         | with maths (and an assumption about the users), in this case
         | the author is pursuing a mathematical proof (or rather, a
         | counterexample to a mathematical hypothesis). The use of
         | Haskell here is just as a tool to that end. It is therefore
         | reasonable to expect a greater than normal amount of
         | mathematical notation.
        
         | the_af wrote:
         | How do you get anything done in Haskell, you ask? Here are some
         | steps:
         | 
         | 1. Read a Haskell book and/or tutorials online, some of which
         | are free.
         | 
         | 2. Try writing simple software and reading other people's code.
         | Ask for help when you get stuck. The Haskell community tends to
         | be friendly and helpful.
         | 
         | 3. Write your own non-toy code, hit real world problems and
         | solve them.
         | 
         | The path is very similar to other languages, except that
         | because Haskell doesn't share the same lineage than
         | C/C++/Java/Javascript, you won't be able to "wing it" by
         | skipping the steps above. That's likely the source of your
         | confusion: you can't as easily draw analogies to the languages
         | you're already familiar with.
         | 
         | Note that learning Haskell from exploratory articles which are
         | meant for an audience already familiar with the language is not
         | the best way to go about it. This is like trying to learn Java
         | from an article describing the subtleties of the garbage
         | collector in some interesting corner cases: it'll just confuse
         | you and it won't have any practical lessons for the novice.
         | 
         | People can and do write actual production Haskell code, so it
         | can't be that hard.
        
           | ljm wrote:
           | Worth noting that some articles may also feel inaccessible
           | because so many examples are written in terms of formulae.
           | 
           | There's plenty of Haskell out there that is easier to pick up
           | on without looking at what seems to be a jumble of letters
           | that imparts deep meaning.                   frobnicateM :: M
           | a b => b -> a -> f (b a) a
           | 
           | You might get to that stage at some point but books like
           | writing a scheme in 48 hours will show you a much more
           | accessible and elegant side.
        
             | chowells wrote:
             | That jumble of letters _does_ impart deep meaning. In this
             | case it tells me very precisely that you banged on your
             | keyboard at random, because that 's got an infinite kind
             | error in it.
             | 
             | Documentation is powerful stuff, especially when it's
             | machine-checked the way Haskell type signatures are.
        
         | Hitton wrote:
         | Nah, that's just author over complicating simple thing (quite
         | common among Haskell programmers). If he thought for a moment
         | before starting Haskell, he would find out that associativity
         | is enough for map reduce operation.
        
           | smlckz wrote:
           | Different people learn differently. Something that is obvious
           | to you might not be obvious to someone else. The author
           | learnt in that way and shared that, which is a good thing.
           | 
           | Sometimes, Haskell seems to be more mathematics than
           | programming.
        
           | mopierotti wrote:
           | However some map-reduce implementations implicitly expect
           | reducers to be commutative, which the author became
           | appropriately suspicious of because of the "typical examples"
           | of map-reduce.
           | 
           | I found this interesting paper on the subject of non-
           | commutative reducers: "Nondeterminism in MapReduce Considered
           | Harmful?" https://www.microsoft.com/en-us/research/wp-
           | content/uploads/...
        
         | andrepd wrote:
         | I don't get these kinds of comments. Have you spent any time
         | learning any of this? If not, then why do you say that "this is
         | very complicated and I'm too stupid"?
         | 
         | If you find a text in German and you don't understand a word,
         | because you have never studied German, do you think you are too
         | stupid to speak German?
        
           | jcelerier wrote:
           | > Have you spent any time learning any of this? If not, then
           | why do you say that "this is very complicated and I'm too
           | stupid"?
           | 
           | not all things require the same amount of effort to be
           | understood at an adequate level.
           | 
           | > If you find a text in German and you don't understand a
           | word, because you have never studied German, do you think you
           | are too stupid to speak German?
           | 
           | there are definitely people that are able to look at a german
           | text without having ever studied german before and make sense
           | of it in a rough way
        
             | nemetroid wrote:
             | > there are definitely people that are able to look at a
             | german text without having ever studied german before and
             | make sense of it in a rough way
             | 
             | Sure, by knowing both English and Swedish, German has
             | sufficiently little "uniqueness" left to it that I can
             | often get a rough idea of a German text despite never
             | studying it. But how is this different from reading Haskell
             | when you've previously worked in Scala and Rust?
        
           | the_af wrote:
           | I think the mistake is the assumption that there must be a
           | quick and easy analogy to a language with C-style syntax (I'm
           | simplifying a bit, but you get the idea).
           | 
           | This is sort of like PG's Blub paradox: the reader opens an
           | article, sees a bunch of seemingly bizarre syntax and novel
           | terminology, can't easily map it to something he/she knows
           | from C/Java/Python/Javascript, and panics.
           | 
           | For some reason people don't assume the same about natural
           | languages. No Western reader will take a peek at a webpage
           | written in Japanese and cry "I can't understand how people
           | speak this language". (Not in this day and age, at least).
        
             | Pet_Ant wrote:
             | > No Western reader will take a peek at a webpage written
             | in Japanese and cry "I can't understand how people speak
             | this language".
             | 
             | Well not speak it, but reading it... yes. Japanese writing
             | is like a prank that I would not believe is real... and I
             | still doubt it.
             | 
             | This is a really smart guy with a gift for languages
             | struggling with it:
             | https://www.youtube.com/watch?v=bcdYKxHT8kY (NativLang
             | 7m10s)
             | 
             | ...and yet children can do it. Use that as a lesson.
        
               | quelltext wrote:
               | > ...and yet children can do it. Use that as a lesson.
               | 
               | Children? No, it takes them up to middle school to get
               | somewhat proficient with Kanji knowledge for daily life,
               | but they can still not read _a lot_ of Kanji at that
               | point.
               | 
               | Trust me you can learn this too on the same time span and
               | semi-full immersion if you started now.
        
               | [deleted]
        
             | dreamcompiler wrote:
             | And Haskell syntax is _much_ easier to learn than Japanese
             | syntax. (And Japanese syntax is itself easier than you
             | might think.)
        
             | viraptor wrote:
             | While true, there's a certain level of "we must use math
             | concepts to describe things", which while true sometimes
             | just seems unnecessarily silly. "The Monoid 'Natural
             | Numbers under Addition' is associative" => "putting
             | brackets in any place when adding >2 numbers doesn't change
             | the result".
             | 
             | It adds to the perception of "I need to understand masters
             | level maths to use Haskell".
        
             | mumblemumble wrote:
             | I'm guessing it's because most people spend very nearly
             | their entire careers nestled comfortably within a single
             | programming language family. So one maybe gets used to the
             | idea that one should be able to decipher an unfamiliar
             | programming language just by reading it carefully, without
             | needing to do any background study first.
             | 
             | It's most pronounced with lisp and ml-style languages, but
             | I've also seen Java lifers bounce off of things as
             | innocuous as Python's list comprehensions.
        
               | FartyMcFarter wrote:
               | Prolog can also be pretty mind-bending at first, for
               | people used to imperative programming.
        
               | mumblemumble wrote:
               | But it's such an enjoyable mind-bend.
               | 
               | I get a little bit wistful when I see people bouncing off
               | of exotic languages. Where's the delight in a chance to
               | discover something new, and the excitement of a new thing
               | to wrap one's mind around? It makes me wonder if some
               | magic has been lost, or isn't being passed along like it
               | should be.
        
             | jcranmer wrote:
             | > This is sort of like PG's Blub paradox: the reader opens
             | an article, sees a bunch of seemingly bizarre syntax and
             | novel terminology, can't easily map it to something he/she
             | knows from C/Java/Python/Javascript, and panics.
             | 
             | I analogize Blub more to Legal Latin: you're using an
             | uncommon reference point to explain the material that could
             | be explained perfectly well without it. The example I'd
             | give is call/cc; I'm sure I could give a thorough
             | description of how call/cc works, and leave a lot of
             | programmers in the dust by doing so. Or I could explain how
             | the yield operator works, and most of those some
             | programmers would suddenly understand what I'm talking
             | about. But they're basically the same thing (the only real
             | difference being that yield continuation magic can only
             | operate within its generator function, whereas call/cc
             | works across function boundaries).
        
               | the_af wrote:
               | > _you 're using an uncommon reference point to explain
               | the material that could be explained perfectly well
               | without it_
               | 
               | Should Haskell practitioners cater to C-like developers?
               | If so, why bother with Haskell at all? This is one step
               | removed from saying "just write an improved C and forget
               | about Haskell", and in fact there have been attempts at
               | it! It's just that they aren't Haskell.
               | 
               | Is it too much to ask of people trying to understand
               | Haskell code to learn about it first?
        
         | an_ko wrote:
         | Don't be too hard on yourself. It's not stupidity but
         | unfamiliarity. Haskell's roots are pretty distinct from most
         | other popular programming languages, so it's normal if your
         | existing notions from unrelated programming languages don't
         | translate. Like if the article were written in Finnish.
        
           | k__ wrote:
           | Why is Haskell like this?
           | 
           | I don't have that feeling when using other FP languages like
           | Lisp or OCaml.
        
             | freeone3000 wrote:
             | Read $ as <| and . as an empty space and it parses fine as
             | F#. This article relies more on the underlying algebraic
             | set theory than anything in particular about Haskell (sans
             | the conclusion, which is an implementation detail - stick
             | to your commutative monoids in f#'s PSeq.reduce please!).
        
             | gimboland wrote:
             | Haskell is what happens when you take the idea of pure
             | functions (i.e. no side effects) _really_ seriously, and
             | try to create a programming language that works in that
             | world (modulo mechanisms for doing the side effects you
             | _need_ such as I/O).
             | 
             | It turns out that when you do that, all sorts of stuff that
             | simply doesn't fly in other languages "just works",
             | particularly around representing things using structures
             | otherwise rarely seen outside of mathematical logic and
             | category theory. This is all very powerful and elegant, but
             | it means the landscape is unfamiliar, confusing, and
             | intimidating to programmers who don't have that background
             | or the inclination to learn about it.
             | 
             | Hence all the monad tutorials -- not because monads are
             | particularly special, but because beginners encounter them
             | quickly due to IO, so they need some sort of explanation
             | (it's like how if you teach a beginning programmer Java,
             | there's loads of ceremony to explain around classes and
             | public and so on just to do "hello world")... whereas you
             | can get away with using haskell for longer without learning
             | about, say, functors and monoids -- though it somewhat
             | misses the point of the language if you do so.
        
             | josephg wrote:
             | Haskell programmers have delved the deepest and furthest in
             | the mines of a certain kind of Truth. Decades under the
             | earth gives them strange language and strange ideas - the
             | monoid, applicators, all sorts of things.
             | 
             | You can follow them to their deep dark places - but they
             | have travelled far, and finding them in their tunnels is a
             | journey of many moons. To the Haskell programmer, us
             | surface dwellers are all confused - missing the essential
             | True Names of programming. Don't despair - you can travel
             | where you will; but know that discovering the hidden truths
             | will humble and age the best of us. Why? I know not.
             | Perhaps realising we're small and stupid is the cost of
             | staring into the abyss of Math.
        
               | MaxBarraclough wrote:
               | The acolytes of laziness dug too eagerly and too deep.
        
             | the_af wrote:
             | A lot of people complain about all the parens in Lisp. I
             | don't agree with that complaint either.
             | 
             | Different people find different things confusing.
        
             | dreamcompiler wrote:
             | It's because Haskell is more functional than Lisp. (Can't
             | speak to OCaml). Lisp doesn't do automatic currying;
             | Haskell does. When you finally grok the implications of
             | _partial evaluation and laziness everywhere_ then you begin
             | to understand that descriptions of the inputs and outputs
             | of functions, while remaining ordered in time, become fully
             | _associative_ : You can group them however you like.
             | 
             | In Prolog there is another kind of revelation about inputs
             | and outputs: There's no distinction between inputs and
             | outputs and they aren't even necessarily ordered in time
             | any more.
        
               | kazinator wrote:
               | Firstly, I suspect you might mean partial appplication;
               | many functional languages in that family support implicit
               | partial application, so that if _f_ is a three argument
               | function, _f a b c_ is be regarded as the partial
               | applications _((f a) b) c)_.
               | 
               | Currying is a mechanism of representing all functions
               | with two or more arguments using functions of only one
               | argument; with currying we can "bootstrap" functions of
               | higher arities in a substrate like a lambda calculus that
               | has only one argument functions.
               | 
               | This arrangement of implicit partial application is a
               | syntactic sugar.
               | 
               | Syntactic sugar for partial application is not partial
               | application itself. Manual partial application is just as
               | "functional". Functions are being applied, and passed
               | around as values.
               | 
               | Implicit partial application can be obtained with Lisp
               | macros, but that can never be fully general for at least
               | two reasons.
               | 
               | One is that Lisps support something very useful, namely
               | variadic functions, and functions with optional
               | arguments. Variadic functions mean that you don't need
               | ugly shims like "zipwith3" in a Lisp.
               | 
               | However, implicit partial application also requires
               | static typing. In order to know that f is a three
               | argument function being called with two arguments, and
               | therefore reduced to a partially applied one-argument
               | function, our syntactic sugar processor needs complete
               | arity information about f.
               | 
               | In a Lisp, f can be redefined at run-time from having
               | three arguments to having four.
               | 
               | In principle, in spite of that, all function calls could
               | be treated as partial applications in a very generic way,
               | but it would be horribly, horribly inefficient --- and
               | the diagnostics when things go wrong would be pretty
               | terrible.
        
               | dreamcompiler wrote:
               | Agree with all your points, and I did mean partial
               | application, not partial evaluation.
               | 
               | And yes currying is not the same thing but it's common in
               | informal settings to say "Haskell has automatic currying"
               | and everybody knows you're talking about automatic
               | partial application.
               | 
               | Lisp can of course do partial application and currying
               | but not automatically; as you say you'd have to write
               | special macros and sacrifice variadic functions to make
               | it work (and I have done so). And it's really not that
               | necessary in Lisp because we have parentheses; Haskell is
               | the only example I've yet found of a Lisp-like language
               | that has successfully removed most of the parentheses,
               | and it succeeded _because_ of its implicit partial
               | application. But it had to sacrifice variadic functions.
               | 
               | When I first started using Haskell I lamented not having
               | macros, and then I realized that many of my use cases for
               | macros in Common Lisp weren't necessary in Haskell
               | because of its laziness.
               | 
               | CL is still my daily-use language. But Haskell expanded
               | how I think about programming almost as much as Lisp did
               | all those years ago; that happened because Haskell took
               | the functional metaphor to an extreme. I now write better
               | Lisp programs because of my exposure to Haskell.
        
         | mjaniczek wrote:
         | You don't need to use the advanced features of Haskell to be
         | productive with Haskell.
         | 
         | On my latest project I'm writing Haskell in a largely Elm
         | style, and it's glorious.
        
         | SloopJon wrote:
         | The takeaway for me, which agrees with my experience, is that
         | QuickCheck is a great way to start testing. It's pretty amazing
         | what you can find just by checking a few properties or
         | invariants with a hundred random inputs each. It's kind of the
         | seven-minute-a-day workout of testing.
        
       | dreamcompiler wrote:
       | To me this says the parallel library the author described isn't
       | as efficient as it could be because it's enforcing an unexpected
       | ordering constraint.
        
         | jerf wrote:
         | Unfortunately, you're in bad shape as soon as you have your
         | data in a Haskell list, since it's a linked list. It's suitable
         | for small things, but if you want to go larger than that,
         | you're going to need to use arrays or something. Haskell lists
         | aren't just ordered in principle, they're ordered structurally
         | as well because you can't randomly access them at all.
         | 
         | That said, for anything that fits in memory, it's so easy for a
         | parallel map to preserve order that it probably should anyhow,
         | and if it doesn't fit in memory it's still not _that_ hard and
         | probably a better default. (If you 've got some sort of map
         | with wildly variable compute times per element, then you could
         | starve the thing that finally emitting them in order if you hit
         | a very expensive particular element, and you could potentially
         | have gotten better throughput if the rest of the system could
         | just keep working around it and ignoring order. But I'd still
         | expect order-preservation to be the default such that I'd have
         | to tell my parallel map that it's OK to ignore order.)
        
       | leblancfg wrote:
       | That's a great article! I have no background in either Haskell or
       | CS but was captivated to read until the end. Great job!
       | 
       | Nit: I think this would read better if the author replaced
       | "personal theory" with "hypothesis".
        
         | PartiallyTyped wrote:
         | 'Conjecture' could also work.
        
       | dan-robertson wrote:
       | There seems to be a lot of work to get not very much. Maybe the
       | point of this article is more about faffing around with
       | quickcheck but the conclusion should basically be:
       | 
       | 1. Parallel [efficient] MapReduce requires a commutative monoid
       | for deterministic results
       | 
       | 2. A naive implements of parallel MapReduce in haskell using par
       | doesn't because it will only do reduce operations in a certain
       | order. Imagine if the map for every second element took 0s and
       | the others took 10s. In normal mapreduce, you would reduce
       | together the fast half of the results as they come in but in this
       | haskell implementation you would need to sit on them and wait
       | until the other adjacent elements come in. In the article's very
       | naive implementation, you can't even reduce together the first
       | two elements together until you've reduced everything after them
       | (modulo some weirdness around lazy evaluation)
       | 
       | I think if one thinks of mapreduce as an operation on sets not
       | lists, it should be obvious that the monoid must be commutative.
        
         | lupire wrote:
         | 3. A day of implementation can save 20minutes of design .
        
         | jgilias wrote:
         | But that's basically what the conclusion is, isn't it? The
         | author set out with 'obviously the monoid must be commutative',
         | fuzzed it and came to the second point you mention. This all
         | may be very obvious to some, but I found it an interesting read
         | that reminds that one should never fail to question one's
         | assumptions and understanding.
        
           | dan-robertson wrote:
           | I think that confusion stems from thinking about an operation
           | that ought to be defined on sets, mapreduce, as one defined
           | on lists. Your monoid obviously needs to be commutative for
           | the former and not the latter. And the implementations are
           | pretty similar in serial but quite different in performance
           | in parallel.
        
         | maweki wrote:
         | I think the efficiency then hinges on the distribution of your
         | slowness. Is there a slow mapping step for some elements? Then
         | it should be fine having them all near each other. All fast-
         | mapped values before and after can be reduced in the time the
         | slow data is mapped. Only if somehow every second value is
         | slowly mapped you have to wait for reduce to even start.
         | 
         | If your reduce is slow for some inputs, its better to have
         | those inputs near each other, as all consecutive values before
         | and after can be reduced during that time.
         | 
         | And if every step is somewhat constant time, I don't think
         | reordering (that's basically the same as combining random
         | elements instead of consecutive ones) the input does anything.
        
           | dan-robertson wrote:
           | There's two parts of the efficiency. One is memory: if you
           | have a bunch of partial results hanging around for adjacent
           | elements to reduce with, they will use memory. The other is
           | the time and yes, if you have a map reduce on lists
           | implementation that (unlike the author's) reduces any two
           | adjacent mapped elements then you'll only be harmed by more
           | pathological cases like the one I described, but I don't
           | think it's that obvious that that case is unlikely (eg let's
           | say you're processing some logs. You get one file per day
           | from a system that runs from 9am to 5pm and one from a run
           | from 5pm to 9am. It seems likely that one file would
           | typically take longer than the other to process) And
           | reordering your elements to optimise around this means that
           | you'll get different results, unless you're using a
           | commutative monoid! If your monoid isn't commutative then
           | reordering likely isn't an option.
        
         | yakubin wrote:
         | _> I think if one thinks of mapreduce as an operation on sets
         | not lists, it should be obvious that the monoid must be
         | commutative._
         | 
         | If anything, you could think of it as an operation on
         | multisets, but not sets. Sets don't have repeated elements.
        
         | gampleman wrote:
         | I think the main point of the article is that you can use
         | property based testing not as a software quality tool, but as a
         | very quick way to verify/disprove your assumptions. The
         | MapReduce stuff is just a simple enough example to illustrate
         | the point.
         | 
         | I've used property tests the exact same way many times. In
         | complex algorithms assumptions can be hard to think through,
         | but a property testing tool can find counterexamples shockingly
         | quickly.
        
       | [deleted]
        
       | millstone wrote:
       | It seems like the monoid-thinking just obscures. Positive
       | integers under addition are not a monoid (no identity) but you
       | can still map-reduce their sum of squares. All that you really
       | need is associativity, and associative operations are
       | associative.
        
         | massysett wrote:
         | That's a semigroup.
        
       ___________________________________________________________________
       (page generated 2021-02-12 23:01 UTC)