[HN Gopher] Why GADTs matter for performance (2015)
       ___________________________________________________________________
        
       Why GADTs matter for performance (2015)
        
       Author : hyperbrainer
       Score  : 48 points
       Date   : 2025-05-10 13:55 UTC (2 days ago)
        
 (HTM) web link (blog.janestreet.com)
 (TXT) w3m dump (blog.janestreet.com)
        
       | rbjorklin wrote:
       | Does anyone have some hard numbers on the expected performance
       | uplift when using GADTs? Couldn't see any mentioned in the
       | article.
        
         | ackfoobar wrote:
         | The example here is basically an 8-fold memory saving going
         | from `long[]` from `byte[]` - while still retaining
         | polymorphism (whereas in Java the two are unrelated types).
         | 
         | Hard to say exactly how much performance one would get, as that
         | depends on access patterns.
        
           | misja111 wrote:
           | The reason that a byte array is in reality layed out as a
           | (mostly empty) long array in Java, is actually for
           | performance. Computers tend to have their memory aligned at 8
           | byte intervals and accessing such an address is faster than
           | accessing an address that's at an offset of an 8 byte
           | interval.
           | 
           | Of course it depends on your use case, in some cases a
           | compact byte array performs better anyway, for instance
           | because now you're able to fit it in your CPU cache.
        
             | john-h-k wrote:
             | But you can load any byte by loading 8 bytes and shift (v
             | cheap)
        
             | ackfoobar wrote:
             | > a byte array is in reality layed out as a (mostly empty)
             | long array in Java
             | 
             | Are you saying each byte takes up a word? That is the case
             | in the `char array` in OCaml, but not Java's `byte[]`.
             | AFAIK The size of a byte array is rounded up to words. Byte
             | arrays of length 1-8 all have the same size in a 64-bit
             | machine, then length 7-16 take up one more word.
             | 
             | https://shipilev.net/jvm/objects-inside-out/
        
       | cosmic_quanta wrote:
       | Interesting, thanks for posting.
       | 
       | I share the author's frustration with the lack of non-compiler-
       | related examples of GADT uses. It seems like such a powerful
       | idea, but I haven't been able to get a feel for when to reach for
       | GADTs in Haskell
        
         | wyager wrote:
         | I often find them handy for locking down admissible states at
         | compile time. Maybe ~10 years ago in a processor design class,
         | I wrote some CPUs in Haskell/Clash for FPGA usage. A nice thing
         | I could do was write a single top-level instruction set, but
         | then lock down the instructions based on what stages of the
         | processor they could exist at.
         | 
         | For example, something like (not an actual example from my
         | code, just conceptually - may be misremembering details):
         | data Instruction stages where        MovLit :: Word64 ->
         | Register -> Instruction '[Fetch, Decode, Execute, Writeback]
         | -- MovReg instruction gets rewritten to MovLit in Execute stage
         | MovReg :: Register -> Register -> Instruction '[Fetch, Decode,
         | Execute]        ...
         | 
         | And then my CPU's writeback handler block could be something
         | like:                 writeback :: (Writeback `member` stages)
         | => Instruction stages -> WritebackState -> WritebackState
         | writeback (MovLit v reg) = ...       -- Compiler knows (MovReg
         | _ _) is not required here
         | 
         | So you can use the type parameters to impose constraints on the
         | allowed values, and the compiler is smart enough to use this
         | data during exhaustiveness checks (cf "GADTs Meet Their Match")
        
           | anyfoo wrote:
           | Wow, someone else who (used to be) using Clash. I still use
           | it for everything I can in my (hobby) FPGA projects. I'm not
           | sure I've used GADTs, but I've certainly made use of other
           | more "advanced" parts of the type system, like type families.
           | 
           | What you're doing here is pretty cool, I think I will start
           | doing so, too. I have a number of places where I use
           | "undefined" instead. (The "undefined" from the Clash Prelude,
           | which translates into a "don't care" signal state.)
        
       | hyperbrainer wrote:
       | Related:
       | https://github.com/ocaml/RFCs/blob/881b220adc1f358ab15f7743d...
        
       | goldchainposse wrote:
       | I know Jane Street love OCaml, but you have to wonder how much
       | it's cost them in velocity and maintenance. This is a quant firm
       | blogging about a programming language they're the most famous
       | user of.
        
         | pjmlp wrote:
         | It is thanks to the companies like Jane Street that believe
         | there is something else beyond C, that we can have nice toys.
         | 
         | Remember if OCaml wasn't a mature programming language, maybe
         | Rust would not have happened in first place.
        
         | kryptiskt wrote:
         | Why do you assume it's a drag for them and not a competitive
         | advantage? I don't know if it's such a terrible thing to use a
         | slightly out of mainstream language, when the standard in the
         | business is to accumulate tens of millions of lines of C++.
        
           | ackfoobar wrote:
           | Agreed, indeed I believe they have mentioned that OCaml gets
           | them to ship quicker because they are more confident with the
           | correctness of changes.
           | 
           | But being outside of the mainstream may mean you need to
           | occasionally debug more esoteric stuff:
           | https://gallium.inria.fr/blog/intel-skylake-bug/ I'm sure
           | Jane Street can afford doing that, but I'm not so sure if a
           | small team can.
        
             | gjadi wrote:
             | That was an interesting read, thanks. However I fail to see
             | how it's an issue specific to ocaml. It was a bug in the
             | Skylake processor triggered by a special pattern of
             | instructions produced by gcc. Ocaml built with clang was ok
             | because it doesn't used the same pattern. Did I miss
             | something?
        
               | ackfoobar wrote:
               | If the JVM encountered the same bug other people would
               | have discovered it before me. Most probably I won't even
               | know the bug exists.
        
           | goldchainposse wrote:
           | > Why do you assume it's a drag for them and not a
           | competitive advantage?
           | 
           | Because despite them being very open about it, no one else
           | does it, and every distinguished engineer who pushes a weird
           | tech choice will justify and defend it.
        
         | keybored wrote:
         | Concretely how do you think it's holding them back? Just by
         | being niche?
        
         | anyfoo wrote:
         | There are many things to say about this, but one of those
         | things is that I think you are making the assumption that an
         | (e.g.) C programmer who does not want (or even cannot) get into
         | OCaml would somehow be better for this highly specialized,
         | high-performance, and high-correctness-affine use case, than
         | someone who does. And I'd question that assumption.
        
       ___________________________________________________________________
       (page generated 2025-05-12 23:01 UTC)