[HN Gopher] Algebraic Data Types in Haskell
       ___________________________________________________________________
        
       Algebraic Data Types in Haskell
        
       Author : ingve
       Score  : 94 points
       Date   : 2022-03-26 10:54 UTC (2 days ago)
        
 (HTM) web link (serokell.io)
 (TXT) w3m dump (serokell.io)
        
       | Koshkin wrote:
       | 'Maybe' is both a monad and a sum. A monadic and an algebraic
       | type at the same time. I wonder if there is an interesting
       | interaction between those.
        
         | Twisol wrote:
         | Maybe is a sum because it was _defined_ using sums. It 's a
         | monad because, from that definition, you can implement
         | functions meeting the monad interface. "Monad" is more like an
         | adjective than a noun.
         | 
         | > I wonder if there is an interesting interaction between
         | those.
         | 
         | There's a rather rich theory of polynomial endofunctors (i.e.
         | generic types with one type parameter built from products,
         | sums, and non-generic types), and I think monads figure
         | prominently (they're somehow related to the initial algebras
         | for such functors). So there's definitely a relationship there.
        
         | black_knight wrote:
         | In general an algebraic (polynomial) functor which is also a
         | monad is called an operad (non-sigma operad for the algebraic
         | topologists). They are plenty and have a lot of interesting
         | structure.
        
       | prionassembly wrote:
       | Haskell is very powerful, but I don't think this article presents
       | anything that's widely available in popular languages right now.
       | I'd be very skeptical that TypeScript isn't expressive enough to
       | do Maybe and Either; I wouldn't be surprised in PHP is there
       | already.
        
         | jerf wrote:
         | Your post as I read it is very hard to parse, but I don't think
         | the article claimed that Typescript couldn't do it. Many modern
         | languages can do Either or Maybe types precisely because they
         | copied it from Haskell in the first place. (Which is good.
         | Copying is good.)
         | 
         | The dynamic scripting languages can do a "maybe" type because
         | they can do just about anything when it comes to types; what
         | they can't do is _enforce_ correct usage by restrictions, so
         | you write a Maybe type but you can 't ever quite be _sure_ it
         | 's doing what you think it's doing. This is just something you
         | end up living with in dynamically typed langauges.
        
           | prionassembly wrote:
           | Expressivity and automatic enforcement are completely
           | orthogonal. Some pseudocode in academic papers is sublimely
           | expressive, but can't be typed directly on a computer.
        
           | fooyc wrote:
           | Typescript is a static type system though
        
             | jerf wrote:
             | Yes, the paragraph break was there for a reason.
        
         | dllthomas wrote:
         | Typescript can do Maybe and Either, but it's a little verbose
         | because you have to manage the tags yourself.
         | 
         | For instance you can certainly do:                   type
         | Maybe<X> = { tag: 'just', value: X } | { tag: 'nothing'};
         | type Either<X, Y> = { tag: 'left', value: X } | { tag: 'right',
         | value: Y };              function either<X, Y, Z>(fromLeft: (x:
         | X) => Z, fromRight: (y: Y) => Z, xy: Either<X, Y>): Z {
         | switch (xy.tag) {                 case 'left': return
         | fromLeft(xy.value);                 case 'right': return
         | fromRight(xy.value);             }         }
         | 
         | Note that this is meaningfully different from the (often more
         | idiomatic and certainly less verbose) untagged use of union
         | types because you can distinguish left from right even when the
         | contained types agree and you can distinguish different
         | nestings, for better and for worse.
        
       | cosmic_quanta wrote:
       | Algebraic data types (ADTs) help a lot when expressing business
       | logic. It's hard to go back once you've tried it in Haskell (or
       | presumably in other languages).
       | 
       | I often try to replicate ADTs in Python at $DAYJOB, because
       | they're so damn convenient. I end up with dataclasses which have
       | the same interface, and I disambiguate between them using
       | `isinstance`. It's not perfect but it's useable.
        
         | prionassembly wrote:
         | Tuples (or namedtuples/dataclasses if you must have named
         | fields, which in Haskell too is additional syntax) are product
         | types, enums are sum types.
         | 
         | You should look into functools.singledispatch if you're doing
         | business logic with isinstance. But you should also learn about
         | object-oriented programming, inheritance vs composition,
         | mixins, etc. People get sour on OOP because they should have
         | learned FP first and then imitate/translate FP idioms --
         | instead of doing something like modeling domain ontologies in
         | object graphs. Likewise, people get sour on FP yadda yadda
         | instead of trying to literally write their programs as proofs
         | to theorems.
        
           | couchand wrote:
           | OOP is not an alternative to ADT, and a language lacking an
           | expressive system of value types will stunt your development
           | regardless of what flavor the rest of the language follows.
        
         | xwowsersx wrote:
         | Can you share some examples, please?
        
           | BoiledCabbage wrote:
           | Funny enough someone actually wrote a book on it.
           | 
           |  _Data Modeling Made Functional_
           | 
           | https://www.amazon.com/Domain-Modeling-Made-Functional-
           | Domai...
           | 
           | It's one of the better and more useful software engineering
           | books I've read. Even if you don't use a functional
           | programming langauge. It's about using Algebraic Data Types
           | do model common problems in the day-to-day business domain
           | (not typical academic problems).
           | 
           | It's a really simple and awesome presentation, and by the end
           | you're dying for the ability to use this more so in the day
           | to day job. Honestly after reading through it, trying to
           | model problems in OOP just seems so unnecessarily obtuse.
           | 
           | The Scott Wlaschin also runs
           | https://fsharpforfunandprofit.com
           | 
           | https://fsharpforfunandprofit.com/ddd/ - a link to a talk on
           | it which is decent, but the book is much better.
        
             | xwowsersx wrote:
             | Thanks. I was asking specifically about accomplishing this
             | in Python.
        
         | klysm wrote:
         | My brain is 100% polluted by the power of ADTs. I can't think
         | about modeling problems without them anymore and it makes
         | development in languages where it's not ergonomic difficult.
        
         | pid-1 wrote:
         | Python does have Union types and pattern matching tho.
        
           | garethrowlands wrote:
           | Python has types?
        
             | matyasrichter wrote:
             | Uh, yes?
        
         | Gadiguibou wrote:
         | I have had the exact same problem with Go recently. I have just
         | started using it and have been disappointed to see that, for a
         | language that claims to be modern, it doesn't have proper sum
         | types/enums OR convenient product types/tuples.
         | 
         | Tuples can be emulated using structs but it can generate a lot
         | of boilerplate for a single function. The only alternative to
         | express iterating over "zipped" lists is to have the two lists
         | side by side and iterate using an integer.
         | 
         | However, sum types are just plain missing. I guess interfaces
         | help, but they're really limiting in what they allow and even
         | regular C-style enums are painful and can't be checked for
         | exhaustivity at compile time.
         | 
         | Does anyone have tips on what the "idiomatic" solutions are for
         | these problems?
        
           | jerf wrote:
           | There are times when sum types are the best solution, but in
           | many cases, interfaces are the solution. But, interfaces as
           | they are _designed_ to be used, not just interfaces used to
           | label a set of types that you then constantly type switch
           | between. See: http://www.jerf.org/iri/post/2960
           | 
           | This is true of OO languages in general, not just Go.
        
             | bobbylarrybobby wrote:
             | In practice this doesn't seem to work. In theory, Go should
             | have a Result interface and two types Data and Error that
             | implement it, and then functions that could error would
             | return a Data or an Error and then... you'd have to call a
             | function like `my_result.Do(callback)`? The fact that
             | idiomatic Go code actually returns (data, err) and checks
             | if err == nil -- and that it's not an error to forget this
             | check -- suggests that the interface paradigm just doesn't
             | work that well.
        
           | couchand wrote:
           | It's been a while since I've written Go but back then the
           | idiomatic solution was a C-style enum and a fair amount of
           | hope.
        
         | rfw300 wrote:
         | Have you seen Python 3.10's pattern matching?
         | https://peps.python.org/pep-0636/
         | 
         | While it doesn't bring you Haskell ADTs, it sounds like it
         | could make your `isinstance`-style code a lot cleaner.
        
           | cosmic_quanta wrote:
           | That's great! Unfortunately, at the pace our stack is moving,
           | I don't think I'll be using Python 3.10 for some time
        
         | anoctopus wrote:
         | Completely agree, although I'm often put off by the
         | comparatively worse ergonomics for the python emulation, to the
         | point that I sometimes miss obvious-in-retrospect places I
         | should be modeling the domain with more variants, even in code
         | I'm already using unions of dataclasses. The ergonomics and
         | clarity of `data Foo = Foo Int | Bar` are incredible.
        
         | LeanderK wrote:
         | > I often try to replicate ADTs in Python at $DAYJOB, because
         | they're so damn convenient. I end up with dataclasses which
         | have the same interface, and I disambiguate between them using
         | `isinstance`. It's not perfect but it's useable.
         | 
         | I also do this! It is so painful not to have real ADTs.
        
       | titzer wrote:
       | When designing Virgil, I left out ADTs for the longest time. I
       | had programmed a bit in ML, and knew I would eventually want
       | them. I finally added them to Virgil circa 2016. It's really
       | impossible to emulate them with objects, because objects
       | typically have identity, whereas ADTs, IMHO, shouldn't. The
       | pattern matching and exhaustiveness checking that ADTs give you
       | allows you to write very concise programs that you can be sure
       | cover all the cases.
       | 
       | Extensibility in OO languages can be great for many things, but
       | ADTs excel at expressing a fixed set of data shapes, forcing
       | programmers to think carefully and break things down into a fixed
       | set of cases. When you've handled all the cases, you're done!
        
       | garethrowlands wrote:
       | Most language type systems can say AND but relatively few can say
       | OR. For example, Java can say a `Point` has an `x` AND a `y`:
       | class Point { public int x; public int y; }
       | 
       | ...but it can't say a `Point` has _either_ `x` and `y` OR `r` and
       | `theta`. You can simulate it with two classes and a base
       | class/interface but that implies there might be _other_ ways of
       | defining a point... which there really aren't. The Haskell point
       | would be:                   data Point = Cartesian { x :: Double,
       | y :: Double } |                      Polar { r :: Double, theta
       | :: Double }
        
         | azth wrote:
         | You can do the same in Java now with sealed types:
         | sealed interface Point {}         record Cartesian(double x,
         | double y) implements Point {}         record Polar(double r,
         | double theta) implements Point {}
        
           | gavinray wrote:
           | With JDK 18 you even get binding patterns in matches, so you
           | can extract + assign "Point.x"/"Point.y" to variables when
           | matching on them!
        
             | Twisol wrote:
             | Type patterns [1] are still unfortunately in preview (and
             | have been since JDK 17), and record (destructuring)
             | patterns [2] haven't even landed as a preview yet. I hope
             | the latter get at least a preview in JDK 19 [3], but it
             | looks like they've barely even begun the process of
             | targeting JEPs to 19 yet.
             | 
             | [1]: https://openjdk.java.net/jeps/420
             | 
             | [2]: https://openjdk.java.net/jeps/405
             | 
             | [3]: https://openjdk.java.net/projects/jdk/19/
        
           | garethrowlands wrote:
           | Good point, that's an improvement.
        
       | gunshowmo wrote:
       | Recent updates in TypeScript have also made ADTs much better to
       | use and make expressing business logic in a statically safe way.
       | It's incredible how much mental overhead that good typing systems
       | (and good utilization of them) can save you when navigating
       | complex code bases.
        
         | dllthomas wrote:
         | You still need to tag your own unions though, right?
        
           | gunshowmo wrote:
           | Yes, and since typing is only at compile time, it's also
           | missing some of the extremely convenient pattern matching
           | syntax that I love from Haskell, Rust, Scala, etc....
           | 
           | Despite some downside, I'm a huge fan overall though. I
           | haven't tried ReasonML or ReScript, but compared to bare JS,
           | TypeScript makes frontend programming a lot more enjoyable to
           | me.
        
             | cercatrova wrote:
             | I consider ReasonML and ReScript to be effectively dead
             | since they split the community a few times now. I don't
             | know anyone who's actually using either in production,
             | except for their respective sponsors Meta and Bloomberg.
        
         | domlebo70 wrote:
         | What are the recent changes?
        
       | earleybird wrote:
       | In these discussions about ADTs, often I see Maybe and Either
       | identified as an ADT rather than an ADT constructor. Am I just
       | missing something other folks 'just know'? If so, what's the
       | intuition I should be looking for?
        
         | pwm wrote:
         | It's just semantics.                   data Bool = False | True
         | data Maybe a = Nothing | Just a
         | 
         | Bool is a nullary type constructor or simply type. False and
         | True are nullary data constructors or simply constants. Maybe
         | is an unary type constructor taking one parameter to construct
         | a fully saturated type (eg. Maybe Int), but calling it a "Maybe
         | type" is ok, no crazy ambiguity. Nothing is a constant and Just
         | is an unary data constructor taking one parameter to construct
         | fully saturated data (eg. Just 1).
        
           | dllthomas wrote:
           | And in the context of Haskell, non-nullary type constructors
           | are still types. They don't happen to be the types of any
           | values, but they can still parameterize types, be constrained
           | by type equality (eg. a ~ Maybe), etc.
        
         | Twisol wrote:
         | For any type `a`, `Maybe a` is a type. We say "Maybe" in the
         | same sense we say "List" -- it's obviously a list of some type
         | `a`, but if what we're saying doesn't depend on any particular
         | choice of `a`, then by abuse of language, we don't bother
         | saying it.
         | 
         | On the other hand, `Just` and `Nothing` are _constructors_ , or
         | _variants_ , for values of type `Maybe a`. You can think of
         | them as almost like static factories (as in `Optional.empty()`
         | and `Optional.of(val)` in Java), except that algebraic
         | constructors are invertible: you can match on a value to
         | determine whether it was constructed with `Just` or `Nothing`,
         | and what the arguments to the constructor were.
        
       | fooyc wrote:
       | The article is well written and Haskell appears to have what's
       | necessary to work efficiently with algebraic data types (matching
       | and overloading).
       | 
       | But in the end, it's just fancy words for union and intersection
       | types, or am I missing something ?
        
         | curtisf wrote:
         | "Algebraic data types", as they appear in Haskell, are
         | recursive sums of product types.
         | 
         | (The "sum" and "product" are why they're called "algebraic".
         | You can also analyze them as power series, such as in computing
         | derivatives for zippers)
         | 
         | Intersection types are something else (that Haskell does not
         | have built in to my knowledge).
         | 
         | "union" is slightly imprecise, because sum types are tagged
         | (i.e., a non-discriminated union of Int and Int is just Int,
         | but Int+Int is actually the same as (Bool, Int). True non-
         | discriminated union types are relatively uncommon in static
         | type systems, but TypeScript for example does have them)
        
         | [deleted]
        
       | bmc7505 wrote:
       | ADTs are indeed powerful, but they are one solution to the
       | Expression Problem [1] among many. Just as OOP has limitations,
       | so do ADTs. Until you spend time studying each one and absorb the
       | lessons they have to offer, only then will you realize that all
       | solutions have tradeoffs, only then will you finally achieve
       | enlightenment.
       | 
       | [1]: https://en.wikipedia.org/wiki/Expression_problem
        
       | dllthomas wrote:
       | I was just trying to explain these to my mother, in the context
       | of rust.
        
       | [deleted]
        
       | pwm wrote:
       | Sum types are one of those things missing from most mainstream
       | languages that people don't even realise that they are deprived
       | of something fundamental. They naturally complement product types
       | that are ubiquitous in most/all languages.
        
         | dllthomas wrote:
         | Specifically, closed sum types. You can usually get open sum
         | types by way of inheritance.
        
           | globuous wrote:
           | True that! It was eye opening once I understood that! Here's
           | a basic type system I made in Python for a small smart
           | contract language compiler I'm working on. It works with mypy
           | pretty flawlessly! [1] I still haven't figured out how to
           | prevent subclassing with methods that don't comply with a
           | specific type signature, but that's a story for another
           | day...
           | 
           | [1] https://yourlabs.io/pyratzlabs/pymich/-/blob/master/pymic
           | h/m...
        
         | zozbot234 wrote:
         | FreePascal supports them, via variant records. A nice gain in
         | developer-friendliness over C.
        
         | [deleted]
        
         | pklausler wrote:
         | C++17 has std::variant<> and it's ubiquitous in my code.
         | Catching your type errors at compilation time is something you
         | don't have to give up any more after doing your prototyping in
         | Haskell.
        
           | b0sk wrote:
           | it is very clunky though. there is no pattern matching
           | support -- you need to write non-trivial visitors to do any
           | meaningful work and it takes the fun out of it. imo, Rust
           | gets it right and borrows a lot of Haskell ergonomics
        
       | dboreham wrote:
       | Tagged unions were (are?) frequently used in C, albeit without
       | safety. This seems to have been mostly forgotten now when sum
       | types are discussed.
        
         | edflsafoiewq wrote:
         | Tagged unions were common in procedural languages. They
         | disappeared when OO became popular.
        
       | skybrian wrote:
       | Sum types are great in functional languages, but people are
       | talking about using sum types in languages that don't natively
       | support them. This blog post warns about misuse:
       | 
       | Abuse Of Some Sum Types In OO Languages
       | http://www.jerf.org/iri/post/2960
        
         | tialaramex wrote:
         | Hmm, but that Go post argues that Sum types are only the right
         | choice in a _minority_ of situations and if you looked at, say,
         | Rust you 'd see APIs return sum types all over the place.
         | 
         | If we were to believe that "build yet another interface every
         | time" was an equally good solution which is just a different
         | flavour from sum types we'd expect to see equivalent APIs in Go
         | with such interfaces and we do not. It's not a different style,
         | it's just worse.
         | 
         | Example, suppose I got a string from a user, u, and I am now
         | trying to see where that string is in another string I have
         | from somewhere, x
         | 
         | In Rust x.find(u) returns Option<usize>, ie either None or
         | Some(number)
         | 
         | But in Go strings.Index(x, u) returns int, and I have to know
         | that the integer -1 is used as a sentinel value meaning "not
         | found" rather than "found at position -1".
         | 
         | OK, so strings don't have great ergonomics, how about file
         | opening? Let's open a file named george.txt which alas might
         | not exist.
         | 
         | In Rust File::open("george.txt") returns Result<File>, ie
         | either an Error or Ok(File)
         | 
         | But in Go os.Open("george.txt") return a tuple with both an
         | error _and_ my file and then I need to check whether the error
         | was actually nil (no error).
        
           | skybrian wrote:
           | I do think sum types would be a nice addition to Go, but this
           | article is about the workarounds that people use since Go
           | doesn't have them.
           | 
           | For example, if you're designing an API in Go, you should
           | follow the established convention for error handling, rather
           | than come up with your own weird thing, unless there's
           | something special going on. Go's error handling is verbose
           | but it's well understood and it's not broken.
        
       | adultSwim wrote:
       | ML languages (like OCaml) hit such a sweat spot between
       | complexity and productivity. I wish I could use one on the daily.
        
       ___________________________________________________________________
       (page generated 2022-03-28 23:01 UTC)