[HN Gopher] Algebraic Data Types in Haskell
___________________________________________________________________
Algebraic Data Types in Haskell
Author : ingve
Score : 94 points
Date : 2022-03-26 10:54 UTC (2 days ago)
(HTM) web link (serokell.io)
(TXT) w3m dump (serokell.io)
| Koshkin wrote:
| 'Maybe' is both a monad and a sum. A monadic and an algebraic
| type at the same time. I wonder if there is an interesting
| interaction between those.
| Twisol wrote:
| Maybe is a sum because it was _defined_ using sums. It 's a
| monad because, from that definition, you can implement
| functions meeting the monad interface. "Monad" is more like an
| adjective than a noun.
|
| > I wonder if there is an interesting interaction between
| those.
|
| There's a rather rich theory of polynomial endofunctors (i.e.
| generic types with one type parameter built from products,
| sums, and non-generic types), and I think monads figure
| prominently (they're somehow related to the initial algebras
| for such functors). So there's definitely a relationship there.
| black_knight wrote:
| In general an algebraic (polynomial) functor which is also a
| monad is called an operad (non-sigma operad for the algebraic
| topologists). They are plenty and have a lot of interesting
| structure.
| prionassembly wrote:
| Haskell is very powerful, but I don't think this article presents
| anything that's widely available in popular languages right now.
| I'd be very skeptical that TypeScript isn't expressive enough to
| do Maybe and Either; I wouldn't be surprised in PHP is there
| already.
| jerf wrote:
| Your post as I read it is very hard to parse, but I don't think
| the article claimed that Typescript couldn't do it. Many modern
| languages can do Either or Maybe types precisely because they
| copied it from Haskell in the first place. (Which is good.
| Copying is good.)
|
| The dynamic scripting languages can do a "maybe" type because
| they can do just about anything when it comes to types; what
| they can't do is _enforce_ correct usage by restrictions, so
| you write a Maybe type but you can 't ever quite be _sure_ it
| 's doing what you think it's doing. This is just something you
| end up living with in dynamically typed langauges.
| prionassembly wrote:
| Expressivity and automatic enforcement are completely
| orthogonal. Some pseudocode in academic papers is sublimely
| expressive, but can't be typed directly on a computer.
| fooyc wrote:
| Typescript is a static type system though
| jerf wrote:
| Yes, the paragraph break was there for a reason.
| dllthomas wrote:
| Typescript can do Maybe and Either, but it's a little verbose
| because you have to manage the tags yourself.
|
| For instance you can certainly do: type
| Maybe<X> = { tag: 'just', value: X } | { tag: 'nothing'};
| type Either<X, Y> = { tag: 'left', value: X } | { tag: 'right',
| value: Y }; function either<X, Y, Z>(fromLeft: (x:
| X) => Z, fromRight: (y: Y) => Z, xy: Either<X, Y>): Z {
| switch (xy.tag) { case 'left': return
| fromLeft(xy.value); case 'right': return
| fromRight(xy.value); } }
|
| Note that this is meaningfully different from the (often more
| idiomatic and certainly less verbose) untagged use of union
| types because you can distinguish left from right even when the
| contained types agree and you can distinguish different
| nestings, for better and for worse.
| cosmic_quanta wrote:
| Algebraic data types (ADTs) help a lot when expressing business
| logic. It's hard to go back once you've tried it in Haskell (or
| presumably in other languages).
|
| I often try to replicate ADTs in Python at $DAYJOB, because
| they're so damn convenient. I end up with dataclasses which have
| the same interface, and I disambiguate between them using
| `isinstance`. It's not perfect but it's useable.
| prionassembly wrote:
| Tuples (or namedtuples/dataclasses if you must have named
| fields, which in Haskell too is additional syntax) are product
| types, enums are sum types.
|
| You should look into functools.singledispatch if you're doing
| business logic with isinstance. But you should also learn about
| object-oriented programming, inheritance vs composition,
| mixins, etc. People get sour on OOP because they should have
| learned FP first and then imitate/translate FP idioms --
| instead of doing something like modeling domain ontologies in
| object graphs. Likewise, people get sour on FP yadda yadda
| instead of trying to literally write their programs as proofs
| to theorems.
| couchand wrote:
| OOP is not an alternative to ADT, and a language lacking an
| expressive system of value types will stunt your development
| regardless of what flavor the rest of the language follows.
| xwowsersx wrote:
| Can you share some examples, please?
| BoiledCabbage wrote:
| Funny enough someone actually wrote a book on it.
|
| _Data Modeling Made Functional_
|
| https://www.amazon.com/Domain-Modeling-Made-Functional-
| Domai...
|
| It's one of the better and more useful software engineering
| books I've read. Even if you don't use a functional
| programming langauge. It's about using Algebraic Data Types
| do model common problems in the day-to-day business domain
| (not typical academic problems).
|
| It's a really simple and awesome presentation, and by the end
| you're dying for the ability to use this more so in the day
| to day job. Honestly after reading through it, trying to
| model problems in OOP just seems so unnecessarily obtuse.
|
| The Scott Wlaschin also runs
| https://fsharpforfunandprofit.com
|
| https://fsharpforfunandprofit.com/ddd/ - a link to a talk on
| it which is decent, but the book is much better.
| xwowsersx wrote:
| Thanks. I was asking specifically about accomplishing this
| in Python.
| klysm wrote:
| My brain is 100% polluted by the power of ADTs. I can't think
| about modeling problems without them anymore and it makes
| development in languages where it's not ergonomic difficult.
| pid-1 wrote:
| Python does have Union types and pattern matching tho.
| garethrowlands wrote:
| Python has types?
| matyasrichter wrote:
| Uh, yes?
| Gadiguibou wrote:
| I have had the exact same problem with Go recently. I have just
| started using it and have been disappointed to see that, for a
| language that claims to be modern, it doesn't have proper sum
| types/enums OR convenient product types/tuples.
|
| Tuples can be emulated using structs but it can generate a lot
| of boilerplate for a single function. The only alternative to
| express iterating over "zipped" lists is to have the two lists
| side by side and iterate using an integer.
|
| However, sum types are just plain missing. I guess interfaces
| help, but they're really limiting in what they allow and even
| regular C-style enums are painful and can't be checked for
| exhaustivity at compile time.
|
| Does anyone have tips on what the "idiomatic" solutions are for
| these problems?
| jerf wrote:
| There are times when sum types are the best solution, but in
| many cases, interfaces are the solution. But, interfaces as
| they are _designed_ to be used, not just interfaces used to
| label a set of types that you then constantly type switch
| between. See: http://www.jerf.org/iri/post/2960
|
| This is true of OO languages in general, not just Go.
| bobbylarrybobby wrote:
| In practice this doesn't seem to work. In theory, Go should
| have a Result interface and two types Data and Error that
| implement it, and then functions that could error would
| return a Data or an Error and then... you'd have to call a
| function like `my_result.Do(callback)`? The fact that
| idiomatic Go code actually returns (data, err) and checks
| if err == nil -- and that it's not an error to forget this
| check -- suggests that the interface paradigm just doesn't
| work that well.
| couchand wrote:
| It's been a while since I've written Go but back then the
| idiomatic solution was a C-style enum and a fair amount of
| hope.
| rfw300 wrote:
| Have you seen Python 3.10's pattern matching?
| https://peps.python.org/pep-0636/
|
| While it doesn't bring you Haskell ADTs, it sounds like it
| could make your `isinstance`-style code a lot cleaner.
| cosmic_quanta wrote:
| That's great! Unfortunately, at the pace our stack is moving,
| I don't think I'll be using Python 3.10 for some time
| anoctopus wrote:
| Completely agree, although I'm often put off by the
| comparatively worse ergonomics for the python emulation, to the
| point that I sometimes miss obvious-in-retrospect places I
| should be modeling the domain with more variants, even in code
| I'm already using unions of dataclasses. The ergonomics and
| clarity of `data Foo = Foo Int | Bar` are incredible.
| LeanderK wrote:
| > I often try to replicate ADTs in Python at $DAYJOB, because
| they're so damn convenient. I end up with dataclasses which
| have the same interface, and I disambiguate between them using
| `isinstance`. It's not perfect but it's useable.
|
| I also do this! It is so painful not to have real ADTs.
| titzer wrote:
| When designing Virgil, I left out ADTs for the longest time. I
| had programmed a bit in ML, and knew I would eventually want
| them. I finally added them to Virgil circa 2016. It's really
| impossible to emulate them with objects, because objects
| typically have identity, whereas ADTs, IMHO, shouldn't. The
| pattern matching and exhaustiveness checking that ADTs give you
| allows you to write very concise programs that you can be sure
| cover all the cases.
|
| Extensibility in OO languages can be great for many things, but
| ADTs excel at expressing a fixed set of data shapes, forcing
| programmers to think carefully and break things down into a fixed
| set of cases. When you've handled all the cases, you're done!
| garethrowlands wrote:
| Most language type systems can say AND but relatively few can say
| OR. For example, Java can say a `Point` has an `x` AND a `y`:
| class Point { public int x; public int y; }
|
| ...but it can't say a `Point` has _either_ `x` and `y` OR `r` and
| `theta`. You can simulate it with two classes and a base
| class/interface but that implies there might be _other_ ways of
| defining a point... which there really aren't. The Haskell point
| would be: data Point = Cartesian { x :: Double,
| y :: Double } | Polar { r :: Double, theta
| :: Double }
| azth wrote:
| You can do the same in Java now with sealed types:
| sealed interface Point {} record Cartesian(double x,
| double y) implements Point {} record Polar(double r,
| double theta) implements Point {}
| gavinray wrote:
| With JDK 18 you even get binding patterns in matches, so you
| can extract + assign "Point.x"/"Point.y" to variables when
| matching on them!
| Twisol wrote:
| Type patterns [1] are still unfortunately in preview (and
| have been since JDK 17), and record (destructuring)
| patterns [2] haven't even landed as a preview yet. I hope
| the latter get at least a preview in JDK 19 [3], but it
| looks like they've barely even begun the process of
| targeting JEPs to 19 yet.
|
| [1]: https://openjdk.java.net/jeps/420
|
| [2]: https://openjdk.java.net/jeps/405
|
| [3]: https://openjdk.java.net/projects/jdk/19/
| garethrowlands wrote:
| Good point, that's an improvement.
| gunshowmo wrote:
| Recent updates in TypeScript have also made ADTs much better to
| use and make expressing business logic in a statically safe way.
| It's incredible how much mental overhead that good typing systems
| (and good utilization of them) can save you when navigating
| complex code bases.
| dllthomas wrote:
| You still need to tag your own unions though, right?
| gunshowmo wrote:
| Yes, and since typing is only at compile time, it's also
| missing some of the extremely convenient pattern matching
| syntax that I love from Haskell, Rust, Scala, etc....
|
| Despite some downside, I'm a huge fan overall though. I
| haven't tried ReasonML or ReScript, but compared to bare JS,
| TypeScript makes frontend programming a lot more enjoyable to
| me.
| cercatrova wrote:
| I consider ReasonML and ReScript to be effectively dead
| since they split the community a few times now. I don't
| know anyone who's actually using either in production,
| except for their respective sponsors Meta and Bloomberg.
| domlebo70 wrote:
| What are the recent changes?
| earleybird wrote:
| In these discussions about ADTs, often I see Maybe and Either
| identified as an ADT rather than an ADT constructor. Am I just
| missing something other folks 'just know'? If so, what's the
| intuition I should be looking for?
| pwm wrote:
| It's just semantics. data Bool = False | True
| data Maybe a = Nothing | Just a
|
| Bool is a nullary type constructor or simply type. False and
| True are nullary data constructors or simply constants. Maybe
| is an unary type constructor taking one parameter to construct
| a fully saturated type (eg. Maybe Int), but calling it a "Maybe
| type" is ok, no crazy ambiguity. Nothing is a constant and Just
| is an unary data constructor taking one parameter to construct
| fully saturated data (eg. Just 1).
| dllthomas wrote:
| And in the context of Haskell, non-nullary type constructors
| are still types. They don't happen to be the types of any
| values, but they can still parameterize types, be constrained
| by type equality (eg. a ~ Maybe), etc.
| Twisol wrote:
| For any type `a`, `Maybe a` is a type. We say "Maybe" in the
| same sense we say "List" -- it's obviously a list of some type
| `a`, but if what we're saying doesn't depend on any particular
| choice of `a`, then by abuse of language, we don't bother
| saying it.
|
| On the other hand, `Just` and `Nothing` are _constructors_ , or
| _variants_ , for values of type `Maybe a`. You can think of
| them as almost like static factories (as in `Optional.empty()`
| and `Optional.of(val)` in Java), except that algebraic
| constructors are invertible: you can match on a value to
| determine whether it was constructed with `Just` or `Nothing`,
| and what the arguments to the constructor were.
| fooyc wrote:
| The article is well written and Haskell appears to have what's
| necessary to work efficiently with algebraic data types (matching
| and overloading).
|
| But in the end, it's just fancy words for union and intersection
| types, or am I missing something ?
| curtisf wrote:
| "Algebraic data types", as they appear in Haskell, are
| recursive sums of product types.
|
| (The "sum" and "product" are why they're called "algebraic".
| You can also analyze them as power series, such as in computing
| derivatives for zippers)
|
| Intersection types are something else (that Haskell does not
| have built in to my knowledge).
|
| "union" is slightly imprecise, because sum types are tagged
| (i.e., a non-discriminated union of Int and Int is just Int,
| but Int+Int is actually the same as (Bool, Int). True non-
| discriminated union types are relatively uncommon in static
| type systems, but TypeScript for example does have them)
| [deleted]
| bmc7505 wrote:
| ADTs are indeed powerful, but they are one solution to the
| Expression Problem [1] among many. Just as OOP has limitations,
| so do ADTs. Until you spend time studying each one and absorb the
| lessons they have to offer, only then will you realize that all
| solutions have tradeoffs, only then will you finally achieve
| enlightenment.
|
| [1]: https://en.wikipedia.org/wiki/Expression_problem
| dllthomas wrote:
| I was just trying to explain these to my mother, in the context
| of rust.
| [deleted]
| pwm wrote:
| Sum types are one of those things missing from most mainstream
| languages that people don't even realise that they are deprived
| of something fundamental. They naturally complement product types
| that are ubiquitous in most/all languages.
| dllthomas wrote:
| Specifically, closed sum types. You can usually get open sum
| types by way of inheritance.
| globuous wrote:
| True that! It was eye opening once I understood that! Here's
| a basic type system I made in Python for a small smart
| contract language compiler I'm working on. It works with mypy
| pretty flawlessly! [1] I still haven't figured out how to
| prevent subclassing with methods that don't comply with a
| specific type signature, but that's a story for another
| day...
|
| [1] https://yourlabs.io/pyratzlabs/pymich/-/blob/master/pymic
| h/m...
| zozbot234 wrote:
| FreePascal supports them, via variant records. A nice gain in
| developer-friendliness over C.
| [deleted]
| pklausler wrote:
| C++17 has std::variant<> and it's ubiquitous in my code.
| Catching your type errors at compilation time is something you
| don't have to give up any more after doing your prototyping in
| Haskell.
| b0sk wrote:
| it is very clunky though. there is no pattern matching
| support -- you need to write non-trivial visitors to do any
| meaningful work and it takes the fun out of it. imo, Rust
| gets it right and borrows a lot of Haskell ergonomics
| dboreham wrote:
| Tagged unions were (are?) frequently used in C, albeit without
| safety. This seems to have been mostly forgotten now when sum
| types are discussed.
| edflsafoiewq wrote:
| Tagged unions were common in procedural languages. They
| disappeared when OO became popular.
| skybrian wrote:
| Sum types are great in functional languages, but people are
| talking about using sum types in languages that don't natively
| support them. This blog post warns about misuse:
|
| Abuse Of Some Sum Types In OO Languages
| http://www.jerf.org/iri/post/2960
| tialaramex wrote:
| Hmm, but that Go post argues that Sum types are only the right
| choice in a _minority_ of situations and if you looked at, say,
| Rust you 'd see APIs return sum types all over the place.
|
| If we were to believe that "build yet another interface every
| time" was an equally good solution which is just a different
| flavour from sum types we'd expect to see equivalent APIs in Go
| with such interfaces and we do not. It's not a different style,
| it's just worse.
|
| Example, suppose I got a string from a user, u, and I am now
| trying to see where that string is in another string I have
| from somewhere, x
|
| In Rust x.find(u) returns Option<usize>, ie either None or
| Some(number)
|
| But in Go strings.Index(x, u) returns int, and I have to know
| that the integer -1 is used as a sentinel value meaning "not
| found" rather than "found at position -1".
|
| OK, so strings don't have great ergonomics, how about file
| opening? Let's open a file named george.txt which alas might
| not exist.
|
| In Rust File::open("george.txt") returns Result<File>, ie
| either an Error or Ok(File)
|
| But in Go os.Open("george.txt") return a tuple with both an
| error _and_ my file and then I need to check whether the error
| was actually nil (no error).
| skybrian wrote:
| I do think sum types would be a nice addition to Go, but this
| article is about the workarounds that people use since Go
| doesn't have them.
|
| For example, if you're designing an API in Go, you should
| follow the established convention for error handling, rather
| than come up with your own weird thing, unless there's
| something special going on. Go's error handling is verbose
| but it's well understood and it's not broken.
| adultSwim wrote:
| ML languages (like OCaml) hit such a sweat spot between
| complexity and productivity. I wish I could use one on the daily.
___________________________________________________________________
(page generated 2022-03-28 23:01 UTC)