[HN Gopher] A reckless introduction to Hindley-Milner type infer...
___________________________________________________________________
A reckless introduction to Hindley-Milner type inference (2019)
Author : nequo
Score : 79 points
Date : 2024-06-24 05:24 UTC (1 days ago)
(HTM) web link (reasonableapproximation.net)
(TXT) w3m dump (reasonableapproximation.net)
| cies wrote:
| I encountered HM a lot in my working life as a programmer. I did
| not expect that when I first encountered the names.
|
| From the article:
|
| > HM type systems can't implement heterogenous lists [...] >
| [well it does when wrapping the elements in a sum type, but then]
| > it can only accept types you know about in advance. > [which
| makes it] a massive pain to work with.
|
| Sorry, I totally disagree... Heterogeneous lists without the sum
| types are a massive pain! Maybe it is easier for the initial
| development effort, but if you need to maintain code with
| heterogeneous lists (or dynamically typed code in general) the
| pain will come at some point and likely when already in
| production.
|
| I much rather fix compile errors than runtime errors. And a
| massive thanks to Hindley and Milner for advancing type systems!
|
| Found their photos in gratitude just now:
|
| https://www.researchgate.net/profile/J-Hindley
|
| https://en.wikipedia.org/wiki/Robin_Milner
| fluoridation wrote:
| They're only a pain if you need to know the run-time types of
| the elements. If you're just operating on them through a
| narrowly defined interface, that's just dynamic polymorphism,
| which is indeed more flexible with regards to future
| expandability, as long as you define the interface
| appropriately.
| skybrian wrote:
| I think this article would be better if it ignored the issue of
| proving termination.
|
| Proving that a program terminates _eventually_ is a very weak
| guarantee, since it could be a program that terminates after a
| million years. This matters mostly for proof languages, where
| it's useful to write proofs about functions that you _never run._
| If the function provably terminates, you can assume that whatever
| value it returns exists, without calculating it.
|
| For a possibly long-running job like a SQL query, a static
| guarantee that it terminates eventually is almost certainly not
| what you want. You're going to need to test its performance, and
| ideally have a timeout and a way of handling the timeout. If you
| want to get fancy, a progress bar and cancel button might be
| nice.
|
| Edit:
|
| Compilers need to run fast for the programs we actually write. A
| type system that possibly doesn't terminate under unusual
| conditions isn't that big a deal, as long as people don't write
| code like that. Hopefully you get a decent error message.
|
| Compare with memory use. We don't want compilers to use excessive
| amounts of memory, but don't need static guarantees about memory
| use.
| dmkolobov wrote:
| The problem with "not writing code like that" is that
| eventually someone does.
|
| At work we use a home-grown Haskell-like language with "row
| types" which can be split into disjoint sets of fields in order
| to represent joins and the like.
|
| It works well in most cases, but it's very easy to forget a
| type annotation and trigger unmanageable build-times. Often
| times, it is not obvious where something needs to be annotated.
| Sure, we can act as human time-outs, but the productivity loss
| can be substantial, especially for a newcomer to the language.
| gpm wrote:
| The solution to "eventually someone does" is "add a timeout,
| and when you hit that timeout give them an error so they can
| fix their code".
|
| There's no practical difference between exponential and
| infinite time, you need to do this for bad exponential time
| cases anyways.
|
| Many langauges type systems (Java, Typescript, C++, Rust,
| ...) really _aren 't_ guaranteed to terminate in the absence
| of a timeout either. They're (the type systems) turing
| complete, you can write infinite loops that the compiler
| can't know are infinite loops. This makes no practical
| difference relative to languages which "merely" have
| exponential time type systems, because with an exponential
| time type system you're alread accepting the risk that
| typechecking might not terminate before the heat death of the
| universe.
| AnimalMuppet wrote:
| A timeout (or recursion/expansion depth limit) could be
| useful to give the human some specifics about what went
| wrong. But humans generally kill compilations that take
| much longer than usual. If it took 20 seconds before, and
| now it's run for 10 minutes and hasn't finished, I'm
| probably going to control-C it (or whatever), and then diff
| against the previous code to try to figure out what went
| wrong.
| skybrian wrote:
| Yes, making it possible to reason locally about compiler
| performance can be pretty helpful. It's one reason I like to
| have type declarations for public API's. Also, hitting an
| implementation-defined limit (that you can raise) for
| unreasonable constructs might result in better errors than
| just continuing to run, but slowly.
|
| A termination guarantee for type checking doesn't really
| help.
| samatman wrote:
| > _Proving that a program terminates eventually is a very weak
| guarantee, since it could be a program that terminates after a
| million years. This matters mostly for proof languages, where
| it's useful to write proofs about functions that you never run.
| If the function provably terminates, you can assume that
| whatever value it returns exists, without calculating it._
|
| In practical algorithms of significance, we can often do much
| better than this, proving both termination, and the time
| complexity of that termination. For instance, a DFA of a
| regular language will terminate in O(n) of the length of the
| haystack, and we also have a bound for the time it takes to
| construct that DFA.
|
| Sometimes, as you're alluding to, we know something terminates,
| but the time complexity is impractical: exhaustive naive search
| for Traveling Salesman will, in principle, halt. This is,
| itself, useful information.
|
| The article claims (I believe correctly) not just that H-M
| inference halts, but that it does so in "near-linear" time.
| That's a much tighter bound than merely proving termination.
| skybrian wrote:
| Yes, agreed.
| nextaccountic wrote:
| Complexity sometimes is still too little, because the
| constants can overwhelm your O(n) for all n that you care
| about in practice. Or rather, complexity can't help you to
| select between two competing O(n) algorithms
|
| However, without a precise cost model of your operations
| (say, you just know each operation has a bounded time and
| such they are each O(1 ) but nothing beyond that), that's
| often the best you can do
|
| There's something called resource aware computation that can
| do more finer grained bounds on the runtime and memory usage
| of functions (and other resource usage like stack space so
| you can prove you always use less than 8MB or whatever
| number) not only complexity but also the constants. With the
| right cost model you _could_ model CPU and memory latencies
| and other things like CPU port usage, cache usage etc, but
| there 's an interplay with the optimizer here (this could
| even be used to spot performance regressions of compiler
| codegen even without running any benchmark)
| gpm wrote:
| If "near-linear" time was a bound it would be, but it's not.
| HM type inference is exptime-complete. Near-linear here is
| just "most programs people actually write type check in
| something close to linear time".
| bmacho wrote:
| I second this, for programs to terminate _eventually_ , is not
| a good goal to pursue. In practice if you need termination, you
| will need bounded termination, which solves your eventual
| termination problem completely.
|
| I don't understand why people talk so much about eventual
| termination.
| mrkeen wrote:
| Think of it not like proofs but like tests, i.e. it catches
| bad stuff.
|
| Having static types is a low cost solution that catches a lot
| of possible bugs. Purity catches another bunch of bugs. No-
| nulls catches even more. At this point you can kinda start
| trusting function names and type signatures to do what they
| say on the box (often said about Haskell). Likewise, if it
| compiles, it probably does the right thing.
|
| So where's the next gap in Haskell that lets bugs through?
| I'd argue that (eventual) termination would be that next
| step. If you say a function has type a->a (for all a), it
| only admits one possible implementation... except for bad
| ones, like 'undefined' or an infinite recursion - both of
| which termination would catch for you.
| chowells wrote:
| > If you say a function has type a->a (for all a), it only
| admits one possible implementation... except for bad ones,
| like 'undefined' or an infinite recursion - both of which
| termination would catch for you.
|
| To be super-pedantic, it only admits one possible
| denotation, assuming termination. There are an infinite
| number of implementations that waste finite time on
| something useless before returning the argument. We usually
| don't talk about those, because... Why would you write code
| that does that? But it's worth keeping in mind as an edge
| case in this sort of discussion.
| shiandow wrote:
| It's very easy to write programs solving some unsolvable
| problem that don't terminate.
| marssaxman wrote:
| > I think this article would be better if it ignored the issue
| of proving termination.
|
| I agree; that part went on so long that I got bored and stopped
| reading before the article actually got around to saying
| anything about type inference.
| abeppu wrote:
| Can someone familiar with the territory recommend a better or
| complementary resource of similar level of approachability?
|
| The intro here flags that the author i) is not happy with the
| post in its current form, ii) was resynthesized from wikipedia
| "without particularly putting in the effort to check my own
| understanding", which doesn't inspire confidence.
|
| And then when you do get into the content, stuff _looks_
| incorrect, but a non-expert reader, given the caveats in the
| intro, should be on the lookout for errors but isn 't equipped to
| identify them with confidence. E.g. the very first inference rule
| included is described as:
|
| > if you have a variable x declared to have some polytype p, and
| if the monotype m is a specialisation of p, then x can be judged
| to have type m
|
| ... but I _think_ this should be "p is a specialization of m".
|
| Basically, I think this is the kind of material where detail
| matters, where errors seriously impede understanding, and a
| resource where the reader/learner has to be on guard second-
| guessing the author is a bad intro resource. But surely there are
| better ones out there.
| nathanrf wrote:
| The quoted paragraph is correct as written. E.g. if
|
| f: [?]A, B. A -> B -> A
|
| Then we have poly type p = "[?]A, B. A -> B -> A".
|
| The monotype m = Str -> Int -> Str is a specialization of p, so
| we are also permitted to judge that f has type m.
|
| The type specialization operator "looks backwards", where we
| write sorta "p <= m" because there's a kind of "subtyping"
| relationship (though most HM type systems don't actually
| contain subtyping) - anywhere that a term of type m is used, a
| term of type p can be used instead because it can be
| specialized to m.
|
| I think this is a good complementary article:
| https://boxbase.org/entries/2018/mar/5/hindley-milner/ (it also
| links to three more academic sources for more context/detail)
| shpongled wrote:
| "Types and Programming Languages" by Benjamin Pierce is an
| excellent textbook (and approachable, even to people without
| formal CS education) that covers type inference and introduces
| a lot of the nomenclature/formalism gradually.
| mrkeen wrote:
| This is nice and genuinely entertaining
| https://youtu.be/IOiZatlZtGU (Propositions as types by phil
| wadler). Its true focus is the 1-1 mapping between type systems
| and logic systems, but it's well worth the watch.
|
| This is good reading - it demystifies the funny typing rules in
| their mathematical form. https://legacy-
| blog.akgupta.ca/blog/2013/05/14/so-you-still-...
|
| (Neither of these tell you how to implement the checker
| yourself though)
| efnx wrote:
| It will never be finished but Stephen Diehl's "Write you a
| Haskell" is great.
|
| http://dev.stephendiehl.com/fun/
| ashton314 wrote:
| Here are my notes on writing a type checker/type inferrer in case
| anyone is interested:
| https://lambdaland.org/posts/2022-07-27_how_to_write_a_type_...
___________________________________________________________________
(page generated 2024-06-25 23:00 UTC)