[HN Gopher] Data evolution with set-theoretic types
       ___________________________________________________________________
        
       Data evolution with set-theoretic types
        
       Author : josevalim
       Score  : 67 points
       Date   : 2025-01-14 09:08 UTC (13 hours ago)
        
 (HTM) web link (dashbit.co)
 (TXT) w3m dump (dashbit.co)
        
       | josevalim wrote:
       | Author here. This is probably the article that took me the
       | longest to write, roughly 15 months, and I may still not have
       | explained all concepts with the clarity they deserve and I
       | intended to. If there is any feedback or questions, I'd be glad
       | to answer them!
        
         | practal wrote:
         | Stuff like this is why I don't like type systems. What you want
         | to do is easy, but it becomes difficult to explain in a sane
         | way (15 months difficult), because you need to work around the
         | limitations of type systems. When you say "set-theoretic
         | types", I hear, "get rid of types, just give me logic".
        
           | ryanschaefer wrote:
           | > what you want to do is easy Easy to implement, hard to get
           | correct. It inverts where you do work in a system. It can be
           | hard to implement robust types but once that's done it's easy
           | to know what you are writing is correct.
        
           | josevalim wrote:
           | The work to develop the base theory, which this article
           | presents, takes 15 months, but it doesn't take 15 months to
           | read it (and hopefully it won't take as long to use it
           | either). Whenever you use a programming language, you may
           | work with data structures that took months to formalize and
           | several more years to optimize, yet no one is saying "throw
           | our data structures away". Even things like pretty printing
           | and formatting a float have collectively several years of
           | research behind them, yet the API is often a single function
           | call.
           | 
           | Of course, you can still not like types, and making it harder
           | to evolve libraries over time is a good reason. But using the
           | time it takes to formalize its underlying concepts is not a
           | strong argument against them. The goal is that someone will
           | spend this time, precisely so you don't have to. :)
        
             | Conscat wrote:
             | I have seen multiple users of one certain popular
             | programming language claim that data structures besides a
             | dynamic length array and a hash table have no useful
             | application.
        
               | caspper69 wrote:
               | They wouldn't also happen to throw out memory safety
               | guarantees one level up by using a ton of array indexing,
               | would they?
        
               | tadfisher wrote:
               | Clojure? Sounds like something Hickey would say.
        
             | practal wrote:
             | Oh, I like formalising things, don't get me wrong, and I
             | don't mind spending time on it at all. I just don't like
             | doing it via types, and looking at how much time you spent
             | on what, I rest my case.
        
       | beders wrote:
       | These are great examples of difficulties people will encounter in
       | all popular statically typed languages sooner or later.
       | 
       | I find the solution presented interesting but it is limited to
       | the 3 operations mentioned.
       | 
       | The alternative to this runtime schema checks and treating your
       | data as data - the data you are working with is coming from an
       | external system and needs to be runtime-validated anyways.
       | 
       | Throw in some nil-punting and you can have a very flexible system
       | that is guarded by runtime checks.
        
       | Groxx wrote:
       | Yea, this is an issue rather near and dear to my heart (due to
       | pain). I very much appreciate strong and safe types, but it tends
       | to mean enormous pain when making small obvious fixes to past
       | mistakes, and you _really_ don 't want to block those. It just
       | makes everything harder in the long term.
       | 
       | As an up-front caveat for below: I don't know Elixir in detail,
       | so I'm approaching this as a general type/lang-design issue. And
       | this is a bit stream-of-consciousness-y and I'm not really
       | seeking any goal beyond maybe discussion.
       | 
       | ---
       | 
       | Structural sub-typing with inference (or similar, e.g. perhaps
       | something fancier with dependent types) seems like kinda the only
       | real option, as you _need_ to be able to adapt to whatever bowl
       | of Hyrum Slaw[1] has been created when you weren 't looking.
       | Allowing code that is _still provably correct_ to continue to
       | work without modification seems like a natural fit, and for
       | changes I 've made it would fairly often mean >90% of users would
       | do absolutely nothing and simply get better safety and better
       | behavior for free. It might even be an _ideal_ end state.
       | 
       | I kinda like the _ergonomics_ of `revision 2` here, it 's clear
       | what it's doing and can provide tooling hints in a rather
       | important and complicated situation... but tbh I'm just not sure
       | how much this offers vs _actual structural typing_ , e.g. just
       | having an implicit revision per field. With explicit revisions
       | you can bundle interrelated changes (which is quite good, and
       | doing this with types alone ~always requires some annoying
       | ceremony), but it seems like you'll also be forcing code to
       | accept all of v2..N-1 to get the change in vN because they're
       | _not_ independent.
       | 
       | The "you must accept all intermediate changes" part is in some
       | ways natural, but you'll also be (potentially) forcing it on your
       | users, and/or writing a lot of transitioning code to avoid
       | constraining them.
       | 
       | I'm guessing this is mostly due to Elixir's type system, and
       | explicit versions are a pragmatic tradeoff? A linear rather than
       | combinatoric growth of generated types?
       | 
       | > _It is unlikely - or shall we say, not advisable to - for a
       | given application to depend on several revisions over a long
       | period of time. They are meant to be transitory._
       | 
       | An _application_ , yes - applications should migrate when they
       | are able, and because they are usage-leaf-nodes they can do that
       | without backwards compatibility concerns. But any library that
       | uses other libraries generally benefits from supporting as many
       | versions as possible, to constrain the parent-library's users as
       | little as possible.
       | 
       | It's exactly the same situation as you see in normal SAT-like
       | dependency management: applications should pin versions for
       | stability, libraries should try to allow as broad of a range as
       | possible to avoid conflicts.
       | 
       | >* Would downcasting actually be useful in practice? That is yet
       | to be seen.*
       | 
       | I would pretty-strongly assume both "yes" and "it's complicated".
       | For end-users directly touching those fields: yes absolutely,
       | pick your level of risk and live with it! This kind of thing is
       | great for isolated workarounds and "trust me, it's fine"
       | scenarios, code has varying risk/goals and that's good. Those
       | happen all the time, even if nobody really likes them afterward.
       | 
       | But if those choices apply to _all_ libraries using it in a
       | project... well then it gets complicated. Unless you know how all
       | of them use it, _and they all agree_ , you can't safely make that
       | decision. Ruby has refinements which can at least somewhat deal
       | with this, by restricting _when_ those decisions apply, and Lisps
       | with continuations have another kind of tool, but most popular
       | languages do not... and I have no idea how possible either would
       | be in Elixir.
       | 
       | ---
       | 
       | All that probably summarizes as: if we could boil the ocean,
       | would this be meaningfully different than structural typing with
       | type inference, and no versioning? It sounds like this might be a
       | reasonable middle-ground for Elixir, but what about in general,
       | when trying to apply this strategy to other languages? And viewed
       | through that lens, are there other structural typing tools worth
       | looking at?
       | 
       | [1] https://en.wiktionary.org/wiki/Hyrum%27s_law
        
       ___________________________________________________________________
       (page generated 2025-01-14 23:00 UTC)