[HN Gopher] Data evolution with set-theoretic types
___________________________________________________________________
Data evolution with set-theoretic types
Author : josevalim
Score : 67 points
Date : 2025-01-14 09:08 UTC (13 hours ago)
(HTM) web link (dashbit.co)
(TXT) w3m dump (dashbit.co)
| josevalim wrote:
| Author here. This is probably the article that took me the
| longest to write, roughly 15 months, and I may still not have
| explained all concepts with the clarity they deserve and I
| intended to. If there is any feedback or questions, I'd be glad
| to answer them!
| practal wrote:
| Stuff like this is why I don't like type systems. What you want
| to do is easy, but it becomes difficult to explain in a sane
| way (15 months difficult), because you need to work around the
| limitations of type systems. When you say "set-theoretic
| types", I hear, "get rid of types, just give me logic".
| ryanschaefer wrote:
| > what you want to do is easy Easy to implement, hard to get
| correct. It inverts where you do work in a system. It can be
| hard to implement robust types but once that's done it's easy
| to know what you are writing is correct.
| josevalim wrote:
| The work to develop the base theory, which this article
| presents, takes 15 months, but it doesn't take 15 months to
| read it (and hopefully it won't take as long to use it
| either). Whenever you use a programming language, you may
| work with data structures that took months to formalize and
| several more years to optimize, yet no one is saying "throw
| our data structures away". Even things like pretty printing
| and formatting a float have collectively several years of
| research behind them, yet the API is often a single function
| call.
|
| Of course, you can still not like types, and making it harder
| to evolve libraries over time is a good reason. But using the
| time it takes to formalize its underlying concepts is not a
| strong argument against them. The goal is that someone will
| spend this time, precisely so you don't have to. :)
| Conscat wrote:
| I have seen multiple users of one certain popular
| programming language claim that data structures besides a
| dynamic length array and a hash table have no useful
| application.
| caspper69 wrote:
| They wouldn't also happen to throw out memory safety
| guarantees one level up by using a ton of array indexing,
| would they?
| tadfisher wrote:
| Clojure? Sounds like something Hickey would say.
| practal wrote:
| Oh, I like formalising things, don't get me wrong, and I
| don't mind spending time on it at all. I just don't like
| doing it via types, and looking at how much time you spent
| on what, I rest my case.
| beders wrote:
| These are great examples of difficulties people will encounter in
| all popular statically typed languages sooner or later.
|
| I find the solution presented interesting but it is limited to
| the 3 operations mentioned.
|
| The alternative to this runtime schema checks and treating your
| data as data - the data you are working with is coming from an
| external system and needs to be runtime-validated anyways.
|
| Throw in some nil-punting and you can have a very flexible system
| that is guarded by runtime checks.
| Groxx wrote:
| Yea, this is an issue rather near and dear to my heart (due to
| pain). I very much appreciate strong and safe types, but it tends
| to mean enormous pain when making small obvious fixes to past
| mistakes, and you _really_ don 't want to block those. It just
| makes everything harder in the long term.
|
| As an up-front caveat for below: I don't know Elixir in detail,
| so I'm approaching this as a general type/lang-design issue. And
| this is a bit stream-of-consciousness-y and I'm not really
| seeking any goal beyond maybe discussion.
|
| ---
|
| Structural sub-typing with inference (or similar, e.g. perhaps
| something fancier with dependent types) seems like kinda the only
| real option, as you _need_ to be able to adapt to whatever bowl
| of Hyrum Slaw[1] has been created when you weren 't looking.
| Allowing code that is _still provably correct_ to continue to
| work without modification seems like a natural fit, and for
| changes I 've made it would fairly often mean >90% of users would
| do absolutely nothing and simply get better safety and better
| behavior for free. It might even be an _ideal_ end state.
|
| I kinda like the _ergonomics_ of `revision 2` here, it 's clear
| what it's doing and can provide tooling hints in a rather
| important and complicated situation... but tbh I'm just not sure
| how much this offers vs _actual structural typing_ , e.g. just
| having an implicit revision per field. With explicit revisions
| you can bundle interrelated changes (which is quite good, and
| doing this with types alone ~always requires some annoying
| ceremony), but it seems like you'll also be forcing code to
| accept all of v2..N-1 to get the change in vN because they're
| _not_ independent.
|
| The "you must accept all intermediate changes" part is in some
| ways natural, but you'll also be (potentially) forcing it on your
| users, and/or writing a lot of transitioning code to avoid
| constraining them.
|
| I'm guessing this is mostly due to Elixir's type system, and
| explicit versions are a pragmatic tradeoff? A linear rather than
| combinatoric growth of generated types?
|
| > _It is unlikely - or shall we say, not advisable to - for a
| given application to depend on several revisions over a long
| period of time. They are meant to be transitory._
|
| An _application_ , yes - applications should migrate when they
| are able, and because they are usage-leaf-nodes they can do that
| without backwards compatibility concerns. But any library that
| uses other libraries generally benefits from supporting as many
| versions as possible, to constrain the parent-library's users as
| little as possible.
|
| It's exactly the same situation as you see in normal SAT-like
| dependency management: applications should pin versions for
| stability, libraries should try to allow as broad of a range as
| possible to avoid conflicts.
|
| >* Would downcasting actually be useful in practice? That is yet
| to be seen.*
|
| I would pretty-strongly assume both "yes" and "it's complicated".
| For end-users directly touching those fields: yes absolutely,
| pick your level of risk and live with it! This kind of thing is
| great for isolated workarounds and "trust me, it's fine"
| scenarios, code has varying risk/goals and that's good. Those
| happen all the time, even if nobody really likes them afterward.
|
| But if those choices apply to _all_ libraries using it in a
| project... well then it gets complicated. Unless you know how all
| of them use it, _and they all agree_ , you can't safely make that
| decision. Ruby has refinements which can at least somewhat deal
| with this, by restricting _when_ those decisions apply, and Lisps
| with continuations have another kind of tool, but most popular
| languages do not... and I have no idea how possible either would
| be in Elixir.
|
| ---
|
| All that probably summarizes as: if we could boil the ocean,
| would this be meaningfully different than structural typing with
| type inference, and no versioning? It sounds like this might be a
| reasonable middle-ground for Elixir, but what about in general,
| when trying to apply this strategy to other languages? And viewed
| through that lens, are there other structural typing tools worth
| looking at?
|
| [1] https://en.wiktionary.org/wiki/Hyrum%27s_law
___________________________________________________________________
(page generated 2025-01-14 23:00 UTC)