[HN Gopher] Nominal for Storing, Structural for Manipulating
___________________________________________________________________
Nominal for Storing, Structural for Manipulating
Author : todsacerdoti
Score : 40 points
Date : 2024-11-20 02:50 UTC (20 hours ago)
(HTM) web link (welltypedwitch.bearblog.dev)
(TXT) w3m dump (welltypedwitch.bearblog.dev)
| PaulHoule wrote:
| I think is forgotten here that one of the benefits of nominal
| typing is that the compiler can know that data layout at run time
| so performance benefits.
|
| There has been so much ink spilled on the question of what kind
| of type systems help programmers be productive but there is not
| such controversy on the performance side.
| agentultra wrote:
| Do you mean at compile time?
|
| I'm mostly familiar with Haskell which does type erasure. I
| think the means that there is no type checking at run time.
|
| I think dependently typed languages would the benefit of
| knowing structure at compile time enough to detect things like
| dead branches and impossible cases which can optimize by
| removing code. I'm not sure that all types are erased in such
| languages though.
| taeric wrote:
| I assume they mean the simple fact that nominal types can be
| used in optimizations. Is why float versus double or int can
| be a big difference.
|
| For user types, explicitly, it is less of a benefit. But can
| still show up.
| PaulHoule wrote:
| In the case of simple types, say a 64-bit int if the
| compiler knows what the types are ahead of time it can
| store the ints in registers and add them with one assembly
| language instruction. Same if it is floating point.
|
| If you want to add two things in Python they could be
| totally different types so at runtime the program is likely
| to have pointers in those registers and it is going to have
| to look at the objects and figure out what it has to do to
| add them, possibly calling the __add__ method on the
| object. In the obvious implementation you are having to
| fetch the int out of memory when you could have just got it
| out of the register.
|
| Now many languages play tricks with pointers, we are not
| filling out a 64-bit address space any time soon, so we
| could make 'pointers' with certain bit patterns host
| integers and other numbers inside. Still it is a lot harder
| to add two of those than it is to just add two values we
| know are integers.
|
| With user types it is pretty much the same, particularly in
| single inheritance languages like Java where it is dead
| obvious that you can access field A of type X by adding a
| fixed offset to the object pointer. In a language like
| Python or Javascript you are looking everything up in a
| hashtable, even if it is a fast hashtable. (You don't
| consistently win using slots.)
|
| A really advanced compiler/runtime could undo some of this
| stuff, for instance it might be clear that in a particular
| case you are adding x+y and those are both going to be
| 64-bit ints and you can specialize it. You could do this at
| build time or even at runtime see that function has been
| called in a particular context 10,000 and you get the
| chance to inline that function into the function that calls
| it and then inline that function and then recompile it with
| the fastest code but it is tricky to get right. If an x
| that isn't an int finds its way in there you need to
| despecialize the function without breaking anything.
|
| PyPy shows that Python could be greatly improved over what
| it is, I think CPython is going to approach it in speed
| gradually.
| lmm wrote:
| Premature optimisation is the root of all evil. If you get the
| semantics right then good performance will generally follow, if
| you make the wrong thing then it doesn't matter how fast it is.
| taeric wrote:
| Performant code in the wild largely disagrees? This is the
| difference between a numpy array versus a regular python
| list. It is stark.
|
| You are correct that you can often engineer the performance
| later. But layout concerns and nominal types for performance
| are fairly non controversial.
| thesz wrote:
| Python does not do optimizations at all.
|
| In contrast, you can find Fortran compilers that can
| substitute your inefficient implementation of some
| numerical algorithm with different much more performant
| one.
| PaulHoule wrote:
| Python has a peephole optimizer:
| https://akaptur.com/blog/2014/08/02/the-cpython-peephole-
| opt...
| maccard wrote:
| I think what OP meant was "Python's optimisations are not
| very fast", which (IMO) is a fair statement. Python is
| many, many things, but it's best case scenarios (without
| rewriting in C, at which point you're writing C not
| python) are slower than the naive implementations in many
| other languages.
| biorach wrote:
| > Premature optimisation is the root of all evil
|
| If you're going to quote Knuth you better be damn sure you've
| fully understood him.
|
| The quote in question is about micro-optimisations which were
| pointless on account of design issues. The situation you are
| commenting on is kind of the opposite.
| PaulHoule wrote:
| I would say this debate
|
| https://en.wikipedia.org/wiki/AoS_and_SoA
|
| (which I think is weakened in the Wikipedia version) goes
| to the heart of where profiling can hide huge opportunities
| for optimization.
|
| It is a good prop bet that rewriting something (that
| doesn't allocate lots of dynamic memory) that is SoA style
| in AoS style will speed it up significantly. The victim is
| emotionally attached to their code and the profiler would
| never show you the many small costs, often distributed
| through the memory system, that add up. In a case like that
| they might accuse you of cheating by applying a bunch of
| small optimizations which are often easier to express in
| SoA if not downright obvious. When they apply the same
| optimizations to their code they will speed it up but not
| as much.
|
| Oddly it's a discussion that's been going on even since
| people were writing games in assembly language in the 1980s
| and probably since that; it would be interesting to see
| more tools which are AoS on the inside but look SoA on the
| outside.
| fanf2 wrote:
| Good data layout is the root of all optimization.
| thesz wrote:
| Worst-case optimal join algorithm [1] might have a
| different opinion.
|
| [1] https://en.wikipedia.org/wiki/Worst-
| case_optimal_join_algori...
| jerf wrote:
| This is basically a form of the "Sufficiently Smart Compiler"
| claim. It is generally false, barring a definition of
| "semantics right" that simply includes a presumption that the
| "semantics" will be chosen for performance in which case the
| claim becomes circular.
|
| At this point in the collective journey we are all on
| understanding programming languages and what they can do, the
| evidence is _overwhelming_ that there are in fact plenty of
| useful semantics that are intrinsically slower than other
| useful semantics, relative to any particular chunk of
| hardware executing them. That is, what is slow on CPU or GPU
| may differ, but there is certainly some sort of difference
| that will exist, and there is no amount of (feasible)
| compiling around that problem.
|
| Indeed, that's _why_ we have CPUs and GPUs and soon NPUs and
| in the future perhaps other types of dedicated processors...
| precisely _because_ not all semantics can be implemented
| equally no matter how smart you are.
| lmm wrote:
| From where I stand the sufficiently smart compiler paradigm
| is already dominant, and getting more so every day.
|
| - The overwhelming majority of new code is written in high-
| level languages
|
| - High-level languages have continued to close what small
| performance gaps remain
|
| - There have been no serious efforts to implement a true
| low-level language for post-Pentium (superscalar) CPUs, yet
| alone the CPUs of today
|
| - Even GPUs and NPUs are largely programmed by using
| languages that express largely the same semantics as
| languages for CPUs, and relying heavily on compiler
| optimisation
| jerf wrote:
| That's not what the "sufficiently smart compiler" is. The
| "sufficiently smart compiler" is one that allows you to
| write any amazingly capable language you want with
| whatever amazing semantics you want, and it compiles it
| into code that is competitive with C written for speed.
|
| You can easily observe in any cross-language shootout in
| 2024 that optimized code bases in the various languages
| still have gradients and we do not live in a world where
| you can just start slamming out Python code and expect no
| performance delta against C.
|
| https://prog21.dadgum.com/40.html
|
| Merely _smart_ compilers are amazing; one of the defining
| characteristics of the software world is that you can be
| handed these things _for free_. The "sufficiently smart
| compiler", however, does not exist, and while there is no
| mathematical proof that I'm aware of that they are
| impossible, after at least two decades of people trying
| to produce them and totally failing, the rational
| expectation at this point must be that they do not exist.
| readthenotes1 wrote:
| An odd to see Java thrown in there without methods...
| molikto wrote:
| convertTree doesn't work because Tree uses Tree type recursively.
| jerf wrote:
| Since the author is writing their own language, their language
| can fix that; it's not hard to figure out how to implement
| that.
|
| What I find horrifying is that types are most useful when they
| are more than just a shape. Just because two things have a type
| X[A] {X[A], A, X[A]} doesn't mean it's the same binary tree. A
| trivial example is that one tree could be sorting by the "id"
| field and another could be sorting on a "username" field, and
| simply casting one to the other means you've now got a garbage
| data structure and none of the functions/methods operating on
| it are going to return good data now. It's a bit of a culture
| shock to see such Haskell-esque type notation combined with
| excitement over blowing away everything about a type other than
| its brute shape.
| vrotaru wrote:
| I guess Modula-3 was doing it as well.
|
| Records were structurally typed. But you can "braid"(?) a record
| and that will make it nominal type.
| cryptonector wrote:
| ASN.1 is structurally typed too.
| deredede wrote:
| > What this means is that all nominal variants and records are
| really just nominal wrappers over structural variants or records.
|
| If you have both nominal types and structural types in your
| language, you can already do this, while keeping the ability to
| be nominal only or structural only when you want.
|
| In the following OCaml code variant inference in pattern matching
| is not automatic (although the compiler will warn you about the
| missing cases, which helps you figuring what to write), but the
| types do get refined in the corresponding branch.
| type 'a tree = Tree of [ `Empty
| | `Branch of ('a tree * 'a * 'a tree) (*
| Unfortunately, inline records are not supported here,
| so you have to use tuples, objects, or a mutually recursive
| record type. *) ]
| [@@unboxed] (* You can give aliases to subsets of
| constructors *) type 'a branch = [ `Branch of
| ('a tree * 'a * 'a tree) ] let f (x : 'a branch)
| = () let f x = match x with
| | Tree `Empty -> () (* You need to be explicit about
| the cases you want to handle. This pattern could
| also be written `Tree (#branch as rest)`. *) | Tree
| (`Branch _ as rest) -> f rest
|
| The one feature I'd really like in this space would be the
| ability to refer to a subset of the constructors of a nominal
| types as if it was a (restricted) polymorphic variant that could
| only be recombined with another subset of constructors of the
| same nominal type. It would allow some of the power of
| polymorphic variants without losing the efficient representation
| allowed by nominal types knowing the possible variants ahead of
| time.
| js8 wrote:
| The nominal vs structural distinction (in both programming and
| math) goes beyond types. Should names in programs matter?
|
| Consider two functions, say copySurname and copyStreetName. They
| do same exact thing, but in different context. No sane person
| would call copySurname for the street, even though the function
| is the same.
|
| So there is this tension, should name of the function (or of the
| type) only reflect it's internal structure (structuralism), or
| should it also reflect the intended domain application
| (nominalism)?
|
| There is a formalism school of mathematics that is pretty much
| the hardcore structuralist, i.e. names of the objects don't
| matter, only their relationships. But most mathematicians (and
| all programmers) reject that view.
| bheadmaster wrote:
| In real-world software writing, there's a huge pull towards
| nominalism (i.e. naming code parts by their domain purpose),
| but in my experience, the most useful way is to have structural
| names for internal mechanisms, then use their interfaces in
| nominal way by describing their role in the business logic.
|
| That way, all nominalism is simply a descriptive wrapper over
| structuralism.
| js8 wrote:
| That's very similar to what the article is advocating.
| sevensor wrote:
| Although I have my frustrations with Python's bolted-on type
| checking, particularly how clunky the type annotations can be
| at runtime (this type is just the string "Foo | Bar | int",
| good luck!), I think the pragmatic approach is working out
| pretty well on the nominal versus structural front. I.e. I can
| make it an error to assign a number in radians to a variable of
| type meters (NewType), but I can also construct a dictionary of
| the right shape and have it satisfy a TypedDict annotation. Or
| I can write a Protocol, which is the biggest hammer in the type
| system.
| ojkelly wrote:
| The name should represent what the function does, it should
| indicate its purpose.
|
| The distinction is useful even when it's structurally identical
| to another function.
|
| Two identical functions in different contexts or domains often
| diverge. When they're not arbitrarily bound by the fact they
| have the same contents it's easy to extend one and not the
| other.
|
| During compilation, they could both end up using the same
| actual implementation (though I'm not sure if any compiler does
| this).
___________________________________________________________________
(page generated 2024-11-20 23:02 UTC)