[HN Gopher] Types versus sets in math and programming languages
       ___________________________________________________________________
        
       Types versus sets in math and programming languages
        
       Author : classified
       Score  : 94 points
       Date   : 2021-07-25 07:42 UTC (15 hours ago)
        
 (HTM) web link (byorgey.wordpress.com)
 (TXT) w3m dump (byorgey.wordpress.com)
        
       | tromp wrote:
       | I think calling the type of natural numbers N instead of Natural
       | is causing the confusion. Students would be less tempted to write
       | 
       | {1,2} [?] Natural
       | 
       | as the right side looks like a type name rather than a value .
       | Haskell helps to maintain the distinction by using capitals for
       | type names and lower case for variable names.
       | 
       | I would seem natural to write:                   naturals :: Set
       | Natural         naturals = {0,1..}         {1,2} [?] naturals
        
         | mjburgess wrote:
         | You're downvoted, but what you've said is basically the point-
         | of and conclusion-to the article.
         | 
         | `N` denotes a set and `Natural` denotes a type, and one cannot
         | be _eliminated_ to the other in all contexts of use.
        
       | amw-zero wrote:
       | It might help to consider the history of types and sets.
       | 
       | Sets (and consequently modern math and logic) have not been
       | around for as long as we might think. Maybe we had them in
       | different forms, but sets as we know them today were born in 1874
       | in this paper:
       | https://srjcstaff.santarosa.edu/~jomartin/IrratFiles/Cantors....
       | 
       | Around 1901, Bertrand Russell discovered a paradox in this
       | formulation of sets, so-called Russel's Paradox:
       | https://en.wikipedia.org/wiki/Russell's_paradox. There are then
       | many ways of dealing with this paradox. The most widely accepted
       | way is to just reformulate the axioms, which was done with
       | Zermelo-Frankel set theory:
       | https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_t....
       | When people use the word "mathematics" today, they generally
       | implicitly mean that ZFC is being used. It's the most widely
       | accepted mathematical foundation we have.
       | 
       | Another path was proposed by Russell himself, and that is the
       | notion of "type." By restricting the way in which certain
       | operations and their arguments can be considered "valid," his own
       | paradox can be avoided.
       | 
       | So from the beginning, sets are about _collecting_, and types are
       | about _restricting_.
        
         | ProfHewitt wrote:
         | Types are about rigorously constructing mathematical
         | abstractions
         | 
         | from primitive types defined up to a unique isomorphism.
         | 
         | In fact, types have become the primary foundation of Computer
         | Science.
         | 
         | Russell introduced the crucial concept of the order of a
         | 
         | proposition to block his famous paradox as follows:
         | 
         | Without orders on propositions, the paradoxical predicate
         | I'mNotSelfApplicable (such that                     I'mNotSelfA
         | pplicable[I'mNotSelfApplicable]=!I'mNotSelfApplicable[I'mNotSel
         | fApplicable])
         | 
         | could be constructed using the following recursive definition:
         | I'mNotSelfApplicable:Predicate<i>[?]!I'mNotSelfApplicable[I'mNo
         | tSelfApplicable]
         | 
         | Since I'mNotSelfApplicable is a predicate variable in the
         | definition,
         | !I'mNotSelfApplicable[I'mNotSelfApplicable]:Predicate<i+1>.
         | 
         | Consequently,                      I'mNotSelfApplicable:Predica
         | te<i>=I'mNotSelfApplicable:Predicate<i+1>
         | 
         | which is a contradiction meaning that I'mNotSelfApplicable does
         | not exist.
         | 
         | See video: https://www.youtube.com/watch?v=AJP1VL7shiI
        
         | ProfHewitt wrote:
         | Unfortunately, 1st-order ZFC has very severe limitations
         | 
         | because it does not precisely define sets.
         | 
         | However, ordinals that provide the backbone of sets have been
         | 
         | precisely defined up to a unique isomorphism using types.
         | 
         | See the following article:
         | 
         | "Theory Ordinals can Replace ZFC in Computer Science"
         | 
         | https://papers.ssrn.com/abstract=3457802
        
       | k__ wrote:
       | Half-OT: What's the reason rationals aren't used often in
       | programming?
        
         | dkarl wrote:
         | Few applications need them for correctness according to domain
         | requirements. Compare with fixed-precision decimal encodings,
         | which are mathematically less fundamental, but are common in
         | programming because many applications require monetary amounts
         | to be calculated using fixed-precision arithmetic.
         | 
         | It's true that floating point calculations are much faster, but
         | we do a lot of expensive things in code when we need to, and
         | people do use rational types when they need exact rational
         | arithmetic. It just doesn't happen very often.
        
         | mjburgess wrote:
         | You mean full-precision integer representations?
         | 
         | A percentage is effectively this -- and floats are often
         | multipliers on integers which are "holding the precision".
         | 
         | However it isn't clear what standards sorts of arithmetical use
         | cases storing two integers helps with (over, say, integer &
         | float, etc.).
        
           | k__ wrote:
           | I'm not sure either, that's why I asked. :D
        
             | mjburgess wrote:
             | I'm not an expert but, certainly, (1) limited memory on
             | historical machines necessitated a small-width
             | representation of a real number; (2) limited cpus meant
             | simple operations needed to be applied to this small-width
             | representation in one go.
             | 
             | Ie., your CPU is just a system for taking bit patterns and
             | producing new bit patterns: (0000 0000) -> (1010 0101). So
             | you want each individual number to be of a size the
             | processor can operate on in one pass.
             | 
             | By using two integers you gain some capacity (ie., you can
             | still overflow), but lose memory and compute performance.
        
         | marcosdumay wrote:
         | You obviously can't have the entire rationals, because your
         | computer's memory is finite. And if you are approximating
         | anyway, adding space to the mantissa of both the numerator and
         | denominator will approximate values about as well as adding
         | space only into one of them (the actual gain grows with
         | O(log(n)) of those sizes).
         | 
         | Thus, if you won't work with exact numbers, the best choice to
         | represent rationals is to place all bits you can at the
         | mantissa of a single side, and just store the order of
         | magnitude of the other side. And, for simplicity you want the
         | mantissa on the numerator side. What is basically a floating
         | point number.
         | 
         | Things are very different if you will work with exact numbers.
        
         | lioeters wrote:
         | I'm guessing the reason it's not often used in programming is
         | due to slower performance and larger memory requirements. I
         | typically see rationals (numerator/denominator pair) used in
         | arbitrary-precision libraries.
         | 
         | > ..Arithmetic with rational numbers can become unwieldy very
         | quickly: 1/99 - 1/100 = 1/9900, and if 1/101 is then added, the
         | result is 10001/999900.
         | 
         | https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic
        
         | ekimekim wrote:
         | If you mean "representing values as a pair of integers P, Q,
         | such that the value = P/Q", the answer is that they're very
         | expensive to work with. You generally need to be using
         | arbitrary-length integers for P and Q, because the values get
         | very large very quickly. And as the values get large,
         | operations like finding the lowest common multiple (required
         | for almost any operation between two rationals) get expensive.
         | 
         | As an example of the kinds of runaway increase in P and Q I'm
         | talking about, consider calcuating the sum of x/(x+1) for
         | values of x from 1 to 999, using python's fractions module:
         | >>> sum(Fraction(x)/(x+1) for x in range(1000))         Fractio
         | n(7075502361382798268120793110090229940258706389605508780354900
         | 641905867818691614397718938054011886551220308728288024788414601
         | 821895503124242001568659978565050184167864774563757893457200995
         | 977901194872761682228965032590045181707202517783731540396927542
         | 180527680245723964874429994374505238893797078720320991788231820
         | 649513569449454767125309918934132672543174716802507966087781714
         | 014833406311821646176291230132112749836042537756095388477483, 7
         | 128865274665093053166384155714272920668358861885893040452001991
         | 154324087581111499476444151913871586911717817019575256512980264
         | 067621009251465871004305131072686268143200196609974862745937188
         | 343705015434452523739745298963145674982128236956232823794011068
         | 809262317708861979540791247754558049326475737829923352751796735
         | 248042463638051137034331214781746850878453485678021888075373249
         | 921995672056932029099390891687487672697950931603520000)
         | 
         | That's a 436-digit P and a 433-digit Q.
         | 
         | Rational types like this can definitely be a good choice in a
         | use-case where accuracy is important, and operations do not
         | compound. But they're a poor default, which is why we generally
         | see float types as the most common non-integer number
         | representation.
        
           | k__ wrote:
           | Makes sense.
           | 
           | Thanks!
        
         | hansvm wrote:
         | Rationals are typically slower (gcd operations) and prone to
         | memory explosion if the set of possible denominators isn't
         | extremely nice.
         | 
         | If approximate answers suffice then a scheme like floats or
         | doubles is usually good enough.
         | 
         | If approximate answers suffice and floats or doubles have
         | undesirable properties (weird density distributions, ranges
         | larger than what your application needs, ...) then it's easy to
         | hack something together with a few _bounded_ integer types
         | which has better properties for your specific needs.
         | 
         | In most other practical applications some kind of a fixed-
         | precision decimal suffices. It's a bit slower than floats and
         | might have undesirable memory characteristics, but broad
         | classes of real-world numbers which need to be represented
         | exactly (like financial data) are nicely represented.
         | 
         | Compared with those (and other) schemes, what advantages do
         | rationals have? Chiefly, they admit an exact representation of
         | any rational number. If you can mitigate the downsides (e.g.,
         | by not having many possible denominator prime factors) and need
         | those benefits then rationals might be a good fit.
         | 
         | One practical situation where I occasionally like to use
         | rationals is in account data -- credits, currencies, .... It
         | lends itself to forward compatibility when new features are
         | desired (most often that the smallest atomic unit in a currency
         | changes or that some niche system wants to offer transactions
         | for 1/3 of a unit). Such changes usually have small, nice,
         | compatible denominators, so you don't have unbounded memory
         | issues. More importantly, all such operations are exact by
         | definition without any extra work on my part (the biggest
         | challenges are in display and formatting if your business
         | admits those sorts of oddities).
        
       | klodolph wrote:
       | Coming from the mathematics side of things,
       | 
       | > In a math class, we typically tell students that N is a _set._
       | 
       | I think the approach is a bit glib. There are some caveats--a
       | student with more mathematical experience would think of N as a
       | set, or a type, or as objects in various categories.
       | 
       | It's also worth asking a student what 1[?]2 is. Most people will
       | look at that and just tilt their head, even though common
       | foundations for mathematics treat 1 and 2 as sets. It turns out
       | most people don't want to work in foundations, and want to think
       | of 1 and 2 as non-sets. The same applies in the other direction,
       | upwards, when you think of N as a type (even though it is also a
       | set).
        
         | vkazanov wrote:
         | My math is shallow at best but... 1[?]2 would become just 2
         | with integers represented as sets in certain foundational
         | theories, right?
        
           | Spivak wrote:
           | Yep! But notation depends a lot on the author to some extent.
           | You will find people who treat 1U2 as {1}U{2} because it's
           | less annoying to write.
           | 
           | It's the same reason languages have an .add method on sets so
           | you don't have to write.                   s =
           | s.union(set(s))
        
             | alisonkisk wrote:
             | who writes 1U2 instead of {1,2} ?
             | 
             | When would {1}U{2} ever appear in ant context except for
             | the first day of Intro to Set Theory?
        
             | [deleted]
        
           | hansvm wrote:
           | Yes. One common definition is equivalent to each natural to
           | being the set containing all preceding naturals. With that
           | view 2={0,1} and 1={0}, so their union is just 2.
        
         | lapinot wrote:
         | > It turns out most people don't want to work in foundations,
         | and want to think of 1 and 2 as non-sets
         | 
         | Which is to say most mathematicians prefer type theory to set
         | theory :). Perhaps we should stop saying "maths is usually done
         | in ZFC", or that "ZFC is mainstream math". For one math is
         | almost never formalized and second informal math is quite
         | strongly typed (the basic object usually not a set but a
         | structured set) where set theory is very untyped. At some point
         | people (erm mathematicians) have to acknowledge that we
         | computer sciency-math have brought far better foundations
         | (being intuitionistic type theory) insofar that it has set
         | theory as a subpart, but is much finer in what it can
         | distinguish. It's newtonian mechanics vs general relativity.
         | What set-foundationalists say is fine, but the frontier is
         | elsewhere.
        
           | Tainnor wrote:
           | > For one math is almost never formalized and second informal
           | math is quite strongly typed [...]
           | 
           | I think you're correct about the first part, but not about
           | the second.
           | 
           | It is incredibly common in mathematics to silently "convert"
           | objects between different types or to ignore certain things
           | that will trip you up once you try to encode things super
           | formally. For example, "the function x^2" (instead of
           | x->x^2), basically all the common notational shorthands in
           | differential equations, where functions are treated as
           | variables, silently identifying isomorphic objects and so on.
           | Plus, very few working mathematicians actually use
           | constructivism and are quite happy with the existence of, for
           | example, the whole set of real numbers.
        
           | creata wrote:
           | How is intuitionistic type theory a "better" foundation? As a
           | simple example, where can I find a statement of Brouwer's
           | fixpoint theorem in an intuitionistic type theory, and how is
           | it better stated than its classical counterpart?
        
       | specialist wrote:
       | > _Sets are characterized by the \in relation: we can ask which
       | items are elements of a set and which are not._
       | 
       | > _Types, on the other hand, are characterized by how elements of
       | the type are built: we can construct elements of a type (and
       | deconstruct them) in certain ways specific to the type._
       | 
       | Huh. So sets vs types are like folksonomies vs taxonomies?
       | 
       | Neat.
       | 
       | --
       | 
       | Tangent:
       | 
       | > _Disco > {1,2} [?] {1,2,3}_
       | 
       | Dumb question: How are you entering mathematically symbols?
       | 
       | My hobby language recognizes both "!=" and "[?]" as "not equal",
       | and I use a font with ligatures to render them the same.
        
       | Ericson2314 wrote:
       | I feel like the author should just write call Set FiniteSet, and
       | then a lot of confusion would go away!
       | 
       | https://golem.ph.utexas.edu/category/2021/06/large_sets_1.ht...
       | also, I highly recommend this series for anyone interested in
       | sets outside of set theory.
        
       | nolta wrote:
       | Here's a nice video with a little more detail on the difference
       | between intensional/extensional:
       | https://www.youtube.com/watch?v=bNG53SA4n48
        
       | raphlinus wrote:
       | From the title, I thought it was going to be an extension of this
       | discussion: https://mathoverflow.net/questions/376839/what-makes-
       | depende...
       | 
       | It's not, exactly, but some of the concepts do overlap. In
       | particular, the idea of being able to write things in a "natural"
       | mathematical style, then have them translate to an obsessively
       | precise internal representation, is essentially the problem of
       | "elaboration" as described in the top answer.
        
       | piinbinary wrote:
       | > So types are prior to sets: types provide a universe of values,
       | constructed in orderly ways, that we can work with; only then can
       | we start picking out certain values to place them in a set.
       | 
       | What about uninhabited types?
       | 
       | Or types where the constructor is not enough to determine the
       | entire type?
       | 
       | e.g.                   data MyType a b = MyConstructor { myField
       | :: a }
        
         | tel wrote:
         | What's the question?
         | 
         | For uninhabited types there's a universe (empty), ways to
         | construct (no ways), ways to work with (any way), and you can
         | (tautologically) pick out values and put them in sets.
         | 
         | Constructors aren't first order meaningful (except in higher-
         | kinded calculi) but for any concrete types `A` and `B`, `MyType
         | A B` is a type with a universe of values (isomorphic to `A`'s
         | universe), ways to construct and use (again, isomorphic to
         | `A`s), and we can place them in sets.
        
           | piinbinary wrote:
           | I think I see now. So the author wasn't saying that
           | construction of values is what defines the type. Rather it is
           | a necessary step before you can start talking about sets of
           | values. (And you can talk about a type without knowing how to
           | construct it, so that makes types different from sets?)
        
       | mjburgess wrote:
       | I think a clearer way of putting is this: types are not a
       | mathematical notion.
       | 
       | So "Bool" is not the same as "{On, Off}" nor "{0, 1}" nor "{Up,
       | Down}" etc.
       | 
       | We can _reduce_ a type to a set of all constructible values;
       | likewise we can reduce english to the  "big infinity" of all
       | possible sentences expressed in all possible universes. But
       | english _isnt_ this set.
       | 
       | If one had this set, E, then you could _reduce_ the question of
       | whether a conversation (C) is valid english to set-membership,
       | "is C in E?" but this misses whether the conversation is actually
       | _about_ anything.
       | 
       | Likewise "me" can occur in an infinite number of sets (Person,
       | RandomObjectsOnEarth, ...), but _that_ "me in Person" is the
       | "valid question" has nothing to do with "me in Person" -- and
       | everything to do with the semantics of me/Person. "Person" is
       | _about_ "me".
       | 
       | It is for this reason that values have "principal types" in
       | programming languages; ie., the "set" which happens to be _about_
       | the value.
       | 
       | Insofar as mathematics is employed to model reality we have to
       | introduce non-mathematical devices to constrain its application,
       | types may be understood as the formal mechanism to do this.
        
         | Ericson2314 wrote:
         | Types absolutely are a mathematical notion. Formal systems like
         | the calculus of construction and Martin Lof Type Theory are
         | well studdied.
        
           | mjburgess wrote:
           | That doesn't have much to do with my point.
           | 
           | There is no mathematical encoding of natural world semantics:
           | the sense in which "Person" and "MostIntelligentSpecies" are
           | _different_ cannot be expressed mathematically.
           | 
           | (p in Person) obtains as does (p in MostIntelligentSpecies)
           | 
           | but _Person_ cannot be eliminated for
           | _MostIntelligentSpecies_
           | 
           | The article itself makes this point, but then tries to
           | compromise with the mathematical reduction "to sets". It
           | would be clearer just to state that this is a reduction
           | rather than an elimination.
        
       | didibus wrote:
       | Isn't the confusion just that the Disco language uses the same
       | symbol for its type that math uses for a very commonly known set?
       | 
       | The symbol N commonly refers to a set, but on Disco it suddenly
       | refers to a type, it's not surprising that the confusion arises,
       | especially since Disco also includes the concept of sets.
       | 
       | > The set of natural numbers, denoted N
       | 
       | This is mostly how math refers to it, you can see both set and
       | type here, the set of type natural numbers is denoted N, thus
       | "natural number" is the type and "N" is the set.
       | 
       | The type describes the rules for what is included in the set (in
       | this case: all natural numbers), and the set is the list of all
       | things that are included (each of the natural numbers
       | themselves).                   The set of <type> is denoted
       | <notation>
        
         | alisonkisk wrote:
         | It gets even weirder when Brent (an expert in Haskell and math,
         | both of which feature infinite sets!) expresses concern that
         | he'll have to add infinite sets to his language, as though that
         | wasn't totally obvious from the start.
        
       | triska wrote:
       | This strikes me as a quite language-specific treatise, and may
       | indicate a shortcoming or particularity of the syntax or
       | semantics of the described programming language (i.e., Disco)
       | rather than an essential conceptual difference.
       | 
       | For comparison, in Prolog, a _type_ is indeed a _set_ of terms!
       | For example, the type  "integer" truly denotes the set of
       | integers, and if an argument has the type "integer", then we know
       | that the predicate can only hold if that argument is an element
       | of this type.
       | 
       | Quoting from the Prolog ISO standard:                   7.1 Types
       | The type of any term is determined by its abstract syntax
       | (6.1.2).              Every term has one of the following
       | mutually-exclusive         types: V (variables), I (integers), F
       | (floating point values),         A (atoms), CT (compound terms).
       | A term with type I, F, or A is an atomic term.
       | 
       | and further:                   7.1.2 Integer              An
       | integer is a member of a set I (see 6.1.2 c) where I         is a
       | subset of Z characterized by one or three parameters.
       | 
       | Prolog also has _type errors_ : _" There shall be a Type Error
       | when the type of an argument or one of its components is
       | incorrect, but not a variable."_ For example, in Scryer Prolog,
       | we get:                  ?- length(Ls, a).        caught:
       | error(type_error(integer,a),length/2)
       | 
       | This is because the second argument of length/2 must be an
       | element of the set _I_ of _integers_. Since the atom  'a' is not
       | an integer, the relation does not hold in this case.
       | Declaratively, a _type error_ is equivalent to _silent failure_ :
       | It means that the relation cannot hold for this type of argument,
       | and so every specialization of the query will also not hold,
       | whereas a generalization or an instantiation with a term of a
       | different type may succeed.
       | 
       | This view readily carries over to logic variables. For example,
       | using library(clpz), we get:                   ?- X #> 3.
       | clpz:(X in 4..sup).
       | 
       | And we get a _type error_ if we try to instantiate the thus
       | constrained variable X to a term that is not an element of Z:
       | ?- X #> 3, X = a.         caught:
       | error(type_error(integer,a),unknown(a)-1)
       | 
       | So, indeed, _types_ denote _sets_ of terms that are acceptable as
       | arguments.
       | 
       | The example in the article, i.e., N, could be regarded as an
       | element of the set of _atoms_ in Prolog, and therefore have the
       | _type_ atom. In indeed, for example Trealla Prolog answers:
       | ?- atom(N).         true.
        
         | tel wrote:
         | This is an old yarn distinguishing sets and types, extrinsic
         | and intrinsic, "run time" and "compile time".
         | 
         | The idea of a runtime procedure `?- atom(_)` which determines
         | `true.` or `false.` is fundamentally an extrinsic property,
         | thus "set like".
         | 
         | Intrinsically defined objects are defined through (possibly
         | inductive) construction or (possibly coinductive) destruction
         | rules.
         | 
         | Typically a notion of set arises from having some larger
         | universe of things and restricting them via predicates. The
         | notion of types arises from having a system of building
         | (co)inductive definitions and then utilizing that system. Sets
         | are thus, in a sense, fundamentally based on a closed world
         | (said universe) and types an open world (each new utilization
         | of the rules system introduces a novel type).
        
         | kmill wrote:
         | The word "type" is very overloaded. In the language of type
         | theory, Prolog (and many other languages, usually dynamic)
         | consider a universal type of all terms, and then a "type" is a
         | set on the universal type. This is a fine way to go (though you
         | can get contradictions in the form of infinite loops in
         | typechecking if "types" are terms themselves).
         | 
         | In type theory, at least for the ones many theoreticians seem
         | to be interested in, the has_type/2 predicate (using Prolog
         | terminology) is a function, which is to say has_type(x, y) has
         | a single value for y for any given term x. Things are set up so
         | that the way in which x is represented (i.e., constructed)
         | determines y. This is useful computationally because if you
         | know something's type, you then know how to deconstruct it into
         | components. For example, types drive the expected cases of a
         | match expression.
         | 
         | My main point with this comment is to bring up the fact that
         | the article is talking about type theories as academics who
         | study functional programming languages see them, so in context
         | it's not specific to Disco.
         | 
         | (I should mention that has_type is only a function given a
         | symbolic representation of a construction for x. Theoretically,
         | things of different types can have the same internal
         | representation, like with Java generics or C structs, but I
         | think you can always arrange things so that has_type is a
         | function by tagging values with their type. Even in more
         | delicate dependent type systems.)
         | 
         | (I guess I should mention that my understanding of types after
         | some years of confusion is that in the end they essentially
         | provide an API for values. The program code sort of constitutes
         | a proof that a value has a specific type, and then you can
         | interact with it as such. It's explicitly not part of the API
         | for a type to be able to tell whether something has that type
         | -- in many type theories, values can have multiple types! Sets
         | on the other hand are a type where part of their API is that
         | values can determine whether or not a value is an element. It's
         | certainly possible to fuse the concepts so that types are
         | required to be sets, but I think this is avoided because
         | there's not always an algorithm for the set interface.)
        
         | elliekelly wrote:
         | Only tangentially related, every once in a while I look for a
         | book, video, or even course that concisely explains the
         | "language" of mathematical notation. As someone who is largely
         | self-taught when it comes to the more advanced math it's
         | frustrating that so many resources assume a basic level of
         | notational knowledge/understanding so they just skip right over
         | it and you're left piecing it together using context clues and
         | LaTex crib sheets.
        
       | kuprel wrote:
       | I don't think you need the distinction for the natural numbers.
       | Int32 is a type. The natural numbers are the set of positive
       | integers that don't overflow the int type used to express them.
        
         | Ericson2314 wrote:
         | This language is surely going to use bignums to avoid all that
         | nonsense.
        
       ___________________________________________________________________
       (page generated 2021-07-25 23:02 UTC)