[HN Gopher] Types versus sets in math and programming languages
___________________________________________________________________
Types versus sets in math and programming languages
Author : classified
Score : 94 points
Date : 2021-07-25 07:42 UTC (15 hours ago)
(HTM) web link (byorgey.wordpress.com)
(TXT) w3m dump (byorgey.wordpress.com)
| tromp wrote:
| I think calling the type of natural numbers N instead of Natural
| is causing the confusion. Students would be less tempted to write
|
| {1,2} [?] Natural
|
| as the right side looks like a type name rather than a value .
| Haskell helps to maintain the distinction by using capitals for
| type names and lower case for variable names.
|
| I would seem natural to write: naturals :: Set
| Natural naturals = {0,1..} {1,2} [?] naturals
| mjburgess wrote:
| You're downvoted, but what you've said is basically the point-
| of and conclusion-to the article.
|
| `N` denotes a set and `Natural` denotes a type, and one cannot
| be _eliminated_ to the other in all contexts of use.
| amw-zero wrote:
| It might help to consider the history of types and sets.
|
| Sets (and consequently modern math and logic) have not been
| around for as long as we might think. Maybe we had them in
| different forms, but sets as we know them today were born in 1874
| in this paper:
| https://srjcstaff.santarosa.edu/~jomartin/IrratFiles/Cantors....
|
| Around 1901, Bertrand Russell discovered a paradox in this
| formulation of sets, so-called Russel's Paradox:
| https://en.wikipedia.org/wiki/Russell's_paradox. There are then
| many ways of dealing with this paradox. The most widely accepted
| way is to just reformulate the axioms, which was done with
| Zermelo-Frankel set theory:
| https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_t....
| When people use the word "mathematics" today, they generally
| implicitly mean that ZFC is being used. It's the most widely
| accepted mathematical foundation we have.
|
| Another path was proposed by Russell himself, and that is the
| notion of "type." By restricting the way in which certain
| operations and their arguments can be considered "valid," his own
| paradox can be avoided.
|
| So from the beginning, sets are about _collecting_, and types are
| about _restricting_.
| ProfHewitt wrote:
| Types are about rigorously constructing mathematical
| abstractions
|
| from primitive types defined up to a unique isomorphism.
|
| In fact, types have become the primary foundation of Computer
| Science.
|
| Russell introduced the crucial concept of the order of a
|
| proposition to block his famous paradox as follows:
|
| Without orders on propositions, the paradoxical predicate
| I'mNotSelfApplicable (such that I'mNotSelfA
| pplicable[I'mNotSelfApplicable]=!I'mNotSelfApplicable[I'mNotSel
| fApplicable])
|
| could be constructed using the following recursive definition:
| I'mNotSelfApplicable:Predicate<i>[?]!I'mNotSelfApplicable[I'mNo
| tSelfApplicable]
|
| Since I'mNotSelfApplicable is a predicate variable in the
| definition,
| !I'mNotSelfApplicable[I'mNotSelfApplicable]:Predicate<i+1>.
|
| Consequently, I'mNotSelfApplicable:Predica
| te<i>=I'mNotSelfApplicable:Predicate<i+1>
|
| which is a contradiction meaning that I'mNotSelfApplicable does
| not exist.
|
| See video: https://www.youtube.com/watch?v=AJP1VL7shiI
| ProfHewitt wrote:
| Unfortunately, 1st-order ZFC has very severe limitations
|
| because it does not precisely define sets.
|
| However, ordinals that provide the backbone of sets have been
|
| precisely defined up to a unique isomorphism using types.
|
| See the following article:
|
| "Theory Ordinals can Replace ZFC in Computer Science"
|
| https://papers.ssrn.com/abstract=3457802
| k__ wrote:
| Half-OT: What's the reason rationals aren't used often in
| programming?
| dkarl wrote:
| Few applications need them for correctness according to domain
| requirements. Compare with fixed-precision decimal encodings,
| which are mathematically less fundamental, but are common in
| programming because many applications require monetary amounts
| to be calculated using fixed-precision arithmetic.
|
| It's true that floating point calculations are much faster, but
| we do a lot of expensive things in code when we need to, and
| people do use rational types when they need exact rational
| arithmetic. It just doesn't happen very often.
| mjburgess wrote:
| You mean full-precision integer representations?
|
| A percentage is effectively this -- and floats are often
| multipliers on integers which are "holding the precision".
|
| However it isn't clear what standards sorts of arithmetical use
| cases storing two integers helps with (over, say, integer &
| float, etc.).
| k__ wrote:
| I'm not sure either, that's why I asked. :D
| mjburgess wrote:
| I'm not an expert but, certainly, (1) limited memory on
| historical machines necessitated a small-width
| representation of a real number; (2) limited cpus meant
| simple operations needed to be applied to this small-width
| representation in one go.
|
| Ie., your CPU is just a system for taking bit patterns and
| producing new bit patterns: (0000 0000) -> (1010 0101). So
| you want each individual number to be of a size the
| processor can operate on in one pass.
|
| By using two integers you gain some capacity (ie., you can
| still overflow), but lose memory and compute performance.
| marcosdumay wrote:
| You obviously can't have the entire rationals, because your
| computer's memory is finite. And if you are approximating
| anyway, adding space to the mantissa of both the numerator and
| denominator will approximate values about as well as adding
| space only into one of them (the actual gain grows with
| O(log(n)) of those sizes).
|
| Thus, if you won't work with exact numbers, the best choice to
| represent rationals is to place all bits you can at the
| mantissa of a single side, and just store the order of
| magnitude of the other side. And, for simplicity you want the
| mantissa on the numerator side. What is basically a floating
| point number.
|
| Things are very different if you will work with exact numbers.
| lioeters wrote:
| I'm guessing the reason it's not often used in programming is
| due to slower performance and larger memory requirements. I
| typically see rationals (numerator/denominator pair) used in
| arbitrary-precision libraries.
|
| > ..Arithmetic with rational numbers can become unwieldy very
| quickly: 1/99 - 1/100 = 1/9900, and if 1/101 is then added, the
| result is 10001/999900.
|
| https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic
| ekimekim wrote:
| If you mean "representing values as a pair of integers P, Q,
| such that the value = P/Q", the answer is that they're very
| expensive to work with. You generally need to be using
| arbitrary-length integers for P and Q, because the values get
| very large very quickly. And as the values get large,
| operations like finding the lowest common multiple (required
| for almost any operation between two rationals) get expensive.
|
| As an example of the kinds of runaway increase in P and Q I'm
| talking about, consider calcuating the sum of x/(x+1) for
| values of x from 1 to 999, using python's fractions module:
| >>> sum(Fraction(x)/(x+1) for x in range(1000)) Fractio
| n(7075502361382798268120793110090229940258706389605508780354900
| 641905867818691614397718938054011886551220308728288024788414601
| 821895503124242001568659978565050184167864774563757893457200995
| 977901194872761682228965032590045181707202517783731540396927542
| 180527680245723964874429994374505238893797078720320991788231820
| 649513569449454767125309918934132672543174716802507966087781714
| 014833406311821646176291230132112749836042537756095388477483, 7
| 128865274665093053166384155714272920668358861885893040452001991
| 154324087581111499476444151913871586911717817019575256512980264
| 067621009251465871004305131072686268143200196609974862745937188
| 343705015434452523739745298963145674982128236956232823794011068
| 809262317708861979540791247754558049326475737829923352751796735
| 248042463638051137034331214781746850878453485678021888075373249
| 921995672056932029099390891687487672697950931603520000)
|
| That's a 436-digit P and a 433-digit Q.
|
| Rational types like this can definitely be a good choice in a
| use-case where accuracy is important, and operations do not
| compound. But they're a poor default, which is why we generally
| see float types as the most common non-integer number
| representation.
| k__ wrote:
| Makes sense.
|
| Thanks!
| hansvm wrote:
| Rationals are typically slower (gcd operations) and prone to
| memory explosion if the set of possible denominators isn't
| extremely nice.
|
| If approximate answers suffice then a scheme like floats or
| doubles is usually good enough.
|
| If approximate answers suffice and floats or doubles have
| undesirable properties (weird density distributions, ranges
| larger than what your application needs, ...) then it's easy to
| hack something together with a few _bounded_ integer types
| which has better properties for your specific needs.
|
| In most other practical applications some kind of a fixed-
| precision decimal suffices. It's a bit slower than floats and
| might have undesirable memory characteristics, but broad
| classes of real-world numbers which need to be represented
| exactly (like financial data) are nicely represented.
|
| Compared with those (and other) schemes, what advantages do
| rationals have? Chiefly, they admit an exact representation of
| any rational number. If you can mitigate the downsides (e.g.,
| by not having many possible denominator prime factors) and need
| those benefits then rationals might be a good fit.
|
| One practical situation where I occasionally like to use
| rationals is in account data -- credits, currencies, .... It
| lends itself to forward compatibility when new features are
| desired (most often that the smallest atomic unit in a currency
| changes or that some niche system wants to offer transactions
| for 1/3 of a unit). Such changes usually have small, nice,
| compatible denominators, so you don't have unbounded memory
| issues. More importantly, all such operations are exact by
| definition without any extra work on my part (the biggest
| challenges are in display and formatting if your business
| admits those sorts of oddities).
| klodolph wrote:
| Coming from the mathematics side of things,
|
| > In a math class, we typically tell students that N is a _set._
|
| I think the approach is a bit glib. There are some caveats--a
| student with more mathematical experience would think of N as a
| set, or a type, or as objects in various categories.
|
| It's also worth asking a student what 1[?]2 is. Most people will
| look at that and just tilt their head, even though common
| foundations for mathematics treat 1 and 2 as sets. It turns out
| most people don't want to work in foundations, and want to think
| of 1 and 2 as non-sets. The same applies in the other direction,
| upwards, when you think of N as a type (even though it is also a
| set).
| vkazanov wrote:
| My math is shallow at best but... 1[?]2 would become just 2
| with integers represented as sets in certain foundational
| theories, right?
| Spivak wrote:
| Yep! But notation depends a lot on the author to some extent.
| You will find people who treat 1U2 as {1}U{2} because it's
| less annoying to write.
|
| It's the same reason languages have an .add method on sets so
| you don't have to write. s =
| s.union(set(s))
| alisonkisk wrote:
| who writes 1U2 instead of {1,2} ?
|
| When would {1}U{2} ever appear in ant context except for
| the first day of Intro to Set Theory?
| [deleted]
| hansvm wrote:
| Yes. One common definition is equivalent to each natural to
| being the set containing all preceding naturals. With that
| view 2={0,1} and 1={0}, so their union is just 2.
| lapinot wrote:
| > It turns out most people don't want to work in foundations,
| and want to think of 1 and 2 as non-sets
|
| Which is to say most mathematicians prefer type theory to set
| theory :). Perhaps we should stop saying "maths is usually done
| in ZFC", or that "ZFC is mainstream math". For one math is
| almost never formalized and second informal math is quite
| strongly typed (the basic object usually not a set but a
| structured set) where set theory is very untyped. At some point
| people (erm mathematicians) have to acknowledge that we
| computer sciency-math have brought far better foundations
| (being intuitionistic type theory) insofar that it has set
| theory as a subpart, but is much finer in what it can
| distinguish. It's newtonian mechanics vs general relativity.
| What set-foundationalists say is fine, but the frontier is
| elsewhere.
| Tainnor wrote:
| > For one math is almost never formalized and second informal
| math is quite strongly typed [...]
|
| I think you're correct about the first part, but not about
| the second.
|
| It is incredibly common in mathematics to silently "convert"
| objects between different types or to ignore certain things
| that will trip you up once you try to encode things super
| formally. For example, "the function x^2" (instead of
| x->x^2), basically all the common notational shorthands in
| differential equations, where functions are treated as
| variables, silently identifying isomorphic objects and so on.
| Plus, very few working mathematicians actually use
| constructivism and are quite happy with the existence of, for
| example, the whole set of real numbers.
| creata wrote:
| How is intuitionistic type theory a "better" foundation? As a
| simple example, where can I find a statement of Brouwer's
| fixpoint theorem in an intuitionistic type theory, and how is
| it better stated than its classical counterpart?
| specialist wrote:
| > _Sets are characterized by the \in relation: we can ask which
| items are elements of a set and which are not._
|
| > _Types, on the other hand, are characterized by how elements of
| the type are built: we can construct elements of a type (and
| deconstruct them) in certain ways specific to the type._
|
| Huh. So sets vs types are like folksonomies vs taxonomies?
|
| Neat.
|
| --
|
| Tangent:
|
| > _Disco > {1,2} [?] {1,2,3}_
|
| Dumb question: How are you entering mathematically symbols?
|
| My hobby language recognizes both "!=" and "[?]" as "not equal",
| and I use a font with ligatures to render them the same.
| Ericson2314 wrote:
| I feel like the author should just write call Set FiniteSet, and
| then a lot of confusion would go away!
|
| https://golem.ph.utexas.edu/category/2021/06/large_sets_1.ht...
| also, I highly recommend this series for anyone interested in
| sets outside of set theory.
| nolta wrote:
| Here's a nice video with a little more detail on the difference
| between intensional/extensional:
| https://www.youtube.com/watch?v=bNG53SA4n48
| raphlinus wrote:
| From the title, I thought it was going to be an extension of this
| discussion: https://mathoverflow.net/questions/376839/what-makes-
| depende...
|
| It's not, exactly, but some of the concepts do overlap. In
| particular, the idea of being able to write things in a "natural"
| mathematical style, then have them translate to an obsessively
| precise internal representation, is essentially the problem of
| "elaboration" as described in the top answer.
| piinbinary wrote:
| > So types are prior to sets: types provide a universe of values,
| constructed in orderly ways, that we can work with; only then can
| we start picking out certain values to place them in a set.
|
| What about uninhabited types?
|
| Or types where the constructor is not enough to determine the
| entire type?
|
| e.g. data MyType a b = MyConstructor { myField
| :: a }
| tel wrote:
| What's the question?
|
| For uninhabited types there's a universe (empty), ways to
| construct (no ways), ways to work with (any way), and you can
| (tautologically) pick out values and put them in sets.
|
| Constructors aren't first order meaningful (except in higher-
| kinded calculi) but for any concrete types `A` and `B`, `MyType
| A B` is a type with a universe of values (isomorphic to `A`'s
| universe), ways to construct and use (again, isomorphic to
| `A`s), and we can place them in sets.
| piinbinary wrote:
| I think I see now. So the author wasn't saying that
| construction of values is what defines the type. Rather it is
| a necessary step before you can start talking about sets of
| values. (And you can talk about a type without knowing how to
| construct it, so that makes types different from sets?)
| mjburgess wrote:
| I think a clearer way of putting is this: types are not a
| mathematical notion.
|
| So "Bool" is not the same as "{On, Off}" nor "{0, 1}" nor "{Up,
| Down}" etc.
|
| We can _reduce_ a type to a set of all constructible values;
| likewise we can reduce english to the "big infinity" of all
| possible sentences expressed in all possible universes. But
| english _isnt_ this set.
|
| If one had this set, E, then you could _reduce_ the question of
| whether a conversation (C) is valid english to set-membership,
| "is C in E?" but this misses whether the conversation is actually
| _about_ anything.
|
| Likewise "me" can occur in an infinite number of sets (Person,
| RandomObjectsOnEarth, ...), but _that_ "me in Person" is the
| "valid question" has nothing to do with "me in Person" -- and
| everything to do with the semantics of me/Person. "Person" is
| _about_ "me".
|
| It is for this reason that values have "principal types" in
| programming languages; ie., the "set" which happens to be _about_
| the value.
|
| Insofar as mathematics is employed to model reality we have to
| introduce non-mathematical devices to constrain its application,
| types may be understood as the formal mechanism to do this.
| Ericson2314 wrote:
| Types absolutely are a mathematical notion. Formal systems like
| the calculus of construction and Martin Lof Type Theory are
| well studdied.
| mjburgess wrote:
| That doesn't have much to do with my point.
|
| There is no mathematical encoding of natural world semantics:
| the sense in which "Person" and "MostIntelligentSpecies" are
| _different_ cannot be expressed mathematically.
|
| (p in Person) obtains as does (p in MostIntelligentSpecies)
|
| but _Person_ cannot be eliminated for
| _MostIntelligentSpecies_
|
| The article itself makes this point, but then tries to
| compromise with the mathematical reduction "to sets". It
| would be clearer just to state that this is a reduction
| rather than an elimination.
| didibus wrote:
| Isn't the confusion just that the Disco language uses the same
| symbol for its type that math uses for a very commonly known set?
|
| The symbol N commonly refers to a set, but on Disco it suddenly
| refers to a type, it's not surprising that the confusion arises,
| especially since Disco also includes the concept of sets.
|
| > The set of natural numbers, denoted N
|
| This is mostly how math refers to it, you can see both set and
| type here, the set of type natural numbers is denoted N, thus
| "natural number" is the type and "N" is the set.
|
| The type describes the rules for what is included in the set (in
| this case: all natural numbers), and the set is the list of all
| things that are included (each of the natural numbers
| themselves). The set of <type> is denoted
| <notation>
| alisonkisk wrote:
| It gets even weirder when Brent (an expert in Haskell and math,
| both of which feature infinite sets!) expresses concern that
| he'll have to add infinite sets to his language, as though that
| wasn't totally obvious from the start.
| triska wrote:
| This strikes me as a quite language-specific treatise, and may
| indicate a shortcoming or particularity of the syntax or
| semantics of the described programming language (i.e., Disco)
| rather than an essential conceptual difference.
|
| For comparison, in Prolog, a _type_ is indeed a _set_ of terms!
| For example, the type "integer" truly denotes the set of
| integers, and if an argument has the type "integer", then we know
| that the predicate can only hold if that argument is an element
| of this type.
|
| Quoting from the Prolog ISO standard: 7.1 Types
| The type of any term is determined by its abstract syntax
| (6.1.2). Every term has one of the following
| mutually-exclusive types: V (variables), I (integers), F
| (floating point values), A (atoms), CT (compound terms).
| A term with type I, F, or A is an atomic term.
|
| and further: 7.1.2 Integer An
| integer is a member of a set I (see 6.1.2 c) where I is a
| subset of Z characterized by one or three parameters.
|
| Prolog also has _type errors_ : _" There shall be a Type Error
| when the type of an argument or one of its components is
| incorrect, but not a variable."_ For example, in Scryer Prolog,
| we get: ?- length(Ls, a). caught:
| error(type_error(integer,a),length/2)
|
| This is because the second argument of length/2 must be an
| element of the set _I_ of _integers_. Since the atom 'a' is not
| an integer, the relation does not hold in this case.
| Declaratively, a _type error_ is equivalent to _silent failure_ :
| It means that the relation cannot hold for this type of argument,
| and so every specialization of the query will also not hold,
| whereas a generalization or an instantiation with a term of a
| different type may succeed.
|
| This view readily carries over to logic variables. For example,
| using library(clpz), we get: ?- X #> 3.
| clpz:(X in 4..sup).
|
| And we get a _type error_ if we try to instantiate the thus
| constrained variable X to a term that is not an element of Z:
| ?- X #> 3, X = a. caught:
| error(type_error(integer,a),unknown(a)-1)
|
| So, indeed, _types_ denote _sets_ of terms that are acceptable as
| arguments.
|
| The example in the article, i.e., N, could be regarded as an
| element of the set of _atoms_ in Prolog, and therefore have the
| _type_ atom. In indeed, for example Trealla Prolog answers:
| ?- atom(N). true.
| tel wrote:
| This is an old yarn distinguishing sets and types, extrinsic
| and intrinsic, "run time" and "compile time".
|
| The idea of a runtime procedure `?- atom(_)` which determines
| `true.` or `false.` is fundamentally an extrinsic property,
| thus "set like".
|
| Intrinsically defined objects are defined through (possibly
| inductive) construction or (possibly coinductive) destruction
| rules.
|
| Typically a notion of set arises from having some larger
| universe of things and restricting them via predicates. The
| notion of types arises from having a system of building
| (co)inductive definitions and then utilizing that system. Sets
| are thus, in a sense, fundamentally based on a closed world
| (said universe) and types an open world (each new utilization
| of the rules system introduces a novel type).
| kmill wrote:
| The word "type" is very overloaded. In the language of type
| theory, Prolog (and many other languages, usually dynamic)
| consider a universal type of all terms, and then a "type" is a
| set on the universal type. This is a fine way to go (though you
| can get contradictions in the form of infinite loops in
| typechecking if "types" are terms themselves).
|
| In type theory, at least for the ones many theoreticians seem
| to be interested in, the has_type/2 predicate (using Prolog
| terminology) is a function, which is to say has_type(x, y) has
| a single value for y for any given term x. Things are set up so
| that the way in which x is represented (i.e., constructed)
| determines y. This is useful computationally because if you
| know something's type, you then know how to deconstruct it into
| components. For example, types drive the expected cases of a
| match expression.
|
| My main point with this comment is to bring up the fact that
| the article is talking about type theories as academics who
| study functional programming languages see them, so in context
| it's not specific to Disco.
|
| (I should mention that has_type is only a function given a
| symbolic representation of a construction for x. Theoretically,
| things of different types can have the same internal
| representation, like with Java generics or C structs, but I
| think you can always arrange things so that has_type is a
| function by tagging values with their type. Even in more
| delicate dependent type systems.)
|
| (I guess I should mention that my understanding of types after
| some years of confusion is that in the end they essentially
| provide an API for values. The program code sort of constitutes
| a proof that a value has a specific type, and then you can
| interact with it as such. It's explicitly not part of the API
| for a type to be able to tell whether something has that type
| -- in many type theories, values can have multiple types! Sets
| on the other hand are a type where part of their API is that
| values can determine whether or not a value is an element. It's
| certainly possible to fuse the concepts so that types are
| required to be sets, but I think this is avoided because
| there's not always an algorithm for the set interface.)
| elliekelly wrote:
| Only tangentially related, every once in a while I look for a
| book, video, or even course that concisely explains the
| "language" of mathematical notation. As someone who is largely
| self-taught when it comes to the more advanced math it's
| frustrating that so many resources assume a basic level of
| notational knowledge/understanding so they just skip right over
| it and you're left piecing it together using context clues and
| LaTex crib sheets.
| kuprel wrote:
| I don't think you need the distinction for the natural numbers.
| Int32 is a type. The natural numbers are the set of positive
| integers that don't overflow the int type used to express them.
| Ericson2314 wrote:
| This language is surely going to use bignums to avoid all that
| nonsense.
___________________________________________________________________
(page generated 2021-07-25 23:02 UTC)