[HN Gopher] Parse, Don't Validate (2019)
___________________________________________________________________
Parse, Don't Validate (2019)
Author : melse
Score : 345 points
Date : 2021-06-26 07:23 UTC (15 hours ago)
(HTM) web link (lexi-lambda.github.io)
(TXT) w3m dump (lexi-lambda.github.io)
| TheAceOfHearts wrote:
| Related discussion from last month which links to this article in
| the repo description:
| https://news.ycombinator.com/item?id=27166162
| Vosporos wrote:
| Wonderful piece, it has really opened my eyes on foreign data
| ingestion.
| flqn wrote:
| What would you rather have: an object of value potentially
| outside your domain and an expensive boolean function saying if
| it's ok that you need to apply everywhere just to be sure, or a
| method of producing values that you know are always within your
| domain which you have to apply just once and no expensive boolean
| function?
| wodenokoto wrote:
| When I think of validation I think of receiving a data file and
| checking that all rows and columns are correct and generating a
| report about _all_ the problems.
|
| Does my thing have a different name? Where can I read up on how
| to do that best?
| geofft wrote:
| I think that is in fact validating in the sense that the
| article means it.
|
| Here's validating a CSV in Python (which I'm using because it's
| a language that's, well, less excited about types than the
| author's choice of Haskell, to show that the principle still
| applies): def validate_data(filename):
| reader = csv.DictReader(open(filename)) for row in
| reader: try: date =
| datetime.datetime.fromisoformat(row["date"])
| except ValueError: print("ERROR: Invalid
| date", row) if date < datetime.datetime(2021,
| 1, 1): print("ERROR: Last year's data",
| row)) # etc. return errors
| def actually_work_with_data(filename): reader =
| csv.DictReader(open(filename)) for row in reader:
| try: date =
| datetime.datetime.fromisoformat(row["date"])
| except ValueError: raise Exception("Wait,
| didn't you validate this already???") # etc.
|
| Yes, it's a kind of silly example, but - the validation routine
| is already doing the work of getting the data into the form you
| want, and now you have some DRY problems. What happens if you
| start accepting additional time formats in validate_data but
| you forget to teach actually_work_with_data to do the same
| thing?
|
| The insight is that the work of reporting errors in the data is
| exactly the same as the work of getting non-erroneous data into
| a usable form. If a row of data doesn't have an error, that
| means it's usable; if you can't turn it into a directly usable
| format, that necessarily means it has some sort of error.
|
| So what you want is a function that takes the data and does
| both of these at the same time, because it's actually just a
| single task.
|
| In a language like Haskell or Rust, there's a built-in type for
| "either a result or an error", and the convention is to pass
| errors back as data. In a language like Python, there isn't a
| similar concept and the convention is to pass errors as
| exceptions. Since you want to accumulate all the errors, I'd
| probably just put them into a separate list:
| @attr.s # or @dataclasses.dataclass, whichever class
| Order: name: str date:
| datetime.datetime ... def
| parse(filename): data = [] errors = []
| reader = csv.DictReader(open(filename)) for row in
| reader: try: date =
| datetime.datetime.fromisoformat(row["date"])
| except ValueError: errors.append(("Invalid
| date", row)) continue if
| date < datetime.datetime(2021, 1, 1):
| errors.append(("Last year's data", row))
| continue # etc.
| data.append(Order(name=row["name"], date=date, ...))
| return data, errors
|
| And then all the logic of working with the data, whether to
| actually use it or to report errors, is in one place. Both your
| report of bad data and your actually_work_with_data function
| call the same routine. Your actual code doesn't have to parse
| fields in the CSV itself; that's already been done by what used
| to be the validation code. It gets a list of Order objects, and
| unlike a dictionary from DictReader, you know that an Order
| object is usable without further checks. (The author talks
| about "Use a data structure that makes illegal states
| unrepresentable" - this isn't quite doable in Python where you
| can generally put whatever you want in an object, but if you
| follow the discipline that only the parse() function generates
| new Order objects, then it's effectively true in practice.)
|
| And if your file format changes, you make the change in one
| spot; you've kept the code DRY.
| rmnclmnt wrote:
| You can use the now widely adopted Great Expections[0] library,
| which fits exactly this use-case for data validation!
|
| [0] https://greatexpectations.io
| wodenokoto wrote:
| Thanks for the link. It looks really nice.
|
| I see they've raised a lot of money. Does anyone know what
| their revenue model is?
| quickthrower2 wrote:
| I thought of input validation for web forms. Similar thing I
| guess. In Haskell you can create a type that you know is a
| validated email address but you still need a validation
| function from String -> Maybe Email to actually validate it at
| runtime
| jacoblambda wrote:
| That's just a parser though. Like described in the post,
| parsers sometimes can fail but importantly they always pass
| along the result if they succeed. Validation functions on the
| other hand only validate that said data is valid.
|
| The argument is that if you need to interact with or operate
| on some data you shouldn't be designing functions to validate
| the data but rather to render it into a useful output with
| well defined behaviour.
| WJW wrote:
| I think for the usecase GP gives it'd be even better to have
| a function `String -> Either (LineNumber,String,[Problem])
| Email`, so that you can report back which of the lines had
| problems and what kind of problems. For web form validation
| you can skip the line number but it'd still be useful to keep
| the list of problems, so that you can report back to the user
| what about their input did not conform to expectations.
| foota wrote:
| Data validation?
| gbrown_ wrote:
| Previous discussion https://news.ycombinator.com/item?id=21476261
| pansa2 wrote:
| "Parse, don't [just] validate".
|
| Say I have a string that's supposed to represent an integer. To
| me, "Validate" means using a regex to ensure it contains only
| digits (raising an error if it doesn't) but then continuing to
| work with it as a string. "Parse" means using "atoi" to obtain an
| integer value (but what if the string's malformed?) and then
| working with that.
|
| I first thought this article was recommending doing the latter
| instead of the former, but the actual recommendation (and I
| believe best practice) is to do both.
| b3morales wrote:
| You seem to suggest that it's possible to parse _without_
| validating, which I 'm not sure I follow. Surely validation is
| just one of the phases or steps of parsing?
| pansa2 wrote:
| Functions like `atoi` parse strings into integers, but will
| happily accept " 10blah" and return 10. In my experience it's
| best to validate that the string is well-formed (e.g.
| contains only digits) before passing it to one of those
| functions.
| nsajko wrote:
| The point is that validation is (or should/can be) a byproduct
| of parsing. I.e., you shouldn't "do both", rather the
| validation should be encompassed by the parsing, as much as it
| makes sense.
| Attummm wrote:
| That's how the validation tool for Python Maat works. By creating
| a completely new dictionary. https://github.com/Attumm/Maat
| errnesto wrote:
| For js / typescript I like: https://github.com/paperhive/fefe
|
| basically ist's just functions that take a value of one type
| and return a other one
| vorticalbox wrote:
| There is also joi, zod, myzod just to name a few.
|
| I personally use myzod as its fast it parsing, zero
| dependancies and you can infre types from your schemas.
| catlifeonmars wrote:
| Don't forget https://github.com/gcanti/io-ts
| mulmboy wrote:
| How does Maat compare with pydantic?
| https://github.com/samuelcolvin/pydantic
| rmnclmnt wrote:
| If only for conciseness, readability and speed, I'd take
| Pydantic over any day. Being able to express 80% of type
| checking using Python native type hints + dataclasses is just
| so intuitive!
|
| And it's getting some wide adoption, for instance FastAPI
| which uses it for request validations.
| Attummm wrote:
| Engineering is about tradeoffs, even though both projects
| do validation.
|
| The points you made are all very valid points.
|
| At my employer we use both projects. If the data is very
| nested, or really large Maat is used.
| rmnclmnt wrote:
| Mmmh interesting requirement! Indeed, defining very
| nested structure with Pydantic is one of its weaknesses.
|
| And of course I agree 100% about tradoffs in engineering.
| However I usually advise against using 2 dependencies
| doing mainly the same thing if possible within the same
| project.
|
| Anyway, good catch, thanks for enlightening me!
| Attummm wrote:
| Pydantic is using classes and typehinting. The new
| dataclasses style. Currently Maat doens't have a parser for
| dataclasses, it could come in the future. Pydantic works
| great with typehinting.
|
| Maat was created before dataclasses existed. For validation
| Maat offers the same. But it also allows for some really neat
| features such as validation on encrypted data. https://github
| .com/Attumm/Maat/blob/main/tests/test_validati...
|
| Since validation is written as dictionaries its possible to
| store the validations in caching db such as Redis.
|
| And since its simple its easy to extend for anyone use case.
| And there are no other dependencies.
|
| Benchmarks of pydantic has Maat around twice as Pydantic.
| Attummm wrote:
| Unable to change my comment.
|
| Benchmarks of pydantic has Maat around twice the speed of
| Pydantic
| AlexSW wrote:
| I couldn't agree with this post more.
|
| I found myself replacing the configuration parsing code in a C++
| project that was littered with exactly the validation issues
| described, and converted it to that which the author advocates.
| The result was a vastly more readable and maintainable codebase,
| and it was faster and less buggy to boot.
|
| Another nice advantage is that the types are providing free/self-
| documentation, too.
| iamwil wrote:
| A related mantra is to "Make impossible states impossible"
| https://www.youtube.com/watch?v=IcgmSRJHu_8
| nathcd wrote:
| I think the original formulation is "make illegal states
| unrepresentable" from Yaron Minsky:
| https://blog.janestreet.com/effective-ml/ and
| https://blog.janestreet.com/effective-ml-revisited/ (or maybe
| there are older sources than 2010?)
| bruce343434 wrote:
| This still sounds like validation but with extra steps. (or
| less?)
| kortex wrote:
| Validation is checking if something _looks_ like a data
| structure. Parsing is smashing data into a data structure, and
| failing out if you can 't do it.
|
| At the end of parsing, you have a structure with a _type_.
| After validation, you may or may not have a structure with that
| type, depending on how you chose to validate.
|
| But I think the big win is, parsers are usually much easier to
| compose (since they themselves are structured functions) and so
| if you start with the type first, you often get the
| "validation" behavior aspect of parsing for "free" (usually
| from a library). Maybe title should have been "Parse, don't
| _manually_ validate. "
|
| But if your type doesn't catch all your invariants, yeah it
| does feel kinda just like validation.
| plesiv wrote:
| The post is saying:
|
| - don't drop the info gathered from checks while validating,
| but keep track of it
|
| - if you do this, you'll effectively be parsing
|
| - parsing is more powerful that validating
|
| "Extra steps" would be keeping track of info gathered from
| checks.
| bruce343434 wrote:
| Right. My takeaway was "verify and validate once, then put it
| in a specially marked datastructure, or if your language
| allow it make the typesystem guarantee some conditions of the
| data, then work with that from there". Where does parsing
| come in the picture?
| vvillena wrote:
| Well, ain't that it? If you validate a string so that it
| contains some angle bracket tags at the beginning and the
| end, ensure that both the tag values contain the same
| string (except for one extra marker in the end tag), and
| store the tag name and the string within the tags in a
| purpose-made data structure, you can call it whatever you
| want. Some will call that a rudimentary XML parser.
| kortex wrote:
| > verify and validate once, then put it in a specially
| marked datastructure
|
| Not to steal vvillena's thunder, but that's pretty much the
| dictionary definition of "parsing"
|
| > analyze (a string or text) into logical syntactic
| components, typically in order to test conformability to a
| logical grammar.
|
| Parsing is taking some collection of symbols, and emitting
| some other structure that obeys certain rules. Those
| symbols need not be text, they can be any abstract "thing".
| A symbol could be a full-blown data structure - you can
| parse a List into a NotEmptyList, where there's some
| associated grammar with the NEL that's a stricter version
| of the List grammar.
| justinpombrio wrote:
| Haha, "Those symbols need not be text", you say, right
| after quoting a definition that says they need to be "a
| string or text"!
|
| There's a field of study called "parsing", which studies
| "parsers". Hundreds of papers. Very well defined problem:
| turning a _list_ of symbols into a _tree_ shaped parse
| tree (or data structure). The defining aspect of parsing,
| that makes it difficult and an interesting thing to
| study, is that you 're starting with a list and ending
| with a tree. If you're converting tree to tree (that is,
| a typical data structure to a typical data structure),
| all the problems vanish (or change drastically) and all
| the parsing techniques are inapplicable.
|
| I'm kind of annoyed that people are starting to use the
| word "parse" metaphorically. Bit by bit, precise words
| turn fuzzy. Alas, it will be a lost battle.
| samatman wrote:
| Sure, but the list of symbols can be an arbitrary
| collection where the symbols are 0 and 1.
|
| Voila, now your 'string' is 'binary data' not 'text'.
|
| Parsing binary data is my bread and butter, so I might be
| biased but: it works fine.
|
| Anything which comes over the wire is a string, anything
| which comes out of store is a string. If you're using
| something like protobufs, that's great, because having to
| marshal/serialize/parse along every process boundary is
| expensive and probably unnecessary.
|
| But at some point, and anywhere on the 'surface' of the
| system, data has to be un-flattened into a shape. That's
| parsing.
| kortex wrote:
| > Haha, "Those symbols need not be text", you say, right
| after quoting a definition that says they need to be "a
| string or text"!
|
| Ha yeah nice catch, that's why I added that in there. In
| this case the dictionary is slightly wrong.
|
| > The defining aspect of parsing, that makes it difficult
| and an interesting thing to study, is that you're
| starting with a list and ending with a tree.
|
| Ah, I didn't know that! Great bit to learn.
|
| In that case, I will say that the "increase a data
| structure's rules" is a bit ambiguous.
|
| I think my statement is still correct in that "a symbol
| could be a data structure," right? Like you could take a
| list of dicts and emit a tree of dicts.
|
| But wait, a list _is_ a kind of tree, or rather, there is
| a parse tree of recursive head /tail branches. So I think
| you could still argue List->NotEmptyList is a Parse
| because NEL requires a nonzero "head" and zero or one NEL
| as "tail."
| justinpombrio wrote:
| > I think my statement is still correct in that "a symbol
| could be a data structure," right? Like you could take a
| list of dicts and emit a tree of dicts.
|
| Yeah I guess. Text combinator libraries like Haskell's
| Parsec and Rust's Nom are typically parametric over the
| type of "characters". Realistically I don't think I've
| ever seen anyone use one of those libraries for an input
| that wasn't text-like, though; do you have a use case in
| mind?
|
| > But wait, a list is a kind of tree, or rather, there is
| a parse tree of recursive head/tail branches.
|
| Yes, so you can run into parsing problems when working
| with trees, if you work _really hard_ at it. But if you
| do the correct action is "reconsider your life choices"
| and not "use parsing theory".
| justinpombrio wrote:
| You understand it. It's using the word "parse"
| metaphorically, to mean "validate, then put it in a
| specially marked data structure". For example, `parse_int
| :: string -> maybe int` is a parsing function, and it
| "validates that the string is an integer, then puts it in a
| specially marked data structure called int". However, the
| post uses the word "parse" not only for true parsing
| functions (that convert text into a data structure), but
| also for conversions from data structure to data structure.
|
| I also find this a confusing use of the word "parse", and
| it's not explained in the post, and I think "parse, don't
| validate" is a poor slogan as a result. The traditional
| slogan is "make illegal states unrepresentable", though
| that's a bit narrower of a concept.
| b3morales wrote:
| I don't think this is the same as "make illegal states
| unrepresentable"; it's a corollary (or the converse
| maybe):
|
| "Make assertions of legal states representable"
| joshuamorton wrote:
| They're sort of the same. If you have str and safestr,
| and safestr is known to confiorm to some invariant, the
| illegal state of a, say str where validate hasn't been
| called, isn't representable.
| dgb23 wrote:
| It's more like putting the "extra steps" at the right place in
| the code.
| seanwilson wrote:
| From the Twitter link:
|
| > IME, people in dynamic languages almost never program this way,
| though--they prefer to use validation and some form of shotgun
| parsing. My guess as to why? Writing that kind of code in
| dynamically-typed languages is often a lot more boilerplate than
| it is in statically-typed ones!
|
| I feel that once you've got experience working in (usually
| functional) programming languages with strong static type
| checking, flakey dynamic code that relies on runtime checks and
| just being careful to avoid runtime errors makes your skin crawl,
| and you'll intuitively gravitate towards designs that takes
| advantage of strong static type checks.
|
| When all you know is dynamic languages, the design guidance you
| get from strong static type checking is lost so there's more bad
| design paths you can go down. Patching up flakey code with ad-hoc
| runtime checks and debugging runtime errors becomes the norm
| because you just don't know any better and the type system isn't
| going to teach you.
|
| More general advice would be "prefer strong static type checking
| over runtime checks" as it makes a lot of design and robustness
| problems go away.
|
| Even if you can't use e.g. Haskell or OCaml in your daily work, a
| few weeks or just of few days of trying to learn them will open
| your eyes and make you a better coder elsewhere.
| Map/filter/reduce, immutable data structures, non-nullable types
| etc. have been in other languages for over 30 years before these
| ideas became more mainstream best practices for example (I'm
| still waiting for pattern matching + algebraic data types).
|
| It's weird how long it's taking for people to rediscover why
| strong static types were a good idea.
| jhgb wrote:
| > > IME, people in dynamic languages almost never program this
| way, though--they prefer to use validation
|
| I wonder how many people the author met.
| ukj wrote:
| Every programming paradigm is a good idea if the respective
| trade-offs are acceptable to you.
|
| For example, one good reason why strong static types are a bad
| idea... they prevent you from implementing dynamic dispatch.
|
| Routers. You can't have routers.
| justinpombrio wrote:
| Are you sure you know what dynamic dispatch is? Java has
| dynamic dispatch, and it is a statically typed language. In
| Java, it's often called "runtime polymorphism".
|
| https://www.geeksforgeeks.org/dynamic-method-dispatch-
| runtim...
|
| And using it doesn't give up any of Java's type safety
| guarantees. The arguments and return type of the method you
| call (which will be invoked with dynamic dispatch) are type
| checked.
| ukj wrote:
| Are you sure you know when type-checking and when dynamic
| dispatching happens?
|
| Compile time is not runtime.
|
| The Java compiler is not the JVM.
|
| The compiler does type checking. The JVM does the dynamic
| dispatching.
|
| Neither does both.
| justinpombrio wrote:
| All those statements are correct. The people downvoting
| you know that too. I don't think anyone has figured out
| what point you're trying to make, though. Could you spell
| it out in more detail?
|
| Consider addition. The compiler does type checking, and
| the JVM actually adds the numbers. Nonetheless, the
| addition is type checked, and does not represent a
| weakness of static type checking. How is dynamic dispatch
| different than this?
| ukj wrote:
| The point is trivial.
|
| You can't have both static type safety AND dynamic
| dispatch at the same time and in the same context about
| the same data.
|
| Choose one. Give up the other. Make a conscious trade
| off.
|
| The language that you are using is making such trade offs
| for you - they are implicit in the language design. Best
| you know what they are because they are meaningful in
| principle and in practice.
| justinpombrio wrote:
| Java has both static type safety AND dynamic dispatch at
| the same time and in the same context about the same
| data.
| ukj wrote:
| No, it doesn't.
|
| The input-data to the compiler can't be handled by the
| JVM and vice versa.
|
| The JVM handles byte code as input. The compiler handles
| source code as input.
|
| That is two different functions with two different data
| domains.
|
| They literally have different types!
|
| Which one of the two functions is the thing you call
| "Java"?
| justinpombrio wrote:
| A type system is sound when: for all
| expressions e: if e type checks with type t,
| then one of the following holds: - e
| evaluates to a value v of type t; or - e does
| not halt; or - e hits an "unavoidable error"
| like division by 0 or null deref (what
| counts as "unavoidable" varies from lang to lang)
|
| Notice anything interesting about this definition? It
| uses the word "evaluate"! Type soundness is not just a
| statement about the type checker. It relates type
| checking to run time (and thus to compilation too). That
| is, if you muck with Java's runtime or compiler, you can
| break type soundness, even if you don't change its type
| system in the slightest.
| ukj wrote:
| Yes this is precisely what I am talking about.
|
| Different implementations of eval() a.k.a different
| programming languages have different semantics.
|
| Which implicit eval() implementation does the above hold?
|
| What type system do you have in mind such that the above
| holds for ALL expressions. Type-checking itself is not
| always decidable.
| BoiledCabbage wrote:
| It's pretty clear at this point you don't understand what
| dynamic dispatch means.
|
| I don't think it's worthwhile for anyone else to argue
| with you further. C++ and Java both support dynamic
| dispatch although you deny it.
|
| You've taken up almost a full page of HN arguing with
| everyone trying to explain it to you. People have pointed
| you to wikipedia showing you that you're wrong. [1]
|
| ISOCPP of which Bjarne is a director [2] says that C++
| supports dynamic dispatch. [3]
|
| And you continue to attempt to argue that everyone on HN,
| Wikipedia and the creator of the C++ language are all
| wrong and don't know what dynamic dispatch is.
|
| Your continued insistence is both _wrong_ and a negative
| impact at this point on hn. Please stop arguing something
| that numerous people have taken lots of patience and
| charity in trying every way possible to explain to you
| and what is clearly factually wrong.
|
| If you're going to reply, please explain why an
| organization that Bjarne Stroustrup is a director of
| believes that C++ supports dynamic dispatch.
|
| 1. https://en.wikipedia.org/wiki/Dynamic_dispatch#C++_imp
| lement...
|
| 2 https://isocpp.org/about
|
| 3. https://isocpp.org/wiki/faq/big-picture#why-use-oo
| ukj wrote:
| >If you're going to reply, please explain why an
| organization that Bjarne Stroustrup is a director of
| believes that C++ supports dynamic dispatch.
|
| Because the meaning of "dynamic" is ambigous!
|
| Since you are pointing me to wikipedia I'll point you
| right back...
|
| https://en.wikipedia.org/wiki/Virtual_method_table#Compar
| iso...
|
| "Virtual method tables also only work if dispatching is
| constrained to a known set of methods, so they can be
| placed in a simple array built at compile time."
|
| If the vtable is generated at compile and is constrained
| to a _known set of methods_ then that array is immutable!
| Calling that "dynamic" is an obvious misnomer!
|
| You are neither charitable nor patient. You are
| committing the bandwagon fallacy as we speak. You and the
| other 120 (and counting) angry downvoters ;)
|
| I am using the word "dynamic" to actually mean dynamic! I
| am not going to define it. Use your judgment. Dynamic is
| NOT static. I am not asking you to "educate me", or to
| tell me I am right; or wrong. I am asking you to
| understand the sort of programming language design I have
| in mind!
|
| Either you understand; or you don't.
| ukj wrote:
| It is pretty clear at this point that you don't know what
| "meaning" is when it relates to the semantics of formal
| languages.
|
| For starters you don't even understand Rice's theorem.
|
| Sadly this isn't on Wikipedia. You actually have to dive
| deep on the Topological, Categorical and Logical view of
| computation.
| LAC-Tech wrote:
| > It's pretty clear at this point you don't understand
| what dynamic dispatch means.
|
| Terms in computing are so overloaded that these days I
| try[0] and never correct anyone on how they use a term.
| Instead I ask them to define it, and debate off of that
| definition.
|
| So instead of downvoting this guy for using different
| terminology - we can ask him what he means and just have
| a discussion.
|
| [0] alright I don't always succeed but it's an ideal to
| strive for
| Jtsummers wrote:
| Dynamic dispatch is not terribly overloaded. It's
| dispatching based on run-time information instead of just
| compile-time information.
|
| The problem in this discussion is that ukj has come to
| the belief (but communicated it poorly) that dynamic
| dispatch is somehow incompatible with static typing. And
| for some reason this also matters.
|
| Static typing does _not_ preclude dynamic dispatch, and
| despite being pointed to several mainstream languages
| that have both features, ukj decided to ignore reality or
| the common understanding of the phrase "dynamic
| dispatch" and produced this grotesque example of trying
| to communicate with an individual who is, apparently,
| just a troll. Feeding the troll, ukj, is probably the
| dumbest thing I did today, but I'll blame that on the
| insomnia reducing my ability to detect trolls.
| ukj wrote:
| >Dynamic dispatch is not terribly overloaded. It's
| dispatching based on run-time information instead of just
| compile-time information.
|
| According to the above definition C++ _does not_ have
| dynamic dispatch!
|
| The vtable in C++ is generated at _compile time_ and used
| at _runtime_. It 's immutable at runtime. That means it
| is NOT dynamic!
|
| https://en.wikipedia.org/wiki/Virtual_method_table#Compar
| iso...
|
| "Virtual method tables also only work if dispatching is
| constrained to a known set of methods, so they can be
| placed in a simple array built at compile time."
|
| Like, I don't care how you use words, but you (and
| everyone) are using "dynamic" to speak about a
| statically-behaving system!
| ukj wrote:
| This is a commendable approach.
|
| Computation is a general, abstract and incredibly useful
| idea disconnected from any particular model of
| computation (programming language).
|
| Different languages are just different models of
| computation and have different (desirable, or
| undesirable) semantic properties independent from their
| (trivial) syntactic properties.
|
| It's this sort of angry dogmatism which prevents people
| from talking about programming language design.
|
| Not for a second do they pause to think their own
| understanding may be limited.
| Jtsummers wrote:
| > It's this sort of angry dogmatism which prevents people
| from talking about programming language design.
|
| The dogmatism demonstrated today was in your comments,
| ukj. Your inability to understand that your non-standard
| use of terms makes it impossible for others to
| communicate with you in any effective way made this a
| remarkable farce of a conversation or debate.
| ukj wrote:
| There is no such thing as a "standard" model of
| computation, and therefore no such thing as "standard
| definition" of computation.
|
| There is only the model _you_ (and your tribe) believe is
| "standard" implicitly. Do you actually understand that?
|
| I kinda thought navigating the inherent ambiguity of all
| language is a fundamental skill for a software engineer.
|
| Communication is indeed impossible when you think you
| possess the "right" meaning of words.
| kortex wrote:
| Sure you can. You just need the right amount of indirection
| and abstraction. I think almost every language has some
| escape hatch which lets you implement dynamic dispatch.
| ukj wrote:
| This is a trivial and obvious implication of Turing
| completeness. Why do you even bother making the point?
|
| With the right amount of indirection/abstraction you can
| implement everything in Assembly.
|
| But you don't. Because you like all the heavy lifting the
| language does for you.
|
| First Class citizens is what we are actually interested in
| when we talk about programming language paradigm-choices.
|
| https://en.wikipedia.org/wiki/First-class_citizen
| kortex wrote:
| I mean, GP said "you can't have routers" and maybe I'm
| being dense by interpreting that as "never or almost
| never," but even with a generous "too hard to be
| practical," I still don't think it's correct.
|
| And I explicitly said "escape hatch" meaning language
| feature. You don't need _that_ much indirection to get
| routers in Haskell, Rust, Go, C, C++... like I fail to
| see how implementing routers are a barrier in strict type
| system languages.
|
| Is it _easier_ in python or js? Sure. _can 't_? hardly.
|
| E: here's some vtable dispatch (unless that doesn't count
| as "dynamic dispatch") in Rust. Looks _really_
| straightforward.
|
| https://doc.rust-lang.org/1.8.0/book/trait-objects.html
| dkersten wrote:
| > Why do you even bother making the point?
|
| Maybe because you said:
|
| > they prevent you from implementing dynamic dispatch.
|
| and
|
| > Routers. You can't have routers.
|
| Which just isn't true. You _can_ implement dynamic
| dispatch and you _can_ have routers, but they come at a
| cost (either of complex code or of giving up compile-time
| type safety, but in a dynamic language you don 't have
| the latter anyway, so with a static language you can at
| least choose when you're willing to pay the price).
|
| > First Class citizens is what we are actually interested
| in when we talk about programming language paradigm-
| choices.
|
| But that's not what you said in your other comment. You
| just said _you can 't have these things_, not _they 're
| not first class citizens_. Besides, some static languages
| do have first class support for more dynamic features.
| C++ has tools like std::variant and std::any in its
| standard library for times you want some more dynamism
| and are willing to pay the tradeoffs. In Java you have
| Object. In other static languages, you have other built-
| in tools.
| ukj wrote:
| Everything comes at a cost of something in computation!
|
| That is what "trade offs" means.
|
| You can have any feature in any language once you
| undermine the default constraints of your language. You
| can implement Scala in Brainfuck. Turing completeness
| guarantees it!
|
| But this is not the sort of discourse we care about in
| practice.
|
| https://en.wikipedia.org/wiki/Brainfuck
| dkersten wrote:
| Yes, and? How is that relevant here?
|
| You said _" you can't"_, kortex said _" you can"_ and
| then you moved the goal posts to _" you can because of
| turing completeness, but its bad, Why do you even bother
| making the point?"_ to which I replied _" because its a
| valid response to you're `you can't`"_ now you moved them
| again to _" everything comes at a cost"_ (which... I also
| said?).
|
| Of course everything comes at a cost and yes, that's what
| "trade off" means. Dynamic languages come at a cost too
| (type checking is deferred to run time). So, this time,
| let me ask you: _Why do you even bother making the
| point?_
| ukj wrote:
| Tractability vs possibility.
|
| You don't grok the difference.
|
| You can implement EVERYTHING in Brainfuck. Tractability
| is the reason you don't.
|
| The goalposts are exactly where I set them. With my first
| comment.
|
| "Every programming paradigm is a good idea if the
| respective trade-offs are acceptable to you."
| bidirectional wrote:
| You can't actually implement everything in Brainfuck. You
| can implement something which is performing an equivalent
| computation in an abstract, mathematical sense. But
| there's no way to write Firefox or Windows or Fortnite in
| Brainfuck. Turing completeness means you can evaluate any
| computable function of type N -> N (and the many things
| isomorphic to that), it doesn't give you anything else.
| ukj wrote:
| I am interested in computation. Period. Not any
| particular model of computation (programming language);
| and not merely computation with functions from N->N.
|
| Quoting from
| "http://math.andrej.com/2006/03/27/sometimes-all-
| functions-ar..."
|
| "The lesson is for those "experts" who "know" that all
| reasonable models of computation are equivalent to Turing
| machines. This is true if one looks just at functions
| from N to N. However, at higher types, questions of
| representation become important, and it does matter which
| model of computation is used."
| dkersten wrote:
| Just as a Turing machine requires an infinitely sized
| tape to compute any computable function, so too would
| brainfuck require an indirect sized tape (or whatever
| it's called in BF) to compute any computable function.
| Since memory is finite, neither of these properties are
| actually available on real hardware.
| dkersten wrote:
| Its perfectly tractable though. Just because you don't
| understand it or don't think it is, doesn't make it true.
|
| > "Every programming paradigm is a good idea if the
| respective trade-offs are acceptable to you."
|
| That's not what we are responding to. Nobody here is
| arguing over this statement. We are responding to you
| assertion that static typed compile-time checked
| languages _prevent_ you from having dynamic dispatch and
| that you _can't have_ routers because of that. Neither of
| which are true.
|
| Dynamic languages prevent you from having compile time
| checks. Does that make them bad? Static languages give
| you compile time safety, but if you're willing to forego
| that [1], then you can get _the EXACT SAME behavior as
| dynamic languages_ give you.
|
| You literally said: For example, one
| good reason why strong static types are a bad idea...
| they prevent you from implementing dynamic dispatch.
| Routers. You can't have routers.
|
| Nowhere did you say anything about trying to implement it
| at compile time. Also, if strong static types are a bad
| idea because you can't maintain them all the time, then
| dynamic typed languages are a bad idea because you don't
| get static types ever, its always at runtime.
|
| Just because a hammer can't screw in screws doesn't mean
| its a bad idea, it just means that you can't use it for
| all use cases. This is the same. You can use static types
| and for the few cases where you need runtime dynamism,
| then you use that. That doesn't make the static types in
| the rest of your code bad. It just gives you additional
| tools that dynamic types alone don't have.
|
| [1] to various degrees, its not all or nothing like you
| seem to be implying, there are levels of middle ground,
| like std::variant which maintains safety but you need to
| enumerate all possible types, or std::any which is fully
| dynamic but you give up compile time checks
| [deleted]
| ukj wrote:
| Nothing PREVENTS you from doing anything with a computer
| except your own, mutually incompatible design goals!
|
| I am pointing out (and you seem to be agreeing) that you
| can't have static type safety AND dynamic dispatch
| _______AT THE SAME TIME__________
|
| Which is why you get static type safety ____AT COMPILE
| TIME____. And dynamic dispatch ____AT RUNTIME___.
|
| You were the one moving the goal posts all along by flip-
| flopping between time-contexts.
| dkersten wrote:
| Ok, I think I understand what you are trying to say.
| Instead of telling us how we don't understand various
| things, how about next time you define your terms, choose
| your words more carefully and be clearer with your
| explanations, so we can actually understand what you're
| trying to say.
|
| So let me restate what I think you meant. Maybe I'm
| wrong.
|
| > a limitation of compile time static types is that they
| cannot do dynamic dispatch, which is relies on
| information that isn't known at compile time.
|
| I think we can all agree on this?
|
| Note that I said it's a limitation, not that it's bad.
| Limitations don't make something bad in themselves, they
| only constrain when it is an appropriate tool. You
| wouldn't say that a submarine is bad because it can't go
| on land? It's a limitation but that doesn't make it
| _bad_.
|
| Then you said that because of this limitation, "you can't
| have routers".
|
| But you can have routers in static languages like C++,
| Rust or even Haskell.
|
| You have a choice (trade off) to make:
|
| 1. either you give up on some dynamism to keep compile
| time checks by enumerating the types you can dispatch on
| (as std::variant does)
|
| 2. or you choose to give up static checking (for this
| part of your code only!) and move it to runtime first
| full dynamic logic (as std::any does). When giving up
| compile time safety for one part of your code, you do not
| give it up for all of your code, because you can runtime
| validate then on the boundaries going back into
| statically checked code paths, so the compiler cab assume
| that they are valid if they get back in (as the runtime
| checks would reject bad types)
|
| Neither case is "you can't have routers", but sure you
| can't have fully dynamic routers purely at compile time.
|
| Also, both options are perfectly tractable and both cases
| are typically first class language features (at least in
| the static types languages I've used). In no case are the
| "bad" options, despite each option having different
| limitations.
|
| In a dynamic-types-only language, you don't get to choose
| the trade offs at all, you only get "fully dynamic no
| compile time checking, runtime checks only".
|
| Note also that in real life, few things are truly fully
| dynamic. You are always constrained by the operations
| that are expected to be carried out on the dynamic data
| and while you might not know at runtime what days you
| could get, you DO know what operations you expect to run
| on the data. And you can perform compile time checks
| around that.
|
| So for a router, so you really need it to be fully
| dynamic? Or can you live with it being _generic_ (ie it's
| a library that does routing and it can support any type,
| but you constrain it to known types when you actually use
| it). If so, you have options that maintain type safety:
| you can use OOP inheritance, you can use enumerated types
| like std::variant, you can use generics /templates. The
| library works for any types but the user chooses what
| types to implement based on the operations that will be
| performed on the data. Even dynamic types do this, they
| just defer it to runtime.
|
| Or you can have the router operate on purely dynamic
| types but the handlers that are routed to are statically
| typed (eg if in C++ the router uses std::any and the
| handler registers what types it can accept and the router
| checks validity before handing the data off).
| ukj wrote:
| I chose my words as well as I thought necessary.
|
| Had I know I am going to be lynched for my choices - I
| would've chosen them even better. Or maybe I would've
| just censored myself.
|
| How about you practice the principle of charity next
| time?
| dkersten wrote:
| I don't think any of the replies you got for your first
| comments were uncharitable at all, they responded to what
| you wrote.
|
| At that point, you could have corrected us and revised
| your wording for clarity, but you did not, you dug your
| heels in, you moved the goal posts, you claimed we don't
| understand various things that weren't even really
| related to the comment we were responding to and you
| brought in irrelevant points like Turing completeness.
| You didn't "get lynched" right away, you could have
| reworded or clarified or asked what we didn't understand
| about your statement.
|
| Also, YOU didn't practice the principle of charity!
|
| When people responded to what you wrote, you dug in and
| claimed we didn't understand compilers or tractability vs
| possibility and various other things, rather than
| thinking _" maybe they didn't understand my point, I
| should clarify"_. So its on you, not us.
|
| I'm still not sure if I understand what you were trying
| to say, I am assuming that you meant what I wrote when I
| restated your comment, by piecing all of your different
| comments together. I'm still not sure if you actually
| meant static types are bad period (vs being bad at
| certain things and having certain limitations). And I
| still don't agree with "you can't have routers".
|
| Anyway, I'm done, have a nice day.
|
| EDIT: I know I said I'm done, but your reply: _" parse
| better"_, I gave you an out and still you blame everyone
| else and don't accept that you might have made a mistake.
| You're so sure that you are right and everyone else is
| wrong that you don't even entertain the possibility that
| you might have made a mistake (either in your reasoning
| or your explanation thereof). You appear to have an ego
| problem. You should take some time out and reflect on
| your life a bit.
| ukj wrote:
| You responded to your misunderstanding of what I wrote.
|
| Parse better.
| [deleted]
| [deleted]
| mixedCase wrote:
| Did you mean something other than "dynamic dispatch", or
| what do you mean by "first class support"?
|
| No offense, but your claim sounds like you're confused to
| me, but maybe I am the one confused. AFAIK I do dynamic
| dispatch all the time in strongly typed languages. Can
| you show an example that a strongly typed language can't
| accomplish?
| ukj wrote:
| >I do dynamic dispatch all the time in strongly typed
| languages.
|
| I believe semantics is getting in our way of
| communicating.
|
| You don't do dynamic dispatch ___ALL___ the time. You
| only do it at runtime. And you only do static type safety
| at compile time. Those are different times.
|
| You can't have both of those features at the __SAME
| TME__, therefore you can't have both features ALL the
| time. They are mutually exclusive.
| mixedCase wrote:
| "All the time" is an english expression, please, look it
| up before going all-caps Python name mangle convention on
| me.
|
| Yes, of course dynamic dispatch is a runtime phenomenom,
| that's the dynamic part of it. But there's nothing
| stopping the code that performs dynamic dispatch from
| being strongly typed. Strong types are instructions used
| to prove that the code holds certain properties, they are
| a separate program from the final binary the compiler
| gives you. Do you also think that code that gets unit
| tested can't perform dynamic dispatch?
|
| If your point is that "types don't exist at runtime
| anyway" (reflection aside) then you don't understand what
| the purpose of a type system is, nor what strongly typed
| code means.
| ukj wrote:
| There is something stopping you from running the proof
| against your implementation!
|
| Proof-checking happens only at compile time.
|
| The implementation that you want to prove things about is
| only available at runtime!
|
| Non-availability (incompleteness) of information is what
| is preventing you...
| Jtsummers wrote:
| Dynamic dispatch is literally a runtime feature of a
| language. That's why it's called "dynamic". GP can
| absolutely use dynamic dispatch "all the time" in the
| sense that they use it regularly in their program,
| perhaps in every or nearly every program they write.
|
| Your statement is verging on the nonsensical, like saying
| to someone "You don't use integers all the time,
| sometimes you use strings, they're different things."
| Well, duh?
|
| EDIT: Also, it's discouraged on this site in general to
| use all caps for emphasis. *phrase* produces an
| italicized/emphasized form with _phrase_.
| ukj wrote:
| I am using a perfectly sensible notion of "dynamic" (e.g
| NOT static) when I am talking about "dynamic dispatch".
|
| Registering pointers to new implementations at runtime
| (adding entries to the dispatch table) Unregistering
| pointers to old implementations at runtime (removing
| entries from the dispatch table).
|
| If your dispatch table is immutable (e.g static!),
| there's nothing dynamic about your dispatch!
| ImprobableTruth wrote:
| Huh? Could you give a specific example? Because e.g. C++ and
| Rust definitely have dynamic dispatch through their vtable
| mechanisms.
| ukj wrote:
| Do you understand the difference between compile time and
| runtime?
|
| Neither C++ nor Rust give you static type safety AND
| dynamic dispatch because all of the safety checks for C++
| and Rust happen at compile time. Not runtime.
| detaro wrote:
| > _Neither C++ nor Rust have dynamic dispatch_
|
| You appear to be using some other definition of dynamic
| dispatch than the rest of the software industry...
| ukj wrote:
| You appear to be conflating compilers with runtimes.
|
| Dynamic dispatch happens at runtime.
|
| C++ and Rust are compile-time tools, not runtimes.
| BoiledCabbage wrote:
| You are wrong. C++ _supports_ dynamic dispatch.
|
| Please read about it on Wikipedia
|
| https://en.m.wikipedia.org/wiki/Dynamic_dispatch
|
| And for the future to not litter HN with comments like
| these, next time 10 different people in thread are all
| explaining to you why you're mistaken, take a moment to
| try to listen and think through what they're saying
| instead of just digging deeper.
|
| Having an open mind to learning something new, not just
| arguing a point is a great approach to life.
| ukj wrote:
| You are wrong about me being wrong.
|
| C++ is just a compiler. It outputs a binary for a target
| platform. The compiler does nothing for you at runtime -
| certainly not "type safety".
|
| Sometimes the 10 people on HN are wrong. This is one of
| those times.
| Jtsummers wrote:
| C++ is not a compiler. C++ is a language with a
| specification from which people derive compilers and
| standard libraries and runtimes.
|
| C++ the language very much does tell you what to expect
| at runtime, though perhaps not everything you could ever
| want. I mean, it's not Haskell or Idris with their much
| richer type systems.
| ukj wrote:
| Perfect!
|
| Please produce a piece of code (in a language such as Coq
| or Agda) which proves whether any given piece of random
| data has the type "C++ compiler" or "C++ program".
|
| That is the epitome of static type-checking, right?
| detaro wrote:
| And the compiler generates the code necessary for dynamic
| dispatch to happen at runtime.
| ukj wrote:
| But it doesn't static-type-check that particular code-
| path.
|
| Because it can't.
| dkersten wrote:
| Dynamic languages do it at runtime too, JUST LIKE rust
| and C++ do. What's the difference?
|
| C++ and Rust let you have compile-time safety, _until you
| choose_ to give it up and have runtime checks instead.
| Dynamic languages _only_ allow the latter. Static
| languages let you choose, dynamic languages chose the
| latter for you in all cases. Both can have dynamic
| dispatch.
|
| Besides, static languages can have compile-time type safe
| dynamic dispatch, if you constrain the dispatch to
| compile-time-known types (eg std::variant). You only lose
| that if you want fully unconstrained dynamism, in which
| case you defer type checking to runtime. Which is what
| dynamic languages always have.
|
| So both C++ and Rust _DO_ have dynamic dispatch and the
| programmer gets to choose what level of the dynamism
| /safety trade off they want. And yes, these features ARE
| first class features of the languages.
| ukj wrote:
| >until you choose to give it up
|
| PRECISELY
|
| You have to give up the safety to get the feature.
|
| So you "want type-safety". Until you don't.
|
| >static languages can have compile-time type safe dynamic
| dispatch
|
| "Compile-time dynamic dispatch" is an oxymoron. Dynamic
| dispatch happens at runtime.
| [deleted]
| jsnell wrote:
| I think you might need to define what you mean by dynamic
| dispatch, because it is very clearly something totally
| different than how the term is commonly understood.
| ukj wrote:
| Deciding which implementation of a function handles any
| given piece of data at runtime.
|
| Trivially, because you don't have this knowledge (and
| therefore you can't encode it into your type system) at
| compile time.
| justinpombrio wrote:
| Aha! I think I have debugged your thinking. Wow you made
| that hard by arguing so much.
|
| Apparently you do know what dynamic dispatch is, you're
| just wrong that it can't be type checked.
|
| In Java, say you have an interface called `Foo` with a
| method `String foo()`, and two classes A and B that
| implement that method. Then you can write this code
| (apologies if the syntax isn't quite right, it's been a
| while since I've written Java): Foo foo
| = null; if (random_boolean()) { foo =
| new A(); } else { foo = new B();
| } // This uses dynamic dispatch
| System.out.println(foo.foo())
|
| This uses dynamic dispatch, but it is statically type
| checked. If you change A's `foo()` method to return an
| integer instead of a String, while still declaring that A
| implements the Foo interface, you will get a type error,
| _at compile time_.
| ukj wrote:
| So, there is nothing dynamic about that dispatch.
|
| Because the implementation details of Foo are actually
| know at compile time. Which is why you are able to type-
| check it.
|
| You have literally declared all allowed (but not all
| possible) implementations of Foo.
|
| What happens when Foo() is a remote/network call?
| detaro wrote:
| so you _are_ using a different definition of dynamic
| dispatch than the rest of the software industry.
| ukj wrote:
| I am using a conception (NOT a definition) that is
| actually dynamic.
|
| If you can type-check your dispatcher at compile time
| then there is nothing dynamic about it.
|
| Decidable (ahead of time) means your function is fully
| determined. Something that is fully determined is not
| dynamic.
|
| It is the conception computer scientists use.
| justinpombrio wrote:
| _That is not what dynamic dispatch means!_ It is an
| extremely well established term, with a very clear
| meaning, and that is not what it means.
|
| I thought you were just mistaken about something, but no,
| instead you've redefined a well understood term without
| telling anyone, then aggressively refused to clarify what
| you meant by it and argued for hours with people, while
| saying they were all wrong when they used the well
| established term to mean its well established meaning.
|
| The thing you're talking about _is_ an interesting
| concept, but it 's not called dynamic dispatch, and you
| will confuse everyone you talk to if you call it that. I
| don't know if there's a term for it.
| ukj wrote:
| "Well established" doesn't mean anything.
|
| According to who?
|
| Computer scientists talk about "well formed" not "well
| established".
|
| Those are categorical definitions.
| justinpombrio wrote:
| > According to who?
|
| Wikipedia, every textbook you can find, the top dozen
| search results for "dynamic dispatch", me who has a PhD
| in computer science plus all the other CS PhD people I
| know, everyone in my office who knows the term (who are
| industry people, not academia people), every blog post I
| have ever read that uses the term, and all the other HN
| commenters except you. I'm really not exaggerating; a lot
| of CS terms have disputed meanings but not this one.
|
| EDIT: Sorry all for engaging the troll. I thought there
| might have been some legitimate confusion. My bad.
| ukj wrote:
| So which textbook contains the meaning of "meaning"?
|
| Oh, that's recursive! Which is Computer Science's domain
| of expertise, not the public domain.
|
| We are talking about formal semantics here. What do
| programs (and computer languages are themselves programs)
| mean?
|
| Point 0 of Wadler's law.
|
| https://en.wikipedia.org/wiki/Semantics_(computer_science
| )
|
| If you can type-check it at compile time then it is NOT
| dynamic dispatch. It's a contextual confusion.
| lexi-lambda wrote:
| This is sort of a perplexing perspective to me. It seems
| tantamount to saying "you can't predict whether a value
| will be a string or a number AND have static type safety
| because the value only exists at runtime, and static type
| safety only happens at compile-time." Yes, obviously
| static typechecking happens at compile-time, but type
| systems are carefully designed so that the compile-time
| reasoning says something useful about what actually
| occurs at runtime--that is, after all, the whole point!
|
| Focusing exclusively on what happens at compile-time is
| to miss the whole reason static type systems are useful
| in the first place: they allow compile-time reasoning
| about runtime behavior. Just as we can use a static type
| system to make predictions about programs that pass
| around first-class functions via static dispatch, we can
| also use them to make predictions about programs that use
| vtables or other constructions to perform dynamic
| dispatch. (Note that the difference between those two
| things isn't even particularly well-defined; a call to a
| first-class function passed as an argument is a form of
| unknown call, and it is arguably a form of dynamic
| dispatch.)
|
| Lots of statically typed languages provide dynamic
| dispatch. In fact, essentially _all_ mainstream ones do:
| C++, Java, C#, Rust, TypeScript, even modern Fortran.
| None of these implementations require sacrificing static
| type safety in any way; rather, type systems are designed
| to ensure such dispatch sites are well-formed in other
| ways, without restricting their dynamic nature. And this
| is entirely in line with the OP, as there is no tension
| whatsoever between the techniques it describes and
| dynamic dispatch.
| ukj wrote:
| You must be strawmanning my position to make this
| comment.
|
| Obviously static type systems are useful. I don't even
| think my point is contrary to anything you are saying.
| This is not being said as way of undermining any
| particular paradigm because computation is universal -
| the models of computation on the other hand (programming
| languages) are not all "the same". There are qualitative
| differences.
|
| Every single programming paradigm is a self-imposed
| restriction of some sort. It is precisely this
| restriction that we deem useful because they prevent us
| from shooting off our feet with shotguns. And we also
| prevent ourselves from being able to express certain
| patterns (of course we can deliberately/explicitly turn
| off the self-imposed restriction! ).
|
| Like the restriction you are posing on your self is
| explicit in "type systems are carefully designed so that
| the compile-time reasoning says something useful about
| what actually occurs at runtime"
|
| If you could completely determine everything that happens
| at runtime you wouldn't need exception/error handling!
|
| All software would be 100% deterministic.
|
| And it isn't.
|
| I can say nothing of the structure of random bitstreams
| from unknown sources. I only know what I EXPECT them to
| be. Not what they actually are.
|
| In this context parsing untrusted data IS runtime type-
| checking.
| benrbray wrote:
| Yeah, I remember I used to get frustrated when I had to read
| code that used map() or even .forEach() extensively, thinking a
| simple, imperative for loop would suffice. I slowly came to
| realize that a for loop gives you too much power. It's a
| hammer. It holds the place of a bug you just haven't written
| yet. Now I'm the one writing JavaScript like it's Haskell.
| Although Haskell could learn a thing or two from TypeScript
| about row polymorphism.
| touisteur wrote:
| On the other end I'm endlessly tired of 'too simple'
| foreach/map iterators. They're OK until you want to do
| something like different execution on first and/or last
| element... Give me a way to implement a 'join' pattern over
| the foreach iterators, or less terse iterators (with 'some'
| positional information). I think I'm just ranting about the
| for-of iterator in Ada...
| Jtsummers wrote:
| Ada provides a kind of iterator called a _Cursor_ which
| could be used to build up a package of functions similar to
| the various C++ standard library algorithms. I believe this
| has actually already been done. _Cursor_ s can also be
| converted back to positional information if it makes sense
| (like with a _Vector_ ).
| DougBTX wrote:
| I quite like the "enumerate" pattern. When indexes matter,
| instead of `for x in v` you would write, `for (i, x) in
| enumerate(v)`, then the language only needs one type of for
| loop as both cases use the same enumerator interface.
| k__ wrote:
| Is using "row polymorphism" the same as using a "structutal
| type system"?
|
| I never heard about the former.
| inbx0 wrote:
| Not really, and correct me if I'm wrong but afaik TS
| doesn't actually do row polymorphism so much as structural
| subtyping - although the difference between the two is
| pretty small and you can get pretty close to row
| polymorphism with structural subtyping + generics.
|
| But even if these were the same thing and we want to be a
| bit pendantic since this is HN after all, structural type
| systems often support some kind of subtyping or row
| polymorphism, but it's not a strict requirement for a type
| system to be "structural". You could have a structural type
| system that doesn't allow { a: int, b: int
| }
|
| to be used where { a: int }
|
| is expected. How practical such a type system would be... I
| don't know. Flow type checker for JavaScript makes a
| distinction between "exact" types, i.e. object must have
| exactly the properties listed and no more, and "inexact"
| types where such subtyping is allowed.
| mixedCase wrote:
| FWIW, you can achieve row polymorphism in TypeScript,
| although it's not super intuitive.
| function rowPolymorphic<R extends { a: number }>(record:
| R): R & { a: string } { return {
| ...record, a: record.a.toString(), }
| } const rec = rowPolymorphic({ a: 123,
| b: "string", }) console.log(rec.b)
| seanwilson wrote:
| > How practical such a type system would be... I don't
| know. Flow type checker for JavaScript makes a
| distinction between "exact" types, i.e. object must have
| exactly the properties listed and no more, and "inexact"
| types where such subtyping is allowed.
|
| TypeScript doesn't have this check
| (https://github.com/Microsoft/TypeScript/issues/12936)
| and I've found it can be really error prone when you're
| wanting to return a copy of an object with some fields
| updated. Spot the bug in this example: https://www.typesc
| riptlang.org/play?#code/C4TwDgpgBAKlC8UDeU...
| RHSeeger wrote:
| Plenty of people know multiple statically typed and dynamic
| languages, and multiple functional, imperative, and other
| languages; and use dynamic languages for some things but not
| other things. The set of people using dynamic languages isn't
| just "those that haven't had their eyes opened yet to what
| static languages can do". Different languages and paradigms
| make different things easier.
|
| I do believe that, for long lasting, larger projects, static
| typing tends to make the code easier to maintain as time goes
| on. But not every project is like that. In fact, not every
| project uses a single language. Some use statically typed
| languages for some parts, and dynamically typed for others
| (this is common in web dev).
| wpietri wrote:
| For sure. I do most of my work in situations of high
| volatility of domain and requirements and relatively high
| risk. (E.g., startups, projects in new areas.)
|
| Static typing really appeals to me on a personal level. I
| enjoy the process of analysis it requires. I love the notion
| of eliminating whole classes of bugs. It feels way more tidy.
| I took Odersky's Scala class for fun and loved it.
|
| But in practice, they're just a bad match for projects where
| the defining characteristic is unstable ground. They force
| artificial clarity when the reality is murky. And they impose
| costs that pay off in the long run, which only matters if the
| project has a long run. If I'm building something where we
| don't know where we're going, I'll reach for something like
| Python or Ruby to start.
|
| This has been brought home to me by doing interviews
| recently. I have a sample problem that we pair on for an hour
| or so; there are 4 user stories. It involves a back end and a
| web front end. People can use any tools they want. My goal
| isn't to get particular things done; it's to see them at
| their best.
|
| After doing a couple dozen, I'm seeing a pattern: developers
| using static tooling (e.g., Java, TypeScript) get circa half
| as much done as people using dynamic tooling (Python, plain
| JS). In the time when people in static contexts are still
| defining interfaces and types, people using dynamic tools are
| putting useful things on the page. Making a change in the
| static code often requires multiple tweaks in situations
| where it's one change in the dynamic code. It makes the extra
| costs of static tooling really obvious.
|
| That doesn't harm the static-language interviewees, I should
| underline. The goal is to see how they work. But it was
| interesting to see that it wasn't just me feeling the extra
| costs. And those costs are only worth paying when they create
| payoffs down the road.
| mixmastamyk wrote:
| This is the best comment on the subject and should be at
| the top rather the current dogmatic ones.
| giantrobot wrote:
| I have had the exact same experience. There's lots of
| utility in statically typed languages. They're great if
| your problem space is well defined. With respect to type
| checking, it's like a jig in wood or metal working. You
| trade flexibility for correctness.
|
| When the problem space is less well defined the type-
| related boiler plate adds a lot of friction. It's not
| impossible to overcome that friction but it slows down
| progress. When you're under a tight deadline development
| velocity is often more valuable than absolute correctness
| or even overall runtime efficiency.
|
| An delivered product that works is usually more valuable
| than an undelivered product that's more "correct" or
| efficient. A development project is just a cost (for
| various values of cost) until it ships.
| wpietri wrote:
| Definitely. And for me the early stages of a product are
| often about buying information. "Users say they want X,
| so let's ship X and see." Key to exploring a product
| space is tight feedback loops between having an idea and
| seeing what people really do. It's only once I have
| enough active users (especially active _paying_ users) to
| justify the project that I have some confidence about
| what "long term" really means for the code base.
| garethrowlands wrote:
| "type-related boiler plate"
|
| That phrase makes me sad. Mainstream languages have a lot
| of scope for improvement in their type systems.
| j1elo wrote:
| Great comment. There are no silver bullets. I am _Team
| static typing_ , but recognize how heavy of a burden would
| be to start a purely exploratory development in Rust or
| Java. It just "cuts your wings" in the name of
| correctness... well some times it is _useful_ to have the
| ability to start with a technically incorrect
| implementation that anyways only fails in a corner case
| that is not your main point of research.
|
| On the other hand, as the initial code grows and grows, the
| cost of moving it all to a saner language grows too...
| discouraging a rewrite. So we end up with very complex
| production software that started as dynamic and is still
| dynamic.
| [deleted]
| b3morales wrote:
| Perhaps the Goldilocks mixture will be languages that
| allow type annotations but don't require it (e.g.
| Typescript, Elixir, Racket, and I think this is how
| Python's works).
| wpietri wrote:
| Yeah, I've been using Python's gradual typing for a
| while. It's not perfect, but I'm excited for the
| possibilities. But the real test is to see what it's like
| on a large, long-lived project, so I'm keeping any open
| mind. I figure if that doesn't work fully, it'll still be
| a nice step toward things that can be pulled out as
| isolated services.
| rowanG077 wrote:
| I'm one of those people. But the reasons I use dynamic
| languages is in spite of them being dynamic. I choose Python
| because it has insane library support, I don't like Python as
| a language though. I would instantly choose Haskell if it had
| even half the available libraries. But I can't justify having
| to write everything myself and take 10x as long.
| seanwilson wrote:
| > The set of people using dynamic languages isn't just "those
| that haven't had their eyes opened yet to what static
| languages can do".
|
| This was more aimed at people who are new to the idea of
| parsing over validating. In a strong statically typed
| language, the type system would naturally guide you to use
| this approach so if this isn't natural to you then time in
| other languages would probably be worthwhile.
| RHSeeger wrote:
| My misunderstanding then. Your comment came across (to me)
| as saying that people using dynamic language just don't
| know any better; that, if they would just learn a static
| language, they would suddenly understand the error of their
| ways. That's the thought that I was responding to.
| steveklabnik wrote:
| > It's weird how long it's taking for people to rediscover why
| strong static types were a good idea.
|
| I don't think it's weird. Most of those languages were not
| popular in industry for various reasons, and the ones that
| _were_ (especially in say, the 90s) did not have particularly
| capable static type systems. The boilerplate /benefit ratio was
| all off.
|
| The way I describe this dichotomy personally is, I would rather
| use Ruby than Java 1.5. I would rather use Rust than Ruby.
| (Java 1.5 is the last version of Java I have significant
| development experience in, and they've made the type system
| much more capable since those days.)
| archsurface wrote:
| "It's weird how long it's taking for people to rediscover why
| strong static types were a good idea." It sounds like you're
| projecting your limited experience eg you've dismissed all Ms
| dev. Many people have been using strong static for decades. The
| benefits have never been out of sight. Many use dynamic, but
| many have always used strong static.
| lukashrb wrote:
| For what its worth: People don't use dynamic language because
| they don't know better or never used a static language. To
| better understand what dynamic languages bring to the table,
| here are some disadvantages of static types to consider:
|
| Static types are awesome for local reasoning, but they are not
| that helpful in the context of the larger system (this already
| starts at the database, see idempotency mismatch).
|
| Code with static types is sometimes larger and more complex
| than the problem its trying to solve
|
| They tightly couple data to a type system, which (can)
| introduce incidental complexity >(I'm still waiting for pattern
| matching + algebraic data types) This is a good example, if you
| pattern match to a specific structure (e.g. position of fields
| in your algebraic data type), you tightly coupled your program
| to this particular structure. If the structure change, you may
| have to change all the code which pattern matches this
| structure.
| lolinder wrote:
| This argument is common, but I've never understood how a
| dynamically typed language is supposed to avoid coupling
| algorithms to data structures.
|
| When using a data structure, I know what set of fields I
| expect it to have. In TypeScript, I can ask the compiler to
| check that my function's callers always provide data that
| meets my expectations. In JavaScript, I can check for these
| expectations at runtime or just let my function have
| undefined behavior.
|
| Either way, if my function's assumptions about the data's
| shape don't turn out to be correct, it _will_ break, whether
| or not I use a dynamic language.
|
| It seems that most of the people who make this argument
| against static typing are actually arguing against violations
| of the Robustness Principle[0]: "be conservative in what you
| send, be liberal in what you accept".
|
| A statically typed function that is as generous as possible
| should be no more brittle against outside change than an
| equally-generous dynamically typed function. The main
| difference is that the statically typed function is explicit
| about what inputs it has well-defined behavior for.
|
| [0] https://en.wikipedia.org/wiki/Robustness_principle
| tikhonj wrote:
| The example with pattern matching doesn't have anything to do
| with static types. You'll have exactly the same problem if
| you pattern match against positional arguments in Python:
| match event.get(): case Click((x, y)):
| handle_click_at(x, y)
|
| (Example from PEP 636[1].)
|
| In both Python and statically typed languages you can avoid
| this by matching against field names rather than positions,
| or using some other interface to access data. This is an
| important design aspect to consider when writing code, but
| does not have anything to do with dynamic programming. The
| only difference static typing makes is that when you _do_
| change the type in a way that breaks existing patterns, you
| can know statically rather than needing failing tests or
| runtime errors.
|
| The same is true for the rest of the things you've mentioned:
| none are specific to static typing! My experience with a lot
| of Haskell, Python, JavaScript and other languages is that
| Haskell code for the same task tends to be _shorter_ and
| _simpler_ , albeit by relying on a set of higher-level
| abstractions you have to learn. I don't think much of that
| would change for a hypothetical dynamically typed variant of
| Haskell either!
|
| [1]: https://www.python.org/dev/peps/pep-0636/#matching-
| sequences
| lukashrb wrote:
| You're absolutely right. I guess I mentioned pattern
| matching in particular because of the cited sentence from
| OP "I'm still waiting for pattern matching + algebraic data
| types".
|
| > The same is true for the rest of the things you've
| mentioned: none are specific to static typing!
|
| Sure, I could be wrong here. I frequently am. But could you
| point out why do you think that?
| giovannibonetti wrote:
| When you said "idempotency mismatch" you were meaning
| impedance mismatch, right?
| lukashrb wrote:
| Your are right! Thank you for correcting me.
| tome wrote:
| Strange if so because it's the "Object-relational impedance
| mismatch" not the "Static type-relational impedance
| mismatch".
|
| https://en.wikipedia.org/wiki/Object%E2%80%93relational_imp
| e...
| yashap wrote:
| This matches my personal experience, for sure. I started out
| writing Python (pre-type-annotations) and JS, with lots of
| "raw" dicts/objects. For the past ~7 years though, I've written
| mostly Scala, but also a decent amount of TypeScript, Go and
| Java, and it's completely transformed how I code, dramatically
| for the better.
|
| Now, even in the rare case where I write some Python, JS or
| PHP, I write it in a very statically typed style, immediately
| parsing input into well-thought-out domain classes. And for
| backend services, I almost always go with 3 layers of models:
|
| 1) Data Transfer Objects. Map directly to the wire format, e.g.
| JSON or Protobuf. Generally auto-generated from API specs, e.g.
| using Open API Generator or protoc. A good API spec + code gen
| handles most input validation well
|
| 2) Domain Objects. Hand written, purely internal to the backend
| service, faithfully represent the domain. The domain layer of
| my code works exclusively with these. Sometimes there's a
| little more validation when transforming a DTO into a domain
| model
|
| 3) Data Access Objects. Basically a representation of DB
| tables. Generally auto-generated from DB schemas, e.g. using
| libs like Prisma for TS or SQLBoiler for Go
|
| Can't imagine going back to the "everything is a dictionary"
| style for any decent sized project, it becomes such a mess so
| quickly. This style is a little more work up front, when you
| first write the code, but WAYYYYYY easier to maintain over
| time, fewer bugs and easier to modify quickly and confidently,
| with no nasty coupling of your domain models to either DB or
| wire format concerns. And code gen for the DTO and DAO layers
| makes it barely more up-front work.
| chriswarbo wrote:
| I have the same feeling after spending a few years with
| Haskell, StandardML, Agda, Idris, Coq, etc.
|
| One trick I've found very useful is to realise that Maybe (AKA
| Option) can be though of as "a list with at most one element".
| Dynamic languages usually have some notion of list/array, which
| we can use as if it were a Maybe/Option type; e.g. we can
| follow a 'parse, don't validate' approach by wrapping a
| "parsed" result in a list, and returning an empty list
| otherwise. This allows us to use their existing 'map',
| 'filter', etc. too ;)
|
| (This is explored in more detail, including links to logic
| programming, in
| https://link.springer.com/chapter/10.1007%2F3-540-15975-4_33 )
|
| If we want to keep track of useful error messages, I've found
| Scala's "Try" type to be useful ('Try[T]' is isomorphic to
| 'Either Throwable T'). Annoyingly, built-in sum type; the
| closest thing is usually a tagged pair like '[true,
| myFoo]'/'[false, myException]', which is pretty naff.
| np_tedious wrote:
| > a list with at most one element
|
| I've found scala or even LINQ to really hammer down this
| point, even to those who aren't into FP very much. Doing that
| map/flatmap makes it click for just about anyone
| dgb23 wrote:
| This principle can be applied to dynamic languages as well if you
| have some mechanism such as type hinting, pre-conditions etc.
| that are checked by a linter during development, even if it
| isn't, you can still use it at runtime with sufficient error
| handling.
|
| The essential point of this blog post is to avoid "shotgun
| parsing", where parsing/validating is done just from a procedural
| standpoint, where it matters _when exactly_ it happens. In the
| paper "Out of the Tar Pit" it is asserted that this leads to
| "accidental complexity" (AKA "pain and anxiety"), which is
| something every programmer has experienced before, possibly many
| times.
|
| I've become a fan of declarative schema to (json-schema/OpenApi,
| clojure spec etc.) to express this kind of thing. Usually this is
| used at the boundaries of an application (configuration, web
| requests etc.) but there are many more applications for this
| within the flow of data transformations. If you apply the "parse
| don't validate" principle you turn schema-validated (sic!) data
| into a new thing. Whether that is a "ValidatedData" type or meta
| data, a pre-condition or runtime check says more about the
| environment you program rather than the principle in discussion.
| The benefit however is clear: Your code asserts that it requires
| parsed/validated data _where it is needed_ , instead of _when it
| should happen_.
| frogulis wrote:
| I think the article goes a little further than what you
| describe -- it would have you use a strong type that _cannot_
| represent illegal values.
|
| There's a follow-up article by the same author (that I
| unfortunately can't find), in which she explains this point.
|
| As an example, returning a NonZero newtype over Int is not as
| type safe as using an ADT that lacks a zero value altogether.
| Using a NonEmpty newtype over List is not as type safe as using
| the NonEmpty ADT that has an element as part of its structure.
|
| Basically newtype still has use, but it is not as airtight as a
| well-designed ADT.
| sgift wrote:
| > This principle can be applied to dynamic languages as well if
| you have some mechanism such as type hinting, pre-conditions
| etc. that are checked by a linter during development, even if
| it isn't, you can still use it at runtime with sufficient error
| handling.
|
| Every time I read something like this my mind translates it to
| "after building an ad-hoc compiler you can do all the things a
| compiler can do. Just not as well, but you can do it." -- Same
| with "I don't need a compiler, my tests stop all this kind of
| bugs"
| dgb23 wrote:
| I know of the advantages of static typing and very much
| appreciate them. My point was more about how the concept in
| the article may be translated to other types of tooling.
| mirekrusin wrote:
| In typescript parsing/asserting types with combinators works very
| well merging runtime with static type system [0], it has to be
| used at i/o boundary, then it enters static type system guarantee
| and no assertions are necessary, makes very nice codebase.
|
| [0] https://github.com/appliedblockchain/assert-combinators
| lloydatkinson wrote:
| I wish it had actual proper examples. I've no idea how to use
| that.
| mirekrusin wrote:
| Thank you, you are right. I'll add examples and ping here.
| hermanradtke wrote:
| Check out https://github.com/gcanti/io-
| ts/blob/master/index.md instead. I find it more composable
| and you can define a codec and get a native type from it so
| you are only defining things once.
| mirekrusin wrote:
| Assert combinators are composable, light, terse (very
| little verbosity), types can be defined in single place,
| instead of type/interface definition you can use return
| type of assert function.
|
| They don't go into deep category theory, you won't find
| monads and friends, they are first level, straightforward
| any typescript developer can pick up in minutes - this is
| by design. It stops at combinators in typescript to solve
| very specific problem and nothing more. Haskell in ts is
| not the goal of this npm.
| uryga wrote:
| from a look at the readme, you combine those `$.TYPE` things
| to build a validation function that checks if its argument
| matches some pattern (and throws an exception if it doesn't).
| import * as $ from '@appliedblockchain/assert-combinators'
| const validateFooBar = ( $.object({ foo:
| $.string, bar: $.boolean }) )
| // probably roughly equivalent to /* const
| validateFooBar = (x) => { console.assert(
| typeof x === 'object' && typeof x.foo === 'string'
| && typeof x.bar === 'boolean' )
| return x } */ const test1 = {
| foo: "abc", bar: false } const test2 = { foo: 0, quux:
| true } const { foo, bar } = validateFooBar(test1) //
| ok const oops = validateFooBar(test2) // throws error
|
| the source is pretty readable too if you want to get an idea
| how it works.
|
| https://github.com/appliedblockchain/assert-
| combinators/blob...
|
| https://github.com/appliedblockchain/assert-
| combinators/blob...
| ukj wrote:
| Software Engineers: Parse, don't validate.
|
| Mathematicians: Parsing is validation
|
| https://gallais.github.io/pdf/draft_sigbovik21.pdf
| pwdisswordfish8 wrote:
| The point being, the converse of 'parsing is validation' is not
| true.
| ukj wrote:
| Then you have some formally inexpressible/impredicative
| notion of "validation" in mind. For posterity (lifting from
| the depths of the threads):
|
| General case: Validating random data as input into some
| program.
|
| Particular case: Validating random source code (data) as
| input into some compiler (program).
|
| Do compilers parse or validate?
|
| > "the converse of 'parsing is validation' is not true."
|
| If that were the case then you should be able to give an
| example of a compiler validating random source code (data)
| but not parsing it.
|
| What determines the validity of random input is precisely a
| compiler's ability to parse it.
| nsajko wrote:
| I think that you actually agree with the comment you
| responded to, it's just that you misinterpreted what it was
| trying to say.
| ukj wrote:
| I certainly don't disagree (that doesn't mean I agree).
|
| The purpose of the conversation is to arrive at mutually
| acceptable interpretation.
| ukj wrote:
| The word "is" implies an isomorphism.
|
| If you see it differently you are implicitly assuming a non-
| formalist perspective on what "validation" means. Tell us
| about it.
| codetrotter wrote:
| The word "is" is also often used informally to mean "is a
| kind of".
| ukj wrote:
| "A kind of" is precisely its formal use from the PoV of a
| type theorist.
|
| Two things are the same type of thing if they share all
| of their extensional properties.
|
| That is what it means for two things to be
| identical/equal.
| codetrotter wrote:
| But what I am saying is that parsing is a kind of
| validation. But all validation is not parsing.
|
| For example let's say that I have written an HTTP API
| that accepts application/x-www-form-urlencoded data to
| one of its endpoints. Let's say `POST /users`, and this
| is where the client-side application posts data to.
|
| Now I can implement this in many ways. I can for example
| define pub struct Person {
| name: String, phone_number: String, }
|
| But how I populate this struct can determine whether I am
| actually parsing or not, even if most of the code aside
| from that is the same.
|
| And of course I could go further and define types for the
| name and the phone number but in this case lets say that
| I have decided that strings are the proper representation
| in this case.
|
| If the fields of my structs were directly accessible
| pub struct Person { pub name: String,
| pub phone_number: String, }
|
| And in my HTTP API endpoint for `POST /users` I do the
| following: // ... let
| name = post_data.name; let phone_number =
| post_data.phone_number; let
| norwegian_phone_number_format =
| Regex::new(r"^(\+47|0047)?\d{8}$").unwrap();
| // ...
|
| And I didn't bother to write out the rest of the code
| here for this example but you get the gist.
|
| The point is that here I am doing some rudimentary
| validation on the phone number, requiring it to be in
| Norwegian format. But I am enforcing this in the
| implementation of the handler for the HTTP API endpoint,
| rather than in the data type itself.
|
| Whereas if I was instead doing pub
| struct Person { name: String,
| phone_number: String, } impl Person
| { pub fn try_new (name: String, phone_number:
| String) -> std::result::Result<Self, PersonDataError> {
| // ... let
| norwegian_phone_number_format =
| Regex::new(r"^(\+47|0047)?\d{8}$").unwrap();
| // ... } }
|
| Now I've moved the validation into an associated function
| of the type itself, and I've made the fields of the
| struct unaccessible from the outside.
|
| And in this manner, even though my validation is still
| rudimentary, and a type purist might find the type
| insufficiently constrained, I have indeed in my own book
| gone from just validation to actual parsing. Because I
| have made it so that the construction of the type
| enforces the constraints on the data.
| ukj wrote:
| You are over-complicating this into obscurity.
|
| General case: Validating random data as input into some
| program.
|
| Particular case: Validating random source code (data) as
| input into some compiler (program).
|
| Do compilers parse or validate?
|
| "parsing is validation, but validation is not parsing" if
| that were true then you should be able to give an example
| of a compiler doing some sort of validation on the random
| source code (data) that is not parsing.
|
| The very thing which determines the validity of random
| source code is the compiler's ability to parse it.
| codetrotter wrote:
| But I'm not talking about compilers here
| ukj wrote:
| Why not?
|
| Compilers are computable functions.
|
| If "Parsing is validation, but validation is not parsing"
| is true then it is also true about compilers.
| [deleted]
| thereare5lights wrote:
| > The word "is" implies an isomorphism.
|
| Are you talking about a bijective mapping or are you saying
| it's a synonym for identical?
|
| Because the former doesn't make any sense here and the
| latter is not true.
|
| Red is a color does not imply that all colors are red.
| ukj wrote:
| I am talking about the polymorphic use of the verb "is"
| during the process of formalization.
|
| "Red is a color" can be formalized as "Red is a type of
| color" or "Red is member of set Colors".
|
| You can't formalize "Color is red" because it doesn't
| mean anything.
|
| When I say "Parsing is validation" I am using the verb
| "is" to mean an isomorphism.
| [deleted]
| pwdisswordfish8 wrote:
| 'A square is a rectangle' means squares are isomorphic to
| rectangles?
| ukj wrote:
| You are tripping up over polymorphism. "Is" means many
| things - which meaning you infer is precisely parsing!
|
| "A square is a rectangle" means "A square is a TYPE of
| rectangle" (at least, that is what I am parsing it as).
|
| "Parsing is Validation" means Parsing is isomorphic to
| Validation.
|
| How do I know? Because that is how I want you to parse
| it.
| jhgb wrote:
| > "A square is a rectangle" means "A square is a TYPE of
| rectangle" (at least, that is what I am parsing it as).
|
| In that case your former statement that 'The word "is"
| implies an isomorphism' seems to be wrong.
| ukj wrote:
| It may be wrong in your model/interpretation of my words,
| but it's not wrong in my interpretation of my words.
| pwdisswordfish8 wrote:
| > 'When I use a word,' Humpty Dumpty said in rather a
| scornful tone, 'it means just what I choose it to mean --
| neither more nor less.'
| ukj wrote:
| +[?]
|
| parse verb. resolve (a sentence) into its component parts
| and describe their syntactic roles.
|
| In computer science what we do is precisely syntax
| analysis. Determining the meaning of operators.
|
| Mathematicians have the exact same problem with respect
| to the equality operator.
|
| https://ncatlab.org/nlab/show/equality#DifferentKinds
| cjfd wrote:
| Software engineers like efficiently running software.
| Mathematicians like beautiful definitions. Scientists like non-
| trivial discoveries.
|
| This paper.... uh.... what exactly is it good for?
|
| I suppose it could be kind of nice as some kind of
| undergraduate paper writing project kind of thing but it looks
| too professional for that.... I am kind of at a loss why this
| was written. Maybe it is some strange kind of satire....
| ukj wrote:
| This paper is good for parsing/validating your source code
| (from the view-point of your compiler/interpreter).
|
| Code is data after all.
| squiddev wrote:
| It's written for sigbovik 2021 [1][2], which is very much a
| joke conference. Other papers this year were "Lowestcase and
| Uppestcase letters: Advances in Derp Learning" and "On the
| dire importance of mru caches for human survival (against
| Skynet)".
|
| [1]: http://www.sigbovik.org/ [2]:
| http://www.sigbovik.org/2021/proceedings.pdf
| m3koval wrote:
| SIGBOVIK is a parody of computer science conferences. It's a
| running joke hosted on April Fools Day every year at CMU -
| and apparently a quite convincing one. ;-)
|
| Source: I attended SIGBOVIK a few times in grad school.
| Drup wrote:
| To everyone in this subthread: sigbovik is a conference
| published every 1st of April.
|
| This paper is an April's fool joke. I didn't think people could
| take that one seriously. I guess it's a good April's fool then.
| :)
| ukj wrote:
| The conference is indeed a spoof, but in so far as what
| Mathematicians call a "proof" - the paper contains one. Agda
| is a proof assistant in the spirit of the Calculus of
| Constructions (
| https://en.wikipedia.org/wiki/Calculus_of_constructions ).
|
| So is the joke on Computer Scientists or Mathematicians? You
| decide ;)
|
| Beware of bugs in the above code; I have only proved it
| correct, not tried it --Donald Knuth
| Drup wrote:
| Sigbovik's jokes are of the kind where the premise is
| completely bonkers. The rest of the development is made
| with the utmost rigor to highlight said bonkersitude,
| Reductio ad absurdum.
| ukj wrote:
| Yeah, but that is precisely how inductive types work.
|
| "Bonkers" premises. Iterate, iterate, iterate. "Bonkers"
| conclusions. GIGO.
|
| And yet the result is reified, exists and speaks for
| itself. So what is so "absurd" and "bonkers" about a
| result that is right before your eyes?
|
| https://en.wikipedia.org/wiki/Reification_(computer_scien
| ce)
| rualca wrote:
| The wide adoption of Flask as Python's backend development
| framework of choice makes it quite clear that software
| developers have a hard time picking up April fool's jokes.
| 411111111111111 wrote:
| Or how great April fool's ideas can actually be if they
| turn out to be real
| kortex wrote:
| This principle is how pydantic[0] utterly revolutionized my
| python development experience. I went from constantly having to
| test functions in repls, writing tons of validation boilerplate,
| and still getting TypeErrors and NoneTypeErrors and
| AttributeErrors left and right to like...just writing code. And
| it _working_! Like one time I wrote a few hundred lines of python
| over the course of a day and then just ran it... and it _worked_.
| I just sat there shocked, waiting for the inevitable crash and
| traceback to dive in and fix something, but it never came. _In
| Python!_ Incredible.
|
| [0] https://pydantic-docs.helpmanual.io/
| jimmaswell wrote:
| I've found this to be simply a matter of experience, not
| tooling. As the years go by I find the majority of my code just
| working right - never touched anything like pydantic or
| validation boilerplate for my own code, besides having to write
| unit tests as an afterthought at work to keep the coverage
| metric up.
| vikingcaffiene wrote:
| Man, for a dev with as much experience as you're claiming to
| have, this comment ain't a great look.
|
| I'd argue that the more experience you get the more you write
| code for other people which involves adding lots of tooling,
| tests, etc. Even if the code works the first time, a more
| senior dev will make sure others have a "pit of success" they
| can fall into. This involves a lot more than just some "unit
| tests as an afterthought to keep the coverage up."
| JPKab wrote:
| It's an immediate tell when someone makes statements like
| the one you're replying to.
|
| It immediately tells me that they've never worked on large
| software projects, and if they have they haven't worked on
| ones that lasted more than a few months.
|
| I apologize to folks reading this for my rather aggressive
| tone but I've been writing software for a long time in
| numerous languages, and people with the unit tests as an
| afterthought attitude are typically rather arrogant in fool
| hardy.
|
| The most recent incarnation I've encountered is the hotshot
| data scientist who did okay in a few Kaggle competitions
| using Jupyter notebooks, and thinks they can just write
| software the way they did for the competitions with no test
| of any kind.
|
| I had one of these on my team recently and naturally I had
| to do 95% of the work to turn anything he produced into a
| remotely decent product. I couldn't even get the guy to use
| nbdev, which would have allowed him to use Jupyter to write
| tested, documented, maintainable code.
| jimmaswell wrote:
| I've worked on large scale projects for a long time. A
| large portion of the kind of code I've written is
| impractical or impossible to actually "unit test" e.g.
| Unity3D components or frontend JS that interacts with a
| million things. When something weird is going on I'll
| have to dig in with console logs and breakpoints.
|
| On certain backend code where I am able to do unit tests,
| they do catch the occasional edge case logic error but
| not at a rate that makes me concerned about only checking
| them in some time after the original code, which I'll
| have already tested myself in real use as I went along.
| mixmastamyk wrote:
| You got paid to do the work presumably. You might also be
| able to push back on it. Coding standards should be a
| thing just about anywhere competent.
|
| In short, there are choices besides, "I alone have to do
| all the hard work."
| JPKab wrote:
| I quit the company and the team as a result of the bosses
| refusing to make their pet data scientists write remotely
| professional code.
|
| I was more experienced with predictive algorithms and
| deep learning than any of the data scientists at the
| company but because they were brought in from an
| acquisition of a company that had an undeserved
| reputation due to a loose affiliation with MIT, they were
| treated like magicians while the rest of us were treated
| like blacksmiths.
|
| I had the choice and I made the choice to leave. And of
| course I raised hell with the bosses about them not
| writing remotely production quality code that required
| extensive refactoring.
|
| And yes I was paid to do the work but the work occupied
| time that I could have spent working on the other
| projects I had that were more commercially successful but
| less sexy to Silicon Valley VCs who look at valuations
| based on other companies' newest hottest product.
| [deleted]
| mixmastamyk wrote:
| Adding lots, no. I agree with the grandparent.
|
| Keeping the code simple, finding the right abstractions,
| untangling coupling, gets the most bang for the buck. See
| the "beyond pep8" talk for a enlightened perspective.
|
| That said, lightweight testing and tools like pyflakes to
| prevent egregious errors helps an experienced dev write
| very productively. Typing helps the most with large,
| venerable projects with numerous devs of differing
| experience levels.
| kortex wrote:
| > Typing helps the most with large, venerable projects
|
| I disagree. I've started using types from the ground up
| and it helps almost equally at every stage of the game.
| Also I aggressively rely on autocomplete for methods.
| It's faster this way than usual "dynamic" or "pythonic"
| python.
|
| Part of it might be exactly because writing my datatypes
| first helps me think about the right abstractions.
|
| The big win with python is maybe 2-10% of functions, I
| just want to punt and use a dict. But I have shifted >80%
| of what used to be dicts to Models/dataclasses and it's
| so much faster to write and easier to debug.
| mixmastamyk wrote:
| I don't need to aggressively rely on tools, they are
| merely in the background. Perhaps what the earlier post
| about experience was thinking.
|
| Also, what makes you think I'm not aware of datatypes?
| Currently working eight hours a day on Django models.
| jolux wrote:
| Typing is just another guardrail, it's not a substitute
| for finding the right abstractions and keeping things
| simple.
| hardwaregeek wrote:
| I agree but guardrails are pretty awesome. And if people
| were saying "don't use guardrails, just drive properly",
| I'd ask why they think guardrails and driving properly
| are mutually exclusive.
| jolux wrote:
| Exactly. To be clear I'm very pro-type systems.
| hardwaregeek wrote:
| Agreed. It's like saying "oh well I just fly the airplane
| really carefully". A lot of codebases eclipse the point
| where one person can understand the whole system. Testing,
| static analysis and tooling are what allows us to keep the
| plane flying.
| mixmastamyk wrote:
| Agreed with the end of your post. However, the top post
| approaches religious dogma. I argue against that even if
| one has some good points.
| kortex wrote:
| No this was like over a week, and 100% due to the tooling.
| Pydantic, pycharm, black, mypy, and flake8. Pretty much went
| from "type hints here and there" to "what happens if I try
| writing python as if it were (95%) statically typed." I'd
| been testing well before this point but it's not the same as
| writing test.
|
| The _development_ process is totally different when you write
| structured types _first_ and then write your logic. 10 /10
| would recommend.
|
| Usual caveat: this is what makes sense to me and my brain.
| Your experience may be different based on neurotype.
| Scarbutt wrote:
| _The development process is totally different when you
| write structured types first and then write your logic. 10
| /10 would recommend._
|
| Unless you were writing very small throwaway scripts, in
| what world where you writing your logic first and thinking
| about your data structures later?
| globular-toast wrote:
| I agree. I'm often baffled by some developers who seem to
| think dynamic typing is a minefield that inevitably goes
| wrong all the time. I note these are almost always Javascript
| programmers, though. In practice, experience developers in
| dynamic languages like Python, Lisp etc. rarely make such
| errors. The number of bugs we deal with that would have been
| caught early by static typing are vanishingly small.
|
| The best argument I've heard for doing type annotation is for
| documentation purposes to help future devs. But I don't
| completely buy this either. I touch new codebases all the
| time and I rarely spend much time thinking about what types
| will be passed. I can only assume it comes with experience.
|
| Type annotation actually ends up taking a hell of a long time
| to do and is of questionable benefit if some of the codebase
| is not annotated. People spend sometimes hours just trying to
| get the type checker to say OK for code that actually works
| just fine!
| exdsq wrote:
| It's okay if you're working on a blog site, less so if
| you're working on an air-planes autopilot.
| globular-toast wrote:
| JPL sent Lisp to space https://flownet.com/gat/jpl-
| lisp.html
| exdsq wrote:
| Sure and I know people who write Python that goes into
| space too, but it doesn't mean it'd the norm or even a
| good idea
| JPKab wrote:
| I've worked with plenty of coders who talk about how awesome
| their code is even though they just write unit test as an
| afterthought. They also talk about how they don't need
| validation and everything is just awesome.
|
| I hated working with those coders because they weren't really
| very good and their code was always the worst to maintain.
| They are the equivalent of a carpenter who brags about how
| quickly they can bang nails but can't build a stable
| structure to save their life.
| cortesoft wrote:
| Tests aren't to make sure your code works when you write it,
| it is to make sure it doesn't break when you make changes
| down the line.
| exdsq wrote:
| How do you know your code works when you write it if you
| don't test it?
| globular-toast wrote:
| You don't. But you only need to test it once (manually),
| then commit it.
|
| You write _automated_ tests so that you can keep running
| the tests later such that the behaviour is maintained
| through refactor and non-breaking changes.
| syngrog66 wrote:
| you run it. look at the results or output. like the
| stdout, or a file it changed, or in a REPL or debugger.
| depends on situation
| exdsq wrote:
| Sounds laborious to manually check edge cases each time
| you change that code or its dependencies. I'd rather just
| write a test.
| chrisandchris wrote:
| Yeah I'm not sure that's how software engineering should
| work.
|
| Tests should prove a desired behaviour. Sometimes it's
| not possible to fully run code until late in some
| staging, just because there are a lot of dependencies and
| conplexity. That's what tests are for (on various lebels
| of abstraction).
| cortesoft wrote:
| I think it depends on the task. Some code we write is so
| simple and only used a few times that you don't need
| tests.
| cortesoft wrote:
| Sorry, should have said "aren't JUST to make sure your
| code works when you write it"
|
| I was specifically responding to the commenter I replied
| to, who said they didn't need tests because their code
| just worked the first time after he wrote it.
| omegalulw wrote:
| Can't you do this already with Python type annotations? I am a
| fan of typing in general (not just for data model as this seems
| to be) and using types everywhere saves a lot of debugging
| hassle and even allows for catching _some_ bugs with static
| analysis.
| JPKab wrote:
| Curious, but how does pydantic compare to marshmallow?
|
| I'm currently using marshmallow in a project, specifically
| using the functionality that builds parsers from dataclasses.
|
| I was curious what the differences were.
| goodoldneon wrote:
| My company soured on Marshmallow a while back due to
| performance. Maybe it's gotten a lot better, but it has a bad
| reputation here. Most people seem really happy after we
| started using Pydantic. Take all that with a grain of salt
| since I'm just parroting hearsay :)
| exdsq wrote:
| Just started working on a new SaaS startup and using FastAPI &
| Pydantic. The development experience has been great.
| mixmastamyk wrote:
| Sounded familiar. Worked correctly the first time it was run,
| sans types.
|
| https://www.python.org/about/success/esr/
| ElevenPhonons wrote:
| I've also found Pydantic to be a valuable library to use.
|
| However, it does have a strongly opinionated approach to
| casting that can sometimes yield non-obvious results. This
| behavior is documented and I would suggest new potential
| adopters of the library to explore this casting/coerce feature
| in the context of your product/app requirements.
|
| For the most part, it's not an huge issue, but I've run into a
| few surprising cases. For example, sys.maxint, 0, '-7', 'inf',
| float('-inf') are all valid datetime formats.
|
| - https://pydantic-docs.helpmanual.io/usage/models/#data-
| conve... -
| https://gist.github.com/mpkocher/30569c53dc3552bc5ad73e09b48...
| dnadler wrote:
| We've had a similar experience using pydantic. We integrated it
| quite tightly with a django project and it's been awesome.
| theptip wrote:
| Where did you find it valuable to wire in to Django?
| k__ wrote:
| Could someone rewrite the examples in TypeScript?
|
| Some points really elude me because Haskell uses many symbols and
| is very dense.
| pengwing wrote:
| I'd call it: pattern match, don't validate.
|
| Gotta go and program more Elixir...
| spinningslate wrote:
| This is a great post. I come back to it frequently.
|
| There's beautiful clarity in the articulation, and the essence is
| easy to grasp yet powerful. It reminds me a bit of Scott
| Wlaschin's Railway Oriented Programming (ROP) [0]. As a
| technique, ROP nicely complements "parse don't validate". As an
| explanation, it's similarly simple yet wonderfully effective.
|
| I've a real admiration for people who can explain and present
| things so clearly. With ROP, for example, the reader learns the
| basics of monads without even realising it.
|
| [0]: https://fsharpforfunandprofit.com/rop/
| ChrisMarshallNY wrote:
| I agree. It's a very well-written post. I am not a Haskell
| person, but it was quite clear to me.
|
| I feel that we don't put enough value, these days, on the
| ability to write clear, articulate exposition. Also, I believe
| that many people are not willing to read articles, books, or
| papers, of any meaningful length.
|
| Everything needs to be boiled down to <10 min. read time, or
| <18 min. TED talks.
| themulticaster wrote:
| The way you somewhat randomly mention the value of clear and
| articulate exposition makes me assume you just had to wade
| through a 300-page specification document for a government
| contract regarding pencil sharpeners or something similar. If
| that's the case, you have my sympathy.
|
| Anyways, I definitely agree.
| ChrisMarshallNY wrote:
| Not recently (thank the Gods), but I used to work for a
| defense contractor, and I have dealt with many
| specification documents (like the Bluetooth spec).
| wrycoder wrote:
| Those are the extras. This is the post:
|
| https://fsharpforfunandprofit.com/posts/recipe-part2/
| adamlett wrote:
| I think making this about the type checker is a bit of a red
| herring. There is nothing in this - otherwise excellent - advice
| that can't be applied to a dynamically typed language like Ruby.
| It's the same insight that leads OOP folks to warn against the
| Primitive Obsession code smell
| (http://wiki.c2.com/?PrimitiveObsession). It's also the insight
| that leads to the Hexagonal Architecture (
| https://en.wikipedia.org/wiki/Hexagonal_architecture_(softwa...
| ).
| Zababa wrote:
| The advantage of the type checker is that it automatically
| check types for you. If you have only one function that
| produces a parsedArray and multiple functions that accept a
| parsedArray, you can be sure where they come from.
| adamlett wrote:
| Yes, that's literally the advantage of a static type checker.
| I don't dispute that. I'm just saying that the advice in the
| article is just as applicable in a dynamic language and
| confers the same benefits. True, you're not protected against
| accidentally using one type where another was expected, but
| that's not really the point as I see it. The point is to use
| _better_ types. Types you can lean on instead instead of
| nervously tiptoe around.
| imoverclocked wrote:
| I agree but I also think it's on the right path. This seems
| partially like a "why Haskell" in disguise to me.
|
| I've run across DSLs that have three or more layers of parsing
| and validation. Embedding different languages within each other
| (eg: JSON snippets within your own DSL) definitely leads to the
| issues the article talks about.
|
| Also, growing your own parser without understanding standard
| lexer/parser basics seems far more common than it ought to be.
| I'm not talking brilliant design, rather the extremely naive
| one-character-at-a-time-in-a-really-complex-loop variety of
| design.
|
| The better level of bad is, "I know what lexers/parsers are,
| now I'll write something to basically implement a type-checking
| parser with the lexed+parsed tree as input."
|
| This article is basically stating, "Why not just get your
| parser to do it all for you in one swell foop?" When I have
| refactored code to follow this kind of design, I have never
| regretted the outcome.
| garethrowlands wrote:
| I don't think types are a red herring here. Because if you
| follow this advice, then just using logic and your source code,
| you can prove what data is valid. And, since types and
| (constructive) logic are so strongly related, then the types
| are, in some sense "there" even if you don't see them. To put
| it another way, it's nice if your computer can make the proofs
| but if it can't, does that make the theorems any less true?
___________________________________________________________________
(page generated 2021-06-26 23:01 UTC)