hngopher.com

       [HN Gopher] An INI Critique of TOML (2021)
       ___________________________________________________________________
        
       An INI Critique of TOML (2021)
        
       Author : belter
       Score  : 130 points
       Date   : 2023-09-21 10:41 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | kazinator wrote:
       | I almost stopped reading at the difference between "89" and 89
       | being something bad that risks making your program crash.
       | 
       | What a moronic diatribe.
       | 
       | TOML being typed makes it excellent compared to INI.
       | 
       | Nobody with anything resembling a CS degree on their wall should
       | be defending nonsense like "castable strings", and the
       | proliferation of string conversions into the application layer.
       | Let alone in C or C++.
       | 
       | Postel's law is only half right.
       | 
       | You should be conservative in what you generate (don't probe
       | every obscure corner spec of a representation, if you can avoid
       | it), and reject all inputs that do not conform to the
       | specification.
       | 
       | Programs shouldn't react in uexpected ways to bad inputs, like
       | crash, or allow an attacker to take control. But they shouldn't
       | try to reinterpret bad inputs as good, either. That's folly.
       | 
       | The only reason to follow Postel's law is economic gain at the
       | expense of the technological ecosystem.
       | 
       | If your web browser accepts broken HTML and renders it, whereas
       | the competing browser rejects it, that competing browser is
       | better for the web, but looks buggy to the naive user base, which
       | will prefer your web browser.
       | 
       | Postel's law was used as one of the weapons in the browser wars,
       | whose legacy negatively affects the web even today.
        
         | masklinn wrote:
         | > I almost stopped reading at the difference between "89" and
         | 89 being something bad that risks making your program crash.
         | 
         | I can only commend you for not taking a few minutes to consider
         | whether it was worth continuing when the essay starts with
         | praising postel's law, possibly the worst idea in the field
         | since "let me just run that code I received over a socket".
        
           | uxp8u61q wrote:
           | > let me just run that code I received over a socket
           | 
           | ...javascript?
        
       | umanwizard wrote:
       | Mostly minor issues that are all dwarfed by the gigantic
       | advantage TOML has over INI: the fact that it's actually
       | specified.
        
       | insanitybit wrote:
       | It's so incredible how two engineers can look at something like
       | this and draw literally the opposite conclusion. Taste is
       | certainly subjective...
       | 
       | To me, this is largely an argument _for_ TOML. I mean,
       | 
       | > wishes = "I am fine"
       | 
       | This is an array with one element????
       | 
       | > Even TOML's featured example proposes "the INI way"
       | 
       | Obviously it is the TOML way too? Hence it being the featured
       | example. That another way to express it doesn't change that...
       | 
       | It's almost comical how much their arguments are convincing me
       | that INI is a disaster. And yet they are, seemingly in all
       | seriousness, truly trying to convince me that _this is the way_.
       | 
       | What an odd field to be in.
        
         | [deleted]
        
         | Joker_vD wrote:
         | > It's so incredible how two engineers can look at something
         | like this and draw literally the opposite conclusion. Taste is
         | certainly subjective...
         | 
         | Well, in such cases it's likely because two engineers are
         | actually expressing their tastes/beliefs, merely using the
         | thing they look at as a running example. That also means that
         | the running example, ironically, is actually mostly irrelevant
         | to the topic: otherwise it wouldn't have been able to support
         | two opposite conclusions.
         | 
         | And in the field of philosophy this phenomenon is even more
         | egregious: you can find e.g. one philosopher arguing that life
         | is meaningless because it's finite, and another philosopher
         | arguing that life has meaning because it's finite.
        
         | emodendroket wrote:
         | Just yesterday a broadside against YAML was on the front page.
         | I'm sure one could find them against JSON, XML, or any other
         | possible choice.
        
         | nayuki wrote:
         | > It's so incredible how two engineers can look at something
         | like this and draw literally the opposite conclusion. [...]
         | It's almost comical how much their arguments are convincing me
         | that INI is a disaster. And yet they are, seemingly in all
         | seriousness, truly trying to convince me that this is the way.
         | 
         | That pretty much sums up how I view arguments about metric vs.
         | imperial measurement systems. Every feature of imperial that
         | someone points out as an advantage is something I see as a
         | disadvantage.
        
           | HideousKojima wrote:
           | >That pretty much sums up how I view arguments about metric
           | vs. imperial measurement systems.
           | 
           | One argument I will make in favor of imperial: feet (when
           | divided into inches) are easy to divide into thirds because
           | it's base 12 instead of base 10. This makes a a lot of things
           | (especially in construction, woodworking, and others) a lot
           | easier. Of course it has a whole boatload of downsides to go
           | along with it, but that's one real and tangible benefit to
           | imperial over metric that I've experienced.
        
             | ziml77 wrote:
             | Metric should have been base 12. You get more typical
             | factors than base 10, and it's not an absurd deviation from
             | base 10 (base 60 would allow for the same factors as both
             | 12 and 10, but I doubt anyone would be happy to adopt that)
        
             | nayuki wrote:
             | That's exactly the kind of thinking that I'm alluding to.
             | Okay, if base 12 is so great, why are gallons not divisible
             | by 12? Why is our currency not divisible by 12? Why is a
             | mile 1760 yards - what if you wanted to parcel a square
             | mile into thirds on each side? Why is an inch partitioned
             | into binary fractions? Why are pounds not divisible by 12?
             | The imperial customs (it's hard to even call it a system)
             | have no internal consistency.
             | 
             | To add to the observation about lack of internal
             | consistency: Metalworkers use decimal inches. Woodworkers
             | use feet, inches, and binary fractions. Surveyors use
             | decimal feet. Some people use decimal miles. You could
             | argue that each of the aforementioned systems make sense on
             | its own, but none of them interoperate. Seriously - you'll
             | be baffled if you pick up a decimal foot measuring tape in
             | real life (they exist). Metric doesn't have this problem
             | because if someone is doing detailed work in millimetres,
             | someone is planning a house's rooms in metres, and someone
             | is organizing a town's land in kilometres, they can all
             | work with each other by simply moving the decimal place and
             | changing the prefix.
             | 
             | Another problem is that you're presupposing that the
             | products you interact with are designed in a whole number
             | of feet, and then you subdivide from there. I don't see
             | this as true at all; things come in all sizes like 2'5",
             | how are you going to divide that into thirds?
        
               | kemotep wrote:
               | Unlike metric, all the different imperial measurements
               | are just combinations of different kinds of initially
               | unrelated measurement systems. So a gallon has absolutely
               | nothing to do with foot, miles are a completely different
               | measurement system than feet, pounds and so on.
               | 
               | Needing to know the weight of a cubic hectare of water
               | and then how many gallons that is, is not a common issue.
               | Metric does give you a system that can quickly calculate
               | these things comparatively.
        
               | philwelch wrote:
               | > The imperial customs (it's hard to even call it a
               | system) have no internal consistency.
               | 
               | "Internal consistency" doesn't matter. Virtually nobody
               | ever has to convert miles to yards; the use cases for
               | measuring a distance in miles and the use cases for
               | measuring a distance in yards almost never overlap. For
               | the rare cases where they do, the units do have an
               | integer conversion.
               | 
               | For the use cases that do commonly overlap, e.g. inches
               | and feet, you get a useful-for-that-specific-domain
               | conversion factor of 12. Some domains use decimal miles
               | or decimal inches and that's totally fine; sounds to me
               | like they didn't need the metric system after all.
               | 
               | But if you're going to be so insistent on "internal
               | consistency", riddle me this: why do we measure time in
               | hours, days, weeks, months, and years rather than in
               | decaseconds, hectoseconds, kiloseconds, megaseconds,
               | etc.?
               | 
               | > if someone is doing detailed work in millimetres,
               | someone is planning a house's rooms in metres, and
               | someone is organizing a town's land in kilometres, they
               | can all work with each other
               | 
               | These people never work with each other. This is a
               | fantasy.
        
               | nayuki wrote:
               | > "Internal consistency" doesn't matter.
               | 
               | The lack of an opinionated stance on internal consistency
               | is exactly how we arrive at traditional measures and also
               | the US Customary set of units. It's far easier
               | politically to be accommodating and allow more units than
               | to put down your foot and say no, this is redundant, this
               | cannot be used.
               | 
               | > Virtually nobody ever has to convert miles to yards
               | 
               | That's mostly true because it seems yards are only used
               | to measure football fields and fabric. Everything else is
               | measured in feet, from personal heights to furniture to
               | rooms to house yards to building structures (the Empire
               | State Building is 1454 feet tall).
               | 
               | But my point still stands. You think you don't have to
               | convert between miles and feet? Okay:
               | https://www.researchgate.net/figure/a-Typical-multi-lane-
               | hig... . There's a highway exit coming up in 800 feet and
               | another in 1/2 mile. How many times longer does it take
               | to reach the second exit compared to the first exit? You
               | have no clue. In metric it's 250 m and 800 m, and it
               | obviously takes about 3 times longer to reach the second
               | exit.
               | 
               | > These people never work with each other. This is a
               | fantasy.
               | 
               | Tell me you haven't worked in engineering without telling
               | me you haven't worked in engineering. If you eyeball
               | everything and use intuition, I can see why you don't
               | care about units, conversions, and calculations. If you
               | actually need to plan and analyze things carefully before
               | you order materials and cut things, you'll quickly see
               | that having a plethora of units adds complexity and
               | chances for error without adding any functionality that a
               | pure system has (whether you're using millimetres or only
               | decimal inches).
        
               | philwelch wrote:
               | > That's mostly true because it seems yards are only used
               | to measure football fields and fabric.
               | 
               | Also shooting ranges, but yes.
               | 
               | > There's a highway exit coming up in 800 feet and
               | another in 1/2 mile. How many times longer does it take
               | to reach the second exit compared to the first exit?
               | 
               | Prior to GPS navigation nobody ever said "there's a
               | highway exit coming up in 800 feet"; road signs in the US
               | consistently use fractions of a mile. When my GPS app
               | switches units from fractions of a mile to feet, that
               | means I can see where I need to exit/turn. Calculating a
               | precise ratio doesn't matter. It's between 6 and 7 since
               | there's 5280 feet in a mile and 8 _6 is 48 but 8_ 7 is
               | 56, but who cares?
               | 
               | Also, on a highway, I'm usually traveling at least 60
               | mph, and since there's 60 minutes in an hour, that comes
               | out to one mile per minute. Try that with your fancy
               | metric system!
               | 
               | > Tell me you haven't worked in engineering without
               | telling me you haven't worked in engineering. If you
               | eyeball everything and use intuition, I can see why you
               | don't care about units, conversions, and calculations.
               | 
               | You're imagining a scenario where a guy who's worrying
               | about fractions of an inch building a cabinet has to talk
               | to the city planner who's worried about miles and they
               | have to convert units to do that. That doesn't happen,
               | and yet you want to optimize the entire unit of
               | measurement for that specific use case at the expense of
               | more common use cases like "dividing by three".
               | 
               | When it comes to domains where consistency does matter,
               | we just don't do unit conversions. For instance, flight
               | altitude is measured in feet, even when it's thousands of
               | feet, instead of miles. If you're flying an airplane at
               | 30,000 feet, who cares how many miles that is? That's not
               | what miles are for. Likewise with domains that use feet
               | per second rather than miles per hour. Again, even
               | "metric" countries don't commit to the bit here; name me
               | a country where the highway speed limit is in meters per
               | second.
        
               | nayuki wrote:
               | > When it comes to domains where consistency does matter,
               | we just don't do unit conversions. For instance, flight
               | altitude is measured in feet, even when it's thousands of
               | feet, instead of miles. If you're flying an airplane at
               | 30,000 feet, who cares how many miles that is?
               | 
               | Wrong.
               | https://en.wikipedia.org/wiki/Gliding_flight#Glide_ratio
               | 
               | Your plane is up at 30000 feet and your engines are out.
               | The nearest airport is 47 nautical miles out. The plane
               | has a glide ratio of 15. Will you make it?
               | 
               | It's easy in metric: 9.1 km altitude, 87 km distance.
               | 
               | > Again, even "metric" countries don't commit to the bit
               | here; name me a country where the highway speed limit is
               | in meters per second.
               | 
               | I actually much prefer metres per second. It makes things
               | like kinetic energy calculations easier. If I wanted to
               | know the KE of a 1500 kg, 100 km/h car, I would first
               | need to convert to m/s. Ditto the kilowatt-hour; it needs
               | to die in favor of megajoules.
        
               | philwelch wrote:
               | > Your plane is up at 30000 feet and your engines are
               | out. The nearest airport is 47 nautical miles out. The
               | plane has a glide ratio of 15. Will you make it?
               | 
               | Nautical miles and statute miles are different units
               | anyway, so my initial assertion--that you would never
               | convert flight altitude to statute miles--is still
               | correct.
               | 
               | Glide ratio varies according to speed; the plane might
               | have a glide ratio of 15 at one speed but a different
               | glide ratio at a different speed. So in practice, the
               | real world version of this word problem is more complex
               | than you make it out to be, and pilots' handbooks will
               | commonly have tables of glide distances to consult for
               | this reason.
               | 
               | > If I wanted to know the KE of a 1500 kg, 100 km/h car,
               | I would first need to convert to m/s.
               | 
               | You've adequately demonstrated that metric is more
               | convenient for arbitrary word problems that you've
               | provided. Real world applications is what I'm less
               | convinced about.
        
               | emodendroket wrote:
               | > But my point still stands. You think you don't have to
               | convert between miles and feet? Okay:
               | https://www.researchgate.net/figure/a-Typical-multi-lane-
               | hig... . There's a highway exit coming up in 800 feet and
               | another in 1/2 mile. How many times longer does it take
               | to reach the second exit compared to the first exit? You
               | have no clue. In metric it's 250 m and 800 m, and it
               | obviously takes about 3 times longer to reach the second
               | exit.
               | 
               | I'm having a hard time seeing this as much of a problem
               | in daily life. That sounds more like a word problem in
               | math class than something someone would want to typically
               | calculate on the fly. And my sense of how far something
               | at a given distance is in the car is informed more by
               | experience and intuitive sense than measurement.
        
               | gfv wrote:
               | >why do we measure time in hours, days, weeks, months,
               | and years rather than in decaseconds, hectoseconds,
               | kiloseconds, megaseconds, etc.?
               | 
               | We don't use them generally because at the time of French
               | Revolution, due to taxes, standardizing the state units
               | of measurement of physical goods was a much more pressing
               | concern than time. (If I were to guess, there were no
               | work hours limits, and thus it just hadn't crossed the
               | popular mind). Weeks were changed to have ten days
               | though.
               | 
               | That's not to say that they hadn't tried: decimal time
               | was mandatory for a few years before they realized that
               | there were too many clocks around, and pushing decimal
               | time can actually turn people hostile against the then-
               | new metric system.
               | 
               | We still kinda sorta use them in form of fractions of
               | Julian days.
        
               | thfuran wrote:
               | >why do we measure time in hours, days, weeks, months,
               | and years rather than in decaseconds, hectoseconds,
               | kiloseconds, megaseconds, etc.?
               | 
               | Because it would be astronomically expensive to slow
               | earth's rotation enough for a day to be 100 ksec.
        
               | nayuki wrote:
               | There's no theoretical problem if we define 86400 seconds
               | = 1000000 newseconds.
               | 
               | One big practical problem is that because SI is a
               | coherent system, the second is embedded in many units.
               | For example, 1 N = 1 kg m/s^2. 1 J = 1 N m. 1 V = 1 J/C.
               | 1 Pa = 1 N/m^2. And so on and so forth. So all of those
               | units will have to be replaced with units derived from
               | the newsecond. This is kind of like how SI has the unit
               | tesla but the CGS electromagnetic unit is gauss.
               | 
               | A long-term problem is that no matter what, the length of
               | the day on Earth will drift from scientifically accurate
               | atomic time. Sometime in the near future, a day will be
               | 86401 seconds. Then 86402, and so on. So the old second
               | or the new second will not solve this problem.
        
               | philwelch wrote:
               | > A long-term problem is that no matter what, the length
               | of the day on Earth will drift from scientifically
               | accurate atomic time. Sometime in the near future, a day
               | will be 86401 seconds. Then 86402, and so on. So the old
               | second or the new second will not solve this problem.
               | 
               | How long term is this problem? For instance, we
               | habitually insert "leap seconds" in order to keep things
               | lined up, but if we didn't insert leap seconds, we might
               | accumulate an error of maybe half an hour between the
               | solar meridian and "noon" in the next 500 years, which is
               | less error than we introduce by putting Madrid and
               | Belgrade in the same time zone. And in 500 years, most of
               | humanity will not even be living on the Earth anyway.
        
           | thfuran wrote:
           | It has supposed advantages?
        
             | nayuki wrote:
             | See https://www.google.com/search?q=why+imperial+is+better+
             | than+... , https://www.youtube.com/watch?v=odVPwNgD8w4 (the
             | guy argues FOR fractions!),
             | https://youtu.be/JoDQOqB5cZg?t=221 ("based on the size of
             | everyday objects"),
             | https://www.youtube.com/watch?v=Anmm7phETE4 ,
             | https://www.google.com/search?client=firefox-
             | b-d&q=why+imper... , et cetera.
        
               | thfuran wrote:
               | What a bunch of ridiculous nonsense. "I'm 6'1", which is
               | easy to visualize because it's literally about the length
               | of six adult feet end to end". I don't know how he spends
               | his time, but I doubt I've ever seen six adult feet end
               | to end, let alone with such frequency that's it's a
               | familiar reference for measurement. (Not to mention that
               | adult feet are very rarely as much as one foot in
               | length).
               | 
               | Edit: Oh no, I watched another one. "It's more
               | naturalistic. Think about it: there are a dozen inches in
               | a foot just like there are a dozen eggs in a carton." I'm
               | not even sure where to start on that one.
        
         | jmholla wrote:
         | > Obviously it is the TOML way too? Hence it being the featured
         | example. That another way to express it doesn't change that...
         | 
         | It's not even another way to express it. The example is a
         | completely different data structure. The initial proposed "TOML
         | way" is an array while the "INI way" is a table. The then
         | corrected "TOML way" IS the equivalent table in TOML.
         | 
         | It's a ridiculous strawman along with several other oddities
         | where the author loves to give INI the benefit of its weirdness
         | but then rails against TOML's. The author loves INI and refuses
         | to understand why people wanted and needed TOML to fix its
         | deficiencies.
         | 
         | I'd love to tear this thing apart point-by-point, but I think
         | it'd be more cathartic for me than useful to anyone else.
        
       | booleandilemma wrote:
       | _TOML forces arrays to be encapsulated within square brackets
       | (exactly like section paths do), although humans do not need
       | square brackets for recognizing when something is a list._
       | 
       | Isn't TOML meant to be read by programs?
       | 
       | Don't square brackets make things less ambiguous for both
       | machines and humans?
       | 
       |  _There is no way to convince a human that something composed of
       | only one member is a list (if you think differently, chances are
       | that you are partly non-human)._
       | 
       | The author should speak for himself.
       | 
       |  _TOML forces arrays to be always comma-separated, although a
       | human can recognize a list even when the separator is a
       | mushroom._
       | 
       | I stopped reading after this.
        
       | p2detar wrote:
       | We use INI files to configure our on-premises product. Four INI
       | files to be precise. Database connection url, server hostnames,
       | ports, various Boolean and debug flags, and even CSV string
       | arrays.
       | 
       | To this day not a single customer complaint or even support
       | questions with regards to the INI files. Most of our customers
       | are on Windows but we also support Linux with the same config
       | scheme.
        
       | pornel wrote:
       | tomlc99 parser took 13 minutes to parse a 50MB file!? There must
       | be something horribly broken with it. Rust's TOML parser is
       | literally 1000 times faster than that.
        
         | epage wrote:
         | Don't think I've benchmarked that case since thats not
         | generally what you'll run into with a human data format. Would
         | love to see numbers if you have them since I have no idea what
         | they'd be.
         | 
         | Personally performance isn't my biggest concern and I only
         | focus on it for the cargo case but I want to switch cargo to
         | storing packaged `.crate` files to a format meant for machines.
         | 
         | (maintainer of `toml` and `toml_edit` packages for Rust)
        
           | arp242 wrote:
           | I assume they used the "prepare.sh" script for that 50M file:
           | https://github.com/madmurphy/libconfini/tree/master/dev/test.
           | .. (you need to clone the repo to run it, because it depends
           | on some other files in there).
           | 
           | The file it generates is not valid TOML though; even if you
           | fix the obvious syntax issues you run in to issues like:
           | % tomlv ./big_file.ini       Error in './big_file.ini': toml:
           | line 35: Key 'section' has already been defined.
           | 
           | Perhaps tomlc99 doesn't check for this; I didn't try it.
           | 
           | Maybe I should add something to toml-test for this; while I
           | agree that 50M files are somewhat rare[1] and that
           | performance isn't a _huge_ consideration, it 's not entirely
           | unimportant either, and this can give people a bit of a "is
           | the performance of my parser horribly atrocious or roughly
           | fine?" kind of baseline.
           | 
           | [1]: One use-case I have is that I use TOML for translation
           | files (like gettext .po, except in, well, TOML) and in a
           | large application with a whole bunch of translations I can
           | see that adding up to 50M.
        
       | b450 wrote:
       | In replying to something this deranged, one runs the risk of
       | responding with sincerity to a troll, and thereby being Owned.
       | Nevertheless, my ethics require me to implore anyone who
       | sympathizes with the section on Square Brackets to seek help.
        
       | IggleSniggle wrote:
       | CUE, anyone?
       | 
       | https://cuelang.org/
        
         | tln wrote:
         | Is CUE only implemented in Go?
        
       | SrslyJosh wrote:
       | Reminder that TOML was initially conceived as a joke:
       | https://news.ycombinator.com/item?id=17523304
        
       | rednafi wrote:
       | Postel's Law is bad. It directly contradicts with the Unix
       | philosophy of building small and composable interfaces. Look how
       | small and generic the interface of Reader in Go. It has one
       | method with a simple signature and can be used widely.
       | 
       | Otoh, when your software is not opinionated about what it
       | accepts, that's when it tends to produce surprising behaviors.
       | This whole argument made me lean on TOML even more. It's not
       | perfect but certainly better than ini in many ways.
        
       | zxcvbn56nbvcxz wrote:
       | I don't like OP's variant of INI but hell TOML is the ugliest
       | format ever invented. The first reason why I wouldn't touch Rust
       | with a 1 meter stick.
        
       | gavinhoward wrote:
       | I've made my own config format precisely because TOML was not
       | good enough. But INI would have been worse for my case.
       | 
       | Strict typing is _good_. If you expect a number, and it doesn 't
       | parse, you want to know that.
       | 
       | Strict typing communicates _intent_. And that is worth gobs of
       | effort.
       | 
       | (I do admit that the author has a point about times when apps
       | just want a string value, so I think I'll add a function to get
       | the string value of a move no matter what type it is.)
       | 
       | Also, as much as the author slams TOML's avoidance of ambiguity,
       | the industry has learned that Postel's Law is bad. We need to be
       | conservative in what we emit and accept, for several reasons, the
       | most important of which is that parsers are some of the most
       | dangerous code; any possible ambiguity could be a security bug or
       | many waiting to be discovered. My parser is opinionated because
       | that prevents bad things from happening.
       | 
       | The author is also wrong about C not handling mixed arrays well;
       | my config format is JSON with a few niceties, and I easily
       | implemented it in C, even though it can have mixed arrays.
       | 
       | And braces/ brackets in JSON are not human-editable? Come on...
       | 
       | That said, some changes to JSON _were_ necessary to make it
       | easier to edit; besides comments, I also added keys that don 't
       | need quotes, as long as they are one word.
        
         | mjevans wrote:
         | The author's premise is that the strict typing and validation
         | of a configuration file belong with the application, not some
         | library doing initial validation of a document which may or may
         | not conform to the application's logic.
        
           | gavinhoward wrote:
           | But even then, the logic depends on the contents of the
           | string. If you need a country (as a string) and you get a
           | number, you know something is wrong.
           | 
           | If you need a country code and you get a Boolean, chances are
           | you got Norway, but by having strict typing, you don't run
           | into that ambiguity.
        
       | eviks wrote:
       | > All INI dialects however are well-defined (every INI file is
       | parsed by some application, and by studying a parser's source
       | code it is possible to deduce its rules)
       | 
       | That the definition of a lack of definition. Well-defined means
       | you don't need to study parser's source code and deduce the
       | rules, you get the rules!
       | 
       | > , and, if one looks closely, the number of INI dialects
       | actually used in the wild is not infinite.
       | 
       | and that doesn't help you since you'll have to study parser's
       | source code to deduce
        
         | laszlokorte wrote:
         | There is no undefined behavior in C. You just have to read your
         | compilers source code or the generated assembly or simply run
         | to program to learn the exact definition.
        
       | joshxyz wrote:
       | it's like comparing javascript and typescript. there are trade-
       | offs in the presence (and lack) of verbosity.
        
       | noirscape wrote:
       | I kinda get where the author is coming from. Everyone is rushing
       | over themselves to try and "fix" configuration file formats by
       | adding more and more complexity so we humans can express more
       | complex data types into configuration files.
       | 
       | Nowadays TOML is the new hotness, 6 years ago it was YML, 9 years
       | ago it was JSON and before that XML was our alledged savior.
       | 
       | Ultimately there's no one perfect solution, but INI does have one
       | thing going for it: it is _dead simple_. It has no type system
       | and it leaves it up to the application to convert a set of key
       | /value string pairs (grouped by a header) into something
       | meaningful rather than attempting to guess and spitting out
       | unexpected surprises.
       | 
       | Contrast to XML which has 50 ways to say the same thing depending
       | on what tickles your fancy, YML which is probably the
       | configuration language with the fastest footguns in the West or
       | TOML which attempts to express fairly complicated data types into
       | a configuration file without questioning if that's even useful
       | for 99% of the configuration files.
       | 
       | The only one that is close to equal in simplicity is JSON, which
       | only allows booleans, numbers, strings and arrays/dictionaries.
       | Really I think the only reason why JSON failed the "human-written
       | configuration file" contest is because spec-compliant
       | implementations forbid comments. This being mostly because the
       | author of the spec wanted to avoid JSON (de)serializers from
       | putting custom logic in them.
       | 
       | (Although I'll note that JSON specifically tends to have the most
       | non-spec-compliant libraries that have a looser stance on it and
       | allow in-line comments using javascripts rules, at least in my
       | experience and even has a bunch of dedicated offshoot specs that
       | are basically "JSON but with comments".)
        
         | AndyKluger wrote:
         | > The only one that is close to equal in simplicity is JSON,
         | which only allows booleans, numbers, strings and
         | arrays/dictionaries.
         | 
         | NestedText only allows strings, arrays, and dictionaries, has a
         | proper specification, and you never need to escape anything.
        
         | marcosdumay wrote:
         | > Nowadays TOML is the new hotness, 6 years ago it was YML, 9
         | years ago it was JSON and before that XML was our alledged
         | savior.
         | 
         | Hum... Didn't you notice that complexity is not steadily
         | increasing on that progression you pointed out?
         | 
         | Anyway, INI is so simple that it doesn't even have an
         | specification. So, I'd recommend people to avoid it.
        
       | raincole wrote:
       | A bit hard to tell which parts are satire and which are
       | serious...
        
         | zaphar wrote:
         | If we assume this is serious then it falls into a division I
         | see in a lot of areas. Catch errors sooner _or_ Catch errors
         | later. The Catch errors later group tends to take the position
         | that Because I have to catch some errors later anyway I might
         | as well catch all the errors later. The catch errors earlier
         | group tends to take the position that the earlier I can catch
         | the error the easier it is to handle and the safer my code will
         | be.
         | 
         | Neither seems to be able to see valid points in the other's
         | position so it ends up being polarizing.
        
           | kybernetikos wrote:
           | I don't think this is to do with catching errors. You can
           | throw errors at load time with both mechanisms if that's what
           | you want. Instead, the question is, should the file format
           | include a bunch of notation so that the computer can
           | deserialize the value to some type even without knowing what
           | it'll be used for?
           | 
           | Since ini is a format designed for humans to read and write,
           | the argument is that no, the reading code should decide how
           | to interpret the value, and this is fine because it knows
           | whether it wants this particular value to be a boolean or a
           | string or a number or a continent.
           | 
           | The ini file reader has an implicit schema in its reader
           | code. The TOML file makes (some of) the types explicit in the
           | file itself at the expense of making it less convenient for a
           | human to read and write.
        
             | naasking wrote:
             | Following this argument to its logical conclusion, why
             | bother having any kind of standardized format at all, ini
             | or otherwise? The program's config reader knows what it
             | wants to read, why bother standardizing names, notation,
             | section delimiters, or anything else?
        
               | kybernetikos wrote:
               | I presume the argument the author of this article would
               | make is because it helps the human writer of the file.
        
               | naasking wrote:
               | And a standardized format takes that even further, since
               | now text editors can be aware of more of the file
               | structure and assist with highlighting, completion and
               | more.
        
               | kybernetikos wrote:
               | NestedText was mentioned earlier in this thread, and it
               | takes the philosophy to it's logical conclusion. That
               | conclusion includes schemas
               | https://nestedtext.org/en/latest/schemas.html. Text
               | editors could absolutely be written to understand schemas
               | and provide the help you suggest.
        
               | marcosdumay wrote:
               | > why bother having any kind of standardized format at
               | all, ini or otherwise?
               | 
               | So you can reuse the parsing code and the file-editing
               | instructions.
               | 
               | Standardization is not about catching errors.
        
           | naasking wrote:
           | > The Catch errors later group tends to take the position
           | that Because I have to catch some errors later anyway I might
           | as well catch all the errors later.
           | 
           | I'm not sure why that follows. It doesn't seem to apply to
           | any other domain, eg: since I have to deal with diseases
           | associated with aging later anyway, I might as well not take
           | care of myself now and just go wild and do anything I feel
           | like.
        
       | robertlagrant wrote:
       | This seems a bit odd:
       | 
       | > TOML forces arrays to be encapsulated within square brackets
       | (exactly like section paths do), although humans do not need
       | square brackets for recognizing when something is a list.
       | 
       | > # not an array in TOML
       | 
       | > wishes = apples, cars, elephants, chairs
       | 
       | But earlier he didn't want strings to be quoted. If these two
       | criticisms were both applied, how do you distinguish the string
       | "apples, cars, elephants, chairs" and the list ["apples", "cars",
       | "elephants", "chairs"]?
        
         | scrollaway wrote:
         | You don't! You rely on the application parsing logic because
         | apparently that is the best idea ever!
         | 
         | What a waste of time this article is, honestly.
        
       | gweinberg wrote:
       | It's surprising how many commenters point out that the post reads
       | like a joke, but none of them seem to consider the possibility
       | that it is, in fact, a joke. I'm pretty confident that it is.
       | Consider this quote: "if one looks closely, the number of INI
       | dialects actually used in the wild is not infinite". That isn't
       | damning with faint praise, it's just damning.
        
       | indymike wrote:
       | The advantage of TOML over INI is that you can use a generic TOML
       | linter or TOML validator to check files. This is very helpful
       | when dealing with deployment and CI pipelines where it would be
       | nice to just fail if the config file(s) are not valid. This can
       | save a lot of time, and eliminate the whole "spin up a container
       | and load the application with broken config file" step...
        
       | uneekname wrote:
       | There are some pretty bad takes here. An integer and a version
       | string are obviously different.
       | 
       | > no apparent motivation behind this rule, except that of
       | conforming TOML to JSON...TOML's reason remains somewhat
       | mysterious.
       | 
       | No, it's not mysterious, you figured it out! TOML is designed in
       | part to work fairly well with other common formats, like JSON.
       | 
       | > ...except that a value can also be a date. There is something
       | intriguing in all this. Even forgetting that an application might
       | not need dates at all, why constraining something so particular
       | and that can be formatted in so many different ways into a rigid
       | primitive?
       | 
       | Once again, you answered your own question. Dates can be
       | formatted in many different ways, so TOML offers a standard date
       | type. It's really helpful!
       | 
       | > And even if you do survive the process of writing a parser that
       | is fully compliant with TOML (some people don't), you still have
       | done only half of the job, that of writing a parser, without
       | really thinking of any real case usage.
       | 
       | In my experience TOML parsers are more consistent than INI
       | parsers in terms of behavior. They have been immediately helpful
       | to me, as they support a configuration format I vastly prefer.
       | 
       | What a funny write-up. TOML isn't perfect, but I like it much
       | more than INI.
        
       | atoav wrote:
       | I used TOML for complex configuration and I am quite happy with
       | it.
       | 
       | However I often saw people who complained with it, and most of
       | the complains afaik are of the nature: "This extremely complex
       | use case, that I could as well just write in an interpreted
       | language isn't supported by TOML".
       | 
       | If your configuration is so crazy it needs more than any
       | configuration language offers, just use a scripting language
       | instead. I have seen lua or python files act as configuration and
       | there is nothing wrong with that.
        
       | zajio1am wrote:
       | There is definitely an argument for understanding elementary data
       | types later on application level instead on config format, so we
       | do not create arbitrary distinction between types on config
       | format level and on application level, and forcing IP addresses
       | (and other domain-specific types) to be encoded as strings.
       | 
       | But i do not think this is good argument for structural data
       | types like lists, sets and so on, because M applications would
       | use N different ways how to encode them. While a human can
       | recognize them, it is hard to human to remember that this
       | specific application uses that specific encoding with its own
       | quirks.
        
       | throw0101c wrote:
       | Personally I leaned towards ISC-style format for, e.g., BIND9:
       | 
       | * https://bind9.readthedocs.io/en/latest/reference.html
       | additional-from-auth (yes | no) ;               allow-query-on {
       | address_match_list };               also-notify { ip_addr [port
       | ip_port] ; ... ] };               category category_name {
       | channel_name; ... };              channel channel_name {
       | channel_spec };
       | 
       | * https://www.zytrax.net/books/dns/ch7/statements.html
       | 
       | Kind of BNF-y.
        
       | chrismorgan wrote:
       | This is an extremely terrible defence/criticism, for many of the
       | reasons pointed out by others already, but I'll add some more:
       | INI came from Windows, and if you're going to call it INI I think
       | it's reasonable to expect it to work with Windows' INI functions
       | --or else you can call it conf, embracing the still-very-ad-hoc
       | format used across Linux and such. (And yeah, I recognise that
       | the library name invokes both labels, but beyond that they're
       | focusing on INI.)
       | 
       | But _not one of the INI examples shown will actually work as
       | you'd expect using Windows' GetPrivateProfileString function_.
       | Windows' INI-reading functions are _extremely_ simplistic; about
       | the most magic thing is case-insensitivity of keys. You can't put
       | a space around the equals sign: that gives you a key name that
       | ends with a space and a value that starts with a space. There are
       | no line continuations (section 5). You can't do what they call a
       | composite configuration file (section 8). Empty keys (section 10)
       | are fine. Implicit keys (section 12) don't work.
        
       | tux3 wrote:
       | This critique made me learn some new things about both formats,
       | but I actually come out more in support of TOML because of it
       | 
       | Each argument the author gives is a strong preferrence for
       | everything being implicit, ambiguous, free of heavy syntax like
       | quotes for strings and brackets for arrays
       | 
       | That would be a reasonable subjective choice, but when I see the
       | INI exemples used to illustrate, or the slightly out there
       | assertion that a 1 element list is a completely meaningless
       | inhuman concept, I'm not really swayed. Sometimes it almost seems
       | like irony. The exemples are really hard to take at face value..
       | 
       | If anything, this pushes me further away from that. I've done
       | YAML. I've done languages where everything is helpfully
       | implicitly cast, and all the "WTF talks" that result from the
       | weird rules and edge cases.
       | 
       | We know what it's like when DE is a string but NO is a boolean,
       | and only some version numbers are strings while some are floats.
       | 
       | Quoting strings is really not what costs me time and effort. It's
       | weird edge cases and surprises that come back to bite you.
        
         | kybernetikos wrote:
         | I understand your points as precisely in the opposite direction
         | to your conclusion.
         | 
         | According to this post, the INI approach says that the code
         | that reads the value determines what type it should be. That
         | approach means that you never get a problem where NO is a
         | boolean when it should have been an enum, or a version number
         | is a float when it should have been a string. You only get
         | those problems when the type of a value is determined from the
         | file without reference to the expected type, like in TOML.
        
           | PH95VuimJjqBqy wrote:
           | and I agree with that approach personally.
           | 
           | I don't consider YAML to be an acceptable configuration
           | format and it kills me that the standard moved to it away
           | from XML.
        
           | coldtea wrote:
           | > _According to this post, the INI approach says that the
           | code that reads the value determines what type it should be.
           | That approach means that you never get a problem where NO is
           | a boolean when it should have been an enum, or a version
           | number is a float when it should have been a string_
           | 
           | That's a simplistic view.
           | 
           | In practice, there soon wont be just _one_ piece of code
           | reading your files, or it will be shared elsewhere, and it
           | will all depend on implicit semantics and documentation of
           | the assumptions (if you 're lucky to have it). Hilarity,
           | chaos, and head scratching ensues.
           | 
           | Whereas with a format that enforces the types, every consumer
           | with a parser for the format gets the same values (to the
           | extended that it matters: a list doesn't suddenly become a
           | string, but in some language it might be a vector and in
           | another an array).
        
             | marcosdumay wrote:
             | Configuration files usually are only read by one piece of
             | code. Besides, the article is correct in that the type
             | system from TOML is completely inadequate for fully parsing
             | the files anyway, so it will necessarily depend on implicit
             | agreement on the semantics from every reader.
             | 
             | IMO, configuration file formats should only ever have text
             | as primitive type. Anything else should be defined in
             | another layer. I completely agree with that part of the
             | argument from the article.
             | 
             | Then the article goes to argue that the quotes are
             | harmful... And no, if you have a whitespace sensitive
             | language, you need a damn good representation for strings
             | that won't allow for ambiguity to creep in. And INI is just
             | horrible on this.
        
               | coldtea wrote:
               | > _Configuration files usually are only read by one piece
               | of code._
               | 
               | Having a single source of truth and multiple services and
               | scripts needing the same info means the same
               | configuration file will get to be read by many pieces of
               | code, even from different languages.
               | 
               | And that's without considering piecemeal migration of the
               | same "one piece of code" running on different services to
               | another language or a version two design, still needing
               | to read the same file.
               | 
               | > _IMO, configuration file formats should only ever have
               | text as primitive type. Anything else should be defined
               | in another layer. I completely agree with that part of
               | the argument from the article._
               | 
               | I mean, that's not even wrong.
               | 
               | Except if you mean "they should not be binary". Then,
               | sure.
        
               | AndyKluger wrote:
               | > IMO, configuration file formats should only ever have
               | text as primitive type. Anything else should be defined
               | in another layer.
               | 
               | I very much agree. If you haven't checked it out,
               | NestedText is an excellent format that takes this
               | sentiment to heart.
        
         | MrBuddyCasino wrote:
         | > _Sometimes it almost seems like irony. The examples are
         | really hard to take at face value._
         | 
         | Agree, with a few exceptions (eg empty key names), the TOML
         | design choice just seems less ambiguous, simpler and thus...
         | better?
        
         | dale_glass wrote:
         | I mean, it refers to Postel's law on the top. Most of it seems
         | to follow from that.
         | 
         | IMO that's long been proven to have been a very bad idea in
         | retrospect. So good riddance.
        
           | coldtea wrote:
           | > _I mean, it refers to Postel 's law on the top_
           | 
           | One of the crappiest ideas in CS.
        
             | kemotep wrote:
             | I can imagine several security issues with accepting any
             | kind of input in your program.
        
               | masklinn wrote:
               | Literally every language held as a shining example of
               | postel's law is full to the brim with security issues.
               | 
               | The literal interpretation of Postel's Law has been
               | considered highly detrimental for 20 years:
               | https://datatracker.ietf.org/doc/html/rfc3117#section-4.5
        
             | sham1 wrote:
             | Well it seems to work for TCP at least, which is where it
             | comes from. Of course it's not the correct approach for
             | everything, but calling it "one of the crappiest ideas in
             | CS" might be a tad harsh.
             | 
             | EDIT: Of course there are better ways to be robust than to
             | try to just accept whatever garbage is thrown your way
             | because "be liberal in what you accept." So for example
             | since this is about config files, you could easily just
             | tell the user that their stuff is wrong _and_ tell them how
             | to fix it.
        
           | bunderbunder wrote:
           | I can accept Postel's Law as being a great idea for fairly
           | low-impact things like markup languages. XHTML is a good
           | example here: it turns out it wasn't an awesome idea, because
           | if the author of an HTML file forgets to close a tag, I'd
           | rather the browser make a best effort at displaying a
           | document that might be a little janky, than show me nothing
           | at all.
           | 
           | But if we're talking configuration files for applications?
           | No. Absolutely not. If I get anything even slightly off, do
           | not under any circumstances respond by launching the
           | application into an unpredictable state. Fail immediately and
           | tell me why. Same principle applies for RPC messages.
           | 
           | The reductio ad absurdum here is weak typing. If Postel's Law
           | were actually a generally applicable law, then PHP4 would be
           | widely considered to be the pinnacle of language design. I
           | think most people would agree that it's closer to the nadir.
           | 
           | But still... context matters, XHTML was a mistake. Which
           | implies that Postel's Law is true in at least some contexts.
        
             | capitainenemo wrote:
             | There's still a few nice things about XHTML that I miss.
             | It's really helpful in debugging and catching mistakes.
             | I'll actually force it on for dev and test systems just to
             | quickly identify errors. I've caught hundreds of issues
             | with templates that way. Sure there are markup validators,
             | but the always on strict was nice. And still usable with
             | XHTML5... So long as the web page is being generated by
             | your code, the strictness is a win I think. And you can
             | turn off strict in browsers by serving XHTML5 as HTML5 with
             | text/html content type.
             | 
             | The responseXML in XHR for XHTML is really nice, and still
             | available though mostly useless. I wish when XHTML was
             | abandoned that a responseParsedDOM was offered to avoid
             | some of the exploitable hacks people came up with instead.
             | 
             | XML transforms using XSL could do some pretty nifty tricks
             | with static docs and no other processors but your browser.
             | 
             | So, yeah, don't feel it was wholly a mistake. Sure for
             | random content or user generated isn't a good idea, and
             | it'd be nice if there were clean ways to handle that (not
             | iframes), but saying that your app shouldn't have a strict
             | rendering is like saying JSON should be forgiving of
             | misplaced braces... If you're feeding bad JSON to your
             | modern JS driven app, well, that's your fault and there
             | should be errors and it should be fixed. Similar for XHTML
             | for your server side app IMO.
        
               | nayuki wrote:
               | Good news: XHTML was never abandoned. It still exists
               | today as an optional serialization format for HTML5. I am
               | using it in practice on my website and described it in
               | great detail: https://www.nayuki.io/page/practical-guide-
               | to-xhtml
        
               | capitainenemo wrote:
               | Yep! Nice guide. And there's https://www.w3.org/TR/html-
               | polyglot/ for polyglots.
               | 
               | The main thing you lose (no idea why XHTML5 doesn't add
               | support for this) is <noscript> is ignored. Obviously if
               | you did any other form of JS detection in a session, you
               | can just use that to offer alternate content.
        
             | Nullabillity wrote:
             | I'd question the common XHTML talking point too, why is it
             | the browser's job to render content that you clearly
             | haven't even bothered to proofread?
        
               | [deleted]
        
               | bunderbunder wrote:
               | Because it's a _user agent_ and as the _user_ I want it
               | to degrade as gracefully as possible. It doesn 't serve
               | my interests to refuse to render anything just because
               | the author of the website forgot a </b> tag somewhere.
               | I'd rather read the text just with formatting other than
               | what the author intended, than not read the text at all.
               | Don't punish me for someone else's typo.
        
               | Nullabillity wrote:
               | I'm just baffled about how this hypothetical scenario
               | would even happen.
               | 
               | Did the author of the website never try rendering the
               | page themself before pushing it to live?
               | 
               | If user-generated content is able to trigger this then
               | you have have an XSS vulnerability on your hands, strict
               | validation or not.
        
               | layer8 wrote:
               | By that logic, broken SVGs and the like should also be
               | rendered leniently. That doesn't make any sense.
               | 
               | If HTML had been strictly schema-validated from the
               | start, nobody would be arguing for this.
               | 
               | It's certainly true that HTML being parsed leniently
               | helped in it being picked up by amateur website authors
               | in the early days, because they weren't confronted with
               | error messages (though they were confronted with "why
               | doesn't this render as I expect" instead). But that has
               | little to do with user expectations by browser users.
        
             | aidenn0 wrote:
             | Postel's law is a way to aid in adoption, not a way to
             | increase correctness.
             | 
             | If Product X accepts malformed input I, but product Y does
             | not, then product X appears to "work better" than product Y
             | and people will adopt X more. (The other half of the law
             | also helps in adoption; if you emit very conservative
             | output, then your output works with everybody else as well,
             | also making your product look better).
             | 
             | If authors of webpages only had access to browsers that
             | implemented strict XHTML, then there would be a lot fewer
             | missing close-tags out there. Things have largely been
             | sorted out now, but for a while it was a case of "I have to
             | be just as good at rendering complete garbage as IE is, or
             | nobody will use my browser" which I hesitate to label as
             | "positive" in any meaningful sense.
        
       | pwdisswordfishc wrote:
       | > Postel's law is a good indicator of how robust a language is:
       | the more a language is able to make sense of different types of
       | input, the more robust the language.
       | 
       | Into the trash it goes.
       | 
       | Seriously, the avoid-crashing-at-all-costs anti-pattern is what
       | made HTML, JavaScript and PHP the messes that they still are,
       | from which the latter one is only now recovering at a glacial
       | pace. For once, we could learn the lesson.
        
         | afavour wrote:
         | > what made HTML, JavaScript and PHP the messes that they still
         | are
         | 
         | Also arguably the three most popular ways anyone got into
         | programming in the last couple of decades. Something worth
         | pondering on.
        
           | Diggsey wrote:
           | Because for a long time if you wanted to build a website with
           | no money and no experience those were your only options!
           | 
           | JavaScript and HTML for obvious reasons. PHP because if you
           | didn't have your own server, you couldn't run anything else
           | (while there were free hosting providers that would host your
           | PHP scripts).
           | 
           | Nowadays there are tons more options.
        
             | toxik wrote:
             | Lisp was a thing before PHP, and software engineers did use
             | it but it never reached the popularity of PHP. It is in the
             | end a question of being pragmatic, which PHP and Perl are.
        
         | djha-skin wrote:
         | > Postel's Law about being "conservative in what you emit and
         | liberal in what you accept" is quite frankly not a good
         | engineering principle.
         | 
         | -- Joel Spolsky[1]
         | 
         | 1: https://www.joelonsoftware.com/2003/10/08/the-absolute-
         | minim...
        
           | mongol wrote:
           | Has Postel himself backtracked though? Because while Joel is
           | an authority of sorts, Postel was too...
        
             | coldtea wrote:
             | That would be relevant if the criterion was "what some
             | authority believes".
             | 
             | Joel merely states what reality showed about Postel's law
             | over decades.
             | 
             | It's a thing developers know from experience, if Joe Random
             | had said the above quote, it would still be true.
        
               | mongol wrote:
               | The success of internet's core protocols tells a
               | different story
        
               | coldtea wrote:
               | Everything can be a success if its the only game in town
               | or a free option, even Javascript.
               | 
               | And we shouldn't conflate adoption success with design
               | quality either.
               | 
               | That said, it's not like TCP is a great example of
               | Postel's law, and surely not in the crude way it's
               | understood and practiced by its advocates. The RFC says:
               | 
               | "As a simple example, consider a protocol specification
               | that contains an enumeration of values for a particular
               | header field -- e.g., a type field, a port number, or an
               | error code; this enumeration must be assumed to be
               | incomplete. Thus, if a protocol specification defines
               | four possible error codes, the software must not break
               | when a fifth code shows up. An undefined code might be
               | logged (see below), but it must not cause a failure."
               | 
               | Which is hardly the "anything goes" ticket people imagine
               | it to be. E.g. TCP would still break on a badly formed
               | header and consider it an error.
               | 
               | Besides, the advice is good for a transmission control
               | protocol, especially one involving in-between nodes that
               | will not care for the enumeration values like ports and
               | such like the start/end nodes do, and just need to pass
               | them through.
               | 
               | It's horrible for other types of software. Language
               | parsing would be a great example where it should not be
               | followed. And of course the most famous related shit show
               | is HTML handling in browsers. HTML might be succesful,
               | but it's hardly because of following the Postel
               | principle.
        
               | mongol wrote:
               | > HTML might be succesful, but it's hardly because of
               | following the Postel principle.
               | 
               | I think if webbrowsers were not lenient with HTML parsing
               | the web would have been adopted much slower in the
               | initial years. Also, HTML5 can be seen as a confirmation
               | that XHTML2 with a strict parsing would not be
               | successful. I followed the WHATWG mailing lists quite
               | closely when that effort began. This is evidence for the
               | applicability of Postel's law at least in the case of
               | HTML.
        
               | zaphar wrote:
               | Nearly all of the internet's core protocols reject
               | bad/malformed packets. They don't do a best effort to
               | figure out what the sender "intended" they just reject
               | the packet.
               | 
               | Some of the most popular instances of doing a best effort
               | to figure out the intent of the sender are also poster
               | children for protocol level security flaws. If we've
               | learned anything from deploying, managing, and developing
               | on top of the core internet protocols it's this:
               | 
               | 1. Be very conservative in what you send.
               | 
               | 2. Reject anything that isn't what you expected to get.
               | 
               | If you squint you can sort of see that being a valid
               | interpretation of Postel's law but it's not the standard
               | interpretation in practice.
        
         | nayuki wrote:
         | The leniency of HTML burned me a lot in my beginner days. At
         | some point, I decided to switch to XHTML (serving as the media
         | type "application/xhtml+xml") and never looked back. Ditto
         | JavaScript, the laxness harmed a lot, and only like 15 years
         | later did I prepend "use strict" to every script.
        
         | HideousKojima wrote:
         | Every time I hear Postel's Law mentioned all I can think is
         | "God forbid we actually expect people to correctly implement a
         | spec." I mean I could kind of get it if the specification is
         | poorly written/ambiguous, but that's a problem with the spec
         | itself in that case. Otherwise it's just adding unneeded
         | complexity (that can majorly harm performance) for no real
         | gain, except that you accommodate the people too incompetent to
         | correctly implement a spec.
        
         | indymike wrote:
         | > Seriously, the avoid-crashing-at-all-costs anti-pattern is
         | what made HTML, JavaScript and PHP the messes that they still
         | are
         | 
         | Forgiving syntax made HTML and PHP easy, and because it was
         | easy people were quick to learn and use them. Everything is
         | trade-offs.
        
           | HideousKojima wrote:
           | Clear and sane error messages (i.e. something with the level
           | of quality of Rust's compiler) could have accomplished the
           | same thing without creating the insanity we have now.
        
           | Aaargh20318 wrote:
           | > and because it was easy people were quick to learn and use
           | them
           | 
           | and then went on to write a shitload of insecure code.
           | 
           | Just because it's simple to use, doesn't mean that just
           | anyone should be using it. The problem with PHP is that it
           | can be used by someone with far below average programming
           | skills to make something functional. But the flip side of how
           | forgiving it is, is that it takes someone of above average
           | skill to make something to use it to make something safe and
           | performant.
           | 
           | Making it easy to use wrongly also makes it harder to use it
           | right. Simply because you lack any feedback when you do
           | something dumb.
        
       | astrobe_ wrote:
       | > value in EUR = 345 # valid with libconfini but invalid in TOML
       | 
       | This is degenerate.
       | 
       | You want a key-value format easy to pick up, and easy to edit by
       | hand - otherwise you'd rather just use something like SQlite - in
       | particular if you "need" the other insanity that is sections in
       | INI files.
       | 
       | Why the heck allow spaces and UTF8 in key names? To satisfy
       | someone's libido?
        
         | gloria_mundi wrote:
         | Unicode is useful for languages other than English, and has
         | nothing to do with anyone's libido.
        
       | jonhohle wrote:
       | The author does not seem to appreciate separating tokenization
       | from interpretation. Can you right a (choose your data format of
       | choice) document that is valid in (your data format of choice)
       | and not valid config for your application? Absolutely! Can you
       | right an arbitrarily specked INI file that is both invalid
       | structurally and not valid? Even more so!
       | 
       | Choosing a standardized data format gives you a constellation of
       | tools for managing, generating, querying, and validating data
       | that you don't need to write in an application.
       | 
       | I'm not sure if TOML libraries support it, but the Ion libraries,
       | for example, allow you to see the next data type in code and
       | adjust accordingly. If you want to accept a symbol, string, or
       | array of values, the application can choose to do that. Oh, and
       | you can write a formal schema with things like enumeration values
       | that users and code can use to validate a document. So if you're
       | in custom INI land, that's yet another tool you need to write on
       | top of everything else.
       | 
       | If I didn't want to pull in a dependency, I might write a quick
       | and dirty INI parser, knowing there are likely bugs, corner
       | cases, and all kinds of potential future issues that the extra
       | code entails. If I'm taking on a dependency, I'd probably choose
       | a well defined, human and machine readable/writable format that
       | has schema support. Then I'd write a schema and point people
       | (operators and programmers) to it for reference, not the code
       | implementing the parser.
        
       | mongol wrote:
       | I agree with the typing arguments. A config parser can only parse
       | the values to a degree, to primitive types basically. The
       | application can parse them fully. So the question is to what
       | extent the parser helps the application vs to what extent it
       | introduces subtle problems. As a config file user I would prefer
       | not to have to care so much, but also not to have to work around
       | weird corner cases. YAML's no-problem is a perfect example. This
       | would not exist if the parser did not try to be helpful.
       | 
       | I am in the author's camp.
        
         | AndyKluger wrote:
         | I know I'm being repetitive in all these config file threads,
         | but you might like NestedText, which only provides strings,
         | lists, and maps.
        
       | donatj wrote:
       | I'm not a big fan of TOML, but I find the typing criticism here
       | weak. I would far rather my configs have a strict interpretation
       | of 89 vs "89" vs 89.0
       | 
       | None of the INI parsers I have ever used have just returned
       | everything as a raw castable string of exactly what the user
       | entered. There's always a horrible layer of interpretation. Many,
       | including the one built into PHP have confusing rules around
       | bools, for instance
       | 
       | > String values "true", "on" and "yes" are converted to true.
       | "false", "off", "no" and "none" are considered false. "null" is
       | converted to null
       | 
       | Literally the quotation marks don't help.                   foo =
       | false         foo = "no"
       | 
       | Either way you are getting a bool false.
        
       | djha-skin wrote:
       | We have showcased here the classic impedance mismatch between
       | serialization in a typed language and serialization in a dynamic
       | language. The author clearly is in the typed camp, speaking of
       | making enumeration labels and a dated type for the names of
       | continents.
       | 
       | The typed language camp loves the idea of structured untyped
       | strings in a serialization format, such as NestedText or INI.
       | However, this is uncomfortable in dynamic language territory, for
       | many use cases. Because of this dichotomy, any universal
       | serialization format is going to feel like a compromise. JSON is
       | the poster child of this.
       | 
       | In the YAML spec, the authors explicitly stated a goal of
       | supporting native data types of dynamic languages. This design
       | decision seems to be a good compromise between the typed and
       | dynamic camps, but the old saying applies: a good compromise is
       | where everyone goes home angry.
        
       | [deleted]
        
       | PaulHoule wrote:
       | Sometimes I dream of a pseudo natural language configuration
       | language based on the ideas of Inform7
       | 
       | https://ganelson.github.io/inform-website/book/WI_6_1.html
       | The blog database is a postgres database named "foo" at host
       | "db.example.com" with username "harry" and password "bar"
       | 
       | The thing is that probably needs tool assistance as much or more
       | than any other configuration format would.
        
       | xedrac wrote:
       | For nearly every argument made in the article, I found myself
       | siding with the TOML approach. The "no non-ascii keys except
       | wrapped in quotes" thing is a bit wonky I admit.
        
         | epage wrote:
         | toml 1.1 will allow non-ascii in keys (and multi-line inline
         | tables)
         | 
         | See https://github.com/toml-lang/toml/blob/main/CHANGELOG.md
        
       | nmilo wrote:
       | The fact that YAML is so bad is why being "understandable by a
       | human" shouldn't be the ultimate goal of any configuration
       | language. I would gladly make concessions like quoted strings or
       | bracketed lists to avoid the hell of trying to figure out if my
       | string/number/list actually parsed as a string/number/list.
        
         | vkazanov wrote:
         | YAML is the Perl of config formats. I just don't understand how
         | it ended up being so popular!
         | 
         | I remember my oss editor at some being had a little problem
         | highlighting a yaml configuration. No problem, - I thought. -
         | how hard can that be?
         | 
         | Well, turns out it's almost cpp level hard to properly parse
         | the language.
        
         | Pxtl wrote:
         | Yaml convinced me that dynamically typed config & interchange
         | formats are a bad idea.
         | 
         | Statically typed schema eliminate the ambiguity and so you get
         | to have your legibility cake and eat your consistence cake too.
        
           | hitchstory wrote:
           | thats what strictyaml does
        
           | kibwen wrote:
           | I used to be in favor of schemas, but my problem with them
           | these days is that they just can't encode all the validation
           | necessary to ensure that the config is correct. At the end of
           | the day, the only way to check if the config is actually
           | valid is to parse it, so I'm sympathetic to the "string them
           | all and let the application sort them out" approach.
        
             | IggleSniggle wrote:
             | I'm sympathetic to that, but not to ini! INI is _just
             | standardized enough_ that your ini parser may or may not
             | give you  "just a string." As another poster mentioned, it
             | may well interpret the string "NO", quotes included, as the
             | boolean false, before passing along to the rest of the
             | application. It's this ambiguity of type that makes INI
             | problematic. If it simply handed along strings, without
             | fail, and left the application to parse whether "NO" should
             | be a country or boolean or string, that wouldn't be a
             | problem.
             | 
             | But inevitably, in order to DRY, somebody will make a
             | consistent parser that is used in your application, whether
             | that's in-house or a dependency. And at that point, it is
             | very tempting to run everything through the parser, and the
             | parser is going to make some unexpected decisions.
             | 
             | So, sure, use INI. But don't really. Use a env file that is
             | parseable as INI or as shell environment variables. As soon
             | as you start needing anything more complex, use something
             | where you have at least a few basic guarantees that what
             | you're getting is at least in the general vicinity of what
             | you want.
        
               | kibwen wrote:
               | Right, I'm not trying to say that INI is the solution,
               | only that 1) TOML's anemic selection of types is probably
               | pointless, and 2) any attempt to provide a useful
               | selection of types would require being a Turing-complete
               | language, which is not want I want in my config files, so
               | you might as well just give me strings and let me parse
               | it in the typed, Turing-complete language that I'm
               | already using for my application logic.
        
             | AndyKluger wrote:
             | Absolutely agree!
             | 
             | If you haven't checked it out, NestedText takes this
             | approach.
        
               | Pxtl wrote:
               | Yes. I like the NestedText approach, but I do feel like
               | it needs an official optional "blessed" schema
               | description language for type validation, instead of
               | "here's a dozen ways to do NestedText validation in
               | python".
        
       | rdtsc wrote:
       | Having used INIs already, it seemed TOML didn't really bring
       | enough to the table to be worth switching to it. But I imagine
       | switching from YAML to TOML might be an improvement.
       | 
       | Date support, and general type-awareness and nested sections also
       | seemed like anti-features to me.
        
       | palmfacehn wrote:
       | Comments prescribing example configurations resolve most of these
       | issues. Even if the ideal config format is found and Internet
       | debates settle the issue decisively, README.txt will still be a
       | good practice.
       | 
       | There's very little which isn't permissible if we presuppose
       | users will read the source to understand a config file or
       | software generally.
        
       ___________________________________________________________________
       (page generated 2023-09-21 23:01 UTC)