[HN Gopher] An INI Critique of TOML (2021)
___________________________________________________________________
An INI Critique of TOML (2021)
Author : belter
Score : 130 points
Date : 2023-09-21 10:41 UTC (12 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| kazinator wrote:
| I almost stopped reading at the difference between "89" and 89
| being something bad that risks making your program crash.
|
| What a moronic diatribe.
|
| TOML being typed makes it excellent compared to INI.
|
| Nobody with anything resembling a CS degree on their wall should
| be defending nonsense like "castable strings", and the
| proliferation of string conversions into the application layer.
| Let alone in C or C++.
|
| Postel's law is only half right.
|
| You should be conservative in what you generate (don't probe
| every obscure corner spec of a representation, if you can avoid
| it), and reject all inputs that do not conform to the
| specification.
|
| Programs shouldn't react in uexpected ways to bad inputs, like
| crash, or allow an attacker to take control. But they shouldn't
| try to reinterpret bad inputs as good, either. That's folly.
|
| The only reason to follow Postel's law is economic gain at the
| expense of the technological ecosystem.
|
| If your web browser accepts broken HTML and renders it, whereas
| the competing browser rejects it, that competing browser is
| better for the web, but looks buggy to the naive user base, which
| will prefer your web browser.
|
| Postel's law was used as one of the weapons in the browser wars,
| whose legacy negatively affects the web even today.
| masklinn wrote:
| > I almost stopped reading at the difference between "89" and
| 89 being something bad that risks making your program crash.
|
| I can only commend you for not taking a few minutes to consider
| whether it was worth continuing when the essay starts with
| praising postel's law, possibly the worst idea in the field
| since "let me just run that code I received over a socket".
| uxp8u61q wrote:
| > let me just run that code I received over a socket
|
| ...javascript?
| umanwizard wrote:
| Mostly minor issues that are all dwarfed by the gigantic
| advantage TOML has over INI: the fact that it's actually
| specified.
| insanitybit wrote:
| It's so incredible how two engineers can look at something like
| this and draw literally the opposite conclusion. Taste is
| certainly subjective...
|
| To me, this is largely an argument _for_ TOML. I mean,
|
| > wishes = "I am fine"
|
| This is an array with one element????
|
| > Even TOML's featured example proposes "the INI way"
|
| Obviously it is the TOML way too? Hence it being the featured
| example. That another way to express it doesn't change that...
|
| It's almost comical how much their arguments are convincing me
| that INI is a disaster. And yet they are, seemingly in all
| seriousness, truly trying to convince me that _this is the way_.
|
| What an odd field to be in.
| [deleted]
| Joker_vD wrote:
| > It's so incredible how two engineers can look at something
| like this and draw literally the opposite conclusion. Taste is
| certainly subjective...
|
| Well, in such cases it's likely because two engineers are
| actually expressing their tastes/beliefs, merely using the
| thing they look at as a running example. That also means that
| the running example, ironically, is actually mostly irrelevant
| to the topic: otherwise it wouldn't have been able to support
| two opposite conclusions.
|
| And in the field of philosophy this phenomenon is even more
| egregious: you can find e.g. one philosopher arguing that life
| is meaningless because it's finite, and another philosopher
| arguing that life has meaning because it's finite.
| emodendroket wrote:
| Just yesterday a broadside against YAML was on the front page.
| I'm sure one could find them against JSON, XML, or any other
| possible choice.
| nayuki wrote:
| > It's so incredible how two engineers can look at something
| like this and draw literally the opposite conclusion. [...]
| It's almost comical how much their arguments are convincing me
| that INI is a disaster. And yet they are, seemingly in all
| seriousness, truly trying to convince me that this is the way.
|
| That pretty much sums up how I view arguments about metric vs.
| imperial measurement systems. Every feature of imperial that
| someone points out as an advantage is something I see as a
| disadvantage.
| HideousKojima wrote:
| >That pretty much sums up how I view arguments about metric
| vs. imperial measurement systems.
|
| One argument I will make in favor of imperial: feet (when
| divided into inches) are easy to divide into thirds because
| it's base 12 instead of base 10. This makes a a lot of things
| (especially in construction, woodworking, and others) a lot
| easier. Of course it has a whole boatload of downsides to go
| along with it, but that's one real and tangible benefit to
| imperial over metric that I've experienced.
| ziml77 wrote:
| Metric should have been base 12. You get more typical
| factors than base 10, and it's not an absurd deviation from
| base 10 (base 60 would allow for the same factors as both
| 12 and 10, but I doubt anyone would be happy to adopt that)
| nayuki wrote:
| That's exactly the kind of thinking that I'm alluding to.
| Okay, if base 12 is so great, why are gallons not divisible
| by 12? Why is our currency not divisible by 12? Why is a
| mile 1760 yards - what if you wanted to parcel a square
| mile into thirds on each side? Why is an inch partitioned
| into binary fractions? Why are pounds not divisible by 12?
| The imperial customs (it's hard to even call it a system)
| have no internal consistency.
|
| To add to the observation about lack of internal
| consistency: Metalworkers use decimal inches. Woodworkers
| use feet, inches, and binary fractions. Surveyors use
| decimal feet. Some people use decimal miles. You could
| argue that each of the aforementioned systems make sense on
| its own, but none of them interoperate. Seriously - you'll
| be baffled if you pick up a decimal foot measuring tape in
| real life (they exist). Metric doesn't have this problem
| because if someone is doing detailed work in millimetres,
| someone is planning a house's rooms in metres, and someone
| is organizing a town's land in kilometres, they can all
| work with each other by simply moving the decimal place and
| changing the prefix.
|
| Another problem is that you're presupposing that the
| products you interact with are designed in a whole number
| of feet, and then you subdivide from there. I don't see
| this as true at all; things come in all sizes like 2'5",
| how are you going to divide that into thirds?
| kemotep wrote:
| Unlike metric, all the different imperial measurements
| are just combinations of different kinds of initially
| unrelated measurement systems. So a gallon has absolutely
| nothing to do with foot, miles are a completely different
| measurement system than feet, pounds and so on.
|
| Needing to know the weight of a cubic hectare of water
| and then how many gallons that is, is not a common issue.
| Metric does give you a system that can quickly calculate
| these things comparatively.
| philwelch wrote:
| > The imperial customs (it's hard to even call it a
| system) have no internal consistency.
|
| "Internal consistency" doesn't matter. Virtually nobody
| ever has to convert miles to yards; the use cases for
| measuring a distance in miles and the use cases for
| measuring a distance in yards almost never overlap. For
| the rare cases where they do, the units do have an
| integer conversion.
|
| For the use cases that do commonly overlap, e.g. inches
| and feet, you get a useful-for-that-specific-domain
| conversion factor of 12. Some domains use decimal miles
| or decimal inches and that's totally fine; sounds to me
| like they didn't need the metric system after all.
|
| But if you're going to be so insistent on "internal
| consistency", riddle me this: why do we measure time in
| hours, days, weeks, months, and years rather than in
| decaseconds, hectoseconds, kiloseconds, megaseconds,
| etc.?
|
| > if someone is doing detailed work in millimetres,
| someone is planning a house's rooms in metres, and
| someone is organizing a town's land in kilometres, they
| can all work with each other
|
| These people never work with each other. This is a
| fantasy.
| nayuki wrote:
| > "Internal consistency" doesn't matter.
|
| The lack of an opinionated stance on internal consistency
| is exactly how we arrive at traditional measures and also
| the US Customary set of units. It's far easier
| politically to be accommodating and allow more units than
| to put down your foot and say no, this is redundant, this
| cannot be used.
|
| > Virtually nobody ever has to convert miles to yards
|
| That's mostly true because it seems yards are only used
| to measure football fields and fabric. Everything else is
| measured in feet, from personal heights to furniture to
| rooms to house yards to building structures (the Empire
| State Building is 1454 feet tall).
|
| But my point still stands. You think you don't have to
| convert between miles and feet? Okay:
| https://www.researchgate.net/figure/a-Typical-multi-lane-
| hig... . There's a highway exit coming up in 800 feet and
| another in 1/2 mile. How many times longer does it take
| to reach the second exit compared to the first exit? You
| have no clue. In metric it's 250 m and 800 m, and it
| obviously takes about 3 times longer to reach the second
| exit.
|
| > These people never work with each other. This is a
| fantasy.
|
| Tell me you haven't worked in engineering without telling
| me you haven't worked in engineering. If you eyeball
| everything and use intuition, I can see why you don't
| care about units, conversions, and calculations. If you
| actually need to plan and analyze things carefully before
| you order materials and cut things, you'll quickly see
| that having a plethora of units adds complexity and
| chances for error without adding any functionality that a
| pure system has (whether you're using millimetres or only
| decimal inches).
| philwelch wrote:
| > That's mostly true because it seems yards are only used
| to measure football fields and fabric.
|
| Also shooting ranges, but yes.
|
| > There's a highway exit coming up in 800 feet and
| another in 1/2 mile. How many times longer does it take
| to reach the second exit compared to the first exit?
|
| Prior to GPS navigation nobody ever said "there's a
| highway exit coming up in 800 feet"; road signs in the US
| consistently use fractions of a mile. When my GPS app
| switches units from fractions of a mile to feet, that
| means I can see where I need to exit/turn. Calculating a
| precise ratio doesn't matter. It's between 6 and 7 since
| there's 5280 feet in a mile and 8 _6 is 48 but 8_ 7 is
| 56, but who cares?
|
| Also, on a highway, I'm usually traveling at least 60
| mph, and since there's 60 minutes in an hour, that comes
| out to one mile per minute. Try that with your fancy
| metric system!
|
| > Tell me you haven't worked in engineering without
| telling me you haven't worked in engineering. If you
| eyeball everything and use intuition, I can see why you
| don't care about units, conversions, and calculations.
|
| You're imagining a scenario where a guy who's worrying
| about fractions of an inch building a cabinet has to talk
| to the city planner who's worried about miles and they
| have to convert units to do that. That doesn't happen,
| and yet you want to optimize the entire unit of
| measurement for that specific use case at the expense of
| more common use cases like "dividing by three".
|
| When it comes to domains where consistency does matter,
| we just don't do unit conversions. For instance, flight
| altitude is measured in feet, even when it's thousands of
| feet, instead of miles. If you're flying an airplane at
| 30,000 feet, who cares how many miles that is? That's not
| what miles are for. Likewise with domains that use feet
| per second rather than miles per hour. Again, even
| "metric" countries don't commit to the bit here; name me
| a country where the highway speed limit is in meters per
| second.
| nayuki wrote:
| > When it comes to domains where consistency does matter,
| we just don't do unit conversions. For instance, flight
| altitude is measured in feet, even when it's thousands of
| feet, instead of miles. If you're flying an airplane at
| 30,000 feet, who cares how many miles that is?
|
| Wrong.
| https://en.wikipedia.org/wiki/Gliding_flight#Glide_ratio
|
| Your plane is up at 30000 feet and your engines are out.
| The nearest airport is 47 nautical miles out. The plane
| has a glide ratio of 15. Will you make it?
|
| It's easy in metric: 9.1 km altitude, 87 km distance.
|
| > Again, even "metric" countries don't commit to the bit
| here; name me a country where the highway speed limit is
| in meters per second.
|
| I actually much prefer metres per second. It makes things
| like kinetic energy calculations easier. If I wanted to
| know the KE of a 1500 kg, 100 km/h car, I would first
| need to convert to m/s. Ditto the kilowatt-hour; it needs
| to die in favor of megajoules.
| philwelch wrote:
| > Your plane is up at 30000 feet and your engines are
| out. The nearest airport is 47 nautical miles out. The
| plane has a glide ratio of 15. Will you make it?
|
| Nautical miles and statute miles are different units
| anyway, so my initial assertion--that you would never
| convert flight altitude to statute miles--is still
| correct.
|
| Glide ratio varies according to speed; the plane might
| have a glide ratio of 15 at one speed but a different
| glide ratio at a different speed. So in practice, the
| real world version of this word problem is more complex
| than you make it out to be, and pilots' handbooks will
| commonly have tables of glide distances to consult for
| this reason.
|
| > If I wanted to know the KE of a 1500 kg, 100 km/h car,
| I would first need to convert to m/s.
|
| You've adequately demonstrated that metric is more
| convenient for arbitrary word problems that you've
| provided. Real world applications is what I'm less
| convinced about.
| emodendroket wrote:
| > But my point still stands. You think you don't have to
| convert between miles and feet? Okay:
| https://www.researchgate.net/figure/a-Typical-multi-lane-
| hig... . There's a highway exit coming up in 800 feet and
| another in 1/2 mile. How many times longer does it take
| to reach the second exit compared to the first exit? You
| have no clue. In metric it's 250 m and 800 m, and it
| obviously takes about 3 times longer to reach the second
| exit.
|
| I'm having a hard time seeing this as much of a problem
| in daily life. That sounds more like a word problem in
| math class than something someone would want to typically
| calculate on the fly. And my sense of how far something
| at a given distance is in the car is informed more by
| experience and intuitive sense than measurement.
| gfv wrote:
| >why do we measure time in hours, days, weeks, months,
| and years rather than in decaseconds, hectoseconds,
| kiloseconds, megaseconds, etc.?
|
| We don't use them generally because at the time of French
| Revolution, due to taxes, standardizing the state units
| of measurement of physical goods was a much more pressing
| concern than time. (If I were to guess, there were no
| work hours limits, and thus it just hadn't crossed the
| popular mind). Weeks were changed to have ten days
| though.
|
| That's not to say that they hadn't tried: decimal time
| was mandatory for a few years before they realized that
| there were too many clocks around, and pushing decimal
| time can actually turn people hostile against the then-
| new metric system.
|
| We still kinda sorta use them in form of fractions of
| Julian days.
| thfuran wrote:
| >why do we measure time in hours, days, weeks, months,
| and years rather than in decaseconds, hectoseconds,
| kiloseconds, megaseconds, etc.?
|
| Because it would be astronomically expensive to slow
| earth's rotation enough for a day to be 100 ksec.
| nayuki wrote:
| There's no theoretical problem if we define 86400 seconds
| = 1000000 newseconds.
|
| One big practical problem is that because SI is a
| coherent system, the second is embedded in many units.
| For example, 1 N = 1 kg m/s^2. 1 J = 1 N m. 1 V = 1 J/C.
| 1 Pa = 1 N/m^2. And so on and so forth. So all of those
| units will have to be replaced with units derived from
| the newsecond. This is kind of like how SI has the unit
| tesla but the CGS electromagnetic unit is gauss.
|
| A long-term problem is that no matter what, the length of
| the day on Earth will drift from scientifically accurate
| atomic time. Sometime in the near future, a day will be
| 86401 seconds. Then 86402, and so on. So the old second
| or the new second will not solve this problem.
| philwelch wrote:
| > A long-term problem is that no matter what, the length
| of the day on Earth will drift from scientifically
| accurate atomic time. Sometime in the near future, a day
| will be 86401 seconds. Then 86402, and so on. So the old
| second or the new second will not solve this problem.
|
| How long term is this problem? For instance, we
| habitually insert "leap seconds" in order to keep things
| lined up, but if we didn't insert leap seconds, we might
| accumulate an error of maybe half an hour between the
| solar meridian and "noon" in the next 500 years, which is
| less error than we introduce by putting Madrid and
| Belgrade in the same time zone. And in 500 years, most of
| humanity will not even be living on the Earth anyway.
| thfuran wrote:
| It has supposed advantages?
| nayuki wrote:
| See https://www.google.com/search?q=why+imperial+is+better+
| than+... , https://www.youtube.com/watch?v=odVPwNgD8w4 (the
| guy argues FOR fractions!),
| https://youtu.be/JoDQOqB5cZg?t=221 ("based on the size of
| everyday objects"),
| https://www.youtube.com/watch?v=Anmm7phETE4 ,
| https://www.google.com/search?client=firefox-
| b-d&q=why+imper... , et cetera.
| thfuran wrote:
| What a bunch of ridiculous nonsense. "I'm 6'1", which is
| easy to visualize because it's literally about the length
| of six adult feet end to end". I don't know how he spends
| his time, but I doubt I've ever seen six adult feet end
| to end, let alone with such frequency that's it's a
| familiar reference for measurement. (Not to mention that
| adult feet are very rarely as much as one foot in
| length).
|
| Edit: Oh no, I watched another one. "It's more
| naturalistic. Think about it: there are a dozen inches in
| a foot just like there are a dozen eggs in a carton." I'm
| not even sure where to start on that one.
| jmholla wrote:
| > Obviously it is the TOML way too? Hence it being the featured
| example. That another way to express it doesn't change that...
|
| It's not even another way to express it. The example is a
| completely different data structure. The initial proposed "TOML
| way" is an array while the "INI way" is a table. The then
| corrected "TOML way" IS the equivalent table in TOML.
|
| It's a ridiculous strawman along with several other oddities
| where the author loves to give INI the benefit of its weirdness
| but then rails against TOML's. The author loves INI and refuses
| to understand why people wanted and needed TOML to fix its
| deficiencies.
|
| I'd love to tear this thing apart point-by-point, but I think
| it'd be more cathartic for me than useful to anyone else.
| booleandilemma wrote:
| _TOML forces arrays to be encapsulated within square brackets
| (exactly like section paths do), although humans do not need
| square brackets for recognizing when something is a list._
|
| Isn't TOML meant to be read by programs?
|
| Don't square brackets make things less ambiguous for both
| machines and humans?
|
| _There is no way to convince a human that something composed of
| only one member is a list (if you think differently, chances are
| that you are partly non-human)._
|
| The author should speak for himself.
|
| _TOML forces arrays to be always comma-separated, although a
| human can recognize a list even when the separator is a
| mushroom._
|
| I stopped reading after this.
| p2detar wrote:
| We use INI files to configure our on-premises product. Four INI
| files to be precise. Database connection url, server hostnames,
| ports, various Boolean and debug flags, and even CSV string
| arrays.
|
| To this day not a single customer complaint or even support
| questions with regards to the INI files. Most of our customers
| are on Windows but we also support Linux with the same config
| scheme.
| pornel wrote:
| tomlc99 parser took 13 minutes to parse a 50MB file!? There must
| be something horribly broken with it. Rust's TOML parser is
| literally 1000 times faster than that.
| epage wrote:
| Don't think I've benchmarked that case since thats not
| generally what you'll run into with a human data format. Would
| love to see numbers if you have them since I have no idea what
| they'd be.
|
| Personally performance isn't my biggest concern and I only
| focus on it for the cargo case but I want to switch cargo to
| storing packaged `.crate` files to a format meant for machines.
|
| (maintainer of `toml` and `toml_edit` packages for Rust)
| arp242 wrote:
| I assume they used the "prepare.sh" script for that 50M file:
| https://github.com/madmurphy/libconfini/tree/master/dev/test.
| .. (you need to clone the repo to run it, because it depends
| on some other files in there).
|
| The file it generates is not valid TOML though; even if you
| fix the obvious syntax issues you run in to issues like:
| % tomlv ./big_file.ini Error in './big_file.ini': toml:
| line 35: Key 'section' has already been defined.
|
| Perhaps tomlc99 doesn't check for this; I didn't try it.
|
| Maybe I should add something to toml-test for this; while I
| agree that 50M files are somewhat rare[1] and that
| performance isn't a _huge_ consideration, it 's not entirely
| unimportant either, and this can give people a bit of a "is
| the performance of my parser horribly atrocious or roughly
| fine?" kind of baseline.
|
| [1]: One use-case I have is that I use TOML for translation
| files (like gettext .po, except in, well, TOML) and in a
| large application with a whole bunch of translations I can
| see that adding up to 50M.
| b450 wrote:
| In replying to something this deranged, one runs the risk of
| responding with sincerity to a troll, and thereby being Owned.
| Nevertheless, my ethics require me to implore anyone who
| sympathizes with the section on Square Brackets to seek help.
| IggleSniggle wrote:
| CUE, anyone?
|
| https://cuelang.org/
| tln wrote:
| Is CUE only implemented in Go?
| SrslyJosh wrote:
| Reminder that TOML was initially conceived as a joke:
| https://news.ycombinator.com/item?id=17523304
| rednafi wrote:
| Postel's Law is bad. It directly contradicts with the Unix
| philosophy of building small and composable interfaces. Look how
| small and generic the interface of Reader in Go. It has one
| method with a simple signature and can be used widely.
|
| Otoh, when your software is not opinionated about what it
| accepts, that's when it tends to produce surprising behaviors.
| This whole argument made me lean on TOML even more. It's not
| perfect but certainly better than ini in many ways.
| zxcvbn56nbvcxz wrote:
| I don't like OP's variant of INI but hell TOML is the ugliest
| format ever invented. The first reason why I wouldn't touch Rust
| with a 1 meter stick.
| gavinhoward wrote:
| I've made my own config format precisely because TOML was not
| good enough. But INI would have been worse for my case.
|
| Strict typing is _good_. If you expect a number, and it doesn 't
| parse, you want to know that.
|
| Strict typing communicates _intent_. And that is worth gobs of
| effort.
|
| (I do admit that the author has a point about times when apps
| just want a string value, so I think I'll add a function to get
| the string value of a move no matter what type it is.)
|
| Also, as much as the author slams TOML's avoidance of ambiguity,
| the industry has learned that Postel's Law is bad. We need to be
| conservative in what we emit and accept, for several reasons, the
| most important of which is that parsers are some of the most
| dangerous code; any possible ambiguity could be a security bug or
| many waiting to be discovered. My parser is opinionated because
| that prevents bad things from happening.
|
| The author is also wrong about C not handling mixed arrays well;
| my config format is JSON with a few niceties, and I easily
| implemented it in C, even though it can have mixed arrays.
|
| And braces/ brackets in JSON are not human-editable? Come on...
|
| That said, some changes to JSON _were_ necessary to make it
| easier to edit; besides comments, I also added keys that don 't
| need quotes, as long as they are one word.
| mjevans wrote:
| The author's premise is that the strict typing and validation
| of a configuration file belong with the application, not some
| library doing initial validation of a document which may or may
| not conform to the application's logic.
| gavinhoward wrote:
| But even then, the logic depends on the contents of the
| string. If you need a country (as a string) and you get a
| number, you know something is wrong.
|
| If you need a country code and you get a Boolean, chances are
| you got Norway, but by having strict typing, you don't run
| into that ambiguity.
| eviks wrote:
| > All INI dialects however are well-defined (every INI file is
| parsed by some application, and by studying a parser's source
| code it is possible to deduce its rules)
|
| That the definition of a lack of definition. Well-defined means
| you don't need to study parser's source code and deduce the
| rules, you get the rules!
|
| > , and, if one looks closely, the number of INI dialects
| actually used in the wild is not infinite.
|
| and that doesn't help you since you'll have to study parser's
| source code to deduce
| laszlokorte wrote:
| There is no undefined behavior in C. You just have to read your
| compilers source code or the generated assembly or simply run
| to program to learn the exact definition.
| joshxyz wrote:
| it's like comparing javascript and typescript. there are trade-
| offs in the presence (and lack) of verbosity.
| noirscape wrote:
| I kinda get where the author is coming from. Everyone is rushing
| over themselves to try and "fix" configuration file formats by
| adding more and more complexity so we humans can express more
| complex data types into configuration files.
|
| Nowadays TOML is the new hotness, 6 years ago it was YML, 9 years
| ago it was JSON and before that XML was our alledged savior.
|
| Ultimately there's no one perfect solution, but INI does have one
| thing going for it: it is _dead simple_. It has no type system
| and it leaves it up to the application to convert a set of key
| /value string pairs (grouped by a header) into something
| meaningful rather than attempting to guess and spitting out
| unexpected surprises.
|
| Contrast to XML which has 50 ways to say the same thing depending
| on what tickles your fancy, YML which is probably the
| configuration language with the fastest footguns in the West or
| TOML which attempts to express fairly complicated data types into
| a configuration file without questioning if that's even useful
| for 99% of the configuration files.
|
| The only one that is close to equal in simplicity is JSON, which
| only allows booleans, numbers, strings and arrays/dictionaries.
| Really I think the only reason why JSON failed the "human-written
| configuration file" contest is because spec-compliant
| implementations forbid comments. This being mostly because the
| author of the spec wanted to avoid JSON (de)serializers from
| putting custom logic in them.
|
| (Although I'll note that JSON specifically tends to have the most
| non-spec-compliant libraries that have a looser stance on it and
| allow in-line comments using javascripts rules, at least in my
| experience and even has a bunch of dedicated offshoot specs that
| are basically "JSON but with comments".)
| AndyKluger wrote:
| > The only one that is close to equal in simplicity is JSON,
| which only allows booleans, numbers, strings and
| arrays/dictionaries.
|
| NestedText only allows strings, arrays, and dictionaries, has a
| proper specification, and you never need to escape anything.
| marcosdumay wrote:
| > Nowadays TOML is the new hotness, 6 years ago it was YML, 9
| years ago it was JSON and before that XML was our alledged
| savior.
|
| Hum... Didn't you notice that complexity is not steadily
| increasing on that progression you pointed out?
|
| Anyway, INI is so simple that it doesn't even have an
| specification. So, I'd recommend people to avoid it.
| raincole wrote:
| A bit hard to tell which parts are satire and which are
| serious...
| zaphar wrote:
| If we assume this is serious then it falls into a division I
| see in a lot of areas. Catch errors sooner _or_ Catch errors
| later. The Catch errors later group tends to take the position
| that Because I have to catch some errors later anyway I might
| as well catch all the errors later. The catch errors earlier
| group tends to take the position that the earlier I can catch
| the error the easier it is to handle and the safer my code will
| be.
|
| Neither seems to be able to see valid points in the other's
| position so it ends up being polarizing.
| kybernetikos wrote:
| I don't think this is to do with catching errors. You can
| throw errors at load time with both mechanisms if that's what
| you want. Instead, the question is, should the file format
| include a bunch of notation so that the computer can
| deserialize the value to some type even without knowing what
| it'll be used for?
|
| Since ini is a format designed for humans to read and write,
| the argument is that no, the reading code should decide how
| to interpret the value, and this is fine because it knows
| whether it wants this particular value to be a boolean or a
| string or a number or a continent.
|
| The ini file reader has an implicit schema in its reader
| code. The TOML file makes (some of) the types explicit in the
| file itself at the expense of making it less convenient for a
| human to read and write.
| naasking wrote:
| Following this argument to its logical conclusion, why
| bother having any kind of standardized format at all, ini
| or otherwise? The program's config reader knows what it
| wants to read, why bother standardizing names, notation,
| section delimiters, or anything else?
| kybernetikos wrote:
| I presume the argument the author of this article would
| make is because it helps the human writer of the file.
| naasking wrote:
| And a standardized format takes that even further, since
| now text editors can be aware of more of the file
| structure and assist with highlighting, completion and
| more.
| kybernetikos wrote:
| NestedText was mentioned earlier in this thread, and it
| takes the philosophy to it's logical conclusion. That
| conclusion includes schemas
| https://nestedtext.org/en/latest/schemas.html. Text
| editors could absolutely be written to understand schemas
| and provide the help you suggest.
| marcosdumay wrote:
| > why bother having any kind of standardized format at
| all, ini or otherwise?
|
| So you can reuse the parsing code and the file-editing
| instructions.
|
| Standardization is not about catching errors.
| naasking wrote:
| > The Catch errors later group tends to take the position
| that Because I have to catch some errors later anyway I might
| as well catch all the errors later.
|
| I'm not sure why that follows. It doesn't seem to apply to
| any other domain, eg: since I have to deal with diseases
| associated with aging later anyway, I might as well not take
| care of myself now and just go wild and do anything I feel
| like.
| robertlagrant wrote:
| This seems a bit odd:
|
| > TOML forces arrays to be encapsulated within square brackets
| (exactly like section paths do), although humans do not need
| square brackets for recognizing when something is a list.
|
| > # not an array in TOML
|
| > wishes = apples, cars, elephants, chairs
|
| But earlier he didn't want strings to be quoted. If these two
| criticisms were both applied, how do you distinguish the string
| "apples, cars, elephants, chairs" and the list ["apples", "cars",
| "elephants", "chairs"]?
| scrollaway wrote:
| You don't! You rely on the application parsing logic because
| apparently that is the best idea ever!
|
| What a waste of time this article is, honestly.
| gweinberg wrote:
| It's surprising how many commenters point out that the post reads
| like a joke, but none of them seem to consider the possibility
| that it is, in fact, a joke. I'm pretty confident that it is.
| Consider this quote: "if one looks closely, the number of INI
| dialects actually used in the wild is not infinite". That isn't
| damning with faint praise, it's just damning.
| indymike wrote:
| The advantage of TOML over INI is that you can use a generic TOML
| linter or TOML validator to check files. This is very helpful
| when dealing with deployment and CI pipelines where it would be
| nice to just fail if the config file(s) are not valid. This can
| save a lot of time, and eliminate the whole "spin up a container
| and load the application with broken config file" step...
| uneekname wrote:
| There are some pretty bad takes here. An integer and a version
| string are obviously different.
|
| > no apparent motivation behind this rule, except that of
| conforming TOML to JSON...TOML's reason remains somewhat
| mysterious.
|
| No, it's not mysterious, you figured it out! TOML is designed in
| part to work fairly well with other common formats, like JSON.
|
| > ...except that a value can also be a date. There is something
| intriguing in all this. Even forgetting that an application might
| not need dates at all, why constraining something so particular
| and that can be formatted in so many different ways into a rigid
| primitive?
|
| Once again, you answered your own question. Dates can be
| formatted in many different ways, so TOML offers a standard date
| type. It's really helpful!
|
| > And even if you do survive the process of writing a parser that
| is fully compliant with TOML (some people don't), you still have
| done only half of the job, that of writing a parser, without
| really thinking of any real case usage.
|
| In my experience TOML parsers are more consistent than INI
| parsers in terms of behavior. They have been immediately helpful
| to me, as they support a configuration format I vastly prefer.
|
| What a funny write-up. TOML isn't perfect, but I like it much
| more than INI.
| atoav wrote:
| I used TOML for complex configuration and I am quite happy with
| it.
|
| However I often saw people who complained with it, and most of
| the complains afaik are of the nature: "This extremely complex
| use case, that I could as well just write in an interpreted
| language isn't supported by TOML".
|
| If your configuration is so crazy it needs more than any
| configuration language offers, just use a scripting language
| instead. I have seen lua or python files act as configuration and
| there is nothing wrong with that.
| zajio1am wrote:
| There is definitely an argument for understanding elementary data
| types later on application level instead on config format, so we
| do not create arbitrary distinction between types on config
| format level and on application level, and forcing IP addresses
| (and other domain-specific types) to be encoded as strings.
|
| But i do not think this is good argument for structural data
| types like lists, sets and so on, because M applications would
| use N different ways how to encode them. While a human can
| recognize them, it is hard to human to remember that this
| specific application uses that specific encoding with its own
| quirks.
| throw0101c wrote:
| Personally I leaned towards ISC-style format for, e.g., BIND9:
|
| * https://bind9.readthedocs.io/en/latest/reference.html
| additional-from-auth (yes | no) ; allow-query-on {
| address_match_list }; also-notify { ip_addr [port
| ip_port] ; ... ] }; category category_name {
| channel_name; ... }; channel channel_name {
| channel_spec };
|
| * https://www.zytrax.net/books/dns/ch7/statements.html
|
| Kind of BNF-y.
| chrismorgan wrote:
| This is an extremely terrible defence/criticism, for many of the
| reasons pointed out by others already, but I'll add some more:
| INI came from Windows, and if you're going to call it INI I think
| it's reasonable to expect it to work with Windows' INI functions
| --or else you can call it conf, embracing the still-very-ad-hoc
| format used across Linux and such. (And yeah, I recognise that
| the library name invokes both labels, but beyond that they're
| focusing on INI.)
|
| But _not one of the INI examples shown will actually work as
| you'd expect using Windows' GetPrivateProfileString function_.
| Windows' INI-reading functions are _extremely_ simplistic; about
| the most magic thing is case-insensitivity of keys. You can't put
| a space around the equals sign: that gives you a key name that
| ends with a space and a value that starts with a space. There are
| no line continuations (section 5). You can't do what they call a
| composite configuration file (section 8). Empty keys (section 10)
| are fine. Implicit keys (section 12) don't work.
| tux3 wrote:
| This critique made me learn some new things about both formats,
| but I actually come out more in support of TOML because of it
|
| Each argument the author gives is a strong preferrence for
| everything being implicit, ambiguous, free of heavy syntax like
| quotes for strings and brackets for arrays
|
| That would be a reasonable subjective choice, but when I see the
| INI exemples used to illustrate, or the slightly out there
| assertion that a 1 element list is a completely meaningless
| inhuman concept, I'm not really swayed. Sometimes it almost seems
| like irony. The exemples are really hard to take at face value..
|
| If anything, this pushes me further away from that. I've done
| YAML. I've done languages where everything is helpfully
| implicitly cast, and all the "WTF talks" that result from the
| weird rules and edge cases.
|
| We know what it's like when DE is a string but NO is a boolean,
| and only some version numbers are strings while some are floats.
|
| Quoting strings is really not what costs me time and effort. It's
| weird edge cases and surprises that come back to bite you.
| kybernetikos wrote:
| I understand your points as precisely in the opposite direction
| to your conclusion.
|
| According to this post, the INI approach says that the code
| that reads the value determines what type it should be. That
| approach means that you never get a problem where NO is a
| boolean when it should have been an enum, or a version number
| is a float when it should have been a string. You only get
| those problems when the type of a value is determined from the
| file without reference to the expected type, like in TOML.
| PH95VuimJjqBqy wrote:
| and I agree with that approach personally.
|
| I don't consider YAML to be an acceptable configuration
| format and it kills me that the standard moved to it away
| from XML.
| coldtea wrote:
| > _According to this post, the INI approach says that the
| code that reads the value determines what type it should be.
| That approach means that you never get a problem where NO is
| a boolean when it should have been an enum, or a version
| number is a float when it should have been a string_
|
| That's a simplistic view.
|
| In practice, there soon wont be just _one_ piece of code
| reading your files, or it will be shared elsewhere, and it
| will all depend on implicit semantics and documentation of
| the assumptions (if you 're lucky to have it). Hilarity,
| chaos, and head scratching ensues.
|
| Whereas with a format that enforces the types, every consumer
| with a parser for the format gets the same values (to the
| extended that it matters: a list doesn't suddenly become a
| string, but in some language it might be a vector and in
| another an array).
| marcosdumay wrote:
| Configuration files usually are only read by one piece of
| code. Besides, the article is correct in that the type
| system from TOML is completely inadequate for fully parsing
| the files anyway, so it will necessarily depend on implicit
| agreement on the semantics from every reader.
|
| IMO, configuration file formats should only ever have text
| as primitive type. Anything else should be defined in
| another layer. I completely agree with that part of the
| argument from the article.
|
| Then the article goes to argue that the quotes are
| harmful... And no, if you have a whitespace sensitive
| language, you need a damn good representation for strings
| that won't allow for ambiguity to creep in. And INI is just
| horrible on this.
| coldtea wrote:
| > _Configuration files usually are only read by one piece
| of code._
|
| Having a single source of truth and multiple services and
| scripts needing the same info means the same
| configuration file will get to be read by many pieces of
| code, even from different languages.
|
| And that's without considering piecemeal migration of the
| same "one piece of code" running on different services to
| another language or a version two design, still needing
| to read the same file.
|
| > _IMO, configuration file formats should only ever have
| text as primitive type. Anything else should be defined
| in another layer. I completely agree with that part of
| the argument from the article._
|
| I mean, that's not even wrong.
|
| Except if you mean "they should not be binary". Then,
| sure.
| AndyKluger wrote:
| > IMO, configuration file formats should only ever have
| text as primitive type. Anything else should be defined
| in another layer.
|
| I very much agree. If you haven't checked it out,
| NestedText is an excellent format that takes this
| sentiment to heart.
| MrBuddyCasino wrote:
| > _Sometimes it almost seems like irony. The examples are
| really hard to take at face value._
|
| Agree, with a few exceptions (eg empty key names), the TOML
| design choice just seems less ambiguous, simpler and thus...
| better?
| dale_glass wrote:
| I mean, it refers to Postel's law on the top. Most of it seems
| to follow from that.
|
| IMO that's long been proven to have been a very bad idea in
| retrospect. So good riddance.
| coldtea wrote:
| > _I mean, it refers to Postel 's law on the top_
|
| One of the crappiest ideas in CS.
| kemotep wrote:
| I can imagine several security issues with accepting any
| kind of input in your program.
| masklinn wrote:
| Literally every language held as a shining example of
| postel's law is full to the brim with security issues.
|
| The literal interpretation of Postel's Law has been
| considered highly detrimental for 20 years:
| https://datatracker.ietf.org/doc/html/rfc3117#section-4.5
| sham1 wrote:
| Well it seems to work for TCP at least, which is where it
| comes from. Of course it's not the correct approach for
| everything, but calling it "one of the crappiest ideas in
| CS" might be a tad harsh.
|
| EDIT: Of course there are better ways to be robust than to
| try to just accept whatever garbage is thrown your way
| because "be liberal in what you accept." So for example
| since this is about config files, you could easily just
| tell the user that their stuff is wrong _and_ tell them how
| to fix it.
| bunderbunder wrote:
| I can accept Postel's Law as being a great idea for fairly
| low-impact things like markup languages. XHTML is a good
| example here: it turns out it wasn't an awesome idea, because
| if the author of an HTML file forgets to close a tag, I'd
| rather the browser make a best effort at displaying a
| document that might be a little janky, than show me nothing
| at all.
|
| But if we're talking configuration files for applications?
| No. Absolutely not. If I get anything even slightly off, do
| not under any circumstances respond by launching the
| application into an unpredictable state. Fail immediately and
| tell me why. Same principle applies for RPC messages.
|
| The reductio ad absurdum here is weak typing. If Postel's Law
| were actually a generally applicable law, then PHP4 would be
| widely considered to be the pinnacle of language design. I
| think most people would agree that it's closer to the nadir.
|
| But still... context matters, XHTML was a mistake. Which
| implies that Postel's Law is true in at least some contexts.
| capitainenemo wrote:
| There's still a few nice things about XHTML that I miss.
| It's really helpful in debugging and catching mistakes.
| I'll actually force it on for dev and test systems just to
| quickly identify errors. I've caught hundreds of issues
| with templates that way. Sure there are markup validators,
| but the always on strict was nice. And still usable with
| XHTML5... So long as the web page is being generated by
| your code, the strictness is a win I think. And you can
| turn off strict in browsers by serving XHTML5 as HTML5 with
| text/html content type.
|
| The responseXML in XHR for XHTML is really nice, and still
| available though mostly useless. I wish when XHTML was
| abandoned that a responseParsedDOM was offered to avoid
| some of the exploitable hacks people came up with instead.
|
| XML transforms using XSL could do some pretty nifty tricks
| with static docs and no other processors but your browser.
|
| So, yeah, don't feel it was wholly a mistake. Sure for
| random content or user generated isn't a good idea, and
| it'd be nice if there were clean ways to handle that (not
| iframes), but saying that your app shouldn't have a strict
| rendering is like saying JSON should be forgiving of
| misplaced braces... If you're feeding bad JSON to your
| modern JS driven app, well, that's your fault and there
| should be errors and it should be fixed. Similar for XHTML
| for your server side app IMO.
| nayuki wrote:
| Good news: XHTML was never abandoned. It still exists
| today as an optional serialization format for HTML5. I am
| using it in practice on my website and described it in
| great detail: https://www.nayuki.io/page/practical-guide-
| to-xhtml
| capitainenemo wrote:
| Yep! Nice guide. And there's https://www.w3.org/TR/html-
| polyglot/ for polyglots.
|
| The main thing you lose (no idea why XHTML5 doesn't add
| support for this) is <noscript> is ignored. Obviously if
| you did any other form of JS detection in a session, you
| can just use that to offer alternate content.
| Nullabillity wrote:
| I'd question the common XHTML talking point too, why is it
| the browser's job to render content that you clearly
| haven't even bothered to proofread?
| [deleted]
| bunderbunder wrote:
| Because it's a _user agent_ and as the _user_ I want it
| to degrade as gracefully as possible. It doesn 't serve
| my interests to refuse to render anything just because
| the author of the website forgot a </b> tag somewhere.
| I'd rather read the text just with formatting other than
| what the author intended, than not read the text at all.
| Don't punish me for someone else's typo.
| Nullabillity wrote:
| I'm just baffled about how this hypothetical scenario
| would even happen.
|
| Did the author of the website never try rendering the
| page themself before pushing it to live?
|
| If user-generated content is able to trigger this then
| you have have an XSS vulnerability on your hands, strict
| validation or not.
| layer8 wrote:
| By that logic, broken SVGs and the like should also be
| rendered leniently. That doesn't make any sense.
|
| If HTML had been strictly schema-validated from the
| start, nobody would be arguing for this.
|
| It's certainly true that HTML being parsed leniently
| helped in it being picked up by amateur website authors
| in the early days, because they weren't confronted with
| error messages (though they were confronted with "why
| doesn't this render as I expect" instead). But that has
| little to do with user expectations by browser users.
| aidenn0 wrote:
| Postel's law is a way to aid in adoption, not a way to
| increase correctness.
|
| If Product X accepts malformed input I, but product Y does
| not, then product X appears to "work better" than product Y
| and people will adopt X more. (The other half of the law
| also helps in adoption; if you emit very conservative
| output, then your output works with everybody else as well,
| also making your product look better).
|
| If authors of webpages only had access to browsers that
| implemented strict XHTML, then there would be a lot fewer
| missing close-tags out there. Things have largely been
| sorted out now, but for a while it was a case of "I have to
| be just as good at rendering complete garbage as IE is, or
| nobody will use my browser" which I hesitate to label as
| "positive" in any meaningful sense.
| pwdisswordfishc wrote:
| > Postel's law is a good indicator of how robust a language is:
| the more a language is able to make sense of different types of
| input, the more robust the language.
|
| Into the trash it goes.
|
| Seriously, the avoid-crashing-at-all-costs anti-pattern is what
| made HTML, JavaScript and PHP the messes that they still are,
| from which the latter one is only now recovering at a glacial
| pace. For once, we could learn the lesson.
| afavour wrote:
| > what made HTML, JavaScript and PHP the messes that they still
| are
|
| Also arguably the three most popular ways anyone got into
| programming in the last couple of decades. Something worth
| pondering on.
| Diggsey wrote:
| Because for a long time if you wanted to build a website with
| no money and no experience those were your only options!
|
| JavaScript and HTML for obvious reasons. PHP because if you
| didn't have your own server, you couldn't run anything else
| (while there were free hosting providers that would host your
| PHP scripts).
|
| Nowadays there are tons more options.
| toxik wrote:
| Lisp was a thing before PHP, and software engineers did use
| it but it never reached the popularity of PHP. It is in the
| end a question of being pragmatic, which PHP and Perl are.
| djha-skin wrote:
| > Postel's Law about being "conservative in what you emit and
| liberal in what you accept" is quite frankly not a good
| engineering principle.
|
| -- Joel Spolsky[1]
|
| 1: https://www.joelonsoftware.com/2003/10/08/the-absolute-
| minim...
| mongol wrote:
| Has Postel himself backtracked though? Because while Joel is
| an authority of sorts, Postel was too...
| coldtea wrote:
| That would be relevant if the criterion was "what some
| authority believes".
|
| Joel merely states what reality showed about Postel's law
| over decades.
|
| It's a thing developers know from experience, if Joe Random
| had said the above quote, it would still be true.
| mongol wrote:
| The success of internet's core protocols tells a
| different story
| coldtea wrote:
| Everything can be a success if its the only game in town
| or a free option, even Javascript.
|
| And we shouldn't conflate adoption success with design
| quality either.
|
| That said, it's not like TCP is a great example of
| Postel's law, and surely not in the crude way it's
| understood and practiced by its advocates. The RFC says:
|
| "As a simple example, consider a protocol specification
| that contains an enumeration of values for a particular
| header field -- e.g., a type field, a port number, or an
| error code; this enumeration must be assumed to be
| incomplete. Thus, if a protocol specification defines
| four possible error codes, the software must not break
| when a fifth code shows up. An undefined code might be
| logged (see below), but it must not cause a failure."
|
| Which is hardly the "anything goes" ticket people imagine
| it to be. E.g. TCP would still break on a badly formed
| header and consider it an error.
|
| Besides, the advice is good for a transmission control
| protocol, especially one involving in-between nodes that
| will not care for the enumeration values like ports and
| such like the start/end nodes do, and just need to pass
| them through.
|
| It's horrible for other types of software. Language
| parsing would be a great example where it should not be
| followed. And of course the most famous related shit show
| is HTML handling in browsers. HTML might be succesful,
| but it's hardly because of following the Postel
| principle.
| mongol wrote:
| > HTML might be succesful, but it's hardly because of
| following the Postel principle.
|
| I think if webbrowsers were not lenient with HTML parsing
| the web would have been adopted much slower in the
| initial years. Also, HTML5 can be seen as a confirmation
| that XHTML2 with a strict parsing would not be
| successful. I followed the WHATWG mailing lists quite
| closely when that effort began. This is evidence for the
| applicability of Postel's law at least in the case of
| HTML.
| zaphar wrote:
| Nearly all of the internet's core protocols reject
| bad/malformed packets. They don't do a best effort to
| figure out what the sender "intended" they just reject
| the packet.
|
| Some of the most popular instances of doing a best effort
| to figure out the intent of the sender are also poster
| children for protocol level security flaws. If we've
| learned anything from deploying, managing, and developing
| on top of the core internet protocols it's this:
|
| 1. Be very conservative in what you send.
|
| 2. Reject anything that isn't what you expected to get.
|
| If you squint you can sort of see that being a valid
| interpretation of Postel's law but it's not the standard
| interpretation in practice.
| nayuki wrote:
| The leniency of HTML burned me a lot in my beginner days. At
| some point, I decided to switch to XHTML (serving as the media
| type "application/xhtml+xml") and never looked back. Ditto
| JavaScript, the laxness harmed a lot, and only like 15 years
| later did I prepend "use strict" to every script.
| HideousKojima wrote:
| Every time I hear Postel's Law mentioned all I can think is
| "God forbid we actually expect people to correctly implement a
| spec." I mean I could kind of get it if the specification is
| poorly written/ambiguous, but that's a problem with the spec
| itself in that case. Otherwise it's just adding unneeded
| complexity (that can majorly harm performance) for no real
| gain, except that you accommodate the people too incompetent to
| correctly implement a spec.
| indymike wrote:
| > Seriously, the avoid-crashing-at-all-costs anti-pattern is
| what made HTML, JavaScript and PHP the messes that they still
| are
|
| Forgiving syntax made HTML and PHP easy, and because it was
| easy people were quick to learn and use them. Everything is
| trade-offs.
| HideousKojima wrote:
| Clear and sane error messages (i.e. something with the level
| of quality of Rust's compiler) could have accomplished the
| same thing without creating the insanity we have now.
| Aaargh20318 wrote:
| > and because it was easy people were quick to learn and use
| them
|
| and then went on to write a shitload of insecure code.
|
| Just because it's simple to use, doesn't mean that just
| anyone should be using it. The problem with PHP is that it
| can be used by someone with far below average programming
| skills to make something functional. But the flip side of how
| forgiving it is, is that it takes someone of above average
| skill to make something to use it to make something safe and
| performant.
|
| Making it easy to use wrongly also makes it harder to use it
| right. Simply because you lack any feedback when you do
| something dumb.
| astrobe_ wrote:
| > value in EUR = 345 # valid with libconfini but invalid in TOML
|
| This is degenerate.
|
| You want a key-value format easy to pick up, and easy to edit by
| hand - otherwise you'd rather just use something like SQlite - in
| particular if you "need" the other insanity that is sections in
| INI files.
|
| Why the heck allow spaces and UTF8 in key names? To satisfy
| someone's libido?
| gloria_mundi wrote:
| Unicode is useful for languages other than English, and has
| nothing to do with anyone's libido.
| jonhohle wrote:
| The author does not seem to appreciate separating tokenization
| from interpretation. Can you right a (choose your data format of
| choice) document that is valid in (your data format of choice)
| and not valid config for your application? Absolutely! Can you
| right an arbitrarily specked INI file that is both invalid
| structurally and not valid? Even more so!
|
| Choosing a standardized data format gives you a constellation of
| tools for managing, generating, querying, and validating data
| that you don't need to write in an application.
|
| I'm not sure if TOML libraries support it, but the Ion libraries,
| for example, allow you to see the next data type in code and
| adjust accordingly. If you want to accept a symbol, string, or
| array of values, the application can choose to do that. Oh, and
| you can write a formal schema with things like enumeration values
| that users and code can use to validate a document. So if you're
| in custom INI land, that's yet another tool you need to write on
| top of everything else.
|
| If I didn't want to pull in a dependency, I might write a quick
| and dirty INI parser, knowing there are likely bugs, corner
| cases, and all kinds of potential future issues that the extra
| code entails. If I'm taking on a dependency, I'd probably choose
| a well defined, human and machine readable/writable format that
| has schema support. Then I'd write a schema and point people
| (operators and programmers) to it for reference, not the code
| implementing the parser.
| mongol wrote:
| I agree with the typing arguments. A config parser can only parse
| the values to a degree, to primitive types basically. The
| application can parse them fully. So the question is to what
| extent the parser helps the application vs to what extent it
| introduces subtle problems. As a config file user I would prefer
| not to have to care so much, but also not to have to work around
| weird corner cases. YAML's no-problem is a perfect example. This
| would not exist if the parser did not try to be helpful.
|
| I am in the author's camp.
| AndyKluger wrote:
| I know I'm being repetitive in all these config file threads,
| but you might like NestedText, which only provides strings,
| lists, and maps.
| donatj wrote:
| I'm not a big fan of TOML, but I find the typing criticism here
| weak. I would far rather my configs have a strict interpretation
| of 89 vs "89" vs 89.0
|
| None of the INI parsers I have ever used have just returned
| everything as a raw castable string of exactly what the user
| entered. There's always a horrible layer of interpretation. Many,
| including the one built into PHP have confusing rules around
| bools, for instance
|
| > String values "true", "on" and "yes" are converted to true.
| "false", "off", "no" and "none" are considered false. "null" is
| converted to null
|
| Literally the quotation marks don't help. foo =
| false foo = "no"
|
| Either way you are getting a bool false.
| djha-skin wrote:
| We have showcased here the classic impedance mismatch between
| serialization in a typed language and serialization in a dynamic
| language. The author clearly is in the typed camp, speaking of
| making enumeration labels and a dated type for the names of
| continents.
|
| The typed language camp loves the idea of structured untyped
| strings in a serialization format, such as NestedText or INI.
| However, this is uncomfortable in dynamic language territory, for
| many use cases. Because of this dichotomy, any universal
| serialization format is going to feel like a compromise. JSON is
| the poster child of this.
|
| In the YAML spec, the authors explicitly stated a goal of
| supporting native data types of dynamic languages. This design
| decision seems to be a good compromise between the typed and
| dynamic camps, but the old saying applies: a good compromise is
| where everyone goes home angry.
| [deleted]
| PaulHoule wrote:
| Sometimes I dream of a pseudo natural language configuration
| language based on the ideas of Inform7
|
| https://ganelson.github.io/inform-website/book/WI_6_1.html
| The blog database is a postgres database named "foo" at host
| "db.example.com" with username "harry" and password "bar"
|
| The thing is that probably needs tool assistance as much or more
| than any other configuration format would.
| xedrac wrote:
| For nearly every argument made in the article, I found myself
| siding with the TOML approach. The "no non-ascii keys except
| wrapped in quotes" thing is a bit wonky I admit.
| epage wrote:
| toml 1.1 will allow non-ascii in keys (and multi-line inline
| tables)
|
| See https://github.com/toml-lang/toml/blob/main/CHANGELOG.md
| nmilo wrote:
| The fact that YAML is so bad is why being "understandable by a
| human" shouldn't be the ultimate goal of any configuration
| language. I would gladly make concessions like quoted strings or
| bracketed lists to avoid the hell of trying to figure out if my
| string/number/list actually parsed as a string/number/list.
| vkazanov wrote:
| YAML is the Perl of config formats. I just don't understand how
| it ended up being so popular!
|
| I remember my oss editor at some being had a little problem
| highlighting a yaml configuration. No problem, - I thought. -
| how hard can that be?
|
| Well, turns out it's almost cpp level hard to properly parse
| the language.
| Pxtl wrote:
| Yaml convinced me that dynamically typed config & interchange
| formats are a bad idea.
|
| Statically typed schema eliminate the ambiguity and so you get
| to have your legibility cake and eat your consistence cake too.
| hitchstory wrote:
| thats what strictyaml does
| kibwen wrote:
| I used to be in favor of schemas, but my problem with them
| these days is that they just can't encode all the validation
| necessary to ensure that the config is correct. At the end of
| the day, the only way to check if the config is actually
| valid is to parse it, so I'm sympathetic to the "string them
| all and let the application sort them out" approach.
| IggleSniggle wrote:
| I'm sympathetic to that, but not to ini! INI is _just
| standardized enough_ that your ini parser may or may not
| give you "just a string." As another poster mentioned, it
| may well interpret the string "NO", quotes included, as the
| boolean false, before passing along to the rest of the
| application. It's this ambiguity of type that makes INI
| problematic. If it simply handed along strings, without
| fail, and left the application to parse whether "NO" should
| be a country or boolean or string, that wouldn't be a
| problem.
|
| But inevitably, in order to DRY, somebody will make a
| consistent parser that is used in your application, whether
| that's in-house or a dependency. And at that point, it is
| very tempting to run everything through the parser, and the
| parser is going to make some unexpected decisions.
|
| So, sure, use INI. But don't really. Use a env file that is
| parseable as INI or as shell environment variables. As soon
| as you start needing anything more complex, use something
| where you have at least a few basic guarantees that what
| you're getting is at least in the general vicinity of what
| you want.
| kibwen wrote:
| Right, I'm not trying to say that INI is the solution,
| only that 1) TOML's anemic selection of types is probably
| pointless, and 2) any attempt to provide a useful
| selection of types would require being a Turing-complete
| language, which is not want I want in my config files, so
| you might as well just give me strings and let me parse
| it in the typed, Turing-complete language that I'm
| already using for my application logic.
| AndyKluger wrote:
| Absolutely agree!
|
| If you haven't checked it out, NestedText takes this
| approach.
| Pxtl wrote:
| Yes. I like the NestedText approach, but I do feel like
| it needs an official optional "blessed" schema
| description language for type validation, instead of
| "here's a dozen ways to do NestedText validation in
| python".
| rdtsc wrote:
| Having used INIs already, it seemed TOML didn't really bring
| enough to the table to be worth switching to it. But I imagine
| switching from YAML to TOML might be an improvement.
|
| Date support, and general type-awareness and nested sections also
| seemed like anti-features to me.
| palmfacehn wrote:
| Comments prescribing example configurations resolve most of these
| issues. Even if the ideal config format is found and Internet
| debates settle the issue decisively, README.txt will still be a
| good practice.
|
| There's very little which isn't permissible if we presuppose
| users will read the source to understand a config file or
| software generally.
___________________________________________________________________
(page generated 2023-09-21 23:01 UTC)