[HN Gopher] YAML: The Norway Problem (2022)
       ___________________________________________________________________
        
       YAML: The Norway Problem (2022)
        
       Author : carlos-menezes
       Score  : 205 points
       Date   : 2025-04-12 22:10 UTC (1 days ago)
        
 (HTM) web link (www.bram.us)
 (TXT) w3m dump (www.bram.us)
        
       | firesteelrain wrote:
       | This problem occurs because pyyaml load() uses the full YAML 1.1
       | schema. There is another function BaseLoader that will interpret
       | everything as a string which is the workaround that the article
       | suggests. Just another way to achieve it.
       | 
       | It's a bit of a sore spot in the YAML community as to why PyYAML
       | can't / won't support YAML 1.2. It was in maintenance mode for a
       | while. YAML 1.2 also introduced breaking changes.
       | 
       | From a SO comment: " As long as you're okay with the YAML 1.1
       | standard, PyYAML is still perfectly fine, secure, etc. If you
       | want to support the YAML 1.2 spec (released in 2009), you can use
       | ruamel.yaml, which started out as a fork of PyYAML. - CrazyChucky
       | Commented Mar 26, 2023 at 20:51"
       | 
       | - https://stackoverflow.com/q/75850232
        
         | rat87 wrote:
         | Yeah it's a problem I had to put up a PR on a tool I was using
         | because I ran into the Norway problem on yaml I was getting
         | from another team. I did ask them to add quotes just in case
        
           | firesteelrain wrote:
           | A supplier we contracted with and we gave requirements to
           | asked me what format do we want the export/import of the data
           | to be in and I said JSON. It's simple, easy and can be
           | converted into anything else very easily
        
         | gschizas wrote:
         | I wish that ruamel.yaml had better documentation. I've had to
         | dive into the code so many times to find out how to do
         | something.
        
       | gnabgib wrote:
       | Related
       | 
       |  _The YAML document from hell_ (566 points, 2023, 353 comments)
       | https://news.ycombinator.com/item?id=34351503
       | 
       |  _That 's a Lot of YAML_ (429 points, 2023, 478 comments)
       | https://news.ycombinator.com/item?id=37687060
       | 
       |  _No YAML_ (Same as above) (152 points, 2021, 149 comments)
       | https://news.ycombinator.com/item?id=29019361
        
         | mdaniel wrote:
         | And some light commentary a few days ago:
         | https://news.ycombinator.com/item?id=43648263 - Apr 2025 (51
         | comments)
        
       | dissent wrote:
       | I reckon if this is really a big concern for anybody, then they
       | are probably writing way too much YAML to begin with. If you're
       | being caught out by things like this and need to debug it, then
       | it maps very cleanly to types in most high level languages and
       | you can generate your YAML from that instead.
        
         | makeitdouble wrote:
         | Sadly you usually realize you've been writing too much YAML way
         | past the turning point, and it will be a pain to move a single
         | file to JSON for instance when you have a whole process and
         | system that otherwise ingest YAML, including keeping track of
         | why this specific part of JSON and not YAML.
         | 
         | So people work around the little paper cuts, while still
         | hitting the traps from time to time as they forget them.
         | 
         | > generate YAML
         | 
         | I've a hard time finding a situation where I'd want to do that.
         | Usually YAML is chosen for human readability, but here we're
         | already in a higher level language first. JSON sounds a more
         | appropriate target most of the time ?
        
           | dissent wrote:
           | There are probably two use cases.
           | 
           | Configuration files for programs. These tend to be short.
           | 
           | DSLs which are large manifests for things like cloud
           | infrastructure. These tend to be long, they grow over time.
           | 
           | My pet hypothesis is these DSLs exist mostly for neutrality -
           | the vendor can't assume you have Python or something present.
           | But as a user, you can assume just that and gain a lot by
           | authoring in a proper language and generating YAML.
           | 
           | See https://github.com/cloudtools/troposphere for a great
           | example for AWS CloudFormation.
        
             | bigstrat2003 wrote:
             | > Configuration files for programs. These tend to be short.
             | 
             | This is where I use YAML and it shines there. IMO easier to
             | read and write by hand than JSON, and short sweet config
             | files don't have the various problems people run into with
             | YAML. It's great.
        
             | makeitdouble wrote:
             | I can't run the examples right now, but looking at the last
             | "print(template.to_json())" line, looks like the main use
             | case is JSON ?
             | 
             | On cloud infra, yes, having one or two layers of languages
             | is a natural situation. GCP and AWS both accepting
             | (encouraging?) JSON as a subset of YAML makes it a simpler
             | choice when choosing an auto generating target.
             | 
             | You mention people wanting to author the generated files, I
             | think in other situations tweaking the auto-generated files
             | will be seen as riskier with potential overwriting issues,
             | so lower readability will be seen as a positive.
        
               | dissent wrote:
               | That's the point really, you can generate JSON or YAML
               | and it doesn't really matter. If you want to include 100
               | similar objects in that output, you can use a for loop.
               | You can't do that in plain JSON/YAML.
        
           | charrondev wrote:
           | Isn't yaml a strict superset of JSON? Any compliant YAML
           | parser should be able to ingest a JSON document.
        
             | mannykannot wrote:
             | Are there no cases where well-formed JSON could be subject
             | to the problems covered in the article, when parsed by a
             | compliant YAML parser? I'm asking because I know nothing
             | about YAML and not much more about JSON.
        
               | charrondev wrote:
               | Not that I know. JSON requires strings to be quoted which
               | is basically the problem here. Of course it's not a great
               | human writable configuration format (no comments being a
               | huge problem).
               | 
               | I'm just pointing out that it should be very simple to
               | swap a YAML file for a JSON file in any system that
               | accepts YAML
        
               | makeitdouble wrote:
               | JSON is stricter than YAML so that class of issues is
               | avoided.
        
             | makeitdouble wrote:
             | Yes. Rewriting a YAML file into strict JSON won't have any
             | impact on the ingestion or the processing of it.
        
             | throwawaymaths wrote:
             | https://metacpan.org/pod/JSON::XS#JSON-and-YAML
        
               | charrondev wrote:
               | > I have been pressured multiple times by Brian Ingerson
               | (one of the authors of the YAML specification) to remove
               | this paragraph, despite him acknowledging that the actual
               | incompatibilities exist. As I was personally bitten by
               | this "JSON is YAML" lie, I refused and said I will
               | continue to educate people about these issues, so others
               | do not run into the same problem again and again. After
               | this, Brian called me a (quote)complete and worthless
               | idiot(unquote).
               | 
               | > In my opinion, instead of pressuring and insulting
               | people who actually clarify issues with YAML and the
               | wrong statements of some of its proponents, I would
               | kindly suggest reading the JSON spec (which is not that
               | difficult or long) and finally make YAML compatible to
               | it, and educating users about the changes, instead of
               | spreading lies about the real compatibility for many
               | years and trying to silence people who point out that it
               | isn't true.
               | 
               | > Addendum/2009: the YAML 1.2 spec is still incompatible
               | with JSON, even though the incompatibilities have been
               | documented (and are known to Brian) for many years and
               | the spec makes explicit claims that YAML is a superset of
               | JSON. It would be so easy to fix, but apparently,
               | bullying people and corrupting userdata is so much
               | easier.
               | 
               | Well that's disappointing.
        
               | alabastervlog wrote:
               | This explains some things on, like, a _mythic_ level,
               | that I've felt about yaml practically since the first
               | time I saw it.
               | 
               | I guess software are human texts after all.
        
         | dev_l1x_be wrote:
         | True. YAML is an intermediate representation between my
         | intention expressed in Dhall and what runs in production.
         | 
         | https://github.com/dhall-lang/dhall-kubernetes
        
       | ashishb wrote:
       | How often do people even encounter this issue? I have been using
       | YAML for 5+ years and have never had it before. Further, I use
       | `yamllint` which points this out as a lint issue "truthy value
       | should be one of [false, true]".
        
         | rat87 wrote:
         | I have when getting an openapi yaml file from someone else.
        
         | mongol wrote:
         | Has been encountered where I work. A global website with lots
         | of country-specific config.
        
         | hinkley wrote:
         | Fractions are discriminatory when they happen to one individual
         | or group every time or even just the first time.
         | 
         | See also p95 but the same couple of users always see the p99
         | time, due to some bug.
        
           | ashishb wrote:
           | Indeed, based on the comments, it is a scissor-bug. Most
           | people never encountered it while some encountered it a lot.
        
         | Y-bar wrote:
         | Never experienced it for the past 10+ years since the bug was
         | fixed in the spec.
        
         | speedgoose wrote:
         | I have encountered it once, though I live in Norway and worked
         | in IT there for a decade.
        
         | tetha wrote:
         | I don't recall encountering the norway problem in the wild.
         | 
         | Ansible has a pretty common issue with file permissions,
         | because pretty much every numeric representation of a file mode
         | is a valid number in YAML - and most of them are not what you
         | want.
         | 
         | Sure, we can open up a whole 'nother can of worms if we should
         | be programming infrastructure provisioning in YAML, but it's
         | what we have. Chef with Ruby had much more severe issues once
         | people started to abuse it.
         | 
         | Plus, ansible-lint flags that reliably.
        
         | peanut-walrus wrote:
         | I for one did encounter exactly this problem when configuring a
         | list of countries via ansible for geoip whitelisting.
        
         | jeltz wrote:
         | I have seen it twice but I work in Sweden where we often do
         | things also for the Norwegian market.
        
         | Y_Y wrote:
         | I don't think false is truthy.
        
       | thyrsus wrote:
       | I do a lot of ansible which needs to run on multiple versions,
       | and their yaml typing are not consistent - whenever I have a
       | variable in a logic statement, I nearly always need to apply the
       | "| bool" filter.
        
         | polski-g wrote:
         | Yep. I just want strict yaml:
         | 
         | anything encased in quotes is a string, anything not is not a
         | string (bool, int or float)
        
           | awestroke wrote:
           | Including keys?
        
         | mdaniel wrote:
         | This is likely hair splitting, but you are far more likely
         | getting bitten by the monster amount of variance in jinja2
         | versions/behaviors than by anything "yaml-y"
         | 
         | For example, yaml does not care about this whatsoever
         | - name: skip on Tuesdays         when:
         | ansible_date_time.weekday != "Tuesday"
         | 
         | but different ansible versions are pretty yolo about whether
         | one needs to additionally wrap those fields in jinja2 mustaches
         | - name: skip on Tuesdays         when: '{{
         | ansible_date_time.weekday != "Tuesday" }}'
         | 
         | and another common bug is when the user tries to pass in a
         | boolean via "-e" because those are coerced into _string_ key-
         | value pairs as in                 $ ansible -e not_today=true
         | -m debug -a var=not_today all       localhost | SUCCESS => {
         | "not_today": "true"       }
         | 
         | but if one uses the jinja/python compatible flavor, it does the
         | thing                 $ ansible -e not_today=True -m debug -a
         | var=not_today all       localhost | SUCCESS => {
         | "not_today": true       }
         | 
         | It may be more work than you care for, since sprinkling rampant
         | |bool likely doesn't actively hurt anything, but the
         | |type_debug filter[1] can help if it's behaving mysteriously
         | 
         | 1:
         | https://docs.ansible.com/ansible/11/collections/ansible/buil...
        
       | quechimba wrote:
       | We had this issue many years ago when people from Norway couldn't
       | sign up. Took us a while to figure out
        
         | duxup wrote:
         | Or were they from Noway ...
        
         | magicalhippo wrote:
         | As a Norwegian I'm very curious, where in the pipeline were you
         | using YAML? And why?
         | 
         | I've only seen it used for configuration.
        
           | tough wrote:
           | usually locale's paths gone wrong
        
           | StableAlkyne wrote:
           | I've seen teams use it as a replacement for JSON because it
           | has the perception of being more "modern"
        
             | bornfreddy wrote:
             | While JSON is annoying because it lacks some pretty basic
             | features (comments, trailing comma), at least its spec is
             | short. YAML is huuuge - there are way too many ways to do
             | the same thing.
        
           | quechimba wrote:
           | We were using a fork of https://github.com/carmen-
           | ruby/carmen/tree/master/iso_data/b... with our own
           | translations. We used the data in the signup form.
        
         | TZubiri wrote:
         | I usually think of yaml for internal config files, would never
         | think of yaml for user data.
         | 
         | Don't ask me why though, might have something to do with how
         | it's written like a python file, no user would want to write
         | their data in yaml format.
        
           | nurgasemetey wrote:
           | Probably, OP didn't keep user data in YAML, but I think there
           | was config that kept allowed countries to sign up.
        
         | dmckeon wrote:
         | Narrow escape for people from Yemen (YE).
        
       | singpolyma3 wrote:
       | Quote your strings
        
         | pavel_lishin wrote:
         | Empty the footgun before firing.
        
       | umanwizard wrote:
       | "Be liberal in what you accept" rears its ugly head once more.
        
         | eyelidlessness wrote:
         | Being liberal in what you accept, also known as the "robustness
         | principle", doesn't mean being _ambiguous or surprising_ about
         | how you accept it. If anything, robustness requires a great
         | deal more precision and clarity (at least with your own
         | reasoning, then with how you communicate what to expect from
         | it).
        
           | kazinator wrote:
           | Postel's Law does not deserve to be nicknamed the Robustness
           | Principle.
           | 
           | Robustness has a meaning and it refers to handling bad inputs
           | gracefully. An example of a lack of robustness is allowing a
           | malicious actor to execute arbitrary code by supplying a
           | datum larger than some buffer limit.
           | 
           | Trying to make sense of invalid inputs and do something with
           | them isn't robustness. It's just example of making an
           | extension to a spec. The extension could be robust or not.
           | 
           | Postel's Law amounts to "have extensions and hacks to handle
           | incorrectly formatted data, rather than rejecting them. So,
           | OK, yes, that entails being robust to certain bad inputs that
           | are outside of the spec, but which land into onto one of the
           | extensions. It doesn't entail being robust to inputs that
           | fall outside of the core spec and all hacks/extensions.
           | 
           | Cherry picking certain bad inputs and giving them a meaning
           | isn't, by itself, _bona fide_ robustness; robustness means
           | handling all bad inputs without crashing or allowing security
           | to be compromised.
        
             | hinkley wrote:
             | Pandering to customers will make you a lot of money today
             | but very narrow margins tomorrow. If you're in startup
             | mentality your bosses may be 100% fine with that. But you
             | will likely be stuck supporting that crap because you
             | didn't become wealthy in the IPO/merger.
        
             | munch117 wrote:
             | Postel's law isn't about accepting arbitrary invalid
             | inputs. It's about inputs that are technically invalid but
             | the intent is obvious from looking at it, and handling
             | those according to intent.
             | 
             | In a distributed non-adversarial setting, this is exactly
             | what you want for robustness.
             | 
             | The problem, as we've come to realise in the time since
             | Postel's law was formulated, is that there is no such thing
             | as a distributed non-adversarial setting. So I get what
             | you're saying.
             | 
             | But your definition of robustness is too narrow as well.
             | There's more to robustness than security. When Outlook
             | strips out a certificate from an email for alleged security
             | reasons, then that's not robustness, that's the opposite,
             | brokenness: You had one job, to deliver an attachment from
             | A to B, and you failed.
             | 
             | Robustness and security can be at odds. It's quite OK to
             | say, "on so and so occasion I choose to make the system
             | _not_ robust, because the robust solution would not be
             | sufficiently secure ".
        
               | inopinatus wrote:
               | You are spot on in regards to consideration of
               | adversarial context; however, it is instructive to review
               | the nuanced difference between the RFC761 (1980)
               | statement, viz. "be conservative in what you do, be
               | liberal in what you accept from others" and the
               | substitution of "send" for "do" in RFC1122 (1989). The
               | latter is, with hindsight, an error, since it refocused
               | the attention of some rigid thinkers entirely onto
               | protocol mechanics and away from implementation
               | behaviour, despite the commentary beneath that admonishes
               | such a mindset and concurs wholly with your point.
               | 
               | Or to put it otherwise, Postel was right to begin with,
               | albeit perhaps just a little too cryptic, and has been
               | frequently misquoted and misinterpreted ever since.
        
               | kazinator wrote:
               | The "liberal in what you accept" part is what is flat out
               | wrong; is _that_ being misquoted?
        
               | inopinatus wrote:
               | It's not wrong, because it doesn't stand alone, and yes,
               | if you're cherry picking half a statement, then you're
               | like the Australian politicians calling it a "lucky
               | country", perpetuating a gross misrepresentation of both
               | the letter and the sentiment of the original statement.
               | 
               | This is especially ironic given that the constructive
               | argument against Postel's law is generally based on the
               | value of a strict interpretation of specification. If
               | you're intentionally omitting half of the law, then you
               | have an implementation problem.
               | 
               | Furthermore, none of this has much to do with YAML being
               | a shitty design.
        
               | kazinator wrote:
               | > intent is obvious from looking at it
               | 
               | Ouch, no. Dragons be there. Famous last words.
               | 
               | The only area in which is it acceptable to reason this
               | way is graphical user interfaces. (And only if you've
               | provided an API already for reliable automation, so that
               | nobody has to automate the application through its GUI.).
               | Is say graphical, because, no, not in command interfaces.
               | 
               | Even in the area of GUIs, new heuristics about intent
               | cause annoyances to the users. But only annoyances and
               | nothing more.
               | 
               | Like for example when you update your operating system,
               | and now the window manager thinks that whenever you move
               | a window so that its title bar happens to touch the top
               | of the screen, you must be indicating the intent to
               | maximize it.
               | 
               | I suppose the ship has sailed now that people are
               | deploying LLMs in this way and that and those things
               | intuit intent. They are like Postel's Law on
               | amphetamines. There is a big cost to it, like warming the
               | planet, and the systems become fragile for their lack of
               | specification.
               | 
               | > When Outlook strips out a certificate from an email for
               | alleged security reasons
               | 
               | I would say it's being liberal in what it accepts, if
               | it's an alternative to rejecting the e-mail for security
               | reasons.
               | 
               | It has taken a datum with a security problem and "fixed"
               | it, so that it now looks like a datum without that
               | security problem.
               | 
               | (I can't find references online to this exact issue that
               | you're referring to, so I don't have the facts. Are you
               | talking about incoming or outgoing? Is it a situation
               | like an expired or otherwise invalid certificate not
               | being used when sending an outgoing mail? That would be
               | "conservative in what you send/do".)
        
               | umanwizard wrote:
               | > It's about inputs that are technically invalid but the
               | intent is obvious from looking at it, and handling those
               | according to intent.
               | 
               | Surely someone at some point thought it was obvious that
               | "No" should mean "false", and that's why we're now in
               | this mess.
        
         | inopinatus wrote:
         | This has little to do with the robustness principle, however
         | mis-stated. It's just shitty design. But if someone was still
         | hell-bent on invoking it, then if anything, it's a straight-up
         | violation of the adjacent words "be conservative in what you
         | do"1, and further disregards the commentary in RFC11222:
         | ... assume that the network is         filled with malevolent
         | entities that will send in packets         designed to have the
         | worst possible effect ...
         | 
         | [1] https://datatracker.ietf.org/doc/html/rfc761#section-2.10
         | 
         | [2] https://datatracker.ietf.org/doc/html/rfc1122#page-12
        
         | senderista wrote:
         | That works only when everyone is trying in good faith to follow
         | the standard, i.e. basically never. My version of Postel's Law:
         | 
         |  _If you accept crap, then eventually you will receive only
         | crap._
        
       | kazinator wrote:
       | In Lisp, if you want to read text into symbols (e.g. file of
       | words), you just switch to a dedicated package in which those
       | symbols are interned. Then if NIL happens to come up, it will be
       | a symbol named "NIL" in that package, unrelated to the special
       | object.
        
       | TZubiri wrote:
       | That edge case sounds like a reasonable tradeoff you would make
       | for such a simple and readable generic data format.
       | 
       | Escaped json probably hits that sweetspot by being a bit uglier
       | than yaml, but 100 times simpler than xml, though.
        
         | tetha wrote:
         | Mh, since I just commented about ansible, you just made XML-
         | based ansible flash in front of my eyes. I think I'm in a bit
         | of pain now.                   <tasks>
         | <ansible.builtin.copy notify="restart minio">
         | <src> files/minio.service </src>                 <dest>
         | /etc/systemd/system/minio.service </dest>
         | <owner> root </owner>                 <group> root </group>
         | <mode> 0x644 </mode>             </ansible.builtin.copy>
         | </tasks>
         | 
         | But you could use XSLT to generate documentation in XHTML from
         | your playbooks about what files are deployed, what services are
         | managed...
        
           | mdaniel wrote:
           | I will die on the hill than writing out ansible.builtin. are
           | characters of my life I'll never get back, and refuse to. If
           | it's _built in_ why do I have to qualify it?!
           | 
           | Also, watch out: 0x644 != 0644 which is the mode you meant
        
       | thund wrote:
       | I like using tags and avoid any doubt
       | 
       | !!boolean
       | 
       | https://dev.to/kalkwst/a-gentle-introduction-to-the-yaml-for...
        
       | nnurmanov wrote:
       | Another solution is to change the country name:)
        
         | gunalx wrote:
         | No way.
        
           | rwoerz wrote:
           | Neitherway
        
           | hinkley wrote:
           | Or in New Zealand: Nor way.
        
         | gunalx wrote:
         | Thoug we have renamed amino acidsvi think it was. Because
         | microsoft excel switched the original names to months.
        
           | madcaptenor wrote:
           | Genes, not amino acids.
           | 
           | https://www.theverge.com/2020/8/6/21355674/human-genes-
           | renam...
        
       | riffraff wrote:
       | Usual reminder that this is not a problem in YAML 1.2 released 15
       | years ago.
       | 
       | Sadly many libraries still don't support it.
        
         | lifthrasiir wrote:
         | This effectively means that a new version of specification
         | didn't solve the problem at all.
        
       | normie3000 wrote:
       | Google App Engine used to do this to environment variables
       | defined in YAML. IIRC it would convert the string "true" to
       | "Yes", which was a fun surprise when deploying Java And NodeJS
       | apps.
        
       | raffraffraff wrote:
       | Why not just use quotes all the time for strings?
        
         | mystifyingpoi wrote:
         | I like that in concept, but 1) literally no one does that
         | (prime example - Kubernetes docs) and 2) it looks much more
         | messy with quotes, when you know that they are unnecessary in
         | 95% of cases.
        
           | zelphirkalt wrote:
           | Oh, I did that in Ansible stuff. Using quotes for all
           | strings. Exactly because I know what a mess YAML is.
        
         | kinow wrote:
         | I guess sometikes it is out of your control. I work on a
         | workflow manager where users specify their workflows with YAML.
         | So there's little we can do to prevent them from writing things
         | like no, n, t in a place it could cause some issue like ij the
         | article.
        
           | zelphirkalt wrote:
           | Ah, the many places that choose to use YAML for no good
           | reason...
        
             | kinow wrote:
             | Yeah, can't say much about it as I joined after they had
             | already decided on using YAML.
        
         | kergonath wrote:
         | Because that's annoying. YAML is often written and read by
         | humans. If you want a verbose and more regular way to do it,
         | there is always JSON. But JSON is really annoying to deal with
         | for humans, although it is much better than YAML for several
         | applications.
        
           | hnlmorg wrote:
           | You don't actually need quotes to define a string in YAML. Eg
           | the following syntax                  User:          Name: >-
           | Bob          Phone: >-            01234 56789
           | Description:>-            This is a            multi line
           | description
           | 
           | That's both readable and parses your records as strings.
           | 
           | Edit: This stack overflow like provides more details
           | https://stackoverflow.com/questions/3790454/how-do-i-
           | break-a...
        
             | mdaniel wrote:
             | I can't tell if it's irony or not given the sentiment in
             | this thread, but that is not a declaration of a multiline
             | Description field, that's a field of User named
             | "Description:>-" that happens to be missing its trailing
             | ":"
             | 
             | Seeing that used systemically, versus just for "risky"
             | fields makes me want to draw attention to the fantastic
             | remarshal tool[1], which offers a "--yaml-style >" (and "|"
             | and the rest) which will render yaml fields quoted as one
             | wishes
             | 
             | 1: https://github.com/remarshal-project/remarshal#readme
             | and/or $(brew install remarshal)
        
               | hnlmorg wrote:
               | > I can't tell if it's irony or not given the sentiment
               | in this thread, but that is not a declaration of a
               | multiline Description field, that's a field of User named
               | "Description:>-" that happens to be missing its trailing
               | ":"
               | 
               | The trailing ':' was there right after the 'n'.
               | 
               | Examples of this syntax:
               | 
               | https://github.com/lmorg/murex/blob/master/builtins/core/
               | arr...
               | 
               | I do agree it's a bit of a kludge. But if you want data
               | types and unquoted strings then anything you do to the
               | syntax to denote strings over other data types then
               | becomes a bit of a kludge.
               | 
               | The one good thing about this kludge is it allows for
               | string literals (ie no complicated escaping rules).
               | 
               | > Seeing that used systemically, versus just for "risky"
               | fields makes me want to draw attention to the fantastic
               | remarshal tool[1], which offers a "--yaml-style >" (and
               | "|" and the rest) which will render yaml fields quoted as
               | one wishes
               | 
               | I don't really understand what you're alluding to here.
        
               | mdaniel wrote:
               | Tell us that you didn't try to use that example without
               | telling us you just eyeballed the post
               | $ /usr/local/opt/ansible/libexec/bin/python3 -c 'import
               | sys, yaml; print(yaml.safe_load(sys.stdin.read()))' <<YML
               | User:              Name: >-                Bob
               | Phone: >-                01234 56789
               | Description:>-                This is a
               | multi line                description         YML
               | yaml.scanner.ScannerError: while scanning a simple key
               | in "<unicode string>", line 6, column 6:
               | Description:>-         $ gojq --yaml-input . <<YML
               | User:                  Name: >-                    Bob
               | Phone: >-                    01234 56789
               | Description:>-                    This is a
               | multi line                    description         YML
               | gojq: invalid yaml: <stdin>:6             6 |
               | Description:>-                 ^  could not find expected
               | ':'
               | 
               | That's because, for better or worse, yaml considers that
               | a legitimate key name, just missing its delimiter
               | $ gojq --yaml-input . <<YML               User:
               | Name: >-                    Bob                  Phone:
               | >-                    01234 56789
               | Description:>-:                    This is a
               | multi line                    description         YML
               | {           "User": {             "Description:>-": "This
               | is a multi line description",             "Name": "Bob",
               | "Phone": "01234 56789"           }         }
               | 
               | This exchange in a thread complaining about the
               | whitespace sensitivity doesn't escape me
               | 
               | As for remarshal, it was the _systemic_ application of
               | that quoting style that made me think of it, since
               | writing { Name:  >- Bob} is the worst of both worlds: not
               | as legible as the plain unquoted version, not suitable
               | for grep, _and_ indentation sensitive
        
               | hnlmorg wrote:
               | The issue is the lack of white space between the : and
               | the >, not a missing : at the end. I'm typing this on my
               | phone so the odd syntax error might creep in but the key
               | pointer is the examples I linked to and the block token
               | I've described.
               | 
               | Further to that point, none of the example links I've
               | shared have the : at the end and I have production code
               | that works using the formatting I've described. So you're
               | flat out wrong there with your assumption that block keys
               | always terminate with :
               | 
               | > As for remarshal, it was the systemic application of
               | that quoting style that made me think of it, since
               | writing { Name: >- Bob} is the worst of both worlds: not
               | as legible as the plain unquoted version, not suitable
               | for grep, and indentation sensitive
               | 
               | You wouldn't write code like that because >- denotes a
               | block and you're now inlining a string.
               | 
               | I mean I've shared links explaining how this works and
               | you're clearly not reading them.
               | 
               | At the end of the day, I'm not going to argue that >-
               | (and its ilk) solves everything. It clearly doesn't. If
               | you want to write "minimized" YAML using JSON syntax then
               | you're far far better off quoting the string.
               | 
               | But if you are writing a string in YAML and either don't
               | want to deal with quotation marks, or need that string to
               | be a string literal (ie not having to escape things like
               | quotation marks) then my suggestion is an option.
               | 
               | It's not there as a silver bullet but it is a lesser
               | known feature of YAML. Hence me sharing.
               | 
               | Now go read the links and understand it better. You might
               | genuinely find it useful under some scenarios ;)
        
               | mdaniel wrote:
               | I enjoy that you're scolding me about 'not reading' after
               | doubling down the accuracy of your initial post, which,
               | yes, I can easily imagine you did from your phone
               | 
               | And yet I brought receipts for my claims, and you just
               | bring "reed the manul, n00b"
        
               | hnlmorg wrote:
               | Firstly, I didn't say "read the manual", I said "read the
               | links I shared". And that's a pretty reasonable comment
               | to make given I took the time to find examples knowing
               | that I couldn't easily type them out on my phone. And if
               | you bothered to open the links you'd realize they were
               | brief and to the point. I was actually trying to be
               | helpful.
               | 
               | Secondly, your "receipts" were incorrect. Neither of your
               | examples follows the examples I cited, and your second
               | example creates a key named "Description:>-", which is
               | clearly wrong. Hence why ">-" needs to be _after_ the
               | colon.
               | 
               | Here is more examples and evidence of how to use >- and
               | why your "receipts" were also incorrect:
               | 
               | https://go.dev/play/p/1B4ba-dUARq
               | 
               | Here you can clearly see my example:
               | Foo: >-           hello           world
               | 
               | produces:                   { "Foo": "hello world" }
               | 
               | which is correct.
               | 
               | Whereas your example:                   Bar:>-:
               | hello           world
               | 
               | produces                   { "Bar:\u003e-": "hello world"
               | }
               | 
               | which is incorrect.
               | 
               | ----
               | 
               | One final point: I don't understand why you're being so
               | argumentative here. I posted a lesser-known YAML feature
               | in case it helps some people and you've turned it into
               | some kind of pissing match based on bad-faith
               | interpretations of my comments. There was no need for you
               | to do that.
        
       | stephenr wrote:
       | It's not a coincidence that YAML is a perfect acronym for "yet
       | another migraine looming".
       | 
       | I mean ok it is technically a coincidence but it definitely feels
       | like the direct result of the "what could possibly go wrong"
       | approach the spec writers apparently took
        
       | praptak wrote:
       | Related: the YAML exponent problem[0]
       | 
       | TLDR: unquoted hex hash in YAML is fine until it happens to match
       | \d+E\d+ when it gets interpreted as a float in scientific
       | notation.
       | 
       | [0]https://www.brautaset.org/posts/yaml-exponent-problem.html
        
       | ajuc wrote:
       | YAML is just doing too much and trying to be too clever.
        
       | xelxebar wrote:
       | This has been fixed since 2009 with YAML 1.2. The problem is that
       | everyone uses libyaml (_e.g._ PyYAML _etc._) which is stuck on
       | 1.1 for reasons.
       | 
       | The 1.2 spec just treats all scalar types as opaque strings,
       | along with a configurable mechanism[0] for auto-converting non-
       | quoted scalars if you so please.
       | 
       | As such, I really don't quite grok why upstream libraries haven't
       | moved to YAML 1.2. Would love to hear details from anyone with
       | more info.
       | 
       | [0]:https://yaml.org/spec/1.2.2/#chapter-10-recommended-schemas
        
         | xigoi wrote:
         | I'm sad that the "fix" was to disallow "no" as a more readable
         | alternative to "false", rather than to disallow unquoted
         | strings.
        
           | xelxebar wrote:
           | The fix is to make conversion user-controllable. If you want
           | to disallow bare scalars except for booleans and numbers or
           | whatever, it's just a little bit of configuration away.
        
           | mckn1ght wrote:
           | It's silly to have so many keyword synonyms as specified in
           | that earlier regex. I'm also glad we can't specify numeric
           | literals as roman numerals. KISS
        
             | xigoi wrote:
             | Honestly I'd prefer if "yes" and "no" were the only ways to
             | spell the boolean values. They make sense in pretty much
             | all contexts where booleans are used, whereas "true" and
             | "false" rarely make sense.
        
               | tacker2000 wrote:
               | In boolean logic true/false is ubiquitious and well
               | known. As you can see, if one tries to be cute with it,
               | one will get all sorts of issues. So at this point it
               | doesnt make sense to use anything else.
        
               | xigoi wrote:
               | The true/false terminology makes sense in boolean logic
               | because you're dealing with the truth of propositions.
               | However, it does not make sense in the context of a
               | configuration language, where there are no propositions
               | that could be true or false.
        
               | stevage wrote:
               | Huh, I never considered this. we take true and false for
               | granted everywhere but they aren't the most meaningful.
        
               | umanwizard wrote:
               | It makes sense in the context of a configuration language
               | because virtually 100% of programmers and other technical
               | computer users understand "true" and "false" as the
               | canonical Boolean values, and as far as I know that has
               | always been the case. It never would have made sense to
               | invent different unfamiliar terms like "yes" and "no"
               | because of some niche philosophical distinction between
               | "Boolean logic" and "configuration" that almost nobody in
               | the real world cares about.
        
               | xigoi wrote:
               | "yes" and "no" are "unfamiliar terms"? What the fuck?
               | Everyone who knows even the basics of English knows what
               | these words mean.
        
               | umanwizard wrote:
               | They are familiar as English words, yes, but unfamiliar
               | as terms of art for Boolean values in computing. It'd be
               | like replacing "if" statements with "whenever"
               | statements.
        
               | mckn1ght wrote:
               | Don't give them any ideas! They already tried to make
               | inroads with ruby's "unless".
        
               | dtech wrote:
               | Boolean algebra with true and false was well established
               | decades before computers were invented
        
               | xigoi wrote:
               | Boolean algebra deals with logical propositions, not with
               | configuration. The true/false terminology makes sense
               | there.
        
           | pydry wrote:
           | Yeah, you still get the same issue that 3 is an integer, 3.3
           | is a float and 3.3.3 is a string.
        
           | heavenlyblue wrote:
           | Why do you need an alternative spelling of false?
        
             | xigoi wrote:
             | `logging: no` clearly says "I do not want logging".
             | `logging: false` is less explicit - what exactly is false?
        
               | jeltz wrote:
               | Then it should be on/off, not yes/no.
        
               | xigoi wrote:
               | on/off also doesn't make sense in many contexts, for
               | example `isRegistered: on`.
        
               | qw wrote:
               | I often prefer enums over booleans for this. It seems
               | more readable for most cases, and can be extended with
               | new values.
               | 
               | This:                   isRegistered: true
               | 
               | could be replaced with                   accountStatus:
               | "UNREGISTRED"
        
               | qznc wrote:
               | Logging: ignore/print/file
               | 
               | Don't use bool at all.
        
               | xigoi wrote:
               | This, along with number formats, could be a good argument
               | for strings being the only primitive type in config
               | languages.
        
               | qznc wrote:
               | I recently learned about NestedText:
               | https://nestedtext.org/
               | 
               | While it has the YAML-like significant whitespace, it
               | looks nice because it doesn't try to be clever.
        
               | alkonaut wrote:
               | Logging: no could also be log in norwegian. Or log only
               | for the norwegian region. That's the thing with too many
               | keywords and optional quoting, you can't know.
               | 
               | And for this reason, "logging: false" would be clearer
               | than "logging: no" to represent "I do not want logging".
        
               | xigoi wrote:
               | `false` could be a code for something else just as well
               | as `no`. For example, it could mean that I only want to
               | see logs of false information appearing in the system.
               | The only proper solution is to require quotes around
               | strings.
        
               | paulddraper wrote:
               | Options are either
               | 
               | 1. Specify in the key                 loggingEnabled:
               | false
               | 
               | 2. Specify in the value:                 logging:
               | disabled
        
         | maxloh wrote:
         | Why wasn't that a major version bump, like YAML 2.0?
         | 
         | That sounds like a breaking change that rendered old YAML
         | documents to be parsed differently.
        
         | transfire wrote:
         | Absolutely correct! Please correct me if I am wrong, but as far
         | as I know, no one has implemented YAML completely according to
         | spec.
         | 
         | The tag schema used is supposed to be modifiable folks!
         | 
         | And why anyone would still be using 1.1 at this point is just
         | forehead palming foolishness.
        
       | anvandare wrote:
       | "The limits of my keyboard mean the limits of my programming
       | language."
       | 
       | If only they had had [?] and [?] somewhere on their keys to work
       | with Booleans directly while designing the languages. In another
       | branch of history, perchance.[1]
       | 
       | [1]
       | https://en.wikipedia.org/wiki/APL_(programming_language)#/me...
        
         | tossandthrow wrote:
         | [?] and [?] is not entirely congruent to false and true.
         | 
         | Boolean and propositional logic is not the same.
        
           | Q6T46nT668w6i3m wrote:
           | For _ordinary_ two-valued classical propositional logic,
           | e.g., YAML, they are congruent.
        
         | rusk wrote:
         | I have an emacs macro for this
        
       | hgomersall wrote:
       | IMO the proposed solution of StrictYAML + schema is the right one
       | here and what we use extensively for human readable configs.
       | StrictYAML (linked to in the post) is essentially a string-type-
       | only restriction of YAML, so you impose your type coercion on the
       | parsed data structure.
        
         | vander_elst wrote:
         | If you have a schema, why not using directly something like
         | protobufs?
        
           | LelouBil wrote:
           | The comment you replied to talks about human readable configs
        
             | Mesopropithecus wrote:
             | Like this?
             | https://protobuf.dev/reference/protobuf/textformat-spec/
        
               | mdaniel wrote:
               | https://protobuf.dev/reference/protobuf/textformat-
               | spec/#:~:...
               | 
               | and, setting that aside, the very next paragraph says
               | that this is a legit representation of -2.0 which means
               | something has gone gravely wrong                 value: -
               | # change this to 3.14 one day         2.0
        
       | weinzierl wrote:
       | Perl has a _Poland Problem_. The customary file extension for
       | Perl files is *.pl. This worked well until Apache introduced
       | content negotiation and the convention to add a language code as
       | file extension. It had index.html.en, index.html.de, for example.
       | 
       | index.html.pl is where the problem started and the reason why the
       | officially recommended file extension for Perl files used to be
       | (still is?) *.plx.
       | 
       | I don't have the Camel book at hand, but Randal Schwartz's
       | _Learning Perl_ 5th edition says:
       | 
       |  _" Perl doesn't require any special kind of filename or
       | extension, and it's better not to use an extension at all. But
       | some systems may require an extension like plx (meaning PerL
       | eXecutable); see your system's release notes for more
       | information."_
        
         | dtech wrote:
         | That sounds more like an Apache problem than a Perl problem.
         | It's their mistake and it's not even relevant outside Apache
         | context
        
           | weinzierl wrote:
           | It should have been an Apache problem, yes. Not only did it
           | turn out that at least the language negotiation part of
           | content negotiation wasn't the best idea but the way Apache
           | handled it was problematic apart from the pl problem. In the
           | end the Perl community took the issue upon them, so
           | historically I'd say it was a Perl problem (of choice).
        
           | maxloh wrote:
           | That should be marked as a breaking change on Apache side
           | IMO. It would be a security nightmare if server code were
           | leaked to public.
        
         | ginko wrote:
         | Also, Prolog has the Perl problem. :)
        
       | whacko_quacko wrote:
       | Pandas has a Nigeria problem, where NA -> NaN.
       | 
       | It's not that bad, because you can explicitly turn that behavior
       | off, but ask me how I know =(
        
         | trueismywork wrote:
         | How?
        
         | orangewindies wrote:
         | That's a Namibia problem, Nigeria is NG.
        
       | pkkm wrote:
       | Programming with string templates, in a highly complex and
       | footgun-rich markup language, is one of the things I find most
       | offputting about the DevOps ecosystem.
        
         | sph wrote:
         | I believe Satan itself decided to mix YAML, Jinja and Turing-
         | complete logic when it created Ansible. It truly is the
         | sendmail of the modern era.
        
           | senderista wrote:
           | Several years ago when I was writing a deployment system for
           | a cloud distributed database, I tried to automate everything
           | with Ansible playbooks and the Ansible "API" (LOL). I pretty
           | quickly gave up on implementing anything but the most trivial
           | logic in templated YAML and switched to Python (wrapping
           | maximally-dumb Ansible playbooks) for everything nontrivial.
        
           | Fizzadar wrote:
           | You might like pyinfra.
        
             | mdaniel wrote:
             | Just about every time someone complains about ansible,
             | there's a comment to plug this project but pyinfra seems to
             | opt-out of the _cloud_ provisioning part, instead
             | delegating to its terraform connector, which drags in all
             | the nonsense that entails. That makes it not only less
             | useful but (IMHO) a _horrible_ name for a project that only
             | does  "remote execution" and not _infrastructure_. The fact
             | that it 's even missing @aws @azure @gcp connectors further
             | solidifies "who is the audience for this thing?"
        
         | nicktelford wrote:
         | This is why I generally use Terraform for Kubernetes. It's not
         | perfect, but it's miles better than the various different YAML-
         | templating solutions (Kustomize, Helm) popular in the
         | Kubernetes ecosystem.
        
           | mdaniel wrote:
           | Two different _stateful_ recordkeeping control planes with
           | disparate opinions about the current state of the world. What
           | can go wrong.
        
       | azernik wrote:
       | Even worse is the all-decimal MAC problem.
       | 
       | Some genius decided that, to make time input convenient, YAML
       | would parse HH:MM:SS as SS + 60xMM + 60x60xHH. So you could enter
       | 1:23:45 and it would give you the correct number of seconds in 1
       | hour, 23 minutes, and 45 seconds.
       | 
       | They neglected to put a maximum on the number of such sexagesimal
       | places, so if you put, say, six numbers separated by colons like
       | this, it would be parsed as a very large integer.
       | 
       | Imagine my surprise when, while working at a networking company,
       | we had some devices which failed to configure their MAC addresses
       | in YAML! After this YAML config file had been working for literal
       | years! (I believe this was via netplan? It's been like a decade,
       | I don't remember.)
       | 
       | Turns out, if an unquoted MAC address had even a single non-
       | decimal hex digit, it would do what we expected (parse as a
       | string). This is not only by FAR the more common case, but also
       | we had an A in our vendor prefix, so we never ran into this
       | "feature" during initial development.
       | 
       | Then one day we ran out of MAC addresses and got a new vendor
       | prefix. This time it didn't have any letters in it. Hilarity
       | ensued.
       | 
       | (This behavior has thankfully been removed in more recent YAML
       | standards.)
        
       | maelito wrote:
       | We should use very basic yaml parsers without these kind of
       | functions.
        
       | ghuntley wrote:
       | See also https://noyaml.com (feel send in PRs with your
       | gripes/gotchas re: YAML)
        
       | alkonaut wrote:
       | Always quote all yaml strings. If you have a yaml file that has
       | something that isn't a simple value (number, boolean) such as for
       | example a date, time, ip-address, mac address, country code,
       | phone number, server name, configuration name, etc. etc. then you
       | are asking for trouble. Just DON'T DO THAT. It's pretty simple.
       | 
       | "Yeah but it's so convenient"
       | 
       | "Yeah but the benefit of yaml is that you don't need quotes
       | everywhere so that it's more human readable"
       | 
       | DON'T
        
         | ohgr wrote:
         | Yeah that.
         | 
         | 00,01,02,03,04,05,06,07,OH SHIT
        
       | cirwin wrote:
       | I've been working on https://conl.dev, which fixes/removes YAMLs
       | problematic features.
       | 
       | Trying to find a tag-line for it I like, maybe "markdown for
       | config"?
        
       | endofreach wrote:
       | The article mentioned people with the last name "null". I never
       | thought about that. It sounds like really fun in modern days to
       | have that last name.
        
         | mdaniel wrote:
         | There have been several write-ups about it:
         | https://news.ycombinator.com/item?id=43113997
         | https://news.ycombinator.com/item?id=15046223
        
       ___________________________________________________________________
       (page generated 2025-04-13 23:01 UTC)