https://hitchdev.com/strictyaml/why-not/toml/ [ ] [ ] Skip to content logo HitchDev What is wrong with TOML? ( ) ( ) [ ] Initializing search logo HitchDev * About * Consulting * Contact me * [ ] * [ ] Commandlib Commandlib + Changelog + [ ] Using Using o [ ] Alpha Alpha # Add directory to PATH (with_path) # Capture output (.output()) # Easily invoke commands from one directory (CommandPath) # Change your command's environment variables (with_env) # Run command and don't raise exception on nonzero exit code (ignore_errors()) # Piping data in from string or file (.piped) # Piping data out to string or file (.piped) # Run commmands interactively using icommandlib or pexpect # Easily invoke commands from the current virtualenv (python_bin) * [ ] Faketime Faketime * [ ] Hitchrunpy Hitchrunpy + Changelog + [ ] Using Using o [ ] Alpha Alpha # Cprofile # Environment vars # Exceptions # Include files # Interact with running code # Setup code # Syntax errors # Timeout # Variables * [ ] Hitchstory Hitchstory + Changelog + [ ] Approach Approach o Is HitchStory a BDD tool? How do I do BDD with hitchstory? o Complementary tools o Domain Appropriate Scenario Language (DASL) o Executable specifications o Flaky Tests o The Hermetic End to End Testing Pattern o ANTIPATTERN - Analysts writing stories for the developer o Separation of Test Concerns o Snapshot Test Driven Development (STDD) o Test Artefact Environment Isolation o Test concern leakage o Tests as an investment o What is the difference betweeen a test and a story? o The importance of test realism o Testing non-deterministic code o Specification Documentation Test Triality + [ ] Using Using o [ ] Behavior Behavior # Abort a story with ctrl-C # Upgrade breaking changes between v0.14 and v0.15 # Handling failing tests # Running a single named story successfully o [ ] Documentation Documentation # Generate documentation with extra variables and functions # Generate documentation from story o [ ] Engine Engine # Hiding stacktraces for expected exceptions # Given preconditions # Gradual typing of story steps # Match two JSON snippets # Match two strings and show diff on failure # Extra story metadata - e.g. adding JIRA ticket numbers to stories # Story with parameters # Story that rewrites itself # Story that rewrites the sub key of an argument # Raising a Failure exception to conceal the stacktrace # Arguments to steps # Strong typing o [ ] Inheritance Inheritance # Inherit one story from another simply # Story inheritance - given mapping preconditions overridden # Story inheritance - override given scalar preconditions # Story inheritance - parameters # Story inheritance - steps # Variations o [ ] Pytest Pytest # Self rewriting tests with pytest and hitchstory o [ ] Runner Runner # Continue on failure when playing multiple stories # Flaky story detection # Play multiple stories in sequence # Run one story in collection # Shortcut lookup for story names o [ ] Setup Setup # Creating a basic command line test runner + [ ] Why Why o Declarative User Stories o Why does hitchstory mandate the use of given but not when and then? o Why is inheritance a feature of hitchstory stories? o Why does hitchstory not have an opinion on what counts as interesting to "the business"? o What does the license mean for me? o Why does hitchstory not have a command line interface? o Principles o Why does HitchStory have no CLI runner - only a pure python API? o Why Rewritable Test Driven Development (RTDD)? o Why does HitchStory use StrictYAML? + [ ] Why not Why not o Why use Hitchstory instead of Behave, Lettuce or Cucumber (Gherkin)? o Why not use the Robot Framework? o Why use hitchstory instead of a unit testing framework? * [ ] Icommandlib Icommandlib + Changelog + [ ] Using Using o Custom Screen Condition o Kill o Process properties o Screenshot o Screensize o Send keys o Wait until successful exit * [ ] Orji Orji + Changelog + [ ] Using Using o Demonstration of all template features o Deliberately trigger a template failure o Example of Generated LaTeX A4 CV o Example of Generated LaTeX A4 Letter o Convert chunks of orgmode text into markdown o Use a python module with template variables and methods * [*] Strictyaml Strictyaml + Changelog + What YAML features does StrictYAML remove? + What is YAML? + When should I use a validator and when should I not? + [ ] Using Using o [ ] Alpha Alpha # [ ] Compound Compound @ Fixed length sequences (FixedSeq) @ Mappings combining defined and undefined keys (MapCombined) @ Mappings with arbitrary key names (MapPattern) @ Mapping with defined keys and a custom key validator (Map) @ Using a YAML object of a parsed mapping @ Mappings with defined keys (Map) @ Optional keys with defaults (Map/Optional) @ Validating optional keys in mappings (Map) @ Sequences of unique items (UniqueSeq) @ Sequence/list validator (Seq) @ Updating document with a schema # [ ] Howto Howto @ Build a YAML document from scratch in code @ Either/or schema validation of different, equally valid different kinds of YAML @ Labeling exceptions @ Merge YAML documents @ Revalidate an already validated document @ Reading in YAML, editing it and writing it back out @ Get line numbers of YAML elements @ Parsing YAML without a schema # [ ] Restrictions Restrictions @ Disallowed YAML @ Duplicate keys @ Dirty load # [ ] Scalar Scalar @ Boolean (Bool) @ Parsing comma separated items (CommaSeparated) @ Datetimes (Datetime) @ Decimal numbers (Decimal) @ Email and URL validators @ Empty key validation @ Enumerated scalars (Enum) @ Floating point numbers (Float) @ Hexadecimal Integers (HexInt) @ Integers (Int) @ Validating strings with regexes (Regex) @ Parsing strings (Str) + [ ] Why Why o What is wrong with duplicate keys? o What is wrong with explicit tags? o What is wrong with flow-style YAML? o The Norway Problem - why StrictYAML refuses to do implicit typing and so should you o What is wrong with node anchors and references? o Why does StrictYAML not parse direct representations of Python objects? o Why does StrictYAML only parse from strings and not files? o Why is parsing speed not a high priority for StrictYAML? o What is syntax typing? o Why does StrictYAML make you define a schema in Python - a Turing-complete language? + [*] Why not Why not o Why avoid using environment variables as configuration? o Why not use HJSON? o Why not HOCON? o Why not use INI files? o Why not use JSON Schema for validation? o Why not JSON for simple configuration files? o Why not JSON5? o Why not use the YAML 1.2 standard? - we don't need a new standard! o Why not use kwalify with standard YAML to validate my YAML? o Why not use Python's schema library (or similar) for validation? o Why not use SDLang? o [ ] What is wrong with TOML? What is wrong with TOML? Table of contents # 1. It's very verbose. It's not DRY. It's syntactically noisy. # 2. TOML's hierarchies are difficult to infer from syntax alone # 3. Overcomplication: Like YAML, TOML has too many features # 4. Syntax typing # Advantages of TOML still has over StrictYAML o Why shouldn't I just use Python code for configuration? o Why not use XML for configuration or DSLs? Table of contents * 1. It's very verbose. It's not DRY. It's syntactically noisy. * 2. TOML's hierarchies are difficult to infer from syntax alone * 3. Overcomplication: Like YAML, TOML has too many features * 4. Syntax typing * Advantages of TOML still has over StrictYAML What is wrong with TOML? # This is a TOML document. title = "TOML Example" [owner] name = "Tom Preston-Werner" dob = 1979-05-27T07:32:00-08:00 # First class dates TOML is a configuration designed as a sort of "improved" INI file. It's analogous to this project - StrictYAML, a similar attempt to fix YAML's flaws: # All about the character name: Ford Prefect age: 42 possessions: - Towel I'm not going to argue here that TOML is the worst file format out there - if you use it infrequently on small and simple files it does its job fine. It's a warning though: as you scale up its usage, many bad warts start to appear. Martin Vejnar, the author of PyTOML argued exactly this. He initially built a TOML parser out of enthusiasm for this new format but later abandoned it. When asked if he would like to see his library used as a dependency for pip as part of PEP-518, he said no - and explained why he abandoned the project: TOML is a bad file format. It looks good at first glance, and for really really trivial things it is probably good. But once I started using it and the configuration schema became more complex, I found the syntax ugly and hard to read. Despite this, PyPA still went ahead and used TOML for PEP-518. Fortunately pyproject.toml is fairly trivial and appears just once per project so the problems he alludes to aren't that pronounced. StrictYAML, by contrast, was designed to be a language to write readable 'story' tests where there will be many files per project with more complex hierarchies, a use case where TOML starts to really suck. So what specifically is wrong with TOML when you scale it up? 1. It's very verbose. It's not DRY. It's syntactically noisy. In this example of a StrictYAML story and its equivalent serialized TOML the latter ends up spending 50% more characters to represent the exact same data. This is largely due to the design decision to have the full name of every key being associated with every value which is not DRY. It is also partly due to the large numbers of syntactic cruft - quotation marks and square brackets dominate TOML documents whereas in the StrictYAML example they are absent. Shortening program lengths (and DRYing code), all other things being equal, reduces the number of bugs significantly because maintenance becomes easier and deriving intent from the code becomes clearer. What goes for Turing-complete code also applies to configuration code. 2. TOML's hierarchies are difficult to infer from syntax alone Mapping hierarchy in TOML is determined by dots. This is simple enough for parsers to read and understand but this alone makes it difficult to perceive the relationships between data. This has been recognized by many TOML writers who have adopted a method that will be quite familiar to a lot of programmers - indentation that the parser ignores: Non-meaningful indentation This parallels the way indentation is added in lots of programming languages that have syntactic markers like brackets - e.g. JSON, Javascript or Java are all commonly rendered with non-parsed indentation to make it easier for humans to understand them. But not Python. Python, has long been a stand out exception in how it was designed - syntactic markers are not necessary to infer program structure because indentation is the marker that determines program structure. This argument over the merits of meaningful indentation in Python has been going on for decades, and not everybody agrees with this, but it's generally considered a good idea - usually for the reasons argued in this stack exchange question: 1. Python inherited the significant indentation from the (now obsolete) predecessor language ABC. ABC is one of the very few programming languages which have used usability testing to direct the design. So while discussions about syntax usually comes down to subjective opinions and personal preferences, the choice of significant indentation actually has a sounder foundation. 2. Guido van Rossum came across subtle bugs where the indentation disagreed with the syntactic grouping. Meaningful indentation fixed this class of bug. Since there are no begin/end brackets there cannot be a disagreement between grouping perceived by the parser and the human reader. 3. Having symbols delimiting blocks and indentation violates the DRY principle. 4. It does away with the typical religious C debate of "where to put the curly braces" (although TOML is not yet popular enough to inspire such religious wars over indentation... yet). 3. Overcomplication: Like YAML, TOML has too many features Somewhat ironically, TOML's creator quite rightly criticizes YAML for not aiming for simplicity and then falls into the same trap itself - albeit not quite as deeply. One way it does this is by trying to include date and time parsing which imports all of the inherent complications associated with dates and times. Dates and times, as many more experienced programmers are probably aware is an unexpectedly deep rabbit hole of complications and quirky, unexpected, headache and bug inducing edge cases. TOML experiences many of these edge cases because of this. The best way to deal with essential complexity like these is to decouple, isolate the complexity and delegate it to a specialist tool that is good at handling that specific problem which you can swap out later if required. This the approach that JSON took (arguably a good decision) and it's the approach that StrictYAML takes too. StrictYAML the library (as opposed to the format) has a validator that uses Python's most popular date/time parsing library although developers are not obliged or even necessarily encouraged to use this. StrictYAML parses everything as a string by default and whatever validation occurs later is considered to be outside of its purview. 4. Syntax typing Like most other markup languages TOML has syntax typing - the writer of the markup decides if, for example, something should be parsed as a number or a string: flt2 = 3.1415 string = "hello" Programmers will feel at home maintaining this, but non programmers tend to find the difference between "1.5" and 1.5 needlessly confusing. StrictYAML does not require quotes around any value to infer a data type because the schema is assumed to be the single source of truth for type information: flt2: 3.1415 string: hello In the above example it just removes two characters, but in larger documents with more complex data, pushing type parsing decision to the schema (or assuming strings) removes an enormous amount of syntactic noise. The lack of syntax typing combined with the use of indentation instead of square brackets to denote hierarchies makes equivalent StrictYAML documents 10-20% shorter, cleaner and ultimately more readable. Advantages of TOML still has over StrictYAML There are currently still a few: * StrictYAML does not currently have an "official spec". The spec is currently just "YAML 1.2 with features removed". This has some advantages (e.g. YAML syntax highlighting in editors works just fine) but also some disadvantages (some documents will render differently). * StrictYAML does not yet have parsers in languages other than Python. If you'd like to write one for your language (if you don't also do validation it actually wouldn't be very complicated), contact me, I'd love to help you in any way I can - including doing a test suite and documentation. * Popularity. Copyright (c) 2018 - 2023 Colm O'Connor Made with Material for MkDocs