[HN Gopher] Typed Config Languages
___________________________________________________________________
Typed Config Languages
Author : nalgeon
Score : 52 points
Date : 2022-01-20 06:36 UTC (1 days ago)
(HTM) web link (kevincox.ca)
(TXT) w3m dump (kevincox.ca)
| milliams wrote:
| Can anyone else read that white text on a light grey background?
| kevincox wrote:
| Author: Oops, I fixed the inline code snippets but accidentally
| broke the blocks. I'll push out a fix shortly.
|
| Edit: Should be fixed (might need a hard reload). It turns out
| inverting something twice doesn't really do much.
| Zababa wrote:
| The white and orange/brown are hard to read for me. Both have
| less than 1.5 contrast in the chrome dev tools. To continue on
| the nitpicks, I think there's a typo, the procedural macros
| have "drive" instead of "derive". And the boxes of code taking
| all the page compared to the "terminal" text make it a bit hard
| to "get" the flow of the page.
|
| Other than that it was a nice read, and the grey background is
| nice and easy on the eyes. I really like the breadcrumbs too,
| they act as a minimalist menu and easily clickable URL.
| kevincox wrote:
| Thanks for the feedback. I think now that the bug with the
| syntax highlighting is fixed the accessibility should be ok.
| I need to find a better way to test though because I am using
| the "invert" hack for the light theme and the build-in
| accessibility checker for Chromium and Firefox don't appear
| to take this into consideration. I'll need to find a basic
| calculator and do the math manually when I have time.
|
| Typo fixed.
|
| Thanks for the flow feedback, I thought the full-width code
| was cool and it can help for wide code without making the
| page text too wide (it allows the code to grow to the right)
| but maybe it is more confusing than it is worth. And thanks
| for the feedback on the background and breadcrumbs, it is
| good to hear both positive and negative thoughts.
| Zababa wrote:
| Thanks, it's way easier to read now. The flow feedback is
| very personal, feel free to ignore it. After rewatching the
| article, I like how unique it is.
| Cyberdog wrote:
| Looks like nobody has mentioned TOML yet, so I will. It's
| basically INI syntax with stricter rules and can be used as such
| if you wish, but also supports some extra features like arrays,
| dictionaries/tables, timestamp values, nested sections, and so
| on. It's a joy to use in the rare times that I'm able to use it.
|
| https://toml.io/en/
| kstrauser wrote:
| TOML's great, although I'm not in love with the syntax for
| nested sections.
| jer0me wrote:
| Tom's Obvious Markup Language, as in Tom Preston-Werner of
| GitHub
| AaronFriel wrote:
| I've increasingly found myself drawn to writing declarations as
| programs, not as config in any plain textual format.
|
| Whether it's going all the way in the direction of supporting
| general purpose languages like Pulumi does, or with a niche but
| still Turing-complete language like Nix's expression language, or
| even Dhall, which other commentators have mentioned. That isn't
| to say there is no place for simple, human readable schema. I
| think these tools need a fallback to something simpler.
|
| Nothing these tools do couldn't be emulated by manual processes
| of hand-writing complex makefiles or YAML or whatever, but what
| is striking to me is the use of general purpose languages,
| usually with tooling to do type-checking and IDE assistance to
| writing these things, which lowers the barrier to entry and
| empowers someone who "just knows (Python|JavaScript|Go|...)" to
| contribute in a familiar environment.
|
| A couple more examples:
|
| Envoy: I have had the displeasure, recently, of writing by-hand a
| configuration for the Envoy load balancer. Envoy uses a "typed
| config language", which is often represented as YAML or JSON, but
| it's very painful to write by hand. On the other hand, if you
| have the protocol buffers/gRPC schema available in code, it's
| vastly less painful to use any programming language to build the
| typed objects and then export to plain text. The xDS protocol is
| designed for being interacted with via programs, not plaintext.
|
| GitLab CI: GitLab supports writing a program, which runs as part
| of the CI/CD job, to generate the configuration for a subsequent
| pipeline. This makes writing complex jobs, or repetitive monorepo
| tasks much simpler. A 10 line Python program that effectively
| does "for each folder in `python`, emit this block of YAML" is
| incredibly powerful.
|
| That last example is salient to me: markup languages are really
| easy to parse, but challenging to read when they're dynamic.
| Wouldn't it be nice to be able to mix and match? YAML/JSON where
| it makes sense and the intent and meaning is self-evident, and to
| write code where you need dynamism?
|
| Full disclosure: I work for Pulumi, opinions are my own, etc.
| jandrese wrote:
| In the past it was fairly common to write configuration files
| in TCL and just running the script to parse them.
|
| This was generally considered to be a mistake. It opens up a
| huge threat surface for your program and few people ended up
| using the more advanced capabilities it created. If your config
| format supports simple globbing that covers the 95% use case
| for otherwise needing a turing compete language for your
| configuration files.
| blacksqr wrote:
| > This was generally considered to be a mistake.
|
| Pity. Tcl allows you to create safe interpreters within which
| you can disable any commands you want in order to have a
| trustworthy environment for running configuration scripts.
|
| Tcl itself uses them to build its internal list of available
| package modules.
| AaronFriel wrote:
| I think that speaks to having that flexibility when needed.
| The other 5% of use cases are significant too.
| jandrese wrote:
| The other 5% is generally solved by people who write
| scripts to generate the config files. It's not like they're
| up a creek.
| kaba0 wrote:
| That's why I think Dhall is really interesting, it being
| purposefully not Turing-complete, yet still very flexible.
| tromp wrote:
| Dhall [1] [2] looks promising to me.
|
| [1] http://dhall-lang.org/
|
| [2] https://github.com/dhall-lang/dhall-lang
| corysama wrote:
| Does anyone know if using Dhall to generate JSON that is
| validated using Cue a sensical idea? I don't know enough about
| them to be sure.
| kevinmgranger wrote:
| dhall serves as the validation layer itself.
|
| Unless you're already consuming some other API that publishes
| Cue validation, in which case dhall is just the templating
| language.
| vzaliva wrote:
| I am using Cerberus schema to validate my YAML configs:
|
| https://docs.python-cerberus.org/en/stable/schemas.html
| msoad wrote:
| Like it or not YAML configuration files are everywhere. I've had
| a lot of luck using JSON Schema with YAML config files. Luckily
| VSCdoe and possibly many other editors can be configured to
| provide type hints for completion. Using any available JSONSchema
| checker you can validate your config files in CI and elsewhere.
|
| Most of the time, what's missing an accurate JSON Schema for a
| configuration. I usually encourage owners of those configurations
| to sit down and write it for everyone to benefit from
| pietroppeter wrote:
| I like the approach of strictyaml. A parser that concentrates on
| a restricted subset of yaml and allows to use a schema to have a
| type safe validator.
|
| https://github.com/crdoconnor/strictyaml
| nicoburns wrote:
| If we're on the topic of config languages, I'd like to plug Gura
| (https://github.com/gura-conf/gura). It's not too well-known, but
| it probably has the best design I've seen, and seems to have a
| good coverage of languages with an available library.
| milliams wrote:
| YAML's parsing of `no` as `False` has not been part of the spec
| for 13 years now. It was changed in YAML 1.2 in 2009 to only be
| `true` and `false` (with variations in case allowed I think).
| kbd wrote:
| As has come up in this thread already, any discussion of typed
| config languages nowadays that doesn't mention Cue
| (https://cuelang.org/) seems incomplete. They really seem to be
| tackling the problem in a thorough way. I hope it catches on.
|
| For anyone who knows more about Cue: right now you can go from
| Cue<->yaml (in fact, their docs on yaml also use the "no" case as
| an example: https://cuelang.org/docs/integrations/yaml/) to
| integrate with existing systems, but I suppose eventually the
| goal would be to have direct support in libraries like Serde?
| kevincox wrote:
| (disclaimer: author)
|
| Cue is a very cool language, but it is quite different than the
| "typed config language" that I have described here. Maybe I
| picked a poor title but in the post I am talking about using
| the type information to "improve" parsing. IIUC Cue does not
| due this, it parses in a "dynamically typed" manor, then uses
| the type system to evaluate the turing complete (or close to
| it) expression language.
| kbd wrote:
| Yeah that's why I was asking about Cue's eventual goals with
| libraries like Serde. I assume eventually they'd like to be
| able to auto-generate type definitions for a target language,
| but I don't know.
|
| > Cue does not due this, it parses in a "dynamically typed"
| manor, then uses the type system to evaluate the turing
| complete (or close to it) expression language.
|
| As I understand it Cue would help in two ways currently. 1.
| It would be able to type-check existing yaml files to catch
| things like the "no" case. 2. if you write your config in
| Cue, it would output properly-typed yaml to avoid things like
| "no".
| kevincox wrote:
| Yes. I agree. It would "prevent" the "no case" by returning
| an error on parse/evaluation. However the solution
| described here can do better. It can correctly parse the no
| case. Basically by knowing it is parsing a string the
| grammar can be simpler, it doesn't have to decide if it is
| a int/bool/string anymore.
| yegle wrote:
| Re the first note in the post: a good serialization format is
| both easy to read by machine and read/write by human. I think the
| text protobuf file is one of such example. A (human read/write-
| able) config language needs to be consumed by program anyway, in
| a sense a config language is a human-to-computer serialization
| format.
| [deleted]
| usrbinbash wrote:
| > Statically typed programming languages are catching on so why
| don't we extend this typing to our config files?
|
| Because I simply don't want to expend the same amount of
| cognitive load to read config files as I do for code.
|
| Yes, yaml has some minor ambiguities. These are easily solved. To
| use the example from the article: countries:
| - ca - "no" - us
|
| There, done. The problem was solved with 2 extra characters and
| remembering the fact that `no` is special in yaml. Comparing that
| to the amount of typing I have to do to define a scheme, the
| syntax of which I have to learn, which I also have to read or
| remember and keep in mind every time I read the config, I take
| the 2 extra double-quotes.
|
| And, speaking of statically typed languages: This problem would
| be caught immediately anyway if the config is read into static
| types.
| andrewzah wrote:
| "There, done."
|
| Except, we're not done.
|
| YAML has multiple footguns like this, which I have to remember,
| forever. And anyone who works with YAML. It's unintuitive and
| confusing, and costs space in my brain that I really should be
| using for more important things.
|
| Not to mention that if you -don't- know about these ahead of
| time, debugging them can be confusing.
|
| A type system is marginally more work for decreased cognitive
| load and eliminating stupid, idiotic bugs that nobody should
| have to waste their time tracking down.
|
| With IDE integration, the cost is pretty much negligible other
| than learning the syntax, which, c'mon, is not difficult and
| we're being paid to do it.
|
| There are even tools like Dhall [0] that auto-generate yaml for
| us.
|
| [0]: https://github.com/dhall-lang/dhall-lang
| TrainedMonkey wrote:
| Would most of the footguns be solved by quoting all of the
| strings? e.g: "countries": - "ca"
| - "no" - "us"
| kevincox wrote:
| Yes, but now you are losing a lot of the clean syntax that
| causes most people to use YAML in the first place. There is
| a reason that most people don't write YAML like JSON with
| trailing commas and comments, it is nice to cut most of
| this noise.
| meowface wrote:
| You can also use a stricter subset of YAML that removes
| things like the "no" footgun. Plenty of such strict
| parsers exist across languages. Maybe it's no longer
| technically YAML at that point, but you get all the nice
| parts of YAML without having to revamp everything with
| static typing.
| Spivak wrote:
| So yes it's a footgun but it makes some sense. Most people
| wouldn't really complain about true not being equivalent to
| "true" or 100 not keeping it "100". People just aren't used
| to yes/no being reserved words. Ruby's klass is a funny
| workaround to this. enable_feature: yes
|
| Is totally natural. Nobody reads the spec though. If you're
| outputting YAML documents with string builders you're headed
| for ruin no matter what. You don't need Dhall, you need
| yaml.dump which handles the types too.
| CBLT wrote:
| I agree that the problem is one of quicker feedback - the dev
| cycle should involve a program checking the yaml correctness
| (using types or otherwise) straight away and giving a useful
| error message. Too often I've seen incorrect yaml checked into
| git that fails with a cryptic error when deploying the
| application.
|
| The strength of types, in my opinion, is composability. Most
| config files I've seen have ultimately pulled in input from
| another source and used that to create their output. Types
| would allow the configuration to be checked for correctness
| even in the face of unknowns.
| kevincox wrote:
| > Comparing that to the amount of typing I have to do to define
| a scheme
|
| From my use case of config files the code that is reading them
| knows the type anyways. So for a setup like Rust+serde there is
| no overhead to set this up.
|
| > This problem would be caught immediately anyway if the config
| is read into static types.
|
| That is true, but it still breaks you out of your flow. You get
| a confusing error, it probably doesn't tell you the exact line
| number and you need to look over your changes. If you changed a
| lot of places in the file it may be easy to miss that adding
| `no` to a list was the mistake. Because problems like that are
| easy to understand in retrospect, but if you keep reading "no"
| as "Norway" it is easy to look straight at this mistake and
| think it is fine before hunting elsewhere in the file.
|
| I think you are right. It is still unclear if the cognitive
| overhead when writing the file is worth it, but from my point
| of view the upsides are much more valuable then you make them
| appear to be.
| simplify wrote:
| Funny enough, I've implemented a config language that fits
| exactly this bill https://github.com/gilbert/zaml
|
| An example (also see it in the online editor[0]):
| users { andy beth { admin true
| } carl }
|
| The author is right that you gain syntax benefits when you define
| a schema. For those who say this adds cognitive overhead, it
| actually doesn't; the schema and compiler are able to _reduce_
| that overhead, because if you make a mistake, you get a nice,
| accurate error message.
|
| [0]
| https://gilbert.github.io/zaml/editor.html#s=N4IgzgxgFgpgtgQ...
| dlrush wrote:
| Just use ruby for advanced config file capabilities:
|
| https://darrenrush.medium.com/ruby-is-the-ultimate-config-fi...
| unwind wrote:
| Pretty cool, I'm thinking about config file formats at the moment
| so this was timely.
|
| A minor note since the author seems to be around (and noboby has
| mentioned it that I could see). There's a typo in the example:
| allowed-countires
|
| should of course spell "countries" more like I just did. The same
| error occurs twice.
| alaties wrote:
| I'm kind of shocked no one brought up protobufs yet. protobuf
| libraries are available in pretty much all mainstream languages
| and the textproto format is pretty mature.
|
| It's albeit clunkier and less freeform than YAML. And if you ever
| only plan on using rust the proposed solution here is probably
| cleaner.
|
| Having portability over multiple languages maintained by large
| organizations can be useful in some cases though.
| dqpb wrote:
| > Statically typed programming languages are catching on so why
| don't we extend this typing to our config files?
|
| What you really need is Cuelang. Cuelang does graph unification
| over a type-value lattice. This allows the user to do progressive
| type -> value refinement (e.g. type->range->value).
|
| For configuration, this is both better than regular type systems,
| and better than inheritance.
| verdverm wrote:
| Have you seen CUE? Just so happens v0.4.1 was released today
| vmchale wrote:
| Is the author aware of Dhall? He might be interested.
|
| I think it's better in general but in any case it gets away from
| the "everything has to be a keyed map" silliness and you get
| sums/products.
| brundolf wrote:
| This is why, in the statically-typed programming language I'm
| working on, the project manifest is just a file written in the
| language itself which can export a special (typed) const to
| configure things like the linter. It gets to piggyback off of all
| the existing tooling for the language, particularly type checks,
| and can even be constructed using functions, etc if desired.
___________________________________________________________________
(page generated 2022-01-21 23:00 UTC)