[HN Gopher] A reasonable configuration language
___________________________________________________________________
A reasonable configuration language
Author : todsacerdoti
Score : 99 points
Date : 2024-02-04 14:12 UTC (8 hours ago)
(HTM) web link (ruudvanasseldonk.com)
(TXT) w3m dump (ruudvanasseldonk.com)
| AtlasBarfed wrote:
| The configuration rabbit hole, for practically any system that
| becomes widely applied or through large amounts of feature
| iterations. With provisioning systems you rapidly walk the ladder
| because they are basically starting about five steps down on this
| already:
|
| You start with an INI/yaml/json
|
| Then you might have several overlaid INIs
|
| Eventually: https://docs.spring.io/spring-
| boot/docs/2.1.13.RELEASE/refer...
|
| Go ahead and make a java crack or something similar. Every line
| of that hierarchy is bathed in the blood of IT workers in the
| real world.
|
| Wait! Not even close to being done.
|
| Templating, expansions, "classes", embedded expressions: now we
| have computation folks!
|
| I don't actually know the computational power of HCL, but
| eventually you end up with a Turing Complete configuration input
| that the poster wants.
|
| And I'm actually leaving a lot of things out.
|
| So many systems go through this, picking their own path. Kind of
| like workflow engines, almost all major systems will end up with
| a workflow engine in it (and the builds are workflow anyway). But
| there aren't well-conserved implementations of these overarching
| meta-patterns, so everyone bespokes it or picks from a mind
| numbingly vast array of pseudo-matching options.
|
| Good luck!
| dsheets wrote:
| The article did not advocate for a TC config language. I think
| Ruud would probably have been happy with a total primitive
| recursive language.
| ruuda wrote:
| To me the distinction is not important. It is possible to
| write programs that take too long to execute in non-Turing
| complete languages, and when a program hangs, you just
| terminate it. I think what people really want is for their
| configuration to remain simple, but not being Turing complete
| does not guarantee that. For RCL I added a limit on the
| number of evaluation steps, because without it some programs
| trap the fuzzer.
| dsheets wrote:
| We agree. You don't want universal computation in your
| configuration. Your "gas" approach is the right one and
| very weak computationally. I further posit that you don't
| even want to offer gas fuelled TC and everyone would be
| happy with a provably total primitive recursive language
| fuelled by gas. Why? Accidents happen. I'd rather have my
| configuration tool tell me statically that some construct
| isn't provably terminating than wait to find out later when
| someone imports my config snippet library and uses it in an
| unexpected way.
| stevekemp wrote:
| It's scary to think how many times I've had to evolve
| configuration options as systems developed and the user-base
| changed.
|
| As you say, I'd start out with INI files, then move on to JSON,
| after that it would come a hack to allow the JSON to be
| generated by executing a shell/ruby/perl script. (i.e. "--
| config=!xx" would execute XX and parse the output, instead of
| reading a file).
|
| Later still we started embedding lua, or similar, to allow
| things to be templated and "dynamic" on a per-host basis.
|
| I was always fond of the Apache-style configuration system, and
| I guess HCL is close to that, but there aren't any great
| universal solutions unless you go all-in with scripting, and
| then you end up with emacs!
| qznc wrote:
| I defined myself 5 levels:
|
| Level 1 is just values in a file. The Linux kernel uses that.
|
| Level 2 is a list of values, e.g. ini files.
|
| Level 3 allows nesting. JSON, XML, and YAML are here.
|
| Level 4 allows computation but limited. Dhall and Starlark are
| here.
|
| Level 5 is a Turing-complete language. Python, Javascript, etc.
|
| RCL seems to be level 5, so I'm not sure if there is really an
| advantage compared to Python.
| mst wrote:
| I think the goal is to have a language that feels like it was
| designed for level 3/4 but allows you to break out to level 5
| as an escape hatch.
|
| Given how many configuration languages eventually end up with
| a half-assed level 5 (Greenspun comes to mind) it strikes me
| as, at least, an experiment worth performing.
| gary_0 wrote:
| I wonder if anyone would take me seriously if I suggested WASM
| as a configuration language. When run, the WASM would be given
| imports that provide a small API for manipulating the config
| DOM, and importing and executing further WASM files. Then you
| can use whatever high-level config language you want (or, say,
| a subset of Rust) and compile it to WASM.
|
| Existing JSON/YAML/etc could be compiled to WASM that simply
| builds the corresponding DOM.
|
| Funny joke?
| coderedart wrote:
| using wasm goes into crazy territory imo. It would _probably_
| be bigger, and opaque, with no proper errors (unless you
| compile debug info which would make it much bigger).
|
| I think we already have an almost perfect language i.e. Lua
| Everyone knows it (or can learn it easily). tiny runtime to
| embed. sandboxed by default. garbage collected. Only feature
| missing is static typing.
| marcosdumay wrote:
| Hum, most systems do not go down that hole at all. Most
| software stop at most at hierarchical parameters.
|
| For the ones that do go down the hole, you are almost always
| better on separating them into components, to minimize the ones
| that need complex configuration, and accelerating the journey
| of those small pieces. (But yeah, it would be nice to settle on
| a good workflow description language that isn't a makefile.)
|
| Now, it seems that everybody that is pushing those high-
| complexity languages wants an infrastructure description
| language, and just keeps calling it by "configuration
| language". Since those two problems are completely different,
| insisting on using the wrong name is quite harmful to the goal.
| macintux wrote:
| A reasonable configuration language blog post. Explains the
| motivation, provides some meaningful examples and a happy
| accident, and concludes with an overview of prominent
| alternatives.
|
| Worth a read, and worth trying out sometime, if nothing else
| because I too don't use jq often enough to remember more complex
| usages.
| raphinou wrote:
| jsonnet[1] and kapitan[2] are the tools I currently use. Their
| learning curve is not optimal (and I tried to contribute to
| smoothen it with a jsonnet course[3] and a 'get started wit
| kapitan' blog post[4]), but once used to it it's hard to do
| without, and their combination makes them even more useful (esp.
| if you deploy K8s).
|
| In Ruud's case, Jsonnet might have been worth looking at as
| Hashicorp tools can be configured with json in addition to HCL.
| But that would have been less fun I guess ;-)
|
| I hope for Ruud it finds its niche, there's quite some
| competition in this field!
|
| 1: https://jsonnet.org/
|
| 2: https://kapitan.dev/
|
| 3: referal link: https://www.udemy.com/course/jsonnet-from-
| scratch/?referralC...
|
| 4: https://www.yvesdennels.com/posts/starting-with-kapitan/
| js2 wrote:
| The author mentions Jsonnet in the appendix[^1]:
|
| > I never properly evaluated Jsonnet, but probably I should.
| Superficially it looks like one of the more mature formats, and
| in many ways it looks similar to RCL. Its has a page comparing
| itself against other configuration languages.
|
| [^1]: https://ruudvanasseldonk.com/2024/a-reasonable-
| configuration...
| raphinou wrote:
| I should have made it clearer, but I wrote my comment in
| reaction to his mention that he "never properly evaluated
| Jsonnet".
| nynx wrote:
| This looks very restrained. Well done!
| lolinder wrote:
| Reading through this I'm reminded of another article from about a
| month ago, _An app can be a home-cooked meal_ [0]. Whenever
| someone starts on a new language or framework, people are quick
| to ask "why" and bemoan the proliferation of languages and
| frameworks. I think this article is a good illustration of the
| "why".
|
| What the author understands that so many don't is that a
| _language_ can be a home-cooked meal. A programming language is
| nothing more or less than a tool for use by a programmer.
| Compilers have a lot of mystique about them because of all the
| crazy optimizations that production-grade compilers put in, but
| fundamentally a compiler is just a pipeline that transforms a
| data structure from a format that the human would prefer to
| interact with into a format that a specific machine can work
| with.
|
| rcl is a home-cooked meal kind of configuration language. It was
| never intended to serve a wide audience, it was intended to solve
| a specific human's pain points and help that specific human with
| their task. Its value lies in that it doesn't need to try to be
| anything else.
|
| [0] _An app can be a home-cooked meal_ (1051 points, 288
| comments) https://news.ycombinator.com/item?id=38877423
| mst wrote:
| I'm working on something that may end up being rcl-like.
|
| My intention is to make it something that can update existing
| configuration files just as well as -be- a configuration file,
| such that if somebody finds it useful, they can use it without
| their colleagues needing to be aware and with the same commit
| diffs as if they'd done the same work by hand.
|
| How well that will work in practice is still an open question,
| though a bunch of my experiments in that direction seem to have
| worked out nicely so far.
|
| But so long as it helps -me- make changes to a repository that
| are good, so long as I can ensure nobody else working on that
| repository needs to care that I used it to help, I figure it's
| worth the attempt.
|
| (also, hey, I'm having fun :)
| civilized wrote:
| I have seen many of these posts now and I have a simple question.
| Why do we need special configuration languages? Why not just use
| existing languages?
|
| I understand why data formats like JSON and YAML are valuable,
| and I understand why it would be valuable to use a programming
| language to automate the generation of such formatted data. But
| the niche of the configuration language remains mysterious to me.
| raphinou wrote:
| I'm guessing here, but maybe you see these configuration
| languages as templates? However a configuration language is
| much more than that. For example, jsonnet lets you combine
| objects and override deeply nested fields [1]. I'm not sure how
| practical it would be to achieve the same result with a general
| purpose language.
|
| 1: https://jsonnet.org/learning/tutorial.html#oo
| aleksiy123 wrote:
| I think because there are some desirable traits for config
| languages that don't exist in general purpose languages.
|
| Not exhaustive list but generally:
|
| Usually constrained to reduce complexity and try to eliminate
| the need for testing config.
|
| Interopable between many different programming languages.
|
| Readable by programmers working in different languages.
|
| I think this usually makes config languages favour declarative
| over imperative which usually eliminates most general purpose
| languages.
|
| Another topic is why do we use configuration at all and what is
| the difference between code and config.
| Someone wrote:
| It also is desirable that you can programmatically process
| configuration files (example: go through all your docker
| files to determine which ones use vulnerable images)
|
| That is at odds with using a generic Turing complete
| language, as figuring out what the configuration is requires
| running code that may do all kinds of weird things (download
| data, delete files, etc) (that is similar to the
| postscript/PDF case. To figure out how many pages a
| postscript file has, you have to run code)
|
| Now, some will say they trust those writing config files to
| not do stupid things and only use that power when it is
| absolutely required, but that's not an opinion shared by all.
| notfed wrote:
| I think NixOS/Nix is a good example of where we need one. Nix
| is used for a massive amount of configuration which needs to be
| functional and turing-complete. Nix is pretty nice and compact
| as a configuration language, but the "programming language"
| aspect of Nix, IMO is ugly and unreadable, and could have been
| designed better.
| DrBazza wrote:
| Don't repeat yourself / single-point-of-truth. Config files can
| be full of repetition, from values, to sections/objects.
| Defining a value or object once, dramatically reduces
| search+replace errors for example.
| rco8786 wrote:
| Is that mutually exclusive with using your normal programming
| language as a config language?
| Jtsummers wrote:
| What do you do when your normal programming language stops
| being your normal programming language (you add a language,
| switch languages), but you want to preserve details of the
| configuration? Translate it all into your new language(s)?
| How do you keep these things synced up? You extract the
| details into a configuration language/serialization format
| and then deserialize in every language you work with.
|
| Embedding your configuration in your application language
| only works well as long as you have one application
| language. (Or perhaps you use Lua or TCL which are both
| almost trivially embedded into every other language.)
| danenania wrote:
| You can compile your general purpose language DSL to json
| and then I think this isn't too much of a problem.
|
| A bigger issue imo is packaging. General purpose
| languages aren't typed well-designed for producing
| single, understandable standalone files like config
| languages are. Like I think a simple config DSL in
| TypeScript could potentially be a perfect way to solve
| this problem, except that no one wants to lug around a
| package.json, package-lock.json, and node_modules
| directory just to write a bit of config. Yet the ability
| to bring in npm modules--especially those containing
| relevant types--is where a lot of the attraction of using
| TypeScript comes from.
| Jtsummers wrote:
| So turn the general purpose language into a non-general
| purpose language by creating a compiler for a subset of
| it. That might be what we'd call a configuration language
| then since it's no longer actually the original language
| (in full). Like JSON and others.
| ncallaway wrote:
| > Why do we need special configuration languages? Why not just
| use existing languages?
|
| Sure, or you could turn around and ask:
|
| Why do we need languages at all? Surely we can express
| everything with assembly?
|
| Or, shouldn't one language work for all cases? Why not just use
| C everywhere?
|
| And the answer becomes obvious: some languages are better at
| certain tasks than others! Then, a "configuration language" is
| just going to be a programming language that is best suited to
| generating configuration.
|
| The same way that C or Rust have different use cases than C#
| and Java.
| civilized wrote:
| Okay, then in your terms my question would be "why is the
| best language for configuration not an existing language?"
| lamontcg wrote:
| People keep hacking up YAML with templating to produce a
| configuration language which is a bad solution (see for example
| https://leebriggs.co.uk/blog/2019/02/07/why-are-we-
| templatin...). The problem is that the people who are using
| templated YAML don't really want to learn a full-blown language
| since the vast majority of them never, ever want to write a
| compiler or a highly concurrent HTTP server. They definitely
| don't care about functional programming or want to know what a
| monad is or want to have to learn about first class functions
| and closures or async/await. They just want a pretty trivial
| language that you can learn in a weekend or two. The kind of
| necessary complexity that you might 'add' to such a language
| before anything else is making it side-effect free so that you
| can't alter the running state of the system in the language and
| the purpose is to render a YAML/JSON/TOML config to feed as
| declarative config into tools like terraform. That focus is
| much different from a programming language which gives you the
| ability to execve() arbitrary commands and comes with
| concurrency primitives that templated-YAML users just don't
| care about at all.
|
| When you look at it from the perspective of someone who already
| knows a computer language well it doesn't seem useful, but the
| point is to address the users who don't know any computer
| languages particularly well.
|
| And 24 years ago I thought we needed to just use general
| purpose computer languages and turn system administrators into
| programmers (or fire the ones that wouldn't learn) but that has
| clearly never happened, and we're retaining the less-technical-
| than-SWE roles who manage infrastructure, although its now
| DevOps people managing k8s with YAML. If just using a general
| purpose language met the social needs that we have then it
| would have happened already. The existence of templated YAML
| and its overwhelming success proves that there's a need that
| has to be met. I'm still skeptical that any of these
| configuration management languages is thinking clearly about
| what need is driving templated YAML though, but its nice that
| there's an explosion of them so that hopefully sooner or later
| one of them will really stick and become popular.
| catlifeonmars wrote:
| There's no smooth gradient between data format and Turing
| complete programming language. Let's say I want environment
| conditional logic (such as looking up a map of AMIs to AWS
| regions), but don't want to allow arbitrary code execution, or
| I want to embed the parser inside the service that accepts the
| config. Those are the niches that HCLs fill.
|
| A programming language that allows you to enable language
| constructs individually might be interesting (imagine being
| able to easily turn on list comprehensions but not allow
| imports or to disable mutation).
| patrickmay wrote:
| > There's no smooth gradient between data format and Turing
| complete programming language.
|
| S-expressions beg to differ.
| hnlmorg wrote:
| > It's 2024, so RCL has some features that you might expect from
| a "modern" language: trailing commas
|
| I have the opposite opinion here. Commas should be like semi-
| colons: only required if you want multiple statements on a line.
|
| Eg fruit = [ "apples"
| "oranges" "bananas" ]
|
| No commas yet still extremely explicit
| mcphage wrote:
| That's not really the opposite opinion... the opposite is not
| allowing trailing commas (like JSON) so if you remove the last
| item from a list, you need to adjust the line above as well.
| And when you add something to a list, you need to know if
| something will come after it, even though you don't necessarily
| know.
|
| Commas as separator is pretty much orthogonal, and... well,
| better overall.
| hnlmorg wrote:
| I got what they were saying. My point is that if you have a
| line feed then a comma is an entirely redundant piece of
| syntax. Commas should only be needed if you are placing
| multiple statements on a single line. Just like how several
| programming languages treat semi-colons (eg Go, bash,
| JavaScript, etc). So mandating them is the opposite of that
| modern languages should be insisting upon.
|
| In Murex (my own language) both commas and semi-colons can be
| dropped if the statement or expression is terminated by a
| line feed. list = %[ chair
| table "bed-side table" ]
|
| YAML, for all of its warts, also behaves similarly too.
|
| It's a much better way to handle lists and objects because
| the comma only exists for the same reason the semi-colon does
| in C-like syntax: as a parser hint. But if you've got a new
| line then that hint is completely redundant.
| OJFord wrote:
| It's only redundant in languages that make it redundant,
| like you mentioned C at the end - it's the new line that's
| redundant there (or only for readability), it's the
| semicolon with the 'end' semantics.
| unilynx wrote:
| In this specific case it's not redundant in C - without
| the commas, the strings would be merged together.
| OJFord wrote:
| Exactly, that's what I said? 'In this specific case'
| though? It would be news to me (not that I've used it
| since university!) that they're _ever_ non-semantic?
| teo_zero wrote:
| Would your language also accept this? list
| = %[ chair, table ]
|
| And this? list = %[ chair, table,
| ]
| zer00eyz wrote:
| The problem isnt "configuration"
|
| Yaml, JSON, xml, text files... all work great for configuration.
|
| But the assumption is that your configuring a piece of software
| on an already existing system.
|
| A config file as means of setting up a system, installing
| software, and establishing how its going to run is exceedingly
| stupid.
|
| Write your app so it can run on bare metal, install with apt, yum
| or your tool(s) of choice for your org. Build it so it scales
| diagonally, and works against spot instances where you can
| (because sometimes more or less cores are cheaper). Dont even get
| me started on the nonsense of everyone and their ideas for
| "secrets management", it makes me miss LDAP.
| Kinrany wrote:
| Are you suggesting that all configuration should be dynamic and
| multi-tenant (for the lack of a better word), not happen at
| startup?
| zer00eyz wrote:
| Scripting a config file. Templates for config. We did this
| already, xml, XSD, XSLT, DOM...
|
| It sucked.
|
| Config should be easy and flat and simple. IF it isnt that
| source it from code, or from a service... Most of the
| nonsense of config madness is a byproduct of, or hidden in,
| containers. If we were writing installable software much of
| that would go away...
| MrDarcy wrote:
| What's this magical config service do? Take data in from a
| request and return transformed data back in response to
| configure the caller? What is the essential difference
| between this config service and existing config languages?
| Jtsummers wrote:
| > Write your app so it can run on bare metal, install with apt,
| yum or your tool(s) of choice for your org.
|
| If it's running on bare metal, why are you using apt or yum?
| Those are parts of an OS you wouldn't have if you're running
| your program without an OS.
| habitue wrote:
| I know the standard response is "why another one?" but honestly,
| his diagnosis of the existing options is pretty good. He even
| uses the best examples of thoughtfully designed config languages:
| cue, dhall and nix, and ... yeah I pretty much agree with his
| qualms about those languages.
|
| What is a config language really? It's a language that doesn't
| allow side effects and evaluates to json.
| paulddraper wrote:
| > I was struggling with that day was to define six cloud storage
| buckets in Terraform...The kind of thing you'd do with a two-line
| nested loop in any general-purpose language
|
| I understand this is just an example, but FYI the modern solution
| is to use CDKTF rather than HCL for Terraform.
|
| That allows you to choose your favorite general purpose lang:
| Python, TypeScript, Go, Java, C#.
| OJFord wrote:
| I use Terraform _because_ of HCL, the absolute best thing about
| it is the declarative config; if someone insists on using their
| favourite general purpose language: 1) they 're wrong; 2)
| there's plenty of other options for that and I'm not interested
| in fighting in Terraform's corner knowing they won't use it for
| the most fundamental reason it's good.
| paulddraper wrote:
| So, your reply to Ruud is "you're holding it wrong" or "it
| actually fine you just think it isn't"?
| aliasxneo wrote:
| > but FYI the modern solution is to use CDKTF rather than HCL
| for Terraform.
|
| That's an odd take. Are you saying that because it's newer? I
| would push something like Crossplane as more "modern" in that
| it solves the critical issue of Terraform not having any sort
| of reconciliation loop.
| deathanatos wrote:
| I quite like what has been come up with here. In particular, I
| understand exactly how HCL would drive someone down this path, as
| it is infuriating to try to get that language to compute what you
| want computed.
|
| I think at the end of the day, k8s YAML is focused on the
| datatype, i.e., what would be the result of your `rcl evaluate`.
| I do think this would be a much saner path than what Helm
| provides, though, and Helm's _textual_ templating, as opposed to
| understanding _values_ and building the actual datastructures,
| and then converting that to a format like JSON or YAML ... is the
| wrong path. (I.e., I can see RCL being a possible replacement for
| Helm.)
|
| > _The language is a superset of json._
|
| I'm going to introduce what I think is pretty much a universal
| law: languages claiming to be supersets of other languages are
| not supersets.
|
| In the case of JSON, there's basically one counter-example that
| breaks all alleged supersets: "\ud83d\udca9"
|
| And if we try it: >> printf '"\\ud83d\\udca9"' |
| cargo r -- evaluate Finished dev [unoptimized +
| debuginfo] target(s) in 0.01s Running
| `target/debug/rcl evaluate` stdin:1:2 | 1 |
| "\ud83d\udca9" | ^~~~~~ Error: Invalid escape
| sequence: not a Unicode scalar value. Help: For code
| points beyond U+FFFF, use '\u{...}' instead of a surrogate pair.
|
| Now! This to me is _not a bug in RCL:_ this particular facet of
| JSON is utter _crazy town_ , and I would strongly encourage you
| to _not_ adopt it. (Since down this road lies madness, like
| unpaired surrogates. JSON 's grammar & standard is sloppy here,
| and JSON/JS's syntax of using the _UTF-16 encoding_ , and not
| just the scalar value ... it's the part of JavaScript that is
| just what JS is, but is the part that we shouldn't be copying.)
|
| It is much saner to just have a flag/way to indicate "my input is
| JSON" and then to just parse it via an actual JSON parser. Then
| let RCL evolve on its own merits. If there's a lot of happy
| overlap, and most JSON documents are blissfully polyglots with
| RCL, that's fine too / a happy little accident. (But the option
| is important if you want to apply it somewhere programmatically,
| where the inputs are JSON.)
|
| YAML breaks the same way, too: it too is a "superset" of JSON
| that isn't.
| lolinder wrote:
| Have you ever run into a case where this distinction matters in
| practice? If it's a superset of all JSON that actually exists
| in the wild, that feels functionally the same as being a
| superset of JSON, even if the spec technically allows edge
| cases it can't handle.
| eclectic29 wrote:
| Wow! Another configuration language post. 2nd day in a row. And
| that too on the front page. It's 2024 and the software industry
| has not solved the problem of configuration yet. Umpteen
| solutions every year - languages, formats, etc - all for just
| doing configuration. Just wow! It amazes me. It would be so much
| better if all the great minds focused on real issues than keep
| spinning on the hamster wheel of config file formats and
| languages.
| jahewson wrote:
| I spent my PhD working on one of the most pressing problems
| facing big cloud companies at the time: downtime. The big
| problem being faced was... configuration.
| vaylian wrote:
| What in your humble opinion is a real issue that should be
| addressed instead?
| mst wrote:
| For templating/generating YAML it's worth a look at
| https://yglu.io/ and ingy's latest piece of madness,
| https://yamlscript.org/
|
| I make no claim that anybody who does look will like either of
| them, but I do claim they're worth a look even if it turns out
| that you don't.
| heads wrote:
| Configuration languages are an interface to the API presented by
| the software you are using. There can be a hard boundary between
| the two -- a C codebase for a web server and the XML-like file
| that configured it -- or there can be an imperceptible transition
| between the two where the configuration is just another module in
| the code base.
|
| As one of the authors working in a 2000+ module Python codebase,
| nothing gives me greater pleasure than to drive one part of the
| codebase by creating another module. Nothing gives me greater
| sadness than to be forced to interact with something through a
| YAML config file, command line flags, or launching GitLab
| pipelines. All three of those boundaries interrupt the powerful
| electromagnetic force fields that pervade the system: type
| checking, linting, and symbol finding (IDE integration). In time,
| I vow to destroy every last one of these non-code boundaries.
| Another unwanted-boundary demon on the exorcism list is
| polyrepos.
|
| There once was a time when the world worked like this but instead
| of Python it was C. It wasn't as rich as the dynamic world we
| have today -- I'm certain I don't want to go back to configuring
| my softest by recompiling it -- but it did have a lot of the
| advantages of working in one language environment across all the
| things.
| somewhereoutth wrote:
| This, but with Javascript. Perhaps the 'electromagnetic force
| fields' are less powerful, but the array and object literals
| are perfect for configuration.
___________________________________________________________________
(page generated 2024-02-04 23:00 UTC)