[HN Gopher] CCL: Categorical Configuration Language
___________________________________________________________________
CCL: Categorical Configuration Language
Author : SchwKatze
Score : 64 points
Date : 2025-01-06 20:04 UTC (5 days ago)
(HTM) web link (chshersh.com)
(TXT) w3m dump (chshersh.com)
| mightyham wrote:
| I really like the conciseness of this syntax. The language seems
| very well thought through.
|
| That being said, I've been working with NixOS recently and it's
| made me reconsider what is useful for a configuration language.
| In many reasonably large software projects, where configs become
| very complex, config reuse (in other words templating or meta-
| configuration) becomes an increasingly helpful feature. Nix
| configs are great because it's not just a config, but a full
| blown purely functional language for manipulating the config.
| It's intuitive and powerful once you get the hang of it, and I
| sometimes find myself wishing I could use it when I have to work
| with yaml, json, etc.
| one-punch wrote:
| You might be interested in nickel (https://nickel-lang.org/),
| which is a modern take on configuration management based on the
| experience of Nix/NixOS configurations: purely functional
| configuration, built-in validation (types & contracts),
| reusable (functions, modules, defaults), and in addition
| exports to Yaml, Json, etc.
|
| To integrate nickel with nix, see how organist
| (https://github.com/nickel-lang/organist) does DevShell
| management.
| worthless-trash wrote:
| Every configuration language that is not code ends up being wrong
| in multiple ways. We will never learn.
| andrewflnr wrote:
| Out of curiosity, do you like Dhall?
| theamk wrote:
| That's a lot of words to describe a very simple syntax:
| name=value pairs, with line continuation using whitespace.
|
| Basically RFC822 email headers, or Debian Control File Format [0]
| but with "=" instead of ":", and without dedicated comment
| character.
|
| The biggest problem with this format is that a lot of things are
| left for the app, so each app will have its own way to implement
| lists, bools, line wrap support.. Even something like "value
| override" is left to program implementation. Don't expect
| YAML/JSON/XML-style automated validators/linters, each program
| will need its own bespoke parser/generator.
|
| [0] https://www.debian.org/doc/debian-policy/ch-
| controlfields.ht...
| atoav wrote:
| Not to discredit the author, there are some smart thoughts in
| there... but I can't help but feel like: yeah of course this is
| very elegant -- but the complexity is not gone, it is
| elsewhere. And they are not showing that elsewhere.
|
| Namely the parsing code.
| sn9 wrote:
| You can literally just look at it:
| https://github.com/chshersh/ccl/tree/main/lib
| atoav wrote:
| Sure, but it is a _configuration_ format if it is intended
| to be used in all kind of languages you have to show bow to
| deal with it in all kind of languages.
|
| Checks that might be trivial in OCaml might be utter
| footguns in say C.
|
| Don't get me wrong here, I get that this is someones spare
| time project that they might use for themselves only. I am
| fine with that. But I am unconvinced if the (I admit: well
| demonstrated) simplicity of the format translates to
| simplicity of use in the scope it aimed for (replacement
| for other wide spread configuration formats).
|
| I don't say it is impossible (or even unlikely) that this
| is an improvement, I just caution against seeing an elegant
| minimalist approach and automatically assuming it makes
| things simpler -- remember, computers with their binary
| file systems _already_ have the most simple format that
| could exist: zero or one. Yet somehow people shot
| themselves in the foot parsing those for decades. Much of
| the complexity of computer systems stems from managing the
| simplicity of the underlying components (if we ignore the
| thick layer historically grown cruft).
|
| What would it take to change my mind? Elegant examples how
| to parse all the examples in that post using major
| programming languages (C, C++, Javascript, Java, Python,
| Go, Rust, ...). That is ofc a lot of work, but if the
| format should be adopted that work would be needed anyways.
| sn9 wrote:
| Part of the power of the idea is how amenable this is to
| property-based testing.
|
| And it really only takes one solid implementation that
| can be wrapped/called from other languages to do well.
| (Or perhaps one per an ecosystem like JVM, .NET, WASM,
| anything that can be relatively easily called from C,
| anywhere Python is used for scripting, etc.)
|
| And because of the formalisms involved, you could have a
| pretty precise and complete spec defined.
| diggan wrote:
| > The biggest problem with this format is that a lot of things
| are left for the app, so each app will have its own way to
| implement lists, bools, line wrap support
|
| That seems to be one of the explicit goals:
|
| > Configuration is specific to a particular application. What
| you want is to follow the rule of the least surprise and
| utility functions to parse strings.
|
| Since configuration is specific to a particular program, so
| should the configuration, seems to be what the author is
| getting at.
|
| Personally, what puts me off this particular configuration
| language is this part, hidden behind collapsed text:
|
| > In fact, CCL is indentation-sensitive.
|
| Programming/configuring stuff with invisible characters isn't
| my idea of fun, and it sounds especially cumbersome if everyone
| is using it differently, since the configuration language
| leaves a lot up to the users of the configuration.
| cies wrote:
| I think indentation sensitivity is very well suited for
| configs: you want little line noise and the complexity is
| low. I do understand the trade-off TOML made in this case.
|
| Some languages prohibit the TAB character, and only allow
| spaces at the start of the line in groups of 2 or 4: so it is
| always clear how indentation is to be understood.
| nickm12 wrote:
| Everyone is entitled to scratch their own itch, but this seems
| like the most useless configuration language I've ever seen.
|
| Take the "fixed point" example, where you have a boolean setting
| which one file says should should be "yes" and the other says it
| should be "no" and the language semantics composes that into a
| list with both values. For what boolean setting does this make
| sense?
|
| The article says "Overrides are not a problem because you keep
| both values. And you can decide what to do with them: keep only
| the first, keep only the last or use some smart logic to combine
| both of them. You're the boss."
|
| If you need custom logic in your application determine the
| setting to use, how is this language helping you?
| evujumenuk wrote:
| I think this is probably the best place within these comments
| to note that one thing some people expect of a configuration
| format is to be able to hide information from the consuming
| piece of software.
|
| Normally, it is often useful for a program to receive all the
| configuration from all sources. ("This flag is normally set to
| TRUE, has been set to FALSE on this system, has been set to
| TRUE by the user, and now there's an environment variable that
| says one thing and a command line flag that says something
| else.") Sometimes, integrating several incoherent settings into
| one is dependent on its consumer, or even the setting itself.
| Sometimes, you would like to be able to debug how different
| settings interact with one another. Sometimes, different
| settings can be merged without issue.
|
| CCL exposes everything to the program receiving the config,
| which is something (some) people seem to abhor. I can see how
| wanting to hide information can be both useful and detrimental,
| so I'm wondering if this issue is actually orthogonal to
| configuration languages, meaning CCL, and others, shouldn't
| even concern themselves with it.
| cies wrote:
| Reading this I think of all the programming languages that
| comments with whole languages inside of them. That is beyond
| the complex documentation I found.
| hoseja wrote:
| You'd think people would be more disinclined to xkcd://927 but
| for some reason this keeps happening.
| trelliscoded wrote:
| The equal sign is a required character for anything base64
| encoded, which includes some things you'd expect to be in a
| config file, like ssh public keys and x509 certs.
| herrington_d wrote:
| Calling other languages like "none of the tooling" in the "why"
| section sounds like a huge self-roasting since CCL does not have,
| say, highlighting/LSP/FFI for adoption.
| junon wrote:
| What a delightfully arrogant article, to the point I believe it
| to be satire (stopped reading at section headers, perhaps I
| missed the punchline).
|
| TOML is by far the most stripped down and easy to understand
| configuration format I've ever used, allowing just enough
| syntactic sugar to be useful without changing semantics. The fact
| it's made by Tom is meaningless, so its flagrant dismissal is
| silly to me.
|
| Meanwhile, the proposed configuration format sounds like a
| nightmare to read. There is still clashing syntax, offloads all
| of the parsing work to the software (which means now you have the
| same config format with multiple different ways of interpreting
| values), restricts usage of certain characters with no way of
| escaping them (someone else mentioned base64), and otherwise
| requires that you recursively parse it for nested KVs rather than
| constructing the final in memory structures in a linear pass,
| adding a layer of indirection prior to parsing.
|
| Not to mention, I really get turned off by this sort of pious
| writing style.
|
| No thanks. Lots of reasons to dislike config formats we've seen
| before but this doesn't solve anything in my eyes.
| Bost wrote:
| Have a look at "Code is Data / Data is Code"
| https://en.m.wikipedia.org/wiki/Homoiconicity And then see how
| it's done in real life: https://guix.gnu.org/
| bobnamob wrote:
| I'm still sad there's such an aversion to parens
|
| Edn is a lovely config language that checks most of the authors
| boxes, while still being "composable"
| jjulius wrote:
| With the handful of threads about electronic music in the past
| couple of weeks, for a brief moment I thought this was going to
| be about the inimitable CCL[1].
|
| [1] https://on.soundcloud.com/CGSmV6qHWNXKhLcp8
| jcarrano wrote:
| Essentially a stripped-down ini where you have to code any
| additional functionality and you will never have any tooling
| because basic things (like comments) are not standardized.
|
| For the use cases mentioned in the article, I have used JSON with
| schemas and JSON-merge and JSON-patch quite effectively. VScode
| supports schemas so it will help you edit the file without making
| mistakes. You can use JSON merge to combine global and local
| configs and you could even use custom fields in the schema to
| indicate metadata, for example, that a key represents an upper
| limit so the lowest value should be considered when merging.
| cies wrote:
| i miss comments so badly in json, especially when used in
| config files
| jcarrano wrote:
| Just add a "$comment" key. The only problem will be if the
| parser is too strict and rejects the unrecognized key.
| baq wrote:
| Or the linter requires sorted keys.
|
| JSON is just about the worst config format imaginable. I'd
| rather write my configs in xml and I really don't like xml.
| cies wrote:
| Nice! I really like a fresh take on anything.
|
| It's been said to be like RFC822 or Debian Control File Format in
| the comments here, I'd like to add like x-www-form-urlencoded. At
| work I use this a lot as it is what browsers submit. It's
| List<String, List<String>>, so keys may occur more than once. We
| standardized on little language for the keys that allows us to
| submit structured forms. (Many libraries prescribe a language for
| this, Rails does too; our keys look like
| ".location.space[2].name" for
| "{location:{space:[null,null,{name=VALUE_AS_STRING}]}}" in json).
|
| Some years ago I wrote a TOML parser in Haskell. Because parsers
| a fun to write in Haskell, and I needed one.
|
| Since we deploy with AWS/Fargate (Docker) the config is passed as
| JSON k-v pairs that are then set as ENV VARs in the container
| (following one of those 12factor principles). So it seems I
| cannot dictate the config file format.
| 4ad wrote:
| Compositionality is paramount and category theory guarantees
| compositionality, but the author's criteria for what entails a
| good configuration language are woefully naive.
|
| Configuration is not about describing data, it's about control.
| Control over a system made of impure, effectful parts.
|
| Configuration is a matter or _programming_ a mutable computer,
| i.e. a way to specify the composition of effects that you want.
|
| The configuration language is agnostic over the systems it
| controls, therefore it must provide semantics that preserve
| morphism in any of its interpetations. The language must be rich
| enough to accomodate for this. It is not enough to have one
| semantics.
|
| Moreover, it must be rich enough to describe its own models. Yes,
| the interpretation of it by arbitrary systems must be expressible
| in the language itself in order to be meaningful and to preserve
| consistency with regard of its interpretations. In practice, this
| is done through types.
|
| Additionally, configuration is a _global_ activity, it 's applied
| to the whole system, with many people changing conflicting
| aspects of it. Just like with any large evolving program,
| abstraction and typing are required for software engineering
| reasons alone.
|
| Coincidentally, CUE is also a monoid, but it is more than that,
| it is a complete Heyting algebra (or a complete Boolean algebra
| in the case of closed world assumption), these objects also form
| very rich categories.
|
| Another way to look at CUE is to view it as a semantic domain for
| the denotation of arbitrary types of arbitrary languages. It's
| suitable for this because it's a coherence space (Girard). All
| CUE operations are closed, preserving the structure of the space.
|
| One interesting aspect of author's effort is that even if he was
| so naive, category theory led him to a path that is correct. What
| he did is _incomplete_ , a monoid does not suffice for a
| configuration language, but a monoid is _required_. This is
| saying something.
| jteppinette wrote:
| Is there a /s missing somewhere?
| azeirah wrote:
| If you want a category theoretically-informed configuration
| language that has real-world use, then use nix.
|
| It's precisely this, and there's a reason it has the largest
| package repo on the planet
| binary132 wrote:
| This is silly. I like it. I'm pretty sure you just tricked me
| into reading a "for dummies" primer on category theory.
| Congratulations!
| kukimik wrote:
| This, for a number of reasons, reminds me of the Tree format, see
| https://hackernoon.com/tree-ast-which-crushes-json-xml-yaml-...
| and https://github.com/nin-jin/tree.d.
| jalk wrote:
| > You want to introduce data validation and type-checking in your
| config? Fine, you can just ask users to provide type annotations
| in the format you want...
|
| No - the users cannot choose type - they cannot suddenly decide
| that they want to provide a date where your parser expects an
| URL, or you are suddenly just making users repeat the schema
|
| > Every software MUST WORK WITHOUT A CONFIG!! > So, empty config
| or no config file at all must be a valid configuration
|
| Loads of scenarios where I want fail-fast over a running but
| broken system: "WARN AlertSystem URL not configured: alerting
| disabled" "WARN No credentials store configured, adding
| admin/111111" ....
___________________________________________________________________
(page generated 2025-01-11 23:01 UTC)