[HN Gopher] CCL: Categorical Configuration Language
       ___________________________________________________________________
        
       CCL: Categorical Configuration Language
        
       Author : SchwKatze
       Score  : 64 points
       Date   : 2025-01-06 20:04 UTC (5 days ago)
        
 (HTM) web link (chshersh.com)
 (TXT) w3m dump (chshersh.com)
        
       | mightyham wrote:
       | I really like the conciseness of this syntax. The language seems
       | very well thought through.
       | 
       | That being said, I've been working with NixOS recently and it's
       | made me reconsider what is useful for a configuration language.
       | In many reasonably large software projects, where configs become
       | very complex, config reuse (in other words templating or meta-
       | configuration) becomes an increasingly helpful feature. Nix
       | configs are great because it's not just a config, but a full
       | blown purely functional language for manipulating the config.
       | It's intuitive and powerful once you get the hang of it, and I
       | sometimes find myself wishing I could use it when I have to work
       | with yaml, json, etc.
        
         | one-punch wrote:
         | You might be interested in nickel (https://nickel-lang.org/),
         | which is a modern take on configuration management based on the
         | experience of Nix/NixOS configurations: purely functional
         | configuration, built-in validation (types & contracts),
         | reusable (functions, modules, defaults), and in addition
         | exports to Yaml, Json, etc.
         | 
         | To integrate nickel with nix, see how organist
         | (https://github.com/nickel-lang/organist) does DevShell
         | management.
        
       | worthless-trash wrote:
       | Every configuration language that is not code ends up being wrong
       | in multiple ways. We will never learn.
        
         | andrewflnr wrote:
         | Out of curiosity, do you like Dhall?
        
       | theamk wrote:
       | That's a lot of words to describe a very simple syntax:
       | name=value pairs, with line continuation using whitespace.
       | 
       | Basically RFC822 email headers, or Debian Control File Format [0]
       | but with "=" instead of ":", and without dedicated comment
       | character.
       | 
       | The biggest problem with this format is that a lot of things are
       | left for the app, so each app will have its own way to implement
       | lists, bools, line wrap support.. Even something like "value
       | override" is left to program implementation. Don't expect
       | YAML/JSON/XML-style automated validators/linters, each program
       | will need its own bespoke parser/generator.
       | 
       | [0] https://www.debian.org/doc/debian-policy/ch-
       | controlfields.ht...
        
         | atoav wrote:
         | Not to discredit the author, there are some smart thoughts in
         | there... but I can't help but feel like: yeah of course this is
         | very elegant -- but the complexity is not gone, it is
         | elsewhere. And they are not showing that elsewhere.
         | 
         | Namely the parsing code.
        
           | sn9 wrote:
           | You can literally just look at it:
           | https://github.com/chshersh/ccl/tree/main/lib
        
             | atoav wrote:
             | Sure, but it is a _configuration_ format if it is intended
             | to be used in all kind of languages you have to show bow to
             | deal with it in all kind of languages.
             | 
             | Checks that might be trivial in OCaml might be utter
             | footguns in say C.
             | 
             | Don't get me wrong here, I get that this is someones spare
             | time project that they might use for themselves only. I am
             | fine with that. But I am unconvinced if the (I admit: well
             | demonstrated) simplicity of the format translates to
             | simplicity of use in the scope it aimed for (replacement
             | for other wide spread configuration formats).
             | 
             | I don't say it is impossible (or even unlikely) that this
             | is an improvement, I just caution against seeing an elegant
             | minimalist approach and automatically assuming it makes
             | things simpler -- remember, computers with their binary
             | file systems _already_ have the most simple format that
             | could exist: zero or one. Yet somehow people shot
             | themselves in the foot parsing those for decades. Much of
             | the complexity of computer systems stems from managing the
             | simplicity of the underlying components (if we ignore the
             | thick layer historically grown cruft).
             | 
             | What would it take to change my mind? Elegant examples how
             | to parse all the examples in that post using major
             | programming languages (C, C++, Javascript, Java, Python,
             | Go, Rust, ...). That is ofc a lot of work, but if the
             | format should be adopted that work would be needed anyways.
        
               | sn9 wrote:
               | Part of the power of the idea is how amenable this is to
               | property-based testing.
               | 
               | And it really only takes one solid implementation that
               | can be wrapped/called from other languages to do well.
               | (Or perhaps one per an ecosystem like JVM, .NET, WASM,
               | anything that can be relatively easily called from C,
               | anywhere Python is used for scripting, etc.)
               | 
               | And because of the formalisms involved, you could have a
               | pretty precise and complete spec defined.
        
         | diggan wrote:
         | > The biggest problem with this format is that a lot of things
         | are left for the app, so each app will have its own way to
         | implement lists, bools, line wrap support
         | 
         | That seems to be one of the explicit goals:
         | 
         | > Configuration is specific to a particular application. What
         | you want is to follow the rule of the least surprise and
         | utility functions to parse strings.
         | 
         | Since configuration is specific to a particular program, so
         | should the configuration, seems to be what the author is
         | getting at.
         | 
         | Personally, what puts me off this particular configuration
         | language is this part, hidden behind collapsed text:
         | 
         | > In fact, CCL is indentation-sensitive.
         | 
         | Programming/configuring stuff with invisible characters isn't
         | my idea of fun, and it sounds especially cumbersome if everyone
         | is using it differently, since the configuration language
         | leaves a lot up to the users of the configuration.
        
           | cies wrote:
           | I think indentation sensitivity is very well suited for
           | configs: you want little line noise and the complexity is
           | low. I do understand the trade-off TOML made in this case.
           | 
           | Some languages prohibit the TAB character, and only allow
           | spaces at the start of the line in groups of 2 or 4: so it is
           | always clear how indentation is to be understood.
        
       | nickm12 wrote:
       | Everyone is entitled to scratch their own itch, but this seems
       | like the most useless configuration language I've ever seen.
       | 
       | Take the "fixed point" example, where you have a boolean setting
       | which one file says should should be "yes" and the other says it
       | should be "no" and the language semantics composes that into a
       | list with both values. For what boolean setting does this make
       | sense?
       | 
       | The article says "Overrides are not a problem because you keep
       | both values. And you can decide what to do with them: keep only
       | the first, keep only the last or use some smart logic to combine
       | both of them. You're the boss."
       | 
       | If you need custom logic in your application determine the
       | setting to use, how is this language helping you?
        
         | evujumenuk wrote:
         | I think this is probably the best place within these comments
         | to note that one thing some people expect of a configuration
         | format is to be able to hide information from the consuming
         | piece of software.
         | 
         | Normally, it is often useful for a program to receive all the
         | configuration from all sources. ("This flag is normally set to
         | TRUE, has been set to FALSE on this system, has been set to
         | TRUE by the user, and now there's an environment variable that
         | says one thing and a command line flag that says something
         | else.") Sometimes, integrating several incoherent settings into
         | one is dependent on its consumer, or even the setting itself.
         | Sometimes, you would like to be able to debug how different
         | settings interact with one another. Sometimes, different
         | settings can be merged without issue.
         | 
         | CCL exposes everything to the program receiving the config,
         | which is something (some) people seem to abhor. I can see how
         | wanting to hide information can be both useful and detrimental,
         | so I'm wondering if this issue is actually orthogonal to
         | configuration languages, meaning CCL, and others, shouldn't
         | even concern themselves with it.
        
           | cies wrote:
           | Reading this I think of all the programming languages that
           | comments with whole languages inside of them. That is beyond
           | the complex documentation I found.
        
       | hoseja wrote:
       | You'd think people would be more disinclined to xkcd://927 but
       | for some reason this keeps happening.
        
       | trelliscoded wrote:
       | The equal sign is a required character for anything base64
       | encoded, which includes some things you'd expect to be in a
       | config file, like ssh public keys and x509 certs.
        
       | herrington_d wrote:
       | Calling other languages like "none of the tooling" in the "why"
       | section sounds like a huge self-roasting since CCL does not have,
       | say, highlighting/LSP/FFI for adoption.
        
       | junon wrote:
       | What a delightfully arrogant article, to the point I believe it
       | to be satire (stopped reading at section headers, perhaps I
       | missed the punchline).
       | 
       | TOML is by far the most stripped down and easy to understand
       | configuration format I've ever used, allowing just enough
       | syntactic sugar to be useful without changing semantics. The fact
       | it's made by Tom is meaningless, so its flagrant dismissal is
       | silly to me.
       | 
       | Meanwhile, the proposed configuration format sounds like a
       | nightmare to read. There is still clashing syntax, offloads all
       | of the parsing work to the software (which means now you have the
       | same config format with multiple different ways of interpreting
       | values), restricts usage of certain characters with no way of
       | escaping them (someone else mentioned base64), and otherwise
       | requires that you recursively parse it for nested KVs rather than
       | constructing the final in memory structures in a linear pass,
       | adding a layer of indirection prior to parsing.
       | 
       | Not to mention, I really get turned off by this sort of pious
       | writing style.
       | 
       | No thanks. Lots of reasons to dislike config formats we've seen
       | before but this doesn't solve anything in my eyes.
        
       | Bost wrote:
       | Have a look at "Code is Data / Data is Code"
       | https://en.m.wikipedia.org/wiki/Homoiconicity And then see how
       | it's done in real life: https://guix.gnu.org/
        
       | bobnamob wrote:
       | I'm still sad there's such an aversion to parens
       | 
       | Edn is a lovely config language that checks most of the authors
       | boxes, while still being "composable"
        
       | jjulius wrote:
       | With the handful of threads about electronic music in the past
       | couple of weeks, for a brief moment I thought this was going to
       | be about the inimitable CCL[1].
       | 
       | [1] https://on.soundcloud.com/CGSmV6qHWNXKhLcp8
        
       | jcarrano wrote:
       | Essentially a stripped-down ini where you have to code any
       | additional functionality and you will never have any tooling
       | because basic things (like comments) are not standardized.
       | 
       | For the use cases mentioned in the article, I have used JSON with
       | schemas and JSON-merge and JSON-patch quite effectively. VScode
       | supports schemas so it will help you edit the file without making
       | mistakes. You can use JSON merge to combine global and local
       | configs and you could even use custom fields in the schema to
       | indicate metadata, for example, that a key represents an upper
       | limit so the lowest value should be considered when merging.
        
         | cies wrote:
         | i miss comments so badly in json, especially when used in
         | config files
        
           | jcarrano wrote:
           | Just add a "$comment" key. The only problem will be if the
           | parser is too strict and rejects the unrecognized key.
        
             | baq wrote:
             | Or the linter requires sorted keys.
             | 
             | JSON is just about the worst config format imaginable. I'd
             | rather write my configs in xml and I really don't like xml.
        
       | cies wrote:
       | Nice! I really like a fresh take on anything.
       | 
       | It's been said to be like RFC822 or Debian Control File Format in
       | the comments here, I'd like to add like x-www-form-urlencoded. At
       | work I use this a lot as it is what browsers submit. It's
       | List<String, List<String>>, so keys may occur more than once. We
       | standardized on little language for the keys that allows us to
       | submit structured forms. (Many libraries prescribe a language for
       | this, Rails does too; our keys look like
       | ".location.space[2].name" for
       | "{location:{space:[null,null,{name=VALUE_AS_STRING}]}}" in json).
       | 
       | Some years ago I wrote a TOML parser in Haskell. Because parsers
       | a fun to write in Haskell, and I needed one.
       | 
       | Since we deploy with AWS/Fargate (Docker) the config is passed as
       | JSON k-v pairs that are then set as ENV VARs in the container
       | (following one of those 12factor principles). So it seems I
       | cannot dictate the config file format.
        
       | 4ad wrote:
       | Compositionality is paramount and category theory guarantees
       | compositionality, but the author's criteria for what entails a
       | good configuration language are woefully naive.
       | 
       | Configuration is not about describing data, it's about control.
       | Control over a system made of impure, effectful parts.
       | 
       | Configuration is a matter or _programming_ a mutable computer,
       | i.e. a way to specify the composition of effects that you want.
       | 
       | The configuration language is agnostic over the systems it
       | controls, therefore it must provide semantics that preserve
       | morphism in any of its interpetations. The language must be rich
       | enough to accomodate for this. It is not enough to have one
       | semantics.
       | 
       | Moreover, it must be rich enough to describe its own models. Yes,
       | the interpretation of it by arbitrary systems must be expressible
       | in the language itself in order to be meaningful and to preserve
       | consistency with regard of its interpretations. In practice, this
       | is done through types.
       | 
       | Additionally, configuration is a _global_ activity, it 's applied
       | to the whole system, with many people changing conflicting
       | aspects of it. Just like with any large evolving program,
       | abstraction and typing are required for software engineering
       | reasons alone.
       | 
       | Coincidentally, CUE is also a monoid, but it is more than that,
       | it is a complete Heyting algebra (or a complete Boolean algebra
       | in the case of closed world assumption), these objects also form
       | very rich categories.
       | 
       | Another way to look at CUE is to view it as a semantic domain for
       | the denotation of arbitrary types of arbitrary languages. It's
       | suitable for this because it's a coherence space (Girard). All
       | CUE operations are closed, preserving the structure of the space.
       | 
       | One interesting aspect of author's effort is that even if he was
       | so naive, category theory led him to a path that is correct. What
       | he did is _incomplete_ , a monoid does not suffice for a
       | configuration language, but a monoid is _required_. This is
       | saying something.
        
       | jteppinette wrote:
       | Is there a /s missing somewhere?
        
       | azeirah wrote:
       | If you want a category theoretically-informed configuration
       | language that has real-world use, then use nix.
       | 
       | It's precisely this, and there's a reason it has the largest
       | package repo on the planet
        
       | binary132 wrote:
       | This is silly. I like it. I'm pretty sure you just tricked me
       | into reading a "for dummies" primer on category theory.
       | Congratulations!
        
       | kukimik wrote:
       | This, for a number of reasons, reminds me of the Tree format, see
       | https://hackernoon.com/tree-ast-which-crushes-json-xml-yaml-...
       | and https://github.com/nin-jin/tree.d.
        
       | jalk wrote:
       | > You want to introduce data validation and type-checking in your
       | config? Fine, you can just ask users to provide type annotations
       | in the format you want...
       | 
       | No - the users cannot choose type - they cannot suddenly decide
       | that they want to provide a date where your parser expects an
       | URL, or you are suddenly just making users repeat the schema
       | 
       | > Every software MUST WORK WITHOUT A CONFIG!! > So, empty config
       | or no config file at all must be a valid configuration
       | 
       | Loads of scenarios where I want fail-fast over a running but
       | broken system: "WARN AlertSystem URL not configured: alerting
       | disabled" "WARN No credentials store configured, adding
       | admin/111111" ....
        
       ___________________________________________________________________
       (page generated 2025-01-11 23:01 UTC)