[HN Gopher] Show HN: Jb / json.bash - Command-line tool (and bas...
       ___________________________________________________________________
        
       Show HN: Jb / json.bash - Command-line tool (and bash library) that
       creates JSON
        
       jb is a UNIX tool that creates JSON, for shell scripts or
       interactive use. Its "one thing" is to get shell-native data
       (environment variables, files, program output) to somewhere else,
       using JSON encapsulate it robustly.  I wrote this because I wanted
       a robust and ergonomic way to create ad-hoc JSON data from the
       command line and scripts. I wanted errors to not pass silently, not
       coerce data types, not put secrets into argv. I wanted to leverage
       shell features/patterns like process substitution, environment
       variables, reading/streaming from files and null-terminated data.
       If you know of the jo program, jb is similar, but type-safe by
       default and more flexible. jo coerces types, using flags like -n to
       coerce to a specific type (number for -n), without failing if the
       input is invalid. jb encodes values as strings by default,
       requiring type annotations to parse & encode values as a specific
       type (failing if the value is invalid).  If you know jq, jb is
       complementary in that jq is great at transforming data already in
       JSON format, but it's fiddly to get non-JSON data into jq. In
       contrast, jb is good at getting unstructured data from arguments,
       environment variables and files into JSON (so that jq could use
       it), but jb cannot do any transformation of data, only parsing &
       encoding into JSON types.  I feel rather guilty about having
       written this in bash. It's something of a boiled frog story. I
       started out just wanting to encode JSON strings from a shell
       script, without dependencies, with the intention of piping them
       into jq. After a few trials I was able to encode JSON strings in
       bash with surprising performance, using array operations to encode
       multiple strings at once. It grew from there into a complete tool.
       I'd certainly not choose bash if I was starting from scratch now...
        
       Author : h4l
       Score  : 119 points
       Date   : 2024-07-03 10:18 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | h4l wrote:
       | As well as anyone's general thoughts/experiences, I'd appreciate
       | opinions on the error handling mechanism jb uses to detect errors
       | in upstream jb processes that jb is reading from.
       | 
       | Normally, detecting errors on the other end of a pipe requires
       | care in a shell environment (e.g. retrospectively checking
       | PIPESTATUS). I used an approach I've called Stream Poisoning. It
       | takes advantage of the fact that control characters are never
       | present in valid JSON. When jb fails to encode JSON, it emits a
       | Cancel control character[1] on stdout. When jb encounters such a
       | character in an input, it can tell the input it's reading from is
       | truncated/erroneous. This avoids the typical problem of a pipe
       | silently being read as an empty file.
       | 
       | I've got a page explaining this with some examples here:
       | https://github.com/h4l/json.bash/blob/main/docs/stream-poiso... I
       | can imagine using control characters in a text stream being
       | rather controversial, but I feel it works quite well in practice.
       | 
       | [1]: https://en.wikipedia.org/wiki/Cancel_character
        
         | throwawaynorway wrote:
         | What happens if the next program in the pipe is not jb? Does jb
         | also exit with a code?
         | 
         | For example `jb | jq`, where jq or a similar program discards
         | the cancel character.
         | 
         | (Away from pc, unable to check right now.)
        
           | h4l wrote:
           | Good question! Yep, jb exits with non-zero:
           | $ jb size:number=oops; echo $?         json.encode_number():
           | not all inputs are numbers: 'oops'         json(): Could not
           | encode the value of argument 'size:number=oops' as a 'number'
           | value. Read from inline value.                  1
           | 
           | If you pipe the jb error into jq, jq fails to parse the JSON
           | (because of the Cancel ctrl char) and also errors:
           | $ jb size:number=oops | jq         json.encode_number(): not
           | all inputs are numbers: 'oops'         json(): Could not
           | encode the value of argument 'size:number=oops' as a 'number'
           | value. Read from inline value.         parse error: Invalid
           | numeric literal at line 2, column 0              $ declare -p
           | PIPESTATUS         declare -a PIPESTATUS=([0]="1" [1]="4")
           | 
           | So jq exits with status 4 here.
        
       | abdellah123 wrote:
       | Amazing tool and syntax. Hat down!
        
         | h4l wrote:
         | Thanks!
        
       | pmarreck wrote:
       | BATS was a little heavy for me as a testing dependency for my own
       | use (I ended up writing what I intended to be "the most
       | minimalist shell testing library possible", see below, I think it
       | still needs work though!), but I at least want to commend you for
       | having what looks like a great test suite to begin with!
       | 
       | https://github.com/pmarreck/tinytestlib
        
         | h4l wrote:
         | It's nice to have compact single-file dependencies like this! I
         | like the look of your assertions (checking out, err & status).
         | I definitely found myself writing my own assertions to get
         | understandable errors.
        
       | mg wrote:
       | I like the syntax to send typed values from the terminal:
       | jb id=42 size:number=42 surname=null data:null              =>
       | {"id":"42","size":42,"surname":"null","data":null}
       | 
       | I never had the need to use typed arguments in bash, but if I
       | ever have it, this might be the syntax I'd use.
       | 
       | In fact, I was thinking about such a syntax recently. I am
       | writing a tool which lets you call functions in Python modules
       | from the command line. At first, I thought I need to define the
       | argument types on the command line. But then I decided it is more
       | convenient to use inspection and auto-convert the values to the
       | needed types.
        
         | enriquto wrote:
         | > I like the syntax to send typed values from the terminal:
         | 
         | Incidentally, this syntax shows a notation that is clearly
         | superior to json (at least for non-nested stuff). If all you
         | need is this, you'd be better off by avoiding json altogether.
         | 
         | [Rant: if json is so unergonomic that people keep inventing
         | alternatives like this syntax and stuff like "gron" to de-
         | jsonise their lives, maybe using json was always a bad idea,
         | after all... I guess in a decade everybody will look at json
         | with the same disdain as we do XML today.]
        
           | sureIy wrote:
           | > shows a notation that is clearly superior to json
           | 
           | I don't see that at all? Why is `n:number=1` superior to
           | `{n:1}`? If anything, CLI commands are awful for anything
           | other than strings.
        
             | enriquto wrote:
             | But strings are often the most common case (or even, the
             | only case that is needed). And they need much less
             | punctuation. Compare:                   a=1 b=2 c=3
             | 
             | with                  {"a"="1", "b"="2", "c"="3"}
             | 
             | the json version needs 19 punctuation characters just to
             | define three variables, against the bash version that only
             | has 3. Which one would you prefer to type with your
             | keyboard?
        
             | hnlmorg wrote:
             | Depends on the shell. The following parses as a number in
             | Murex:                   %{n:1}
             | 
             | https://murex.rocks/parser/create-object.html
             | 
             | I'm sure you can do similar things in other modern shells
             | too. So the real problem is that people are stuck on the
             | constraints of 1970s command lines.
        
         | h4l wrote:
         | Glad to hear, this was something I wanted to make reliable,
         | ergonomic and intuitive. I figured a lot of languages use `:
         | type` to declare types.
         | 
         | The same using jo would be like this, which I find harder to
         | type and remember:                 jo -- -s id=42 -n size=42 -s
         | surname=null data=null
         | {"id":"42","size":42,"surname":"","data":null}
         | 
         | Notice that surname comes out as the empty string though, I
         | think this must be a bug in jo!
        
       | altruios wrote:
       | Windows: what if everything was an (command) object?
       | 
       | Linux: what if everything was a file?
       | 
       | Soon we might have...
       | 
       | Mong/Os: what if everything was JSON?
       | 
       | YiAM/OS: YiAM/OS is ANOTHER MARKUP OPERATING SYSTEM... would come
       | out shortly thereafter...
       | 
       | I like JSON and getting in the terminal is a challenge - GOOD
       | JOB!
        
       | boomskats wrote:
       | I find writing bash really gratifying - almost relaxing - but
       | you're right, it also makes me feel kind of 'guilty', especially
       | when I start getting carried away and reading tput docs.
       | 
       | However, I think in your case the rationale in the performance
       | section of your Readme totally makes sense, and every single use
       | case I can think of for this would prioritise minimal latency
       | over increased throughput. I've seen init containers that would
       | execute probably 100x faster with this for the exact reasons you
       | point out. I'm quite curious as to what you would you choose
       | instead of bash if you were starting from scratch now?
       | 
       | FYI Shellcheck has a couple of superficial nits that you might
       | wanna address (happy to send a PR). And your Readme is great.
        
         | h4l wrote:
         | I did find it quite satisfying to coerce bash into doing this
         | while maintaining decent performance. I definitely came to
         | appreciate some aspects of bash more from this, but it's so
         | easy to shoot yourself in the foot!
         | 
         | If I started from scratch now I'd use a compiled language that
         | could produce a single static binary and start with really low
         | latency. I'm pretty sure jo must not be tuned for startup time,
         | if they optimised that they must be able get it way faster than
         | bash can start and parse json.bash. I was pretty surprised that
         | bash can startup faster!
         | 
         | The codebase is basically at the limit of what I'd want to do
         | with bash, but there are features I could add if it was in a
         | proper programming language. e.g. validating :int number types,
         | pretty-printing output, not needing the :raw type to stream
         | JSON input.
         | 
         | Thanks for the heads up on Shellcheck, I'd be happy to take a
         | PR if you'd like to.
        
       | 2f0ja wrote:
       | Similar to jo, which is written in C [1]
       | 
       | [1] https://github.com/jpmens/jo
        
         | IshKebab wrote:
         | This is mentioned.
        
       | simonw wrote:
       | I found the JSON array syntax a little unintuitive:
       | $ jb dependencies:[,]=Bash,Grep
       | {"dependencies":["Bash","Grep"]}
       | 
       | One possible alternative would be to accept JSON literal
       | snippets, like this:                   $ jb
       | dependencies='["Bash", "Grep"]'
       | 
       | This should support all forms of nested JSON objects. You could
       | have a rule that if an argument does NOT parse as a valid JSON
       | value it is treated as a raw string, so this would work:
       | $ jb foo=bar bar='"this is a well formed string"'         {"foo":
       | "bar", "bar": "this is a well formed string"}
       | 
       | You could even then nest jb calls like this:                   $
       | jb foo=$(jb bar=baz)         {"foo": {"bar": "baz}}
        
         | h4l wrote:
         | Thanks for giving it a try and your feedback. I agree, the
         | array splitting is a bit fiddly. It is actually possible to
         | pass JSON directly, you use the :json type on the argument:
         | $ jb dependencies:json='["Bash","Grep"]'
         | {"dependencies":["Bash","Grep"]}              $ jb foo=bar
         | bar:json='"this is a well formed string"'
         | {"foo":"bar","bar":"this is a well formed string"}
         | 
         | And then you can indeed use command substitution to nest calls:
         | $ jb foo:json=$(jb bar=baz)         {"foo":{"bar":"baz"}}
         | 
         | It works even better to use process substitution, this way the
         | shell gives jb a file path to a file to read, and so you don't
         | need to quote the $() to avoid whitespace breaking things:
         | $ jb foo:json@<(jb msg=$'no need\nto quote this!')
         | {"foo":{"msg":"no need\nto quote this!"}}
         | 
         | Another option is to use jb-array to generate arrays. (jb-array
         | is best for tuple-like arrays with varying types):
         | $ jb dependencies:json@<(jb-array Bash Grep)
         | {"dependencies":["Bash","Grep"]}
         | 
         | And if you use it from bash as a function, you can put values
         | into a bash array and reference it:                   $ source
         | json.bash         $ dependencies=(Bash Grep)         $ json
         | @dependencies:[]            {"dependencies":["Bash","Grep"]}
        
       | vips7L wrote:
       | Built into Powershell:                   > @{ hello = 'world' } |
       | ConvertTo-Json         > { "hello": "world" }
        
         | h4l wrote:
         | Powershell has the upper hand here!
         | 
         | Still, bash can try to keep up using json.bash. :)
         | $ source json.bash         $ declare -A
         | greeting=([Hello]=World)         $ json ...@greeting:{}
         | {"Hello":"World"}
         | 
         | ... is splatting the greeting associative array entries into
         | the object created by the json call.
         | 
         | Without the ... the greeting would be a nested object. Probably
         | more clear with multiple entries:                   $ declare
         | -A greeting=([Hello]=World [How]="are you?")         $ json
         | @greeting:{}            {"greeting":{"Hello":"World","How":"are
         | you?"}}
         | 
         | Vs:                   $ json ...@greeting:{}
         | {"Hello":"World","How":"are you?"}
        
           | majkinetor wrote:
           | $h=@{x=1; y=2}; $h + @{z=3} | ConvertTo-Json              {
           | "y": 2,           "z": 3,           "x": 1          }
           | 
           | You can even use [ordered]$h to make keys not go random
           | place.
        
         | majkinetor wrote:
         | Not only its built in, but syntax is on another level, i.e. you
         | don't need to learn special syntax if you know PowerShell. This
         | thing alone makes pwsh worth it instead of using number of
         | other tools.                   @{ Hello = 'world'; array =
         | 1..10; object = @{ date = Get-Date } } | ConvertTo-Json
         | {           "array": [             1,             2,
         | 3,             4,             5,             6,             7,
         | 8,             9,             10           ],
         | "object": {             "date":
         | "2024-07-03T21:07:21.6562053+02:00"           },
         | "Hello": "world"         }
        
           | h4l wrote:
           | That is pretty cool, and I wish such features were common in
           | regular UNIX shells.
           | 
           | For good measure, this is how you might do the same with jb:
           | $ jb Hello=world array:number[]@<(seq 10)
           | object:json@<(date=$(date -Iseconds) jb @date)         {"Hell
           | o":"world","array":[1,2,3,4,5,6,7,8,9,10],"object":{"date":"2
           | 024-07-03T19:26:36+00:00"}}
           | 
           | Alternatively, using the :{} object entry syntax:
           | jb Hello=world array:number[]@<(seq 10) object:{}=date=$(date
           | -Iseconds)         {"Hello":"world","array":[1,2,3,4,5,6,7,8,
           | 9,10],"object":{"date":"2024-07-03T19:30:26+00:00"}}
        
       | fieu wrote:
       | I wonder if I could use this on my project which uses multiple
       | glue functions to piece together JSON strings.
       | https://github.com/fieu/discord.sh
        
         | h4l wrote:
         | If it helps, there's a little example of using the bash API
         | with bash variables/arrays, should give you an idea of how it
         | could be to use:
         | https://github.com/h4l/json.bash/blob/main/examples/notify.s...
         | 
         | This example uses the pattern of setting an out=varname when
         | calling a json function, the encoded JSON goes into $varname
         | variable. This pattern avoids the overhead of forking processes
         | (e.g. subshells) when generating JSON.
         | 
         | Otherwise you can use the more normal approach of jb writing to
         | stdout, and capturing the output stream.
        
       | westurner wrote:
       | jshn.sh: https://openwrt.org/docs/guide-developer/jshn src:
       | https://git.openwrt.org/?p=project/libubox.git;a=blob;f=sh/j... :
       | 
       | > _jshn (JSON SHell Notation), a small utility and shell library
       | for parsing and generating JSON data_
        
       | jpgvm wrote:
       | {"password":"hunter2"}
       | 
       | A man of culture I see.
       | 
       | This looks really useful where you don't want to introduce
       | another scripting VM just to spit out some JSON, i.e I have used
       | Ruby a lot for this in the past.
       | 
       | I can see myself using this in container init scripts and other
       | very low dep environments to format config files from env vars
       | etc.
        
         | h4l wrote:
         | How did you guess my password?!?!
         | 
         | This is just the kind of use case I had in mind. Something I've
         | considered is publishing a mini version with only the
         | json.encode_string function, as that's enough to create an
         | array of JSON-encoded strings and use a hard-coded template
         | with printf to insert the JSON string values.
         | 
         | That would be a fraction of the overall json.bash file size.
        
         | ukuina wrote:
         | RIP bash.org!
        
       | IshKebab wrote:
       | Yeah if you need this it's definitely a sign you shouldn't be
       | using Bash.
       | 
       | Can you give a concrete example of when this is the sanest
       | option?
        
         | h4l wrote:
         | Two main situations I think. The first is just interactive use
         | in any shell to encode ad-hoc JSON. If you have a next-gen
         | shell which can handle structured data directly, then you
         | probably don't need it.
         | 
         | Second is situations where you'd rather not add an additional
         | dependency, but bash is pretty much a given. For example, CI
         | environments, scripts in dev environments, container
         | entrypoints. Or things that area already written in bash.
         | 
         | I don't advocate writing massive programs in bash, for sure
         | it's better to turn to a proper language before things get
         | hairy. But bash is just really ubiquitous, and most people who
         | do any UNIX work will be able to deal with a bit of shell
         | script.
        
           | IshKebab wrote:
           | > Second is situations where you'd rather not add an
           | additional dependency, but bash is pretty much a given. For
           | example, CI environments, scripts in dev environments,
           | container entrypoints. Or things that area already written in
           | bash.
           | 
           | Is this tool not an additional dependency?
           | 
           | > But bash is just really ubiquitous
           | 
           | Biggest crime of the Unix world probably.
        
           | wavemode wrote:
           | I agree with the interactive usecase.
           | 
           | But for when you don't want an extra dependency, awk and perl
           | are better than bash and just about as ubiquitous. (I might
           | dare to say more ubiquitous, since MacOS in particular ships
           | with an ancient version of bash that can't even use this jb
           | tool. But the versions of awk and perl it comes with are
           | fine.)
        
       | metadat wrote:
       | This is incredibly high-quality BASH programming, as a fellow
       | bash freak I am studying this code, and even I am learning some
       | new techniques.
       | 
       | https://github.com/h4l/json.bash/blob/main/json.bash
       | 
       | You've boiled it down to a set of very elegant constructs.
       | Respect. Thank you @h4l, this is badass.
       | 
       | I hope you follow up with a golang or rust implementation, that
       | would really be something else.
       | 
       | p.s. I noticed the following odd behaviors with escaping
       | delimiters (e.g. "="), is there a way to get an un-escaped equal
       | sign as the trailing part of a key or leading part of a value?
       | $ docker container run --rm ghcr.io/h4l/json.bash/jb msg=Hi
       | {"msg":"Hi"}       $ docker container run --rm
       | ghcr.io/h4l/json.bash/jb msg=\=Hi       {"msg=Hi":"msg=Hi"}
       | $ docker container run --rm ghcr.io/h4l/json.bash/jb "msg=\=Hi"
       | {"msg":"\\=Hi"}       $ docker container run --rm
       | ghcr.io/h4l/json.bash/jb "msg\==\=Hi"       {"msg\\=\\":"Hi"}
       | $ docker container run --rm ghcr.io/h4l/json.bash/jb
       | "msg\\==\=Hi"       {"msg\\=\\":"Hi"}       $ docker container
       | run --rm ghcr.io/h4l/json.bash/jb "msg\\===Hi"
       | {"msg\\=":"Hi"}
        
         | h4l wrote:
         | Thank you, that's high praise! I learnt a lot about bash
         | writing this, but I've also not looked at the code in a few
         | months, and it's already starting to look quite intimidating!
         | 
         | I definitely like the idea of a goland/rust implementation,
         | there are certainly things I could improve.
         | 
         | So the argument syntax escapes by repeating a character rather
         | than backslash. I chose this because with backslashes escapes
         | it would be unclear whether a backslash was in the shell syntax
         | or the jb syntax, and users may end up needing to double escape
         | backslashes, which is no fun! Whereas a shell will always
         | ignore two copies of a character like =:@.
         | 
         | The downside of double-escaping is that the syntax can be
         | ambiguous, so sometimes you need to include the middle type
         | marker to disambiguate the key from the value. But the type can
         | be empty, so just : works:                   $ jb
         | ===msg==:==hi=         {"=msg=":"=hi="}
         | 
         | In the key part, the first = begins the key, the == following
         | are an escaped =. The first = following the : marks the value,
         | and everything after is not parsed, so =hi= is literal.
         | 
         | When you have reserved characters in keys/values (especially if
         | they're dynamic), it's easiest to store the values in variables
         | and reference them with @var syntax:                   $
         | k='=msg=' v='=hi=' jb @k@v         {"=msg=":"=hi="}
        
           | zikohh wrote:
           | How is this different to this
           | https://github.com/kellyjonbrazil/jc
        
       | Aerbil313 wrote:
       | I'll wait right here using Nushell while the you guys can spend
       | the next 10 years re-inventing it.
        
       | gkfasdfasdf wrote:
       | Is there a minimum bash version required? I.e. will it work with
       | bash 3 or whatever ships with macos by default?
        
         | h4l wrote:
         | There is, the earliest version I've tested with is 4.4.19, but
         | ideally a 5.x version. 3 certainly won't work I'm afraid. If
         | you use homebrew on Mac it's a good way to get the latest bash.
        
       | lttlrck wrote:
       | This is great! no doubt I'll be reaching for it very soon.
        
       | lsferreira42 wrote:
       | Amazing bash programming skills, this is so cool that i want to
       | find a problem to solve using it right now!!
        
       ___________________________________________________________________
       (page generated 2024-07-03 23:00 UTC)