[HN Gopher] Show HN: Sol - A de-minifier for shell programs
       ___________________________________________________________________
        
       Show HN: Sol - A de-minifier for shell programs
        
       I've built a tool called sol (like "soul") that helps you inspect
       and format complex shell one-liners. Features:  - Choose which
       transformations you want (break on pipe, args, redirect, whatever)
       - "Peeks" into stringified commands (think xargs, parallel) and
       formats those, too  - Auto-breaks at a given width (e.g., 80
       characters)  - Shows you non-standard aliases, functions, files,
       etc. that you might not have in your shell environment  - Breaks up
       long jq lines with jqfmt because--let's be honest--they're getting
       out of hand  As a security researcher and tool developer, I often
       encounter (or create) long pipelined Bash commands. While quick and
       powerful, they can be a nightmare to read or debug. I created sol
       to make it easier to understand and share these commands with
       others.
        
       Author : noperator
       Score  : 120 points
       Date   : 2024-09-16 13:53 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | kmarc wrote:
       | I usually do this by hand. Good to see a tool for it :-)
       | 
       | Feature request, which I would love to have in all my automation
       | scripts:
       | 
       | Replace short flags with the long switches. Short flags are great
       | when typing in a terminal but I don't want to figure out 2 years
       | from now what the                   obscurecommand -v -f -n
       | 
       | does, and I have to assume that it's NOT --version --file --dry-
       | run, but --verbose, --force, and --dont-ask-before-deleting-
       | everything
       | 
       | I try to use long options in my script, therefore (especially in
       | a team, where not everyone is familiar with every single command)
        
         | notpushkin wrote:
         | It would be a great rule for shellcheck, by the way.
         | Line 6:         curl -fsSL "${url}"              ^-- SC8537
         | (warning): use long options instead (`--fail`, `--silent`,
         | `--show-error`, `--location`).
        
           | yjftsjthsd-h wrote:
           | I would want it opt-in, because I use shellcheck on scripts
           | that will be run on busybox or *BSD where there _aren 't_
           | long options
        
             | notpushkin wrote:
             | Of course.
        
               | yjftsjthsd-h wrote:
               | Oh, I didn't realize shellcheck already had optional
               | checks (see `shellcheck --list-optional` for a list), so
               | that was not obvious to me initially. Then yes, that'd be
               | a good thing to have available.
        
         | jakub_g wrote:
         | When I saw "deminifier for shell commands" in title I had
         | exactly the same in mind.
        
       | snatchpiesinger wrote:
       | Cool! My personal preference is Knuth-style line-breaks on binary
       | operators and pipes, which means breaking before the
       | operator/pipe symbol.                 foo -a -b \       | bar -c
       | -d -e \       | baz -e -f
       | 
       | instead of                 foo -a -b | \       bar -c -d -e | \
       | baz -e -f
       | 
       | This doesn't seem to be an option, but could be easy to
       | implement.
        
         | basemi wrote:
         | `shfmt` formats multi-line pipes into:                   foo -a
         | -b |             bar -c -d -e |             baz -e -f
         | 
         | which it's not that bad
        
         | ramses0 wrote:
         | Next level:                  foo -a -b \        | bar -c -d -e
         | \        | baz -e -f \        && echo "DONE."   # && /bin/true
         | 
         | ...means you can safely (arbitrarily) add more intermediate
         | commands to the pipeline w/o having to worry about modifying
         | the trailing slash (eg: `yyp` won't cause a syntax error).
        
           | pxc wrote:
           | I do it this way but I indent the rest of the pipeline (like
           | one typically would in a functional language with a pipeline
           | operator or thread macro, or in an OOP language with method
           | chaining via `.`):                 foo --long-something \
           | | bar --long-other
           | 
           | and if the lines in subsequent commands in the pipeline start
           | to get long, I also indent their arguments:
           | foo --long-flag \         | bar \             --long-other-
           | flag \             --long-option a \             --long-
           | option b \         | baz \             --long-another-flag \
           | --long-flag-again \         > /path/to/some/file
           | 
           | I really like to use && on its own line like that. One of my
           | favorite things about Fish is how it turns && and || from
           | special syntax into _commands_ , which it calls combiners, so
           | you could write:                 foo \         | bar \
           | | baz       and echo done
           | 
           | I use this often for conditions whose bodies would only be
           | one line, avoiding the nesting and indentation:
           | test -n "$SOMETHING"       or set -x SOMETHING some-default
           | command $SOMETHING
           | 
           | In Bash, I usually use parameter substitution for this, but
           | in other situations (other than setting default values for
           | vars) I throw a backslash at the end of a line, indent and
           | use && or ||, imitating the Fish style.
           | 
           | One of my favorite patterns for readability is to use
           | indented, long-form pipelines like this to set variables.
           | They work fine inside subshells, but for uniformity and
           | clarity I prefer to do                 shopt -s lastpipe
           | foo \         | bar \         | baz \         | read SOMEVAR
           | 
           | I really enjoy 'maximizing' pipelines like this because it
           | makes it possible to use long pipelines everywhere without
           | making your program terse and mysterious, or creating
           | unwieldy long lines.
           | 
           | If you do this, you end up with a script is mostly 'flat'
           | (having very little nested control flow, largely avoiding
           | loops), has very few variable assignments, and predictably
           | locates the variable assignments it does have at the ends of
           | pipelines. Each pipeline is a singular train of thought
           | requiring you to consider context and state only at the very
           | beginning and very end, and you can typically likewise think
           | of all the intermediate steps/commands in functional terms.
           | 
           | I tend to write all of my shell scripts this way, including
           | the ones I write interactively at my prompt. One really cool
           | thing about shell languages is that unlike in 'real'
           | programming languages, loops are actually composable! So you
           | can freely mix ad-hoc/imperative and pipeline-centric styles
           | like this (example is Fish):                 find -name
           | whatever -exec basename '{}' \;           | while read -l
           | data               set -l temp (some-series $data)
           | set -l another (some-command $temp)               blahdiblah
           | --something $temp --other $another           end \
           | | bar \           | baz \           > some-list.txt
           | 
           | (I typically use 2 spaces to indent when writing Bash
           | scripts, but Fish defaults to 4 in the prompt, which it also
           | automatically indents for you. I'm not sure if that's
           | configurable but I haven't attempted to change it.)
           | 
           | I tend to follow my guideline suggested earlier and do this
           | only close to the very beginning or very end of a pipeline if
           | that loop actually modifies the filesystem or non-local
           | variables, but it's really nice to have that flexibility imo.
           | (It's especially handy if you want to embed some testing or
           | conditional logic into the pipeline to filter results in a
           | complex way.)
        
             | stouset wrote:
             | Shell script authors like yourself make me very happy. The
             | pipe-to-read is a fun idea, I'll use it.
             | 
             | One stanza I have at the beginning of every script:
             | [[ -n "${TRACE:-}" ]] && set -o xtrace
             | 
             | This lets you trace any script just by setting the
             | environment variable. And it's nounset-safe.
             | 
             | This was typed from memory on mobile so if the above is
             | bugged, my bad :)
        
               | ramses0 wrote:
               | Same w/ the "pipe to read" (although it doesn't seem to
               | work right with OSX's bash-3.2). I found this gizmo
               | somewhere and it's worth sharing as well...
               | # MAGIC DEBUGGING LINE         #trap '(read -p
               | "[$BASH_SOURCE:$LINENO] $BASH_COMMAND? ")' DEBUG
               | 
               | ...basically interactive prompting while running.
               | 
               | I need to write up some thoughts on bash being
               | effectively a lisp if you stare at it the right way.
        
               | pxc wrote:
               | > Same w/ the "pipe to read" (although it doesn't seem to
               | work right with OSX's bash-3.2). I found this gizmo
               | somewhere and it's worth sharing as well...
               | 
               | Yes, you need a recent Bash for this (and lots of other
               | nice things like 'associative arrays' (maps)). To free
               | myself up to write more concise and legible scripts, I
               | lean into bashisms and make use of external programs
               | without restriction, regardless of what's shipped by
               | default on any system. To recover portability where I
               | need it, I use Nix to fix my Bash interpreter versions as
               | well as the versions of external tools.
               | 
               | When my scripts are 'packaged' by Nix, their bodies are
               | written as multiline strings in Nix. In that case, Nix
               | takes care of setting the shebang to a fixed version of
               | Bash, and I interpolate external commands in like this:
               | "${pkgs.coreutils}/bin/realpath" .      # these do
               | "${pkgs.coreutils}/bin/pwd" --physical  # the same thing
               | 
               | and in that way my scripts use only their fixed
               | dependencies at fixed versions without modifying the
               | environment at all. This also works nicely across
               | platforms, so when these scripts run on macOS they get
               | the same versions of the GNU coreutils as they do when
               | they run on Linux, just compiled for the appropriate
               | platform and architecture. Same goes for different
               | architectures. So this way your script runs 'natively'
               | (no virtualization) but you still pin all its
               | dependencies.
               | 
               | In other contexts (e.g., cloud-init), I use bash a bit
               | more restrictively depending on what versions I'm
               | targeting. But I still use Nix to provide dependencies so
               | that my scripts use the same versions of external tools
               | regardless of distro or platform:                 nix
               | shell nixpkgs#toybox --command which which # these do
               | nix run nixpkgs#which -- which                 # the same
               | thing
               | 
               | `nix run` and `nix shell` both behave as desired in
               | pipelines and subshells and all that. (To get the same
               | level of determinism as with the method of use outlined
               | earlier, you'd want to either pin the `nixpkgs` ref in
               | your local flake registry or replace it with a reference
               | that pinned the ref down to a commit hash.)
               | 
               | There are is a really cool tool[1] by the Nixer abathur
               | for automagically converting naively-written scripts to
               | ones that pin their deps via Nix, as in the first
               | example. I'm not using it yet but I likely will if the
               | scripts I use for our development environments at work
               | get much bigger-- that way I can store them as normal
               | scripts without any special escaping/templating and
               | linters will know how to read them and all that.
               | 
               | Anyhow, it's totally safe to install a newer Bash on
               | macOS, and I recommend doing it for personal interactive
               | use and private scripts. Pkgsrc and MacPorts both have
               | Bash 5.x, if you don't have or want Nix.
               | 
               | > I need to write up some thoughts on bash being
               | effectively a lisp if you stare at it the right way.
               | 
               | You should! It's totally true, since external commands
               | and shell builtins are in a prefix syntax just like Lisp
               | function calls. Some shells accentuate this a bit, like
               | Fish, where subshells are just parens with no dollar
               | signs so nested subcommands look very Lisp-y, and Elvish,
               | whose author (xiaq) has tried to lean into that syntactic
               | convergence designing a new shell (drawing inspiration
               | from Scheme in various places).
               | 
               | > I found this gizmo somewhere and it's worth sharing as
               | well...                   # MAGIC DEBUGGING LINE
               | #trap '(read -p "[$BASH_SOURCE:$LINENO] $BASH_COMMAND?
               | ")' DEBUG
               | 
               | Okay _that_ looks really nifty. I will definitely find a
               | use for that literally tomorrow, if not today.
               | 
               | --
               | 
               | 1: https://github.com/abathur/resholve
        
               | pxc wrote:
               | > Shell script authors like yourself make me very happy.
               | 
               | Shell script is really good for some things, so we're
               | gonna end up writing it. And if we write it, it might as
               | well be legible, right? Shell scripts deserve the same
               | care that other programs do.
               | 
               | > This lets you trace any script just by setting the
               | environment variable. And it's nounset-safe.
               | 
               | Nice! I think I'll start adding the same.
               | 
               | > This was typed from memory on mobile so if the above is
               | bugged, my bad :)
               | 
               | I think it's fine aside from the smart quote characters
               | your phone inserted instead of plain vertical double
               | quotes around the parameter expansion you use to make the
               | line nounset-friendly!
        
           | yjftsjthsd-h wrote:
           | A pattern I typically do                   foo && \
           | bar && \         baz && \         :
           | 
           | or so, which is less verbose but short and sweet. Obviously
           | slightly different, but : (no-op) seems applicable to your
           | situation.
        
             | ramses0 wrote:
             | Clever! ...almost TOO clever... ;-)
             | 
             | That's a great technique, but the `:` as no-op is tough to
             | expect bash-normies to understand (and a tough "operator"
             | to search for). Thanks for sharing, it'll definitely stay
             | in my back pocket!
        
             | js2 wrote:
             | You don't need the backslashes in that case. As with lines
             | ending in pipes and a few other places, the line
             | continuation is implicit after the &&:
             | 
             | https://unix.stackexchange.com/questions/253518/where-are-
             | ba...
        
               | yjftsjthsd-h wrote:
               | Huh, neat. So I picked that habit up from writing
               | Dockerfiles, which does let you do                   RUN
               | foo && \             bar && \             :
               | 
               | but _not_                   RUN foo &&             bar &&
               | :
               | 
               | (I just tested it), but more recently you _can_ just
               | write                   RUN <<EOF         foo         bar
               | EOF
               | 
               | so with the caveat of needing to `set -e` the whole thing
               | might be a moot point now:)
        
         | corytheboyd wrote:
         | I love that it forms a literal pipe line
        
         | js2 wrote:
         | If you end with a pipe, you don't need the backslash before the
         | newline. It's implicit.
         | 
         | https://unix.stackexchange.com/questions/253518/where-are-ba...
         | 
         | When writing bash, if I have a command with many switches, I
         | use an array to avoid backslashes and make the intent clearer:
         | curl_cmd=(         curl         --silent         --fail
         | --output "$output_file"         "$url"       )
         | "${curl_cmd[@]}"
         | 
         | I also insist on long-form options in shell scripts since they
         | are much more self-documenting as compared to single-letter
         | options.
        
         | TristanBall wrote:
         | \ linebreaks are not something I love,and a while ago I started
         | using chained blocks..
         | 
         | These are usually a step between "overely complicated one
         | liner" and structured script, and often get refactored to
         | functions etc if the script evolves that far. But lots don't,
         | and if I just want something readable, that also lends itself
         | to comments etc, this works for me.
         | 
         | { foo -a -b }|{ bar -c -d -e }|{ baz -e -f }
         | 
         | But I suspect it's not everyone's cup of tea!
        
           | TristanBall wrote:
           | Ha.. my linebreaks got removed!
        
       | ComputerGuru wrote:
       | This is really cool; for a second I thought I could use it to
       | stop manually maintaining both the minified and full-text
       | versions of my "shell prefix" that makes it possible to run rust
       | source code directly as if it were a shell script [0] where I've
       | accidentally diverged between the two in the past, but then I
       | revisited it and saw that the real value was in the comments and
       | explanations more than just placing output in variables and
       | breaking up command pipelines across shell lines.
       | 
       | But the opposite might work, does anyone have a good _minifier_
       | they could recommend (preferably one that does more than just
       | whitespace mangling, eg also renames variables, chains
       | executions, etc) that doesn't introduce bash-isms into the
       | resulting script?
       | 
       | [0]: https://neosmart.net/blog/self-compiling-rust-code/
        
       | pxc wrote:
       | This looks really handy! I should add this to the environment for
       | some of my shell-centric projects at work.
        
       ___________________________________________________________________
       (page generated 2024-09-18 23:01 UTC)