[HN Gopher] "Exit Traps" Can Make Your Bash Scripts Way More Rob...
___________________________________________________________________
"Exit Traps" Can Make Your Bash Scripts Way More Robust and
Reliable
Author : ekiauhce
Score : 192 points
Date : 2023-06-20 06:34 UTC (16 hours ago)
(HTM) web link (redsymbol.net)
(TXT) w3m dump (redsymbol.net)
| franknord23 wrote:
| I wish Bash had 'defer' like Go.
| c7DJTLrn wrote:
| https://cedwards.xyz/defer-for-shell/
|
| Enjoy. (blog post is mine)
| chasil wrote:
| I used an exit trap to kill an SSH agent that I am running, and I
| noticed that dash did not kill if the script was interrupted, but
| only if it ran to successful completion.
|
| I asked on the mailing list if this was expected behavior, and it
| turns out that POSIX only requires EXIT to run on a clean
| shutdown; to catch interruptions, add more signals.
| trap 'eval $(ssh-agent -k)' EXIT INT ABRT KILL TERM
| lgsymons wrote:
| The signals EXIT HUP INT TERM cover everything I've run into
| (I'm actually using EXIT SIGHUP SIGINT SIGTERM but presumably
| it's equivalent).
|
| In basic terms for my purposes these respectively account for a
| clean exit, the terminal emulator being closed, ctrl-c, the
| kill command.
| diarrhea wrote:
| Harsh lesson, those five signal names are identical if one
| squints real good. Would have never known.
| hoherd wrote:
| `man 7 signal` on linux or just `man signal` on macOS will
| give you more information about the different signals, and
| shows what the different meanings of those are.
| chasil wrote:
| "kill -l" gives you a terse (but complete) list.
| simcop2387 wrote:
| You can't trap kill can you? doesn't that just go kill your
| process without any possibility of intervention or other
| actions? Also you probably want to handle HUP there too I would
| think (depending on what the script does)
| chasil wrote:
| That's what they put on the ticket, so that's what I'm using,
| but you're probably right.
| mike_hock wrote:
| SIGABRT is also not a normal termination signal. Seems out of
| place here.
| paulddraper wrote:
| SIGABRT would have to come from the process itself; IDK when
| if ever the shell would do that.
|
| And SIGKILL can't be handled, so that is indeed pointless.
| arp242 wrote:
| POSIX says that "setting a trap for SIGKILL or SIGSTOP
| produces undefined results", but for signals it describes
| SIGKILL as "Kill (cannot be caught or ignored)".
|
| I'm guessing this is some relic from 80s Unix systems where
| SIGKILL behaved different, or perhaps just an
| inconsistency/oversight.
| nerdponx wrote:
| I read that as undefined in terms of how the shell itself
| handles it, because the OS doesn't care.
| [deleted]
| js2 wrote:
| I think you want: trap 'ssh-agent -k' EXIT INT
| TERM
|
| I don't see any reason for the eval as "ssh-agent -k" doesn't
| return anything useful you want the shell to evaluate.
| chasil wrote:
| That's not what the eval is for.
|
| The "ssh-agent -k" command will emit shell commands that the
| shell must then execute which will kill the agent daemon and
| unset the socket environment variable.
| leodag wrote:
| > The "ssh-agent -k" command will emit shell commands
|
| Does it really? I've executed it here and it just runs
| kill, doesn't emit any bash. Running just ssh-agent
| (without any args) does that though, which is what's
| probably causing the confusion.
| chasil wrote:
| I am on OpenBSD 7.2, and I see: $ eval
| $(ssh-agent) Agent pid 56785 $ ssh-agent
| -k unset SSH_AUTH_SOCK; unset SSH_AGENT_PID;
| echo Agent pid 56785 killed;
|
| The correct processing of that output requires an eval.
|
| Did you have any other questions?
| js2 wrote:
| If all you care about is killing it, you don't need to eval
| the output. The output just unsets two environment
| variables which only matters in the current shell context.
| $ ssh-agent SSH_AUTH_SOCK=/var/folders/8p/_pwq997168s
| 7vdwwdg_qr1j40000gn/T//ssh-DE0IoJfU5rrM/agent.15015; export
| SSH_AUTH_SOCK; SSH_AGENT_PID=15016; export
| SSH_AGENT_PID; echo Agent pid 15016; $
| SSH_AGENT_PID=15016; export SSH_AGENT_PID; $ ssh-
| agent -k unset SSH_AUTH_SOCK; unset
| SSH_AGENT_PID; echo Agent pid 15016 killed;
|
| That said, it doesn't hurt to eval it, so I overstated my
| case in my original comment.
| snapcaster wrote:
| Very cool! didn't know about these
| gscho wrote:
| Off topic but I really enjoy the lofi website design!
| arp242 wrote:
| An annoying thing about bash is that EXIT will _also_ run on
| SIGINT (^C), which most other shells won 't (in my reading it's
| also not POSIX compliant, although the document is a bit vague).
| Some might argue this is a feature, but IMHO it's a bug -
| sometimes you really _don 't_ want cleanup to happen so people
| can inspect the contents of temporary files for debugging.
| Because trap doesn't pass the signal information to the handler
| it's hard to not do cleanup on SIGINT, so it's certainly less
| flexible, and it's an annoying incompatibility between bash and
| any other shell.
|
| Also, zsh has a much nicer mechanism for the common case:
| { echo lol } always { # Ensure *all*
| temporary files are cleaned up. nohup rm -rf / &
| }
| jcotton42 wrote:
| > Because trap doesn't pass the signal information to the
| handler it's not hard not to do cleanup on SIGINT
|
| Did you mean "it's hard" instead of "it's not hard"?
| arp242 wrote:
| Oops, yes, thanks; seems a "not" got duplicated in editing -
| still within edit window.
| ggm wrote:
| Some temporary file remover. Lol indeed
| wkat4242 wrote:
| // Thinks about that type I typed rm -rf /<space>something by
| mistake.
|
| It took a few seconds before I thought... "Why does it take
| that long for only a handful of files?"
|
| I never did that again.
|
| Had my DOS filesystem mounted under Linux too (yes that long
| ago), and I spent a few days guessing the first letter of
| each deleted file with norton disk doctor or undeleter or
| something. That was fun (FAT16 filesystems overwrote the
| first letter of each filename to delete it)
|
| At least it wasn't a mistake I made at work on some
| production thing. Though there is a reason I make all the
| desktops on windows production servers bright red. One time I
| was tired and shut down "my laptop" forgetting I was still
| logged into a remote server 200km away..... :/ Of course the
| iLO wasn't hooked up but I was extremely happy to find that
| HP servers listen to wake on LAN even when they're off.
| Another one for the never again books :P
| arp242 wrote:
| "Keep non-temporary files intact" was not part of the design
| document.
| c5c3c9 wrote:
| [flagged]
| js2 wrote:
| > Because trap doesn't pass the signal information to the
| handler
|
| You can examine $? on entry to the trap function. On signals,
| it will be 128 + signal. i.e. on TERM (15) it will be 143. On
| INT (2) it will be 130. #!/bin/bash
| skip_exit= on_exit() { code=$? if test
| $code == 130; then skip_exit=1 fi if
| test -n "$skip_exit"; then return fi
| echo "Exiting with: $code" return $code }
| trap on_exit INT EXIT sleep 2 false
|
| With ctrl-c: $ ./foo.sh ^C
|
| After 2 seconds: $ ./foo.sh Exiting with:
| 1
|
| You can also setup separate handlers for each signal and use a
| sentinel: $ cat foo.sh #!/bin/bash
| skip_exit= on_int() { echo int
| skip_exit=1 } on_exit() { test -n
| "$skip_exit" && return echo exit }
| trap on_int INT trap on_exit EXIT sleep 2
|
| With ctrl-c: $ ./foo.sh ^Cint
|
| After 2 seconds: $ ./foo.sh exit
| telotortium wrote:
| @redsymbol your site has a TLS certificate error. On Chrome I get
| NET::ERR_CERT_COMMON_NAME_INVALID because your certificate is
| from mobilewebup.com
|
| Otherwise a good article. I use the following code to enable
| passing the signal name to the trap handler, so that I can kill
| the Bash process with the correct signal name, which is best
| practice for Unix signal handling (EXIT would have to be handled
| specially in `sig_rekill`): # Set trap for
| several signals and pass signal name to trap function. #
| https://stackoverflow.com/a/2183063/207384
| trap_with_arg() { func="$1" ; shift for
| sig ; do trap "$func $sig" "$sig"
| done } sig_rekill() { # Kill whole
| process group. trap "$1"; kill -"$1" -$$ }
| # Catch signal and kill whole process group.
| trap_with_arg sig_rekill HUP INT QUIT PIPE TERM
| [deleted]
| smcleod wrote:
| Yep, I use these all the time, they're very useful indeed.
| jeron wrote:
| I thought exit traps were just SPACs
| filereaper wrote:
| >The secret sauce is a pseudo-signal provided by bash, called
| EXIT, that you can trap; commands or functions trapped on it will
| execute when the script exits for any reason.
|
| "Secret Sauce", why is this secret at all.
|
| Nothing against the author who's helping the ecosystem here, but
| is there an authoritative guide on Bash that anyone can
| recommend?
|
| Hopefully something that's portable between Mac & Linux.
|
| The web is full of contradictory guides and shellcheck seems to
| be the last line of defense.
|
| - https://github.com/koalaman/shellcheck
| sigg3 wrote:
| Yes. I use them for cleanup in every non-trivial script I write.
| waselighis wrote:
| I wish there was a nicer shell scripting language that simply
| transpiled to Bash and would generate all this boilerplate code
| for me. There is https://batsh.org/ which has a nice syntax but
| it doesn't even support pipes or redirection, making it pretty
| worthless for shell scripting. I haven't found any other such
| scripting languages.
| paulddraper wrote:
| What's the difference between that and Go?
| burnished wrote:
| Go doesnt seem related at all?
| cvalka wrote:
| [flagged]
| usr1106 wrote:
| bash scripts have their use cases, many things are shorter and
| simpler than in Python. But coders should bother to learn how
| bash works and use shellcheck. Just guessing from how things
| work in another language typically leads to buggy code. Keeping
| a daemon always running is not a task for bash. systemd is
| typically much better at that (although something like
| exponential backoff in case of failure seem to be tricky)
| anaganisk wrote:
| A good read before dismissing http://n-gate.com/software/2017/
| arsome wrote:
| What's the gripe with Let's Encrypt? Certificate
| transparency?
| mttjj wrote:
| Can you expand on your first sentence with some reasons or
| justifications for stating this?
| ipnon wrote:
| I just learned about these through "pair" programming with
| ChatGPT. It is the quintessential ML-enhanced programming trick:
| Using some old, robust language feature I'm skilled enough to
| grok but never had the time to learn about through endless
| documentation spelunking.
|
| My opinion is that LLM pair programming is most or maybe only
| beneficial to already skilled programmers. ChatGPT can open the
| door for you, but it can't show you where the door is. I needed
| the experience to ask it for a Bash script that handles exit
| codes gracefully, which is not a question all junior programmers
| would be able to ask.
| abathur wrote:
| I like combining this with a bash implementation of an event API
| (https://github.com/bashup/events). This makes it easy/idiomatic,
| for example, to conditionally add cleanup as you go.
|
| Glossing over some complexity, but roughly:
| add_cleanup(){ event on cleanup "$@" }
| trap "event emit 'cleanup'" HUP EXIT
| start_postgres(){ add_cleanup stop_postgres
| # actually start pg } start_apache(){
| add_cleanup stop_apache # actually start apache
| }
|
| I wrote a little about some other places where I've used it in
| https://www.t-ravis.com/post/shell/neighborly_shell_with_bas...
| and https://t-ravis.com/post/nix/avoid_trap_clobbering_in_nix-
| sh... (though I make the best use of it in my private bootstrap
| and backup scripts...)
| e12e wrote:
| Thank you for sharing - if i understand the code, the queue is
| serialized into bash variable(s) (arrays)?
|
| I must admit I find the code somewhat painfully terse and hard
| to read.
|
| Still, interesting idea. I wonder if using a temporary
| SQLite/Berkeley DB/etc for queue might generalize the idea to a
| "Unix" event system - allowing other programs and scripts to
| use it for coordinating? (Like logger(1) does for logging)?
| phh wrote:
| This 100%.
|
| I'll complete with patterns I'm using for exit traps:
|
| - for temporary files I have a global array that lists files to
| remove (and for my use case umount them beforehand)
|
| - in the EC2 example, I add a line with just "bash", so I have an
| env with the container still running to debug what happened and I
| just need to close that shell to clear the allocated resources
| tommica wrote:
| This is very useful to know - thanks for sharing!
| rgrau wrote:
| I couldn't find a way to have more than one callback per signal,
| and created a system to have an array of callbacks:
|
| https://github.com/kidd/scripting-field-guide/blob/master/bo...
|
| A nice bonus is that it also keeps the return value of the last
| non-callback function, so your script behaves better when called
| from other scripts.
| ch33zer wrote:
| Should go without saying, but don't rely on this for anything
| critical. It's not guaranteed this will run, even on successful
| completion of the script. Simple example: power is cut between
| the last line of the script and before the trap runs. Just a
| heads up
| JohnMakin wrote:
| I like to use these in combination with set -e and report the
| error that happened to whatever is capturing stdout for logging.
|
| You can report the error code with $? at the start of your trap,
| IIRC.
___________________________________________________________________
(page generated 2023-06-20 23:00 UTC)