[HN Gopher] Hidden gems of moreutils
___________________________________________________________________
Hidden gems of moreutils
Author : jiripospisil
Score : 153 points
Date : 2023-12-30 12:00 UTC (10 hours ago)
(HTM) web link (jpospisil.com)
(TXT) w3m dump (jpospisil.com)
| throw0101b wrote:
| For _execsnoop_ , people running systems with DTrace can find the
| same:
|
| * https://github.com/jorgev/dtrace-scripts/blob/master/execsno...
|
| On macOS Monterey+ you'll probably have to install the Kernel
| Debug Kit per:
|
| * https://developer.apple.com/forums/thread/692444
|
| The Linux variant was written Brendan Gregg (who previous did a
| lot of work on Solaris, where DTrace was created):
|
| * https://github.com/brendangregg/perf-tools/blob/master/execs...
|
| * https://github.com/iovisor/bcc/blob/master/tools/execsnoop.p...
| bloopernova wrote:
| In case anyone was wondering, the moreutils tools:
| chronic: runs a command quietly unless it fails combine:
| combine the lines in two files using boolean operations
| errno: look up errno names and descriptions ifdata: get
| network interface info without parsing ifconfig output
| ifne: run a program if the standard input is not empty
| isutf8: check if a file or standard input is utf-8 lckdo:
| execute a program with a lock held mispipe: pipe two
| commands, returning the exit status of the first parallel:
| run multiple jobs at once pee: tee standard input to pipes
| sponge: soak up standard input and write to a file ts:
| timestamp standard input vidir: edit a directory in your
| text editor vipe: insert a text editor into a pipe
| zrun: automatically uncompress arguments to command
|
| from
| https://rentes.github.io/unix/utilities/2015/07/27/moreutils...
|
| Similarly, there's some lesser-known useful stuff in GNU
| Coreutils:
|
| https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_com...
| paste: Merges lines of files expand: Converts tabs to
| spaces seq: prints a sequence of numbers shuf:
| shuffles its input
| koolba wrote:
| seq is "less known"? I'd assume anyone familiar with shell
| scripting would know about it.
|
| It's a great starting point for entertaining children on a
| terminal: seq 99 -1 0 | xargs printf '%s
| bottles of beer on the wall...\n'
| foobarian wrote:
| Bash kinda broke seq for me since you can just write {99..0}
| o11c wrote:
| Or if you need dynamic arguments, you probably want:
| N=9; for ((i=0; i<=N; ++i)); do echo "$i"; done
| andrewshadura wrote:
| Why use this form of for when you can just use seq and it
| works in any shell including fish?
| o11c wrote:
| Because capturing the output of `seq` requires spawning a
| whole separate process (significant for small sequences)
| and shoving all the data into a single buffer
| (significant for large sequences) rather than working
| incrementally.
| bloopernova wrote:
| Less known maybe for most people who read HN, but you're
| right that a lot of shell scripting folks would know about
| it.
| agumonkey wrote:
| thanks for the tl;dr
|
| moreutils and similar pops up every year but it's still easy to
| forget.. they should be part of core distributions nowadays..
| jeffbee wrote:
| Confusingly, moreutils parallel is not "GNU parallel".
| Moreutils parallel is very simple, while the other parallel is
| very featureful. Linux distributions can deal with the
| conflict, but bad package managers like homebrew cannot.
| mlk wrote:
| "rename" shares the same fate, there are several
| implantations, completely different from each other
| atomicstack wrote:
| I use `ts` quite often in adhoc logging/monitoring contexts.
| Because it uses strftime() under the hood, it has automatic
| support for '%F' and '%T', which are much easier to type than
| '%Y-%m-%d' and '%H:%M:%S'. Plus, it also has support for high-
| resolution times via '%.T', '%.s', and '%.S':
| echo 'hello world' | ts '[%F %.T]' [2023-12-30
| 16:25:40.463640] hello world
| c0l0 wrote:
| Assuming semi-recent _bash(1)_ , you can also get away with
| something like while read -r line; do printf
| '%(%F %T %s)T %s\n' "-1" "${line}"; done
|
| as the right-hand side/reader of a shell pipe for most of what
| _ts(1)_ offers. ( "-1" makes the embedded _strftime(3)_ format
| string assume the current time as its input).
| smalu wrote:
| I recommend zmwangx/ets package, it is the modern version of
| ts. I'm using it in CI/CD pipeline in gitlab for debugging
| performance.
| loeg wrote:
| The 'logger' command can also be useful.
| Karellen wrote:
| I use `sponge` and `ts` (mentioned in the article) pretty
| regularly, and am really happy for them.
|
| I have used `isutf8` a fair amount in the past, but I find it
| mostly redundant these days (thankfully!)
|
| The other one that I don't use very often, but is absolutely
| invaluable when I do need it, is `ifne` - "Run command if the
| standard input is not empty". It's like `-r` from GNU `xargs`,
| but for any command in a pipeline.
| 1f60c wrote:
| One typo that's easy to make is: sort file.txt |
| sponge > file.txt
|
| (i.e., using redirection rather than passing the path as an
| argument to sponge)
|
| This is wrong and will not work! I've been bitten by it before.
| 22c wrote:
| moreutils parallel can also come in handy for quick command
| parallelization (not to be confused with GNU parallel which
| serves a similar purpose but can be more complicated)
| croemer wrote:
| And GNU parallel is very aggressive about citations which I get
| but it's also too much
| ostensible wrote:
| I have switched to using xargs to parallelize things: it has a
| benefit of being part of posix, and is not annoying about
| citations like parallel.
| twic wrote:
| The parallelism isn't part of POSIX though (AFAIK), that's an
| extension by whoever wrote your xargs.
|
| If what you really mean is that it's already installed on
| every machine you use, fair enough. But it's not strictly
| portable in some standards-based sense.
| pie_flavor wrote:
| That they occupy the same namespace is always very annoying.
| Instead of just `brew upgrade` I must unlink and later link
| --overwrite parallel.
| croemer wrote:
| Not a moreutil, but I recently discovered `pv`, the pipe viewer
| and it's so useful. Like tqdm (Python progressbar library) but as
| a Unix utility. Just put it between two pipes and it'll display
| rate of bytes/lines
|
| Apparently it's neither a coreutil nor a moreutil.
|
| Here's an HN discussion from 2022:
| https://news.ycombinator.com/item?id=33244768
| chlorion wrote:
| I have also discovered that certain implementations of dd have
| a progress printing functionality that can be used for similar
| purposes. You can put a "dd status=progress" in a pipeline and
| it will print the amount and rate of data being piped!
|
| This dd option is not as nice as pipe viewer but it's handy for
| when pv isn't around for some reason.
| derefr wrote:
| Even if you don't pass this argument, you can poke most
| implementations of dd(1) with a certain POSIX signal, and
| they'll respond by printing a progress line.
|
| On Linux, this is SIGUSR1, and you have to use kill(1) to
| send it.
|
| On BSDs (incl. macOS), though, the signal dd listens for is
| instead called SIGINFO (which probably makes this make a lot
| more sense for why a process would have this response to it.)
| Shells/terminal emulators on these platforms emit SIGINFO
| when you type Ctrl+T into them!
|
| (For a lot more useful info about this behavior:
| https://stuff-things.net/2016/04/06/that-one-stupid-dd-
| trick...)
|
| Bonus fact not mentioned in the above article: dd used in the
| middle of a pipeline will still "hear" Ctrl+T and print
| progress, since signals generated by a shell (think: SIGINT
| from Ctrl+C) are propagated to _all_ processes in the process
| group started by the command. Test it yourself:
| cat /dev/zero | dd count=10000000 bs=1024 | cat > /dev/null
| BlackLotus89 wrote:
| Yeah bit me in the butt once on Mac OS usr1 killed dd. When
| I want progress of dd and didn't define status I mostly use
| progress now. Also works with many pther utils like cp, xz
| and all the usual suspects
| cycomanic wrote:
| When did I mentioned one should always point to
| https://www.vidarholen.net/contents/blog/?p=479
|
| For like >99% of cases where people used dd they would have
| been better of using a different tool.
| derefr wrote:
| You can also use pv as you would use cat, e.g.
| pv file.tar.gz.part1 file.tar.gz.part2 | tar -x -z
|
| Just like cat, pv used this way will stream out the
| concatenation of the passed-in file paths; but it will _also_
| add up the sizes of these files and use them to calculate the
| total stream size, i.e. the divisor for its displayed progress
| percentage.
| matrss wrote:
| > Like tqdm (Python progressbar library) but as a Unix utility.
|
| FYI: tqdm can be used in a shell pipeline as well. It's
| documented (at least) in their readme:
| https://github.com/tqdm/tqdm#module
| genman wrote:
| It is a really incredible anxiety reducing small tool when you
| have to transfer large files.
| LeoPanthera wrote:
| If you have a long running copy process running but forget to
| enable progress, you can use the "progress" utility to show the
| progress of something that is already running.
|
| It supports: cp mv dd tar bsdtar cat rsync scp grep fgrep egrep
| cut sort cksum md5sum sha1sum sha224sum sha256sum sha384sum
| sha512sum adb gzip gunzip bzip2 bunzip2 xz unxz lzma unlzma 7z
| 7za zip unzip zcat bzcat lzcat coreutils split gpg gcp gmv
| kyrofa wrote:
| Yeah I use chronic all the time for my cron jobs so they only
| email me if they fail and I can still print helpful output from
| them. Love moreutils.
| dig1 wrote:
| I just learned about vidir [1]. Emacs Dired [2] can rename &
| delete files by editing the buffer directly, and let's say I was
| thrilled when I saw someone replicated that behavior as a general
| Unix tool.
|
| [1] https://github.com/trapd00r/vidir
|
| [2]
| https://www.gnu.org/software/emacs/manual/html_node/emacs/Wd...
| gpvos wrote:
| I'd never heard of the :cq command in vim before. Seems useful,
| but in practice it's so unknown that things like editing the git
| commit message cannot rely on it and instead check whether the
| file has been changed. Also, reading its documentation, it
| probably would be better named :cqall .
| cassepipe wrote:
| I was wondering that too although I don't have access to vim
| right now. What's the punch line ? EDIT : The difference with
| :q! is the exit code !
|
| (Yes, and :wall is actually the :update command on all your
| buffers, that is, unlike :w, buffers are written only if there
| has been changes. Bad naming is the mother of all pedagocical
| pain)
| andrewshadura wrote:
| As far as I remember, it works with git commit just fine. It's
| also far from being unknown.
| gpvos wrote:
| Yeah, I should've written "cannot rely on that alone and
| _also_ check ". I've worked with vi, later vim, for 34 years
| and read about it here first; ddg'ing for it doesn't give
| many hits.
| manx wrote:
| I use that often for aborting the current commit or the current
| git interactive rebase
| kiprasmel wrote:
| it's v useful if you want to abort, e.g. when editing an
| interactive rebase & decide to not go thru w/ it.
| cbarrick wrote:
| > `pee` [...] It runs the commands using popen, so it's actually
| passing them to /bin/sh -c (which on most systems is a symlink to
| Bash).
|
| Do not assume /bin/sh is Bash!!
|
| On Debian-based systems, including Ubuntu, Dash is the default
| non-interactive shell. Specifically, Dash does not support common
| Bashisms.
|
| (Not to mention Alpine, OpenWRT, FreeBSD, ...)
|
| This is a bit of a pet-peeve of mine. If you're dropping a
| reference to `/bin/sh -c` like the reader knows what that means,
| then you don't need to tell them that "it's a symlink to Bash."
| They know their own system better than you.
| 8organicbits wrote:
| Huh. I didn't realize that
|
| https://wiki.debian.org/Shell
| throwaway892238 wrote:
| Rule of thumb about shells: You'll never know
| for sure what shell is the default, so write your scripts to
| a minimum shell family, and encode the name of that shell
| family in the shebang.
|
| The default shell, for either the entire system or an
| individual user, could be: - A Bourne shell
| - A C shell - A POSIX shell - A "modern" shell,
| like Bash, Zsh, Fish, Osh
|
| Use the following shebangs to call the class of shell you
| expect. Start with _/ usr/bin/env_ to allow the system to
| locate the executable based on the current _PATH_ environment
| variable rather than a fixed filesystem path.
| #!/usr/bin/env sh # ^ should result in a
| Bourne-like shell, on modern systems, but could also be
| # a POSIX shell (like Ash), which is not
| really backwards compatible #!/usr/bin/env csh
| # ^ should result in a C shell
| #!/usr/bin/env bash # ^ should result in a
| Bash shell, though the only version you should #
| expect is version 3.2.57, the last GPLv2 version
| #!/usr/bin/env dash # ^ you expect the
| system to have a specific shell. if your code depends #
| on Dash features, do this. otherwise write your code using an
| earlier # version of the original shell
| family, such as Bourne or POSIX.
|
| If you need to pass a command-line argument to the shell
| command before interpreting your script, use the fixed
| filesystem path for the shell. Due to bugs in the way kernels
| execute scripts, all arguments after the initial shebang path
| may be sent as a single argument, which is probably not what
| you (or the shell command) expect. (e.g. use _#! /bin/bash
| --posix_ instead of _#! /usr/bin/env bash --posix_ as the
| latter may fail)
|
| Different shells have different implementation details. For
| example, Bash tends to skew toward POSIX conformance by
| default, even if it conflicts with traditional Bourne shell
| behavior. Dash may try to implement POSIX semantics, but also
| has its own specific behavior incompatible with POSIX.
|
| In order to get specific behavior (like POSIX emulation), you
| may need to add a flag like _--posix_ , or set an environment
| variable like _POSIXLY_CORRECT_. For shells that support it,
| you can also detect the version of the shell running, and
| bail if the version isn 't compatible with your script.
|
| Here are some references of the differences between shells:
| - Comparison of command shells[1] - Major differences
| between Bash and Bourne shell[2] - Practical
| differences between Bash and Zsh[3] - Fundamental
| differences between mainstream \*NIX shells[4]
|
| [1]
| https://en.wikipedia.org/wiki/Comparison_of_command_shells
| [2] https://www.gnu.org/software/bash/manual/html_node/Major-
| Dif... [3]
| https://apple.stackexchange.com/questions/361870/what-are-
| th... [4] https://unix.stackexchange.com/questions/3320/what-
| are-the-f...
| throw0101b wrote:
| > _Specifically, Dash does not support common Bashisms._
|
| More importantly, Bash should not support Bashims when called
| as /bin/sh (either).
|
| If you want to use Bashisms just invoke Bash.
| cassepipe wrote:
| > In Vim / Helix you do that with :cq
|
| Never heard of that before. I generally use :q! or ZQ
|
| Is there a difference ?
| mbwgh wrote:
| Yes, the exit code. See e.g. `:help cq` in vim. :q! and ZQ will
| yield exit code 0, which sometimes is not what you want if you
| want to ensure some task is properly aborted.
| cassepipe wrote:
| Thanks !
| katzgrau wrote:
| `pee` - no doubt the dev was delighted and amused
| opan wrote:
| vidir within ranger is really nice. vipe is also pretty cool.
| Mostly I use vipe for editing my clipboard contents and then
| sending the modified version back to the clipboard, or
| occasionally editing some text stream before sending it to my
| clipboard, such as some grep output I only want some of.
___________________________________________________________________
(page generated 2023-12-30 23:00 UTC)