[HN Gopher] An Opinionated Guide to Xargs
___________________________________________________________________
An Opinionated Guide to Xargs
Author : todsacerdoti
Score : 239 points
Date : 2021-08-21 16:21 UTC (6 hours ago)
(HTM) web link (www.oilshell.org)
(TXT) w3m dump (www.oilshell.org)
| masklinn wrote:
| > Shell functions and $1, instead of xargs -I {}
|
| > -n instead of -L (to avoid an ad hoc data language)
|
| Apparently GNU xargs is missing it, but BSD xargs has -J, which
| is a `-I` which works with `-n`: with `-I` each replstr gets
| replaced by one of the inputs, with `-J` the replstr gets
| replaced by the entire batch (as determined by `-n`).
| [deleted]
| MichaelGroves wrote:
| > _A lobste.rs user asked why you would use find | xargs rather
| than find -exec. The answer is that it can be much faster. If
| you're trying to rm 10,000 files, you can start one process
| instead of 10,000 processes!_
|
| Fair enough, but I still favor _find -exec_. I find it generally
| less error prone, and it 's never been so slow that I wished I
| had instead used xargs.
|
| Also, if you're specifically using _-exec rm_ with find, you
| could instead use find with _-delete_.
| bobbylarrybobby wrote:
| You can also use `find -exec` with `'+'` instead of `';'` as
| the terminator. This will call `rm` on all of the found files
| in one call.
| masklinn wrote:
| I tend to prefer xargs because it works in more contexts e.g.
| I've got a tool which automatically generates databases but
| sometimes the cleanup doesn't work. `find -exec` does
| nothing, but `xargs -n1 dropdb` (following an intermediate
| grep) does the job. From there, it makes sense to... just use
| xargs everywhere.
|
| And I always fail to remember that the -exec terminator must
| be escaped in zsh, so using -exec always takes me multiple
| tries. So I only use -exec when I must (for `find`
| predicates).
| shoo wrote:
| i agree. `find somewhere -exec some_command {} +` can be
| dramatically faster. but it does not guarantee a single
| invocation of `some_command`, it may make multiple
| invocations if you pass very large numbers of matching files
|
| after spending a bit of time reading the man page for find, i
| rarely use xargs any more. find is pretty good.
|
| tangent:
|
| another instance i've seen where spawning many processes can
| lead to bad performance is in bash scripts for git pre-
| recieve hooks, to scan and validate the commit message of a
| range of commits before accepting them. it is pretty easy to
| cobble together some loop in a bash script that executes
| multiple processes _per commit_. that's fine for typical
| small pushes of 1-20 commits -- but if someone needs to do
| serious graph surgery and push a branch of 1000 - 10,000
| commits that can can cause very long running times -- and
| more seriously, timeouts, where the entire push gets rejected
| as the pre-receive script takes too long. a small program
| using the libgit2 API can do the same work at the cost of a
| single process, although then you have the fun of figuring
| out how to build, install and maintain binary git pre-receive
| hooks.
| chubot wrote:
| A benefit I didn't mention in the post (but probably should) is
| that the pipe lets you interpose other tools.
|
| That is, find -exec is sort of "hard-coded", while find | xargs
| allows obvious extensions like: find | grep |
| xargs # filter tasks find | head | xargs # I
| use this all the time for faster testing find |
| shuf | xargs
|
| Believe it or not I actually use find | shuf | xargs mplayer to
| randomize music and videos :)
|
| So shell is basically a more compositional language than find
| (which is its own language, as I explain here:
| http://www.oilshell.org/blog/2021/04/find-test.html )
| reilly3000 wrote:
| I'm unconvinced by the post OP was responding to. It's a utility,
| it provides some means to get things done. *nix provides many
| means of parsing text and running commands, each have their
| idioms based on their own axioms. It seems as if a composer is
| lambasting the clarinet because they don't care for its
| fingerings. I've only used xargs sparingly, can somebody
| enlighten me as to why it's bad, aside from the fact that there
| are other ways to do some things it does?
| westurner wrote:
| Wanting verbose logging from xargs, years ago I wrote a script
| called `el` (edit lines) that basically does `xargs -0` with
| logging.
| https://github.com/westurner/dotfiles/blob/develop/scripts/e...
|
| It turns out that e.g. -print0 and -0 are the only safe way: line
| endings aren't escaped: find . -type f -print0
| | el -0 --each -x echo
|
| GNU Parallel is a much better tool:
| https://en.wikipedia.org/wiki/GNU_parallel
| chubot wrote:
| (author here) Hm I don't see either of these points because:
|
| GNU xargs has --verbose which logs every command. Does that not
| do what you want? (Maybe I should mention its existence in the
| post)
|
| xargs -P can do everything GNU parallel do, which I mention in
| the post. Any counterexamples? GNU parallel is a very ugly DSL
| IMO, and I don't see what it adds.
|
| --
|
| edit: Logging can also be done with by recursively invoking
| shell functions that log with the $0 Dispatch Pattern,
| explained in the post. I don't see a need for another tool;
| this is the Unix philosophy and compositionality of shell at
| work :)
| jeffbee wrote:
| Parallel's killer feature is how it spools subprocess output,
| ensuring that it doesn't get jumbled together. xargs can't do
| that. I use parallel for things like shelling out to 10000
| hosts and getting some statistics. If I use xargs the output
| stomps all over itself.
| chubot wrote:
| Ah OK thanks, I responded to this here:
| https://news.ycombinator.com/item?id=28259473
| Godel_unicode wrote:
| As far as I'm aware, xargs still has the problem of multiple
| jobs being able to write to stdout at the same time,
| potentially causing their output streams to be intermingled.
| Compare this with parallels --group.
|
| Also parallels can run some of those threads on remote
| machines. I don't believe xargs has an equivalent job
| management function.
| [deleted]
| LeoPanthera wrote:
| Yeah but xargs doesn't refuse to run until I have agreed to a
| EULA stating I will cite it in my next academic paper.
| jeffbee wrote:
| parallel doesn't either, it just nags. I agree about how
| silly and annoying it is. Imagine if every time the parallel
| author opened Firefox he got a message reminding him to
| personally thank me if he uses his web browser for research,
| or if every time his research program calls malloc he has to
| acknowledge and cite Ulrich Drepper. Very very silly.
|
| Parallel is the better tool but the nagware impairs its
| reputation.
| blibble wrote:
| or every time a process called fork() you had to read some
| stupid message
| fiddlerwoaroof wrote:
| I frequently find myself reaching for this pattern instead of
| xargs: do_something | ( while read -r v; do
| . . . done )
|
| I've found that it has fewer edge cases (except it creates a
| subshell, which can be avoided in some shells by using braces
| instead of parens)
| aaaaaaaaaaab wrote:
| Also for the `while` enthusiasts, here's how you zip the output
| of two processes in bash: paste -d \\n
| <(do_something1) <(do_something2) | while read -r var1 && read
| -r var2; do ... # var1 comes from do_something1,
| var2 comes from do_something2 done
| aaaaaaaaaaab wrote:
| Some additional tips:
|
| 1. You don't need the parentheses.
|
| 2. If you use process substitution [1] instead of a pipe, you
| will stay in the same process and can modify variables of the
| enclosing scope: i=0 while read -r v;
| do ... i=$(( i + 1)) done <
| <(do_something)
|
| The drawback is that this way `do_something` has to come after
| `done`, but that's bash for you -\\_(tsu)_/-
|
| [1]
| https://www.gnu.org/software/bash/manual/html_node/Process-S...
| chriswarbo wrote:
| I use this exact pattern a lot. One thing to consider is that
| in the process substitution version, do_something can't
| modify the enclosing variables. The vast majority of the time
| I want to modify variables in the loop body and not the
| generating process, but it's worth keeping in mind.
|
| One common pattern I use this for is running a bunch of
| checks/tests, e.g. EXIT_CODE=0
| while read -r F do do_check "$F" ||
| EXIT_CODE=1 done < <(find ./tests -type f)
| exit "$EXIT_CODE"
|
| This is a more complicated alternative to the following:
| find ./tests -type f | while read -r F do
| do_check "$F" || exit 1 done
|
| The simpler version will abort on the first error, whilst the
| first version will always run all of the checks (exiting with
| an error afterwards, if any of them failed)
| fiddlerwoaroof wrote:
| I usually write zsh scripts and I think there's a shell
| option in zsh that allows the loop at the end of the pipe
| to modify variables in the enclosing body: I remember at
| least one occasion where I was surprised about this
| discrepancy between shells.
| aaaaaaaaaaab wrote:
| Interesting! Indeed, Greg's BashFAQ notes it too:
| https://mywiki.wooledge.org/BashFAQ/024
|
| >Different shells exhibit different behaviors in this
| situation:
|
| >- BourneShell creates a subshell when the input or
| output of anything (loops, case etc..) but a simple
| command is redirected, either by using a pipeline or by a
| redirection operator ('<', '>').
|
| >- BASH, Yash and PDKsh-derived shells create a new
| process only if the loop is part of a pipeline.
|
| >- KornShell and Zsh creates it only if the loop is part
| of a pipeline, but not if the loop is the last part of
| it. The read example above actually works in ksh88,
| ksh93, zsh! (but not MKsh or other PDKsh-derived shells)
|
| >- POSIX specifies the bash behaviour, but as an
| extension allows any or all of the parts of the pipeline
| to run without a subshell (thus permitting the KornShell
| behaviour, as well).
| fiddlerwoaroof wrote:
| Yeah, although I use the parentheses mostly because I like
| how it reads. And that process substitution trick is
| important too.
|
| I think the redirection can come first, though (not at a
| computer to test): < <( do_something )
| while read . . .
| [deleted]
| lottin wrote:
| This is not POSIX compliant though.
| fiddlerwoaroof wrote:
| These days bash and/or zsh are available nearly every
| place I care about, so I find POSIX compliance to be much
| less relevant.
| pgtan wrote:
| No, process substitution must be provided by the
| kernel/syslibs, it is not feature of bash. For example
| there is bash on AIX, but process substitution is not
| possible because the OS do not support it.
| aaaaaaaaaaab wrote:
| Yeah, for _commands_ , the input/output redirections can
| precede them, but for some reason it doesn't work for
| builtin constructs like `while`: $ < <(
| echo foo ) while read -r f; do echo "$f"; done
| -bash: syntax error near unexpected token `do' $ <
| <( echo foo ) xargs echo foo $ bash
| --version GNU bash, version 5.1.4(1)-release
| (x86_64-apple-darwin20.2.0)
| fiddlerwoaroof wrote:
| Maybe wrap the loop either with parentheses or braces?
| aaaaaaaaaaab wrote:
| Tried that, but nope :D I'll let you figure this one out
| once you get near a computer!
| entire-name wrote:
| Redirection like this doesn't seem to work if it comes
| first on GNU bash 5.0.17(1)-release.
|
| For documentation purposes, this is the exact thing I tried
| to run: $ < <(echo hi) while read a; do
| echo "got $a"; done -bash: syntax error near
| unexpected token `do' $ while read a; do echo
| "got $a"; done < <(echo hi) got hi
|
| Maybe there is another way...
| JNRowe wrote:
| One way which isn't great, but an option nonetheless...
| The zsh parser is happy with that form:
| $ zsh -c '< <(echo hi) while read a; do echo "got $a";
| done' got hi
|
| My position isn't that it is a good reason to switch
| shells, but if you're using it anyway then it is an
| option.
| fiddlerwoaroof wrote:
| I've always preferred zsh and, as I've slowly adopted
| nix, I've slowly stopped writing bash in favor of zsh
| tomcam wrote:
| Thank you. Your comment coalesced a number of things in my mind
| that I hadn't grasped properly as a UNIX midwit, especially the
| braces thing.
| ptspts wrote:
| For thousands of arguments this sloution is much slower (high
| CPU usage) than xargs, because either it implements the logic
| as a shell script (slow) or it runs an external program for
| each argument (slow).
| fiddlerwoaroof wrote:
| Sure, if performance matters use xargs. I find this is easier
| to read and think about.
| pgtan wrote:
| FWIW AIX also has an apply command
|
| https://www.ibm.com/docs/en/aix/7.2?topic=apply-command
| 2OEH8eoCRo0 wrote:
| I spent a year using AIX at my previous job and never heard of
| this or saw anybody use it. Is it new in 7.2? We were far
| behind on AIX 6.
| pgtan wrote:
| No idea how old this command is. Most of the AIX/Linux admins
| I knew were very bad shell programmers, skills end with
| awfull for-loops, useless use of cat, and awk '{print $3}'.
| agumonkey wrote:
| I used to have bash fun like `curry { xargs -I {} $1 }` or
| something like that. Pretty useful to simplify one liners.
| rcpt wrote:
| awk '{ print your_command }' | bash
|
| Never can remember all the -I stuff around xargs
| chubot wrote:
| This is like the sed|bash anti-pattern mentioned in the
| original post, and quoted in the appendix on shell injection.
|
| I wouldn't say "never use it", but I would hesitate to ever put
| it in a script, vs. doing a one-off at the command line.
| legobmw99 wrote:
| This is only tangentially related, but after all the posts here
| the last few days about thought terminating cliches, I can't help
| but reflect on the "X considered harmful" title cliche
| MichaelGroves wrote:
| Would you say the title terminated your consideration of the
| article?
| legobmw99 wrote:
| No I think if anything seeing it was a response to "xargs
| considered harmful" made me take the authors side quicker
| Zababa wrote:
| I've been thinking about titles, and it's hard to make a good
| one that doesn't look like a total cliche. "X considered
| harmful", "an opinionated guide to X", some kind of joke or
| reference, what could be a collection of tags (X, Y and Z),
| "things I have learned doing X", etc.
| zeroimpl wrote:
| I specifically clicked on this topic because of the word
| "opinionated". As I already know how to use xargs, I was
| curious what kind of non-conventional or controversial
| opinion the author might have.
| Zababa wrote:
| As I've said to a sibling comment, I don't think it's a bad
| title, and "an opinionated guide to X" is one of the better
| cliche for titles that I see (the worst being the
| journalist that feels like they have to make a joke).
| ineedasername wrote:
| In this case a less cliche/click-baity title could simply be:
|
| "A Response to Xargs Criticism"
| Zababa wrote:
| I think this title is fine, it's mostly that after spending
| some time on Hacker News all the titles start to look the
| same.
| phone8675309 wrote:
| What every X should know about Y, an opinionated take on Z
| considered harmful
| MonkeyClub wrote:
| ...with an example Lisp implementation written in APL
| translating into 6502 assembly :)
| abetusk wrote:
| Yes, I absolutely hate them. I was thinking of creating a
| "considered harmful" considered harmful rant but it already
| exists [0].
|
| [0] https://meyerweb.com/eric/comment/chech.html
| JadeNB wrote:
| Is it thought _terminating_ , though? "X considered harmful"
| seems more intended to spark discussion in an intentionally
| inflammatory way than to stifle it.
|
| (In any case, this surely _is_ tangential, since the title is
| not "X considered harmful" for any value of X--at best it
| _comments_ on a post by that title, as, indeed, you are doing.)
| yudlejoza wrote:
| Of xargs, for, and while, I have limited myself to while. It's
| more typing everytime but saves me from having to remember so
| many quirks of each command. cat input.file |
| ... | while read -r unit; do <cmd> ${unit}; done | ...
|
| between 'while read -r unit' and 'while IFS= read -r unit' I can
| probably handle 90% of the cases. (maybe I should always use IFS
| since I tend to forget the proper way to use it).
| patrickdavey wrote:
| Would you mind expanding with a couple of examples? (E.g. using
| "foo bar" as a single line or split by whitespace).
|
| I suspect I'll really like your way of doing things, but an
| example would be very handy.
| andy81 wrote:
| Today I appreciated Powershell
| jmholla wrote:
| Can you expand on that? I've never had trouble leveraging xargs
| and find it aligns well with shell piping.
| bialpio wrote:
| Not OP but to me the best thing about PowerShell is that it
| recognizes that text is not always the best way to output
| results from commands if you care about creating pipelines.
| In short, it passes objects around so there's no need for
| parsing text.
| bialpio wrote:
| Two examples from the article translated into PS (sorry,
| I'm a bit rusty so the second one may not be the shortest
| possible): PS> "alice", "bob" | echo
| PS> Get-ChildItem . -Include "*test.cpp","*test.py"
| -Recurse | foreach { Remove-Item $_.Name }
|
| No text parsing in sight, and the object attributes can be
| tab-completed from the shell (e.g. I tab-completed the
| `$_.Name`).
| andy81 wrote:
| Thanks, we were thinking of the same thing.
| HMH wrote:
| I always wonder why something like xargs is not a shell built-in.
| It's such a common pattern, but I dread formulating the correct
| incantation every time.
|
| I was happy to read that the author comes to the same conclusion
| and proposes an `each` builtin (albeit only for the Oil shell)!
| Like that there is no need to learn another mini language as
| pointed out.
| JNRowe wrote:
| If you're a zsh user it offers a version of something like
| xargs in zargs1. As the documentation shows it can be really
| quite powerful in part because of zsh's excellent globbing
| facilities, and I think without that support it wouldn't be all
| that useful as a built-in.
|
| I'd also perhaps argue that the reason we don't want xargs to
| be a built-in is precisely because of zargs and the point in
| your second paragraph. If it was built-in it would no doubt be
| obscenely different in each shell, and five decades later a
| standard that no one follows would eventually specify its
| behaviour ;)
|
| 1 https://zsh.sourceforge.io/Doc/Release/User-
| Contributions.ht... - search for "zargs", it has no anchor.
| Sorry.
| l0b0 wrote:
| I scanned until I saw `ls | egrep '.*_test\\.(py|cc)' | xargs -d
| $'\n' -- rm`, and then stopped. This is a terrible idea[1][2].
|
| [1] https://mywiki.wooledge.org/ParsingLs
|
| [2] https://unix.stackexchange.com/q/128985/3645
| tyingq wrote:
| I'm surprised the links don't mention find. The -print0 flag
| makes it safe for crazy filenames, which pairs with the xargs
| -0 flag, or the perl -0 flag, etc. And you have -maxdepth if
| you don't want it to trawl.
| WhatIsDukkha wrote:
| I tend to reach for gnu parallel instead of xargs -
|
| https://www.gnu.org/software/parallel/parallel_alternatives....
|
| parallel is probably on the complex side but its also been
| actively developed, bugfixed and had a lot of road miles from
| large computing users.
| orhmeh09 wrote:
| The nagware prompts of parallel are so objectionable that I
| will do a lot of things to avoid using it at all. So
| pretentious!
| queuebert wrote:
| It's also written in Perl!
| orhmeh09 wrote:
| Veering off course here, after experiencing how incredibly
| long it took to install Sqitch, I will go out of my way to
| avoid anything that is more than a single script, certainly
| anything requiring CPAN too. I don't think there's anything
| technically wrong with these programs or with Perl, they're
| just presented in ways that are unique hassles in this day
| and age.
| grawlinson wrote:
| Seems like some distributions patch out the nagware. I know
| Arch Linux does[0].
|
| [0]: https://github.com/archlinux/svntogit-
| community/tree/package...
| chubot wrote:
| I mention it here:
| https://www.oilshell.org/blog/2021/08/xargs.html#xargs-p-aut...
|
| What does it do that xargs and shell can't? (honest question)
| lacksconfidence wrote:
| i don't know if xargs cant, but i use gnu parallel to split
| an input pipe into N parallel pipes processing slices of the
| input stream.
|
| Edit: To clarify, xargs usually wants to spin up a process
| per task. I have parallel spin up N processes and then
| continuously feed them.
| bloopernova wrote:
| GNU Parallel can be sourced into a bash session from a plain
| text file and used as a function. I've used it to get around
| overly-restrictive build environments. (overly restrictive
| because the team that manages the build image wasn't open to
| modifying their image for my use case)
| xmcqdpt2 wrote:
| Restart capability and remote executions make gnu parallel
| the tool if choice for HPC. For example, you might very well
| use gnu parallel to run 1000s of cpu-hours of numerical
| simulation using patterns such as these ones,
|
| https://docs.computecanada.ca/mediawiki/index.php?title=GNU_.
| ..
|
| Using xargs for this kind of work is euhm... not a good idea.
| orf wrote:
| Resumption, error reporting and much better progress
| monitoring.
| vhold wrote:
| Oh I didn't know about resumption.. parallel has so many
| features packed into its CLI it's kind of ridiculous.
|
| For others that didn't know about it, see the examples
| here: https://www.gnu.org/software/parallel/parallel_tutori
| al.html...
|
| Here's another surprising feature: https://www.gnu.org/soft
| ware/parallel/parallel_tutorial.html...
| bsmithers wrote:
| Not to be pedantic, but that's a bit of a non-argument. _Of
| course_ you can do it with xargs and shell, but imho parallel
| is generally more convenient, especially for remote
| execution. It provides a higher level of abstraction for such
| tasks.
| orhmeh09 wrote:
| Issue complaint prompts to promote the author, for one.
| leephillips wrote:
| Remote execution.
| chubot wrote:
| I'd like to see a demo of it! I will try rewriting it with
| the $0 Dispatch Pattern and ssh :)
| xmcqdpt2 wrote:
| Good luck balancing node usage!
|
| Here is an example of how it works,
|
| https://docs.computecanada.ca/mediawiki/index.php?title=G
| NU_...
|
| This + restart capabilities make gnu parallel very well
| suited to running 1000s of compute-heavy jobs on HPC
| clusters.
| figomore wrote:
| I used Parallel to distribute the rendering of a little
| Blender animation It worked very well.
|
| https://github.com/tfmoraes/blender_gnu_parallel_render/b
| lob...
| comex wrote:
| One thing parallel can do better than xargs is collect
| output.
|
| If you use `xargs -P`, all processes share the same stdout
| and output may be mixed arbitrarily between them. (If the
| program being executed uses line buffering, lines _usually_
| won 't be mixed together from multiple invocations, but they
| can be if they're long enough).
|
| In contrast, `parallel` by default doesn't mix together
| output from different commands at all, instead buffering the
| entire output until the command exits and then printing it.
|
| With `--line-buffer` the unit of atomicity can be weakened
| from an entire command output to individual lines of output,
| reducing latency.
|
| Alternately, with `--keep-order`, `parallel` can ensure the
| outputs are printed in the same order as the corresponding
| inputs, which makes the output deterministic if the program
| is deterministic. Without that you'll get results in an
| arbitrary order.
|
| These aren't technically things that xargs and shell can't
| do; you could reimplement the same behavior by hand with the
| shell. But by the same token, there isn't anything xargs can
| do that the shell can't do alone; you could always use the
| shell to manually split up the input and invoke subprocesses.
| It's just a question of how much you want to reimplement by
| hand.
| chubot wrote:
| OK thanks, looks like there are several features of GNU
| parallel that users like.
|
| For the output interleaving issue, what I do is use the $0
| Dispatch Pattern and write a shell function that redirects
| to a file: do_one() {
| task_with_stdout > $dir/$task_id.txt }
|
| So if there are 10,000 tasks then I get 10,000 files, and I
| can check the progress with "ls", and I can also see what
| tasks failed and possibly restart them.
|
| You even have some notion of progress by checking the file
| size with ls -l.
|
| I tend to use a pattern where each task also outputs a
| metadata file: the exit status, along with the data from
| "time" (rusage, etc.)
|
| But I admit that this is annoying to rewrite in every
| script that uses xargs! It does make sense to have this
| functionality in a tool.
|
| But I think that tool should be a LANGUAGE like Oil, not a
| weirdo interface like GNU parallel :)
|
| But thanks for the explanation (and thanks to everyone in
| this subthread) -- I learned a bunch and this is why I
| write blog posts :)
| Godel_unicode wrote:
| Thank you for writing this, it really crystalized for me
| why I feel the way I do about oil. I hate it. When I want
| a language, I want a real language like python not a
| weirdo jumped up shell (see what I did there?). What I
| want in a shell is a super small, fast, universally
| understood thing for basic tasks and easy expandability
| through tools like parallel and python.
|
| For what it's worth, I consider oil to be closer to a
| unixy PowerShell rather than a more powerful bash. Note
| that this is not a slight, PowerShell is sweet for what
| it is. It (oil) really takes a hard left from the POSIX
| philosophy of focusing on one thing and doing it well.
| I'm also bitter that, if it's going to veer so far away
| from POSIX, that it didn't go the whole hundred and
| become a function language with comprehensions and such.
|
| For what it's worth, everything you mentioned above about
| your approach can be done with parallel.
| senkora wrote:
| I always think of xargs as the inverse of echo. echo converts
| arguments to text streams, and xargs converts text streams to
| arguments.
| kazinator wrote:
| In 2002, I implemented xargs in Lisp, in the Meta-CVS project.
|
| It is quite necessary, because you cannot pass an arbitrarily
| large command line or environment in exec system calls.
|
| Of course, this doesn't have the problem requiring -0 because
| we're not reading textual lines from standard input, but working
| with lists of strings. ;;; This source file is
| part of the Meta-CVS program, ;;; which is distributed
| under the GNU license. ;;; Copyright 2002 Kaz Kylheku
| (in-package :meta-cvs) (defconstant *argument-limit*
| (* 64 1024)) (defun execute-program-xargs (fixed-args
| &optional extra-args fixed-trail-args) (let* ((fixed-size
| (reduce #'(lambda (x y) (+
| x (length y) 1)) (append
| fixed-args fixed-trail-args)
| :initial-value 0)) (size fixed-size))
| (if extra-args (let ((chopped-arg ())
| (combined-status t)) (dolist (arg extra-args)
| (push arg chopped-arg) (when (> (incf size (1+
| (length arg))) *argument-limit*) (setf
| combined-status (and combined-status
| (execute-program (append fixed-args
| (nreverse chopped-arg)
| fixed-trail-args)))) (setf chopped-arg nil)
| (setf size fixed-size))) (when chopped-arg
| (execute-program (append fixed-args (nreverse chopped-arg)
| fixed-trail-args))) combined-status)
| (execute-program (append fixed-args fixed-trail-args)))))
| jordemort wrote:
| I appreciate this. If I wrote my own opinionated guide to xargs,
| it would be a single profane sentence.
| lisper wrote:
| Note that the suggested:
|
| rm $(ls | grep foo)
|
| will not work if you have file names that contain spaces.
|
| Shell programming is planted thick with landmines like this.
| ViViDboarder wrote:
| The linked article doesn't suggest this. They explicitly
| suggest against it.
|
| > Besides the extra ls, the suggestion is bad because it relies
| on shell's word splitting. This is due to the unquoted $().
| It's better to rely on the splitting algorithms in xargs,
| because they're simpler and more powerful.
| pwg wrote:
| Since the blog author is commenting here, you have this statement
| part way down your blog:
|
| > That is, grep doesn't support an analogous -0 flag.
|
| However, the GNU grep variant does have an analogous flag:
|
| -z, --null-data
|
| Treat the input as a set of lines, each terminated by a zero byte
| (the ASCII NUL character) instead of a newline. Like the -Z or
| --null option, this option can be used with commands like sort -z
| to process arbitrary file names.
| chubot wrote:
| Ah cool, I didn't know that! I'll update the blog post. (What a
| cacophony of flags)
|
| Edit: It seems that grep -0 isn't taken for something else and
| they should have used it for consistency? The man page says
| it's meant to be used with find -print0, xargs -0, perl -0, and
| sort -z (another inconsistency)
| tyingq wrote:
| I think that's because they needed to support both input and
| output. So there's both -Z and -z. No such thing as an
| uppercase 0 :)
| kragen wrote:
| It _is_ taken in grep, just poorly documented; grep -5 means
| grep -C 5, and grep -0 means grep -C 0. It 's not taken in
| sort, though, so I don't know why they didn't use -0 for
| sort.
| l0b0 wrote:
| It's best to give up on any kind of consistency between
| command options. Any project is free to do anything it wants,
| and they all do. Someone is eventually going to come up with
| standard N+1[1] which does things consistently, but they are
| going to have to either recreate a bazillion tools or create
| some sort of huge translation framework configuration on top
| of existing tools to get there. And even then it'll take
| literally decades before people migrate away from the current
| tools. Basically, the sad truth is this isn't going to
| happen.
|
| [1] https://xkcd.com/927/
| [deleted]
| aaaaaaaaaaab wrote:
| I would recommend using -0 instead of -d, as the latter is not
| supported on BSD (and macOS) xargs:
| do_something | tr \\n \\0 | xargs -0 ...
| derriz wrote:
| I wish this was the default behavior of xargs (the 'tr \\\n
| \\\0 | xargs -0' bit). I don't know why xargs splits on spaces
| and tabs as well as newlines by default and doesn't even have a
| flag to just split on lines.
|
| Ok filenames can theoretically have newlines in them but I'd be
| happy to deal with that weird case. I can't recall ever having
| encountered it in years of using bash on various systems.
|
| Shell pipes would then orthogonally provide the stuff like
| substitution that xargs does in it's own unique way (that I
| just can't be bothered learning) - instead you'd just pipe the
| find output through sed or 'grep -v' or whatever you wanted
| before piping into xargs.
|
| I guess that's what aliases but I'm too lazy anymore to bother
| with configuring often short-lived systems all the time.
| fl0wenol wrote:
| xargs defaults to all whitespace because it was designed to
| get around the problem of short argv lengths (like, I'm
| talking 4k or less on older Unix-y systems, sometimes as low
| as 255 bytes).
|
| So the defaults went with principle of least surprise,
| pretending it's like a very long args list that you could
| theoretically enter at the shell, including quotes.
|
| You could, for example, edit the args list in vi and line
| split / indent as you please but not impact the end result.
| ahawkins wrote:
| Xargs ftw!
___________________________________________________________________
(page generated 2021-08-21 23:00 UTC)