xargs - www.codemadness.org - www.codemadness.org saait content files
(HTM) git clone git://git.codemadness.org/www.codemadness.org
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
xargs (13424B)
---
1 1<- Back / codemadness.org 70
2 i codemadness.org 70
3 i codemadness.org 70
4 i# xargs: an example for parallel batch jobs codemadness.org 70
5 i codemadness.org 70
6 iLast modification on 2023-12-17 codemadness.org 70
7 i codemadness.org 70
8 iThis describes a simple shellscript programming pattern to process a list of codemadness.org 70
9 ijobs in parallel. This script example is contained in one file. codemadness.org 70
10 i codemadness.org 70
11 i codemadness.org 70
12 i# Simple but less optimal example codemadness.org 70
13 i codemadness.org 70
14 i #!/bin/sh codemadness.org 70
15 i maxjobs=4 codemadness.org 70
16 i codemadness.org 70
17 i # fake program for example purposes. codemadness.org 70
18 i someprogram() { codemadness.org 70
19 i echo "Yep yep, I'm totally a real program!" codemadness.org 70
20 i sleep "$1" codemadness.org 70
21 i } codemadness.org 70
22 i codemadness.org 70
23 i # run(arg1, arg2) codemadness.org 70
24 i run() { codemadness.org 70
25 i echo "[$1] $2 started" >&2 codemadness.org 70
26 i someprogram "$1" >/dev/null codemadness.org 70
27 i status="$?" codemadness.org 70
28 i echo "[$1] $2 done" >&2 codemadness.org 70
29 i return "$status" codemadness.org 70
30 i } codemadness.org 70
31 i codemadness.org 70
32 i # process the jobs. codemadness.org 70
33 i j=1 codemadness.org 70
34 i for f in 1 2 3 4 5 6 7 8 9 10; do codemadness.org 70
35 i run "$f" "something" & codemadness.org 70
36 i codemadness.org 70
37 i jm=$((j % maxjobs)) # shell arithmetic: modulo codemadness.org 70
38 i test "$jm" = "0" && wait codemadness.org 70
39 i j=$((j+1)) codemadness.org 70
40 i done codemadness.org 70
41 i wait codemadness.org 70
42 i codemadness.org 70
43 i codemadness.org 70
44 i# Why is this less optimal codemadness.org 70
45 i codemadness.org 70
46 iThis is less optimal because it waits until all jobs in the same batch are finished codemadness.org 70
47 i(each batch contain $maxjobs items). codemadness.org 70
48 i codemadness.org 70
49 iFor example with 2 items per batch and 4 total jobs it could be: codemadness.org 70
50 i codemadness.org 70
51 i* Job 1 is started. codemadness.org 70
52 i* Job 2 is started. codemadness.org 70
53 i* Job 2 is done. codemadness.org 70
54 i* Job 1 is done. codemadness.org 70
55 i* Wait: wait on process status of all background processes. codemadness.org 70
56 i* Job 3 in new batch is started. codemadness.org 70
57 i codemadness.org 70
58 i codemadness.org 70
59 iThis could be optimized to: codemadness.org 70
60 i codemadness.org 70
61 i* Job 1 is started. codemadness.org 70
62 i* Job 2 is started. codemadness.org 70
63 i* Job 2 is done. codemadness.org 70
64 i* Job 3 in new batch is started (immediately). codemadness.org 70
65 i* Job 1 is done. codemadness.org 70
66 i* ... codemadness.org 70
67 i codemadness.org 70
68 i codemadness.org 70
69 iIt also does not handle signals such as SIGINT (^C). However the xargs example codemadness.org 70
70 ibelow does: codemadness.org 70
71 i codemadness.org 70
72 i codemadness.org 70
73 i# Example codemadness.org 70
74 i codemadness.org 70
75 i #!/bin/sh codemadness.org 70
76 i maxjobs=4 codemadness.org 70
77 i codemadness.org 70
78 i # fake program for example purposes. codemadness.org 70
79 i someprogram() { codemadness.org 70
80 i echo "Yep yep, I'm totally a real program!" codemadness.org 70
81 i sleep "$1" codemadness.org 70
82 i } codemadness.org 70
83 i codemadness.org 70
84 i # run(arg1, arg2) codemadness.org 70
85 i run() { codemadness.org 70
86 i echo "[$1] $2 started" >&2 codemadness.org 70
87 i someprogram "$1" >/dev/null codemadness.org 70
88 i status="$?" codemadness.org 70
89 i echo "[$1] $2 done" >&2 codemadness.org 70
90 i return "$status" codemadness.org 70
91 i } codemadness.org 70
92 i codemadness.org 70
93 i # child process job. codemadness.org 70
94 i if test "$CHILD_MODE" = "1"; then codemadness.org 70
95 i run "$1" "$2" codemadness.org 70
96 i exit "$?" codemadness.org 70
97 i fi codemadness.org 70
98 i codemadness.org 70
99 i # generate a list of jobs for processing. codemadness.org 70
100 i list() { codemadness.org 70
101 i for f in 1 2 3 4 5 6 7 8 9 10; do codemadness.org 70
102 i printf '%s\0%s\0' "$f" "something" codemadness.org 70
103 i done codemadness.org 70
104 i } codemadness.org 70
105 i codemadness.org 70
106 i # process jobs in parallel. codemadness.org 70
107 i list | CHILD_MODE="1" xargs -r -0 -P "${maxjobs}" -L 2 "$(readlink -f "$0")" codemadness.org 70
108 i codemadness.org 70
109 i codemadness.org 70
110 i# Run and timings codemadness.org 70
111 i codemadness.org 70
112 iAlthough the above example is kindof stupid, it already shows the queueing of codemadness.org 70
113 ijobs is more efficient. codemadness.org 70
114 i codemadness.org 70
115 iScript 1: codemadness.org 70
116 i codemadness.org 70
117 i time ./script1.sh codemadness.org 70
118 i [...snip snip...] codemadness.org 70
119 i real 0m22.095s codemadness.org 70
120 i codemadness.org 70
121 iScript 2: codemadness.org 70
122 i codemadness.org 70
123 i time ./script2.sh codemadness.org 70
124 i [...snip snip...] codemadness.org 70
125 i real 0m18.120s codemadness.org 70
126 i codemadness.org 70
127 i codemadness.org 70
128 i# How it works codemadness.org 70
129 i codemadness.org 70
130 iThe parent process: codemadness.org 70
131 i codemadness.org 70
132 i* The parent, using xargs, handles the queue of jobs and schedules the jobs to codemadness.org 70
133 i execute as a child process. codemadness.org 70
134 i* The list function writes the parameters to stdout. These parameters are codemadness.org 70
135 i separated by the NUL byte separator. The NUL byte separator is used because codemadness.org 70
136 i this character cannot be used in filenames (which can contain spaces or even codemadness.org 70
137 i newlines) and cannot be used in text (the NUL byte terminates the buffer for codemadness.org 70
138 i a string). codemadness.org 70
139 i* The -L option must match the amount of arguments that are specified for the codemadness.org 70
140 i job. It will split the specified parameters per job. codemadness.org 70
141 i* The expression "$(readlink -f "$0")" gets the absolute path to the codemadness.org 70
142 i shellscript itself. This is passed as the executable to run for xargs. codemadness.org 70
143 i* xargs calls the script itself with the specified parameters it is being fed. codemadness.org 70
144 i The environment variable $CHILD_MODE is set to indicate to the script itself codemadness.org 70
145 i it is run as a child process of the script. codemadness.org 70
146 i codemadness.org 70
147 i codemadness.org 70
148 iThe child process: codemadness.org 70
149 i codemadness.org 70
150 i* The command-line arguments are passed by the parent using xargs. codemadness.org 70
151 i codemadness.org 70
152 i* The environment variable $CHILD_MODE is set to indicate to the script itself codemadness.org 70
153 i it is run as a child process of the script. codemadness.org 70
154 i codemadness.org 70
155 i* The script itself (ran in child-mode process) only executes the task and codemadness.org 70
156 i signals its status back to xargs and the parent. codemadness.org 70
157 i codemadness.org 70
158 i* The exit status of the child program is signaled to xargs. This could be codemadness.org 70
159 i handled, for example to stop on the first failure (in this example it is not). codemadness.org 70
160 i For example if the program is killed, stopped or the exit status is 255 then codemadness.org 70
161 i xargs stops running also. codemadness.org 70
162 i codemadness.org 70
163 i codemadness.org 70
164 i# Description of used xargs options codemadness.org 70
165 i codemadness.org 70
166 hFrom the OpenBSD man page: »https://man.openbsd.org/xargs« URL:https://man.openbsd.org/xargs codemadness.org 70
167 i codemadness.org 70
168 i xargs - construct argument list(s) and execute utility codemadness.org 70
169 i codemadness.org 70
170 iOptions explained: codemadness.org 70
171 i codemadness.org 70
172 i* -r: Do not run the command if there are no arguments. Normally the command codemadness.org 70
173 i is executed at least once even if there are no arguments. codemadness.org 70
174 i* -0: Change xargs to expect NUL ('\0') characters as separators, instead of codemadness.org 70
175 i spaces and newlines. codemadness.org 70
176 i* -P maxprocs: Parallel mode: run at most maxprocs invocations of utility codemadness.org 70
177 i at once. codemadness.org 70
178 i* -L number: Call utility for every number of non-empty lines read. A line codemadness.org 70
179 i ending in unescaped white space and the next non-empty line are considered codemadness.org 70
180 i to form one single line. If EOF is reached and fewer than number lines have codemadness.org 70
181 i been read then utility will be called with the available lines. codemadness.org 70
182 i codemadness.org 70
183 i codemadness.org 70
184 i# xargs options -0 and -P, portability and historic context codemadness.org 70
185 i codemadness.org 70
186 iSome of the options, like -P are as of writing (2023) non-POSIX: codemadness.org 70
187 hhttps://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html. URL:https://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html codemadness.org 70
188 iHowever many systems support this useful extension for many years now. codemadness.org 70
189 i codemadness.org 70
190 iThe specification even mentions implementations which support parallel codemadness.org 70
191 ioperations: codemadness.org 70
192 i codemadness.org 70
193 i"The version of xargs required by this volume of POSIX.1-2017 is required to codemadness.org 70
194 iwait for the completion of the invoked command before invoking another command. codemadness.org 70
195 iThis was done because historical scripts using xargs assumed sequential codemadness.org 70
196 iexecution. Implementations wanting to provide parallel operation of the invoked codemadness.org 70
197 iutilities are encouraged to add an option enabling parallel invocation, but codemadness.org 70
198 ishould still wait for termination of all of the children before xargs codemadness.org 70
199 iterminates normally." codemadness.org 70
200 i codemadness.org 70
201 i codemadness.org 70
202 iSome historic context: codemadness.org 70
203 i codemadness.org 70
204 iThe xargs -0 option was added on 1996-06-11 by Theo de Raadt, about a year codemadness.org 70
205 iafter the NetBSD import (over 27 years ago at the time of writing): codemadness.org 70
206 i codemadness.org 70
207 hCVS log URL:http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/xargs/xargs.c?rev=1.2&content-type=text/x-cvsweb-markup codemadness.org 70
208 i codemadness.org 70
209 iOn OpenBSD the xargs -P option was added on 2003-12-06 by syncing the FreeBSD codemadness.org 70
210 icode: codemadness.org 70
211 i codemadness.org 70
212 hCVS log URL:http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/xargs/xargs.c?rev=1.14&content-type=text/x-cvsweb-markup codemadness.org 70
213 i codemadness.org 70
214 i codemadness.org 70
215 iLooking at the imported git history log of GNU findutils (which has xargs), the codemadness.org 70
216 ivery first commit already had the -0 and -P option: codemadness.org 70
217 i codemadness.org 70
218 hgit log URL:https://savannah.gnu.org/git/?group=findutils codemadness.org 70
219 i codemadness.org 70
220 i commit c030b5ee33bbec3c93cddc3ca9ebec14c24dbe07 codemadness.org 70
221 i Author: Kevin Dalley <kevin@seti.org> codemadness.org 70
222 i Date: Sun Feb 4 20:35:16 1996 +0000 codemadness.org 70
223 i codemadness.org 70
224 i Initial revision codemadness.org 70
225 i codemadness.org 70
226 i codemadness.org 70
227 i# xargs: some incompatibilities found codemadness.org 70
228 i codemadness.org 70
229 i* Using the -0 option empty fields are handled differently in different codemadness.org 70
230 i implementations. codemadness.org 70
231 i* The -n and -L option doesn't work correctly in many of the BSD implementations. codemadness.org 70
232 i Some count empty fields, some don't. In early implementations in FreeBSD and codemadness.org 70
233 i OpenBSD it only processed the first line. In OpenBSD it has been improved codemadness.org 70
234 i around 2017. codemadness.org 70
235 i codemadness.org 70
236 iDepending on what you want to do a workaround could be to use the -0 option codemadness.org 70
237 iwith a single field and use the -n flag. Then in each child program invocation codemadness.org 70
238 isplit the field by a separator. codemadness.org 70
239 i codemadness.org 70
240 i codemadness.org 70
241 i# References codemadness.org 70
242 i codemadness.org 70
243 h* xargs: »https://man.openbsd.org/xargs« URL:https://man.openbsd.org/xargs codemadness.org 70
244 h* printf: »https://man.openbsd.org/printf« URL:https://man.openbsd.org/printf codemadness.org 70
245 h* ksh, wait: »https://man.openbsd.org/ksh#wait« URL:https://man.openbsd.org/ksh#wait codemadness.org 70
246 h* wait(2): »https://man.openbsd.org/wait« URL:https://man.openbsd.org/wait codemadness.org 70
247 .