xargs - www.codemadness.org - www.codemadness.org saait content files
 (HTM) git clone git://git.codemadness.org/www.codemadness.org
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) README
 (DIR) LICENSE
       ---
       xargs (13424B)
       ---
            1 1<- Back        /        codemadness.org        70
            2 i                codemadness.org        70
            3 i                codemadness.org        70
            4 i# xargs: an example for parallel batch jobs                codemadness.org        70
            5 i                codemadness.org        70
            6 iLast modification on 2023-12-17                codemadness.org        70
            7 i                codemadness.org        70
            8 iThis describes a simple shellscript programming pattern to process a list of                codemadness.org        70
            9 ijobs in parallel. This script example is contained in one file.                codemadness.org        70
           10 i                codemadness.org        70
           11 i                codemadness.org        70
           12 i# Simple but less optimal example                codemadness.org        70
           13 i                codemadness.org        70
           14 i        #!/bin/sh                codemadness.org        70
           15 i        maxjobs=4                codemadness.org        70
           16 i                        codemadness.org        70
           17 i        # fake program for example purposes.                codemadness.org        70
           18 i        someprogram() {                codemadness.org        70
           19 i                echo "Yep yep, I'm totally a real program!"                codemadness.org        70
           20 i                sleep "$1"                codemadness.org        70
           21 i        }                codemadness.org        70
           22 i                        codemadness.org        70
           23 i        # run(arg1, arg2)                codemadness.org        70
           24 i        run() {                codemadness.org        70
           25 i                echo "[$1] $2 started" >&2                codemadness.org        70
           26 i                someprogram "$1" >/dev/null                codemadness.org        70
           27 i                status="$?"                codemadness.org        70
           28 i                echo "[$1] $2 done" >&2                codemadness.org        70
           29 i                return "$status"                codemadness.org        70
           30 i        }                codemadness.org        70
           31 i                        codemadness.org        70
           32 i        # process the jobs.                codemadness.org        70
           33 i        j=1                codemadness.org        70
           34 i        for f in 1 2 3 4 5 6 7 8 9 10; do                codemadness.org        70
           35 i                run "$f" "something" &                codemadness.org        70
           36 i                        codemadness.org        70
           37 i                jm=$((j % maxjobs)) # shell arithmetic: modulo                codemadness.org        70
           38 i                test "$jm" = "0" && wait                codemadness.org        70
           39 i                j=$((j+1))                codemadness.org        70
           40 i        done                codemadness.org        70
           41 i        wait                codemadness.org        70
           42 i                codemadness.org        70
           43 i                codemadness.org        70
           44 i# Why is this less optimal                codemadness.org        70
           45 i                codemadness.org        70
           46 iThis is less optimal because it waits until all jobs in the same batch are finished                codemadness.org        70
           47 i(each batch contain $maxjobs items).                codemadness.org        70
           48 i                codemadness.org        70
           49 iFor example with 2 items per batch and 4 total jobs it could be:                codemadness.org        70
           50 i                codemadness.org        70
           51 i* Job 1 is started.                codemadness.org        70
           52 i* Job 2 is started.                codemadness.org        70
           53 i* Job 2 is done.                codemadness.org        70
           54 i* Job 1 is done.                codemadness.org        70
           55 i* Wait: wait on process status of all background processes.                codemadness.org        70
           56 i* Job 3 in new batch is started.                codemadness.org        70
           57 i                codemadness.org        70
           58 i                codemadness.org        70
           59 iThis could be optimized to:                codemadness.org        70
           60 i                codemadness.org        70
           61 i* Job 1 is started.                codemadness.org        70
           62 i* Job 2 is started.                codemadness.org        70
           63 i* Job 2 is done.                codemadness.org        70
           64 i* Job 3 in new batch is started (immediately).                codemadness.org        70
           65 i* Job 1 is done.                codemadness.org        70
           66 i* ...                codemadness.org        70
           67 i                codemadness.org        70
           68 i                codemadness.org        70
           69 iIt also does not handle signals such as SIGINT (^C). However the xargs example                codemadness.org        70
           70 ibelow does:                codemadness.org        70
           71 i                codemadness.org        70
           72 i                codemadness.org        70
           73 i# Example                codemadness.org        70
           74 i                codemadness.org        70
           75 i        #!/bin/sh                codemadness.org        70
           76 i        maxjobs=4                codemadness.org        70
           77 i                        codemadness.org        70
           78 i        # fake program for example purposes.                codemadness.org        70
           79 i        someprogram() {                codemadness.org        70
           80 i                echo "Yep yep, I'm totally a real program!"                codemadness.org        70
           81 i                sleep "$1"                codemadness.org        70
           82 i        }                codemadness.org        70
           83 i                        codemadness.org        70
           84 i        # run(arg1, arg2)                codemadness.org        70
           85 i        run() {                codemadness.org        70
           86 i                echo "[$1] $2 started" >&2                codemadness.org        70
           87 i                someprogram "$1" >/dev/null                codemadness.org        70
           88 i                status="$?"                codemadness.org        70
           89 i                echo "[$1] $2 done" >&2                codemadness.org        70
           90 i                return "$status"                codemadness.org        70
           91 i        }                codemadness.org        70
           92 i                        codemadness.org        70
           93 i        # child process job.                codemadness.org        70
           94 i        if test "$CHILD_MODE" = "1"; then                codemadness.org        70
           95 i                run "$1" "$2"                codemadness.org        70
           96 i                exit "$?"                codemadness.org        70
           97 i        fi                codemadness.org        70
           98 i                        codemadness.org        70
           99 i        # generate a list of jobs for processing.                codemadness.org        70
          100 i        list() {                codemadness.org        70
          101 i                for f in 1 2 3 4 5 6 7 8 9 10; do                codemadness.org        70
          102 i                        printf '%s\0%s\0' "$f" "something"                codemadness.org        70
          103 i                done                codemadness.org        70
          104 i        }                codemadness.org        70
          105 i                        codemadness.org        70
          106 i        # process jobs in parallel.                codemadness.org        70
          107 i        list | CHILD_MODE="1" xargs -r -0 -P "${maxjobs}" -L 2 "$(readlink -f "$0")"                codemadness.org        70
          108 i                codemadness.org        70
          109 i                codemadness.org        70
          110 i# Run and timings                codemadness.org        70
          111 i                codemadness.org        70
          112 iAlthough the above example is kindof stupid, it already shows the queueing of                codemadness.org        70
          113 ijobs is more efficient.                codemadness.org        70
          114 i                codemadness.org        70
          115 iScript 1:                codemadness.org        70
          116 i                codemadness.org        70
          117 i        time ./script1.sh                codemadness.org        70
          118 i        [...snip snip...]                codemadness.org        70
          119 i        real    0m22.095s                codemadness.org        70
          120 i                codemadness.org        70
          121 iScript 2:                codemadness.org        70
          122 i                codemadness.org        70
          123 i        time ./script2.sh                codemadness.org        70
          124 i        [...snip snip...]                codemadness.org        70
          125 i        real    0m18.120s                codemadness.org        70
          126 i                codemadness.org        70
          127 i                codemadness.org        70
          128 i# How it works                codemadness.org        70
          129 i                codemadness.org        70
          130 iThe parent process:                codemadness.org        70
          131 i                codemadness.org        70
          132 i* The parent, using xargs, handles the queue of jobs and schedules the jobs to                codemadness.org        70
          133 i  execute as a child process.                codemadness.org        70
          134 i* The list function writes the parameters to stdout. These parameters are                codemadness.org        70
          135 i  separated by the NUL byte separator. The NUL byte separator is used because                codemadness.org        70
          136 i  this character cannot be used in filenames (which can contain spaces or even                codemadness.org        70
          137 i  newlines) and cannot be used in text (the NUL byte terminates the buffer for                codemadness.org        70
          138 i  a string).                codemadness.org        70
          139 i* The -L option must match the amount of arguments that are specified for the                codemadness.org        70
          140 i  job. It will split the specified parameters per job.                codemadness.org        70
          141 i* The expression "$(readlink -f "$0")" gets the absolute path to the                codemadness.org        70
          142 i  shellscript itself. This is passed as the executable to run for xargs.                codemadness.org        70
          143 i* xargs calls the script itself with the specified parameters it is being fed.                codemadness.org        70
          144 i  The environment variable $CHILD_MODE is set to indicate to the script itself                codemadness.org        70
          145 i  it is run as a child process of the script.                codemadness.org        70
          146 i                codemadness.org        70
          147 i                codemadness.org        70
          148 iThe child process:                codemadness.org        70
          149 i                codemadness.org        70
          150 i* The command-line arguments are passed by the parent using xargs.                codemadness.org        70
          151 i                codemadness.org        70
          152 i* The environment variable $CHILD_MODE is set to indicate to the script itself                codemadness.org        70
          153 i  it is run as a child process of the script.                codemadness.org        70
          154 i                codemadness.org        70
          155 i* The script itself (ran in child-mode process) only executes the task and                codemadness.org        70
          156 i  signals its status back to xargs and the parent.                codemadness.org        70
          157 i                codemadness.org        70
          158 i* The exit status of the child program is signaled to xargs. This could be                codemadness.org        70
          159 i  handled, for example to stop on the first failure (in this example it is not).                codemadness.org        70
          160 i  For example if the program is killed, stopped or the exit status is 255 then                codemadness.org        70
          161 i  xargs stops running also.                codemadness.org        70
          162 i                codemadness.org        70
          163 i                codemadness.org        70
          164 i# Description of used xargs options                codemadness.org        70
          165 i                codemadness.org        70
          166 hFrom the OpenBSD man page: »https://man.openbsd.org/xargs«        URL:https://man.openbsd.org/xargs        codemadness.org        70
          167 i                codemadness.org        70
          168 i        xargs - construct argument list(s) and execute utility                codemadness.org        70
          169 i                codemadness.org        70
          170 iOptions explained:                codemadness.org        70
          171 i                codemadness.org        70
          172 i* -r: Do not run the command if there are no arguments. Normally the command                codemadness.org        70
          173 i  is executed at least once even if there are no arguments.                codemadness.org        70
          174 i* -0: Change xargs to expect NUL ('\0') characters as separators, instead of                codemadness.org        70
          175 i  spaces and newlines.                codemadness.org        70
          176 i* -P maxprocs: Parallel mode: run at most maxprocs invocations of utility                codemadness.org        70
          177 i  at once.                codemadness.org        70
          178 i* -L number: Call utility for every number of non-empty lines read. A line                codemadness.org        70
          179 i  ending in unescaped white space and the next non-empty line are considered                codemadness.org        70
          180 i  to form one single line. If EOF is reached and fewer than number lines have                codemadness.org        70
          181 i  been read then utility will be called with the available lines.                codemadness.org        70
          182 i                codemadness.org        70
          183 i                codemadness.org        70
          184 i# xargs options -0 and -P, portability and historic context                codemadness.org        70
          185 i                codemadness.org        70
          186 iSome of the options, like -P are as of writing (2023) non-POSIX:                codemadness.org        70
          187 hhttps://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html.        URL:https://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html        codemadness.org        70
          188 iHowever many systems support this useful extension for many years now.                codemadness.org        70
          189 i                codemadness.org        70
          190 iThe specification even mentions implementations which support parallel                codemadness.org        70
          191 ioperations:                codemadness.org        70
          192 i                codemadness.org        70
          193 i"The version of xargs required by this volume of POSIX.1-2017 is required to                codemadness.org        70
          194 iwait for the completion of the invoked command before invoking another command.                codemadness.org        70
          195 iThis was done because historical scripts using xargs assumed sequential                codemadness.org        70
          196 iexecution. Implementations wanting to provide parallel operation of the invoked                codemadness.org        70
          197 iutilities are encouraged to add an option enabling parallel invocation, but                codemadness.org        70
          198 ishould still wait for termination of all of the children before xargs                codemadness.org        70
          199 iterminates normally."                codemadness.org        70
          200 i                codemadness.org        70
          201 i                codemadness.org        70
          202 iSome historic context:                codemadness.org        70
          203 i                codemadness.org        70
          204 iThe xargs -0 option was added on 1996-06-11 by Theo de Raadt, about a year                codemadness.org        70
          205 iafter the NetBSD import (over 27 years ago at the time of writing):                codemadness.org        70
          206 i                codemadness.org        70
          207 hCVS log        URL:http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/xargs/xargs.c?rev=1.2&content-type=text/x-cvsweb-markup        codemadness.org        70
          208 i                codemadness.org        70
          209 iOn OpenBSD the xargs -P option was added on 2003-12-06 by syncing the FreeBSD                codemadness.org        70
          210 icode:                codemadness.org        70
          211 i                codemadness.org        70
          212 hCVS log        URL:http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/xargs/xargs.c?rev=1.14&content-type=text/x-cvsweb-markup        codemadness.org        70
          213 i                codemadness.org        70
          214 i                codemadness.org        70
          215 iLooking at the imported git history log of GNU findutils (which has xargs), the                codemadness.org        70
          216 ivery first commit already had the -0 and -P option:                codemadness.org        70
          217 i                codemadness.org        70
          218 hgit log        URL:https://savannah.gnu.org/git/?group=findutils        codemadness.org        70
          219 i                codemadness.org        70
          220 i        commit c030b5ee33bbec3c93cddc3ca9ebec14c24dbe07                codemadness.org        70
          221 i        Author: Kevin Dalley <kevin@seti.org>                codemadness.org        70
          222 i        Date:   Sun Feb 4 20:35:16 1996 +0000                codemadness.org        70
          223 i                        codemadness.org        70
          224 i            Initial revision                codemadness.org        70
          225 i                codemadness.org        70
          226 i                codemadness.org        70
          227 i# xargs: some incompatibilities found                codemadness.org        70
          228 i                codemadness.org        70
          229 i* Using the -0 option empty fields are handled differently in different                codemadness.org        70
          230 i  implementations.                codemadness.org        70
          231 i* The -n and -L option doesn't work correctly in many of the BSD implementations.                codemadness.org        70
          232 i  Some count empty fields, some don't.  In early implementations in FreeBSD and                codemadness.org        70
          233 i  OpenBSD it only processed the first line.  In OpenBSD it has been improved                codemadness.org        70
          234 i  around 2017.                codemadness.org        70
          235 i                codemadness.org        70
          236 iDepending on what you want to do a workaround could be to use the -0 option                codemadness.org        70
          237 iwith a single field and use the -n flag.  Then in each child program invocation                codemadness.org        70
          238 isplit the field by a separator.                codemadness.org        70
          239 i                codemadness.org        70
          240 i                codemadness.org        70
          241 i# References                codemadness.org        70
          242 i                codemadness.org        70
          243 h* xargs: »https://man.openbsd.org/xargs«        URL:https://man.openbsd.org/xargs        codemadness.org        70
          244 h* printf: »https://man.openbsd.org/printf«        URL:https://man.openbsd.org/printf        codemadness.org        70
          245 h* ksh, wait: »https://man.openbsd.org/ksh#wait«        URL:https://man.openbsd.org/ksh#wait        codemadness.org        70
          246 h* wait(2): »https://man.openbsd.org/wait«        URL:https://man.openbsd.org/wait        codemadness.org        70
          247 .