tsignals.rst - pism - [fork] customized build of PISM, the parallel ice sheet model (tillflux branch)
 (HTM) git clone git://src.adamsgaard.dk/pism
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) LICENSE
       ---
       tsignals.rst (2797B)
       ---
            1 .. include:: ../../global.txt
            2 
            3 .. |pid| replace:: *PID*\s
            4 
            5 .. _sec-signal:
            6 
            7 Signals, to control a running PISM model
            8 ----------------------------------------
            9 
           10 Ice sheet model runs sometimes take a long time, so the state of a run may need checking.
           11 Sometimes the run needs to be stopped, but with the possibility of restarting. PISM
           12 implements these behaviors using "signals" from the POSIX standard, included in Linux and
           13 most flavors of Unix. :numref:`tab-signals` summarizes how PISM responds to signals.
           14 A convenient form of ``kill``, for Linux users, is ``pkill`` which will find processes by
           15 executable name. Thus "``pkill -USR1 pismr``" might be used to send all PISM processes the
           16 same signal, avoiding an explicit list of |pid|.
           17 
           18 .. list-table:: Signalling running PISM processes.  "|pid|" stands for list of all identifiers of the PISM processes.
           19    :name: tab-signals
           20    :header-rows: 1
           21    :widths: 1,1,2
           22 
           23    * - Command
           24      - Signal
           25      - PISM behavior
           26    * - ``kill -KILL`` |pid|
           27      - ``SIGKILL``
           28      - Terminate with extreme prejudice. PISM cannot catch it and no state is saved.
           29    * - ``kill -TERM`` |pid|
           30      - ``SIGTERM``
           31      - End process(es), but save the last model state in the output file, using ``-o``
           32        name or default name as normal. Note that the ``history`` string in the output file
           33        will contain an "``EARLY EXIT caused by signal SIGTERM``" indication.
           34    * - ``kill -USR1`` |pid|
           35      - ``SIGUSR1``
           36      - Process(es) will continue after saving the model state at the end of the current
           37        time step, using a file name including the current model year. Time-stepping is not
           38        altered. Also flushes output buffers of scalar time-series.
           39    * - ``kill -USR2`` |pid|
           40      - ``SIGUSR2``
           41      - Just flush time-series output buffers.
           42    
           43 Here is an example. Suppose we start a long verification run in the background, with
           44 standard out redirected into a file:
           45 
           46 .. code-block:: none
           47 
           48    pismv -test G -Mz 101 -y 1e6 -o testGmillion.nc >> log.txt &
           49 
           50 This run gets a Unix process id, which we assume is "8920". (Get it using ``ps`` or
           51 ``pgrep``.) If we want to observe the run without stopping it we send the ``USR1`` signal:
           52 
           53 
           54 .. code-block:: none
           55 
           56    kill -USR1 8920
           57 
           58 (With ``pkill`` one can usually type "``pkill -usr1 pismv``".) Suppose it happens that we
           59 caught the run at year 31871.5. Then, for example, a NetCDF file ``pismv-31871.495.nc`` is
           60 produced. Note also that in the standard out log file ``log.txt`` the line
           61 
           62 .. code-block:: none
           63 
           64    caught signal SIGUSR1:  Writing intermediate file ... and flushing time series.
           65 
           66 appears around that time step. Suppose, on the other hand, that the run needs to be
           67 stopped. Then a graceful way is
           68 
           69 .. code-block:: none
           70 
           71    kill -TERM 8920
           72 
           73 because the model state is saved and can be inspected.