states.man - enscript - GNU Enscript
 (HTM) git clone git://thinkerwim.org/enscript.git
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) README
 (DIR) LICENSE
       ---
       states.man (11884B)
       ---
            1 .\"
            2 .\" States manual page.
            3 .\" Copyright (c) 1997-1998 Markku Rossi.
            4 .\" Author: Markku Rossi <mtr@iki.fi>
            5 .\"
            6 .\" This file is part of GNU Enscript.
            7 .\"
            8 .\" Enscript is free software: you can redistribute it and/or modify
            9 .\" it under the terms of the GNU General Public License as published by
           10 .\" the Free Software Foundation, either version 3 of the License, or
           11 .\" (at your option) any later version.
           12 .\"
           13 .\" Enscript is distributed in the hope that it will be useful,
           14 .\" but WITHOUT ANY WARRANTY; without even the implied warranty of
           15 .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
           16 .\" GNU General Public License for more details.
           17 .\"
           18 .\" You should have received a copy of the GNU General Public License
           19 .\" along with Enscript.  If not, see <http://www.gnu.org/licenses/>.
           20 .\"
           21 .TH STATES 1 "Oct 23, 1998" "STATES" "STATES"
           22 
           23 .SH NAME
           24 states \- awk alike text processing tool
           25 
           26 .SH SYNOPSIS
           27 .B states
           28 [\f3\-hvV\f1]
           29 [\f3\-D \f2var\f3=\f2val\f1]
           30 [\f3\-f \f2file\f1]
           31 [\f3\-o \f2outputfile\f1]
           32 [\f3\-p \f2path\f1]
           33 [\f3\-s \f2startstate\f1]
           34 [\f3\-W \f2level\f1]
           35 [\f2filename\f1 ...]
           36 
           37 .SH DESCRIPTION
           38 
           39 \f3States\f1 is an awk-alike text processing tool with some state
           40 machine extensions.  It is designed for program source code
           41 highlighting and to similar tasks where state information helps input
           42 processing.
           43 
           44 At a single point of time, \f3States\f1 is in one state, each quite
           45 similar to awk's work environment, they have regular expressions which
           46 are matched from the input and actions which are executed when a match
           47 is found.  From the action blocks, \f3states\f1 can perform state
           48 transitions; it can move to another state from which the processing is
           49 continued.  State transitions are recorded so \f3states\f1 can return
           50 to the calling state once the current state has finished.
           51 
           52 The biggest difference between \f3states\f1 and awk, besides state
           53 machine extensions, is that \f3states\f1 is not line-oriented.  It
           54 matches regular expression tokens from the input and once a match is
           55 processed, it continues processing from the current position, not from
           56 the beginning of the next input line.
           57 
           58 .SH OPTIONS
           59 .TP 8
           60 .B \-D \f2var\f3=\f2val\f3, \-\-define=\f2var\f3=\f2val\f3
           61 Define variable \f2var\f1 to have string value \f2val\f1.  Command
           62 line definitions overwrite variable definitions found from the config
           63 file.
           64 .TP 8
           65 .B \-f \f2file\f3, \-\-file=\f2file\f3
           66 Read state definitions from file \f2file\f1.  As a default,
           67 \f3states\f1 tries to read state definitions from file \f3states.st\f1
           68 in the current working directory.
           69 .TP 8
           70 .B \-h, \-\-help
           71 Print short help message and exit.
           72 .TP 8
           73 .B \-o \f2file\f3, \-\-output=\f2file\f3
           74 Save output to file \f2file\f1 instead of printing it to
           75 \f3stdout\f1.
           76 .TP 8
           77 .B \-p \f2path\f3, \-\-path=\f2path\f3
           78 Set the load path to \f2path\f1.  The load path defaults to the
           79 directory, from which the state definitions file is loaded.
           80 .TP 8
           81 .B \-s \f2state\f3, \-\-state=\f2state\f3
           82 Start execution from state \f3state\f1.  This definition overwrites
           83 start state resolved from the \f3start\f1 block.
           84 .TP 8
           85 .B \-v, \-\-verbose
           86 Increase the program verbosity.
           87 .TP 8
           88 .B \-V, \-\-version
           89 Print \f3states\f1 version and exit.
           90 .TP 8
           91 .B \-W \f2level\f3, \-\-warning=\f2level\f3
           92 Set the warning level to \f2level\f1.  Possible values for \f2level\f1
           93 are:
           94 .RS 8
           95 .TP 8
           96 .B light
           97 light warnings (default)
           98 .TP 8
           99 .B all
          100 all warnings
          101 .RE
          102 
          103 .SH STATES PROGRAM FILES
          104 
          105 \f3States\f1 program files can contain on \f2start\f1 block,
          106 \f2startrules\f1 and \f2namerules\f1 blocks to specify the initial
          107 state, \f2state\f1 definitions and \f2expressions\f1.
          108 
          109 The \f2start\f1 block is the main() of the \f3states\f1 program, it is
          110 executed on script startup for each input file and it can perform any
          111 initialization the script needs.  It normally also calls the
          112 \f3check_startrules()\f1 and \f3check_namerules()\f1 primitives which
          113 resolve the initial state from the input file name or the data found
          114 from the beginning of the input file.  Here is a sample start block
          115 which initializes two variables and does the standard start state
          116 resolving:
          117 .PP
          118 .RS
          119 .nf
          120 start
          121 {
          122   a = 1;
          123   msg = "Hello, world!";
          124   check_startrules ();
          125   check_namerules ();
          126 }
          127 .fi
          128 .RE
          129 .PP
          130 Once the start block is processed, the input processing is continued
          131 from the initial state.
          132 
          133 The initial state is resolved by the information found from the
          134 \f2startrules\f1 and \f2namerules\f1 blocks.  Both blocks contain
          135 regular expression - symbol pairs, when the regular expression is
          136 matched from the name of from the beginning of the input file, the
          137 initial state is named by the corresponding symbol.  For example, the
          138 following start and name rules can distinguish C and Fortran files:
          139 .PP
          140 .RS
          141 .nf
          142 namerules
          143 {
          144   /\\.(c|h)$/    c;
          145   /\\.[fF]$/     fortran;
          146 }
          147 
          148 startrules
          149 {
          150   /-\\*- [cC] -\\*-/      c;
          151   /-\\*- fortran -\\*-/   fortran;
          152 }
          153 .fi
          154 .RE
          155 .PP
          156 If these rules are used with the previously shown start block,
          157 \f3states\f1 first check the beginning of input file.  If it has
          158 string \f3-*- c -*-\f1, the file is assumed to contain C code and the
          159 processing is started from state called \f3c\f1.  If the beginning of
          160 the input file has string \f3-*- fortran -*-\f1, the initial state is
          161 \f3fortran\f1.  If none of the start rules matched, the name of the
          162 input file is matched with the namerules.  If the name ends to suffix
          163 \f3c\f1 or \f3C\f1, we go to state \f3c\f1.  If the suffix is
          164 \f3f\f1 or \f3F\f1, the initial state is fortran.
          165 
          166 If both start and name rules failed to resolve the start state,
          167 \f3states\f1 just copies its input to output unmodified.
          168 
          169 The start state can also be specified from the command line with
          170 option \f3\-s\f1, \f3\-\-state\f1.
          171 
          172 State definitions have the following syntax:
          173 
          174 .B state { \f2expr\f1 {\f2statements\f1} ... }
          175 
          176 where \f2expr\f1 is: a regular expression, special expression or
          177 symbol and \f2statements\f1 is a list of statements.  When the
          178 expression \f2expr\f1 is matched from the input, the statement block
          179 is executed.  The statement block can call \f3states\f1' primitives,
          180 user-defined subroutines, call other states, etc.  Once the block is
          181 executed, the input processing is continued from the current intput
          182 position (which might have been changed if the statement block called
          183 other states).
          184 
          185 Special expressions \f3BEGIN\f1 and \f3END\f1 can be used in the place
          186 of \f2expr\f1.  Expression \f3BEGIN\f1 matches the beginning of the
          187 state, its block is called when the state is entered.  Expression
          188 \f3END\f1 matches the end of the state, its block is executed when
          189 \f3states\f1 leaves the state.
          190 
          191 If \f2expr\f1 is a symbol, its value is looked up from the global
          192 environment and if it is a regular expression, it is matched to the
          193 input, otherwise that rule is ignored.
          194 
          195 The \f3states\f1 program file can also have top-level expressions,
          196 they are evaluated after the program file is parsed but before any
          197 input files are processed or the \f2start\f1 block is evaluated.
          198 
          199 .SH PRIMITIVE FUNCTIONS
          200 
          201 .TP 8
          202 .B call (\f2symbol\f3)
          203 Move to state \f2symbol\f1 and continue input file processing from
          204 that state.  Function returns whatever the \f3symbol\f1 state's
          205 terminating \f3return\f1 statement returned.
          206 .TP 8
          207 .B calln (\f2name\f3)
          208 Like \f3call\f1 but the argument \f2name\f1 is evaluated and its value
          209 must be string.  For example, this function can be used to call a
          210 state which name is stored to a variable.
          211 .TP 8
          212 .B check_namerules ()
          213 Try to resolve start state from \f3namerules\f1 rules.  Function
          214 returns \f31\f1 if start state was resolved or \f30\f1 otherwise.
          215 .TP 8
          216 .B check_startrules ()
          217 Try to resolve start state from \f3startrules\f1 rules.  Function
          218 returns \f31\f1 if start state was resolved or \f30\f1 otherwise.
          219 .TP 8
          220 .B concat (\f2str\f3, ...)
          221 Concanate argument strings and return result as a new string.
          222 .TP 8
          223 .B float (\f2any\f3)
          224 Convert argument to a floating point number.
          225 .TP 8
          226 .B getenv (\f2str\f3)
          227 Get value of environment variable \f2str\f1.  Returns an empty string
          228 if variable \f2var\f1 is undefined.
          229 .TP 8
          230 .B int (\f2any\f3)
          231 Convert argument to an integer number.
          232 .TP 8
          233 .B length (\f2item\f3, ...)
          234 Count the length of argument strings or lists.
          235 .TP 8
          236 .B list (\f2any\f3, ...)
          237 Create a new list which contains items \f2any\f1, ...
          238 .TP 8
          239 .B panic (\f2any\f3, ...)
          240 Report a non-recoverable error and exit with status \f31\f1.  Function
          241 never returns.
          242 .TP 8
          243 .B print (\f2any\f3, ...)
          244 Convert arguments to strings and print them to the output.
          245 .TP 8
          246 .B range (\f2source\f3, \f2start\f3, \f2end\f3)
          247 Return a sub\-range of \f2source\f1 starting from position \f2start\f1
          248 (inclusively) to \f2end\f1 (exclusively).  Argument \f2source\f1 can
          249 be string or list.
          250 .TP 8
          251 .B regexp (\f2string\f3)
          252 Convert string \f2string\f1 to a new regular expression.
          253 .TP 8
          254 .B regexp_syntax (\f2char\f3, \f2syntax\f3)
          255 Modify regular expression character syntaxes by assigning new
          256 syntax \f2syntax\f1 for character \f2char\f1.  Possible values for
          257 \f2syntax\f1 are:
          258 .RS 8
          259 .TP 8
          260 .B 'w'
          261 character is a word constituent
          262 .TP 8
          263 .B ' '
          264 character isn't a word constituent
          265 .RE
          266 .TP 8
          267 .B regmatch (\f2string\f3, \f2regexp\f3)
          268 Check if string \f2string\f1 matches regular expression \f2regexp\f1.
          269 Functions returns a boolean success status and sets sub-expression
          270 registers \f3$\f2n\f1.
          271 .TP 8
          272 .B regsub (\f2string\f1, \f2regexp\f3, \f2subst\f3)
          273 Search regular expression \f2regexp\f1 from string \f2string\f1 and
          274 replace the matching substring with string \f2subst\f1.  Returns the
          275 resulting string.  The substitution string \f2subst\f1 can contain
          276 \f3$\f2n\f1 references to the \f2n\f1:th parenthesized
          277 sup-expression.
          278 .TP 8
          279 .B regsuball (\f2string\f1, \f2regexp\f3, \f2subst\f3)
          280 Like \f3regsub\f1 but replace all matches of regular expression
          281 \f2regexp\f1 from string \f2string\f1 with string \f2subst\f1.
          282 .TP 8
          283 .B require_state (\f2symbol\f3)
          284 Check that the state \f2symbol\f1 is defined.  If the required state
          285 is undefined, the function tries to autoload it.  If the loading
          286 fails, the program will terminate with an error message.
          287 .TP 8
          288 .B split (\f2regexp\f3, \f2string\f3)
          289 Split string \f2string\f1 to list considering matches of regular
          290 rexpression \f2regexp\f1 as item separator.
          291 .TP 8
          292 .B sprintf (\f2fmt\f1, ...)
          293 Format arguments according to \f2fmt\f1 and return result as a
          294 string.
          295 .TP 8
          296 .B strcmp (\f2str1\f3, \f2str2\f3)
          297 Perform a case\-sensitive comparision for strings \f2str1\f1 and
          298 \f2str2\f1.  Function returns a value that is:
          299 .RS 8
          300 .TP 8
          301 .B -1
          302 string \f2str1\f1 is less than \f2str2\f1
          303 .TP 8
          304 .B 0
          305 strings are equal
          306 .TP 8
          307 .B 1
          308 string \f2str1\f1 is greater than \f2str2\f1
          309 .RE
          310 .TP 8
          311 .B string (\f2any\f3)
          312 Convert argument to string.
          313 .TP 8
          314 .B strncmp (\f2str1\f3, \f2str2\f3, \f2num\f3)
          315 Perform a case\-sensitive comparision for strings \f2str1\f1 and
          316 \f2str2\f1 comparing at maximum \f2num\f3 characters.
          317 .TP 8
          318 .B substring (\f2str\f3, \f2start\f3, \f2end\f3)
          319 Return a substring of string \f2str\f1 starting from position
          320 \f2start\f1 (inclusively) to \f2end\f1 (exclusively).
          321 .RE
          322 
          323 .SH BUILTIN VARIABLES
          324 .TP 8
          325 .B $.
          326 current input line number
          327 .TP 8
          328 .B $\f2n\f3
          329 the \f2n\f1:th parenthesized regular expression sub-expression from the
          330 latest state regular expression or from the \f3regmatch\f1 primitive
          331 .TP 8
          332 .B $`
          333 everything before the matched regular rexpression.  This is usable
          334 when used with the \f3regmatch\f1 primitive; the contents of this
          335 variable is undefined when used in action blocks to refer the data
          336 before the block's regular expression.
          337 .TP 8
          338 .B $B
          339 an alias for \f3$`\f1
          340 .TP 8
          341 .B argv
          342 list of input file names
          343 .TP 8
          344 .B filename
          345 name of the current input file
          346 .TP 8
          347 .B program
          348 name of the program (usually \f3states\f1)
          349 .TP 8
          350 .B version
          351 program version string
          352 .RE
          353 
          354 .SH FILES
          355 .nf
          356 .ta 4i
          357 @DATADIR@/enscript/hl/*.st        enscript's states definitions
          358 .fi
          359 
          360 .SH SEE ALSO
          361 awk(1), enscript(1)
          362 
          363 .SH AUTHOR
          364 Markku Rossi <mtr@iki.fi> <http://www.iki.fi/~mtr/>
          365 
          366 GNU Enscript WWW home page: <http://www.iki.fi/~mtr/genscript/>