.TH CHAINS "PTU"
.SH NAME
chains \- tracks chains of words in texts
.SH SYNOPSIS
.B chains [options] text [weightfile]


.SH DESCRIPTION
Accepts a textfile or a textfile and a weights-file as input and prints
for every sentence in the textfile the number of active word-chains.
If weightfiles are used, thay can be of the formats:

           filenumber   word     weight
           filenumber   weight   word

for document-word weights, such as the tf.idf, or

           weight word
           word weight

for plain word weights, such as the discrimination value.
   A list with synonyms may also be used.

.SH OPTIONS

.TP
.I "\-A" 
write complete commandline as comment to stdout.
   
.TP
.I "\-a" 
make words of a-z only (default is alphanumeric [a-z0-9])
   
.TP
.I "\-c<n>" 
maximum chainlength (default 6). If a token does not re-occur before this number of sentences has passed, the chain is considered broken.
      
.TP
.I "\-d" 
debug info. Not very exiting.

.TP
.I "\-h" 
Help (this message).

.TP
.I "\-i" 
add sentence-codes to output.

.TP
.I "\-n" 
text has no codes in front of sentence (ended by ']')

.TP
.I "\-N<n>
break text in markov-chain of n-grams of length <n> characters.

.TP
.I "\-l<n>" 
number of sentences to combine in artificial sentence(default=1)

.TP
.I "\-L<n>" 
 minimum length of words to be considered (default=2)

.TP
.I "\-o<name>" 
discard sentences with strings from file 'name'

.TP
.I "\-O<name>" 
discard sentences without strings from 'name'
.TP
.I "\-q<name>" 
use list of stop words
.TP
.I "\-Q<name>" 
ignore all but list of obligate words
.TP
.I "\-r<recsep>" 
recognize recsep as record-separator.
.TP
.I "\-R<name>" 
make file with record- and line-numbers.
.TP
.I "\-s<n>" 
use artificial lines of <n> tokens in stead of lines.
default= 20 tokens
.TP
.I "\-S<name>" 
print this artificial file to file name
.TP
.I "\-t<float>" 
Treshold  to  be  set for the wordweights in the weights-file. Words equal to or below that treshold are ignored.

.TP
.I "\-T<name>" 
Use a file with synonyms of the format:
   
   word synonym
   
So that every 'word' in the text will be read as 'synonym'.
.TP
.I "\-v" 
verbose
.TP
.I "\-w0" 
compute all (only with weights-file)
.TP
.I "\-w1" 
compute only nouns
.TP
.I "\-w2" 
compute only verbs
.TP
.I "\-w3" 
compute only adjs
.TP
.I "\-w13" 
compute only nouns and adjs

note that w1-w13 make only sense with tagged files


weights-file

If an inputfile for weights is used, the program will decide from the
number of fields in the first line if the weights-file is a
word-weight or a document-word-weight table.  In the second case the
-r option should be used to identify the records in the file.
  
text-file 

If lines from the second file start with a number, it is supposed to
be a linenumber. If not, the linenumber counter is increased
automatically.  This will fail if lines start for some reason with an
integer number (which is very rare - I hope). Also, codes may be
prefixed to the sentences, as long as they end on a ']'.

.SH See also
listwords,  matrix,  word_sel,  sent_wgt,  sent_til, discrim, bigrams, intro

.SH Copyright
Copyright Hans Paijmans 1995
.SH PTU
Paai's Text Utilities






