
Paai's Text Utilities.



This is a dynamic (growing and changing) collection of programs and
Unix-scripts that I use for research in the properties of
text-files. The collection is not complete, by no means bugfree and
perhaps even not very original.

Most programs are centered around weighted indexing, computation of
lexical cohesion and taht kind of things.

Hopefully other people that are engaged in Information Retrieval and
Corpus Linguistics find something here that they can use. I know that
I wished for a collection of similar programs when I started. All
programs compile and run under Linux. As they are not very
complicated, they should run on most Unix-systems.

Hans Paijmans, jan. 1997

paai@kub.nl

More information at 

http://purl.oclc.org/NET/PAAI/Publiek

or

http://pi0959.kub.nl:2080/Paai/Publiek


=============================

In this directory are the sources and the makefile.

In the directory Test_data all kinds of datafiles for day-to-day
testing are stored.

testchains

This shellscript first writes some testdata to the current directory
and then does some runs on them with 'chains'. If everything is
all-right, no differences should be reported.

testdiscrim

This shellscript does the same for 'discrim'. I used the output of
Dubin's program to compare with.

