.\"	This file is part of the software similarity tester SIM.
.\"	Written by Dick Grune, Vrije Universiteit, Amsterdam.
.\"
.TH SIM I
.SH NAME
sim \- find similarities in C-files
.SH SYNOPSIS
.B sim
[
.B \-[fns]
.BI \-r N
]
file ...
.SH DESCRIPTION
.I Sim
reads the C-files
.I file ...
and looks for pieces of text that are similar; two pieces of C-text
are similar if they only differ in layout, comment, identifiers and
the contents of numbers, strings and characters.  If any runs
of sufficient length
are found, they are reported on standard output; the default length
minimum is 24, but can be reset by the
.BR \-r -parameter.
.PP
The program can be used for finding copied pieces of code in
purportedly unrelated programs (with the
.BR \-s -flag),
or for finding accidentally duplicated code in larger projects
(without the
.BR \-s -flag
but with the
.BR \-f -flag).
.PP
Since it reads the files several times, it cannot read from standard input.
.PP
There are the following options:
.TP
.B \-f
Runs are restricted to pieces with balancing parentheses, to isolate
potential functions.
.TP
.B \-n
Similarities found are only summarized, not displayed.
.TP
.B \-s
The contents of a file are not compared to itself (\-s = not self).
.PP
The matching process uses a hash table so that tens of thousands of
lines are processed in a few minutes; if, however, there is not
enough memory for the table, the matching process uses sequential
search, which can take hours.
.SH AUTHOR
Dick Grune, Vrije Universiteit, Amsterdam.
.SH BUGS
Strong periodicity in the input text (like a table of
.I N
almost identical lines) causes problems.
.I Sim
tries to cope with this but cannot avoid giving appr.
.I log N
messages about it.  The best advice is still to take the offending
files out of the game.
