	rfiledist (Remote File Distribution System)

(If anyone else can think of a better name for this program, please do!)

Intro:

	rfiledist is designed to synchronize the contents of files and
filesystems across a network.  It's easier to give examples of what it
can do than to attempt to describe it.  But first some definitions:

pre-script : a script which is downloaded by the client from the
server and executed by the client, in order to prepare the client to
receive a file.  Might do things like stop a certain service (to
unlock lock files and such) or mount filesystems in order to make it
possible to put a new file into place.  For example, if you were
updating the /etc/syslog.conf file, you might want to stop syslogd
with the pre-script, download the new syslog.conf, and then use a
post-script to restart it.

post-script : same as pre-script, however used to clean-up after the
pre-script and downloading process.  For example, if you have
downloaded a new copy of /etc/inetd.conf, and wish it to now become
active, you would have the post-script send a "HUP" signal to inetd so
that it re-reads the new inetd.conf.

package : a group of files (or single file) which are served to the
client upon request and allowable access.


	Example 1.
	Let's say you have a network with various machines on it, and
you want to have a way of making those machines look alike, or at
least have certain sets of files look alike across certain machines.
First, you make one particular machine (generally a "trusted" machine,
where you know people won't be fiddling around with it too much and
you can retain some kind of tight control over it) to act as a
server.  Files and/or configurations can be served from it to the
other clients.
	On the server, you need to make a few files and directories.
The first should be /packages, which will be where the files
containing info about the packages and their pre-scripts and
post-scripts will be held.  For the case of this example, let's assume
you want to transfer only 1 file from the server to the client, and to
retain all the same permissions, owner, group, date & time, etc.  The
file is to be owned by uid=0, gid=2 (root and bin on Solaris and root
and daemon on Slackware Linux, a typical example as to why you need to
use the same uid's and gid's across you network), and have permissions
of 775.  The file name is /usr/local/bin/foo_bar.  For this example it
is a regular file, not a link of any kind, and not a directory or a
FIFO pipe file.(Note: The server is smart enough to know that if it
can't find a pre- or post-script which a client may request, that it
will ignore the script request and notify the client that there is no
script.  Same for the actual package file.  If a client requests a
non-existent package entirely, the server will send down "Ignore this"
commands to the client.)  For this particular file, we don't need any
pre- or post-scripting, so we don't create any scripts in /packages.
All we need to do is make a single file, with its name as the package
name we wish to call the package.  Since the filename is version 1.1
of the program /usr/local/bin/foo_bar, we might choose a name like
"foo_bar.v1.1.package", to be descriptive.  Any name is ok, however.
The contents of the actual package file are important and contain some
of the typical stat(2) information.  The format of the file is very
important, also.  I'll describe that later.  Note also that
/usr/local/bin/foo_bar is present on the server, and that the client
desires it in the SAME location.  For this situation, as a regular
file, we want the contents of /packages/foo_bar.v1.1.package to be:
/usr/local/bin/foo_bar
R
/usr/local/bin/foo_bar
9039
775
0
2
123456789
987654321
finish
	Comments in this file are not possible.  I should have made
that possible from the beginning, but I got caught up in other more
important details of writing this.  No blank lines, no comments.  All
the data in this file is useful.  Quick description:
The first line tells the location that the file should take on the
client.  The second tells what kind it is (R=regular, S=symlink,
H=hardlink, F=fifo, D=directory).  Third line tells the location that
the file currently has on the server.  This is so that the server can
have a repository of LOTS of potentially-conflicting packages for a
variety of architectures.  A server might contain binaries for SPARC
Solaris, Solaris x86, Linux, SGI, AIX, etc, and might have them each
located in different directories, like /repository/Linux,
/repository/SPARCSolaris, and so on.  Thus the server can service
anybody with permissions to connect and requesting a package for their
platform.  Moving along, the next line of "9039" is the file size in
bytes.  The line with "775" describes the mode.  The next two lines
are the uid and gid of the file, respectively.  The next two 9-digit
number are the last-access time, and modification-time of the file.
The last line of "finish" tells the system that this is the last file
in the series, and no more are left.  If this was a multi-file
package, the "finish" line could be replaced by ANY type of
single-line separator string.  I have been using "---" to denote the
separator in my tests, but that is arbitrary.  Note there are no blank
lines and no extra stuff, everything here is necessary to transfer the
file correctly.
	Now, we have /packages/foo_bar.v1.1.package created.  We need
to let the server know that a client is allowed to connect.  The file
which contains the list of clients which can connect is found in the
char * list in server.c named "goodmachines[]".  The file can be found
at 6 different locations, but only the first one it finds will be
used.  I would recommend placing it in ./access.list (in the directory
server is started from.).  The format is simple for access.list.  Just
list the clients by either IP address or hostname, one per line.  A
comment line must start with a "#" in this file.  DNS will add the
actual IP address for each IP or hostname to a linked list of machines
that are allowed to use this service.  Should a machine connect which
is not listed in access.list, the server will simply close the
attempted connection, and the client will just die of loneliness.  If
we had 2 machines we wish to have connect to the server, "remington"
and "colt", we may have a access.list file that looks like this: 

# This is the first line.
# This is my access.list file.  The following machines 
# are allowed to connect.
remington 
colt
# end of access.list

	The standard gethostbyname(3) function is used, so it's
somewhat dependent upon your OS as to how gethostbyname is resolved.
Some systems may need extra help with name resolution, but this
appears ok on Linux and Solaris, as far as I can see.  This should
work also with multi-homed systems.  Lets say "colt" is a SPARC with
an extra HME ethernet card in it, then the server should know that
whether colt connects through the HME or its internal ethernet
adapter, it should still allow it if DNS knows about the multiple IP
addresses.

	I almost forgot: you'll need to "touch
/etc/refdis-server.conf" on the server.  I put in the functionality
for an additional configuration file into server.c, but I didn't clean
it up enough to make it so that it *should* ignore it.  It will die if
it doesn't see at least one of the refis-server.conf locations listed
in the char * variable "configpath[]".  I should either make that file
useful or remove it entirely.  For now, it's a minor agitation.  Just
touch it and proceed.
	You'll need to make a directory, if you haven't already, named
/var/log.  That's where logged stuff will go.
	If you haven't already, "make" it.  Compiles ok under
Slackware 3.2, RedHat 5.0, and Solaris 2.5.1 (and hopefully 2.6), all
using the gnu compiler and utilities.  gcc 2.7.2.1 thru 2.7.2.3 were
used, but others should be ok.  Maybe even Sun's "cc", but no
guarantees there.
	The server must be run as root.  If it's uid is not 0 and it
can't simply change its own uid to 0, then it will die.  Use it with
the "-d" parameter to keep it in debug (foreground) mode.  With no
paramters, it will fork(), exec(), and background itself.
	If you take a look at /var/log/refdis.serv, you will find that
it should have started off with something like this:

Fri Feb 20 23:41:42 1998: SRV[25969]: main: Starting...
Fri Feb 20 23:41:42 1998: SRV[25969]: main: Got root as my name, and
root as the root name

	Depending on where you put refdis-server.conf and access.list,
you'll see a list of "main: Failed to stat ..."  This is ok.  As long
as one of the says "main: Opening ..." for both refdis-server.conf and
access.list.  Also, if you see "add_address: gethostbyname returned
NULL for ...", then you have a problem with your name resolver.  If
you see a "main: Denied an attempt by 0.0.0.0 to connect.", that's
ok.  I'm using a static area for storing the data, so that's sort of
expected.  I should change that to dynamic.  This happens when you hit
ctrl-c while the server is running in the foreground, or when you use
"kill" to kill the server while it is running the background.
	Another thing which is ok is if you get the message: "main: No
security db.  All packages free & unsecured".  This is because the
server also does a lookup of another file which it consults to find
secured packages.  Some sites may have a licensed product, which they
want to have a package of for distribution & maintenence, however they
only want it to be available to a certain subset of their machines,
not all.  Using a bit of slight-of-hand with a linked-list of
linked-lists, the server builds a table of "secure" packages and
clients that can download "secure" packages.  Any client that is not
in the list is denied.  I'll cover that later, not important to this
example.
	Now you should be ready to put something on the client side.
You will need only the client binary and a configuration file.  The
configuration file is named refdis.conf (not to be confused with the
server config file, refdis-server.conf).  You should place that in the
current directory, at least just for testing.  Its format is
*slightly* more involved, but still rather simple.  It will allow
comments starting a line with "#".  There are two key words which may
be used in it, NO_PRE and NO_POST.  The format is such:

# This is the first line of the configuration file, refids.conf
# The following line says that I want all that a package named 
# "foo_bar.v1.1.package" has to offer: pre-script, package contents,
# and post-script
foo_bar.v1.1.package
# The following line says to ignore the pre-script for a package named
# "improved_bash.v2.3.package" when I ask for the package.
NO_PRE improved_bash.v2.3.package
# I also want to specify that I don't want any post-script associated
# with that package
NO_POST improved_bash.v2.3.package
# Declaring that I do not want the pre- or post-scripts does NOT
# implicitly mean that I DO want the package itself.  I must
# explicitly name the package in order to get it.
improved_bash.v2.3.package
# Last line of the refdis.conf file

	Thus, you can see the formats for most of the config files are
pretty simple.  The order in which the NO_POST and NO_PRE key words
are used is unimportant, however the order in which the lines with
just the packages names appear determines the order in which they are
obtained.  I think one should declare groups of commands for a single
package together, so that it's easier to understand, though.
	Your client should only have 1 line in it's refdis.conf,
though, and it should simply contain a line with
"foo_bar.v1.1.package" on it, with a newline character.  My routines
for reading config files aren't wonderfully smart, so you need
to hit <return> at the end of the file to make sure the last line is
recognized.  In other words, the last character must be a newline, and
the next-to-last must be the last character of the last string you put
there.  Empty lines may cause problems!
	Now you should run the client as root.  It should create a
logfile named /var/log/refdis.cli.  If the server is running and all
goes well, the process is this:

	1.) The client connects to the server.  The server checks the
list of allowed-clients.  If this IP address is allowed to connect, we
proceed to the next step.  If not, the server closes the connection
and lets the client die.
	2.) Since there is no "NO_PRE" keyword for the package
"foo_bar.v1.1.package", the client issues a "get-pre-script" request
and names "foo_bar.v1.1.package" as the package name.  If a NO_PRE
keyword had been present, it would not ask for a pre-script.
	3.) The server checks /packages/ for a file named
"foo_bar.v1.1.package.PRE" or "foo_bar.v1.1.package.PRE.NOWAIT".  If
the "...NOWAIT" file exists, it will indicate to the client that the
client shouldn't wait for the termination of the script to proceed
with asking for a package.  If just the "...PRE" file exists, then it
will send it down to the client and the client will execute it and
wait for the script to terminate before proceeding.  If nothing by
either of those names exist, it will simply tell the client that no
pre-script is available, and to continue on.
	4.) Now that the pre-script has been handled, the client will
ask for the actual package itself.  The client will send a
"get-package" command to the server, and name the package as
"foo_bar.v1.1.package".  The server will check the /packages/
directory for a file named by that name.  If one exists, it will read
it in.  It will pass elements of the data found in
/packages/foo_bar.v1.1.package down to the client so that the client
can decide whether it wants that file or not.  If the file exists on
the client side, and all the passed-down data is identical, it will
leave the file alone.  If things like permissions and uid/gid are
different, but all else is the same, the client will simply change
modes and move along.  If the file size is different or the file isn't
present at all, it will always ask for a download.  If downloaded,
once the whole file is received, the client can set modes,
permissions, times, etc on it appropriately.  Only when finished with
the actual transfer of data (the contents of /usr/local/bin/foo_bar),
the client will close the dedicated socket and set the modes,
permissions, etc. that the server previously specified.
	5.) Had there been other files listed in
"foo_bar.v1.1.package" on the server, the server would have iterated
the above step until it reached the "finish" end-of-package marker.
Since there is no other file listed in that package file, the server
notifies the client that the end of the file set has been reached.
When the client finds out that the file set is done, it will want to
ask for the post-script.  It will check its linked-list of "NO_POST"
packages and see if the package name is present there.  If it was
listed in the "NO_POST" list, it would not ask, but since there is no
occurrance of it in the NO_POST list, it will ask the server for the
post-script.
	6.) The client sends up a request for the post-script, and the
server looks for the file /packages/foo_bar.v1.1.package.POST and
/package/foo_bar.v1.1.package.POST.NOWAIT.  Works similarly to the
pre-script requests.  If the "...NOWAIT" file is present, then the
client will not wait for the script to finish, it will background the
process and keep going with requesting packages, if any more are
left.  Otherwise, it will wait for the script to end before
proceeding.
	7.) The client will look in its list of packages to do, and
see if any are left.  If there are more, it will start over and do
steps 1 through 6 again for each package.  If there aren't any left to
do, it will close the sockets and exit.


TO BUILD:

	Provided you are on a Sun or Linux machine, you should just
need to run "./configure" first, and then "make".  That should build
you 3 executables, a server, a client and "genstat" to generate stat(2)
info about files you specify suitable to be used in this program.

CONCLUSION:

	I'm looking for people to contribute to this to make it
worthwhile.  Yes, the coding isn't great and the design needs some
help.  But let's change it for the better and make it workable.  I
think that for anyone interested in maintaining uniform configurations
across their network, this is a place to start.
	When you generate patches, fixes, etc, please submit them to me
at tkunz@fast.net.  I will incorporate them into the main body of work
as quickly as possible, usually within a day or so, depending on how
many submissions I get.

Tom Kunz
tkunz@fast.net
