          WWWOFFLE - World Wide Web Offline Explorer - Version 1.0
          ========================================================


The WWWOFFLE programs simplify World Wide Web browsing from computers that use
intermittent (dial-up) connections to the internet.

      - Cacheing of pages that are viewed while connected for review later.
      - Browsing of cached pages while not connected, with the ability to follow
        links and mark other pages for download.
      - Downloading of specified pages non-interactively.
      - Command line interface to select pages for downloading.
      - Index of pages stored in cache for easy selection.


Description
-----------

When a web browser is configured to use wwwoffle as the HTTP proxy, there are
two modes of operation.

 - When connected to the internet
        The web pages requested by the browser are fetched into the browser and
        a copy stored in the cache if the page is not already in the cache.
 - When not connected to the internet
        The URL requested by the browser is examined, if the page is in the
        cache then it is displayed in the browser.  If it is not in the cache
        then the request is stored for download later.

The requests that were made for pages not in the cache can then be fetched by
the program the next time that it is connected.
This means that it is possible to browse web pages and read them without having
to remain connected.


Index Of Cached Files
---------------------

To get the index of files in the cache, use the URL 'http://localhost:8080/'.
The indexes can also have an optional argument, 'alpha', 'mtime', 'ctime' or
'atime' to sort the index alphabetically, by modification time, creation time or
access time of the files.  This main index has links to an index for each host,
also taking the optional argument.  From the index of pages, there is a link to
each page, and a refresh button.  The link gets the cached version, the refresh
button requests a new copy if offline, or gets it if online.


Configuring A Web Browser
-------------------------

To use the wwwoffles programs, requires that your web browser is set up to use
it as a proxy.  The proxy hostname will be 'localhost', and the port number will
be the one that is used by wwwoffled (default 8080).


The Programs
------------

There are actually three programs that make up this utility.

wwwoffled - A demon process that acts as an HTTP proxy.
wwwoffle  - A program to interact with and control the HTTP proxy demon.
wwwoffles - A server that actually does the fetching of the web pages.


WWWOFFLE - User control program
-------------------------------

The control program (wwwoffle) is used to control the action of the demon
program (wwwoffled).

The demon program needs to know if the system is online or offline, when to
fetch the pages that have been previously requested and when to purge the cache
of pages.

All of this is controlled through the wwwoffle program.

wwwoffle -online        Indicates to the demon that the system is online.

wwwoffle -offline       Indicates to the demon that the system is offline.

wwwoffle -fetch         Commands the demon to fetch the pages that were
                        requested by browsers while the system was offline.
                        wwwoffle exits when the fetching is complete.
                        (This requires the demon to be told it is online).

wwwoffle -purge         Commands the demon to purge from the cache the pages
                        that have not been accessed within the specified number
                        of days.

wwwoffle <URL>          Specifies to the demon a URL that must be fetched.
                        If online then it is got immediately, else the request
                        is stored for a later fetch.

wwwoffle -port <port>   Can be used with any of the above options to specify the
                        port number that the demon program listens to.  (For the
                        -online, -offline, -fetch and -purge options this is the
                        wwwoffle control port, for URLs it is the proxy port.)

wwwoffle -h             Gives help about the command line options.


WWWOFFLED - Demon program
-------------------------

The demon program (wwwoffled) runs as an HTTP proxy and also accepts connections
from the control program (wwwoffle).

The demon program needs to maintain the current state of the system, online or
offline, as well as some parameters, the cache directory and the real proxy to
use when online.

As HTTP proxy requests come in, the program forks copies of the server program
(wwwoffles) to handle the requests.  The server program can also be forked in
response to the wwwoffle program requesting pages to be fetched.


wwwoffled -proxy <host[:port]>  Specifies the hostname and port number of a real
                                proxy to use when online.

wwwoffled -ports <port1> <port2> Specify the port numbers to use for the HTTP
                                proxy and the wwwoffle control port on the local
                                host.

wwwoffled -spool <spool>        Specifies the directory to use as the cache.

wwwoffled -h                    Gives help about the command line options.


There are a number of error and informational messages that are printed to
standard error as the program runs.  The wwwoffles program also the same
standard error since it is forked from wwwoffled.

The Browser that you use must be set up to use the HTTP proxy on localhost,
using the port specified by the first of the two -ports numbers.

The wwwoffle control program must use -port with the second of the two -ports
numbers for controlling the demon and the proxy port for getting URLs.


WWOFFLES - Server program
-------------------------

The server (wwwoffles) can be started by the demon (wwwoffled) in one of three
different modes.

Real  - When the system is online and acting as a proxy for a browser.
        All requests for web pages are handled by forking a new server which
        will connect to the remote host and fetch the page.  This page is then
        stored in the cache as well as being returned to the browser.

Fetch - When the system is online and fetching pages that have been requested.
        A web page requests in the outgoing directory are fetched by the server
        connecting to the remote host to get the page.  This page is then stored
        in the cache, there is no browser active.

Spool - When the system is offline and acting as a proxy for a browser.
        All requests for web pages are handled by forking a server that will
        either return a cached page or store the request.  If the page is
        cached, it is returned to the browser, else a dummy page is returned
        (and stored in the cache), and the outgoing request is stored.
        If the cached page refers to a page that was marked for fetch, but
        failed, then it will be deleted from the cache.

Depending on the existence of files in the spool and other conditions, the mode
can be changed to one of several other modes.

RealNoCache - For requests for pages on the machine 'localhost'.

RealRefresh - Used by the refresh button on the index and the wwwoffle program
        to refetch a page while the system is online.

SpoolGet - Used when the page does not exist in the cache so a request needs to
        be stored for it in the outgoing directory.

SpoolRefresh - Used when the refresh button on the index or the wwwoffle program
        are used, the existing spooled page (if there is one) is not
        overwritten, but a request is stored.

Index - When the server is started as real or spool, but the URL is local.
        This generates an index page that is returned to the browser.  It
        contains a list of all of the files that are cached.  This allows the
        user to easily retrieve the pages after fetching them.

The server program should never be started from the command line.


Technical Description Of Program Inter-operation
------------------------------------------------

The way that the three programs communicate between themselves and with browsers
and servers are shown in the following figures.

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
                                    +-------+         +---------+
Figure 1                            |BROWSER|-->-<-+--|WWWOFFLED|
                                    +-------+      |  +---------+
System online                                      v       :
Browser connection to wwwoffled                    |       :
wwwoffled forks a wwwoffles                        ^       v
wwwoffles in real mode                             |  +---------+       +------+
Connect to server                                  +--|WWWOFFLES|-->-<--|SERVER|
Read from browser, write to server                    +---------+       +------+
Read from server, write to cache and browser                  |
                                                              v
                                                              |
                                                             CACHE
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
                                   +--------+         +---------+
Figure 2                           |WWWOFFLE|-->-<-+--|WWWOFFLED|
                                   +--------+      |  +---------+
System online                                      |       :
wwwoffle connection to wwwoffled                   |       :
wwwoffled forks a wwwoffles                        ^       v
wwwoffles in fetch mode                            |  +---------+       +------+
Connect to server                                  +--|WWWOFFLES|-->-<--|SERVER|
Read from outgoing, write to server                   +---------+       +------+
Read from server, write to cache                        |     |
Repeat fork until done                                  ^     v
                                                        |     |
                                                  OUTGOING   CACHE
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
                                    +-------+         +---------+
Figure 3a                           |BROWSER|-->-<-+--|WWWOFFLED|
                                    +-------+      |  +---------+
System offline                                     v       :
Browser connection to wwwoffled                    |       :
wwwoffled forks a wwwoffles                        ^       v
wwwoffles in spool mode                            |  +---------+
Read from cache, write to browser                  +--|WWWOFFLES|
                                                      +---------+
                                                              |
                                                              ^
                                                              |
                                                             CACHE
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
                                    +-------+         +---------+
Figure 3b                           |BROWSER|-->-<-+--|WWWOFFLED|
                                    +-------+      |  +---------+
System offline                                     v       :
Browser connection to wwwoffled                    |       :
wwwoffled forks a wwwoffles                        ^       v
wwwoffles in spool mode                            |  +---------+
Read from browser, write to outgoing               +--|WWWOFFLES|
Write to cache and browser                            +---------+
                                                        |     |
                                                        v     v
                                                        |     |
                                                  OUTGOING   CACHE
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Author and Copyright
--------------------

The three programs wwwoffle, wwwoffled and wwwoffles:

Were written by Andrew M. Bishop in 1996,97.

Are copyright Andrew M. Bishop 1996,97.

Can be freely distributed according to the terms of the GNU General Public
License (see the file `COPYING').

If you wish to submit bug reports or other comments about the programs then
email the author amb@gedanken.demon.co.uk and put wwwoffle in the subject line.
