#
# Copyright 1995 Carlos Maltzahn
# 
# Permission to use, copy, modify, distribute, and sell this software
# and its documentation for any purpose is hereby granted without fee,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Carlos Maltzahn or 
# the University of Colorado not be used in advertising or publicity 
# pertaining to distribution of the software without specific, written 
# prior permission.  Carlos Maltzahn makes no representations about the 
# suitability of this software for any purpose.  It is provided "as is" 
# without express or implied warranty.
# 
# CARLOS MALTZAHN AND THE UNIVERSITY OF COLORADO DISCLAIMS ALL WARRANTIES 
# WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF 
# MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE UNIVERSITY OF COLORADO
# BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY 
# DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER 
# IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING 
# OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
# 
# Author:
# 	Carlos Maltzahn
# 	Dept. of Computer Science
# 	Campus Box 430
# 	Univ. of Colorado, Boulder
# 	Boulder, CO 80309
# 
# 	carlosm@cs.colorado.edu
#


                               Paos 0.2
                               ========



DISTRIBUTION
------------
Paos (Python active object server) is an active multi-user object server with a 
simple query language. All software is written in Python. The distribution 
consists of the following files:

  Store.py      - implements storing and locking of objects, the query language 
                  and registration of notifications.

  Server.py     - implements the network interface of Store.py. Server.py
                  imports Store.py and is started by "python Server.py <port>"

  Client.py     - implements the network interface of a client. It is used
                  by importing it into a Python program.

  Schema.py     - defines the class DBobject. All objects that are to be
                  stored in the object server need to be of this class or 
                  a class that inherits this class directly or indirectly.

  Utilities.py  - contains a number of functions that are used in 
                  above modules.

  example/
  --------
  Producer.py   - implements a producer that accepts input lines and stores them
                  to the object server. Started by 
                  "python Producer.py <host> <port>"

  Consumer.py   - implements a consumer that prints out lines produced by
                  a producer and is started by 
                  "python Consumer.py <host> <port>"

  Talk.py	- implements two way communication (accepts input lines and
                  prints out lines received from the server as notifications).
		  Uses select call and the new pipe feature.


  ExSchema.py   - contains the schema necessary for Talk.py, Producer.py and 
		  Consumer.py


INSTALLATION
------------
Look at http://www.python.org/ for information on how to get and 
install Python. 

During installation make sure that you include at least one database 
module of either dbhash, gdbm, dbm, or macdb. I would recommend dbhash 
or dbm with the ndbm library because these do not limit length of records 
(which gdbm and the default library of dbm do; I don't know anything about 
macdb). 

Second you need to include the home directory of Paos and all your application 
directories into the environment variable PYTHONPATH. In this case the 
applicaton directory would be <Paos home>/example. Sometimes this environment
variable is not accessible to the Python application (e.g. in CGI programs
for a WWW server). Then your application programs need to import the module
"sys" and set the variable "sys.path" appropriately. 


STARTING THE SERVER
-------------------
You start the server by "python Server.py <port number> [<database file name>].
The database file name is optional. The default database file name is
"database". The server then looks for a file <database file name>.db. If it
it does not exist, the server creates a new file of this name.

CONNECTING TO THE SERVER
------------------------
The client can be either a standalone or an embedded Python program. It
needs to import Client.py. This module defines a class called "Connection"
which is instantiated as follows:

import Client

conn = Client.Connection(<host name>, 
                         <port number>, 
			 <client name> [,       
			 <callback function>])  

If host and port are correctly specified this creates a TCP connection to the
server. 

<client name> can be an arbitrary string which is only useful 
for debugging purposes and possible future extensions. 

<callback function> is optional. If specified, this function is called 
if the client receives a notification from the object server (see below on 
how to register notification requests). 

NEW in v0.2: Instead of the callback function you can now pass a pipe
instead of a callback function (a tuple of a read and write file 
descriptor returned by os.pipe()). You can use select.select(...) 
on the read descriptor of the pipe. Use Utilities.READ(...)  and 
pickle.loads(...) to receive the notification (see below for the 
format of a notification). You also need to apply conn.register_objs(...) 
on the notification's object list. See the example application.

All interactions with the server are defined as methods of the Connection
instance. Note also, that you could have multiple connections to same or 
different servers. However, currently each object server has a seperate object 
ID name space. Also, each client registrates with a client specific name, not
a connection specific name. Therefore, the client programmer has to take
care of possible name collisions. A future version will introduce client
naming that is unique over all connections and object ID naming that is
unique over all Paos object servers.

Use

conn.close() 

to close the connection. 


QUERYING THE OBJECT SERVER
--------------------------
In order to query the object server you use 

answer = conn.get(<access mode>, <scope>, <property list>)

answer is a list of objects.

<access mode> can be either 'r' for read-only access or 'rw' for write-locking
  all objects contained in the answer. If some of the objects contained in
  answer are already write-locked by another client then the answer is None.
  Note the difference to an empty list that merely indicates that there
  is no object in the object server that matches the query. 
  Note that each failure to acquire write-locks results in the loss of
  all write-locks acquired so far!

<scope> can be either a list of persistent object references or a class name. 
  A persistent object reference is a tuple as follows: 
  ('__db', <db_id>). 

<db_id> is an integer issued to each object that is stored in the object server.

<property list> is a list of properties. A property is a tuple as follows:
  (<attribute name>, <relation>, <value>). 

<attribute name> is a string specifying the name of an attribute of objects
  specified by <scope>.

<relation> can have '==', '!=', 'in', 'not in', 'has', 'has not',
  'all in', 'not all in', 'some in', 'none in'. 

  The meaning of '==', ..., 'not in' is the same as in Python. 

  A list 'has' element iff element 'in' a list.

  A list 'has not' element iff not list 'has' element.

  List A 'all in' list B iff the elements of A are a subset of elements of B.

  List A 'not all in' list B iff not list A 'all in' list B

  List A 'some in' list B iff there exist a non-empty subset C of elements of A
    which is also a subset of elements of B. 

  List A 'none in' list B iff not list A 'some in' list B

  Note that 'some in' is not the same as 'not all in'. In the first case 
  the subset C has to be non-empty; in the second case C can be empty. 


CREATING NEW OBJECTS
--------------------
Each new object that is created in a client and that is eventually written
to the object server needs to be registered with the server PRIOR TO COMMIT
TIME. Objects that are not registered at commit time can cause bad
inconsistencies! In general new objects should be registered before your
first access to one of their attributes with references to other persistent 
objects. Each registered object receives a unique persistent object ID 
under the attribute name "db_id". Use

db_id_list = conn.register_objs(<obj_list>)

db_id_list is a list of db_id integers in the order corresponding to <obj_list>.

<obj_list> is a list of objects. It can contain registered and unregistered
  objects. Registrating registered objects is useful in connection with
  notifications (see below). All unregistered objects in <obj_list> 
  acquire write-locks.


STORING OBJECTS
---------------
Objects are stored by using

ret = conn.commit(<obj_list>)

ret is either 'ok' or None if an error at the server occured 
  (the diagnostics printed out by the server will give more information 
  about the error - I'm aware that this is not a good solution; future 
  versions will hopefully offer a better error handling).

<obj_list> is a list of objects. <obj_list> contains all the objects
  that are supposed to be written to  the database. However, only objects
  that were previously locked will be written to the object server; readonly
  objects are simply ignored.


LOCKING OBJECTS
---------------
It is possible to write-lock objects once they are loaded. Use

answer = conn.lock(<obj_list>)

answer is a list of objects locked. The order of the list corresponds to 
  <obj_list>. However, answer contains the versions of objects as they were
  found in the object server at locking time. If the lock failed answer is 
  None and all previously acquired locks are released. 

<obj_list> is list of persistent objects to be locked. Objects that are not
  explicitly mentioned in the list (i.e., are only directly or indirectly
  referenced by objects explicitly mentioned in the list) are ignored.

Note: 'lock' is faster than 'get' in the case of failed locking: 'get'
retrieves objects before checking their locks while 'lock' checks locks first.

Note also that there are three occasions where all previously acquired
locks are lost: (1) calling "commit", (2) calling "lock" which fails, and
(3) closing the connection or terminating the client


ATTRIBUTE ACCESS
----------------
Assuming you load object a and b, and a.attr = b, i.e. a.attr contains a 
pointer to b. Now you issue a query that loads b and c. However, a.attr and
b refer now to different objects because a.attr points to an older version of 
b. With many objects referring to each other it can become quite difficult
to keep track of all the different versions of objects. 

In Paos each connection instance maintains an object cache that is updated by
all connection methods except get_raw_notification() (see below). 
Attribute access of registered objects always access objects in the cache.
Thus, in the above example a.attr always refers to the newest version of b.
If a user wants to keep the older version of b she needs to assign it to 
a variable v before the next query. However, b's references to other persistent
objects always point to the newest versions. 

Another advantage of this policy of attribute access is that the client
will load objects from the object server as needed. For example, if you load
object a and you assign v = a.attr then the client will automatically
load b unless it is already in the cache.

This convenience comes with a price: When you define persistent object
classes you need to enumerate those attribute names that can have attribute
values which contain references to other persistent objects. This information
is kept in a special attribute called '__refs'. For example: 

import Schema

class A(schema.DBobject):
  def __init__(self):
    schema.DBObject.__init__(self)
    self.__refs = ['attr']

This assumes that instances of class A have an attribute called 'attr' that 
can refer to other persistent objects.


NOTIFICATIONS
-------------
With

request_id = conn.register(<scope>, <property list>)

you can register a notification request. <scope> and <property list> have
the same meaning as in "get". A notification request is a query that is
stored at the object server and evaluated in each subsequent "commit" 
against the set of objects that is written to the object server. If the result 
of such a query is not empty the client which registered the notification 
request is notified. The format of the notification is

(<request_id>, <obj_list>, <committing client>)

<request_id> corresponds with the returned value of the corresponding
  "register" call, i.e. identifies the corresponding query.

<obj_list> is the list of objects that matches the query.

<committing client> identifies the client that triggered the notification.

Note that there no client can register a notification request for another
client; each notification request corresponds to exactly one client.
Also note that notification request do not survive a client's lifetime:
If a client terminates (or crashes) all notification requests owned by that
client are deleted.

There are multiple ways for a client to process notifications. If the 
connection to the server was created with a pointer to a callback 
function in the fourth argument then the client is interrupted at
each notification (with the signal SIGUSR1) and the callback function is
called. Otherwise the client needs to poll for notifications.
In both cases notifications are retrieved by

notification = conn.get_notification()

Note that a notification is generated for each registered notification
request. For example, if a client registered two requests and a subsequent
commit contains objects matching both requests then the object server
sends two notifications to the client.
Also note that multiple notifications triggered by one commit are sent in the
order they were registered.

Each "get_notification" updates the object cache (see paragraph about
attribute access). One can avoid this by using 

notification = conn.get_raw_notification()

Note however, that attribute access in objects within the notification
is  not resolved correctly since these objects are disconnected from the 
attribute resolution mechanism discussed above. To connect these objects 
to the resolution mechanism use "register_objs" (this updates the object cache).

If there are no notifications "get_notification" returns None.
With

conn.unregister(<request_id>)

you can retract a notification request.


