Checksum: 47346
Path: utzoo!utgpu!craig
From: craig@gpu.utcs.utoronto.ca (Craig Hubley)
Date: Sun, 3-Sep-89 18:23:14 EDT
Message-ID: <1989Sep3.182314.9660@gpu.utcs.utoronto.ca>
Organization: University of Toronto Computing Services
Newsgroups: comp.databases
Subject: Re: OO DBMSs (was Re: Extended RDB vs OODB)
References: <3560052@wdl1.UUCP> <411@odi.ODI.COM> <19@dgis.daitc.mil> <1989Aug14.140128.15094@odi.com> <27@dgis.daitc.mil> <1989Aug21.132525.3179@odi.com> <459@cimshop.UUCP> <1989Aug28.213213.1239@odi.com>
Reply-To: craig@gpu.utcs.UUCP (Craig Hubley)

(John Orenstein requested analogies for 'adding a column' in O-O terms,
 sorry I haven't the article on hand)

I think this issue is important because it may help to define relational
operations as a proper subset of object-oriented operations, and let
OODBMS developers provide all standard relational capabilities in their
systems, which I think is a worthwhile goal.  I have also included some
discussion of Linda, which is a DBMS in Krueger's sense, and well worth
investigating as a unifying access metaphor.  

DATA-BASED ANALOGY

If you accept the analogy of C++ classes to relational table definitions,
which seems relatively sound to me, then you might also accept the analogy
of 'adding a column' to mean adding a data item to each instance of the class.
Of course, to be truly object-oriented, you are actually adding a set of
*legal operations* on each object of that class, including the ability to
alter and retrieve that data item. There may be other, more involved meanings,
but that is a sort of default.  Alternatively, you could think of inserting
these 'more involved meanings', which is addressed below.

Clearly, you may also be adding a set of more involved or advanced operations
or virtual capabilities, but if the data item or 'column' didn't exist before
then you aren't relying on it anywhere.  Presumably the implementation of 
other operations (methods) on the class might change due to the prescence
(or absence) of this new item, but that doesn't change the availability of
these methods from outside, which in terms of the access algebra is the only
issue.

So if I add 'Country' to a relational table containing company addresses,
then by default I would add methods like 'SetCountry()' and 'GetCountry()',
but these are optional.  If the definition of 'SendMail()' changes because
of the new 'Country' field, that is not an issue at the interface.

METHOD-BASED ANALOGY

Another interpretation of 'adding a column' is 'adding a method',
and this is probably a more supportable analogy.  Adding a new method
may or may not involve adding new data to all instances of the class.
You could think of 'inserting' this method in terms of defining a new
function in an incremental compiler for an object-oriented language.
In fact, it could be implemented this way.

LINDA

Finally, if one accepts the definition of a 'database' as 'shared access to
persistent objects', then everyone should have a look at Linda, the Yale
language extensions developed for parallel processing.  In fact, Linda defines
a 'tuple space', like a simple table of variable-length records, that in fact
provides 'shared access to persistent objects'.  It is assumed in Linda that
the persistence is of short duration, but this assumption is in no way built
into the interface.  It would be relatively straightforward to extend Linda
to full relational DBMS capabilities, and ultimately to OODBMS status, in the
bargain gaining Linda's unification of all IPC and RPC schemes!

Linda's interface is quite simple, basically:
	in(tuple pattern)	removes a tuple from the tuple space
	out(tuple)		places a tuple in the tuple space
	rd(tuple pattern)	reads a tuple from the space, without removing
	eval(active tuple)	(one or more fields are actually a function
				call, when finished the tuple becomes passive
				and remains in the tuple space).

ref: "Linda in Context", April 1989 Communications of the ACM.

Linda's present conception of 'tuple pattern' is limited, and it inherits its
datatypes from the language these primitives are added to (the one-language
unification of DBMS and other programming constructs).  However, the idea of
tuples is very general, and were the system extended to work set-at-a-time,
accepting tuple pattern definitions in relational algebra, and active
tuple definitions as programs in relational algebra, Linda could just as
easily extend SQL !

For an object-oriented system, each field in the tuple can be thought of
as an object, and the operations as retrieving or emitting a set of objects,
or a container object holding the set, which would probably be better.

Linda is traditionally a preprocessor and implements its capabilities on
top of existing shared-memory, semaphore, or message port schemes, but
there are other approaches, including building 'tuple space' as a real DBMS. 
Contrary to it's abstract appearance, Linda has been added efficiently to
C, C++, PostScript, and other languages - one scalable supercomputer, the
Cogent XTM, even uses Linda to do *all* it's low-level IPC and RPC, even to
the level of mouse moves.  It is based on transputers and exploits their
1 microsecond context switches.  Their version, Kernel Linda, defines a
set of language-independent data types, though their primary interface is C++.


CONCLUSION

So it would seem that not only is it possible to define an access algebra
that is very useful in traditional computing approaches, it seems possible to
extend it to include database needs as well (that is, if we accept 'shared
access to persistent objects' as a good working definition).  Linda tuples
can be effectively made very shared and very persistent, as evidenced by the
Cogent implementation.  Although 'tuples' seem like a relational concept,
they are in fact only lists of fields, and these fields could contain object
tags as easily as anything else.  In fact, since the Linda algebra is
presently defined to be one-at-a-time, not set-at-a-time, it would be
easier to have the primitives work with container objects than with sets.
It seems thus more suited for object-oriented approaches.  The more so since
developers would have two selling points:  the DBMS and the unified and 
explicit metaphor for dealing with transient objects too!  As the Cogent
system shows, optimizing for this single shared-access metaphor can be
made very efficient and totally scalable.  This is the logical extension
of the 'one-language' advantage that the Object Design people claim.

    Craig Hubley			-------------------------------------
    Craig Hubley & Associates		"Lead, follow, or get out of the way"
    craig@gpu.utcs.utoronto.ca		-------------------------------------
    craig@gpu.utcs.toronto.edu    mnetor!utgpu!craig@uunet.UU.NET
    {allegra,bnr-vpa,decvax,mnetor!utcsri}!utgpu!craig    craig@utorgpu.bitnet
-- 
    Craig Hubley			-------------------------------------
    Craig Hubley & Associates		"Lead, follow, or get out of the way"
    craig@gpu.utcs.utoronto.ca		-------------------------------------
    craig@gpu.utcs.toronto.edu    mnetor!utgpu!craig@uunet.UU.NET
    {allegra,bnr-vpa,decvax,mnetor!utcsri}!utgpu!craig    craig@utorgpu.bitnet
