Subj : Re: Change Patterns (was: Polymorphism sucks)
To   : comp.programming,comp.object
From : rem642b
Date : Mon Aug 15 2005 04:29 pm

> From: "Ed Prochak" <ed.proc...@magicinterface.com>
> Your understanding of RELATIONAL Databases is really off track.

Or perhaps your perception of my understanding is incomplete or wrong.

> One big feature of an RDBMS is to handle the data storage for you.

Yes, I know that already. Tell me something I don't already know.

> You don't know or care whether the fields are fixed lenght or not.

Actually you do, because variable-length fields with unlimited length
are not portable across all relational database systems. In particular
CloudScape/Derby requires explicit maximum-length declarations on all
varchar fields.

> But the key feature of a RDBMS is treating data as sets.

"A" key feature, one of several.

> Until you understand that fundamental concept of Relational
> databases, you will not use them properly.

I already understand that, with my correction from your incorrect English.
Tell me something I don't already know.

> I haven't used or read about CloudScape/Derby.

"Google is your friend." as they always say.

> how do you handle data integrity? e.g., how would you guarantee that a
> stock number exists before allowing it to be used on a shipping order.

I make sure the file or table containing the stock number is safely
written to disk before I start writing any data that mentions it. When
writing flat-file databases, I write to a new temporary file first,
then after I'm sure all temporary files are safely written I perform a
rename-roll: backup <- main <- newtemp for each file involved in the
transaction. If the program or system crashes in the middle of writing
temporary files, I just discard them all and re-do the transaction if
anyone still is around to want it. If the program or system crashes
during the rename-roll, I manually finish it before restarting the
program. In a production system I could have a COMMIT file which is
written after all the temporaries but before the rename-roll occurs. If
that file exists after a crash, do/finish rename-roll, else delete all
temporaries. The user could be told COMMITTING...DONE. If that full
thing appears, the transaction is definitely committed. If none of that
appears before a crash, the transaction is definitely not committed. If
only the first part prints out, a very rare condition where the system
or the InterNet connection crashed during the millisecond it took to
write the COMMIT file, then the user has to call customer service to
learn whether the transaction was committed or discarded.

> How to you deal with multiple users accessing the same rocords at the
> same time? how to you guarantee one user doesn't interfere with
> another?

If they're just reading the same record, no problem.
If one is modifying the record while the rest are just reading, the
reads that started first get the old version whereas the reads that
started last get the updated version, and presumably the system is
documented to the effect that this particular kind of record is updated
in-place rather than amended as a separate update record.
Interlocking of that record prevents simultaneous update. The
application must be coded in such a way that read-modify-write is all
enclosed in a lock on that record, so that the following is possible:
  User A:  read- ...modify...   -write
  User b:         waitForLock.........GotLock read-modify-write
but the following is not possible:
  User A:  read- m o d i f y v e r y s l o w l y -write
  User B:         read-modify-write
and all other possible interleaved combos are likewise impossible.

> A decent DBMS does a lot of this (data integrity and tranaction
> integrity) for you.

Yes, I already know that. For example:
  http://groups.google.com/group/comp.programming/msg/666abab8e2cc9c6d?dmode=source&hl=en
  Message-ID: <REM-2005aug09-013@Yahoo.Com>
  ... I wanted to make sure two different programs
  didn't try to increment the master sequence number at the same time, so
  I set up a "Lock" database whereby any process can ask the RDBMS to add
  a Lock record with a certain value as the unique key, and if successful
  then the process knows that no other process can have that same key at
  the same time, hence that one process has unique access to whatever
  activity is associated with that unique key, 
Tell me something I don't already know.
(Note I didn't want to lock the whole database through an entire
transaction, I wanted merely to lock a particular table during a
critical section, while anything not needing to perform the same exact
critical section, namely assigning the next consecutive unused
unique-primary-key could use the database, even the same table, during
that time. I took advantage of the feature that adding a record to a
table is an atomic step in a RDBS and is protected from two different
processes trying to add records with the same primary key. Note: Two
primary keys involved here, the primary key of the lock record, which
is a global constant, so that two processes trying to get the same lock
will have the second blocked by the first, and the primary key of the
data table, which is generated dynamically as highest existing key plus
one.)

> What stops an application program not written by you from adding that
> record to the order file?

On Unix/Linux there's a feature called directory permission.
Nobody except me or my software is allowed to write onto my directory.

> the ACCESS internal model is Network, not Relational. ORACLE is
> Relational (with some OO and other features).

What's the difference between the two models that most distinguishes them?

> You might be surprised by the things you could do with a real RDBMS.

Yeah, sometimes I wish I had a useful version to play with. CloudScape
in my 39MB Linux laptop is just barely usable, enough to do class
assignments, but not enough to really get going with.

> Personally I started on a network model DBMS called IMAGE 2000 on a
> 16bit minicomputer.

Lucky you. I never had any such thing available to me.

> A locking model is used in some RDBMS's you are basically rewriting
> stuff that ORACLE Sequel Server, DB2, UNIFY, Informix, and others
> include.

Probably so, but I didn't have any of those systems available. I needed
to develop on my laptop (CloudScape/Derby) then port to the campus
system to demonstrate with MS-ACCESS. So I had to implement one
advanced feature myself. It wasn't hard at all, so I didn't mind.
It wasn't like the three weeks of pain it took me to figure out a way
to run HttpServlets without J2EE, because J2EE slowed my whole system
to a crawl making it take hours for a single edit-debug cycle, so I
needed to develop with only J2SE running, and then after it was all
debugged spend two hours running a full J2EE test just to make sure it
really worked before spending several more hours uploading to Yahoo!
Mail then taking bus to campus to download to Windows and see if it
still worked there on barebones TomCat without any J2EE.

> Also, controlling access by locks is inherently unscalable. ONLY ONE
> process has access, making all others wait.

No, you're mistaken. It's easily scalable. For example, consider my
example of locking the master ID table whenever somebody needs to
create a new ID one distinct from any previous ID. Locking the whole
table via the built-in transaction mechanism is indeed unscalable.
But having a lock record in a lock table is easily scalable.
First off all, you can break up ID space into modular classes, for example
module ten. So then there are ten separate locks for the ten separate
modular classes of IDs, and up to ten IDs can be generated simultaneously.
If the data to/from the table-server is overloading the network, you
can physically break the over-used tables into separate tables located
in remote locations. But if that's happening, you probably already
needed to figure out a distributed system for your databases and the
load on any single such table of IDs is back to not-too-huge.

Regarding what I did in 1970-71 to implement a "programmed text" layout
for organizing online information in a hierarchial menu format,
analagous to today's trees of Web pages:
> I'm impressed.

Impressed that I saw the need and decided to hack it, and thereby was
one of the first to see such a need and act on it, or that I was able
to write the code for it at a time when I had only a couple years of
part-time academic software-development experience, or what?

Note that I abandoned the strictly tree-structured information system
way back in the early 1970's because I realized that it's difficult to
navigate via menus from the global topic of all topics down to the
special topic of interest, when really you'd rather just type in a
keyword and thereby jump immediately to some frame close to where you
want to browse. I got busy with other projects so couldn't develop that
idea, but Yahoo came up with the same idea about twenty years later and
implemented it on the Web, namely a tree structure of information with
a way to keyword your way to any starting point but then browse from
there along the tree links. I tried that just now but Yahoo seems to
have dismembered their entire hierarchial topics, so now a Yahoo search
for "gopher protocol" gets a flat list of matches but no list of
entries into the tree they used to have. Yahoo! Groups still has that
kind of classification system. For example I searched for "gopher
protocol" in there, and got just one match, but going into that group I
found the category to be in the hierarchy:
   Top > Computers & Internet > Internet
and looking out the branches from there:
Browse for more specialized Groups
     * 7 Auctions (875)
     * 7 Chatting (2450)
     * 7 General (2353)
     * 7 Individual Web Sites (861)
     * 7 Instant Messaging (568)
     * 7 Internet Appliances (208)
     * 7 Internet Business (3910)
     * 7 Mobile Internet (225)
     * 7 Searching the Net (232)
     * 7 Servers (228)
     * 7 Spam (233)
     * 7 Web Browsers (144)
     * 7 Web Design (1679)
     * 7 Web Site Promotion (616)
     * 7 Webmasters (743)
     * 7 Webrings (112)
     * 7 Internet Addiction@ (209)
     * 7 Web Cams@ (590)

> Maybe I'm speaking to a master who has moved beyond the need for a
> DBMS. But for us mortals, it is a very useful tool.

Oh no, if I had regular access to a RDBMS I'd probably find a use for it.
Unfortunately I don't know of any RDBMS available on my Unix shell account.
(If anyone knows of any such available in the standard distribution of
FreeBSD Unix, even if PHP is the only way to get access to it, please
let me know.)

.