Subj : A TCP/IP variation of -N
To   : David Noon
From : Mike Luther
Date : Sun Aug 26 2001 02:09 am

Yes, Dave ..

 DN> As I wrote in my follow-up to your identical request 
 DN> on Usenet, the way to
 DN> go about this is to perform the raw extraction on the server. If you view
 DN> this request as a database table and you wish to extract selected row(s)
 DN> from said table, the way to go about that is to have the server run some
 DN> code that extracts the row(s) you want and download them.

 DN> This means that ftp itself does not need any real modification -- just
 DN> smarter servers. I offered the U.K. mirror of Hobbes 
 DN> as an example, but it
 DN> uses Web pages to offer the zip file's contents for selection. While http
 DN> is somewhat slower than ftp, the difference over a 
 DN> cable modem is sod all.

You are totally correct, Dave,  And the reason I didn't go forward with what 
was offered there, is that it seems it's more civilized to kick things around 
here in Fido, so I'll try to do that.

The problem, as I see it is in that last sentence you offered:

 DN> While http is somewhat slower than ftp, the difference
 DN> over a cable modem is sod all.

Under the conventional way of handling FTP for information and data recovery, 
we can recover a PAGE or PAGES of data.  When we see it displayed on the 
download end, it is what is in the PAGE that is what we see,  Since the content 
in the page is accurate, the underpinning file that is a part of it is 
transparent to the user, but is *NOT* the same.  It does not have the same date 
and time stamp as the source.

HTTP has no underling method, which I know about, to match the file date and 
time stamp between the systems,  It does match the contents of the file. If you 
think about the question I posed, you'll realize that what I was shooting at 
was replication of FILES, based on and including their exact time and date 
stamp.  The core of the question spoke toward knowing whether or not to get or 
not get any particular file, or record, as a result of its file stamp time.  We 
absolutely have to maintain file stamp date and time,and, yes, in some case, 
even the EA's, just to precisely match the information package between the 
boxes!

If you look at a couple of threadlettes on this that have been in the Usegroups 
too, you'll see that there simply isn't any provision that has ever been set 
forth for HTTP that will provide this.  If you use HTTP to create a maintenace 
database of files, names, dates and times, for example,to chart the on-going 
fix-everything for any OS/2 box, or complicated system; you'll fail.  That's 
because there is no way to display the entire collection so that you are *SURE* 
that the file you have is the right one.

A prime example of that is what has happened to the various files that make or 
break the Adaptec SCSI driver operation for OS/2.

There are super-critical applications now coming to focus which have legal 
requirements concerning this in medicine for example.  In short, the FDA here 
in the USA is very specific.  What was stored, in total, has to be what is 
transmitted, stored, and recovered, in exactly the same fashion as it was 
archived.  100% lossless is the rule.  That's everything, including file dates 
and times, just so that there is absolutely no question that what was archived, 
is exactly the precise same thing that the next person gets, sees,and uses in 
making their decision on what to do in any given case.

Yes, the other side of this is there too.  Once you get it, and you, as 
qualified party, decide to do something with the data, at the requesting site, 
it may change form, get bound up in some other file or data form. That's just 
fine.  The issue is that under no circumstances can the sommunications forum, 
in a quaint sort of way, be allowed to alter the data in any way between two 
different institutional parties.

The confusion, I think, as to the applicability of all this arrises on not 
realizing what the rules really are for the transmission between parties!

I'll attempt to explain.

A given institution, internal to itself, may do what it wants to in respect to 
how it faces eventual liability for errors.  Different institutions will have, 
in every respect, complete information identity 100 lossless between them. It's 
been so ordained; case closed.

I request a specific couple of chunks of data from a different institution that 
is using even my code, yet operated by different folks.  What is requested is 
going to be just one file of, say hundreds of thousands that are part of an 
'Empire Central' archive.  It's the file stamp which is the key toward 
applicatbility!

It may, indeed be shown as necessary in that it has a later date than the one 
locally.  That's how we know it is needed.  And that extends down even into the 
code locally everywhere too!  But that's my choice as the design feller.  Not 
everyone thinks like that at all!

Forget, for the moment, check-sum authenticity and all that.  At the moment,we 
can't even handle any of that, in practice, with HTTP, as I see it.  And,as 
others have illustrated, in answer to my question, .ZIP's have the index of all 
the freight cars in the train are in the Brakeman's hip pocket in the crummy!  
Which is why the railroads all got rid of the Cabooses and park the Brakeman in 
the second diesel where the ride is mighty lonely, mighty lonely, if you know a 
bit about whatever happened to Jimmy Rogers and so on,dey's a whole lot more 
lonely brakemen now riding the rails then there ever used to be!

You know, Dave, if only a few requesting sites were involved, and the value of 
the information could really command a price for archival and retrieval,that 
would be one thing.  However what I contemplate requires that hundreds of 
thousands of boxes be able to make trivial requests, not often, but when needed 
for tiny snippets.

In this information railroad, I think we're all going to be more like that than 
many suspect in the near future, but then, that's just my opinion.

In reality, for mission-critical applications, traditional client-server, as I 
see it, is really a bad setup.  As I posted elsewhere, each box of hundreds of 
thousands of boxes, is an embedded system, as 'tis, even without a keyboard, 
incidentally!  It has to stand on its own.  Even if it loses contact with 
Empire Central, it has to go right on working at the level of intelligence it 
has in it based on last connectivity.  Slowly, over time, it picks up the 
patterns and nuances of what it needs to do to fight the daily battle from tiny 
common snippet files that it needs for update.

But if someone walks in and empties a 45 ACP into whatever server and it can't 
reach the commander, it goes right on working until it discovers, "Hey I can 
talk to them again!"  Status reports are exchanged, new orders are cut.  We are 
all happy unless something went terminal during that comm loss that there was 
no way to have prevented.

That's what lawyers are for..  chuckle.

All this, of course, is what, "On demand", really means, isn't it?  And yes, 
Virgina, for most cases of Santa Claus, maybe a minute max is good enough, in 
many cases, for yanking whatever little snip you need out of a 100 terabyte 
tape in an IBM RSC-6000 AIX-400 or bigger.  Grotesque?  Not hardly.  One IBM 
storage guy here looking at all this, noted, "Well, they have more than 1700 
Linux users on a single IBM Mainframe now, Mike, and it isn't even IBM!  But we 
are a monopoly in the mainframe world, you know."

I said, quietly, "Yes, I know."

He said, "I suppose you do want On Demand?"

I said, quietly, "Of course!"

Further affiant sayeth not.  From you and others far better informed than I 
will ever be, I am trying to learn, trying to learn.

Thanks for whatever time you've spent and can spend on things like this!


Mike @ 1:117/3001


--- Maximus/2 3.01
 * Origin: Ziplog Public Port (1:117/3001)

.