Newsgroups: news.software.b
Path: utzoo!henry
From: henry@zoo.toronto.edu (Henry Spencer)
Subject: Re: Ideas for Message-ID's
Message-ID: <1991Apr17.212354.12236@zoo.toronto.edu>
Date: Wed, 17 Apr 1991 21:23:54 GMT
References: <3427@litchi.bbn.com> <3434@litchi.bbn.com> <1991Apr16.174706.4963@nic.funet.fi>
Organization: U of Toronto Zoology

In article <1991Apr16.174706.4963@nic.funet.fi> hks@funet.fi writes:
>Would it be possible to have after the time a checksum calculated over
>the most of the message? The checksum calculation should could include at
>least the newsgroup name and subject if not everything.  It's unlikely
>that even an automatic program sends within one second two messages
>with same subject to same newsgroup...

I'm not sure what your objective is here.  What this is essentially
doing is adding a random number to the message ID.  Using the process ID
accomplishes the same thing, with random numbers that are *guaranteed
unique* over the whole system, making collisions essentially impossible.

>Including the newsgroup name would make it possible to munge the
>message-ids to become consistently different when gatewayed to two
>different newsgroups from mail. In theory you shouldn't tamper with
>message-ids if they are already present but in practise you might have
>to or the message might get lost.

Can you explain this in more detail?  I don't see why you ever have to
tamper with a legal message ID, and you most certainly should never have
to assign more than one to the same message.  Gatewaying to multiple
newsgroups should be done with a cross-posting, not by posting the same
article to each newsgroup in turn!

>The other advantage of this style of message id (marked with some special
>delimiter?) could be used to detect problems in message transport...

Geoff and I thought about this long and hard during C News development.
Some early versions generated a Checksum header for this purpose.  We
eventually deleted it.  The problem is that articles which go via broken
networks like Bitnet are often changed slightly in harmless ways, like
having tabs expanded to spaces or empty lines changed to contain a single
space.  So you get a lot of spurious checksum mismatches.  Given this,
we couldn't see a use for the checksums.  You can't just discard articles
with bad checksums.  Messages complaining about it will be frequent
enough that people will ignore them.  The software problems that cause
them are mostly already known, so alerting people won't do any good.
Checking the checksum on every article is costly, especially if the
algorithm is trying to be clever and ignore harmless kinds of damage.
There are perhaps rare circumstances where it would be useful to know
whether an article was damaged or not, but they didn't seem common enough
to justify hauling the checksum along in every message.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry
