Subj : talking to myself To : Maurice Kinal From : mark lewis Date : Sun Feb 13 2005 09:10 pm ml>> they aren't supposed to be meaningful... they're just ml>> a serial number assigned to a message, for diety's sake... MK> Well then they can safely be ignored and stripped. Right? MK> Why waste the bytes and processing time for absolutely no good MK> reason to man or machine? no, they are to identify that specific message within a 3 year time period... the address postion of the MSGID is also significant ;) ml>> you're still looking too deeply... why go thru all that ml>> when you can seed a counter at 0 (zero) and increment ml>> until it hits 2147483647 and then roll it and start over ml>> at 0 (zero) again... MK> Yeah. I thought of that but then reconsidered. A complete MK> waste of time seeing it is totally meaningless anyhow, MK> networkingly-speaking. nah, not really... i know of at least one individual who designed and wrote a MSGID "server" for his stuff... granted, only his homebrew software uses it... his server is a simple daemon that sits waiting for a request for a new serial number... it spits one out, increments it and waits for another request... at some point, it stores the current number in a small datafile on the drive... ml>> based on 365 days in a year, three years is 1095 days... ml>> dividing, that gives us 1961172 messages per day... MK> Right. ml>> surely that's enough and the method easy enough??? MK> Where is the joy in that? ;-) hehe, ya know? i think part of the problem with the MSGID spec is that the author put in the notation about "leaving it to the implementor to figure out how to generate the serial number" ;) many folk took that as a challenge and tried all kinds of ways of generating serial numbers... some even went down the wrong trail and used CRC32's of something without even thinking that there's a limited number of CRC32s /and/ that there is a very real possibility of creating a duplicate from two very different sources ;) ml>> the "problem" is storing the serial counter... MK> Right. Again we're creating extra variables to the equation. MK> Personally I'd like to keep everything restricted to what MK> absolutely HAS to happen and extract any additional MK> information from variables one MUST have. nothing has been said about when to store the memory contents of the serial counter to had media... that doesn't have to be done every time, TTBOMK... sendmail and others do this very thing... have you thought to look in their code and see what they are doing? O:) ml>> another "problem" is that even with non-duped MSGID serial ml>> numbers, it is possible for some software to see a ml>> false-dupe... MK> Right again. IDs are no assurance of catching true dupes, MK> unique or otherwise. that's solely because the spec wasn't made mandatory as well as the "rubbish" about "leaving it to the implementor" and such... if the spec had been made mandatory and the method of generating the number had been hammered down as well as what, exactly, is meant to be the "address of the originating machine", then we'd not be having this particular problem (or discussion, for that matter) O:) ml>> because they aren't always looking at the MSGID but only at the ml>> header of the message... some of them will go a tad further by ml>> looking at the header plus some (maybe 30 or 40) additional bytes to ml>> try to see if the message body is different... MK> How about quoting? Excessive quoting could potentially cause MK> a false dupe especially when quotes precede the reply. possible... let's also not forget that some dupe detection routines are a CRC32 on the header and possibly some of the first XX bytes of the message body... we already know, and i mentioned it above, that CRC32s are limited and are able to be duplicated with very different input ;) ml>> for some reason, i'm also thinking that the message headers that ml>> contain seconds are stuck with a 2 second granularity in the same ml>> vein that billy's file systems have been from the beginning... MK> Could be. Also what is the "correct" time at any given MK> moment, especially on a fast machine? hehe, i guess that would depend on the definition of "it", ROTFL!! ml>> however, i've not the time nor inclination to go rooting thru the ml>> archives to confirm this "memory"... MK> Heh, heh. I don't blame you one bit for that. oh, i could probably go right to it... it's probably right in the project directory with the code for my FTN message tool that takes raw ASCII text files and generates messages from them... one of the beta cycles actually took 3 years... one reason was impending burnout... another was a very strong desire to see realworld usage of the implemented MSGID routines that i posted to you the other day... that code hasn't been touched in many years and is still in use every day, on my system... granted, though, that tool only posts up to 50 messages a day... in testing, though, it has posted a 2Meg message (JAM format, largest fidonet nodelist) as well as posting many smaller messages per second... IIRC, on one test machine, with "empty" message bodies, it approached some 200 messages a second... on another, much faster machine, the speed for the same test clocked upwards of 500 or so messages per second... granted, they were all empty bodies but the header stuff and such all took time to create and stuff into the message abse format... i guess i should also mention that originally, the tool was a fire once afair where you had to do a "for %f in *.txt postit some.paramters %f" to post batches of messages... a quick addition was to implement a batch afair where you'd list the parameters in a @bulk file and specify that on the command line (ie: postit @elist1) and it'd draw all the parameters from there... it still has to load and process the message body text file from the disk, though... should i mention that the above tool is written in pascal? i have no idea what it'd take to "port" it to perl but i am confident that it'd be quite a bit slower on the same boxes ;) )\/(ark * Origin: (1:3634/12) .