Subj : ET phone home To : Nick Andre From : mark lewis Date : Sun Jan 08 2017 11:44 am On 2017 Jan 08 10:30:12, you wrote to me: NA> On 08 Jan 17 09:16:24, Mark Lewis said the following to Nick Andre: ML>> exactly sure on the details but i know that they greatly influenced me ML>> to my software place the MSGID as close to the beginning of the control ML>> lines possible so that db would not detect messages posted within one ML>> second as dupes... this was especially important when testing at 100+ ML>> posts per second... are you willing to share information about how db ML>> does its dupe detection so others can understand more? please? NA> Its not that hard to understand. A CRC is computed from the header and date NA> of the tossed message. I would have to dig into the code and I'm not sure NA> how many bytes are being included from the start of the message. yeah, these little details are what is/was being sought... if the header and X bytes are being read into a buffer and then the CRC calculated on that entire raw buffer or if each field is read individually and then fed to the CRC calculator... knowing this may help others who are trying to work out how to do dupe checking that doesn't rely on MSGID alone... that because messages without MSGID can'tbe checked that way so an alternative or three is desired/needed... NA> Each Echomail area has a cache database file. In the case of *.MSG, NA> this is called DBRIDGE.DUP and resides in each area and for NA> Hudson/QBBS there is one database segmented slightly different. The NA> CRC's are kept in there. I believe the code sets the cache database NA> size at 1,024 entries. i remember the different dup cache files... i didn't know they were limited to a paltry 1024 entries, though... i never dug that deep ;) NA> Interestingly it appears that there is a "reputation" method for the NA> cache database. It appears as it is loaded into RAM during a toss, any NA> time a CRC match is encountered, that CRC is pushed up the cache NA> table, while CRC's of legitimate messages end up being pushed down. NA> The CRC table is saved into that cache file every time the Echomail NA> area changes in the toss cycle; or there are no more packets to toss. that's pretty interesting... i guess that's so that messages with more dupes can be detected faster with their CRCs at the top of the queue... interesting idea and i'm sure one that was important back in the day of slower machines :) )\/(ark Always Mount a Scratch Monkey Do you manage your own servers? If you are not running an IDS/IPS yer doin' it wrong... .... Actually, if they leak, you've pumped them too many times. --- * Origin: (1:3634/12.73) .