[HN Gopher] The weirdest bug I've ever encountered
       ___________________________________________________________________
        
       The weirdest bug I've ever encountered
        
       Author : tjalfi
       Score  : 47 points
       Date   : 2021-11-13 03:53 UTC (19 hours ago)
        
 (HTM) web link (mental-reverb.com)
 (TXT) w3m dump (mental-reverb.com)
        
       | dmingod666 wrote:
       | Weirdest bug I saw was OTRS getting slow at the end of the month.
       | It was weird, no pressure on the DB or the server, it would get
       | slow with barely any users and then suddenly it would work really
       | fast with no change.
       | 
       | After a wild goose chase I saw one of the servers took too long
       | to authenticate SSH than the other when the problem happend,
       | turns out, the only difference between the 2 was one of them had
       | a reference to a domain and the other did not.
       | 
       | From this clue I checked there was an internal AD server that was
       | getting called by its domain name on every page call of OTRS.
       | When that DNS server got slow, each page got slow, switching to
       | direct IP fixed it. DNS getting slow on an internal system is
       | wayy down on most people's lists if it even exists..
        
       | blakesterz wrote:
       | Everytime I read a story like this I'm reminded of "The case of
       | the 500-mile email"
       | 
       | https://www.ibiblio.org/harris/500milemail.html
       | 
       | "...But then I tried to send an email to Memphis (600 miles). It
       | failed. Boston, failed. Detroit, failed. I got out my address
       | book and started trying to narrow this down. New York (420 miles)
       | worked, but Providence (580 miles) failed...."
        
       | btbuilder wrote:
       | This reminded me of a story about the launch of WebTV by
       | Microsoft back in the 90s.
       | 
       | My memory is blurry but essentially the bug was in the signup or
       | login code. When displaying an error to the user, the memory
       | address to read the message from was pointing to an array of
       | banned words. This lead to the user seeing no details of the
       | error but instead just the word "f*k".
       | 
       | Unfortunately the only reference I can find to it online is
       | someone also trying to find it and failing:
       | https://dotclue.org/archive/2006/12/002664/
        
         | [deleted]
        
         | lanewinfield wrote:
         | Well, there looks to be a comment on that very link you sent
         | with the story:
         | 
         | https://fadden.com/tech/webtv-anecdotes.html
         | 
         | (and the dialog box:
         | https://fadden.com/tech/images/fkdialog.jpg)
         | 
         | And really laughed at it. Thanks for sharing.
        
       | fargle wrote:
       | Way before QNX went shared/open source, they were as they are
       | now. But they had just released Neutrino a major update which
       | basically took their QNX4 microkernel and mated it with a
       | POSIX/GNU/UNIX-like userland. And it was for a while, free as in
       | beer, for R&D type use.
       | 
       | So that all sounded great and we tried it, this was around 20
       | years ago. It was a disaster.
       | 
       | The same kind of shoddy work, race-condition, type bugs all over
       | the place.
       | 
       | - malloc broken under heavy multi-threaded load. Replaced with
       | dl-malloc.
       | 
       | - serial driver (16550) broke under heavy load. Wrote our own
       | replacement.
       | 
       | - ethernet driver broke under heavy load. Had to replace intel
       | cards with 3-com (or vice-versa) to use a different driver.
       | 
       | - had to write/rewrite several other drivers. Interestingly the
       | only think I liked about QNX is that user mode drivers are very
       | nice and easy to write.
       | 
       | - compiler, QNX bespoke port of GCC, crashed and broke on large
       | codebases (could be because of malloc). Had to re-port GCC (which
       | was easy) and build our own toolchains.
       | 
       | When we got done trying it our adamant report was GARBAGE NEVER
       | USE. Got overrode by management. because we succeeded in making
       | it work. Sigh. Started using it for products, so we bought
       | commercial not-free license AND support. Reported all of the
       | above bugs and 20 more, with reproducible test cases. QNX
       | "promised to fix them in the next release". This was 20 years
       | ago. Finally after many years we completed excising QNX from our
       | products.
       | 
       | We never had a problem with ps, but it was probably broken. This
       | company was producing garbage in the late nineties and didn't
       | care. They produce garbage now. They didn't have enough or
       | experienced enough developers to be trying to pull together a
       | whole OS, so it was about how you'd expect a university or hobby
       | project.
       | 
       | The only place this author got it wrong is that on QNX, if you
       | assume the OS or compiler is at fault, you are probably _right_.
        
       ___________________________________________________________________
       (page generated 2021-11-13 23:01 UTC)