[HN Gopher] The weirdest bug I've ever encountered
___________________________________________________________________
The weirdest bug I've ever encountered
Author : tjalfi
Score : 47 points
Date : 2021-11-13 03:53 UTC (19 hours ago)
(HTM) web link (mental-reverb.com)
(TXT) w3m dump (mental-reverb.com)
| dmingod666 wrote:
| Weirdest bug I saw was OTRS getting slow at the end of the month.
| It was weird, no pressure on the DB or the server, it would get
| slow with barely any users and then suddenly it would work really
| fast with no change.
|
| After a wild goose chase I saw one of the servers took too long
| to authenticate SSH than the other when the problem happend,
| turns out, the only difference between the 2 was one of them had
| a reference to a domain and the other did not.
|
| From this clue I checked there was an internal AD server that was
| getting called by its domain name on every page call of OTRS.
| When that DNS server got slow, each page got slow, switching to
| direct IP fixed it. DNS getting slow on an internal system is
| wayy down on most people's lists if it even exists..
| blakesterz wrote:
| Everytime I read a story like this I'm reminded of "The case of
| the 500-mile email"
|
| https://www.ibiblio.org/harris/500milemail.html
|
| "...But then I tried to send an email to Memphis (600 miles). It
| failed. Boston, failed. Detroit, failed. I got out my address
| book and started trying to narrow this down. New York (420 miles)
| worked, but Providence (580 miles) failed...."
| btbuilder wrote:
| This reminded me of a story about the launch of WebTV by
| Microsoft back in the 90s.
|
| My memory is blurry but essentially the bug was in the signup or
| login code. When displaying an error to the user, the memory
| address to read the message from was pointing to an array of
| banned words. This lead to the user seeing no details of the
| error but instead just the word "f*k".
|
| Unfortunately the only reference I can find to it online is
| someone also trying to find it and failing:
| https://dotclue.org/archive/2006/12/002664/
| [deleted]
| lanewinfield wrote:
| Well, there looks to be a comment on that very link you sent
| with the story:
|
| https://fadden.com/tech/webtv-anecdotes.html
|
| (and the dialog box:
| https://fadden.com/tech/images/fkdialog.jpg)
|
| And really laughed at it. Thanks for sharing.
| fargle wrote:
| Way before QNX went shared/open source, they were as they are
| now. But they had just released Neutrino a major update which
| basically took their QNX4 microkernel and mated it with a
| POSIX/GNU/UNIX-like userland. And it was for a while, free as in
| beer, for R&D type use.
|
| So that all sounded great and we tried it, this was around 20
| years ago. It was a disaster.
|
| The same kind of shoddy work, race-condition, type bugs all over
| the place.
|
| - malloc broken under heavy multi-threaded load. Replaced with
| dl-malloc.
|
| - serial driver (16550) broke under heavy load. Wrote our own
| replacement.
|
| - ethernet driver broke under heavy load. Had to replace intel
| cards with 3-com (or vice-versa) to use a different driver.
|
| - had to write/rewrite several other drivers. Interestingly the
| only think I liked about QNX is that user mode drivers are very
| nice and easy to write.
|
| - compiler, QNX bespoke port of GCC, crashed and broke on large
| codebases (could be because of malloc). Had to re-port GCC (which
| was easy) and build our own toolchains.
|
| When we got done trying it our adamant report was GARBAGE NEVER
| USE. Got overrode by management. because we succeeded in making
| it work. Sigh. Started using it for products, so we bought
| commercial not-free license AND support. Reported all of the
| above bugs and 20 more, with reproducible test cases. QNX
| "promised to fix them in the next release". This was 20 years
| ago. Finally after many years we completed excising QNX from our
| products.
|
| We never had a problem with ps, but it was probably broken. This
| company was producing garbage in the late nineties and didn't
| care. They produce garbage now. They didn't have enough or
| experienced enough developers to be trying to pull together a
| whole OS, so it was about how you'd expect a university or hobby
| project.
|
| The only place this author got it wrong is that on QNX, if you
| assume the OS or compiler is at fault, you are probably _right_.
___________________________________________________________________
(page generated 2021-11-13 23:01 UTC)