* * * * * “Every time the shipper takes away a pallet from the shipping room, the server times out within two seconds.” > Last year at my job we had a pretty severe problem just as unexplainable. > > The day after an unscheduled closing (hurricane), I started getting calls > from users complaining about database connection timeouts. Since I had a > very simple network with less than 32 nodes and barely any bandwidth in > use, it was quite scary that I could ping to the database server for 15- 20 > minutes and then get "request timed out" for about 2 minutes. I had > performance monitors etc. running on the server and was pinging the server > from multiple sources. Pretty much every machine except the server was able > to talk to the others constantly. I tried to isolate a faulty switch or a > bad connection but there was no way to explain the random yet periodic > failures. > > I asked my coworker to observe the lights on a switch in the warehouse > while I ran trace routes and unplugged different devices. After 45-50 > minutes on the walkie-talkie with him saying "ya it's down, ok it's back > up," I asked if he noticed any patterns. He said, "Yeah… I did. But you're > going to think I'm nuts. Every time the shipper takes away a pallet from > the shipping room, the server times out within 2 seconds." I said "WHAT???" > He said "Yeah. And the server comes back up once he starts processing the > next order." > Via Hacker News [1], “chime comments on The case of the 500-mile email [2]” This is every bit as amusing as the 500-mile email (The story about a server that refused to send email more than 500 miles away) [3] and shows that bugs can be very hard to debug, especially when they aren't caused by bug-ridden code. I'm fortunate in that I've never had to debug such issues. [1] https://news.ycombinator.com/item?id=13347058 [2] https://www.reddit.com/r/reddit.com/comments/vunp/the_case_of_the_500mi [3] http://www.ibiblio.org/harris/500milemail.html? Email Sean Conner at sean@conman.org .