* * * * * A few notes about yesterday's crashes I wrote crashreport() in an attempt to find out why glibc was reporting a double free (or memory corruption) [1], so imagine my surprise when I found other crashes happening [2]. I did find the root causes for the crashes yesterday, but I have yet to figure out why the memory corruption happened. First off, no points to Apache [3] for failing to report the unexpected termination of a child process. I can certainly understand that the Apache developers don't expect anyone to use CGI (Common Gateway Interface) anymore, and if people do, to use a CGI developed in a scripting language that probably won't core dump. But still, they make the CGI module [4], and that the program the CGI module executes can be written in anything and hiding the fact that a program crashed due to SIGSEGV or SIGABRT is, to me, inexcusable. Had Apache logged the crash, I probably would have found the error a few years ago (seriously). The actual crash only happened after the output was generated and sent to the browser, so I never saw anything unusual. And because Apache never said anything about a crash and well … everything is okay, right? Second, the code path with the crash was in a seldom used code path— specifically, when the addentry.html page was requested. I normally use email to create entries, not the web interface. But it's not like I never use the web interface, but I can safely count on two hands the number of times I've used it over the past thirteen years. So to say it doesn't get a lot of use is an understatement. Now, are there features I don't use? Yes. And such code is currently commented out. That code was written at a time when I expected other people might use the codebase, but alas, only one other person ever used mod_blog (only to stop blogging due to personal reasons) and now, as far as I know, I'm the only one who uses this codebase. That doesn't bother me, but it does indicate that I should probably remove the code that I don't use. But the web interface? I use it just enough to justify its existence in the codebase. Third, the addtion of command line and evironment variables to the output of crashreport() (and I solved the global variable issues I had) certainly helped with the diagnosis. It revealed a request that would reliably crash the program (the aforementioned addentry.html page) and with a reliable way to crash the program, it's easy to isolate the buggy code (if a bit tedious). And to tell the truth, the bug has existed since May 26^th, 2009 [5], when I made the following commit: > Basically, I rewrote the core blogging engine over the past twelve hours. I > still have yet to support adding new entries via the engine, but until I > get that fixed, I can add them manually. > only I didn't quite update all the code properly. And since the code path in question isn't executed except when called as a CGI program (I should note that mod_blog can be run from the command line as well), and Apache never logs CGI programs that crash, no wonder I never saw this bug. [1] gopher://gopher.conman.org/0Phlog:2013/01/11.1 [2] gopher://gopher.conman.org/0Phlog:2013/01/11.2 [3] http://httpd.apache.org/ [4] http://httpd.apache.org/docs/2.4/mod/mod_cgi.html [5] gopher://gopher.conman.org/1Phlog:2009/05/26 Email Sean Conner at sean@conman.org .