* * * * * I'm terribly upset that GCC didn't start NetHack Ah yes, undefined behavior of C [1]. It's easy to see in retrospect, but it's still a bit surprising. The code: > #include > > int foo(int a,int b) > { > return a / b; > } > > int main(void) > { > return foo(INT_MIN,-1); > } > And when you compile with the right options … > [spc]lucy:/tmp>gcc -std=c99 -O0 crash.c > [spc]lucy:/tmp>./a.out > Floating point exception (core dumped) > [spc]lucy:/tmp> > What's actually going on here? Well, I compiled the code with crashreport() [2], so I could capture the crash: > CRASH(9372/000): pid=9372 signal='Floating point exception' > CRASH(9372/001): reason='Integer divide-by-zero' > CRASH(9372/002): pc=0x804883d > CRASH(9372/003): CS=0073 DS=007B ES=007B FS=0000 GS=0033 > CRASH(9372/004): EIP=0804883D EFL=00010296 ESP=BFFC370C EBP=BFFC3710 ESI=BFFC37C4 EDI=BFFC3750 > CRASH(9372/005): EAX=80000000 EBX=00CBBFF4 ECX=BFFC371C EDX=FFFFFFFF > CRASH(9372/006): UESP=BFFC370C TRAPNO=00000000 ERR=00000000 > CRASH(9372/007): STACK DUMP > CRASH(9372/008): BFFC370C: 1C 37 FC BF > CRASH(9372/009): BFFC3710: 38 37 FC BF 7C 88 04 08 00 00 00 80 FF FF FF FF > CRASH(9372/010): BFFC3720: F4 BF CB 00 F4 BF CB 00 F8 A9 04 08 F4 BF CB 00 > CRASH(9372/011): BFFC3730: 00 00 00 00 A0 CC B8 00 98 37 FC BF 93 4E BA 00 > CRASH(9372/012): BFFC3740: 01 00 00 00 C4 37 FC BF CC 37 FC BF 26 22 B8 00 > CRASH(9372/013): BFFC3750: F4 BF CB 00 00 00 00 00 50 37 FC BF 98 37 FC BF > CRASH(9372/014): BFFC3760: 40 37 FC BF 55 4E BA 00 00 00 00 00 00 00 00 00 > CRASH(9372/015): BFFC3770: 00 00 00 00 D4 CF B8 00 01 00 00 00 80 87 04 08 > CRASH(9372/016): BFFC3780: 00 00 00 00 60 21 B8 00 B0 2C B8 00 D4 CF B8 00 > CRASH(9372/017): BFFC3790: 01 00 00 00 80 87 04 08 00 00 00 00 A1 87 04 08 > CRASH(9372/018): BFFC37A0: 47 88 04 08 01 00 00 00 C4 37 FC BF CC 92 04 08 > CRASH(9372/019): BFFC37B0: 20 93 04 08 B0 2C B8 00 BC 37 FC BF 92 9A B8 00 > CRASH(9372/020): BFFC37C0: 01 00 00 00 BB A9 FF BF 00 00 00 00 C3 A9 FF BF > CRASH(9372/021): BFFC37D0: E4 A9 FF BF F4 A9 FF BF FF A9 FF BF 0D AA FF BF > CRASH(9372/022): BFFC37E0: 34 AA FF BF 51 AA FF BF 79 AA FF BF 95 AA FF BF > CRASH(9372/023): BFFC37F0: A7 AA FF BF B8 AA FF BF CE AA FF BF EC AA FF BF > CRASH(9372/024): BFFC3800: F5 AA FF BF 04 AB FF BF C7 AC FF BF > CRASH(9372/025): STACK TRACE > CRASH(9372/026): ./a.out[0x804889c] > CRASH(9372/027): ./a.out[0x8049078] > CRASH(9372/028): /lib/tls/libc.so.6[0xbb79b0] > CRASH(9372/029): ./a.out[0x804887c] > CRASH(9372/030): /lib/tls/libc.so.6(__libc_start_main+0xd3)[0xba4e93] > CRASH(9372/031): ./a.out[0x80487a1] > CRASH(9372/032): DONE > And from there, we can load up the program and do some disassembly: > [spc]lucy:/tmp>gdb a.out > GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-redhat-linux-gnu"…Using host libthread_db > library "/lib/tls/libthread_db.so.1". > > (gdb) disassemble 0x804883d > Dump of assembler code for function foo: > 0x08048828 : push %ebp > 0x08048829 : mov %esp,%ebp > 0x0804882b : sub $0x4,%esp > 0x0804882e : mov 0x8(%ebp),%edx > 0x08048831 : lea 0xc(%ebp),%eax > 0x08048834 : mov %eax,0xfffffffc(%ebp) > 0x08048837 : mov %edx,%eax > 0x08048839 : mov 0xfffffffc(%ebp),%ecx > 0x0804883c : cltd > 0x0804883d : idivl (%ecx) > 0x0804883f : mov %eax,0xfffffffc(%ebp) > 0x08048842 : mov 0xfffffffc(%ebp),%eax > 0x08048845 : leave > 0x08048846 : ret > End of assembler dump. > (gdb) > It faulted on the IDIV instruction, but it wasn't technically an “integer division-by-zero.” The Intel 80386 (and the Pentium™ in my computer is little more than a glorified Intel 80386) book I have describes IDIV as: > An 80386 interrupt zero (0) [which is reported as an “Integer division-by- > zero”] is taken if a zero divisor or a quotient too large for the > destination register is generated. [emphasis added] > Now, EAX is -2,147,483,648 (80000000 in hexadecimal notation [3], which can be represented in 32-bits (we're running 32-bit code here—the issue still happens on 64-bit systems but the value will be vastly larger), but - 2,147,483,648 divided by -1 should be 2,147,483,648, but 2,147,483,648 cannot be respresented in 32-bits [Technically, the value can be represented in 32 bits, but the instruction in question, IDIV is a signed instruction, and because of the way Intel does signed integer math, the signed quantity 2,147,483,648 cannot be represented as a 32-bit signed quantity in 32-bits. — Editor] and thus, because the quotient is then considered “too large” we get the fault which ends the program. This is fine as far as C goes, because C says such behavior is “undefined” and thus, anything goes. Simple once you know what's going on. (And for the 99% of my readership who don't get the NetHack reference [4] in the title … ) [1] http://blog.regehr.org/archives/887 [2] gopher://gopher.conman.org/0Phlog:2013/01/11.1 [3] http://en.wikipedia.org/wiki/Hexadecimal [4] http://feross.org/gcc-ownage/ Email Sean Conner at sean@conman.org .