[HN Gopher] Workaround Clang v15 AArch64 miscompile that affects...
       ___________________________________________________________________
        
       Workaround Clang v15 AArch64 miscompile that affects parallel
       collection
        
       Author : gus_massa
       Score  : 46 points
       Date   : 2024-10-17 13:39 UTC (4 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | gus_massa wrote:
       | It's a very weird bug and a very weird debugging story. The
       | description of the change is short, but don't miss the first
       | comment that explains all the details.
        
       | kenada wrote:
       | The bug probably wasn't reproducing on Linux because the clang
       | shipped with Xcode is from Apple's fork. The version may say
       | "15", but it was closer to upstream LLVM's clang 16. There are
       | also changes that haven't been upstreamed yet and probably some
       | that may not ever be (if my experience updating libtapi in
       | nixpkgs was anything to go by).
        
       | rurban wrote:
       | I dont get how the C macro with the endless C loop can be called
       | from within scheme, just asis. Without any registration. Is this
       | a chez speciality, or am I missing something?
        
         | nneonneo wrote:
         | If I had to guess, they're probably just expanding the Scheme
         | code into C code during compilation. Also: it isn't an endless
         | C loop; do{/ _statements_ /}while(0) is a common macro idiom
         | for a block of code _that requires a trailing semicolon_ , i.e.
         | one that behaves syntactically like a normal (void) function
         | call. It only runs once because the condition is false.
        
       | benmmurphy wrote:
       | i tried reproducing the zipped example on macOS 14.4.1 (23E224)
       | and Apple clang version 15.0.0 (clang-1500.3.9.4) and it looks
       | like the issue is not present on that version.
       | 
       | here are the instructions marked with '<<<<' from `disassem.txt`
       | and my version:                   0x1000026a8 <+3300>: adrp
       | x14, 6152a                ; <<<<<< address used for
       | `in_parallel_sweepers`?           1000026a8: 9000c04e     adrp
       | x14, 0x10180a000 <_tgc+0x6c8>              0x100002bb8 <+4596>:
       | ldr    w22, [x14]               ; <<<<<< load for `in_par` value,
       | offset missing?           100002bb8: b94961d6     ldr     w22,
       | [x14, #2400]              0x100002c60 <+4764>: cbnz   w22,
       | 0x100002bec          ; <+4648> at gc.c:264:34  <<<<<<< branch on
       | `in_par` value?           100002c60: 35fffc76     cbnz    w22,
       | 0x100002bec <_sweep_thread+0x1228>              0x100002c64
       | <+4768>: ldr    w11, [x14, #0x960]        ;
       | <<<<<<< addition ot `in_parallel_sweepers`, has offset
       | 100002c64: b94961cb     ldr     w11, [x14, #2400]
       | 
       | on my version it uses the offset for the first load #2400 =
       | #0x960. so i guess it must have been fixed somewhere between
       | `clang-1500.0.40.1` and `clang-1500.3.9.4`. but that is an insane
       | bug. also, `sweep_thread` seems to have the exact same number of
       | instructions under both versions of clang and i'm guessing the
       | only difference is this offset which is kind of wild.
        
       ___________________________________________________________________
       (page generated 2024-10-21 23:00 UTC)