[HN Gopher] Workaround Clang v15 AArch64 miscompile that affects...
___________________________________________________________________
Workaround Clang v15 AArch64 miscompile that affects parallel
collection
Author : gus_massa
Score : 46 points
Date : 2024-10-17 13:39 UTC (4 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| gus_massa wrote:
| It's a very weird bug and a very weird debugging story. The
| description of the change is short, but don't miss the first
| comment that explains all the details.
| kenada wrote:
| The bug probably wasn't reproducing on Linux because the clang
| shipped with Xcode is from Apple's fork. The version may say
| "15", but it was closer to upstream LLVM's clang 16. There are
| also changes that haven't been upstreamed yet and probably some
| that may not ever be (if my experience updating libtapi in
| nixpkgs was anything to go by).
| rurban wrote:
| I dont get how the C macro with the endless C loop can be called
| from within scheme, just asis. Without any registration. Is this
| a chez speciality, or am I missing something?
| nneonneo wrote:
| If I had to guess, they're probably just expanding the Scheme
| code into C code during compilation. Also: it isn't an endless
| C loop; do{/ _statements_ /}while(0) is a common macro idiom
| for a block of code _that requires a trailing semicolon_ , i.e.
| one that behaves syntactically like a normal (void) function
| call. It only runs once because the condition is false.
| benmmurphy wrote:
| i tried reproducing the zipped example on macOS 14.4.1 (23E224)
| and Apple clang version 15.0.0 (clang-1500.3.9.4) and it looks
| like the issue is not present on that version.
|
| here are the instructions marked with '<<<<' from `disassem.txt`
| and my version: 0x1000026a8 <+3300>: adrp
| x14, 6152a ; <<<<<< address used for
| `in_parallel_sweepers`? 1000026a8: 9000c04e adrp
| x14, 0x10180a000 <_tgc+0x6c8> 0x100002bb8 <+4596>:
| ldr w22, [x14] ; <<<<<< load for `in_par` value,
| offset missing? 100002bb8: b94961d6 ldr w22,
| [x14, #2400] 0x100002c60 <+4764>: cbnz w22,
| 0x100002bec ; <+4648> at gc.c:264:34 <<<<<<< branch on
| `in_par` value? 100002c60: 35fffc76 cbnz w22,
| 0x100002bec <_sweep_thread+0x1228> 0x100002c64
| <+4768>: ldr w11, [x14, #0x960] ;
| <<<<<<< addition ot `in_parallel_sweepers`, has offset
| 100002c64: b94961cb ldr w11, [x14, #2400]
|
| on my version it uses the offset for the first load #2400 =
| #0x960. so i guess it must have been fixed somewhere between
| `clang-1500.0.40.1` and `clang-1500.3.9.4`. but that is an insane
| bug. also, `sweep_thread` seems to have the exact same number of
| instructions under both versions of clang and i'm guessing the
| only difference is this offset which is kind of wild.
___________________________________________________________________
(page generated 2024-10-21 23:00 UTC)