Post ATmt9XNL32Tm3XF0YC by stefanha@fosstodon.org
 (DIR) More posts by stefanha@fosstodon.org
 (DIR) Post #ATmoA0j5QHv3JaotZQ by penguin42@mastodon.org.uk
       2023-03-19T21:41:51Z
       
       0 likes, 0 repeats
       
       I really hate big complex functions that return -ENINVAL or similar; wtf knows which of the zillion things it doesn't like. (Looking at you clEnqueueReadBuffer )
       
 (DIR) Post #ATmt9XNL32Tm3XF0YC by stefanha@fosstodon.org
       2023-03-19T22:37:47Z
       
       0 likes, 0 repeats
       
       @penguin42 The kernel function graph tracer is good for those. In userspace rr or perf's Intel Processor Trace can probably be used to figure out where things went wrong.
       
 (DIR) Post #ATmvSLsKEI9pnbpm5o by penguin42@mastodon.org.uk
       2023-03-19T23:03:35Z
       
       0 likes, 0 repeats
       
       @stefanha Hmm, I'm on AMD - I can see it has 'Smart trace buffer' which sounds similar, but I don't see any perf hook for it.  I think my current plan is to try bpf probes.
       
 (DIR) Post #ATn3zvlAdDp4Hb0gRU by penguin42@mastodon.org.uk
       2023-03-20T00:39:18Z
       
       0 likes, 0 repeats
       
       @stefanha OK, good old gdb did the trick, and single stepping through the opencl implementation until it was obvious one 'size' was twice the value it was expecting ... and then I trigged one array was of numpy ints (64bit) and the other of opencl floats (32bit)