[HN Gopher] Comparing Julia to performance portable parallel pro...
___________________________________________________________________
Comparing Julia to performance portable parallel programming models
for HPC [pdf]
Author : leephillips
Score : 26 points
Date : 2021-11-17 19:23 UTC (3 hours ago)
(HTM) web link (conferences.computer.org)
(TXT) w3m dump (conferences.computer.org)
| adgjlsfhk1 wrote:
| I hope LLVM gets better at using AVX-512 instructions
| efficiently. That looks like the main pain point found here. It's
| exciting to see that for the most part Julia is roughly matching
| performance of the mature HPC solutions.
| snicker7 wrote:
| I think the issue with code vectorization is that the compiler
| must know that the given loop can run out-of-order. I don't
| know if that is something that LLVM can do reliably.
| adgjlsfhk1 wrote:
| In cases where Julia is already emitting 256 bit
| instructions, LLVM has already figured out that the loop
| isn't order dependent.
|
| The main issue here is that LLVM thinks AVX512 is slow (which
| it sometimes is).
| throwawaybutwhy wrote:
| The link exposes a login and a password. Dunno if this is
| intended.
___________________________________________________________________
(page generated 2021-11-17 23:01 UTC)