[HN Gopher] Comparing Julia to performance portable parallel pro...
       ___________________________________________________________________
        
       Comparing Julia to performance portable parallel programming models
       for HPC [pdf]
        
       Author : leephillips
       Score  : 26 points
       Date   : 2021-11-17 19:23 UTC (3 hours ago)
        
 (HTM) web link (conferences.computer.org)
 (TXT) w3m dump (conferences.computer.org)
        
       | adgjlsfhk1 wrote:
       | I hope LLVM gets better at using AVX-512 instructions
       | efficiently. That looks like the main pain point found here. It's
       | exciting to see that for the most part Julia is roughly matching
       | performance of the mature HPC solutions.
        
         | snicker7 wrote:
         | I think the issue with code vectorization is that the compiler
         | must know that the given loop can run out-of-order. I don't
         | know if that is something that LLVM can do reliably.
        
           | adgjlsfhk1 wrote:
           | In cases where Julia is already emitting 256 bit
           | instructions, LLVM has already figured out that the loop
           | isn't order dependent.
           | 
           | The main issue here is that LLVM thinks AVX512 is slow (which
           | it sometimes is).
        
       | throwawaybutwhy wrote:
       | The link exposes a login and a password. Dunno if this is
       | intended.
        
       ___________________________________________________________________
       (page generated 2021-11-17 23:01 UTC)