[HN Gopher] Little OpenMP* Runtime
       ___________________________________________________________________
        
       Little OpenMP* Runtime
        
       Author : ingve
       Score  : 42 points
       Date   : 2021-02-08 14:58 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | phkahler wrote:
       | Not clear what OSes this works with. In particular, can we use it
       | on Windows?
       | 
       | I recently I introduced a lot of OMP usage in Solvespace and it
       | was really easy, and its transparent to compilers that dont
       | support it. However, windows doesnt ship with the required DLL by
       | default. Since we are a single executable file with no installer
       | on Windows, we ship separate binaries with and without OMP.
       | 
       | If we could statically link this on Windows that would be great.
        
         | kolbusa wrote:
         | It seems to claim to implement the interface of Clang OpenMP
         | library. So you compile with clang and use LOMP instead of
         | libomp.so. Clang OpenMP runtime originated at Intel, and
         | Intel's OpenMP library is also compatible with GNU's libgomp.
         | I'm not sure if that's true for libomp, though.
        
       | bullen wrote:
       | Why do we need this?
       | 
       | The following code "works": http://move.rupy.se/file/atomic.txt
        
         | scott_s wrote:
         | First sentence:
         | 
         | > LOMP, short for Little OpenMP (runtime), is a small OpenMP
         | runtime implementation that can be used for educational or
         | prototyping purposes.
        
         | jcranmer wrote:
         | That code actually doesn't work, but it's written in such a way
         | that it's impossible to tell when it doesn't work.
         | 
         | Data races in C/C++ code are undefined behavior, which means
         | you get 0 theoretical guarantees as to what will actually
         | happen. But I'm guessing you don't care about theory, but
         | practice.
         | 
         | In actual practice, compilers can (and do!) move around loads
         | and stores--these are expensive, and eliminating them were
         | possible is almost always insanely profitable. In this specific
         | example, there isn't really any optimization that is likely to
         | screw you up. However, the hardware can screw you up here.
         | There's no guarantee that the stores arrive in any particular
         | order, and no guarantee that temporally later reads will
         | actually see the results of temporally earlier writes.
         | 
         | You need various kinds of synchronizations (and atomic read-
         | modify-write operations) to actually develop multithreaded code
         | properly. But your example doesn't have any, and since it's
         | literally just reading and writing random values, there's no
         | way of telling if the semantics you think it is is actually the
         | semantics it actually is just by running it: you can't find any
         | failure modes where hardware turns out to violate sequential
         | consistency.
        
       ___________________________________________________________________
       (page generated 2021-02-08 23:01 UTC)