[HN Gopher] Little OpenMP* Runtime
___________________________________________________________________
Little OpenMP* Runtime
Author : ingve
Score : 42 points
Date : 2021-02-08 14:58 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| phkahler wrote:
| Not clear what OSes this works with. In particular, can we use it
| on Windows?
|
| I recently I introduced a lot of OMP usage in Solvespace and it
| was really easy, and its transparent to compilers that dont
| support it. However, windows doesnt ship with the required DLL by
| default. Since we are a single executable file with no installer
| on Windows, we ship separate binaries with and without OMP.
|
| If we could statically link this on Windows that would be great.
| kolbusa wrote:
| It seems to claim to implement the interface of Clang OpenMP
| library. So you compile with clang and use LOMP instead of
| libomp.so. Clang OpenMP runtime originated at Intel, and
| Intel's OpenMP library is also compatible with GNU's libgomp.
| I'm not sure if that's true for libomp, though.
| bullen wrote:
| Why do we need this?
|
| The following code "works": http://move.rupy.se/file/atomic.txt
| scott_s wrote:
| First sentence:
|
| > LOMP, short for Little OpenMP (runtime), is a small OpenMP
| runtime implementation that can be used for educational or
| prototyping purposes.
| jcranmer wrote:
| That code actually doesn't work, but it's written in such a way
| that it's impossible to tell when it doesn't work.
|
| Data races in C/C++ code are undefined behavior, which means
| you get 0 theoretical guarantees as to what will actually
| happen. But I'm guessing you don't care about theory, but
| practice.
|
| In actual practice, compilers can (and do!) move around loads
| and stores--these are expensive, and eliminating them were
| possible is almost always insanely profitable. In this specific
| example, there isn't really any optimization that is likely to
| screw you up. However, the hardware can screw you up here.
| There's no guarantee that the stores arrive in any particular
| order, and no guarantee that temporally later reads will
| actually see the results of temporally earlier writes.
|
| You need various kinds of synchronizations (and atomic read-
| modify-write operations) to actually develop multithreaded code
| properly. But your example doesn't have any, and since it's
| literally just reading and writing random values, there's no
| way of telling if the semantics you think it is is actually the
| semantics it actually is just by running it: you can't find any
| failure modes where hardware turns out to violate sequential
| consistency.
___________________________________________________________________
(page generated 2021-02-08 23:01 UTC)