[HN Gopher] Expressive Vector Engine - SIMD in C++
___________________________________________________________________
Expressive Vector Engine - SIMD in C++
Author : klaussilveira
Score : 61 points
Date : 2025-01-05 17:11 UTC (3 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| vblanco wrote:
| Interesting library, but i see it falls back into what happens to
| almost all SIMD libraries, which is that they hardcode the vector
| target completely and you cant mix/match feature levels within a
| build. The documentation recommends writing your kernels into
| DLLs and dynamic-loading them which is a huge mess
| https://jfalcou.github.io/eve/multiarch.html
|
| Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has the
| feature level as a template parameter on its vector objects,
| which lets you branch at runtime between simd levels as you wish.
| I find its a far better way of doing things if you actually want
| to ship the simd code to users.
| spacechild1 wrote:
| Thanks, that's an important caveat!
|
| > Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has
| the feature level as a template parameter on its vector objects
|
| That's pretty cool because you can write function templates and
| instantiate different versions that you can select at runtime.
| vblanco wrote:
| Yeah thts the fun of it, you create your kernel/function so
| that the simd level is a template parameter, and then you can
| use simple branching like:
|
| if(supports<avx512>){ myAlgo<avx512>(); } else{
| myAlgo<avx>(); }
|
| Ive also used it for benchmarking to see if my code scales to
| different simd widths well and its a huge help
| dyaroshev wrote:
| FYI: You don't want to do this. `supports<avx512>` is an
| expensive check. You really want to put this check in a
| static.
| kookamamie wrote:
| 100% agreed. This is the main reason ISPC is my go-to tool for
| explicit vectorization.
| janwas wrote:
| +1, dynamic dispatch is important. Our Highway library has
| extensive support for this.
|
| Detailed intro by kfjahnke here:
| https://github.com/kfjahnke/zimt/blob/multi_isa/examples/mul...
| vlovich123 wrote:
| Since you seem knowledgeable about this, what does this do
| differently from other SIMD libraries like xsimd / highway? Is
| it the addition of algorithms similar to the STD library that
| are explicitly SIMD optimized?
| dyaroshev wrote:
| Our answer to this - is dynamic dispatch. If you want to have
| multiple version of the same kernel compiled - compile multiple
| dlls.
|
| The big problem here is: ODR violations. We really didn't want
| to do the xsimd thing of forcing the user to pass an arch
| everywhere.
|
| Also that kinda defeats the purpose of "simd portability" - any
| code with avx2 can't work for an arm platform.
|
| eve just works everywhere.
|
| Example: https://godbolt.org/z/bEGd7Tnb3
| janwas wrote:
| It is possible to avoid ODR violations :) We put the per-
| target code into unique namespaces, and export a function
| pointer to them.
| dyaroshev wrote:
| You can do many thing with macros and inline namespaces but
| I believe they run into problems when modules come into
| play. Can you compile the same code twice, with different
| flags with modules?
| nickpsecurity wrote:
| I also found this looking for portable SIMD:
|
| https://github.com/google/highway
| shadowpho wrote:
| Wait what about AMD? They only claim support for intel and arm
| Sadiinso wrote:
| << AMD >> is x86
| dyaroshev wrote:
| AMD we support pretty well. I tested Zen1 and a bit Zen4
| Conscat wrote:
| EVE is personally my favorite SIMD library in any programming
| language. It's the only one I've tried that provides masked lane
| operations in a declarative style, aside from SPMD languages like
| CUDA or OpenMP. The [] syntax for that is admittedly pretty
| exotic C++, but I think the usefulness of the feature is worth
| it. I wish the documentation was better, though. When I first
| started, I struggled to figure out how to simply make a 4-lane
| float vector that I can pass into shaders, because almost all of
| the examples are written for the "wide" native-SIMD size.
| dyaroshev wrote:
| Hi!
|
| Thanks for your interest in the library.
|
| Here is a godbolt example: https://godbolt.org/z/bEGd7Tnb3 Here
| is a bunch of simple examples:
| https://github.com/jfalcou/eve/blob/fb093a0553d25bb8114f1396...
|
| I personally think we have the following strenghs:
|
| * Algorithms. Writing SIMD loops is very hard. We give you a lot
| of ready to go loops. (find, search, remove, set_intersection to
| name a few). * zip and SOA support out of the box. * High quality
| codegen. I haven't seen other libraries care about
| unrolling/aligning data accesses - meanwhile these give you
| substantial improvements. * Supporting more than
| transform/reduce. We have really decent compress implemented for
| sse/avx/neon implemented for example.
|
| The following weaknesses:
|
| * We don't support runtime sized sve/rvv (only fixed size). We
| tried really hard, but unfortunately just the C++ language
| refuses to play ball there. Here is a discussion about that
| https://stackoverflow.com/questions/73210512/arm-sve-wrappin...
|
| If this is something you need we recommend compiling a few
| dynamic libraries with the correct fixed lengths. Google Highway
| manage to pull it off but the trade off is a variadics interface
| that I personally find very difficult.
|
| * Runtime dispatch based on arch.
|
| We again recommend dlls for this. The problem here is ODR. I
| believe there is a solution based on preprocessor and namespaces
| I could use but it breaks as soon as modules become a thing. So -
| in the module world - we don't have an option. I'm happy for
| suggestions.
|
| * No MSVC support
|
| C++20 and MSVC is still not a thing enough. And each new version
| breaks something that was already working. Sad times.
|
| * Just tricky to get started.
|
| I don't know what to do about that. I'm happy to just write
| examples for people. If you wanna try a library - please create
| an issue/discussion or smth - I'm happy to take some time and try
| to solve your case.
|
| We talked about the library at CppCon:
| https://youtu.be/WZGNCPBMInI?si=buFteQB1e1vXRT5M
|
| If you want to learn how SIMD algorithms work, here are a couple
| of talks I gave: https://youtu.be/PHZRTv3erlA?si=b87DBYMDskvzYcq1
| https://youtu.be/vGcH40rkLdA?si=WL2e5gYQ7pSie9bd
|
| Feel free to ask any questions.
___________________________________________________________________
(page generated 2025-01-08 23:01 UTC)