[HN Gopher] Fast RISC-V-based scripting back end for game engines
       ___________________________________________________________________
        
       Fast RISC-V-based scripting back end for game engines
        
       Author : fwsgonzo
       Score  : 72 points
       Date   : 2024-01-14 19:24 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | hammyhavoc wrote:
       | Genius.
        
         | thesnide wrote:
         | Yep, I also thought about it to replace wasm to enable secure
         | server side code execution.
         | 
         | https://blog.pwkf.org/2023/07/16/lambda-mcu.html
        
           | hammyhavoc wrote:
           | This is extremely cool!
        
       | charcircuit wrote:
       | When writing a web assembly interpreter I was surprised at how it
       | didn't map cleanly to the underlying hardware. The interpreter
       | had to keep track of the types of everything. I can definitely
       | see how a RISC-V based solution would be faster.
        
         | klodolph wrote:
         | I'm curious what you're talking about--WASM has different
         | instructions for each of its different types. There are
         | separate instructions for i32, i64, f32, and f64 in the base
         | spec.
         | 
         | This seems like a pretty clean map to me--most of the
         | architectures these days have native support for 32-bit and
         | 64-bit types these days. You only get 16-bit and 8-bit support
         | with SIMD or load/store, much like how WASM does it.
        
           | fwsgonzo wrote:
           | He might be talking about having to implement a register
           | allocator/file in order to interpret stack machines faster.
           | Something you get from the compilers on register
           | architectures like RISC-V, ARM and x86.
           | 
           | wasmtime uses cranelift, which has a complex register
           | allocator. wasm3 uses a simple register file.
           | 
           | > In M3/Wasm, the stack machine model is translated into a
           | more direct and efficient "register file" approach.
           | 
           | https://github.com/wasm3/wasm3/blob/main/docs/Interpreter.md
        
             | miohtama wrote:
             | Because registers on x86 and RISC-V are different, you
             | still need to have a register file to map those, no?
        
               | fwsgonzo wrote:
               | Only for binary translation or just-in-time compilation
               | (basically producing native code). If you're interpreting
               | RISC-V you can just pretend the registers are an array of
               | register-sized integers. Which is what they are.
        
               | charcircuit wrote:
               | >registers are an array of register-sized integers
               | 
               | A "register file" is the name of such an array.
        
           | dkjaudyeqooe wrote:
           | I'm not familiar with the details, and don't have the
           | references to hand, but WASM has been designed in a way that
           | reflects the design choices of Chrome's Javascript engine
           | rather than optimizing standalone compilation.
        
             | klodolph wrote:
             | Part of optimizing for the design choices of browser
             | engines is to make it easy to emit machine code. The
             | browsers have multiple compilers in them, and one of the
             | compilers has the task of emitting simple machine code
             | quickly, so the code can run immediately (reducing start-up
             | time). In V8, this initial compiler is called _Ignition_.
        
           | Findecanor wrote:
           | WASM is a stack-machine, so the input code needs to be
           | validated to make sure that the same types gets pop'ed as has
           | previously been push'ed. But that can be done ahead of
           | execution.
           | 
           | You'd need to validate function types during runtime though,
           | which is something that a CPU emulator does not need to.
        
             | klodolph wrote:
             | WASM is kind of a stack machine, yes, but there are a lot
             | of constraints on the stack machine that are designed to
             | make it easy to translate to registers.
             | 
             | If you squint and look sideways, you can think of it as a
             | flattened tree, rather than a stack machine.
        
           | charcircuit wrote:
           | Take for example i32.add you have to
           | 
           | 1. Check the type at the top of the stack to make sure it's
           | i32 and trap otherwise
           | 
           | 2. Pop the top of the stack
           | 
           | 3. Check the top of the stack and make sure it's i32 and trap
           | otherwise
           | 
           | 4. Pop the top of the stack.
           | 
           | 5. Add the two values together
           | 
           | 6. Push the result to the top of the stack along with its
           | type.
           | 
           | Compare this to ADD from RISC-V where you can just add the
           | contents of 2 source registers and store it in a destination
           | register. There is not a bunch of type book keeping or
           | alignment that you have to worry about.
        
             | klodolph wrote:
             | Most of this can be done statically. You do the bookkeeping
             | once. There are various approaches--you can try to allocate
             | registers, or you can emit code that loads/stores on the
             | stack, or you can just run an interpreter and remove the
             | safety checks from it.
             | 
             | The stack isn't dynamically typed, so it doesn't make sense
             | to trap.
        
               | charcircuit wrote:
               | Looking into it further the type checking is supposed to
               | be done in a dedicated validation step and other than
               | validation using instructions with the wrong type of data
               | seems to work.
        
       | SotCodeLaureate wrote:
       | Recently I wrote a simplistic RISC-V interpreter for educational
       | purposes mostly, and a friend was benchmarking it against several
       | scripting system in the context of game-character control
       | routines. Well, it's quite fast for what it is, though Lua is
       | still somewhat faster, heh.
       | 
       | Some benchmarking code is here:
       | https://github.com/glebnovodran/roam_bench
        
         | fwsgonzo wrote:
         | Very cool! One tip: Take the execute segment and produce fast
         | bytecodes instead of interpreting each instruction as a bit
         | pattern. Producing faster bytecodes is something that even WASM
         | emulators do, despite it being a bytecode format.
        
           | FPGAhacker wrote:
           | What's the difference between a bit pattern and a bytecode?
        
             | fwsgonzo wrote:
             | The fast bytecode is reduced to a simple operation that
             | excludes certain knowns. For example if you have an
             | instruction that stores A0 = A1 + 0, then knowing the
             | immediate is zero, this can be reduced from reading a
             | complex bit pattern to a bytecode that moves from one
             | register to another, basically MV dst=A0, src=A1.
        
           | SotCodeLaureate wrote:
           | Thanks! Yeah, this one is supposed to be a very simple
           | implementation for a kind of "how to write a machine code
           | interpreter" tutorial, but I was looking to experiment with
           | some optimizations if time permits. Patching with pre-baked
           | alt-bytecode was one of the ideas indeed.
        
       | dataangel wrote:
       | This was posted before and I still have no idea what the
       | rationale is. No desktop PC or game console is RISC-V, so if I'm
       | going to all the trouble to use a scripting solution that
       | requires me to compile my scripts to machine language, why would
       | I target RISC-V? Why wouldn't I just compile to x86-64 directly?
       | 
       | Like, what is the vision here, a LuaJIT that targets RISC-V,
       | running inside my x86-64 game, where the emulator translates the
       | RISC-V back into x86-64... for reasons? Just isolation?
        
         | nagisa wrote:
         | The answer lies in the reason why you would use an embedded
         | scripting language/environment at all, over just loading native
         | code plugins - sandboxing, fault isolation and such.
         | 
         | RISC-V seems to be used here more as a bytecode format of
         | sorts, and at least compared to x86_64 it should be much easier
         | to implement (first of all because all of the opcodes have the
         | same size.)
        
         | fwsgonzo wrote:
         | The sandbox is interpreting RISC-V, just very quickly. It's
         | platform independent. I am actually making a game with a
         | derivative of this repo, and in the server I am building the
         | programs at the same time as the server. Once the server
         | starts, it loads all the programs, and then sends compressed
         | programs to each client, so that everyone who connects has the
         | same scripts. Easy and convenient, but most likely just that
         | because I did it from the start.
        
           | speps wrote:
           | This is an interesting concept, getting the gameplay logic
           | sent to the client this way. I guess it only works for
           | simpler games so far, do you have any code examples?
        
             | fwsgonzo wrote:
             | The game is quite complex actually, but the script is not
             | doing overmuch right now. The script is doing things that
             | makes sense for a growing modding API, while the engine
             | still does the brunt of the work. That said, there are
             | functions that end up being called billions of times simply
             | because you want that flexibility, and that's where the low
             | latency script pays off.
        
         | saagarjha wrote:
         | Yeah, it seems like WASM would be a better choice?
        
           | MaxBarraclough wrote:
           | Or transpiling to C/C++, like Nim.
        
         | doctorpangloss wrote:
         | > ...for reasons?
         | 
         | Yes. When something's intellectually stimulating you work on it
         | more, that's it.
         | 
         | Most game development is a grind. These huge distractions, like
         | a scripting backend, well if you spend 100h working on the
         | scripting engine only to spend 1h authoring actual scripts, you
         | still spent 1h authoring scripts those 2 weeks instead of 0h.
         | The game gets delivered sooner even if you spend 10x as long
         | working on it.
         | 
         | It's a quintessential misunderstanding about indie game
         | development. HN readers think Jonathan Blow is wasting his time
         | writing a whole new programming language and engine, and that's
         | why his games take 6 years to make. No: his games would take 20
         | years to make if they weren't intellectually engaging to make.
         | They wouldn't be made at all!
         | 
         | It's the same energy as rewriting everything in another
         | programming language.
         | 
         | I think this happens at giant companies too, all the time. So
         | called Not Invented Here syndrome: it's as much about
         | laundering open source code as it is about keeping things
         | interesting enough to make the extreme boredom and grind worth
         | it for otherwise smart and healthy people.
        
       | logicprog wrote:
       | As someone who's working on a game engine that's designed to be
       | highly data-driven, with only the core game engine itself in a
       | systems programming language and a _lot_ of calls out to scripts,
       | this is very intriguing! C++20 doesn 't seem like a good
       | candidate for an actual scripting language, so I'd have to find a
       | suitable scripting language that can compile to RISC-V, though,
       | and of course requiring precompilation is an issue.
        
         | fwsgonzo wrote:
         | Yes, precompilation is not optional. I've used many languages
         | in sandboxes over the years developing both this and other
         | emulators.
         | 
         | I really like Nim the most. It's Python spiritually, but with
         | types. Types just really really necessary, and it does have FFI
         | support so you can forward arguments either from a wrapper or
         | use {.cdecl} directly. It's still not a barneskirenn (as we say
         | in Norway), but definitely the most fun I've had using another
         | language in my various sandboxes.
         | 
         | Nelua and Zig are close seconds. Both are just so easy to work
         | with C-based FFI. Nelua is probably not safe to use, as it's
         | still a work in progress. At least last I checked. Zig is very
         | much ready to use. Maybe not your cup of tea, though.
         | 
         | Golang has a complex run-time and I don't recommend using it.
         | It's definitely possible though as my sandbox does have a full
         | MMU. Just expect integration to be long and ardous. And the ABI
         | changes potentially every version.
         | 
         | Rust is one of the easier ones. I didn't find it fun to work
         | with though. It has by far the best inline assembly, but too
         | much fighting with the compiler. I know that people love Rust,
         | and it _is_ fully supported in my emulator.
         | 
         | C/C++ has the benefit of having the ability to have their
         | underlying functions overridden by native helper system calls.
         | Eg. replacing memcpy() with a system call that has native
         | performance. It's too long a long topic to talk about here, but
         | Nim and Nelua also falls under this umbrella along with other
         | languages that can compile to C/C++.
         | 
         | Kotlin is definitely possible to use. A bit hard to understand
         | how the native stuff actually works, but I did manage to run a
         | hello world program in several sandboxes with some effort. I'm
         | not 100% sure but I think I managed to convert a C API header
         | directly to something kotlin understands using a one-liner in
         | the terminal. I would say it scores high just on that. Again,
         | just a bit hard to understand how to talk to Kotlin from an
         | external FFI looking in.
         | 
         | JavaScript is of course possible with both jitless v8 and
         | QuickJS. v8 requires writing some C++ scaffolding + host API
         | functions, and QuickJS requires the same in C. Either works.
         | 
         | And I think that's all the languages I have tried. If you think
         | there's a missing language here, then I would love to try it!
        
           | logicprog wrote:
           | This is a really interesting list, thank you for replying!
           | 
           | (the following are entirely undirected musings on using your
           | technology for my game engine, mostly in case anyone finds
           | them interesting or has something to suggest that I haven't
           | thought of since I'm new at this)
           | 
           | At least for my use case C, Zig, Rust, and even probably Nim
           | are out of the question because I'm not just using scripting
           | for sandboxing capabilities and such, I'm also using it
           | because I want people to be able to program games and write
           | mods in a high-level language, so using a systems programming
           | language as my scripting language kind of feels like it
           | defeats part of the purpose and I might as well just use
           | dynamic linking with a C ABI or something crazy like that
           | (I'm new to this so excuse me if that's nonsense lol).
           | JavaScript and Kotlin are intriguing though, because they
           | aren't systems programming languages, so I'll have to think
           | long and hard about those;
           | 
           | I've been considering C# for scripting lately (because I like
           | it well enough, it's widespread, and known in the gaming
           | world) and I wonder if you can get natively aot compiled
           | Kotlin to work, if you could get natively aot compiled C# to
           | work too... there would probably be similar complications due
           | to the large and complex runtime, but I know you can strip it
           | down so I wonder how that might play in. I also wonder what
           | the performance would be like compared to just hosting the
           | dotnet runtime, which is what I was intending to do before I
           | saw your post. Maybe at some point I should set up a
           | benchmark to compare! Although I have to say the pre-
           | compilation thing is a bit of a deal-breaker for me, since I
           | really want people to be able to put plain text things
           | directly in the game folder and see those changes in the
           | engine without having to do any kind of build process, which
           | is something that is achievable with the regular dotnet
           | runtime using a precompiled assembly bootstraps the rest of
           | the sctipts using dynamic assembly compilation and loading to
           | collect everything else in the script directory.
        
             | fwsgonzo wrote:
             | Hm, if you don't actually need a sandbox then I think just
             | using the C# run-time makes a lot of sense. I also like C#,
             | but I've never been in a situation to try the fairly new
             | AOT support. Sounds like a good idea, though. C# is a very
             | good language.
        
       | FpUser wrote:
       | Long time ago I've used paxCompiler scripting engine. It was made
       | for Delphi / FreePascal and supported Delphi input language.
       | Rather than being byte code interpreter / JIT it actually
       | generated native code in RAM hence the overhead of calling it was
       | the same as calling regular function. It had lots of other super
       | nice features but that is all in the past anyways.
       | 
       | I am curious why such approach is not used for other scripting
       | projects.
        
       | sylware wrote:
       | Funny, I just started (yesterday!!) something similar: a kind of
       | x86_64 (which I call x64) virtual machine for rv64.
       | 
       | I am writting this "virtual machine" in x86_64 assembly though
       | (linux ABI). I don't plan to have native x86_64 compilation but
       | only an interpreter.
       | 
       | This idea is to code rv64 executables(linux) which I could "run-
       | ish" on x86_64(linux).
       | 
       | There is also the real machine emulator from M.Bellard :
       | https://bellard.org/tinyemu
       | 
       | Well, RISC-V is gaining momentum, and transition apparatus from
       | x86_64 is taking shaped from many individuals wishing risc-v to
       | be a success.
       | 
       | BTW, if anybody knows about milk-v duo (the one with the SOC free
       | from arm cores) resellers in Europe I can pay without a credit
       | card and I can contact with a self-hosted email... thank you.
        
       ___________________________________________________________________
       (page generated 2024-01-14 23:00 UTC)