[HN Gopher] Why LuaJIT's interpreter is written in assembly
___________________________________________________________________
Why LuaJIT's interpreter is written in assembly
Author : obl
Score : 43 points
Date : 2021-02-11 16:18 UTC (6 hours ago)
(HTM) web link (lua-users.org)
(TXT) w3m dump (lua-users.org)
| guenthert wrote:
| That needs an 2011 tag.
| newobj wrote:
| Is LuaJIT still under active development? I thought the developer
| had walked away. With Torch also looking dead, that use case is
| gone, too. Roblox has their own Lua VM now.
|
| I love Lua and code in it almost every day for fun, but yeah I'm
| pretty sure LuaJIT is just "done" now?
| BugsJustFindMe wrote:
| The git repo weirdly continues to get periodic updates. I think
| it just doesn't get features anymore?
| cardanome wrote:
| LuaJit is simply feature complete. There are is nothing to add
| to it.
|
| The newer versions of Lua basically implemented features that
| LuaJit already had or that would mean a performance trade off.
| I am still using Lua 5.1 (which is from 2012) because there is
| no reason to upgrade.
|
| If you design something well, you don't need to push new
| features every year. Stability is underrated.
| pansa2 wrote:
| > _There is nothing to add to it._
|
| I'm not sure that's true. Maybe LuaJIT was never going to add
| the features it's missing from Lua 5.2, 5.3 and 5.4. However,
| when Mike Pall stepped back in 2015 [0], he had still been
| planning to further improve the implementation - for example
| with a new garbage collector [1] and "hyperblock scheduling"
| [2] (which remain unimplemented), plus 64-bit pointer support
| (which was eventually completed by other people).
|
| [0] https://www.freelists.org/post/luajit/Looking-for-new-
| LuaJIT...
|
| [1] http://wiki.luajit.org/New-Garbage-Collector
|
| [2] https://github.com/LuaJIT/LuaJIT/issues/37
| moonchild wrote:
| It gets maintained, but that's pretty much it. Last commit was
| a small fix ~2 months ago.
|
| I had high hopes for moonjit, but development on it has ceased.
| There are other forks--openresty and raptorjit come to mind--
| but they don't have feature parity.
| tobylane wrote:
| https://github.com/openresty/luajit2
|
| It has a few extras but they agree with the original luajit
| authors opinion that not every 5.2 feature can be made in jit.
| pansa2 wrote:
| Relevant:
|
| > > _Threaded code should have better branch prediction behavior
| than a jump table with a single dispatch point_
|
| > _This is not the case anymore, at least for modern Intel
| processors. Starting with the Haswell micro-architecture, the
| indirect branch predictor got much better and a plain switch
| statement is just as fast as the "computed goto" equivalent. Be
| wary of any references about this that are from before 2013._
|
| https://news.ycombinator.com/item?id=15396761
| gopalv wrote:
| > Be wary of any references about this that are from before
| 2013.
|
| And now that it is 2021, ignore the pre-2018 numbers.
|
| Somewhere in that old thread, I referenced the indirect jump
| performance hit from the whole Spectre/Meltdown mitigations for
| any Xeons you might have bought before mid-2020.
|
| There's a nice paper from VMWare on "JumpSwitches" from USENIX
| '19 that is worth reading in this context.
|
| That suggests that of the five different types of indirect
| jumps, some are back to being fast again - the one we are
| dealing with is the search jumpswitch.
|
| I would say direct threaded execution (computed goto) is still
| worth it over the single dispatch jumps, particularly if you
| can JIT basic blocks & replace something an "add, mul, add,
| store" into a single basic block without unloading from
| registers for the whole operation, jump into that directly with
| CGOTO like you would do with a compiled chunk of code & build
| your micro-JIT one opcode at a time.
| [deleted]
| remexre wrote:
| I wonder if Clang can do better these days; I noticed that it
| seems to merge computed gotos to normal-looking control-flow (I
| suppose this is necessary for alias analysis).
| nn3 wrote:
| Also modern compilers should be able to use frequency
| information from profile feedback to inform the register
| allocator. So it's unclear the static register allocation
| scheme is really that much better.
___________________________________________________________________
(page generated 2021-02-11 23:01 UTC)