[HN Gopher] Nova: A JavaScript and WebAssembly engine written in...
___________________________________________________________________
Nova: A JavaScript and WebAssembly engine written in Rust
Author : AbuAssar
Score : 128 points
Date : 2025-05-29 14:05 UTC (8 hours ago)
(HTM) web link (trynova.dev)
(TXT) w3m dump (trynova.dev)
| nine_k wrote:
| Uses "data-oriented design", so it's likely striving to be faster
| than other non-JIT runtimes by being more cache-friendly.
|
| Still at early stages, quite incomplete, not nearly ready for
| real use, AFAICT.
| aapoalas wrote:
| Hi, Nova dev here.
|
| Yes, basically. And removing structural inheritance.
| throwaway894345 wrote:
| Can you elaborate on "And removing structural inheritance"?
| Does that mean Nova doesn't use traits, and if so, why would
| that matter?
| aapoalas wrote:
| Traits are a type of interface inheritance; base classes
| and inherited classes a la C++ is structural inheritance.
|
| So basically it just means that I have to write more
| interfaces and implementations for them, because I don't
| have base classes to fall onto. Instead, in derived
| type/class instances I have an optional (maybe null)
| "pointer" to a base type/class instance. If the derived
| instance never uses its base class features, then the
| pointer stays null and no base instance is created.
|
| Often derived objects in JS are only used for those derived
| features, so I save live memory. But: the derived object
| type needs its own version of at least some of the base
| class methods, so I pay more in instruction memory
| (executable size).
| Permik wrote:
| Essentially implementing JavaScript on top of the ECS
| architecture :D
| aapoalas wrote:
| Yup! My whole inspiration for this came from a friend
| explaining ECS to me and me thinking "wouldn't that work for
| a JS engine?"
| SkiFire13 wrote:
| I've seen this brought up a couple times now, but I never
| get it. Why would ECS fit a JS engine? The ECS pattern
| optimizes for iterating over ton of data, but a JS engine
| does the opposite of that, it need to interpret instruction
| by instruction which could access random data.
| aapoalas wrote:
| Indeed, there's no guarantee that it will fit: I think it
| will but I don't know and want to find out.
|
| There are strong (IMO) reasons to think it will fit,
| though. User code can indeed do whatever but it rarely
| does. Programs written in JS are no less structured and
| predictable than ones written in C++ or Rust or any other
| language: they mostly operate on groups of data running
| iterations, loops, and algorithms over and over again. So
| the instructions being interpreted are likely to form
| roughly ECS System-like access patterns.
|
| Furthermore, it is more likely that data that came into
| the engine at one time (eg. one JSON.parse call or fetch
| result) will be iterated through at the same time. Thus,
| if the engine can ensure that data is and stays
| temporally colocated, then it is statistically likely
| that the interpreter's memory access patterns will not
| only come from System-like algorithms, they will access
| Component-array like memory.
|
| So: JS objects (and other heap allocated data) are
| Entities, their data is laid out in arrays of Components
| (TODO laying out object properties in Component arrays,
| at least in some cases), and the program forms the
| Systems. ECS :)
| eyelidlessness wrote:
| Disclaimer: I'm way out of my depth on the theoretical
| front, despite similarly taking interest in ECS in
| unconventional places. I'm responding from the
| perspective of most of my career being in JS/TS.
|
| I think your instincts about program structure are mostly
| right, but the outliers are pretty far out there.
|
| I'm much less optimistic about how you're framing
| arbitrary data access. In my experience, it's very common
| for JS code (marginally less common when authored as TS)
| to treat JSON (or other I/O bound data) as a perpetual
| blob of uncertainty. Data gets partially resolved into
| program interfaces haphazardly, at seemingly random
| points downstream, often with tons of redundancy and
| internal contradictions.
|
| I'm not sure how much that matters for your goals! But if
| I were taking on a project like this I'd be looking at
| that subset of non-ideal patterns frequently to reassess
| my assumptions.
| aapoalas wrote:
| Hey, thank you for the viewpoint. I'm myself a career
| JS/S programmer as well, and I do appreciate that the
| lived reality is quite varied.
|
| The partial resolving and haphazardness of JSON data
| usage shouldn't matter too much. I don't mean to make
| JSON parsed objects to be some special class, per se, or
| for the memory layout to depend on access patterns on
| said data. Only, I force data that was created together
| to be close together in memory (this is what real
| production engines already do, but only if possible) and
| for that data to stay together (again, production engines
| do this but only as is reasonably possible; I force the
| issue). So I explicitly choose temporal coherence. Beyond
| that, I use interface inheritance / removal of structural
| inheritance to reduce memory usage. eg. Plain Arrays
| (used in the common way) I can push to 9 bytes or even 8
| bytes if I accept that Arrays with a length larger than
| 2^24 are always pessimised. ECS / Struct-of-Arrays data
| storage then further allows me to choose to move some
| data onto separate cache lines.
|
| But; it's definitely true that some programs will just
| ruin all reasonable access patterns and do everything
| willy-nilly and mixed up. I expect Nova to perform worse
| on those kinds of cases: as I am adding indirection to
| uncommon cases and splitting up data onto multiple cache
| lines to improve common access patterns, I do pessimise
| the uncommon cases further and further down the drain. I
| guess I just want to see what happens if I kick those
| uncommon cases to the curb and say "you want to be slow?
| feel free." :) I expect I will pay for that arrogance,
| and I look forward to that day <3
| nine_k wrote:
| Hmm, a compacting garbage collector that would try to put
| live data together, according to its access patterns,
| might be fun to consider. Along these lines, it could
| even split objects' attributes along ECS-friendly lines,
| working in concert with a profiler.
| aapoalas wrote:
| Nova's GC doesn't use access patterns for this, but this
| is basically what we do, or in some cases aim to do.
|
| Arrays, Objects, ArrayBuffers, Numbers, Strings, BigInts,
| ... all have their data allocated onto different heap
| vectors. These heap vectors will eventually be SoA
| vectors to split objects' attributes along ECS-friendly
| lines; eg. Array length might be split from the elements
| storage pointer, Object shape pointer split from the
| Object property storage pointer etc. Importantly, what we
| already do is that an Array does not hold all Object's
| attributes but instead holds an optional pointer to a
| "backing Object". If an Array is used like an Object (eg.
| `array.foo = "something"`) then a backing object is
| created and the Array's backing Object pointer is
| initialised to point to that data. Because we use a SoA
| structure, that backing Object pointer can be stored in a
| sparse column, meaning that Arrays that don't have a
| backing Object initialised also do not initialise the
| memory to hold the pointer.
|
| I'm also interested in maybe splitting Object properties
| so that they're stored in ECS-friendly lines (at least if
| eg. they're Objects parsed from an Array in JSON.parse).
|
| Our GC is then a compacting GC on these heap vectors
| where it simply "drops" data from the vector and moves
| items down to perform compaction. This also means it gets
| to perform the compaction in a trivially parallel manner
| <3
| k__ wrote:
| I had the impression, ECS would boost performance mainly
| by allowing the systems to run in parallel on the
| entities. Isn't this kinda moot in a single threaded
| runtime?
| aapoalas wrote:
| This is definitely the more meaningful/influential
| performance benefit of ECS in game development, I
| believe. JavaScript will not allow for that as you point
| out. Perhaps a sufficiently crazy JIT might claw some of
| those benefits back, though? Not sure.
|
| But: the lesser but still impactful performance benefit
| of ECS is the usage of Struct-of-Array vectors for data
| storage. JavaScript can still ruin that benefit by always
| accessing all parts and features of an Object every time
| it touches one, but it is a less likely thing to happen.
| So, there is a benefit that JavaScript code itself can
| enjoy.
|
| Finally, there is one single "true System" in a
| JavaScript engine's ECS: the garbage collector. The GC
| will run through a good part of the engine heap, and you
| can fairly easily write it to be a batched operation
| where eg. "all newly found ordinary Objects" are iterated
| through in memory access order, have their mark checked,
| and then gather up their referents if they were unmarked.
| Rinse and repeat to find all live/reachable objects by
| constantly iterating mostly sequential memory in batches.
| This can also be parallelised, though then the batch
| queue needs to become shareable across threads.
|
| The sweep of the heap after this is then a True-True
| System where all items are iterated in order, unmarked
| ones are ignored, marked ones are copied to their post-
| compaction location, and any references they hold are
| shifted down to account for the locations of items
| changing post-compaction.
| chris37879 wrote:
| I'll be checking this project out! I'm a big fan of ECS and
| have lofty goals to use it for a data processing project
| I've been thinking about for a long time that has a lot in
| common with a programming language, enough that I've
| basically been considering it as one this whole time. So
| it's always cool to see ECS turn up somewhere I wouldn't
| otherwise expect it.
| aapoalas wrote:
| Hi, main developer of Nova here if you want to ask any questions!
| I'm at a choir event for the rest of the week though, so my
| answers may tarry a bit.
| glutamate wrote:
| This may be too early to ask, but are you targeting a near-v8
| level of performance? Or more like quickjs or duktape?
| aapoalas wrote:
| Of course, and thank you for taking the time to ask!
|
| For the foreseeable future the aim will be rather on the
| QuickJS/DuckTape level than beating V8. But! That is only
| because they need to be beat before V8 can be beaten :)
|
| I'm not rushing to build a JIT, and I don't even have exact
| plans for one right now but I'm not barring it out either.
|
| If my life, capabilities, and external support enable it then
| I do want Nova to either supplant existing mainstream
| engines, or inspire them to rethink at least some of their
| heap data structures. But of course it is fairly unlikely I
| will get there; I will simply try.
| Permik wrote:
| Gz on your grant! I must have missed the announcement, but
| working on OSS for a living (even for just a bit) would be
| _super_ awesome.
| aapoalas wrote:
| Thank you! I've been a bit bad with announcing it; blog post
| was a month late and all that. But indeed, it's really cool
| to be able to do this for half a year!
| eviks wrote:
| Given the fact that you were so precise in your time estimate
| on interleaved garbage collection, how long do you think it
| would take to get to 99% of the tests?
| aapoalas wrote:
| Haha, I think that was a one time fluke! :D
|
| I'm aiming for something like 75-85% this year; basically get
| iterators properly done (they're in the engine but not very
| complete yet), implement ECMAScript modules, and then mostly
| focus on correctness, builtins, and performance improvements
| after that. 99% would perhaps be possible by the end of next
| year, barring unforeseeable surprises.
| kumavis wrote:
| have you considered using js polyfills to help you get
| closer to 100% coverage and then replacing with native
| implementations prioritized by performance impact?
| aapoalas wrote:
| Not really, no. Its an interesting proposition, but for
| the most part I believe I'll be sticking it out the "hard
| way". The ECMAScript spec is fairly easy to read as well,
| after all. (Nevermind that I spent the single free hour I
| had today cursing at my incapability of understanding
| what is going wrong with my iterator code and what it
| even should do vis-a-vis the spec :D )
| afavour wrote:
| FYI I'm getting an SSL certificate error trying to load the
| site.
| eliassjogreen wrote:
| It's hosted by GitHub pages with Cloudflare DNS so any issues
| are probably related to that.
| pvg wrote:
| Show HN thread a few months ago
| https://news.ycombinator.com/item?id=42168166
| Ericson2314 wrote:
| More ways for Servo to be all-Rust, OK!
| aapoalas wrote:
| That is one explicit goal, maybe next year realistically: Servo
| has asked for help making their JS engine bindings layer
| modular, and I have a self-serving interest in helping achieve
| that :)
| Ericson2314 wrote:
| Nice to hear!
| ComputerGuru wrote:
| OP, since you're here in the comments can you talk about the
| binary and memory size and sandboxing support? Ability to import
| and export functions/variables across runtime boundaries? Is this
| a feasible replacement for Lua scripting in a rust application?
| aapoalas wrote:
| Hmm, sorry, I'm not sure what you mean.
|
| The engine is written with a fair bit of feature flags to
| disable more complicated or annoying JS features if the
| embedder so wants: it is my aim that this would go quite deep
| and enable building a very slim, simple, and easily self-
| optimising JS engine through this.
|
| That could then perhaps truly serve as an easy and fast
| scripting engine for embedding use cases.
| ComputerGuru wrote:
| That answers half my question (eg disable networking), thank
| you. The other part was about the overhead of adding this to
| an app (startup memory usage and increase in binary size) and
| how much work has been done on interop so that you can
| execute a static rust function Foo() passing in a rust
| singleton Bar, or accessing properties or methods on a rust
| singleton Baz, i.e. calling whitelisted rust code from within
| the JS env (vice-versa is important but that's possible by
| default simply by hard-coding a JS snippet to execute, though
| marshaling the return value of a JS function without
| (manually) using JSON at the boundary is also a nice QOL
| uplift).
| progval wrote:
| > written with a fair bit of feature flags
|
| I see you use Cargo feature for this. One thing to be aware
| of is Cargo's feature unification (https://doc.rust-
| lang.org/cargo/reference/features.html#feat...), ie. if an
| application embeds crate A that depends on nova_vm with all
| features and crate B that depends on nova_vm without any
| security-sensitive features like shared-array-buffer (eg.
| because it runs highly untrusted Javascript), then
| interpreters spawned by crate B will still have all features
| enabled.
|
| Is there an other way crate B can tell the interpreter not to
| enable these features for the interpreters it spawns itself?
| ComputerGuru wrote:
| Nice catch, thanks for pointing that out! This also might
| be less than ideal if it's the only option (rather than in
| addition to a runtime startup flag or a per-
| entrypoint/execution flag) because one could feasibly want
| to bundle the engine with the app with features x, y, and z
| enabled but only allow some scripts to execute with a
| subset thereof while running different scripts with a
| different subset.
___________________________________________________________________
(page generated 2025-05-29 23:00 UTC)