[HN Gopher] I have written a JVM in Rust
       ___________________________________________________________________
        
       I have written a JVM in Rust
        
       Author : lukastyrychtr
       Score  : 604 points
       Date   : 2023-07-21 08:48 UTC (14 hours ago)
        
 (HTM) web link (andreabergia.com)
 (TXT) w3m dump (andreabergia.com)
        
       | celeritascelery wrote:
       | I have a few questions about the garbage collection. One of the
       | hard parts of implementing a garbage collector is making sure
       | everything is properly rooted (especially with a moving
       | collector). you have the `do_garbage_collection` method marked
       | unsafe[1], but don't explain what the calling code needs to do to
       | ensure it is safe to call. How do you ensure all references to
       | the heap are rooted? This is not a trivial problem[2][3][4].
       | 
       | Also note that I cloned the repo and tried to run `cargo test`
       | every test fails with 'should be able to add entries to the
       | classpath: InvalidEntry(".../vm/rt.jar")'
       | vm/tests/integration/real_code_tests.rs:15:10
       | 
       | [1]
       | https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
       | 
       | [2] https://manishearth.github.io/blog/2021/04/05/a-tour-of-
       | safe...
       | 
       | [3] https://without.boats/blog/shifgrethor-iii/
       | 
       | [4] https://coredumped.dev/2022/04/11/implementing-a-safe-
       | garbag...
        
         | munificent wrote:
         | It's pretty straightforward. Their VM maintains its own notion
         | of a callstack instead of using the native callstack. That lets
         | them iterate over it and find all of the parameters and locals
         | on the VM's callstack and use them as roots.
         | 
         | There is a performance cost for a VM having its own virtual
         | callstacks like this, but it makes GC tracing much simpler. (It
         | also makes implementing interesting concurrency and control
         | flow primitives like coroutines or continuations much easier
         | too.)
        
           | celeritascelery wrote:
           | Seems like that would take care of roots for the bytecode's
           | themselves, but not for "native" functions[1]. Allocating a
           | new object could call gc[2], and native functions are using
           | the native callstack. It seems like it would be easy to
           | allocate in a native function and any unrooted references
           | would be invalidated. In fact I see a case like that here[3].
           | That method creates a reference with
           | `expect_concrete_object_at` and then calls gc with
           | `new_java_lang_class_object`. It avoids UB by not using `arg`
           | after the call that gc's, but there is nothing stopping you
           | from using `arg` again (and having an invalid reference).
           | 
           | [1] https://github.com/andreabergia/rjvm/blob/main/vm/src/nat
           | ive...
           | 
           | [2] https://github.com/andreabergia/rjvm/blob/be9c54066c64a82
           | 879...
           | 
           | [3] https://github.com/andreabergia/rjvm/blob/be9c54066c64a82
           | 879...
        
             | andreabergia wrote:
             | Indeed you are right, this is definitely a bug and could
             | cause errors.
             | 
             | I guess the solution would be to add an explicit API to
             | create a GC root, invoked by native methods (which is a bit
             | complicated by the fact that I use a moving collector).
             | 
             | Many years ago I was using SpiderMonkey in a c++ project
             | and I seem to remember there were some APIs for native
             | callbacks to invoke that rooted values. Same problem and
             | similar solution. :-)
        
               | munificent wrote:
               | _> I guess the solution would be to add an explicit API
               | to create a GC root, invoked by native methods (which is
               | a bit complicated by the fact that I use a moving
               | collector)._
               | 
               | This is why I do in the Wren VM. Any time a native C
               | function has the only reference to a GC-managed object
               | and it's possible for a collection to occur, it calls a
               | function to temporarily add the object to a list of known
               | roots.
        
         | amelius wrote:
         | GCs are pretty easy, and just a matter of good accounting. That
         | is, until you start doing concurrent GC then it becomes
         | hellishly difficult.
        
       | encody wrote:
       | Missed opportunity to call it Just
        
         | [deleted]
        
         | orlp wrote:
         | Just is already 'taken' in the Rust community as it is a
         | command runner utility like make: https://github.com/casey/just
        
           | 1letterunixname wrote:
           | Haha. Shout out to homeboy Xoogler Rodarmor.
        
           | entropicdrifter wrote:
           | Ooh, I like that. `just build` feels like a good plea to make
           | to the Rust compiler lol
        
         | brunoborges wrote:
         | JustVM
        
       | Agingcoder wrote:
       | I'm doing a (free) operating system (just a hobby, won't be big
       | and professional like gnu) for 386 (486) AT clones.
       | 
       | :-)
        
         | 1letterunixname wrote:
         | There's a no-std tutorial on how to write a demo kernel in
         | Rust. https://os.phil-opp.com
         | 
         | osdev.org, sandpile.org, RBIL, and freevga. The biggest PITA is
         | hardware support. There are many good vintage hardcopy books
         | with recipes for things like reliable port IO and undocumented
         | hardware tricks.
         | 
         | - Intel(r) 64 and IA-32 Architectures Software Developer's
         | Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and
         | 4
         | 
         | - Microsoft MS-DOS Programmer's Reference (also includes real-
         | mode BIOS calls)
         | 
         | - PC Interrupts
         | 
         | - Undocumented PC
         | 
         | - PC Intern
         | 
         | - Programmer's Guide To The EGA, VGA, And Super VGA Cards
         | 
         | - Graphics Programming Black Book Special Edition
         | 
         | Also, it's worth toying with advances in OS dev past the era of
         | monolithic, microkernel, and hybrid.
         | 
         | 1. Capability-based like seL4. It has a number of inherent
         | performance and security advantages including capabilities and
         | excellent IPC.
         | 
         | 2. POSIX compatibility layer. Even embedded OSes without the
         | concept of threads or processes can implement POSIX.
         | 
         | 3. Hypervisor. They're much easier to add with intel's VT-[xd].
         | Failing that, fall back to emulation. Translational emulation
         | is very performant.
         | 
         | 4. Get good at generalizing interrupt handlers, making them
         | fast, avoiding race conditions, and using lock-free patterns.
         | 
         | Also:
         | 
         | 5. Rewriting or trapping unsupported instructions including x87
         | and MMX.
         | 
         | 6. The failure of pure microkernel was the added complexity and
         | management of sequencing multiple resources in a transactional
         | manner. There are great theoretical security and operational
         | advantages in microkernel architectures but they never caught
         | on widely in a pure form.
        
       | cmrdporcupine wrote:
       | Great learning project, I'm glad the author is having fun.
       | Implementing a VM from scratch is a blast, and I have learned so
       | much in the past doing that kind of thing.
       | 
       | If they're interested in bolting on a GC, it couldn't hurt to
       | look at MMtk. (https://www.mmtk.io/) Some high quality collection
       | algorithms, written to be pluggable to various VMs, and written
       | in Rust.
        
         | sbt567 wrote:
         | Uh, first time hearing mmtk. Thanks for the link!
        
           | cmrdporcupine wrote:
           | I only became aware of it because a former employer
           | (RelationalAI) was heavily interested in replacing Julia's GC
           | with it (for some workloads):
           | https://pretalx.com/juliacon2023/talk/BMBEGY/
        
         | celeritascelery wrote:
         | Note that MMTK is x86 only. I was going to use it for a toy
         | project but I have a Mac.
        
           | cmrdporcupine wrote:
           | That's a bummer -- I guess I never noticed that, when I
           | played with it before it was on an M1 Mac, but compiled into
           | an x86 Julia executable & running through Rosetta (which
           | surprisingly did not suck).
           | 
           | Haven't read, but I bet it's likely related to expectations
           | around the x86_64 memory model & atomics. In the long run I
           | see no reason why it couldn't be made portable, but I imagine
           | the authors efforts are elsewhere for now.
        
       | nine_k wrote:
       | This project is like the ground floor of the JVM, not the whole
       | tower. I like though how the project's page is direct and clear
       | about that.
       | 
       | A lot of the foundation and ground-floor mechanics are pretty
       | interesting though.
        
       | ChuckMcM wrote:
       | That is pretty awesome! When I joined the Java effort in '92
       | (called Oak at the time) the group I was with was looking at
       | writing a full OS in Java. The idea being that you could get to
       | just the minimal set of things needed as "machine code" (aka
       | native methods) you could reduce the attack surface of an
       | embedded OS. (originally Java was targeted to run in things like
       | TV's and other appliances). We were, of course, working in C
       | rather than Rust for the native methods. The JVM in Rust though
       | adds a solid level of memory safety to the entire process.
        
         | mshockwave wrote:
         | > writing a full OS in Java
         | 
         | IMAO, Android kind of achieve that...kind of. They write lots
         | of OS logics in Java (or Kotlin) but mixing lots of system
         | services written in native code at the same time,
         | interconnected by the famous (or infamous?) Bind IPC.
        
           | ActorNightly wrote:
           | Android is mostly things that run java, not java itself. You
           | can look at the source code, there is a relatively small
           | amount of java in there.
        
           | soperj wrote:
           | It's a modified version of the linux kernel no? That'd be
           | mostly C.
        
             | kaptainscarlet wrote:
             | He might be referring to system services that manage
             | hardware devices .etc BatteryService .rtc
        
           | grishka wrote:
           | Android isn't _conventional_ Java. For starters, its runtime
           | uses its own bytecode (dex) that 's based on registers
           | instead of a stack. But then, also, many things that aren't
           | related to GUI are C++ with a thin Java wrapper on top.
           | 
           | When I think about a "Java OS", I imagine a JVM running in
           | kernel mode, providing minimal OS functionality (scheduler,
           | access to hardware I/O ports) and there not being any kind of
           | userspace.
        
           | tgtweak wrote:
           | Embedded JVM is actually huge/pervasive and runs things as
           | benign as the chip on your credit card.
        
             | ActorNightly wrote:
             | By the virtue of being the most convenient alternative, not
             | because its actually good.
        
               | lazide wrote:
               | How is that not actually good?
               | 
               | Engineering is all about tradeoffs, and 'works' is pretty
               | high praise frankly.
        
               | palata wrote:
               | Still better than ElectronJS, I would say
        
             | lldb wrote:
             | Every blu ray player runs java for bonus features on the
             | disk- they can even connect to the internet!
        
               | pseudosavant wrote:
               | The very first feature I disable on every Blu-ray player
               | I've ever used.
        
             | locusofself wrote:
             | After reading your comment I was surprised to find out that
             | credit card chips have any processing capabilities
             | whatsoever, which they apparently do, though at least
             | according to gpt4, they are far too basic to run java/jvm.
             | ?
        
               | cayley_graph wrote:
               | Google appears to be significantly more useful than GPT-4
               | here. [1] is the third result for me for the query
               | "credit card jvm". [2] is the second result and gives a
               | direct (and more importantly, actually correct) answer.
               | That post links to the Oracle documentation for Java
               | Cards [3] which is the fourth result.
               | 
               | [1] https://en.m.wikipedia.org/wiki/Java_Card
               | 
               | [2] https://superuser.com/questions/362567/are-there-any-
               | credit-...
               | 
               | [3] https://www.oracle.com/java/java-card/
               | 
               | All of this is just as easy as, if not easier than, using
               | ChatGPT. It's unclear that such a tool even serves this
               | purpose (retrieval of basic facts) adequately, so it
               | should probably be avoided in the future.
        
               | locusofself wrote:
               | Fair enough, there is a "Java Card". I'm not convinved
               | that Java is running on any of my or your credit cards in
               | your wallet today, though I'm not willing to bet on it.
        
               | arllk wrote:
               | It's running on many e-Passports and e-ID cards, i can't
               | find the documentation from my e-ID card which runs on
               | Java, but the chips are quite common like in
               | 
               | https://www.cardlogix.com/product/cardlogix-credentsys-
               | lite-...
               | 
               | And on another source:
               | 
               | Visa became the first large payment company to license
               | JavaCard. Visa mandated JavaCard for all of Visa's
               | smartcard payment cards. Later, MasterCard acquired
               | Mondex, and Peter Hill joined as their CTO, licensed
               | JavaCard, and ported the Mondex payment platform to
               | JavaCard.
               | 
               | Source: https://javacardforum.com/2022/07/28/the-birth-
               | of-javacard/
        
               | cayley_graph wrote:
               | This lists a few things using Java Cards (likely stuff
               | that's in your wallet, surprisingly):
               | https://stackoverflow.com/questions/47731005/practical-
               | use-o...
               | 
               | (fwiw I have a bit of prior experience here)
        
               | locusofself wrote:
               | That's pretty neat. It sounds like Java Card (or at least
               | some other cards) actually "boot up" by way of inductive
               | coupling, ie, via the "contactless" card readers where
               | you just hold your card in proximity to the reader thing.
               | Did not know that, I assumed it was just reading a key
               | via NFC or something.
        
               | pests wrote:
               | This is a classic defcon talk about it. They developed
               | their own cell network with their own sim cards for an
               | event and even built custom JavaCard applications their
               | users could use. They have released all their information
               | and tools used to build and compile Java Card software.
               | 
               | https://www.youtube.com/watch?v=_-nxemBCcmU
        
               | Sharlin wrote:
               | The whole reason there's a thing called a chip in the
               | card is that it actually does computing (indeed they're
               | called smart cards) and that it does the sort of
               | computing (cryptographic challenge-response) that makes
               | these cards much more secure than oldschool magnetic
               | stripe cards.
               | 
               | Even fully passive NFC tags contain logic that needs
               | power to talk NFC back to the reader, there's no such
               | thing as just reading data via NFC.
        
               | xorcist wrote:
               | Just wait until you find out about the SIM card in your
               | phone!
               | 
               | It's almost bizarre what that thing does.
        
               | unintendedcons wrote:
               | In what world did you convince yourself asking a chatbot
               | was a source of real knowledge?
               | 
               | respect yourself enough to look at primary sources
        
               | locusofself wrote:
               | How about respect other people instead of rushing to
               | condescending judgement? the stakes are incredibly low
               | here, I asked ChatGPT for fun, like tens of millions of
               | others are doing every day.
        
               | refulgentis wrote:
               | For some reason you announced that you were subjecting us
               | to a low quality information retrieval method, and after
               | 6 months of this, people are irritable. The social norms
               | is to do that sort of thing in private, doing it in
               | public and seemingly proudly came across as coarse and
               | impolite
               | 
               | It didn't help any that it was clear from the initial
               | post you were questioning someone with domain knowledge,
               | which was later gently indicated to you
        
               | tjlingham wrote:
               | It feels like that should be true, I get it. However Java
               | Card is very real.
               | 
               | https://en.m.wikipedia.org/wiki/Java_Card
        
             | Isolus wrote:
             | Also many SIM cards (UICCs) / embedded SIM modules as well
             | as e.g. the Secure Element that Samsung uses for Knox run
             | with Java Card.
        
         | pjmlp wrote:
         | Besides the sibling comments,
         | 
         | - SavageJE
         | 
         | - microEJ
         | 
         | - PTC and Aonix bare metal Java runtimes
         | 
         | - SunSPOT mit SquawkVM
        
         | spullara wrote:
         | There was one for a while though wasn't really targeted at
         | users:
         | 
         | https://en.wikipedia.org/wiki/JavaOS
        
       | techn00 wrote:
       | See also https://jacobin.org/ for JVM 17 written in Go.
        
         | xmcqdpt2 wrote:
         | Also https://github.com/lihaoyi/Metascala for a JVM implemented
         | in Scala running on the JVM.
        
           | dimgl wrote:
           | Seems... redundant, no?
        
             | mike_hearn wrote:
             | Nope. For a more realistic example of such a JVM, look at
             | SubstrateVM (written in "SystemJava" and compiled to native
             | code ahead of time along with the app it runs), and "Java
             | on Truffle" (a.k.a. Espresso), which is a JVM written in
             | Java designed to be compiled to run on top of SubstrateVM.
             | Both projects are a part of Graal.
             | 
             | The reason to do this, beyond the inherently neat Inception
             | factor, is that JVMs are a PITA to work on because they're
             | normally written in languages like C++ or Rust which
             | optimize for performance and manual control over
             | productivity. That makes it hard to experiment with new JVM
             | features or changed semantics. If you could write a JVM in
             | a high level very productive language like Java (or Kotlin
             | or Scala) then the productivity of people writing and
             | experimenting with JVMs would go up. It would also make it
             | feasible for "ordinary" Java devs to actually fork the JVM
             | and modify it to better suit their app, at least in some
             | cases.
             | 
             | There's also something conceptually cleaner about having a
             | language and its runtime implemented purely in itself. As
             | long as you don't mind the circularity, that is.
             | 
             | Espresso for example has hot-swap features HotSpot doesn't
             | have, so you can modify your program as it's running in
             | more flexible ways than what regular Java allows.
        
               | bfrog wrote:
               | I find that Rust is like maybe 1.5-2x more productive to
               | code in than say C or C++. Part of that is the tooling
               | has so much less arcane baggage, part of that is that I
               | need to reach less for external tools for
               | metaprogramming, part of that is fewer crazy
               | macro/template compiler errors, and part of that is less
               | time spent debugging.
               | 
               | It all adds up.
        
               | mike_hearn wrote:
               | I've heard very inconsistent things about this, which is
               | interesting. Often people say Rust is _less_ productive,
               | as there 's often "makework" involved with satisfying the
               | borrow checker. I suspect a lot of it revolves around how
               | you perceive that sort of thing: it can be cast as both
               | productivity (satisfying it can potentially rule out
               | bugs) or a loss of productivity (you were already
               | satisfied the code was correct).
               | 
               | But I don't have enough experience with Rust to really
               | have formed an opinion on that yet.
        
               | pjmlp wrote:
               | And Jikes RVM as well.
        
               | nerpderp82 wrote:
               | Jikes is extremely popular in academia for being the
               | basis of VM research because it is so easy to modify.
               | 
               | https://github.com/JikesRVM/JikesRVM
        
             | dgb23 wrote:
             | apply eval
        
             | cmrdporcupine wrote:
             | _" The goal of Metascala is to create a platform to
             | experiment with the JVM: a 3000 line JVM written in Scala
             | is probably much more approachable than the 1,000,000 lines
             | of C/C++ "_
             | 
             | Seems like a reasonable goal.
        
               | RcouF1uZ4gsC wrote:
               | I seriously doubt the ratio of Scala vs C++ for
               | implementing the JVM is 1:300.
        
               | patrec wrote:
               | He's not saying that. What he is saying is that are
               | simple, non-production quality implementation in Scala is
               | much more amenable to experimentation than a
               | sophisticated, production-quality implementation in C++
               | that weighs in at 300x the LOC.
        
               | RcouF1uZ4gsC wrote:
               | But a simple non-production quality implementation in C++
               | would also be amenable to experimentation and not have
               | the bootstrapping issues as well as provide an easier
               | starting point to incorporate more of the existing
               | optimizations as desired.
        
               | valenterry wrote:
               | Probably not, because JVM users are much more likely to
               | be more proficient in Scala/Java than in C++.
        
               | eindiran wrote:
               | That is true, but the author of Metascala wanted to write
               | it in Scala. Other people are free to write a simple C++
               | implementation of the JVM themselves.
        
               | cempaka wrote:
               | I'm sure it's not complete, but I also wouldn't be
               | surprised if 99% of what's in HotSpot is optimization
               | tweaks and performance boosts which aren't essential to
               | the JLS.
        
         | leshow wrote:
         | That is a very interesting name for a programming project lol.
         | The Jacobins were a revolutionary political club during the
         | French Revolution in the 1790's. It's also the name of a
         | magazine at https://jacobin.com
        
           | snordgren wrote:
           | It's starts with the letters "ja", that's all that matters
           | for a Java-related project.
        
       | haspok wrote:
       | Nice project, congrats!
       | 
       | One thing struck me as a bit odd:
       | 
       | > In particular, it does not support: generics
       | 
       | What kind of support is there for generics in the JVM? Maybe I'm
       | too naive to assume that due to type erasure on bytecode level
       | everything is just an Object, ie. a reference type? Or do you
       | mean the class definition parser - but then, you don't really
       | have any checks in place to see if the class file is valid (other
       | than the basic syntax)?
        
         | newmana wrote:
         | They might be talking about the checkcast operation:
         | https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.ht...
         | 
         | This is generated when you do something like: final Main value
         | = list.get(0);
         | 
         | http://henrikeichenhardt.blogspot.com/2013/05/how-are-java-g...
        
           | xxs wrote:
           | The cast is added by javac, so it just needs to verify the
           | object on the stack to be compatible w/ the provided class.
           | That part is very simple.
        
         | andreabergia wrote:
         | Thanks!
         | 
         | About the generics - some people have pointed out the same on
         | reddit, and yeah, you are correct. The only thing that should
         | be done is to read the Signature attribute that encodes the
         | generic information about classes, methods, and fields (https:/
         | /docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.ht...)
         | 
         | As a matter of fact, I just did a test and the following code
         | works! :-)                   public class Generic {
         | public static void main(String[] args) {
         | List<String> strings = new ArrayList<String>(10);
         | strings.add("hey");                 strings.add("hackernews");
         | for (String s : strings) {                     tempPrint(s);
         | }             }                  private static native void
         | tempPrint(String value);         }
        
         | xxs wrote:
         | pretty much this - generics have (rare) implications to the
         | reflection (but it's unsupported as well) but overall they are
         | replaced with the nearest class/interface when compiled.
         | 
         | OTOH lack of string interning is super strange [it's trivial to
         | implement], and w/o it JVM is not a thing. String being equal
         | by reference is important, and part of JLS.
         | 
         | Lack of thread makes the entire endeavor a toy project.
        
           | 3cats-in-a-coat wrote:
           | Java strings are compared by reference, if they do not match,
           | they're compared by value. There's no guarantee every single
           | string has a single instance. That would hurt performance.
        
             | maverwa wrote:
             | I think op meant "String literals". For those the spec
             | seems to require interning:
             | 
             | > Moreover, a string literal always refers to the same
             | instance of class String. This is because string literals -
             | or, more generally, strings that are the values of constant
             | expressions (SS15.28) - are "interned" so as to share
             | unique instances, using the method String.intern.
             | 
             | And later:
             | 
             | > Literal strings within different classes in different
             | packages likewise represent references to the same String
             | object.
             | 
             | Source: https://docs.oracle.com/javase/specs/jls/se8/html/j
             | ls-3.html...
             | 
             | But that does - as far as I can see - say nothing for non-
             | literal strings.
             | 
             | [edit]: formatting
        
               | 3cats-in-a-coat wrote:
               | Thanks, makes sense yes. Still if the JVM look up in all
               | cases defers to value after ref mismatch, it should work
               | identically, no? Even if interning is mandatory as per
               | spec, I'm not sure how it'd change the outcome of
               | evaluation.
        
               | maverwa wrote:
               | yeah, I'd assume as much. If it indeed falls back to a
               | by-value comparison it would be slower, but should work.
        
               | xxs wrote:
               | nope - it'd be plain wrong. Literals must be equal by
               | reference, comparing them by value would just break JLS,
               | as they would be equal to any other composed string by
               | reference as well.
        
               | robertlagrant wrote:
               | Yes - that is a performance optimisation. I don't think
               | comparing everything by value makes or breaks the
               | implementation.
        
               | xxs wrote:
               | >I don't think comparing everything by value makes or
               | breaks the implementation.
               | 
               | Nothing much to think -- distinct objects must have
               | distinct references [e.g. new String("a")!=new
               | String("a')], literals must have the same references for
               | the same values [e.g. "a"=="a"].
        
               | thfuran wrote:
               | It affects the result of ==, which is only a reference
               | comparison.
        
               | simiones wrote:
               | The problem is that, per the spec, the following _must_
               | hold:                 if("abc" == "abc") {
               | System.out.println("correct"); }       if(new
               | String("abc") != new String("abc")) {
               | System.out.println("correct"); }
               | 
               | So, not having proper string interning support means that
               | you mis-execute certain programs.
        
               | kelnos wrote:
               | Sure, but also consider that this JVM (intentionally)
               | lacks support for other things that all but the most
               | trivial programs would use. I don't think it's expected
               | by the author that you can throw any random program at
               | it. It's really there just to run your own programs that
               | you've written specifically for it in order to play
               | around with things. And since you know you're writing for
               | this particular JVM, you should know not to do anything
               | that depends on string interning, among other things.
        
               | xxs wrote:
               | pretty much indeed.
               | 
               | > say nothing for non-literal strings
               | 
               | yes, of course.
        
           | senorrib wrote:
           | "I want to stress that this is a toy JVM, built for learning
           | purposes and not a serious implementation."
        
             | xxs wrote:
             | The rest of the stuff, incl. I/O is actually on the trivial
             | side - threads do require planning. This is what I meant by
             | being a 'toy' project, threads (and JMM) would be
             | impossible to bolt in later on.
        
               | ncallaway wrote:
               | The reason you're being downvoted is you keep dismissing
               | this as a "toy" project, and pointing out that it would
               | be hard to make a real project.
               | 
               | But, as the previous commenter attempted to point out to
               | you this project is a *self-described* toy project.
               | 
               | On the very page that is linked, the author of the JVM
               | specifically says:
               | 
               | "I want to stress that this is a toy JVM, built for
               | learning purposes and not a serious implementation."
               | 
               | Thus, absolutely _no one_ disagrees that its a toy JVM.
               | They just want you to stop being dismissive of someone 's
               | toy project by repeatedly pointing out its a toy project
               | and not a "not a thing"
        
               | __jem wrote:
               | Right, just because something is a "toy" doesn't mean
               | it's not still impressive. If someone implemented a "toy"
               | database that could parse and execute SQL queries,
               | distribute data across nodes, etc., you would probably
               | not want to use that in production, but it's still a very
               | impressive project for a single person to pull off, even
               | if it's riddled with bugs. Getting a very complex system
               | to "just barely functional" is still a huge achievement
               | and very cool!
        
               | xxs wrote:
               | Being impressing or otherwise is very subjective.
               | 
               | SQL (ACID) over multiple non-cache-coherent nodes is
               | extremely difficult to pull with regards to consistency,
               | though.
        
               | __jem wrote:
               | > SQL (ACID) over multiple non-cache-coherent nodes is
               | extremely difficult to pull with regards to consistency,
               | though.
               | 
               | Thats... why it's a toy! I'm really not sure what you're
               | missing here.
        
               | gazarullz wrote:
               | you must be fun to work with
        
               | xxs wrote:
               | I've pointed the only part that makes it a toy project is
               | the lack of Threading support, the rest is not hard to
               | add. So the items in list of things missing after 'toy'
               | thing should have totally different weights (with Threads
               | being the added to the last).
        
               | lolinder wrote:
               | You're still missing the point--it was always intended to
               | be a toy project, and the author has explicitly declared
               | that they are completely done with it and won't be doing
               | any more work. What does it matter how they sort the list
               | of missing items? It's not a todo list in need of
               | prioritization, it's just an "FYI, these are some of the
               | things I never got to".
        
           | whizzter wrote:
           | Not entirely correct, last I checked string interning was
           | ONLY guaranteed for those strings defined in source and read
           | in during class loading, strings created via the String
           | constructor (f.ex. via StringBuilder) CAN duplicate those
           | strings that you hardcoded in your sources, to get the
           | "canonical" string in those cases you have to invoke
           | String.intern() if memory serves me correct.
           | 
           | https://docs.oracle.com/en/java/javase/11/docs/api/java.base.
           | ..()
           | 
           | Also interning strings to optimize equality checks to be able
           | to use pointer comparison is dangerous for external inputs
           | since iirc at some point interned strings could permanently
           | be stored (unless implemented by a WeakSet) and attackers
           | could fill up your heap (or cause other GC issues since the
           | entire interning functionality is a cache) by filling up your
           | interning lists with crap.
        
             | xxs wrote:
             | >String being equal by reference is important, and part of
             | JLS.
             | 
             | I never said String must be equal by reference when their
             | content is. However string literals must be equal by
             | reference. I thought Mentioning the JLS would make it
             | obvious, esp. having 'intern' in the context
        
             | ywei3410 wrote:
             | There's also the new JVM option which eludes me at the
             | moment which sweeps the strings which are promoted to the
             | older generation and interns them.
             | 
             | Not certain about whether `String.intern` is permanently
             | stored; I rather suspect that it sweeps the existing
             | strings since iirc the java string has a hash associated
             | with it anyway.
        
           | znpy wrote:
           | > Lack of thread makes the entire endeavor a toy project.
           | 
           | yeah, as stated by the author in the line that says "I want
           | to stress that this is a toy JVM, built for learning purposes
           | and not a serious implementation."
        
             | xxs wrote:
             | Yeah, that's the only part that makes it a toy project -
             | the rest can be added w/o too much of an effort. This is
             | pretty much what makes it a toy project.
        
       | nunobrito wrote:
       | Good work!
        
       | hajmo97 wrote:
       | How often will this post be reposted on HN ?
       | 
       | Recent posts:
       | 
       | https://news.ycombinator.com/item?id=36735344 - 6 days ago
       | 
       | https://news.ycombinator.com/item?id=36717967 - 7 days ago
       | 
       | https://news.ycombinator.com/item?id=36710803 - 7 days ago (OP)
       | 
       | Btw. nice project!
        
         | freedomben wrote:
         | Getting the attention needed for front page is a huge chance.
         | I've seen great stuff get posted 5 or more times before it
         | makes it out of obscurity. HN is highly non-deterministic in
         | these things.
        
         | capableweb wrote:
         | Doesn't really count as a repost if none of the previous
         | submissions didn't get any traction.
        
           | dgb23 wrote:
           | I agree, plenty of technically interesting projects don't get
           | discussion on hn. There's always a bit of luck and context
           | involved.
        
       | aardvark179 wrote:
       | Very well done. Building VMs is always fun, and I'm sure it was
       | an interesting learning experience when combined with Rust's type
       | system.
       | 
       | If you're looking for a job then ping me on Twitter, Mastodon or
       | my work email, I'm sure you can figure them out from my user id
       | here.
        
       | tenaf0 wrote:
       | Shameless plug of my similar project:
       | https://github.com/tenaf0/rust-jvm3
        
         | sproketboy wrote:
         | [dead]
        
       | bingemaker wrote:
       | When I see such cool projects, I feel very overwhelmed. How do
       | you get started with Rust and master basics to even attempt doing
       | such a thing? Can OP explain?
        
         | andreabergia wrote:
         | Well, _I_ feel impostor syndrome half the times I open HN
         | honestly!
         | 
         | I did have a bit of experience with VMs before, I wrote many
         | years ago a short series of posts about it on my blog, and at
         | my previous job I dabbled a bit in JVM byte code to solve one
         | very unusual problem we had for a customer. I also read the
         | _amazing_ https://craftinginterpreters.com/ years ago and that
         | gave me some ideas.
         | 
         | But this project was definitely big and complex. It took me a
         | lot of time, and it got abandoned a couple of times, like many
         | of my side projects. But I'm happy I finished it. :-)
        
         | nop_slide wrote:
         | Likewise. Not to go onto too much of tangent, but on a more
         | personal note I've been generally struggling with this feeling
         | a lot lately.
         | 
         | I've been a professional software developer for almost 10
         | years, and I _know_ I'm competent (and not an impostor) as
         | demonstrated by my current position and ability to ship things.
         | 
         | However, lately after viewing developer blogs I become
         | overwhelmed that I actually don't know enough and am not a
         | "real" developer. I seem to have formed a notion of an ideal
         | developer in my head and I compare myself against this imagined
         | construct which leads to these feelings. I admire how these
         | people have so much deep knowledge and can express themselves
         | so clearly and concisely, then wonder why I am not like that.
         | 
         | I barely have the energy after work after taking care of my
         | family to do anything further, and I know programming isn't
         | everything but I do have a desire to learn more and improve
         | myself.
         | 
         | I recognize this isn't healthy nor is it rational, but it's
         | just a feeling I can't shake lately.
        
           | theLiminator wrote:
           | Well, you're probably comparing yourself against the top 1%
           | of developers. It's okay to not be the very best, being in
           | the top 30% of this field already is very rewarding.
        
           | dist1ll wrote:
           | What you're describing is very common amongst developers. So
           | common in fact, that I've written a post about this
           | https://alic.dev/blog/comparisons
           | 
           | In short: recognizing your insecurities is the first step.
           | The next step is figuring out what's important to you,
           | shedding impossible to achieve and irrational ambitions,
           | prioritizing your goals in life, and articulating concrete
           | steps to further them.
        
             | nop_slide wrote:
             | Thanks, appreciate the link and going to keep this in mind.
             | Cheers.
        
         | naltun wrote:
         | Not OP nor am I a Rust expert. I can speak regarding another
         | technology: sockets.
         | 
         | I've been deep-diving into sockets recently. 2 weeks ago I had
         | only a high-level understanding of sockets (learned from
         | casually reading manpages, docs, blog posts, etc.). I decided
         | to read as much as possible because I wanted to understand
         | networking fundamentals, and after a week I learned enough to
         | write some sockets code in Python and C. I know Python quite
         | well, so reviewing the ``sockets'' library made more sense
         | after my deep dive.
         | 
         | If you want to get better at technology A using language X, I
         | suggest either reading/watching as much as you can about tech
         | A, and build stuff with it in language Y. Then you can circle
         | back to learning language X and you've already mastered much of
         | the concepts around technology A.
         | 
         | e: spelling
        
         | aardvark179 wrote:
         | Break things down. A simple language VM is going to have a way
         | to represent objects in memory, a byte code interpreter, a
         | simple garbage collector, and a way to load things.
         | 
         | A byte code interpret is a stack, some way to represent
         | functions on that stack, and then a loop to interpret beach
         | byte code and move the program counter.
        
         | sn9 wrote:
         | How much do you code in your free time? Like average hours per
         | week?
         | 
         | If it's zero (and no judgement from me if it is; plenty of
         | other things to focus on), then it shouldn't be surprising that
         | someone for whom that number is (speculatively) 10-20 hours per
         | week on average for years has impressive side projects.
        
       | squirtlebonflow wrote:
       | [dead]
        
       | FrustratedMonky wrote:
       | Just for kicks, has anybody tried to Rube-Goldberg it, and see
       | how many VM's can stack on top of each other. Like have Java App
       | running on JVM written in RUST, Running on WASM, Running on JVM,
       | etc.. etc...
        
         | Zambyte wrote:
         | This talk isn't about VMs, but it is about an infinite tower of
         | interpreters. You may find it interesting:
         | https://youtu.be/SrKj4hYic5A
        
         | _joel wrote:
         | It's turtles, all the way down..
        
         | post-it wrote:
         | See also: The Birth & Death of JavaScript, A talk by Gary
         | Bernhardt from PyCon 2014
         | (https://www.destroyallsoftware.com/talks/the-birth-and-
         | death...)
        
       | maverwa wrote:
       | Great project, congrats! Mad respect.
       | 
       | Started working on something very similar a few years back and
       | gave up pretty soon for some stupid reason. Maybe I should try
       | again, am getting better at getting stuff done.
        
         | zote wrote:
         | Please do
        
         | cmrdporcupine wrote:
         | You should. I've been doing this kind of stuff on the side for
         | 25+ years, and nothing ever gets "done." But the educational
         | value is much higher than you think. And the skills learned can
         | eventually propagate out into interesting paid work.
         | 
         | I think of them as retirement projects before I retire. When I
         | actually retire I'll maybe finish them.
         | 
         | (That said, I have in the past tried taking jobs that were
         | adjacent to my "research" interests, and found the joy of
         | building these things from scratch is much better than fiddling
         | with the levers on the side of someone else's thing they built
         | from scratch years ago. I like working on and improving
         | production systems, but if they intersect too closely to my
         | personal interests, it can be demoralizing.)
        
         | andreabergia wrote:
         | Thanks!
         | 
         | I know the feeling - this project, like most of my other side
         | projects, got abandoned a couple of times. But I was really
         | curious about implementing a GC and, for once, I managed to
         | finish something. I'm glad I did! :-)
        
       | celeritascelery wrote:
       | I am curious if your ran into limitations due to the lifetimes on
       | this signature
       | 
       | fn execute_instruction( &mut self, vm: &mut Vm<'a>, call_stack:
       | &mut CallStack<'a>, instruction: Instruction, ) ->
       | Result<InstructionCompleted<'a>, MethodCallFailed<'a>>
       | 
       | When I try to add a lifetime to the `Err` variant of a `Result`
       | and that lifetime is invariant (which it is due to `vm` and
       | `call_stack`) it usually means that I can't use the question mark
       | operator or have early returns in the code[1]. This makes error
       | handling more verbose and less readable. Is that your experience
       | as well?
       | 
       | [1] https://users.rust-lang.org/t/nll-and-early-return-not-
       | allow...
        
         | celeritascelery wrote:
         | EDIT: Looks like this is not an issue because the invariant
         | lifetime 'a is not used for the mutable reference of vm or
         | call_stack. So it's not the invariance that is the problem, but
         | rather how Rust reasons about the lifetime of mutable
         | references, which this avoids.
         | 
         | In that case I don't understand what the point of 'a is on VM
         | and CallStack. You can create[1][2] those with any unbounded
         | lifetime (including 'static[3]), which means it is not
         | constraining anything. What is the lifetime 'a doing here? Why
         | not remove it?
         | 
         | [1]
         | https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
         | 
         | [2]
         | https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
         | 
         | [3]
         | https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
        
           | andreabergia wrote:
           | I wanted to express the fact that everything that gets
           | allocated (call stack, frames, classes, and objects) is alive
           | and valid until the "root" VM is, thus I used 'a more or less
           | everywhere.
           | 
           | I also struggled with a got a ton of errors from the borrow
           | checker initially, and I fixed many of those with a lot of
           | explicit lifetimes, but it's not impossible that in some
           | places they are unnecessary.
        
             | celeritascelery wrote:
             | > I wanted to express the fact that everything that gets
             | allocated (call stack, frames, classes, and objects) is
             | alive and valid until the "root" VM is, thus I used 'a more
             | or less everywhere.
             | 
             | That's not being expressed in the type system. The lifetime
             | 'a is unbounded (meaning you can make it anything you want,
             | including 'static) so anything that shares 'a can outlive
             | the vm without rust complaining. it would be no different
             | then if you removed 'a completely. If you wanted to ensure
             | anything couldn't outlive the vm you could tie the lifetime
             | to a _reference_ to the vm, but then the vm can 't hold
             | those values (it would be a self-referential lifetime).
        
               | skitter wrote:
               | Funnily enough I did the same in an early wip version of
               | my toy JVM. Ended up using unsafe to use 'static
               | references internally but only hand out wrappers that
               | include a reference to the JVM. This also ensures that
               | objects/classes/... from one JVM can't be used in another
               | one.
        
       | sylware wrote:
       | Can I compile it with the rust-written rust compiler?
        
       | bkkz5046 wrote:
       | but more importantly does edbrowse work here?
        
       | babuloseo wrote:
       | Oracle oracle oracle.
        
       ___________________________________________________________________
       (page generated 2023-07-21 23:00 UTC)