[HN Gopher] I have written a JVM in Rust
___________________________________________________________________
I have written a JVM in Rust
Author : lukastyrychtr
Score : 604 points
Date : 2023-07-21 08:48 UTC (14 hours ago)
(HTM) web link (andreabergia.com)
(TXT) w3m dump (andreabergia.com)
| celeritascelery wrote:
| I have a few questions about the garbage collection. One of the
| hard parts of implementing a garbage collector is making sure
| everything is properly rooted (especially with a moving
| collector). you have the `do_garbage_collection` method marked
| unsafe[1], but don't explain what the calling code needs to do to
| ensure it is safe to call. How do you ensure all references to
| the heap are rooted? This is not a trivial problem[2][3][4].
|
| Also note that I cloned the repo and tried to run `cargo test`
| every test fails with 'should be able to add entries to the
| classpath: InvalidEntry(".../vm/rt.jar")'
| vm/tests/integration/real_code_tests.rs:15:10
|
| [1]
| https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
|
| [2] https://manishearth.github.io/blog/2021/04/05/a-tour-of-
| safe...
|
| [3] https://without.boats/blog/shifgrethor-iii/
|
| [4] https://coredumped.dev/2022/04/11/implementing-a-safe-
| garbag...
| munificent wrote:
| It's pretty straightforward. Their VM maintains its own notion
| of a callstack instead of using the native callstack. That lets
| them iterate over it and find all of the parameters and locals
| on the VM's callstack and use them as roots.
|
| There is a performance cost for a VM having its own virtual
| callstacks like this, but it makes GC tracing much simpler. (It
| also makes implementing interesting concurrency and control
| flow primitives like coroutines or continuations much easier
| too.)
| celeritascelery wrote:
| Seems like that would take care of roots for the bytecode's
| themselves, but not for "native" functions[1]. Allocating a
| new object could call gc[2], and native functions are using
| the native callstack. It seems like it would be easy to
| allocate in a native function and any unrooted references
| would be invalidated. In fact I see a case like that here[3].
| That method creates a reference with
| `expect_concrete_object_at` and then calls gc with
| `new_java_lang_class_object`. It avoids UB by not using `arg`
| after the call that gc's, but there is nothing stopping you
| from using `arg` again (and having an invalid reference).
|
| [1] https://github.com/andreabergia/rjvm/blob/main/vm/src/nat
| ive...
|
| [2] https://github.com/andreabergia/rjvm/blob/be9c54066c64a82
| 879...
|
| [3] https://github.com/andreabergia/rjvm/blob/be9c54066c64a82
| 879...
| andreabergia wrote:
| Indeed you are right, this is definitely a bug and could
| cause errors.
|
| I guess the solution would be to add an explicit API to
| create a GC root, invoked by native methods (which is a bit
| complicated by the fact that I use a moving collector).
|
| Many years ago I was using SpiderMonkey in a c++ project
| and I seem to remember there were some APIs for native
| callbacks to invoke that rooted values. Same problem and
| similar solution. :-)
| munificent wrote:
| _> I guess the solution would be to add an explicit API
| to create a GC root, invoked by native methods (which is
| a bit complicated by the fact that I use a moving
| collector)._
|
| This is why I do in the Wren VM. Any time a native C
| function has the only reference to a GC-managed object
| and it's possible for a collection to occur, it calls a
| function to temporarily add the object to a list of known
| roots.
| amelius wrote:
| GCs are pretty easy, and just a matter of good accounting. That
| is, until you start doing concurrent GC then it becomes
| hellishly difficult.
| encody wrote:
| Missed opportunity to call it Just
| [deleted]
| orlp wrote:
| Just is already 'taken' in the Rust community as it is a
| command runner utility like make: https://github.com/casey/just
| 1letterunixname wrote:
| Haha. Shout out to homeboy Xoogler Rodarmor.
| entropicdrifter wrote:
| Ooh, I like that. `just build` feels like a good plea to make
| to the Rust compiler lol
| brunoborges wrote:
| JustVM
| Agingcoder wrote:
| I'm doing a (free) operating system (just a hobby, won't be big
| and professional like gnu) for 386 (486) AT clones.
|
| :-)
| 1letterunixname wrote:
| There's a no-std tutorial on how to write a demo kernel in
| Rust. https://os.phil-opp.com
|
| osdev.org, sandpile.org, RBIL, and freevga. The biggest PITA is
| hardware support. There are many good vintage hardcopy books
| with recipes for things like reliable port IO and undocumented
| hardware tricks.
|
| - Intel(r) 64 and IA-32 Architectures Software Developer's
| Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and
| 4
|
| - Microsoft MS-DOS Programmer's Reference (also includes real-
| mode BIOS calls)
|
| - PC Interrupts
|
| - Undocumented PC
|
| - PC Intern
|
| - Programmer's Guide To The EGA, VGA, And Super VGA Cards
|
| - Graphics Programming Black Book Special Edition
|
| Also, it's worth toying with advances in OS dev past the era of
| monolithic, microkernel, and hybrid.
|
| 1. Capability-based like seL4. It has a number of inherent
| performance and security advantages including capabilities and
| excellent IPC.
|
| 2. POSIX compatibility layer. Even embedded OSes without the
| concept of threads or processes can implement POSIX.
|
| 3. Hypervisor. They're much easier to add with intel's VT-[xd].
| Failing that, fall back to emulation. Translational emulation
| is very performant.
|
| 4. Get good at generalizing interrupt handlers, making them
| fast, avoiding race conditions, and using lock-free patterns.
|
| Also:
|
| 5. Rewriting or trapping unsupported instructions including x87
| and MMX.
|
| 6. The failure of pure microkernel was the added complexity and
| management of sequencing multiple resources in a transactional
| manner. There are great theoretical security and operational
| advantages in microkernel architectures but they never caught
| on widely in a pure form.
| cmrdporcupine wrote:
| Great learning project, I'm glad the author is having fun.
| Implementing a VM from scratch is a blast, and I have learned so
| much in the past doing that kind of thing.
|
| If they're interested in bolting on a GC, it couldn't hurt to
| look at MMtk. (https://www.mmtk.io/) Some high quality collection
| algorithms, written to be pluggable to various VMs, and written
| in Rust.
| sbt567 wrote:
| Uh, first time hearing mmtk. Thanks for the link!
| cmrdporcupine wrote:
| I only became aware of it because a former employer
| (RelationalAI) was heavily interested in replacing Julia's GC
| with it (for some workloads):
| https://pretalx.com/juliacon2023/talk/BMBEGY/
| celeritascelery wrote:
| Note that MMTK is x86 only. I was going to use it for a toy
| project but I have a Mac.
| cmrdporcupine wrote:
| That's a bummer -- I guess I never noticed that, when I
| played with it before it was on an M1 Mac, but compiled into
| an x86 Julia executable & running through Rosetta (which
| surprisingly did not suck).
|
| Haven't read, but I bet it's likely related to expectations
| around the x86_64 memory model & atomics. In the long run I
| see no reason why it couldn't be made portable, but I imagine
| the authors efforts are elsewhere for now.
| nine_k wrote:
| This project is like the ground floor of the JVM, not the whole
| tower. I like though how the project's page is direct and clear
| about that.
|
| A lot of the foundation and ground-floor mechanics are pretty
| interesting though.
| ChuckMcM wrote:
| That is pretty awesome! When I joined the Java effort in '92
| (called Oak at the time) the group I was with was looking at
| writing a full OS in Java. The idea being that you could get to
| just the minimal set of things needed as "machine code" (aka
| native methods) you could reduce the attack surface of an
| embedded OS. (originally Java was targeted to run in things like
| TV's and other appliances). We were, of course, working in C
| rather than Rust for the native methods. The JVM in Rust though
| adds a solid level of memory safety to the entire process.
| mshockwave wrote:
| > writing a full OS in Java
|
| IMAO, Android kind of achieve that...kind of. They write lots
| of OS logics in Java (or Kotlin) but mixing lots of system
| services written in native code at the same time,
| interconnected by the famous (or infamous?) Bind IPC.
| ActorNightly wrote:
| Android is mostly things that run java, not java itself. You
| can look at the source code, there is a relatively small
| amount of java in there.
| soperj wrote:
| It's a modified version of the linux kernel no? That'd be
| mostly C.
| kaptainscarlet wrote:
| He might be referring to system services that manage
| hardware devices .etc BatteryService .rtc
| grishka wrote:
| Android isn't _conventional_ Java. For starters, its runtime
| uses its own bytecode (dex) that 's based on registers
| instead of a stack. But then, also, many things that aren't
| related to GUI are C++ with a thin Java wrapper on top.
|
| When I think about a "Java OS", I imagine a JVM running in
| kernel mode, providing minimal OS functionality (scheduler,
| access to hardware I/O ports) and there not being any kind of
| userspace.
| tgtweak wrote:
| Embedded JVM is actually huge/pervasive and runs things as
| benign as the chip on your credit card.
| ActorNightly wrote:
| By the virtue of being the most convenient alternative, not
| because its actually good.
| lazide wrote:
| How is that not actually good?
|
| Engineering is all about tradeoffs, and 'works' is pretty
| high praise frankly.
| palata wrote:
| Still better than ElectronJS, I would say
| lldb wrote:
| Every blu ray player runs java for bonus features on the
| disk- they can even connect to the internet!
| pseudosavant wrote:
| The very first feature I disable on every Blu-ray player
| I've ever used.
| locusofself wrote:
| After reading your comment I was surprised to find out that
| credit card chips have any processing capabilities
| whatsoever, which they apparently do, though at least
| according to gpt4, they are far too basic to run java/jvm.
| ?
| cayley_graph wrote:
| Google appears to be significantly more useful than GPT-4
| here. [1] is the third result for me for the query
| "credit card jvm". [2] is the second result and gives a
| direct (and more importantly, actually correct) answer.
| That post links to the Oracle documentation for Java
| Cards [3] which is the fourth result.
|
| [1] https://en.m.wikipedia.org/wiki/Java_Card
|
| [2] https://superuser.com/questions/362567/are-there-any-
| credit-...
|
| [3] https://www.oracle.com/java/java-card/
|
| All of this is just as easy as, if not easier than, using
| ChatGPT. It's unclear that such a tool even serves this
| purpose (retrieval of basic facts) adequately, so it
| should probably be avoided in the future.
| locusofself wrote:
| Fair enough, there is a "Java Card". I'm not convinved
| that Java is running on any of my or your credit cards in
| your wallet today, though I'm not willing to bet on it.
| arllk wrote:
| It's running on many e-Passports and e-ID cards, i can't
| find the documentation from my e-ID card which runs on
| Java, but the chips are quite common like in
|
| https://www.cardlogix.com/product/cardlogix-credentsys-
| lite-...
|
| And on another source:
|
| Visa became the first large payment company to license
| JavaCard. Visa mandated JavaCard for all of Visa's
| smartcard payment cards. Later, MasterCard acquired
| Mondex, and Peter Hill joined as their CTO, licensed
| JavaCard, and ported the Mondex payment platform to
| JavaCard.
|
| Source: https://javacardforum.com/2022/07/28/the-birth-
| of-javacard/
| cayley_graph wrote:
| This lists a few things using Java Cards (likely stuff
| that's in your wallet, surprisingly):
| https://stackoverflow.com/questions/47731005/practical-
| use-o...
|
| (fwiw I have a bit of prior experience here)
| locusofself wrote:
| That's pretty neat. It sounds like Java Card (or at least
| some other cards) actually "boot up" by way of inductive
| coupling, ie, via the "contactless" card readers where
| you just hold your card in proximity to the reader thing.
| Did not know that, I assumed it was just reading a key
| via NFC or something.
| pests wrote:
| This is a classic defcon talk about it. They developed
| their own cell network with their own sim cards for an
| event and even built custom JavaCard applications their
| users could use. They have released all their information
| and tools used to build and compile Java Card software.
|
| https://www.youtube.com/watch?v=_-nxemBCcmU
| Sharlin wrote:
| The whole reason there's a thing called a chip in the
| card is that it actually does computing (indeed they're
| called smart cards) and that it does the sort of
| computing (cryptographic challenge-response) that makes
| these cards much more secure than oldschool magnetic
| stripe cards.
|
| Even fully passive NFC tags contain logic that needs
| power to talk NFC back to the reader, there's no such
| thing as just reading data via NFC.
| xorcist wrote:
| Just wait until you find out about the SIM card in your
| phone!
|
| It's almost bizarre what that thing does.
| unintendedcons wrote:
| In what world did you convince yourself asking a chatbot
| was a source of real knowledge?
|
| respect yourself enough to look at primary sources
| locusofself wrote:
| How about respect other people instead of rushing to
| condescending judgement? the stakes are incredibly low
| here, I asked ChatGPT for fun, like tens of millions of
| others are doing every day.
| refulgentis wrote:
| For some reason you announced that you were subjecting us
| to a low quality information retrieval method, and after
| 6 months of this, people are irritable. The social norms
| is to do that sort of thing in private, doing it in
| public and seemingly proudly came across as coarse and
| impolite
|
| It didn't help any that it was clear from the initial
| post you were questioning someone with domain knowledge,
| which was later gently indicated to you
| tjlingham wrote:
| It feels like that should be true, I get it. However Java
| Card is very real.
|
| https://en.m.wikipedia.org/wiki/Java_Card
| Isolus wrote:
| Also many SIM cards (UICCs) / embedded SIM modules as well
| as e.g. the Secure Element that Samsung uses for Knox run
| with Java Card.
| pjmlp wrote:
| Besides the sibling comments,
|
| - SavageJE
|
| - microEJ
|
| - PTC and Aonix bare metal Java runtimes
|
| - SunSPOT mit SquawkVM
| spullara wrote:
| There was one for a while though wasn't really targeted at
| users:
|
| https://en.wikipedia.org/wiki/JavaOS
| techn00 wrote:
| See also https://jacobin.org/ for JVM 17 written in Go.
| xmcqdpt2 wrote:
| Also https://github.com/lihaoyi/Metascala for a JVM implemented
| in Scala running on the JVM.
| dimgl wrote:
| Seems... redundant, no?
| mike_hearn wrote:
| Nope. For a more realistic example of such a JVM, look at
| SubstrateVM (written in "SystemJava" and compiled to native
| code ahead of time along with the app it runs), and "Java
| on Truffle" (a.k.a. Espresso), which is a JVM written in
| Java designed to be compiled to run on top of SubstrateVM.
| Both projects are a part of Graal.
|
| The reason to do this, beyond the inherently neat Inception
| factor, is that JVMs are a PITA to work on because they're
| normally written in languages like C++ or Rust which
| optimize for performance and manual control over
| productivity. That makes it hard to experiment with new JVM
| features or changed semantics. If you could write a JVM in
| a high level very productive language like Java (or Kotlin
| or Scala) then the productivity of people writing and
| experimenting with JVMs would go up. It would also make it
| feasible for "ordinary" Java devs to actually fork the JVM
| and modify it to better suit their app, at least in some
| cases.
|
| There's also something conceptually cleaner about having a
| language and its runtime implemented purely in itself. As
| long as you don't mind the circularity, that is.
|
| Espresso for example has hot-swap features HotSpot doesn't
| have, so you can modify your program as it's running in
| more flexible ways than what regular Java allows.
| bfrog wrote:
| I find that Rust is like maybe 1.5-2x more productive to
| code in than say C or C++. Part of that is the tooling
| has so much less arcane baggage, part of that is that I
| need to reach less for external tools for
| metaprogramming, part of that is fewer crazy
| macro/template compiler errors, and part of that is less
| time spent debugging.
|
| It all adds up.
| mike_hearn wrote:
| I've heard very inconsistent things about this, which is
| interesting. Often people say Rust is _less_ productive,
| as there 's often "makework" involved with satisfying the
| borrow checker. I suspect a lot of it revolves around how
| you perceive that sort of thing: it can be cast as both
| productivity (satisfying it can potentially rule out
| bugs) or a loss of productivity (you were already
| satisfied the code was correct).
|
| But I don't have enough experience with Rust to really
| have formed an opinion on that yet.
| pjmlp wrote:
| And Jikes RVM as well.
| nerpderp82 wrote:
| Jikes is extremely popular in academia for being the
| basis of VM research because it is so easy to modify.
|
| https://github.com/JikesRVM/JikesRVM
| dgb23 wrote:
| apply eval
| cmrdporcupine wrote:
| _" The goal of Metascala is to create a platform to
| experiment with the JVM: a 3000 line JVM written in Scala
| is probably much more approachable than the 1,000,000 lines
| of C/C++ "_
|
| Seems like a reasonable goal.
| RcouF1uZ4gsC wrote:
| I seriously doubt the ratio of Scala vs C++ for
| implementing the JVM is 1:300.
| patrec wrote:
| He's not saying that. What he is saying is that are
| simple, non-production quality implementation in Scala is
| much more amenable to experimentation than a
| sophisticated, production-quality implementation in C++
| that weighs in at 300x the LOC.
| RcouF1uZ4gsC wrote:
| But a simple non-production quality implementation in C++
| would also be amenable to experimentation and not have
| the bootstrapping issues as well as provide an easier
| starting point to incorporate more of the existing
| optimizations as desired.
| valenterry wrote:
| Probably not, because JVM users are much more likely to
| be more proficient in Scala/Java than in C++.
| eindiran wrote:
| That is true, but the author of Metascala wanted to write
| it in Scala. Other people are free to write a simple C++
| implementation of the JVM themselves.
| cempaka wrote:
| I'm sure it's not complete, but I also wouldn't be
| surprised if 99% of what's in HotSpot is optimization
| tweaks and performance boosts which aren't essential to
| the JLS.
| leshow wrote:
| That is a very interesting name for a programming project lol.
| The Jacobins were a revolutionary political club during the
| French Revolution in the 1790's. It's also the name of a
| magazine at https://jacobin.com
| snordgren wrote:
| It's starts with the letters "ja", that's all that matters
| for a Java-related project.
| haspok wrote:
| Nice project, congrats!
|
| One thing struck me as a bit odd:
|
| > In particular, it does not support: generics
|
| What kind of support is there for generics in the JVM? Maybe I'm
| too naive to assume that due to type erasure on bytecode level
| everything is just an Object, ie. a reference type? Or do you
| mean the class definition parser - but then, you don't really
| have any checks in place to see if the class file is valid (other
| than the basic syntax)?
| newmana wrote:
| They might be talking about the checkcast operation:
| https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.ht...
|
| This is generated when you do something like: final Main value
| = list.get(0);
|
| http://henrikeichenhardt.blogspot.com/2013/05/how-are-java-g...
| xxs wrote:
| The cast is added by javac, so it just needs to verify the
| object on the stack to be compatible w/ the provided class.
| That part is very simple.
| andreabergia wrote:
| Thanks!
|
| About the generics - some people have pointed out the same on
| reddit, and yeah, you are correct. The only thing that should
| be done is to read the Signature attribute that encodes the
| generic information about classes, methods, and fields (https:/
| /docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.ht...)
|
| As a matter of fact, I just did a test and the following code
| works! :-) public class Generic {
| public static void main(String[] args) {
| List<String> strings = new ArrayList<String>(10);
| strings.add("hey"); strings.add("hackernews");
| for (String s : strings) { tempPrint(s);
| } } private static native void
| tempPrint(String value); }
| xxs wrote:
| pretty much this - generics have (rare) implications to the
| reflection (but it's unsupported as well) but overall they are
| replaced with the nearest class/interface when compiled.
|
| OTOH lack of string interning is super strange [it's trivial to
| implement], and w/o it JVM is not a thing. String being equal
| by reference is important, and part of JLS.
|
| Lack of thread makes the entire endeavor a toy project.
| 3cats-in-a-coat wrote:
| Java strings are compared by reference, if they do not match,
| they're compared by value. There's no guarantee every single
| string has a single instance. That would hurt performance.
| maverwa wrote:
| I think op meant "String literals". For those the spec
| seems to require interning:
|
| > Moreover, a string literal always refers to the same
| instance of class String. This is because string literals -
| or, more generally, strings that are the values of constant
| expressions (SS15.28) - are "interned" so as to share
| unique instances, using the method String.intern.
|
| And later:
|
| > Literal strings within different classes in different
| packages likewise represent references to the same String
| object.
|
| Source: https://docs.oracle.com/javase/specs/jls/se8/html/j
| ls-3.html...
|
| But that does - as far as I can see - say nothing for non-
| literal strings.
|
| [edit]: formatting
| 3cats-in-a-coat wrote:
| Thanks, makes sense yes. Still if the JVM look up in all
| cases defers to value after ref mismatch, it should work
| identically, no? Even if interning is mandatory as per
| spec, I'm not sure how it'd change the outcome of
| evaluation.
| maverwa wrote:
| yeah, I'd assume as much. If it indeed falls back to a
| by-value comparison it would be slower, but should work.
| xxs wrote:
| nope - it'd be plain wrong. Literals must be equal by
| reference, comparing them by value would just break JLS,
| as they would be equal to any other composed string by
| reference as well.
| robertlagrant wrote:
| Yes - that is a performance optimisation. I don't think
| comparing everything by value makes or breaks the
| implementation.
| xxs wrote:
| >I don't think comparing everything by value makes or
| breaks the implementation.
|
| Nothing much to think -- distinct objects must have
| distinct references [e.g. new String("a")!=new
| String("a')], literals must have the same references for
| the same values [e.g. "a"=="a"].
| thfuran wrote:
| It affects the result of ==, which is only a reference
| comparison.
| simiones wrote:
| The problem is that, per the spec, the following _must_
| hold: if("abc" == "abc") {
| System.out.println("correct"); } if(new
| String("abc") != new String("abc")) {
| System.out.println("correct"); }
|
| So, not having proper string interning support means that
| you mis-execute certain programs.
| kelnos wrote:
| Sure, but also consider that this JVM (intentionally)
| lacks support for other things that all but the most
| trivial programs would use. I don't think it's expected
| by the author that you can throw any random program at
| it. It's really there just to run your own programs that
| you've written specifically for it in order to play
| around with things. And since you know you're writing for
| this particular JVM, you should know not to do anything
| that depends on string interning, among other things.
| xxs wrote:
| pretty much indeed.
|
| > say nothing for non-literal strings
|
| yes, of course.
| senorrib wrote:
| "I want to stress that this is a toy JVM, built for learning
| purposes and not a serious implementation."
| xxs wrote:
| The rest of the stuff, incl. I/O is actually on the trivial
| side - threads do require planning. This is what I meant by
| being a 'toy' project, threads (and JMM) would be
| impossible to bolt in later on.
| ncallaway wrote:
| The reason you're being downvoted is you keep dismissing
| this as a "toy" project, and pointing out that it would
| be hard to make a real project.
|
| But, as the previous commenter attempted to point out to
| you this project is a *self-described* toy project.
|
| On the very page that is linked, the author of the JVM
| specifically says:
|
| "I want to stress that this is a toy JVM, built for
| learning purposes and not a serious implementation."
|
| Thus, absolutely _no one_ disagrees that its a toy JVM.
| They just want you to stop being dismissive of someone 's
| toy project by repeatedly pointing out its a toy project
| and not a "not a thing"
| __jem wrote:
| Right, just because something is a "toy" doesn't mean
| it's not still impressive. If someone implemented a "toy"
| database that could parse and execute SQL queries,
| distribute data across nodes, etc., you would probably
| not want to use that in production, but it's still a very
| impressive project for a single person to pull off, even
| if it's riddled with bugs. Getting a very complex system
| to "just barely functional" is still a huge achievement
| and very cool!
| xxs wrote:
| Being impressing or otherwise is very subjective.
|
| SQL (ACID) over multiple non-cache-coherent nodes is
| extremely difficult to pull with regards to consistency,
| though.
| __jem wrote:
| > SQL (ACID) over multiple non-cache-coherent nodes is
| extremely difficult to pull with regards to consistency,
| though.
|
| Thats... why it's a toy! I'm really not sure what you're
| missing here.
| gazarullz wrote:
| you must be fun to work with
| xxs wrote:
| I've pointed the only part that makes it a toy project is
| the lack of Threading support, the rest is not hard to
| add. So the items in list of things missing after 'toy'
| thing should have totally different weights (with Threads
| being the added to the last).
| lolinder wrote:
| You're still missing the point--it was always intended to
| be a toy project, and the author has explicitly declared
| that they are completely done with it and won't be doing
| any more work. What does it matter how they sort the list
| of missing items? It's not a todo list in need of
| prioritization, it's just an "FYI, these are some of the
| things I never got to".
| whizzter wrote:
| Not entirely correct, last I checked string interning was
| ONLY guaranteed for those strings defined in source and read
| in during class loading, strings created via the String
| constructor (f.ex. via StringBuilder) CAN duplicate those
| strings that you hardcoded in your sources, to get the
| "canonical" string in those cases you have to invoke
| String.intern() if memory serves me correct.
|
| https://docs.oracle.com/en/java/javase/11/docs/api/java.base.
| ..()
|
| Also interning strings to optimize equality checks to be able
| to use pointer comparison is dangerous for external inputs
| since iirc at some point interned strings could permanently
| be stored (unless implemented by a WeakSet) and attackers
| could fill up your heap (or cause other GC issues since the
| entire interning functionality is a cache) by filling up your
| interning lists with crap.
| xxs wrote:
| >String being equal by reference is important, and part of
| JLS.
|
| I never said String must be equal by reference when their
| content is. However string literals must be equal by
| reference. I thought Mentioning the JLS would make it
| obvious, esp. having 'intern' in the context
| ywei3410 wrote:
| There's also the new JVM option which eludes me at the
| moment which sweeps the strings which are promoted to the
| older generation and interns them.
|
| Not certain about whether `String.intern` is permanently
| stored; I rather suspect that it sweeps the existing
| strings since iirc the java string has a hash associated
| with it anyway.
| znpy wrote:
| > Lack of thread makes the entire endeavor a toy project.
|
| yeah, as stated by the author in the line that says "I want
| to stress that this is a toy JVM, built for learning purposes
| and not a serious implementation."
| xxs wrote:
| Yeah, that's the only part that makes it a toy project -
| the rest can be added w/o too much of an effort. This is
| pretty much what makes it a toy project.
| nunobrito wrote:
| Good work!
| hajmo97 wrote:
| How often will this post be reposted on HN ?
|
| Recent posts:
|
| https://news.ycombinator.com/item?id=36735344 - 6 days ago
|
| https://news.ycombinator.com/item?id=36717967 - 7 days ago
|
| https://news.ycombinator.com/item?id=36710803 - 7 days ago (OP)
|
| Btw. nice project!
| freedomben wrote:
| Getting the attention needed for front page is a huge chance.
| I've seen great stuff get posted 5 or more times before it
| makes it out of obscurity. HN is highly non-deterministic in
| these things.
| capableweb wrote:
| Doesn't really count as a repost if none of the previous
| submissions didn't get any traction.
| dgb23 wrote:
| I agree, plenty of technically interesting projects don't get
| discussion on hn. There's always a bit of luck and context
| involved.
| aardvark179 wrote:
| Very well done. Building VMs is always fun, and I'm sure it was
| an interesting learning experience when combined with Rust's type
| system.
|
| If you're looking for a job then ping me on Twitter, Mastodon or
| my work email, I'm sure you can figure them out from my user id
| here.
| tenaf0 wrote:
| Shameless plug of my similar project:
| https://github.com/tenaf0/rust-jvm3
| sproketboy wrote:
| [dead]
| bingemaker wrote:
| When I see such cool projects, I feel very overwhelmed. How do
| you get started with Rust and master basics to even attempt doing
| such a thing? Can OP explain?
| andreabergia wrote:
| Well, _I_ feel impostor syndrome half the times I open HN
| honestly!
|
| I did have a bit of experience with VMs before, I wrote many
| years ago a short series of posts about it on my blog, and at
| my previous job I dabbled a bit in JVM byte code to solve one
| very unusual problem we had for a customer. I also read the
| _amazing_ https://craftinginterpreters.com/ years ago and that
| gave me some ideas.
|
| But this project was definitely big and complex. It took me a
| lot of time, and it got abandoned a couple of times, like many
| of my side projects. But I'm happy I finished it. :-)
| nop_slide wrote:
| Likewise. Not to go onto too much of tangent, but on a more
| personal note I've been generally struggling with this feeling
| a lot lately.
|
| I've been a professional software developer for almost 10
| years, and I _know_ I'm competent (and not an impostor) as
| demonstrated by my current position and ability to ship things.
|
| However, lately after viewing developer blogs I become
| overwhelmed that I actually don't know enough and am not a
| "real" developer. I seem to have formed a notion of an ideal
| developer in my head and I compare myself against this imagined
| construct which leads to these feelings. I admire how these
| people have so much deep knowledge and can express themselves
| so clearly and concisely, then wonder why I am not like that.
|
| I barely have the energy after work after taking care of my
| family to do anything further, and I know programming isn't
| everything but I do have a desire to learn more and improve
| myself.
|
| I recognize this isn't healthy nor is it rational, but it's
| just a feeling I can't shake lately.
| theLiminator wrote:
| Well, you're probably comparing yourself against the top 1%
| of developers. It's okay to not be the very best, being in
| the top 30% of this field already is very rewarding.
| dist1ll wrote:
| What you're describing is very common amongst developers. So
| common in fact, that I've written a post about this
| https://alic.dev/blog/comparisons
|
| In short: recognizing your insecurities is the first step.
| The next step is figuring out what's important to you,
| shedding impossible to achieve and irrational ambitions,
| prioritizing your goals in life, and articulating concrete
| steps to further them.
| nop_slide wrote:
| Thanks, appreciate the link and going to keep this in mind.
| Cheers.
| naltun wrote:
| Not OP nor am I a Rust expert. I can speak regarding another
| technology: sockets.
|
| I've been deep-diving into sockets recently. 2 weeks ago I had
| only a high-level understanding of sockets (learned from
| casually reading manpages, docs, blog posts, etc.). I decided
| to read as much as possible because I wanted to understand
| networking fundamentals, and after a week I learned enough to
| write some sockets code in Python and C. I know Python quite
| well, so reviewing the ``sockets'' library made more sense
| after my deep dive.
|
| If you want to get better at technology A using language X, I
| suggest either reading/watching as much as you can about tech
| A, and build stuff with it in language Y. Then you can circle
| back to learning language X and you've already mastered much of
| the concepts around technology A.
|
| e: spelling
| aardvark179 wrote:
| Break things down. A simple language VM is going to have a way
| to represent objects in memory, a byte code interpreter, a
| simple garbage collector, and a way to load things.
|
| A byte code interpret is a stack, some way to represent
| functions on that stack, and then a loop to interpret beach
| byte code and move the program counter.
| sn9 wrote:
| How much do you code in your free time? Like average hours per
| week?
|
| If it's zero (and no judgement from me if it is; plenty of
| other things to focus on), then it shouldn't be surprising that
| someone for whom that number is (speculatively) 10-20 hours per
| week on average for years has impressive side projects.
| squirtlebonflow wrote:
| [dead]
| FrustratedMonky wrote:
| Just for kicks, has anybody tried to Rube-Goldberg it, and see
| how many VM's can stack on top of each other. Like have Java App
| running on JVM written in RUST, Running on WASM, Running on JVM,
| etc.. etc...
| Zambyte wrote:
| This talk isn't about VMs, but it is about an infinite tower of
| interpreters. You may find it interesting:
| https://youtu.be/SrKj4hYic5A
| _joel wrote:
| It's turtles, all the way down..
| post-it wrote:
| See also: The Birth & Death of JavaScript, A talk by Gary
| Bernhardt from PyCon 2014
| (https://www.destroyallsoftware.com/talks/the-birth-and-
| death...)
| maverwa wrote:
| Great project, congrats! Mad respect.
|
| Started working on something very similar a few years back and
| gave up pretty soon for some stupid reason. Maybe I should try
| again, am getting better at getting stuff done.
| zote wrote:
| Please do
| cmrdporcupine wrote:
| You should. I've been doing this kind of stuff on the side for
| 25+ years, and nothing ever gets "done." But the educational
| value is much higher than you think. And the skills learned can
| eventually propagate out into interesting paid work.
|
| I think of them as retirement projects before I retire. When I
| actually retire I'll maybe finish them.
|
| (That said, I have in the past tried taking jobs that were
| adjacent to my "research" interests, and found the joy of
| building these things from scratch is much better than fiddling
| with the levers on the side of someone else's thing they built
| from scratch years ago. I like working on and improving
| production systems, but if they intersect too closely to my
| personal interests, it can be demoralizing.)
| andreabergia wrote:
| Thanks!
|
| I know the feeling - this project, like most of my other side
| projects, got abandoned a couple of times. But I was really
| curious about implementing a GC and, for once, I managed to
| finish something. I'm glad I did! :-)
| celeritascelery wrote:
| I am curious if your ran into limitations due to the lifetimes on
| this signature
|
| fn execute_instruction( &mut self, vm: &mut Vm<'a>, call_stack:
| &mut CallStack<'a>, instruction: Instruction, ) ->
| Result<InstructionCompleted<'a>, MethodCallFailed<'a>>
|
| When I try to add a lifetime to the `Err` variant of a `Result`
| and that lifetime is invariant (which it is due to `vm` and
| `call_stack`) it usually means that I can't use the question mark
| operator or have early returns in the code[1]. This makes error
| handling more verbose and less readable. Is that your experience
| as well?
|
| [1] https://users.rust-lang.org/t/nll-and-early-return-not-
| allow...
| celeritascelery wrote:
| EDIT: Looks like this is not an issue because the invariant
| lifetime 'a is not used for the mutable reference of vm or
| call_stack. So it's not the invariance that is the problem, but
| rather how Rust reasons about the lifetime of mutable
| references, which this avoids.
|
| In that case I don't understand what the point of 'a is on VM
| and CallStack. You can create[1][2] those with any unbounded
| lifetime (including 'static[3]), which means it is not
| constraining anything. What is the lifetime 'a doing here? Why
| not remove it?
|
| [1]
| https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
|
| [2]
| https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
|
| [3]
| https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
| andreabergia wrote:
| I wanted to express the fact that everything that gets
| allocated (call stack, frames, classes, and objects) is alive
| and valid until the "root" VM is, thus I used 'a more or less
| everywhere.
|
| I also struggled with a got a ton of errors from the borrow
| checker initially, and I fixed many of those with a lot of
| explicit lifetimes, but it's not impossible that in some
| places they are unnecessary.
| celeritascelery wrote:
| > I wanted to express the fact that everything that gets
| allocated (call stack, frames, classes, and objects) is
| alive and valid until the "root" VM is, thus I used 'a more
| or less everywhere.
|
| That's not being expressed in the type system. The lifetime
| 'a is unbounded (meaning you can make it anything you want,
| including 'static) so anything that shares 'a can outlive
| the vm without rust complaining. it would be no different
| then if you removed 'a completely. If you wanted to ensure
| anything couldn't outlive the vm you could tie the lifetime
| to a _reference_ to the vm, but then the vm can 't hold
| those values (it would be a self-referential lifetime).
| skitter wrote:
| Funnily enough I did the same in an early wip version of
| my toy JVM. Ended up using unsafe to use 'static
| references internally but only hand out wrappers that
| include a reference to the JVM. This also ensures that
| objects/classes/... from one JVM can't be used in another
| one.
| sylware wrote:
| Can I compile it with the rust-written rust compiler?
| bkkz5046 wrote:
| but more importantly does edbrowse work here?
| babuloseo wrote:
| Oracle oracle oracle.
___________________________________________________________________
(page generated 2023-07-21 23:00 UTC)