[HN Gopher] Jazelle DBX: Allow ARM processors to execute Java by...
___________________________________________________________________
Jazelle DBX: Allow ARM processors to execute Java bytecode in
hardware
Author : vincent_s
Score : 96 points
Date : 2024-01-22 13:27 UTC (9 hours ago)
(HTM) web link (en.wikipedia.org)
(TXT) w3m dump (en.wikipedia.org)
| nanolith wrote:
| Jazelle and its replacement, ThumbEE, have been deprecated and
| removed in later architectures.
|
| On modern Cortex-A systems, there are enough resources to make
| JIT feasible. On smaller systems, AOT is a reasonable
| alternative.
| fch42 wrote:
| My brainfog claims some blurry memories of this ... for one,
| documentation is lacking so much that an opensource JVM using
| Jazelle never happened; you wanted to develop a JVM on top of it,
| you'd pay ARM for docs, professional services, and unit licenses.
| And second, that once things got to the ARM11 series cores,
| software JITs beat the cr* out of Jazelle. I don't remember any
| early Android device ever used it.
|
| ARM is quite capable in vapourware generation. 64bit ARM was
| press-released (https://www.zdnet.com/article/arm-to-
| unleash-64-bit-jaguar-f...) a decade before ARMv8 / aarch64
| became a thing.
|
| (I'd love to learn more)
| RicoElectrico wrote:
| > I don't remember any early Android device ever used it.
|
| It couldn't have, as Dalvik VM is distinct from JVM.
| fch42 wrote:
| It executes Java Bytecode. Whether Dalvik VM was/is a "Java"
| VM is hardly relevant there (not the least because "Java" is
| so much more than Java Bytecode, and Jazelle does nothing to
| help with anything on top of the latter).
| RicoElectrico wrote:
| Apparently it's not even Java bytecode. Would make sense,
| after all Dalvik is register-based.
| https://stackoverflow.com/a/36335740
| pjmlp wrote:
| It certainly doesn't.
|
| https://source.android.com/docs/core/runtime/dalvik-
| bytecode
| bitwize wrote:
| Java bytecode is transpiled to Dalvik's own bytecode as a
| build step. Dalvik itself doesn't run Java bytecode. This
| is one of the reasons why Oracle sued Google: clearly
| Google was trying to appropriate Java with some clever IP
| law dodges with this Dalvik business.
| Vogtinator wrote:
| There is a bit of info including example code on
| https://hackspire.org/index.php/Jazelle
| jaywee wrote:
| Sun wanted to do the same thing in late 90ties - picoJAVA
| (embedded), microJava and UltraJava (VLIW workstations).
|
| Relegated to the dustbin of history.
| miki123211 wrote:
| Java Card still survives, though.
|
| I find Java Card pretty puzzling. You go from high-level
| interpreted languages on powerful servers, to Java and C++ on
| less powerful devices (like old phones for example), to
| almost exclusively C on Microcontrollers, and then back to
| Java again on cards. If. it makes sense to write Java code
| for a device small enough to draw power from radio waves, why
| aren't we doing that on microcontrollers?
| cpgxiii wrote:
| The Java Card environment is quite limited, though, due to
| resource limitations.
|
| There have been several more-or-less successful attempts at
| running higher-level languages on microcontrollers, e.g.
| .Net Micro Framework and CircuitPython. In all of these
| cases, though, you tend to struggle with all the native
| device behavior being described/intended by the vendor for
| use with C or C++ and the BSP for the higher level
| environment being an afterthought.
| sillywalk wrote:
| FYI UltraJava was renamed to MAJC[0] which IIRC was only used
| in Sun's XVR graphics cards.
|
| More from Ars (1999)
| https://archive.arstechnica.com/cpu/4q99/majc/majc-1.html
|
| [0] https://en.wikipedia.org/wiki/MAJC
| fidotron wrote:
| The performance of Dalvik was far below J2ME on Nokia and Sony
| Ericsson feature phones for a very long time, and Android
| relied on pushing a lot to C libraries to compensate.
| pjmlp wrote:
| As Nokia alumni it was incredible how much of the Google
| fanbase believed in Dalvik's performance fairy tail.
|
| ART is another matter, though.
| toast0 wrote:
| Sure, but J2ME can't seek backwards in open files. (That was
| added in Java 1.4, and J2ME is 1.3)
| zozbot234 wrote:
| There was a successor ThumbEE ("Execution Environment") that
| _was_ comprehensively documented. But it didn 't get much
| attention either and later chips removed it.
| happosai wrote:
| IIRC jazelle left it to the implementers which bytecodes to
| handle in HW and what to trap to SW. Since SW JIT beat the
| Jazelle implementtions, by the arm11 times, the cpu
| implementers would just leave everything for the SW traps... So
| while the original raspberry pi was arm1176j and J meant
| Jazelle support, it was all already hollowed out.
| michaelt wrote:
| I remember reading about Jazelle many years ago - before the
| release of the iPhone and suchlike. This was the age when people
| were coming up with things like 'Java Card' - smartcards
| programmed directly in Java.
|
| I never heard of anyone actually using Jazelle, though - I assume
| JIT ended up working better.
| fch42 wrote:
| I'm a little in the realm of speculation here. Part of the
| issue with Java for embedded devices was "a bad fit". What made
| Java thrive in the server or even applet spaces wasn't the
| instruction set but the rich ecosystem around Java. Yet,
| threading - as "inherent" to Java it is - is provided by the OS
| and "only" used/wrapped by the JVM. All the libraries ...
| megabytes of (useful) software, yet not implemented (nor even
| helped) by hardware acceleration. The "equivalent" of static
| linking to minimize the footprint never quite existed.
|
| So on a smartcard ... write software in a (uncommon, and when
| compared with ARM which is a very "rich" assembly language)
| form of low-level instruction set, and pay both Sun and ARM
| top$ for the privilege - nevermind the likely "runtime"
| footprint far exceeding the 256kB RAM you planned for that 5$
| card - why? Writing small singlethreaded software in anything
| that compiles down to a static ARM binary has been easy and
| quick enough that going off the ARM instruction set looked
| pointless for most. And learning which parts of "Java" actually
| worked in such an environment was hard, even (or especially?)
| for developers who knew (the strengths of) Java well. Because
| developers and specifiers expected "rich Java", and couldn't
| care less about the Bytecode. JITs later only hoovered up the
| ashes.
| pjmlp wrote:
| Java is doing just fine in embedded devices.
|
| https://www.ptc.com/en/products/developer-tools/perc
|
| https://www.aicas.com/wp/
|
| https://www.microej.com/
|
| https://en.wikipedia.org/wiki/BD-J
|
| https://www.thalesgroup.com/en/markets/digital-identity-
| and-...
| bombcar wrote:
| IIRC people didn't "really believe" that Java could _actually
| be performant_ because they assumed that since it has a JIT
| layer, it would _never even get close_ to native code.
|
| But the reality was that JIT allows code to get _faster_ over
| time, as the JIT improves.
|
| Things like Jazelle let chip manufacturers paper over a paper
| objection.
| PaulHoule wrote:
| Specialized hardware has been losing out against general for
| years.
|
| There were those "LISP machines" in the early 1980s but when
| Common Lisp was designed they made sure it could be
| implemented efficiently on emerging 32-bit machines.
| bombcar wrote:
| Part of the reason is anytime a specialized hardware is
| found that _works_ , the generalized hardware steals the
| feature that makes it faster - basically the story of all
| the extensions to x86 like SSE, etc.
| o11c wrote:
| > But the reality was that JIT allows code to get faster over
| time, as the JIT improves.
|
| Ehh .. PGO is only somewhat better for JIT than AOT. More
| often for purely-numerical code the win is because the AOT
| doesn't do per-machine `-march=native`. It's the memory model
| that kills JVM performance for any nontrivial app though.
| tenaf0 wrote:
| Well, code size is another interesting aspect here. A JIT
| compiler can effectively create any number of versions for
| a hot method, based on even very aggressive assumptions (an
| easy one would be that a given object is non-null, or that
| the interface only has a single instance loaded). The
| checks for these are cheap (e.g. it could be encoded as
| trapping an invalid page address in case of NPEs), and
| invalidation's cost is amortized.
|
| Contrast this with the problem of specialization in AOT
| languages, which can easily result in bloated binaries (PGO
| does help here quite a lot, that much is true). For
| example, generics might output a completely new function
| for every type it gets instantiated with - if the function
| is not that hot, it actually makes sense to rather try to
| handle more cases with the same code.
| BenoitP wrote:
| The gains seem to not have been high enough to sustain that
| project. Nowadays CPUs plan, fuse and reorder so much of micro-
| code that lower-level languages can sort of be considered virtual
| as well.
|
| But Java and similar languages extract more freedom-of-operation
| from the programmer to the runtime: no memory address
| shenanigans, richer types, and to some extent immutability and
| sealed chunks of code. All these could be picked up and turned
| into more performance by the hardware; with some help from the
| compiler. Sort of like SQL being a 4th-gen language, letting the
| runtime collect statistics and chose the best course of execution
| (if you squint at it in the dark with colored glasses)
|
| More recent work about this is to be found on the RISC-V J
| extension [1], still to be formalized and picked up by the
| industry. Three features could help dynamic languages:
|
| * Pointer masking: you can fit a lot in the unused higher bits of
| an address. Some GCs use them to annotate memory (refered-
| to/visited/unvisited/etc.), but you have to mask them. A hardware
| assisted mask could help a lot.
|
| * Memory tagging: Helps with security, helps with bounds-checking
|
| * More control over instruction caches
|
| It is sort of stale at the moment, and if you track down the
| people working on it they've been reassigned to the AI-
| accelerator craze. But it's going to come back, as Moore's law
| continues to end and Java's TCO will again be at the top of the
| bean-counter's stack.
|
| [1] https://github.com/riscv/riscv-j-extension
| lionkor wrote:
| > as Moore's law continues to end
|
| more like Wirths law proving itself still
| pjmlp wrote:
| As free beer AOT compilers for Java are commonly available, and
| as shown on Android since version 5, I doubt special opcodes
| will matter again.
|
| Ironically when one dives into computer archeology, old
| Assembly languages are occasionally referred as bytecodes, the
| reason being that in CISC designs with microcoded CPUs they
| were already seen that way by hardware teams.
| BenoitP wrote:
| I'm still not decided on AOT vs JIT being the endgame.
|
| In theory JIT should be higher performance, because it
| benefits from statistics taken at actual runtime. Given a
| smart enough compiler. But as a piece of code matures and
| gets more stable, the envelope of executions is better known
| and programmers can encode that at compile-time. That's the
| tradeoff taken by Rust: ask for more proofs from the
| programmers, and Rust is continuing to pick up speed.
|
| That's also what the Leyden project / condensers [1] is
| about, if I understand correctly. Pick up proofs and
| guarantees as early as possible and transform the program.
| For example by constant-propagating a configuration file
| taken up during build-time.
|
| Something I've pondered over the years: a programmer's job is
| not to produce code. It is to produce proofs and guarantees
| (yet another digression/rant: generating code was never a
| problem. Before LLMs we could copy-paste code from
| StackOverflow just fine)
|
| In the end it's only about marginal improvements though.
| These could be superseded by changes of paradigm like RAM
| getting some compute capabilities; or programs being split
| into a myriad of specialized instructions. For example
| filters, rules and parsing going inside the network card; SQL
| projections and filters going into the SSD controller; or
| matrix-multiplication going into integrated GPU/TPU/etc just
| like now.
|
| [1] https://openjdk.org/projects/leyden/notes/03-toward-
| condense...
| pjmlp wrote:
| The best solution isn't AOT vs JIT, rather JIT and AOT,
| having both available as standard part of the tooling.
|
| Android has learnt to have both, and thanks to PGO being
| shared across devices via Play Store, the AOT/JIT outcome
| reaches the ideal optimum for a specific application.
|
| Azul and IBM have similar approaches on their JVMs with a
| cluster based JIT, and JIT caches as AOT alternative.
|
| Also stuff like GPGPU is a mix of AOT and JIT, and is doing
| quite alright.
|
| I am not so confident with LLMs, when they get good enough
| programmers will be left out of the loop, and will have to
| contend to similar roles as when doing no-code SaaS configs
| or some form of architects.
|
| A few programmers will remain as the LLMs high priests.
| BenoitP wrote:
| > A few programmers will remain as the LLMs high priests.
|
| That's interesting.
|
| It's controversial to say that in 2024, but not all
| opinions have the same value. Some are great, but some
| are plain dumb. The current corporate right opinion is to
| praise LLMs as end-all be-all. I've been asked to advise
| a private banking family office wanting to get into LLMs.
| For advising their clients' financial decisions. I
| politely declined. Can there be a worse use case? LLMs
| are parrots with the brain size of the internet. With
| thoughts of random origin mixed together randomly. It
| produces wonderful form, but abysmal analysis.
|
| IMHO as LLMs will begin to be indistinguishable from real
| users (and internet dogs), there's going to be a
| resurging need to trace origin to a human; and maybe to
| also rank their opinions as well. My money is on some
| form of distributed social proof designating the high
| priests.
| pjmlp wrote:
| I see the current state of LLMs are when we read about
| Assembly programmers being suspicious something like high
| level languages would ever take off.
|
| When we read about history of Fortran, there are several
| remarks on the amount of work put into place to win over
| those developers, as otherwise Fortran would have been
| yet another failed attempt.
|
| LLMs seem to be at a similar stage, maybe their Fortran
| moment isn't yet here, parrots as you say, but it will
| come.
| tenaf0 wrote:
| I do think, that in the general case, a JIT compiler is
| required: you can't make _every_ program fast, without
| having the ability to synthesize new code based on only-
| runtime available information. There are many where AOT is
| more than enough, but not all are such. Note, this doesn't
| preclude AOT /hybrid models as pjmlp correctly says.
|
| One stereotypical (but not the best) example would be
| regexes: you basically want to compile some AST into a
| mini-program. This can also be done with a tiny interpreter
| without JIT, which will be quite competitive in speed (I
| believe that's what rust has, and it's indeed one of the
| fastest - the advantage of the problem/domain here is that
| you really can have tiny interpreters that efficiently use
| the caches, having very little overhead on today's CPUs),
| but I am quite sure that a "JITted rust" with all the other
| optimizations/memory layouts _could_ potentially fair
| better, but of course it's not a trivial additional
| complexity.
| vextea wrote:
| Remember when for a while Azul tried to sell custom CPUs to
| support features in their JVM (e.g. some garbage collector
| features that required hardware interrupts and some other extra
| instructions). Although they dropped it pretty quickly in favor
| of just working on software
|
| https://www.cpushack.com/2016/05/21/azul-systems-vega-3-54-c...
| sillywalk wrote:
| IBM's Z14 (and later I assume) supported Guarded Storage
| Facility for 'pauseless Java Garbage collection.'
| toast0 wrote:
| > Pointer masking: you can fit a lot in the unused higher bits
| of an address. Some GCs use them to annotate memory (refered-
| to/visited/unvisited/etc.), but you have to mask them. A
| hardware assisted mask could help a lot.
|
| If you're building hardware masking, it should be viable for
| low bits too. If you define all your objects to be n-byte
| aligned, it frees up low bits for things too, and might not be
| an imposition, things like to be aligned.
| ithkuil wrote:
| The sparc ISA had tagged arithmetic instructions so that you
| could tag integers using LSBs and ignore them
| pjc50 wrote:
| One of the few elements left like this is the ARM Javascript
| instruction: https://news.ycombinator.com/item?id=24808207
| funcDropShadow wrote:
| The Java ecosystems initially started with optimizing Java
| compilers. That setup could benefit from direct hardware
| support for Java bytecode. Later, it was discovered that it is
| more beneficial to remove the optimization from javac in order
| to provide more context to the JIT compiler. Which enables
| better optimizations from JIT compilers. By directly running
| Java bytecode, you would loose so many optimizations done by
| Hotspot, that it is hard to get on par just by interpreting
| bytecode in hardware. The story may be different for restricted
| JVMs that don't have a sophisticated JIT.
| miohtama wrote:
| The current (largest) end-user Java ecosystem is in practice
| Android and it ahead-of-time compiling ART.
|
| Java itself got very good. Though Oracle was blocked to leech
| money, or have return for their investment, depending on the
| viewpoint.
| kaba0 wrote:
| I don't really get your last point - java's improvements
| are _due_ to Oracle, not despite it. They have a terrible
| name, but they have been excellent stewards of the
| platform.
| miohtama wrote:
| Android ART would unlikely to exist if Oracle would have
| been enforcing licensing requirements as they wished.
|
| ART runs on devices for 1B+ users and is more relevant
| for the world population as Oracle. Although we can
| speculate likely Android would have switched to something
| else if Oracle were to win in the court.
| bobsmith432 wrote:
| Fun fact, both the Wii's seconday ARM chip used for security
| tasks and the iPhone 2G's processors had Jazelle but never used
| them.
| khangaroo wrote:
| The 3DS has it too: https://github.com/SonoSooS/libjz
| repiret wrote:
| It was on every Arm926, 1136 and 1176. Lots of devices of a
| certain era had it but didn't use it.
| willvarfar wrote:
| In a similar spirit, Apple seems to have made sure some critical
| OSX idioms were fast on the M1, perhaps even influencing their
| instruction set.
|
| Retaining and releasing an NSObject took ~6.5 nanoseconds on the
| M1 when it came out, comparing with ~30 nanoseconds on the equiv
| gen Intel.
|
| In fact, the M1 _emulated_ an Intel retaining and releasing an
| NSObject fast than an Intel could!
|
| One source: https://daringfireball.net/2020/11/the_m1_macs
| ithkuil wrote:
| The M1 emulation with Rosetta is actually dynamic recompilation
| so of you're measuring only that specific small section it's
| not surprising that Rosetta could have emitted optimal code for
| that instruction sequence
| nadavwr wrote:
| Running bytecode instructions in hardware essentially means a
| hardware-based interpreter. It likely would have been the best
| performing interpreter for the hardware, but JIT-compilation to
| native code still would run circles around it.
|
| During years when this instruction set was relevant (though
| apparently unutilized), Oracle still had very limited ARM support
| for Java SE, so having a fast interpreter could have been
| desirable -- but it makes no sense on beefier ARM systems that
| are able to support decent JIT or AOT support available nowadays.
| nickpsecurity wrote:
| If you want a Java processor, there's a cheap one for FPGA's
| here:
|
| https://www.jopdesign.com/
|
| It's more for embedded use. Might give someone ideas, though.
| depr wrote:
| The trend of posting a page here based on some detail from a
| comment, that was posted in another thread ("What's that
| touchscreen in my room?" in this case) a few days ago, has become
| quite frequent and a bit annoying.
|
| To everyone who wants to write: but I didn't read that thread and
| I find this quite interesting; you are free to find it
| interesting, but I did read about it 2 days ago and to me it
| looks like karma farming.
| donio wrote:
| Regarding the quality of posts in general the problem is not
| what gets posted, there is a ton of junk in the "new" queue and
| most of it never makes it to the front page. It's what gets
| upvoted.
| bitwize wrote:
| Why you gotta yuck other people's yum, man?
|
| To me it seems like Hackernews, as a whole, goes off on the
| same kinds of thought-tangents as I do, and that makes the site
| more interesting. And I _was_ one of the commenters about
| Jazelle on the thread you mentioned.
| depr wrote:
| Because this is a community and I care about what goes on in
| it. I think this is precisely not a thought-tangent, it is
| taking the tangent that occurred elsewhere and posting an
| article about it to score internet points. If we keep doing
| that, this becomes even more of an echo chamber and a very
| boring place.
| czscout wrote:
| Not everyone spends so much time on this site that they can
| easily spot a post as an extension of a related discussion
| elsewhere on the site. Someone posted this page, it got upvoted
| by others who found it interesting, and now it's on the front
| page. What's wrong with that?
| depr wrote:
| Popular content is not necessarily good content (very boring
| to say this, but just look at reddit). And posting articles
| to get upvotes, which I'm not saying this post is necessarily
| doing but at least _some_ are doing, leads to lower quality.
| HN barely has any methods for maintaining overall quality of
| the website and it will automatically degrade as it gets
| larger.
|
| To simply allow these posts and having them hit the front
| page when they get upvotes is a valid position. But I think
| it contributes to a website that is less interesting.
|
| I don't think these posts should be removed, but they should
| at least be frowned upon, and/or linked to the original
| comment thread.
| icegreentea2 wrote:
| I think attribution would be nice. Both from a honesty
| standpoint, but also generally useful.
| LeFantome wrote:
| I was going to say that history repeats itself. This is
| incorrect. This is actually just history.
|
| This is so old that its replacement, ThumbEE, had already been
| deprecated as well.
| Twirrim wrote:
| A little related, back in the day Sun Microsystems came up with
| picoJava, https://en.wikipedia.org/wiki/PicoJava, a full
| microprocessor specification dedicated to native execution of
| java bytecode. It never really went anywhere, other than a few
| engineering experiments, as far as I remember.
|
| For a while Linus Torvalds, of the Linux kernel fame, worked for
| a company called Transmeta,
| https://en.wikipedia.org/wiki/Transmeta, who were doing some
| really interesting things. They were aiming to make a highly
| efficient processor, that could handle x86 through a special
| software translation layer. One of the languages they could
| support was picoJava. IIRC, the processor was never designed to
| run operating systems etc. natively. The intent was always to
| have it work through the translation layer, something that could
| easily be patched and updated to add support for any x86
| extensions that Intel or AMD might introduce.
| specialist wrote:
| I was never clear on how a Java bytecodes would be implemented on
| a register-based CPU. Efficiently.
|
| The JVM is stack-based, right? So it'd be an interpreter (in
| microcode)? Unless there's some kind of "virtual" stack, as
| spec'd for picoJava.
|
| I'm less clear on how Jazelle would implement only a subset of
| the bytecodes.
|
| Am noob. And a quick scholar search says the relevant papers are
| paywalled. Oh well; now it's just a curiosity.
|
| Stack-based CPUs are cool, right? For embedded. Super efficient
| and cheap, enough power for IoT or secure enclaves or whatever.
|
| But it seems that window of opportunity closed. Indeed, if it was
| ever open.
| penguin_booze wrote:
| I hope people get it: an arm, and then thumb.
___________________________________________________________________
(page generated 2024-01-22 23:01 UTC)