[HN Gopher] JEP 450: Compact Object Headers
___________________________________________________________________
JEP 450: Compact Object Headers
Author : mfiguiere
Score : 165 points
Date : 2023-05-04 15:25 UTC (7 hours ago)
(HTM) web link (openjdk.org)
(TXT) w3m dump (openjdk.org)
| davnicwil wrote:
| As I (think I) understand the goal of this is to reduce memory
| footprint of the HotSpot JVM, with the tradeoff of performance
| degradation being capped to 5%, only in the worst rarest cases.
| aaa_aaa wrote:
| Just guessing that Cache throughput increase with the memory
| reduction may offset most performance issues.
| w10-1 wrote:
| I'm always impressed with the clarity of Java's
| design/implementation discussions, which I think is due to Mark
| Reinhold's leadership since 1997 (a breathtaking tenure in its
| own right).
|
| But this needs to replace stack locking with an alternate
| lightweight locking scheme, to avoid races. Unfortunately, that
| is opaque:
| https://bugs.openjdk.org/browse/JDK-8291555
|
| Does anyone have other pointers for the design or viability of
| the required alternative?
| zorgmonkey wrote:
| Their is a detailed description of the new stack locking scheme
| in the PR here https://github.com/openjdk/jdk/pull/10907
| akokanka wrote:
| 5% latency overhead is huuuge. Throw more memory and focus on
| throughout. Not sure why memory is a concern here.
| kasperni wrote:
| "in infrequent cases" not in general.
| jesboat wrote:
| Different applications can have very different requirements.
| I've worked on systems which would kill for 5% latency and on
| systems that would gladly pay 5% for better memory usage.
| exabrial wrote:
| Is this part of the JVM Specification, or is this an
| implementation detail of OpenJDK itself? If the first, thats sort
| of surprising. Seems like an implementation detail that'd be best
| left up to the platform on how to manage objects in memory.
| mike_hearn wrote:
| No spec changes needed for this one.
| papercrane wrote:
| This is an implementation detail of the HotSpot JIT compiler in
| OpenJDK. Other implementations are free to layout their objects
| as they choose.
| Kwpolska wrote:
| This document is a proposal for the HotSpot JVM, which is the
| default JVM implementation (the one that ships with OpenJDK),
| but not the only JVM implementation out there (see
| https://en.wikipedia.org/wiki/List_of_Java_virtual_machines for
| a full list).
| pdpi wrote:
| The JEP identifies the scope as "implementation" and the
| component as "hotspot/runtime", so I do assume it's an
| implementation detail, yes.
| marginalia_nu wrote:
| Java made a bunch of optimistic assumptions about the future of
| hardware when it was designed. UTF-16 strings, 64 bit pointers,
| enormous object headers.
|
| While great for future-proofing, it's been hard to deny it's had
| overhead. Glad to see it's slowly getting undone: Compressed
| Oops, Compact Strings, now this.
| komadori wrote:
| 64-bit longs and doubles take up two slots in the JVM's local
| variable table, whereas Object references only take one. So, if
| anything, the design of Java bytecode assumed 32-bit pointers.
| kaba0 wrote:
| They can just use 64-bit slots for everything (it does leave
| unused 32bit for each 32bit local variable, but there are not
| many, and you get the benefit of the same logic no matter
| what you store there, plus it has the same performance on
| 64-bit systems)
|
| Edit: mind explaining the downvote ?
| ternaryoperator wrote:
| Not sure why you're being downvoted.
|
| The Jacobin JVM [0] does exactly what you suggest: 64-bit
| operand stack and local variables. Longs and doubles still
| occupy two slots on the operand stack, so as to avoid
| having to recompile Java classes that assume the two-slot
| allocation, but the design avoids having to smash together
| two 32-bit values every time a long or double is operated
| on.
|
| [0] http://jacobin.org/
| skitter wrote:
| Relatedly the JVM specification section 4.4.5 says (as the
| same holds for the cp and stack): In
| retrospect, making 8-byte constants take two constant pool
| entries was a poor choice.
| masklinn wrote:
| > 64 bit pointers
|
| 64 bit pointers was not "optimistic assumptions" about
| anything, it was just 64 bit pointers on 64 bits systems like
| most everyone else. And compressed oops were added more than a
| decade ago
| (https://wiki.openjdk.org/display/HotSpot/CompressedOops).
|
| For reference that's about when x32 was added to the linux
| kernel, and unlike x32 compressed oops have _not_ been on the
| chopping block for 5 years.
|
| > enormous object headers.
|
| Hardly? They're two words a class pointer, and a "mark words"
| for GC integration.
| marginalia_nu wrote:
| > And compressed oops were added more than a decade ago
|
| That is still recent on a Java timescale.
|
| > Hardly? They're two words a class pointer, and a "mark
| words" for GC integration.
|
| Well compare with for example C++, that can fly commando with
| no header, just (optional) alignment padding.
|
| In many Java objects, the header is more than half the size
| of the object. That's just not good data locality. The speed-
| up from switching from an array of objects that has n fields
| to an object which has an n arrays of fields can be very
| significant.
| kaba0 wrote:
| Your array of objects are just tightly packed pointers, so
| the data locality (or its lack) doesn't come from that
| (especially that the order of objects in memory may be
| different than the order in the array).
| masklinn wrote:
| I'm going on a limb as they're going in every direction
| and trebuchet-ing goalposts, but I'd assume they're
| talking about using an SoA structure, so instead of
| having a bunch of Foo objects each with a header and,
| say, an `int` field (so two words of header for half a
| word of payload) you have a single header at the top of
| the array, then an int value per record in the packed
| array.
| iainmerrick wrote:
| 64-bit pointers isn't really an optimistic assumption, it's
| just the normal pointer size on 64-bit systems.
| Bjartr wrote:
| It was optimistic because Java came out long before 64 bit
| systems were common.
| zactato wrote:
| The DEC Alpha was a 64 bit CPU architecture that came out
| in 1992. Java support it from Java 1.0 to ~ Java 3
|
| Interestingly, I can't find any information on Google about
| this, but ChatGPT does support this. Unfortunately all the
| primary sources it gave me are for docs that no longer
| exist on the web. The Wayback machine was no help. The old
| web is dead :(
| fweimer wrote:
| Apparently, there was a Java port for Windows NT running
| on the DEC Alpha:
| https://archive.org/details/jdk-v1-alpha_nt
|
| It's unclear whether it's actually a 64-bit build. Did
| Windows even have a 64-bit userspace on Alpha, or was it
| all ILP32?
|
| Anyway, it's not really possible to implement the Java
| memory model on Alpha: https://www.cs.umd.edu/~pugh/java/
| memoryModel/AlphaReorderin... So it's not really a
| natural target for Java code.
| twoodfin wrote:
| Wikipedia matches my recollection: Windows NT was
| actually ported to 64-bit using Alpha workstations, so
| there were probably some pre-release versions floating
| around. But by the time 64-bit Windows was ready, Alpha
| was a dying platform and the original release was only
| for the 64-bit platform with a real future, Itanium.
| cesarb wrote:
| > Interestingly, I can't find any information on Google
| about this, but ChatGPT does support this. Unfortunately
| all the primary sources it gave me are for docs that no
| longer exist on the web.
|
| Or perhaps that never existed. It's well known that
| ChatGPT often hallucinates non-existing references (see
| for instance the discussion at
| https://news.ycombinator.com/item?id=33841672).
| formerly_proven wrote:
| Common in desktops, yes, but Sun being a UNIX workstation
| vendor they switched in the 90s.
| oefrha wrote:
| Well, the 3 billion devices (r) Java ran on certainly
| weren't Sun workstations.
| prpl wrote:
| UltraSPARC says hello
| iainmerrick wrote:
| Optimistic to represent object references with native
| pointers?
|
| C and C++ usually do that too.
| marginalia_nu wrote:
| Right, but in the mid '90s when Java was designed, 64 bit
| machines were basically an exotic architecture and its
| practicalities weren't well understood, which is quite clear
| when CompressedOOPs is basically the sane default.
|
| The weird inconsistency is apparent as you fairly frequently
| stub your toe on the 2 billion size limits of Java arrays,
| which is an area where 64 bit pointers would have made much
| more size than in object references.
| znpy wrote:
| > Right, but in the mid '90s when Java was designed, 64 bit
| machines were basically an exotic architecture
|
| No they weren't, not in the enterprise market. At home?
| sure.
|
| SUN Microsystems released the Sparc v9 architecture in
| 1993. SUN also made the Java language and JVM.
| krzyk wrote:
| Wasn't java targeted at embedded? AFAIR at some point it
| was marketed that it will be in every washing machine,
| fridge.
|
| There a was HotJava browser, weren't servers just a small
| part of its target?
| jbverschoor wrote:
| "Write Once, Run Everywhere" was their slogan.
|
| There's Java Card for smartcards yes. There was also Java
| Micro Edition (j2me) for phones.
| fweimer wrote:
| Most binaries were still 32-bit for performance reason,
| and I don't think that Java 1.2 had a 64-bit port yet,
| not even on Solaris/SPARC.
|
| The situation that an x86-64 build is faster than an i386
| build of most applications (exception extremely pointer-
| heavy ones) is a bit of an exception because x86-64 added
| additional registers and uses a register-based calling
| convention everywhere. That happens to counteract the
| overhead of 64-bit pointers in most cases. Other 64-bit
| architectures with 32-bit userspace compatibility kept
| using 32-bit userspace for quite some time.
| titzer wrote:
| Identity hashcodes and monitors are perennially tricky to
| implement with low space overhead and I think in hindsight that
| putting them at the base of the object model (i.e. every object
| has them) was a mistake.
|
| WebAssembly GC objects do not have identity hash codes nor
| monitors to strive for the lowest space overhead in the base
| object model.
| mastax wrote:
| I wondered why .NET let you lock any object, it seems like a
| strange feature. I figured they just had a spare bit in a
| bitset somewhere. Should've known it was copied from Java.
| kevingadd wrote:
| In practice (IIRC, I'd have to check again to be 100% sure)
| there's a little 'extra garbage' (my name, the real name is
| something else) pointer inside .NET object headers, and if
| your object has an identity hashcode or has been used with
| locking, the extra garbage pointer points to a separate heap
| allocation containing the hashcode and the locking info. That
| way you're not paying for it for every object. I think in
| some cases there are optimizations to be able to store the
| hashcode inline (tagging, etc). So it ends up being similar
| to these 'compact object headers' described here in terms of
| size but is done by making those optional features more
| expensive instead of by bitpacking stuff into the object
| header.
|
| Caveat: There are multiple .NET implementations so what I've
| said may not apply to all of them
| mike_hearn wrote:
| As the JEP explains, the JVM has done both those things for
| a long time. In particular the object doesn't actually pay
| the price of a lock unless it's actually locked at some
| point. It used to be the case that the JVM was even more
| extreme and you wouldn't pay the price of a lock unless the
| object was actually _contended_ , this was called biased
| locking, but the code complexity to implement it was
| eventually determined to be no longer worth it on modern
| hardware.
| hinkley wrote:
| Even in JDK 1.3 era I recognized that anyone could grab a lock
| on any object, not just the object itself, so that was a bit of
| a problem. I started experimenting with member variables called
| "lock" that were new Object(). It seemed pretty dumb at first
| but then I discovered lock splitting and it was off to the
| races. There's always going to be some object with multiple
| concerns, especially if one or two functions involve both but
| others involve one (eg, add/delete versus generate data from
| the current state).
|
| Am I misremembering that there was a time where the JRE lazily
| added locks to objects? I thought it was part of their long
| road to lock elision.
| mike_hearn wrote:
| It does. It's called lock inflation. The lock data structure
| is more than just a handful of bits, so the object header can
| point to the real lock once allocated. You don't pay a full
| lock for every object, just the memory needed to determine if
| it's in use or not.
| Someone wrote:
| > Identity hashcodes and monitors are perennially tricky to
| implement with low space overhead and I think in hindsight that
| putting them at the base of the object model (i.e. every object
| has them) was a mistake.
|
| I can understand the hashcode choice, but I never understood
| why they chose to add the ability to lock on any object. IMO it
| still is a bad choice now on server machines, and it certainly
| was in a language designed for embedded devices in the early
| 1990s.
|
| Does anybody know what they were thinking? Were they afraid of
| having to support parallel class hierarchies with a
| _LockableFoo_ alongside _Foo_ for every class? If so, why? Or
| did they think most programs would have very few objects, and
| mostly use reference types?
| mike_hearn wrote:
| At the time, there was a widespread assumption that massively
| parallel computing was the future and thus that any serious
| language had to try and integrate concurrency as a core
| feature.
|
| It wasn't just Java. For many years THE argument for
| functional languages was you'd better learn them because
| these languages will soon be automatically parallelized, and
| that's the only way you'll be able to master machines with
| thousands of cores.
|
| With hindsight we know it didn't work out like that. We got
| multi-core CPUs but not _that_ many cores. Most code is still
| single threaded. We got SIMD but very little code uses it. We
| got GPUs with many cores, but they are programmed with
| imperative fairly boring C-like languages that don 't use
| locking or message passing or anything else, they're data
| parallel pure functions.
|
| But at the time, people didn't know that. You can see how in
| that environment of uncertainty "everything will be massively
| multi-threaded so let's give everything a lock" might have
| made sense
| paulddraper wrote:
| > We got SIMD but very little code uses it.
|
| Because programming for SIMD is wildly different than MIMD.
| (And MIMD is what `synchronized` is for.)
|
| The theory was that there would be more machines like AMD
| Thread Ripper.
| paulddraper wrote:
| > Does anybody know what they were thinking?
|
| Ergonomics. (Plus an inherent assumption of mutable data.)
|
| You can write `synchronized (this)` or `synchronized` class
| method, etc. and it just works.
| Someone wrote:
| Yes, but the number of things you want to synchronize on in
| real-life code is limited, and, certainly at the time,
| memory was scarce. I think only allowing 'synchronized' on
| containers and allowing programmers to opt-in on them would
| have been the better choice.
|
| > Plus an inherent assumption of mutable data.
|
| Yet, Strings and the boxed value types (Integer, Long,
| etc.) are immutable, and you can synchronize on them (or
| does this matter less for those because of the granularity
| of the memory allocator?)
| paulddraper wrote:
| > Strings and the boxed value types (Integer, Long, etc.)
| are immutable, and you can synchronize on them
|
| Side note: That is a very bad idea, because their object
| identity is iffy.
| derefr wrote:
| If I were the Java1.0 authors, I probably would have:
|
| - made a "monitor slot" its own primitive type;
|
| - made two overloads of `synchronized` --
| synchronized(MonitorSlot), and synchronized(Object); where
| for synchronized(Object), the compiler expects to find a
| MonitorSlot-typed field with a special system name (e.g.
| "__jvmMonitorSlot") on the passed-in Object
|
| - taken the presence of `synchronized (this)` in body code
| of a class to implicitly define such a field on the class,
| meaning you _can_ "just write" `synchronized (this)`, since
| it becomes `synchronized (this.__jvmMonitorSlot)` and also
| triggers the creation of the __jvmMonitorSlot field
|
| - explicitly define the __jvmMonitorSlot field on Class and
| other low-object-count system types
|
| The only change from today would be that you can't point at
| some arbitrary Object you didn't define the class for
| yourself, and say "synchronize on _that_. " Which... why do
| you want to be doing that again?
| ShroudedNight wrote:
| At that point, why not just make types you want lock
| functionality for be declared as 'synchronized' in their
| class definition: synchronized class
| Foo { //... }
| derefr wrote:
| Because, for the things that do care about
| synchronization, they might want _multiple_ explicit
| MonitorSlot members. It makes more sense to just be able
| to synchronize on MonitorSlots directly, and then decide
| where they go.
|
| The only reason I added the `synchronized (this)`
| allowance was because the parent said that they think
| that that's "good ergonomics" -- and presumably, the
| Java1.0 authors also thought that -- and I was trying to
| suggest an alternative that would preserve what they
| consider "good ergonomics."
|
| But personally, if I was the _sole dictator_ of Java1.0
| language design with nobody else to please, I would just
| have `synchronize(MonitorSlot)` + explicitly-defined
| MonitorSlot members (that, if you declare one, must
| always be declared final, and cannot be assigned to in
| the constructor), and that 's it. Just refer to them by
| name when you need one.
| paulddraper wrote:
| Yes with sufficient complexity, it's possible to achieve
| transparently the same result.
| titzer wrote:
| For me, the main question is where this complexity lives.
| To minimize the places where sharp knives need to be
| used, I would prefer to move this up into a language
| runtime and have the VM/engine underneath focus on
| implementing a simpler object model that has the
| mechanisms to pull off these tricks without resorting to
| unsafe tricks.
| pron wrote:
| Neither will Java assuming the next stage of this project goes
| as planned. The overhead for them will only be allocated on
| demand for those objects that need them.
| titzer wrote:
| I get that monitors are inflated dynamically, but from my
| reading of the linked JEP, Compressed Object Headers have
| something like 24 bits allocated for the hashcode in the 64
| bit word?
| nicktelford wrote:
| Towards the bottom of the JEP, they mention that the
| ultimate goal is 32 bit object headers, which would
| necessitate object monitors be tracked on-demand in a side
| table. That's what the parent was getting at.
| pron wrote:
| Right, but that's the first phase. In the next, I believe
| the goal is to attempt to reduce the header to 32 bits, and
| then the header overhead (for all objects) for both
| monitors and hashcodes will be ~3 bits, basically just to
| indicate whether or not they're used. For objects where
| they're used, they will be stored outside the header.
| titzer wrote:
| That's pretty cool. These things are super tricky and are
| hard to get into a high-performance production system, so
| I respect the journey :-)
|
| For Wasm GC, I think we need programmable metaobjects to
| be able to combine language-level metadata with the
| engine-level metadata. I have only a prototype in my head
| for Virgil and plan to explore this in Wizard soon.
| derefr wrote:
| Given https://shipilev.net/jvm/anatomy-quarks/26-identity-hash-
| cod...:
|
| > For identity hash code, there is no guarantee there are
| fields to compute the hash code from, and even if we have some,
| then it is unknown how stable those fields actually are.
| Consider java.lang.Object that does not have fields: what's its
| hash code? Two allocated Object-s are pretty much the mirrors
| of each other: they have the same metadata, they have the same
| (that is, empty) contents. The only distinct thing about them
| is their allocated address, but even then there are two
| troubles. First, addresses have very low entropy, especially
| coming from a bump-ptr allocator like most Java GCs employ, so
| it is not well distributed. Second, GC moves the objects, so
| address is not idempotent. Returning a constant value is a no-
| go from performance standpoint.
|
| How else _could_ identity hashcodes be implemented? Is it just
| impossible to put a WebAssembly GC base-object as a key into a
| map?
| titzer wrote:
| The idea is that if your language has an identity hash code
| for every object, the language runtime can just add a field
| in the Wasm struct for it. There's nothing special about that
| field; it could be an 8-bit, 16-bit, 32-bit field, etc. For
| the lock inflation logic and so on, you can make a "tagged
| pointer" in Wasm GC by using the i31ref type, so you could do
| something like have only the identity hash code by default,
| but "inflate" to a (boxed) indirection with an additional
| monitor dynamically. But the Wasm engine just treats it all
| as regular fields. The overall idea is the Wasm GC gives you
| the mechanisms by which you can implement your specific
| language's needs, hopefully as efficiently as a native VM
| could.
| moonchild wrote:
| One proposed strategy
| (https://wiki.openjdk.org/display/lilliput#Main-Hashcode)
| for dealing with hashes is to compute them lazily, taking
| advantage of the fact that most objects are never hashed:
| initially use the object's address, but also set a flag on
| the object when it is first hashed; when the gc next moves
| the object, it adds an extra field in which the original
| address is cached.
|
| Another strategy, used by sbcl, is to rehash identity hash
| tables after each gc cycle.
|
| How could I implement either of these with wasm?
| derefr wrote:
| > when the gc next moves the object, it adds an extra
| field in which the original address is cached.
|
| Under this scheme, if I allocate object A at address 0,
| then GC-move it to address 100 (such that it caches that
| it was "originally at" address 0); and then, with object
| A still alive, I allocate object B at address 0... then
| don't objects A and B both now have 0 as their identity
| hashcode?
|
| (I'm guessing the answer here is "this only works with
| generational GC, and the GC generation sequence-number is
| an implicit part of the stateless-address hashcodes and
| an explicit part of the cached-in-member hashcodes")
| PaulHoule wrote:
| Java is also considering making value objects that don't have
| that overhead:
|
| https://openjdk.org/jeps/8277163
|
| .NET has something like that already, but it would make a huge
| difference for things like Optional or if you want to make a
| type for, say, complex numbers where there is just 64-bits of
| data (for FP32) and even a 64-bit object header would be 100%
| overhead. Value objects could be possibly allocated on the
| stack and completely bypass the garbage collector in some
| cases.
|
| Note that with that overhead, Java is an environment in which
| you can write multithreaded applications with good reliability
| and scaling, something that WebAssembly definitely isn't.
| bcrosby95 wrote:
| I've been lightly following some of their work here. From the
| outside, it seems like a lot of this has been born from
| optimizing the JVM over the past few decades, running into
| fundamental problems with how dynamic the JVM can be, then
| targeting changes for making that dynamicism opt-in, either
| through JVM flags or new features.
|
| I do think some of it is kinda interesting: some people might
| say "if you never do X, we can guarantee Y". But it's nice,
| as a programmer, to be told "if you use feature A, you will
| never do X, which means we can guarantee Y". It's a lot more
| comforting that I can't slip up and accidentally do X.
| titzer wrote:
| > It's a lot more comforting that I can't slip up and
| accidentally do X.
|
| Indeed, this is why value semantics (i.e. structural
| equality) for language constructs like ADTs is so
| wonderful. Because a program can never observe "identity",
| which is an implementation detail, the implementation
| doesn't have to use objects at all underneath. That opens a
| whole host of value representation options that aren't
| otherwise available.
| kccqzy wrote:
| Monitors are especially strange when I learned Java. I mean,
| why would you put a synchronization feature into every object?
| It's not necessary in the vast majority of the time.
|
| Maybe the programming style at the time had something to do
| with it. Maybe they thought every class needs to be thread safe
| and mutable.
| layer8 wrote:
| This was over 30 years ago when there was still a lack of
| awareness that locks don't compose [0]. Look at the early JRE
| classes like Vector, Hashtable, and StringBuffer, whose
| operations are synchronized. The idea was that since you'd
| potentially want to synchronize access to any mutable object
| (because you'd want to be able to share any object between
| threads), and most objects are mutable, that the most
| convenient solution would be for every object to have that
| functionality built in.
|
| [0] https://en.wikipedia.org/wiki/Lock_(computer_science)#Lac
| k_o...
| vbezhenar wrote:
| Yeah, I asked why there're no Hashable & Equatable interfaces
| instead of putting things into Object which would make more
| sense to me. People responded that's just how things should be.
| Apparently not. IMO object identity is almost always not needed
| and a sign of bad design. You either should write proper hash
| code, or you should not use data structures which use hash
| code.
| lazulicurio wrote:
| Yeah, having equals and hashCode on the root Object class is
| Java's biggest mistake, IMO. Although for a slightly
| different reason: equality is usually context-dependent, but
| having equals as an instance method ties you to one
| implementation.
| devman0 wrote:
| That is somewhat fixed by Comparable, but the fact that
| HashMap doesn't have a pluggable Hash override always
| bothered me.
| cesarb wrote:
| > You either should write proper hash code, or you should not
| use data structures which use hash code.
|
| There's also the question of _which hash code_. Do you use a
| fast but low quality hash function, or a slower but higher
| quality one? Does your hash function need to be secure
| against hash collision attacks? Does it have to be
| deterministic? The correct choice of hash function can depend
| on the data structure, and the same object might have to be
| hashed with different hash functions (or different hash
| function seeds) in the same program.
| josefx wrote:
| > IMO object identity is almost always not needed and a sign
| of bad design.
|
| Sometimes you have no choice. I had multiple times where I
| needed to store additional data for instances of class X but
| had not control over it, so I had to store it in a separate
| structure and keep track of things by object identity.
|
| > You either should write proper hash code
|
| And object identity fulfills all the requirements of a
| "proper" hash code.
| mike_hearn wrote:
| The reason is that it'd make HashMap and HashSet a lot less
| useful. If every type had to opt in, you'd be unable to use
| many types as keys or set entries for no better reason than
| the authors didn't bother to implement hash codes or
| equality. By providing reasonable identity-based versions of
| these, it increases the utility of the language at a tradeoff
| in memory usage.
| hesk wrote:
| You could implement the hashing code in a helper class and
| construct the HashMap with it.
| mike_hearn wrote:
| Without a way to give objects identity you'd get stuck
| pretty fast as it's not guaranteed you have access to any
| data to hash. You'd have to break encapsulation. That
| would then hit problems with evolution, where a new
| version of a class changes its fields and then everything
| that uses it as a hashmap key breaks.
|
| My experience with platform design has consistently been
| that handling version evolution in the presence of
| distant teams increases complexity by 10x, and it's not
| just about some mechanical notion of backwards
| compatibility. It's a particular constraint for Java
| because it supports separate compilation. This enables
| extremely fast edit/run cycles because you only have to
| recompile a minimal set of files, and means that
| downloading+installing a new library into your project
| can be done in a few seconds, but means you have to
| handle the case of a program in which different files
| were compiled at different times against different
| versions of each other.
| titzer wrote:
| JavaScript handles the "no identity hash" with WeakMap
| and WeakSet, which are language built-ins. For Virgil, I
| chose to leave out identity hashes and don't really
| regret it. It keeps the language simple and the
| separation clear. HashMap (entirely library code, not a
| language wormhole) takes the hash function and equality
| function as arguments to the constructor.
|
| [1] https://github.com/titzer/virgil/blob/master/lib/util
| /Map.v3
|
| This is partly my style too; I try to avoid using maps
| for things unless they are really far flung, and the
| things that end up serving as keys in one place usually
| end up serving as keys in lots of other places too.
| josefx wrote:
| Does that assume that you have only one key type and not
| an infinite sized hierarchy of child classes to deal
| with? If you had a map that took a Number as key, how
| many child classes do you think your helper class would
| cover and what extension framework would it use to be
| compatible with user defined classes?
| jzoch wrote:
| yeah this is where traits instead of hierarchies become
| useful - I should be able to implement the Hash interface
| for an object I do not own and then use that object + trait
| going forward for HashMap and HashSet.
|
| Java doesnt make this very composable
| nusaru wrote:
| Should be noted that Rust (one of the most prominent
| languages with traits) doesn't allow you to implement a
| trait for an object you do not own. A common workaround
| is to wrap that object in your own tuple struct and then
| implement the trait for that struct.
| Sharlin wrote:
| (If you don't own the trait either, that is. Your own
| traits can be implemented for foreign types.)
|
| Rust's approach to the Hash and Eq problem is to make
| them opt-in but provide a derive attribute that
| autoimplements them with minimal boilerplate for most
| types.
|
| Also, Rust's Hash::hash implementations don't actually
| hash anything _themselves_ , they just pass the relevant
| parts of the object to a Hasher passed as a parameter.
| This way types aren't stuck with just a single hash
| implementation, and normal programmers don't need to
| worry about primes and modular arithmetic.
| Phelinofist wrote:
| That sounds like a use case for the decorator pattern?
| maherbeg wrote:
| Java has just been on a roll with such substantial changes to
| fundamental pieces of the runtime! Great work to the teams that
| keep making improvements in backwards compatible manners.
| capableweb wrote:
| This specific change seems to impact the JVM the runtime, not
| Java the language. Which is great for people like me, who
| heavily rely on JVM but couldn't care less about Java.
| pron wrote:
| When some people (like me) say "Java", they mean the Java
| Platform, and they call what you call Java "the Java
| language". When you say JVM, I assume you mean the Java
| Platform minus the language (the language is ~3-5% JDK, the
| JVM is about 25%). People who use "the JVM" but not "Java"
| use ~97% of Java, i.e. the Java platform.
| pulse7 wrote:
| People may not care about your "carelessness about Java"...
| capableweb wrote:
| I'm fairly sure that's fine, people get to care about
| whatever they want :)
| Scarbutt wrote:
| Funny, those languages rely heavily on the Java ecosystem and
| libraries, and they better hope the Java ecosystem keeps
| thriving, improving current libraries and producing new
| libraries else they become impractical. Show me your pure
| Clojure production grade http servers or database drivers ;)
| kernal wrote:
| Summary
|
| Reduce the size of object headers in the HotSpot JVM from between
| 96 and 128 bits down to 64 bits on 64-bit architectures. This
| will reduce heap size, improve deployment density, and increase
| data locality.
___________________________________________________________________
(page generated 2023-05-04 23:00 UTC)