[HN Gopher] JOPA: Java compiler in C++, Jikes modernized to Java...
       ___________________________________________________________________
        
       JOPA: Java compiler in C++, Jikes modernized to Java 6 with Claude
        
       Author : pshirshov
       Score  : 49 points
       Date   : 2025-11-23 17:17 UTC (3 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | pshirshov wrote:
       | Essentially, I've tried to throw a task which, I thought, Claude
       | won't handle. It did with minimal supervision. Some things had to
       | be done in "adversarial" mode where Claude coded and Codex
       | criticized/reviewed, but it is what it is. An LLM was able to
       | implement generics and many other language features with very
       | little supervision in less than a day o_O.
       | 
       | I've been thrilled to see it using GDB with inhuman speed and
       | efficiency.
        
         | yosefk wrote:
         | I am very impressed with the kind of things people pull out of
         | Claude's zhopa but can't see such opportunities in my own work.
         | Is success mostly the result of it being able to test its
         | output reliably, and of how easy it is to set up the
         | environment for this testing?
        
           | pshirshov wrote:
           | > Is success mostly the result of it being able to test its
           | output reliably, and of how easy it is to set up the
           | environment for this testing?
           | 
           | I won't say so. From my experience the key to success is the
           | ability to split big tasks into smaller ones and help the
           | model with solutions when it's stuck.
           | 
           | Reproducible environments (Nix) help a lot, yes, same for
           | sound testing strategies. But the ability to plan is the key.
        
             | orbifold wrote:
             | One other thing I've observed is that Claude fares much
             | better in a well engineered pre-existing codebase. It
             | adopts to most of the style and has plenty of "positive"
             | examples to follow. It also benefits from the existing test
             | infrastructure. It will still tend to go in infinite loops
             | or introduce bugs and then oscillate between them, but I've
             | found it to be scarily efficient at implement medium sized
             | features in complicated codebases.
        
               | pshirshov wrote:
               | Yes, that too, but this particular project was an ancient
               | C++ codebase with extremely tight coupling, manual memory
               | management and very little abstraction.
        
           | UncleEntity wrote:
           | Claude will also tend to go for the "test-passing"
           | development style where it gets super fixated on making the
           | tests pass with no regards to how the features will work with
           | whatever is intended to be built later.
           | 
           | I had to throw away a couple days worth of work because the
           | code it built to pass the tests wasn't able to do the actual
           | thing it was designed for and the only workaround was to go
           | back and build it correctly while, ironically, still keeping
           | the same tests.
           | 
           | You kind of have to keep it on a short leash but it'll get
           | there in the end... hopefully.
        
           | tekacs wrote:
           | zhopa -> jopa (zhopa) for those who don't spot the joke
        
         | 1024bees wrote:
         | how did you get gdb working with Claude? There are a few mcp
         | servers that looks fine, curious what you used
        
           | pshirshov wrote:
           | Well, just told it to use gdb when necessary, MCP wasn't
           | required at all! Also it helps to tell it to integrate
           | cpptrace and always look at the stacks.
        
           | formerly_proven wrote:
           | MCP is more or less obsolete for code generation since agents
           | can just run CLI tools directly.
        
         | UncleOxidant wrote:
         | > Some things had to be done in "adversarial" mode where Claude
         | coded and Codex criticized/reviewed
         | 
         | How does one set up this kind of adversarial mode? What tools
         | would you need to use? I generally use Cline or KiloCode - is
         | this possible with those?
        
           | pshirshov wrote:
           | My own (very dirty) tool, there are some public ones,
           | probably I'll try to migrate to one of the more mature tools
           | later. Example: https://github.com/ruvnet/claude-flow
           | 
           | > is this possible with those?
           | 
           | You can always write to stdin/read from stdout even if there
           | is no SDK available I guess. Or create your own agent on top
           | of an LLM provider.
        
       | proxysna wrote:
       | Jopa means ass in russian, this reminded me of Pidora.
        
         | koakuma-chan wrote:
         | There's JEPA too
        
         | dimaaan wrote:
         | Don't forget NPM packages Mocha and Chai (Pee and Tea)
        
         | photios wrote:
         | I came here for this comment! TIL about Pidora :D
        
         | mike386 wrote:
         | There is also "mudyla" repo in the org, so
        
           | pshirshov wrote:
           | That's Multimodal Dynamic Launcher. A very nice thing
           | actually, a scripting orchestrator.
        
       | pshirshov wrote:
       | Btw, working on Java 7 support. At this moment I sorta have
       | working Java 7 compiler targeting Java 6 bytecode (Java 7 has
       | StackMapTable which is sort of annoying).
       | 
       | Also, I've tried to replace parser with a modern one. Claude
       | succeeds in generating Java 8 parsers with various parser
       | generators/parser combinators but fails to resolve extremely
       | tight coupling.
        
         | algo_trader wrote:
         | what is the feasibility/crazyness level of "llm porting" the
         | javac source code to c++ ?
         | 
         | setting copyright issues aside, javac is a pretty clean
         | textual-input-output program, and It can probably be reduced to
         | a single thread variant
        
           | pshirshov wrote:
           | Claude won't handle a project of that scale. Even with Java 7
           | modernization project, which is much simpler than full javac
           | translation, I constantly hit context limits and Claude
           | throws things like "API Error: 400 {"type":"error","error":{"
           | type":"invalid_request_error","message":"messages.3.content.7
           | 6: `thinking` or `redacted_thinking` blocks in the latest
           | assistant message cannot be modified. These blocks must
           | remain as they were in the original
           | response."},"request_id":"req_011CVWwBJpf3ZrmYGkYZLQVf"}" at
           | me.
        
         | Snuggly73 wrote:
         | looking at the "att" branches (excuse my unhealthy curiosity) I
         | can only say - "jesus fucking christ".
         | 
         | from the old parser ast -> to json -> to new ast representation
         | (that is basically again copy of the old one) -> to some new
         | incomplete bytecode generation
         | 
         | im sure there is some good explanation, but....why?! :)
        
           | pshirshov wrote:
           | I've been looking for a way to decouple legacy parser from
           | the rest of the compiler, plus create a way to dump parser
           | output in a readable form. Unfortunately, the coupling is too
           | tight there. In my own compilers all the outputs of all the
           | phases are serializable.
           | 
           | In the end I've just reanimated the original parser generator
           | and progressed to full Java 7 syntactically (-att5 branch),
           | but there are some major obstacles with bytecode.
        
             | Snuggly73 wrote:
             | i thought it might be something like this (still a weird
             | overkill), but if you are effectively replacing the parser
             | with new peg and replacing the backend with something new -
             | then there is nothing left - just start from scratch :)
        
       | exabrial wrote:
       | tangential: Isn't there from the same time period, a java
       | compiler written in java?
        
         | pshirshov wrote:
         | It's much older, but even now this is THE ONLY viable pathway
         | to bootstrap a modern JDK from scratch. I'm trying to modernize
         | it so the bootstrap path might be shortened.
         | 
         | See https://bootstrappable.org/projects/java.html
        
         | cyberax wrote:
         | Java-to-bytecode compiler (javac) has always been written in
         | Java. There was a JVM written in Java: Jikes RVM.
        
           | anthk wrote:
           | Jikes didn't.
        
       | pshirshov wrote:
       | Ah, by the way. I've tried to do the same with Codex
       | (gpt-5.1-codex-max) and Gemini (2.5 pro), both failed
       | spectacularly. This job was done mostly by Sonnet 4.5. Java 6 did
       | not require intensive supervision. Java 7 parts are done with
       | Opus 4.5 and it constantly hits its limits, I have to regularly
       | intervene.
        
       | goranmoomin wrote:
       | I'm genuinely curious on how well this is working, is there an
       | independent Java test suite that covers major Java 5/6 features
       | that can verify that the JOPA compiler works per the spec? I.e. I
       | see that Claude has wrote a few tests in it's commits, but it
       | would be wonderful if there's a non-Clauded independent test
       | suite (probably from other Java implementations?) that tracks
       | progress.
       | 
       | I do feel that that is pretty much needed to claim that Claude is
       | adding features to match the Java spec.
        
         | pshirshov wrote:
         | Well, it's complicated. The original jdk compliance tests are
         | notoriously hard to deal with. Currently I parse nearly 100% of
         | positive testcases from JDK 7 test suite (in one of Java 7
         | branches) but I only have several dozens of true end to end
         | tests (build .java with jopa, validate classfile with javap,
         | run classfile with javac).
         | 
         | So, I can't tell how good it actually is but it definitely
         | handles reasonably complex source files with generics
         | (something the original compiler was unable to do).
         | 
         | The actual goal of the project is to be able to build at least
         | ANT to simplify clean bootstrap of OpenJDK.
        
           | AtlasBarfed wrote:
           | That is perilously close to the usual:
           | 
           | "AI DID EVERYTHING IN A DAY"
           | 
           | "How do you know it works?"
           | 
           | "... it just looks like it does"
           | 
           | Like when I ask AIs to port sed to java, and it writes test
           | cases ... running sed on a CLI and doesn't implement the full
           | lang spec no matter how much prompting I give it.
        
             | pshirshov wrote:
             | Well, at least the emitted bytecode validates with javap
             | and a lot of stuff definitely runs on real jvm.
        
               | th0ma5 wrote:
               | I think the criticisms are too often dismissed as moving
               | the goalposts or ignorant of potential, but short of
               | recreating the active open bugs in Java, you've created a
               | different thing whose differences have to be managed and
               | it is unclear how helpful that may be despite the working
               | implementations of subsets.
        
               | pshirshov wrote:
               | If I (or someone else) can use it as a start point in
               | bootstrap process - that's fine with me. This is not
               | supposed to be a top-tier compiler. Essentially, it needs
               | to be able to build ANT.
        
       | sgammon wrote:
       | [j|y]ikes
        
       | atgreen wrote:
       | Related: I recently got javac working with OpenLDK, my JVM
       | bytecode to Common Lisp transpiler. The `javacl` binary is a
       | dumped sbcl image that behaves just like OpenJDK javac program,
       | but with CL under the hood (eg. java objects/methods are all
       | CLOS).
        
         | pshirshov wrote:
         | Please post the link here. If it's more than just a demo, it
         | might be a valuable tool.
        
           | atgreen wrote:
           | https://github.com/atgreen/openldk
        
             | pshirshov wrote:
             | I think it's an extremely valuable tool. If it can compile
             | from sources - it would be priceless.
        
               | atgreen wrote:
               | openldk itself builds from source. It reads jar/class
               | files and JIT-transpiles them to common lisp code, which
               | is in turn compiled to native instructions. It does not
               | read java source code at all. But you can run OpenJDK's
               | javac with OpenLDK. You can write "native" methods in
               | Common Lisp, extend Java classes with CLOS classes, use
               | conditions/restarts, :before/:after/:around methods, dump
               | images, etc. There's some ways to go still, but -- like I
               | said -- javac just started working as a native lisp image
               | executable, which was an important milestone.
        
               | pshirshov wrote:
               | I cannot bootstrap openjdk from ground zero, from pure
               | source files without binaries. From what I know there is
               | just one pathway to that, the Guix one, which starts from
               | Jikes.
        
       | shawn_w wrote:
       | I remember discovering and using jikes in the 90's. It was /so/
       | much faster than javac back then.
       | 
       | "Modernizing" to Java 6 is amusing.
        
         | pshirshov wrote:
         | Almost got to Java 7. And there is a huge gap between Jikes'
         | original Java 4 and Java 6.
         | 
         | Even Java 6 support should make ground zero bootstrap of modern
         | JDKs much easier.
        
       ___________________________________________________________________
       (page generated 2025-11-26 23:00 UTC)