[HN Gopher] GraalPy - A high-performance embeddable Python 3 run...
       ___________________________________________________________________
        
       GraalPy - A high-performance embeddable Python 3 runtime for Java
        
       Author : fniephaus
       Score  : 153 points
       Date   : 2024-09-17 18:04 UTC (4 hours ago)
        
 (HTM) web link (www.graalvm.org)
 (TXT) w3m dump (www.graalvm.org)
        
       | sevensor wrote:
       | Took a little digging to find that it targets 3.11. Didn't see
       | anything about a GIL. If you're a Python person, don't click the
       | quick start link unless you want to look at some xml.
        
         | pjmlp wrote:
         | Python implementations naturally don't have any GIL in regards
         | to JVM or CLR variants, there is no such thing on those
         | platforms.
         | 
         | YAML and JSON have both tried to replicate the XML tooling
         | experience, only worse.
         | 
         | Schemas, comments, parsing and schema conversions tools.
        
           | lopuhin wrote:
           | I think GraalPython does have a GIL, see https://github.com/o
           | racle/graalpython/blob/master/docs/contr... - and if by
           | "there is no such thing on those platforms" you mean JVM/CLR
           | not having a GIL, C also does not have a GIL but CPython
           | does.
        
             | westurner wrote:
             | "PEP 703 - Making the Global Interpreter Lock Optional in
             | CPython" (2023) https://peps.python.org/pep-0703/
             | 
             | CPython built with --disable-gil does not have a GIL (as
             | long as PYTHONGIL=0 and all loaded C extensions are built
             | for --disable-gil mode)
             | https://peps.python.org/pep-0703/#py-mod-gil-slot
             | 
             | "Intent to approve PEP 703: making the GIL optional" (2023)
             | https://news.ycombinator.com/item?id=36913328#36917709
             | https://news.ycombinator.com/item?id=36913328#36921625
        
               | kaashif wrote:
               | This is pretty beside the point. The point is that X not
               | having a GIL doesn't inherently mean Python on X also
               | doesn't have a GIL.
        
             | pjmlp wrote:
             | My mistake, as I assumed they took the same decision as
             | jython and IronPython.
             | 
             | https://jython.readthedocs.io/en/latest/Concurrency/#no-
             | glob...
             | 
             | https://wiki.python.org/moin/IronPython
             | 
             | The difference between JVM, CLR and C in regards to
             | parallel and concurrent code is that they are built for
             | those kind of workloads, and have a memory model proper,
             | hence not needing a GIL.
        
               | commodoreboxer wrote:
               | I think they would have to here, to support native
               | modules. Jython (and I believe IronPython, but don't
               | quote me) does not support native CPython modules.
               | CPython modules explicitly control the GIL, so if they
               | are supported (as they are here), you can't really leave
               | the GIL out without exposing potential thread safety
               | issues.
        
         | foobazgt wrote:
         | I mean, if you're trying to embed one language in another,
         | please don't be surprised when the quickstart guide has a
         | couple of examples containing a few lines of code written for
         | the embedding language and its package manager(s).
        
         | jitl wrote:
         | Happily, you can ignore the Maven XML and use Gradle instead,
         | it's the next codeblock on the page, after "or":
         | implementation("org.graalvm.polyglot:polyglot:24.1.0")
         | implementation("org.graalvm.polyglot:python:24.1.0")
        
       | wenc wrote:
       | DuckDB is not currently a supported package, but Pandas and
       | matplotlib are which is good. If DuckDB and Polars were supported
       | and if they ran well, I suspect many data jobs could benefit.
        
       | tannhaeuser wrote:
       | I guess what makes Python interesting right now is the
       | integration with ML toolchains, CUDA, Metal/MLX, pytorch,
       | tensorflow, LLM encoders/decoders, etc. more than Python the
       | language. But can GraalVM run those codes meaningfully when
       | Python is merely used for glue code with the important bits
       | implemented in native code?
        
         | waldrews wrote:
         | Looks like all of that would run in a native sandbox
         | environment which in turn is called from the Python running on
         | the JVM. So, maybe it simplifies interop, but whether it's
         | straightforward to get full performance from the native layer
         | (especially GPU/multicore) is an open question.
        
         | tln wrote:
         | Yes, apparently it can
         | 
         | https://www.graalvm.org/dev/reference-manual/python/Native-E...
         | 
         | > CPython provides a native extensions API for writing Python
         | extensions in C/C++. GraalPy provides experimental support for
         | this API, which allows many packages like NumPy and PyTorch to
         | work well for many use cases. The support extends only to the
         | API, not the binary interface (ABI), so extensions built for
         | CPython are not binary compatible with GraalPy. Packages that
         | use the native API must be built and installed with GraalPy,
         | and the prebuilt wheels for CPython from pypi.org cannot be
         | used. For best results, it is crucial that you only use the pip
         | command that comes preinstalled in GraalPy virtualenvs to
         | install packages. The version of pip shipped with GraalPy
         | applies additional patches to packages upon installation to fix
         | known compatibility issues and it is preconfigured to use an
         | additional repository from graalvm.org where we publish a
         | selection of prebuilt wheels for GraalPy. Please do not update
         | pip or use alternative tools such as uv.
        
           | theLiminator wrote:
           | I wonder if hpy will solve the extension problem.
        
         | yosefk wrote:
         | The reasons for all this stuff having been developed in Python
         | also make Python interesting right now, all by themselves. It
         | did not happen by accident; this stuff was developed fairly
         | recently and there was no shortage of mature languages to
         | choose from.
        
           | pjmlp wrote:
           | Using Python as C and C++ REPL of sorts has been common in
           | academia since it took the scripting crown away from Perl and
           | Tcl, which were used during the late 90's.
           | 
           | Example see the Bioinformatics papers from that period, and
           | the Perl tooling used alongside the research.
           | 
           | Already in 2003 CERN was using Python on some of their build
           | infrastructure (see CMT), Grid Computing scripting efforts,
           | and we had Python trainings available to us.
           | 
           | Now there is a difference between a REPL of sorts, scripting
           | OS tasks, and going full blown applications with a pure
           | interpreter.
        
           | BiteCode_dev wrote:
           | The people disliking the language are very vocal about it,
           | but there is a huge amount of silent people that loves it and
           | an even bigger amount that just like it as much as
           | alternatives. It's mainstream now, not trending like 10 years
           | ago, so there is no hype about it anymore. We just use it to
           | do stuff.
           | 
           | Add to that the existing excellent ecosystem, the strong
           | culture of scientific stacks and a very good story for
           | providing c-extentions (actually the best one in all
           | scripting languages because of things like cibuildwheel).
           | 
           | It's only in small tech bubbles like HN that devs find it
           | surprising.
        
           | Eridrus wrote:
           | It didn't happen by total accident, but it didn't happen by
           | design for where we are today either. The original choice to
           | start building data science tooling in Python happened
           | intentionally, but since then path dependence has been a huge
           | thing.
        
           | wenc wrote:
           | As a former Perl hacker who started using Python in 2005, I
           | saw Python ride several waves. (Numerical computation, data
           | science, deep learning)
           | 
           | Perl was the leading tool for scripting and text parsing.
           | Python didn't really supplant it for a long time -- until
           | people started writing more complicated scripts that had to
           | be maintained. Perl reads like line noise after 6 months
           | whereas I can look at Python code from 20 years ago, prettify
           | it with black, and understand it.
           | 
           | Python got picked up by the scientific computing community,
           | which gave it some its earliest libraries like numpy, f2py,
           | scipy. Some of us who were on MATLAB moved over.
           | 
           | Then data science happened. Pandas built off the scientific
           | computation foundations and eventually libraries like scikit
           | and matplotlib (mimicking matlab's plotting) came along.
           | 
           | Then tensorflow came along and built on the foundation of
           | numerical libraries. PyTorch followed.
           | 
           | Other systems like Django came and made python popular for
           | building database backed websites.
           | 
           | Suddenly there was momentum and today almost all numerical
           | software have a python API -- this includes proprietary stuff
           | like CPLEX and what have you.
           | 
           | Python was the glue language that had the lowest barrier of
           | entry. For instance, Spark was written in Scala and has a
           | performant Scala API but everyone uses PySpark because it's
           | much more accessible, despite the interop cost.
           | 
           | The counterfactual to all this was Ruby. It had much nicer
           | syntax than Python but when I tried to use it in grad school
           | I was quickly stymied by the lack of numerical libraries.
           | Ruby never found a niche outside of Rails and config
           | management.
           | 
           | Essentially Python -- like Nvidia today -- bet on linear
           | algebra (and more broadly on data processing) and won.
           | 
           | I get why there's hate for Python -- it's not a perfect
           | language. Yet those of us pragmatists who use it understand
           | the trade offs. You trade off on the metal performance for
           | programmer performance. You trade off packaging difficulties
           | for something that works. You trade off an imperfect syntax
           | for getting things done.
           | 
           | I could have used Ruby -- a much more beautiful lanaguage --
           | in grad school and worked around its lacks, but I would have
           | not graduated on time. Python was pragmatic choice for me and
           | continues to be one for me today (outside of situations
           | requiring raw performance)
        
             | o11c wrote:
             | There was also the major anti-wave of Python 3. But it has
             | managed to pull through despite ending up with broken
             | strings (RIP all old code that needs to deal with legacy-
             | encoded data), probably because there was no viable
             | replacement.
        
             | commodoreboxer wrote:
             | I agree with you, and I'll put it slightly stronger. Ruby
             | is a better language than Python in every way except the
             | very most important two:
             | 
             | - Imports in Ruby seriously suck compared to Python.
             | Everything requires into a global scope and an ecosystem
             | like bundler which encourages centralizing all imports for
             | your entire codebase into one file.
             | 
             | - Python has docstrings encouraging in code documentation.
             | 
             | Add common ecosystem things like the Ruby community
             | encouraging generated methods, magical "do what I mean"
             | parameters, and REPL poke-driven development, and this
             | leads to the effect that Python codebases are almost always
             | well documented and easy to understand. You can tell where
             | every symbol comes from, and you can usually find a
             | documentation entry for every single method. It's not
             | uncommon for a Ruby library, even a popular one, to be
             | documented solely through a scattering of sparsely-
             | explained examples with literally no real API
             | documentation. Inheriting a long-lived Ruby project can be
             | a serious ordeal just to discover where all the code that's
             | running is running, why it's running, where things are
             | preloaded into a builtin class, and with Rails and
             | Railties, a Gem can auto insert behavior and Middleware
             | just by existing, without ever being explicitly mentioned
             | in any code or configs other than the Gemfile. It's an
             | absolute headache.
             | 
             | My dream language would be Ruby with Python-style imports
             | and docstrings.
        
         | pjmlp wrote:
         | I am willing to live with Python as the Lisp we deserve to
         | have, on this AI wave, when it finally gets a proper JIT story
         | we can rely on, regardless of the workload.
         | 
         | Currently it is a mix and match of an herculean engineering
         | effort mostly ignored by the community (PyPy), DSLs for GPGPUs,
         | bunch of C and C++ libraries that people keep referring to as
         | "Python" when any language can have similar bindings, jython,
         | IronPython, GraalPy,...
         | 
         | So it isn't for lack of trying, at least we finally have
         | CPython folks more welcoming to performance improvements, and
         | JITs.
        
       | theanonymousone wrote:
       | Does it have to be run in a GraalVM, or any JVM implementation is
       | fine?
        
         | jryan49 wrote:
         | Graal let's you compile native binaries
        
           | ackfoobar wrote:
           | Graal is many things (a marketing nightmare). The guest
           | language part is orthogonal to the native packager AFAIK.
        
             | w10-1 wrote:
             | Yes, but I was under the impression that graal-level inter-
             | op was limited to packages the graal toolchain could
             | compile.
             | 
             | Thus, while swift and graal both depend on llvm, they use
             | different variants and there's no real way to make inter-op
             | between swift and graal (even using the llvm it which graal
             | is said to be able to consume).
             | 
             | e.g., I believe this announcement represents the work to
             | compile a python (3.11) and some proof-of-concept python
             | packages using graal toolchain, to spur other packages to
             | support the same.
             | 
             | So I'd really love to be wrong, but I believe building
             | under the graal llvm is the common factor.
        
               | kaba0 wrote:
               | I don't really see how swift comes into the picture,
               | besides SuLong being a thing (running LLVM bitcode).
               | Native binary was meant as a compile _target_ in the
               | previous comment, I believe, not as an _input_. Graal can
               | do both, but as a target it has no dependency on LLVM.
               | 
               | So yeah, graalvm should be able to produce a native
               | binary for python code (though depending on the specifics
               | it might actually be more like a native binary
               | interpreter running python scripts, it can't optimize in
               | every circumstance but I'm hazy on the details).
        
         | Okx wrote:
         | > You can use GraalPy with GraalVM JDK, Oracle JDK, or OpenJDK
         | 
         | https://www.graalvm.org/latest/reference-manual/python/
        
       | mkoubaa wrote:
       | HPy can eventually be used to support CPython extension modules
       | in GraalPy
        
         | ajdhGfa wrote:
         | And they will run how much slower or have strange bugs?
        
       | 2OEH8eoCRo0 wrote:
       | In what world is anything written in Python "high performance?"
        
         | pjmlp wrote:
         | In places where it actually has a JIT.
        
         | mkoubaa wrote:
         | The one we want to live in while you fling poop
        
       | nkzd wrote:
       | What is the use-case for GraalPy? To be honest I don't understand
       | why would anyone want to use it.
        
         | xyproto wrote:
         | Data scientists trapped in bureaucracy?
        
         | abirch wrote:
         | Minecraft Mods can only be written in Java and I want my kid to
         | learn python.
         | 
         | Jython is still 2.x and it'd be nice to let my kid write a
         | minecraft mod in python. Not a business use case but a use
         | case.
        
           | smj-edison wrote:
           | When I was learning programming, my coding class used a
           | Bukkit plugin that connected to Python. I can't remember what
           | it was called, but that was for Minecraft 1.7.10.
           | 
           | Not sure if you were wanting Python specifically, but KubeJS
           | lets you use JavaScript for mods. I think there's also a
           | clojure integration.
        
             | abirch wrote:
             | Thank you. My 3rd grader knows basic python so I'd prefer
             | to stick with that or Scratch
        
         | the_arun wrote:
         | I am assuming - With this, JVMs needing integration with LLMs
         | can embed LLMs in JVM instead of making outbound API calls. If
         | my assumption is right - wouldn't this improve performance of
         | consumer applications?
        
           | pjmlp wrote:
           | Thankfully some LLMs also have Java bindings to the same
           | native libraries used by Python.
        
         | chc4 wrote:
         | Ghidra embeds Python scripting via Jython, which is stuck on
         | Python 2. Switching to GraalPy would allow Python 3 scripting.
         | 
         | Any other Java programs that want a scripting engine could use
         | it as well.
        
         | andreldm wrote:
         | I worked at a company where data scientists wrote python code
         | using pandas and we had port it to java and a library called
         | keanu that was very useful but soon became unmaintained.
         | 
         | Of course this was very time consuming and unrewarding, all
         | because only java applications could be deployed to production
         | due to a stupid top-down decision.
         | 
         | This GraalPy sounds like something I wish existed back then.
        
           | pvorb wrote:
           | Did you look into Jython back then?
        
             | toyg wrote:
             | Jython has historically lagged _hard_ , often falling
             | behind for very extended periods. For a time their releases
             | basically just stopped, which led to them missing support
             | for pretty much anything between 2.7 and 3.6 (iirc). I know
             | the project basically rebooted at some point, but I've
             | since lost interest.
        
             | andreldm wrote:
             | Not me, someone else in the company did, I don't remember
             | why it was dismissed.
        
             | jsight wrote:
             | Jython was dead for a long time. It might be back a little
             | now, but there is still no Python 3 support.
             | 
             | GraalPy is much more active and more compatible.
        
         | _joel wrote:
         | https://github.com/oracle/graalpython?tab=readme-ov-file#why...
        
         | pvorb wrote:
         | Maybe this would be an interesting alternative runtime
         | environment for PySpark? I think currently PySpark runs in
         | Python and somehow interacts with a JVM and relies on copying
         | data from one to the other.
        
         | theflyinghorse wrote:
         | Picture working for a big, non-tech corporation. Your BU only
         | does Java because it has always been thus and Jeff the SVP is a
         | law grad and doesn't want anything to change because of
         | perceived risk. GraalVM allows smart people who have to work
         | within such limitations to still write (mostly) the software
         | they want while still vaguely relating it to Java for decision
         | makers.
        
           | actionfromafar wrote:
           | Not so vaguely, either. The dev story is not Java but the
           | deploy story is.
        
           | nunobrito wrote:
           | Those "smart people" write blackboxes in esoteric languages
           | that only the same person maintains.
           | 
           | Everyone else has to write wrappers to interact with that
           | blackbox. God forbid someone daring to even change the code,
           | because it basically doesn't even need/use junit tests.
           | Eventually the smart person gets bored and moves to something
           | else, that tool then gets rewritten to Java in two days by
           | someone else.
           | 
           | End of story.
        
         | kaba0 wrote:
         | Besides all the nice answers given by others, a big one was not
         | mentioned: performance!
         | 
         | Graal can do pretty advanced JIT-compilation for any Graal
         | language, plus you can mix-and-match languages (with a big
         | chunk of their ecosystems) and it will actually compile
         | _across_ language boundaries. And we haven't even mentioned
         | Java's state of the art GCs that can run circles around any
         | tracing GC, let alone the very low throughput reference
         | counting.
        
           | ackfoobar wrote:
           | I guess for pure python applications, they'd rather throw
           | more hardware at the problem than messing with the JVM.
        
             | kaba0 wrote:
             | For serial workloads it's very very hard to scale by
             | hardware, though. CPUs aren't getting 2x faster as they
             | used to.
             | 
             | Also, what is "messing with the JVM"? That's like one of
             | the most battle tested technologies out there, right next
             | to the Linux kernel.
        
               | ackfoobar wrote:
               | Don't get me wrong, I love the JVM.
               | 
               | The unfortunately common irrational aversion to JVM
               | aside, there's also the fear of "using it wrong".
        
       | ajdhGfa wrote:
       | I'm very skeptical about production use, but the thought of
       | Oracle taking over Python is amusing, since the Python community
       | is already run like Oracle in a top down military manner. It can
       | only get better!
        
       | iLemming wrote:
       | What does that mean for Clojure?
        
         | masklinn wrote:
         | Why would it mean anything for clojure?
        
         | positr0n wrote:
         | Same thing as Java. You could use this to run python in your
         | clojure JVM process.
        
       | calrizien wrote:
       | Is there a way to embed Python 3 into Swift like this?
        
         | w10-1 wrote:
         | I haven't seen embedding using graal/vm, or inter-op using the
         | native JVM FFI.
         | 
         | There is (active, 2K stars)
         | https://github.com/pvieito/PythonKit and I've heard of people
         | being able to deploy apps with python on the app store. YMMV.
        
       | froh wrote:
       | what's the advantage of this over JPype?
        
         | mdaniel wrote:
         | That it goes in the opposite direction of your cited project
         | (run modern-ish python from within the JVM), and almost
         | certainly has a much, much better JIT story than yours
        
       | Rochus wrote:
       | In case someone is interested, here are some benchmark results
       | comparing GraalPy and others with JDK8 using the Are-we-fast-yet
       | benchmark suite: https://stefan-marr.de/downloads/tmp/awfy-
       | bun.html
       | 
       | And here is a table representation of all benchmarks and the
       | geomean and median overall results: http://software.rochus-
       | keller.ch/awfy-bun-summary.ods
       | 
       | The implementation of the same benchmark suite runs around factor
       | 2.4 (geomean) faster on JDK8 than on GraalPython EE 22.3 Hotspot,
       | or 41 times faster than CPython 3.11. The Graal Enterprise
       | Edition (EE) seem to be factor 1.31 faster than the Community
       | Edition (CE).
        
       | upghost wrote:
       | FWIW we've had full Java/Python integration in Clojure for awhile
       | now, courtesy of Chris Neurnberger and libpython-clj:
       | https://github.com/clj-python/libpython-clj
       | 
       | If you're into that sort of thing.
       | 
       | Self-interest disclosure: I'm a major contributor and heavy user.
        
         | waldrews wrote:
         | What's the GIL/threading story there?
        
       ___________________________________________________________________
       (page generated 2024-09-17 23:00 UTC)