[HN Gopher] Calling Rust from Python using PyO3
       ___________________________________________________________________
        
       Calling Rust from Python using PyO3
        
       Author : loige
       Score  : 136 points
       Date   : 2021-11-28 12:50 UTC (10 hours ago)
        
 (HTM) web link (saidvandeklundert.net)
 (TXT) w3m dump (saidvandeklundert.net)
        
       | stabbles wrote:
       | PyO3 is a clever name
        
         | fezzez wrote:
         | I don't get it. What's the pun?
        
           | gopiandcode wrote:
           | FeO3 (Iron Oxide) can be referred to as Rust and presumably
           | the name PyO3 makes reference to that.
        
             | dijit wrote:
             | It's a double pun because -O3 is the highest optimisation
             | level for many compilers.
        
               | cosmic_quanta wrote:
               | This explains why they didn't go with PyO2!
        
           | [deleted]
        
           | barefeg wrote:
           | Python Oxide
        
           | drexlspivey wrote:
           | It's a chemistry pun about the process of oxidation, oxides
           | (such as rust) have a chemical formula of XO3 where X is the
           | oxidized element. The library is called Py Oxide (PyO3) where
           | the oxidized element is Python
        
       | marvinvz wrote:
       | I've been playing around with Rust, PyO3 and Blender the last few
       | days and it has been really nice to use.
        
       | Grollicus wrote:
       | I recently wanted to use some rust crates from Python and decided
       | to give PyO3 a go and I must say it works really well.
       | 
       | Took about one evening to plug it all together and get it
       | running. The guide at pyo3.rs was really helpful and maturin to
       | build binary packages just works. You pass in python types and
       | they arrive as rust-native types. Makes writing code feel very
       | native.
       | 
       | The result: https://github.com/Grollicus/pyttfwrap takes a string
       | and splits it as it would wrap when rendered with a given TTF
       | font. Most of the code is about keeping a reference to the loaded
       | font as that's an expensive operation.
       | 
       | Was some great fun and I'm sure I'll use it some more.
        
       | rdedev wrote:
       | Check out apache arrow in rust for transferring large amounts
       | data between rust and python. With it you can get zero copy
       | transfers between the languages. Afaik there is some performance
       | penalty involved when converting python types to native rust
       | types.
       | 
       | It used to be hard to do such transfers between rust and python
       | using arrow but from v3 or v4, arrow implementation in rust
       | started supporting the c data interface. Creating the array in
       | rust and sending it to python might involve leaking a boxed
       | object though
       | 
       | Check this out if you want to see how
       | https://github.com/jhoekx/python-rust-arrow-interop-example/...
        
         | wrcwill wrote:
         | i was using rust numpy for this, i wonder how much faster this
         | is and if it supports passing arrays of strings
        
           | rdedev wrote:
           | Yup. It supports arrays of strings.
           | 
           | https://docs.rs/arrow/6.2.0/arrow/array/type.LargeStringArra.
           | ..
           | 
           | As for performance, YMMV but I think it could be faster since
           | storing a list of strings as a numpy array would involve
           | padding. Arrow arrays can get away without padding
        
       | gravypod wrote:
       | I hope someone creates a rule set for Bazel to automate this
       | stuff. This seems like a really nice way to start rewriting
       | performance critical code in a safe language!
        
         | bytearray64 wrote:
         | I've been using this macro in my personal projects repo:
         | https://pastebin.com/raw/atNtPgwv
         | 
         | The three big gotchas with this are:
         | 
         | - You need to name the rule the same as the module you've
         | exported in Rust
         | 
         | - It creates a copy to get the naming right. You could write an
         | actual rule instead of a macro that just makes a symlink with
         | the right name for Python to pick it up.
         | 
         | - Since the macro just exposes a genrule and isn't a PyInfo
         | provider, you need to add it in the data of the
         | py_library/binary you intend to use it in. Again, could be
         | fixed by an actual rule impl.
        
         | dijit wrote:
         | I've been having difficulty recently trying to get bazel and
         | rust to work together nicely.
         | 
         | It seems like cargo does a lot of heavy lifting w.r.t.
         | dependencies which Bazel does not like. Do you have to vendor
         | your dependencies- and your dependencies dependencies. Ad
         | infinitum.
         | 
         | There is cargo-raze which helps, but only if you're making a
         | rust library: not if you're making a binary.
         | 
         | So maybe it works for this case.
        
           | gravypod wrote:
           | > It seems like cargo does a lot of heavy lifting w.r.t.
           | dependencies which Bazel does not like. Do you have to vendor
           | your dependencies- and your dependencies dependencies. Ad
           | infinitum.
           | 
           | I'm really hoping bzlmod helps in this case:
           | https://www.youtube.com/watch?v=TxOCKtU39Fs
           | 
           | The story for external deps is indeed extremely painful right
           | now.
        
           | bluejekyll wrote:
           | Have you tried cargo-vendor for managing all the
           | dependencies?
        
           | bytearray64 wrote:
           | Not disagreeing that cargo-raze in bazel is awkward, but just
           | wanted to add:
           | 
           | Cargo raze should let you consume libraries you've imported
           | in a rust_binary target, if that's what you're talking about.
           | It's also possible, but slightly more annoying, to vendor the
           | dependency tree if you need to rely on pulling from a local
           | mirror.
           | 
           | I've also used it to import cargo binaries (i.e wasm-bindgen-
           | cli) into a Bazel workspace -- it just makes the resulting
           | target something like @raze_some_bin//:bin
        
           | HelloNurse wrote:
           | Isn't it possible to make a little script to run _bazel
           | clean_ or the equivalent whenever you run Cargo to actually
           | add, remove or update Rust libraries, and ignore them as
           | dependencies because they are immutable apart from such user
           | interventions? Are Bazel and /or Cargo too smart?
        
       | crm416 wrote:
       | We've been using PyO3 and Maturin at Spring for a while now, and
       | happily. The smooth Python interop means we can call out to Rust
       | without much pain for performance-critical codepaths. But the
       | other side-effect is that we can use Rust across the org more
       | broadly, even when Python interop isn't a consideration -- e.g.,
       | for isolated services, or applications that need to compile to
       | Windows, or whatever else -- since we're building up the cultural
       | knowledge and shared libraries to do so.
        
       | JPKab wrote:
       | I've been waiting a while for a good article on integrating rust
       | into python for a while. I was inspired by the amazing
       | performance of the orjson python lib.
        
       | wheelerof4te wrote:
       | This is smart thinking. Python is the world's best glue language
       | and Rust is on track to become the world's best compiled, safe
       | language.
       | 
       | It makes sense to combine the two and replace C in this.
        
         | matheusmoreira wrote:
         | C has not been fully replaced. These Python modules are not
         | Rust libraries, they're C libraries that just happen to be
         | written in Rust internally.
        
         | linkdd wrote:
         | While Python's bindings in other languages are great, and Rust
         | has amazing features regarding memory safety, I think saying
         | they are the best in their area is a bit too much.
         | 
         | For compiled languages: Rust is the best when you care about
         | memory safety. C is the best when you care about
         | performance/simplicity/portability. C++ is the best when you
         | care about modularity, Assembly is the best when you care about
         | lowlevel stuffs, ...
         | 
         | For interpreted languages: Python is the best for data science,
         | Java is the best for entreprise, Javascript is the best for
         | speed, ...
         | 
         | Each one have a use case where they are the best. And those use
         | cases I listed above, are entirely subjectives and will never
         | be the same from person to person.
         | 
         | If there was an objectively best language period, why would
         | other languages exist?
        
           | wheelerof4te wrote:
           | I said that Rust "is on track" to become the best, not that
           | it is best. It is still a young language, but Rust shows
           | great promise to one day be _that_ one language you need for
           | just about anything, sans scripting. The best part is that
           | Rust is approaching C levels of speed, sometimes even
           | surpassing it.
           | 
           | Python is currently the easiest language to learn, has a
           | plethoa of modules in it's standard library alone, plus a
           | vast ocean of 3rd party libraries. From displaying cute cats
           | inside your terminal, to large mathematical monoliths like
           | numpy and scipy. Python has almost everything you need to
           | build programs, minus the speed and easy distribution of
           | executables.
           | 
           | Java is semi-compiled and too verbose to be a glue language.
           | 
           | Javascript is faster than Python, but is much more difficult
           | to use efficiently. It's standard library is such a joke that
           | I had to install a 3rd-party library just to use a input()
           | equivalent.
        
             | Jansen312 wrote:
             | Compiled with memory safety? Which language you have in
             | mind better than rust at this moment? Just curious.
        
               | wheelerof4te wrote:
               | Honestly, none that I can think of.
        
           | tialaramex wrote:
           | Other people have pushed back about safety, but I want to
           | push back on _performance_.
           | 
           | Here's the thing, presumably you agree that "It's fast except
           | that it never works" isn't actually fast. So in practice your
           | high performance C ends up compromised by the reality that it
           | must work, at least often, which constrains what you are
           | confident to actually write because you already know you
           | can't write over-complicated code correctly.
           | 
           | Languages like Rust (and C++) enable you to write _much_ more
           | complicated software that you still understand well enough to
           | debug it. An efficient filter that might be a single line of
           | Rust or C++ may take dozens of lines of C, or else several
           | macro invocations that add invisible overhead of lines that
           | are constructed by the pre-processor unseen by the
           | programmer, yet still taint the shared namespace and
           | semantics. As a result, while the Rust or C++ programmer
           | feels free to chain say, six filters the equivalent C looks
           | monstrous and you recoil from it, even though that 's the
           | efficient way to solve the problem you had.
           | 
           | The intuition that if it looks simpler it's faster is often
           | wrong, the reason Godbolt (Compiler Explorer) exists is that
           | Matt Godbolt was concerned about whether the C++ for-each
           | loop results in the same fast _machine code_ as a manual
           | loop, or whether you might pay a price for the nicer syntax.
           | You don 't, and Matt's continued exploration of this sort of
           | issue, plus his generosity in sharing the result with the
           | world is why the site is there now.
           | 
           | But of course the nicer iterator loop (and there are a _lot_
           | of examples like this) encourages you to choose more
           | complicated solutions which are faster, because total
           | cognitive load isn 't so great as it would be in a less
           | expressive language. As a result even though you _could_ in
           | principle write an equally high performance C program, actual
           | C programmers would not do that.
           | 
           | Now, machines don't fear complexity, so one of the
           | interesting results of WUFFS is that they produce C code
           | which no human would ever write, but which has all the
           | properties they guaranteed (e.g. memory safety but lots more)
           | in their small interest domain, by "simply" using tremendous
           | amounts of complexity. The unexpected effect of this is that
           | WUFFS-the-library is very fast: Since the only possible
           | mistakes are programs that either don't do what is required
           | or don't compile, WUFFS programmers are freed to focus on
           | very fast tight code for the library. Again, you could _in
           | theory_ have written that code in C, but you wouldn 't
           | because you're human and you can't handle that.
        
             | adsharma wrote:
             | Agreed on the need to write simpler programs with less
             | cognitive load.
             | 
             | The thing about wuffs is that you can't single step through
             | it and there is no ecosystem of libraries a wuffs
             | programmer could use.
             | 
             | This is why a subset of python that shares design
             | principles with Julia and Nim, but transpiles to Rust, Go
             | or C++ is interesting. It's not hugely popular with those
             | language communities (prefer coding natively in Julia or
             | Nim), but the pytorch thread
             | (https://news.ycombinator.com/item?id=29354474#29371641)
             | explains why the ecosystem is important (harder to build).
             | 
             | Having said that the basic hello world wuffs example in
             | py2many isn't working as well as I'd like it to, but it's
             | close.
        
             | staticassertion wrote:
             | There's a _lot_ of examples of this sort of thing.
             | 
             | 1. C++ uses the equivalent of an Arc in Rust, because it
             | can't tell if that reference will be shared across threads.
             | In Rust you can use an Rc and, if you ever need an Arc, it
             | will tell you.
             | 
             | 2. Rust's `&str[..]` is safe, C++'s string_view causes tons
             | of UAFs, so string_view is used much less frequently
             | whereas &str is ubiquitous in rust code. In general you can
             | share stack space in Rust easily, even across threads,
             | which is incredibly powerful.
        
           | post-it wrote:
           | C, C++, and Assembly aren't safe, and Java and JS aren't good
           | glue languages.
        
             | linkdd wrote:
             | So?
             | 
             | My point was that if you don't care about safety, C/C++/ASM
             | have good aspects too.
             | 
             | I can write a program without a single pointer in C, and it
             | will be fast, small, and portable.
             | 
             | Java is used a lot in data science alongside Python. And
             | Javascript also have bindings for Tensorflow and other
             | libs. They are scripting languages able to call C code, by
             | definition they are glue languages.
        
               | treesprite82 wrote:
               | > So?
               | 
               | Wheelerof5te said Rust is becoming the "world's best
               | *compiled, safe* language". You objected to this, but
               | then only listed alternative languages which aren't in
               | that group. You responded as if Wheelerof5te had claimed
               | there was "an objectively best language period".
               | 
               | > They are scripting languages able to call C code, by
               | definition they are glue languages.
               | 
               | If I say something like "world's best chef" then I mean
               | the person who is best _at being a chef_. Not the world
               | 's best (most moral?) person who happens to also fulfill
               | the definition of being a chef but may be pretty bad at
               | cooking. I think the same was intended here for "best
               | glue language" - best at being a glue language.
        
             | smitty1e wrote:
             | Implementing Rust's borrow checker as a C++ compile-time
             | template metaprogramming exercise seems only a guru and a
             | cloud-based compiler cluster out of reach.
        
       | oconnor663 wrote:
       | https://pypi.org/project/blake3 has been based on PyO3 for about
       | a year and a half now, and it's been smooth sailing. One of the
       | reasons we reach for compiled code from Python is to get
       | multithreading working for CPU-bound tasks, and the combination
       | of PyO3 plus Rayon on the Rust side is especially nice.
        
       | staticassertion wrote:
       | I'm curious about any rough edges.
       | 
       | 1. What happens if the Rust code is misbehaving, panicking, etc?
       | 
       | 2. What's build support like for this?
       | 
       | I'd love to gut what little Python code I have left and use Rust.
        
         | jamesmishra wrote:
         | 1. The Python code calling the Rust code raises a runtime-
         | related exception, but it doesn't crash the process.
         | 
         | 2. Building Python extensions out of PyO3 code is easy with
         | maturin: https://github.com/PyO3/maturin
        
       | vlmutolo wrote:
       | We used PyO3 at $work to expose a Rust implementation of a
       | compute-intensive algorithm to an existing Python codebase.
       | 
       | The teammate who did it had been using Rust only for a couple
       | months and none of us had ever used PyO3. He got it done in just
       | a couple days. I consider that an endorsement of the API they've
       | built.
       | 
       | It's heavily macro-based, which does cause some confusion. But if
       | you spend some time with their examples, finding the fast path
       | isn't too tricky.
        
         | staticassertion wrote:
         | Did you run into any issues? I've been nervous about taking
         | this approach in our codebase because it doesn't feel totally
         | well worn yet, but anecdotes are exactly the sort of thing that
         | change my mind about that.
        
           | vlmutolo wrote:
           | No real issues other than our own lack of experience with
           | PyO3. I'd say the biggest gamble is spending the time to get
           | it set up. Once we got it running, it was smooth sailing.
           | 
           | We're calling Rust from Python. Haven't tried it the other
           | way around. Our use case was helped by the fact that we are
           | just passing a String into Rust and letting Rust do all the
           | heavy lifting. There's minimal back-and-forth.
           | 
           | I liked how PyO3 managed panics (they're just normal
           | exceptions that can be caught on the Python side). I wasn't
           | the one dealing with Maturin, but it seemed reasonable to get
           | started with. I never enjoy introducing more tools into a
           | build system, but this was relatively painless.
           | 
           | If I recall correctly, the bindings themselves are only
           | like... thirty lines of code.
        
       ___________________________________________________________________
       (page generated 2021-11-28 23:01 UTC)