[HN Gopher] Calling Rust from Python using PyO3
___________________________________________________________________
Calling Rust from Python using PyO3
Author : loige
Score : 136 points
Date : 2021-11-28 12:50 UTC (10 hours ago)
(HTM) web link (saidvandeklundert.net)
(TXT) w3m dump (saidvandeklundert.net)
| stabbles wrote:
| PyO3 is a clever name
| fezzez wrote:
| I don't get it. What's the pun?
| gopiandcode wrote:
| FeO3 (Iron Oxide) can be referred to as Rust and presumably
| the name PyO3 makes reference to that.
| dijit wrote:
| It's a double pun because -O3 is the highest optimisation
| level for many compilers.
| cosmic_quanta wrote:
| This explains why they didn't go with PyO2!
| [deleted]
| barefeg wrote:
| Python Oxide
| drexlspivey wrote:
| It's a chemistry pun about the process of oxidation, oxides
| (such as rust) have a chemical formula of XO3 where X is the
| oxidized element. The library is called Py Oxide (PyO3) where
| the oxidized element is Python
| marvinvz wrote:
| I've been playing around with Rust, PyO3 and Blender the last few
| days and it has been really nice to use.
| Grollicus wrote:
| I recently wanted to use some rust crates from Python and decided
| to give PyO3 a go and I must say it works really well.
|
| Took about one evening to plug it all together and get it
| running. The guide at pyo3.rs was really helpful and maturin to
| build binary packages just works. You pass in python types and
| they arrive as rust-native types. Makes writing code feel very
| native.
|
| The result: https://github.com/Grollicus/pyttfwrap takes a string
| and splits it as it would wrap when rendered with a given TTF
| font. Most of the code is about keeping a reference to the loaded
| font as that's an expensive operation.
|
| Was some great fun and I'm sure I'll use it some more.
| rdedev wrote:
| Check out apache arrow in rust for transferring large amounts
| data between rust and python. With it you can get zero copy
| transfers between the languages. Afaik there is some performance
| penalty involved when converting python types to native rust
| types.
|
| It used to be hard to do such transfers between rust and python
| using arrow but from v3 or v4, arrow implementation in rust
| started supporting the c data interface. Creating the array in
| rust and sending it to python might involve leaking a boxed
| object though
|
| Check this out if you want to see how
| https://github.com/jhoekx/python-rust-arrow-interop-example/...
| wrcwill wrote:
| i was using rust numpy for this, i wonder how much faster this
| is and if it supports passing arrays of strings
| rdedev wrote:
| Yup. It supports arrays of strings.
|
| https://docs.rs/arrow/6.2.0/arrow/array/type.LargeStringArra.
| ..
|
| As for performance, YMMV but I think it could be faster since
| storing a list of strings as a numpy array would involve
| padding. Arrow arrays can get away without padding
| gravypod wrote:
| I hope someone creates a rule set for Bazel to automate this
| stuff. This seems like a really nice way to start rewriting
| performance critical code in a safe language!
| bytearray64 wrote:
| I've been using this macro in my personal projects repo:
| https://pastebin.com/raw/atNtPgwv
|
| The three big gotchas with this are:
|
| - You need to name the rule the same as the module you've
| exported in Rust
|
| - It creates a copy to get the naming right. You could write an
| actual rule instead of a macro that just makes a symlink with
| the right name for Python to pick it up.
|
| - Since the macro just exposes a genrule and isn't a PyInfo
| provider, you need to add it in the data of the
| py_library/binary you intend to use it in. Again, could be
| fixed by an actual rule impl.
| dijit wrote:
| I've been having difficulty recently trying to get bazel and
| rust to work together nicely.
|
| It seems like cargo does a lot of heavy lifting w.r.t.
| dependencies which Bazel does not like. Do you have to vendor
| your dependencies- and your dependencies dependencies. Ad
| infinitum.
|
| There is cargo-raze which helps, but only if you're making a
| rust library: not if you're making a binary.
|
| So maybe it works for this case.
| gravypod wrote:
| > It seems like cargo does a lot of heavy lifting w.r.t.
| dependencies which Bazel does not like. Do you have to vendor
| your dependencies- and your dependencies dependencies. Ad
| infinitum.
|
| I'm really hoping bzlmod helps in this case:
| https://www.youtube.com/watch?v=TxOCKtU39Fs
|
| The story for external deps is indeed extremely painful right
| now.
| bluejekyll wrote:
| Have you tried cargo-vendor for managing all the
| dependencies?
| bytearray64 wrote:
| Not disagreeing that cargo-raze in bazel is awkward, but just
| wanted to add:
|
| Cargo raze should let you consume libraries you've imported
| in a rust_binary target, if that's what you're talking about.
| It's also possible, but slightly more annoying, to vendor the
| dependency tree if you need to rely on pulling from a local
| mirror.
|
| I've also used it to import cargo binaries (i.e wasm-bindgen-
| cli) into a Bazel workspace -- it just makes the resulting
| target something like @raze_some_bin//:bin
| HelloNurse wrote:
| Isn't it possible to make a little script to run _bazel
| clean_ or the equivalent whenever you run Cargo to actually
| add, remove or update Rust libraries, and ignore them as
| dependencies because they are immutable apart from such user
| interventions? Are Bazel and /or Cargo too smart?
| crm416 wrote:
| We've been using PyO3 and Maturin at Spring for a while now, and
| happily. The smooth Python interop means we can call out to Rust
| without much pain for performance-critical codepaths. But the
| other side-effect is that we can use Rust across the org more
| broadly, even when Python interop isn't a consideration -- e.g.,
| for isolated services, or applications that need to compile to
| Windows, or whatever else -- since we're building up the cultural
| knowledge and shared libraries to do so.
| JPKab wrote:
| I've been waiting a while for a good article on integrating rust
| into python for a while. I was inspired by the amazing
| performance of the orjson python lib.
| wheelerof4te wrote:
| This is smart thinking. Python is the world's best glue language
| and Rust is on track to become the world's best compiled, safe
| language.
|
| It makes sense to combine the two and replace C in this.
| matheusmoreira wrote:
| C has not been fully replaced. These Python modules are not
| Rust libraries, they're C libraries that just happen to be
| written in Rust internally.
| linkdd wrote:
| While Python's bindings in other languages are great, and Rust
| has amazing features regarding memory safety, I think saying
| they are the best in their area is a bit too much.
|
| For compiled languages: Rust is the best when you care about
| memory safety. C is the best when you care about
| performance/simplicity/portability. C++ is the best when you
| care about modularity, Assembly is the best when you care about
| lowlevel stuffs, ...
|
| For interpreted languages: Python is the best for data science,
| Java is the best for entreprise, Javascript is the best for
| speed, ...
|
| Each one have a use case where they are the best. And those use
| cases I listed above, are entirely subjectives and will never
| be the same from person to person.
|
| If there was an objectively best language period, why would
| other languages exist?
| wheelerof4te wrote:
| I said that Rust "is on track" to become the best, not that
| it is best. It is still a young language, but Rust shows
| great promise to one day be _that_ one language you need for
| just about anything, sans scripting. The best part is that
| Rust is approaching C levels of speed, sometimes even
| surpassing it.
|
| Python is currently the easiest language to learn, has a
| plethoa of modules in it's standard library alone, plus a
| vast ocean of 3rd party libraries. From displaying cute cats
| inside your terminal, to large mathematical monoliths like
| numpy and scipy. Python has almost everything you need to
| build programs, minus the speed and easy distribution of
| executables.
|
| Java is semi-compiled and too verbose to be a glue language.
|
| Javascript is faster than Python, but is much more difficult
| to use efficiently. It's standard library is such a joke that
| I had to install a 3rd-party library just to use a input()
| equivalent.
| Jansen312 wrote:
| Compiled with memory safety? Which language you have in
| mind better than rust at this moment? Just curious.
| wheelerof4te wrote:
| Honestly, none that I can think of.
| tialaramex wrote:
| Other people have pushed back about safety, but I want to
| push back on _performance_.
|
| Here's the thing, presumably you agree that "It's fast except
| that it never works" isn't actually fast. So in practice your
| high performance C ends up compromised by the reality that it
| must work, at least often, which constrains what you are
| confident to actually write because you already know you
| can't write over-complicated code correctly.
|
| Languages like Rust (and C++) enable you to write _much_ more
| complicated software that you still understand well enough to
| debug it. An efficient filter that might be a single line of
| Rust or C++ may take dozens of lines of C, or else several
| macro invocations that add invisible overhead of lines that
| are constructed by the pre-processor unseen by the
| programmer, yet still taint the shared namespace and
| semantics. As a result, while the Rust or C++ programmer
| feels free to chain say, six filters the equivalent C looks
| monstrous and you recoil from it, even though that 's the
| efficient way to solve the problem you had.
|
| The intuition that if it looks simpler it's faster is often
| wrong, the reason Godbolt (Compiler Explorer) exists is that
| Matt Godbolt was concerned about whether the C++ for-each
| loop results in the same fast _machine code_ as a manual
| loop, or whether you might pay a price for the nicer syntax.
| You don 't, and Matt's continued exploration of this sort of
| issue, plus his generosity in sharing the result with the
| world is why the site is there now.
|
| But of course the nicer iterator loop (and there are a _lot_
| of examples like this) encourages you to choose more
| complicated solutions which are faster, because total
| cognitive load isn 't so great as it would be in a less
| expressive language. As a result even though you _could_ in
| principle write an equally high performance C program, actual
| C programmers would not do that.
|
| Now, machines don't fear complexity, so one of the
| interesting results of WUFFS is that they produce C code
| which no human would ever write, but which has all the
| properties they guaranteed (e.g. memory safety but lots more)
| in their small interest domain, by "simply" using tremendous
| amounts of complexity. The unexpected effect of this is that
| WUFFS-the-library is very fast: Since the only possible
| mistakes are programs that either don't do what is required
| or don't compile, WUFFS programmers are freed to focus on
| very fast tight code for the library. Again, you could _in
| theory_ have written that code in C, but you wouldn 't
| because you're human and you can't handle that.
| adsharma wrote:
| Agreed on the need to write simpler programs with less
| cognitive load.
|
| The thing about wuffs is that you can't single step through
| it and there is no ecosystem of libraries a wuffs
| programmer could use.
|
| This is why a subset of python that shares design
| principles with Julia and Nim, but transpiles to Rust, Go
| or C++ is interesting. It's not hugely popular with those
| language communities (prefer coding natively in Julia or
| Nim), but the pytorch thread
| (https://news.ycombinator.com/item?id=29354474#29371641)
| explains why the ecosystem is important (harder to build).
|
| Having said that the basic hello world wuffs example in
| py2many isn't working as well as I'd like it to, but it's
| close.
| staticassertion wrote:
| There's a _lot_ of examples of this sort of thing.
|
| 1. C++ uses the equivalent of an Arc in Rust, because it
| can't tell if that reference will be shared across threads.
| In Rust you can use an Rc and, if you ever need an Arc, it
| will tell you.
|
| 2. Rust's `&str[..]` is safe, C++'s string_view causes tons
| of UAFs, so string_view is used much less frequently
| whereas &str is ubiquitous in rust code. In general you can
| share stack space in Rust easily, even across threads,
| which is incredibly powerful.
| post-it wrote:
| C, C++, and Assembly aren't safe, and Java and JS aren't good
| glue languages.
| linkdd wrote:
| So?
|
| My point was that if you don't care about safety, C/C++/ASM
| have good aspects too.
|
| I can write a program without a single pointer in C, and it
| will be fast, small, and portable.
|
| Java is used a lot in data science alongside Python. And
| Javascript also have bindings for Tensorflow and other
| libs. They are scripting languages able to call C code, by
| definition they are glue languages.
| treesprite82 wrote:
| > So?
|
| Wheelerof5te said Rust is becoming the "world's best
| *compiled, safe* language". You objected to this, but
| then only listed alternative languages which aren't in
| that group. You responded as if Wheelerof5te had claimed
| there was "an objectively best language period".
|
| > They are scripting languages able to call C code, by
| definition they are glue languages.
|
| If I say something like "world's best chef" then I mean
| the person who is best _at being a chef_. Not the world
| 's best (most moral?) person who happens to also fulfill
| the definition of being a chef but may be pretty bad at
| cooking. I think the same was intended here for "best
| glue language" - best at being a glue language.
| smitty1e wrote:
| Implementing Rust's borrow checker as a C++ compile-time
| template metaprogramming exercise seems only a guru and a
| cloud-based compiler cluster out of reach.
| oconnor663 wrote:
| https://pypi.org/project/blake3 has been based on PyO3 for about
| a year and a half now, and it's been smooth sailing. One of the
| reasons we reach for compiled code from Python is to get
| multithreading working for CPU-bound tasks, and the combination
| of PyO3 plus Rayon on the Rust side is especially nice.
| staticassertion wrote:
| I'm curious about any rough edges.
|
| 1. What happens if the Rust code is misbehaving, panicking, etc?
|
| 2. What's build support like for this?
|
| I'd love to gut what little Python code I have left and use Rust.
| jamesmishra wrote:
| 1. The Python code calling the Rust code raises a runtime-
| related exception, but it doesn't crash the process.
|
| 2. Building Python extensions out of PyO3 code is easy with
| maturin: https://github.com/PyO3/maturin
| vlmutolo wrote:
| We used PyO3 at $work to expose a Rust implementation of a
| compute-intensive algorithm to an existing Python codebase.
|
| The teammate who did it had been using Rust only for a couple
| months and none of us had ever used PyO3. He got it done in just
| a couple days. I consider that an endorsement of the API they've
| built.
|
| It's heavily macro-based, which does cause some confusion. But if
| you spend some time with their examples, finding the fast path
| isn't too tricky.
| staticassertion wrote:
| Did you run into any issues? I've been nervous about taking
| this approach in our codebase because it doesn't feel totally
| well worn yet, but anecdotes are exactly the sort of thing that
| change my mind about that.
| vlmutolo wrote:
| No real issues other than our own lack of experience with
| PyO3. I'd say the biggest gamble is spending the time to get
| it set up. Once we got it running, it was smooth sailing.
|
| We're calling Rust from Python. Haven't tried it the other
| way around. Our use case was helped by the fact that we are
| just passing a String into Rust and letting Rust do all the
| heavy lifting. There's minimal back-and-forth.
|
| I liked how PyO3 managed panics (they're just normal
| exceptions that can be caught on the Python side). I wasn't
| the one dealing with Maturin, but it seemed reasonable to get
| started with. I never enjoy introducing more tools into a
| build system, but this was relatively painless.
|
| If I recall correctly, the bindings themselves are only
| like... thirty lines of code.
___________________________________________________________________
(page generated 2021-11-28 23:01 UTC)