[HN Gopher] PyO3: Rust Bindings for the Python Interpreter
___________________________________________________________________
PyO3: Rust Bindings for the Python Interpreter
Author : batterylow
Score : 255 points
Date : 2021-01-29 12:17 UTC (10 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| mleonhard wrote:
| I'm interested in running Python inside wasmtime. I think PyO3
| would be useful. We could build a small Rust wasm binary that
| exports an "execute_python_script" function. This would finally
| be a way to run Python in a strong sandbox with memory [0] and
| CPU [1] restrictions. (In 1999, I asked Guido for sandboxing
| support in Python, but he refused.)
|
| [0] https://github.com/bytecodealliance/wasmtime/issues/2273
|
| [1] https://github.com/bytecodealliance/wasmtime/issues/2274
| minimaxir wrote:
| Huggingface Tokenizers
| (https://github.com/huggingface/tokenizers), which are now used
| by default in their Transformers Python library, use pyO3 and
| became popular due to the pitch that it encoded text an order of
| magnitude faster with zero config changes.
|
| It lives up to that claim. (I had issues with return object
| typing when going between Python/Rust at first but those are more
| consistent now)
| adsharma wrote:
| There is another way to speed up python:
|
| Write code in python and transpile to another language (could be
| rust) and then import it back into python
|
| https://github.com/adsharma/py2many/tree/main/tests/expected
|
| Figuring out a mapping between a subset of a compiled language
| and a subset of statically typed python should be possible.
|
| The hard part is mapping standard library. I suspect something
| like nim might have an advantage there.
| gukoff wrote:
| With PyO3, I built the library to parse datetimes 10x faster than
| `datetime.strptime` in just a few lines of code:
| https://github.com/gukoff/dtparse
|
| It just calls the Rust's chrono library that does the parsing and
| wraps the result in a Python object. You can do it for any Rust
| library, it's very, very easy!
|
| The only slightly complicated part is the distribution. You need
| to use https://github.com/PyO3/maturin or
| https://github.com/PyO3/setuptools-rust, and of course, you need
| to have Rust installed on the wheel-building machine.
|
| Feel free to use this repo as a reference if you want to build a
| similar thing. The code is commented, and there's a working
| GitHub action that builds the wheels for all platforms and
| uploads them to PyPi:
| https://github.com/gukoff/dtparse/tree/master/.github/workfl...
| JPKab wrote:
| Thank you thank you thank you!
|
| I was looking at PyO3 a few months ago, after discovering the
| orjson python (with rust inside) library and radically speeding
| up an auto-ML app for work.
|
| I really enjoyed starting to learn Rust, but found the process
| to embed in Python to be rather intimidating. Looking forward
| to using your repo as a reference, and love the dtparse work
| you've done.
| Rotareti wrote:
| This is awesome, thanks for sharing! I think this should be
| added to the PyO3 examples list :)
|
| https://github.com/PyO3/pyo3#examples
| japhyr wrote:
| I was surprised to find out how slow strptime() can be. I was
| working on a data-focused project that was finally starting to
| slow down from the growing volume of data. I was looking at
| river heights over time, and once I hit about 140,000 data
| points the project got slow enough to make some profiling and
| optimization worthwhile. I was quite surprised to find it was
| spending more than two full seconds just running strptime(),
| out of a total execution time of around 15 seconds.
|
| I ended up looking at a bunch of different ways of processing
| timestamps in Python: strptime(), string parsing, regex,
| datetime.isoformat(), NumPy, Pandas, and more. I got a 46x
| speedup using datetime.isoformat(). Other approaches got
| anywhere from 4x to 40x speedup, and a couple approaches were
| an order of magnitude slower than strptime().
|
| My takeaway was there's no substitute for profiling the actual
| code you're running, and focusing on the specific bottlenecks
| in your own project. I wrote this up in a blog post if anyone's
| interested, "What's faster than strptime()?"
|
| https://ehmatthes.com/blog/faster_than_strptime/
| mrcarruthers wrote:
| how does it compare against ciso8601 perf-wise?
| https://pypi.org/project/ciso8601/
|
| to be fair ciso8601 only parses iso8601 datetimes, but that's
| enough for 90%+ of my use cases.
| throwaway894345 wrote:
| I'm very curious to hear the use case for which date time
| parsing was the bottleneck! Also, I'm surprised that the
| overhead of calling across the language boundary didn't dwarf
| the gains from parsing...
| pbecotte wrote:
| I've certainly never been bottlenecked on date parsing :)
| However, many/most of the high performance python libraries
| are built in C code, and compiled down into something the
| python interpreter can use directly. There are lots of python
| bindings written in c++ to native c libraries as well, I know
| I have used ZeroMQ pretty recently. Rust is done the same
| way- the code is compiled down into objects that Python can
| use directly- its not like running a javascript interpreter
| in your code.
| oblvious-earth wrote:
| I've had this situation a few times. Most recently
| transforming large (1-50 GB) CSV files in to a format that
| can be digested by a proprietary bulk DB loader.
|
| Because our problem was just about reformatting we ended up
| reading the CSVs in binary mode and using struct to extract
| the relevant values from the date time fields. But if we
| needed to do actual date logic something like this would
| perhaps be useful (but there other fast date time libraries
| out there, I've been a fan of pendulum for some tasks).
| throwaway894345 wrote:
| That makes sense, but I have a hard time believing the
| approach of calling into a date time parser O(n) times is
| going to yield a significant performance gain no matter how
| much faster the parser is. However, I'm being downvoted, so
| perhaps I'm mistaken?
| brundolf wrote:
| Maybe they did it in bulk? i.e. send all the strings over
| at once, parse them in a loop, send them back. Seems like
| that would reduce overhead
| throwaway894345 wrote:
| Right, and that makes sense, but the context here is a
| date parsing library for Python--unless said library has
| a batch interface, I'm not sure how that would improve
| performance, but maybe I'm misestimating something.
| brundolf wrote:
| Ah, I skimmed over the part where this is a library and
| not application-code
| lincolnq wrote:
| My instinct is that the overhead is small. You need to
| add a few C stack frames and do some string conversion on
| each call, maybe an allocation to store the result. It's
| not going to be as quick as doing in pure Rust, but the
| python-to-native code layer can be pretty lightweight I
| think!
| oblvious-earth wrote:
| Sometimes it's about optimizing wall time not algorithmic
| complexity.
|
| If you have a batch SLA of 1 hour, and your currently
| spending 50-70 mins to complete the batch and 20 minutes
| of that time is spent date parsing and you can reduce it
| to 5 minutes that's an big win.
| throwaway894345 wrote:
| No doubt, but if your date parsing saves you 1 second per
| date parsed but each call into the faster library costs 2
| seconds, then your performance actually suffers. The only
| way around this is to make a batch call such that the
| overhead is O(1).
| minitech wrote:
| I'm not going to install it to check, but when someone
| writes "Fast datetime parser for Python written in Rust.
| Parses 10x-15x faster than datetime.strptime." it seems
| reasonable to assume that this is not the case.
| throwaway894345 wrote:
| Depends on whether or not the parent is including the
| overhead in their statistic. Misinformation about
| microbenchmarks is hardly a rarity.
| dmw_ng wrote:
| Another cheap trick if the time column is sequential is to
| split the string into date and time components, cache the date
| part and calculate the time part just with some multiplication
|
| Major caveat is timezone handling, but this only applies in a
| subset of situations
| quietbritishjim wrote:
| If you've got to that point of modifying the storage format
| then you might as well just use an integer (microseconds
| success the epoch) and be done with it. That seems cleaner
| than using a string (or two strings) anyway.
| adkadskhj wrote:
| I needed Blender integration a while back and wasn't sure what i
| could write it in. Py03 worked great with Blender with no
| configuration. I was quite concerned that something about the
| Python-embedded-Blender behavior would limit Py03.. but nope, so
| far it's worked flawlessly.
|
| Thanks Py03 team :)
| mynameisash wrote:
| At work, I'm using PyO3 for a project that churns through a lot
| of data (step 1) and does some pattern mining (step 2). This is
| the second generation of the project and is on-demand compared
| with the large, batch project in Spark that it is replacing. The
| Rust+Python project has really good performance, and using Rust
| for the core logic is such a joy compared with Scala or Python
| that a lot of other pieces are written in.
|
| Learning PyO3, I cobbled together a sample project[0] to
| demonstrate how some functionality works. It's a little outdated
| (uses PyO3 0.11.0 compared with the current 0.13.1) and doesn't
| show everything, but I think it's reasonably clear.
|
| One thing I noticed is that passing very large data from Rust and
| into Python's memory space is a bit of a challenge. I haven't
| quite grokked who owns what when and how memory gets correctly
| dropped, but I think the issues I've had are with the amount of
| RAM used at any moment and not with any memory leaks.
|
| [0] https://github.com/aeshirey/CheeseShop
| fulafel wrote:
| Previously (2017): https://news.ycombinator.com/item?id=14859844
| LockAndLol wrote:
| If this works well, I'd rather use this over being forced to use
| type hints and mypy.
|
| Has anybody used this in conjunction with a python framework?
| Django, fastapi or something?
| uranusjr wrote:
| Uh, how do you plan to use FastAPI while avoiding type hints?
| edenhyacinth wrote:
| I have! Used FastAPI as a frontend to do some minor data
| modification, and passed the data for model inference in Rust.
|
| Works really nicely, although given how little work I'm doing
| in the Python side I honestly prefer using Rocket instead of
| FastAPI and then using pyo3 to call the Python library in Rust,
| rather than the other way around.
| LockAndLol wrote:
| Thanks for the response. That does sound pretty much like
| what I would like to do. Have you by any chance open-sourced
| your project?
|
| I'm new to rust, but I'll check out Rocket. Cheers
| pansa2 wrote:
| How would PyO3 help you avoid type hints and mypy?
| brundolf wrote:
| I think the idea is that they move their business logic to
| the Rust code, since Rust's type system is more powerful and
| more sound, instead of trying to make do with MyPy
| zerkten wrote:
| Wouldn't it be more of a priority to move it for lower
| memory use and higher request speed? A better type system
| is good, but often these are a struggle with scaling
| interpreted languages compared to other lower level
| languages.
| brundolf wrote:
| For many people the primary appeal of Rust is its type
| system and related features (declaring deep immutability,
| pattern-matching, etc)
|
| > often these are a struggle with scaling interpreted
| languages compared to other lower level languages
|
| Not sure what's meant by this
| LockAndLol wrote:
| It would minimize the python surface required to be covered
| with type-hints and mypy. If possible, one could simply point
| django to the modules generated from rust.
|
| I'll give it a shot tonight and see how it goes. Now I'm
| curious.
| edeion wrote:
| That's a really great name you came up with! Embodies both parts
| of your focus, stays pronounceable. Does the 3 relate to the
| Python version or are you mimicking some specific molecule that I
| can't think of?
| [deleted]
| SnowflakeOnIce wrote:
| My guess is that the name is derived from the `-O3` compiler
| optimization level from many compilers.
| fafhrd91 wrote:
| name was chosen after `uranium trioxide`, pythonium trioxied
| - pyo3
| chc wrote:
| If you're trying to figure out the origin of a Rust
| project's name, the safest bet is always to choose the one
| that's a reference to metal.
| fafhrd91 wrote:
| i am original author of pyo3. Yuri Selivanov (author of
| uvloop and edgedb) suggested pyo3 name.
| chc wrote:
| Oh, I know, I wasn't trying to correct you or anything. I
| was just adding on to the correct answer to point out
| that PyO3's naming scheme is part of a popular trend in
| Rust libraries.
| batterylow wrote:
| It's indeed a cool name, but it's not my doing (this isn't a
| Show HN)!
| smlckz wrote:
| Py (iv) O O = Py < |
| O
|
| or Py (vi) O || O =
| Py || O
|
| or Py (ii) O Py < > O
| O
|
| heh!
| auscompgeek wrote:
| I think you might be missing an oxygen atom there.
| Swenrekcah wrote:
| I would guess it is derived from:
| https://en.wikipedia.org/wiki/Iron(III)_oxide
| smlckz wrote:
| But that's Fe_2 O_3 !
| ziml77 wrote:
| I think calling it Py2O3 would be a bit confusing though.
| smlckz wrote:
| Just PyO or Py_3 O_4 could have been used as well, does
| not matter that much.
| OskarS wrote:
| I thought it was like the compiler flag, -O3. "With full
| optimization", basically.
| benecollyridam wrote:
| Another related project: Wasmtime and Rust+Python
|
| Compile your Rust code to wasm to circumvent having to compile
| for different architectures.
|
| https://docs.wasmtime.dev/wasm-rust.html
| ksm1717 wrote:
| Between pyodide, pyo3, rust-cpython, and rustpython, I think Pyo3
| is the best way to drop in rust in a python project for a speed
| up, if that is your goal. Some of the demos show using python
| from rust, but to me the biggest feature is without a doubt
| compiling rust code to native python modules. I'm using it to
| speed up image manipulation backed by numpy arrays.
|
| There's a setuptools rust [0] extension package that can be used
| to hook the compilation of the rust into the wheel building or
| install from source. Maturin [1] seems to be regarded as the new
| and improved solution for this, but I found that it's angled
| toward the using python from rust.
|
| There's also the rust numpy [2] package by the same org which is
| fantastic in that it lets you pass a numpy matrix to a native
| method written in rust and convert it to the rust equivalent data
| structure, perform whatever transformation you want (in parallel
| using rayon [3]), and return the array. When building for
| release, I was seeing speed ups of 100x over numpy on the most
| matrix mathable function imaginable, and numpy is no joke.
|
| I think there is a lot of potential for these two ecosystems
| together. If there's not a python package for something, there's
| probably a rust crate.
|
| If anyone is interested the python package that I'm building with
| some rust backend, its called pyrogis [4] for making custom image
| manipulations through numpy arrays.
|
| [0] https://github.com/PyO3/setuptools-rust
|
| [1] https://github.com/PyO3/maturin
|
| [2] https://github.com/PyO3/rust-numpy
|
| [3] https://github.com/rayon-rs/rayon
|
| [4] https://github.com/pierogis/pierogis
| cycomanic wrote:
| > Between pyodide, pyo3, rust-cpython, and rustpython, I think
| Pyo3 is the best way to drop in rust in a python project for a
| speed up, if that is your goal. Some of the demos show using
| python from rust, but to me the biggest feature is without a
| doubt compiling rust code to native python modules. I'm using
| it to speed up image manipulation backed by numpy arrays.
|
| > There's a setuptools rust [0] extension package that can be
| used to hook the compilation of the rust into the wheel
| building or install from source. Maturin [1] seems to be
| regarded as the new and improved solution for this, but I found
| that it's angled toward the using python from rust.
|
| > There's also the rust numpy [2] package by the same org which
| is fantastic in that it lets you pass a numpy matrix to a
| native method written in rust and convert it to the rust
| equivalent data structure, perform whatever transformation you
| want (in parallel using rayon [3]), and return the array. When
| building for release, I was seeing speed ups of 100x over numpy
| on the most matrix mathable function imaginable, and numpy is
| no joke.
|
| What sort of algorithm was that? Generally getting 100x speedup
| on vectorized code is highly unusual even using handcoded c++.
| So I suspect it was quite loop heavy? In those cases I have
| also seen very significant speed ups.
|
| I have been using pythran [1] for speeding up my python code.
| It generally achieves extremely good performance. I have
| blogged about it here [2] and recently a member used pythran to
| speed up some nbody benchmarks [3] which was used in an article
| to argue for using compiled languages.
|
| That said I find pyO3 quite exciting and have been
| contemplating to try it with some of my projects. [1]
| https://github.com/serge-sans-paille/pythran [2]
| https://jochenschroeder.com/blog/articles/DSP_with_Python2/ [3]
| https://github.com/paugier/nbabel
| ksm1717 wrote:
| Matrix of shape (rows, columns, 3). Average the last dim for
| each point and change it to [0,0,0] if average less than a
| value, [255,255,255] if greater. A brightness threshold. May
| be remembering the speed up factor wrong so take it with a
| grain of salt - fact of the matter is it was very impressive.
|
| I'm checking out that post later, I'm trying to make my
| package easy to build on, so being able to write extensions
| with Pythran would be another great option for speed ups.
| Thanks
| cycomanic wrote:
| Just for the fun of it I tested what speed up I could get
| with a naive algorithm and pythran. Based on your
| description it looks like the I should do the following:
|
| def threshold_pixel(img, thr): out = np.zeros_like(img) o =
| np.mean(img, axis=-1) out[o>thr] = 255 return out
|
| This runs in ~30ms for a (1024,1024,3) array using numpy on
| my machine. Using pythran (note I had to explicitely write
| out the loop for out[o>thr] =255, due to a bug, that I
| found and just reported), I get a speed of 6.ms (with
| openmp) and 9ms without (I did not tune the openmp, but
| this should yield a much higher speedup).
|
| P.S.: Just had a look at your project, very cool, I have to
| try that
| pansa2 wrote:
| Related: RustPython - A Python interpreter written in Rust.
|
| https://github.com/RustPython/RustPython
| bluedays wrote:
| Without looking at it I wonder if it's using the Python language
| underneath, or the python vm. Either way this is pretty cool.
| Nvorzula wrote:
| Precisely, this is Rust that compiles to a C FFI that plugs
| into CPython.
| itamarst wrote:
| I've been playing with PyO3 for prototyping, and wrapped some
| Rust code to see if it's faster than Python. The experience was
| very much like using Boost Python (whcih these days has
| alternative with https://github.com/pybind/pybind11). It's
| _really_ easy to wrap code for Python, and it has nice APIs to
| ensure GIL is held. Being Rust, I'm much more confident I won't
| suffer from memory unsafety issues which my C++ at the time did.
|
| Now I'm starting to use it as part of the Python memory profiler
| I'm working on (https://pythonspeed.com/fil), in this case to
| call in to the low-level Python C API which PyO3 includes
| bindings for in addition to its high-level API. This kind of
| usage is more like writing C, except with the benefit of having
| high-level APIs (for GIL holding, but also object conversion)
| available when I need it.
|
| So basically you get safe, high-level, easy-to-use APIs, with
| fallback to low-level unsafe APIs if you need them.
|
| Highly recommend trying it out.
| JPKab wrote:
| Was just checking out your fil project. It looks really useful,
| and I dig the jupyter kernel as well.
| itamarst wrote:
| Thank you! If you have any questions/problems/ideas, please
| reach out via GitHub or email (itamar@pythonspeed.com).
| brundolf wrote:
| What's the data-conversion overhead look like at the boundary?
| Which data structures can be passed back and forth without a
| full clone, etc?
| itamarst wrote:
| There's definitely a conversion cost. For strings, Python
| apparently caches the UTF-8 encoded string, so if you
| _repeatedly_ transfer it to Rust I suspect (but haven't
| checked) that the cost is much lower.
|
| In general I suspect it's the usual "NumPy arrays are fast,
| everything else you better be getting a sufficiently large
| boost from the low-level code to justify conversion".
|
| For the thing I prototyped in Rust, it was wrapping the
| `ahocorasick` crate which was in fact faster than
| `pyahocorasick` which is written in C or Cython or something.
| Both have similar conversion costs, probably, so it came down
| to "for lots of data the Rust version was faster".
| burntsushi wrote:
| Be sure to use auto configuration to get it to go even
| faster, depending on your use case: https://docs.rs/aho-
| corasick/0.7.15/aho_corasick/struct.AhoC...
|
| Or just be sure to enable the DFA option if you can afford
| it. It looks like the Python library is just the standard
| NFA algorithm.
| itamarst wrote:
| Yeah, I was using DFA.
|
| Next step is trying alternative approach, but if that
| alternative doesn't work I'm going to see about wrapping
| your package for Python.
|
| Thanks for all your work on it!
| burntsushi wrote:
| Nice! Reach out if there are any problems or if you need
| something exposed in the API. Looking at the
| pyahocorasick issue tracker, there are a number of
| features/bugs that your wrapper package would resolve. :)
| liuliu wrote:
| NumPy also support conversions without copying. One thing I
| haven't found good way to bridge between Python is the
| pandas.DataFrame, it seems to be quite Python focused
| object and iterating through DataFrame is particularly
| slow.
| itamarst wrote:
| Internally Pandas often uses NumPy arrays, especially for
| numeric data, so might be able to pass things that way in
| some cases?
|
| E.g. `df["column_name"].values` will you get you a NumPy
| array.
| shirakawasuna wrote:
| Sounds great! Would so much rather drop into Rust than C or
| C++.
| dbrgn wrote:
| If you're interested in publishing Rust libraries as Python
| packages (or integrating Rust code into an existing Python
| package), check out https://github.com/PyO3/maturin and
| https://github.com/PyO3/setuptools-rust.
| edenhyacinth wrote:
| Been using Maturin for a little while professionally, and it's
| surprisingly good. There's a few bugbears here and there - I
| haven't found a way to have Cargo Test & a pyo3 library working
| at the same time - but overall it's a lot more pleasant than
| working with Rust and R was.
___________________________________________________________________
(page generated 2021-01-29 23:00 UTC)