[HN Gopher] Our Plan for Python 3.13
___________________________________________________________________
Our Plan for Python 3.13
Author : bratao
Score : 438 points
Date : 2023-06-15 12:55 UTC (10 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| orbisvicis wrote:
| How multithreaded programs depend on the GIL rather than use
| proper locking? The No GIL project might break a lot of
| multithreaded apps.
| gyrovagueGeist wrote:
| It would require no changes from pure Python apps (as the nogil
| implementation preserves the current behavior & thread safety
| of python objects).
|
| But any C extensions that rely on the GIL for thread safety
| would be an issue..
| selimnairb wrote:
| They must've explored this, but couldn't one make a GIL
| emulator/wrapper for legacy extensions so that they can still
| work in GIL-less Python while they are (hopefully) updated to
| work with the new synchronization primitives?
| pmoriarty wrote:
| Wish python would sort out their packages mess and let there be
| one and only one (obvious) way to do it.
| Alifatisk wrote:
| Ruby team approached parallelism & multicore in a cool way, they
| introduced the concept of the Actor model (like Erlang)!
| sproketboy wrote:
| [dead]
| kortex wrote:
| Every single HN thread on python performance, ever:
|
| Person with limited to zero experience with CPython internals> I
| hate the GIL, why don't they _just remove it_?
|
| That _just_ is doing an incredible amount of heavy lifting. It 'd
| be like saying, "Why doesn't the USA _just_ switch entirely to
| the metric system? " It's a huge ask, and after being burned by
| the 2/3 transition, the python team is loathe to rock to boat too
| much again.
|
| The GIL is deep, arguably one of the deepest abstractions in the
| codebase, up there with PyObject itself. Think about having to
| redefine the String implementation of a codebase in your language
| of choice.
|
| Whatever your monday-morning quarterback idea on how to pull out
| the GIL is, I can almost guarantee, someone has thought about it,
| probably implemented it, and determined it will have one or more
| side effects:
|
| - Reduced single-thread performance
|
| - Breaking ABI/FFI compatibility
|
| - Not breaking ABI immediately, but massively introducing the
| risk of hard to track concurrency bugs that didn't exist before,
| because GIL
|
| - Creating a maintenance nightmare by adding tons of complexity,
| switches, conditionals, etc to the codebase
|
| The team has already decided those tradeoffs are not really
| justifiable.
|
| The GILectomy project has probably come the closest, but it
| impacts single-thread performance by introducing a mutex (there
| are some reference type tricks [1] to mitigate the hit, but it
| still hurts run time 10-50%), and necessitates any extension
| libraries update their code in a strongly API-breaking way.
|
| It's possible that over time, there are improvements which
| simultaneously improve performance or maintainability, while also
| lessening the pain of a future GILectomy, e.g. anything which
| reduces the reliance on checking reference counts.
|
| [1] PEP 683 is probably a prerequisite for any future GILectomy
| attempts, which looks like it has been accepted, which is great
| https://peps.python.org/pep-0683/
| arp242 wrote:
| There were already patches to "just" remove the GIL for Python
| 2.0 or thereabouts, over 20 years ago. Strictly technically
| speaking it's not even that hard, but as you mentioned it comes
| with all sorts of trade-offs.
|
| Today Python is one of the world's most popular programming
| language, if not _the_ most popular language. The GIL is
| limiting and a trade-off in itself of course - designing any
| programming language is an exercise in trade-offs. But clearly
| Python has gotten quite far with the set of trade-offs it has
| chosen: you need to be careful to "just" radically change the
| set of trade-offs that have proven themselves to be quite
| successful.
| Groxx wrote:
| Removing the GIL will also make some currently-correct threaded
| code into incorrect code, since the GIL gives some convenient
| default safety that I keep seeing people accidentally rely on
| without knowing it exists.
|
| Unavoidable performance costs _and_ reduced safety equals
| massive friction no matter how it 's approached. It's not a
| question of "are we smart enough to make this better in every
| way", since the answer is "no, and nobody can be, because it
| can't be done". It's only "will we make this divisive change".
| [deleted]
| aldanor wrote:
| To think about it, perhaps US should have started metrification
| back in 1980 when EU enforced it for its members. It takes
| time, but is doable - if took Ireland 25 years to fully switch
| to metric.
|
| The cost of switching increases over time and will increase
| further and further, making it less and less likely. Would have
| been way cheaper back then.
| dragonwriter wrote:
| > To think about it, perhaps US should have started
| metrification back in 1980
|
| The US started that in either 1875 or 1893, though, and
| somewhat more aggressively in 1975.
|
| It reversed course in the 1980s, it didn't fail to start
| before that.
| kmod wrote:
| You should check out the new nogil project by Sam Gross, which
| is what's being talked about these days -- he actually
| successfully removed the gil, but yes with the tradeoffs that
| you mention. The other projects were, by comparison, "attempts"
| to remove the gil, and didn't address core issues such as
| ownership races (which are far harder than making refcount
| operations atomic).
| soulbadguy wrote:
| > That just is doing an incredible amount of heavy lifting.
| It'd be like saying, "Why doesn't the USA just switch entirely
| to the metric system?" It's a huge ask, and after being burned
| by the 2/3 transition, the python team is loathe to rock to
| boat too much again.
|
| > The GIL is deep, arguably one of the deepest abstractions in
| the codebase, up there with PyObject itself. Think about having
| to redefine the String implementation of a codebase in your
| language of choice.
|
| I am somewhat familiar with the cpython code base, and talk to
| some folks involves in some of thousand new python runtime.
|
| The problem is not that deep, and cpython is not that much
| different from other projects : the GIL is an implementation
| details that leaked through the API and now we have a bunch of
| people are relying on it. The question is what do we do about
| it. The trade-off in such situation are well known and just a
| question to pick the right one.
|
| > The team has already decided those tradeoffs are not really
| justifiable.
|
| That's from my view the crust of the issue, the TLDR is that
| GIL-less python is not a priority for the python team, so they
| view any trade-off/comprise as "not justifiable". Especially
| something when it come to added complexity to the code base or
| what not...
|
| > Person with limited to zero experience with CPython
| internals> I hate the GIL, why don't they just remove it?
|
| People have this reaction because having a single-threaded
| interpreter in 2023 is just ... embarrassing, what ever the
| reasons behind it.
| milliams wrote:
| Notably yesterday Guido posted an update on the Faster Python
| plans w.r.t. the no-GIL plans
| (https://discuss.python.org/t/pep-703-making-the-global-inter...)
| specifically: Our ultimate goal is to integrate a
| JIT into CPython
| selimnairb wrote:
| Yeah, reading CPython 3.13 plans made it seem like they were
| backing their way into creating a JIT compiler.
| [deleted]
| cellularmitosis wrote:
| > Experiments on the register machine machine are complete
|
| This is a worthwhile watch on this topic:
| https://youtu.be/cMMAGIefZuM
| kzrdude wrote:
| faster-cpython team has done a lot of work to experiment on it:
| https://github.com/faster-cpython/ideas/issues/485#issuecomm...
|
| I wonder if the plan written there is current.
| samsquire wrote:
| This is extremely exciting stuff. Thank you for Python and
| working on improving its performance.
|
| I use Python threads for some IO heavy work with Popen and C and
| Java for my true multithreading work.
| klooney wrote:
| This is exciting. Feels a little like JavaScript 10 years ago.
| andai wrote:
| You mean 15? ;)
| https://en.wikipedia.org/wiki/V8_(JavaScript_engine)
| phkahler wrote:
| From the pip on subinterpreters:
|
| >> In this case, multiple interpreter support provide a novel
| concurrency model focused on isolated threads of execution.
| Furthermore, they provide an opportunity for changes in CPython
| that will allow simultaneous use of multiple CPU cores (currently
| prevented by the GIL-see PEP 684).
|
| This whole thing seems more like something developer wants to do
| (hey, it's novel!) than something users want. To the extent they
| do want it, removing the GIL is probably preferable IMHO. That a
| global lock is a sacred cow in 2023 seems strange to me.
|
| Maybe I'm misunderstanding, but I don't want an API to manage
| interpreters any more than I want one for managing my CPU cores.
| Just get me unhindered threads ;-)
| [deleted]
| samwillis wrote:
| The "Faster Python" team are doing a fantastic job, it's
| incredible to see the progress they are making.
|
| However, there is also a bit of a struggle going on between them
| and the project to remove the GIL (global interpreter lock) from
| CPython. There is going to be a performance impact on single
| threaded code if the "no GIL" project is merged, something in the
| region of 10%. It seems that the faster Python devs are pushing
| back against that as it impacts their own goals. Their argument
| is that the "sub interpreters" they are adding (each with its own
| GIL) will fulfil the same use cases of multithreaded code without
| a GIL, but they still have the overhead of encoding and passing
| data in the same way you have to with sub processors.
|
| There is also the argument that it could "divide the community"
| as some C extensions may not be ported to the new ABI that the no
| GIL project will result in. However again I'm unconvinced by
| that, the Python community has been through worse (Python 3) and
| even asyncIO completely divides the community now.
|
| It's somewhat unfortunate that this internal battle is happening,
| both projects are incredible and will push the language forward.
|
| Once the GIL has been removed it opens us all sorts of
| interesting opportunities for new concurrency APIs that could
| enable making concurrent code much easer to write.
|
| My observation is that the Faster Python team are better placed
| politicly, they have GVR on the team, whereas No GIL is being
| proposed by an "outsider". It just smells a little of NIH
| syndrome.
| vb-8448 wrote:
| > There is going to be a performance impact on single threaded
| code if the "no GIL" project is merged, something in the region
| of 10%.
|
| 10% doesn't look too much to me, I still don't get why today
| people care so much about single thread performance.
| crabbone wrote:
| CPython is so hopelessly slow, I wouldn't care about 10%. For
| most of the stuff written in Python, users don't really care
| about speed.
|
| The impact won't be on users / Python programmers who don't
| develop native extensions. It will suck for people who had a
| painful workaround for Pythons crappy parallelism already,
| but now will have to have two workarounds for different kinds
| of brokenness. It still pays off to make these native
| extensions, however their authors will create a lot of
| problems for the clueless majority of Python users, which
| will like end up in some magical "workarounds" for problems
| introduced by this change very few people understand. This
| will result in more cargo cult in the community that's
| already on the witch hunt.
| samwillis wrote:
| Exactly, and the thing is, the Faster Python project will
| completely surpass that 10% performance change.
| _kulang wrote:
| Sometimes people see it as losing 10% performance that you
| never get back
| awill wrote:
| because a lot of code is single-threaded
| celeritascelery wrote:
| Single threaded performance is still more important than
| multi-threaded. Most applications are single threaded, and
| single threaded programs are much easier to write and debug.
| Removing the GIL from python will not change that.
|
| If no-GIL has a 10% single thread performance hit, that means
| that essentially all my existing python code would be that
| much worse.
| vb-8448 wrote:
| > If no-GIL has a 10% single thread performance hit, that
| means that essentially all my existing python code would be
| that much worse.
|
| Maybe in a 100% CPU bound code, most of the code is I/O
| bound and no one will notice the change, just my opinion.
| celeritascelery wrote:
| Maybe, but if your code is I/O bound, then multi-
| threading isn't going to help you either.
| wongarsu wrote:
| Maybe that's just my bubble, but I see much more python
| in data science projects than in web servers. And in
| (python) data science even your file reading/writing code
| quickly gets CPU bound.
| coldtea wrote:
| Not in pure Python, that's in specialized libs, like
| numpy, pandas and co, done in C.
|
| So, the hit on the Python interpreter wouldn't translate
| to a hit on those.
| regularfry wrote:
| That's going to be CPU-bound in numpy's C extensions
| rather than Python itself, one would hope. The worst of
| all worlds is that we get a 10% perf cut to python
| execution _and_ numpy breaks because the C API is ripped
| up.
| baq wrote:
| There was a time like 5-10 years ago where Python was
| really popular for grassroots web projects. Nowadays this
| is mostly node looks like.
| gpderetta wrote:
| That's because you are doing it wrong. You'll need to
| split every step of your data science pipeline into a
| microservice, then put it in the cloud for resilience.
| Then the application will be so fast that it is no longer
| CPU bound but I/O bound.
| umanwizard wrote:
| But the GIL doesn't need to be held in I/O-bound code
| anyway, so why does it matter?
| burnished wrote:
| It might be more helpful to think of it in terms of
| supported use cases, rather than just pure volume.
| coldtea wrote:
| > _If no-GIL has a 10% single thread performance hit, that
| means that essentially all my existing python code would be
| that much worse._
|
| So? Especially since the "Faster Python" team already made
| Python 1.11 "10-60% Faster than 3.10", and 1.12 is even
| faster still, whereas their overall plan is to get it to
| 2-5 times faster compared to 3.9.
|
| So at the worst case, with a 10% hit, you'd balance out the
| 3.11 speed, and your code would be as fast as 3.10.
| shpx wrote:
| But your software would still run 10% slower than it
| needs to. Single threaded code is like 99% of all code
| written.
| gpderetta wrote:
| that's not an argument either, as your software is
| already 10000% slower than it needs to be as you have
| written it in python.
| [deleted]
| coldtea wrote:
| > _But your software would still run 10% slower than it
| needs to_
|
| There's no absolute objective "needs to" or even any
| static baseline. Python can have, and often has had, a
| performance regression that drop your code by 10% at any
| time. It's no big deal in itself.
|
| Also consider a further speedup of e.g. 50% in upcoming
| versions (they have promised more).
|
| If you're OK with the X speed of today's Python, you
| should be ok with X + 40% - even if it's not the X + 50%
| it could have been due to the 10% GIL's removal toll.
| hospitalJail wrote:
| Are your current python programs slow and it matters?
|
| Why havent you implemented multithreading?
|
| (don't get me wrong, I know the cost of implementation, but
| if speed matters, multithreading is a very reasonable step
| in python)
| birdyrooster wrote:
| Because the GIL and also you have to use tons of locks to
| get around the lack of thread safety for pythons objects.
| [deleted]
| Spivak wrote:
| > Why haven't you implemented multi-threading?
|
| Because that makes programs slower in Python.
|
| Multi-threading in Python is for when you need time-
| slicing for CPU intensive tasks so that they don't block
| other work that needs to be be done.
| coldtea wrote:
| His point remains, he just phrased it badly. We haven't
| you implemented a multiprocess pool?
| gpderetta wrote:
| Because not everything is trivially parallelizable and
| multiprocess makes it harder to share data?
| coldtea wrote:
| > _Because not everything is trivially parallelizable_
|
| A lot of things are though...
| munch117 wrote:
| Because it's a global solution to a local problem.
|
| With threads, I can encapsulate the use of threads in a
| class, whose clients never even notice that threads are
| in use. Sure, threads are a global resource too, but much
| of the time you can get away with pretending that they're
| not and create them on demand. Not so with
| _multiprocess_. If you use that, then the whole program
| has to be onboard with it.
|
| Threads work great in Python. Well not for maximising
| multicore performance, of course, but for other things,
| for structuring programs they're great. Just shuttle work
| items and results back and forth using queue.Queue, and
| you're golden - Python threads are super reliable. And if
| the threads are doing mostly GIL-releasing stuff, then
| even multicore performance can be good.
| coldtea wrote:
| > _Not so with multiprocess. If you use that, then the
| whole program has to be onboard with it_
|
| Huh? In Python you just need a function to call, and
| multiprocess will run it wrapped in a process from the
| pool, while api-wise it would look as it would if it was
| a threadpool (but with no sharing in the process case,
| obviously).
|
| So what would the rest of the program be onboard with?
|
| And all this could also be hidden inside some subpackage
| within your package, the rest of the program doesn't need
| to know anything, except to collect the results.
| munch117 wrote:
| multiprocessing needs to run copies of your program that
| are sufficiently initialised that they can execute the
| function, yet no initialisation code must be run that
| should not be run multiple times.
|
| That means you either use fork - which is a major can of
| worms for a reusable library to use.
|
| Or you write something like this in your entry point
| module: if __name__=='__main__':
| multiprocessing.freeze_support()
| once_only_application_code()
|
| Suppose I don't realise that your library is using
| multiprocessing, and I carelessly call it from this two-
| line script: import
| library_that_uses_multiprocessing_internally
| library_that_uses_multiprocessing_internally.do_stuff()
|
| That's basically a fork bomb.
|
| And where do you put the multiprocessing.set_start_method
| call? Surely not in the library.
| Spivak wrote:
| > Sure, threads are a global resource too, but much of
| the time you can get away with pretending that they're
| not and create them on demand.
|
| I think you would love Trio and applying the idea to
| threads.
| munch117 wrote:
| Applying which idea? async does not appeal to me, if
| that's what you mean.
| david422 wrote:
| And if you're really concerned about speed, Python is not the
| language to choose.
| burnished wrote:
| Every program in any language has the potential to be
| concerned about speed. This cute maxim is ultimately a
| punchline, not really a serious point.
| ahoho wrote:
| Right, and single thread performance won't matter as much if
| it becomes easier to implement multithreading. This hurts
| legacy code, but I imagine it would be worth it in the long
| run.
| bombolo wrote:
| It would remain as hard as it has always been. Also threads
| are very heavy, locking kills performance, and if you don't
| have GIL, you'll need to manage explicit locks, that will
| be just as slow but also cause an incredible amount of
| subtle bugs.
| pdonis wrote:
| _> if you don 't have GIL, you'll need to manage explicit
| locks_
|
| You need to do that with multithreaded Python code _with_
| the GIL. The GIL only guarantees that operations that
| take a single bytecode are thread-safe. But many common
| operations (including built-in operators, functions, and
| methods on built-in types) take more than one bytecode.
| saltminer wrote:
| > locking kills performance, and if you don't have GIL,
| you'll need to manage explicit locks
|
| I was under the impression that the Python thread
| scheduler is dependent on the host OS (rather than being
| intelligent enough to magically schedule away race
| conditions, deadlocks, etc.), so you still need to manage
| locks, semaphores, etc. if you write multi-threaded
| Python code. I don't see how removing the GIL would make
| this any worse. (Maybe make it slightly harder to debug,
| but at that point it would be in-line with debugging
| multi-threaded Java/C/etc. code.)
|
| Or would this affect single-threaded code somehow?
| bombolo wrote:
| In python you always have a lock, the GIL. If you take it
| away you end up actually having to do synchronization for
| real. Which is hard and error prone.
| [deleted]
| hospitalJail wrote:
| >I still don't get why today people care so much about single
| thread performance.
|
| For about 10 minutes a few years ago, when the M1 had the
| best single threaded performance per buck, people cared.
|
| Now that the M1 isnt the leader in single threaded, we are
| back to the 'multithread is most important'.
|
| Which has always been true. If your program needs an
| improvement in speed, you can multithread it. The opposite
| isnt true.
| insanitybit wrote:
| You can improve performance by moving to a single thread.
| Pinning work to a single core will improve cache
| performance, avoid overhead of flushing TLBs and other
| process specific kernel structures, and more.
| Joker_vD wrote:
| > If your program needs an improvement in speed, you can
| multithread it. The opposite isnt true.
|
| What do you mean by "the opposite"? "If your program
| doesn't need an improvement in speed, you can't multithread
| it"? "If you can multithread your program, then it doesn't
| need an improvement in speed"? Well, yeah, obviously both
| of those statements are false but they're also quite
| useless, so who cares?
| hoosieree wrote:
| Add negative threads to fix a program that runs too fast?
| Joker_vD wrote:
| Well, having too _many_ threads can slow down a program
| as well (extra context switches, extra synchronization)
| so... no idea.
| holoduke wrote:
| Not all algorithms can be chunked up. Single thread
| performance is and will always be important.
| gpderetta wrote:
| If the GIL were an optional interpreter parameter you could
| spawn GILed subinterpreters and GILless subinterpreters
| according to your needs.
| meepmorp wrote:
| > There is also the argument that it could "divide the
| community" as some C extensions may not be ported to the new
| ABI that the no GIL project will result in. However again I'm
| unconvinced by that, the Python community has been through
| worse (Python 3) and even asyncIO completely divides the
| community now.
|
| I think the fact that you can name two other recent things
| which have divided the community is a solid argument for being
| a least a little gunshy about making big, breaking changes.
| There's the cost of the changes themselves, but there's also a
| cost to the language as a whole to add yet-another-upheaval.
|
| Performance is important, but not breaking things is also
| important. I can understand the appeal of doing something
| suboptimal (but better than current) in favor of not
| introducing a bunch of harder to predict side effects, both in
| code and the community.
| deschutes wrote:
| Do sub interpreters actually work with c extensions? I get
| the extension API has long supported it. However, I wonder if
| in practice extensions rely on process global state to stash
| information.
|
| If so, sub interpreters invite all kinds of nasty bugs. Keep
| in mind that porting the most popular extensions is an easy
| exercise so the more interesting question is how this hidden
| majority of extensions fares.
| RobotToaster wrote:
| Why not just try to make multiprocessing easier?
| bastawhiz wrote:
| > Their argument is that the "sub interpreters" they are adding
| (each with its own GIL) will fulfil the same use cases of
| multithreaded code without a GIL, but they still have the
| overhead of encoding and passing data in the same way you have
| to with sub processors.
|
| This is smart, though, because (even if it's not great) there's
| a lot of evidence that it works in practice. Specifically, this
| is almost exactly what JavaScript does with workers. It's not a
| great API and it's cumbersome to write code for, but it got
| implemented successfully and people use it successfully (and it
| didn't slow down the whole web).
| dist-epoch wrote:
| As someone who observed Python core development for many years,
| a major change to the interpreter REQUIRES core-dev buy in.
| There have been at least 5 big projects which proposed large
| changes, they have all been declined.
|
| It is a NIH syndrome, if a big project doesn't originate in the
| dev team, it will not be accepted.
| AlphaSite wrote:
| Nogil would give far larger returns and I wish they'd focus on
| that. That's the best way to a faster python.
| btilly wrote:
| My point of view is that anyone who wants to write
| multithreaded code, shouldn't be trusted to. Making it easier
| for people to justify this kind of footgun is a problem.
|
| Also, no matter how much you wish it otherwise, retrofitting
| concurrency on an existing project guarantees that you'll wind
| up with subtle concurrency bugs. You might not encounter them
| often, and they're hard to spot, but they'll be there.
|
| Furthermore existing libraries that expect to be single-
| threaded are now a potential source of concurrency bugs. And
| there is no particular reason to expect the authors of said
| libraries to have either the skills or interest to track those
| bugs down and fix them. Nor do I expect that multi-threaded
| enthusiasts who use those libraries in unwise ways will
| recognize the potential problems. Particularly not in a dynamic
| language like Python that doesn't have great tool support for
| tracking down such bugs in an automated way.
|
| As a result if "no GIL" ever gets merged, I expect that the
| whole Python ecosystem will get much worse as well. But that's
| no skin off of my back - I've learned plenty of languages. I
| can simply learn one that hasn't (yet) made this category of
| mistake.
| iskander wrote:
| >My point of view is that anyone who wants to write
| multithreaded code, shouldn't be trusted to. Making it easier
| for people to justify this kind of footgun is a problem.
|
| Out of curiosity, have you done any Rust programming and used
| Rayon?
|
| It's hard to convey how easy and impactful multi-threading
| can be if properly enclosed in a safe abstraction.
| btilly wrote:
| I have only read about and played a tiny bit with Rust. But
| as I noted at
| https://news.ycombinator.com/item?id=36342081, I see it as
| fundamentally different than the way people want to add
| multithreading to Python. People want to lock code in
| Python. But Rust locks data with its compile-time checked
| ownership model.
|
| See https://blog.rust-lang.org/2015/04/10/Fearless-
| Concurrency.h... for more.
| kortex wrote:
| Python is dynamic AF and Rust's whole shtick is compile-
| time safety. Python was built from the ground up to be
| dynamic and "easy", Rust was meticulously designed to be
| strict and use types to enforce constraints.
|
| It's hard to convey how difficult it would be to retrofit
| python to be able to truly "enclose multithreading in a
| safe abstraction".
| samsquire wrote:
| My deep interest is multithreaded code. For a software
| engineer working on business software, I'm not sure if they
| should be spending too much time debugging multithreaded bugs
| because they are operating at the wrong level of abstraction
| from my perspective for business operations.
|
| I'm looking for an approach to writing concurrent code with
| parallelism that is elegant and easy to understand and hard
| to introduce bugs. This requires alternative programming
| approaches and in my perspective, alternative notations.
|
| One such design uses monotonic state machines which can only
| move in one direction. I've designed a syntax and written a
| parser and very toy runtime for the notation.
|
| https://github.com/samsquire/ideas5#56-stateful-circle-
| progr...
|
| https://github.com/samsquire/ideas4#558-state-machine-
| formul...
|
| The idea is inspired by LMAX Disruptor and queuing systems.
| btilly wrote:
| And your approach can be built into a system that does
| multi-threading away from Python, thereby achieving
| parallelism without requiring that Python supports it as
| well.
|
| That's basically what all machine learning code written in
| Python does. It calls out to libraries that can themselves
| parallelize, use the GPU, etc. And then gets the answer
| back. You get parallelism without any special Python
| support.
| zo1 wrote:
| Just to add a bit of my opinion after reading your comment
| in the context of this thread and not to the merit of your
| idea. You are precisely the type of person I'd keep very
| very far away from multithreading in any business software
| project and also why I advocate the GIL to stay. If you
| want to do that, go solo in your own time, or try apply for
| a research position in some giant tech Co.
| coldtea wrote:
| > _My point of view is that anyone who wants to write
| multithreaded code, shouldn 't be trusted to. Making it
| easier for people to justify this kind of footgun is a
| problem._
|
| It's 2023 already. 1988 called.
| btilly wrote:
| Did you know that Coverity actively REMOVED checks for
| concurrency bugs?
|
| It turns out that when the programmer doesn't understand
| what the tool says, managers believe the programmer and
| throw out the tool. Coverity was finding itself in
| situations where they were finding real bugs, and being
| punished for it by losing the sale. So they removed the
| checks for those bugs.
|
| I'll revisit my opinion of multithreaded code when things
| like that stop happening. In the meantime there are models
| of how to run code on multiple threads that work well
| enough with different primitives. See Erlang, Go, and Rust
| for three of them. Also, if you squint sideways,
| microservices. (Though most people set up microservices in
| a way that makes debugging problematic. Topic for another
| day.)
| biorach wrote:
| > Did you know that Coverity actively REMOVED checks for
| concurrency bugs?
|
| Source?
| btilly wrote:
| https://cseweb.ucsd.edu/~dstefan/cse227-spring20/papers/b
| ess...
| yjftsjthsd-h wrote:
| Specifically, on the last page of that:
|
| > As an example, for many years we gave up on checkers
| that flagged concurrency errors; while finding such
| errors was not too difficult, explaining them to many
| users was.
|
| (And thanks; I was also wondering about that)
| yjftsjthsd-h wrote:
| This is very witty and all, but what does it _mean_?
| DonHopkins wrote:
| I think he was subtly Rick Rolling you.
|
| https://en.wikipedia.org/wiki/Never_Gonna_Give_You_Up
|
| On 12 March 1988, "Never Gonna Give You Up" reached
| number one in the American Billboard Hot 100 chart after
| having been played by resident DJ, Larry Levan, at the
| Paradise Garage in 1987. The single topped the charts in
| 25 countries worldwide.
| coldtea wrote:
| Actually it just meant that this is a tired old argument
| when C/C++ programmers were new to multithreaded code.
|
| Now we have languages and language facilities (consider
| Rust, Haskell, and others) to make it much safer. Same
| with green threads and what Go and now Java does.
| p5a0u9l wrote:
| Removing the GIL should be seen as a last option.
| dekhn wrote:
| The GIL will never be removed from the main python
| implementation. Histortically, the main value of GIL removal
| proposals and implementations has been to spur the core team to
| speed up single core codes.
|
| I think it's too late to consider removing the gil from the
| main implementation. Like guido said in the PEP thread, the
| python core team burned the community for 10 years with the 2-3
| switch, and a GIL change would be likely as impactful; we'd
| have 10 years of people complaining their stuff didn't work.
| Frankly I wish Guido would just come out and tell Sam "no, we
| can't put this in cpython. You did a great work but
| compatibility issues trump performance".
|
| Kind of a shame because Hugunin implemented a Python on top of
| the CLR some 20 years ago and showed some extremely impressive
| performance results. Like jython, and pypy and other
| implementations, it never caught on because compatibility with
| cpython is one of the most important criteria for people
| dealing with lots of python code.
| lacker wrote:
| _I think it 's too late to consider removing the gil from the
| main implementation._
|
| I think it'll happen one day. Is Python going anywhere?
|
| Give it 20 years. The 2 -> 3 switch will be like the Y2K bug,
| only remembered by the oldest programmers. The memories of
| pain will fade, leaving only entertaining war stories. The
| GIL will still be there, and still be annoying.
|
| Then, when everyone has forgotten, the community will be
| ready. For an incredibly long and grinding transition to
| Python 4.
| ActorNightly wrote:
| In 20 years, you wouldn't have Python 4. You will have
| something like ChatGPT that you interact with and it writes
| code for you, down to the machine level instructions that
| are all hyperoptimized. Coding will be half typing, half
| verbal.
| jwandborg wrote:
| Imagine debugging hyperoptimized machine code. - Or would
| you just blame yourself for not stating your natural
| language instructions clearly enough and start over? I
| guess all of these complex problems would somehow be
| solved for everyone within the next 21 years and 364
| days.
| ActorNightly wrote:
| You wouldn't debug it directly. The interaction will be
| something like telling the compiler to run the program
| for a certain input, and seeing wrong output, and then
| saying, "Hey, it should produce this output instead".
|
| The algorithm will be smart enough to simply regenerate
| the code minimally to produce the correct output.
| m4rtink wrote:
| Good luck with that.
| ilc wrote:
| I think you have described a new definition of hell.
| dumpsterdiver wrote:
| Yeah, when people start running into weird failure modes
| with no insight into the code... sounds like a nightmare.
| timacles wrote:
| While titillating to think about, what you're describing
| is more than 20 years away for AI. The best it could do
| is that for generalized problems, to actually write new
| original code and complex code is not within the bounds
| of current AI capabilities.
| zackees wrote:
| I agree with this take.
|
| The GIL is hiding all kinds of concurrency bugs. If the
| CPython team default disable it then all hell is going to
| break loose.
|
| It's better to carve out special concurrency constructs for
| those that need it.
| nomel wrote:
| Not sure why this was flagged dead. If you look at many
| stackoverflow answers, around _threading_ , many
| _explicitly_ rely on the GIL, quoting source /documentation
| "proving" that it's safe, avoiding the use of
| threading.Lock() and the like.
|
| As an early python programmer, I copy pasted these types of
| answers. My old threaded code _absolutely_ has these bugs.
| I've even seen code in production that has these bugs,
| because they're sometimes dumb performance enhancements,
| rather than bugs, unless you happen to use a Python
| interpreter without a GIL, especially one that that doesn't
| exist yet.
|
| Great care would have to be taken to make sure the GIL was
| _not_ disabled by default, for anything an existing thread
| touches (or some super, dynamic aware, smarts to know if it
| can be disabled).
| tomn wrote:
| I believe that GIL-removal projects aim to preserve this
| behaviour:
|
| https://peps.python.org/pep-0703/#container-thread-safety
|
| That's why gilectomy carries an unreasonable single-
| threaded performance cost: many operations now need to
| take a lock where before they relied on the GIL.
| umanwizard wrote:
| Why do you consider code that relies on the GIL to be
| buggy? Isn't the GIL a documented, stable part of Python?
| (Hence why it will probably never be removed).
| miraculixx wrote:
| The GIL protects interpreter resources, not your program.
| If you have concurrent access to your own objects you
| need your own locks.
| londons_explore wrote:
| > compatibility issues trump performance".
|
| Surely it would be possible to drop back to GIL mode if any c
| extension is loaded which is incompatible with running
| lockless?
|
| Then you usually get speed, yet still maintain compatibility.
| mlyle wrote:
| > if any c extension is loaded which is incompatible with
| running lockless?
|
| It'd be slightly magical to do. You'd probably have a giant
| RWLock, which is used for checking whether you're in the
| "lock free" mode. But at least almost all code could hit it
| only for read, and it could go away one day.
| giantrobot wrote:
| > Like jython, and pypy and other implementations, it never
| caught on because compatibility with cpython is one of the
| most important criteria for people dealing with lots of
| python code.
|
| I think this is more an issue of popular packages were
| developed and tested against cpython and there wasn't enough
| effort available to port/test them against anything else.
| There's no special magic in cpython that Python programmers
| love, they just want their code to run. If they've got a
| numpy dependency (IIRC it doesn't support pypy but I'm not
| going to look it up so I may be corrected on that point this
| is a long parenthetical) they can't use an interpreter that
| doesn't support it. Even if it worked but had bugs it didn't
| have in cpython, they're still going to use cpython. Most
| people aren't writing Python for its super duper fast
| performance so they're fine leaving a little performance on
| the table by using the interpreter that their dependencies
| support. Whatever that is.
| dehrmann wrote:
| Jython doesn't have a good story for running native
| libraries like Numpy.
| giantrobot wrote:
| Sure, I didn't claim it did. My point is that _Python_
| programmers don 't tend to have a particular fondness for
| an interpreter. They tend to care only about the
| ecosystem. If you came up with a cxpython interpreter
| that was faster than cpython _and_ supported all the
| modules the same way (including C interop) Python
| programmers would jump over to it. If your cxpython was
| faster than cpython but _didn 't_ support everything
| they'd ignore it.
|
| Case in point: Python 2.7. While 3.x offered a lot of
| improvements it took years for some popular modules to
| support it. No one bothered to look twice at Python 3
| until their dependencies supported it.
|
| Python programmers don't tend to care much about the
| interpreter so much as the code they wrote or use running
| correctly.
| takeda wrote:
| I wouldn't say that GIL will never be removed, but I believe
| the GIL cannot be removed without breaking a lot of existing
| code.
|
| That means there could be another drama with migrations if
| that would be done.
|
| I think the most likely way they can effectively eliminate
| GIL would be to provide a compile option that would basically
| say "this enables paralellization, but your code is bo longer
| allowed to do A, B, C (there probably would be a lot more
| things)"
|
| People who want to get it would then adapt their code, and
| there could be pressure for other packages to make them work
| in that mode.
| coldtea wrote:
| > _the python core team burned the community for 10 years
| with the 2-3 switch, and a GIL change would be likely as
| impactfu_
|
| A core team led by him, which also had the opportunity to
| make much more impactful changes during 3, including removing
| the GIL, since they were going to mess up compatibility
| anyway, but didn't.
|
| All that mess (and resulting split and multi-year slowdown in
| Py3 adoption) just to put in the utf-8 change and some
| trivial changes. It's only after 3.5 or so that 3 became
| interesting.
| paulddraper wrote:
| It's hard to argue that Python 3 should have been even less
| compatible.
| mardifoufs wrote:
| Agreed. It would have pushed the 2->3 migration from
| "very painful for the ecosystem" to a full on perl5->6
| break between the two versions. Not sure it would have
| survived that.
| coldtea wrote:
| Is it? It took the 5 year (at least) adoption hit anyway.
| How worse would it be if it had more features people
| want?
|
| I'd say it should have been less compatible when needed
| to add more substantial changes people wanted, as opposed
| to taking the hit for nothing!
|
| And it should be more compatible for things that were
| stupid decisions that had to eventually take back, like
| not having a bytes str solution.
| misnome wrote:
| > How worse would it be
|
| raku
| coldtea wrote:
| Raku failed to timely deliver a stable release. And
| didn't stick to sensible new features, but tried to make
| an uber-language with everything plus the kitchen sink,
| and even a multi-language vm.
| misnome wrote:
| You are literally asking what could happen if Python 3
| had done the same.
| paulddraper wrote:
| > How worse would it be
|
| 8 years
|
| or never
| eesmith wrote:
| > if it had more features people want
|
| What people wanted was features to help with the
| migration.
|
| Yes, that of course having _those_ features would have
| helped.
|
| But doing that required experience the Python developers
| didn't have when they were doing 3.0!
|
| The Python developers thought people could do a one-off
| syntactic code translation (2to3), perhaps even at
| install time, rather than what most people did - write to
| the common subset of 2 and 3, with helpers like the 'six'
| package.
|
| What are the "more substantial changes" you propose? The
| walrus operator? Matching? Other things that Python 3
| eventually gained, and which took years in some cases to
| develop?
|
| Or are you proposing something that would have it more
| difficult to write to the common subset?
|
| That subset compatibility necessity extends to the
| Python/C API. Get rid of global state and you'll need to
| replace things like:
| PyErr_SetString(PyExc_ValueError, "embedded null
| character");
|
| with something that passes in the current execution
| state. Make that too hard, and you inhibit migration of
| existing extension modules, which further inhibits the
| migration.
| regularfry wrote:
| I think that's called "learning from one's mistakes". I
| hope.
| jedberg wrote:
| > The GIL will never be removed from the main python
| implementation.
|
| I don't see why. It's a much easier transition than 2 to 3.
|
| Make each package declare right at the top if they are non-
| GIL compatible. Have both modes available in the interpreter.
| If every piece of code imported has the declaration on top,
| then it runs in non-GIL mode, otherwise it runs in classic
| GIL mode.
|
| At first most code would still run in GIL mode, but over time
| most packages would be converted, especially if people
| stopped using packages that weren't compatible.
| Groxx wrote:
| I don't see why you wouldn't have holdouts using GIL for
| valid single-threaded performance reasons for
| years/decades. And that's ignoring legacy code - even 2.7
| is still alive and kicking in some corners.
| jedberg wrote:
| I'm sure you would, but anyone who cared about it would
| work around it, either by not using that library or
| finding a different one or even forking the one that
| isn't updated. Just like most people use Python 3.x now,
| but there are some 2.7 holdouts. But those holdouts
| aren't holding back the entire ecosystem at this point.
| powersnail wrote:
| Python would be too dynamic a language to have non-GIL
| compatibility declared simply at top. Code can be
| imported/evaled/generated at runtime, and at any part of a
| script, which means that python would need to be able to
| switch from non-GIL to GIL at any time of execution.
| phire wrote:
| That's not an insurmountable problem.
|
| As long as all the data structures stay the same, all you
| really need to do is flush out all the non-GIL bytecode
| and load in (or generate) GIL bytecode.
|
| Sure, there might be a stutter when this happens. You
| will also want a way to either start in GIL mode, or
| force it to stay in non-GIL mode, throwing errors. But
| it's a very solvable problem.
| richdougherty wrote:
| I mean that seems horribly tricky, but also totally
| doable.
|
| Having read about JavaScript/Java VM optimisations in
| JITs and GC I would be surprised if a global state change
| like this is not manageable - think deoptimising the JS
| when you enter the debugger in the dev tools in your
| browser.
| blibble wrote:
| Java-the-bytecode is as dynamic yet somehow hotspot
| manages to be rather nippy
| jedberg wrote:
| It's true that there are dynamic imports, but presumably
| it would be on the library maintainer to know about that,
| but also you could throw a catchable error about GIL
| imports or something like that.
|
| All I'm saying is that it's solvable, and more solvable
| than 2 to 3.
| samwillis wrote:
| I completely agree, something like: from
| __future__ import multicore
|
| (The proposal seems to be to call it that rather than "no
| Gil" as it's more positive)
|
| And maybe if done in an __init__.py it applies to the
| whole module.
|
| Do that for v3.x then drop it completely for v4
|
| I think the issue may be that maintaining both systems is
| too complex.
| dekhn wrote:
| I'm sorry but I really don't think it's "easy" and your
| suggestion would be just part of a much, much larger
| solution.
| aprdm wrote:
| This is very naive.
| xapata wrote:
| jedberg doesn't seem like a naive person, glancing at his
| work history. Perhaps you could explain why you think
| this opinion is naive?
| dekhn wrote:
| Even skilled programmers often make the mistake of saying
| "why don't you just..." or "it's easy, just..." when
| completely ignoring large important factors, such as
| following process, ensuring backward compatibility,
| stakeholder alignment (ugh), and addressing long tail
| problems.
| Eisenstein wrote:
| "There are two types of programmers: new ones who don't
| know how complicated things are, and experienced ones who
| have forgotten it" -from an article on HN the other day
| about setting up python envs.
| execveat wrote:
| There also are programmers who are aware of much more
| complicated things being done in sister projects, like
| JVM and JS runtimes in this case.
| xapata wrote:
| An appeal to authority? I suppose it'd be better if you
| named the article's author or linked to it.
| Eisenstein wrote:
| Not appealing to anyone; I thought it was a clever quote.
|
| * https://www.bitecode.dev/p/why-not-tell-people-to-
| simply-use
|
| EDIT -- Here is the quote:
|
| _There are usually two kinds of coders giving advises. A
| fresh one that has no idea how complex things really are,
| yet. Or an experienced one, that forgot it._
| Dork1234 wrote:
| I really think the GIL is saving a bunch of poorly written
| multi-threaded C++ wrappers/libraries out there. If they
| remove it, a bunch of bugs will appear in other libraries
| that might not be Pythons fault.
| eklitzke wrote:
| They're not "poorly written", the fact that you don't need
| to do any locking in C/C++ code is part of the existing
| Python API. Right now when Python code calls into C/C++
| code the entire call is treated as if it's a single atomic
| bytecode instruction. Adding extra locking would just make
| the code slower and would accomplish absolutely nothing,
| which is why people don't do it.
| tinus_hn wrote:
| The interpreter could do locks around these calls
| automatically to make them atomic, while leaving itself
| multithreaded.
| KMag wrote:
| In order for the call into C to appear atomic to a
| multithreaded interpreter, all threads in the interpreter
| would need to be blocked during the call. That's possible
| to do, but you've just re-introduced the GIL whenever any
| thread is within a C extension.
|
| In the unlocked case, one could use low-overhead tricks
| used for GC safepoints in some interpreters. One low-
| overhead technique is a dedicated memory page from which
| a single byte is read at the beginning at of every opcode
| dispatch, and you mark that page non-readable when you
| need to freeze all the threads executing in the
| interpreter. You'd then have the SIGSEGV handler block
| each faulting thread until the one thread returned from
| C. That's fairly heavy in the case it's used, but pretty
| light-weight if not used.
| silverwind wrote:
| Like in any other language, it's best to avoid non-native
| dependencies.
| enedil wrote:
| Nevertheless, this is still a concern to wider ecosystem,
| if Python libraries suddenly start to break due to
| underlying issues. I don't think this can be neglected.
| coldtea wrote:
| > _There is also the argument that it could "divide the
| community" as some C extensions may not be ported to the new
| ABI that the no GIL project will result in_
|
| I think the arguments are a red herring. It's more
| rationalizations for not wanting to do it.
| gazpacho wrote:
| The other thing I don't get is that the whole sub interpreters
| thing seems to totally break extension modules as well:
| https://github.com/PyO3/pyo3/issues/2274. In theory parts of
| sub-interpreters have been around for a while and it just
| happens that every extension module out there is incompatible
| with it because no one used it. But if it's going to become the
| recommended way to do parallelism going forward then they'll
| have to become compatible with it.
|
| The serialization thing is also a huge issue. Half of the time
| I want to use multiprocessing I end up finding that the
| serialization of data is the bottleneck and have to somehow re-
| architect my code to minimize it.
|
| I would much prefer a world in which asyncio is 2x faster and
| can benefit from real parallelism across threads. Libraries
| like anyio already make it super easy to work with async +
| threads. It would make Python a viable option for workloads
| where it currently just isn't.
| rogerbinns wrote:
| (Disclosure: Author of a Python C extension that wraps SQLite
| that has 2 decades of development behind it.)
|
| Have a look at the current documentation for writing
| extensions. This approach is essentially unchanged since
| Python 2.0.
| https://docs.python.org/3/extending/newtypes_tutorial.html
|
| In particular note how everything is declared static - ie
| only one instance of that data item will exist. If there are
| multiple interpreters then there needs to be one instance per
| sub-interpreter. That means no more static and initialisation
| has to be changed to attach these to the module object which
| is then attached to a specific interpreter instance. It also
| means every location you needed to access a previously static
| item (which often happens) has to change from a direct
| reference through new APIs chasing back from objects to get
| their owning module and then get the reference. That is the
| code churn the PyO3 issue is having to address. One bonus
| however is that you can then cleanly unload modules.
|
| This may still not be sufficient. For example I wrap SQLite
| and it has some global items like a logging callback. If my
| module was loaded into two different sub interpreters and
| both registered the logging callback, only one would win.
| These kind of gnarly issues are hard to discover and
| diagnose.
|
| Removing the GIL also won't magically help. I already release
| it at every possible opportunity. If it did go away, I would
| have to reintroduce a lock anyway to prevent concurrency in
| certain places. And there would have to be more locking
| around various Python data structures. For example if I am
| processing items in a list, I'd need a lock to prevent the
| list changing while processing. Currently the GIL handles
| that and ensures fewer bugs.
|
| I've also experienced the serialization overhead with
| multiprocessing. I made a client's code so much faster that
| any form of Python concurrency was slower because of all the
| overhead. I had to rearchitect the code to work on batches of
| data items instead of the far more natural one at a time.
| That finally allowed a performance improvement with
| multiprocessing.
| samwillis wrote:
| This just posted on the Python forum is a brilliant rundown of
| the conflicting "Faster Python" and "No GIL" projects, and a
| proposal (plus call for funding) for a route forward.
|
| I think everyone would agree that trying to combine both would
| be ideal!
|
| "A fast, free threading Python"
|
| https://discuss.python.org/t/a-fast-free-threading-python/27...
| dragonwriter wrote:
| It is from the person most involved in the faster-with-GIL
| effort, and its recommendation is to prioritize that effort
| in any case, and if the resources are available for that and
| no-gil, do both.
|
| Not that I disagree with the recommendation, but one of the
| sides saying "as long as resources make us choose, choose my
| side" is...not really surprising.
| samwillis wrote:
| Not surprising, but I'm very happy they are trying to find
| a route forward for both. That I commend.
|
| I think from memory "Faster Python" is Microsoft funded and
| "No GIL" is funded by Facebook. If they can find a way to
| fund a combined effort that would be good.
|
| I suspect the conflicting funding also adds to the general
| political difficulty around this.
| jrochkind1 wrote:
| this is a good read and deserves to be on the homepage with
| it's own thread too!
| zerkten wrote:
| This feels like it's playing out as I expected. I followed
| Python, and the Python community, really closely from 2008-2016
| when there were tons of relatively small scale experiments
| happening. This all happened organically to a large extent and
| there was no one coordinating a grand vision. It seems like we
| have a continuation of this giving rise to the concern that
| there is some battle.
|
| I suspect there will be some butting of heads for a while
| before they work things out after seeing how the community
| reacts. All of this could be handled better with some
| thoughtful proactive engagement, but that's not really how
| things operate and there is no one to really enforce it.
| ptx wrote:
| If we could just get efficient passing of objects graphs from
| one subinterpreter to another, which is not in the current
| plan, I think that would solve a lot of use cases. That would
| allow producer/consumer-style processing with multiple cores
| without the serialization overhead of the multiprocessing
| module.
|
| Removing the GIL seems like it could make things more
| complicated in many ways by making lots of currently thread-
| safe code subtly unsafe, but I might be wrong about this.
| (...in which case it would just make things very slow because
| everything is synchronized?)
| bratao wrote:
| Yeah, they clearly stated that here
| https://discuss.python.org/t/pep-703-making-the-global-inter...
|
| I really wish that GIL go away. It is better to pay this price
| now, multi-threading is the future.
| ActorNightly wrote:
| >multi-threading is the future.
|
| Yes, with ML powered compilers recognizing what you are
| trying to do and generating the actual multithreaded code for
| you.
|
| And it won't be multithreaded code like you know it, in the
| sense of os specific threading code with context switching
| and what not. It will be compiled compute graphs targeted at
| specific ML hardware, likely with static addressing.
| PhilipRoman wrote:
| >multi-threading is the future
|
| Haha, reminds me of an image I saw with a farmer from some
| developing country saying "irrigation is the future".
|
| For everyone else multithreading has been the status quo for
| quite a long time.
| hospitalJail wrote:
| I'm so conflicted.
|
| ~40 of the programs I am responsible for are single threaded.
| They were relatively quick to develop and were made by
| Electrical Engineers rather than career/degreed programmers.
|
| 2 programs use multithreading, I had to do that. The learning
| curve was not a huge deal, but the development time adds at
| least hours. In my case days(due to testing).
|
| I imagine its too hard to have an optional flag at the start
| of each program that can let the user decide?
| birdyrooster wrote:
| The problem is that Python types are not thread safe so you
| have to jump through more hoops to have safe
| parallelization in Python. These changes would make it so
| that writing multithreaded code will be much easier it
| seems like.
| coldtea wrote:
| > _but the development time adds at least hours_
|
| So? some "hours of development time" is nothing.
| sfink wrote:
| True, and as you get better at it, those hours will be
| fewer.
|
| The weeks of debugging never go away, though.
|
| (Or at least, not as long as you're using shared state,
| but that's really the only thing under consideration
| here.)
| hospitalJail wrote:
| I'm mostly with you. I think this probably affects part
| time/newbie programmers more.
|
| A few hours of dev time is 1 or 2 nights of work.
| maleldil wrote:
| > I imagine its too hard to have an optional flag at the
| start of each program that can let the user decide?
|
| Adding nogil would mean deep changes to the interpreter. I
| imagine maintaining both versions would be almost like
| forking the project.
| dragonwriter wrote:
| > I imagine its too hard to have an optional flag at the
| start of each program that can let the user decide?
|
| The actual next-step proposal toward no-gil is GIL as a
| build-time flag (which isn't quite the same as a runtime
| flag, but not too far off, either.)
|
| https://peps.python.org/pep-0703/
| sgt wrote:
| Not for Python, I feel. In sheer volume, the vast majority of
| my Python programs are single-threaded. I want my programs to
| be very quick in runtime when they run.
|
| Those that are multi-threaded are seeing minor to medium
| load.
|
| If expecting extreme load (like Twitter scale), then Python
| is usually not the answer (rather go to a statically typed
| language like Java, Go, Rust etc).
| gpderetta wrote:
| > In sheer volume, the vast majority of my Python programs
| are single-threaded
|
| obviously if multithreading is near-useless in python, very
| few programs will take advantage of it.
| zo1 wrote:
| Likewise I can say: Obviously we still have the GIL
| because very few people want it removed.
| AlphaSite wrote:
| Is single threaded perf is important, you've already lost
| by using python. You're only ever going to get, ok-ish
| performance or slightly more ok-ish
| gcbirzan wrote:
| > Not for Python, I feel. In sheer volume, the vast
| majority of my Python programs are single-threaded.
|
| Yes, they are single threaded, because using multiple
| threads brings very little benefit in most cases...
|
| > If expecting extreme load (like Twitter scale), then
| Python is usually not the answer (rather go to a statically
| typed language like Java, Go, Rust etc).
|
| So that means we shouldn't get any performance
| improvements, because there are faster languages out there?
| amelius wrote:
| It's relatively simple to make the GIL go away: just compile
| to some VM that has a good concurrent garbage collector would
| be one approach. Yes, this will break some assumptions here
| and there, but not too difficult to overcome especially if
| you bump the version number to Python 4.
|
| However, that leaves a lot of C code that you can't talk to
| anymore because the C code requires the old Python FFI. I
| think this is where the main problem lies.
| coldtea wrote:
| > _It 's relatively simple to make the GIL go away: just
| compile to some VM that has a good concurrent garbage
| collector would be one approach. Yes, this will break some
| assumptions here and there, but not too difficult to
| overcome especially if you bump the version number to
| Python 4._
|
| "It's easy to lower the air-conditioning costs of Las
| Vegas: just move the town to New England".
|
| The problem is "how to remove the GIL" in abstract. It's
| how to remove the GIL, not impact extensions at all (or as
| little as possible), keep single threaded performance, and
| have zero impact to user programs.
|
| To which the above isn't any kind of solution.
| brightball wrote:
| JRuby is a good path to this in the Ruby world.
| zerkten wrote:
| >> However, that leaves a lot of C code that you can't talk
| to anymore because the C code requires the old Python FFI.
| I think this is where the main problem lies.
|
| This is exactly the problem, but people have a hard time
| grasping this because most people interacting with Python
| have no understanding of how C code interacts with Python,
| or don't understand the C module ecosystem. I'm not sure if
| the Python community has a good accounting of this either
| because I don't recall seeing much quantitative analysis of
| how many modules would need to be updated etc.
|
| This would help compare with the Python 2 to 3 conversion
| efforts. Even then, the site listing (shaming?) popular
| modules with compatibility made a mid-to-late appearance in
| the process of killing Python 2. Quantification of module
| updates is obvious thing to have from the get-go for anyone
| looking to follow through on removing the GIL, but it's not
| a fun task.
| amelius wrote:
| This needs more thinking but how about a hybrid approach,
| where you have Thread objects, and GILFreeThread objects?
|
| The Thread objects work with old code, but run more
| slowly.
|
| The GILFreeThread objects are fast.
|
| If an object is passed from a Thread to a GILFreeThread
| or the other way around, then special safety code is
| attached to the object so that manipulating the object
| from the other side doesn't cause issues.
|
| The advantage is that now the module implementers have
| time to migrate from the old system to the new system.
| And users can work with both the old modules and
| "converted" modules in the same system, with minor
| changes.
| ptx wrote:
| This sounds a bit like COM and its apartment-threaded vs.
| free-threaded objects. The "special safety code" in that
| case is a proxy object that sends messages to the thread
| that owns the actual object when its methods are invoked.
| kortex wrote:
| That sounds like a maintenance and stability nightmare,
| if it's even possible. You are effectively red/blue
| splitting the entire codebase. PyObject and the GIL touch
| _everything_ in the codebase.
| amelius wrote:
| The red/blue splitting happens behind the scenes, so it's
| different. Not really a color problem, because the user
| doesn't have to know about it.
|
| But yeah, you will basically have two versions of Python
| running at the same time, with some (hopefully invisible)
| translation between them.
| kortex wrote:
| > But the red/blue splitting happens behind the scenes,
| so it's different.
|
| Respectfully, I don't believe you have spent any
| appreciable time looking at the CPython source code. If
| you had, you would understand how unreasonable this
| expectation is. I don't say this to tear you down, I say
| this to convey the magnitude of what you are describing.
| It would involve touching tens of thousands of LoC. You
| are talking about a multi-million dollar project that
| would result in a ton of near-duplication of code.
|
| The red/blue is inescapable because you have to redefine
| PyObject to have two flavors, PyObject with GIL and
| GilFreePyObject. You now have to check which one you are
| dealing with constantly.
| amelius wrote:
| > You now have to check which one you are dealing with
| constantly.
|
| No, because if you're running inside a Thread you will
| know that you will see only PyObjects, whereas if you're
| running inside a GilFreeThread you will know that you
| will only see GilFreePyObjects.
|
| If you're manipulating the PyObject (necessarily from a
| Thread) then there will be behind-the-scenes translation
| code that will manipulate the corresponding
| GilFreePyObject for you. But you don't have to know about
| it.
| kortex wrote:
| What exactly does "running inside a Thread/GilFreeThread"
| in the context of the cpython runtime mean? You pretty
| much need an entire copy of the virtual machine code.
|
| These are C structs we are talking about here, not some
| Rust trait you can readily parameterize over abstractly.
| That either means lots of manual code duplication, or
| some gnarly preprocessor action. Both are a maintenance
| nightmare.
| amelius wrote:
| Yes, the assumption is that writing a "double-headed
| Python" runtime is far less work than converting the
| entire ecosystem to a new Python runtime.
|
| I think this is the correct view, because at this moment
| people are writing various approaches in an attempt at
| getting rid of the GIL. It's the ecosystem of modules
| that's the real problem, where you want to basically put
| in as little effort as possible per module, at least
| initially.
| nomel wrote:
| Please read any amount of CPython interpreter code to
| begin to understand what you're asking for "behind the
| scenes".
| make3 wrote:
| this would literally break every single python package out
| there man
| crabbone wrote:
| [flagged]
| gyrovagueGeist wrote:
| ...you mean like how the nogil project already has a
| working Numpy module?
| apgwoz wrote:
| > Also, there aren't good programmers in Python core dev.
|
| You seem pretty confident that you know what you are
| doing.
| [deleted]
| notatallshaw wrote:
| > It's relatively simple to make the GIL go away: just
| compile to some VM that has a good concurrent garbage
| collector would be one approach
|
| Sure, if you don't mind paying a 50-90% performance impact
| on single threaded performance or completely abandon C-API
| compatibility and have C extensions start from scratch then
| there are simple approaches.
|
| If you look at any example in the past to remove the GIL
| you would see that keeping these two requirements of not
| having terrible single threaded performance and not having
| almost a completely new C-API is actually very complex and
| takes a lot of expertise to implement.
| saltminer wrote:
| This might be a dumb question, but why would removing the
| GIL break FFI? Is it just that existing no-GIL
| implementations/proposals have discarded/ignored it, or
| is there a fundamental requirement, e.g. C programs
| unavoidably interact directly with the GIL? (In which
| case, couldn't a "legacy FFI" wrapper be created?) I know
| that the C-API is only stable between minor releases [0]
| compiled in the same manner [1], so it's not like the
| ecosystem is dependent upon it never changing.
|
| I cannot seem to find much discussion about this. I have
| found a no-GIL interpreter that works with numpy, scikit,
| etc. [2][3] so it doesn't seem to be a hard limit. (That
| said, it was not stated if that particular no-GIL
| implementation requires specially built versions of C-API
| libs or if it's a drop-in replacement.)
|
| [0]: https://docs.python.org/3/c-api/stable.html#c-api-
| stability
|
| [1]:
| https://docs.python.org/3/c-api/stable.html#platform-
| conside...
|
| [2]: https://github.com/colesbury/nogil
|
| [3]: https://discuss.python.org/t/pep-703-making-the-
| global-inter...
| kortex wrote:
| > C programs unavoidably interact directly with the GIL?
|
| Bingo. They don't _have_ to, but often the point of C
| extensions is performance, which usually means turning on
| parallelism. E.g. Numpy will release the GIL in order to
| use machine threads on compute-heavy tasks. I 'm not
| worried about the big 5 (numpy, scipy, pandas, pytorch,
| and sklearn), they have enough support that they can
| react to a GILectomy. It's everyone else that touches the
| GIL but may not have the capacity or ability to update in
| a timely manner.
|
| I don't think this is something which can be shimmed
| either or ABI-versioned either. It's deeeep and touches
| huge swaths of the cpython codebase.
| saltminer wrote:
| Thanks, that explains a lot. Sounds like a task that
| would have to be done in Python 4, if ever it exists.
| notatallshaw wrote:
| > or is there a fundamental requirement, e.g. C programs
| unavoidably interact directly with the GIL?
|
| Both C programs can use the GIL for thread safety and can
| make assumptions about the safety of interacting with a
| Python object.
|
| Some of those assumptions are not real guarantees from
| the GIL but in practise are good enough, they would no
| longer be good enough in a no-GIL world.
|
| > I know that the C-API is only stable between minor
| releases [0] compiled in the same manner [1], so it's not
| like the ecosystem is dependent upon it never changing.
|
| There is a limited API tagged as abi3[1] which is
| unchanging and doesn't require recompiling and any
| attempt to remove the GIL so far would break that.
|
| > so it's not like the ecosystem is dependent upon it
| never changing
|
| But the wider C-API does not change _much_ between major
| versions, it 's not like the way you interact with the
| garbage collector completely changes causing you to
| rethink how you have to write concurrency. This allows
| the many projects which use Python's C-API to relatively
| quickly update to new major versions of Python.
|
| > I have found a no-GIL interpreter that works with
| numpy, scikit, etc. [2][3] so it doesn't seem to be a
| hard limit.
|
| The version of nogil Python you are linking is the
| product of years of work by an expert funded to work full
| time on this by Meta, the knowledge is sourcing many
| previous attempts to remove the GIL including the
| "gilectomy". Also you are linking to the old version
| based on Python 3.9, there is a new version based on
| Python 3.12[2]
|
| This strays away from the points I was making, but with
| this specific attempt to remove the GIL if it is adopted
| it is unlikely to be switched over in a "big bang", e.g.
| Python 3.13 followed by Python 4.0 with no backwards
| compatibility on C extensions. The Python community does
| not want to repeat the mistakes of the Python 2 to 3
| transition.
|
| So far more likely is to try and find a way to have a
| bridge version that supports both styles of extensions.
| There is a lot of complexity in this though, including
| how to mark these in packaging, how to resolve
| dependencies between packages which do or do not support
| nogil, etc.
|
| And _even_ this attempt to remove the GIL is likely to
| make things slower in some applications, both in terms of
| real-world performance as some benchmarks such as MyPy
| show a nearly 50% slowdown and there may be even worse
| edge cases not discovered yet, and in terms of lost
| development as the Faster CPython project will unlikely
| be able to land a JIT in 3.13 or 3.14 as they plan right
| now.
|
| [1]: https://docs.python.org/3/c-api/stable.html#c.Py_LIM
| ITED_API [2]: https://github.com/colesbury/nogil-3.12
| JohnFen wrote:
| Oh, boy. Will any of that impact backward compatibility?
|
| I don't develop anything in Python, but it is used by several
| applications of importance to me. The lack of compatibility
| between versions is a thing that bites me hard, and I tend to
| curse Python because of it.
| matsemann wrote:
| GIL is one of the things that make Python an annoyance to work
| with. In saner languages, you could handle multiple requests at
| the same time, or easily spin something off in a thread to work
| in the background. In Python you can't do this. You need to
| duplicate your process, then pay the price of memory usage and
| other things multiple processes hinder (like communication
| between threads or pre-computed values now aren't shared so you
| need something external again). To deploy your app, you end up
| with 10 different deploys because each of them have to have a
| different entry point and separate task to fulfill.
| slt2021 wrote:
| No it is not.
|
| If you want to reach peak performance single-threaded app
| with no locks is the way to go, and work being sharded (not
| shared) among multiple single-threaded apps.
|
| Multi-threaded apps with shared state introduce more
| complexity, than the performance when compared to multiple
| single-threaded apps running asyncio event loop.
|
| For example LMAX Disruptor
| ActorNightly wrote:
| The other languages are not saner. You are basically saying
| "Python GIL is annoying because I can't write parallel
| processing performant code in Python". Python has never been
| and is not a performant language. Its designed for rapid and
| easy development.
|
| The multiprocessing+asyncio in Python fulfills the aspect of
| utilizing all the resources, albeit at a higher memory cost,
| but memory is dirt cheap these days. You have a master
| process and then worker threads. For all things that you
| would write in Python, where in >90% cases you are network
| latency limited, the paradigm of a master process and worker
| processes with IPC on unix sockets works extremely well. Set
| up a web app with fast api/gunicorn master/uvicorn workers,
| and it will be plenty fast enough for anything you do.
| stinos wrote:
| _GIL is one of the things that make Python an annoyance to
| work with_
|
| For your particular usecase, yes. Personally I've been using
| Python for like 20 years for various tasks and so far never
| got really bothered by the presence of it once. Worst case
| was having to wait somewhat longer for things to complete.
| For my case: still worth it compared to making things
| multithreaded. And async fixed the rest. And the things which
| I actually need to be _fast_ aren 't usually in Python
| anyway. I'm not saying the GIL should stay, it's just that it
| doesn't seem as much of a problem in the general land of
| Python. Or in other words: how many Python users out there
| even know what GIL means and does?
| stult wrote:
| > For your particular usecase, yes.
|
| The use case they are describing is a standard web server
| or web application. That's a pretty important and widely
| applicable use case to dismiss out of hand as "your
| particular usecase".
| nunuvit wrote:
| The dismissiveness really goes the other way. Pythons
| like IronPython and Jython don't have a GIL. CPython does
| because it's primarily a glue language for extensions
| that might not be thread-safe. Web apps were given huge
| accommodation with async, so you can't say their needs
| are being dismissed. Why must we break the C in CPython
| for a use-case that could use one of the GIL-free
| Pythons?
| stinos wrote:
| That's somewhat out of context. With the bit you quoted I
| meant "sure working around the GIL by implemening a web
| server in that particular way is annoying". I'm not
| saying that "web server" as a whole is not important or
| not widely applicable, merely that amongst all other
| usecases and applications of Python out there, web
| servers are just one of many. And the particular
| implementation stated like "10 different deploys" is even
| a subset of that 'one' and as explained by fellow
| comments, probably not the most appropriate one.
| semiquaver wrote:
| The GIL is not held during IO, which is what most web
| applications and web servers should be spending the vast
| majority of their time doing.
|
| https://docs.python.org/3/library/threading.html
|
| If that's too limiting, preforking and other forms of
| process-based parallelism are a tried and true approach
| that has been used for years to run python, ruby, PHP,
| and once upon a time Perl web applications at enormous
| scale. The difference between threads and processes on
| Linux is relatively minor.
|
| Saying that python doesn't work for web application use
| cases because of the GIL is frankly sort of bizarre given
| the large number of python web applications in the wild
| chugging along delivering value.
| mlyle wrote:
| > which is what most web applications and web servers
| should be spending the vast majority of their time doing.
|
| Sure... but if you have dozens of threads spending most
| of their time doing I/O, that still leaves many threads
| wanting to do things other than I/O.
|
| > The difference between threads and processes on Linux
| is relatively minor.
|
| Except having any shared state between processes is
| painful. If you're hitting an outside database for
| everything, it's fine.
| jrochkind1 wrote:
| > The GIL is not held during IO, which is what most web
| applications and web servers should be spending the vast
| majority of their time doing.
|
| While this has been oft-repeated for years, more or less
| language-independently, I have become convinced it no
| longer accurately describes _ruby on rails_ apps. People
| still say it about ruby /rails too though. But my rails
| web apps are spending 50-80% of their wall time in cpu,
| rather than blocking on IO. Depending on app and action.
| And whenever I ask around for people who have actual
| numbers, they are similar -- as long as they are projects
| with enough developer experience to avoid things like n+1
| ORM problems.
|
| I don't have experience with python, so I can't speak to
| python. but python and ruby are actually pretty similar
| languages, including with performance characteristics,
| and the GIL. Python projects tend to use more stuff
| that's in C, which would make more efficient use of CPU,
| so that could be a difference. (Also not unrelated to
| what we're talking about!)
|
| But I have become very cautious of accepting "most web
| applications are spending the vast majority of their time
| on io blocking rather than CPU" as conventional wisdom
| without actually having any kind of numbers. _vast_
| majority? I would doubt it, but we need empirical
| numbers.
| nomel wrote:
| > The use case they are describing is a standard web
| server or web application.
|
| I believe this is what they were referring to when they
| said "async fixed the rest".
| Fiahil wrote:
| I think Data Scientists would like a word with you. They
| have plenty of time since their parallel pipeline was
| OOMKilled.
| oconnor663 wrote:
| > you could handle multiple requests at the same time
|
| To be fair to Python and the GIL, it's totally capable of
| parallelizing requests when most of the work is network-
| bound, which is probably the common case. And when the work
| is CPU-bound, but the CPU-intensive part is written in C,
| it's also possible for C to release the GIL. So it's really
| only "heavy computational work directly in Python" programs
| that are affected by this. (On the other hand, Python
| applications do naturally expand to look like this over
| time...)
| paulddraper wrote:
| Other languages do that: JavaScript, PHP, Erlang.
|
| Python multiprocessing is pretty usable.
|
| I like multithreading, but also...it has more footguns then
| the rest of programming combined. [1]
|
| I'm not convinced Python's approach is that bad in practice.
|
| [1] https://news.ycombinator.com/item?id=22165193
| [deleted]
| mohaine wrote:
| It is, until the GIL bites you in the ass. As it is you get
| different behavior if your call is calling out to external
| code vs being pure python. Note that you really don't know
| if a random function call is python or wrapping external
| class so you really get random behavior.
|
| The time it got me was a thread just to timeout another
| process. Tests worked great but the timeout didn't work in
| production because the call it was wrapping was calling out
| to C code so nothing would run until the call returned. We
| even still got the timeout error in the logs and it looked
| like it was working (it even tossed the now waited for
| valid results), but not at the time of the timeout but
| after the call finally returned a few hours later.
| paulddraper wrote:
| So.... It would have been better if GIL were even more
| aggressive?
| samsquire wrote:
| I have often wondered what the solution to the serialisation of
| objects between subinterpreters is.
|
| If its garbage collection that's the problem, I think you could
| transfer ownership between threads, so the subinterpreter takes
| ownership of the object and all references to it in the source
| interpreter are voided.
|
| Alternatively you can do something like Java and all objects
| are in a global object allocator, passing things between
| threads doesn't require interpretation, just a reference.
| jillesvangurp wrote:
| The GIL has been a blocker for many years. It's nice that the
| team is making progress of course. IMHO it's one of those
| bandaids they need to rip off.
|
| I was listening to the interview with Chris Lattner with Lex
| Friedman a week ago or so. Very interesting discussion on his
| project mojo which intends to build a new language that is
| backwards compatible and a drop in replacement for python with
| opt in strict typing, better support for native/primitive types
| where this makes sense, easier integrations with hardware
| optimizations, and of course no GIL. The idea would be that the
| migration path for existing code is that it should just work
| and then you optimize it and provide the compiler with type
| hints and other information so it can do a better job. Very
| ambitious roadmap and I'm curious to see if they'll be able to
| deliver.
|
| The main goal seems to be to enable programmers to do the
| things you currently can't do in python because it's too slow
| in python without running into a brick wall in terms of
| performance.
|
| I mostly work with JVM languages and a few other things but I
| occasionally do a bit with python as well. I've always liked it
| as a language but I'm by no means an expert in it. I recently
| spent a day building a simple geocoder and since I know about
| the GIL, I went straight for the multi processing library and
| did not bother with threads. IMHO there's absolutely no point
| in attempting to use threads with python with the GIL in place.
| I needed to geocode a few hundred thousand things in a
| reasonable time frame, so all I wanted to do was use a few
| different processes concurrently so I could cut down the
| runtime to something reasonable.
|
| Python is ok for single threaded stuff but you run into a
| brickwall doing anything with multiple processes or threads and
| juggling state. In the end I just gave up and wrote a bunch of
| logic that splits the input into files, processes the files
| with separate processes, waits for that to finish, and then
| combines the output files. Just a lot of silly boiler plate and
| abusing the file system for sharing state. It does what it
| needs to but it feels a bit primitive and backwards and I'm not
| proud of the solution.
|
| Removing the GIL, adding some structured concurrency, and maybe
| some other features, would make python a lot more capable for
| data processing. And since there are a lot of people already
| use python for that sort of thing, I don't think that would be
| such a bad thing. Data science and data processing are the core
| use case for python. I don't think people actually care a lot
| about the raw python performance. It's never been that great to
| begin with. If it's performance critical, it's mostly being
| done via native libraries already.
| mixmastamyk wrote:
| Is this io-bound or cpu-bound? Hard to tell from your one
| word description, "geocode". Is that local or a network call?
|
| If you've broken up the input already I'd use the shell to
| parallelize, ie for &. If network, async is probably what you
| want.
| robertlagrant wrote:
| This is an interesting writeup. Could you go for
| asyncio.gather[0] or TaskTaskGroups[1] these days? Or would
| that not help?
|
| [0] https://docs.python.org/3/library/asyncio-
| task.html#asyncio....
|
| [1] https://docs.python.org/3/library/asyncio-
| task.html#asyncio....
| nologic01 wrote:
| > Data science and data processing are the core use case for
| python
|
| indeed. one would almost hope that all the different aspects
| of "performance" and "concurency", their memory, disk or
| network profile etc get their own dedicated labels. The
| conflation of these distinct dimensions is a major source of
| confusion (and thus a waste of bandwidth).
| nmstoker wrote:
| I do hope the dialogue stays cordial, constructive and open
| rather than becoming distinct entrenched camps - the Python
| community has a strong and mature community spirit so this
| seems plausible and not too much wishful thinking.
|
| Much as No GIL would be an adventure, I'm leaning towards the
| more gradual and stable changes from the FasterPython team and
| I can see that throwing No GIL into the mix adds complexity at
| an inopportune moment.
| RandyRanderson wrote:
| Many projects start out in Python b/c often new libs are python-
| first. Many of those run into performance issues and eventually
| determine that Python will never be fast b/c of the GIL.
|
| I think it's very magnanimous of the python team, by not removing
| the GIL, to give Go, Java and C++ a chance.
| titzer wrote:
| Building a whole new interpreter and accompanying compiler tiers
| is a _lot_ of work, still in 2023. Many different projects have
| tried to make this easier, provide reusable components, to offer
| toolkits, or another VM to build on top of. But it seems that
| none of these really apply, we 're still at the "every language
| implementation is a special snowflake" territory. That's partly
| the case in Python just because the codebase has accumulated,
| rather nucleated, around a core C interpreter from 30 years ago.
|
| IMHO the Python community has intentionally torpedoed all
| competing implementations of Python besides CPython to its own
| detriment. They seem to constantly make the task of making a
| compatible VM harder and harder, on purpose. The task they face
| now is basically building a new, sophisticated VM inside a very
| much non-sophisticated VM with a ton of legacy baggage. A massive
| task.
| PeterStuer wrote:
| Just one question: Do you realy want Python to be held back by
| the GIL 10 years from now? If not, when do you want to start the
| change? If so, what do you think a CPU will look like in 10
| years, and how would a GIL Python facilitate getting value from
| it?
| kzrdude wrote:
| You don't want python to still be reeling from grueling the gil
| ecosystem split, 10 years from now.
| henrydark wrote:
| Looks cool. This caught my eye
|
| > C++ and Java developers expect to be able to run a program at
| full speed (or very close to it) under a debugger.
|
| I haven't worked in Java too much, but in C++ I don't remember
| having this expectation
| mm007emko wrote:
| Java programs might not run in debugger very well either,
| depends on where and how you place breakpoints.
|
| However I'd be glad if profilers (and notably memory profiler)
| would slow down a Python program only as much as valgrind does
| C.
| The_Colonel wrote:
| I noticed that java with a connected debugger can be very
| fast even with breakpoints, but stepping over can be very
| slow. Which is a big weird, since "step over" is basically
| just putting the breakpoint on the next line.
| Chabsff wrote:
| Think of it more as: It's still possible, painful as it may
| sometimes be, to debug optimised C++ code running at full
| speed.
| gyrovagueGeist wrote:
| Can someone give a good argument of why subinterpreters are an
| interesting or useful solution to concurrency in Python? It seems
| like all the drawbacks of multiprocessing (high memory use, slow
| serialized object copying) with little benefit and higher
| complexity for the user.
|
| The nogil effort seems like such a better solution, that even if
| it breaks the C interface, subinterpreters aren't worth
| considering.
| bsder wrote:
| Pure uninformed speculation follows ...
|
| I suspect sub-interpreters are a punt and a feint.
|
| My guess is that there will likely be exactly _2_ sub-
| interpreters in most Python code. One which talks to an old C
| API with a GIL and one which talks to a new C API without a
| GIL.
|
| It's going to be a _lot_ easier to manage handing objects
| between two Python sub-interpreters than to manage handing
| objects between two incompatible ABIs.
| nhumrich wrote:
| Sub interpreters are still faster than processes. With them you
| could do continuation message passing which other modern
| languages, such as go, use for multi-threading. Also, for cases
| such as web apps or data science training, you don't need to
| share memory between threads, and a sub interpreter uses a lot
| less resources than a full python process.
|
| > Even if it breaks the c interface
|
| Than most of your python packages wouldn't work. A python that
| isn't backwards compatible? Ya, that has been tried once
| before, and was a disaster. If you want a non-backwards
| compatible gil-less python, it already exists. You can find
| versions of nogil online.
| bratao wrote:
| I understand that sub-interpreter can perform better on object
| copying as it is sharing the same memory. But yeah, nogil looks
| like the correct way.
| globular-toast wrote:
| Minor nitpick: Python can perfectly well do concurrency. What
| you mean is parallel execution (multi-threading).
| celeritascelery wrote:
| > Can someone give a good argument of why subinterpreters are
| an interesting or useful solution to concurrency in Python?
|
| I will give it a shot.
|
| Subinterpreters are better than multiple processes because:
|
| - they have significantly less memory overhead
|
| - they can move objects much faster between subinterpreters
| because they don't need to serialize through a format like JSON
|
| - since they are all in the same process you can implement
| things like atomics or channels easily.
|
| Subinterpreter are better then no-Gil because:
|
| - they make the code easier to reason about and debug relative
| to raw multi-theading
|
| - they don't negatively impact single threaded (basically all
| existing) python code performance
|
| - they don't require any changes to the C interface, preventing
| a fractured ecosystem
|
| - they can't have data races
| kmod wrote:
| Couple corrections:
|
| - They absolutely do have to serialize, usually via pickle.
| I'm pretty sure objects are not sharable between
| subinterpreters and there is not a plan for that. The main
| reason people think subinterpreters are good ("you can just
| share the memory!") is not actually true.
|
| - They don't require any changes to the C interface because
| those changes were already made, and a fair amount of cost
| was paid by C library maintainers. So it's true,
| subinterpreters are at an advantage in this regard, but
| that's more of a political question than a technical one
| kzrdude wrote:
| Eric Snow mentioned in his Pycon talk that memory sharing
| would be used, especially big data blobs, arrays etc. Sure,
| not directly sending python objects, but passing pointers
| can be done.
| Spivak wrote:
| > - they don't negatively impact single threaded (basically
| all existing) python code performance
|
| I think this deserves an extra callout because _even your
| multi-threaded Python programs are effectively single
| threaded and benefiting from the performance gain_.
| [deleted]
| sergiomattei wrote:
| > they can't have data races
|
| How so? Asking out of curiosity.
| celeritascelery wrote:
| Because they don't share memory. The C interpreters share
| memory, so they could have data races, but the python code
| can't. Just like how the C interpreter can have memory
| unsafely but python can't (or shouldn't).
| ptx wrote:
| > _they can move objects much faster between subinterpreters
| because they don't need to serialize through a format like
| JSON_
|
| That would be a huge advantage, but it's not there yet.
| According to PEP 554 [1] the only mechanism for sharing data
| is sending bytes through OS pipes, which is exactly the same
| as for multiprocessing and requires the same sort of
| serialization.
|
| [1] https://peps.python.org/pep-0554/#api-for-sharing-data
| semiquaver wrote:
| Is the overhead of pickle eg as used in
| multiprocessing.Pipe() actually a limiting factor in most
| circumstances?
|
| https://docs.python.org/3/library/multiprocessing.html#mult
| i...
|
| In a message passing system there's always going to need to
| be some form of serialization. I'll wager that pickle is
| fast and flexible enough for most cases and for those that
| aren't, using something like flatbuffers or capn proto in
| shared memory wouldn't be too much of a lift to integrate.
|
| Although all of that has long been possible in a multiple-
| process architecture, so I'm also curious to know if there
| are any real advantages to subinterpreters. From this
| message [1] linked to from the PEP it sounds like the
| author once thought that object sharing was a possibility,
| but if it's not there seem to be no real benefits over
| multiprocessing and one big downside (the GIL).
|
| Contrast with ruby's Ractor system [2], which is similar to
| the subinterpreter concept but allows true parallelism
| within a single process by giving each ractor its own
| interpreter lock, along with a system for marking an object
| as immutable so it can be shared among ractors.
|
| [1] https://mail.python.org/pipermail/python-
| ideas/2017-Septembe...
|
| [2] https://github.com/ruby/ruby/blob/master/doc/ractor.md
| spacechild1 wrote:
| > - they can move objects much faster between subinterpreters
| because they don't need to serialize through a format like
| JSON
|
| Why do you think that you would need to serialize to JSON?
| Pipes and sockets can deal with binary data just fine. With
| shared memory, there wouldn't be any difference at all.
|
| > - since they are all in the same process you can implement
| things like atomics or channels easily.
|
| This is also possible with shared memory.
|
| AFAICT the advantage of subinterpreters over subprocesses
| are:
|
| - lower memory overhead
|
| - faster creation/destruction time
|
| - ability to share global data (with subprocesses the data
| would either need to be duplicated or live in shared memory)
| celeritascelery wrote:
| Sure, you _could_ do something like that. But a shared
| memory segment python with a stable binary object format
| doesn 't exist (and isn't even being worked on). Comparing
| the proposed PEP 554 solution to a non-existent theoretical
| solution isn't very useful.
|
| But you do bring up some good points for ways you could
| achieve similar goals without the need to make the
| interpreters thread safe.
| spacechild1 wrote:
| > But a shared memory segment python with a stable binary
| object format doesn't exist
|
| There is multiprocessing.Queue (https://docs.python.org/3
| /library/multiprocessing.html#multi...).
|
| I don't know if it uses shared memory, or rather sockets
| or pipes, but this is just an implementation detail.
|
| My point is that there is no fundamental difference
| between isolated interpreters and processes when it comes
| to data sharing. Either way, you need a (binary)
| serialization format and some thread/process-safe queue.
|
| I would have naively assumed that you could repurpose
| multiprocessing.Queue for passing data between multiple
| interpreters; you would just need to replace the
| underlying communication mechanism (sockets, pipes,
| shared memory, whatever it is) with a queue + mutex. But
| then again, I'm not familiar with the actual code base.
| If there are any complications that I didn't take into
| acccount, I would be curious to hear about them.
|
| Interestingly, the PEP authors currently don't propose an
| API for exchanging data and instead suggest using raw
| pipes:
|
| > https://peps.python.org/pep-0554/#api-for-sharing-data
|
| Of course, this is just a temporary hack. It would be
| ridiculous to use actual pipes for sharing data within
| the same process...
| samus wrote:
| Easy interop with C is one of the core selling points of
| Python. It's not just about performance - it's about Python
| being able to be a glue language that can interface with any
| legacy or system library written in C or being accessible via
| FFI. A LOT of systems exist that take advantage of this, and
| changes to the C interface impact all of these. Unless the
| porting story is thought out extraordinary well, it would be
| another disaster like Python 2->3. Since this affects C code,
| it won't be anything but tricky.
| gyrovagueGeist wrote:
| From reading the nogil repo and related PEP, C ABI breaking
| does not seem to be the worst problem. An updated Cython,
| Swig, etc seems like it would be enough in most cases to get
| something running. More extreme cases might only have ~dozen
| LoC changes to replace the GIL Ensure/Release patterns.
|
| The hidden really hard problem is that extension modules may
| have been written relying on GIL behavior for thread
| safety... these may be undocumented and unmaintained.
|
| Even so I hope the community decides it is worth it. A glue
| language with actual MT support would be much more useful.
| samus wrote:
| Indeed, thanks for clarifying that. I actually had the
| extensions in mind when I wrote my comment, but got somehow
| distracted by the C interop aspect. Extensions indeed have
| a more intimate relation with the interpreter and (directly
| or indirectly) rely on the GIL and its associated
| semantics.
| bkovacev wrote:
| How would this (or any of the recent changes) affect popular web
| frameworks like Django / Flask / FastAPI? Would this increase
| performance in serialization or speed in general?
| retskrad wrote:
| Speaking of Python, as a beginner I have tried to grasp Classes
| and Objects and watched countless YouTube videos and
| Reddit/StackOverflow comments to understand them better but I'm
| just banging my head against the wall and it's not sticking. The
| whole __init__ method, the concept of self and why it's used,
| instance and class variables, arguments, all of it. When learning
| something, when you simply cannot grasp a concept however hard
| you try, what's the course of action? I have tried taking breaks
| but that's not helping.
| abecedarius wrote:
| First, do you feel you understand OO in another language? So
| this is about how Python does it?
| retskrad wrote:
| I'm a beginner and Python is the only language I'm familiar
| with
| abecedarius wrote:
| OK. It might help to keep in mind a couple things:
|
| - Python started as a simple language, but it's grown over
| the decades, and professional programmers learned the new
| complications along the way. They don't naturally
| appreciate what it's like if you get hit with them all
| together, as sometimes happens -- it all seems simple to
| _them_. So to understand OO I would try to screen out the
| fancier bits like the class variables you mentioned -- you
| can come back to them later.
|
| - The original central idea of objects (from Smalltalk) is,
| an object is a thing that can receive different commands
| ('methods' in Python), and update its own variables and
| call on other objects. The way Python gives you to define
| objects (by defining a class and then creating an instance
| of the class) is not the most direct possible way it
| could've been designed to do this -- if it feels more
| complicated than necessary, it kind of is. But it's not too
| bad, you can get used to how it works for the most central
| stuff as I mentioned, and learn more from there.
| switch007 wrote:
| Classes and objects exist in all object oriented program
| languages. Perhaps stepping back from Python and trying some
| more generic material might help?
| retskrad wrote:
| That's a good idea, I'll try that
| IAmGraydon wrote:
| As everyone else said, use it. Don't try to understand by
| reading. Understand by doing.
| kzrdude wrote:
| I think working with it in practice is the only way forward.
| Looking up info, build something, work more, and let it become
| a feedback loop. People work long with systems before they
| "get" them.
|
| Programming is not a "school" activity for me. It's a craft,
| you do stuff. That has eventually led to a depth of knowledge
| in various topics inside programming.
|
| (Digression: With that said - we should think of a lot more
| "school" stuff as skills and crafts - stuff that you don't read
| to understand but you get better at with practice. A lot of
| maths class is skills, not knowledge.)
| retskrad wrote:
| Good advice, thanks
| montecarl wrote:
| Others have said something similar, but I will still chime in.
|
| It sounds like you may want to feel like you have a complete
| understanding of classes, objects, and the way the work before
| you can begin working with them. This can almost never work. I
| haven't come across a topic where just reading about the topic
| is sufficient to fully understand it or even become proficient
| in it.
|
| In my experience this is true for cooking, lab techniques in
| chemistry, electronics, and programming. Even when I have read
| something and felt that I understood it completely, as soon as
| I begin the activity I immediately realize that I had a
| fundamental misunderstanding of what I had read. That my brain
| had made some oversimplification or skipped passed some details
| so that I felt like I grasped the concept.
|
| So if I had to describe the way I learn a new concept it would
| look like this:
|
| 1. Read many different descriptions/watch many different videos
| of the topic to get used to the terms and concepts
|
| 2. Try to apply those concepts in real life (write software,
| build a circuit, cook a meal)
|
| 3. Figure out where my understanding fell short, and go back to
| step 1
|
| You want to make this loop as tight as possible. When you
| become expert at something you can do this activity super fast.
| When you first start out you may have to do quite a bit of
| reading to even have the baseline needed to attempt a concrete
| task. However, it is important to get started towards a real
| goal as soon as possible or you will be wasting your time
| feeling like you are learning and moving towards you goal, but
| you are not.
| njharman wrote:
| > When learning something, when you simply cannot grasp a
| concept however hard you try, what's the course of action?
|
| Learn by doing, don't learn by studying.
|
| Pick a (hopefully real) problem, solve it using classes.
| Repeat. Continue repeating. You will either learn what OO is
| good for, what it is not good for and how you can use it to
| write better code. Or, learn that programming is not something
| you can grok.
|
| If you don't have a "real" problem; Then write program to play
| tic-tac-toe; manage the and display board, take player inputs,
| detect when game is done (winner or draw).
|
| Then expand it to 8x8 board, then to 3 players. Those changes
| should be easy with good OO design (lots rewrites, code
| changes). And probably "harder" with bad OO or no OO design.
|
| btw this is 2nd interview question I and my team used for
| years.
| harrelchris wrote:
| Either keep looking for explanations until it clicks, or try to
| break it down into more fundamental elements.
|
| Classes and blueprints have a simple analogy that I'm sure you
| are familiar with.
|
| Blueprints are instructions for how to build something, like a
| house. Once built, the house is a physical thing you can
| interact with. It has attributes, such as a height or color. It
| has things it can do or that can be done to it, like open a
| door or turn on some lights.
|
| Classes are blueprints - they tell a computer how to build
| something. An object is what is built by the class - it has
| attributes and methods.
| retskrad wrote:
| Thanks, I appreciate it
| ozzy6009 wrote:
| I was the same way when I was a beginner, I didn't really "get"
| python classes until years later and using other programming
| languages. I recommend SICP if you want a more first principles
| understanding: https://web.mit.edu/6.001/6.037/sicp.pdf
| retskrad wrote:
| Thanks
| mixmastamyk wrote:
| Go to school, with a good teacher. Yt videos are not a
| substitute for a proper course.
| IshKebab wrote:
| I disagree on the school part. There are _so_ many good
| resources available especially for Python programming. There
| 's no way you need to pay for lessons. I don't think the kind
| of people you get teaching Python will be the most
| knowledgeable anyway.
|
| But I would maybe recommend a good book if you're struggling.
| mixmastamyk wrote:
| Community college is still a thing, not expensive, and
| won't make the mistake of teaching specifically Python. It
| also forces a schedule which is helpful in its own right.
|
| What you just proposed is exactly what _isn't_ working for
| this person.
| Zizizizz wrote:
| Try out these Cory Schafer videos, I remember watching them 6
| years ago and them being quite helpful.
| https://youtu.be/ZDa-Z5JzLYM
| retskrad wrote:
| Thanks
| WoodenChair wrote:
| Build a project where you purposely use a bunch of your own
| custom classes. Learning by doing is best if the reading
| materials are not clicking.
| tasubotadas wrote:
| > PEP 669 - Low Impact Monitoring for CPython
|
| Finally, 15 years later Python might also get something similar
| to VisualVM.
| jonnycomputer wrote:
| I'm going to admit that what I really want to see is a strong
| push to standardize and fully incorporate package management and
| distribution into python's core. Despite the work done on it,
| it's still a mess as far as I can see, and there is no single
| source of truth (that I know of) on how to do it.
|
| For that matter, pip can't even search for packages any more, and
| instead directs you to browse the pypi website in a browser.
| Whatever the technical reasons for that, its a user interface
| fail. Conda can do it!!!!! (as well as just about any package
| management system I've ever used)
| globular-toast wrote:
| There it is. The obligatory comment on every Python thread on
| HN. It's most popular programming language in the world. Other
| people can figure it out, apparently.
| jonnycomputer wrote:
| I love python. It is my go-to language for just about
| everything. But that also means that I feel the pain points
| pretty acutely. And you know what, I'm not alone.
| jonnycomputer wrote:
| Oh look: https://news.ycombinator.com/vote?id=36343991&how=up
| &auth=85...
| woodruffw wrote:
| I think we can be more charitable than this: it's possible to
| be both immensely popular and to have a sub-par packaging
| experience that users put up with. That's where Python is.
| globular-toast wrote:
| The trouble is people compare it to greenfield languages of
| the past few years with nowhere near the scope, userbase or
| legacy of Python. Long time Python users like me don't have
| any of the problems that the non-Python users that always
| post these comments have. It would be nice to have
| improvements to packaging, sure, but it's always just
| completely non-constructive stuff like "it's not as easy as
| <brand new language with no legacy>".
| jonnycomputer wrote:
| Big assumptions here.
| jshen wrote:
| Java and Ruby both have much better dependency management
| experiences and both have been around for far longer than
| a few years.
| [deleted]
| antod wrote:
| As someone who dealt with Java and Python 20yrs back, I
| don't think Java is a valid comparison.
|
| Java had a terrible or non existent OS integration story
| - it didn't even try to have OS native stuff. It was it's
| own separate island that worked best when you stayed on
| the island. On Linux, Python was included in the OS so
| you had the two worlds of distro packaging and
| application development/deployment dependencies already
| in conflict. Macs also shipped their own Python that you
| had to avoid messing up. And on Windows Python was also
| trying to support the standard download a setup.exe
| method for library distribution. Java only ever had the
| developer dependency usecase to think about.
|
| Before Maven most Java apps just manually vendored all
| their dependencies into their codebase, or you manually
| wrangled assembling stuff in place using application
| specific classpaths and additions to path env vars etc.
| jshen wrote:
| Today, java has much better dependency management than
| python. Nearly all popular languages do.
| woodruffw wrote:
| I agree with all of this! Ironically, grievances around
| Python packaging are a function of Python's overwhelming
| success; non-constructive complaints about the packaging
| experience reflect otherwise reasonable assumptions about
| how painless it _should_ be, given how nice everything
| else is.
|
| (This doesn't make them constructive, but it does make
| them understandable.)
| kzrdude wrote:
| Packaging is a big topic right now, and a lot is happening -
| that includes a lot of good tool improvements. I think that's
| one reason for these comments, because it's close to top of
| mind
| asylteltine wrote:
| Python's dependency management, or lack there of, its import
| system, and the lack of strong typing really make me hate it.
| It's the first language I really felt adept with, but once I
| learned Go, I never looked back. Every time I have to use
| python it's like coding with crayons.
| pnt12 wrote:
| I think type hints help a lot. A codebase with classes and
| type hints reads much better than one using ad-hoc
| dictionaries for every data structure.
|
| But on the other hand I don't really like Go, so maybe it's
| different languages for different tastes.
| wendyshu wrote:
| What aspect of python's type system do you find insufficient?
| Aperocky wrote:
| Premature optimization is root of all evil.
|
| Python isn't perfect, and mostly unsuitable for any system
| where performance is in consideration.
|
| That leaves everything else including personal utility
| scripts and packages I use each day to automate random stuff.
| And I hugely appreciate how fast and simple it is to develop
| in python, unlike certain languages that literally depends on
| IDEs due to the verbosity and unnecessary cognitive load.
| switch007 wrote:
| > And I hugely appreciate how fast and simple it is to
| develop in python
|
| Indeed it's crazy. A few pip installs and I had a
| multiprocessing pandas (dask) with a web gui, and a
| workflow system (also with a web gui), and a pipeline to
| convert csv to parquet in like 20 lines of code
| woodruffw wrote:
| > I'm going to admit that what I really want to see is a strong
| push to standardize and fully incorporate package management
| and distribution into python's core. Despite the work done on
| it, it's still a mess as far as I can see, and there is no
| single source of truth (that I know of) on how to do it.
|
| Package management is standardized in a series of PEPs[1]. Some
| of those PEPs are living documents that have versions
| maintained under the PyPA packaging specifications[2].
|
| The Python Packaging User Guide[3] is, for most things, the
| canonical reference for how to do package distribution in
| Python. It's also maintained by the PyPA.
|
| (I happen to agree, even with all of this, that Python
| packaging is a bit of a mess. But it's a _much better defined_
| mess than it was even 5 years ago, and initiatives to bring
| packaging into the core need to address ~20 years of packaging
| debt.)
|
| [1]: https://peps.python.org/topic/packaging/
|
| [2]:
| https://packaging.python.org/en/latest/specifications/index....
|
| [3]: https://packaging.python.org/en/latest/flow/
| jshen wrote:
| Is there anything in there about managing dependencies within
| a python project? What is the canonical way to do that in
| python today?
| woodruffw wrote:
| It depends (unfortunately) on what you mean by a Python
| project:
|
| * If you mean a thing that's ultimately meant to be `pip`
| installable, then you should use `pyproject.toml` with PEP
| 518 standard metadata. That includes the dependencies for
| your project; the PyPUG linked above should have an example
| of that.
|
| * If you mean a thing that's meant to be _deployed_ with a
| bunch of Python dependencies, then `requirements.txt` is
| probably still your best bet.
| jshen wrote:
| I meant the second. requirements.txt is a really bad
| solution for that, and that is the frustration many of us
| have that have used languages with much better solutions.
| starlevel003 wrote:
| > * If you mean a thing that's meant to be deployed with
| a bunch of Python dependencies, then `requirements.txt`
| is probably still your best bet.
|
| This is exactly how we got in this mess. Using
| ``setup.cfg`` or ``pyproject.toml`` for _all_ projects
| makes this easy as now your deployable project can be
| installed via pip like every other one.
|
| 1. ``python -m virtualenv .``
|
| 2. ``source ./bin/activate.fish``
|
| 3. ``pip install -U
| https://my.program.com/path/to/tarball.tar.xz``
| jonnycomputer wrote:
| which is why I said, "Despite the work done on it"
| woodruffw wrote:
| Yes, that was meant more for the "source of truth" part.
| jonnycomputer wrote:
| No, I do appreciate you taking the time to do it (I was
| too lazy)
| aranke wrote:
| I gave a talk about this at the Packaging Summit during Pycon
| which was well received, so the team is definitely aware of the
| problem.
|
| However, the sense I got was that it was going to be a lot of
| work to "fix Python packaging" which wasn't feasible with an
| all-volunteer group.
|
| At work, we're migrating away from pip as a distribution
| mechanism for this reason; I don't expect to see meaningful
| improvements to the developer experience anytime soon.
|
| This is especially true because pip today is roughly where npm
| was in 2015, so there's a lot of fundamental infrastructure
| work (including security) that still needs to happen. An
| example of this is that PyPI just got the ability to namespace
| packages.
| LordKeren wrote:
| > we're migrating away from pip as a distribution mechanism
| for this reason
|
| Could you elaborate on what you're using as a replacement?
| pnt12 wrote:
| Not the parent but pipenv is decent, poetry is even better:
|
| - clear separation of dev and production dependencies -
| lock file with the current version of all dependencies for
| reproducible builds (this is slightly difference than the
| dependency specification) - no accidental global installs
| because you forgot to activate a virtual environment - (not
| sure if supported by pip) allows installing libraries
| directly from a git repo, which is very useful if you have
| internal libraries - easier updates
| di wrote:
| > An example of this is that PyPI just got the ability to
| namespace packages.
|
| You're thinking of organizations, which are not namespaces:
| https://blog.pypi.org/posts/2023-04-23-introducing-pypi-
| orga...
| bandrami wrote:
| It's beyond "mess" well into "fiasco" and frankly I'm astounded
| people think there's a more important issue facing the language
| right now. Look, for an example of a high-prestige project, at
| Spleeter, which spends multiple pages of its wiki describing
| how to install it with Conda and then summarizes "Note: as of
| 2021 we no longer recommend using Conda to install Spleeter"
| and nothing else.
| noitpmeder wrote:
| What are you smoking? The readme for spleeter clearly shows
| the two simple commands needed to install -- one being a
| conda install for system level dependencies and one being pip
| for the spleeter python package itself.
| dboreham wrote:
| Just add some PR spin like JS did and declare the GIL a feature.
___________________________________________________________________
(page generated 2023-06-15 23:00 UTC)