[HN Gopher] Our Plan for Python 3.13
       ___________________________________________________________________
        
       Our Plan for Python 3.13
        
       Author : bratao
       Score  : 438 points
       Date   : 2023-06-15 12:55 UTC (10 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | orbisvicis wrote:
       | How multithreaded programs depend on the GIL rather than use
       | proper locking? The No GIL project might break a lot of
       | multithreaded apps.
        
         | gyrovagueGeist wrote:
         | It would require no changes from pure Python apps (as the nogil
         | implementation preserves the current behavior & thread safety
         | of python objects).
         | 
         | But any C extensions that rely on the GIL for thread safety
         | would be an issue..
        
           | selimnairb wrote:
           | They must've explored this, but couldn't one make a GIL
           | emulator/wrapper for legacy extensions so that they can still
           | work in GIL-less Python while they are (hopefully) updated to
           | work with the new synchronization primitives?
        
       | pmoriarty wrote:
       | Wish python would sort out their packages mess and let there be
       | one and only one (obvious) way to do it.
        
       | Alifatisk wrote:
       | Ruby team approached parallelism & multicore in a cool way, they
       | introduced the concept of the Actor model (like Erlang)!
        
       | sproketboy wrote:
       | [dead]
        
       | kortex wrote:
       | Every single HN thread on python performance, ever:
       | 
       | Person with limited to zero experience with CPython internals> I
       | hate the GIL, why don't they _just remove it_?
       | 
       | That _just_ is doing an incredible amount of heavy lifting. It 'd
       | be like saying, "Why doesn't the USA _just_ switch entirely to
       | the metric system? " It's a huge ask, and after being burned by
       | the 2/3 transition, the python team is loathe to rock to boat too
       | much again.
       | 
       | The GIL is deep, arguably one of the deepest abstractions in the
       | codebase, up there with PyObject itself. Think about having to
       | redefine the String implementation of a codebase in your language
       | of choice.
       | 
       | Whatever your monday-morning quarterback idea on how to pull out
       | the GIL is, I can almost guarantee, someone has thought about it,
       | probably implemented it, and determined it will have one or more
       | side effects:
       | 
       | - Reduced single-thread performance
       | 
       | - Breaking ABI/FFI compatibility
       | 
       | - Not breaking ABI immediately, but massively introducing the
       | risk of hard to track concurrency bugs that didn't exist before,
       | because GIL
       | 
       | - Creating a maintenance nightmare by adding tons of complexity,
       | switches, conditionals, etc to the codebase
       | 
       | The team has already decided those tradeoffs are not really
       | justifiable.
       | 
       | The GILectomy project has probably come the closest, but it
       | impacts single-thread performance by introducing a mutex (there
       | are some reference type tricks [1] to mitigate the hit, but it
       | still hurts run time 10-50%), and necessitates any extension
       | libraries update their code in a strongly API-breaking way.
       | 
       | It's possible that over time, there are improvements which
       | simultaneously improve performance or maintainability, while also
       | lessening the pain of a future GILectomy, e.g. anything which
       | reduces the reliance on checking reference counts.
       | 
       | [1] PEP 683 is probably a prerequisite for any future GILectomy
       | attempts, which looks like it has been accepted, which is great
       | https://peps.python.org/pep-0683/
        
         | arp242 wrote:
         | There were already patches to "just" remove the GIL for Python
         | 2.0 or thereabouts, over 20 years ago. Strictly technically
         | speaking it's not even that hard, but as you mentioned it comes
         | with all sorts of trade-offs.
         | 
         | Today Python is one of the world's most popular programming
         | language, if not _the_ most popular language. The GIL is
         | limiting and a trade-off in itself of course - designing any
         | programming language is an exercise in trade-offs. But clearly
         | Python has gotten quite far with the set of trade-offs it has
         | chosen: you need to be careful to  "just" radically change the
         | set of trade-offs that have proven themselves to be quite
         | successful.
        
         | Groxx wrote:
         | Removing the GIL will also make some currently-correct threaded
         | code into incorrect code, since the GIL gives some convenient
         | default safety that I keep seeing people accidentally rely on
         | without knowing it exists.
         | 
         | Unavoidable performance costs _and_ reduced safety equals
         | massive friction no matter how it 's approached. It's not a
         | question of "are we smart enough to make this better in every
         | way", since the answer is "no, and nobody can be, because it
         | can't be done". It's only "will we make this divisive change".
        
         | [deleted]
        
         | aldanor wrote:
         | To think about it, perhaps US should have started metrification
         | back in 1980 when EU enforced it for its members. It takes
         | time, but is doable - if took Ireland 25 years to fully switch
         | to metric.
         | 
         | The cost of switching increases over time and will increase
         | further and further, making it less and less likely. Would have
         | been way cheaper back then.
        
           | dragonwriter wrote:
           | > To think about it, perhaps US should have started
           | metrification back in 1980
           | 
           | The US started that in either 1875 or 1893, though, and
           | somewhat more aggressively in 1975.
           | 
           | It reversed course in the 1980s, it didn't fail to start
           | before that.
        
         | kmod wrote:
         | You should check out the new nogil project by Sam Gross, which
         | is what's being talked about these days -- he actually
         | successfully removed the gil, but yes with the tradeoffs that
         | you mention. The other projects were, by comparison, "attempts"
         | to remove the gil, and didn't address core issues such as
         | ownership races (which are far harder than making refcount
         | operations atomic).
        
         | soulbadguy wrote:
         | > That just is doing an incredible amount of heavy lifting.
         | It'd be like saying, "Why doesn't the USA just switch entirely
         | to the metric system?" It's a huge ask, and after being burned
         | by the 2/3 transition, the python team is loathe to rock to
         | boat too much again.
         | 
         | > The GIL is deep, arguably one of the deepest abstractions in
         | the codebase, up there with PyObject itself. Think about having
         | to redefine the String implementation of a codebase in your
         | language of choice.
         | 
         | I am somewhat familiar with the cpython code base, and talk to
         | some folks involves in some of thousand new python runtime.
         | 
         | The problem is not that deep, and cpython is not that much
         | different from other projects : the GIL is an implementation
         | details that leaked through the API and now we have a bunch of
         | people are relying on it. The question is what do we do about
         | it. The trade-off in such situation are well known and just a
         | question to pick the right one.
         | 
         | > The team has already decided those tradeoffs are not really
         | justifiable.
         | 
         | That's from my view the crust of the issue, the TLDR is that
         | GIL-less python is not a priority for the python team, so they
         | view any trade-off/comprise as "not justifiable". Especially
         | something when it come to added complexity to the code base or
         | what not...
         | 
         | > Person with limited to zero experience with CPython
         | internals> I hate the GIL, why don't they just remove it?
         | 
         | People have this reaction because having a single-threaded
         | interpreter in 2023 is just ... embarrassing, what ever the
         | reasons behind it.
        
       | milliams wrote:
       | Notably yesterday Guido posted an update on the Faster Python
       | plans w.r.t. the no-GIL plans
       | (https://discuss.python.org/t/pep-703-making-the-global-inter...)
       | specifically:                 Our ultimate goal is to integrate a
       | JIT into CPython
        
         | selimnairb wrote:
         | Yeah, reading CPython 3.13 plans made it seem like they were
         | backing their way into creating a JIT compiler.
        
       | [deleted]
        
       | cellularmitosis wrote:
       | > Experiments on the register machine machine are complete
       | 
       | This is a worthwhile watch on this topic:
       | https://youtu.be/cMMAGIefZuM
        
         | kzrdude wrote:
         | faster-cpython team has done a lot of work to experiment on it:
         | https://github.com/faster-cpython/ideas/issues/485#issuecomm...
         | 
         | I wonder if the plan written there is current.
        
       | samsquire wrote:
       | This is extremely exciting stuff. Thank you for Python and
       | working on improving its performance.
       | 
       | I use Python threads for some IO heavy work with Popen and C and
       | Java for my true multithreading work.
        
       | klooney wrote:
       | This is exciting. Feels a little like JavaScript 10 years ago.
        
         | andai wrote:
         | You mean 15? ;)
         | https://en.wikipedia.org/wiki/V8_(JavaScript_engine)
        
       | phkahler wrote:
       | From the pip on subinterpreters:
       | 
       | >> In this case, multiple interpreter support provide a novel
       | concurrency model focused on isolated threads of execution.
       | Furthermore, they provide an opportunity for changes in CPython
       | that will allow simultaneous use of multiple CPU cores (currently
       | prevented by the GIL-see PEP 684).
       | 
       | This whole thing seems more like something developer wants to do
       | (hey, it's novel!) than something users want. To the extent they
       | do want it, removing the GIL is probably preferable IMHO. That a
       | global lock is a sacred cow in 2023 seems strange to me.
       | 
       | Maybe I'm misunderstanding, but I don't want an API to manage
       | interpreters any more than I want one for managing my CPU cores.
       | Just get me unhindered threads ;-)
        
       | [deleted]
        
       | samwillis wrote:
       | The "Faster Python" team are doing a fantastic job, it's
       | incredible to see the progress they are making.
       | 
       | However, there is also a bit of a struggle going on between them
       | and the project to remove the GIL (global interpreter lock) from
       | CPython. There is going to be a performance impact on single
       | threaded code if the "no GIL" project is merged, something in the
       | region of 10%. It seems that the faster Python devs are pushing
       | back against that as it impacts their own goals. Their argument
       | is that the "sub interpreters" they are adding (each with its own
       | GIL) will fulfil the same use cases of multithreaded code without
       | a GIL, but they still have the overhead of encoding and passing
       | data in the same way you have to with sub processors.
       | 
       | There is also the argument that it could "divide the community"
       | as some C extensions may not be ported to the new ABI that the no
       | GIL project will result in. However again I'm unconvinced by
       | that, the Python community has been through worse (Python 3) and
       | even asyncIO completely divides the community now.
       | 
       | It's somewhat unfortunate that this internal battle is happening,
       | both projects are incredible and will push the language forward.
       | 
       | Once the GIL has been removed it opens us all sorts of
       | interesting opportunities for new concurrency APIs that could
       | enable making concurrent code much easer to write.
       | 
       | My observation is that the Faster Python team are better placed
       | politicly, they have GVR on the team, whereas No GIL is being
       | proposed by an "outsider". It just smells a little of NIH
       | syndrome.
        
         | vb-8448 wrote:
         | > There is going to be a performance impact on single threaded
         | code if the "no GIL" project is merged, something in the region
         | of 10%.
         | 
         | 10% doesn't look too much to me, I still don't get why today
         | people care so much about single thread performance.
        
           | crabbone wrote:
           | CPython is so hopelessly slow, I wouldn't care about 10%. For
           | most of the stuff written in Python, users don't really care
           | about speed.
           | 
           | The impact won't be on users / Python programmers who don't
           | develop native extensions. It will suck for people who had a
           | painful workaround for Pythons crappy parallelism already,
           | but now will have to have two workarounds for different kinds
           | of brokenness. It still pays off to make these native
           | extensions, however their authors will create a lot of
           | problems for the clueless majority of Python users, which
           | will like end up in some magical "workarounds" for problems
           | introduced by this change very few people understand. This
           | will result in more cargo cult in the community that's
           | already on the witch hunt.
        
           | samwillis wrote:
           | Exactly, and the thing is, the Faster Python project will
           | completely surpass that 10% performance change.
        
             | _kulang wrote:
             | Sometimes people see it as losing 10% performance that you
             | never get back
        
           | awill wrote:
           | because a lot of code is single-threaded
        
           | celeritascelery wrote:
           | Single threaded performance is still more important than
           | multi-threaded. Most applications are single threaded, and
           | single threaded programs are much easier to write and debug.
           | Removing the GIL from python will not change that.
           | 
           | If no-GIL has a 10% single thread performance hit, that means
           | that essentially all my existing python code would be that
           | much worse.
        
             | vb-8448 wrote:
             | > If no-GIL has a 10% single thread performance hit, that
             | means that essentially all my existing python code would be
             | that much worse.
             | 
             | Maybe in a 100% CPU bound code, most of the code is I/O
             | bound and no one will notice the change, just my opinion.
        
               | celeritascelery wrote:
               | Maybe, but if your code is I/O bound, then multi-
               | threading isn't going to help you either.
        
               | wongarsu wrote:
               | Maybe that's just my bubble, but I see much more python
               | in data science projects than in web servers. And in
               | (python) data science even your file reading/writing code
               | quickly gets CPU bound.
        
               | coldtea wrote:
               | Not in pure Python, that's in specialized libs, like
               | numpy, pandas and co, done in C.
               | 
               | So, the hit on the Python interpreter wouldn't translate
               | to a hit on those.
        
               | regularfry wrote:
               | That's going to be CPU-bound in numpy's C extensions
               | rather than Python itself, one would hope. The worst of
               | all worlds is that we get a 10% perf cut to python
               | execution _and_ numpy breaks because the C API is ripped
               | up.
        
               | baq wrote:
               | There was a time like 5-10 years ago where Python was
               | really popular for grassroots web projects. Nowadays this
               | is mostly node looks like.
        
               | gpderetta wrote:
               | That's because you are doing it wrong. You'll need to
               | split every step of your data science pipeline into a
               | microservice, then put it in the cloud for resilience.
               | Then the application will be so fast that it is no longer
               | CPU bound but I/O bound.
        
               | umanwizard wrote:
               | But the GIL doesn't need to be held in I/O-bound code
               | anyway, so why does it matter?
        
               | burnished wrote:
               | It might be more helpful to think of it in terms of
               | supported use cases, rather than just pure volume.
        
             | coldtea wrote:
             | > _If no-GIL has a 10% single thread performance hit, that
             | means that essentially all my existing python code would be
             | that much worse._
             | 
             | So? Especially since the "Faster Python" team already made
             | Python 1.11 "10-60% Faster than 3.10", and 1.12 is even
             | faster still, whereas their overall plan is to get it to
             | 2-5 times faster compared to 3.9.
             | 
             | So at the worst case, with a 10% hit, you'd balance out the
             | 3.11 speed, and your code would be as fast as 3.10.
        
               | shpx wrote:
               | But your software would still run 10% slower than it
               | needs to. Single threaded code is like 99% of all code
               | written.
        
               | gpderetta wrote:
               | that's not an argument either, as your software is
               | already 10000% slower than it needs to be as you have
               | written it in python.
        
               | [deleted]
        
               | coldtea wrote:
               | > _But your software would still run 10% slower than it
               | needs to_
               | 
               | There's no absolute objective "needs to" or even any
               | static baseline. Python can have, and often has had, a
               | performance regression that drop your code by 10% at any
               | time. It's no big deal in itself.
               | 
               | Also consider a further speedup of e.g. 50% in upcoming
               | versions (they have promised more).
               | 
               | If you're OK with the X speed of today's Python, you
               | should be ok with X + 40% - even if it's not the X + 50%
               | it could have been due to the 10% GIL's removal toll.
        
             | hospitalJail wrote:
             | Are your current python programs slow and it matters?
             | 
             | Why havent you implemented multithreading?
             | 
             | (don't get me wrong, I know the cost of implementation, but
             | if speed matters, multithreading is a very reasonable step
             | in python)
        
               | birdyrooster wrote:
               | Because the GIL and also you have to use tons of locks to
               | get around the lack of thread safety for pythons objects.
        
               | [deleted]
        
               | Spivak wrote:
               | > Why haven't you implemented multi-threading?
               | 
               | Because that makes programs slower in Python.
               | 
               | Multi-threading in Python is for when you need time-
               | slicing for CPU intensive tasks so that they don't block
               | other work that needs to be be done.
        
               | coldtea wrote:
               | His point remains, he just phrased it badly. We haven't
               | you implemented a multiprocess pool?
        
               | gpderetta wrote:
               | Because not everything is trivially parallelizable and
               | multiprocess makes it harder to share data?
        
               | coldtea wrote:
               | > _Because not everything is trivially parallelizable_
               | 
               | A lot of things are though...
        
               | munch117 wrote:
               | Because it's a global solution to a local problem.
               | 
               | With threads, I can encapsulate the use of threads in a
               | class, whose clients never even notice that threads are
               | in use. Sure, threads are a global resource too, but much
               | of the time you can get away with pretending that they're
               | not and create them on demand. Not so with
               | _multiprocess_. If you use that, then the whole program
               | has to be onboard with it.
               | 
               | Threads work great in Python. Well not for maximising
               | multicore performance, of course, but for other things,
               | for structuring programs they're great. Just shuttle work
               | items and results back and forth using queue.Queue, and
               | you're golden - Python threads are super reliable. And if
               | the threads are doing mostly GIL-releasing stuff, then
               | even multicore performance can be good.
        
               | coldtea wrote:
               | > _Not so with multiprocess. If you use that, then the
               | whole program has to be onboard with it_
               | 
               | Huh? In Python you just need a function to call, and
               | multiprocess will run it wrapped in a process from the
               | pool, while api-wise it would look as it would if it was
               | a threadpool (but with no sharing in the process case,
               | obviously).
               | 
               | So what would the rest of the program be onboard with?
               | 
               | And all this could also be hidden inside some subpackage
               | within your package, the rest of the program doesn't need
               | to know anything, except to collect the results.
        
               | munch117 wrote:
               | multiprocessing needs to run copies of your program that
               | are sufficiently initialised that they can execute the
               | function, yet no initialisation code must be run that
               | should not be run multiple times.
               | 
               | That means you either use fork - which is a major can of
               | worms for a reusable library to use.
               | 
               | Or you write something like this in your entry point
               | module:                   if __name__=='__main__':
               | multiprocessing.freeze_support()
               | once_only_application_code()
               | 
               | Suppose I don't realise that your library is using
               | multiprocessing, and I carelessly call it from this two-
               | line script:                   import
               | library_that_uses_multiprocessing_internally
               | library_that_uses_multiprocessing_internally.do_stuff()
               | 
               | That's basically a fork bomb.
               | 
               | And where do you put the multiprocessing.set_start_method
               | call? Surely not in the library.
        
               | Spivak wrote:
               | > Sure, threads are a global resource too, but much of
               | the time you can get away with pretending that they're
               | not and create them on demand.
               | 
               | I think you would love Trio and applying the idea to
               | threads.
        
               | munch117 wrote:
               | Applying which idea? async does not appeal to me, if
               | that's what you mean.
        
           | david422 wrote:
           | And if you're really concerned about speed, Python is not the
           | language to choose.
        
             | burnished wrote:
             | Every program in any language has the potential to be
             | concerned about speed. This cute maxim is ultimately a
             | punchline, not really a serious point.
        
           | ahoho wrote:
           | Right, and single thread performance won't matter as much if
           | it becomes easier to implement multithreading. This hurts
           | legacy code, but I imagine it would be worth it in the long
           | run.
        
             | bombolo wrote:
             | It would remain as hard as it has always been. Also threads
             | are very heavy, locking kills performance, and if you don't
             | have GIL, you'll need to manage explicit locks, that will
             | be just as slow but also cause an incredible amount of
             | subtle bugs.
        
               | pdonis wrote:
               | _> if you don 't have GIL, you'll need to manage explicit
               | locks_
               | 
               | You need to do that with multithreaded Python code _with_
               | the GIL. The GIL only guarantees that operations that
               | take a single bytecode are thread-safe. But many common
               | operations (including built-in operators, functions, and
               | methods on built-in types) take more than one bytecode.
        
               | saltminer wrote:
               | > locking kills performance, and if you don't have GIL,
               | you'll need to manage explicit locks
               | 
               | I was under the impression that the Python thread
               | scheduler is dependent on the host OS (rather than being
               | intelligent enough to magically schedule away race
               | conditions, deadlocks, etc.), so you still need to manage
               | locks, semaphores, etc. if you write multi-threaded
               | Python code. I don't see how removing the GIL would make
               | this any worse. (Maybe make it slightly harder to debug,
               | but at that point it would be in-line with debugging
               | multi-threaded Java/C/etc. code.)
               | 
               | Or would this affect single-threaded code somehow?
        
               | bombolo wrote:
               | In python you always have a lock, the GIL. If you take it
               | away you end up actually having to do synchronization for
               | real. Which is hard and error prone.
        
               | [deleted]
        
           | hospitalJail wrote:
           | >I still don't get why today people care so much about single
           | thread performance.
           | 
           | For about 10 minutes a few years ago, when the M1 had the
           | best single threaded performance per buck, people cared.
           | 
           | Now that the M1 isnt the leader in single threaded, we are
           | back to the 'multithread is most important'.
           | 
           | Which has always been true. If your program needs an
           | improvement in speed, you can multithread it. The opposite
           | isnt true.
        
             | insanitybit wrote:
             | You can improve performance by moving to a single thread.
             | Pinning work to a single core will improve cache
             | performance, avoid overhead of flushing TLBs and other
             | process specific kernel structures, and more.
        
             | Joker_vD wrote:
             | > If your program needs an improvement in speed, you can
             | multithread it. The opposite isnt true.
             | 
             | What do you mean by "the opposite"? "If your program
             | doesn't need an improvement in speed, you can't multithread
             | it"? "If you can multithread your program, then it doesn't
             | need an improvement in speed"? Well, yeah, obviously both
             | of those statements are false but they're also quite
             | useless, so who cares?
        
               | hoosieree wrote:
               | Add negative threads to fix a program that runs too fast?
        
               | Joker_vD wrote:
               | Well, having too _many_ threads can slow down a program
               | as well (extra context switches, extra synchronization)
               | so... no idea.
        
             | holoduke wrote:
             | Not all algorithms can be chunked up. Single thread
             | performance is and will always be important.
        
         | gpderetta wrote:
         | If the GIL were an optional interpreter parameter you could
         | spawn GILed subinterpreters and GILless subinterpreters
         | according to your needs.
        
         | meepmorp wrote:
         | > There is also the argument that it could "divide the
         | community" as some C extensions may not be ported to the new
         | ABI that the no GIL project will result in. However again I'm
         | unconvinced by that, the Python community has been through
         | worse (Python 3) and even asyncIO completely divides the
         | community now.
         | 
         | I think the fact that you can name two other recent things
         | which have divided the community is a solid argument for being
         | a least a little gunshy about making big, breaking changes.
         | There's the cost of the changes themselves, but there's also a
         | cost to the language as a whole to add yet-another-upheaval.
         | 
         | Performance is important, but not breaking things is also
         | important. I can understand the appeal of doing something
         | suboptimal (but better than current) in favor of not
         | introducing a bunch of harder to predict side effects, both in
         | code and the community.
        
           | deschutes wrote:
           | Do sub interpreters actually work with c extensions? I get
           | the extension API has long supported it. However, I wonder if
           | in practice extensions rely on process global state to stash
           | information.
           | 
           | If so, sub interpreters invite all kinds of nasty bugs. Keep
           | in mind that porting the most popular extensions is an easy
           | exercise so the more interesting question is how this hidden
           | majority of extensions fares.
        
         | RobotToaster wrote:
         | Why not just try to make multiprocessing easier?
        
         | bastawhiz wrote:
         | > Their argument is that the "sub interpreters" they are adding
         | (each with its own GIL) will fulfil the same use cases of
         | multithreaded code without a GIL, but they still have the
         | overhead of encoding and passing data in the same way you have
         | to with sub processors.
         | 
         | This is smart, though, because (even if it's not great) there's
         | a lot of evidence that it works in practice. Specifically, this
         | is almost exactly what JavaScript does with workers. It's not a
         | great API and it's cumbersome to write code for, but it got
         | implemented successfully and people use it successfully (and it
         | didn't slow down the whole web).
        
         | dist-epoch wrote:
         | As someone who observed Python core development for many years,
         | a major change to the interpreter REQUIRES core-dev buy in.
         | There have been at least 5 big projects which proposed large
         | changes, they have all been declined.
         | 
         | It is a NIH syndrome, if a big project doesn't originate in the
         | dev team, it will not be accepted.
        
         | AlphaSite wrote:
         | Nogil would give far larger returns and I wish they'd focus on
         | that. That's the best way to a faster python.
        
         | btilly wrote:
         | My point of view is that anyone who wants to write
         | multithreaded code, shouldn't be trusted to. Making it easier
         | for people to justify this kind of footgun is a problem.
         | 
         | Also, no matter how much you wish it otherwise, retrofitting
         | concurrency on an existing project guarantees that you'll wind
         | up with subtle concurrency bugs. You might not encounter them
         | often, and they're hard to spot, but they'll be there.
         | 
         | Furthermore existing libraries that expect to be single-
         | threaded are now a potential source of concurrency bugs. And
         | there is no particular reason to expect the authors of said
         | libraries to have either the skills or interest to track those
         | bugs down and fix them. Nor do I expect that multi-threaded
         | enthusiasts who use those libraries in unwise ways will
         | recognize the potential problems. Particularly not in a dynamic
         | language like Python that doesn't have great tool support for
         | tracking down such bugs in an automated way.
         | 
         | As a result if "no GIL" ever gets merged, I expect that the
         | whole Python ecosystem will get much worse as well. But that's
         | no skin off of my back - I've learned plenty of languages. I
         | can simply learn one that hasn't (yet) made this category of
         | mistake.
        
           | iskander wrote:
           | >My point of view is that anyone who wants to write
           | multithreaded code, shouldn't be trusted to. Making it easier
           | for people to justify this kind of footgun is a problem.
           | 
           | Out of curiosity, have you done any Rust programming and used
           | Rayon?
           | 
           | It's hard to convey how easy and impactful multi-threading
           | can be if properly enclosed in a safe abstraction.
        
             | btilly wrote:
             | I have only read about and played a tiny bit with Rust. But
             | as I noted at
             | https://news.ycombinator.com/item?id=36342081, I see it as
             | fundamentally different than the way people want to add
             | multithreading to Python. People want to lock code in
             | Python. But Rust locks data with its compile-time checked
             | ownership model.
             | 
             | See https://blog.rust-lang.org/2015/04/10/Fearless-
             | Concurrency.h... for more.
        
             | kortex wrote:
             | Python is dynamic AF and Rust's whole shtick is compile-
             | time safety. Python was built from the ground up to be
             | dynamic and "easy", Rust was meticulously designed to be
             | strict and use types to enforce constraints.
             | 
             | It's hard to convey how difficult it would be to retrofit
             | python to be able to truly "enclose multithreading in a
             | safe abstraction".
        
           | samsquire wrote:
           | My deep interest is multithreaded code. For a software
           | engineer working on business software, I'm not sure if they
           | should be spending too much time debugging multithreaded bugs
           | because they are operating at the wrong level of abstraction
           | from my perspective for business operations.
           | 
           | I'm looking for an approach to writing concurrent code with
           | parallelism that is elegant and easy to understand and hard
           | to introduce bugs. This requires alternative programming
           | approaches and in my perspective, alternative notations.
           | 
           | One such design uses monotonic state machines which can only
           | move in one direction. I've designed a syntax and written a
           | parser and very toy runtime for the notation.
           | 
           | https://github.com/samsquire/ideas5#56-stateful-circle-
           | progr...
           | 
           | https://github.com/samsquire/ideas4#558-state-machine-
           | formul...
           | 
           | The idea is inspired by LMAX Disruptor and queuing systems.
        
             | btilly wrote:
             | And your approach can be built into a system that does
             | multi-threading away from Python, thereby achieving
             | parallelism without requiring that Python supports it as
             | well.
             | 
             | That's basically what all machine learning code written in
             | Python does. It calls out to libraries that can themselves
             | parallelize, use the GPU, etc. And then gets the answer
             | back. You get parallelism without any special Python
             | support.
        
             | zo1 wrote:
             | Just to add a bit of my opinion after reading your comment
             | in the context of this thread and not to the merit of your
             | idea. You are precisely the type of person I'd keep very
             | very far away from multithreading in any business software
             | project and also why I advocate the GIL to stay. If you
             | want to do that, go solo in your own time, or try apply for
             | a research position in some giant tech Co.
        
           | coldtea wrote:
           | > _My point of view is that anyone who wants to write
           | multithreaded code, shouldn 't be trusted to. Making it
           | easier for people to justify this kind of footgun is a
           | problem._
           | 
           | It's 2023 already. 1988 called.
        
             | btilly wrote:
             | Did you know that Coverity actively REMOVED checks for
             | concurrency bugs?
             | 
             | It turns out that when the programmer doesn't understand
             | what the tool says, managers believe the programmer and
             | throw out the tool. Coverity was finding itself in
             | situations where they were finding real bugs, and being
             | punished for it by losing the sale. So they removed the
             | checks for those bugs.
             | 
             | I'll revisit my opinion of multithreaded code when things
             | like that stop happening. In the meantime there are models
             | of how to run code on multiple threads that work well
             | enough with different primitives. See Erlang, Go, and Rust
             | for three of them. Also, if you squint sideways,
             | microservices. (Though most people set up microservices in
             | a way that makes debugging problematic. Topic for another
             | day.)
        
               | biorach wrote:
               | > Did you know that Coverity actively REMOVED checks for
               | concurrency bugs?
               | 
               | Source?
        
               | btilly wrote:
               | https://cseweb.ucsd.edu/~dstefan/cse227-spring20/papers/b
               | ess...
        
               | yjftsjthsd-h wrote:
               | Specifically, on the last page of that:
               | 
               | > As an example, for many years we gave up on checkers
               | that flagged concurrency errors; while finding such
               | errors was not too difficult, explaining them to many
               | users was.
               | 
               | (And thanks; I was also wondering about that)
        
             | yjftsjthsd-h wrote:
             | This is very witty and all, but what does it _mean_?
        
               | DonHopkins wrote:
               | I think he was subtly Rick Rolling you.
               | 
               | https://en.wikipedia.org/wiki/Never_Gonna_Give_You_Up
               | 
               | On 12 March 1988, "Never Gonna Give You Up" reached
               | number one in the American Billboard Hot 100 chart after
               | having been played by resident DJ, Larry Levan, at the
               | Paradise Garage in 1987. The single topped the charts in
               | 25 countries worldwide.
        
               | coldtea wrote:
               | Actually it just meant that this is a tired old argument
               | when C/C++ programmers were new to multithreaded code.
               | 
               | Now we have languages and language facilities (consider
               | Rust, Haskell, and others) to make it much safer. Same
               | with green threads and what Go and now Java does.
        
         | p5a0u9l wrote:
         | Removing the GIL should be seen as a last option.
        
         | dekhn wrote:
         | The GIL will never be removed from the main python
         | implementation. Histortically, the main value of GIL removal
         | proposals and implementations has been to spur the core team to
         | speed up single core codes.
         | 
         | I think it's too late to consider removing the gil from the
         | main implementation. Like guido said in the PEP thread, the
         | python core team burned the community for 10 years with the 2-3
         | switch, and a GIL change would be likely as impactful; we'd
         | have 10 years of people complaining their stuff didn't work.
         | Frankly I wish Guido would just come out and tell Sam "no, we
         | can't put this in cpython. You did a great work but
         | compatibility issues trump performance".
         | 
         | Kind of a shame because Hugunin implemented a Python on top of
         | the CLR some 20 years ago and showed some extremely impressive
         | performance results. Like jython, and pypy and other
         | implementations, it never caught on because compatibility with
         | cpython is one of the most important criteria for people
         | dealing with lots of python code.
        
           | lacker wrote:
           | _I think it 's too late to consider removing the gil from the
           | main implementation._
           | 
           | I think it'll happen one day. Is Python going anywhere?
           | 
           | Give it 20 years. The 2 -> 3 switch will be like the Y2K bug,
           | only remembered by the oldest programmers. The memories of
           | pain will fade, leaving only entertaining war stories. The
           | GIL will still be there, and still be annoying.
           | 
           | Then, when everyone has forgotten, the community will be
           | ready. For an incredibly long and grinding transition to
           | Python 4.
        
             | ActorNightly wrote:
             | In 20 years, you wouldn't have Python 4. You will have
             | something like ChatGPT that you interact with and it writes
             | code for you, down to the machine level instructions that
             | are all hyperoptimized. Coding will be half typing, half
             | verbal.
        
               | jwandborg wrote:
               | Imagine debugging hyperoptimized machine code. - Or would
               | you just blame yourself for not stating your natural
               | language instructions clearly enough and start over? I
               | guess all of these complex problems would somehow be
               | solved for everyone within the next 21 years and 364
               | days.
        
               | ActorNightly wrote:
               | You wouldn't debug it directly. The interaction will be
               | something like telling the compiler to run the program
               | for a certain input, and seeing wrong output, and then
               | saying, "Hey, it should produce this output instead".
               | 
               | The algorithm will be smart enough to simply regenerate
               | the code minimally to produce the correct output.
        
               | m4rtink wrote:
               | Good luck with that.
        
               | ilc wrote:
               | I think you have described a new definition of hell.
        
               | dumpsterdiver wrote:
               | Yeah, when people start running into weird failure modes
               | with no insight into the code... sounds like a nightmare.
        
               | timacles wrote:
               | While titillating to think about, what you're describing
               | is more than 20 years away for AI. The best it could do
               | is that for generalized problems, to actually write new
               | original code and complex code is not within the bounds
               | of current AI capabilities.
        
           | zackees wrote:
           | I agree with this take.
           | 
           | The GIL is hiding all kinds of concurrency bugs. If the
           | CPython team default disable it then all hell is going to
           | break loose.
           | 
           | It's better to carve out special concurrency constructs for
           | those that need it.
        
             | nomel wrote:
             | Not sure why this was flagged dead. If you look at many
             | stackoverflow answers, around _threading_ , many
             | _explicitly_ rely on the GIL, quoting source /documentation
             | "proving" that it's safe, avoiding the use of
             | threading.Lock() and the like.
             | 
             | As an early python programmer, I copy pasted these types of
             | answers. My old threaded code _absolutely_ has these bugs.
             | I've even seen code in production that has these bugs,
             | because they're sometimes dumb performance enhancements,
             | rather than bugs, unless you happen to use a Python
             | interpreter without a GIL, especially one that that doesn't
             | exist yet.
             | 
             | Great care would have to be taken to make sure the GIL was
             | _not_ disabled by default, for anything an existing thread
             | touches (or some super, dynamic aware, smarts to know if it
             | can be disabled).
        
               | tomn wrote:
               | I believe that GIL-removal projects aim to preserve this
               | behaviour:
               | 
               | https://peps.python.org/pep-0703/#container-thread-safety
               | 
               | That's why gilectomy carries an unreasonable single-
               | threaded performance cost: many operations now need to
               | take a lock where before they relied on the GIL.
        
               | umanwizard wrote:
               | Why do you consider code that relies on the GIL to be
               | buggy? Isn't the GIL a documented, stable part of Python?
               | (Hence why it will probably never be removed).
        
               | miraculixx wrote:
               | The GIL protects interpreter resources, not your program.
               | If you have concurrent access to your own objects you
               | need your own locks.
        
           | londons_explore wrote:
           | > compatibility issues trump performance".
           | 
           | Surely it would be possible to drop back to GIL mode if any c
           | extension is loaded which is incompatible with running
           | lockless?
           | 
           | Then you usually get speed, yet still maintain compatibility.
        
             | mlyle wrote:
             | > if any c extension is loaded which is incompatible with
             | running lockless?
             | 
             | It'd be slightly magical to do. You'd probably have a giant
             | RWLock, which is used for checking whether you're in the
             | "lock free" mode. But at least almost all code could hit it
             | only for read, and it could go away one day.
        
           | giantrobot wrote:
           | > Like jython, and pypy and other implementations, it never
           | caught on because compatibility with cpython is one of the
           | most important criteria for people dealing with lots of
           | python code.
           | 
           | I think this is more an issue of popular packages were
           | developed and tested against cpython and there wasn't enough
           | effort available to port/test them against anything else.
           | There's no special magic in cpython that Python programmers
           | love, they just want their code to run. If they've got a
           | numpy dependency (IIRC it doesn't support pypy but I'm not
           | going to look it up so I may be corrected on that point this
           | is a long parenthetical) they can't use an interpreter that
           | doesn't support it. Even if it worked but had bugs it didn't
           | have in cpython, they're still going to use cpython. Most
           | people aren't writing Python for its super duper fast
           | performance so they're fine leaving a little performance on
           | the table by using the interpreter that their dependencies
           | support. Whatever that is.
        
             | dehrmann wrote:
             | Jython doesn't have a good story for running native
             | libraries like Numpy.
        
               | giantrobot wrote:
               | Sure, I didn't claim it did. My point is that _Python_
               | programmers don 't tend to have a particular fondness for
               | an interpreter. They tend to care only about the
               | ecosystem. If you came up with a cxpython interpreter
               | that was faster than cpython _and_ supported all the
               | modules the same way (including C interop) Python
               | programmers would jump over to it. If your cxpython was
               | faster than cpython but _didn 't_ support everything
               | they'd ignore it.
               | 
               | Case in point: Python 2.7. While 3.x offered a lot of
               | improvements it took years for some popular modules to
               | support it. No one bothered to look twice at Python 3
               | until their dependencies supported it.
               | 
               | Python programmers don't tend to care much about the
               | interpreter so much as the code they wrote or use running
               | correctly.
        
           | takeda wrote:
           | I wouldn't say that GIL will never be removed, but I believe
           | the GIL cannot be removed without breaking a lot of existing
           | code.
           | 
           | That means there could be another drama with migrations if
           | that would be done.
           | 
           | I think the most likely way they can effectively eliminate
           | GIL would be to provide a compile option that would basically
           | say "this enables paralellization, but your code is bo longer
           | allowed to do A, B, C (there probably would be a lot more
           | things)"
           | 
           | People who want to get it would then adapt their code, and
           | there could be pressure for other packages to make them work
           | in that mode.
        
           | coldtea wrote:
           | > _the python core team burned the community for 10 years
           | with the 2-3 switch, and a GIL change would be likely as
           | impactfu_
           | 
           | A core team led by him, which also had the opportunity to
           | make much more impactful changes during 3, including removing
           | the GIL, since they were going to mess up compatibility
           | anyway, but didn't.
           | 
           | All that mess (and resulting split and multi-year slowdown in
           | Py3 adoption) just to put in the utf-8 change and some
           | trivial changes. It's only after 3.5 or so that 3 became
           | interesting.
        
             | paulddraper wrote:
             | It's hard to argue that Python 3 should have been even less
             | compatible.
        
               | mardifoufs wrote:
               | Agreed. It would have pushed the 2->3 migration from
               | "very painful for the ecosystem" to a full on perl5->6
               | break between the two versions. Not sure it would have
               | survived that.
        
               | coldtea wrote:
               | Is it? It took the 5 year (at least) adoption hit anyway.
               | How worse would it be if it had more features people
               | want?
               | 
               | I'd say it should have been less compatible when needed
               | to add more substantial changes people wanted, as opposed
               | to taking the hit for nothing!
               | 
               | And it should be more compatible for things that were
               | stupid decisions that had to eventually take back, like
               | not having a bytes str solution.
        
               | misnome wrote:
               | > How worse would it be
               | 
               | raku
        
               | coldtea wrote:
               | Raku failed to timely deliver a stable release. And
               | didn't stick to sensible new features, but tried to make
               | an uber-language with everything plus the kitchen sink,
               | and even a multi-language vm.
        
               | misnome wrote:
               | You are literally asking what could happen if Python 3
               | had done the same.
        
               | paulddraper wrote:
               | > How worse would it be
               | 
               | 8 years
               | 
               | or never
        
               | eesmith wrote:
               | > if it had more features people want
               | 
               | What people wanted was features to help with the
               | migration.
               | 
               | Yes, that of course having _those_ features would have
               | helped.
               | 
               | But doing that required experience the Python developers
               | didn't have when they were doing 3.0!
               | 
               | The Python developers thought people could do a one-off
               | syntactic code translation (2to3), perhaps even at
               | install time, rather than what most people did - write to
               | the common subset of 2 and 3, with helpers like the 'six'
               | package.
               | 
               | What are the "more substantial changes" you propose? The
               | walrus operator? Matching? Other things that Python 3
               | eventually gained, and which took years in some cases to
               | develop?
               | 
               | Or are you proposing something that would have it more
               | difficult to write to the common subset?
               | 
               | That subset compatibility necessity extends to the
               | Python/C API. Get rid of global state and you'll need to
               | replace things like:
               | PyErr_SetString(PyExc_ValueError, "embedded null
               | character");
               | 
               | with something that passes in the current execution
               | state. Make that too hard, and you inhibit migration of
               | existing extension modules, which further inhibits the
               | migration.
        
             | regularfry wrote:
             | I think that's called "learning from one's mistakes". I
             | hope.
        
           | jedberg wrote:
           | > The GIL will never be removed from the main python
           | implementation.
           | 
           | I don't see why. It's a much easier transition than 2 to 3.
           | 
           | Make each package declare right at the top if they are non-
           | GIL compatible. Have both modes available in the interpreter.
           | If every piece of code imported has the declaration on top,
           | then it runs in non-GIL mode, otherwise it runs in classic
           | GIL mode.
           | 
           | At first most code would still run in GIL mode, but over time
           | most packages would be converted, especially if people
           | stopped using packages that weren't compatible.
        
             | Groxx wrote:
             | I don't see why you wouldn't have holdouts using GIL for
             | valid single-threaded performance reasons for
             | years/decades. And that's ignoring legacy code - even 2.7
             | is still alive and kicking in some corners.
        
               | jedberg wrote:
               | I'm sure you would, but anyone who cared about it would
               | work around it, either by not using that library or
               | finding a different one or even forking the one that
               | isn't updated. Just like most people use Python 3.x now,
               | but there are some 2.7 holdouts. But those holdouts
               | aren't holding back the entire ecosystem at this point.
        
             | powersnail wrote:
             | Python would be too dynamic a language to have non-GIL
             | compatibility declared simply at top. Code can be
             | imported/evaled/generated at runtime, and at any part of a
             | script, which means that python would need to be able to
             | switch from non-GIL to GIL at any time of execution.
        
               | phire wrote:
               | That's not an insurmountable problem.
               | 
               | As long as all the data structures stay the same, all you
               | really need to do is flush out all the non-GIL bytecode
               | and load in (or generate) GIL bytecode.
               | 
               | Sure, there might be a stutter when this happens. You
               | will also want a way to either start in GIL mode, or
               | force it to stay in non-GIL mode, throwing errors. But
               | it's a very solvable problem.
        
               | richdougherty wrote:
               | I mean that seems horribly tricky, but also totally
               | doable.
               | 
               | Having read about JavaScript/Java VM optimisations in
               | JITs and GC I would be surprised if a global state change
               | like this is not manageable - think deoptimising the JS
               | when you enter the debugger in the dev tools in your
               | browser.
        
               | blibble wrote:
               | Java-the-bytecode is as dynamic yet somehow hotspot
               | manages to be rather nippy
        
               | jedberg wrote:
               | It's true that there are dynamic imports, but presumably
               | it would be on the library maintainer to know about that,
               | but also you could throw a catchable error about GIL
               | imports or something like that.
               | 
               | All I'm saying is that it's solvable, and more solvable
               | than 2 to 3.
        
               | samwillis wrote:
               | I completely agree, something like:                  from
               | __future__ import multicore
               | 
               | (The proposal seems to be to call it that rather than "no
               | Gil" as it's more positive)
               | 
               | And maybe if done in an __init__.py it applies to the
               | whole module.
               | 
               | Do that for v3.x then drop it completely for v4
               | 
               | I think the issue may be that maintaining both systems is
               | too complex.
        
             | dekhn wrote:
             | I'm sorry but I really don't think it's "easy" and your
             | suggestion would be just part of a much, much larger
             | solution.
        
             | aprdm wrote:
             | This is very naive.
        
               | xapata wrote:
               | jedberg doesn't seem like a naive person, glancing at his
               | work history. Perhaps you could explain why you think
               | this opinion is naive?
        
               | dekhn wrote:
               | Even skilled programmers often make the mistake of saying
               | "why don't you just..." or "it's easy, just..." when
               | completely ignoring large important factors, such as
               | following process, ensuring backward compatibility,
               | stakeholder alignment (ugh), and addressing long tail
               | problems.
        
               | Eisenstein wrote:
               | "There are two types of programmers: new ones who don't
               | know how complicated things are, and experienced ones who
               | have forgotten it" -from an article on HN the other day
               | about setting up python envs.
        
               | execveat wrote:
               | There also are programmers who are aware of much more
               | complicated things being done in sister projects, like
               | JVM and JS runtimes in this case.
        
               | xapata wrote:
               | An appeal to authority? I suppose it'd be better if you
               | named the article's author or linked to it.
        
               | Eisenstein wrote:
               | Not appealing to anyone; I thought it was a clever quote.
               | 
               | * https://www.bitecode.dev/p/why-not-tell-people-to-
               | simply-use
               | 
               | EDIT -- Here is the quote:
               | 
               |  _There are usually two kinds of coders giving advises. A
               | fresh one that has no idea how complex things really are,
               | yet. Or an experienced one, that forgot it._
        
           | Dork1234 wrote:
           | I really think the GIL is saving a bunch of poorly written
           | multi-threaded C++ wrappers/libraries out there. If they
           | remove it, a bunch of bugs will appear in other libraries
           | that might not be Pythons fault.
        
             | eklitzke wrote:
             | They're not "poorly written", the fact that you don't need
             | to do any locking in C/C++ code is part of the existing
             | Python API. Right now when Python code calls into C/C++
             | code the entire call is treated as if it's a single atomic
             | bytecode instruction. Adding extra locking would just make
             | the code slower and would accomplish absolutely nothing,
             | which is why people don't do it.
        
               | tinus_hn wrote:
               | The interpreter could do locks around these calls
               | automatically to make them atomic, while leaving itself
               | multithreaded.
        
               | KMag wrote:
               | In order for the call into C to appear atomic to a
               | multithreaded interpreter, all threads in the interpreter
               | would need to be blocked during the call. That's possible
               | to do, but you've just re-introduced the GIL whenever any
               | thread is within a C extension.
               | 
               | In the unlocked case, one could use low-overhead tricks
               | used for GC safepoints in some interpreters. One low-
               | overhead technique is a dedicated memory page from which
               | a single byte is read at the beginning at of every opcode
               | dispatch, and you mark that page non-readable when you
               | need to freeze all the threads executing in the
               | interpreter. You'd then have the SIGSEGV handler block
               | each faulting thread until the one thread returned from
               | C. That's fairly heavy in the case it's used, but pretty
               | light-weight if not used.
        
             | silverwind wrote:
             | Like in any other language, it's best to avoid non-native
             | dependencies.
        
             | enedil wrote:
             | Nevertheless, this is still a concern to wider ecosystem,
             | if Python libraries suddenly start to break due to
             | underlying issues. I don't think this can be neglected.
        
         | coldtea wrote:
         | > _There is also the argument that it could "divide the
         | community" as some C extensions may not be ported to the new
         | ABI that the no GIL project will result in_
         | 
         | I think the arguments are a red herring. It's more
         | rationalizations for not wanting to do it.
        
         | gazpacho wrote:
         | The other thing I don't get is that the whole sub interpreters
         | thing seems to totally break extension modules as well:
         | https://github.com/PyO3/pyo3/issues/2274. In theory parts of
         | sub-interpreters have been around for a while and it just
         | happens that every extension module out there is incompatible
         | with it because no one used it. But if it's going to become the
         | recommended way to do parallelism going forward then they'll
         | have to become compatible with it.
         | 
         | The serialization thing is also a huge issue. Half of the time
         | I want to use multiprocessing I end up finding that the
         | serialization of data is the bottleneck and have to somehow re-
         | architect my code to minimize it.
         | 
         | I would much prefer a world in which asyncio is 2x faster and
         | can benefit from real parallelism across threads. Libraries
         | like anyio already make it super easy to work with async +
         | threads. It would make Python a viable option for workloads
         | where it currently just isn't.
        
           | rogerbinns wrote:
           | (Disclosure: Author of a Python C extension that wraps SQLite
           | that has 2 decades of development behind it.)
           | 
           | Have a look at the current documentation for writing
           | extensions. This approach is essentially unchanged since
           | Python 2.0.
           | https://docs.python.org/3/extending/newtypes_tutorial.html
           | 
           | In particular note how everything is declared static - ie
           | only one instance of that data item will exist. If there are
           | multiple interpreters then there needs to be one instance per
           | sub-interpreter. That means no more static and initialisation
           | has to be changed to attach these to the module object which
           | is then attached to a specific interpreter instance. It also
           | means every location you needed to access a previously static
           | item (which often happens) has to change from a direct
           | reference through new APIs chasing back from objects to get
           | their owning module and then get the reference. That is the
           | code churn the PyO3 issue is having to address. One bonus
           | however is that you can then cleanly unload modules.
           | 
           | This may still not be sufficient. For example I wrap SQLite
           | and it has some global items like a logging callback. If my
           | module was loaded into two different sub interpreters and
           | both registered the logging callback, only one would win.
           | These kind of gnarly issues are hard to discover and
           | diagnose.
           | 
           | Removing the GIL also won't magically help. I already release
           | it at every possible opportunity. If it did go away, I would
           | have to reintroduce a lock anyway to prevent concurrency in
           | certain places. And there would have to be more locking
           | around various Python data structures. For example if I am
           | processing items in a list, I'd need a lock to prevent the
           | list changing while processing. Currently the GIL handles
           | that and ensures fewer bugs.
           | 
           | I've also experienced the serialization overhead with
           | multiprocessing. I made a client's code so much faster that
           | any form of Python concurrency was slower because of all the
           | overhead. I had to rearchitect the code to work on batches of
           | data items instead of the far more natural one at a time.
           | That finally allowed a performance improvement with
           | multiprocessing.
        
         | samwillis wrote:
         | This just posted on the Python forum is a brilliant rundown of
         | the conflicting "Faster Python" and "No GIL" projects, and a
         | proposal (plus call for funding) for a route forward.
         | 
         | I think everyone would agree that trying to combine both would
         | be ideal!
         | 
         | "A fast, free threading Python"
         | 
         | https://discuss.python.org/t/a-fast-free-threading-python/27...
        
           | dragonwriter wrote:
           | It is from the person most involved in the faster-with-GIL
           | effort, and its recommendation is to prioritize that effort
           | in any case, and if the resources are available for that and
           | no-gil, do both.
           | 
           | Not that I disagree with the recommendation, but one of the
           | sides saying "as long as resources make us choose, choose my
           | side" is...not really surprising.
        
             | samwillis wrote:
             | Not surprising, but I'm very happy they are trying to find
             | a route forward for both. That I commend.
             | 
             | I think from memory "Faster Python" is Microsoft funded and
             | "No GIL" is funded by Facebook. If they can find a way to
             | fund a combined effort that would be good.
             | 
             | I suspect the conflicting funding also adds to the general
             | political difficulty around this.
        
           | jrochkind1 wrote:
           | this is a good read and deserves to be on the homepage with
           | it's own thread too!
        
         | zerkten wrote:
         | This feels like it's playing out as I expected. I followed
         | Python, and the Python community, really closely from 2008-2016
         | when there were tons of relatively small scale experiments
         | happening. This all happened organically to a large extent and
         | there was no one coordinating a grand vision. It seems like we
         | have a continuation of this giving rise to the concern that
         | there is some battle.
         | 
         | I suspect there will be some butting of heads for a while
         | before they work things out after seeing how the community
         | reacts. All of this could be handled better with some
         | thoughtful proactive engagement, but that's not really how
         | things operate and there is no one to really enforce it.
        
         | ptx wrote:
         | If we could just get efficient passing of objects graphs from
         | one subinterpreter to another, which is not in the current
         | plan, I think that would solve a lot of use cases. That would
         | allow producer/consumer-style processing with multiple cores
         | without the serialization overhead of the multiprocessing
         | module.
         | 
         | Removing the GIL seems like it could make things more
         | complicated in many ways by making lots of currently thread-
         | safe code subtly unsafe, but I might be wrong about this.
         | (...in which case it would just make things very slow because
         | everything is synchronized?)
        
         | bratao wrote:
         | Yeah, they clearly stated that here
         | https://discuss.python.org/t/pep-703-making-the-global-inter...
         | 
         | I really wish that GIL go away. It is better to pay this price
         | now, multi-threading is the future.
        
           | ActorNightly wrote:
           | >multi-threading is the future.
           | 
           | Yes, with ML powered compilers recognizing what you are
           | trying to do and generating the actual multithreaded code for
           | you.
           | 
           | And it won't be multithreaded code like you know it, in the
           | sense of os specific threading code with context switching
           | and what not. It will be compiled compute graphs targeted at
           | specific ML hardware, likely with static addressing.
        
           | PhilipRoman wrote:
           | >multi-threading is the future
           | 
           | Haha, reminds me of an image I saw with a farmer from some
           | developing country saying "irrigation is the future".
           | 
           | For everyone else multithreading has been the status quo for
           | quite a long time.
        
           | hospitalJail wrote:
           | I'm so conflicted.
           | 
           | ~40 of the programs I am responsible for are single threaded.
           | They were relatively quick to develop and were made by
           | Electrical Engineers rather than career/degreed programmers.
           | 
           | 2 programs use multithreading, I had to do that. The learning
           | curve was not a huge deal, but the development time adds at
           | least hours. In my case days(due to testing).
           | 
           | I imagine its too hard to have an optional flag at the start
           | of each program that can let the user decide?
        
             | birdyrooster wrote:
             | The problem is that Python types are not thread safe so you
             | have to jump through more hoops to have safe
             | parallelization in Python. These changes would make it so
             | that writing multithreaded code will be much easier it
             | seems like.
        
             | coldtea wrote:
             | > _but the development time adds at least hours_
             | 
             | So? some "hours of development time" is nothing.
        
               | sfink wrote:
               | True, and as you get better at it, those hours will be
               | fewer.
               | 
               | The weeks of debugging never go away, though.
               | 
               | (Or at least, not as long as you're using shared state,
               | but that's really the only thing under consideration
               | here.)
        
               | hospitalJail wrote:
               | I'm mostly with you. I think this probably affects part
               | time/newbie programmers more.
               | 
               | A few hours of dev time is 1 or 2 nights of work.
        
             | maleldil wrote:
             | > I imagine its too hard to have an optional flag at the
             | start of each program that can let the user decide?
             | 
             | Adding nogil would mean deep changes to the interpreter. I
             | imagine maintaining both versions would be almost like
             | forking the project.
        
             | dragonwriter wrote:
             | > I imagine its too hard to have an optional flag at the
             | start of each program that can let the user decide?
             | 
             | The actual next-step proposal toward no-gil is GIL as a
             | build-time flag (which isn't quite the same as a runtime
             | flag, but not too far off, either.)
             | 
             | https://peps.python.org/pep-0703/
        
           | sgt wrote:
           | Not for Python, I feel. In sheer volume, the vast majority of
           | my Python programs are single-threaded. I want my programs to
           | be very quick in runtime when they run.
           | 
           | Those that are multi-threaded are seeing minor to medium
           | load.
           | 
           | If expecting extreme load (like Twitter scale), then Python
           | is usually not the answer (rather go to a statically typed
           | language like Java, Go, Rust etc).
        
             | gpderetta wrote:
             | > In sheer volume, the vast majority of my Python programs
             | are single-threaded
             | 
             | obviously if multithreading is near-useless in python, very
             | few programs will take advantage of it.
        
               | zo1 wrote:
               | Likewise I can say: Obviously we still have the GIL
               | because very few people want it removed.
        
             | AlphaSite wrote:
             | Is single threaded perf is important, you've already lost
             | by using python. You're only ever going to get, ok-ish
             | performance or slightly more ok-ish
        
             | gcbirzan wrote:
             | > Not for Python, I feel. In sheer volume, the vast
             | majority of my Python programs are single-threaded.
             | 
             | Yes, they are single threaded, because using multiple
             | threads brings very little benefit in most cases...
             | 
             | > If expecting extreme load (like Twitter scale), then
             | Python is usually not the answer (rather go to a statically
             | typed language like Java, Go, Rust etc).
             | 
             | So that means we shouldn't get any performance
             | improvements, because there are faster languages out there?
        
           | amelius wrote:
           | It's relatively simple to make the GIL go away: just compile
           | to some VM that has a good concurrent garbage collector would
           | be one approach. Yes, this will break some assumptions here
           | and there, but not too difficult to overcome especially if
           | you bump the version number to Python 4.
           | 
           | However, that leaves a lot of C code that you can't talk to
           | anymore because the C code requires the old Python FFI. I
           | think this is where the main problem lies.
        
             | coldtea wrote:
             | > _It 's relatively simple to make the GIL go away: just
             | compile to some VM that has a good concurrent garbage
             | collector would be one approach. Yes, this will break some
             | assumptions here and there, but not too difficult to
             | overcome especially if you bump the version number to
             | Python 4._
             | 
             | "It's easy to lower the air-conditioning costs of Las
             | Vegas: just move the town to New England".
             | 
             | The problem is "how to remove the GIL" in abstract. It's
             | how to remove the GIL, not impact extensions at all (or as
             | little as possible), keep single threaded performance, and
             | have zero impact to user programs.
             | 
             | To which the above isn't any kind of solution.
        
             | brightball wrote:
             | JRuby is a good path to this in the Ruby world.
        
             | zerkten wrote:
             | >> However, that leaves a lot of C code that you can't talk
             | to anymore because the C code requires the old Python FFI.
             | I think this is where the main problem lies.
             | 
             | This is exactly the problem, but people have a hard time
             | grasping this because most people interacting with Python
             | have no understanding of how C code interacts with Python,
             | or don't understand the C module ecosystem. I'm not sure if
             | the Python community has a good accounting of this either
             | because I don't recall seeing much quantitative analysis of
             | how many modules would need to be updated etc.
             | 
             | This would help compare with the Python 2 to 3 conversion
             | efforts. Even then, the site listing (shaming?) popular
             | modules with compatibility made a mid-to-late appearance in
             | the process of killing Python 2. Quantification of module
             | updates is obvious thing to have from the get-go for anyone
             | looking to follow through on removing the GIL, but it's not
             | a fun task.
        
               | amelius wrote:
               | This needs more thinking but how about a hybrid approach,
               | where you have Thread objects, and GILFreeThread objects?
               | 
               | The Thread objects work with old code, but run more
               | slowly.
               | 
               | The GILFreeThread objects are fast.
               | 
               | If an object is passed from a Thread to a GILFreeThread
               | or the other way around, then special safety code is
               | attached to the object so that manipulating the object
               | from the other side doesn't cause issues.
               | 
               | The advantage is that now the module implementers have
               | time to migrate from the old system to the new system.
               | And users can work with both the old modules and
               | "converted" modules in the same system, with minor
               | changes.
        
               | ptx wrote:
               | This sounds a bit like COM and its apartment-threaded vs.
               | free-threaded objects. The "special safety code" in that
               | case is a proxy object that sends messages to the thread
               | that owns the actual object when its methods are invoked.
        
               | kortex wrote:
               | That sounds like a maintenance and stability nightmare,
               | if it's even possible. You are effectively red/blue
               | splitting the entire codebase. PyObject and the GIL touch
               | _everything_ in the codebase.
        
               | amelius wrote:
               | The red/blue splitting happens behind the scenes, so it's
               | different. Not really a color problem, because the user
               | doesn't have to know about it.
               | 
               | But yeah, you will basically have two versions of Python
               | running at the same time, with some (hopefully invisible)
               | translation between them.
        
               | kortex wrote:
               | > But the red/blue splitting happens behind the scenes,
               | so it's different.
               | 
               | Respectfully, I don't believe you have spent any
               | appreciable time looking at the CPython source code. If
               | you had, you would understand how unreasonable this
               | expectation is. I don't say this to tear you down, I say
               | this to convey the magnitude of what you are describing.
               | It would involve touching tens of thousands of LoC. You
               | are talking about a multi-million dollar project that
               | would result in a ton of near-duplication of code.
               | 
               | The red/blue is inescapable because you have to redefine
               | PyObject to have two flavors, PyObject with GIL and
               | GilFreePyObject. You now have to check which one you are
               | dealing with constantly.
        
               | amelius wrote:
               | > You now have to check which one you are dealing with
               | constantly.
               | 
               | No, because if you're running inside a Thread you will
               | know that you will see only PyObjects, whereas if you're
               | running inside a GilFreeThread you will know that you
               | will only see GilFreePyObjects.
               | 
               | If you're manipulating the PyObject (necessarily from a
               | Thread) then there will be behind-the-scenes translation
               | code that will manipulate the corresponding
               | GilFreePyObject for you. But you don't have to know about
               | it.
        
               | kortex wrote:
               | What exactly does "running inside a Thread/GilFreeThread"
               | in the context of the cpython runtime mean? You pretty
               | much need an entire copy of the virtual machine code.
               | 
               | These are C structs we are talking about here, not some
               | Rust trait you can readily parameterize over abstractly.
               | That either means lots of manual code duplication, or
               | some gnarly preprocessor action. Both are a maintenance
               | nightmare.
        
               | amelius wrote:
               | Yes, the assumption is that writing a "double-headed
               | Python" runtime is far less work than converting the
               | entire ecosystem to a new Python runtime.
               | 
               | I think this is the correct view, because at this moment
               | people are writing various approaches in an attempt at
               | getting rid of the GIL. It's the ecosystem of modules
               | that's the real problem, where you want to basically put
               | in as little effort as possible per module, at least
               | initially.
        
               | nomel wrote:
               | Please read any amount of CPython interpreter code to
               | begin to understand what you're asking for "behind the
               | scenes".
        
             | make3 wrote:
             | this would literally break every single python package out
             | there man
        
             | crabbone wrote:
             | [flagged]
        
               | gyrovagueGeist wrote:
               | ...you mean like how the nogil project already has a
               | working Numpy module?
        
               | apgwoz wrote:
               | > Also, there aren't good programmers in Python core dev.
               | 
               | You seem pretty confident that you know what you are
               | doing.
        
               | [deleted]
        
             | notatallshaw wrote:
             | > It's relatively simple to make the GIL go away: just
             | compile to some VM that has a good concurrent garbage
             | collector would be one approach
             | 
             | Sure, if you don't mind paying a 50-90% performance impact
             | on single threaded performance or completely abandon C-API
             | compatibility and have C extensions start from scratch then
             | there are simple approaches.
             | 
             | If you look at any example in the past to remove the GIL
             | you would see that keeping these two requirements of not
             | having terrible single threaded performance and not having
             | almost a completely new C-API is actually very complex and
             | takes a lot of expertise to implement.
        
               | saltminer wrote:
               | This might be a dumb question, but why would removing the
               | GIL break FFI? Is it just that existing no-GIL
               | implementations/proposals have discarded/ignored it, or
               | is there a fundamental requirement, e.g. C programs
               | unavoidably interact directly with the GIL? (In which
               | case, couldn't a "legacy FFI" wrapper be created?) I know
               | that the C-API is only stable between minor releases [0]
               | compiled in the same manner [1], so it's not like the
               | ecosystem is dependent upon it never changing.
               | 
               | I cannot seem to find much discussion about this. I have
               | found a no-GIL interpreter that works with numpy, scikit,
               | etc. [2][3] so it doesn't seem to be a hard limit. (That
               | said, it was not stated if that particular no-GIL
               | implementation requires specially built versions of C-API
               | libs or if it's a drop-in replacement.)
               | 
               | [0]: https://docs.python.org/3/c-api/stable.html#c-api-
               | stability
               | 
               | [1]:
               | https://docs.python.org/3/c-api/stable.html#platform-
               | conside...
               | 
               | [2]: https://github.com/colesbury/nogil
               | 
               | [3]: https://discuss.python.org/t/pep-703-making-the-
               | global-inter...
        
               | kortex wrote:
               | > C programs unavoidably interact directly with the GIL?
               | 
               | Bingo. They don't _have_ to, but often the point of C
               | extensions is performance, which usually means turning on
               | parallelism. E.g. Numpy will release the GIL in order to
               | use machine threads on compute-heavy tasks. I 'm not
               | worried about the big 5 (numpy, scipy, pandas, pytorch,
               | and sklearn), they have enough support that they can
               | react to a GILectomy. It's everyone else that touches the
               | GIL but may not have the capacity or ability to update in
               | a timely manner.
               | 
               | I don't think this is something which can be shimmed
               | either or ABI-versioned either. It's deeeep and touches
               | huge swaths of the cpython codebase.
        
               | saltminer wrote:
               | Thanks, that explains a lot. Sounds like a task that
               | would have to be done in Python 4, if ever it exists.
        
               | notatallshaw wrote:
               | > or is there a fundamental requirement, e.g. C programs
               | unavoidably interact directly with the GIL?
               | 
               | Both C programs can use the GIL for thread safety and can
               | make assumptions about the safety of interacting with a
               | Python object.
               | 
               | Some of those assumptions are not real guarantees from
               | the GIL but in practise are good enough, they would no
               | longer be good enough in a no-GIL world.
               | 
               | > I know that the C-API is only stable between minor
               | releases [0] compiled in the same manner [1], so it's not
               | like the ecosystem is dependent upon it never changing.
               | 
               | There is a limited API tagged as abi3[1] which is
               | unchanging and doesn't require recompiling and any
               | attempt to remove the GIL so far would break that.
               | 
               | > so it's not like the ecosystem is dependent upon it
               | never changing
               | 
               | But the wider C-API does not change _much_ between major
               | versions, it 's not like the way you interact with the
               | garbage collector completely changes causing you to
               | rethink how you have to write concurrency. This allows
               | the many projects which use Python's C-API to relatively
               | quickly update to new major versions of Python.
               | 
               | > I have found a no-GIL interpreter that works with
               | numpy, scikit, etc. [2][3] so it doesn't seem to be a
               | hard limit.
               | 
               | The version of nogil Python you are linking is the
               | product of years of work by an expert funded to work full
               | time on this by Meta, the knowledge is sourcing many
               | previous attempts to remove the GIL including the
               | "gilectomy". Also you are linking to the old version
               | based on Python 3.9, there is a new version based on
               | Python 3.12[2]
               | 
               | This strays away from the points I was making, but with
               | this specific attempt to remove the GIL if it is adopted
               | it is unlikely to be switched over in a "big bang", e.g.
               | Python 3.13 followed by Python 4.0 with no backwards
               | compatibility on C extensions. The Python community does
               | not want to repeat the mistakes of the Python 2 to 3
               | transition.
               | 
               | So far more likely is to try and find a way to have a
               | bridge version that supports both styles of extensions.
               | There is a lot of complexity in this though, including
               | how to mark these in packaging, how to resolve
               | dependencies between packages which do or do not support
               | nogil, etc.
               | 
               | And _even_ this attempt to remove the GIL is likely to
               | make things slower in some applications, both in terms of
               | real-world performance as some benchmarks such as MyPy
               | show a nearly 50% slowdown and there may be even worse
               | edge cases not discovered yet, and in terms of lost
               | development as the Faster CPython project will unlikely
               | be able to land a JIT in 3.13 or 3.14 as they plan right
               | now.
               | 
               | [1]: https://docs.python.org/3/c-api/stable.html#c.Py_LIM
               | ITED_API [2]: https://github.com/colesbury/nogil-3.12
        
         | JohnFen wrote:
         | Oh, boy. Will any of that impact backward compatibility?
         | 
         | I don't develop anything in Python, but it is used by several
         | applications of importance to me. The lack of compatibility
         | between versions is a thing that bites me hard, and I tend to
         | curse Python because of it.
        
         | matsemann wrote:
         | GIL is one of the things that make Python an annoyance to work
         | with. In saner languages, you could handle multiple requests at
         | the same time, or easily spin something off in a thread to work
         | in the background. In Python you can't do this. You need to
         | duplicate your process, then pay the price of memory usage and
         | other things multiple processes hinder (like communication
         | between threads or pre-computed values now aren't shared so you
         | need something external again). To deploy your app, you end up
         | with 10 different deploys because each of them have to have a
         | different entry point and separate task to fulfill.
        
           | slt2021 wrote:
           | No it is not.
           | 
           | If you want to reach peak performance single-threaded app
           | with no locks is the way to go, and work being sharded (not
           | shared) among multiple single-threaded apps.
           | 
           | Multi-threaded apps with shared state introduce more
           | complexity, than the performance when compared to multiple
           | single-threaded apps running asyncio event loop.
           | 
           | For example LMAX Disruptor
        
           | ActorNightly wrote:
           | The other languages are not saner. You are basically saying
           | "Python GIL is annoying because I can't write parallel
           | processing performant code in Python". Python has never been
           | and is not a performant language. Its designed for rapid and
           | easy development.
           | 
           | The multiprocessing+asyncio in Python fulfills the aspect of
           | utilizing all the resources, albeit at a higher memory cost,
           | but memory is dirt cheap these days. You have a master
           | process and then worker threads. For all things that you
           | would write in Python, where in >90% cases you are network
           | latency limited, the paradigm of a master process and worker
           | processes with IPC on unix sockets works extremely well. Set
           | up a web app with fast api/gunicorn master/uvicorn workers,
           | and it will be plenty fast enough for anything you do.
        
           | stinos wrote:
           | _GIL is one of the things that make Python an annoyance to
           | work with_
           | 
           | For your particular usecase, yes. Personally I've been using
           | Python for like 20 years for various tasks and so far never
           | got really bothered by the presence of it once. Worst case
           | was having to wait somewhat longer for things to complete.
           | For my case: still worth it compared to making things
           | multithreaded. And async fixed the rest. And the things which
           | I actually need to be _fast_ aren 't usually in Python
           | anyway. I'm not saying the GIL should stay, it's just that it
           | doesn't seem as much of a problem in the general land of
           | Python. Or in other words: how many Python users out there
           | even know what GIL means and does?
        
             | stult wrote:
             | > For your particular usecase, yes.
             | 
             | The use case they are describing is a standard web server
             | or web application. That's a pretty important and widely
             | applicable use case to dismiss out of hand as "your
             | particular usecase".
        
               | nunuvit wrote:
               | The dismissiveness really goes the other way. Pythons
               | like IronPython and Jython don't have a GIL. CPython does
               | because it's primarily a glue language for extensions
               | that might not be thread-safe. Web apps were given huge
               | accommodation with async, so you can't say their needs
               | are being dismissed. Why must we break the C in CPython
               | for a use-case that could use one of the GIL-free
               | Pythons?
        
               | stinos wrote:
               | That's somewhat out of context. With the bit you quoted I
               | meant "sure working around the GIL by implemening a web
               | server in that particular way is annoying". I'm not
               | saying that "web server" as a whole is not important or
               | not widely applicable, merely that amongst all other
               | usecases and applications of Python out there, web
               | servers are just one of many. And the particular
               | implementation stated like "10 different deploys" is even
               | a subset of that 'one' and as explained by fellow
               | comments, probably not the most appropriate one.
        
               | semiquaver wrote:
               | The GIL is not held during IO, which is what most web
               | applications and web servers should be spending the vast
               | majority of their time doing.
               | 
               | https://docs.python.org/3/library/threading.html
               | 
               | If that's too limiting, preforking and other forms of
               | process-based parallelism are a tried and true approach
               | that has been used for years to run python, ruby, PHP,
               | and once upon a time Perl web applications at enormous
               | scale. The difference between threads and processes on
               | Linux is relatively minor.
               | 
               | Saying that python doesn't work for web application use
               | cases because of the GIL is frankly sort of bizarre given
               | the large number of python web applications in the wild
               | chugging along delivering value.
        
               | mlyle wrote:
               | > which is what most web applications and web servers
               | should be spending the vast majority of their time doing.
               | 
               | Sure... but if you have dozens of threads spending most
               | of their time doing I/O, that still leaves many threads
               | wanting to do things other than I/O.
               | 
               | > The difference between threads and processes on Linux
               | is relatively minor.
               | 
               | Except having any shared state between processes is
               | painful. If you're hitting an outside database for
               | everything, it's fine.
        
               | jrochkind1 wrote:
               | > The GIL is not held during IO, which is what most web
               | applications and web servers should be spending the vast
               | majority of their time doing.
               | 
               | While this has been oft-repeated for years, more or less
               | language-independently, I have become convinced it no
               | longer accurately describes _ruby on rails_ apps. People
               | still say it about ruby /rails too though. But my rails
               | web apps are spending 50-80% of their wall time in cpu,
               | rather than blocking on IO. Depending on app and action.
               | And whenever I ask around for people who have actual
               | numbers, they are similar -- as long as they are projects
               | with enough developer experience to avoid things like n+1
               | ORM problems.
               | 
               | I don't have experience with python, so I can't speak to
               | python. but python and ruby are actually pretty similar
               | languages, including with performance characteristics,
               | and the GIL. Python projects tend to use more stuff
               | that's in C, which would make more efficient use of CPU,
               | so that could be a difference. (Also not unrelated to
               | what we're talking about!)
               | 
               | But I have become very cautious of accepting "most web
               | applications are spending the vast majority of their time
               | on io blocking rather than CPU" as conventional wisdom
               | without actually having any kind of numbers. _vast_
               | majority? I would doubt it, but we need empirical
               | numbers.
        
               | nomel wrote:
               | > The use case they are describing is a standard web
               | server or web application.
               | 
               | I believe this is what they were referring to when they
               | said "async fixed the rest".
        
             | Fiahil wrote:
             | I think Data Scientists would like a word with you. They
             | have plenty of time since their parallel pipeline was
             | OOMKilled.
        
           | oconnor663 wrote:
           | > you could handle multiple requests at the same time
           | 
           | To be fair to Python and the GIL, it's totally capable of
           | parallelizing requests when most of the work is network-
           | bound, which is probably the common case. And when the work
           | is CPU-bound, but the CPU-intensive part is written in C,
           | it's also possible for C to release the GIL. So it's really
           | only "heavy computational work directly in Python" programs
           | that are affected by this. (On the other hand, Python
           | applications do naturally expand to look like this over
           | time...)
        
           | paulddraper wrote:
           | Other languages do that: JavaScript, PHP, Erlang.
           | 
           | Python multiprocessing is pretty usable.
           | 
           | I like multithreading, but also...it has more footguns then
           | the rest of programming combined. [1]
           | 
           | I'm not convinced Python's approach is that bad in practice.
           | 
           | [1] https://news.ycombinator.com/item?id=22165193
        
             | [deleted]
        
             | mohaine wrote:
             | It is, until the GIL bites you in the ass. As it is you get
             | different behavior if your call is calling out to external
             | code vs being pure python. Note that you really don't know
             | if a random function call is python or wrapping external
             | class so you really get random behavior.
             | 
             | The time it got me was a thread just to timeout another
             | process. Tests worked great but the timeout didn't work in
             | production because the call it was wrapping was calling out
             | to C code so nothing would run until the call returned. We
             | even still got the timeout error in the logs and it looked
             | like it was working (it even tossed the now waited for
             | valid results), but not at the time of the timeout but
             | after the call finally returned a few hours later.
        
               | paulddraper wrote:
               | So.... It would have been better if GIL were even more
               | aggressive?
        
         | samsquire wrote:
         | I have often wondered what the solution to the serialisation of
         | objects between subinterpreters is.
         | 
         | If its garbage collection that's the problem, I think you could
         | transfer ownership between threads, so the subinterpreter takes
         | ownership of the object and all references to it in the source
         | interpreter are voided.
         | 
         | Alternatively you can do something like Java and all objects
         | are in a global object allocator, passing things between
         | threads doesn't require interpretation, just a reference.
        
         | jillesvangurp wrote:
         | The GIL has been a blocker for many years. It's nice that the
         | team is making progress of course. IMHO it's one of those
         | bandaids they need to rip off.
         | 
         | I was listening to the interview with Chris Lattner with Lex
         | Friedman a week ago or so. Very interesting discussion on his
         | project mojo which intends to build a new language that is
         | backwards compatible and a drop in replacement for python with
         | opt in strict typing, better support for native/primitive types
         | where this makes sense, easier integrations with hardware
         | optimizations, and of course no GIL. The idea would be that the
         | migration path for existing code is that it should just work
         | and then you optimize it and provide the compiler with type
         | hints and other information so it can do a better job. Very
         | ambitious roadmap and I'm curious to see if they'll be able to
         | deliver.
         | 
         | The main goal seems to be to enable programmers to do the
         | things you currently can't do in python because it's too slow
         | in python without running into a brick wall in terms of
         | performance.
         | 
         | I mostly work with JVM languages and a few other things but I
         | occasionally do a bit with python as well. I've always liked it
         | as a language but I'm by no means an expert in it. I recently
         | spent a day building a simple geocoder and since I know about
         | the GIL, I went straight for the multi processing library and
         | did not bother with threads. IMHO there's absolutely no point
         | in attempting to use threads with python with the GIL in place.
         | I needed to geocode a few hundred thousand things in a
         | reasonable time frame, so all I wanted to do was use a few
         | different processes concurrently so I could cut down the
         | runtime to something reasonable.
         | 
         | Python is ok for single threaded stuff but you run into a
         | brickwall doing anything with multiple processes or threads and
         | juggling state. In the end I just gave up and wrote a bunch of
         | logic that splits the input into files, processes the files
         | with separate processes, waits for that to finish, and then
         | combines the output files. Just a lot of silly boiler plate and
         | abusing the file system for sharing state. It does what it
         | needs to but it feels a bit primitive and backwards and I'm not
         | proud of the solution.
         | 
         | Removing the GIL, adding some structured concurrency, and maybe
         | some other features, would make python a lot more capable for
         | data processing. And since there are a lot of people already
         | use python for that sort of thing, I don't think that would be
         | such a bad thing. Data science and data processing are the core
         | use case for python. I don't think people actually care a lot
         | about the raw python performance. It's never been that great to
         | begin with. If it's performance critical, it's mostly being
         | done via native libraries already.
        
           | mixmastamyk wrote:
           | Is this io-bound or cpu-bound? Hard to tell from your one
           | word description, "geocode". Is that local or a network call?
           | 
           | If you've broken up the input already I'd use the shell to
           | parallelize, ie for &. If network, async is probably what you
           | want.
        
           | robertlagrant wrote:
           | This is an interesting writeup. Could you go for
           | asyncio.gather[0] or TaskTaskGroups[1] these days? Or would
           | that not help?
           | 
           | [0] https://docs.python.org/3/library/asyncio-
           | task.html#asyncio....
           | 
           | [1] https://docs.python.org/3/library/asyncio-
           | task.html#asyncio....
        
           | nologic01 wrote:
           | > Data science and data processing are the core use case for
           | python
           | 
           | indeed. one would almost hope that all the different aspects
           | of "performance" and "concurency", their memory, disk or
           | network profile etc get their own dedicated labels. The
           | conflation of these distinct dimensions is a major source of
           | confusion (and thus a waste of bandwidth).
        
         | nmstoker wrote:
         | I do hope the dialogue stays cordial, constructive and open
         | rather than becoming distinct entrenched camps - the Python
         | community has a strong and mature community spirit so this
         | seems plausible and not too much wishful thinking.
         | 
         | Much as No GIL would be an adventure, I'm leaning towards the
         | more gradual and stable changes from the FasterPython team and
         | I can see that throwing No GIL into the mix adds complexity at
         | an inopportune moment.
        
       | RandyRanderson wrote:
       | Many projects start out in Python b/c often new libs are python-
       | first. Many of those run into performance issues and eventually
       | determine that Python will never be fast b/c of the GIL.
       | 
       | I think it's very magnanimous of the python team, by not removing
       | the GIL, to give Go, Java and C++ a chance.
        
       | titzer wrote:
       | Building a whole new interpreter and accompanying compiler tiers
       | is a _lot_ of work, still in 2023. Many different projects have
       | tried to make this easier, provide reusable components, to offer
       | toolkits, or another VM to build on top of. But it seems that
       | none of these really apply, we 're still at the "every language
       | implementation is a special snowflake" territory. That's partly
       | the case in Python just because the codebase has accumulated,
       | rather nucleated, around a core C interpreter from 30 years ago.
       | 
       | IMHO the Python community has intentionally torpedoed all
       | competing implementations of Python besides CPython to its own
       | detriment. They seem to constantly make the task of making a
       | compatible VM harder and harder, on purpose. The task they face
       | now is basically building a new, sophisticated VM inside a very
       | much non-sophisticated VM with a ton of legacy baggage. A massive
       | task.
        
       | PeterStuer wrote:
       | Just one question: Do you realy want Python to be held back by
       | the GIL 10 years from now? If not, when do you want to start the
       | change? If so, what do you think a CPU will look like in 10
       | years, and how would a GIL Python facilitate getting value from
       | it?
        
         | kzrdude wrote:
         | You don't want python to still be reeling from grueling the gil
         | ecosystem split, 10 years from now.
        
       | henrydark wrote:
       | Looks cool. This caught my eye
       | 
       | > C++ and Java developers expect to be able to run a program at
       | full speed (or very close to it) under a debugger.
       | 
       | I haven't worked in Java too much, but in C++ I don't remember
       | having this expectation
        
         | mm007emko wrote:
         | Java programs might not run in debugger very well either,
         | depends on where and how you place breakpoints.
         | 
         | However I'd be glad if profilers (and notably memory profiler)
         | would slow down a Python program only as much as valgrind does
         | C.
        
           | The_Colonel wrote:
           | I noticed that java with a connected debugger can be very
           | fast even with breakpoints, but stepping over can be very
           | slow. Which is a big weird, since "step over" is basically
           | just putting the breakpoint on the next line.
        
         | Chabsff wrote:
         | Think of it more as: It's still possible, painful as it may
         | sometimes be, to debug optimised C++ code running at full
         | speed.
        
       | gyrovagueGeist wrote:
       | Can someone give a good argument of why subinterpreters are an
       | interesting or useful solution to concurrency in Python? It seems
       | like all the drawbacks of multiprocessing (high memory use, slow
       | serialized object copying) with little benefit and higher
       | complexity for the user.
       | 
       | The nogil effort seems like such a better solution, that even if
       | it breaks the C interface, subinterpreters aren't worth
       | considering.
        
         | bsder wrote:
         | Pure uninformed speculation follows ...
         | 
         | I suspect sub-interpreters are a punt and a feint.
         | 
         | My guess is that there will likely be exactly _2_ sub-
         | interpreters in most Python code. One which talks to an old C
         | API with a GIL and one which talks to a new C API without a
         | GIL.
         | 
         | It's going to be a _lot_ easier to manage handing objects
         | between two Python sub-interpreters than to manage handing
         | objects between two incompatible ABIs.
        
         | nhumrich wrote:
         | Sub interpreters are still faster than processes. With them you
         | could do continuation message passing which other modern
         | languages, such as go, use for multi-threading. Also, for cases
         | such as web apps or data science training, you don't need to
         | share memory between threads, and a sub interpreter uses a lot
         | less resources than a full python process.
         | 
         | > Even if it breaks the c interface
         | 
         | Than most of your python packages wouldn't work. A python that
         | isn't backwards compatible? Ya, that has been tried once
         | before, and was a disaster. If you want a non-backwards
         | compatible gil-less python, it already exists. You can find
         | versions of nogil online.
        
         | bratao wrote:
         | I understand that sub-interpreter can perform better on object
         | copying as it is sharing the same memory. But yeah, nogil looks
         | like the correct way.
        
         | globular-toast wrote:
         | Minor nitpick: Python can perfectly well do concurrency. What
         | you mean is parallel execution (multi-threading).
        
         | celeritascelery wrote:
         | > Can someone give a good argument of why subinterpreters are
         | an interesting or useful solution to concurrency in Python?
         | 
         | I will give it a shot.
         | 
         | Subinterpreters are better than multiple processes because:
         | 
         | - they have significantly less memory overhead
         | 
         | - they can move objects much faster between subinterpreters
         | because they don't need to serialize through a format like JSON
         | 
         | - since they are all in the same process you can implement
         | things like atomics or channels easily.
         | 
         | Subinterpreter are better then no-Gil because:
         | 
         | - they make the code easier to reason about and debug relative
         | to raw multi-theading
         | 
         | - they don't negatively impact single threaded (basically all
         | existing) python code performance
         | 
         | - they don't require any changes to the C interface, preventing
         | a fractured ecosystem
         | 
         | - they can't have data races
        
           | kmod wrote:
           | Couple corrections:
           | 
           | - They absolutely do have to serialize, usually via pickle.
           | I'm pretty sure objects are not sharable between
           | subinterpreters and there is not a plan for that. The main
           | reason people think subinterpreters are good ("you can just
           | share the memory!") is not actually true.
           | 
           | - They don't require any changes to the C interface because
           | those changes were already made, and a fair amount of cost
           | was paid by C library maintainers. So it's true,
           | subinterpreters are at an advantage in this regard, but
           | that's more of a political question than a technical one
        
             | kzrdude wrote:
             | Eric Snow mentioned in his Pycon talk that memory sharing
             | would be used, especially big data blobs, arrays etc. Sure,
             | not directly sending python objects, but passing pointers
             | can be done.
        
           | Spivak wrote:
           | > - they don't negatively impact single threaded (basically
           | all existing) python code performance
           | 
           | I think this deserves an extra callout because _even your
           | multi-threaded Python programs are effectively single
           | threaded and benefiting from the performance gain_.
        
           | [deleted]
        
           | sergiomattei wrote:
           | > they can't have data races
           | 
           | How so? Asking out of curiosity.
        
             | celeritascelery wrote:
             | Because they don't share memory. The C interpreters share
             | memory, so they could have data races, but the python code
             | can't. Just like how the C interpreter can have memory
             | unsafely but python can't (or shouldn't).
        
           | ptx wrote:
           | > _they can move objects much faster between subinterpreters
           | because they don't need to serialize through a format like
           | JSON_
           | 
           | That would be a huge advantage, but it's not there yet.
           | According to PEP 554 [1] the only mechanism for sharing data
           | is sending bytes through OS pipes, which is exactly the same
           | as for multiprocessing and requires the same sort of
           | serialization.
           | 
           | [1] https://peps.python.org/pep-0554/#api-for-sharing-data
        
             | semiquaver wrote:
             | Is the overhead of pickle eg as used in
             | multiprocessing.Pipe() actually a limiting factor in most
             | circumstances?
             | 
             | https://docs.python.org/3/library/multiprocessing.html#mult
             | i...
             | 
             | In a message passing system there's always going to need to
             | be some form of serialization. I'll wager that pickle is
             | fast and flexible enough for most cases and for those that
             | aren't, using something like flatbuffers or capn proto in
             | shared memory wouldn't be too much of a lift to integrate.
             | 
             | Although all of that has long been possible in a multiple-
             | process architecture, so I'm also curious to know if there
             | are any real advantages to subinterpreters. From this
             | message [1] linked to from the PEP it sounds like the
             | author once thought that object sharing was a possibility,
             | but if it's not there seem to be no real benefits over
             | multiprocessing and one big downside (the GIL).
             | 
             | Contrast with ruby's Ractor system [2], which is similar to
             | the subinterpreter concept but allows true parallelism
             | within a single process by giving each ractor its own
             | interpreter lock, along with a system for marking an object
             | as immutable so it can be shared among ractors.
             | 
             | [1] https://mail.python.org/pipermail/python-
             | ideas/2017-Septembe...
             | 
             | [2] https://github.com/ruby/ruby/blob/master/doc/ractor.md
        
           | spacechild1 wrote:
           | > - they can move objects much faster between subinterpreters
           | because they don't need to serialize through a format like
           | JSON
           | 
           | Why do you think that you would need to serialize to JSON?
           | Pipes and sockets can deal with binary data just fine. With
           | shared memory, there wouldn't be any difference at all.
           | 
           | > - since they are all in the same process you can implement
           | things like atomics or channels easily.
           | 
           | This is also possible with shared memory.
           | 
           | AFAICT the advantage of subinterpreters over subprocesses
           | are:
           | 
           | - lower memory overhead
           | 
           | - faster creation/destruction time
           | 
           | - ability to share global data (with subprocesses the data
           | would either need to be duplicated or live in shared memory)
        
             | celeritascelery wrote:
             | Sure, you _could_ do something like that. But a shared
             | memory segment python with a stable binary object format
             | doesn 't exist (and isn't even being worked on). Comparing
             | the proposed PEP 554 solution to a non-existent theoretical
             | solution isn't very useful.
             | 
             | But you do bring up some good points for ways you could
             | achieve similar goals without the need to make the
             | interpreters thread safe.
        
               | spacechild1 wrote:
               | > But a shared memory segment python with a stable binary
               | object format doesn't exist
               | 
               | There is multiprocessing.Queue (https://docs.python.org/3
               | /library/multiprocessing.html#multi...).
               | 
               | I don't know if it uses shared memory, or rather sockets
               | or pipes, but this is just an implementation detail.
               | 
               | My point is that there is no fundamental difference
               | between isolated interpreters and processes when it comes
               | to data sharing. Either way, you need a (binary)
               | serialization format and some thread/process-safe queue.
               | 
               | I would have naively assumed that you could repurpose
               | multiprocessing.Queue for passing data between multiple
               | interpreters; you would just need to replace the
               | underlying communication mechanism (sockets, pipes,
               | shared memory, whatever it is) with a queue + mutex. But
               | then again, I'm not familiar with the actual code base.
               | If there are any complications that I didn't take into
               | acccount, I would be curious to hear about them.
               | 
               | Interestingly, the PEP authors currently don't propose an
               | API for exchanging data and instead suggest using raw
               | pipes:
               | 
               | > https://peps.python.org/pep-0554/#api-for-sharing-data
               | 
               | Of course, this is just a temporary hack. It would be
               | ridiculous to use actual pipes for sharing data within
               | the same process...
        
         | samus wrote:
         | Easy interop with C is one of the core selling points of
         | Python. It's not just about performance - it's about Python
         | being able to be a glue language that can interface with any
         | legacy or system library written in C or being accessible via
         | FFI. A LOT of systems exist that take advantage of this, and
         | changes to the C interface impact all of these. Unless the
         | porting story is thought out extraordinary well, it would be
         | another disaster like Python 2->3. Since this affects C code,
         | it won't be anything but tricky.
        
           | gyrovagueGeist wrote:
           | From reading the nogil repo and related PEP, C ABI breaking
           | does not seem to be the worst problem. An updated Cython,
           | Swig, etc seems like it would be enough in most cases to get
           | something running. More extreme cases might only have ~dozen
           | LoC changes to replace the GIL Ensure/Release patterns.
           | 
           | The hidden really hard problem is that extension modules may
           | have been written relying on GIL behavior for thread
           | safety... these may be undocumented and unmaintained.
           | 
           | Even so I hope the community decides it is worth it. A glue
           | language with actual MT support would be much more useful.
        
             | samus wrote:
             | Indeed, thanks for clarifying that. I actually had the
             | extensions in mind when I wrote my comment, but got somehow
             | distracted by the C interop aspect. Extensions indeed have
             | a more intimate relation with the interpreter and (directly
             | or indirectly) rely on the GIL and its associated
             | semantics.
        
       | bkovacev wrote:
       | How would this (or any of the recent changes) affect popular web
       | frameworks like Django / Flask / FastAPI? Would this increase
       | performance in serialization or speed in general?
        
       | retskrad wrote:
       | Speaking of Python, as a beginner I have tried to grasp Classes
       | and Objects and watched countless YouTube videos and
       | Reddit/StackOverflow comments to understand them better but I'm
       | just banging my head against the wall and it's not sticking. The
       | whole __init__ method, the concept of self and why it's used,
       | instance and class variables, arguments, all of it. When learning
       | something, when you simply cannot grasp a concept however hard
       | you try, what's the course of action? I have tried taking breaks
       | but that's not helping.
        
         | abecedarius wrote:
         | First, do you feel you understand OO in another language? So
         | this is about how Python does it?
        
           | retskrad wrote:
           | I'm a beginner and Python is the only language I'm familiar
           | with
        
             | abecedarius wrote:
             | OK. It might help to keep in mind a couple things:
             | 
             | - Python started as a simple language, but it's grown over
             | the decades, and professional programmers learned the new
             | complications along the way. They don't naturally
             | appreciate what it's like if you get hit with them all
             | together, as sometimes happens -- it all seems simple to
             | _them_. So to understand OO I would try to screen out the
             | fancier bits like the class variables you mentioned -- you
             | can come back to them later.
             | 
             | - The original central idea of objects (from Smalltalk) is,
             | an object is a thing that can receive different commands
             | ('methods' in Python), and update its own variables and
             | call on other objects. The way Python gives you to define
             | objects (by defining a class and then creating an instance
             | of the class) is not the most direct possible way it
             | could've been designed to do this -- if it feels more
             | complicated than necessary, it kind of is. But it's not too
             | bad, you can get used to how it works for the most central
             | stuff as I mentioned, and learn more from there.
        
         | switch007 wrote:
         | Classes and objects exist in all object oriented program
         | languages. Perhaps stepping back from Python and trying some
         | more generic material might help?
        
           | retskrad wrote:
           | That's a good idea, I'll try that
        
         | IAmGraydon wrote:
         | As everyone else said, use it. Don't try to understand by
         | reading. Understand by doing.
        
         | kzrdude wrote:
         | I think working with it in practice is the only way forward.
         | Looking up info, build something, work more, and let it become
         | a feedback loop. People work long with systems before they
         | "get" them.
         | 
         | Programming is not a "school" activity for me. It's a craft,
         | you do stuff. That has eventually led to a depth of knowledge
         | in various topics inside programming.
         | 
         | (Digression: With that said - we should think of a lot more
         | "school" stuff as skills and crafts - stuff that you don't read
         | to understand but you get better at with practice. A lot of
         | maths class is skills, not knowledge.)
        
           | retskrad wrote:
           | Good advice, thanks
        
         | montecarl wrote:
         | Others have said something similar, but I will still chime in.
         | 
         | It sounds like you may want to feel like you have a complete
         | understanding of classes, objects, and the way the work before
         | you can begin working with them. This can almost never work. I
         | haven't come across a topic where just reading about the topic
         | is sufficient to fully understand it or even become proficient
         | in it.
         | 
         | In my experience this is true for cooking, lab techniques in
         | chemistry, electronics, and programming. Even when I have read
         | something and felt that I understood it completely, as soon as
         | I begin the activity I immediately realize that I had a
         | fundamental misunderstanding of what I had read. That my brain
         | had made some oversimplification or skipped passed some details
         | so that I felt like I grasped the concept.
         | 
         | So if I had to describe the way I learn a new concept it would
         | look like this:
         | 
         | 1. Read many different descriptions/watch many different videos
         | of the topic to get used to the terms and concepts
         | 
         | 2. Try to apply those concepts in real life (write software,
         | build a circuit, cook a meal)
         | 
         | 3. Figure out where my understanding fell short, and go back to
         | step 1
         | 
         | You want to make this loop as tight as possible. When you
         | become expert at something you can do this activity super fast.
         | When you first start out you may have to do quite a bit of
         | reading to even have the baseline needed to attempt a concrete
         | task. However, it is important to get started towards a real
         | goal as soon as possible or you will be wasting your time
         | feeling like you are learning and moving towards you goal, but
         | you are not.
        
         | njharman wrote:
         | > When learning something, when you simply cannot grasp a
         | concept however hard you try, what's the course of action?
         | 
         | Learn by doing, don't learn by studying.
         | 
         | Pick a (hopefully real) problem, solve it using classes.
         | Repeat. Continue repeating. You will either learn what OO is
         | good for, what it is not good for and how you can use it to
         | write better code. Or, learn that programming is not something
         | you can grok.
         | 
         | If you don't have a "real" problem; Then write program to play
         | tic-tac-toe; manage the and display board, take player inputs,
         | detect when game is done (winner or draw).
         | 
         | Then expand it to 8x8 board, then to 3 players. Those changes
         | should be easy with good OO design (lots rewrites, code
         | changes). And probably "harder" with bad OO or no OO design.
         | 
         | btw this is 2nd interview question I and my team used for
         | years.
        
         | harrelchris wrote:
         | Either keep looking for explanations until it clicks, or try to
         | break it down into more fundamental elements.
         | 
         | Classes and blueprints have a simple analogy that I'm sure you
         | are familiar with.
         | 
         | Blueprints are instructions for how to build something, like a
         | house. Once built, the house is a physical thing you can
         | interact with. It has attributes, such as a height or color. It
         | has things it can do or that can be done to it, like open a
         | door or turn on some lights.
         | 
         | Classes are blueprints - they tell a computer how to build
         | something. An object is what is built by the class - it has
         | attributes and methods.
        
           | retskrad wrote:
           | Thanks, I appreciate it
        
         | ozzy6009 wrote:
         | I was the same way when I was a beginner, I didn't really "get"
         | python classes until years later and using other programming
         | languages. I recommend SICP if you want a more first principles
         | understanding: https://web.mit.edu/6.001/6.037/sicp.pdf
        
           | retskrad wrote:
           | Thanks
        
         | mixmastamyk wrote:
         | Go to school, with a good teacher. Yt videos are not a
         | substitute for a proper course.
        
           | IshKebab wrote:
           | I disagree on the school part. There are _so_ many good
           | resources available especially for Python programming. There
           | 's no way you need to pay for lessons. I don't think the kind
           | of people you get teaching Python will be the most
           | knowledgeable anyway.
           | 
           | But I would maybe recommend a good book if you're struggling.
        
             | mixmastamyk wrote:
             | Community college is still a thing, not expensive, and
             | won't make the mistake of teaching specifically Python. It
             | also forces a schedule which is helpful in its own right.
             | 
             | What you just proposed is exactly what _isn't_ working for
             | this person.
        
         | Zizizizz wrote:
         | Try out these Cory Schafer videos, I remember watching them 6
         | years ago and them being quite helpful.
         | https://youtu.be/ZDa-Z5JzLYM
        
           | retskrad wrote:
           | Thanks
        
         | WoodenChair wrote:
         | Build a project where you purposely use a bunch of your own
         | custom classes. Learning by doing is best if the reading
         | materials are not clicking.
        
       | tasubotadas wrote:
       | > PEP 669 - Low Impact Monitoring for CPython
       | 
       | Finally, 15 years later Python might also get something similar
       | to VisualVM.
        
       | jonnycomputer wrote:
       | I'm going to admit that what I really want to see is a strong
       | push to standardize and fully incorporate package management and
       | distribution into python's core. Despite the work done on it,
       | it's still a mess as far as I can see, and there is no single
       | source of truth (that I know of) on how to do it.
       | 
       | For that matter, pip can't even search for packages any more, and
       | instead directs you to browse the pypi website in a browser.
       | Whatever the technical reasons for that, its a user interface
       | fail. Conda can do it!!!!! (as well as just about any package
       | management system I've ever used)
        
         | globular-toast wrote:
         | There it is. The obligatory comment on every Python thread on
         | HN. It's most popular programming language in the world. Other
         | people can figure it out, apparently.
        
           | jonnycomputer wrote:
           | I love python. It is my go-to language for just about
           | everything. But that also means that I feel the pain points
           | pretty acutely. And you know what, I'm not alone.
        
           | jonnycomputer wrote:
           | Oh look: https://news.ycombinator.com/vote?id=36343991&how=up
           | &auth=85...
        
           | woodruffw wrote:
           | I think we can be more charitable than this: it's possible to
           | be both immensely popular and to have a sub-par packaging
           | experience that users put up with. That's where Python is.
        
             | globular-toast wrote:
             | The trouble is people compare it to greenfield languages of
             | the past few years with nowhere near the scope, userbase or
             | legacy of Python. Long time Python users like me don't have
             | any of the problems that the non-Python users that always
             | post these comments have. It would be nice to have
             | improvements to packaging, sure, but it's always just
             | completely non-constructive stuff like "it's not as easy as
             | <brand new language with no legacy>".
        
               | jonnycomputer wrote:
               | Big assumptions here.
        
               | jshen wrote:
               | Java and Ruby both have much better dependency management
               | experiences and both have been around for far longer than
               | a few years.
        
               | [deleted]
        
               | antod wrote:
               | As someone who dealt with Java and Python 20yrs back, I
               | don't think Java is a valid comparison.
               | 
               | Java had a terrible or non existent OS integration story
               | - it didn't even try to have OS native stuff. It was it's
               | own separate island that worked best when you stayed on
               | the island. On Linux, Python was included in the OS so
               | you had the two worlds of distro packaging and
               | application development/deployment dependencies already
               | in conflict. Macs also shipped their own Python that you
               | had to avoid messing up. And on Windows Python was also
               | trying to support the standard download a setup.exe
               | method for library distribution. Java only ever had the
               | developer dependency usecase to think about.
               | 
               | Before Maven most Java apps just manually vendored all
               | their dependencies into their codebase, or you manually
               | wrangled assembling stuff in place using application
               | specific classpaths and additions to path env vars etc.
        
               | jshen wrote:
               | Today, java has much better dependency management than
               | python. Nearly all popular languages do.
        
               | woodruffw wrote:
               | I agree with all of this! Ironically, grievances around
               | Python packaging are a function of Python's overwhelming
               | success; non-constructive complaints about the packaging
               | experience reflect otherwise reasonable assumptions about
               | how painless it _should_ be, given how nice everything
               | else is.
               | 
               | (This doesn't make them constructive, but it does make
               | them understandable.)
        
           | kzrdude wrote:
           | Packaging is a big topic right now, and a lot is happening -
           | that includes a lot of good tool improvements. I think that's
           | one reason for these comments, because it's close to top of
           | mind
        
         | asylteltine wrote:
         | Python's dependency management, or lack there of, its import
         | system, and the lack of strong typing really make me hate it.
         | It's the first language I really felt adept with, but once I
         | learned Go, I never looked back. Every time I have to use
         | python it's like coding with crayons.
        
           | pnt12 wrote:
           | I think type hints help a lot. A codebase with classes and
           | type hints reads much better than one using ad-hoc
           | dictionaries for every data structure.
           | 
           | But on the other hand I don't really like Go, so maybe it's
           | different languages for different tastes.
        
           | wendyshu wrote:
           | What aspect of python's type system do you find insufficient?
        
           | Aperocky wrote:
           | Premature optimization is root of all evil.
           | 
           | Python isn't perfect, and mostly unsuitable for any system
           | where performance is in consideration.
           | 
           | That leaves everything else including personal utility
           | scripts and packages I use each day to automate random stuff.
           | And I hugely appreciate how fast and simple it is to develop
           | in python, unlike certain languages that literally depends on
           | IDEs due to the verbosity and unnecessary cognitive load.
        
             | switch007 wrote:
             | > And I hugely appreciate how fast and simple it is to
             | develop in python
             | 
             | Indeed it's crazy. A few pip installs and I had a
             | multiprocessing pandas (dask) with a web gui, and a
             | workflow system (also with a web gui), and a pipeline to
             | convert csv to parquet in like 20 lines of code
        
         | woodruffw wrote:
         | > I'm going to admit that what I really want to see is a strong
         | push to standardize and fully incorporate package management
         | and distribution into python's core. Despite the work done on
         | it, it's still a mess as far as I can see, and there is no
         | single source of truth (that I know of) on how to do it.
         | 
         | Package management is standardized in a series of PEPs[1]. Some
         | of those PEPs are living documents that have versions
         | maintained under the PyPA packaging specifications[2].
         | 
         | The Python Packaging User Guide[3] is, for most things, the
         | canonical reference for how to do package distribution in
         | Python. It's also maintained by the PyPA.
         | 
         | (I happen to agree, even with all of this, that Python
         | packaging is a bit of a mess. But it's a _much better defined_
         | mess than it was even 5 years ago, and initiatives to bring
         | packaging into the core need to address ~20 years of packaging
         | debt.)
         | 
         | [1]: https://peps.python.org/topic/packaging/
         | 
         | [2]:
         | https://packaging.python.org/en/latest/specifications/index....
         | 
         | [3]: https://packaging.python.org/en/latest/flow/
        
           | jshen wrote:
           | Is there anything in there about managing dependencies within
           | a python project? What is the canonical way to do that in
           | python today?
        
             | woodruffw wrote:
             | It depends (unfortunately) on what you mean by a Python
             | project:
             | 
             | * If you mean a thing that's ultimately meant to be `pip`
             | installable, then you should use `pyproject.toml` with PEP
             | 518 standard metadata. That includes the dependencies for
             | your project; the PyPUG linked above should have an example
             | of that.
             | 
             | * If you mean a thing that's meant to be _deployed_ with a
             | bunch of Python dependencies, then `requirements.txt` is
             | probably still your best bet.
        
               | jshen wrote:
               | I meant the second. requirements.txt is a really bad
               | solution for that, and that is the frustration many of us
               | have that have used languages with much better solutions.
        
               | starlevel003 wrote:
               | > * If you mean a thing that's meant to be deployed with
               | a bunch of Python dependencies, then `requirements.txt`
               | is probably still your best bet.
               | 
               | This is exactly how we got in this mess. Using
               | ``setup.cfg`` or ``pyproject.toml`` for _all_ projects
               | makes this easy as now your deployable project can be
               | installed via pip like every other one.
               | 
               | 1. ``python -m virtualenv .``
               | 
               | 2. ``source ./bin/activate.fish``
               | 
               | 3. ``pip install -U
               | https://my.program.com/path/to/tarball.tar.xz``
        
           | jonnycomputer wrote:
           | which is why I said, "Despite the work done on it"
        
             | woodruffw wrote:
             | Yes, that was meant more for the "source of truth" part.
        
               | jonnycomputer wrote:
               | No, I do appreciate you taking the time to do it (I was
               | too lazy)
        
         | aranke wrote:
         | I gave a talk about this at the Packaging Summit during Pycon
         | which was well received, so the team is definitely aware of the
         | problem.
         | 
         | However, the sense I got was that it was going to be a lot of
         | work to "fix Python packaging" which wasn't feasible with an
         | all-volunteer group.
         | 
         | At work, we're migrating away from pip as a distribution
         | mechanism for this reason; I don't expect to see meaningful
         | improvements to the developer experience anytime soon.
         | 
         | This is especially true because pip today is roughly where npm
         | was in 2015, so there's a lot of fundamental infrastructure
         | work (including security) that still needs to happen. An
         | example of this is that PyPI just got the ability to namespace
         | packages.
        
           | LordKeren wrote:
           | > we're migrating away from pip as a distribution mechanism
           | for this reason
           | 
           | Could you elaborate on what you're using as a replacement?
        
             | pnt12 wrote:
             | Not the parent but pipenv is decent, poetry is even better:
             | 
             | - clear separation of dev and production dependencies -
             | lock file with the current version of all dependencies for
             | reproducible builds (this is slightly difference than the
             | dependency specification) - no accidental global installs
             | because you forgot to activate a virtual environment - (not
             | sure if supported by pip) allows installing libraries
             | directly from a git repo, which is very useful if you have
             | internal libraries - easier updates
        
           | di wrote:
           | > An example of this is that PyPI just got the ability to
           | namespace packages.
           | 
           | You're thinking of organizations, which are not namespaces:
           | https://blog.pypi.org/posts/2023-04-23-introducing-pypi-
           | orga...
        
         | bandrami wrote:
         | It's beyond "mess" well into "fiasco" and frankly I'm astounded
         | people think there's a more important issue facing the language
         | right now. Look, for an example of a high-prestige project, at
         | Spleeter, which spends multiple pages of its wiki describing
         | how to install it with Conda and then summarizes "Note: as of
         | 2021 we no longer recommend using Conda to install Spleeter"
         | and nothing else.
        
           | noitpmeder wrote:
           | What are you smoking? The readme for spleeter clearly shows
           | the two simple commands needed to install -- one being a
           | conda install for system level dependencies and one being pip
           | for the spleeter python package itself.
        
       | dboreham wrote:
       | Just add some PR spin like JS did and declare the GIL a feature.
        
       ___________________________________________________________________
       (page generated 2023-06-15 23:00 UTC)