hngopher.com

       [HN Gopher] The first year of free-threaded Python
       ___________________________________________________________________
        
       The first year of free-threaded Python
        
       Author : rbanffy
       Score  : 230 points
       Date   : 2025-05-16 09:42 UTC (13 hours ago)
        
 (HTM) web link (labs.quansight.org)
 (TXT) w3m dump (labs.quansight.org)
        
       | AlexanderDhoore wrote:
       | Am I the only one who sort of fears the day when Python loses the
       | GIL? I don't think Python developers know what they're asking
       | for. I don't really trust complex multithreaded code in any
       | language. Python, with its dynamic nature, I trust least of all.
        
         | DHolzer wrote:
         | I was thinking that too. I am really not a professional
         | developer though.
         | 
         | OFC it would be nice to just write python and everything would
         | be 12x accelerated, but i don't see how there would not be any
         | draw-backs that would interfere with what makes python so
         | approachable.
        
         | NortySpock wrote:
         | I hope at least the option remains to enable the GIL, because I
         | don't trust me to write thread-safe code on the first few
         | attempts.
        
         | txdv wrote:
         | how does the the language being dynamic negatively affect the
         | complexity of multithreading?
        
           | nottorp wrote:
           | Is there so much legacy python multithreaded code anyway?
           | 
           | Considering everyone knew about the GIL, I'm thinking most
           | people just wouldn't bother.
        
             | toxik wrote:
             | There is, and what's worse, it assumes a global lock will
             | keep things synchronized.
        
               | rowanG077 wrote:
               | Does it? The GIL only ensured each interpreter
               | instruction is atomic. But any group of instruction is
               | not protected. This makes it very hard to rely on the GIL
               | for synchronization unless you really know what you are
               | doing.
        
               | immibis wrote:
               | AFAIK a group of instructions is only non-protected if
               | one of the instructions does I/O. Explicit I/O - page
               | faults don't count.
        
               | kfrane wrote:
               | If I understand that correctly, it would mean that
               | running a function like this on two threads f(1) and f(2)
               | would produce a list of 1 and 2 without interleaving.
               | def f(x):           for _ in range(N):
               | l.append(x)
               | 
               | I've tried it out and they start interleaving when N is
               | set to 1000000.
        
           | breadwinner wrote:
           | When the language is dynamic there is less rigor. Statically
           | checked code is more likely to be correct. When you add
           | threads to "fast and loose" code things get really bad.
        
             | jaoane wrote:
             | Unless your claim is that the same error can happen more
             | times per minute because threading can execute more code in
             | the same timespan, this makes no sense.
        
               | breadwinner wrote:
               | Some statically checked languages and tools can catch
               | potential data races at compile time. Example: Rust's
               | ownership and borrowing system enforces thread safety at
               | compile time. Statically typed functional languages like
               | Haskell or OCaml encourage immutability, which reduces
               | shared mutable state -- a common source of concurrency
               | bugs. Statically typed code can enforce usage of thread-
               | safe constructs via types (e.g., Sync/Send in Rust or
               | ConcurrentHashMap in Java).
        
           | jerf wrote:
           | I have a hypothesis that being dynamic has no particular
           | effect on the complexity of multithreading. I think the
           | apparent effect is a combination of two things: 1. All our
           | dynamic scripting languages in modern use date from the 1990s
           | before this degree of threading was a concern for the
           | languages and 2. It is _really hard_ to retrofit code written
           | for not being threaded to work in a threaded context, and the
           | "deeper" the code in the system the harder it is. Something
           | like CPython is about as "deep" as you can go, so it's
           | really, really hard.
           | 
           | I think if someone set out to write a new dynamic scripting
           | language today, from scratch, that multithreading it would
           | not pose any particular challenge. Beyond that fact that it's
           | naturally a difficult problem, I mean, but nothing _special_
           | compared to the many other languages that have implemented
           | threading. It 's all about all that code from before the
           | threading era that's the problem, not the threading itself.
           | And Python has a _loooot_ of that code.
        
           | rocqua wrote:
           | Dynamic(ally typed) languages, by virtue of not requiring
           | strict typing, often lead to more complicated function
           | signatures. Such functions are generally harder to reason
           | about. Because they tend to require inspection of the
           | function to see what is really going on.
           | 
           | Multithreaded code is incredibly hard to reason about. And
           | reasoning about it becomes a lot easier if you have certain
           | guarantees (e.g. this argument / return value always has this
           | type, so I can always do this to it). Code written in dynamic
           | languages will more often lack such guarantees, because of
           | the complicated signatures. This makes it even harder to
           | reason about Multithreaded code, increasing the risk posed by
           | multithreaded code.
        
         | miohtama wrote:
         | GIL or no-GIL concerns only people who want to run multicore
         | workloads. If you are not already spending time threading or
         | multiprocessing your code there is practically no change. Most
         | race condition issues which you need to think are there
         | regardless of GIL.
        
           | immibis wrote:
           | With the GIL, multithreaded Python gives concurrent I/O
           | without worrying about data structure concurrency (unless you
           | do I/O in the middle of it) - it's a lot like async in this
           | way - data structure manipulation is atomic between "await"
           | expressions (except in the "await" is implicit and you might
           | have written one without realizing in which case you have a
           | bug). Meanwhile you still get to use threads to handle
           | several concurrent I/O operations. I bet a _lot_ of Python
           | code is written this way and will start randomly crashing if
           | the data manipulation becomes non-atomic.
        
             | rowanG077 wrote:
             | Afaik the only guarantee there is, is that a bytecode
             | instruction is atomic. Built in data structures are mostly
             | safe I think on a per operation level. But combining them
             | is not. I think by default every few millisecond the
             | interpreter checks for other threads to run even if there
             | is no IO or async actions. See `sys.getswitchinterval()`
        
               | hamandcheese wrote:
               | This is the nugget of information I was hoping for. So
               | indeed even GIL threaded code today can suffer from
               | concurrency bugs (more so than many people here seem to
               | think).
        
               | ynik wrote:
               | Bytecode instructions have never been atomic in Python's
               | past. It was always possible for the GIL to be
               | temporarily released, then reacquired, in the middle of
               | operations implemented in C. This happens because C code
               | is often manipulating the reference count of Python
               | objects, e.g. via the `Py_DECREF` macro. But when a
               | reference count reaches 0, this might run a `__del__`
               | function implemented in Python, which means the "between
               | bytecode instructions" thread switch can happen inside
               | that reference-counting-operation. That's a lot of
               | possible places!
               | 
               | Even more fun: allocating memory could trigger Python's
               | garbage collector which would also run `__del_-`
               | functions. So every allocation was also a possible (but
               | rare) thread switch.
               | 
               | The GIL was only ever intended to protect Python's
               | internal state (esp. the reference counts themselves);
               | any extension modules assuming that their own state would
               | also be protected were likely already mistaken.
        
               | rowanG077 wrote:
               | Well I didn't think of this myself. It's literally what
               | the python official doc says:
               | 
               | > A global interpreter lock (GIL) is used internally to
               | ensure that only one thread runs in the Python VM at a
               | time. In general, Python offers to switch among threads
               | only between bytecode instructions; how frequently it
               | switches can be set via sys.setswitchinterval(). Each
               | bytecode instruction and therefore all the C
               | implementation code reached from each instruction is
               | therefore atomic from the point of view of a Python
               | program.
               | 
               | https://docs.python.org/3/faq/library.html#what-kinds-of-
               | glo...
               | 
               | If this is not the case please let the official python
               | team know their documentation is wrong. It indeed does
               | state that if Py_DECREF is invoked the bets are off. But
               | a ton of operations never do that.
        
             | imtringued wrote:
             | You start talking about GIL and then you talk about non-
             | atomic data manipulation, which happen to be completely
             | different things.
             | 
             | The only code that is going to break because of "No GIL"
             | are C extensions and for very obvious reasons: You can now
             | call into C code from multiple threads, which wasn't
             | possible before, but is now. Python code could always be
             | called from multiple python threads even in the presence of
             | the GIL in python.
        
             | OskarS wrote:
             | That doesn't match with my understanding of free-threaded
             | Python. The GIL is being replaced with fine-grained locking
             | on the objects themselves, so sharing data-structures
             | between threads is still going to work just fine. If you're
             | talking about concurrency issues like this causing out-of-
             | bounds errors:                   if len(my_list) > 5:
             | print(my_list[5])
             | 
             | (i.e. because a different thread can pop from the list in-
             | between the check and the print), that could just as easily
             | happen today. The GIL makes sure that only one python
             | interpreter runs at once, but it's entirely possible that
             | the GIL is released and switches to a different thread
             | after the check but before the print, so there's no extra
             | thread-safety issue in free-threaded mode.
             | 
             | The problems (as I understand it, happy to be corrected),
             | are mostly two-fold: performance and ecosystem. Using fine-
             | grained locking is potentially much less efficient than
             | using the GIL in the single-threaded case (you have to take
             | and release many more locks, and reference count updates
             | have to be atomic), and many, many C extensions are written
             | under the assumption that the GIL exists.
        
           | fulafel wrote:
           | A lot of Python usage is leveraging libraries with parallel
           | kernels inside written in other languages. A subset of those
           | is bottlenecked on Python side speed. A sub-subset of those
           | are people who want to try no-GIL to address the bottleneck.
           | But if non-GIL becomes pervasive, it could mean Python
           | becomes less safe for the "just parallel kernels" users.
        
             | kccqzy wrote:
             | Yes sure. Thought experiment: what happens when these
             | parallel kernels suddenly need to call back in to Python?
             | Let's say you have a multithreaded sorting library. If you
             | are sorting numbers then fine nothing changes. But if you
             | are sorting objects you need to use a single thread because
             | you need to call PyObject_RichCompare. These new parallel
             | kernels will then try to call PyObject_RichCompare from
             | multiple threads.
        
         | quectophoton wrote:
         | I don't want to add more to your fears, but also remember that
         | LLMs have been trained on decades worth of Python code that
         | assumes the presence of the GIL.
        
           | rocqua wrote:
           | This could, indeed, be quite catastrophic.
           | 
           | I wonder if companies will start adding this to their system
           | prompts.
        
         | dotancohen wrote:
         | As a Python dabbler, what should I be reading to ensure my
         | multi-threaded code in Python is in fact safe.
        
           | cess11 wrote:
           | The literature on distributed systems is huge. It depends a
           | lot on your use case what you ought to do. If you're lucky
           | you can avoid shared state, as in no race conditions in
           | either end of your executions.
           | 
           | https://www.youtube.com/watch?v=_9B__0S21y8 is fairly concise
           | and gives some recommendations for literature and techniques,
           | obviously making an effort in promoting PlusCal/TLA+ along
           | the way but showcases how even apparently simple algorithms
           | can be problematic as well as how deep analysis has to go to
           | get you a guarantee that the execution will be bug free.
        
             | dotancohen wrote:
             | My current concern is a CRUD interface that transcribes
             | audio in the background. The transcription is triggered by
             | user action. I need the "transcription" field disabled
             | until the transcript is complete and stored in the
             | database, then allow the user to edit the transcription in
             | the UI.
             | 
             | Of course, while the transcription is in action the rest of
             | the UI (Qt via Pyside) should remain usable. And multiple
             | transcription requests should be supported - I'm thinking
             | of a pool of transcription threads, but I'm uncertain how
             | many to allocate. Half the quantity of CPUs? All the CPUs
             | under 50% load?
             | 
             | Advise welcome!
        
               | realreality wrote:
               | Use `concurrent.futures.ThreadPoolExecutor` to submit
               | jobs, and `Future.add_done_callback` to flip the
               | transcription field when the job completes.
        
               | ptx wrote:
               | Although keep in mind that the callback will be "called
               | in a thread belonging to the process" (say the docs),
               | presumably some thread that is not the UI thread. So the
               | callback needs to post an event to the UI thread's event
               | queue, where it can be picked up by the UI thread's event
               | loop and only then perform the UI updates.
               | 
               | I don't know how that's done in Pyside, though. I
               | couldn't find a clear example. You might have to use a
               | QThread instead to handle it.
        
               | dotancohen wrote:
               | Thank you. Perhaps I should trigger the transcription
               | thread from the UI thread, then? It is a UI button that
               | initiates it after all.
        
               | dotancohen wrote:
               | Thank you.
        
               | sgarland wrote:
               | Just use multiprocessing. If each job is independent and
               | you aren't trying to spread it out over multiple workers,
               | it seems much easier and less risky to spawn a worker for
               | each job.
               | 
               | Use SharedMemory to pass the data back and forth.
        
           | HDThoreaun wrote:
           | Honestly unless youre willing to devote a solid 4+ hours to
           | learning about multi threading stick with ayncio
        
             | dotancohen wrote:
             | I'm willing to invest an afternoon learning. That's been
             | the premise of my entire career!
        
         | bayindirh wrote:
         | More realistically, as it happened in ML/AI scene, the
         | knowledgeable people will write the complex libraries and will
         | hand these down to scientists and other less experienced, or
         | risk-averse developers (which is _not_ a bad thing).
         | 
         | With the critical mass Python acquired over the years, GIL
         | becomes a very sore bottleneck in some cases. This is why I
         | decided to learn Go, for example. Properly threaded (and green
         | threaded) programming language which is higher level than
         | C/C++, but lower than Python which allows me to do things which
         | I can't do with Python. Compilation is another reason, but it
         | was secondary with respect to threading.
        
         | jillesvangurp wrote:
         | You are not the only one who is afraid of changes and a bit
         | change resistant. I think the issue here is that the reasons
         | for this fear are not very rational. And also the interest of
         | the wider community is to deal with technical debt. And the GIL
         | is pure technical debt. Defensible 30 years ago, a bit awkward
         | 20 years ago, and downright annoying and embarrassing now that
         | world + dog does all their AI data processing with python at
         | scale for the last 10. It had to go in the interest of future
         | proofing the platform.
         | 
         | What changes for you? Nothing unless you start using threads.
         | You probably weren't using threads anyway because there is
         | little to no point in python to using them. Most python code
         | bases completely ignore the threading module and instead use
         | non blocking IO, async, or similar things. The GIL thing only
         | kicks in if you actually use threads.
         | 
         | If you don't use threads, removing the GIL changes nothing.
         | There's no code that will break. All those C libraries that
         | aren't thread safe are still single threaded, etc. Only if you
         | now start using threads do you need to pay attention.
         | 
         | There's some threaded python code of course that people may
         | have written in python somewhat naively in the hope that it
         | would make things faster that is constantly hitting the GIL and
         | is effectively single threaded. That code now might run a
         | little faster. And probably with more bugs because naive
         | threaded code tends to have those.
         | 
         | But a simple solution to address your fears: simply don't use
         | threads. You'll be fine.
         | 
         | Or learn how to use threads. Because now you finally can and it
         | isn't that hard if you have the right abstractions. I'm sure
         | those will follow in future releases. Structured concurrency is
         | probably high on the agenda of some people in the community.
        
           | HDThoreaun wrote:
           | > But a simple solution to address your fears: simply don't
           | use threads. You'll be fine.
           | 
           | Im not worried about new code. Im worried about stuff written
           | 15 years ago by a monkey who had no idea how threads work and
           | just read something on stack overflow that said to use
           | threading. This code will likely break when run post-GIL. I
           | suspect there is actually quite a bit of it.
        
             | bgwalter wrote:
             | If it is C-API code: Implicit protection of global
             | variables by the GIL is a documented feature, which makes
             | writing extensions much easier.
             | 
             | Most C extensions that will break are not written by
             | monkeys, but by conscientious developers that followed best
             | practices.
        
             | bayindirh wrote:
             | Software rots, software tools evolve. When Intel released
             | performance primitives libraries which required
             | recompilation to analyze multi-threaded libraries, we were
             | amazed. Now, these tools are built into _processors_ as
             | performance counters and we have way more advanced tools to
             | analyze how systems behave.
             | 
             | Older code will break, but they break all the time. A
             | language changes how something behaves in a new revision,
             | suddenly 20 year old bedrock tools are getting massively
             | patched to accommodate both new and old behavior.
             | 
             | Is it painful, ugly, unpleasant? Yes, yes and yes. However
             | change is inevitable, because some of the behavior was
             | rooted in _inability to do some things_ with current
             | technology, and as hurdles are cleared, we change how
             | things work.
             | 
             | My father's friend told me that length of a variable's name
             | used to affect compile/link times. Now we can test whether
             | we have memory leaks in Rust. That thing was impossible 15
             | years ago due to performance of the processors.
        
               | delusional wrote:
               | > Software rots
               | 
               | No it does not. I hate that analogy so much because it
               | leads to such bad behavior. Software is a digital
               | artifact that can does not degrade. With the right
               | attitude, you'd be able to execute the same binary on new
               | machines for as long as you desired. That is not true of
               | organic matter that actually rots.
               | 
               | The only reason we need to change software is that we
               | trade that off against something else. Instructions are
               | reworked, because chasing the universal Turing machine
               | takes a few sacrifices. If all software has to run on the
               | same hardware, those two artifacts have to have a
               | dialogue about what they need from each other.
               | 
               | If we didnt want the universal machine to do anything
               | new. If we had a valuable product. We could just keep
               | making the machine that executes that product. It never
               | rots.
        
               | kstrauser wrote:
               | That's not what the phrase implies. If you have a C
               | program from 1982, you can still compile it on a 1982
               | operating system and toolchain and it'll work just as
               | before.
               | 
               | But if you tried to compile it on today's libc, making
               | today's syscalls... good luck with that.
               | 
               | Software "rots" in the sense that it has to be updated to
               | run on today's systems. They're a moving target. You can
               | still run HyperCard on an emulator, but good luck running
               | it unmodded on a Mac you buy today.
        
               | dahcryn wrote:
               | yes it does.
               | 
               | If software is implicitly built on wrong understanding,
               | or undefined behaviour, I consider it rotting when it
               | starts to fall apart as those undefined behaviours get
               | defined. We do not need to sacrifice a stable future
               | because of a few 15 year old programs. Let the people who
               | care about the value that those programs bring, manage
               | the update cycle and fix it.
        
               | eblume wrote:
               | Software is written with a context, and the context
               | degrades. It must be renewed. It rots, sorry.
        
               | igouy wrote:
               | You said it's the context that rots.
        
               | bayindirh wrote:
               | It's a matter of perspective, I guess...
               | 
               | When you look from the program's perspective, the context
               | changes and becomes unrecognizable, IOW, it rots.
               | 
               | When you look from the context's perspective, the program
               | changes by not evolving and keeping up with the context,
               | IOW, it rots.
               | 
               | Maybe we anthropomorphize both and say "they grow apart".
               | :)
        
               | igouy wrote:
               | We say the context has breaking changes.
               | 
               | We say the context is not backwards compatible.
        
               | indymike wrote:
               | >> Software rots > No it does not.
               | 
               | I'm thankful that it does, or I would have been out of
               | work long ago. It's not that the files change (literal
               | rot), it is that hardware, OSes, libraries, and
               | everything else changes. I'm also thankful that we have
               | not stopped innovating on all of the things the software
               | I write depends on. You know, another thing changes -
               | what we are using the software for. The accounting
               | software I wrote in the late 80s... would produce
               | financial reports that were what was expected then, but
               | would not meet modern GAAP requirements.
        
               | rocqua wrote:
               | Fair point, but there is an interesting question posed.
               | 
               | Software doesn't rot, it remains constant. But the
               | context around it changes, which means it loses
               | usefulness slowly as time passes.
               | 
               | What is the name for this? You could say 'software
               | becomes anachronistic'. But is there a good verb for
               | that? It certainly seems like something that a lot more
               | than just software experiences. Plenty of real world
               | things that have been perfectly preserved are now much
               | less useful because the context changed. Consider an
               | Oxen-yoke, typewriters, horse-drawn carriages, envelopes,
               | phone switchboards, etc.
               | 
               | It really feels like this concept should have a verb.
        
               | igouy wrote:
               | obsolescence
        
               | cestith wrote:
               | My only concern is this kind of change in semantics for
               | existing syntax is more worthy of a major revision than a
               | point release.
        
               | rbanffy wrote:
               | It's opt-in at the moment. It won't be the default
               | behavior for a couple releases.
               | 
               | Maybe we'll get Python 4 with no GIL.
               | 
               | /me ducks
        
               | spookie wrote:
               | The other day I compiled a 1989 C program and it did the
               | job.
               | 
               | I wish more things were like that. Tired of building
               | things on shaky grounds.
        
               | rbanffy wrote:
               | If you go into mainframes, you'll compile code that was
               | written 50 years ago without issue. In fact, you'll run
               | code that was compiled 50 years ago and all that'll
               | happen is that it'll finish much sooner than it did on
               | the old 360 it originally ran on.
        
             | actinium226 wrote:
             | If code has been unmaintained for more than a few years,
             | it's usually such a hassle to get it working again that 99%
             | of the time I'll just write my own solution, and that's
             | without threads.
             | 
             | I feel some trepidation about threads, but at least for
             | debugging purposes there's only one process to attach to.
        
             | dhruvrajvanshi wrote:
             | > Im not worried about new code. Im worried about stuff
             | written 15 years ago by a monkey who had no idea how
             | threads work and just read something on stack overflow that
             | said to use threading. This code will likely break when run
             | post-GIL. I suspect there is actually quite a bit of it.
             | 
             | I was with OP's point but then you lost me. You'll always
             | have to deal with that coworker's shitty code, GIL or not.
             | 
             | Could they make a worse mess with multi threading? Sure. Is
             | their single threaded code as bad anyway because at the end
             | of the day, you can't even begin understand it? Absolutely.
             | 
             | But yeah I think python people don't know what they're
             | asking for. They think GIL less python is gonna give
             | everyone free puppies.
        
             | zahlman wrote:
             | >Im worried about stuff written 15 years ago
             | 
             | Please don't - it isn't relevant.
             | 
             | 15 years ago, new Python code was still dominantly for 2.x.
             | Even code written back then with an eye towards 3.x
             | compatibility (or, more realistically, lazily run through
             | `2to3` or `six`) will have quite little chance of running
             | acceptably on 3.14 regardless. There have been considerable
             | removals from the standard library, `async` is no longer a
             | valid identifier name (you laugh, but that broke Tensorflow
             | once). The attitude taken towards """strings""" in a lot of
             | 2.x code results in constructs that can be automatically
             | made into valid syntax that _appears_ to preserve the
             | original intent, but which are not at all automatically
             | _fixed_.
             | 
             | Also, the modern expectation is of a lock-step release
             | cadence. CPython only supports up to the last 5 versions,
             | released annually; and whenever anyone publishes a new
             | version of a package, generally they'll see no point in
             | supporting unsupported Python versions. Nor is anyone who
             | released a package in the 3.8 era going to patch it if it
             | breaks in 3.14 - because _support for 3.14 was never
             | advertised anyway_. In fact, in most cases, support for
             | _3.9_ wasn 't originally advertised, and you _can 't update
             | the metadata_ for an existing package upload (you have to
             | make a new one, even if it's just a "post-release") even if
             | you test it and it _does_ work.
             | 
             | Practically speaking, pure-Python packages _usually do_
             | work in the next version, and in the next several versions,
             | perhaps beyond the support window. But you can really never
             | predict what 's going to break. You can only offer a new
             | version when you find out that it's going to break - and a
             | lot of developers are going to just roll that fix into the
             | feature development they were doing anyway, because life's
             | too short to backport everything for everyone. (If there's
             | no longer active development and only maintenance, well,
             | good luck to everyone involved.)
             | 
             | If 5 years isn't long enough for your purposes, practically
             | speaking you need to maintain an environment with an
             | outdated interpreter, and find a third party (RedHat seems
             | to be a popular choice here) to maintain it.
        
           | dkarl wrote:
           | > What changes for you? Nothing unless you start using
           | threads
           | 
           | Coming from the Java world, you don't know what you're
           | missing. Looking inside an application and seeing a bunch of
           | threadpools managed by competing frameworks, debugging
           | timeouts and discovering that tasks are waiting more than a
           | second to get scheduled on the wrong threadpool, tearing your
           | hair out because someone split a tiny sub-10ms bit of
           | computation into two tasks and scheduling the second takes a
           | hundred times longer than the actual work done, adding a
           | library for a trivial bit of functionality and discovering
           | that it spins up yet another threadpool when you initialize
           | it.
           | 
           | (I'm mostly being tongue in cheek here because I know it's
           | nice to have threading when you need it.)
        
           | rbanffy wrote:
           | > There's some threaded python code of course
           | 
           | A fairly common pattern for me is to start a terminal UI
           | updating thread that redraws the UI every second or so while
           | one or more background threads do their thing. Sometimes,
           | it's easier to express something with threads and we do it
           | not to make the process faster (we kind of accept it will be
           | a bit slower).
           | 
           | The real enemy is state that can me mutated from more than
           | one place. As long as you know who can change what, threads
           | are not that scary.
        
         | zem wrote:
         | this looks extremely promising
         | https://microsoft.github.io/verona/pyrona.html
        
         | freeone3000 wrote:
         | I'm sure you'll be happy using the last language that has to
         | fork() in order to thread. We've only had consumer-level
         | multicore processors for 20 years, after all.
        
           | im3w1l wrote:
           | You have to understand that people come from very different
           | angles with python. Some people write web servers where in
           | python, where speed equals money saved. Other people write
           | little UI apps that where speed is a complete non-issue. Yet
           | others write aiml code that spends most of its time in gpu
           | code. But then they want to do just a little data massaging
           | in python which can easily bottleneck the whole thing. And
           | some people people write scripts that don't use a .env but
           | rather os-libraries.
        
         | bratao wrote:
         | This is a common mistake and very badly communicated. The GIL
         | do not make the Python code thread-safe. It only protect the
         | internal CPython state. Multi-threaded Python code is not
         | thread-safe today.
        
           | amelius wrote:
           | Well, I think you can manipulate a dict from two different
           | threads in Python, today, without any risk of segfaults.
        
             | pansa2 wrote:
             | You can do so in free-threaded Python too, right? The dict
             | is still protected by a lock, but one that's much more
             | fine-grained than the GIL.
        
               | amelius wrote:
               | Sounds good, yes.
        
           | porridgeraisin wrote:
           | Internal cpython state also includes say, a dictionary's
           | internal state. So for practical purposes it is safe. Of
           | course, TOCTOU, stale reads and various race conditions are
           | not (and can never be) protected by the GIL.
        
           | kevingadd wrote:
           | This should not have been downvoted. It's true that the GIL
           | does not make python code thread-safe implicitly, you have to
           | either construct your code carefully to be atomic (based on
           | knowledge of how the GIL works) or make use of mutexes,
           | semaphores, etc. It's just memory-safe and can still have
           | races etc.
        
         | tialaramex wrote:
         | You're not the only one. David Baron's note certainly applies:
         | https://bholley.net/blog/2015/must-be-this-tall-to-write-mul...
         | 
         | In a language conceived for this kind of work it's not as easy
         | as you'd like. In most languages you're going to write nonsense
         | which has no coherent meaning whatsoever. Experiments show that
         | humans can't successfully understand non-trivial programs
         | unless they exhibit Sequential Consistency - that is, they can
         | be understood as if (which is not reality) all the things which
         | happen do happen in some particular order. This is not the
         | reality of how the machine works, for subtle reasons, but
         | without it merely human programmers are like "Eh, no idea, I
         | guess everything is computer?". It's really easy to write
         | concurrent programs which do not satisfy this requirement in
         | most of these languages, you just can't debug them or reason
         | about what they do - a disaster.
         | 
         | As I understand it Python without the GIL will enable more
         | programs that lose SC.
        
         | qznc wrote:
         | Worst case is probably that it is like a "Python4": Things
         | break when people try to update to non-GIL, so they rather stay
         | with the old version for decades.
        
         | odiroot wrote:
         | It's called job security. We'll be rewriting decades of code
         | that's broken by that transition.
        
         | almostgotcaught wrote:
         | Do you understand what you're implying?
         | 
         | "Python programmers are so incompetent that Python succeeds as
         | a language only because it lacks features they wouldn't know to
         | use"
         | 
         | Even if it's circumstantially true, doesn't mean it's the right
         | guiding principle for the design of the language.
        
         | frollogaston wrote:
         | What reliance did you have in mind? All sorts of calls in
         | Python can release the GIL, so you already need locking, and
         | there are race conditions just like in most languages. It's not
         | like JS where your code is guaranteed to run in order until you
         | "await" something.
         | 
         | I don't fully understand the challenge with removing it, but
         | thought it was something about C extensions, not something most
         | users have to directly worry about.
        
       | pawanjswal wrote:
       | This is some serious groundwork for the next era of performance!
        
       | pansa2 wrote:
       | Does removal of the GIL have any _other_ effects on multi-
       | threaded Python code (other than allowing it to run in parallel)?
       | 
       | My understanding is that the GIL has lasted this long not because
       | multi-threaded Python depends on it, but because removing it:
       | 
       | - Complicates the implementation of the interpreter
       | 
       | - Complicates C extensions, and
       | 
       | - Causes single-threaded code to run slower
       | 
       | Multi-threaded Python code already has to assume that it can be
       | pre-empted on the boundary between any two bytecode instructions.
       | Does free-threaded Python provide the same guarantees, or does it
       | require multi-threaded Python to be written differently, e.g. to
       | use additional locks?
        
         | rfoo wrote:
         | > Does free-threaded Python provide the same guarantees
         | 
         | Mostly. Some of the "can be pre-empted on the boundary between
         | any two bytecode instructions" bugs are really hard to hit
         | without free-threading, though. And without free-threading
         | people don't use as much threading stuff. So by nature it
         | exposes more bugs.
         | 
         | Now, my rants:
         | 
         | > have any other effects on multi-threaded Python code
         | 
         | It stops people from using multi-process workarounds. Hence, it
         | simplifies user-code. IMO totally worth it to make the
         | interpreter more complex.
         | 
         | > Complicates C extensions
         | 
         | The alternative (sub-interpreters) complicates C extensions
         | more than free-threading and the top one most important C
         | extension in the entire ecosystem, numpy, stated that they
         | can't and they don't want to support sub-interpreters. On
         | contrary, they already support free-threading today and are
         | actively sorting out remaining bugs.
         | 
         | > Causes single-threaded code to run slower
         | 
         | That's the trade-off. Personally I think a single digit
         | percentage slow-down of single-threaded code worth it.
        
           | celeritascelery wrote:
           | > That's the trade-off. Personally I think a single digit
           | percentage slow-down of single-threaded code worth it.
           | 
           | Maybe. I would expect that 99% of python code going forward
           | will still be single threaded. You just don't need that extra
           | complexity for most code. So I would expect that python code
           | as a whole will have worse performance, even though a handful
           | of applications will get faster.
        
             | pphysch wrote:
             | But the bar to parallelizing code gets much lower, in
             | theory. Your serial code got 5% slower but has a direct
             | path to being 50% faster.
             | 
             | And if there's a good free-threaded HTTP server
             | implementation, the RPS of "Python code as a whole" could
             | increase dramatically.
        
               | weakfish wrote:
               | Is there any news from FastAPI folks and/or Gunicorn on
               | their support?
        
               | fjasdfas wrote:
               | You can do multiple processes with SO_REUSEPORT.
               | 
               | free-threaded makes sense if you need shared state.
        
               | pphysch wrote:
               | Any webserver that wants to cache and reuse content cares
               | about shared state, but usually has to outsource that to
               | a shared in-memory database because the language can't
               | support it.
        
             | rfoo wrote:
             | That's the mindset that leads to the funny result that `uv
             | pip` is like 10x faster than `pip`.
             | 
             | Is it because Rust is just fast? Nope. For anything after
             | resolving dependency versions raw CPU performance doesn't
             | matter at all. It's writing concurrent PLUS parallel code
             | in Rust is easier, doesn't need to spawn a few processes
             | and wait for the interpreter to start in each, doesn't need
             | to serialize whatever shit you want to run constantly. So,
             | someone did it!
             | 
             | Yet, there's a pip maintainer who actively sabotages free-
             | threading work. Nice.
        
               | notpushkin wrote:
               | > Yet, there's a pip maintainer who actively sabotages
               | free-threading work.
               | 
               | Wow. Could you elaborate?
        
             | foresto wrote:
             | As I recall, CPython has also been getting speed-ups
             | lately, which ought to make up for the minor single-
             | threaded performance loss introduced by free threading.
             | With that in mind, the recent changes seem like an overall
             | win to me.
        
           | rocqua wrote:
           | Note that there is an entire order of magnitude range for a
           | 'single digit'.
           | 
           | A 1% slowdown seems totally fine. A 9% slowdown is pretty
           | bad.
        
         | jacob019 wrote:
         | Your understanding is correct. You can use all the cores but
         | it's much slower per thread and existing libraries may need to
         | be reworked. I tried it with PyTorch, it used 10x more CPU to
         | do half the work. I expect these issues to improve, still great
         | to see after 20 years wishing for it.
        
         | btilly wrote:
         | It makes race conditions easier to hit, and that will require
         | multi-threaded Python to be written with more care to achieve
         | the same level of reliability.
        
       | heybrendan wrote:
       | I am a Python user, but far from an expert. Occasionally, I've
       | used 'concurrent.futures' to kick off running some very simple
       | functions, at the same time.
       | 
       | How are 'concurrent.futures' users impacted? What will I need to
       | change moving forward?
        
         | rednafi wrote:
         | It's going to get faster since threads won't be locked on GIL.
         | If you're locking shared objects correctly or not using them
         | all, then you should be good.
        
       | 0x000xca0xfe wrote:
       | I know it's just an AI image... but a snake with two tails?
       | C'mon!
        
         | brookst wrote:
         | Confusoborus
        
         | vpribish wrote:
         | shh. don't complain too loudly or we'll lose an important tell.
         | python articles using snake illustrations can usually be
         | ignored because they are not clueful.
         | 
         | -- python, monty
        
       | bgwalter wrote:
       | This is just an advertisement for the company. Fact is, free-
       | threading is still up to 50% slower, the tail call interpreter
       | isn't much faster at all, and free-threading is still flaky.
       | 
       | Things they won't tell you at PyCon.
        
         | tomrod wrote:
         | QuantSight isn't a formal company though, it's a skunkworks/OSS
         | research group run by the Travis Oliphant.
        
         | lenerdenator wrote:
         | I don't see how any of that's a problem given that it's not the
         | default for how people run Python.
         | 
         | It's a big project that's going to take lots of time by lots of
         | people to finish. Keep it behind opt-in, keep accepting pull
         | requests after rigorous testing, and it's fine.
        
       | pjmlp wrote:
       | On the other news, Microsoft dumped the whole faster Python team,
       | apparently the 2025 earnings weren't enough to keep the team
       | around.
       | 
       | https://www.linkedin.com/posts/mdboom_its-been-a-tough-coupl...
       | 
       | Lets see whatever performance improvements still land on CPython,
       | unless other company sponsors the work.
       | 
       | I guess Facebook (no need to correct me on the name) is still
       | sponsoring part of it.
        
         | falcor84 wrote:
         | It wouldn't have bothered me if you just said "Facebook" - I
         | probably wouldn't have even noticed it. But I'm really curious
         | why you chose to write "Facebook", then apparently noticed the
         | issue, and instead of replacing it with "Meta" decided to add
         | the much longer "(no need to correct me on the name)". What axe
         | are you grinding?
        
           | pjmlp wrote:
           | Yes, because I am quite certain someone without anything
           | better to do would correct me on that.
           | 
           | For me Facebook will always be Facebook, and Twitter will
           | always be Twitter.
        
             | falcor84 wrote:
             | > Yes, because I am quite certain someone without anything
             | better to do would correct me on that.
             | 
             | Well, you sure managed to avoid that by setting up camp on
             | that hill. Kudos on so much time saved.
             | 
             | > For me Facebook will always be Facebook, and Twitter will
             | always be Twitter.
             | 
             | Well, for me the product will always be "Thefacebook", but
             | that's since I haven't used it since. But I do respect that
             | there's a company running it now that does more stuff and
             | contributes to open source projects.
        
               | biorach wrote:
               | > Well, you sure managed to avoid that by setting up camp
               | on that hill. Kudos on so much time saved.
               | 
               | Why are you picking a fight about this?
        
               | falcor84 wrote:
               | I think I'm taking it personally because I had previously
               | changed my name and had people repeatedly call me by my
               | old name just to annoy/hurt me.
               | 
               | Obviously I know that companies aren't people and don't
               | have feelings, but I can't understand why you would
               | intentionally avoid using their chosen name, even when
               | it's more effort to you.
        
               | kstrauser wrote:
               | I wouldn't do that to a person. I'm not worried about
               | hurting Twitter's feelings, though.
        
               | Flamentono2 wrote:
               | With money which destroied our society
        
             | rbanffy wrote:
             | > Twitter will always be Twitter.
             | 
             | If Elon can deadname his daughter, then we can deadname his
             | company.
        
               | kstrauser wrote:
               | That's the rationale I've been using.
        
         | rich_sasha wrote:
         | Ah that's very, very sad. I guess they have embraced and
         | extended, there's only one thing left to do.
        
           | biorach wrote:
           | At this stage the cliched and clueless comments about
           | embrace/extend/extinguish are tiresome and inevitable
           | whenever Microsoft is mentioned.
           | 
           | A few decades ago MS did indeed have a playbook which they
           | used to undermine open standards. Laying off some members of
           | the Python team bears no resemblence whatsoever to that. At
           | worst it will delay the improvement of free-threaded Python.
           | That's all.
           | 
           | Your comment is lazy and unfounded.
        
             | kstrauser wrote:
             | _cough_ Bullshit _cough_
             | 
             | * VSCode got popular and they started preventing forks from
             | installing its extensions.
             | 
             | * They extended the Free Source pyright language server
             | into the proprietary pylance. They don't even sell it. It's
             | just there to make the FOSS version less useful.
             | 
             | * They bought GitHub and started rate limiting it to
             | unlogged in visitors.
             | 
             | Every time Microsoft touches a thing, they end up locking
             | it down. They can't help it. It's their nature. And if
             | you're the frog carrying that scorpion across the pond and
             | it stings you, well, you can only blame it so much. You
             | knew this when they offered the deal.
             | 
             | Every time. It hasn't changed substantially since they
             | declared that Linux is cancer, except to be more subtle in
             | their attacks.
        
               | oblio wrote:
               | I actually hate this trope more because of what is says
               | about the poster. Which I guess would, that they're
               | someone wearing horse blinders.
               | 
               | There's a part of me that wants to scream at them:
               | 
               | "Look around you!!! It's not 1999 anymore!!! These days
               | we have Google, Amazon, Apple, Facebook, etc, which are
               | just as bad if not worse!!! Cut it out with the 20+ year
               | old bad jokes!!!"
               | 
               | Yes, Microsoft is bad. The reason Micr$oft was the enemy
               | back in the day is because they... won. They were bigger
               | than anyone else in the fields that mattered (except for
               | server-side, where they almost one). Now they're just 1
               | in a gang of evils. There's nothing special about them
               | anymore. I'm more scared of Apple and Google.
        
               | kstrauser wrote:
               | That's only reasonable if you believe you can only
               | distrust one company at a time. I distrust every one you
               | mentioned there, for different reasons, in different
               | ways. I don't think that Apple is trying to exclusively
               | own the field of programming tools to their own profit,
               | nor do I think that Facebook is. I don't think Apple is
               | trying to own all data about every human. I don't think
               | Microsoft is trying to force all vendors to sell through
               | their app store.
               | 
               | But the thing is that Microsoft hasn't seemed to
               | fundamentally change since 1999. They appear kinder and
               | friendlier but they keep running the same EEE playbook
               | everywhere they can. Lots of us give them a free pass
               | because they let us run a nifty free-for-now programming
               | editor. That doesn't change the leopard's spots, though.
        
               | mixmastamyk wrote:
               | All these posts and no one mentioned their numerous,
               | recent, abusive deeds around Windows or negligent
               | security posture, all the while having captured Uncle Sam
               | and other governments.
               | 
               | MS has continued to metastasize and is in some ways worse
               | than the old days, even if they've finally accepted the
               | utility of open source as a loss leader.
               | 
               | They have the only BigTech products I've been forced to
               | use if I want to eat.
        
               | oblio wrote:
               | Yet I only ever see these tired EEE memes for Microsoft
               | when Chrome is basically the web, for example.
        
               | kstrauser wrote:
               | I don't know what to tell you, except that you obviously
               | haven't read a lot of my stuff on that topic. (Not that I
               | would expect anyone to have, mind you. I'm nobody.) I
               | agree with you. I only use Chrome when I must, like when
               | I'm updating a Meshtastic radio and the flasher app
               | doesn't run on Firefox or Safari.
               | 
               | I'm not anti-MS as much as anti their behavior, whoever
               | is acting that way. This thread is directly related to MS
               | so I'm expressing my opinion on MS here. I'll be more
               | than happy to share my thoughts on Chrome in a Google
               | thread.
        
               | biorach wrote:
               | None of those were independent projects or open
               | standards. VScode and pyright are both MS projects from
               | the get-go.
               | 
               | Sabotaging forks is scummy, but the forks were extending
               | MS functionality, not the other way around.
               | 
               | GitHub was a private company before it was bought by MS.
               | Rate limiting is.... not great, but certainly not an
               | extinguish play.
               | 
               | EEE refers to the subversion of open standards or
               | independent free software projects. It does not apply to
               | any of the above.
               | 
               | MS are still scummy but at least attack them on their own
               | demerits, and don't parrot some schtick from decades ago.
        
               | kstrauser wrote:
               | It's not just EEE, though. They have a history of getting
               | devs all in on a thing and then killing it with
               | corporate-grade ADHD. They bought Visual FoxPro, got
               | bored with it, and told everyone to rewrite into Visual
               | Basic (which they then killed). Then the future was
               | Silverlight, until it wasn't. There are a thousand of
               | these things that weren't deliberately evil in the EEE,
               | but defined the word rugpull before we called it that.
               | 
               | So even without EEE, I think it's supremely risky to
               | hitch your wagon to their tech or services (unless you're
               | writing primarily for Windows, which is what they'd love
               | to help you migrate to). And I can't be convinced the
               | GitHub acquisition wasn't some combination of these dark
               | patterns.
               | 
               | Step 1: Get a plurality of the world's FOSS into one
               | place.
               | 
               | Step 2: Feed it into a LLM and then embed it in a popular
               | free editor so that everyone can use GPL code without
               | actually having to abide the license.
               | 
               | Step 3: Make it increasingly hard to use for FOSS
               | development by starting to add barriers a little at a
               | time. _< = we are here_
               | 
               | As a developer, they've done nothing substantial to earn
               | my trust. I think a lot of Microsoft employees are good
               | people who don't subscribe to all this and who want to do
               | the right thing, but corporate culture just won't let
               | that be.
        
               | biorach wrote:
               | > I think it's supremely risky to hitch your wagon to
               | their tech or services
               | 
               | OK, finally, yes, this is very true, for specific parts
               | of their tech.
               | 
               | But banging on about EEE just distracts from this, more
               | important message.
               | 
               | > Make it increasingly hard to use for FOSS development
               | by starting to add barriers a little at a time. <= we are
               | here
               | 
               | ....and now you've lost me again
        
               | kstrauser wrote:
               | Note I wasn't the one who said EEE upstream. I was just
               | replying to the thread.
               | 
               | Hanlon's razor is a thing, and I generally follow it.
               | It's just that I've seen Microsoft make so many "oops,
               | our bad!" mistakes over the years that purely
               | coincidentally gave them an edge up over their
               | competition, that I tend to distrust such claims from
               | them.
               | 
               | I don't feel that way about all corps. Oracle doesn't
               | make little mistakes that accidentally harm the
               | competition while helping themselves. No, they'll look
               | you in the eye and explain that they're mugging you while
               | they take your wallet. It's kind of refreshingly honest
               | in its own way.
        
               | dhruvrajvanshi wrote:
               | > Oracle doesn't make little mistakes that accidentally
               | harm the competition while helping themselves. No,
               | they'll look you in the eye and explain that they're
               | mugging you while they take your wallet. It's kind of
               | refreshingly honest in its own way.
               | 
               | Fucking hell bud :D
        
               | kstrauser wrote:
               | Tell me I'm wrong! :D
        
           | stusmall wrote:
           | That shows a misunderstanding of what EEE was. This team was
           | sending changes upstream which is the exact opposite of
           | "extend" step of the strategy. The idea of "extend" was to
           | add propriety extensions on top of an open standard/project
           | locking customers into the MSFT implementation.
        
             | jerrygenser wrote:
             | Ok so a better example of what you describe might be
             | vscode.
        
               | nothrabannosir wrote:
               | What existing open standard did vscode Embrace? I thought
               | Microsoft created v0 themselves.
               | 
               | A classic example is ActiveX.
        
               | biorach wrote:
               | > A classic example is ActiveX.
               | 
               | Nah, even that was based on earlier MS technologies - OLE
               | and COM
               | 
               | A good starter list of EEE plays is on the wikipedia
               | page: https://en.wikipedia.org/wiki/Embrace,_extend,_and_
               | extinguis...
        
               | nothrabannosir wrote:
               | Funny you linked that page because that's where I got
               | activex from :D
               | 
               |  _> Examples by Microsoft_
               | 
               |  _> Browser incompatibilities_
               | 
               |  _> The plaintiffs in an antitrust case claimed Microsoft
               | had added support for ActiveX controls in the Internet
               | Explorer Web browser to break compatibility with Netscape
               | Navigator, which used components based on Java and
               | Netscape 's own plugin system._
        
               | biorach wrote:
               | ah ok, sorry. I thought you were saying that they tried
               | an EEE play on ActiveX.
               | 
               | You meant they used ActiveX in an EEE play in the browser
               | wars.
        
               | nothrabannosir wrote:
               | Honestly I kept it vague because I didn't actually know
               | so your call-out was totally valid. I know it better now
               | than without your clarification so thanks :+1:
        
               | JacobHenner wrote:
               | VSCode displaced Atom, pre-GitHub acquisition, by
               | building on top of Atom's rendering engine Electron.
        
         | bgwalter wrote:
         | They were quite a bit behind the schedule that was promised
         | five years ago.
         | 
         | Additionally, at this stage the severe political and governance
         | problems cannot have escaped Microsoft. I imagine that no
         | competent Microsoft employee wants to give his expertise to
         | CPython, only later to suffer group defamation from a couple of
         | elected mediocre people.
         | 
         | CPython is an organization that overpromises, allocates jobs to
         | the obedient and faithful while weeding out competent
         | dissenters.
         | 
         | It wasn't always like that. The issues are entirely self-
         | inflicted.
        
           | biorach wrote:
           | > CPython is an organization that overpromises, allocates
           | jobs to the obedient and faithful while weeding out competent
           | dissenters.
           | 
           | This stinks of BS
        
             | wisty wrote:
             | It sounds like an oblique reference to that time they
             | temporarily suspended one of the of the most valuable
             | members of the community, apparently for having the
             | audacity to suggest that their powers to suspend members of
             | the community seemed a little arbitrary and open to abuse.
        
               | biorach wrote:
               | Well they could just say that instead of wasting people's
               | time with oblique references
        
               | robertlagrant wrote:
               | Saying "This stinks of BS" is going to mean you have
               | little standing to criticise other people for wasting
               | time.
        
           | make3 wrote:
           | Microsoft also fired a whole lot of other open source people
           | unrelated to Python in this current layoff
        
             | pjmlp wrote:
             | Notably MAUI, ASP.NET, Typescript and AI frameworks.
        
         | vlovich123 wrote:
         | That's unfortunate but I called it when people were claiming
         | that Microsoft had committed to this effort for the long term.
        
           | mtzaldo wrote:
           | Could we do a crowdfunding campaign so we can keep paying
           | them? The whole world is/will benefit from their work.
        
         | morkalork wrote:
         | Didn't Google lay off their entire Python development team in
         | the last year as well? I wonder if there is some impetus behind
         | both.
        
           | make3 wrote:
           | doesn't print money right away = cut by executive #3442
        
       | amelius wrote:
       | The snake in the header image appears to have two tail-ends ...
        
         | cestith wrote:
         | I guess it's spawned a second thread in the same process.
        
       | sgarland wrote:
       | > Instead, many reach for multiprocessing, but spawning processes
       | is expensive
       | 
       | Agreed.
       | 
       | > and communicating across processes often requires making
       | expensive copies of data
       | 
       | SharedMemory [0] exists. Never understood why this isn't used
       | more frequently. There's even a ShareableList which does exactly
       | what it sounds like, and is awesome.
       | 
       | [0]:
       | https://docs.python.org/3/library/multiprocessing.shared_mem...
        
         | ogrisel wrote:
         | You cannot share arbitrarily structured objects in the
         | `ShareableList`, only atomic scalars and bytes / strings.
         | 
         | If you want to share structured Python objects between
         | instances, you have to pay the cost of
         | `pickle.dump/pickle.dump` (CPU overhead for interprocess
         | communication) + the memory cost of replicated objects in the
         | processes.
        
           | tomrod wrote:
           | I can fit a lot of json into bytes/strings though?
        
             | cjbgkagh wrote:
             | Perhaps flatbuffers would be better?
        
               | tomrod wrote:
               | I love learning from folks on HN -- thanks! Will check it
               | out.
        
               | notpushkin wrote:
               | Take a look at https://capnproto.org/ as well, while at
               | it.
               | 
               | Neither solve the copying problem, though.
        
               | frollogaston wrote:
               | Ah, I forgot capnproto doesn't let you edit a serialized
               | proto in-memory, it's read-only. In theory this should be
               | possible as long as you're not changing the length of
               | anything, but I'm not surprised such trickery is
               | unsupported.
               | 
               | So this doesn't seem like a versatile solution for
               | sharing data structs between two Python processes. You're
               | gonna have to reserialize the whole thing if one side
               | wants to edit, which is basically copying.
        
               | tinix wrote:
               | let me introduce you to quickle.
        
             | vlovich123 wrote:
             | That's even worse than pickle.
        
               | tomrod wrote:
               | pickle pickles to pickle binary, yeah? So can stream that
               | too with an io Buffer :D
        
             | frollogaston wrote:
             | If all your state is already json-serializable, yeah. But
             | that's just as expensive as copying if not more, hence what
             | cjbgkagh said about flatbuffers.
        
               | frollogaston wrote:
               | oh nvm, that doesn't solve this either
        
             | reliabilityguy wrote:
             | What's the point? The whole idea is to share an object, and
             | not to serialize them whether it's json, pickle, or
             | whatever.
        
               | tomrod wrote:
               | I mean, the answer to this is pretty straightforward --
               | because we can, not because we should :)
        
           | notpushkin wrote:
           | We need a dataclass-like interface on top of a ShareableList.
        
           | sgarland wrote:
           | So don't do that? Send data to workers as primitives, and
           | have a separate process that reads the results and serializes
           | it into whatever form you want.
        
         | modeless wrote:
         | Yeah I've had great success sharing numpy arrays this way.
         | Explicit sharing is not a huge burden, especially when compared
         | with the difficulty of debugging problems that occur when you
         | accidentally share things between threads. People vastly
         | overstate the benefit of threads over multiprocessing and I
         | don't look forward to all the random segfaults I'm going to
         | have to debug after people start routinely disabling the GIL in
         | a library ecosystem that isn't ready.
         | 
         | I wonder why people never complained so much about JavaScript
         | not having shared-everything threading. Maybe because
         | JavaScript is so much faster that you don't have to reach for
         | it as much. I wish more effort was put into baseline
         | performance for Python.
        
           | dhruvrajvanshi wrote:
           | > I wonder why people never complained so much about
           | JavaScript not having shared-everything threading. Maybe
           | because JavaScript is so much faster that you don't have to
           | reach for it as much. I wish more effort was put into
           | baseline performance for Python.
           | 
           | This is a fair observation.
           | 
           | I think a part of the problem is that the things that make
           | GIL less python hard are also the things that make faster
           | baseline performance hard. I.e. an over reliance of the
           | ecosystem on the shape of the CPython data structures.
           | 
           | What makes python different is that a large percentage of
           | python code isn't python, but C code targeting the CPython
           | api. This isn't true for a lot of other interpreted
           | languages.
        
           | com2kid wrote:
           | > I wonder why people never complained so much about
           | JavaScript not having shared-everything threading. Maybe
           | because JavaScript is so much faster that you don't have to
           | reach for it as much. I wish more effort was put into
           | baseline performance for Python.
           | 
           | Nobody sane tries to do math in JS. Backend JS is recommended
           | for situations where processing is minimal and it is mostly
           | lots of tiny IO requests that need to be shunted around.
           | 
           | I'm a huge JS/Node proponent and if someone says they need to
           | write a backend service that crunches a lot of numbers, I'll
           | recommend choosing a different technology!
           | 
           | For some reason Python peeps keep trying to do actual
           | computations in Python...
        
             | frollogaston wrote:
             | Python peeps tend to do heavy numbers calc in numpy, but
             | sometimes you're doing expensive things with
             | dictionaries/lists.
        
           | zahlman wrote:
           | > I wish more effort was put into baseline performance for
           | Python.
           | 
           | There has been. That's why the bytecode is incompatible
           | between minor versions. It was a major selling(?) point for
           | 3.11 and 3.12 in particular.
           | 
           | But the "Faster CPython" team at Microsoft was apparently
           | just laid off (https://www.linkedin.com/posts/mdboom_its-
           | been-a-tough-coupl...), and all of the optimization work has
           | to my understanding been based around fairly traditional
           | techniques. The C part of the codebase has decades of legacy
           | to it, after all.
           | 
           | Alternative implementations like PyPy often post impressive
           | results, and are worth checking out if you need to worry
           | about native Python performance. Not to mention the benefits
           | of shifting the work onto compiled code like NumPy, as you
           | already do.
        
           | frollogaston wrote:
           | "I wonder why people never complained so much about
           | JavaScript not having shared-everything threading"
           | 
           | Mainly cause Python is often used for data pipelines in ways
           | that JS isn't, causing situations where you do want to use
           | multiple CPU cores with some shared memory. If you want to
           | use multiple CPU cores in NodeJS, usually it's just a load-
           | balancing webserver without IPC and you just use throng, or
           | maybe you've got microservices.
           | 
           | Also, JS parallelism simply excelled from the start at
           | waiting on tons of IO, there was no confusion about it.
           | Python later got asyncio for this, and by now regular threads
           | have too much momentum. Threads are the worst of both worlds
           | in Py, cause you get the overhead of an OS thread and the
           | possibility of race conditions without the full parallelism
           | it's supposed to buy you. And all this stuff is confusing to
           | users.
        
         | chubot wrote:
         | Spawning processes generally takes much less than 1 ms on Unix
         | 
         | Spawning a PYTHON interpreter process might take 30 ms to 300
         | ms before you get to main(), depending on the number of imports
         | 
         | It's 1 to 2 orders of magnitude difference, so it's worth being
         | precise
         | 
         | This is a fallacy with say CGI. A CGI in C, Rust, or Go works
         | perfectly well.
         | 
         | e.g. sqlite.org runs with a process PER REQUEST -
         | https://news.ycombinator.com/item?id=3036124
        
           | Sharlin wrote:
           | Unix is not the only platform though (and is process creation
           | fast on all Unices or just Linux?) The point about
           | interpreter init overhead is, of course, apt.
        
             | btilly wrote:
             | Process creation should be fast on all Unices. If it isn't,
             | then the lowly shell script (heavily used in Unix) is going
             | to perform very poorly.
        
               | kragen wrote:
               | While I think you've been using Unix longer than I have,
               | shell scripts are known for performing very poorly, and
               | on PDP-11 Unix (where perhaps shell scripts were most
               | heavily used, since Perl didn't exist yet) fork()
               | couldn't even do copy-on-write; it had to literally copy
               | the process's entire data segment, which in most cases
               | also contained a copy of its code. Moving to paged
               | machines like the VAX and especially the 68000 family
               | made it possible to use copy-on-write, but historically
               | speaking, Linux has often been an order of magnitude
               | faster than most other Unices at fork(). However, I think
               | people mostly don't use _those_ Unices anymore. I imagine
               | the BSDs have pretty much caught up by now.
               | 
               | https://news.ycombinator.com/item?id=44009754 gives some
               | concrete details on fork() speed on current Linux: 50ms
               | for a small process, 700ms for a regular process, 1300ms
               | for a venti Python interpreter process, 30000-50000ms for
               | Python interpreter creation. This is on a CPU of about 10
               | billion instructions per second per core, so forking
               | costs on the order of 1/2-10 million instructions.
        
               | fredoralive wrote:
               | Python runs on other operating systems, like NT, where
               | AIUI processes are rather more heavyweight.
               | 
               | Not all use cases of Python and Windows intersect (how
               | much web server stuff is a Windows / IIS / SQL Server /
               | Python stack? Probably not many, although WISP is a nice
               | acronym), but you've still got to bear it in mind for
               | people doing heavy numpy stuff on their work laptop or
               | whatever.
        
           | charleshn wrote:
           | > Spawning processes generally takes much less than 1 ms on
           | Unix
           | 
           | It depends on whether one uses clone, fork, posix_spawn etc.
           | 
           | Fork can take a while depending on the size of the address
           | space, number of VMAs etc.
        
             | crackez wrote:
             | Fork on Linux should use copy-on-write vmpages now, so if
             | you fork inside python it should be cheap. If you launch a
             | new Python process from let's say the shell, and it's
             | already in the buffer cache, then you should only have to
             | pay the startup CPU cost of the interpreter, since the IO
             | should be satisfied from buffer cache...
        
               | charleshn wrote:
               | > Fork on Linux should use copy-on-write vmpages now, so
               | if you fork inside python it should be cheap.
               | 
               | No, that's exactly the point I'm making, copying PTEs is
               | not cheap on a large address space, woth many VMAs.
               | 
               | You can run a simple python script allocating a large
               | list and see how it affects fork time.
        
             | knome wrote:
             | for glibc and linux, fork just calls clone. as does
             | posix_spawn, using the flag CLONE_VFORK.
        
           | LPisGood wrote:
           | My understanding is that spawning a thread takes just a few
           | micro seconds, so whether you're talking about a process or a
           | Python interpreter process there are still orders of
           | magnitude to be gained.
        
           | kragen wrote:
           | To be concrete about this,
           | http://canonical.org/~kragen/sw/dev3/forkovh.c took 670ms to
           | fork, exit, and wait on the first laptop I tried it on, but
           | only 130ms compiled with dietlibc instead of glibc, and with
           | glibc on a 2.3 GHz E5-2697 Xeon, it took 130ms compiled with
           | glibc.
           | 
           | httpdito http://canonical.org/~kragen/sw/dev3/server.s (which
           | launches a process per request) seems to take only about 50ms
           | because it's not linked with any C library and therefore only
           | maps 5 pages. Also, that doesn't include the time for exit()
           | because it runs multiple concurrent child processes.
           | 
           | On _this_ laptop, a Ryzen 5 3500U running at 2.9GHz, forkovh
           | takes about 330ms built with glibc and about 130-140ms built
           | with dietlibc, and `time python3 -c True` takes about
           | 30000-50000ms. I wrote a Python version of forkovh
           | http://canonical.org/~kragen/sw/dev3/forkovh.py and it takes
           | about 1200ms to fork(), _exit(), and wait().
           | 
           | If anyone else wants to clone that repo and test their own
           | machines, I'm interested to hear the results, especially if
           | they aren't in Linux. `make forkovh` will compile the C
           | version.
           | 
           | 1200ms is pretty expensive in some contexts but not others.
           | Certainly it's cheaper than spawning a new Python interpreter
           | by more than an order of magnitude.
        
           | jaoane wrote:
           | >Spawning a PYTHON interpreter process might take 30 ms to
           | 300 ms before you get to main(), depending on the number of
           | imports
           | 
           | That's lucky. On constrained systems launching a new
           | interpreter can very well take 10 seconds. Python is
           | ssssslllloooowwwww.
        
           | morningsam wrote:
           | >Spawning a PYTHON interpreter process might take 30 ms to
           | 300 ms
           | 
           | Which is why, at least on Linux, Python's multiprocessing
           | doesn't do that but fork()s the interpreter, which takes low-
           | single-digit ms as well.
        
             | zahlman wrote:
             | Even when the 'spawn' strategy is used (default on Windows,
             | and can be chosen explicitly on Linux), the overhead can
             | largely be avoided. (Why choose it on Linux? Apparently
             | forking can cause problems if you also use threads.) Python
             | imports can be deferred (`import` is a _statement_ , not a
             | compiler or pre-processor directive), and child processes
             | (regardless of the creation strategy) name the main module
             | as `__mp_main__` rather than `__main__`, allowing the
             | programmer to distinguish. (Being able to distinguish is of
             | course _necessary_ here, to avoid making a fork bomb -
             | since the top-level code runs automatically and `if
             | __name__ ==  '__main__':` is normally top-level code.)
             | 
             | But also keep in mind that _cleanup_ for a Python process
             | also takes time, which is harder to trace.
             | 
             | Refs:
             | 
             | https://docs.python.org/3/library/multiprocessing.html#cont
             | e... https://stackoverflow.com/questions/72497140
        
           | ori_b wrote:
           | As another example: I run https://shithub.us with shell
           | scripts, serving a terabyte or so of data monthly (mostly due
           | to AI crawlers that I can't be arsed to block).
           | 
           | I'm launching between 15 and 3000 processes per request.
           | While Plan 9 is about 10x faster at spawning processes than
           | Linux, it's telling that 3000 C processes launching in a
           | shell is about as fast as one python interpreter.
        
         | isignal wrote:
         | Processes can die independently so the state of a concurrent
         | shared memory data structure when a process dies while
         | modifying this under a lock can be difficult to manage.
         | Postgres which uses shared memory data structures can sometimes
         | need to kill all its backend processes because it cannot fully
         | recover from such a state.
         | 
         | In contrast, no one thinks about what happens if a thread dies
         | independently because the failure mode is joint.
        
           | wongarsu wrote:
           | > In contrast, no one thinks about what happens if a thread
           | dies independently because the failure mode is joint.
           | 
           | In Rust if a thread holding a mutex dies the mutex becomes
           | poisoned, and trying to acquire it leads to an error that has
           | to be handled. As a consequence every rust developer that
           | touches a mutex has to think about that failure mode. Even if
           | in 95% of cases the best answer is "let's exit when that
           | happens".
           | 
           | The operating system tends to treat your whole process as one
           | and shot down everything or nothing. But a thread can still
           | crash in its own due to unhandled oom, assertion failures or
           | any number of other issues
        
             | jcalvinowens wrote:
             | > But a thread can still crash in its own due to unhandled
             | oom, assertion failures or any number of other issues
             | 
             | That's not really true on POSIX. Unless you're doing nutty
             | things with clone(), or you actually have explicit code
             | that calls pthread_exit() or gettid()/pthread_kill(), the
             | whole process is always going to die at the same time.
             | 
             | POSIX signal dispositions are process-wide, the only way
             | e.g. SIGSEGV kills a single thread is if you write an
             | explicit handler which actually does that by hand.
             | Unhandled exceptions usually SIGABRT, which works the same
             | way.
             | 
             | ** Just to expand a bit: there is a subtlety in that, while
             | dispositions are process-wide, one individual thread does
             | indeed take the signal. If the signal is handled, only that
             | thread sees -EINTR from a blocking syscall; but if the
             | signal is not handled, the default disposition affects _all
             | threads in the process simultaneously_ no matter which
             | thread is actually signalled.
        
               | wahern wrote:
               | It would be nice if someday we got per-thread signal
               | handlers to complement per-thread signal masking and per-
               | thread alternate signal stacks.
        
           | jcalvinowens wrote:
           | This is a solvable problem though, the literature is
           | overflowing with lock-free implementations of common data
           | structures. The real question is how much performance you
           | have to sacrifice for the guarantee...
        
         | tinix wrote:
         | shared memory only works on dedicated hardware.
         | 
         | if you're running in something like AWS fargate, there is no
         | shared memory. have to use the network and file system which
         | adds a lot of latency, way more than spawning a process.
         | 
         | copying processes through fork is a whole different problem.
         | 
         | green threads and an actor model will get you much further in
         | my experience.
        
           | bradleybuda wrote:
           | Fargate is just a container runtime. You can fork processes
           | and share memory like you can in any other Linux environment.
           | You may not want to (because you are running many cheap /
           | small containers) but if your Fargate containers are running
           | 0.25 vCPUs then you probably don't want traditional
           | multiprocessing or multithreading...
        
             | tinix wrote:
             | Go try it and report back.
             | 
             | Fargate isn't just ECS and plain containers.
             | 
             | You cannot use shared memory in fargate, there is literally
             | no /dev/shm.
             | 
             | See "sharedMemorySize" here: https://docs.aws.amazon.com/Am
             | azonECS/latest/developerguide/...
             | 
             | > If you're using tasks that use the Fargate launch type,
             | the sharedMemorySize parameter isn't supported.
        
           | sgarland wrote:
           | Well don't use Fargate, there's your problem. Run programs on
           | actual servers, not magical serverless bullshit.
        
       | YouWhy wrote:
       | Hey, I've been developing professionally with Python for 20
       | years, so wanted to weigh in:
       | 
       | Decent threading is awesome news, but it only affects a small
       | minority of use cases. Threads are only strictly necessary when
       | it's prohibitive to message pass. The Python ecosystem these days
       | includes a playbook solution for literally any such case.
       | Considering the multiple major pitfalls of threads (i.e.,
       | locking), they are likely to become a thing useful only in
       | specific libraries/domains and not as a general.
       | 
       | Additionally, with all my love to vanilla Python, anyone who
       | needs to squeeze the juice out of their CPU (which is actually
       | memory bandwidth) has a plenty of other tools -- off the shelf
       | libraries written in native code. (Honorable mention to Pypy,
       | numba and such).
       | 
       | Finally, the one dramatic performance innovation in Python has
       | been async programming - I warmly encourage everyone not familiar
       | with it to consider taking a look.
        
         | kstrauser wrote:
         | I haven't been using it that much longer than you, and I agree
         | with most of what you're saying, but I'd characterize it
         | differently.
         | 
         | Python has a lot of solid workarounds for avoid threading
         | because until now Python threading has absolutely sucked. I had
         | naively tried to use it to make a CPU-bound workload twice as
         | fast and soon realized the implications of the GIL, so I threw
         | all that code away and made it multiprocessing instead. That
         | sucked in its own way because I had to serialize lots of large
         | data structures to pass around, so 2x the cores got me about
         | 1.5x the speed and a warmer server room.
         | 
         | I would _love_ to have good threading support in Python. It's
         | not always the right solution, but there are a lot of
         | circumstances where it'd be absolutely peachy, and today we're
         | faking our way around its absence with whole playbooks of
         | alternative approaches to avoid the elephant in the room.
         | 
         | But yes, use async when it makes sense. It's a thing of beauty.
         | (Yes, Glyph, we hear the "I told you so!" You were right.)
        
       | sylware wrote:
       | Got myself a shiny python 3.13.3 (ssl module still unable to
       | compile with libressl) replacing a 3.12.2, feels clearly slower.
       | 
       | What's wrong?
        
         | ipsum2 wrote:
         | python 3.13 doesn't ship with free-threaded Python compiled
         | AFAIK.
        
           | sylware wrote:
           | You mean it is not default anymore?
        
             | ipsum2 wrote:
             | It's never been the default.
        
               | sylware wrote:
               | huh... then why it feels significantly slower since I did
               | not touch the build conf.
        
         | jdsleppy wrote:
         | Did you compile the Python yourself? If so, you may need to add
         | optimization flags https://devguide.python.org/getting-
         | started/setup-building/i...
        
       | aitchnyu wrote:
       | Whats currently stopping me (apart from library support) from
       | running a single command that starts up WSGI workers and Celery
       | workers in a single process?
        
         | gchamonlive wrote:
         | Nothing, it's just that these aren't first class features of
         | the language. Also someone already explained that the GIL is
         | mostly about technical debt in the CPython interpreter, so
         | there are reasons other than full parallelism to get rid of the
         | GIL.
        
       | hello_computer wrote:
       | Opting to enable low-level parallelism for user code in an
       | imperative, dynamically typed scripting language seems like
       | regression. It's less bad for LISP because of the pure-functional
       | nature. It's less bad for BEAM languages & Clojure due to
       | immutability. It is less bad for C/C++/Rust because you have a
       | stronger type system--allowing for deeper static analysis. For
       | Python, this is " _high priests of a low cult_ " shitting things
       | up for corporate agendas and/or street cred.
        
       | p0w3n3d wrote:
       | Look behind! A free-threaded Python!
        
       | EGreg wrote:
       | I thought this was mostly a solved problem.
       | Fibers       Green threads       Coroutines       Actors
       | Queues (eg GCD)       ...
       | 
       | Basically you need to reason about what your thing will do.
       | 
       | Separate concerns. Each thing is a server (microservice?) with
       | its own backpressure.
       | 
       | They schedule jobs on a queue.
       | 
       | The jobs come with some context, I don't care if it's a closure
       | on the heap or a fiber with a stack or whatever. Javascript being
       | single threaded with promises wastefully unwinds the entire stack
       | for each tick instead of saving context. With callbacks you can
       | save context in closures. But even that is pretty fast.
       | 
       | Anyway then you can just load-balance the context across
       | machines. Easiest approach is just to have server affinity for
       | each job. The servers just contain a cache of the data so if the
       | servers fail then their replacements can grab the job from an
       | indexed database. The insertion and the lookup is O(log n) each.
       | And jobs are deleted when done (maybe leaving behind a small log
       | that is compacted) so there are no memory leaks.
       | 
       | Oh yeah and whatever you store durably should be sharded and
       | indexed properly, so practicalkt unlimited amounts can be stored.
       | Availability in a given share is a function of replicating the
       | data, and the economics of it is that the client should pay with
       | credits for every time they access. You can even replicate on
       | demand (like bittorrent re-seeding) to handle spikes.
       | 
       | This is the general framework whether you use Erlang, Go, Python
       | or PHP or whatever. It scales within a company and even across
       | companies (as long as you sign/encrypt payloads
       | cryptographically).
       | 
       | It doesn't matter so much whether you use php-fpm with threads,
       | or swoole, or the new kid on the block, FrankenPHP. Well, I
       | should say I prefer the shared-nothing architecture of PHP and
       | APC. But in Python, it is the same thing with eg Twisted vs just
       | some SAPI.
       | 
       | You're welcome.
        
         | kccqzy wrote:
         | It's only a mostly solved problem for concurrent I/O heavy
         | workloads. It's not solved in the Python world for parallel
         | CPU-bound workloads.
        
       | henry700 wrote:
       | I find it peculiar how, in a language so riddled with simple
       | concurrency architectural issues, the approach is to
       | painstankingly fix every library after fixing the runtime,
       | instead of just using some better language. Why does the
       | community insist on such a bad language when literally even
       | fucking Javascript has a saner execution model?
        
         | mylons wrote:
         | i find it peculiar how tribal people are about languages.
         | python is fantastic. you're not winning anyone over with
         | comments like this. just go write your javascript and be happy,
         | bud.
        
         | forrestthewoods wrote:
         | > instead of just using some better language
         | 
         | Python the language is pretty bad. Python the ecosystem of
         | libraries and tools has no equal, unfortunately.
         | 
         | Switching a language is easy. Switching a billion lines of
         | library less so.
         | 
         | And the tragic part is that many of the top "python libraries"
         | are just Python interfaces to a C library! But if you want to
         | switch to a "better language" that fact isn't helpful.
        
           | kubb wrote:
           | I wonder if we get automatic LLM translation of codebases
           | from language to language soon - this could close the library
           | gap and diminish the language lock in factor.
        
         | dash2 wrote:
         | I think the opposite. Every language has flaws. What's
         | impressive about Python is their ongoing commitment to work on
         | theirs, even the deepest-rooted. It makes me optimistic that
         | this is a language to stick with for the long run.
        
         | rednafi wrote:
         | I agree about using other languages that have better
         | concurrency support if concurrency is your bottleneck.
         | 
         | But changing the language in a brownfield project is hard. I
         | love Go, and these days I don't bother with Python if I know
         | the backend needs to scale.
         | 
         | But Python's ecosystem is huge, and for data work, there's
         | little alternative to it.
         | 
         | With all that said, JavaScript ain't got shit on any language.
         | The only good thing about it is Google's runtime, and that has
         | nothing to do with the language. JS doesn't have true
         | concurrency and is a mess of a language in general. Python is
         | slow, riddled with concurrency problems, but at least it's a
         | real language created by a guy who knew what he was doing.
        
       | make3 wrote:
       | I hate how these threads always devolve in insane discussions
       | about why not using threads is better, while most real world
       | people who have tried to do real world speeding up of Python code
       | realize how amazing it would be to have proper threads with
       | shared memory instead of the processes that have so many
       | limitations, like forcing to pickle objects back and forth, &
       | fork so often just not working in the cloud setting, & spawn
       | being so slow in a lot of applications. The usage of processes is
       | just much heavier and less straightforward.
        
       ___________________________________________________________________
       (page generated 2025-05-16 23:00 UTC)