https://lwn.net/Articles/1029307/ LWN.net Logo LWN .net News from the source LWN * Content + Weekly Edition + Archives + Search + Kernel + Security + Events calendar + Unread comments + ------------------------------------------------------------- + LWN FAQ + Write for us * Edition + Return to the Front page User: [ ] Password: [ ] [Log in] | [Subscribe] | [Register] Subscribe / Log in / New account Following up on the Python JIT Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net. By Jake Edge July 14, 2025 --------------------------------------------------------------------- PyCon US Performance of Python programs has been a major focus of development for the language over the last five years or so; the Faster CPython project has been a big part of that effort. One of its subprojects is to add an experimental just-in-time (JIT) compiler to the language; at last year's PyCon US, project member Brandt Bucher gave an introduction to the copy-and-patch JIT compiler. At PyCon US 2025, he followed that up with a talk on "What they don't tell you about building a JIT compiler for CPython" to describe some of the things he wishes he had known when he set out to work on that project. There was something of an elephant in the room, however, in that Microsoft dropped support for the project and laid off most of its Faster CPython team a few days before the talk. Bucher only alluded to that event in the talk, and elsewhere has made it clear that he intends to continue working on the JIT compiler whatever the fallout. When he gave the talk back in May, he said that he had been working with Python for around eight years, as a core developer for six, part of the Microsoft CPython performance engineering team for four, and has been working on the JIT compiler for the last two years. While the team at Microsoft is often equated with the Faster CPython project, it is really just a part of it; " `our team collaborates with lots of people outside of Microsoft'". Faster CPython results [Brandt Bucher] The project has seen some great results over the last few Python releases. Its work first appeared in 2022 as part of Python 3.11, which averaged 25% faster than 3.10, depending on the workload; "`no need to change your code, you just upgrade Python and everything works'". In the years since, there have been further improvements: Python 3.12 was 4% faster than 3.11, and 3.13 improved by 7% over 3.12. Python 3.14, which is due in October, will be around 8% faster than its predecessor. In aggregate, that means Python has gotten nearly 50% faster in less than four years, he said. Around 93% of the benchmarks that the project uses have improved their performance over that time; nearly half (46%) are more than 50% faster. 20% of the benchmarks are more than 100% faster. Those are not simply micro-benchmarks, the benchmarks represent real workloads; Pylint has gotten 100% faster, for example. All of those increases have come without the JIT; they come from all of the other changes that the team has been working on, while " `taking a kind of holistic approach to improving Python performance'". Those changes have a meaningful impact on performance and were done in such a way that the community can maintain them. " `This is what happens when companies fund Python core development'", he said, "`it's a really special thing'". On his slides, that was followed by the crying emoji accompanied by an uncomfortable laugh. Moving on, he gave a "duck typing" example that he would refer to throughout the talk. It revolved around a duck simulator that would take an iterator of ducks and "quack" each one, then print the sound. As an additional feature, if a duck has an "echo" attribute that evaluates to true, it would double the sound: def simulate_ducks(ducks): for duck in ducks: sound = duck.quack() if duck.echo: sound += sound print(sound) That was coupled with two classes that produced different sounds: class Duck: echo = False def quack(self): return "Quack!" class RubberDuck: echo = True def __init__(self, loud): self.loud = loud def quack(self): if self.loud: return "SQUEAK!" return "Squeak!" He stepped through an example execution of the loop in simulate_ducks (). He showed the bytecode for the stack-based Python virtual machine that was generated by the interpreter and stepped through one iteration of the loop describing the changes to the stack and to the duck and sound local variables. That process is largely unchanged " `since Python was first created'". Specialization The 3.11 interpreter added specialized bytecode into the mix, where some of the bytecode operations are changed to assume they are using a specific type--chosen based on observing the execution of the code a few times. Python is a dynamic language, so the interpreter always needs to be able to fall back to, say, looking up the proper binary operator for the types. But after running the loop a few times, it can assume that "sound += sound" will be operating on strings so it can switch to a bytecode with a fast path for that explicit operation. "`You actually have bytecode that can still handle anything, but has inlined fast paths for the shape of your actual objects and data structures and memory layout.'" All of that underlies the JIT compiler, which uses the specialized bytecode interpreter, and can be viewed as being part of the same pipeline, Bucher said. The JIT compiler is not enabled by default in any build of Python, however. As he described in last year's talk, the specialized bytecode instructions get further broken down into micro-ops, which are "`smaller units of work within an individual bytecode instruction'". The translation to micro-ops is completely automatic because the bytecodes are defined in terms of them, "`so this translation step is machine-generated and very very fast'", he said. The micro-ops can be optimized, that is basically the whole point of generating them, he said. Observing the different types and values that are being encountered when executing through the micro-ops will show optimizations that can be applied. Some micro-ops can be replaced with more efficient versions, others can be eliminated because they "`are doing work that is entirely redundant and that we can prove we can remove without changing the semantics'". He showed a slide full of micro-ops that corresponded to the duck loop and slowly replaced and eliminated something approaching 25% of them, which corresponds to what the 3.14 version of the JIT does. The JIT will then translate the micro-ops into machine code one-by-one, but it does so using the copy-and-patch mechanism. The machine-code templates for each of the micro-ops are generated at CPython compile time; it is somewhat analogous to the way the micro-ops themselves are generated in a table-driven fashion. Since the templates are not hand-written, fixing bugs in the micro-ops for the rest of the interpreter also fixes them for the JIT; that helps with the maintainability of the JIT, but also helps lower the barrier to entry for working on it, Bucher said. Region selection With that background out of the way, he moved on to some " `interesting parts of working on a JIT compiler'" that are often overlooked, starting with region selection. Earlier, he had shown a sequence of micro-ops that needed to be turned into machine code, but he did not describe how that list was generated; "`how did we get there in the first place?'" The JIT compiler does not start off with such a sequence, it starts with code like in his duck simulation. There are several questions that need to be answered about that code based on its run-time activity. The first is: "`what do we want to compile?'" If something is running only a few times, it is not a good candidate for JIT compilation, but something that is running a lot is. Another question is where should it be compiled? A function can be compiled in isolation or it can be inlined into its callers and those can be compiled instead. When should the code be compiled? There is a balance to be struck between compiling things too early, wasting that effort because the code is not actually running all that much, and too late, which may not actually make the program any faster. The final question is "why? ", he said; it only makes sense to compile code if it is clear that compiling will make the code more efficient. "`If they are using really dynamic code patterns or doing weird things that we don't actually compile well, then it's probably not worth it.'" One approach that can be taken is to compile entire functions, which is known as "method at a time" or "method JIT". It "`maps naturally to the way we think about compilers'" because it is the way that many ahead-of-time compilers work. So, when the JIT looks at simulate_ducks(), it can just compile the entire function (the for loop) wholesale, but there are some other opportunities for optimization. If it recognizes that most of the time the loop operates on Duck objects, it can inline the quack() function from it: for duck in ducks: if duck.__class__ is Duck: sound = "Quack!" else: sound = duck.quack() ... If there are lots of RubberDuck objects too, that class's quack() method could be inlined as well. Likewise, the attribute lookup for duck.echo could be inlined for one or both cases, but that all starts to get somewhat complicated, he said; "`it's not always super-easy to reason about, especially for something that is running while you are compiling it'". Meanwhile, what if ducks is not a list, but is instead a generator? In simple cases, with a single yield expression, it is not that much different from the list case, but with multiple yield expressions and loops in the generator, it also becomes hard to reason about. That creates a kind of optimization barrier and that kind of code is not uncommon, especially in asynchronous programming contexts. Another technique, and the one that is currently used in the CPython JIT, is to use a "tracing JIT" instead of a method JIT. The technique takes linear traces of the program's execution, so it can use that information to make optimization decisions. If the first duck is a Duck, the code can be optimized as it was earlier, with a guard based on the class and inlining the sound assignment. Next up is a lookup for duck.echo, but the code in the guarded branch has perfect type information; it already knows that it is processing a Duck, so it knows echo is false, and that if can be removed, leaving: for duck in ducks: if duck.__class__ is Duck: sound = "Quack!" print(sound) "`This is pretty efficient. If you have just a list of Ducks, you're going to be doing kind of the bare minimum amount of work to actually quack all those ducks.'" The code still needs to handle the case where the duck is not a Duck, but it does not need to compile that piece; it can, instead, just send it back to the interpreter if the class guard is false. If the code is also handling RubberDuck objects, though, eventually that else branch will get "hot" because it is being taken frequently. At that point, the tracing can be turned back on to see what the code is doing. If we assume that it mostly has non-loud RubberDuck objects, the resulting code might look like: elif duck.__class__ is RubberDuck: if self.loud: ... sound = "Squeak!Squeak!" print(sound) else: ... The two branches that are not specified would simply return to the regular interpreter when they are executed. Since the tracing has perfect type information, it knows that echo is true, so the sound should be doubled, but there is no need to actually use "+=" to get the result. So, now the function has the minimum necessary code to quack either a Duck or a non-loud RubberDuck. If those other branches start getting hot at some point, tracing can once again be used optimize it further. One downside of the tracing JIT approach is that it can compile duplicates of the same code, as with "print(sound)". In "`very branchy code'" Bucher said, "`some things near the tail of those traces can be duplicated quite a bit'". There are ways to reduce that duplication, but it is a downside to the technique. Another technique for selecting regions is called "meta tracing", but he did not have time to go into it. He suggested that attendees ask their LLM of choice "`about the 'first Futamura projection' and don't misspell it like me, it's not 'Futurama''", Bucher said to some chuckles around the room. Memory management JIT compilers "`do really weird things with memory'". C programmers are familiar with readable (or read-only) data, such as a const array, and data that is both readable and writable is the normal case. Memory can be dynamically allocated using malloc(), but that kind of memory cannot be executed; since a JIT compiler needs memory that it can read, write, and execute, it requires "`the big guns'": mmap(). "`If you know the right magic incantation, you can whisper to this thing with all these secret flags and numbers'" to get memory that is readable, writable, and executable: char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); One caveat is that memory from mmap() comes in page-sized chunks, which is 4KB on most systems but can be larger. If the JIT code is, say, four bytes in length, that can be wasteful, so it needs to be managed carefully. Once you have that memory, he asked, how do you actually execute it? It turns out that "`C lets us do crazy things'": typedef int (*function)(int); ((function)data)(42); That first line creates a type definition named "function", which is a pointer to a function that takes an integer argument and returns an integer. The second line casts the data pointer to that type and then calls the function with an argument of 42 (and ignores the return value). "`It's weird, but it works.'" He noted that the term "executable data" should be setting off alarm bells in people's heads; "`if you're a Rust programmer, this is what we call 'unsafe code''" he said to laughter. Being able to write to memory that can be executed is "`a scary thing; at best you shoot yourself in the foot, at worst it is a major security vulnerability'". For this reason, operating systems often require that memory not be in that state. He said that the memory should be mapped readable and writable, then filled in, and switched to readable and executable using mprotect(); if there is a need to modify the data later, it can be switched back and forth between the two states. Debugging and profiling When code is being profiled using one of the Python profilers, code that has been compiled should call all of the same profiling hooks. The easiest way to do that, at least for now, is to not JIT code that has profiler hooks installed. In recent versions of Python, profiling is implemented by using the specializing adaptive interpreter to change certain bytecodes to other, instrumented versions of them, which will call the profiler hooks. If the tracing encounters one of these instrumented bytecodes, it can shut the JIT down for that part of the code, but it can still run in other, non-profiled parts of the code. A related problem occurs when someone enables profiling for code that has already been JIT-compiled. In that case, Python needs to get out of the JIT code as quickly as possible. That is handled by placing special _CHECK_VALIDITY micro-ops just before "`known safe points'" where it can jump out of the JIT code and back to the interpreter. That micro-op checks a one-bit flag; if it is set, the execution bails out of the JIT code. That bit gets set when profiling is enabled, but it is also used when code executes that could change the JIT optimizations (e.g. a change of class attributes). Something that just kind of falls out of that is the ability to support "`the weirder features of Python debuggers'". The JIT code is created based on what the tracing has seen, but someone running pdb could completely upend that state in various ways (e.g. "duck = Goose ()"). The validity bit can be used to avoid problems of that sort as well. For native profilers and debuggers, such as perf and GDB, there is a need to unwind the stack through JIT frames, and interact with JIT frames, but "`the short answer is that it's really really complicated'". There are lots of tools of this sort, for various platforms, that all work differently and each has its own APIs for registering debug information in different formats. The project members are aware of the problem, but are trying to determine which tools need to be supported and what level of support they actually need. Looking ahead The current Python release is 3.13; the JIT can be built into it by using the --enable-experimental-jit flag. For Python 3.14, which is out in beta form and will be released in October, the Windows and macOS builds have the JIT built-in, but it must be enabled by setting PYTHON_JIT=1 in the environment. He does not recommend enabling it for production code, but the team would love to hear about any results from using it: dramatic improvements or slowdowns, bugs, crashes, and so on. Other platforms, or people creating their own binaries, can enable the JIT with the same flag as for 3.13. For 3.15, which is in a pre-alpha stage at this point, there are two GitHub issues they are focusing on: "Supporting stack unwinding in the JIT compiler" and "Make the JIT thread-safe". The first he had mentioned earlier with regard to support for native debuggers and profilers. The second is important since the free-threaded build of CPython seems to be working out well and is moving toward becoming the default--see PEP 779 ("Criteria for supported status for free-threaded Python"), which was recently accepted by the steering council. The Faster CPython developers think that making the JIT thread-safe can be done without too much trouble; "`it's going to take a little bit of work and there's kind of a long tail of figuring out what optimizations are actually still safe to do in a free-threaded environment'". Both of those issues are outside of his domain of expertise, however, so he hoped that others who have those skills would be willing to help out. In addition, there is a lot of ongoing performance work that is going into the 3.15 branch, of course. He noted, pointedly, that fast progress, especially on larger projects, will depend on the availability of resources. The words on his slide saying that changed to bold and he gave a significant cough to further emphasize the point. As he wrapped up, he suggested PEP 659 ("Specializing Adaptive Interpreter") and PEP 744 ("JIT Compilation") for further information. For those who would rather watch something, instead of reading about it, he recommended videos of his talks (covered by LWN and linked above) from 2023 on the specializing adaptive interpreter and from 2024 on adding a JIT compiler. The YouTube video of this year's talk is available as well. [Thanks to the Linux Foundation for its travel sponsorship that allowed me to travel to Pittsburgh for PyCon US.] Index entries for this article Conference PyCon/2025 Python JIT ----------------------------------------- [Log in] to post comments Exciting changes for python Posted Jul 14, 2025 9:59 UTC (Mon) by Niflmir (subscriber, #175249) [ Link] (59 responses) I'm really happy to see all these performance related enhancements to python. I always found it strange that the answer to python related problems was to design your application in a very specific (multiprocessing) manner, rewrite parts of it in a different language when necessary, and hope that you never needed shared memory model concurrency after choosing multiprocessing. These were the sorts of project risks that made me consider other languages as better suited for large scale engineering but python had so many good points of its own, it didn't deserve to be relegated as some niche language. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 12:29 UTC (Mon) by ballombe (subscriber, #9523) [ Link] (53 responses) python is a good beginner language, but once you master it you should learn another performance-minded language. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 14:55 UTC (Mon) by anselm (subscriber, #2796) [ Link] (51 responses) but once you master it you should learn another performance-minded language Python performance is perfectly adequate for a large number of use cases, especially with the recent improvements. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 16:48 UTC (Mon) by npws (subscriber, #168248) [ Link] (48 responses) The problem is more, it is adequate until it isn't. And then you are in for a lot of pain. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 17:01 UTC (Mon) by anselm (subscriber, #2796) [ Link] (47 responses) The problem is more, it is adequate until it isn't. And then you are in for a lot of pain. Perhaps. Perhaps not. I'm with Donald E. Knuth - "Premature optimization is the root of all evil". [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 17:47 UTC (Mon) by pizza (subscriber, #46) [Link] (4 responses) > Perhaps. Perhaps not. The _only_ reason Python has any "performance" chops is because any serious workload calls into non-python libraries to do the actual work. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 21:49 UTC (Mon) by anselm (subscriber, #2796) [ Link] (3 responses) Frankly, I don't know what it is with you people. I've been programming in Python, mostly web stuff, for many years now, and its performance is usually not something I seem to need to worry a lot about - certainly not to a point where I would want to ditch Python for some compiled language. YMMV, of course. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 22:32 UTC (Mon) by ballombe (subscriber, #9523) [ Link] I wrote 'you should learn another performance-minded language.', not that you should stop using python. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 22:41 UTC (Mon) by pizza (subscriber, #46) [Link] > Frankly, I don't know what it is with you people. Which "people" are those, exactly? > I've been programming in Python, mostly web stuff, for many years now, and its performance is usually not something I seem to need to worry a lot about There are many, many more workloads than yours. Here's an example. Several years ago I inherited a bit of python2 code that fired off a bunch of external commands and parsed the output to generate a data file that was consumed by a PHP dashboard-type web application. It typically took about ~30s to run. When self-respecting Linux distros finally dropped python2, it needed to be ported to something more modern. Unfortunately the input was not unicode-clean, which made python3 very unhappy, so in frustration I said "screw this" and rewrote it in perl. Despite being algorithmically identical, the runtime dropped to about 3s -- a literal order of magnitude faster. This vast improvement in runtime performance allowed me to move to a synchronous invocation instead of asynchronous (with state tracking and the other complexity that entailed), resulting in an overall system that was simpler, more robust, and more performant. Python definitely has its strengths. But it also has its weaknesses. [Reply to this comment] Exciting changes for python Posted Jul 17, 2025 8:06 UTC (Thu) by Niflmir (subscriber, #175249) [ Link] I tried to pre-address your question. In order to have reasonable performance, you need to choose multi-processing as your concurrency paradigm (you might have a workload where async/await can help, but better to just choose multi-processing). So this corners your design. Now you need to do resource sharing in a third party application (like pgbouncer) that isn't written in python, if you are working with a protocol (say IMAP) where there isn't an existing resource pool available, well you will have to write it yourself or give up on resource pooling. These are GIL problems. For the lack of JIT, you will see performance problems if you do any sort of computation in python. That is why so many python workloads are actually backed by fortran, not even C. But why can't a pure python code base compete with that performance? Because the work hasn't been done until now. Moreover, the polyglot codebase raises a bunch of packaging issues that an interpreted language shouldn't face. This isn't an issue of compiled or not. JVM bytecode is interpreted but jitted. Python on the jvm also doesn't have these issues. Pypy fixed some of them. This is an issue of the cpython runtime. I know and am comfortable with a bunch of languages, and all else being equal, these risks are reasons not to choose python, and trying to argue that all else isn't equal is just the realm of language flame wars. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 22:35 UTC (Mon) by jmalcolm (subscriber, #8876) [ Link] (36 responses) While I agree with your comment, and I am in no position to question Knuth, I am not sure that your response makes sense here. Are we sure this is what Knuth even meant? What does Knuth mean by "pre-mature optimization"? I would think he means getting too clever with your algorithms or and too many layers of design. I doubt he means vetting that your core architectural approach or data structures even remotely valid or, more on topic, that you select a language with an appropriate set of properties for the task. Sure, I can choose Python to build a real-time operating system kernel. But it is a poor choice. I am going to be getting to that non-evil optimization stuff pretty quickly. The same is true in the opposite direction. I can choose C to write a 6 page web application with some simple CRUD logic. But it is a poor choice. As Facebook (Meta) taught us, you can always create your own compiler after you have written your planet scale social media platform in an interpreted language like PHP. But that is a pretty big task to have to take on and probably best avoided if you can. If you knew you were setting out to make Facebook to begin with, I think PHP (or Python) would be a poor choice. I mean, I guess the benefit would be that the rest of us might get a fast Python JIT compiler after you are forced to build one. And writing compilers is fun. But that is not the point we are making. I mostly agree with your point. I agree because few of our projects are going to grow to the scale of Facebook. And, if they do, we can afford a team of compiler writers. But I hope those compiler writers do not try to choose Python as the language for the compiler telling me "Premature optimization is the root of all evil". Python is fast enough for most of what most of us need to do. And it can call into something faster for the few things that really do need to be faster if we run into bottlenecks. Way too often we reach for Rust, and containers, and K8S, and WASM because we want to be able to go Goolge-Scale. More often, we should probably use Python (or something with similar productivity / performance trade-offs). So, again, I mostly agree with you. However, "it is adequate until it isn't" also contains some wisdom. Python is not for everything. It is not what I would reach for when I go to design my next 3D game engine. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 23:25 UTC (Mon) by anselm (subscriber, #2796) [ Link] (20 responses) Way too often we reach for Rust, and containers, and K8S, and WASM because we want to be able to go Goolge-Scale. More often, we should probably use Python (or something with similar productivity / performance trade-offs). So, again, I mostly agree with you. +1 As far as I'm concerned, starting by deliberately picking a programming language that is more inconvenient to develop with than Python, on the vague off-chance that Python may in the end possibly turn out to not be fast enough for the task at hand, is a form of premature optimisation that should be avoided. There are usually various things one can do to speed up a Python application short of rewriting all of it in a "performance" language - but, as always when doing optimisation work, one should work from actual performance data, not gut feelings. Of course this does not detract from the fact that there are application areas where it is pretty clear from the start that Python is probably not the best choice, as in your 3D game engine example. Picking something else in such a situation is obviously reasonable. [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 7:08 UTC (Tue) by taladar (subscriber, #68407) [ Link] (19 responses) On the other hand, is Python really convenient to develop with? I find its model of checking almost nothing before runtime quite inconvenient because it means most testing during development has to be done by actually running the program instead of relying on a compiler to catch my mistakes early. [Reply to this comment] Convenience of developing in Python Posted Jul 15, 2025 10:53 UTC (Tue) by farnz (subscriber, #17727) [ Link] (17 responses) Python has a very fast edit/run cycle (especially compared to a language with an adequate to good type system, like C++, Rust, or Idris), which makes it great for the sort of development where you don't yet know what the logic of the system should look like - effectively, you're using Python as your design language, instead of trying to come up with a design before writing code. In theory, you then end up with a program that works for the happy path, and you "just" need to fix up all the failure handling logic; in practice, it's often simpler to rewrite into another language at that point. [Reply to this comment] Convenience of developing in Python Posted Jul 15, 2025 12:08 UTC (Tue) by Wol (subscriber, #4433) [Link] > in practice, it's often simpler to rewrite into another language at that point. "Always plan to throw the first one away - it's almost inevitable you will" :-) Cheers, Wol [Reply to this comment] Convenience of developing in Python Posted Jul 16, 2025 8:00 UTC (Wed) by taladar (subscriber, #68407) [ Link] (15 responses) But Rust has a fast edit/don't even need to run cycle where in Python you often need to test run your application at that point. Not to mention the fact that you need to write Python with Python conventions and those are just not compatible with e.g. Rust with Rust conventions if you start in one language and then rewrite in another. [Reply to this comment] Convenience of developing in Python Posted Jul 16, 2025 8:41 UTC (Wed) by farnz (subscriber, #17727) [ Link] (14 responses) You miss the point - until you run the code and see what it does, you do not know whether what it does is useful or not. You have to run the code to find out whether you're on the right track or not. Rust makes it a lot easier to take a rough design, and implement a quality program; Python makes it easier to go from a poorly written spec ("make me something that helps me make cool posters. I'll tell you if the things you're doing are helpful or not if you show me a running program") to a rough design. [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 7:53 UTC (Fri) by taladar (subscriber, #68407) [ Link] (7 responses) I would disagree with the assessment that Python makes it easier to get from poorly written spec to a running program with Python, that was pretty much my entire point. The often claimed "advantage" of dynamic languages of a faster iteration cycle relies entirely on the lack of compilation time while ignoring the fact that you can cut short the iterations in strict compiled languages at the point where the compiler tells you where you made a mistake. And that is even entirely ignoring that a stricter type system is especially useful in the early design phase where you want to figure out which assumptions and invariants the spec misses. For evidence of that look at how broken many OpenAPI files or similar schemas are that resulted from dynamic language projects because accidental sloppiness of the dynamic type system slipped into the output (e.g. a field that sometimes has one type, sometimes another, is sometimes left out and sometimes null,...). [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 9:03 UTC (Fri) by farnz (subscriber, #17727) [ Link] (6 responses) I do not see how you cut out the iterations with a stricter language - the specification you're working to is "show me it running, I'll tell you what I like and do not like". You don't yet have assumptions and invariants at all. You have an oracle (a human being) who will tell you if what you're showing them is closer or further from what they imagined. Python is excellent at this stage, because of the speed with which you can effectively query the oracle; once you've built up a picture of what the human wants, you may be able to go faster in another language, but you cannot go faster in the early stages, since you literally do not have a spec - you're working with an oracle, not a document. [Reply to this comment] Convenience of developing in Python Posted Jul 21, 2025 7:43 UTC (Mon) by taladar (subscriber, #68407) [ Link] (5 responses) In case you are talking about the customer, I would avoid showing anything running to them at all unless I plan to support it in the long term because it is extremely hard to convince customers that something still needs work once they have seen something running (or even just something that visually looks like it is running even if most of the underlying business logic is missing). [Reply to this comment] Convenience of developing in Python Posted Jul 21, 2025 9:03 UTC (Mon) by farnz (subscriber, #17727) [ Link] (3 responses) Then you've just breached the contract with the customer, and you're fired from the job. They've agreed that you'll show them something every week, and that you'll take feedback each week, and you're now refusing to do so. Remember that this is in the context of a spec that looks like "there's some annoying things about my job; make the computer do them the way I would", not a decent spec that you can work from. If you don't show where you are to the customer regularly, you simply don't know what they want. And it's entirely possible in any programming language to only show the things that are complete, and to not even have missing business logic - if it's visible in the demo, it's ready for the customer to accept, or for them to request changes to. This is where languages with a fast edit/run cycle have an advantage; if the customer sees the working (and complete, ready to ship) version you're showing them this week, but wants one small change (where you define small, not the customer), you can make the change on-the-fly and show them the result of making that change. Customers, being human, then often decide that actually the change was a bad idea, and want a different change, but that's also easy to do. [Reply to this comment] Convenience of developing in Python Posted Jul 22, 2025 7:23 UTC (Tue) by taladar (subscriber, #68407) [ Link] (2 responses) But that is exactly my point, at that point you would have to ship the Python version, the one people earlier claimed was "just for prototyping". That is why I wouldn't want to show that one to the customer because at that point I am stuck supporting that language that is completely unmaintainable. [Reply to this comment] Convenience of developing in Python Posted Jul 22, 2025 9:33 UTC (Tue) by farnz (subscriber, #17727) [ Link] You can ship the Python version as 1.0, and rewrite (using PyO3 or similar if you want to use Rust, or PyBind11 if you want to use C++) for 1.1. The other thing you can do to reduce your Python maintenance burden (and move it to other languages) is port stable areas of the code into your other language as you go along. Once you're confident that you've extracted all the requirements relating to a given area of the codebase, you can wrap it up for Python to access, and stop maintaining the Python version. You don't have to wait until the project as a whole is done to do this to components. [Reply to this comment] Convenience of developing in Python Posted Jul 22, 2025 9:38 UTC (Tue) by anselm (subscriber, #2796) [ Link] that language that is completely unmaintainable Speak for yourself. Where I work, we make a pretty good living out of maintaining and extending Python code that has been around - at least in part - for a very long time. IMHO, there are popular programming languages which are way less maintainable than Python. [Reply to this comment] Convenience of developing in Python Posted Jul 21, 2025 12:32 UTC (Mon) by mathstuf (subscriber, #69389) [Link] One tactic there is to make things that are undone *look* undone. For example, instead of a polished icon from the designer (that you may indeed already have), use a crayon-like representation in the meantime (probably better if the developers make it themselves at that point). [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 8:36 UTC (Fri) by Wol (subscriber, #4433) [Link] (5 responses) > You miss the point - until you run the code and see what it does, you do not know whether what it does is useful or not. You have to run the code to find out whether you're on the right track or not. So how come my ex-boss (we're talking 50 years ago) spent his first six months programming without a computer, and when the computer turned up and the program was typed in (by secretaries - those people who were very good at making perfect copies first time round), it worked perfectly? The main reason you need to run and test today, is because we rely on too much 3rd-party code where you cannot trust the authors to have either (a) documented it properly, or (b) checked it properly for bugs. It's amazing the difference you can make to a program just by printing it out, READING the code CAREFULLY, and running a linter/ compiler-with-warnings-at-max over it. Any decent programmer should do that as a matter of course, but the amount of code I work with, even today, where that clearly hasn't been done is amazing. And depressing. Cheers, Wol [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 8:59 UTC (Fri) by farnz (subscriber, #17727) [ Link] (1 responses) No - the main reason you need to run and test today is that the spec you're given now is "show me it doing something, and I'll tell you if it's the right thing or not". When the spec is literally "make a program that does something that $boss likes", you can't do decent software engineering; you're running code to convert "show me a program that does something I like" into a more formal spec, from where you can start the process your ex-boss did. What's changed is that people see the computer as a "magic" machine that can do anything they can imagine, rather than as a tool to calculate things. [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 9:25 UTC (Fri) by anselm (subscriber, #2796) [ Link] No - the main reason you need to run and test today is that the spec you're given now is "show me it doing something, and I'll tell you if it's the right thing or not". This is because the people who want you to write software for them generally find it difficult to explain to you precisely and unambiguously enough, up front, what they want that software to actually do in the end and how. Software development would be so much easier if all one had to do was to write some code based on a pre-existing complete, correct, and unambiguous specification of the job to be done. The problem is that writing that type of specification in the first place is about as difficult, time-consuming, expensive, and error-prone as writing the code itself, and therefore this tends to be attempted only in exceptional circumstances. In the meantime, the rest of the software development world has adopted "agile" development practices which usually involve iteratively building increasingly refined versions of the code until the customer is satisfied, at which point the result may or may not have anything to do with what the customer would have been able to describe meaningfully at the start. [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 10:20 UTC (Fri) by interalia (subscriber, #26615) [Link] (2 responses) Seems unlikely that everyone was just like your ex-boss and got it right first go, every time. His competence and that of all the secretaries and other programmers across the industry, did it follow a bell curve or was it really a vertical line at 100% correctness? As to your implied point that people back then took more care before running code, how weird that people back then adapted and adjusted their behaviour to match the limitations and running costs of their computers. It's about as weird as the reason why people in the 1800s didn't fly on a plane when travelling to other countries. They couldn't do it, so they didn't. [Reply to this comment] Convenience of developing in Python Posted Jul 18, 2025 13:23 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses) > Seems unlikely that everyone was just like your ex-boss and got it right first go, every time. His competence and that of all the secretaries and other programmers across the industry, did it follow a bell curve or was it really a vertical line at 100% correctness? I suspect it's what I call "The Word Effect". Word caused a massive crash in professional literacy because it enabled managers (and other workers) to write their own letters, but they didn't have all the skills in layout and what actually makes a letter readable. That era was awful for all the stuff that was a nightmare to read. (It's still naff, but nowhere near as bad. MOST people have a better feel, but they still don't know the rules and mess up ...) ALL programmers of that era were more than competent at doing that sort of stuff. I remember my fellow pupils at school taking stacks of punched cards up to the local Oceanographic Institute to run on their computer when they had spare capacity. Yes you're right, they did it because they had to. BUT... I suspect it's a case of that same law that Khim mentioned. All these modern programming tools actually *hinder* productivity, but they fool the less experienced/competent (and even the competent) into thinking that these tools help. Even farnz' example - I'm sure spending half an hour with the boss saying "what are you trying to achieve" would save many hours of waterfall. But the boss views your time as less valuable than his own and won't spend that little bit of time with you, despite the fact it will probably cost him hours viewing and rejecting your mis-understanding. And yes, I've had that happened to me, a boss who couldn't accept that I didn't understand what he wanted, and I wasn't prepared to waste MY time getting it wrong ... At the end of the day, if the boss doesn't know what he wants, how on earth does he expect you to know? Yes I know it's difficult dealing with an incompetent boss, but that's what it boils down to ... Cheers, Wol [Reply to this comment] Getting 30 minutes with the big boss Posted Jul 18, 2025 14:23 UTC (Fri) by farnz (subscriber, #17727) [ Link] Thing is that getting 30 minutes of uninterrupted time from the boss is expensive and difficult. You add quite a bit more value if you can do the right thing from an off-the-cuff idea and a 2 minute demo session (with you changing the code live as the boss makes suggestions in the demo session) every week, even if it takes 30 weeks to get to a final result rather than 10 weeks with a 2 minute demo session at the 8 week mark. This happens because the boss's time gets more expensive to schedule the more of it you want; a 2 minute demo session can be squeezed in between other meetings, where a 30 minute session needs the boss to commit to a full meeting, replacing some other work they'd be doing with that time. And that's ignoring cases where the boss literally needs you to tell them what the computer is, and is not, good at. In my experience, there's plenty of people where it's not worth having the 30 minute up-front conversation with them, because they have no sense of what's reasonable to ask for, and expect that if it's easy to state the problem, it'll be easy to solve. With these people, keeping the time down focuses them on asking for relatively small things (since they don't have long to state the problem), and giving them an example of what can be done tends to focus their ideas on things like what they've just seen, rather than on all the things they could be asking for. And yes, this is people management; in the end, it's about managing the person with the money such that they are happy to pay you to solve the problem, rather than choosing to pay someone else. It may well be easier on you to demand more from the person with money, but if that leads to them deciding that they're not paying you, but paying someone else, that's a problem for you, even if you can produce results quicker than the person they're paying. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 8:40 UTC (Wed) by interalia (subscriber, #26615) [Link] I added type hints to my Python code when using an IDE and found it really helps a lot. The IDE can flag when I've passed the wrong variable to a function, made a typo, or tried to access an attribute/ method that doesn't exist on an object of that that type. The fact that type hints are an optional feature can make it sort of a helpful compromise between full static typing and the convenience/ ease of dynamic typing. I find sometimes it's nice when prototyping a new function to be able to omit types, and get a feel for whether the basic approach works without having to fully specify everything the way I'd have to in order to make a C/C++/Rust compiler happy. A bit later when the new Python function is more settled, I can formalise it by adding the final type hints in so that the IDE can check existing and future callers for me. The fact type hints are optional means you can add them to an existing code base piecemeal by starting with hints on some targeted functions but not all, so you don't have to convert the whole thing at once to start getting some benefits. I still probably wouldn't write a large code base in Python but with type hints they're probably at least tractable, and made my medium size scripts better and more maintainable. [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 11:16 UTC (Tue) by khim (subscriber, #9252) [Link ] > Are we sure this is what Knuth even meant? We can be 100% sure Knuth haven't meant what most perusers of that quote mean. You may start by actually reading the article that gave us that quote. Just even the name of the article should give you the hint: Structured programming with go to statements. And these "premature optimizations"? They are about tricks that are incompatible with "normal" structured programming and things like manual loop unrolling. Plus you may find another quote in this exact article: In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Or, maybe, more expanded form: The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can't debug or maintain their "optimized" programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn't bother making such optimizations on a one-shot job, but when it's a question of preparing quality programs, I don't want to restrict myself to tools that deny me such efficiencies. I would say that using that quote to justify Python is as far from what Knuth had in mind as one may imagine. We are not talking about something that leaves 12% improvement in the table, but about something that makes things tens or, maybe (if we think about multicore CPUs) hundreds of times slower! [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 11:48 UTC (Tue) by excors (subscriber, #95769) [ Link] (11 responses) > Are we sure this is what Knuth even meant? What does Knuth mean by "pre-mature optimization"? What he actually says is: (https://dl.acm.org/doi/10.1145/ 356635.356640) > The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can't debug or maintain their "optimized" programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. [...] I don't want to restrict myself to tools that deny me such efficiencies. > > There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We _should_ forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. > > Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. He's not saying performance doesn't matter - he explicitly says it shouldn't be ignored. He's arguing for a more thoughtful approach: most code doesn't need to be heavily optimised, but some code does, and it's important to have the tools to identify those bottlenecks and then to optimise them. Premature optimisation is when you skip the identification step and optimise regardless, but the other extreme is bad too. In that paper he's specifically talking about optimisations like unrolling loops and turning 'while' into 'go to' to reduce the number of control flow instructions. You should "start with a well-structured program and then use well-understood transformations that can be applied mechanically". Nowadays most languages have optimising compilers that can apply those transformations automatically, so we don't have to sacrifice the source code's readability for performance in that way. I think that's a significant weakness of Python: once you identify your program's bottlenecks, they cannot be straightforwardly transformed (by either human or compiler) into efficient code, as demonstrated by the struggles with implementing a decent JIT. You have to rewrite the bottlenecks in a different programming language, which is far from ideal. It sounds like Python also makes it hard to identify those bottlenecks: this article says enabling the Python profiler will disable the JIT, and implies native profilers won't work well with the JIT any time soon, so you can't measure the program's actual performance. And the non-deterministic nature of JIT makes profiling hard even in the best case. Compiled languages are generally much better for both identifying bottlenecks and then optimising them with incremental changes, because there's a more direct correspondence between source code and machine code; I think that's more important than their baseline performance for unoptimised code. But also, Knuth is talking about optimisations on the scale of 12%. The performance cost of using Python instead of a compiled language is regularly >1000% - I imagine he'd be shocked that a good software engineer would even consider that. (At least, I imagine he would have been when he wrote this 50 years ago. A modern desktop PC is maybe 1,000,000% faster than a supercomputer from back then, so perhaps that changes one's perspective on the tradeoffs.) [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 13:13 UTC (Tue) by anselm (subscriber, #2796) [ Link] (4 responses) Knuth is talking about optimisations on the scale of 12%. The performance cost of using Python instead of a compiled language is regularly >1000% Knuth is talking about "a 12% improvement, easily obtained" (my emphasis). I suspect that whether rewriting a Python program in a compiled language leads to "easily obtained" performance improvements (of 12% or 1000% or anything in between) is debatable and also depends on the specific circumstances, including, e.g., how often it will be needed. For something that is only supposed to be run once, spending a day or two to make it run in 5 seconds in something like Rust instead of a couple of hours to make it run in a minute in Python (just for the sake of an example, to get the 1000% in) may not be worthwhile, but if it is something that is supposed to be run via cron every minute, every day, then very probably yes. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 8:06 UTC (Wed) by taladar (subscriber, #68407) [ Link] (3 responses) I doubt Python development is faster to the degree claimed, especially compared to a modern language with good error messages and tooling like Rust (as opposed to the mess that is C/C++ build systems and errors). But even if it was, not ever spending time on learning Python would be a pretty big time saver that probably made up for those few hours on the few programs small enough to make Python viable. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 9:36 UTC (Wed) by farnz (subscriber, #17727) [ Link] (2 responses) My lived experience is that Python is valuable when you're working with someone who "doesn't program", but where the spec for the program is "make this colleague happy". You're not going to teach them Rust - but they may well be trainable to the point where they hack around at the part of the Python program that doesn't work for them (breaking the rest into the bargain - they're not going to fix bugs, because that's programming to them) so that they can show you what they meant. The resulting Python code is not debuggable; it's full of quirks and bugs, mixed together so that you can't tell if a given bit of code is purely a bug, or if it has side effects that are necessary to make another piece of code do the right thing, and with a very healthy helping of bad naming. Once you've gone through this to a point where they're happy with what the program does when it doesn't crash, you've got a pile of code that needs rewriting (whether you're sticking with Python, or changing language); but at this point, you have a better spec that "make my colleague happy with what the program does", because the spec is now "do what the Python program did, but without the bugs". [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 11:33 UTC (Wed) by anselm (subscriber, #2796) [ Link] Python code pays for my salary and that of my colleagues. We get stuff done. Our customers are happy. New ones are coming in all the time. Our management loves our team because we perform way better than the budget projections say we should, and have done so for several years. As far as we're concerned, Python is just fine. [Reply to this comment] Exciting changes for python Posted Jul 17, 2025 4:01 UTC (Thu) by raven667 (subscriber, #5198) [ Link] Good description, this is something I do too in my work where many of my colleagues are not software engineers, but they can do scripting in shell and perl and I encourage them to write code to solve their problems, which immediately provides value to them, and if that needs to be promoted into an ongoing system then I can use their working prototype as a starting point to refactor using our house style and software engineering practices. [Reply to this comment] Premature optimization Posted Jul 15, 2025 14:01 UTC (Tue) by marcH (subscriber, #57642) [ Link] (1 responses) > He's arguing for a more thoughtful approach: most code doesn't need to be heavily optimised, but some code does, and it's important to have the tools to identify those bottlenecks and then to optimise them. Premature optimisation is when you skip the identification step and optimise regardless, but the other extreme is bad too. BTW "premature optimization" is just one particular case of: many developers don't like testing. They just want to (re-)write code. "Oh, look: this code is not optimal. I can easily make it faster!" Never mind it's miles away from any critical path and that rewrite will yield zero user-visible improvement. Sometimes the developer will not even microbenchmark their rewrite in isolation... I think most people I ever discussed Knuth's quote with understood that accurately. Whether that actually stopped them from the selfish pleasure of (re-)writing code is a different question :-) And yes: this is also why a lot of Python software does not need to be converted to a different language - or any JIT compilation even. Because the critical paths are already in a different language (in an ideal world, it would be easier for anyone to mix different languages). [Reply to this comment] Premature optimization Posted Jul 15, 2025 15:35 UTC (Tue) by farnz (subscriber, #17727) [ Link] This is also why tools like PyO3 for Rust, and Boost.Python, pybind11 or PyCXX for C++ are so useful. It's common for a 10% speedup in 20% of your code base to have more impact on users than zeroing out the runtime of the other 80% of the code; why rewrite all of your code into some other language, when you can rewrite the 20% that has a significant impact on your users and get over half the benefits? [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 14:03 UTC (Tue) by marcH (subscriber, #57642) [ Link] (3 responses) > Compiled languages are generally much better for both identifying bottlenecks and then optimising them with incremental changes, because there's a more direct correspondence between source code and machine code; Errr... are you really sure about that? For sure debugging optimized C code is a lost cause. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 0:34 UTC (Wed) by Paf (subscriber, #91811) [Link] (2 responses) Yes, as someone who writes in C for a living in a high performance setting, the correspondence to machine code is only rarely relevant to the performance unless I'm staring at some realllllly tight bit of code. More often it's much more about approach - *what* are you doing or perhaps could not be doing, not the fine details of how the machine is doing it. And I say that as someone who finds that optimization fun. But it doesn't come up much. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 22:21 UTC (Wed) by raven667 (subscriber, #5198) [ Link] (1 responses) > More often it's much more about approach - *what* are you doing or perhaps could not be doing, not the fine details of how the machine is doing it. I've found that to be very true, the first draft of code often lives for years if it works but can be very meandering as the initial developer figured how to solve the problem as they went, once you know what the output is supposed to be refactoring it can be as simple as replacing many loops through a data set one loop checking multiple conditions or a hash key lookup replacing a complex set of expressions, sometimes made easier by a simple reformatting of the input data to make it easier to look up. The end result is asking the computer to do much less work which is almost always faster and as you say, doesn't require deep microarchitectural knowledge, just a basic sense of proportion. [Reply to this comment] Exciting changes for python Posted Jul 17, 2025 8:29 UTC (Thu) by Wol (subscriber, #4433) [Link] lol! My second job, my boss came to me and said "we have this job. It's a six-week deadline. Can you do it?", and I responded "I'll give it a damn good try". I gave myself a five-week deadline, to give the rest of the team time to do the remaining work once the program spat out the results. Four weeks into the job, the program ran. Estimated run time to completion? SIX WEEKS! Cue three days panic as I optimised the hell out of it - I finally handed the program over Wednesday morning week 5 (and went sick :-). The rest of the team made the deadline. And I basically did pretty much exactly what you said ... Cheers, Wol [Reply to this comment] Exciting changes for python Posted Jul 22, 2025 21:38 UTC (Tue) by anton (subscriber, #25547) [ Link] (1 responses) If you knew you were setting out to make Facebook to begin with, I think PHP (or Python) would be a poor choice. Judging by the fact that Facebook/Meta has been enormously successful in a highly competetive field, it may have been a good choice. It may have given them the flexibility to determine the requirements, while competitors who used a language like C++ or Java might have been too slow in adapting to the users and consequently did not gain the critical mass. [Reply to this comment] Exciting changes for python Posted Jul 23, 2025 9:51 UTC (Wed) by paulj (subscriber, #341) [Link] Note that Facebook long ago rewrote the PHP runtime, and extended the PHP language to add typing. They first had a transpiler for PHP->C++, then wrote their own PHP JIT VM (HHVM) somewhere around 2010, along with support for an house dialect of PHP that adds static typing - called "Hack" (open sourced around 2014 it seems). [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 21:34 UTC (Tue) by Bluehorn (subscriber, #17484) [Link] (4 responses) I can confirm that there are scenarios where you get into a lot of pain with Python. I bought the conventional wisdom, saying 1. Premature optimization is the root of all evil 2. use Python for the high-level, glue code, and drop into C/C++ for speed-critical pieces. This might work for number-crunching applications or games, where there is just a time critical core that can be optimized. In the application I am working on, we don't have such pieces. We don't compute much, we just keep data organized for the user. So there is no single bottleneck, we can only optimize peaks of 7% at most from the flame graph. I wish I hadn't fallen for that advice. We don't really have the time to port to another language, and the main critique about our software is about performance. [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 23:39 UTC (Tue) by marcH (subscriber, #57642) [ Link] (3 responses) > We don't compute much, we just keep data organized for the user. So there is no single bottleneck, we can only optimize peaks of 7% at most from the flame graph. If you don't "compute" much, then why is the programming language the bottleneck? All languages are equal when waiting for storage, network, databases, user input, etc. In that case, optimizing requires re-architecturing data and caching and that's generally not language specific. Except higher level languages make it much easier, safer and faster to experiment with different designs and strategies. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 8:10 UTC (Wed) by taladar (subscriber, #68407) [ Link] (2 responses) I wouldn't say Python makes it safe or easy to change designs and strategies. That is precisely where you want a compiler that tells you all the spots where you accidentally broke something while refactoring. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 15:14 UTC (Wed) by marcH (subscriber, #57642) [ Link] (1 responses) I wrote "experiment". Obviously, interpreted languages catch much fewer bugs at compile time and require a lot more test coverage. You absolutely need a ton of test coverage when refactoring and it tends to be never enough. Introducing regressions in less usual cases is bad indeed, but that does not get in the way when experimenting new designs to address performance bottlenecks in _common_ use cases. It only comes back and bites you later but that's not specific to performance, that's just the nature of all interpreted languages and that's the price to pay for a much faster development and prototyping loop. When I wrote "safer" I had memory corruption and C/C++ specifically in mind - these are always the most-time consuming bugs by several orders of magnitude. There are indeed many other options besides Python and C/C++ and some may be better than either depending on the use case. Compared to lower level languages, Python saves time prototyping not just because you write less code but also because there are very high level libraries and approaches to choose from, especially for I/O and concurrency. [Reply to this comment] Exciting changes for python Posted Jul 16, 2025 18:24 UTC (Wed) by mathstuf (subscriber, #69389) [Link] > especially for I/O and concurrency. Except for inotify for some reason. Granted, this was in 2015 when the Great Python 3 Disruption was still in full swing, but the only inotify libraries I could find were either wrapped up in way-too-large frameworks (e.g., Twisted) or abandoned by their maintainers on Python2 (possibly until asyncio was established?). [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 18:51 UTC (Mon) by mb (subscriber, #50428) [Link] (1 responses) Python is a good language and I enjoy programming in it. But all of my bigger projects eventually ran into performance and/or threading problems. No exceptions. That's the main reason why I don't use Python for anything that has the potential to outgrow 1kLOC today. And performance improvements of 50% or 100% don't really help, because compiled languages are *so* much faster. I have reached almost-compiled performance with Cython in certain projects. But why would I want to write code in Cython, if much better compiled languages with much better type systems exist? [Reply to this comment] Alternatives to Python Posted Jul 15, 2025 9:53 UTC (Tue) by farnz (subscriber, #17727) [ Link] One thing I've done in the past, with PyCXX when I used C++ and PyO3 now that I use Rust most of the time is to rewrite the "critical path" code from Python into something more amenable to high performance. The core idea is to use cProfile to find the bits of Python code that are bottlenecking you, then work out what architecture changes are needed to move the bottleneck code into a different language - which, of course, can include moving a chunk of non-bottleneck code across, too - leaving just glue code behind in Python. [Reply to this comment] Exciting changes for python Posted Jul 23, 2025 16:04 UTC (Wed) by danchev (guest, #151356) [Link ] Once you are proficient in English and you can read medical literature, you are pretty much a MD and you should try something else. P.S. I have yet to meet anyone who can genuinely claim to have fully mastered a major programming language. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 12:35 UTC (Mon) by khim (subscriber, #9252) [Link ] (4 responses) > python had so many good points of its own, it didn't deserve to be relegated as some niche language. It's strange to call one of the most popular languages "niche language". It's not a "niche language", but more of a "glue language": you don't use it for anything where many developers may touch the same code with not dedicated owner (there are statically typed languages for that), but it's very nice if the code is not supposed to be supported by more than one guy. [Reply to this comment] Exciting changes for python Posted Jul 14, 2025 18:53 UTC (Mon) by rjones (subscriber, #159862) [ Link] (3 responses) Describing one of the most popular programming languages that ever existed as "niche" is pretty silly. If Python is niche then what adjectives would one use to describe Golang or C#? Toys or hobby languages? [Reply to this comment] Exciting changes for python Posted Jul 15, 2025 10:57 UTC (Tue) by khim (subscriber, #9252) [Link ] (2 responses) If we go by popularity then "toys" or "non-professional" things are always the more popular ones. Everywhere. There are much more toy cars in existence than real cars, e.g. Thus, if anything, popularity of Python puts it squarely in the "toy" area: serious languages couldn't win by sheer number as with any other thing. "Professional" instruments are much less common that "consumer ones", after all. [Reply to this comment] Exciting changes for python Posted Jul 17, 2025 7:46 UTC (Thu) by rjones (subscriber, #159862) [ Link] (1 responses) By that logic brainfuck must be one of the most serious languages ever created. [Reply to this comment] Exciting changes for python Posted Jul 17, 2025 11:44 UTC (Thu) by khim (subscriber, #9252) [Link ] Nope. A implies B is not the same as B implies A. If tools are only used by professionals in a professional settings then they are, normally, less popular than toys. But that doesn't mean that something that's rarely used is, by necessity, a professional tool. There are many other things, besides professional tools, that are rare, after all. [Reply to this comment] How to handle dynamic content ? Posted Jul 14, 2025 10:51 UTC (Mon) by claudex (subscriber, #92510) [ Link] (3 responses) I understand that the examples are not actual code. But how do the JIT handle the changes to a type. For example, with this implementation: for duck in ducks: if duck.__class__ is Duck: sound = "Quack!" else: sound = duck.quack() ... What happens if after a few iterations, I change the definition of the quack method with something like: Duck.quack = RubberDuck.quack() How the JIT will detect that it should revert back to the interpreter ? [Reply to this comment] How to handle dynamic content ? Posted Jul 14, 2025 12:30 UTC (Mon) by khim (subscriber, #9252) [Link ] > How the JIT will detect that it should revert back to the interpreter ? In the exact same way it does that in a JavaScript? It's not a rocket science, just a simple if_class_is_still_the_same check added to the generated code. With near-zero cost because in most programs it's predicted 100% correctly always. [Reply to this comment] How to handle dynamic content ? Posted Jul 14, 2025 16:41 UTC (Mon) by Cyberax ( supporter , # 52523) [Link] (1 responses) > What happens if after a few iterations, I change the definition of the quack method with something like The generated code has a guard flag that switches execution back to interpretation, and the class assignments have (generated) setters for this flag. This was first pioneered in Squeak compilers in early 90-s, then Java used it in their Hotspot JIT for de-virtualization, and finally V6 JavaScript JIT extended it for the fully dynamic JS. [Reply to this comment] How to handle dynamic content ? Posted Jul 14, 2025 22:18 UTC (Mon) by claudex (subscriber, #92510) [ Link] Thanks, that's the explanation I was missing. [Reply to this comment] Tracing JITs have their downsides Posted Jul 14, 2025 17:46 UTC (Mon) by kleptog (subscriber, #1183) [ Link] (4 responses) The comment about how "branchy code" has issues with tracing JITs kind of waves over that some common use-cases do this. PyPy also uses a tracing JIT and we had problematic experiences there. The example we had was code that was parsing mailbox files. Basically, we were looping over the email headers and doing stuff as you go, and the tracing JIT went along and started creating specialisations for every order that the headers could appear in. The resulting JIT generated code was much faster, but the memory usage grew even faster, so that for a fixed amount of memory we were better off just running more non-JIT python interpreters in parallel. This same problem applies to any kind of interpreter or parsing where you're scanning through an input and selecting one of many functions based on that. What we really wanted was a special STOP_JIT_HERE marker to place at the beginning/end of the loop to stop the JIT compiler from tracing across that and producing many traces that were rarely used. IIRC at the time the PyPy developers were not interested in such a hack, but maybe in the meantime the state-of-the-art has advanced to solve this. Raw performance is important, but in some contexts, performance per GB of memory used is also relevant. (Though, thinking on it now, the issue is exacerbated by the GIL meaning that parallel processing must happen in separate processes. If you could parallel process in a single Python process with threads, then the JIT could share the rarely used traces over all the threads and the memory usage would probably be much less problematic. Still, I think I'd prefer to sacrifice some performance for known bounded memory usage.) [Reply to this comment] Tracing JITs have their downsides Posted Jul 14, 2025 20:54 UTC (Mon) by roc (subscriber, #30627) [Link ] (1 responses) Mozilla's Spidermonkey invested big in tracing and eventually had to switch to more traditional approaches because of the trace explosion problem. However, the performance expectations for JS are much much higher than for Python: JS has much more competition, and you can't ever be much slower than your competitors. The situation in Python is very different; it's never going to be fast and users can't easily switch to a better alternative implementation, so eking out small gains by tracing in limited situations might be the way to go. [Reply to this comment] Tracing JITs have their downsides Posted Jul 15, 2025 8:38 UTC (Tue) by Sesse (subscriber, #53779) [ Link] I believe that indeed, it was someone from TraceMonkey who pointed out why tracing JITs don't work that well in practice: "You fall off the trace". But fundamentally, what people want is "faster Python", not "a Python JIT", and even though one's used to JITs giving massive speed boosts, the latter is not necessarily the most productive way to get to the former. [Reply to this comment] Hyperblock scheduling? Posted Jul 15, 2025 18:13 UTC (Tue) by DemiMarie (subscriber, # 164188) [Link] (1 responses) Apparently hyperblock scheduling is a solution to the trace explosion problem, but to the best of my knowledge no production compiler has implemented it. [Reply to this comment] Hyperblock scheduling? Posted Jul 15, 2025 21:07 UTC (Tue) by roc (subscriber, #30627) [Link ] The references to "hyperblock scheduling" that I see online all mean "turn conditional basic blocks into predication" which only makes sense to apply selectively, and on traditional CPU architectures, *very* selectively because of their limited predication support. It's not going to solve all trace explosion problems. [Reply to this comment] Use Julia Posted Jul 16, 2025 4:13 UTC (Wed) by thomas.poulsen (subscriber, # 22480) [Link] (3 responses) I think these efforts of optimizing python are obsolete now that Julia has reached maturity. It "walks like python and runs like C". https://julialang.org/. It is not only great for scientific computing, but also has a great package system (1), plotting libraries (2) and web frameworks (3). The latency problems of early Julia are mostly a thing of the past now. The composeability made possible by multiple dispatch promotes a very modular package ecosystem. 1. https://docs.julialang.org/en/v1/stdlib/Pkg/ 2. https://makie.org/website/ 3. https://genieframework.com/ [Reply to this comment] Use Julia Posted Jul 16, 2025 7:11 UTC (Wed) by Sesse (subscriber, #53779) [ Link] Given the amount of Python code out there, even if people completely stopped writing Python in favor of Julia tomorrow (which isn't going to happen, sorry), making Python faster would still have a value. I mean, C is obsolete now that we have Ocaml, but somehow, people are still interested in faster C compilers :-P [Reply to this comment] Use Julia Posted Jul 16, 2025 16:43 UTC (Wed) by azumanga (subscriber, #90158) [Link] (1 responses) I find Julia no-where near as friendly as Python. The most obvious choice I hit early (but, I was looking at it for teaching a maths class), is that the default integer type is int64, so you get all the usual silent truncation issues that causes. I feel a language which is aiming to be a user friendly python replacement shouldn't be silently wrapping integers at 2^63. You can get big ints, but why not default the other way around, where you get big ints by default, and can choose 64 bit ints if you need them? [Reply to this comment] Use Julia Posted Jul 18, 2025 3:02 UTC (Fri) by thomas.poulsen (subscriber, # 22480) [Link] One of the cool features of Julia is the REPL-modes. Here's and example of a mode that does all arithmetic in arbitrary precision: https://github.com/MasonProtter/ReplMaker.jl?tab= readme-o... [Reply to this comment] Function pointers are not always just a pointer to the instruction Posted Jul 16, 2025 19:28 UTC (Wed) by jrtc27 (subscriber, #107748) [ Link] (5 responses) > One caveat is that memory from mmap() comes in page-sized chunks, which is 4KB on most systems but can be larger. If the JIT code is, say, four bytes in length, that can be wasteful, so it needs to be managed carefully. Once you have that memory, he asked, how do you actually execute it? It turns out that "C lets us do crazy things": > > typedef int (*function)(int); > ((function)data)(42); > > That first line creates a type definition named "function", which is a pointer to a function that takes an integer argument and returns an integer. The second line casts the data pointer to that type and then calls the function with an argument of 42 (and ignores the return value). "It's weird, but it works." This isn't always true. Most of the time it is, but there are a couple of cases where it's not. Firstly, on some architectures, low bits of the address for indirect jumps, and thus function pointers, are used to indicate which execution mode the processor should use. For example, on 32-bit Arm, the LSB is 1 for T32/Thumb and 0 for A32/Arm, and on 32-bit MIPS it similarly distinguishes between MIPS32 and microMIPS32 (or the prior MIPS16e which microMIPS32 replaced). Secondly, some ABIs use function descriptors to represent language-level function pointers. Here, the function pointer is not a pointer to the instructions to execute but is a pointer to a structure that contains such a pointer alongside one or more other pointers, typically some kind of per-library global pointer. This is the case on PA-RISC, Itanium, and 64-bit PowerPC if using version 1 of its ELF ABI (version 2 drops this, and is what most modern distributions use for 64-bit PowerPC), but also in embedded contexts multiple other ISAs have an ABI variant (sometimes called "FDPIC" for "function descriptor position-independent code") that does so, since it allows you to share a single copy of library code between processes in no-MMU systems. [Reply to this comment] Function pointers are not always just a pointer to the instruction Posted Jul 22, 2025 21:31 UTC (Tue) by anton (subscriber, #25547) [ Link] (4 responses) Your post makes me appreciate our use goto * (instead of C function calls) to enter generated machine code in gforth. As for ARM, we have used the address of the first byte of the code as target, and that has worked whether the code used T32 or A32. Are you sure that the mode stuff applies to T32 (Thumb2) and not just to Thumb1? [Reply to this comment] Function pointers are not always just a pointer to the instruction Posted Jul 23, 2025 9:45 UTC (Wed) by farnz (subscriber, #17727) [ Link] The mode stuff on ARM applies to all processors that have both T32 and A32 modes. See the BX instruction. If you're jumping to a register (absolute address), not to an inline offset, the bottom bit determines whether the destination is T32 or A32 code. This happens to not be a problem for jumps to intentionally generated machine code, because T32 instructions must be 16-bit aligned, and A32 instructions must be 32-bit aligned. [Reply to this comment] Function pointers are not always just a pointer to the instruction Posted Jul 23, 2025 10:01 UTC (Wed) by excors (subscriber, #95769) [ Link] (2 responses) Thumb-2 is the same. When you use an interworking branch instruction (`bx`, `blx`, `ldr pc`, `pop {pc}`, etc), the LSB determines whether the CPU switches into A32 or T32 mode. When you use non-interworking branches (`b`, `bl`, `mov pc`, etc), the CPU remains in its current mode, and the bottom 1-2 LSBs of the address are replaced with 0. In C, function pointers to Thumb functions will have the LSB set to 1 (so they're not the actual address of the instruction in memory), and the compiler will emit `blx` instructions. From some quick testing in GCC and Clang, it looks like computed goto pointers *don't* have the LSB set. If the function is in Thumb mode then the compiler will emit either `orr r0, #1; bx r0` (setting LSB before interworked branch) or `mov pc, r0` (non-interworked branch). You're not allowed to computed-goto between different functions, and I don't think a single function can use a mixture of A32 and T32 instructions (except with inline assembly etc), so it's safe for the compiler to assume it's not going to switch mode. [Reply to this comment] Function pointers are not always just a pointer to the instruction Posted Jul 23, 2025 15:08 UTC (Wed) by anton (subscriber, #25547) [ Link] (1 responses) What I actually see in code compiled with -mthumb (apparently -marm is the default for gcc-10 at least on Debian; I have seen Thumb2 code produced by default in earlier times): orr.w r3, r1, #1 bx r3 Yes, if gcc produced values with a set LSB when you do &&mylabel in a Thumb-compiled function, it could then avoid the orr.w instruction in the code for goto *. However, Gforth uses the values produced by && mylabel for determining where the code snippets (for code-copying) start and end, so if the value for the label pointed one byte later, we would need to add a workaround (e.g., force the function to compile to A32). [Reply to this comment] Function pointers are not always just a pointer to the instruction Posted Jul 23, 2025 16:01 UTC (Wed) by anton (subscriber, #25547) [ Link] Correction: gcc-10 on Debian still compiles to T32 by default (I overlooked a -marm in Gforth). [Reply to this comment] Copyright (c) 2025, Eklektix, Inc. This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license Comments and public postings are copyrighted by their creators. Linux is a registered trademark of Linus Torvalds