[HN Gopher] Modern Python Performance Considerations
___________________________________________________________________
Modern Python Performance Considerations
Author : chmaynard
Score : 226 points
Date : 2022-05-05 12:50 UTC (10 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| kzrdude wrote:
| Faster-cpython is not the main topic here but certainly welcome
| since it's the most used python. They've done great things so
| far. Though I remember I heard the promise of 50% improvement in
| each of five separate steps :)
| joncatanio wrote:
| This is a great read, and it's fantastic to see all the work
| being done to evaluate and improve the language!
|
| The dynamic-nature of the language is actually something that I
| had studied a few years back [1]. Particularly the variable and
| object attribute look ups! My work was just a master's thesis, so
| we didn't go too deep into more tricky dynamic aspects of the
| language (e.g. eval, which we restricted entirely). But we did
| see performance improvements by restricting the language in
| certain ways that aid in static analysis, which allowed for more
| performant runtime code. But for those interested, the abstract
| of my thesis [2] gives more insight into what we were evaluating.
|
| Our results showed that restricting dynamic code (code that is
| constructed at run time from other source code) and dynamic
| objects (mutation of the structure of classes and objects at run
| time) significantly improved the performance of our benchmarks.
|
| There was also some great discussion on HN when I had posted our
| findings as well [3].
|
| [1]: https://github.com/joncatanio/cannoli
|
| [2]: https://digitalcommons.calpoly.edu/theses/1886/
|
| [3]: https://news.ycombinator.com/item?id=17093051
| Animats wrote:
| _But we did see performance improvements by restricting the
| language in certain ways that aid in static analysis, which
| allowed for more performant runtime code._
|
| Well, yes. In Python, one thread can monkey-patch the code in
| another thread while running. That feature is seldom used. In
| CPython, the data structures are optimized for that.
| Underneath, everything is a dict. This kills most potential
| optimizations, or even hard-code generation.
|
| It's possible to deal with that efficiently. PyPy has a
| compiler, an interpreter, and something called the "backup
| interpreter", which apparently kicks in when the program being
| run starts doing weird stuff that requires doing everything in
| dynamic mode.
|
| I proposed adding "freezing", immutable creation, to Python in
| 2010, as a way to make threads work without a global lock.[1]
| Guido didn't like it. Threads in Python still don't do much for
| performance.
|
| [1]
| http://www.animats.com/papers/languages/pythonconcurrency.ht...
| chrisseaton wrote:
| > This kills most potential optimizations, or even hard-code
| generation.
|
| It doesn't - this has been a basically solved problem since
| Self and deoptimisation were invented.
| Animats wrote:
| In theory, yes. In CPython, apparently not. In PyPy,
| yes.[1] PyPy has to do a lot of extra work to permit some
| unlikely events.
|
| [1] https://carolchen.me/blog/jits-impls/
| [deleted]
| chrisseaton wrote:
| You're trying to correct me by posting my own mentee's
| blog post at me.
| jerf wrote:
| "Those techniques are based on the idea that most code "does not
| use the full dynamic power that it could at any given time" and
| that Python can quickly check to see if they are using the
| dynamic features."
|
| If anyone has a burning desire to try to write the next big
| dynamically-typed scripting language, I've often noodled in my
| head with the idea of a language that has a dynamically-typed
| startup phase, but at some point you call "DoneBeingDynamic()" on
| something (program, module, whatever, some playing would have to
| be done here) and the dynamic system basically freezes everything
| into place and becomes a static system. (Or you have an explicit
| startup phase to your module, or something like that.)
|
| The core observation I'm driving this on is much the same as the
| quote I give from the article. You generally set up the vast
| majority of your "dynamicness" once at runtime, e.g., you set up
| your monkeypatches, you read the tables out of the DB to set up
| your active classes, you read the config files and munge together
| the configurations, etc. But then forever after, your dynamic
| language is constantly reading this stuff, over and over and
| _over and over_ again, millions, billions, trillions of times,
| with it never changing. But it has to be read for the language to
| work.
|
| Combine that with perhaps some work on a system that backs to a
| struct-like representation of things rather than a hash-like
| representation, and you might be able to build something that
| gets, say, 80% of the dynamicness of a 1990s-era dynamic
| scripting language, while performing at something more like
| compiled language speeds, albeit with a startup cost. If you
| could skip over the dozens of operations resolving
| x.y.z.q = 25
|
| a dynamically-typed language like Python needs to properly
| implement that and get down to a runtime that can do the same
| thing compiled languages do by pre-computing the offset into a
| struct and just setting the value, you might get near static-
| language performance with dynamic typing affordances.
|
| You can also view this as a Lisp-like thing that has an
| integrated phase where it has macros, but then at some point puts
| this capability down.
|
| I tend to think it's just fundamentally flawed to take a language
| that is intrinsically defined as "x.y.z.q" requiring dozens of
| runtime operations versus trying to define a new one where it is
| a first-class priority from day one that the system be able to
| resolve that down to some static understanding of what "x.y.z.q"
| is. e.g., it's OK if y is a property and z is some fancy override
| if the runtime can simply hardcode the relevant details instead
| of having to resolve them every time. You can outrun even JIT-
| like optimizations if you can get this down to the point where
| you don't even have to check incoming types, you just know.
| marcosdumay wrote:
| I disagree. You are just doing those same optimizations by
| hand, instead of on a JIT. The computer is there to help us,
| and a lot of the value in a dynamic language comes from being
| able to override things at any time.
|
| If you just set your structure up and run it statically, you
| are better with a static language, that can take all kinds of
| value from that fixed structure.
| borodi wrote:
| This feels like you are describing julia, startup cost included
| :).
| coldtea wrote:
| > _I 've often noodled in my head with the idea of a language
| that has a dynamically-typed startup phase, but at some point
| you call "DoneBeingDynamic()" on something (program, module,
| whatever, some playing would have to be done here) and the
| dynamic system basically freezes everything into place and
| becomes a static system. (Or you have an explicit startup phase
| to your module, or something like that.)_
|
| V8 tries to guess that for classes and objects based on runtime
| information - that's how it gets some of its speed (it still
| needs checks about whether this is violated at any point, so
| that it can get rid of the proxy/stub "static" object it
| guessed).
|
| For a more static guarantee, there are also things like
| Object.freeze which does about what you describe for dynamic
| objects in JS (#).
|
| # https://gist.github.com/briancavalier/3772938
| jerf wrote:
| I'd be curious to see if a language developed with the idea
| that this is what it's going to do from scratch could do
| better than trying to bodge it on afterwards. Rather than
| pecking around what could be done literally decades after the
| language is specified, what if you started out with this
| idea?
|
| I dunno. It's possible the real world would stomp all over
| this idea in practice, or the resulting language would just
| be too complex to be usable. It does imply a rather weird
| bifurcation between being in "the init phase" and "the normal
| runtime phase", and who knows what other "phases" could
| emerge. Although technically, Perl actually already has this
| split, although generally it can be ignored because it's of
| much less consequence in Perl precisely because there mostly
| isn't much utility to having something done in the earlier
| phase, unlike this hypothetical language.
| gpderetta wrote:
| It seems that lisp-like macros or more generally multistage
| compilation is close to what you have in mind.
| jerf wrote:
| Yes, it's not a brand-new dimension of programming
| languages, merely a refinement of existing ideas. However
| I'm not aware of anything quite like it out there. Lisp
| could be used to implement it, but, I mean, that's not a
| very strong statement now is it? Lisp can be used to
| implement anything. The question is about whether it
| exists.
|
| Partially I throw this idea out as a bone to those who
| like dynamic languages. Personally I don't have this
| problem anymore because I've basically given them up,
| except in cases where the problem is too small to matter.
| And if you already know and like Lisp, you don't really
| have this problem either.
|
| But if you are a devotee of the 1990s dynamic scripting
| languages, you're getting really squeezed right now by
| performance issues. You can run 40-50x slower than C, or
| you can run circa 10x slower than C with an amazing JIT
| that requires a ton of effort and will forever be very
| quirky with performance, and in both cases you'll be
| doing quite a lot of work to use more than one core at a
| time. Python is just hanging in there with the amazing
| amount of work being poured into NumPy, and from what I
| gather from my limited interactions with data scientists,
| as data sets get larger and the pipelines more complex,
| the odds you'll fall out of what NumPy can do and fall
| back to pure Python goes up and the price of that goes up
| too.
|
| I think a new dynamic scripting language built from the
| ground up to be multithreadable and high performance via
| some techniques like this would have some room to run,
| and while hordes of people will come out of the woodwork
| promising that one of the existing ones will get there
| Real Soon Now, just wait, they've almost got it, the
| reality is I think that the current languages have pretty
| much been pushed as far as they can be. Unless someone
| writes this language, dynamic scripting languages are
| going to continue slowly, quite slowly, but also quite
| surely, just getting squeezed out of computing entirely.
| I mean, I'm looking at the road ahead and I'm not sure
| how Go or C# is going to navigate a world where even low-
| end CPUs casually have 128 cores on consumer hardware....
| Python _qua_ Python is going to face a real uphill battle
| when the decision to use it entails basically committing
| to not only using less than 1% of the available cores
| (without offloading on to the programmer a significant
| amount of work to get past that), but also using that
| core ~1.5 orders of magnitude less efficiently than a
| compiled language. You 've always had to pay some to use
| Python, sure, but that's an awful lot of _orders of
| magnitude_ for "a nice language". Surely we can have "a
| nice language" for less price than that.
| ufo wrote:
| This kind of sounds similar to what a JIT compiler does, except
| that a JIT will silently fall back to slower code if you do
| those forbidden dynamic things. I think the most appealing
| thing about what you're suggesting here is less about the peak
| performance and more about having better guarantees about
| startup cost and that performance won't be degraded (prefer
| failing loudly to chugging along unoptimized). These two areas
| often aren't the strongest point in JIT-ed systems...
| tln wrote:
| This approach kind of describes Graal.
|
| Interestingly, GraalPython never seems to come up on these
| speeding-up-Python articles & benchmarks while TruffleRuby is a
| heavyweight in the speeding-up-Ruby space.
| kmod wrote:
| I tried to benchmark GraalPython for the talk but the
| compatibility situation was so poor that I wasn't even close
| to being able to run any benchmarks.
| w-m wrote:
| This may be a naive question (I have very little knowledge
| about building languages and compilers): Would this be possible
| in Python by introducing a keyword like `final`? Any object,
| variable, method that is marked final just has to be looked up
| once by the interpreter, the re-fetching the article describes
| doesn't have to happen again. Trying to change a final thing
| results in an exception.
| uncomputation wrote:
| With JavaScript, these kinds of optimizations in an engine make
| sense due to the web being limited by it and thus speed is a huge
| factor. With Python, however, if a Python web framework is "too"
| slow, I would honestly say the problem is using Python at all for
| a web server. Python shines beautifully as a (somewhat) cross
| platform scripting language: file reading and writing,
| environment variables, simple implementations of basic utilities:
| sort, length, max, etc that would be cumbersome in C. The move of
| Python out of this and into practically everything is the issue
| and then we get led into rabbit holes such as this where since we
| are using Python, a dynamic scripting language, for things a
| second year computer science student should know are not "the
| right jobs for the tool."
|
| Instead of performance, I'd like to see more effort in
| portability, package management, and stability for Python
| because, essentially since it is often enterprise managed,
| juggling fifteen versions of Python where 3.8.x supports native
| collection typing annotations but we use 3.7.x, etc. is my
| biggest complaint. Also up there is pip and just the general mess
| of dependencies and lack of a lock file. Performance doesn't even
| make the list.
|
| This is not to discredit anyone's work. There is a lot of
| excellent technical work and research done as discussed in the
| article. I just think honestly a lot of this effort is wasted on
| things low on the priority tree of Python.
| Barrin92 wrote:
| waprin wrote:
| On paper, Python is not the right tool for the job. Both
| because of its bad performance characteristic and because it's
| so forgiving/flexible/dynamic , it's tough to maintain large
| Python codebases with many engineers.
|
| At Google there is some essay that Python should be avoided for
| large projects.
|
| But then there's the reality that YouTube was written in
| Python. Instagram is a Django app. Pinterest serves 450M
| monthly users as a Python app. As far as I know Python was a
| key language for the backend of some other huge web scale
| products like Lyft, Uber, and Robinhood.
|
| There's this interesting dissonance where all the second year
| CS students and their professors agree it's the wrong tool for
| the job yet the most successful products in the world did it
| anyway.
|
| I guess you could interpret that to mean all these people
| building these products made a bad choice that succeeded
| despite using Python but I'd interpret it as another instance
| of Worse is Better. Just like Linus was told monolithic kernels
| were the wrong tool for the job but we're all running Linux
| anyway.
|
| Sometimes all these "best practices" are just not how things
| work in reality. In reality Python is a mission critical
| language in many massively important projects and it's
| performance characteristics matter a ton and efforts to improve
| them should be lauded rather than scrutinized.
| arinlen wrote:
| > But then there's the reality that YouTube was written in
| Python. Instagram is a Django app. Pinterest serves 450M
| monthly users as a Python app. As far as I know Python was a
| key language for the backend of some other huge web scale
| products like Lyft, Uber, and Robinhood.
|
| All those namedrops mean and matter nothing. Hacking together
| proof of concepts is a time honoured tradition, as is pushing
| to production hacky code that's badly stiched up. Who knows
| if there was any technical analysis to pick Python over any
| alternative? Who knows how much additional engineering work
| and additional resources was required to keep that Python
| code from breaking apart in production? I mean, Python always
| figured very low in webapp framework benchmarks. Did that
| changed just because <trendy company> claims it used Python?
|
| Also serving a lot of monthly users says nothing about a tech
| stack. It says a lot about the engineering that went into
| developing the platform. If a webapp is architected so that
| it can scale well to meet it's real world demand, even after
| paying a premium for the poor choice of tech stack some guy
| who is no longer around made in the past for god knows what
| reason, what would that say about the tech stack?
| dataflow wrote:
| I don't think "I could use tool X for job Y" implies "X was
| the right tool for jon Y". You could commute with a truck to
| your workplace 300 feet away for 50 years straight and I
| would still argue you probably used the wrong tool for the
| job. "Wrong tool" doesn't imply "it is impossible to do
| this", it just means "there are better options".
| ChrisLomont wrote:
| >the most successful products in the world did it anyway
|
| A few successful projects in the world did it. There's likely
| far more successful products that didn't use it.
|
| The key metric along this line is how often each language
| allows success to some level and how often they fail
| (especially when due to the choice of language).
|
| >should be lauded rather than scrutinized
|
| One can do both at the same time.
| sjtindell wrote:
| Instagram has one billion monthly users generating $7
| billion a year. There are almost zero products on earth as
| successful.
| arinlen wrote:
| > Instagram has one billion monthly users generating $7
| billion a year.
|
| Doesn't Instagram serve mostly static content that's put
| together in an appealing way by mobile apps? I'd figure
| Instagram's CDN has far more impact than whatever Python
| code it's running somewhere in it's entrails.
|
| Cargo cult approaches to tech stacks don't define
| quality.
| xboxnolifes wrote:
| The point is that it's still _one_ project. You need to
| count the failures as well to rule out survivorship bias.
| slt2021 wrote:
| Just compare Instagram written in Python to Google Wave,
| Google+ or any other Google's social media, written in
| C++/Java :))))
| jeremycarter wrote:
| And you can put 7 billion of effort into tweaking your
| python application performance?
| fddhjjj wrote:
| > The key metric along this line is how often each language
| allows success to some level and how often they fail
|
| How does python score on these key metrics?
| w1nk wrote:
| > There's this interesting dissonance where all the second
| year CS students and their professors agree it's the wrong
| tool for the job yet the most successful products in the
| world did it anyway.
|
| > I guess you could interpret that to mean all these people
| building these products made a bad choice that succeeded
| despite using Python but I'd interpret it as another instance
| of Worse is Better. Just like Linus was told monolithic
| kernels were the wrong tool for the job but we're all running
| Linux anyway.
|
| This isn't the correct perspective or take away. The 'tool'
| for the job when you're talking about building/scaling a
| website changes over time as the business requirements shift.
| When you're trying to find market fit, iterating quickly
| using 'RAD' style tools is what you need to be doing. Once
| you've found that fit and you need to scale, those tools will
| need to be replaced by things that are capable of scaling
| accordingly.
|
| Evaluating this binary right choice / wrong choice only makes
| sense when qualified with a point in time and or scale.
| digisign wrote:
| The folks that work on performance are not the folks working on
| packaging. Shall we stop their work until the packaging team
| gets in gear?
| rmbyrro wrote:
| Totally agree that performance is not on my top 10 wish list
| for Python.
|
| But I disagree on " _not the right jobs for the tool_ ".
|
| Python is extremely versatile and can be used as a valid tool
| for a lot of different jobs, as long as it fits the _job
| requirements_ , performance included.
|
| It doesn't require a CS degree to know that fitting _job
| requirements_ and other factors like the team expertise, speed,
| budget, etc, are more important than fitting a theoretical
| sense of "right jobs for the tool".
| blagie wrote:
| > It doesn't require a CS degree to know that fitting job
| requirements and other factors like the team expertise,
| speed, budget, etc, are more important than fitting a
| theoretical sense of "right jobs for the tool".
|
| It requires experience.
|
| A lot of those lessons only come after you've seen how much
| more expensive it is to maintain a system than to develop
| one, and how much harder people issues are than technical
| issues.
|
| A CS degree, or even a junior developer, won't have that.
| moffkalast wrote:
| Python can do just about anything... but it will take its
| time doing it.
| pjmlp wrote:
| Agreed, my only use for Python since version 1.6, is portable
| shell scripting or when sh scripts get too complicated.
|
| Anything beyond that, there are compiled languages with REPL
| available.
| mrtranscendence wrote:
| What compiled languages do you have in mind? I suppose
| technically there are repls for C or Rust or Java, but I
| wouldn't consider them ideal for interactive programming.
| Functional programming might do a bit better -- Scala and
| GHCi work fine interactively. Does Go have a repl?
| eatonphil wrote:
| > compiled languages
|
| Might be tripping you up. Very few languages require that
| _implementations_ be compiled or interpreted. For most
| languages, having a compiler or interpreter is an
| implementation decision.
|
| I can implement Python as an interpreter (CPython) or as a
| compiler (mypyc). I can implement Scheme as an interpreter
| (Chicken Scheme's csi) or as a compiler (Chicken Scheme's
| csc). The list goes on: Standard ML's Poly/ML
| implementation ships a compiler and an interpreter; OCaml
| ships a compiler and an interpreter.
|
| There are interpreted versions of Go like
| https://github.com/traefik/yaegi. And there are native-,
| AOT-compiled versions of Java like GraalVM's native-image.
|
| For most languages there need be no relationship at all
| between compiler vs interpreter, static vs dynamic, strict
| or no typing.
| pjmlp wrote:
| Java, C#, F#, Lisp variants, and C++.
|
| Eclipse has Java scratchpads for ages, Groovy also works
| out for trying out ideas and nowadays we have jshell.
|
| F# has a REPL in ML linage, and nowadays C# also shares a
| REPL with it in Visual Studio.
|
| Lisp variants, going at it for 60 years.
|
| C++, there are hot reload environments, scripting variants,
| and even C and C++ debuggers can be quite interactive.
|
| I used GDB in 1996, alongside XEmacs, as poor man's REPL
| while creating a B+Tree library in C.
|
| Yes, there are Go interpreters available,
|
| https://github.com/traefik/yaegi
| blagie wrote:
| I want a common language I can work with. Right now, Python is
| the only tool which fits the bill.
|
| A critical thing is Python does numerics very, very well. With
| machine learning data science, and analytics being what they
| are, there aren't many alternatives. R, Matlab, and Stata won't
| do web servers. That's not to mention wonderful integrations
| with OpenCV, torch, etc.
|
| Python is also competent at dev-ops, with tools like ansible,
| fabric, and similar.
|
| It does lots of niches well. For example, it talks to hardware.
| If you've got a quadcopter or some embedded thing, Python is
| often a go-to.
|
| All of these things need to integrate. A system with
| Ruby+R+Java will be much worse than one which just uses Python.
| From there, it's network effects. Python isn't the ideal server
| language, but it beats a language which _just_ does servers.
|
| As a footnote, Python does package management much better than
| alternatives.
|
| pip+virtualenv >> npm + (some subset of require.js / rollup.js
| / ES2015 modules / AMD / CommonJS / etc.)
|
| JavaScript has finally gone from a horrible, no-good, bad
| language to a somewhat competent one with ES2015, but it has at
| least another 5-10 years before it can start to compete with
| Python for numerics or hardware. It's a sane choice if you're
| front-end heavy, or mobile-heavy. If you're back-end heavy
| (e.g. an ML system) or hardware-heavy (e.g. something which
| talks to a dozen cameras), Python often is the only sane
| choice.
| Denvercoder9 wrote:
| > As a footnote, Python does package management much better
| than alternatives.
|
| If you use it as a scripting language, that might very well
| be the case (it's at least simpler). When you're building
| libraries or applications, no, definitely not. It's a huge
| mess, and every 3 years or so we get another new tool that
| promises to solve it, but just ends up creating a bigger
| mess.
| whimsicalism wrote:
| I think poetry actually does solve it
| Thrymr wrote:
| Oh, there are a half dozen different tools that solve
| python package management. Unfortunately, they are
| mutually incompatible and none solve it for all use
| cases.
| whimsicalism wrote:
| > it has at least another 5-10 years before it can start to
| compete with Python for numerics or hardware
|
| More, given that no language competes at high-level numerics
| with Python outside of Julia and numerics in general only
| adds C++.
| Robotbeat wrote:
| Fortran >:D
| whimsicalism wrote:
| For low-level, fair. I only know of people in astronomy
| academia who actually use it nowadays though.
| DeathArrow wrote:
| >. R, Matlab, and Stata won't do web servers.
|
| Not unless they're pushed to, like Python was.
|
| >A critical thing is Python does numerics very, very well.
|
| That's not Python doing numerical stuff. That's C code,
| called from Python.
| fractalb wrote:
| > Not unless they're pushed to, like Python was.
|
| Readability of code and ease of use is a big thing. It's
| just not about pushing hard till we make it.
|
| edit: formating
| jonnycomputer wrote:
| I wouldn't want to do a web-server in MATLAB. I like
| MATLAB, but no, not that.
| mrtranscendence wrote:
| > That's not Python doing numerical stuff. That's C code,
| called from Python.
|
| That's sort of a distinction without a difference, isn't
| it? Python can be good for numeric code in many instances
| because someone has gone through the effort of implementing
| wrappers atop C and Fortran code. But I'd rather be using
| the Python wrappers than C or especially Fortran directly,
| so it makes at least a little sense to say that Python
| "does numerics [...] well".
|
| > Not unless they're pushed to, like Python was.
|
| R and Matlab, maybe. A web server in Stata would be a
| horrible beast to behold. I can't imagine what that would
| look like. Stata is a _terrible_ general purpose language,
| excelling only at canned econometrics routines and
| plotting. I had to write nontrivial Stata code in grad
| school and it was a painful experience I 'd just as soon
| forget.
| disgruntledphd2 wrote:
| You can do web stuff in R, but it's a lot harder than it
| needs to be. R sucks for string interpolation, and a lot
| of web related stuff is string interpolation.
| mrtranscendence wrote:
| Yeah, I'm not surprised by that. The extent of my web
| experience in R is calling rcurl occasionally, so I've
| never tried and failed to do anything complicated.
| blagie wrote:
| It's not C code. It calls into a mixture of C, CUDA,
| Fortran, and a slew of other things. Someone did the work
| of finding the best library for me, and integrating them.
|
| As for me, I write:
|
| A * B
|
| It multiplies two matrices. C can't do that. In C, I'd have
| some unreadable matrix64_multiply(a, b). Readability is a
| big deal. Math should look more-or-less like math. I can
| handle 2^4, or 2**4, but if you have mpow(2, 4) in the
| middle of a complex equation, the number of bugs goes way
| up.
|
| I'd also need to allocate and free memory. Data wrangling
| is also a disaster in C. Format strings were a really good
| idea in the seventies, and were a huge step up from BASIC
| or Python. For 2022?
|
| And for that A * B? If I change data types, things just
| work. This means I can make large algorithmic changes
| painlessly.
|
| Oh, and I can develop interactively. ipython and jupyter
| are great calculators. Once the math is right, I can copy
| it into my program.
|
| I won't even get started on things like help strings and
| documentation.
|
| Or closures. Closures and modern functional programming are
| huge. Even in the days of C and C++, I'd rather do math in
| a Lisp (usually, Scheme).
|
| I used to do numerics in C++, and in C before that. It's at
| least a 10x difference in programmer productivity stepping
| up to Python.
|
| Your comment sounds like someone who has never done
| numerical stuff before, or at least not serious numerical
| stuff.
| danuker wrote:
| > the number of bugs goes way up
|
| In case you are forced to use the unreadable long-named
| unintuitively-syntaxed methods, add unit tests, and check
| that input-output pairs match with whatever formula you
| started with.
| tomrod wrote:
| Yet, Python (and most of her programmers including data
| scientists, of which I am one) stumble with typing.
| if 0.1 + 0.2 == 0.3: print('Data is handled
| as expected.') else: print('Ruh
| roh.')
|
| This fails on Python 3.10 because floats are not
| decimals, even if we really want them to be. So most
| folks ignore the complexity (due to naivety or
| convenience) or architect appropriately after seeing
| weird bugs. But the "Python is easiest and gets it right"
| notion that I'm often guilty of has some clear edge
| cases.
| fnord123 wrote:
| This is an issue for accountancy. Many numerical fields
| have data coming from noisy instruments so being lossy
| doesn't matter. In the same vein as why GPUs offer f16
| typed values.
| dullcrisp wrote:
| Why would you want decimals for numeric computations
| though? Rationals might be useful for algebraic
| computations, but that'd be pretty niche. I'd think
| decimals would only be useful for presentation and maybe
| accountancy.
| tomrod wrote:
| Well, for starters folks tend to code expecting
| 0.1+0.2=0.3, rather than abs(0.3-0.2-0.1) <
| tolerance_value
|
| Raw floats don't get you there unfortunately.
| gjm11 wrote:
| They also expect 1/3 + 1/3 + 1/3 == 1. Decimals won't
| help with that.
| kbenson wrote:
| That's slightly different in that most programmers won't
| read 1/3 as "one third" but instead "one divided by
| three", and interpret that as three divisions added
| together, and the expectations are different. Seeing a
| constant written as a decimal invites people to think of
| them as decimals, rather than the actual internal
| representation, which is often "the float that most
| closely represents or approximates that decimal".
| dekhn wrote:
| https://docs.python.org/3/library/decimal.html
| tomrod wrote:
| Correct! Many python users don't know about this and
| similar libraries that assist with data types. Numpy has
| several as well.
| kbenson wrote:
| > As a footnote, Python does package management much better
| than alternatives
|
| No offense meant, but that sounds like the assessment of
| someone that has only experienced really shitty package
| management systems. PyPI has had their XMLRPC search
| interface disabled for months (a year?) now, so you can't
| even easily figure out what to install from the shell and
| have to use other tools/a browser to figure it out.
|
| Ultimately, I'm moving towards thinking that most scripting
| languages actually make for fairly poor systems and admin
| languages. It used to be the ease of development made all the
| other problems moot, but there's been large advances in
| compiled language usability.
|
| For scripting languages you're either going to follow the
| path or Perl or the the path of Python, and they both have
| their problems. For Perl, you get amazing stability at the
| expense of eventually the language dying out because there's
| not enough new features to keep people interested.
|
| For Python, the new features mean that module writers want to
| use them, and then they do, and you'll find that the system
| Python you have can't handle what modules need for things you
| want to install, and so you're forced to not just have a
| separate module environment, but fully separate pythons
| installed on servers so you cane make use of the module
| ecosystem. For a specific app you're shipping around this is
| fine, but when maintaining a fleet of servers and trying to
| provide a consistent environment, this is a big PITA that you
| don't want to deal with when you've already chosen a major
| LTS distro to avoid problems like this.
|
| Compiling a scripting language usually doesn't help much
| either, as that usually results in extremely bloated binaries
| which have their own packaging and consistency problems.
|
| This is cyclical problem we've had so far. A language is used
| for admin and system work, the requirements of administrators
| grate up against the usage needs of people that use the
| language for other things, and it fails for non-admin work
| and loses popularity and gets replaced be something more
| popular (Perl -> Python) or it fails for admin work because
| it caters to other uses and eventually gets replaced by
| something more stable (what I think will happen to Python,
| what I think somewhat happened to bash earlier for slightly
| different reasons).
|
| I'm not a huge fan of Go, but I can definitely see why people
| switch to it for systems work. It alleviates a decent chunk
| of the consistency problems, so it's at least better in that
| respect.
| jonnycomputer wrote:
| >No offense meant, but that sounds like the assessment of
| someone that has only experienced really shitty package
| management systems. PyPI has had their XMLRPC search
| interface disabled for months (a year?) now, so you can't
| even easily figure out what to install from the shell and
| have to use other tools/a browser to figure it out.
|
| Yes, this is, frankly, an absurd situation for python.
|
| And then there is the fact that I end up depending on
| third-party solutions to manage dependencies. Python is
| big-time now; stop the amateur hour crap.
| the__alchemist wrote:
| I agree! Here's a related point: Rust seems ideal for web
| servers, since it's fast, and is almost as ergonomic as Python
| for things you listed as cumbersome in C. So, why do I use
| Python for web servers instead of Rust? Because of the robust
| set of tools of tools Django provides. When evaluating a
| language, fundamentals like syntax and performance are one
| part. Given web server bottlenecks are I/O limited (mitigating
| Python's slowness for many web server uses), and that I'd have
| to reinvent several wheels in Rust, I use Python for current
| and future web projects.
|
| Another example, with a different take: MicroPython, on
| embedded. The only good reason I can think for this is to
| appeal to people who've learned Python, and don't want to learn
| another language.
| rootusrootus wrote:
| > the problem is using Python at all for a web server
|
| I don't agree with this. Maybe for a web server where
| performance is really going to matter down to the microsecond,
| and I've got no other way to scale it. I write server code in
| both Javascript and Python, and despite all of my efforts I
| still find that I can spin up a simple site in something like
| django and then add features to it much more easily than I can
| with node. It just has less overhead, is simpler, lets me get
| directly to what I need without having to work too hard. It's
| not like express is _hard_ per se, but python is such an easy
| language to work with and it stays out of my way as long as I
| 'm not trying to do exotic things.
|
| And then it pays dividends later, as well, because it's really
| easy for a python developer to pick up code and maintain it,
| but for JS it's more dependent on how well the original
| programmer designed it.
| srcreigh wrote:
| The problem with Django services is the insanely low
| concurrency level compared to other server frameworks
| (including node).
|
| Django is single request at a time with no async. The
| standard fix is gunicorn worker processes, but then you
| require entire server memory * N memory instead of
| lightweight thread/request struct * N memory for N requests.
|
| I shudder to think that whenever Django server is doing an
| HTTP request to a different service or running a DB query,
| it's just doing nothing while other requests are waiting in
| the gunicorn queue.
|
| The difference is if you have an endpoint with 2s+ queries
| taking 2s for one customer, with Django, it might cause the
| entire service to stall for everybody, whereas with a decent
| async server framework other fast endpoints can make progress
| while the 2s ones are slow.
| pdhborges wrote:
| You can configure gunicorn to use multiple threads to
| recover quite a bit of concurrency in those scenarios and
| that is enough for many applications.
| srcreigh wrote:
| What threading/workers configuration do you use?
|
| I'm looking at a page now which recommends 9 concurrent.
| requests for a Django server running on a 4 core
| computer.
|
| Meanwhile node servers can easily handle hundreds of
| concurrent requests.
| pdhborges wrote:
| We use the ncpu * 2 + 1 formula for the number of workers
| that serve API requests.
|
| I don't think in 'handling x concurrent requests' terms
| because I don't even know what that means. Usually I
| think around thoughout, latency distributions and number
| of connections that can be kept open (for servers that
| deal with web sockets).
|
| For example if you have the 4 core computer and you have
| 4 workers and your requests take around 50ms each you can
| get to a throughput of 80 requests per second. If the
| fraction of request time for IO if 50% you can bump your
| thread count to try to reach 160 request per second. Note
| that in this case each request consumes 25ms of CPU so
| you would never be able to get more than 40 requests per
| second per CPU whether you are using node or python.
| manfre wrote:
| Django has async support for everything except the ORM.
| async db is possible without the ORM or by doing some
| thread pool/sync to async wrapping. A PR for that was under
| review last I checked.
|
| Either way, high concurrency websites shouldn't have
| queries that take multiple seconds and it's still possible
| to block async processes in most languages if you mix in a
| blocking sync operation.
| dirnctiwnsidj wrote:
| This sounds like sour grapes. Python is a general-purpose
| language. Languages like Awk and Perl and Bash are clearly
| domain-specific, but Python is a pretty normal procedural
| language (with OO bolted on). The fact that it is dynamic and
| high-level does not mean it is unsuited for applications or the
| back-end. People use high-level dynamic languages for servers
| all the time, like Groovy or Ruby or, hell, even Node.js.
|
| What about Python makes it unsuitable for those purposes other
| than its performance?
| make3 wrote:
| I'm not sure it's very relevant to say in a discussion of the
| answer of "how do we improve Python" is "don't use Python".
| People have all kinds of valid reasons to use Python. Let's
| keep this on topic please
| heavyset_go wrote:
| > _Also up there is pip and just the general mess of
| dependencies and lack of a lock file._
|
| You can use pyproject.toml or requirements.txt as lock files,
| Poetry can use the former and poetry.lock files, as well.
| marius_k wrote:
| > and lack of a lock file
|
| Is it possible to solve your problem using pip freeze?
| robotsteve2 wrote:
| The world doesn't revolve around web development. It's not the
| only use case. Scientific Python is huge and benefits
| tremendously from the language being faster. If Python can be
| 1% faster, that's a significant force multiplier for scientific
| research and engineering analysis/design (in both academia and
| industry).
| mrtranscendence wrote:
| Because most of the really huge scientific Python libraries
| are written as wrappers over lower-level language code, I'd
| be curious to what extent speeding up Python by, say, 10%
| would speed up "normal" scientific Python code on average.
| 1%? 5%?
| animatedb wrote:
| If you are talking about large sets of numbers, then the
| speed up will be far below 1%.
| DeathArrow wrote:
| >The first topic he raised, "why Python is slow", is somewhat
| divisive
|
| What dynamic, interpreted, single threaded language is fast?
| baisq wrote:
| Practically every other language that ticks those boxes is
| faster than Python.
| bsder wrote:
| > What dynamic, interpreted, single threaded language is fast?
|
| Javascript. End of list.
|
| The problem is that a Javascript implementation is now _so_
| complicated that you can 't develop a new one without massive
| investment of resources.
| brokencode wrote:
| As far as interpreted languages go, Wren is pretty quick, but
| still not fast compared to compiled languages.
|
| But for dynamic, single threaded languages, JavaScript is
| famously fast with a modern JIT compiler like V8.
| Qem wrote:
| Lua (LuaJIT implementation). Some Smalltalk VMs are also quite
| fast. For example, see Eliot Miranda work on CogVM.
| astrobe_ wrote:
| Your pushing it a bit too far if you say that JIT is
| interpreted.
|
| To answer OP, if you replace "dynamic" by "untyped", Forth
| qualifies. And it actually can go where there's no JIT to
| save your A from the "just throw more hardware (and software)
| at the problem" mindset.
| Qem wrote:
| I think someone once said dynamic langs must cheat to be
| performant. Jitted runtimes are just interpreters cheating.
| DeathArrow wrote:
| What's wrong with using the right tool for the right job? Python
| for utility scripts, Javascript for Web frontend, C and C++ for
| system programming, C# for Web backend, R for statistical stuff
| and data analysis?
|
| It seems to me some guys learned a language suited to a thing and
| instead of learning other languages better fitted for other
| purposes, they push for their one and only language to be used
| everywhere, resulting in delays and financial losses.
|
| It's not very hard to learn another language. Or, if you are that
| lazy, you can stay with the language you know and use it for what
| was intended.
| ReflectedImage wrote:
| Python has domain on web backend, statistical stuff and data
| analysis nowadays.
| supreme_berry wrote:
| Dumbest comment on the thread?
| dijit wrote:
| As a very partial; almost unrelated question: Is there any python
| module that you use day-to-day that you'd like to have a
| significant speedup with?
|
| I'm thinking of reimplementing some python modules in rust, as
| that seems like the kind of weird thing I'm in to. I've done it
| with some success (using the excellent work of the pyo3 project)
| professionally, but I'd be interested in doing more.
| yedpodtrzitko wrote:
| Pydantic is quite popular library. Its author is doing exactly
| this - rewriting its core [0] in Rust. It's still WIP, but
| readme mentions that "Pydantic-core is currently around 17x
| faster than Pydantic Standard."
|
| [0] https://github.com/samuelcolvin/pydantic-core
| tclancy wrote:
| Not working in Python right now, but I have 15 years of Python
| + Django on the web and while there are any number of attempts
| at this (I keep a list at
| https://pinboard.in/u:tclancy/t:json/t:python/), any
| improvement in JSON serialization and unserialization speeds is
| a huge boon to projects. I am trying to think of similar
| bottlenecks where a drop-in replacement can be a huge
| performance improvement.
| JackC wrote:
| The missing thing last time I looked was a fast python json
| library that's byte-compatible with stdlib -- same inputs,
| same outputs. There are good fast options but they tend to
| add some (perfectly reasonable) limitation like fixed
| indentation size, for the sake of speed, that blocks them
| from being dropped into an existing public API.
| dotnet00 wrote:
| Definitely matplotlib. Navigating image plots in interactive
| mode with even just 10000x10000 pixels is painfully slow. While
| I've picked up some alternatives, they don't feel as clean as
| matplotlib.
| wcunning wrote:
| 10000% -- matplotlib for visualization of a lot of different
| data I've looked at, but esp things like high res images in
| machine learning contexts is incredibly slow, even on good
| computers. It does fine for small vector stuff and render
| once and save graphs, but it's bad for what a lot of people
| use it for.
| mritchie712 wrote:
| pandas
| curiousgal wrote:
| I remember, when trying to squeeze some performance out of
| it, that a lot of the overhead came from it trying to infer
| types.
| w-m wrote:
| This is a curious reply for me. I would think that there are
| very few parts in pandas that could be sped-up by
| reimplementing them with a compiled language. Pandas is
| plenty fast for the built-in methods, it only gets slow when
| you start interfacing with Python, e.g. by doing an `.apply`
| with your custom Python method. Obviously this interfacing
| part is impossible to speed up by reimplementing parts of
| pandas (you'd need a different API instead).
| mynameis_jeff wrote:
| would give https://github.com/modin-project/modin a shot
| SnooSux wrote:
| It's been done: https://github.com/pola-rs/polars
|
| But I'm sure there's always room for improvement
| rytill wrote:
| It's not like polars is a drop-in replacement, it has a
| totally different API.
| mrtranscendence wrote:
| You wrote "it has a totally different API", did you mean
| "it has an actually sane API?" Because that's what I
| think of when I compare pandas to polars.
| fgh wrote:
| The answer would then be to have a look at polars.
| zmgsabst wrote:
| You'd be awesome if you wrote a library for large image
| processing.
|
| You can make large Numpy arrays fine -- eg, 20k x 20k or 500k x
| 500k, but trying to render that to anything but SVG or manual
| tilings pukes badly.
|
| That's my main blocker on rendering high dimensional shapes:
| you can do the math, but visualizations immediately fall over
| (unless you do tiling yourself).
|
| There's probably someone with a more useful idea than
| "gigapixel rendering" though.
| DeathArrow wrote:
| As I see it, Python is good for glue code and small scripts where
| performance usually doesn't matter. Even if it would be more
| performant, it would be a nightmare for large code bases since
| it's dynamically typed.
|
| I really enjoy Nim which is "slick as Python, fast as C".
| supreme_berry wrote:
| You wouldn't believe how many near-FAANGS have hundreds of
| large backend services on Python without any issues and from
| times where typing was in docstrings.
| baisq wrote:
| Because they have insane amounts of money that they can throw
| at the machines.
| bjourne wrote:
| I one had a database-backed website serving 50k unique
| visitors/day written in Django and hosted on a low-budget
| vps. Worked like a charm with very few hiccups.
| CraigJPerry wrote:
| I was curious so i had a bash at comparing the cost of just
| buying another server to throw at the problem vs telling a
| FAANG dev to optimise the code.
|
| A dedicated 40core / 6Tb server is around $2k but will be
| amortized over the years of its life. It needs power,
| cooling, someone to install it in a rack, someone to
| recycle it afterwards, ..., around $175/yr
|
| A FAANG dev varies wildly but $400k seems fair-ish (given
| how many have TC > 750k).
|
| So that's about 12 hours of time optimising the code vs
| throwing another 40c / 6Tb machine at the problem for 365
| days.
|
| The big cost i'm missing out of both the server and the
| developer is the building they work in. What's the recharge
| for a desk at a FAANG, $150k/yr ? I have no idea how much a
| rack slot works out at.
|
| Unless i've screwed up the figures anywhere, we should
| probably all be looking at replacing Python with Ruby if we
| can squeeze more developer productivity!
| SuaveSteve wrote:
| Why not switch to making __slots__ in classes the default and
| then making attribute changes to an object during runtime an opt-
| in? It will require a long grace period but wouldn't it help
| optimisation efforts immensely?
| BurningFrog wrote:
| Where can I read about what kind of performance improvements
| `__slots__` brings?
| WillDaSilva wrote:
| The Python docs themselves is a good place to start:
| https://docs.python.org/3/reference/datamodel.html#slots
|
| The Python wiki also has some good info about it:
| https://wiki.python.org/moin/UsingSlots
| gjulianm wrote:
| That's going to require quite a lot of changes, it's a giant
| breaking change. All classes would need someone to go around
| finding all the attributes that are created and adding an
| __slots__ dictionary, to avoid regular attribute initialization
| in __init__ failing. It's a massive task, and it would
| completely break backwards compatibility for performance gains
| that not everybody will need.
| anamax wrote:
| default __slots__ breaks a lot of monkey patching.
|
| An "easier" change would be to add a class attribute
| "no__dict__", which says that the __dict__ attribute can't be
| used, which lets the implementation do whatever it wants. That
| can be incrementally added to classes.
|
| Another option is a "no__getattr__" attribute, which disables
| gettattr and friends.
| yedpodtrzitko wrote:
| That would mean all installed dependencies need to comply with
| this change as well, which is unlikely to happen in any
| realistic timeframe.
| [deleted]
| g42gregory wrote:
| 15 years ago I remember reading Guido van Rossum saying that
| Python is a connector language and if you need performance, just
| drop into C and write/use a C module. I thought it was crazy at
| the time, but now I see that he was absolutely right. It took a
| while, but now Python has a high-performing C module for pretty
| much every task.
| chrisseaton wrote:
| But these don't compose right? Each is a black-box to each
| other? A black-box add and a black-box multiply don't fuse.
| kylebarron wrote:
| They can! Numpy exposes a C API to other Python programs [0].
| It's not hard to write a Cython library that uses the Numpy C
| API directly and does not cross into Python [1].
|
| [0]: https://numpy.org/doc/stable/reference/c-api/index.html
|
| [1]: https://github.com/kylebarron/pymartini/blob/4774549ffa2
| 051c...
| chrisseaton wrote:
| So they can if you use their specific API? It doesn't
| naturally compose in conventional Python code?
| dekhn wrote:
| C and Python are not black boxes to each other. The entire
| python interpreter is literally a C API. You can create
| pyobjects, add heterogenous PyObjects to PyLists, etc. So
| evewrything in Python can be introspected from C.
|
| Turned around, Python has arbitrary access into the C
| programming space (really, the UNIX or Windows process it's
| running inside), so long as it has access to headers or other
| type info it can see C with more than black box info.
|
| Most python numerics is implement in numpy; the low levels of
| numpy are actually (or were) effectively a C API implementing
| multiple dimension arrays, with a python wrapper.
| klyrs wrote:
| You're talking past chrisseaton's point here. If you want
| two C extensions to interoperate with bare-metal
| performance, you can't just do from lib1
| import makedata from lib2 import processdata
| data = makedata() print(processdata(data))
|
| Because makedata needs to provide a c->py bridge and
| processdata needs a py->c bridge, so your process
| inherently has python in the middle unless lib2 has
| intimate knowledge of lib1. It can absolutely be done (I've
| written plenty of c extensions that handle numpy arrays,
| for example) but if somebody hasn't done the work, you
| don't get it for free. If your c extension expects a list
| of lists of floats, the numpy array totally supports that
| interface... but (last I checked) the overhead there is way
| slower than calling list(map(list, data)) and throwing that
| into your numpy-naive c extension.
| chrisseaton wrote:
| > C and Python are not black boxes to each other
|
| Yes they are - the C interpreter knows _nothing_ about what
| your C extension does. It can't optimise it because all it
| has is your machine code - no higher level logic.
| n8ta wrote:
| Print doesn't have to be re resolved on every access... Not sure
| about python but many interpreters do a resolution pass that
| matches declarations and usages (and decides where data lives,
| stack, heap, virtual register, whatever)
| SnowflakeOnIce wrote:
| In Python semantics, indeed, 'print' does need to be looked up
| each time!
| dataflow wrote:
| > Python can quickly check to see if they are using the dynamic
| features
|
| I don't understand how this is supposed to be "quickly"
| verifiable?
|
| Nothing prevents you from doing eval('gl' + 'obals')()['len'] =
| ...; how is the interpreter supposed to quickly check that this
| isn't the case when you're calling a function that might not even
| be in the current module?
|
| Doing this correctly would seem to require a ton of static
| analysis on the source or bytecode that I imagine will at _best_
| be slow, and at worst impossible due to the halting problem.
| [deleted]
| kmod wrote:
| Python dictionaries now have version counters that track how
| many times they were modified, so the quick check is to ask
| "was len not overidden last time and is the number of
| modifications to the globals the same as it was last time".
| gpderetta wrote:
| One possibility is to move the cost to the assignment, so the
| code that assigns a new value to the global 'len' function is
| going to track and invalidate all cached lookups. Hopefully you
| are changing the binding of 'len' less often than you are
| calling it :)
| kmod wrote:
| Cinder does this (invalidation), and both Faster CPython and
| Pyston use guarding.
| gpderetta wrote:
| Right, of course, guarded devirtualization is a common
| technique.
| [deleted]
| bootwoot wrote:
| I was reading this as an undetailed description of state
| available WITHIN the interpreter. Probably there is a table of
| globals that you can simply check last modification on or
| something like this. Whether you hit it with eval or some other
| tricky code, you can't modify a global without the interpreter
| knowing about it.
| dataflow wrote:
| If that's what they mean, how would that be any faster than
| what's going on right now? I thought normally when you hit a
| callable, the interpreter would just look up its name, check
| to see if it's a built-in, and then call the built-in if
| so... whereas in this case you'd still have to look up the
| name of the callable (is the idea to bypass this somehow?
| what do they do currently?), check to see if it's different
| than the built-in you'd _expect_ from the name (i.e. if it 's
| ever been reassigned to), then call that expected built-in if
| it's not... which seems like the same thing? At best it would
| seem to convert 1 indirect call to a direct call, which would
| be negligible for something like Python. Is the current
| implementation somehow much slower than I'm imagining? What
| am I missing?
| the-lazy-guy wrote:
| You could do something like primitive inline cache. Store
| "version" of the globals in another variable. Each time
| globals are modified - bump the version. For each call-site
| and/or keep what the global name is resolved to + version
| of "globals object" in a static variable. Now you can avoid
| name resolution if version hasn't changed between two
| executions of the line. Now in fast-path you just pay the
| price of (easily predicted, because globals almost never
| change) single compare and jump vs full hash-table lookup.
| dataflow wrote:
| I think the core of the optimization you're mentioning
| hinges on a normal lookup being a slow hashtable lookup
| (of a string?)... whereas I imagined the first thing the
| interpreter would do would be to intern each name and
| assign it a unique ID (as soon as during parsing, say)
| and use that thereafter whenever they're not forced to
| use a string (like with globals()). That integer could
| literally be a global integer index into a table of
| interned strings, so you could either avoid hashing
| entirely (if the table isn't too big) or reduce it to
| hashing an int, both of which are much faster than
| hashing a string. Do they not do that already? Any idea
| why? I feel like that's the real optimization you'd need
| if checking a key in a hashtable is the slow part (and
| it's independent of whether the value is being modified).
| blagie wrote:
| I don't think the world is quite so bad.
|
| x86 processors solve this by speculating about what's going on.
| If you suddenly run into a 1976-era operation, everything slows
| down dramatically for a bit (but still goes faster than an
| 8086). If you have a branch or cache miss, things slow down a
| little bit.
|
| One has a few possibilities:
|
| - A static analysis /proves/ something. print is print. You
| optimize a lot.
|
| - A static analysis /suggests/ something. print is print,
| unless redefined in an eval. You just need to go into a slow
| path in operations like `eval`, so if print is modified, you
| invalidate the static analysis.
|
| - A static or dynamic analysis suggests something
| probabilistically. You can make the fast path fast, and the
| slow path eventually work. If print isn't print, you raise an
| internal exception, do some recovery, and get back to it.
|
| I'm also okay with this analysis being run in prod and not in
| dev.
|
| As a footnote, JITs, especially in Java, show that this kind of
| analysis can be pretty fast. You don't need it to work 100% of
| the time. The case of a variable being redefined in a dozen
| places, you just ignore. The case where I call a function from
| three places which increments an integer each time, I can find
| with hardly any overhead at all. The latter tends to be where
| most of the bottlenecks are.
| chrisseaton wrote:
| > I don't understand how this is supposed to be "quickly"
| verifiable?
|
| You don't verify, and instead you run assuming no verification
| is needed. Then if someone wants to violate that assumption,
| it's their problem to stop everyone who may have made that
| assumption, and to ask them to not make it going forward.
|
| You shift the cost to the person who's doing the
| metaprogramming and keep it free for everyone who isn't.
|
| https://chrisseaton.com/truffleruby/deoptimizing/
| marcosdumay wrote:
| Hum... You are getting lost on theoretical undecidability.
|
| On the real world, when faced with a generally undecidable
| problem, we don't run away and lose all hope. We decide the
| cases that can be decided, and do something safe when they
| can't be decided.
|
| On your example, Python can just re-optimize everything after
| an eval. That doesn't stop it from running optimized code if
| the eval does not happen. It can do even more and only re-
| optimize things that the eval touched, what has some extra
| benefits and costs, so may or may not be better.
|
| Besides, when there isn't an eval on the code, the interpreter
| can just ignore anything about it.
| dataflow wrote:
| > You are getting lost on theoretical undecidability. [...]
| We decide the cases that can be decided, and do something
| safe when they can't be decided.
|
| I'm not lost on that at all; I'm well aware of that. that's
| precisely why I wrote
|
| >> [...] require _a ton of static analysis_ on the source or
| bytecode that I imagine will _at best_ be slow, and _at
| worst_ impossible due to the halting problem
|
| and not
|
| >> static analysis is impossible in the general case so we
| run away and lose all hope.
|
| I'm not sure how you read that sentiment from my comment.
| marcosdumay wrote:
| Hum... Ok. Then the answer is that most cases do not demand
| as much analysis time as you expect, and the ones that
| demand more still can gain something from dynamic behavior
| analysis in a JIT.
|
| Also, you can combine the two to get something better than
| any single analysis alone.
| peatmoss wrote:
| During Perl's hegemony as The Glue Language, I feel like the folk
| wisdom was:
|
| "Performance is a virtue; if Perl ceases to be good enough, or
| you need to write 'serious' software rewrite in C."
|
| And during Python's ascension, the common narrative shifted very
| slightly:
|
| "Performance is a virtue, but developer productivity is a virtue
| too. Plus, you can drop to C to write performance critical
| portions."
|
| Then for our brief all-consuming affair with Ruby, the wisdom
| shifted more radically:
|
| "Developer productivity is paramount. Any language that delivers
| computational performance is suspect from a developer
| productivity standpoint."
|
| But looking at "high-level" languages (i.e. languages that
| provide developer productivity enhancing abstraction), we can
| rewind the clock to look at language families that evolved during
| more resource-constrained times.
|
| Those languages, the lisps, schemes, smalltalks, etc. are now
| really, really fast compared to Python, and rarely require
| developers to shift to alternative paradigms (e.g. dropping to C)
| just to deliver acceptable performance.
|
| Perl and Python exploded right at the time that Lisp/Scheme
| hadn't quite shaken the myth that they were slow, with
| Python/Perl achieving acceptable performance by having dropped to
| C most of the time.
|
| Now the adoption moat is the wealth of libraries that exist for
| Python--and it's a hell of a big moat. If I were a billionaire,
| I'd hire a team of software developers to systematically review
| libraries that were exemplars in various languages, and write /
| improve idiomatic, performant, stylistically consistent versions
| in something modern like Racket. I'd like to imagine that someone
| would use those things :-)
| zdw wrote:
| This sounds a lot like what some Python package developers are
| trying with Rust (example being the cryptography package),
| which also has the unfortunate side effect of limiting support
| for some less popular platforms.
| edflsafoiewq wrote:
| Perl/Python/Ruby grew up in the 90s, the "Bubble economy" of
| the single core performance world, the likes of which had never
| and probably will never be seen again on the face of the Earth.
| In the post-Bubble world, throwing out 90% of your performance
| before you even start writing code, especially when the same
| dynamic features could be delivered via JIT without the cost,
| seems crazy.
| rockyj wrote:
| So true, excellent point! I just do not understand startups
| choosing Python/Ruby in 2022 when you can get most of the
| features, type safety, concurrency, async and 5 times more
| speed in other languages.
| WJW wrote:
| I don't think it is such a surprise. The ecosystems around
| Rails (for Ruby) and numpy/pandas/etc (for python) are
| orders of magnitude larger than you get in the modern
| languages. In Rails for example, adding an entire user
| management system (including niceties like password reset
| mails and must-haves like proper security for obscure
| vulnerabilities most people will have never heard of) is
| literally a single extra line in the gemfile and two
| console commands. In python the ML and numerics ecosystem
| are completely beyond anything another language has to
| offer at the moment, even more so when you compare the time
| to get started.
|
| In addition, "real" performance is often tricky to measure
| and may be irrelevant compared to other parts of the
| system. Yes, Ruby is 10-100x slower than C. But if a user
| of my web service already has a latency of (say) 200ms to
| the server then it barely matters if the web service
| returns a response in 5 ms or in 0.5 ms. Similarly for
| rendering an email: no user will notice their email
| arriving half a second earlier. Similarly for a python
| notebook: if it takes 1 or 2 seconds to prepare some data
| for a GPU processing job that will take several hours, it
| doesn't really matter that the data preparation could have
| been done in 0.1 seconds instead if it had been done in
| Rust.
|
| _Especially_ for startups where often you 're not sure if
| you're building the right thing in the first place, a big
| ecosystem of prebuilt libraries is super important. If it
| turns out people actually want to buy what you've made in
| sufficient numbers that the inefficiency of
| Ruby/Python/JS/etc becomes a problem then you can always
| rewrite the most CPU intensive parts in another language.
| Most startup code will never have the problem of "too many
| users" though, so it makes no sense to optimize for that
| from the start.
| ReflectedImage wrote:
| Well if you choose Python/Ruby you only need 1/3 of the
| developers as if you choose another language.
|
| The productivity gain is so great it outweights everything
| else. It's as simple as that.
| peatmoss wrote:
| Is it gauche to offer my own counterpoint?
|
| Another possibility is that the requirement to "drop to C" is a
| virtue by de-democratizing access to serious performance. In
| other words, let the commoners eat Python, while the anointed
| manage their own memory.
|
| I personally find this argument a bit distasteful / disagree
| with it, but there was a thread the other day that talked about
| the, uh, variable quality of code in the Julia ecosystem (Julia
| being another language where dropping to C isn't important for
| performance). In Julia, the academics can just write their code
| and get on with their work--the horror!
| munificent wrote:
| _> Those languages, the lisps, schemes, smalltalks, etc._
|
| The main reason those languages got fast despite being highly
| dynamic is because of _very_ complex JIT VM implementations.
| (See also: JavaScript.)
|
| The cost of that is that a complex VM is much less hackable and
| makes it harder to evolve the language. (See also: JavaScript.)
|
| Python and Ruby have, I think, reasonably chosen to have slower
| simpler implementations so that they are able to nimbly respond
| to user needs and evolve the language without needing massive
| funding from giant corporations in order to support an
| implementation. (See also: JavaScript.)
|
| There are other effects at play, too, of course.
|
| Once your implementation's strategy for speed is "drop to C and
| use FFI", then it gets much harder to optimize the core
| language with stuff like a JIT and inlining because the FFI
| system itself gets in the way. Not having an FFI for JS on the
| web essentially forced JavaScript users to push to make the
| core language itself faster.
| peatmoss wrote:
| Spending a weekend or two writing a Scheme that beats Python
| in performance has been a pastime for computer science
| students for at least a couple decades now. I'm not sure that
| I believe that a performant Scheme implementation has more
| complexity than e.g. PyPy. In fact, I'd wager the converse.
| mrtranscendence wrote:
| You're either exaggerating or the computer science students
| you're familiar with are wizards. I've never known the
| student who could write a Scheme implementation, from
| scratch, in one weekend that is both complete and which
| beats Python from a performance perspective.
| peatmoss wrote:
| If it's an exaggeration, it's not much of one.
|
| Two parts to your argument:
|
| - Writing a Scheme implementation quickly: Google "Write
| a Scheme in 48 hours" and "Scheme from scratch." 48 hours
| to a functioning Scheme implementation seems to be a feat
| replicated in multiple programming languages.
|
| - Performance: I haven't benchmarked every hobby scheme,
| but given the proliferation of Scheme implementations
| that, despite limited developer resources, beat (pure)
| Python with it's massive pool of developers (CPython,
| PyPy), I still don't buy the idea that optimizing Scheme
| is a harder task than optimizing Python. Again, I'd
| strongly suggest that optimizing Scheme is a much easier
| task than optimizing Python simply by virtue of how often
| the feat has been accomplished.
| eatonphil wrote:
| I would not include PyPy in a list of easy to beat
| implementations.
| JulianWasTaken wrote:
| Nor ones with massive pools of developers.
| peatmoss wrote:
| Compared to most Scheme implementations?
| mrtranscendence wrote:
| If you can give me an implementation that implements
| almost all of R5RS, in 48 hours, beating Python in
| performance, and all by a single developer, I'll tip my
| hat to that guy or gal. But I can't imagine it's too
| commonly done.
| eatonphil wrote:
| Nobody said you can implement a full Scheme
| implementation in 48 hours or two weeks. That's very much
| besides the point about how poor CPython performance is.
| eatonphil wrote:
| Substitute computer science student with "developer" and
| it holds for me. Definitely some CS students can do it
| too. Actually at my school we did have to implement a
| Scheme compiler. So yeah it's not too big of a stretch to
| say.
|
| I think people who haven't implemented a language
| underestimate how slow CPython is. And overestimate how
| hard it is to build a compiler for a dynamic language.
|
| I think every professional developer or CS student can
| and should build a compiler for a dynamic language!
| mrtranscendence wrote:
| But the claim was that a student could write a conformant
| Scheme implementation in 48 hours that beats Python.
| Clearly it's possible for a student to write a Scheme
| that's faster than Python, but is it a reasonably
| _complete_ Scheme done in a single weekend?
|
| Even I, very much a non-computer scientist, could write a
| fast Scheme quickly if I could keep myself to a very
| small subset, so that's not interesting to me.
| eatonphil wrote:
| Conformant is a word you introduced, they didn't say
| that.
| munificent wrote:
| Sure, but that's because Python has objects.
|
| If your write an object system on top of your performant
| hobby Scheme implementation, you'll likely find that the
| performance of its method dispatch is about as slow as it
| is in Python. Probably even slower.
|
| Purely procedural Python code isn't as slow as object-
| oriented Python code.
| peatmoss wrote:
| That's fair, but also the fact that we're comparing hobby
| scheme implementations to two mainstream extremely
| popular implementations of Python and setting up
| conditions that forces (hobby) Scheme to play to Python's
| relative strengths is telling. :-)
|
| The Python ecosystem has certainly received a lot of
| developer resources and attention the past couple of
| decades. Shall we compare the performance of CLOS on
| SBCL, which again has seen comparatively little developer
| resources, to Python's performance in dealing with
| objects? I'd take that performance wager.
| Spivak wrote:
| This isn't as much of a gotcha as you think. Python is
| slow because the language is so dynamic and simply has to
| do more behind the scenes work on each line. It's not
| impressive that a language that does less is faster.
| What's impressive is that a language that does _more_ ,
| like JS on V8, is faster.
| CraigJPerry wrote:
| Is CLOS doing less than Python?
|
| I'm thinking CLOS has more dynamism than Python - they're
| both dynamically typed, they're both doing a lookup then
| dispatch, but then CLOS adds dynamism on top of that,
| it's also looking in the metadata thingy (i'm not a lisp
| developer, do they call it the hash? I'm meaning the key
| value store on every "atom" - i'm so out of my depth
| here, is atom the right word?) plus if i remember right
| the way CLOS works you use multiple dispatch not just
| single dispatch like python.
| igouy wrote:
| > Python and Ruby have, I think, reasonably chosen to have
| slower simpler implementations...
|
| ?
|
| https://shopify.engineering/yjit-just-in-time-compiler-cruby
| munificent wrote:
| Yes, CRuby is slowly moving towards a JIT now because
| performance is a major blocker for user adoption.
|
| The larger Python ecosystem has tried that a number of
| times too (Unladen Swallow, PyPy, etc.)
|
| It's quite difficult since both of those languages already
| lean heavily on C FFI and having frequent hops in and out
| of FFI code tends to make it harder to get the JIT fast.
| JITs work best when all of the code is in the host language
| and can be optimized and inlined together.
| eatonphil wrote:
| Javascript the language seems to have evolved much more than
| Python despite CPython's very simple implementation.
| munificent wrote:
| Hence my point about "massive funding from giant
| corporations in order to support an implementation". :)
| eatonphil wrote:
| Well almost all the JavaScript language innovation was
| syntax sugar and was implemented as transforms before the
| browsers implemented it. I think JavaScript devs mostly
| were fine to keep using transforms indefinitely and it's
| just been more convenient that the browsers have moved to
| implement it.
|
| Python could have done this easily too but evolving as a
| language just isn't as big a priority (not that I'm
| saying it should be) and that's completely (or mostly)
| disconnected from their backend implementation decisions.
| boringg wrote:
| What does "Modern" python even mean?
| digisign wrote:
| Focuses on 3.8+, but 3.7 has another year of life in it.
| didip wrote:
| If you are building server-side applications using Python 3 and
| async API and if you didn't use
| https://github.com/MagicStack/uvloop, you are missing out on
| performance big time.
|
| Also, if you happen to build microservices, don't forget to try
| PyPy, that's another easy performance booster (if it's compatible
| to your app).
| mrslave wrote:
| > if it's compatible to your app
|
| Every time I experiment with PyPy (on a set of non-trivial web
| services) I encounter at least one incompatibility with PyPy in
| the dependency tree and leave disappointed.
| [deleted]
| s_Hogg wrote:
| Great read, vaguely reminds me someone or other was trying to get
| cpython going with cosmopolitan libc. Wonder what that would do
| for speed.
| make3 wrote:
| why do you think that this would help performance? a quick read
| says cosmo is slower than regular libc. maybe it would be more
| portable, but not faster
___________________________________________________________________
(page generated 2022-05-05 23:00 UTC)