[HN Gopher] Python-based compiler achieves orders-of-magnitude s...
___________________________________________________________________
Python-based compiler achieves orders-of-magnitude speedups
Author : Stratoscope
Score : 213 points
Date : 2023-03-15 07:59 UTC (15 hours ago)
(HTM) web link (news.mit.edu)
(TXT) w3m dump (news.mit.edu)
| ar9av wrote:
| There are other python implementations like pypy which includes a
| JIT (Just In Time compiler). There are other jit which can run
| with official python (cpython) like numba (not all code can be
| optimized, but if you only need optimize your hot code path).
|
| You can use a superset language of python called cython that
| generate C code. It can be used to generate C bindings or fast
| python (for cpython) modules implemented in a python like code.
|
| You can use a really fast language like C, Rust, C++,... create
| python wrappers with cython, swig, Boost.python, cffi,... and use
| python like glue code.
|
| Python is not a fast languages as others, but there are tricks to
| make fast programs.
| DeathArrow wrote:
| Can you use Django with those optimisations or are they good
| mainly for scientific computing?
| acdha wrote:
| I had little trouble switching a Django app years ago but the
| results were mixed. Some complicated views and reports saw a
| hefty win but most of the app was database-limited and
| optimized to the point that there was no meaningful
| difference, except that PyPy used more RAM.
| jstx1 wrote:
| Have a look at Cinder -
| https://github.com/facebookincubator/cinder - it's Meta's
| performance oriented fork of CPython that they use to run
| Instagram (which is a big Django app).
| robertlagrant wrote:
| I always wondered with Cinder why they didn't turbocharge
| PyPy development instead.
| lozenge wrote:
| With a codebase of any significant size the priority is
| always to maintain compatibility while improving
| performance.
|
| If you start with an incompatible, highly performant
| interpreter, the compatibility "distance" is difficult to
| measure and could create unknown performance cost. For
| example, PyPy doesn't support C modules due to the
| differing memory layout.
| doix wrote:
| At my old gig we ran Django and FastAPI with pypy. I don't
| remember there being too many issues. One thing is that pypy
| versions lag behind official python, so if you're using
| bleeding edge stuff, it won't be supported in pypy yet.
|
| That was a couple of years ago at this point, and I've not
| been in the python ecosystem since then, but I can only
| imagine things are getting better in that regard rather than
| worse.
| rpep wrote:
| You can certainly use it, but whether you see any benefit is
| going to strongly depend on your workload. If you're doing
| significant calculations in the API then it might be
| considerably more performant, but if your API is primarily
| retrieving things from the database and transforming it to
| JSON then you're going to be limited mostly by the database
| latency and so I wouldn't expect major improvements.
| doix wrote:
| If you are fetching lots (not even 'big data', but a few
| thousand rows) of data using the Django ORM, you will see a
| performance difference when using pypy, or at least I did a
| few years ago. The database can happily return a few
| thousand rows very quickly, especially if you take care to
| optimize your queries and have good indexes.
|
| Converting a few thousand rows to python/django objects
| takes _time_. I can't quantify anything, because it's been
| too long, but I remember it being fairly significant. When
| I profiled it, the majority of the time was spent calling
| __setattr__ a few million times.
|
| Like you said, it depends on your use case. If your queries
| are slow, then optimize your database queries. But if your
| queries are fast and your responses are still slow, then
| investigating pypy is definitely worth it. You can also
| play around with .values_list or something in Django, so
| that you get 'raw values' instead of objects (but there's
| still a cost to building them up).
| coldtea wrote:
| And yet the same supposedly "io bound" workloads (like
| "parse request, fetch something from DB, return it as
| JSON") still have widely difference performance
| characteristics in some languages vs others, with 10x to
| 100x requests handled per second...
| acdha wrote:
| 2x, yes. If you're seeing 100x you're comparing different
| things like creating and serializing complex objects
| versus simple types or using a JSON parser which loads an
| entire document into objects versus one which only
| retrieved specific values.
| coldtea wrote:
| Well, perhaps not 100x in rps, but 10x sure:
|
| Overall top performing frameworks (JS, Java, and Rust) at
| 650K rps. That's 7x over the top Python based framework
| at: 86K rps.
|
| And another very popular Python framework (flask) gets
| just to 2K rps. That's 325 times worse to the best.
|
| And that's the "single DB query" benchmark: https://www.t
| echempower.com/benchmarks/#section=data-r21&tes...
|
| https://github.com/TechEmpower/FrameworkBenchmarks/wiki/P
| roj...
| jstarfish wrote:
| Pypy is great but I didn't find it very useful with Django.
|
| Quick, transactional HTTP exchanges (GET, POST, etc.) aren't
| really its thing-- there's no time for the compiler to get
| warmed up; the request is complete before pypy has gotten out
| of bed.
|
| But if you have to do really complex view rendering (graphs
| or something) where it would take cpython ~10s or more to
| process, then pypy will leave cpython in the dust.
| afdbcreid wrote:
| I've never used PyPy, but shouldn't it warm after the Nth
| request even if you don't have loops?
| a-dub wrote:
| numba seems to hit the sweet spot for most numerics for me. it
| can get a little annoying with type inference issues but
| overall it seems the most concise and least hassle for moving
| loops into optimized machine code.
| adammarples wrote:
| One great move I've discovered recently is to simply type
| annotate the python thoroughly and use mypyc to build a c
| package
| wcdolphin wrote:
| Would you mind sharing more details of your experience using
| MypyC? What domain are you working in, and what kind of
| effort and speedups did you see?
| FreakLegion wrote:
| Here's a detailed write-up from one of the Black
| maintainers:
| https://ichard26.github.io/blog/2022/05/compiling-black-
| with.... That's on the low side of the speedup curve, in my
| experience.
| math_dandy wrote:
| Is it Pythonic enough to be compatible with tooling, e.g., the VS
| Code Python extension?
| t43562 wrote:
| So the differences:
|
| https://docs.exaloop.io/codon/general/differences
|
| So more limited types (integers) and more type checking and
| collections have to have one kind of thing in them.
|
| There are other python compilers though, like
| https://github.com/Nuitka/Nuitka
|
| I wonder really what the advantages/disadvantages of these are?
| SloopJon wrote:
| Another big difference: "Codon is licensed under the Business
| Source License (BSL), which means its source code is publicly
| available and it's free for non-production use. ... each
| version of Codon converts to an actual open source license
| (specifically, Apache) after 3 years."
|
| https://docs.exaloop.io/codon/general/faq
| UncleEntity wrote:
| Compilers traditionally don't taint users code with their own
| license, seems this one does.
| ptx wrote:
| _" While Codon's syntax and semantics are virtually identical
| to Python's, [...] Codon currently uses ASCII strings unlike
| Python's unicode strings."_
|
| So aside from that tiny issue at the center of the decade-long
| Python 2 to 3 migration debacle, it's virtually identical!
| geysersam wrote:
| It also uses 64-bit ints instead of the infinite size ints in
| CPython.
|
| Sounds like an issue that could easily bite someone in the
| behind and cause quite nasty bugs.
| scythe wrote:
| >a few of Python's dynamic features are disallowed. For
| example, [...] or _adding objects of different types to a
| collection._
|
| The fine print is strong with this one. It makes me wonder why
| they didn't just start with RPython.
| anentropic wrote:
| From what I've heard Nuitka is a true compiler for actual
| Python (not a "Python-like" language) but does not give so much
| speed up
| DeathArrow wrote:
| >"Google users in America have searched for Python more often
| than for Kim Kardashian."
|
| I wonder how Kim Kardashian programming language looks like. I
| guess low level but with garbage collector. :D
| behnamoh wrote:
| Kardashians are famous for... being famous.
| Retric wrote:
| That and wealth. There's plenty of billionaires living
| privately, but many people find rich people who spend
| extravagantly interesting.
| localhost wrote:
| Since I have the world's greatest creative assistant handy,
| here's what GPT-4 thinks a Kim Kardashian language would look
| like:
|
| Hey dolls, let me introduce you to the Kimmie programming
| language, it's like totally fab and easy to use!
|
| To declare a variable, just use the hashtag symbol and the
| variable name, like this:
|
| #my_var
|
| To assign a value to the variable, use the word "like" followed
| by the value, like this:
|
| #my_var like 10
|
| To print out a message, use the word "OMG" followed by the
| message in double quotes, like this:
|
| OMG "Hello, dolls!"
|
| To add two variables together, use the word "add" followed by
| the two variables, like this:
|
| #var1 like 5 #var2 like 7 #sum like add #var1 #var2
| alfor wrote:
| Here is the TrashTalk interpreter:
|
| import re
|
| def trashtalk_interpreter(code): variables = {} code_lines =
| code.split("\n") for line in code_lines:
| if line.startswith("#"): var_name, _, value =
| line.partition(" like ") if "add " in value:
| _, var1, var2 = value.partition("add ")
| var1 = var1.strip() var2 = var2.strip()
| variables[var_name.strip()] = variables[var1] +
| variables[var2] else:
| variables[var_name.strip()] = int(value) elif
| line.startswith("OMG"): message =
| re.findall(r'"(.*?)"', line) if message:
| print(message[0].format(**variables))
|
| # Sample code code = ''' #my_var like 10 OMG "Value of
| my_var: {my_var}" #var1 like 5 #var2 like 7 #sum like add
| #var1 #var2 OMG "Sum of {var1} and {var2} is {sum}" '''
|
| trashtalk_interpreter(code)
| bqmjjx0kac wrote:
| Is this an original joke/impression by GPT-4? Either way,
| this is hilarious.
| localhost wrote:
| I copied the response verbatim. I also did a follow-up
| where it wrote the interpreter as well:
| class KimmieInterpreter: def __init__(self):
| self.variables = {} def
| interpret(self, code): for line in
| code.split("\n"): tokens = line.split()
| if len(tokens) == 0: continue
| if tokens[0] == "#":
| self.variables[tokens[1]] = None elif
| tokens[0] == "#my_var":
| self.variables[tokens[1]] = int(tokens[3])
| elif tokens[0] == "OMG":
| print(tokens[1][1:-1]) elif tokens[0]
| == "add": var1 =
| self.variables[tokens[2]] var2 =
| self.variables[tokens[3]] result =
| var1 + var2
| self.variables[tokens[1]] = result
| a2800276 wrote:
| It wouldn't waste time garbage collecting to focus on trash
| talking instead.
| rjmill wrote:
| Introducing "Trashtalk" a Smalltalk dialect without GC!
|
| This language isn't here to make friends.
| lr1970 wrote:
| > I wonder how Kim Kardashian programming language looks like.
| I guess low level but with garbage collector. :D
|
| And exposed naked primitives ... :-)
| pungentcomment wrote:
| It's all garbage collector.
| toss1 wrote:
| Seems like more like a scaled-up garbage producer than
| garbage collector...
| spokeonawheel wrote:
| do people actually talk about the kardashians still? I thought
| that was like 3 years ago
| sigmoid10 wrote:
| The article states that this was around 2017-2018.
| ac130kz wrote:
| Quite sad to know that the dynamic nature of Python is preventing
| the speedups in the first place. I really hope there'll be a
| built-in optimizing JIT compiler without the limitations of PyPy,
| Codon, Nuitka, Numba, etc.
| tlarkworthy wrote:
| JavaScript/lua are dynamic and they are fast. It's other
| choices (GIL) which cause problem rather than the nature of the
| dynamic language space
| tomn wrote:
| The GIL improves single-threaded performance compared to
| other options.
| ac130kz wrote:
| I'd argue that JavaScript and Lua are much simpler underneath
| the hood, there's only a handful of types to be aware of,
| hence easier to make a JIT for.
| sirwhinesalot wrote:
| Surprised there is no comparison to MyPyC. That said the
| availability of a "JIT" compiler in the style of Numba but with
| much broader Python feature support sounds great to me.
| tekknolagi wrote:
| You might enjoy Cinder then. It's based on CPython so it is
| nearly 100% compatible.
|
| https://github.com/facebookincubator/cinder/
|
| Disclaimer: I used to work on it.
| adammarples wrote:
| I took their fib example and ran it in mypyc too out of
| curiosity. I got speedups of ~10x rather than codon's 100x.
| Still pretty good, I like mypyc.
| sirwhinesalot wrote:
| mypyc keeps Python's "BigIntegers", unicode string
| implementation, reference counting, and has little to no
| floating point-related optimizations yet. It prioritizes
| compatibility over overall performance, so I'm not surprised.
| I was also disappointed at how poor mypyc is at compiling
| across multiple files, but that they can fix at some point.
|
| The BigInteger "issue" pretty much makes something like
| Fibonacci a worst case scenario for it.
| ptx wrote:
| Looks like they added support for unboxed floats yesterday
| (not sure about integers): https://github.com/python/mypy/c
| ommit/d05974b9b099ec755fd1c6...
| Labo333 wrote:
| I don't understand those benchmarks: why is there no comparison
| to numba? Also comparisons where C++ is beaten don't seem very
| realistic.
| physPop wrote:
| Second post on this in two days? Had a show HN yesterday. We
| appreciate the info but dont plug an incomplete product so hard,
| especially one that doesn't really give other options a fair
| assessment on their website. Eg. no mention/discussion of nuitka,
| jax, etc.
| DeathArrow wrote:
| It's not the first time it was tried to compile Python:
| https://en.wikipedia.org/wiki/IronPython
|
| I hope this time we will see better results.
| smcl wrote:
| I'd say Nuitka or Cython are maybe the more common ones when
| talking about this. IronPython is/was interesting in that
| instead compiling to python bytecode or machine code it
| targetted the .NET CLR, and iirc I saw some kind of JIT going
| on when I was digging around (so _some_ things ended up as
| machine code), but it 's not really one of the usual "compiles
| python to machine code" implementations.
| paro_nej wrote:
| > Faster than the speed of C
|
| One must note that this is impossible, unless you have chosen to
| handicap the C-implementations while benchmarking. Borderline
| unethical IMO to put forth such a claim.
| dbrueck wrote:
| Pretty much every JIT-enabled language has this as at least a
| theoretical advantage over C.
|
| So not impossible and therefore not unethical.
| paro_nej wrote:
| They aren't JIT. They're doing AoT AFAICT.
| dbrueck wrote:
| I must have misunderstood what you were objecting to then,
| my bad. What claim are they making that is so impossible
| that it borders on being unethical?
|
| I mentioned JIT because it seems to be based on a similar
| principle at least, that of optimizing things on the
| programmer's behalf by looking at the program's usage and
| not just by looking at how to speed up the code generally.
| paro_nej wrote:
| My claim is that there is no additional information that
| Python provides as opposed to C that would make it
| faster. And hence, the only conclusion I have is either
| they have supercharged their compiler for that particular
| benchmark OR they have chosen to handicap C as once can
| express the computation in C that emits the same assembly
| that they lowered to and hence my point on handicapping
| the C benchmark.
| dbrueck wrote:
| > My claim is that there is no additional information
| that Python provides as opposed to C that would make it
| faster
|
| Ok, but that's not what they are claiming - their claim
| (at least based on what the article is saying) is more
| about one toolchain vs another, i.e. "if you use our
| compiler (that takes python code as input) then the
| resulting executable will run as fast (or possibly faster
| than) programs created by all the popular compilers (that
| take C/C++ code as input)." The sales pitch is that
| they've got magic sauce in their compiler, and you get to
| use Python as well.
| polotics wrote:
| previously: https://news.ycombinator.com/item?id=33908576
| [deleted]
| robomartin wrote:
| > Python -- which is typically orders of magnitude slower than
| languages like C
|
| Not to nit-pick...this has been characterized by a team who
| tested and compared a large set of languages against a wide range
| of application code. The number is, if I remember correctly,
| about 78x slower. I don't think "orders" of magnitude is entirely
| fair. Yes, Python is slow. I have made the mistake of trying to
| use it for time-critical embedded applications. Never again.
|
| Aside from this admittedly pedantic observation, the first thing
| that crossed my mind with regards to this tool --which sounds
| fantastic-- is that you would have to trust the correctness and
| reliability of your code to this translation layer. Not sure how
| to think about this other than to keep a mental note of it if
| using this tool.
| rerx wrote:
| 78x is roughly two orders of magnitude in typical physics
| parlance. If you take a more CSy stance and count powers of
| two, it would be six to seven orders of magnitude. Sounds
| entirely fair to me.
| robomartin wrote:
| In the article they say most speed-ups are in the 5x to 10x
| range. The paper shows this to be true, particularly when
| compared to PyPy.
|
| In other words, the acceleration isn't measured against raw C
| implementations (where the 78x factor I quoted would be
| relevant). It is measured against Python or PyPy.
|
| How much faster does Codon make your Python code. The answer
| seems to be somewhere around the 5x to 10x range.
|
| In that context, and in the context of actual applications
| rather than hand-picked tests (how much can we optimize a
| loop), "orders of magnitude" seems to be an exaggeration.
|
| BTW, MIT does this kind of thing all the time with their
| press releases. They have a brand to support with outlandish
| claims about everything that comes out of there. Those with
| frequent exposure to this kind of press release are wise to
| this. I've seen it for decades. It's marketing.
|
| For me, when someone says "orders of magnitude" it means
| "massive". I tend to say "10 times faster", "50 times faster"
| even "100 times faster". I probably start using "orders of
| magnitude" faster at 1000x or when I am trying to explicitly
| make an impression on a mathematically-challenged audience.
| "Orders of magnitude" sounds great to that crowd.
|
| I have never, in 40 years in CS/Engineering, heard anyone use
| powers-of-two when they say "orders of magnitude". Doing so
| would open you to serious misinterpretation. Engineers might
| say something like "a factor of 2 to the n" or something like
| that.
| wageslave99 wrote:
| I have no idea about compilers, so bear with me with this
| question: Can't we have a faster compiler for a subset of Python?
|
| I mean AFAIK the hard part of Python is that the language allows
| dynamic overwriting of attributes (or something like that). Is
| that feature actually needed for projects like Django, FastAPI,
| numpy, etc?
|
| Maybe I'm wrong, but the main idea I'd like to ask is, can we
| make a compiler for a subset of that language with C-API
| compatibility?
| dagw wrote:
| _Can 't we have a faster compiler for a subset of Python?_
|
| Check out Pythran, that is exactly what they've done.
| wageslave99 wrote:
| Thanks! Sounds interesting!
| https://pythran.readthedocs.io/en/latest/
| crabbone wrote:
| > Can't we have a faster compiler for a subset of Python?
|
| That's exactly how PyPy works.
| thunky wrote:
| PyPy isn't really a subset of Python. From pypy.org:
| PyPy is a Python interpreter, a drop-in replacement for
| CPython 2.7, 3.8 and 3.9
| acdha wrote:
| Yes - I think they meant RPython, which the PyPy team
| developed explicitly as the easy to optimize safer subset
| of Python.
| CJefferson wrote:
| The problem turns out to be that it's maintaining the C-API
| compatibility which is the main thing which makes it hard to
| make Python fast, not the other stuff -- Javascript has most of
| the nasty things Python does, and it's plenty fast on browsers.
|
| However, maintaining C-API compatibility means you need to set
| up lots of data structures exactly how the C API requires, and
| maintaining and updating those ends up losing you lots of your
| benefits of JITing.
|
| You could, hypothetically, introduce an entirely new API, which
| allowed for faster dynamic recompiling, but then you'd need to
| get every package anyone cares about to switch to that.
| vlovich123 wrote:
| I really try hard to understand this argument and I must be
| missing something and must be super stupid. Don't languages
| like JavaScript have this and yet they can still do JIT and
| the base runtime is still in C++? Java itself has an official
| way to invoke C programs from Java applications and still has
| a JIT. And Java also has AOT compilers.
|
| Sure. Crossing that FFI boundary is going to be expensive.
| But there's lots of techniques to mitigate it or even in the
| limit eliminate it. if I recall correctly you can JIT a fast
| call that knows how to invoke the FFI directly without the
| extra indirection layer. Basically a fancy runtime LTO.
|
| I think a huge part of it is CPython's interest in keeping
| the core codebase as simple as possible which seems to be the
| overriding reason for why the global lock still hasn't been
| removed (which iirc even Ruby pulled off at some point). Also
| the reason there's no JIT afaict and why Pypy got started to
| prove it is possible to JIT (and frequently sees substantial
| gains vs cpython). The problem they've had is that CPython is
| a moving target and it's hard to keep a parallel runtime up
| to date on a shoestring amount of funding. That's why you see
| alternate approaches like numba (JIT'ed Python) which are
| less of a departure and Cinder (better budget). To me this
| seems like a CPython project actively hostile to JIT than C
| data structures meaning you lose some benefit to FFI
| overhead. Performance is a virtuous cycle too - when there's
| enthusiasm about a language you get more and more people paid
| to make your language fast. For a while companies tried.
| Google gave up. Facebook only has it as a fork with a public
| plea for the maintainers of CPython to mainline literally
| anything.
|
| The CPython maintainers feel like the biggest obstacle. No?
| pjmlp wrote:
| The difference is that the other languages FFI don't expose
| internals like CPython does.
|
| For example, JNI only exposes handles and you need to
| convert an handle to a pointer, so the runtime knows for
| the time being that handle is special and being used by
| native code.
|
| When it is only an opaque handle, lots of optimizations can
| happen and the native code won't see them.
| vlovich123 wrote:
| Doesn't PyPy accomplish it via CPyExt? It sounds like
| Cinder, Instagram's version is CPython+JIT (among other
| things). I haven't looked at the details so maybe it's
| not a sufficient speed up and that's why all these
| parallel efforts haven't been merged? The part I'm
| missing is how what you said makes it intractable when we
| have counter examples within and without the language.
| Sure. Maybe some optimizations aren't possible. But
| that's a world of difference from little to no benefit
| and impossible.
|
| Don't get me wrong. I'm not passing a value judgement on
| the maintainers. But the reasons don't feel technical to
| me.
| kaba0 wrote:
| That's why I found GraalVM's approach intriguing. They
| provide a high level language API where you can simply write
| a language interpreter, and it will be able to JIT/AOT
| compile it down to fast machine code. But the most
| interesting aspect is that the IR they convert languages to
| is basically universal, so something like LLVM bitcode can
| also use the exact same representations.
|
| So you can interpret (and later AOT compile as well) LLVM
| bitcode and python, and this approach will allow _cross-
| language_ optimizations as well, which were not available at
| all before. But feel free to add a bit of JS /Java, etc to
| your code as well!
| rustybolt wrote:
| Sorry for the ignorance, but what do you mean by C-API here?
|
| Normally I'd say you mean the interface you use when you call
| native machine code from Python, but I don't see how this
| would slow things down.
| radicalbyte wrote:
| The interface to the C language. It is what makes Python
| fast - you write the code which needs to be fast in
| optimised C and call into it with Python. The Python code
| is then basically just the glue.
| pdpi wrote:
| That native code still needs to be able to interact with
| your Python objects somehow. You can't change the API
| around PyObject without forcing all C libraries to make
| changes on their side, and that API forces you to expose
| things a certain way.
| klooney wrote:
| This is Pypy and Rpython!
| KyeRussell wrote:
| Django makes use of a whole lot of Python's fancypants stuff.
| For this reason, for instance, mypy doesn't do well on Django
| projects without a purpose-built plugin. But I still take your
| point.
| LtWorf wrote:
| pydantic also requires a mypy plugin... But it's just how it
| was designed. I designed typedload with mypy in mind, so it
| kinda works (except for some limitations in the type system
| that don't allow to express some things, as of now).
| formerly_proven wrote:
| Most web stuff, but Django especially, relies heavily on
| dynamic Python magic to make cuter APIs.
| hgomersall wrote:
| That would be things like cython (https://cython.org/) and
| rpython (https://rpython.readthedocs.io/en/latest/).
| LtWorf wrote:
| > Is that feature actually needed for projects like Django,
| FastAPI, numpy, etc?
|
| Yes.
|
| FastAPI depends on really slow pydantic (disclaimer: I'm the
| author of the faster typedload).
|
| All those dynamic typechecking modules rely on the dynamic
| nature of the language. The alternative would be to having to
| generate code at compile time instead.
|
| pydantic is also in the process of being rewritten in rust to
| be not so slow any longer, and in the process it will become
| incompatible with anything else than cpython (the normal python
| runtime). Which in turns means fastapi won't be able to run on
| anything else (unless they decouple from pydantic... which
| probably won't be easy).
| BiteCode_dev wrote:
| Since this is highly incompatible with most python ecosystem
| right now, may I plug nuitka?
|
| https://nuitka.net/index.html
|
| It's a compiler for python code that can create stand alone
| executables, and up to 4 times the speed of the initial code.
|
| Best of all, it's extremely reliable, with a high level of
| support of event the tricky things like the scientic and gui
| stacks.
| plonk wrote:
| Seconded, I deploy large packages with gigabytes of deep
| learning and GIS dependencies in single executables with Nuitka
| and it works very well. Also handles including data files into
| the executable if needed.
| WhyCause wrote:
| Out of curiosity, are the GIS dependencies the proprietary
| ones ( _cough_ ESRI _cough_ )?
| plonk wrote:
| Please no, not ArcGIS. I'm sure our dependencies would
| clash with arcpy's if we tried. Fortunately my company uses
| FOSS packages almost exclusively.
| anakaine wrote:
| Ha ing been down a similar path, this whole thing works so
| much better if you don't 'import arcpy'. Licencing issues
| aside, you've often got faster tools in shapely, fiona,
| geopandas, rasterio, xarray.
| flowersjeff wrote:
| Third'ly?... Nuitka is amazing. Simply as that.
| anakaine wrote:
| We do similar. Works well.
| yardshop wrote:
| > high level of support of event the tricky things like the
| scientic and gui stacks
|
| Could it compile an app that uses Pillow and AggDraw and
| ReportLab and OpenPyXL with a TKInter GUI into a standalone app
| I can give to a coworker? That would be extremely useful!
| powersnail wrote:
| I've used Nuitka to package an app that involves Pillow,
| mupdf, PyQt, and several other libraries, and it handles them
| with no problem.
| yardshop wrote:
| That's very encouraging, I will give it a go!
|
| And I have to ask, does a powersnail live in a powershell?
| =) (also a language I like that needs better GUI and EXE
| packaging)
| plonk wrote:
| I don't know half of these but I'm almost sure that Pillow
| and TKInter would work.
| yardshop wrote:
| AggDraw is an "anti-grain geometry" graphics library that
| works with Pillow for drawing high quality images.
| ReportLab is a very big PDF generating library, and
| OpenPyXL reads and writes XLSX format Excel spreadsheets. I
| use these in many of my work-related apps. Tkinter is the
| big question for me because it involves a lot of behind-
| the-scenes files. Thanks for your comment, I will give
| Nuitka a try!
| anigbrowl wrote:
| Sold
| bouchard wrote:
| I wonder how it compares to taichi-lang which also came out of
| MIT CSAIL and doesn't suffer from this "Business Source Licence"
| nonsense.
| ginko wrote:
| nit: 'Python-based' would imply to me that it's written in
| Python, but it looks like it's mostly C++ & LLVM:
|
| https://github.com/exaloop/codon/tree/develop/codon
| capableweb wrote:
| Github reports it's 55% C++ and 43% Python, not too bad if it's
| correct.
| ginko wrote:
| I would expect a large chunk of the 43% Python to be tests.
| scott_s wrote:
| A fair implication, but they mean "Python-based" in that the
| language the compiler implements is based on Python.
| benj111 wrote:
| They should have said python compiler (shorter) or python
| compiler in c++ (more accurate and only one character longer,
| including spaces).
|
| Considering at least 2 people have gone to look at the source
| and then come here to comment, it would have been a net
| benefit for all involved. Plus, what does it say about the
| potential quality of your compiler if you can't even make
| correct English statements? This seems easier to get right
| than if( x = *p++ )
| daquisu wrote:
| > This seems easier to get right than if( x = *p++ )
|
| For people with native or fluent English, for sure. For the
| others, probably not.
| benj111 wrote:
| What level of English fluency should we expect of
| professors at MIT and writers for their site?
| xapata wrote:
| _The Shaft_, a Georgia Tech periodical (similar to _The
| Onion_ in spirit), interviewed a local associate
| professor: "How can I be expected to teach math if my
| students don't speak basic Mandarin?"
| chc wrote:
| 1. It currently only compiles a subset of Python, which is
| presumably why they said it was based on Python rather than
| Python.
|
| 2. There are lots of good developers who aren't capable of
| making any statements in English.
| jbylund wrote:
| These headlines usually work by the dept asking a
| researcher for a 10 sentence summary of their work. Someone
| in the dept summarizes that to 3 sentences, and sends that
| to the university pr dept. The university turns that 3
| sentences into 1 and that's what's published. My take is
| this says more about the weird game of telephone being
| played than it does about the research product itself.
| nightpool wrote:
| Yes, but it's a game of telephone that does make it
| annoying for HN readers to try and actually understand
| what's being presented, so it should be discussed and
| corrected if possible.
| xdavidliu wrote:
| true, but if that's the case, then the game of telephone
| would not call (haha) into question the quality of the
| compiler, since the devs were not responsible for the
| game.
| pgt wrote:
| Hardly a nitpick. It's key to the claim.
| chc wrote:
| It's a nit-pick because it's ultimately just a gripe about
| ambiguous phrasing, not because the implications of what the
| phrasing means are unimportant.
| nerdponx wrote:
| It's a little sad because PyPy literally is written in (a
| restricted subset of) Python, hence the name.
| v3ss0n wrote:
| And it is severely underrated. Even though performance gain
| is aevrate around 4x-20x. Used in production and memory
| usage is also about 1/6th of CPython. Can get 10x
| perfromance easily in many cases.
| ddorian43 wrote:
| > Used in production and memory usage is also about 1/6th
| of CPython
|
| I thought it would have higher memory usage? (based only
| on reading)
| chc wrote:
| Somewhat paradoxically, PyPy always uses more memory for
| programs with small working sets, but can use less memory
| for programs with large working sets. 1/6 is a lot more
| extreme than I would have expected, though.
| smolder wrote:
| I suspect their relative memory efficiency depends on the
| size of the program and the size of the data it's
| processing.
| avgcorrection wrote:
| What does "literally" buy you here?
| avinassh wrote:
| I experimented with SQLite, trying to insert many rows in
| under a minute. I ran my script with PyPy, with zero
| changes and it was 4x times faster!
|
| code here: https://github.com/avinassh/fast-sqlite3-inserts
|
| my blog post: https://avi.im/blag/2021/fast-sqlite-inserts/
| _gtly wrote:
| Paper here: "Codon: A Compiler for High-Performance Pythonic
| Applications and DSLs":
| https://dl.acm.org/doi/pdf/10.1145/3578360.3580275
|
| "Currently, there are several Python features that Codon does not
| support. They mainly consist of runtime polymorphism, runtime
| reflection and type manipulation (e.g., dynamic method table
| modification, dynamic addition of class members, metaclasses, and
| class decorators). There are also gaps in the standard Python
| library coverage. While Codon ships with Python interoperability
| as a workaround to some of these limitations, future work is
| planned to expand the amount of Pythonic code immediately
| compatible with the framework by adding features such as runtime
| polymorphism and by implementing better interoperability with the
| existing Python libraries. Finally, we plan to increase the
| standard library coverage, as well as extend syntax
| configurability for custom DSLs."
| rightbyte wrote:
| Ye well if you remove the dynamic feutures of a dynamic
| language it gets fast. It would be really impressive of they
| can achieve those feutures with the sameish speed.
| [deleted]
| brucethemoose2 wrote:
| I dont necessarily need all that dynamism though, and would
| happily use a Python subset that removed some stuff (and
| forced type hinting) in exchange for better compilation.
|
| Yes there are already subsets like this, but its not as
| helpful if it isnt standard.
| BiteCode_dev wrote:
| Depends of the work you have to do.
|
| If you code a website, fast api and django, the two most
| popular framework to do so, heavily rely on them to make
| you productive.
| bravura wrote:
| If you code a website, your speed issues probably come
| from the database layer. Not your Python.
| FridgeSeal wrote:
| What are you doing to your database? Lol
|
| Most databases I've dealt with will happily outstrip
| Python for a good chunk of the common queries.
| DougMerritt wrote:
| Measuring database speed is ultimately I/O bound,
| measuring a language's speed is typically CPU-bound.
| [deleted]
| echelon wrote:
| Starlark comes to mind, but that's probably too limited.
| msla wrote:
| I wonder what experienced Common Lisp compiler devs could
| accomplish if they turned their attention to Python.
| LispSporks22 wrote:
| There is an implementation of Python in Common Lisp
| https://clpython.common-lisp.dev/ but I think you probably
| mean some more lower-level thing
| msla wrote:
| I do: I mean some people with experience making Common Lisp
| implementations (SBCL, maybe) getting an idea and
| implementing Python with the same basic concepts they used
| to implement a Common Lisp compiler.
| Mikhail_K wrote:
| https://julialang.org/benchmarks/
| _a_a_a_ wrote:
| I'm sure this is a great project worthy of HN front page
| reporting but if it can't run python generally,it ain't python.
| jwmoz wrote:
| Failed at the first hurdle:
|
| main.py:15:1: error: syntax error, unexpected 'async'
| netbioserror wrote:
| Preface: I don't just want to crap on Python here and sell Nim. I
| like Python, and still use it.
|
| But it still shocks me just how much money and manpower is thrown
| at trying to bikeshed and optimize and compile Python and its
| libraries, while the Nim compiler is essentially a community
| hobby project that has made the concept of a "compiled Python" a
| reality already. The orders of magnitude in scale difference, and
| the qualities of the output products, are staggering.
|
| I'm kind of starting to see what Guido is talking about when he
| says Python is a legacy language that's probably on its way out.
| Even in the interpreted world, languages like Janet and other
| newcomers are performing fascinating experiments, often doing
| more with less.
| ActorNightly wrote:
| >I'm kind of starting to see what Guido is talking about when
| he says Python is a legacy language that's probably on its way
| out. Even in the interpreted world, languages like Janet and
| other newcomers are performing fascinating experiments, often
| doing more with less.
|
| Wow, what a way to mischaracterize what Guido said.
|
| His point was about languages evolving to be more abstract than
| Python or any of the ones you mentioned. Programming is going
| to become more and more abstract to the point where you will be
| able to program in natural language through speech. In the mean
| time, we still have to write code manually.
|
| And look, there are plenty of valid criticisms of Python, but
| you are kidding yourself if you don't think its going to be one
| of the primary languages of the future. There is a reason why
| it has the 2nd most gihub repos (behind JS, because of hard
| dependency on it for web stuff).
|
| And the simple reason is this: the vast, vast majority of
| applications don't need the fastest possible speed, its much
| more important to be able to develop fast, and have it be
| right. Its easier and cheaper to throw another EC2 instance in
| your stack rather than pay a developer to write stuff from
| scratch whereas in Python you can just import the relevant
| library for your needs and be up and running much faster, not
| only the short concise syntax used, but also the introspection
| into the running language because of its interpreted nature.
| And this allowed the snowball effect to happen, where
| developers could quickly write relevant libraries, which in
| turn allowed other developers to quickly import those libraries
| and write their libraries, slingshotting Python into a language
| that is used not only for bleeding edge ML stuff but to run
| backend web stacks with no issues.
|
| And in the cases where you do need speed, this is where these
| compilers come in, and its a 100% valid use of manpower and
| money. Think of it as another library.
|
| Every other language that focuses on things like static typing,
| whatever type of inheritance the designers think is best,
| memory safety, and all the other theoretical CS stuff
| completely misses the above point, and for that reason alone,
| it will never become mainstream. Rust is not going to happen,
| Nim is not going to happen, Julia is not going to happen, Scala
| is not going to happen, Elixir is not going to happen. Sure,
| there will be a significant amount of code written in those,
| but the popularity will never come close to Pythons. You may
| not like it, but you know this is true.
|
| We have already seen this cycle happen with Haskell where
| functional programming was the next best thing. you would
| constantly see posts about it at the same frequency you now see
| posts about Rust, and look where Haskell is now.
| afdbcreid wrote:
| Rust is different because it is trying to answer real needs.
| It is not going to replace Python, and if you're seriously
| thinking about writing your code in Python you probably
| shouldn't be thinking of writing it in Rust. There will
| likely also be less Rust code than Python code. But it can
| replace C/C++, not completely and not in the near future, but
| it is possible.
| RayVR wrote:
| A lot of effort is dedicated to trying to improve the speed
| because python is so widely used that improving performance
| could have a massive beneficial impact.
|
| Migrating to a new language is not easy when you have millions
| of lines of code.
| dgb23 wrote:
| But is this still really python? This compiler and others are
| not a drop in replacement. They typically cover a narrow
| subset and/or need additional code/hints etc.
|
| You can adopt it incrementally, but then you could just as
| well switch to a language with higher default performance,
| more language features that just work, unified tooling etc.
| and adopt that incrementally?
| netbioserror wrote:
| I didn't say that anyone should switch (see my preface), and
| I perfectly well understand this point. Having to retread
| this point over and over just serves to make our comments
| long and redundant and full of qualifiers, but I guess here
| we are again.
| RayVR wrote:
| You said you don't want to "crap on python" which I never
| said you were. I'm simply pointing out that what you're
| "shocked by" makes complete sense if you look at what can
| be gained by people "bikeshed and optimize and compile
| python"
| prirun wrote:
| > while the Nim compiler is essentially a community hobby
| project that has made the concept of a "compiled Python" a
| reality already.
|
| I checked out Nim a few years ago because I have a large Python
| project that I'd like to move to a compiled language. In my
| experience, if you get on the forums and do any kind of
| comparison between Python and Nim, you will quickly get
| responses of "Nim isn't Python, so quit trying to make it like
| Python".
|
| I think Nim would have been much more successful if it _was_
| more like Python, and if Python compatibility was one of its
| goals. Just as an example, a subrange a..b in Nim is closed on
| the right, unlike Python, and a.. <b is the open Python
| version. They could just as easily have made a..b open and used
| a..=b for the closed interval, for Python compatibility.
|
| I'm not saying Nim had to be the perfect Python compiler and
| compile all Python code unmodified. Based on what I've read,
| Python is too dynamic for this. But in cases where there was
| the choice to either be compatible with Python or "do something
| unique", Nim often takes the unique path, and not always for
| any good reason IMO.
| kmod wrote:
| Disclaimer: developer on Pyston, which could be considered a
| competitor
|
| My concern is: there have been a few projects already that are,
| from the outside, more or less the same approach and set of
| tradeoffs as this. And they haven't been that successful. Given
| that this is treading familiar ground I would expect some words
| about how this is different, and the lack thereof makes me a bit
| skeptical to say this will become successful when others did not.
| apgwoz wrote:
| > and the lack thereof makes me a bit skeptical to say this
| will become successful when others did not.
|
| Success of projects like this is not usually based on merit,
| but on how many people you can convince to go along with it
| until it eventually becomes a thing of its own sustainability.
| So, the recipe here would be to:
|
| 1. Be "good enough" and easy enough to get started such that
| early adopters have a great first experience speeding up
| something important to them. (hook them) 2. Be open and
| friendly to potential incoming contributors, letting them land
| changes, have a say in the discussion, and generally be part of
| it all. (community build) 3. Encourage people to share their
| successes and hopes / dreams for how great $X is on their
| blogs, HN, social media, etc. (propaganda) 4. Goto 1.
|
| In this case, step 3 will work best by highlighting that "You
| actually don't need most of the dynamic features of Python" as
| the central narrative.
|
| One big caveat is that Codon choose to not use Python semantics
| for `%` so the basic test of `print(-2 % 5)` fails unless you
| run it with `-numerics=py`... which should just be the default
| behavior -- and a great first community patch / discussion!
| [deleted]
| mark_l_watson wrote:
| I just tried it and it works well enough for simple algorithmic
| code. I can't get any examples working though that do network
| I/O.
| hgoldfish wrote:
| I did a similar demo project five years ago when I was learning
| compiler theory. And it was implemented by Python.
|
| https://github.com/hgoldfish/aggie
| crabbone wrote:
| Can we please not?
|
| Humanity wasted close to 50 years optimizing compilers for one
| garbage language. Wasted unimaginable efforts, money and
| developer hours... and all could've been avoided if the same
| people dedicated a fraction of those resources to language
| design.
|
| Same thing happened with Java. And now the existence of a well-
| developed compiler became an argument in its own right in favor
| of choosing a bad language.
|
| There's no need or reason to try to make Python run faster. It's
| a trash language. At best, it deserves a credit for being funny
| 30 years ago... but that had worn out pretty fast. Now it's just
| dumb. Improving its compiler will be again a resource sink for
| the programming community that, in the best case, may hope to
| produce something of value _by accident_ , independent of its
| main goal...
| fourthark wrote:
| Who is "we"? You don't have to have anything to do with it.
| crabbone wrote:
| We, the programming community. And I'd rather you would have
| nothing to do with it than me.
| [deleted]
| mrtranscendence wrote:
| Such an overwhelming amount of perfectly good solutions are
| created with Python that I have trouble conceiving it as a
| "trash" language. Clearly it's at least _good enough_ in many
| scenarios. For example, I use it for machine learning and data
| science professionally and I find it much more pleasant to use
| than alternatives in that space (e.g. Julia and R -- both have
| advantages over Python but have disadvantages too).
|
| It works, it doesn't confuse me, it's easy to find libraries
| and examples, and when it's too slow -- which is surprisingly
| rare -- I have other options to turn to. If that's trash then
| so be it; call me a raccoon because I'm there for the trash.
| Mashimo wrote:
| What is a non-trash programming language?
| crabbone wrote:
| In general, or specifically in the same niche as Python?
|
| Python has several disjoint domains where it's used. So, for
| example, when it comes to statistics, then J, R or Julia
| would all be better than Python. Not ideal, but still a lot
| better.
|
| When it comes to infrastructure and ops, then Erlang would've
| been a lot better. Still not ideal because of how existing
| implementation deals with deployment (too complicated), but
| that's not a feature of the language, and could be worked on
| in the same way how OP wants to work on Python compiler.
|
| When it comes to Web... well, I'm not a specialist... and I
| find everything about Web revolting, so it's hard for me to
| think about alternatives. On the other hand, there's rarely a
| language that doesn't come with a Web framework / some tools
| that allow it to be used to make Web applications. So, just,
| basically, throw a dart, and wherever it lands it's going to
| be better than Python with a very high probability.
|
| Python is also used to teach intro to computer science. And
| there's a lot of problems with this idea. Firstly, I don't
| believe that intro to CS should be taught by way of learning
| to program. It should give an overview of what CS is about,
| give some foundation, basic concepts from important fields...
| just like intro to math does, for example. But if we still
| have to have intro to CS the way we do today, then Scheme
| would be a lot better for this. Assembly is also a good pick,
| but for a different reason.
|
| Now, to address the "in general" part: I don't believe that
| languages like Python should be used universally in different
| domains. What I believe we need is a language like OMeta,
| which we specialize for the domains we want to program in, so
| that we can keep language and compiler mechanics separately
| from syntax of specific domains. Ironically, this was even
| obvious at the time of ALGOL design, but nobody waited for it
| to be implemented and went with the quick-and-dirty solution
| instead.
| hummus_bae wrote:
| [dead]
| mrkeen wrote:
| > if the same people dedicated a fraction of those resources to
| language design.
|
| What do you mean by language design here? Is it the user-facing
| bits and the ergonomics? Because it seems to me (as a non
| python dev) that that's the bit that python devs really like.
| naijaboiler wrote:
| yet its the language of choice for several people who are
| domain experts but not programming experts. It's doing a job
| for non-programmers who have to program. until a solution
| exists that does better, python is going nowhere.
| crabbone wrote:
| There are plenty of things which are bad but universally
| used. What's your point?
|
| Fast food is universally more popular than any healthy food
| that requires time to cook.
|
| People choose to buy low-quality goods in general, trading
| quality for immediate effect all the time. Take any industry,
| any product kind, you will see that consumption is skewed
| towards paying extra for immediate gain rather than paying
| for quality to minimize waste over time.
|
| The fact that you chose to rely on the opinion of non-experts
| in the field to assess the value / quality of a particular
| technology only means that you don't understand what quality
| is about. You are confused between wants and needs.
| ptx wrote:
| "So we thought, let's take Python syntax, semantics, and
| libraries and [...] Codon currently covers a sizable subset of
| Python, it still needs to incorporate several dynamic features
| and expand its Python library coverage. The Codon team is working
| hard to close the gap with Python even further".
| AbsoluteCabbage wrote:
| Python should just have optional compilation built right into the
| language.
|
| Likewise, modern low level languages should have syntactical
| conveniences and optional whitespace.
|
| It's 2023. We don't need to keep having this war.
| messe wrote:
| Scala seems to be taking this route with Scala 3. I'll
| definitely be keeping an eye on it. The JVM is very underrated
| on HN, and I'm far from a Java fanboy (I'm basically a Zig
| evangelist).
| mrkeen wrote:
| > The JVM is very underrated on HN
|
| I don't like it for my use cases, but whenever I read about
| it on HN it's supposedly the best-tooled, finest artifact of
| performance engineering ever built.
| fulafel wrote:
| I'm guessing you know, but as a reminder to readers, Python
| does have compilation built in. It works similarly as Java -
| source code is compiled to a bytecode that is then run by a
| virtual machine. A difference is that for executing that
| bytecode, CPython virtual machine have JIT native code
| generation.
| misrasaurabh1 wrote:
| This is super cool and useful. I know that Instabase , which also
| came from MIT got really popular and useful within finance
| communities because they allowed for really fast and efficient
| compute through their own Python DSL. Good to see this as an open
| source project which everyone can now use.
| zmmmmm wrote:
| Am losing count of all these efforts to rescue Python's
| performance - they all seem to amount to the same thing: it's not
| very hard to achieve this if you throw out fundamental aspects
| that make Python what it is. The premise is always that syntax is
| the barrier, and that people struggle so much to learn a new
| syntax that this is what keeps them using Python even though its
| performance is abysmal. But what if this isn't the right
| assumption? What if it's not the syntax but the ecosystem - and
| an ecosystem at that which highly depends on all the dynamic
| features to achieve what it does. Further than that, what about
| the assumption that people lacking confidence such that they
| blocked by learning another syntax are actually up for the
| challenge of understanding all the constraints and limitations of
| something that is something like but not quite real Python? These
| are exactly the people that you would expect to struggle with it.
| ActorNightly wrote:
| What is this talk about "rescuing" Python performance?
|
| Python does not need to be rescued. Its fast enough. For 90% of
| applications, you are kidding yourself if you need more speed.
|
| These are enhancements on Python, where you want to run stuff
| even faster on par with other languages.
| switchbak wrote:
| Its performance and poor support for parallelism has
| prevented me from using it in places I've wanted to for
| years.
|
| Is it "fast enough"? fast enough 90% of the time? Or just
| fast enough to leave you uncertain that it's even a good
| choice?
|
| Sorry, we have productive choices now that don't leave me
| worrying about this situation. I still like it though, and if
| they could solve those issues I'd probably use it a lot more!
| mrkeen wrote:
| > What is this talk about "rescuing" Python performance?
|
| Which part don't you like:
|
| That python is relatively slow? https://benchmarksgame-
| team.pages.debian.net/benchmarksgame/...
|
| Or that people are trying to fix its slowness? See the thread
| you're currently in.
___________________________________________________________________
(page generated 2023-03-15 23:01 UTC)