[HN Gopher] LPython: Novel, Fast, Retargetable Python Compiler
       ___________________________________________________________________
        
       LPython: Novel, Fast, Retargetable Python Compiler
        
       Author : fgfm
       Score  : 231 points
       Date   : 2023-07-29 02:24 UTC (20 hours ago)
        
 (HTM) web link (lpython.org)
 (TXT) w3m dump (lpython.org)
        
       | xbeuxhedovksb wrote:
       | This looks very cool ! There is also MyPyC which is not in the
       | comparison table, but worth noting.
       | 
       | They have some benchmarks vs regular python here :
       | 
       | https://github.com/mypyc/mypyc-benchmark-results/blob/master...
       | 
       | One difference is that MyPyC compiles your code to a C extension,
       | so your are still dependent on python. On the other hand you can
       | call regular python libraries with the normal syntax while, in
       | LPython, the "break-out" syntax to regular libraries isn't
       | straightforward
       | 
       | In any case super exiting to see work going into AOT python
        
         | certik wrote:
         | Awesome, thank you. I knew about mypyc, but forgot. I just put
         | it in:
         | 
         | https://github.com/lcompilers/lpython.org-deploy/pull/37
         | 
         | So now we have 25 compilers there.
         | 
         | Yes, the current syntax to call CPython is low level, you have
         | to create an explicit interface. We can later make it more
         | straightforward, such as using a `@python` decorator to a
         | function, where inside you just do CPython. We always want to
         | make it explicit, since it will be slow, and by default we want
         | the LPython code to always be fast.
        
       | akasakahakada wrote:
       | Always wonder how well are these compilers doing if I am already
       | writing full on SIMD codes instead of crappy for loops?
        
         | xapata wrote:
         | Depends. Does Numba help you?
        
         | rawrawrawrr wrote:
         | You're writing SIMD with Python? That's impressive. How are you
         | doing that?
        
       | reil_convnet wrote:
       | Looks really cool. Build on windows fails for me, are there any
       | downloadable prebuilt binaries to try lpython quickly ?
        
         | certik wrote:
         | Yes, try `conda install -c conda-forge lpython`. It should work
         | on Windows, but it's not as extensively tested as macOS and
         | Linux. If it doesn't work, please report it, and we'll fix it.
         | Once we get to beta, we will support Windows very well.
        
       | wood_spirit wrote:
       | This is really exciting!
       | 
       | Does typing have to be extensive or does the majority of it get
       | inferred with perhaps just function and class boundaries needing
       | annotations?
       | 
       | And if the latter, does the typing get inferred _after_ the
       | initial ssa pass, so a name has a context-specific type?
        
         | certik wrote:
         | See my comment here for all the details regarding implicit
         | typing and why we don't currently do it:
         | https://news.ycombinator.com/item?id=36920963. But we give you
         | nice error messages if types don't match.
        
       | ubj wrote:
       | Looks interesting! Thank you for focusing on AoT compilation and
       | not just JIT compilation. To be honest, I'm sick of JIT
       | compilation. In theory it seems like the best of both worlds, but
       | in practice it turns out to be the worst of both worlds,
       | especially for larger projects.
       | 
       | Will LPython have the ability to generate AoT compiled libraries
       | in addition to executables?
        
         | certik wrote:
         | > Looks interesting! Thank you for focusing on AoT compilation
         | and not just JIT compilation. To be honest, I'm sick of JIT
         | compilation. In theory it seems like the best of both worlds,
         | but in practice it turns out to be the worst of both worlds,
         | especially for larger projects.
         | 
         | Indeed, I think for large projects you want:
         | 
         | * generate a binary
         | 
         | * fast compilation in Debug mode
         | 
         | * fast runtime in Release mode
         | 
         | Sometimes you want JIT, so we support it too, but I personally
         | don't use it, I write the main program in LPython and just
         | compile it to a binary.
         | 
         | > Will LPython have the ability to generate AoT compiled
         | libraries in addition to executables?
         | 
         | Yes, you can do it today. Just compile to `.o` file and create
         | a library. We use this library feature in production at my
         | company today. If you run into issues, just ask us, we'll help.
        
           | ubj wrote:
           | Fantastic, looking forward to trying this out!
           | 
           | And good point about JIT--it does have its place. What I
           | should have said is that I wish there were better options
           | with high quality support for _both_ AoT and JIT. Most of the
           | options I'm aware of have good support for one, but poor or
           | no support for the other. I'm curious to see how well LPython
           | bridges the gap.
        
             | certik wrote:
             | Definitely let us know any feedback once you try it. You
             | can open up an issue at lpython github.
        
       | sheepscreek wrote:
       | This is way too similar to Mojo (language) by Modular. At least
       | at some level conceptually. All in a good way.
       | 
       | I think - the performance gains are coming from the Python syntax
       | being transliterated to an LFortran intermediate representation
       | (much like Numba converts code to LLVM IR).
       | 
       | Any calls to CPython libraries are made using a special
       | decorator, which might be doing interop using the CPython API. My
       | guess is, this will come with a performance penalty. More so if
       | you're using Numpy or Scipy, as you'll be going through several
       | layers of abstractions and hand-offs.
       | 
       | This is because Numba, Pythran and JAX (in a way) get around this
       | by reimplementing a subset of Numba/Scipy/other core libraries.
       | Any call to a supported function is dynamically rerouted to the
       | native reimplementation during JIT/AOT compilation.
       | 
       | I'd be interested in seeing how far LPython can tolerate regular
       | Python code, with a ton of CPython interop and class use.
       | 
       | In any case, glad to see more competition. Not to take anything
       | from the authors - this is a massive effort on their part and
       | achieves some impressive results. Mojo has VP money behind it -
       | AFAIK, this is a pure volunteer driven effort, and I'm grateful
       | to the authors for doing it!
        
         | ubj wrote:
         | Well, a couple key differences from Mojo are 1) LPython is open
         | source (BSD 3-clause), and 2) LPython is available to download
         | and use locally today. Mojo is still only available on private
         | Modular notebooks.
         | 
         | Granted, I'm optimistic for Mojo's potential but I do wish I
         | could run it locally. Modular's pricing model also remains to
         | be seen.
        
         | certik wrote:
         | Thank you for the encouragement. I can answer / clarify your
         | comments:
         | 
         | > This is way too similar to Mojo (language) by Modular. At
         | least at some level conceptually. All in a good way.
         | 
         | Yes, the main difference is that Mojo is (or will be) a strict
         | superset of Python, while we are a strict subset (but you can
         | call the rest of Python via a decorator).
         | 
         | > I think - the performance gains are coming from the Python
         | syntax being transliterated to an LFortran intermediate
         | representation (much like Numba converts code to LLVM IR).
         | 
         | Correct. We use the same IR as LFortran and then we lower to
         | LLVM IR.
         | 
         | > Any calls to CPython libraries are made using a special
         | decorator, which might be doing interop using the CPython API.
         | My guess is, this will come with a performance penalty. More so
         | if you're using Numpy or Scipy, as you'll be going through
         | several layers of abstractions and hand-offs.
         | 
         | Yes, it calls CPython, so it's slow.
         | 
         | > This is because Numba, Pythran and JAX (in a way) get around
         | this by reimplementing a subset of Numba/Scipy/other core
         | libraries. Any call to a supported function is dynamically
         | rerouted to the native reimplementation during JIT/AOT
         | compilation.
         | 
         | We do as well: we support a subset of NumPy directly
         | (eventually most of NumPy). We also support a very small subset
         | of SymPy. Over time we add more support to more basic
         | libraries. The rest you can call via CPython, but slow. For
         | SymPy we'll experiment building it on top (at least some
         | modules, like limits) and compile using LPython. Given that any
         | LPython code is just Python, this might be a viable way, as
         | long as we support enough of Python directly.
         | 
         | > I'd be interested in seeing how far LPython can tolerate
         | regular Python code, with a ton of CPython interop and class
         | use.
         | 
         | We support structs via `@dataclass`, but not classes yet
         | (although LFortran does to some extent, so we'll add support
         | soon to LPython as well). For regular Python call LPython will
         | give nice error messages suggesting to type things. Once you do
         | and it compiles, it will run fast.
         | 
         | > In any case, glad to see more competition. Not to take
         | anything from the authors - this is a massive effort on their
         | part and achieves some impressive results. Mojo has VP money
         | behind it - AFAIK, this is a pure volunteer driven effort, and
         | I'm grateful to the authors for doing it!
         | 
         | We are supported by my current company (GSI Technology) as well
         | as by NumFOCUS (LFortran), GSoC and other places; we have a
         | very strong team (5 to 10 people). In the past I was supported
         | by Los Alamos National Laboratory to develop LFortran. I have
         | delivered SymPy as a physics student with no institutional
         | support. So I have experience doing a lot with very little. :)
        
       | peterfirefly wrote:
       | > Based on the novel Abstract Semantic Representation (ASR)
       | shared with LFortran,
       | 
       | What's novel about it?
        
         | rebcabin001 wrote:
         | ASR abstracts away all syntax and all details of the target
         | machine, no leaks. Contrast to the schoolbook approach of
         | decorating ASTs with semantic information, which often reflects
         | details of a target machine.
        
           | certik wrote:
           | Yes. ASR is as abstract as it can be, but it is still
           | faithful to the original language, no information has been
           | lost, nothing was lowered.
        
             | peterfirefly wrote:
             | How do you manage to do that for Python AND Fortran in the
             | same language?
        
               | certik wrote:
               | By being a superset of Fortran and subset of Python. It
               | turns out the features map on each other almost 1:1, and
               | the differences can be taken care of by the respective
               | frontends, so the abstracted ASR maps perfectly.
        
             | rebcabin001 wrote:
             | Right. Decompilation with ASR should be relatively easy and
             | more faithful than average decompilation (though, as
             | mentioned by another commenter, the very-long-term value of
             | decompilation in general is debatable in the face of rising
             | AI like CoPilot).
        
           | rebcabin001 wrote:
           | Implication is that ASR is a full programming language in its
           | own right (though with no quality-of-life features:
           | everything is explicit, and it's also currently restricted to
           | the operations featured by LFortran and LPython: heavily
           | array-oriented for now, ASR grows as LFortran and LPython
           | grow). I've prototyped, in Clojure, an independent type-
           | checker for ASR (https://github.com/rebcabin/masr), and an
           | interpreter (for "abstract execution") should not be
           | difficult.
        
       | williamstein wrote:
       | I wish there were comparisons with Cython 3.0, which seems like
       | it would be a competitor with the AOT part of what they are
       | doing. For some reason they don't mention Cython at all.
        
         | certik wrote:
         | Hi William, nice to hear from you. We mention Cython at our
         | front page (at the bottom): https://lpython.org/, together with
         | the other 23 Python compilers that I know about (all of them
         | are competitors, in a way). I am very familiar with Cython from
         | about 10 years ago, but I have not followed the very latest
         | developments. We can do a compilation time and runtime speed
         | benchmarks against Cython in the next blog post. :)
        
           | cycomanic wrote:
           | It would also be nice to add some comparison with pythran,
           | which seems to occupy a similar niche (although it can
           | actually compile numpy code to optimized c++ code which I
           | don't think python can do?)
        
             | certik wrote:
             | Yes, we can compare with pythran too. We should really
             | compare against all the other compilers, but it's a lot of
             | work to compare meaningfully: we don't want to put up
             | benchmarks unless we are really sure they are solid, and as
             | everyone knows, it's really hard to do benchmarks that are
             | fair, as one has to have solid experience with both
             | projects being benchmarked. But we did Numba and C++, so
             | you for now you can compare pythran against them to get a
             | decent idea.
        
       | rich_sasha wrote:
       | Looks very interesting indeed.
       | 
       | Two questions come to my mind:
       | 
       | - presumably, since it is compiled, it does static checks on the
       | code? How many statically-detectable bugs that are now purely
       | triggered at runtime can be eliminated with LPython?
       | 
       | - does it deal with the unholy amounts of dynamism of Python? Can
       | you call getattr, setattr on random objects? Does eval work? Etc.
       | Quite a few Python packages use these at least once somewhere...
        
         | certik wrote:
         | > - presumably, since it is compiled, it does static checks on
         | the code? How many statically-detectable bugs that are now
         | purely triggered at runtime can be eliminated with LPython?
         | 
         | Yes, it does static checks at compile time. The only thing that
         | we do at runtime (imperfectly right now, eventually perfectly)
         | are array/list bounds checks, integer overflow during
         | arithmetic or casting and such. Those checks would only run in
         | Debug and ReleaseSafe modes, but not in Release mode. So for
         | full performance, you choose the Release mode. For 100% safe
         | mode, you would need to use ReleaseFast.
         | 
         | - does it deal with the unholy amounts of dynamism of Python?
         | Can you call getattr, setattr on random objects? Does eval
         | work? Etc. Quite a few Python packages use these at least once
         | somewhere...
         | 
         | It deals with it by not allowing it. We will support as much as
         | we can, as long as it can be robustly ahead of time compiled to
         | high performance code. The rest you can always use just via our
         | CPython "escape hatch". The idea is that either you want
         | performance (then restrict to the subset that can be compiled
         | to high performance code) or you don't (then just use CPython).
        
           | abecedarius wrote:
           | For a system that doesn't intend to handle arbitrary
           | unaltered Python code, I think a post announcing a "Python
           | Compiler" should make this clearer in the first paragraph.
        
             | certik wrote:
             | You are right, we should have made that clearer. If you
             | look at all the 25 compilers that we list at the bottom of:
             | https://lpython.org/, some of them are supersets, some of
             | them are subsets. Sometimes the distinction is blurry,
             | since even with LPython you can call arbitrary CPython
             | today, it just doesn't get compiled right now, but perhaps
             | later we can actually compile it like Cython does, just not
             | speed it up much. In that case we become a subset that is
             | fast, the rest is slower, but I think every superset of
             | Python behaves like that too. In a way, I think all of the
             | 25 compilers supports all of Python one way or the other
             | (LPython as well), and I think each of them only gets a
             | subset to be fast.
             | 
             | If you have some ideas how to best communicate this, let me
             | know.
             | 
             | The best approach that I know right now is to say that we
             | support a strict subset of Python, and if it compiles, it
             | will be fast. The rest of Python you have to call
             | explicitly and it will be slow and you get a CPython
             | dependency in the binary that we generate, but you can do
             | it. That way there is a clear distinction what gets
             | compiled via our compiler into high performance machine
             | code and what gets dispatched via CPython (it will
             | eventually get "compiled", but it will just call CPython).
        
               | abecedarius wrote:
               | I guess I'd put it like "LPython compiles a type-
               | annotated Python subset (dialect? variant?) to optimized
               | machine code. Interop with CPython is easy: arbitrary
               | Python code can go in the same source file using a
               | decorator." -- just as an example to try to be helpful --
               | of course you know your system and I don't. It sounds
               | really cool and I'm wishing you luck.
        
               | certik wrote:
               | Thank you! I opened up an issue to do this:
               | https://github.com/lcompilers/lpython/issues/2220.
        
         | movpasd wrote:
         | I think your second point is especially important, as the
         | semantics of Python are dynamic down to the core. And it's not
         | just hypothetical stuff -- Python libraries pretty
         | systematically take advantage of this dynamism (anyone who's
         | tried using type stubs for popular libraries will know what I
         | mean).
         | 
         | The examples in the article appear to mainly revolve around
         | numerical calculations, so I suspect the target audience is
         | people doing scientific computing who need to "break out" into
         | a compiled mode for heavy CPU calculations (similar to numba or
         | even Julia) from time to time when their calculations aren't
         | vectorisable.
         | 
         | I've noticed a split between the needs of software engineers on
         | the one hand, who need expressive abstractions to manage
         | systems of extensive rather than intensive complexity, and
         | scientific programmers or model-builders on the other hand, who
         | are much more likely to just use the primitives offered by
         | their language or library as their needs revolve around
         | implementing complicated algorithms.
        
       | catsarebetter wrote:
       | This is dope, I think the benchmarks of speed for python marks it
       | way slower than C++, JS.
       | 
       | If this could be used in mainstream web dev with the level of
       | speed detailed, python might eat Javascript's lunch.
       | 
       | *bias - python obsessed
        
         | vorticalbox wrote:
         | Speed isn't usually high on the list for web development, I
         | create APIs for fintech and we use nestJS for the back end, and
         | nest is definitely slower but you get a lot of benefits that
         | make it completely worth it.
         | 
         | Auto swagger generation, validation pipes and so on.
        
           | catsarebetter wrote:
           | completely fair, fintech is a beast that requires a lot of
           | blood sweat and tears
        
       | est wrote:
       | Is the name chosen in contrast with PyPy's RPython?
       | 
       | Haven't heard from pypy for a while now.
        
         | mattip wrote:
         | PyPy is still around. We released a Python3.10 version last
         | month. What else would you like to hear about?
        
           | mdaniel wrote:
           | I'm not the person who asked, but I've always been curious
           | what the "long pole in the tent" is for pypy updates.
           | https://mail.python.org/archives/list/pypy-dev@python.org/
           | seems pretty quiet, the blog seems to be just announcements
           | and not technical dives, and the 3.11 milestone (
           | https://foss.heptapod.net/pypy/pypy/-/milestones/15#tab-
           | issu... ) just has one issue
           | 
           | Please don't take this as "why u no faster?!" as much as I
           | really would enjoy understanding what it takes to bring pypy
           | up to cpython parity. I do see
           | https://doc.pypy.org/en/latest/contributing.html#your-
           | first-... says that pypy has a lot of layers and (I'd
           | suppose) a fraction of the number of people compared to
           | cpython's contributor base, but like I said it's hard to
           | understand from the outside whether it's just a lot of rocks
           | to break, or there are genuine novel optimization or computer
           | science tricks that have to be solved when rolling out a
           | newer version
           | 
           | Above all, thanks so much for your work on pypy - it has
           | really helped me a lot and not from its speed but from my
           | ability to use in in places where getting cpython to run is
           | harder than "curl pypy.tar.bz2 | tar -xjf"
        
             | mattip wrote:
             | I am glad you find PyPy useful. The good news is exactly
             | the boring consistency with which we have been able, on a
             | volunteer basis, to stay pretty close to CPython progress.
             | There are no big challenges there, it is work at the top
             | layer of the interpreter to adapt current code to CPython
             | changes. On the other hand, we need sponsorship to
             | implement the deeper changes we would like to make that
             | would make things even faster.
        
         | certik wrote:
         | No, we had LFortran, so naturally we have LPython now as the
         | second frontend. We chose "L" in LFortran to be unspecified,
         | although let's just say I live in _L_ os Alamos, and we use _L_
         | LLVM.
        
       | LoganDark wrote:
       | Being able to JIT individual Python functions is absolutely huge,
       | this could make way for huge speedups in hot functions from
       | existing large Python codebases.
        
         | KeplerBoy wrote:
         | that's what Numba does.
        
           | LoganDark wrote:
           | Numba seems specialized for NumPy. LPython seems to work with
           | everything, almost similar to mypyc.
        
             | KeplerBoy wrote:
             | It's not that specific to Numpy, it deals with all kinds of
             | properly typed Python as long as no packages are used. In
             | fact you don't need Numba if you're already using proper
             | Numpy code.
        
               | thecfrog wrote:
               | Some things can't be vectorized and sometimes the
               | vectorized implementation requires too much intermediate
               | memory.
        
               | certik wrote:
               | Yes, the Numba use case is a subset of LPython. We want
               | to support what Numba does, that is, you decorate your
               | function and JIT it. But in addition, we also want to
               | compile to binaries (ahead of time) that have no CPython
               | dependency, and support high performance optimizations.
               | Numba speeds up Python a lot, but doesn't seem to quite
               | get the Fortran/C++ level of performance sometimes. One
               | of our main goals is to be able to get maximum
               | performance, so that eventually as a user you can depend
               | on LPython that if it compiles it, it will run at least
               | as fast as C++ or Fortran would.
        
               | KeplerBoy wrote:
               | > At least as fast as C++ or Fortran
               | 
               | That obviously has to be the goal, but is it really
               | feasible to be faster than good C/C++ or Fortran? I did
               | some research into the Python Compiler landscape and came
               | to the conclusion that it almost always boils to LLVM.
               | So, if you want to have fast code, just help the compiler
               | make the most of your code and you'll be 99% of the way
               | there and as fast as possible without significantly more
               | effort.
               | 
               | Would you agree with my layman's understanding of this
               | topic?
        
               | certik wrote:
               | It's harder to imagine for a Python compiler, so let's
               | just focus on LFortran (since LPython delivers exactly
               | the same performance, due to sharing the middle end and
               | backends). The Fortran compilers traditionally were
               | faster than C++, and almost always (even today) are at
               | least as fast as C++, due to the Fortran language being
               | simpler and higher level, designed to allow good
               | optimizations. LFortran competes with other compilers as
               | well as C++ compilers and our goal indeed has to be to be
               | at least as fast as the competition. We currently are
               | sometimes faster sometimes slower, but we are in the same
               | league, so that's a good start.
               | 
               | Regarding LLVM: my experience so far is that LLVM is
               | indeed amazing what it can do in terms of optimizations.
               | It's very very good. However, it is not all LLVM. As our
               | benchmarks in the blog post show, we compare Numba, Clang
               | and LPython, all three of which use LLVM. But we get
               | vastly different performance for what seems to look like
               | identical initial code. To know exactly why, we would
               | have to meet with the Numba and Clang developers and
               | study this, I suspect Clang lowers to LLVM too soon, and
               | uses C++ to do abstractions (like `std::vector` or
               | `std::unordered_map`) and perhaps it can't quite get the
               | top performance this way. Numba perhaps doesn't get all
               | the types as tight as LPython, or perhaps implements some
               | things not as efficiently, or perhaps doesn't apply as
               | good optimizations before lowering to LLVM. I suspect
               | LLVM gets the best performance if the compiler generates
               | as straightforward LLVM IR code as possible, without
               | layers and layers of abstractions that might not end up
               | being "zero cost" in practice.
        
               | KeplerBoy wrote:
               | Thanks for the reply. I appreciate it.
        
               | nicoco wrote:
               | There are things you cannot do with numpy and that numba
               | helps with, eg a custom PDE solver where you cannot avoid
               | iterating over arrays' elements.
        
       | Reubend wrote:
       | Nitpick: The "Documentation" button on the header links to
       | LFortran, not LPython.
       | 
       | With that said, I love the fact that this can generate
       | executables, and I'm looking forward to trying it out in the
       | future! Python compilers are really cool. I recently used Nuitka
       | to build a standalone version of an app I wrote for my own use,
       | so that I can run it on hosts that don't have Python installed
       | (or don't have the right Python packages installed yet). This
       | seems to be focused much more on speed.
       | 
       | One thing which I didn't understand from the homepage: can I take
       | vanilla Python code and AOT compile it with this?
        
         | certik wrote:
         | > Nitpick: The "Documentation" button on the header links to
         | LFortran, not LPython.
         | 
         | Yes, I noticed too, thanks. We currently don't have a dedicated
         | documentation for LPython and a lot of the LFortran
         | documentation applies. We will eventually have a dedicated
         | LPython documentation.
         | 
         | > With that said, I love the fact that this can generate
         | executables, and I'm looking forward to trying it out in the
         | future! Python compilers are really cool. I recently used
         | Nuitka to build a standalone version of an app I wrote for my
         | own use, so that I can run it on hosts that don't have Python
         | installed (or don't have the right Python packages installed
         | yet). This seems to be focused much more on speed.
         | 
         | We focus on speed, but we definitely create executables (no
         | Python dependencies), just like any other C or Fortran compiler
         | would. You only get a CPython dependency if you call into
         | CPython (explicitly). Indeed creating such standalone
         | executables simplifies the deployment and packaging issues: all
         | the packages must be resolved at compile time, once you compile
         | your application, there are no more dependencies on your Python
         | packages (unless any of them calls into CPython of course).
         | 
         | > One thing which I didn't understand from the homepage: can I
         | take vanilla Python code and AOT compile it with this?
         | 
         | Only if that vanilla Python compiles with LPython (since any
         | LPython code is just Python code, a subset). So in general no.
         | If you call CPython from your LPython main program (let's say),
         | then you will get a small binary that depends on CPython to
         | call into your Python code. I thought about somehow packaging
         | the Python sources into the executable, similar to how
         | PyOxidizer does it, but that's a project on its own almost. We
         | will see what the community wants. If we can make LPython
         | support a large enough subset of CPython, I think I would like
         | to use LPython as is, since it's nice to not have any Python
         | dependency and everything being high performance, essentially
         | equivalent to writing C++ or Fortran.
        
       | yevpats wrote:
       | I think today with things like CoPilot that speeds up development
       | significantly the era of python and untyped languages is past its
       | peak.
        
         | certik wrote:
         | Yes, I thought about this too. Also LLMs can or will be able to
         | translate from one language to another, so perhaps the fact
         | that LFortran/LPython can translate Fortran/Python to other
         | languages like C++ or Julia might not be useful.
         | 
         | My approach is that it is still unclear to me what exactly will
         | be possible in the future, while I know exactly how to deliver
         | these compilers today. I suspect a traditional compiler will be
         | more robust and also a lot faster than an LLM for tasks like
         | translation to another language or compilation to binary. And
         | speed of compilation is very important for development from the
         | user perspective.
         | 
         | Conclusion: I don't know what the future will bring, but I
         | suspect these compilers will still be very useful.
        
           | rebcabin001 wrote:
           | Some IDEs have incremental compilers that are sufficiently
           | fast to update squigglies and what-not on every keystroke.
           | Compilation speed is a primary value, in general.
        
       | knighthack wrote:
       | _"...The benchmarks support the claim that LPython is competitive
       | with its competitors in all features it offers. ... Our
       | benchmarks show that it's straightforward to write high-speed
       | LPython code. We hope to raise expectations that LPython output
       | will be in general at least as fast as the equivalent C++ code.
       | "_
       | 
       | At least as fast as C++ is a bold claim, but this is an
       | interestingly documented process. I'm keen.
       | 
       | I'm quite partial to Nuitka at this point, but I'm open to other
       | Python compilers.
        
         | nurettin wrote:
         | Nuitka is just a packaging system which places all dependencies
         | inside of an executable. It doesn't compile into machine code.
         | Why do you conflate it with compilers?
        
           | knighthack wrote:
           | Since you're commenting from opiniated ignorance, you might
           | want to read the Nuitka page itself:
           | 
           | > Nuitka is _the_ Python compiler. ... It then executes
           | uncompiled code and compiled code together in an extremely
           | compatible manner. Nuitka translates the Python modules into
           | a C level program that then uses libpython and static C files
           | of its own to execute in the same way as CPython does.
           | 
           | I'm happy to stick by the tool's own description of itself.
           | But if you don't accept that, and Nuitka still isn't a
           | compiler by you, sure. Go ahead with your pedantic and
           | strictest definition of the meaning of a 'compiler'.
           | 
           | Learn to read English, before insisting on your pedantry as
           | some sort of truth, in disregard of the tool's own
           | intent/description. (And I do hope you take issue with
           | everyone's description of Typescript as a compiled
           | programming language too.)
        
             | nurettin wrote:
             | Let's be reasonable: Nuitka's sales pitch really isn't
             | really a good source of information. And be aware that it
             | doesn't say the code is converted or compiled to C. It just
             | says it loads python code from C and runs it on the
             | interpreter, which is basically what the python interpreter
             | does.
             | 
             | I think my comment being devoid of content might have
             | caused some frustration. So to make the best of everyone's
             | time:
             | 
             | You can look into cython and pythran to see what I'm
             | talking about. cython lets you optimize code step by step
             | via generating an html page with your code and highlighting
             | lines that still require the use of the python runtime. It
             | lets you add types and cdef function definitions in order
             | to reduce your dependency on the python runtime.
             | 
             | Another good example is pythran, which takes your python
             | code and turns it into c++ code to be compiled by a c++
             | compiler. I understand that this isn't a direct compilation
             | to machine code, but a middle step which lets you compile
             | the output to machine code.
             | 
             | Then there is numba and taichi which have just-in-time
             | compilation decorators. Taichi also provides a
             | sophisticated runtime which lets you run parts of the code
             | on a GPU.
             | 
             | Surprisingly, the best performance I've experienced among
             | these examples was numba + numpy, even though numba alone
             | can sometimes have optimizations that surpasses all
             | compilation efforts, because it turns your loops into
             | mathematical formulas and runs them at O(1) complexity when
             | it can.
        
           | quietbritishjim wrote:
           | It is absolutely a compiler. I've certainly spent enough time
           | waiting for it to compile! On the contrary, packing
           | dependencies is a side effect rather than its primary
           | purpose. (Your sarcastic last sentence doesn't help your
           | point, even if you weren't wrong.)
           | 
           | However, it really is different from projects like this one,
           | in that it doesn't attempt to obtain C-like speed (but does
           | hope to do some optimisations). For example, x+=1 will still
           | dynamically dispatch depending on the runtime type of x, and
           | (if it's an int) do the normal Python arbitrary precision
           | operation. But those will be called from machine code rather
           | than interpreted byte code.
           | 
           | (Essentially, it unrolls the main loop of the CPython
           | interpreter, which is written in C, for every byte code
           | operation, and eliminates every case of the switch statement
           | inside except the one that corresponds to this operation.
           | That's what gets compiled.)
        
             | nurettin wrote:
             | Just to be clear, the reason I didn't (and won't) accept
             | nuitka as a compiler is that it doesn't do what actual
             | compilers do, it just plays around with bytecode. I
             | experienced no speed difference when running large
             | programs, but the startup is considerably slower. To me, it
             | is just a docker replacement that is 1/10th as portable.
        
               | mid-kid wrote:
               | "plays around with bytecode"? Even if that is so, it
               | translates the whole program to c++, which is then
               | compiled. It relies on libpython to implement a lot of
               | the core language types and such, so if your program
               | isn't very computationally intensive or makes heavy use
               | of core data types, you might not notice much, but it's
               | definitely a compiler.
        
               | certik wrote:
               | A compiler doesn't need to optimize. I think if it takes
               | Python code, and translates it to something else, it's a
               | compiler. An optimizing compiler is the one that will
               | give you speedups.
        
               | ptx wrote:
               | And Nuitka does optimize: https://nuitka.net/doc/user-
               | manual.html#optimization
        
               | certik wrote:
               | Then it's an optimizing compiler for sure.
        
           | certik wrote:
           | Nuitka is a compiler. With list it at the bottom of
           | https://lpython.org/, together with the other 23 Python
           | compilers, now 24. :)
        
         | certik wrote:
         | We put this sentence there to drive the point home that LPython
         | competes with C++, C and Fortran in terms of speed. The
         | internals are shared with LFortran, and LFortran competes with
         | all other Fortran compilers, that traditionally are often
         | faster than C++ for numerical code. I've been using Python for
         | over 20 years and it's hard for me to imagine that writing
         | Python could actually be faster than Clang/C++, somehow I
         | always think that Python is slow. Right now we are still alpha
         | and sometimes we are slower than C++. Once we reach beta, if an
         | equivalent C++ or Fortran code is faster than LPython, then it
         | should be a bug to report.
        
           | capitalsigma wrote:
           | I thought that Fortran was traditionally faster than C++ for
           | numerical code due to stricter aliasing rules in the
           | language, which I wouldn't expect to carry over to an IR?
        
             | certik wrote:
             | That, but also being simpler and higher level, having
             | multidimensional arrays in the language itself and simpler
             | semantics (such as you cannot just take a pointer to an
             | arbitrary variable, it has to be marked with "target"), no
             | exceptions, and so on. What carries over to the IR today
             | are all the language semantic features, such as all the
             | array operations (minloc, maxval, sum, ...) and functions
             | (sin, cos, special functions) as well as all the other
             | features without any lowering, and we then do optimizations
             | at this high level, then only at the end we lower (say to
             | LLVM). Python/NumPy can be optimized in exactly the same
             | way, and that's what LPython does. I think C++ can also be
             | compiled this way, but the frontend would have to
             | understand basic structures like `std::vector`,
             | `std::unordered_map` as well as arrays (say xtensor or
             | Kokkos, whatever library you use for arrays), and lift it
             | to our high level IR. Possibly we would have to restrict
             | some C++ features if they impeded with performance, such as
             | exceptions --- I am not an expert on C++ compilers, I am
             | only a user of C++.
        
           | chaxor wrote:
           | What happens when including numba or pytorch, etc in the
           | scripts? GPU acceleration in python is one really nice way of
           | getting decent speed, but I would imagine it's difficult to
           | shuffle over when doing this type of compiling. If the end
           | compiled program allows for use of all available
           | computational resources (some logic with python to determine
           | what accelerations to allocate, what is available, etc) and
           | then can compile to C++ speeds for CPU and use GPU where
           | appropriate, this will be astoundingly good.
        
             | certik wrote:
             | Right we support (currently a subset) of NumPy (just `from
             | numpy import ...`) and SymPy (`from sympy import ...`) and
             | some parts of the Python standard library. We want to
             | support PyTorch, CuPy and other such libraries in a similar
             | way, at least the subset that can be ahead of time
             | compiled, which is quite large.
             | 
             | Yes, offloading to GPU we want to support naturally via
             | NumPy syntax. We will look at this very soon, most likely
             | via annotating that a given array lives on a GPU or host,
             | and then array copy will copy it from host to device, etc.
        
       | ljlolel wrote:
       | Looks like an open source project that hits all of the promises
       | of Mojo except for targeting MLIR and fusion
        
         | certik wrote:
         | Mojo is a strict superset of Python, LPython is a strict subset
         | of Python.
         | 
         | We could target MLIR later, right now we are just targeting
         | LLVM.
        
       | fgfm wrote:
       | A new Python high-performance compiler that could compete with
       | Mojo before it even releases.
        
         | eyegor wrote:
         | Mojo is vaporware for now, they still don't support classes.
         | There are many other issues, but good luck finding a meaningful
         | size python codebase with zero classes.
        
           | lozenge wrote:
           | Neither does LPython yet, even when it does it seems like
           | LPython is meant as a Numba alternative, not something to run
           | arbitrary Python code.
           | 
           | "LPython is built from the ground up to translate numerical,
           | array-oriented code into simple, readable, and fast code."
        
             | certik wrote:
             | Yes, LPython is a strict subset of Python, while Mojo is a
             | strict superset of Python.
             | 
             | Both are valid and consistent approaches with their pros
             | and cons, I listed some of them here: https://fortran-
             | lang.discourse.group/t/fast-ai-mojo-may-be-t....
        
               | ptx wrote:
               | > _while Mojo is a strict superset of Python_
               | 
               | Is it superset if it doesn't implement all of Python? It
               | seems more like a set that has a non-empty intersection
               | with Python.
        
               | certik wrote:
               | My understanding from Mojo's plans is that they want to
               | compile all of Python via their compiler (eventually),
               | and then extend Python with extra syntax that will
               | compile to high performance. I think right now they might
               | not compile all of Python yet, so you are right they are
               | neither a subset nor a superset, but once they deliver on
               | their plans, they will become a superset.
        
               | rebcabin001 wrote:
               | A particular advantage of subsetting the language is that
               | LPython inherits all the tooling of Python. I use pudb
               | and PyCharm daily to develop LPython code.
        
       | karteum wrote:
       | Looks very interesting ! The authors talk about Numba, but does
       | anyone know how it would compare to Codon ?
       | (https://news.ycombinator.com/item?id=33908576)
       | 
       | edit: after trying quickly, it seems that lpython really requires
       | type annotations everywhere, while codon is more permissive (or
       | does type inference)
        
         | certik wrote:
         | I would say LPython, Codon, Mojo and Taichi are structured
         | similarly as compilers written in C++, see the links at the
         | bottom of https://lpython.org/.
         | 
         | Internally they each parse the syntax to AST, then have some
         | kind of an intermediate representation (IR), do some
         | optimizations and generate code. The differences are in the
         | details of the IR and how the compiler is internally
         | structured.
         | 
         | Regarding the type inference, this is for a blog post on its
         | own. See this issue for now:
         | https://github.com/lcompilers/lpython/issues/2168, roughly
         | speaking, there is implicit typing (inference), implicit
         | declarations and implicit casting. Rust disallows implicit
         | declarations and casting, but allows implicit typing. As shown
         | in that issue they only meant to do single line implicit
         | typing, but (by a mistake?) allowed multi-statements implicit
         | typing (action at a distance). LPython currently does not allow
         | any implicit typing (type inference). As documented at the
         | issue, the main problem with implicit typing is that there is
         | no good syntax in CPython that would allow explicit type
         | declaration but implicit typing. Typically you get both
         | implicit declaration and implicit typing, say in `x = 5`, this
         | both declares `x` as a new variable as well as types it as
         | integer. C++ and Rust does not allow implicit declarations (you
         | have to use `auto` or `let` keywords) and I think we should not
         | do either. We could do something like `x: var = 5`, but at that
         | point you might as well just do `x: i32 = 5`, use the actual
         | type instead of `var`.
        
           | westurner wrote:
           | Shedskin is a Python-to-C++ transpiler that does type
           | inference and does not support the full standard library:
           | https://en.wikipedia.org/wiki/Shed_Skin#Type_inference
           | 
           | From "Show HN: Python Tests That Write Themselves" (2019)
           | https://news.ycombinator.com/item?id=21012133:
           | 
           | > _pytype (Google) [1], PyAnnotate (Dropbox) [2], and
           | MonkeyType (Instagram) [3] all do dynamic / runtime PEP-484
           | type annotation type inference [4]_
           | 
           | Hypothesis (@given decorator tests) also does type inference
           | IIUC? https://hypothesis.readthedocs.io/en/latest/
           | 
           | icontract and pycontracts do _runtime_ Preconditions and
           | Postconditions with Design-by-Contract patterns similar to
           | Eiffel DbC; they check the types and values of arguments
           | passed _while the program in running_ and not just at coding
           | or compile time.
        
             | certik wrote:
             | Yes, we have Shedskin in the list at the bottom of
             | https://lpython.org/. Note that the Shedskin compiler is
             | written in Python, so the speed of compilation might be
             | lower than other Python compilers written in C++. Unless it
             | compiles itself, that would be interesting. We thought
             | about eventually writing LPython in LPython, but for now we
             | are focusing on delivering, so we are sticking to C++.
        
       | __mharrison__ wrote:
       | This is awesome. Love to see competition as it tends to be very
       | beneficial to end users.
       | 
       | I use Numba quite a bit to does up slow Pandas operations. Cool
       | to have another alternative.
        
       | notpushkin wrote:
       | > You can install it using Conda
       | 
       | On the risk of starting a holywar: _why._
        
         | certik wrote:
         | It was the easiest for us to deliver a binary that works on
         | Linux, macOS and Windows. Others can then use this binary as a
         | reference to package LPython into other distributions. You can
         | also install LPython from source, but it's harder than just
         | using the binary that we built correctly (with all
         | optimizations on, etc.).
        
         | velosol wrote:
         | Miniconda can be nice sharing an environment around especially
         | with Apple Silicon as one of targets.
        
       ___________________________________________________________________
       (page generated 2023-07-29 23:02 UTC)