[HN Gopher] Show HN: Prometeo - a Python-to-C transpiler for hig...
___________________________________________________________________
Show HN: Prometeo - a Python-to-C transpiler for high-performance
computing
Author : zanellia
Score : 112 points
Date : 2021-11-17 14:01 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| BBC-vs-neolibs wrote:
| A brief comparison distinguishing it from Cython would be most
| welcome.
| dom96 wrote:
| This is really cool. Just a bit of pedantry: is Python higher-
| level than C? If so this isn't a transpiler but a compiler :)
| zanellia wrote:
| Fair enough - it's blurred I'd say. I see C as a lower-level,
| and yet still high-level, language, if compared to Python :)
| xapata wrote:
| That's a compiler. I don't understand the desire to create a new
| word when the old one is fine.
| zanellia wrote:
| "A program that translates between high-level languages is
| usually called a source-to-source compiler or transpiler" from
| https://en.wikipedia.org/wiki/Compiler.
| marmaduke wrote:
| Stand-alone is a very useful concept. I don't like deploying
| Python stacks much. Wouldn't that additionally mean you could
| target CL, CUDA or Sycl variants of C?
| zanellia wrote:
| I'd say that's possible in principle - definitely not there at
| the moment though (and not even planned).
| sergius wrote:
| How does this compare with Nim and MicroPython?
| zanellia wrote:
| I though about using Nim as a host language for the DSL for a
| while, but then decided to rely on Python simply because it is
| more mature (and I had already partially figured out how to
| manipulate Python ASTs to generate C code).
| zanellia wrote:
| Hi all,
|
| prometeo is an experimental modeling tool for embedded high-
| performance computing. prometeo provides a domain specific
| language (DSL) based on a subset of the Python language that
| allows one to conveniently write scientific computing programs in
| a high-level language (Python itself) that can be transpiled to
| high-performance self-contained C code easily deployable on
| embedded devices.
|
| The package is still rather experimental, but I hope this concept
| could help making the development of software for high-
| performance computing (especially for embedded applications) a
| little easier.
|
| What do you think of it? Looking forward to receiving
| comments/suggestions/criticism :)
| chriswarbo wrote:
| Very interesting! What are the similarities/differences compared
| to RPython (as used by PyPy)?
|
| https://rpython.readthedocs.io/en/latest/rpython.html
| loeg wrote:
| Looks like RPython is a bigger language that doesn't target an
| embedded use case without a Python runtime. Though I may be
| mistaken - I am not super familiar with RPython.
| chrisseaton wrote:
| RPython programs can be compiled to a standalone executable
| without a Python runtime - it's what PyPy is written in, for
| example.
| klyrs wrote:
| Kneejerk reaction as an enthusiastic Cython developer: "bah,
| another crappy (subset of Python)-to-C compiler."
|
| After reading: this is _really_ cool. If I understand this, I
| think you should be able to beat Cython without breaking a sweat.
| I 'm quite excited to use this.
| zanellia wrote:
| hahaha thanks!
| pella wrote:
| Nice project.
|
| small comment - related to the benchmarks:
|
| - in Julia: it has a newer ricatti solver (in package)
|
| https://github.com/andreasvarga/MatrixEquations.jl/blob/mast...
|
| https://github.com/andreasvarga/MatrixEquations.jl
| lvass wrote:
| Cython, pypy, micropython, nuitka, shedskin, ironpython,
| graalpython, jython, mypyc, pyjs, skuptjs, brython, activepython,
| stackless, transcrypt, cinder and many more I don't remember.
|
| They're all practically useless or delegated to specific tasks.
| At this point you'd need to present incredible evidence that an
| alternative compiler can be useful. Personally I find it comical
| how many developers are still eluded by a promise of performant
| python. I hope you achieve your goals, good luck.
| m_ke wrote:
| Numpy, Numba and PyTorch seem to be doing ok.
| gh02t wrote:
| Cython is also pretty successful obviously, though I don't
| think it quite fits in OPs list given that it's more about
| writing extensions than replacing your entire Python
| code/stack. But I do agree with OPs sentiment even as someone
| who writes a lot of Python.
| fwsgonzo wrote:
| Most of the ones you list require dynamic linking and so is
| hard to make use of in specialized environments.
|
| His project seems to be generating generic C code which is much
| easier to port to any weird platforms. In fact, it might be
| perfect for my use-case where dynamic linking is just extra
| attack surface.
|
| I understand that the project is still in the early stages, but
| I will be paying close attention to it. If at some point it
| will be possible to write "regular" Python in it (minus most of
| the standard library and imports), then it could be a candidate
| for an edge computing platform.
| staticautomatic wrote:
| SpaCy is pretty incredible evidence.
| zanellia wrote:
| The point of prometeo is not to obtain a "performant Python".
| Python is used merely as a host language for an embedded domain
| specific language. You could do the same thing with any other
| language with a mature library for AST analysis :)
| [deleted]
| lvass wrote:
| Which makes this thread's title at least confusing.
| zanellia wrote:
| fair enough - could not cram "embedded" into it :)
| throw10920 wrote:
| I'd argue it's downright misleading. "Python-to-C
| transpiler" means _Python_ , not "a DSL based on a subset
| of Python".
|
| An accurate title would be "a DSL embedded in Python for
| high-performance scientific computing" or something
| similar.
| nspattak wrote:
| so much effort to match the performance of lower level
| languages that it would have actually been easier to use those
| directly :)
| Zababa wrote:
| I'm not sure, most people aren't writing ASM these days
| because the compilers are good enough for most cases.
| Compilers are great.
| zanellia wrote:
| I think most HPC people would disagree with this statement.
| State-of-the-art HPC code is still written in ASM (see
| e.g., https://github.com/xianyi/OpenBLAS) [that's what
| Intel is doing too]
| guenthert wrote:
| That ASM code is however not necessarily constructed
| manually. You'd think for high performance code with
| limited scope, a superoptimizer would be used.
| zanellia wrote:
| Not sure what a "superoptimizer" would look like in this
| context. For a reference, I know for sure that this
| https://github.com/giaf/blasfeo (which beats Intel MKL)
| was coded entirely by hand.
| Zababa wrote:
| I don't think they would. I think they realize that
| state-of-the-art HPC code is a small fraction of all the
| code written. I doubt that these people write ASM instead
| of Python or JS or C or whatever when doing simple
| scripts.
| marmaduke wrote:
| ASM makes sense when the time spent in a specific routine
| exceeds the time it takes to write the ASM, which makes a
| lot of sense for Blas, less so for other HPC yet
| speculative or less fundamental projects. Cvodes for
| instance doesn't need to be written in ASM, and I think
| Julia makes a strong case that it could have been written
| in Julia.
| Tozen wrote:
| Good point. And you don't have to go that low. Maybe go use
| Object Pascal, Nim, or Vlang. I know... the libraries. But a
| lot of them are bindings of C libraries. So, you can create
| bindings in other languages too or use Python from those
| languages. There are various options.
| zanellia wrote:
| I would disagree on "easier" :) Ever spent half a day
| debugging a segfault?
| MR4D wrote:
| Looks like this could be pretty nice.
|
| I noticed your disclaimer at the bottom of the linked page [0],
| and wanted to get an idea of how far you were looking to take
| this. Will it go beyond maths into normal functions (string
| handling, etc) ? Do you eventually plan on supporting most of
| python - for instance, do you think I could write a web server
| using your tool in the future?
|
| [0] - " _Disclaimer: prometeo is still at a very preliminary
| stage and only few linear algebra operations and Python
| constructs are supported for the time being._ "
| zanellia wrote:
| Unfortunately, I think that writing a transpiler for general
| Python programs might be rather difficult without resorting to
| approaches used, e.g., in Cython/Nuitka. Among other things,
| computing the worst-case heap usage could be quite
| cumbersome/computationally heavy for a general program without
| "constraints". I'd be happy to hear what others think about the
| topic though.
| rich_sasha wrote:
| Soo... it takes Python syntax and produces a C program, with no
| links back to Python - is that right? It uses a strict subset of
| Python, so that Prometeo programs are valid Python, but not
| necessarily the opposite. Is that fair?
|
| Do you envisage this being a conduit for tight loop optimisation
| in Python? Or is it rather "you'd like a C program but can't
| write C good"?
|
| And if the former, how do you compare to Nuitka and Cython? I
| read your README but couldn't quite make sense of it :)
| zanellia wrote:
| > Soo... it takes Python syntax and produces a C program, with
| no links back to Python - is that right? It uses a strict
| subset of Python, so that Prometeo programs are valid Python,
| but not necessarily the opposite. Is that fair?
|
| yep
|
| > Do you envisage this being a conduit for tight loop
| optimisation in Python? Or is it rather "you'd like a C program
| but can't write C good"?
|
| There are already plenty of options for calling high-
| performance libraries from Python. Now 1) interpreting Python
| programs that use, e.g., NumPy, can be slow. 2) Compiling these
| programs using, e.g., Cython or Nuitka, can speed up the code
| _across_ calls to high-performance libraries, but the resulting
| code will still rely on the Python runtime library, which can
| be slow /unreliable in an embedded context.
|
| Coming to the second part of the question, writing C code
| directly is definitely an option, but, after doing a bit of
| that, I realized how tedious/error prone it is to
| develop/maintain/extend relatively complex code bases for
| embedded scientific computing (e.g. this one
| https://github.com/acados/acados). Or, to put it as Bjarne
| Stroustroup once said "fiddling with machine addresses and
| memory is rather unpleasant and not very productive". The good
| news seemed to be that many of the code structures necessary to
| write that type of code are rather repetitive and can hopefully
| be generated automatically to some extent.
|
| > And if the former, how do you compare to Nuitka and Cython? I
| read your README but couldn't quite make sense of it :)
|
| This table (from the README) shows some computation times for
| Nuitka, prometeo, Python and PyPy.
|
| CPU times in [s]:
|
| Python 3.7 (CPython) : 11.787 Nuitka : 10.039 PyPy: 1.78
| prometeo : 0.657
|
| Other than performance, the main difference is, again, the
| runtime library dependency.
| BBC-vs-neolibs wrote:
| And Cython? (Not CPython)
| rich_sasha wrote:
| Right. Gotcha. So Prometeo isn't another "make Python fast
| again" project, but rather an orthogonal effort to write fast
| (C) programs, but in a high-level Python-like language.
| Thanks.
| zanellia wrote:
| yep, that's right.
| 4w4s wrote:
| It seems a convenient/high_level way to use highly optimized C
| libraries with minimal overhead both in terms of execution time
| (i.e. vs standard interpreted Python) both in term of runtime
| size/complexity (see Julia).
| zanellia wrote:
| That's correct. I'd say one of the fundamental differences
| between the two lies in the fact that the code generated by
| prometeo does not depend on a runtime library (which is
| somewhat fundamental for embedded applications, e.g., embedded
| optimization). From prometeo's README:
|
| Finally, although it does not use Python as source language, we
| should mention that Julia too is just-in-time (and partially
| ahead-of-time) compiled into LLVM code. The emitted LLVM code
| relies however on the Julia runtime library such that
| considerations similar to the one made for Cython and Nuitka
| apply.
| zcw100 wrote:
| Have you thought of targeting WebAssembly? If you're going from
| Python/Prometo -> C you could always make the extra sep of
| Python/Prometo -> C -> WASM but I wonder if there would be an
| advantage of skipping the intermediate C.
| fwsgonzo wrote:
| Why WASM? It would be a pessimization compared to just
| transpiling to C if performance is the goal. WASM also is
| restricted to 128-bit vector instructions.
| zcw100 wrote:
| Because wasm doesn't support Python and it might be nice to
| be able to write WASM in a Python like language.
| fault1 wrote:
| Hasn't cython been ported to wasm (iodide), or perhaps one
| of the "rewrite in Rust" Python impls? rustc can output
| wasm pretty naturally.
| zanellia wrote:
| Python to ASM would actually be really cool and would guarantee
| performance gains for small matrices, but it would require
| quite some implementation effort. Not sure about WASM.
| b20000 wrote:
| just write your code in c or c++ and be done with it. if you need
| math libs there are plenty out there for anything you can
| imagine. python will go the way java went many years ago.
| sys_64738 wrote:
| Seems like a python to C++ would translate more of the language
| to like for like concepts more easily.
| 4w4s wrote:
| But some "embedded platform" tool-chains do no support C++
| zanellia wrote:
| Right, for sure I would not need to re-invent the machinery to
| translate a class into a glorified C struct. The whole thing
| started with C in mind for portability arguments, but it might
| be a good idea to keep an eye on C++ as an option.
| cerved wrote:
| Looks like a cool project!
|
| I can't speak much about the code itself or the aims of the
| projects. Personally I would recommend more informative commit
| messages.
|
| I do this myself, especially working on personal stuff, but
| writing commit messages that succinctly explain what each commit
| does is a good practice and gives a serious impression.
|
| I often find myself hacking away and periodically going back to
| flesh out messages using rebase.
| zanellia wrote:
| Thanks for the suggestion. Until now it's been a lot of
| discussion with friends and colleagues and much less actual
| collaboration on code writing - I might have drifted into bad
| practices.
| amkkma wrote:
| Regarding all the questions about Julia:
|
| There's ongoing work to reduce runtime dependencies of Julia (for
| example in 1.8, you can strip out the compiler and metadata), but
| then it's only approaching Go/Swift and other static languages
| with runtimes.
|
| Generating standalone runtime free LLVM is another path, that is
| actually already pretty mature as it's what is being done for the
| GPU stack.
|
| Someone just has to retarget that to cpu LLVM, and there's a
| start here: https://github.com/tshort/StaticCompiler.jl/issues/43
| zanellia wrote:
| That's quite cool. Maybe the whole thing can be rewritten in
| Julia too at some point. I just know too little about Julia to
| judge.
| amkkma wrote:
| Well IMO it can definitely be rewritten in Julia, and to an
| easier degree than python since Julia allows hooking into the
| compiler pipeline at many areas of the stack. It's lispy an
| built from the ground up for codegen, with libraries like
| (https://github.com/JuliaSymbolics/Metatheory.jl) that
| provide high level pattern matching with e-graphs. The
| question is whether it's worth your time to learn Julia to do
| so.
|
| You could also do it at the LLVM level:
| https://github.com/JuliaComputingOSS/llvm-cbe
|
| One cool use case is in
| https://github.com/JuliaLinearAlgebra/Octavian.jl which
| relies on loopvectorization.jl to do transforms on Julia AST
| beyond what LLVM does. Because of that, Octavian.jl. a pure
| julia linalg library, beats openblas on many benchmarks
| cossatot wrote:
| I'm curious what an example use case is for scientific computing
| on an embedded device. Is this for real-time analysis on a data
| logger or something?
|
| Many of us think of clusters as high-performance scientific
| computing, which are about as far from embedded as it gets.
|
| Please note that I am not being snarky, just curious!
| zanellia wrote:
| Thanks for the question! My background is in numerical
| optimization for optimal control. Projects like this
| https://github.com/acados/acados motivated the development of
| prometeo. It's mostly about solving optimization problems as
| fast as possible to make optimal decisions in real-time.
| 4w4s wrote:
| Nice job! Is this aimed at single core/thread computations or the
| prometeo layer is also a way to write in a more "user friendly
| way" basic parallel code?
| zanellia wrote:
| For the time being, it targets single core/thread applications
| only.
| fwsgonzo wrote:
| Do you have access to builtins and intrinsics? Are there any
| plans?
|
| The single threaded thing is not an issue because you can
| still call the same function on each CPU and use the CPU ID
| to target parts of the computation, like a compute kernel
| function.
| zanellia wrote:
| Intrinsics (or directly assembly) are used in BLASFEO
| (https://github.com/giaf/blasfeo) the linear package used
| by prometeo. It would be cool to generate assembly directly
| for a few things, but that would require quite a bit of
| work!
| OulaX wrote:
| Each programming language has its purpose.
|
| C code is performant and that is a fact. Python code is not.
|
| When building mission critical systems why don't programmers just
| use C itself instead of coding in another programming language
| and having it transpiled for them? Why introduce such tools all
| the time?
|
| I am against this because the tools programmers use are becoming
| too bloated compared to 10-20 years ago.
|
| Want to build an Android App? Use Java/Kotlin.
|
| Want to build an iOS App? Use Swift.
|
| Want to build a Web App? Use a Single JS Framework (Why millions
| of frameworks?)
|
| Want to build a Windows Desktop App? Use C#.NET Either with
| WinForms or WPF.
|
| I really see tools and technologies coming up all the time to
| solve a problem that most of the time doesn't exist.
| Zababa wrote:
| > When building mission critical systems why don't programmers
| just use C itself instead of coding in another programming
| language and having it transpiled for them?
|
| Why C and not assembler?
|
| > Why introduce such tools all the time?
|
| C compilers are one of those tools.
| GekkePrutser wrote:
| It doesn't have to be production. Maybe it's for a research
| project where you just need the extra performance.
|
| Everything has a cost. This may not be ideal but learning to do
| C properly as an experienced Python dev will have a time cost
| as well. This may just be the best way to get from A to B.
|
| I remember when I did a one-off project with a PIC
| microcontroller. I only had an assembler and I spent 2 days
| getting nowhere.
|
| Then i found a C compiler and I had the whole thing running in
| 2 hours. The compiler turned out to much more efficient in
| speed as well as code size than my hand-written assembler.
| zanellia wrote:
| I think many people who have at least once first prototyped a
| numerical algorithm in a high-level language (say Python,
| Julia, MATLAB?) and _then_ implemented it in C, can relate to
| the experience of transitioning from error messages of the
| type: "dimension mismatch for XYZ" to "segmentation fault".
| That's in my opinion a strong motivation to build tools that
| can automate certain parts of the development process.
|
| Writing C code directly is as a good option, as long as your
| code is not too complex to develop, maintain and extend.
|
| And, again, Python here is intended to be the host language for
| an embedded domain specific language that gets compiled into C.
| It does not need to be efficient it needs to be expressive and
| easy to analyse and transpile.
| adgjlsfhk1 wrote:
| Note that the whole point of Julia is that it saves you the
| rewrite. There is Julia code running on top supercomputers
| that gives speed competitive to C/C++/Fortran. You will have
| to put in some work to get Julia code to be that fast, but it
| is usually dramatically easier than a rewrite in a different
| language.
| zanellia wrote:
| It's not so easy deploy an algorithm written in Julia on an
| embedded platform though, is it?
| adgjlsfhk1 wrote:
| Probably not :)
| fault1 wrote:
| yes, but "speed" in 'top supercomputers' is not "speed" in
| 'embedded systems'.
|
| I do think Julia potentially can crack this space, but the
| runtime at least historically has not been tailored for it.
|
| It does seem like Julia has become more modular lately
| especially being able to disconnect the JIT (or LLVM ORC).
| H Hopefully you'll either being able to either defatten or
| completely remove the runtime dependencies (ala Rust in no-
| std mode). Each of these is important for different use
| cases.
| packetlost wrote:
| You've _never_ been near a lab environment clearly. Python is a
| dominant language in university labs and runs of lot more real-
| time systems than you think. Grad students rarely have industry
| experience and don 't necessarily have the know-how to write C
| code effectively, so it's a question of resources and
| ecosystem. Numpy, matplotlib, pandas, scikit, TensorFlow, etc.
| are all huge draws for the scientific and ML communities.
| klyrs wrote:
| The problem that this language solves is that it automatically
| sorts out the memory usage for you. That isn't a problem for
| me; I've been programming in C for decades. But it is a problem
| for most python programmers who don't have a lick of C
| experience, but want to get C performance. It drastically
| lowers the barrier of entry.
| zanellia wrote:
| For what it counts, I have developed code for this kind of
| applications exclusively in C for ~5 years (let's say 20% of
| my working time). I still think that debugging a segfault
| that you could have avoided is not very productive and that
| motivated me to look into possible alternatives.
| kevin_thibedeau wrote:
| 99% of the time a stack trace shows the culprit for a
| segfault straight away. No different than debugging Python.
| zanellia wrote:
| if we are arguing that implementing a numerical algorithm
| in C is as easy as implementing it in Python - I would
| disagree. But maybe I am just wrong :)
| klyrs wrote:
| FWIW, I almost always use valgrind before a debugger, when
| tracking down segfaults. It doesn't catch everything, but
| 90% of the time, it gets me to the right region of code in
| a single run.
| zanellia wrote:
| sure I use valgrind and gdb too - still hard to argue
| that a segfault is pleasant to debug though?
| klyrs wrote:
| Good good, just wanted to advocate for my favorite tool
| there. But, in my experience, segfaults are usually the
| easiest bugs to resolve. Unlike a sign error in my math,
| they're impossible to miss!
|
| That said, tooling to get rid of them entirely is not to
| be sneezed at :)
| GekkePrutser wrote:
| Yes and it will also prevent common memory management bugs
| that can lead to code injection.
| up6w6 wrote:
| Yes, I'm gonna talk about Julia...
|
| It's kinda of sad how much effort is put on the creation of new
| Python compilers to make it slight faster while the problem of
| latency to compile that people hate at Julia is not tracked
| because of the lack of manpower to improve Julia's interpreter.
|
| https://youtu.be/IlFVwabDh6Q?t=2530 (tldr: The Julia interpreter
| is currently about 500x slower than JIT code and there are a lot
| of low-hanging fruit work there that could easily give it a 10x
| speedup - this could make more viable to switch between compiler
| and interpreter depending on the work)
| zanellia wrote:
| Personally, I think Julia is great - just don't know it well
| enough to write a package that takes Julia ASTs and generate C
| code from them :) There could totally be a Julia implementation
| of the main idea behind prometeo (Julia per se does not solve
| the problem that prometeo aims at solving).
| adgjlsfhk1 wrote:
| You can just use `@code_llvm` to generate LLVM code, or
| `@code_native` to generate assembly. Does that do what you
| need?
| zanellia wrote:
| hmm not sure, the compiled LLVM code would still depend on
| the runtime library?
| adgjlsfhk1 wrote:
| The LLVM code will only call into the runtime for
| allocation or dynamic dispatch, both of which are
| avoidable. Lots of real Julia code will never touch it.
| fault1 wrote:
| the problem with Julia in the use case of OP is really the fact
| that it is garbage collected (and perhaps also how its GC is
| tuned). You can work to eliminate allocations, but the memory
| determinism problem is more important in real time control and
| embedded systems. see for example, this video:
| https://www.youtube.com/watch?v=dmWQtI3DFFo
|
| It's kind of why C is still king in this space.
| loeg wrote:
| To get ahead of the obvious question I had and I'm sure others
| will, this is from the README:
|
| > Cython is a programming language whose goal is to facilitate
| writing C extensions for the Python language. In particular, it
| can translate (optionally) statically typed Python-like code into
| C code that relies on CPython. Similarly to the considerations
| made for Nuitka, this makes it a powerful tool whenever it is
| possible to rely on libpython (and when its overhead is
| negligible, i.e., when dealing with sufficiently large scale
| computations), but not in the context of interest here.
|
| I.e., it's a python-like DSL that does not depend on the Python
| runtime.
|
| Thanks for sharing OP, this is pretty cool.
| zanellia wrote:
| Right, that's indeed the main reason I could not simply use
| Cython or Nuitka (or Julia?). The Python runtime library will
| do all kinds of non real-time/embedded friendly operations such
| as garbage collections, memory allocation/de-allocation and so
| on, in the background.
| cycomanic wrote:
| How does it compare to pythran? Except for the fact that it's c
| and not c++?
| zanellia wrote:
| Not sure how easy it would be to make the code generated by
| Pythran standalone, i.e., no dependency on the Python runtime
| library. Any Pythran expert? :)
| cycomanic wrote:
| Pythran code is standalone, i.e. no dependency on the Python
| runtime AFAIK.
| zanellia wrote:
| It generates a Python extension, doesn't it? Would not know
| how to run it outside of Python.
| cycomanic wrote:
| It can generate python extensions, but doesn't have to,
| here is a blog post talking about using it to generate
| self contained c++ code by the author: https://serge-
| sans-paille.github.io/pythran-stories/pythran-...
|
| BTW very cool project nevertheless, just wanted to see
| the differences parallels to pythran. There might even by
| room for collaboration on some features.
| throw5399375930 wrote:
| Great project, but terrible name, considering how popular
| Prometheus is.
| zanellia wrote:
| fair enough :p I might change it in the future.
___________________________________________________________________
(page generated 2021-11-17 23:01 UTC)