[HN Gopher] Lfortran: Modern interactive LLVM-based Fortran comp...
___________________________________________________________________
Lfortran: Modern interactive LLVM-based Fortran compiler
Author : zaikunzhang
Score : 124 points
Date : 2023-08-28 15:00 UTC (7 hours ago)
(HTM) web link (lfortran.org)
(TXT) w3m dump (lfortran.org)
| pseudosavant wrote:
| Was I the only one hoping that Lfortran was something for music
| synthesis (LFOs...) using Fortran? I can't not see LFO - like
| that keming joke.
| ruste wrote:
| This is probably fantastic from a maintainability perspective,
| but I'm curious if some performance is left on the table by using
| LLVM IR instead of compiling directly to machine code. I know
| there are a number of optimizations that can be made for Fortran
| that can't be made for C-like languages and I wonder if some of
| those C-like assumptions are implicitly encoded in the IR.
| certik wrote:
| The original author of LFortran. Great question.
|
| We designed LFortran to first "raise" the AST (Abstract Syntax
| Tree) to ASR (Abstract Semantic Representation). The ASR keeps
| all the semantics of the original code, but it is otherwise as
| abstract/simple as possible. Thus by definition it allows us to
| do any optimization possible, as ASR->ASR optimization pass. We
| do some already, we will do many more in the future. This
| optimizes all the things where you need to know the high level
| information about Fortran. Then once we can't do any more
| optimizations, we lower to LLVM. If in the future it turns out
| we need some representation between ASR and LLVM, such as MLIR,
| we can add it.
|
| We also have a direct ASR->WASM and WASM->x64 machine code, and
| even direct ASR->machine code, but the ASR->LLVM backend is the
| most advanced, after that probably our ASR->C backend and after
| that our ASR->WASM backend.
| throwaway17_17 wrote:
| How does LLVM cope with the array semantics? I was under the
| impression that the noalias attribute in the IR was not
| activated in such a way as to enable the optimizations that
| make Fortran so fast.
| certik wrote:
| My experience with LLVM so far has been that is possible to
| get maximum speed as long as we generate the correct and
| clean LLVM IR, and do many of the high level optimizations
| ourselves.
|
| If LLVM has any downsides, it is that it is hard to run in
| the browser, so we don't use it for
| https://dev.lfortran.org/, and that it is slow to compile
| (both LLVM itself, as well as it makes LFortran slow to
| compile, compared to our direct WASM/x64 backends). But
| when it comes to runtime performance of the generated code,
| LLVM seems very good.
| galangalalgol wrote:
| Rust drove the fixes needed in llvm to support noalias.
| They went through a couple reverts before seemingly fixing
| everything. If lfortran emits noalias, llvm can probably
| handle it now.
| csjh wrote:
| WASM -> x64 as in a full WASM AOT compiler? Don't those
| already exist? What's the benefit to making one specifically
| for LFortran? Unless I'm misunderstanding
|
| Super cool stuff though
| certik wrote:
| Yes, we could make the WASM->x64 standalone. The main
| motivation is speed of compilation. We do not do any
| optimizations, but we want to generate the x64 binary as
| quickly as possible, with the idea that it would be used in
| Debug mode, for development. Then for Release mode you
| would use LLVM, which is slow to compile, but good runtime
| performance. And since we already have ASR->WASM backend
| (used for example at https://dev.lfortran.org/),
| maintaining WASM->x64 is much simpler than ASR->x64
| directly.
| konradha wrote:
| LFortran is not necessarily using LLVM IR to compile. It's
| building up an ASR [1] structure that's already being used in
| the LFortran-specific backends. Potentially it can make full
| use of Fortran semantics!
|
| [1] https://docs.lfortran.org/en/design/
| zik wrote:
| > there are a number of optimizations that can be made for
| Fortran that can't be made for C-like languages
|
| That used to be true a long time ago but since the restrict
| keyword was introduced in C99 it's not really true any more.
| leephillips wrote:
| As an indirect answer, consider Julia, which is based on LLVM
| and seems to be competitive with Fortran on large scale,
| numerically intensive calculations.
| queuebert wrote:
| Can Julia pre-compile to a binary executable now? If not,
| they can't replace Fortran.
| uoaei wrote:
| https://docs.juliahub.com/PackageCompiler/MMV8C/1.2.1/devdo
| c...
| krestomantsi wrote:
| That is not a true binary. Making julia truly compile
| into binaries is now the number 1 goal of the language
| according to Tim Holy and the julia team.
| certik wrote:
| LFortran can translate your Fortran code to Julia via our
| Julia backend. Once Julia can compile to a binary, it
| will be exciting to do some comparisons, like speed of
| compilation and performance of the generated binary. As
| well as the quality of the Julia code that we generate,
| we'll be happy to improve it to create canonical Julia
| code, if at all possible.
| tombert wrote:
| As someone who's only played with Fortran, and never done
| anything too serious with it, can you explain an optimization
| that can be done in Fortran that can't be done in a C-like
| language?
|
| I'm not being argumentative, I'm actually really curious.
| bogeholm wrote:
| Here's a link to a StackOverflow answer that gives a good
| example: "Is Fortran easier to optimize than C for heavy
| calculations?" [0]
|
| [0]: https://stackoverflow.com/questions/146159/is-fortran-
| easier...
| pklausler wrote:
| The most significant distinction is that dummy arguments in
| Fortran can generally be assumed by an optimizer to be free
| of aliasing, when it matters. Modifications to one dummy
| argument can't change values read from another, or from
| global data. So a loop like subroutine foo(a,
| b, n) integer n real a(n), b(n) do j
| = 1, n a(j) = 2 * b(j) end do end
|
| can be vectorized with no concern about what might happen if
| the `b` array shares any memory with the `a` array. The
| burden is on the programmer to not associate these dummy
| arguments on a call with data that violate this requirement.
|
| (This freedom from aliasing doesn't extend to Fortran's
| POINTER feature, nor does it apply to the ASSOCIATE
| construct, some compilers notwithstanding.)
| 3836293648 wrote:
| This can be done in C, but not C++, though in practice all
| C++ compilers support it. It's the `restrict` keyword
| jcranmer wrote:
| Fortran has true multidimensional arrays in a way that C
| doesn't have--if you know an array is 5x3, you know that A[6,
| 1] doesn't map to a valid element whereas in C, it does map
| to a valid element. This turns out to make a lot of loop
| optimizations easier. (Also, being Fortran, you tend to pass
| around arrays with size information anyways, which C doesn't
| do, since you typically just get pointers with C).
| WanderPanda wrote:
| Is the size info compile-time or runtime in Fortran?
| certik wrote:
| It can be both. If you know the dimension at compile
| time, it is compile time, if you don't it will be
| runtime.
| certik wrote:
| A simple example is returning an allocatable array from a
| function, where the Fortran compiler can decide to allocate
| on a stack instead, or even inline the function and eliminate
| completely. While in C the compiler would need to understand
| the semantics of an allocatable array. If you use raw C
| pointer and malloc, and use Clang, my understanding is that
| Clang translates quite directly to LLVM and LLVM is too low
| level to optimize this out, depending on the details of how
| you call malloc.
|
| Of course, you can rewrite your C code by hand to generate
| the same LLVM code from Clang, as LFortran generates for the
| Fortran code. So in principle I think anything can be done in
| C, as anything can be done in assembly or machine code. But
| the advantage of Fortran is that it is higher level, and thus
| allows you to write code using arrays in a high level way and
| do not have to do many special things as a programmer, and
| the compiler can then highly optimize your code. While in C
| very often you might need to do some of these optimizations
| by hand as a user.
| bee_rider wrote:
| I don't think such an optimization exists.
|
| The nice think about Fortran it that is does the sensible
| thing by default for the type of scientific computing codes
| that are inside it's wheelhouse (the trivial example, it
| assumes arguments don't alias by default).
|
| C can beat anything, assuming unlimited effort. Fortran is
| nice for scientists who want to write pretty good code. Or
| grad students who are working on dissertations in something
| other than hand-tuning kernels.
| queuebert wrote:
| This is the correct answer. They almost entirely compile to
| the same machine code for the computationally intensive
| parts. (Even Julia does that these days.) But the
| limitations of Fortran prevent a lot of difficult-to-debug
| C bugs, while not affecting typical scientific and
| numerical capability.
| sakras wrote:
| How does this compiler compare with Flang? I saw it shouted out
| on the main page but didn't really see any comparisons for why
| you'd pick one or the other.
| certik wrote:
| It's hard to have a fair comparison, and both compilers are
| also moving targets. I tried to do some comparison in a sibling
| comment: https://news.ycombinator.com/item?id=37300279
|
| The best is to mention both (as well as GFortran), and users
| can decide. For LPython (https://lpython.org/) we list all of
| the about 30 Python compilers at the webpage, but beyond that
| it's very hard to have a meaningful comparison.
| slavapestov wrote:
| > LFortran is structured around two independent modules, AST
| (Abstract Syntax Tree) and ASR (Abstract Semantic
| Representation), both of which are standalone (completely
| independent of the rest of LFortran) and users are encouraged to
| use them independently for other applications and build tools on
| top.
|
| Modern frontend architecture comes to Fortran! Awesome.
| cjohnson318 wrote:
| Is this the same outfit that did LPython? It looks like the
| same/similar web design.
| certik wrote:
| Yes, both LPython and LFortran are our two thin frontends to
| ASR (Abstract Semantic Representation). Not just the website is
| reused, but the internals are reused, so LPython runs your code
| at exactly the same speed as LFortran would, since ASR and all
| optimizations and backends are shared.
| dang wrote:
| Related ongoing thread:
|
| _Fortran_ - https://news.ycombinator.com/item?id=37291504 - Aug
| 2023 (193 comments)
|
| (current thread is better because less generic)
| fanf2 wrote:
| I wanted to see a comparison with flang, which I thought is the
| main LLVM Fortran front end.
| certik wrote:
| I am the original author of LFortran. What kind of comparison
| would you like to see? If you have any specific questions, I am
| happy to answer.
| fanf2 wrote:
| What are the relative strengths and weaknesses? If LFortran
| is newer, what problems with flang does it aim to address?
| How should someone choose between them, and how does the
| decision process change in different circumstances?
| certik wrote:
| There is old Flang, which motivated me to start LFortran.
| The new Flang, which presumably you are referring to,
| started possibly in the same month as LFortran, but we
| didn't know about each other.
|
| It's best if you ask Flang developers what they see as the
| advantages of Flang over LFortran. From my biased
| perspective, LFortran can run interactively, it is fast to
| compile the compiler (30s on my laptop) and LFortran
| compiles your code very quickly (especially with our direct
| x64 or C backends). It runs in a browser:
| https://dev.lfortran.org/. We have many backends (LLVM, C,
| C++, Julia, WASM, x64). We plan to add Python and Fortran
| (the latter could be used to modernize your old Fortran
| code). It is easy to add new backends, and it is also easy
| to add new frontends, so we have LPython and LFortran as
| two thin frontends, to our intermediate representation that
| we call ASR (Abstract Semantic Representation). The
| internal design is simple, so a small team can develop
| LCompilers at a fast pace. New contributors without any
| prior compiler experience get up to speed very quickly
| (typically a few weeks or even less).
|
| We are still in alpha, which means it is expected to break
| for your code (and when it does, please report all bugs!).
| To choose between them, I recommend to test them out and
| pick the one that you like the most, based on your
| criteria. Note that the most mature and widespread open
| source Fortran compiler is GFortran.
| gcr wrote:
| I misread LLVM as LLM and wondered what on earth language models
| have to do with Fortran compilation. (They don't. LLVM is a
| compiler framework.)
|
| Anyways, great work to the team! It's fun to see such a flurry of
| articles from the Fortran community today
| Alifatisk wrote:
| > I misread LLVM as LLM and wondered what on earth language
| models have to do with Fortran compilation.
|
| You gave me a good laugh this evening!
| certik wrote:
| Turns out Fortran is a great fit for LLM, I am not joking:
| https://github.com/certik/fastGPT/.
| Conscat wrote:
| I was talking to someone on Grinder yesterday who made the same
| mistake. I wonder how common it is to misread "LLVM" as "LLM".
| certik wrote:
| Thanks! I know, LLM came much later after LLVM. But if you are
| interested in LLM that LFortran can compile, check out:
| https://github.com/certik/fastGPT/.
| csjh wrote:
| Would be cool for there to be a `llama2.f`, like
| https://github.com/karpathy/llama2.c, to demo its
| capabilities
| certik wrote:
| Yes indeed, we'll do it next (unless somebody beats me to
| it). First I am focusing on compiling fastGPT with
| LFortran, we can do it, but have a few workarounds that I
| want to fix. Then we'll do llama2.
___________________________________________________________________
(page generated 2023-08-28 23:00 UTC)