[HN Gopher] Ask HN: LLVM vs. C?
___________________________________________________________________
Ask HN: LLVM vs. C?
What does LLVM have that can not be achieved 1 to 1 in C? And what
does C have that can not be reproduced 1 to 1 in LLVM? And when I
say C I mean tcc, sdcc, gcc, clang and other C compilers. It feels
to me like C should be superior to LLVM and allow to do anything
that is possible to do with LLVM, but maybe I'm wrong? (asking
about it due to all that "fuzz-buzz" about zig going for C as its
main target) Would appreciate real down to the ground answers, not
a sense of how things work or generalizations - as that will
introduce unnecessary noise in judgement of comparison.
Author : danielEM
Score : 16 points
Date : 2023-06-30 16:56 UTC (6 hours ago)
| DamonHD wrote:
| https://stackoverflow.com/questions/10264635/compiler-output...
|
| Gives some good reasons. I just searched for LLVM vs C.
| danielEM wrote:
| Well, I did that search before got here with my question, but
| despite of one or two good reasonings the remaining ones I
| found simply not true or very much opinionated. And my
| questions are very specific.
| KerrAvon wrote:
| Your questions are not well founded. You're both asking for
| people to compare two things that aren't comparable _and_
| you're admonishing people not to explain why they're not
| comparable.
| distcs wrote:
| Great question. I'd like to know too! I always thought C is a
| better target for compiler makers because every target
| architecture has got a C compiler for it. The number of targets
| that LLVM support is far less impressive. Also ton of work has
| gone into optimizing C compilers for any architecture you can
| imagine. I'd like to know too why compiler makers generally
| target LLVM instead of C?
| shortrounddev2 wrote:
| I don't know LLVM, but I would assume that LLVM is more rigidly
| defined and has less UB and more explicitly defined outcomes
| (e.g: less implementation defined behavior)
| KerrAvon wrote:
| Your questions don't make sense. LLVM is a framework for building
| compilers and tools. C is a language specification.
| latenightcoding wrote:
| It does make sense, your language can target C or LLVM. each
| option comes with pros and cons e.g: C is more familiar, LLVM
| has better support for JIT compilation.
| dleslie wrote:
| So let's assume we mean C as a target, in the way one might
| target LLVM IR. What LLVM IR offers over C11 or C20 is a great
| deal more tools to describe to the compiler how one might engage
| in micro-optimizations.
|
| Ie, linkage[0], lifetime[1], and ordering[2] information can be
| critical to delivering performance.
|
| Now that said, GCC is _pretty damned amazing_ at squeezing out
| this information from C, and it's available for use just about
| everywhere. But it's good at figuring it out for _hand-written_
| C. The sort of C that's generated as a compilation output from
| other languages, like Chicken Scheme and Nim, may not be written
| in a way that allows the C compiler to fully take advantage of
| its optimization abilities.
|
| What I find with generated C is that it often is prone to blowing
| the stack or missing cache far more than hand-written C would.
| There's often something funky happening (Cheney-on-the-MTA) or
| it's a soup of function pointers and object references scattered
| haphazardly across memory.
|
| 0: https://llvm.org/docs/LangRef.html#linkage-types
|
| 1: https://llvm.org/docs/LangRef.html#object-lifetime
|
| 2: https://llvm.org/docs/LangRef.html#memory-model-for-
| concurre...
| tibordp wrote:
| You can go surprisingly far with C, though LLVM is probably a
| better long-term option for a serious compiler, because it's a
| tool made for the job (unless you target exotic and/or embedded
| platforms that don't have LLVM support - but that's fairly
| unlikely).
|
| C is very easy to get started with if you don't already know
| LLVM. You don't have to flatten everything to SSA + basic blocks
| and can keep expressions as trees. The downside is that once your
| compiler is reasonably complete, you may spend quite a bit of
| time working around quirks of C (e.g. int promotion is very
| annoying when you already have full type information, so your
| compiler either has to understand C semantics fairly well or
| defensively cast every subexpression).
|
| I have a C backend in my compiler (https://github.com/alumina-
| lang/alumina) and it works really well, though the generated C is
| really ugly and assembly-like. With #line directives, you can
| also get source-level debugging (gdb/lldb) that just works out of
| the box.
|
| There are a few goodies that LLVM gives you that you don't get
| with C, like coverage
| (https://clang.llvm.org/docs/SourceBasedCodeCoverage.html). It
| works when done through clang, but cannot easily be made to track
| the original sources.
| coreyp_1 wrote:
| Well, clang actually uses LLVM.
|
| As I see it: LLVM is like a universal remote. It is a quasi-
| assembly representation that can then be further compiled to
| multiple architectures. If you target LLVM, then you can compile
| to anything that LLVM can compile to. To support additional
| architectures, then all that needs to happen is for that
| architecture to work with LLVM, and then every project that uses
| LLVM will now support that architecture.
|
| C, on the other hand, relies on a compiler. Because a compiler
| may be specifically designed for one architecture, then it may
| (although it is not guaranteed) generate better binaries than
| LLVM.
|
| Because C can be compiled using clang, which uses LLVM, then
| there is nothing that C can do that LLVM cannot.
|
| It may, however, be possible to produce opcode sequences using
| LLVM that is not possible in C, since C imposes semantics and
| structure on a program (syntax).
|
| Lastly, it may be a simple question of resources. Is it easier to
| find people who know C, or people who know LLVM? Is it easier to
| set up a toolchain that compiles C, or one that compiles LLVM?
| Historically speaking, which one has been around longer? (C, of
| course.) Do the semantics of the source language closely match
| the available C paradigms? Which is more stable?
| jcranmer wrote:
| > What does LLVM have that can not be achieved 1 to 1 in C?
|
| Do you want to have signed integer overflow not be undefined
| behavior? Sorry, you can't do that in C. Do you want to support
| SIMD vector types? Oops, no support in C. Underaligned memory
| accesses? Packed structures? Support for handling unwind
| structures? Coroutines?
|
| > It feels to me like C should be superior to LLVM and allow to
| do anything that is possible to do with LLVM, but maybe I'm
| wrong?
|
| Very much the inverse: as LLVM is used to implement a C compiler,
| LLVM can faithfully reproduce all of the C semantics, but C
| cannot be used to implement all of the LLVM features. Even if you
| pretend C doesn't have undefined behavior, and even if you
| include compiler extensions as part of C, there are still a few
| LLVM instructions and intrinsics that just don't exist in C
| (chief among them is invoke, which is used to implement C++
| exception handling).
___________________________________________________________________
(page generated 2023-06-30 23:00 UTC)