[HN Gopher] Ask HN: LLVM vs. C?
       ___________________________________________________________________
        
       Ask HN: LLVM vs. C?
        
       What does LLVM have that can not be achieved 1 to 1 in C? And what
       does C have that can not be reproduced 1 to 1 in LLVM?  And when I
       say C I mean tcc, sdcc, gcc, clang and other C compilers.  It feels
       to me like C should be superior to LLVM and allow to do anything
       that is possible to do with LLVM, but maybe I'm wrong? (asking
       about it due to all that "fuzz-buzz" about zig going for C as its
       main target)  Would appreciate real down to the ground answers, not
       a sense of how things work or generalizations - as that will
       introduce unnecessary noise in judgement of comparison.
        
       Author : danielEM
       Score  : 16 points
       Date   : 2023-06-30 16:56 UTC (6 hours ago)
        
       | DamonHD wrote:
       | https://stackoverflow.com/questions/10264635/compiler-output...
       | 
       | Gives some good reasons. I just searched for LLVM vs C.
        
         | danielEM wrote:
         | Well, I did that search before got here with my question, but
         | despite of one or two good reasonings the remaining ones I
         | found simply not true or very much opinionated. And my
         | questions are very specific.
        
           | KerrAvon wrote:
           | Your questions are not well founded. You're both asking for
           | people to compare two things that aren't comparable _and_
           | you're admonishing people not to explain why they're not
           | comparable.
        
       | distcs wrote:
       | Great question. I'd like to know too! I always thought C is a
       | better target for compiler makers because every target
       | architecture has got a C compiler for it. The number of targets
       | that LLVM support is far less impressive. Also ton of work has
       | gone into optimizing C compilers for any architecture you can
       | imagine. I'd like to know too why compiler makers generally
       | target LLVM instead of C?
        
         | shortrounddev2 wrote:
         | I don't know LLVM, but I would assume that LLVM is more rigidly
         | defined and has less UB and more explicitly defined outcomes
         | (e.g: less implementation defined behavior)
        
       | KerrAvon wrote:
       | Your questions don't make sense. LLVM is a framework for building
       | compilers and tools. C is a language specification.
        
         | latenightcoding wrote:
         | It does make sense, your language can target C or LLVM. each
         | option comes with pros and cons e.g: C is more familiar, LLVM
         | has better support for JIT compilation.
        
       | dleslie wrote:
       | So let's assume we mean C as a target, in the way one might
       | target LLVM IR. What LLVM IR offers over C11 or C20 is a great
       | deal more tools to describe to the compiler how one might engage
       | in micro-optimizations.
       | 
       | Ie, linkage[0], lifetime[1], and ordering[2] information can be
       | critical to delivering performance.
       | 
       | Now that said, GCC is _pretty damned amazing_ at squeezing out
       | this information from C, and it's available for use just about
       | everywhere. But it's good at figuring it out for _hand-written_
       | C. The sort of C that's generated as a compilation output from
       | other languages, like Chicken Scheme and Nim, may not be written
       | in a way that allows the C compiler to fully take advantage of
       | its optimization abilities.
       | 
       | What I find with generated C is that it often is prone to blowing
       | the stack or missing cache far more than hand-written C would.
       | There's often something funky happening (Cheney-on-the-MTA) or
       | it's a soup of function pointers and object references scattered
       | haphazardly across memory.
       | 
       | 0: https://llvm.org/docs/LangRef.html#linkage-types
       | 
       | 1: https://llvm.org/docs/LangRef.html#object-lifetime
       | 
       | 2: https://llvm.org/docs/LangRef.html#memory-model-for-
       | concurre...
        
       | tibordp wrote:
       | You can go surprisingly far with C, though LLVM is probably a
       | better long-term option for a serious compiler, because it's a
       | tool made for the job (unless you target exotic and/or embedded
       | platforms that don't have LLVM support - but that's fairly
       | unlikely).
       | 
       | C is very easy to get started with if you don't already know
       | LLVM. You don't have to flatten everything to SSA + basic blocks
       | and can keep expressions as trees. The downside is that once your
       | compiler is reasonably complete, you may spend quite a bit of
       | time working around quirks of C (e.g. int promotion is very
       | annoying when you already have full type information, so your
       | compiler either has to understand C semantics fairly well or
       | defensively cast every subexpression).
       | 
       | I have a C backend in my compiler (https://github.com/alumina-
       | lang/alumina) and it works really well, though the generated C is
       | really ugly and assembly-like. With #line directives, you can
       | also get source-level debugging (gdb/lldb) that just works out of
       | the box.
       | 
       | There are a few goodies that LLVM gives you that you don't get
       | with C, like coverage
       | (https://clang.llvm.org/docs/SourceBasedCodeCoverage.html). It
       | works when done through clang, but cannot easily be made to track
       | the original sources.
        
       | coreyp_1 wrote:
       | Well, clang actually uses LLVM.
       | 
       | As I see it: LLVM is like a universal remote. It is a quasi-
       | assembly representation that can then be further compiled to
       | multiple architectures. If you target LLVM, then you can compile
       | to anything that LLVM can compile to. To support additional
       | architectures, then all that needs to happen is for that
       | architecture to work with LLVM, and then every project that uses
       | LLVM will now support that architecture.
       | 
       | C, on the other hand, relies on a compiler. Because a compiler
       | may be specifically designed for one architecture, then it may
       | (although it is not guaranteed) generate better binaries than
       | LLVM.
       | 
       | Because C can be compiled using clang, which uses LLVM, then
       | there is nothing that C can do that LLVM cannot.
       | 
       | It may, however, be possible to produce opcode sequences using
       | LLVM that is not possible in C, since C imposes semantics and
       | structure on a program (syntax).
       | 
       | Lastly, it may be a simple question of resources. Is it easier to
       | find people who know C, or people who know LLVM? Is it easier to
       | set up a toolchain that compiles C, or one that compiles LLVM?
       | Historically speaking, which one has been around longer? (C, of
       | course.) Do the semantics of the source language closely match
       | the available C paradigms? Which is more stable?
        
       | jcranmer wrote:
       | > What does LLVM have that can not be achieved 1 to 1 in C?
       | 
       | Do you want to have signed integer overflow not be undefined
       | behavior? Sorry, you can't do that in C. Do you want to support
       | SIMD vector types? Oops, no support in C. Underaligned memory
       | accesses? Packed structures? Support for handling unwind
       | structures? Coroutines?
       | 
       | > It feels to me like C should be superior to LLVM and allow to
       | do anything that is possible to do with LLVM, but maybe I'm
       | wrong?
       | 
       | Very much the inverse: as LLVM is used to implement a C compiler,
       | LLVM can faithfully reproduce all of the C semantics, but C
       | cannot be used to implement all of the LLVM features. Even if you
       | pretend C doesn't have undefined behavior, and even if you
       | include compiler extensions as part of C, there are still a few
       | LLVM instructions and intrinsics that just don't exist in C
       | (chief among them is invoke, which is used to implement C++
       | exception handling).
        
       ___________________________________________________________________
       (page generated 2023-06-30 23:00 UTC)