[HN Gopher] Writing a C Compiler (2017)
___________________________________________________________________
Writing a C Compiler (2017)
Author : lrsjng
Score : 100 points
Date : 2023-06-08 20:08 UTC (1 days ago)
(HTM) web link (norasandler.com)
(TXT) w3m dump (norasandler.com)
| dananjaya86 wrote:
| Book version to be released in October '23 :
| https://nostarch.com/writing-c-compiler
| [deleted]
| userbinator wrote:
| IMHO writing a compiler for a high-level language, in an even
| higher level language, somehow feels a bit "anachronistic" (for
| lack of better word).
| retrac wrote:
| Most of the current major C implementations are written in C++.
| userbinator wrote:
| That's unfortunate.
| WalterBright wrote:
| ImportC is written in D.
| golergka wrote:
| Weren't a lot of functional languages, like ML and it's
| descendants, created specifically to write parsers and
| compilers?
| peterfirefly wrote:
| No. ML was the meta language for a theorem prover (LCF).
|
| https://en.wikipedia.org/wiki/Logic_for_Computable_Functions
| JonChesterfield wrote:
| I've seen it claimed that ML was originally written, in lisp,
| in order to have a better language to write compilers in.
| lispm wrote:
| ML was written as the language used by the theorem prover
| LCF. It was written in Lisp.
| [deleted]
| avgcorrection wrote:
| For what reason?
| userbinator wrote:
| It's backwards. Writing a C compiler in C or Asm makes sense,
| a Python compiler in C also does, but a C compiler in Python
| is an odd inversion of abstraction.
| avgcorrection wrote:
| I guess this harkens back to the days when you _had to_
| write a compiler in a low-level language because that's all
| that the platform that you are targeting supports. Then it
| sounds weird to talk about writing a compiler in a high-
| level language in order to target a low-level one, because
| surely these high-level languages are more platform-
| dependent than the blessed (guaranteed on the platform)
| low-level one.
|
| But these days we can access dozens of languages on many
| platforms. And we can use high-level languages that are
| _good_ for writing compilers--languages with good string
| types and algebraic data types--instead of being limited to
| awfully imperative /procedural ones.
|
| In other words: your perspective sounds way more
| anachronistic.
| Jtsummers wrote:
| Why? The objective is to translate code in one language (C)
| to another (machine code or assembly or perhaps an
| intermediate representation). Why does it make sense to use
| C for that task and not Python or some other language? It's
| not like C provides facilities that specifically enable
| compiler writing or text parsing for itself that other
| languages are lacking.
| bigdict wrote:
| Has anyone worked through this? Is it a good (soon to be) book?
| e19293001 wrote:
| I owe my entire career to this remarkable individual who, despite
| never having met or being affiliated with, has profoundly
| influenced me through his insightful books. His vast knowledge
| and expertise have been instrumental in teaching me numerous
| technical concepts and skills throughout his publications.
|
| https://web.archive.org/web/20220519044634/http://cs.newpalt...
|
| Assembly Language and Computer Architecture Using C++ and Java ,
| Course Technology, 2004
|
| This book has been an invaluable resource in enhancing my
| understanding of various technical aspects. It has provided me
| with in-depth insights into the inner workings of a CPU, enabling
| me to grasp the intricate mechanisms behind its operation.
| Additionally, the book has equipped me with the knowledge and
| skills necessary to write an assembler based on a given
| instruction set.
|
| Moreover, I have delved into the intricacies of assembly
| language, thanks to the comprehensive explanations and examples
| provided in the book especially the exercises. It has allowed me
| to truly comprehend the nuances of this low-level programming
| language and its interactions with hardware.
|
| Furthermore, the book has shed light on the fascinating process
| of how compilers generate assembly code, particularly in the
| context of object-oriented programming languages. By exploring
| this topic, I have gained a deeper understanding of the intricate
| steps involved in transforming high-level code into assembly
| instructions, thereby bridging the gap between software
| development and hardware execution.
|
| I also acquired another remarkable book authored by the same
| author:
|
| "Compiler Construction Using Java, JavaCC, and Yacc," published
| by IEEE/Wiley in 2012. This exceptional book has served as an
| invaluable guide in my journey of creating compilers,
| encompassing both theoretical foundations and practical
| implementation.
|
| One of the most remarkable aspects of this book is its
| comprehensive coverage of parsing techniques. It equipped me with
| the knowledge and skills to effectively parse regular
| expressions, enabling me to implement powerful features akin to
| those found in the widely used tool, grep. This aspect of the
| book has been particularly enlightening, and I consider it a
| significant contribution for anyone seeking to delve into the
| realm of compilers.
|
| Overall, I am deeply indebted to this book, and I wholeheartedly
| recommend it to anyone eager to explore the fascinating world of
| compiler construction. It has truly bridged the gap between
| theory and practical implementation, providing a solid foundation
| and equipping aspiring compiler developers with the essential
| tools and techniques required to embark on this captivating
| journey.
|
| I had some comments before regarding this author and his books
| about compilers and computer architecture all over HN as well.
| fuzztester wrote:
| Nice comment, GPT user.
|
| Now, GPT:
|
| Replace all occurrences of the substring "me" with "you" in the
| above comment text.
| jazzyjackson wrote:
| Rude accusation. GPT talks like the average internet
| commenter, it shouldn't be surprising to find a genuine
| comment written in a voice similar to GPT.
| fuzztester wrote:
| Okay, so now life is imitating art? Never knew ;-)
|
| No, you are being rude. To the average internet commenter.
| Or maybe to GPT. ;-)
|
| By bringing down either one of them to the level of the
| other.
|
| Unless you meant "average" in the same sense as this short
| tale:
|
| A statistician had his head in a fridge and his feet in an
| oven, and when asked how he felt, he said, "on the average,
| I feel quite comfortable".
| fuzztester wrote:
| GPT comments are genuine too. Don't hurt our soon-to-be-
| developed feelings! Sniff ...
| eesmith wrote:
| Huh. I don't get those vibes.
|
| Further investigation doesn't support your claim. The
| citations check out, including publication year and
| publishers.
|
| And the author has indeed praised the book many times before
| (https://news.ycombinator.com/item?id=31843833,
| https://news.ycombinator.com/item?id=31843833,
| https://news.ycombinator.com/item?id=31311613,
| https://news.ycombinator.com/item?id=28481028,
| https://news.ycombinator.com/item?id=23386732,
| https://news.ycombinator.com/item?id=22305353, and
| https://news.ycombinator.com/item?id=23386732,
| https://news.ycombinator.com/item?id=21988211,
| https://news.ycombinator.com/item?id=21513056,
| https://news.ycombinator.com/item?id=18996703, and
| https://news.ycombinator.com/item?id=10184364 ) with the last
| comment from 2015.
|
| Eg, compare "I am deeply indebted to this book" with "I'm
| very debted to this man. I enjoyed a lot reading his books
| and made me who I am today." at
| https://news.ycombinator.com/item?id=28481028 from Sept 10,
| 2021.
|
| Or compare "I owe my entire career to this remarkable
| individual who, despite never having met or being affiliated
| with," with "I'm not affiliated with the author though. This
| book helped a lot in my career as a hardware and firmware
| engineer." at https://news.ycombinator.com/item?id=23386732
| from June 2, 2020.
|
| Or compare "enabling me to implement powerful features akin
| to those found in the widely used tool, grep." with similar
| comments over the last 8+ years, at https://hn.algolia.com/?d
| ateRange=all&page=0&prefix=true&que... , like "and eventually
| write your own 'grep' which was for me is a mind-blowing
| experience" at https://news.ycombinator.com/item?id=13664714
| from Feb 16, 2017.
|
| And https://hn.algolia.com/?dateRange=all&page=0&prefix=true&
| que... shows the OP citing http://cs.newpaltz.edu/~dosreist/
| while this comment uses the archive.org version because the
| old URL doesn't work.
| pharrington wrote:
| For what it's worth, ZeroGPT thinks the comment's a 25%/75%
| human/AI mix.
| eesmith wrote:
| For what it's worth, ZeroGPT thinks the first paragraph
| of your comment at
| https://news.ycombinator.com/item?id=35559453 was "Most
| Likely GPT generated" (25% written by a human, 100%
| generated by an AI/GPT).
|
| The entire comment was 75% human, 46% AI/GPT.
|
| I picked that comment because it had the longest text.
| pharrington wrote:
| Yeah that's fair.
|
| Edit: lmao playing around a bit, simply changing "it is"
| to "its" (no apostrophe) in the first sentence, and
| editing the second sentence to read "the problem's that
| people have to be" makes ZeroGPT no longer think my post
| was AI generated at all.
| fuzztester wrote:
| Your long, detailed, somewhat scholarly, well researched
| comment, leads us to think (after consulting several
| prestigious, highly intelligent, real and artificial
| professors), that you maybe a suitable candidate for the
| first PhD program at the new international Global PHD
| Trainers Institute (iGPT Institute). We will shortly be
| sending you the long, formal and stilted application form,
| to which you must reply in the same way, but better, as the
| first test.
|
| All the best.
|
| Digitally signed, Your soon-to-be GPT overlords.
| Jtsummers wrote:
| Maybe not generated, but still a bizarre opening paragraph
| in context:
|
| > I owe my entire career to this remarkable individual who,
| despite never having met or being affiliated with, has
| profoundly influenced me through his insightful books. His
| vast knowledge and expertise have been instrumental in
| teaching me numerous technical concepts and skills
| throughout his publications.
|
| The individual they're referring to with "this remarkable
| individual" is _not_ Nora Sandler, the author of the
| submitted post, but Anthony J. Dos Reis who they repeatedly
| reference by allusion but never name. A confusing way to
| write.
| bigdict wrote:
| getting college essay vibes from this comment
| belter wrote:
| And ChatGPT vibes...
| [deleted]
| hcks wrote:
| Yet another "compiling" course that puts all the emphasis on
| parsing.
|
| Rule of thumb: parsing/lexing shouldn't takes more than 10% of
| your compiler course.
| wasimanitoba wrote:
| anything better you'd recommend?
| marcosdumay wrote:
| On the other hand, parsing text could easily be a very valuable
| course on its own. You just have to not keep it restricted to
| programing languages, and include the knowledge created on this
| century.
| tester756 wrote:
| parsing is cool
| vector_spaces wrote:
| This attitude bugs me a lot. It seems really common, especially
| in more recent texts about language design and implementation,
| that parsing is heavily de-emphasized to the point where
| practically nobody talks about it. See Essentials of
| Programming Languages by Friedman & Wand, the relevant sections
| in SICP, Programming Languages: Application & Interpretation
| (which goes so far as to call it a distraction).
|
| I get that parsing is more of an implementation detail and
| doesn't really belong to the space-brained realm of language
| design per se, but it's a bit annoying that most texts refuse
| to give any space to the topic, and rely on your language being
| S-expression based or assume you're going to use a parser
| generator. Like, in the real world, even if one will never
| actually implement a fully-fledged programming language, you're
| still probably going to have to parse things sometimes. I would
| love a book that goes into detail about different parsing
| techniques and considers best practices and patterns and
| tradeoffs/design considerations -- would pay good money for
| that
|
| It reminds me somewhat of the situation in analysis, where
| there are lots of theorems that aren't written down anywhere
| because literally every book states them as "easy" exercises.
| Maybe I'm looking in the wrong places, but I can't find much in
| the way of concrete guidance on implementing parsers. I'm aware
| of the beautiful series on parsing theory by Aho & Ullman ("The
| Theory of Parsing, Translation, and Compiling"), but those are
| more focused on theory rather than implementation
| marssaxman wrote:
| > Like, in the real world, even if one will never actually
| implement a fully-fledged programming language, you're still
| probably going to have to parse things sometimes.
|
| That is definitely true, but in practice there isn't much to
| say about it, because sophisticated parsers turn out not to
| be particularly important; it works out better overall to
| design simple grammars, and then the parsing is easy.
|
| - If you're a beginner, you'll write a recursive descent
| parser, because that's the simplest technique, and it lets
| you focus on your project instead of a new, unfamiliar tool.
|
| - If you're writing a domain-specific language, or a config
| format, or something of that nature, you'll use whichever
| parser generator integrates most conveniently into your
| workflow, and you'll design your grammar around whatever its
| manual tells you to do.
|
| - If you're writing a full-scale language compiler, you'll go
| back to recursive descent, because that offers the easiest
| way to recover from errors and report informative messages.
| Maybe you'll throw in precedence-climbing for operators.
|
| > I would love a book that goes into detail about different
| parsing techniques and considers best practices and patterns
| and tradeoffs/design considerations -- would pay good money
| for that
|
| I would also read such a book, but it would be more of a book
| about parser generators than a book about parsers.
| cdcarter wrote:
| On the other hand, historically (and as the parent you're
| replying to points out), many compiler texts have spent a
| MAJORITY of their time on parsing, and rush through the
| actual interesting parts of compilation.
|
| > I would love a book that goes into detail about different
| parsing techniques and considers best practices and patterns
| and tradeoffs/design considerations -- would pay good money
| for that
|
| Terrence Parr's "Language Implementation Patterns" spends
| quite a bit of time on parsing, and parse tree->ast
| conversyions.
| vector_spaces wrote:
| Thanks for pointing that one out -- I had written that one
| off before as an ANTLR book but looks like it covers more
| material than I gave it credit for
| [deleted]
| hota_mazi wrote:
| I disagree.
|
| As opposed to most compiler articles, this one actually covers
| code generation for every section of its chapters, which is
| really great.
|
| I also like that every chapter focuses on a specific feature
| and describes how to implement it end to end: lexical/syntactic
| parsing, AST, and x86_64 generation.
|
| Great series!
| munificent wrote:
| Almost all real-world projects that are language-like or
| compiler-like will need a parser. A much smaller fraction of
| them will need register allocation, instruction selection,
| optimization, code generation, etc.
|
| For every big, deep, native code compiler, there are a hundred
| template languages, config files, report generators, etc. all
| of which are real programs providing real value for actual
| people.
|
| Emphasizing parsing provides the most value for the greatest
| number of people. The folks that do end up needing more back
| end depth will still have the resources available to learn it.
| throwaway17_17 wrote:
| Do you have a 'best of list' for the resources when
| interested in back-end topics.
| munificent wrote:
| I wouldn't consider myself any kind of authority on "best
| of", but I like the Dragon Book, and Engineering a
| Compiler. I've heard good things about Appel's Modern
| Compiler Design.
| WalterBright wrote:
| Parsing takes a weekend. The rest takes a year to get a
| rudimentary compiler working.
| RcouF1uZ4gsC wrote:
| Here is how to write a C compiler in Python that correctly
| compile the vast majority of C programs per the ISO C standard:
| print("You have some form of undefined behavior, which means
| printing this is a valid response per the C standard")
| tialaramex wrote:
| Undefined Behaviour has to actually _happen_ , and so that
| means at runtime+, and thus what you wrote is not a valid C
| compiler.
|
| For C++ IFNDR ("Ill-formed, No diagnostic required") the
| situation is trickier because the affected programs (some
| unknowable but likely large proportion of all purported C++
| code) are not well formed C++, the standard offers no hint as
| to what happens or why, since it constrains only the behaviour
| of a C++ compiler for well formed C++ programs.
|
| + It's possible the C lexer claims to have some "Undefined
| Behaviour" cases like the C++ lexer, hence P2621 "UB? In my
| lexer?" which is a reference to a 2005 meme because C++
| standards committee members are down with the kids, but that's
| clearly a standards text bug if so because it makes no sense to
| have UB in the lexer, these should just be ill-formed programs,
| you get a compiler error.
___________________________________________________________________
(page generated 2023-06-09 23:02 UTC)