https://github.com/jzimmerman/langcc Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code + Explore + All features + Documentation + GitHub Skills + Changelog * Solutions + By Plan + Enterprise + Teams + Compare all + By Solution + CI/CD & Automation + DevOps + DevSecOps + Case Studies + Customer Stories + Resources * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles + Repositories + Topics + Trending + Collections * Pricing [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this user All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} jzimmerman / langcc Public * Notifications * Fork 10 * Star 177 langcc: A Next-Generation Compiler Compiler License Apache-2.0 license 177 stars 10 forks Star Notifications * Code * Issues 8 * Pull requests 0 * Discussions * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Discussions * Actions * Projects * Security * Insights jzimmerman/langcc This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags 1 branch 0 tags Code * Clone HTTPS GitHub CLI [https://github.com/j] Use Git or checkout with SVN using the web URL. [gh repo clone jzimme] Work fast with our official CLI. Learn more. * Open with GitHub Desktop * Download ZIP Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit Joe Zimmerman Split up unit tests ... 232d858 Sep 23, 2022 Split up unit tests 232d858 Git stats * 10 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time data Initial commit Sep 20, 2022 examples Remove crypto++ dependency Sep 23, 2022 gen Initial commit Sep 20, 2022 grammars Initial commit Sep 20, 2022 src Split up unit tests Sep 23, 2022 .gitignore Initial commit Sep 20, 2022 LICENSE Initial commit Sep 20, 2022 Makefile Add uninstall command Sep 23, 2022 README.md Remove crypto++ dependency Sep 23, 2022 bootstrap.sh macOS fixes Sep 21, 2022 deps_macos.sh Remove crypto++ dependency Sep 23, 2022 deps_ubuntu.sh Remove crypto++ dependency Sep 23, 2022 View code langcc: A Next-Generation Compiler Compiler Build Examples Documentation README.md langcc: A Next-Generation Compiler Compiler langcc is a tool that takes the formal description of a language, in a standard BNF-style format, and automatically generates a compiler front-end, including data structure definitions for the language's abstract syntax trees (AST) and traversals, a lexer, a parser, and a pretty-printer. langcc also serves as the companion software implementation to the following technical reports, which describe several innovations on the classic LR parsing paradigm: * Zimmerman, Joe. Practical LR Parser Generation. arXiv, 2022. * Zimmerman, Joe. langcc: A Next-Generation Compiler Compiler. arXiv, 2022. langcc can be used as a replacement for the combination of lex and yacc (or flex and bison). However, langcc provides many additional features, including: * Automatic generation of AST data structures, via a standalone datatype compiler (datacc). * Full LR parser generation as the default, rather than the more restrictive LALR. * Clear presentation of LR conflicts via explicit "confusing input pairs", rather than opaque shift/reduce errors. * Novel efficiency optimizations for LR automata. * An extension of the LR paradigm to include recursive-descent (RD) parsing actions, resulting in significantly smaller and more intuitive automata. * An extension of the LR paradigm to include per-symbol attributes, which are vital for the efficient implementation of many industrial language constructs. * A general transformation for LR grammars (CPS), which significantly expands the class of grammars the tool can support. Unlike previous compiler compilers, langcc is general enough to capture full industrial programming languages, for which existing parsers are typically written by hand or generated from procedural descriptions. Examples include Python 3.9.12 (grammars/py.lang) and Golang 1.17.8 (grammars/go.lang). In both cases, langcc automatically generates a parser that is faster than the standard library parser for each language. In fact, the class of grammars supported by langcc is general enough that the tool is self-hosting: that is, one can express the "language of languages" in the "language of languages" itself, and use langcc to generate its own compiler front-end. We do this in the canonical implementation; see the files bootstrap.sh and grammars/meta.lang for more details. langcc is a research prototype and has not yet been used extensively in production. However, we believe it is essentially stable and feature-complete, and can be used as a standalone tool to facilitate rapid exploration of new compilers and programming languages. Build The build has been tested on Ubuntu 22.04 and macOS 12.5, but should also run on some other versions of Ubuntu and macOS with minor adaptations. For Ubuntu 22.04: ./deps_ubuntu.sh make -j8 sudo make install For macOS 12.5 (requires Homebrew): ./deps_macos.sh make -j8 sudo make install And, in order to bootstrap the langcc front-end itself, subsequently run: ./bootstrap.sh Examples Once langcc (and its companion, datacc) have been installed, one can run various examples: * In the examples directory, there are two examples: basic and calc. Each has its own local Makefile. * The main build process itself compiles grammars/py.lang and grammars/go.lang, producing tests build/go_standalone_test and build/py_standalone_test. (Note: These binaries require, respectively, repositories for Golang 1.17.8 located in the directory ../go, and Python 3.9.12 located in the directory ../ cpython.) * There is the language of datatypes, grammars/data.lang, which describes the input of the additional standalone tool datacc (used by langcc to automatically generate C++ implementations of algebraic datatypes). * Finally, there is the language of languages itself, grammars/ meta.lang. This language also serves as basic documentation, as it enumerates all of its own features. Documentation For full documentation, see the accompanying technical report: * Zimmerman, Joe. langcc: A Next-Generation Compiler Compiler. arXiv, 2022. as well as the theoretical development: * Zimmerman, Joe. Practical LR Parser Generation. arXiv, 2022. About langcc: A Next-Generation Compiler Compiler Resources Readme License Apache-2.0 license Stars 177 stars Watchers 4 watching Forks 10 forks Releases No releases published Packages 0 No packages published Languages * C++ 99.9% * Other 0.1% Footer (c) 2022 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.