[HN Gopher] How Janet's PEG module works
       ___________________________________________________________________
        
       How Janet's PEG module works
        
       Author : behnamoh
       Score  : 52 points
       Date   : 2025-04-11 02:04 UTC (3 days ago)
        
 (HTM) web link (bakpakin.com)
 (TXT) w3m dump (bakpakin.com)
        
       | mplanchard wrote:
       | I did Advent of Code in Janet the year before last I think, and
       | _really_ loved the PEG support. Essentially every day started out
       | with making a quick grammar to parse the problem into whatever
       | data structures I was using. It 's intuitive, pretty easy to pick
       | up and adjust, and powerful.
        
         | 3036e4 wrote:
         | I did that as well, in 2023. Tried to use PEGs as much as
         | possible. Have only good memories of the PEGs, but have not had
         | much reason to use them since. Janet+PEG is definitely
         | something I will consider for future projects whenever I need
         | to parse something. Even for something that would otherwise be
         | just a small regular expression I think writing a PEG instead
         | may make some sense for readability.
        
         | wodenokoto wrote:
         | I had plans to try the same, but with Python background and
         | having never touched any lispy languages or any macros I found
         | the Janet for Mortals book surprisingly difficult to follow and
         | gave up.
        
       | norir wrote:
       | I am not a fan of PEG. It is straightforward to write a fast
       | parser generator for languages that require just one character of
       | lookahead to disambiguate any alternation in the grammar. This
       | gets you most of the expressivity of PEG and nearly optimal
       | performance (since you only need to look at one character to
       | disambiguate and there is no backtracking). Just as importantly,
       | it avoids the implicit ambiguities that PEG's resolution
       | algorithm can hide from the grammar author that lead to
       | unexpected parse results that can be quite difficult to debug
       | and/or fix in the grammar.
       | 
       | It does require a bit more thought to design an unambiguous
       | language but I think it's worth it. While there is a learning
       | curve for designing such languages, it becomes natural with
       | practice and it becomes hard to go back to ambiguous languages.
        
         | janzer wrote:
         | For those further interested in PEG vs LL(1) parsers. The first
         | few sections of the Python PEP[1] where they switched from an
         | LL(1) to PEG parser for CPython has a nice short introduction
         | to both and their rationale for switching from LL(1) to PEG.
         | 
         | https://peps.python.org/pep-0617/
        
           | PaulHoule wrote:
           | It still seems to me the PEG revolution hasn't arrived.
           | 
           | PEG has the possibility for composable grammars (why not
           | smack some SQL code in the middle of Python?) but it needs a
           | few more affordances, particularly an easy way to handle
           | operator precedence.
           | 
           | I think current parser generators suck and that more
           | programmers would be using them if anybody cared about making
           | compiler technology easier to use but the problems are: (1)
           | people who understand compiler technology can get things done
           | with the awful tools we have now and (2) mostly those folks
           | think it is performance uber alles.
           | 
           | With the right tools the "Lisp is better because it is
           | homoiconic" would finally die. With properly architected
           | compilers adding                 unless(X) { .. } -> if(!X) {
           | ... }
           | 
           | to Java would just one grammar production, one transformation
           | operator and _maybe_ a new class in the AST (which might be
           | codegenned), that and something to tell the compiler where to
           | find these things. Less code than the POM file.
           | 
           | I gave up on Restructured text because it didn't support
           | unparsing: I could picture all kinds of scenarios where I'd
           | want to turn something else into RST or take RST and mix it
           | up against other data and turn it back to RST; RST had the
           | potential to work with or without a schema but it never got
           | realized.
        
             | behnamoh wrote:
             | > "Lisp is better because it is homoiconic"
             | 
             | - Lisp is better because it manipulates the same data that
             | the program code is represented in (car works on a data
             | list, and it works on a code list as well).
             | 
             | - Lisp is better (at least, Common Lisp) because of image-
             | and-REPL-driven development. Good luck finding exactly that
             | level of flexibility in other REPL-ful languages.
             | 
             | - Lisp is better because of hot code reloading and
             | restarts. Only Elixir/Erlang have a similar mechanism.
             | 
             | - Lisp is better because of structural editing (e.g.,
             | paredit). No more character-level editing.
             | 
             | I could go on but just wanted to point out that
             | homoiconicity isn't the entire deal with Lisp.
        
               | fuzztester wrote:
               | >> "Lisp is better because it is homoiconic"
               | 
               | >- Lisp is better because it manipulates the same data
               | that the program code is represented in (car works on a
               | data list, and it works on a code list as well).
               | 
               | Don't those two sentences mean the same?
               | 
               | https://news.ycombinator.com/item?id=43676798
        
               | behnamoh wrote:
               | Yeah, but I wanted to emphasize that homoiconicity isn't
               | just some superficial "nice thing" to have, it literally
               | is why we can have powerful macros in Lisp.
        
               | PaulHoule wrote:
               | I went through quite a few stages of grief reading
               | Graham's _On Lisp_ starting with  "this is so awesome" to
               | nitpicking details like "he defined everything else but
               | he didn't define nconc" to "if we was using Clojure he
               | wouldn't be having these problems with nconc" to "funny I
               | can write 80%+ of his examples in Python because most of
               | the magic is in first-class functions and macros are a
               | performance optimization except for that last bit about
               | continuations... and Python has async anyway!"
               | 
               | Notably he doesn't do any interesting tree
               | transformations on the code because tree structures in
               | list are just a collection of twisty nameless tuples that
               | all look alike. If you were trying to do nontrivial
               | transformations on code you'd be better off with an AST
               | in a language like Java or Typescript. In the end the
               | dragon book is _On Lisp_ squared or cubed, that is, games
               | people play with macros are a pale shadow of what you can
               | do if you actually understand how compilers work.
        
             | zem wrote:
             | brag is a pretty user-friendly parser generator for racket:
             | https://docs.racket-lang.org/brag/index.html
        
         | thesz wrote:
         | > It is straightforward to write a fast parser generator for
         | languages that require just one character of lookahead...
         | 
         | Then you get VHDL.
         | 
         | https://news.ycombinator.com/item?id=15017974
         | 
         | You need (at least an approximation to) the symbol table for
         | correct lexing.
         | 
         | Or Postgres/MariaDB's SQL with the the DELIMITER statement that
         | can change semicolon to something else.
        
       ___________________________________________________________________
       (page generated 2025-04-14 23:00 UTC)