[HN Gopher] How Janet's PEG module works
___________________________________________________________________
How Janet's PEG module works
Author : behnamoh
Score : 52 points
Date : 2025-04-11 02:04 UTC (3 days ago)
(HTM) web link (bakpakin.com)
(TXT) w3m dump (bakpakin.com)
| mplanchard wrote:
| I did Advent of Code in Janet the year before last I think, and
| _really_ loved the PEG support. Essentially every day started out
| with making a quick grammar to parse the problem into whatever
| data structures I was using. It 's intuitive, pretty easy to pick
| up and adjust, and powerful.
| 3036e4 wrote:
| I did that as well, in 2023. Tried to use PEGs as much as
| possible. Have only good memories of the PEGs, but have not had
| much reason to use them since. Janet+PEG is definitely
| something I will consider for future projects whenever I need
| to parse something. Even for something that would otherwise be
| just a small regular expression I think writing a PEG instead
| may make some sense for readability.
| wodenokoto wrote:
| I had plans to try the same, but with Python background and
| having never touched any lispy languages or any macros I found
| the Janet for Mortals book surprisingly difficult to follow and
| gave up.
| norir wrote:
| I am not a fan of PEG. It is straightforward to write a fast
| parser generator for languages that require just one character of
| lookahead to disambiguate any alternation in the grammar. This
| gets you most of the expressivity of PEG and nearly optimal
| performance (since you only need to look at one character to
| disambiguate and there is no backtracking). Just as importantly,
| it avoids the implicit ambiguities that PEG's resolution
| algorithm can hide from the grammar author that lead to
| unexpected parse results that can be quite difficult to debug
| and/or fix in the grammar.
|
| It does require a bit more thought to design an unambiguous
| language but I think it's worth it. While there is a learning
| curve for designing such languages, it becomes natural with
| practice and it becomes hard to go back to ambiguous languages.
| janzer wrote:
| For those further interested in PEG vs LL(1) parsers. The first
| few sections of the Python PEP[1] where they switched from an
| LL(1) to PEG parser for CPython has a nice short introduction
| to both and their rationale for switching from LL(1) to PEG.
|
| https://peps.python.org/pep-0617/
| PaulHoule wrote:
| It still seems to me the PEG revolution hasn't arrived.
|
| PEG has the possibility for composable grammars (why not
| smack some SQL code in the middle of Python?) but it needs a
| few more affordances, particularly an easy way to handle
| operator precedence.
|
| I think current parser generators suck and that more
| programmers would be using them if anybody cared about making
| compiler technology easier to use but the problems are: (1)
| people who understand compiler technology can get things done
| with the awful tools we have now and (2) mostly those folks
| think it is performance uber alles.
|
| With the right tools the "Lisp is better because it is
| homoiconic" would finally die. With properly architected
| compilers adding unless(X) { .. } -> if(!X) {
| ... }
|
| to Java would just one grammar production, one transformation
| operator and _maybe_ a new class in the AST (which might be
| codegenned), that and something to tell the compiler where to
| find these things. Less code than the POM file.
|
| I gave up on Restructured text because it didn't support
| unparsing: I could picture all kinds of scenarios where I'd
| want to turn something else into RST or take RST and mix it
| up against other data and turn it back to RST; RST had the
| potential to work with or without a schema but it never got
| realized.
| behnamoh wrote:
| > "Lisp is better because it is homoiconic"
|
| - Lisp is better because it manipulates the same data that
| the program code is represented in (car works on a data
| list, and it works on a code list as well).
|
| - Lisp is better (at least, Common Lisp) because of image-
| and-REPL-driven development. Good luck finding exactly that
| level of flexibility in other REPL-ful languages.
|
| - Lisp is better because of hot code reloading and
| restarts. Only Elixir/Erlang have a similar mechanism.
|
| - Lisp is better because of structural editing (e.g.,
| paredit). No more character-level editing.
|
| I could go on but just wanted to point out that
| homoiconicity isn't the entire deal with Lisp.
| fuzztester wrote:
| >> "Lisp is better because it is homoiconic"
|
| >- Lisp is better because it manipulates the same data
| that the program code is represented in (car works on a
| data list, and it works on a code list as well).
|
| Don't those two sentences mean the same?
|
| https://news.ycombinator.com/item?id=43676798
| behnamoh wrote:
| Yeah, but I wanted to emphasize that homoiconicity isn't
| just some superficial "nice thing" to have, it literally
| is why we can have powerful macros in Lisp.
| PaulHoule wrote:
| I went through quite a few stages of grief reading
| Graham's _On Lisp_ starting with "this is so awesome" to
| nitpicking details like "he defined everything else but
| he didn't define nconc" to "if we was using Clojure he
| wouldn't be having these problems with nconc" to "funny I
| can write 80%+ of his examples in Python because most of
| the magic is in first-class functions and macros are a
| performance optimization except for that last bit about
| continuations... and Python has async anyway!"
|
| Notably he doesn't do any interesting tree
| transformations on the code because tree structures in
| list are just a collection of twisty nameless tuples that
| all look alike. If you were trying to do nontrivial
| transformations on code you'd be better off with an AST
| in a language like Java or Typescript. In the end the
| dragon book is _On Lisp_ squared or cubed, that is, games
| people play with macros are a pale shadow of what you can
| do if you actually understand how compilers work.
| zem wrote:
| brag is a pretty user-friendly parser generator for racket:
| https://docs.racket-lang.org/brag/index.html
| thesz wrote:
| > It is straightforward to write a fast parser generator for
| languages that require just one character of lookahead...
|
| Then you get VHDL.
|
| https://news.ycombinator.com/item?id=15017974
|
| You need (at least an approximation to) the symbol table for
| correct lexing.
|
| Or Postgres/MariaDB's SQL with the the DELIMITER statement that
| can change semicolon to something else.
___________________________________________________________________
(page generated 2025-04-14 23:00 UTC)