* * * * * The significance of this is that you can build parsing expressions on the fly … I found Meta II [1] to be an interesting approach to parsing, and the closest modern equivilent to that are parsing expression grammars [2] (PEG (Programming Expression Grammars)s), and the easiest one to use I've found is the Lua [3] implementation LPeg [4]. What's interesting about LPeg is that it isn't compiled into Lua, but into a specialized parsing VM (Virtual Machine), which makes it quite fast. Maybe not as fast as lex [5] and yacc [6] but certain easier to understand and vastly easier to use. Let me amend that: I find the re [7] module to be easier to use (which is build on LPeg), as I find this: > local re = require "re" > > parser = re.compile [[ > expr <- term (termop term)* > term <- factor (factorop factor)* > factor <- number > / open expr close > > number <- space '-'? [0-9]+ space > termop <- space [+-] space > factorop <- space [*/] space > open <- space '(' space > close <- space ')' space > space <- ' '? > ]] > to be way easier to read and understand than > local lpeg = require "lpeg" > > local space = lpeg.P" "^0 > local close = space * lpeg.P")" * space > local open = space * lpeg.P"(" * space > local factorop = space * lpeg.S"*/" * space > local termop = space * lpeg.S"+-" * space > local number = space * lpeg.P"-"^-1 * lpeg.R"09"^1 * space > > local factor , term , expr = lpeg.V"factor" , lpeg.V"term" , lpeg.V"expr" > > parser = lpeg.P { > "expr", > factor = number > + open * expr * close, > term = factor * (factorop * factor)^0, > expr = term * (termop * term)^0 > } > As such, I've been concentrating on using the re module to brush up on my parsing skills [8] to the point that I've been ignoring a key compent of LPeg expressions! Sure, raw LPeg isn't pretty, but as you can see from the above example, it is built up out of expressions. And that's a powerful abstraction right there. For instance, in mod_blog, I have code that will parse text, converting certain sequences of characters like --- (three dashes) into an HTML (HyperText Markup Language) entity &mcode;. So, I type the following: > > ``The name of our act is---The Aristocrats! ... Um ... hello?'' > which is turned into > “The name of our act is—The Aristocrats! … Um … > hello?” > to be rendered on your screen as: > “The name of our act is—The Aristocrats! … Um … hello?” > Now, I only support a few character sequences (six) and that takes 160 lines of C code. Adding support for more is a daunting task, and one that I've been reluctant to take on. But in LPeg, the code looks like: > local lpeg = require "lpeg" > > local base = > { > [ [[``]] ] = "“" , > [ [['']] ] = "”" , > [ "---" ] = "—" , > [ "--" ] = "–" , > [ "..." ] = "…", > [ ".." ] = "‥" , > } > > function mktranslate(tab) > local tab = tab or {} > local chars = lpeg.C(lpeg.P(1)) > > for target,replacement in pairs(tab) do > chars = lpeg.P(target) / replacement + chars > end > > for target,replacement in pairs(base) do > chars = lpeg.P(target) / replacement + chars > end > > return lpeg.Ct(chars^0) / function(c) return table.concat(c) end > end > Now, I could do this with the re module: > local re = require "re" > local R = { concat = table.concat } > local G = --[[ lpeg/re ]] [[ > > text <- chars* -> {} -> concat > > chars <- '`' -> '“' > / "''" -> '”' > / '---' -> '—' > / '--' -> '–' > / '...' -> '&helip;' > / '..' -> '‥' > / { . } > > ]] > > filter = re.compile(G,R) > But the former allows me to pass in an additional table of translations to do in addition to the “standard set” programmed in, for example: > translate = mktranslate { > ["RAM"] = 'RAM', > ["CPU"] = 'CPU', > ["(tm)"] = '™' > } > And I would want this why? Well, I have Lua embedded in mod_blog [9], so using Lua to do the translations is straightforward. But, now when I make an entry, I could include a table of custom translations for that entry. Doing it this way solves a problem [10] I saw nearly a decade ago. [1] gopher://gopher.conman.org/0Phlog:2011/08/11.1 [2] http://pdos.csail.mit.edu/~baford/packrat/ [3] http://www.lua.org/ [4] http://www.inf.puc-rio.br/~roberto/lpeg/ [5] http://en.wikipedia.org/wiki/Lex_(software) [6] http://en.wikipedia.org/wiki/Yacc [7] http://www.inf.puc-rio.br/~roberto/lpeg/re.html [8] https://github.com/spc476/LPeg-Parsers [9] gopher://gopher.conman.org/0Phlog:2011/11/28.1 [10] gopher://gopher.conman.org/0Phlog:2003/11/19.2 Email Sean Conner at sean@conman.org .