[HN Gopher] Ohm - A library and language for building parsers, i...
       ___________________________________________________________________
        
       Ohm - A library and language for building parsers, interpreters,
       compilers, etc.
        
       Author : testing_1_2_3_4
       Score  : 333 points
       Date   : 2021-03-27 16:20 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | crazypython wrote:
       | This title is misleading. It's a library and language for
       | building parsers. Full stop. Parsing toolkit, as they say
       | themselves.
        
         | exdsq wrote:
         | The title copies the second sentence of their readme:
         | 
         | > You can use it to parse custom file formats or quickly build
         | parsers, interpreters, and compilers for programming languages.
        
           | UncleMeat wrote:
           | I guess it depends on what it means to somebody to build a
           | compiler. Something like yacc says "compiler compiler" in the
           | name but really it is a parser generator. The hard part of
           | industrial compilers is the optimization.
        
       | f430 wrote:
       | If I want to modify GraphQL to support custom syntax, would Ohm
       | work? Or does a solution exist already for my needs?
        
       | fjfaase wrote:
       | I recently wrote a similar parser, maybe less fancy, for a
       | workshop on parsing. It does display the the abstract syntax tree
       | with d3.js and also has a build evaluator for a limited set of
       | language constructs.
       | https://fransfaase.github.io/ParserWorkshop/Online_inter_par...
       | It is based on a parser I implemented in C++.
        
       | scroot wrote:
       | We are using Ohmjs on a project at work and it is fantastic. I'm
       | hoping one day that Ohmjs and Ohm/s (Squeak) can be compatible
       | again -- would love to have the Smalltalk version of our
       | interpreter and environment we built using this
        
       | codr7 wrote:
       | I recently created a library for the other part of an
       | interpreter.
       | 
       | https://github.com/codr7/liblg
       | 
       | https://github.com/codr7/liblgpp
        
       | jweissman wrote:
       | I've built a number of toy language projects with Ohm and it's
       | really wonderful. Just a joy to use the visual tooling also. All
       | around really beautiful machinery
        
       | rkagerer wrote:
       | OHM is also the acronym for Open Hardware Monitor, a great open-
       | source project for monitoring computer temperatures, fan speeds,
       | voltages, etc: https://openhardwaremonitor.org/
        
       | pjmlp wrote:
       | Love it, this is great for teaching purposes.
        
       | tobr wrote:
       | Speaking of - what's the status of HARC? Is it defunct?
        
         | azeirah wrote:
         | Yep, HARC is no more. I don't recall the exact history but iirc
         | SAP withdrew its funding and HARC basically ceased to exist.
         | 
         | Now, ohm survives as an open-source project, Bret Victor
         | continues work with Dynamicland and Vi Hart is currently
         | employed at Microsoft Research.
        
         | jagger27 wrote:
         | Defunct enough to let their TLS cert expire.
        
       | corysama wrote:
       | This is a follow-up to a major component of the
       | http://vpri.org/writings.php project that created an self-
       | contained office suite, OS and compiler suite in something like
       | 100-200k lines of code without external dependencies.
        
         | hobo_mark wrote:
         | Do you have a link to the project? I'm failing to find it on
         | that page.
        
           | beagle3 wrote:
           | Not op, and can't google now but the project was called
           | STEPS, they did a down-to-metal os including network and GUI
           | (and mote) in 20k lines.
           | 
           | Don't remember anything about office suite. Related names I
           | remember are Alan Kay, Dan Amelang, Alessandro Wirth and Ian
           | Piumarta.
        
             | elgertam wrote:
             | The biggest artifact from STEPS was Frank, which was at the
             | time bootstrapped using Squeak Smalltalk and included the
             | work from Ian Piumarta (IDST/Maru, which was a fully
             | bootstrapped LISP down to the metal), Dan Amelang (Nile,
             | the graphics language, and Gezira, the 2.5D graphics
             | library implemented in Nile, which both depended on Maru),
             | Alex Warth (OMeta, which had some sort of relationship to
             | Ian's work on Maru), Yoshiki Ohshima (a lot of the
             | experimental things from Alan's demos of Frank were made by
             | Yoshiki) and then several other names. I got close to
             | getting Frank working, but honestly, I'm not sure it's
             | worth it at this point. A lot of the work is 10-15 years
             | old, and the last time I dove in, I ran into issues running
             | 32-bit binaries. The individual components are more
             | interesting and could be packaged together in some other
             | way.
             | 
             | Since it was a research project, STEPS never quite achieved
             | a cohesive, unified experience, but they proved that the
             | individual components could be substantially minimized and
             | the cost of developing them amortized over a large project
             | like a full GUI environment. Nile and some of the
             | applications of Maru, like a minimal but functioning TCP/IP
             | stack that can be compiled to bare metal by virtue of being
             | made in Maru, still fascinate me.
             | 
             | Work on Maru is ongoing, albeit run by a community (with
             | some input from Ian), Nile has been somewhat reborn of
             | late, Ohm is again under active development as the
             | successor to OMeta and Alan is still around.
             | 
             | (Source: Dan is a friend and colleague, and I've met a few
             | of the STEPS/VPRI people that way.)
        
               | scroot wrote:
               | Can you link to both the Maru community and the reborn
               | Nile work? I've always tried to follow the latter, but
               | [1] seems to be the only place to find information and
               | it's been silent for a long time.
               | 
               | [1] https://github.com/damelang/nile/issues/3
        
               | elgertam wrote:
               | Maru development is documented on an active mailing
               | list.[1]
               | 
               | Dan did a demo of a related language to Nile at Oracle
               | Open World in September 2019. (Full disclosure: I worked
               | on the demo.) I would predict that more information will
               | be forthcoming about Nile this year.
               | 
               | [1] https://groups.google.com/g/maru-dev
               | 
               | [2] https://www.oracle.com/openworld/on-
               | demand.html?bcid=6092429...
        
               | asrp wrote:
               | Is there some place STEPS fans can gather and gather our
               | notes? There are archives of the FONC mailing list here
               | [1].
               | 
               | I'm an outsider and also never got Frank to work. I was
               | waiting for the Nile/Gezira thesis to get a high level
               | (but hopefully also some detailed) descriptions) of how
               | they handled graphics. I vaguely remember getting parts
               | of idst working but for each of these projects, there
               | were always multiple versions lying around. Sometimes in
               | odd places.
               | 
               | I read Alex Warth's thesis and it's well written, in a
               | way that makes it very easy to understand. So, of course,
               | I had to implement my own OMeta variant [2].
               | 
               | Also, the VPRI website itself says it's shutting down
               | (presumably folks moved to HARC at that time?).
               | 
               | Edit to add that OMeta is the language agnostic parser
               | and compiler!
               | 
               | [1] https://www.mail-archive.com/fonc@vpri.org/ [2]
               | https://github.com/asrp/pymetaterp
        
               | elgertam wrote:
               | > Is there some place STEPS fans can gather and gather
               | our notes? There are archives of the FONC mailing list.
               | 
               | Maru development is documented on an active mailing
               | list.[1] Ohm development is being coordinated through
               | GitHub. I'd personally like to take the extant code from
               | OMeta/JS and the JS implementation of Nile & Gezira, and
               | modernize them.
               | 
               | Recently I've been wondering if there's enough interest
               | for a Discord server or something. (In the spirit of
               | STEPS, it'd be ideal to make a new collaborative thing
               | that's really different than static text/audio/video on
               | the web, but gotta start somewhere. :) ) Unfortunately, I
               | have had other, higher-priority projects at the moment,
               | so I have taken no initiative to try to build a
               | community.
               | 
               | I will also say that in my opinion, it's not clear to
               | many of the people who made this stuff how special it is.
               | The only exception to that is Bret Victor, who actually
               | is not well-understood, but even the banana pudding
               | versions of his ideas are typically much better than the
               | industry's.
               | 
               | > I'm an outsider and also never got Frank to work. I was
               | waiting for the Nile/Gezira thesis to get a high level
               | (but hopefully also some detailed) descriptions) of how
               | they handled graphics. I vaguely remember getting parts
               | of idst working but for each of these projects, there
               | were always multiple versions lying around. Sometimes in
               | odd places.
               | 
               | I've never gotten Frank to work, and I abandoned my
               | attempts. I've seen it run, though. The name was fully
               | truthful: it really is Frankenstein's monster.
               | 
               | I _did_ get Nile + Gezira to work (albeit in a very crude
               | way by printing numbers to the console rather than
               | hooking it up to a frame buffer). That 's how I met Dan.
               | I don't want to betray any confidences with him, but
               | there is ongoing work with Nile.
               | 
               | Here's Dan himself presenting a related language at
               | Oracle Open World in a demo (around 25 mins in).[2] (Full
               | disclosure: I worked on the demo.)
               | 
               | If it were me getting started, I would take a look at the
               | JavaScript implementation of Nile in Dan's Nile repo on
               | GitHub. It should more or less work out of the box, and
               | there's an HTML file containing a fairly full subset of
               | Gezira. The only problem is that the JS style is way out
               | of date, and so it does some things that are heavily
               | frowned upon today. It may not work with tools like
               | Webpack.
               | 
               | The Maru-based Nile is trickier to get working, but it
               | does work. The issue with Ian's Maru is that it's quite
               | hard to reason about and lacks clear debugging tools.
               | I've gotten both up and running. I seem to remember the
               | Boehm GC was pivotal in getting Maru to bootstrap and
               | then run Nile.
               | 
               | > I read Alex Warth's thesis and it's well written, in a
               | way that makes it very easy to understand. So, of course,
               | I had to implement my own OMeta variant [2].
               | 
               | Pymetaterp is cool! I agree: Warth's work on OMeta was
               | impressive. In some ways, Ohm feels inferior to me,
               | though they're both good tools with lots of potential.
               | 
               | OMeta is the one tool from STEPS that is basically simple
               | to understand and use without having to do a bunch of
               | code archaeology.
               | 
               | > Also, the VPRI website itself says it's shutting down
               | (presumably folks moved to HARC at that time?).
               | 
               | VPRI closed because STEPS ended and because Alan had to
               | retire at some point. HARC and/or CDG Labs continued the
               | work, but then closed as well. (I don't know all of the
               | details, but someone here suggested SAP withdrew funding.
               | That would track with what I do know.)
               | 
               | Today, Ian is teaching in Japan, Dan is at Vianai, Alex
               | is at (IIRC) Google, Yoshiki is at Croquet, Bret Victor
               | is doing Dynamicland, Vi Hart is at Microsoft Research
               | and then Alan is retired. There were quite a few others
               | I'm missing, and they are all doing interesting things as
               | well.
               | 
               | [1] https://groups.google.com/g/maru-dev
               | 
               | [2] https://www.oracle.com/openworld/on-
               | demand.html?bcid=6092429...
        
               | azeirah wrote:
               | > I will also say that in my opinion, it's not clear to
               | many of the people who made this stuff how special it is.
               | The only exception to that is Bret Victor, who actually
               | is not well-understood, but even the banana pudding
               | versions of his ideas are typically much better than the
               | industry's.
               | 
               | I would love to hear more about how you believe not only
               | outsiders, but also the people who made this
               | misunderstand this work?
               | 
               | How do you see the importance of STEPS and Bret Victor's
               | work? I'm a big fan, and you clearly have a lot of
               | knowledge. I'd love to read more!
        
               | asrp wrote:
               | Thanks, a lot of this is new and useful to me.
               | 
               | > Recently I've been wondering if there's enough interest
               | for a Discord server or something. (In the spirit of
               | STEPS, it'd be ideal to make a new collaborative thing
               | that's really different than static text/audio/video on
               | the web, but gotta start somewhere. :) ) Unfortunately, I
               | have had other, higher-priority projects at the moment,
               | so I have taken no initiative to try to build a
               | community.
               | 
               | I don't really like Discord because they keep asking for
               | phone verification and early on, they were pretty
               | aggressively shut down alternate client attempts.
               | 
               | What about Mattermost? I could try to set one up though
               | initially, we wouldn't have email notifications or a CDN.
               | Might not be so good if the initial group is small.
               | 
               | Slack? Don't know how they compare to Discord but at
               | least they don't ask for phone verification.
               | 
               | A subreddit? A mailing list? Some kind of fediverse
               | thing?
               | 
               | If there's some possibility of migrating to our own
               | platform, I guess it doesn't matter as much where we
               | start.
               | 
               | I could try to set something up in the coming week. But
               | interest in this HN thread will still have died by that
               | time.
               | 
               | > I did get Nile + Gezira to work (albeit in a very crude
               | way by printing numbers to the console rather than
               | hooking it up to a frame buffer). That's how I met Dan. I
               | don't want to betray any confidences with him, but there
               | is ongoing work with Nile.
               | 
               | Nice! I'm not anywhere near that. I'm still looking for a
               | description of what it _is_ and at a very high level, how
               | does it work internally? Something like "it's
               | mathematical notation to describe the pixel
               | positions/intensities implicitly via constraint
               | equations; it uses a <something> solver for ...". What's
               | in quote could be way off and is from memory of what I
               | remember seeing.
               | 
               | > I've gotten both up and running. I seem to remember the
               | Boehm GC was pivotal in getting Maru to bootstrap and
               | then run Nile.
               | 
               | I also vaguely remember something about getting the right
               | Boehm GC version so that some of
               | 
               | > Pymetaterp is cool! I agree: Warth's work on OMeta was
               | impressive. In some ways, Ohm feels inferior to me,
               | though they're both good tools with lots of potential.
               | 
               | Thanks! I share similar thoughts about Ohm. Having a
               | visual editor is very nice, though I tend to use
               | breakpoints for parser debugging [1].
               | 
               | Edit to add that id-objmodel [2] is another STEPS project
               | I found to be simple and useful as an idea.
               | 
               | [1] See, for example, "Debugging" in
               | https://blog.asrpo.com/adding_new_statement [2]
               | https://www.piumarta.com/software/id-objmodel/
        
               | xkriva11 wrote:
               | Can you publish what you have collected?
        
               | elgertam wrote:
               | I plan to: a working system. That seems to be true to the
               | spirit of STEPS, VPRI and Alan Kay himself.
        
             | renox wrote:
             | The 'Word' equivalent was called Frank but AFAIK nobody has
             | been able to reproduce what was demonstrated..
             | 
             | Quite painfully ironic for a software research project that
             | they didn't use properly a VCS..
        
               | elgertam wrote:
               | They did use VCS, actually, but a lot of them used SVN
               | and each person in the STEPS project was hosting their
               | own code. Most of those servers have gone dark now,
               | though you can find random ports over to GitHub (rarely
               | with the version history). As far as I can tell, Dan
               | Amelang and Alex Warth were the only two who used git or
               | moved their code over to git.
        
             | hobo_mark wrote:
             | Thank you, funnily enough this lead me back to the orange
             | website:
             | 
             | "STEPS Toward the Reinvention of Programming, 2012 Final
             | Report Submitted to the National Science Foundation (NSF)
             | October 2012"
             | 
             | https://news.ycombinator.com/item?id=11686325
        
           | e12e wrote:
           | See:
           | 
           | https://en.m.wikipedia.org/wiki/Ometa (including reference
           | section)
           | 
           | Or go to: http://www.vpri.org/writings.php
           | 
           | If I recall correctly you want: "STEPS Toward the Reinvention
           | of Programming, 2012 Final Report Submitted to the National
           | Science Foundation (NSF) October 2012" (and earlier reports)
           | 
           | Discussed on hn:
           | https://news.ycombinator.com/item?id=11686325
           | 
           | And: https://news.ycombinator.com/item?id=585360
           | 
           | Notable for implementing tcp/ip by parsing the rfc.
           | 
           | "A Tiny TCP/IP Using Non-deterministic Parsing Principal
           | Researcher: Ian Piumarta
           | 
           | For many reasons this has been on our list as a prime target
           | for extreme reduction. (...) See Appendix E for a more
           | complete explanation of how this "Tiny TCP" was realized in
           | well under 200 lines of code, including the definitions of
           | the languages for decoding header format and for controlling
           | the flow of packets."
           | 
           | (...)
           | 
           | "Appendix E: Extended Example: A Tiny TCP/IP Done as a Parser
           | (by Ian Piumarta) Elevating syntax to a 'first-class citizen'
           | of the programmer's toolset suggests some unusually expres-
           | sive alternatives to complex, repetitive, opaque and/or
           | error-prone code. Network protocols are a per- fect example
           | of the clumsiness of traditional programming languages
           | obfuscating the simplicity of the protocols and the internal
           | structure of the packets they exchange. We thought it would
           | be instructive to see just how transparent we could make a
           | simple TCP/IP implementation. Our first task is to describe
           | the format of network packets. Perfectly good descriptions
           | already exist in the various IETF Requests For Comments
           | (RFCs) in the form of "ASCII-art diagrams". This form was
           | probably chosen because the structure of a packet is
           | immediately obvious just from glancing at the pictogram. For
           | example:                 +-------------+-------------+-------
           | ------------------+----------+-------------------------------
           | ---------+       | 00 01 02 03 | 04 05 06 07 | 08 09 10 11 12
           | 13 14 15 | 16 17 18 | 19 20 21 22 23 24 25 26 27 28 29 30 31
           | |       +-------------+-------------+------------------------
           | -+----------+----------------------------------------+
           | |   version   |  headerSize |      typeOfService      |
           | length                        |       +-------------+--------
           | -----+-------------------------+----------+------------------
           | ----------------------+       |
           | identification                  |  flags   |
           | offset                |       +---------------------------+--
           | -----------------------+----------+--------------------------
           | --------------+       |       timeToLive          |
           | protocol        |                    checksum
           | |       +---------------------------+------------------------
           | -+---------------------------------------------------+
           | |                                               sourceAddress
           | |       +----------------------------------------------------
           | -----------------------------------------------------+
           | |
           | destinationAddress                                          |
           | +------------------------------------------------------------
           | ---------------------------------------------+
           | 
           | If we teach our programming language to recognize pictograms
           | as definitions of accessors for bit fields within structures,
           | our program is the clearest of its own meaning. The following
           | expression cre- ates an IS grammar that describes ASCII art
           | diagrams."
        
         | infinite8s wrote:
         | They were trying for 10k lines of code(I think I saw Alan Kay
         | mention online that they got to about 20k lines).
        
       | gklitt wrote:
       | Ohm's key selling point for me is the visual editor environment,
       | which shows how the parser is executing on various sample inputs
       | as you modify the grammar. It makes writing parsers fun rather
       | than tedious. One of the best applications of "live programming"
       | I've seen.
       | 
       | https://ohmlang.github.io/editor/
        
         | thesz wrote:
         | I used to debug parsing process for VHDL grammar (which is
         | ambiguous on lexem level) with parsing combinators and Haskell
         | REPL.
         | 
         | Whenever my "compiler" found a syntax error in test suite, I
         | was able to load part of source around error and investigate
         | where my parser's error or omission is by running parser of
         | smaller and smaller part of grammar on smaller and smaller
         | parts of input.
         | 
         | It was 12 years ago.
         | 
         | And yes, it is fun. ;)
        
         | Waterluvian wrote:
         | A lot of regex testers do this and I can't imagine writing a
         | regex or a parser without.
        
           | anon_tor_12345 wrote:
           | >a parser without
           | 
           | can you show me a parser generator that produces this kind of
           | visualization?
        
       | recursivedoubts wrote:
       | Always fun to find the first commit:
       | 
       | https://github.com/harc/ohm/commit/4611bf63c5ecb90d782112d68...
       | 
       | 2014
       | 
       | Neat tool. I write parsers by hand though. More fun, and you can
       | be a lot sleazier.
        
       | glrsbstrd wrote:
       | U realise ohm is used for maesuring ressitance in electiricty
       | dont u?
        
       | tovej wrote:
       | Compiler compilers are great, I love writing DSLs for my
       | projects. I usually use yacc/lex, or write my own compiler
       | (typically in go these days).
       | 
       | However (and this is just me talking), I don't see the point in a
       | javascript-based compiler. Surely any file format/DSL/programming
       | language you write will be parsed server-side?
        
         | branneman wrote:
         | In that case, way I ask why you are not a Racket user? Sounds
         | like it'll save you a ton of time and keep your implementations
         | high level.
        
         | hansvm wrote:
         | (also just me talking -- here are some potential counterpoints)
         | 
         | The choice of language often matters a lot less than how
         | familiar you are with it (and its ecosystem(s)). I think it's
         | totally reasonable to want to use JS for a compiler in, e.g., a
         | Node project if for no other reason than to not have to learn
         | too many extra things at once to be productive with the new
         | tool.
         | 
         | I also don't think it's fair to assume everything will be
         | parsed, tokenized, etc server-side. Even assuming that data
         | originates server-side (since if it didn't you very well might
         | have a compelling case for handling it client-side if for no
         | other reason than latency), it's moderately popular nowadays to
         | serve a basically static site describing a bunch of dynamic
         | things for the frontend to do. Doing so can make it
         | easier/cheaper to hit any given SLA at the cost of making your
         | site unusable for underpowered clients and pushing those costs
         | to your users, and that tradeoff isn't suitable everywhere, but
         | it does exist.
         | 
         | It's interesting that you seem to implicitly assume the only
         | reason somebody would choose JS is that they're writing
         | frontend code. It's personally not my first choice for most
         | things, but it's not too hard to imagine that some aspect of JS
         | (e.g., npm) might make it a top contender for a particular
         | project despite its other flaws and tradeoffs.
        
           | tenaciousDaniel wrote:
           | This makes me feel really good. I'm working on my first DSL
           | and I'm writing it in JS. I really don't know what I'm doing,
           | and it felt like JS wasn't as good a choice as a more
           | "serious" language like C++.
           | 
           | But I'm standing my ground because I'm not even writing a
           | proper "compiler" - in my case, the output is JSON. So it
           | just kinda feels like it makes sense to stick with JS.
        
         | RodgerTheGreat wrote:
         | There's a great deal of value to making programming
         | environments available in a browser, especially in the context
         | of creative coding and education. I have built and used many
         | such tools which are purely client-side.
         | 
         | There is a world of difference in accessibility between a tool
         | that requires installation and a tool that you can use by
         | following a hyperlink.
        
         | breck wrote:
         | > I don't see the point in a javascript-based compiler
         | 
         | My CC is Javascript based (well it was initially, then
         | TypeScript, now a lot of it is written in itself).
         | 
         | 99% of the time I use the actual languages I make in it server
         | side (nodejs), but I am able to develop the languages in my
         | browser using https://jtree.treenotation.org/designer/. It's
         | super easy and fun (at least for me, UX sucks for most people
         | at the moment). There's something somewhat magical about being
         | able to tweak a language from my iPhone and then send the new
         | lang to someone via text. (Warning: Designer is still hard to
         | use and a big refresh is overdue).
        
           | iamwil wrote:
           | Wait, what do you use treenotation for? What are the
           | languages for? I think I'm just a little surprised someone's
           | using treenotation other than to play with it.
        
             | breck wrote:
             | Oh yeah I use it everywhere. The ideas are live on prod on
             | a number of sites.
             | 
             | My recent fun public focus now is to power Scroll,
             | (https://scroll.publicdomaincompany.com/). "Scrolldown" now
             | powers my blog (an example post: https://github.com/breck7/
             | breckyunits.com/blob/main/insist-o...). I think from what
             | I'm seeing so far Scrolldown may be one of the first Tree
             | Lang breakouts. Simple but powerful from extensibility.
             | 
             | TreeBase is used extensively at a few moderately successful
             | websites. I think Tree Notation (or 2D langs generally)
             | will be used OOMs more in this domain. It integrates so
             | incredibly seamlessly with Git.
             | 
             | At Our World in Data Tree Notation is used for the
             | researchers to build interactive data visualizations
             | (https://www.youtube.com/watch?v=vn2aJA5ANUc&t=145s). That
             | one uses it's own implementation called "GridLang" because
             | I didn't want to depend on jtree, which is a bit too R&D
             | for a site with that kind of traffic. The 2D lang/Tree
             | Notation ideas are so simple that it's easy to roll your
             | own code and you don't have to use "jtree". I view the
             | "jtree" project in a way as just an experiment to confirm
             | that yes, you can do anything/everything without any
             | visible syntax characters. Space is all you need.
             | 
             | On the contracting side I'm helping a crypto group with a
             | shockingly ambitious 2-D crypto.
        
         | coldtea wrote:
         | > _Surely any file format /DSL/programming language you write
         | will be parsed server-side?_
         | 
         | Well, Javascript has been used for over a decade heavily on the
         | server side, with Node, WASM and other projects.
         | 
         | And as far as raw speed goes, something like v8 smokes all
         | scripting languages bar maybe LuaJit.
         | 
         | So, there's that...
        
         | chrisseaton wrote:
         | > I don't see the point in a javascript-based compiler
         | 
         | JavaScript is a full programming language. Why wouldn't it be a
         | fine choice to write a compiler in? People have a funny idea
         | that compilers are more complex software or are somehow
         | something low-level? In reality they're conceptually simple -
         | as long as your language lets you write a function from one
         | array of bytes to another array of bytes, then you can write a
         | compiler in it. And for practicalities beyond that you just
         | need basic records or objects or some other kind of structure,
         | and you can have a pleasant experience writing a compiler.
         | 
         | > Surely any file format/DSL/programming language you write
         | will be parsed server-side?
         | 
         | JavaScript can be used user-side, or anywhere else. It's just a
         | regular programming language.
        
           | dw-im-here wrote:
           | I'd rather put my hand in boiling water than develop a
           | compiler in a dynamic weak typed language.
        
             | chrisseaton wrote:
             | My experience doing both in practice is that the type
             | system helps you with things that aren't really a problem
             | anyway (a compiler doesn't really have complex data
             | structures and you don't often get these basic things
             | wrong) and all but the most sophisticated type systems
             | don't even begin to help you with things you really need
             | help with - maintaining invariants.
        
             | [deleted]
        
             | rurban wrote:
             | Because you prefer to beaten by a stick after work, right?
             | Helps your swollen hand.
             | 
             | Lisp is one of the best compiler implementation languages.
             | Doing the same in C of C++ is about 3-20x more effort.
        
             | pwdisswordfish6 wrote:
             | Write a compiler in a strongly typed language, and then
             | remove all the type annotations. This may come as a shock,
             | but this is what a compiler (or any codebase) could look
             | like when developed in a weakly typed language.
        
               | dw-im-here wrote:
               | You're confusing strong and static typing (javascript has
               | neither). In more sophisticated languages such as scala,
               | C# or haskell you can let the compiler infer the types
               | for you, and you can then ask your IDE which type that
               | is. This way you don't need to type out all the
               | boilerplate, you get to see what a functions signature
               | is, and you get compiler errors rather than runtime
               | errors.
        
               | emteycz wrote:
               | Welcome to TypeScript.
        
               | mintplant wrote:
               | That doesn't work if you're using types for anything
               | beyond correctness-checking. Type-driven dispatch, for
               | example, which tends to be used heavily in big compiler
               | and interpreter projects. And tagged unions (or algebraic
               | datatypes), a natural fit for representing ASTs, become
               | more unwieldy without type-directed features like pattern
               | matching.
        
               | pwdisswordfish6 wrote:
               | Sounds like a double standard and possibly moving the
               | goalposts. There are strongly typed languages that don't
               | have those features, and compiler codebases that don't
               | use that kind of architecture. Do they get a pass or not?
        
               | klibertp wrote:
               | > Type-driven dispatch
               | 
               | Smalltalk and everything since then would like a word
               | with you.
               | 
               | > And tagged unions (or algebraic datatypes) [...] type-
               | directed features like pattern matching.
               | 
               | Erlang and Prolog would like to have a chat, too.
        
               | anonymoushn wrote:
               | By type-driven dispatch do you mean dynamic dispatch on
               | more than 1 parameter? Most statically typed languages do
               | not have this and you have to write a bunch of
               | boilerplate to get them to pretend that they do.
        
               | rixed wrote:
               | > Write a compiler in a strongly typed language, and >
               | then remove all the type annotations.
               | 
               | Help! That's what I did. I chose to write the compiler in
               | OCaml, a language that's already ~30 years old by now.
               | But I can not find any type anotations! What should I do?
               | I'm stuck!
        
               | pwdisswordfish6 wrote:
               | No, you're done.
               | 
               | > language that's already ~30 years old by now
               | 
               | Relevance?
        
               | TsiCClawOfLight wrote:
               | ocaml is statically typed, it just uses type inference
               | for 99% of cases. So your're wrong in this case, he got
               | all the advantages from static typing. Errors ade found
               | at compile time.
        
           | e12e wrote:
           | > I don't see the point in a javascript-based compiler
           | 
           | Typescript, sass, jsx... There are a lot of languages running
           | on top of js. Or you might want to do colorizing,
           | autoformating on input in the browser?
           | 
           | Along with all that, there's as mentioned nodejs, deno for
           | running server side.
           | 
           | But at any rate - lots of front-end problems involve various
           | kinds of parsing/validation and transformation (eg:
           | processing.js).
        
           | centimeter wrote:
           | > Why wouldn't it be a fine choice to write a compiler in?
           | 
           | Javascript doesn't seem suited to compiler construction
           | because it lacks lots of features that make compiler
           | construction pleasant (e.g. strong rich types, algebraic data
           | types, etc.)
           | 
           | It might be "fine" but it's not "good".
        
         | acarabott wrote:
         | I interned with the PI behind Ohm (Alex Warth) and one of his
         | reasons for using the browser was simple:
         | 
         | "If I send someone an executable, they will never download it.
         | If I send them a URL, they have no excuse."
        
           | BiteCode_dev wrote:
           | We are talking about a compiler here.
           | 
           | If someone interested in a compiler doesn't download it, it's
           | not a excuse, it's a filter. Or a warning sign.
        
             | coolreader18 wrote:
             | I mean it's JavaScript, I don't think it's intended for you
             | to write C compilers in it - but for compile-to-JS
             | languages, it's a real asset to be able to run it in the
             | browser, although more and more that can be done with
             | WebAssembly as well. However, look at the project listed as
             | using it - it may not even be for web languages, but just
             | projects that need to parse something.
        
             | coldtea wrote:
             | Spoken like someone who has never taught real students!
        
             | pwdisswordfish6 wrote:
             | You know all those jokes that people like Linus make about
             | Real Programmers--the ones who have hair on their chests,
             | etc--you know those are all _jokes_ , right? Jokes in the
             | laughing-at-them sort of way, the way Colbert did it--not
             | something that you're supposed to unironically buy into.
             | 
             | > If someone interested in a compiler doesn't download it,
             | it's not a excuse, it's a filter. Or a warning sign.
             | 
             | You're so invested in gatekeeping that you're confusing the
             | point of research with technofetishism.
             | 
             | Here's what Joe Armstrong had to say in "The Mess We're
             | In":
             | 
             | "I downloaded this program, and I followed the
             | instructions, and it said I didn't have grunt installed!
             | [...Then] I installed grunt, and it said grunt was
             | installed, and then I ran the script that was gonna make my
             | slides... and it said 'Unable to find local grunt'."
             | 
             | Looks like someone needs to go dig up Joe and let him know
             | that the real problem is that there was a mistake in
             | letting him get past the point where he was supposed to be
             | filtered out.
        
             | d110af5ccf wrote:
             | > doesn't download it, it's not a excuse, it's a filter
             | 
             | If it's a decently large project, sure. But if it's a small
             | project with only a couple contributors who I've never
             | heard of? There's the potential for that to be hiding
             | malicious code. Plus the potential complexity of getting a
             | project that's only ever been built on (say) 2 computers to
             | successfully compile and run on _my_ system. Plus figuring
             | out whatever build system and weird flags they happen to
             | use. And potentially wrangling a bunch of dependencies.
             | 
             | All that just to take a quick look at a language that might
             | not actually be of interest to me in the end. The browser
             | offers huge benefits here - follow a link and play around
             | in a text box. It _just works_. (This is also why I use
             | Godbolt - I don 't want to bother with a Windows VM or
             | wrangle ten different versions of Clang and GCC myself.)
        
         | kesava wrote:
         | A ton of front end templating languages/frameworks. They
         | involve compilers to different degrees, don't they?
        
         | TheRealPomax wrote:
         | If your ecosystem is JS, having a JS based compiler is pretty
         | convenient. As long as it's just "slower by some constant",
         | rather than by a runtime order, the fact that it's not as fast
         | as yacc/bison etc. is pretty much irrelevant, so being able to
         | keep everything JS is quite powerful for people new to the idea
         | having started their programming career using JS, as well as
         | seasoned devs working in large JS codebases.
         | 
         | (and you can always decide that you need more speed - if you
         | have a grammar defined, it's almost trivial to feed it to some
         | other parser-generator)
        
         | [deleted]
        
         | peterhunt wrote:
         | There's definitely a use for js based parsing for tooling that
         | runs in the browser (autocomplete, documentation browsing etc).
         | Integration with the Monaco editor is a common use case.
        
       | Sparkyte wrote:
       | Ohm just makes me want to watch Nausicaa of the Valley of the
       | Wind.
        
       | TheRealPomax wrote:
       | It'd be cool if the online editor dispensed with the need to
       | "write the grammar" entirely. A node based parser-generator in
       | addition to Ohm being yet another grammar based parser-generator
       | would be pretty great.
        
         | ampdepolymerase wrote:
         | Even better would be to generate parser from examples. See the
         | Microsoft Research Excel Flash Fill paper.
        
       | joshmarinacci wrote:
       | I'm so happy to see this on HN. I've used Ohm for several
       | projects. If you want a tutorial for building a simple
       | programming language using Ohm, check out this series I put on
       | GitHub.
       | 
       | https://github.com/joshmarinacci/meowlang
        
       | j0e1 wrote:
       | This is an example of a library we built using Ohm:
       | https://github.com/Bridgeconn/usfm-grammar [1]
       | 
       | It works great for our use-case though I have been eyeing tree-
       | sitter[2] for its ability to do partial parses.
       | 
       | [1] USFM: https://ubsicap.github.io/usfm/ [2] https://tree-
       | sitter.github.io/tree-sitter/
        
       | PaulHoule wrote:
       | Each PEG generator promises a revolution but only burns a car.
       | 
       | I was disappointed with how they do operator precedence; they use
       | the usual trick to make a PEG do operator precedence which looks
       | cool when you apply it to two levels of precedence but if tried
       | to implement C or Python in it it gets unwieldy. Most of your AST
       | winds up being nodes that exist just to force precedence in your
       | grammar, working with the AST is a mess.
       | 
       | For all the horrors of the Bell C compilers, having an explicit
       | numeric precedence for operators was a feature in yacc that newer
       | parser gens often don't have.
       | 
       | I worked out the math and it is totally possible to add a stage
       | that adds the nodes to a PEG to make numeric precedence work and
       | also delete the fake nodes from the parsed AST. Unparsing I'm not
       | so sure of, since if someone wrote                  int a = (b +
       | c);
       | 
       | how badly you want to keep the parens is up to you; a system like
       | that MUST have an unparse-parse identity in terms of 'value of
       | the expression', but for sw eng automation you want to keep the
       | text of the source code as stable as you can.
        
       | branneman wrote:
       | When should one use Ohm over Racket?
        
         | coldtea wrote:
         | When they want a library and toolkit for building parsers and
         | languages, rather than a general programming language based on
         | Scheme.
        
           | branneman wrote:
           | ... but racket basically exists to create parsers and
           | languages. It happens to also be a general programming
           | language. But so is JS nowadays with Node.
        
           | dunefox wrote:
           | So, I guess you don't know why OP specifically asked about
           | Racket: https://www.cs.utah.edu/plt/dagstuhl19/
           | https://beautifulracket.com/stacker/why-make-languages.html
        
             | coldtea wrote:
             | Nah, I know about Racket's DSL support and touting itself
             | as friengly to language writing, but it's still not the
             | same as a dedicated parsing toolkit, the same way I
             | wouldn't consider a Lisp with reader macros equivalent
             | either...
        
       | hardwaregeek wrote:
       | I've used PEGs in the past. They're nice since they combine the
       | mental model of LL grammars with the automation of LALR parser
       | generators. However, it is quite easy to accidentally write rules
       | where you never parse the second rule due to the ordering
       | priority for rules. For instance:                   ident ::=
       | name | name ("." name)+
       | 
       | Because with PEGs, the parser tries the first rule, then the
       | second, and because whenever the second rule matches, the first
       | one will also match, we will never parse the second rule. That's
       | kinda annoying.
       | 
       | Of course with PEG tools you could probably solve this by
       | computing the first sets for both rules and noticing that they're
       | the same. Hopefully that's what this tool does.
        
         | sleavey wrote:
         | This is what's called left-recursion, and there's indeed a way
         | to deal with it in PEG parsers:
         | https://github.com/PhilippeSigaud/Pegged/wiki/Left-Recursion.
        
       ___________________________________________________________________
       (page generated 2021-03-28 23:02 UTC)