[HN Gopher] Adding syntax to the CPython interpreter
       ___________________________________________________________________
        
       Adding syntax to the CPython interpreter
        
       Author : leontrolski
       Score  : 129 points
       Date   : 2024-10-17 10:23 UTC (2 days ago)
        
 (HTM) web link (leontrolski.github.io)
 (TXT) w3m dump (leontrolski.github.io)
        
       | nickpresta wrote:
       | The linked, more detailed post in the article is far more
       | interesting: https://miguendes.me/what-if-python-had-this-ruby-
       | feature
        
         | sph wrote:
         | Did OP basically summarise another blog post and publish it on
         | their own blog, even though the original is linked? Not a good
         | look, also because it's "narrated" in the first person, as if
         | they were the one to implement this change.
        
       | nemoniac wrote:
       | You could argue that this is not merely adding syntax, but also
       | adding the associated semantics.
       | 
       | Anyway, if you found this interesting, you might enjoy Eli
       | Bendersky's blog post from nearly 15 years ago where he adds an
       | "until ... do" statement to Python.
       | 
       | https://eli.thegreenplace.net/2010/06/30/python-internals-ad...
        
         | pansa2 wrote:
         | That blog post seems a lot more involved - it adds code to the
         | bytecode compiler as well as to the parser.
         | 
         | I suspect that's mostly because the `until` statement is more
         | complex - but another factor seems to be Python's change in
         | parsing technology from LL(1) to PEG.
        
         | asplake wrote:
         | I sometimes think it would be fun to do "do: ...
         | where:/while/until", wherein the "where" part does local
         | bindings.
        
           | tpoacher wrote:
           | there is a nice "staticscope" package in python, which acts
           | as a context manager, and does pretty much what you refer to.
           | pretty neat.
           | 
           | https://pypi.org/project/scoping/
        
       | Y_Y wrote:
       | If only there was a language that let you modify the interpreter
       | on the fly so you could do this as part of normal execution...
        
         | BiteCode_dev wrote:
         | Python can actually do this using "# coding:", albeit less
         | elegantly than lisp.
         | 
         | I would say it's a good thing, I don't want to see a hundred of
         | half baked, badly tested and vaguely document DSL with no
         | decent tooling support.
        
           | develatio wrote:
           | Can you provide some more info / links regarding the "#
           | coding:" feature? I wasn't able to find anything.
        
             | klibertp wrote:
             | If you place `# coding: utf-8` on the first line of a
             | Python script, it'll try to interpret the raw bytes
             | contained later in the file by passing them through a
             | relevant codec[1]. Since the codec receives the raw source
             | and transforms it before any interpretation happens, you
             | can (here's the point I'm starting to guess) supply a codec
             | that parses the code, transforms the AST, and dumps it back
             | to the source for execution. It would be similar to how
             | "parse transforms" work in Erlang, though there you're
             | handed AST and not bytes.
             | 
             | [1] https://docs.python.org/3/library/codecs.html
        
               | zahlman wrote:
               | >how "parse transforms" work in Erlang, though there
               | you're handed AST and not bytes.
               | 
               | Oh, I didn't know about this. Was thinking about doing
               | something similar in my own language, so I'll have to
               | check that out.
        
             | JackC wrote:
             | The idea is you can register a custom file encoding,
             | analogous to built-in file encodings like utf-8, and use it
             | to transform the source file before python loads it. An
             | example is pyxl, kind of like jsx, where you put `# coding:
             | pyxl` at the top and then it will transform bare `<html>`
             | in the python file into calls to a template builder:
             | https://github.com/gvanrossum/pyxl3
             | 
             | Incidentally this turns out to be super hard to search for
             | without asking an LLM, since "python coding" is so
             | overloaded, and using the feature this way is intentionally
             | undocumented because it's not really what it's for, and not
             | something I think most python users really want to
             | encourage. So, forbidden python knowledge!
        
               | kevindamm wrote:
               | It's specified in PEP 263 [0] which pulls up easier on
               | search if you quote it and include the typical "special
               | comment symbol" -*- around it.
               | 
               | [0]: https://peps.python.org/pep-0263/
        
             | BiteCode_dev wrote:
             | Codecs (https://docs.python.org/3/library/codecs.html) can
             | change the content of the file on the fly. You can abuse
             | that to create new syntax, although it is evidently not the
             | original intent.
             | 
             | Let's say you hate significative spaces, here is a (very
             | fragile) PoC for your pain:
             | 
             | https://0bin.net/paste/42AQCTIC#dLEscW0rWQbE70cdnVCCiY72VuJ
             | w...
             | 
             | Import that into a *.pth file in your venv, and you can
             | then do:                   # coding: braces_indent
             | def main() {             print("Hello, World!")
             | if True {                 print("This is indented using
             | braces.")             }         }
             | 
             | You also can use import hooks (python ideas does that),
             | bytecode manipulations (pytest does that) or use the ctypes
             | module (forbiddenfruit does that).
             | 
             | Still, I'm very happy it stays limited to super specific
             | niches. Big power, big responsibilities, and all that.
        
           | pas wrote:
           | That's probably an argument for a language with good DSL
           | support.
           | 
           | When this comes up I usually link to the work of Alan Kay and
           | others (the very mystical sounding STEPS project at VPRI)
           | 
           | """ The big breakthrough is making it easy to create new DSLs
           | for any situation. Every area of the OS has its own language
           | (and you can just add more if you feel the need) so that the
           | whole OS including networking and GUI is very compact,
           | understandable, and hackable. This particular project focused
           | on compactness, just to prove that it is quantitatively more
           | expressive. """
           | 
           | comment by sp332
           | https://news.ycombinator.com/item?id=11687952
           | 
           | final report from 2016
           | https://news.ycombinator.com/item?id=11686325
        
         | knighthack wrote:
         | Try Nim's macros.
        
         | klibertp wrote:
         | You're Lisp-baiting, aren't you? ;) I'd add Elixir next to Nim
         | (already mentioned); also Rust. Recently, also Scala.
         | 
         | The reason we don't have such metaprogramming available
         | everywhere is mostly because you have to subscribe to a
         | particular ideology to allow it. If you think programmers are
         | generally intelligent and responsible, you put macros and
         | metaclasses in your language. If, on the other hand, you think
         | most programmers are dumb code monkeys (with a few exceptions,
         | maybe) your language becomes like early Java or early Go.
        
           | keybored wrote:
           | That dichotomy is interesting considering Guy Steele's
           | _Growing a Language_ talk and his Scheme background. But
           | maybe just mentioning Scheme is misleading here...
        
           | Y_Y wrote:
           | (Isn't lisp-baiting the _raison d 'etre_ of hn?)
           | 
           | Since you mention it, Python does have a fairly elaborate
           | metaclass system, but it seems like it's only really used for
           | implementing the language and rarely if ever wielded by
           | "users". I guess that's a reflection of the language ideology
           | you're talking about.
           | 
           | Also for what it's worth, I know myself to be a dumb code
           | monkey, but being in the CRUD gutter doesn't preclude me from
           | looking at the metasyntactic stars.
        
         | radarsat1 wrote:
         | Why would I want to extend the language syntax as part of
         | normal execution?
        
       | pkolchanov wrote:
       | Great post!
       | 
       | I have worked on a similar series of CPython customization
       | articles that you might find interesting:
       | 
       | https://github.com/pkolchanov/CustomizingPython
        
       | tgv wrote:
       | Great, now you have the dangling else in python, the one thing
       | that actually was solved by significant white space.
        
       | dig1 wrote:
       | This is where Lisp-like languages excel with macros. An example
       | in Clojure:                   (defmacro do* [body expr cond]
       | `(~expr ~cond            ~body))               (do* (println
       | "hello") if true)          (do* (println "hello") when-not (> 1
       | 2))
        
         | BoingBoomTschak wrote:
         | Yeah, this is the kind of material that shows that any attempt
         | to prevent/hamper meta-programming or punt it to preprocessors
         | is at best misguided. It also feeds the smug Lisp weenie within
         | me.                 Above all the wonders of Lisp's pantheon
         | stand its metalinguistic tools; by their grace have
         | Lisp's acolytes been liberated from the rigid asceticism of
         | lesser faiths. Thanks to Macro and       kin, the jolly,
         | complacent Lisp hacker can gaze through a fragrant cloud of
         | setfs and defstructs       at the emaciated unfortunates below,
         | scraping out their meager code in inflexible notation, and
         | sneer superciliously. It's a good feeling.            --
         | iterate manual, A.1 Introduction
         | 
         | do..while is also the example used here https://www.tcl-
         | lang.org/man/tcl9.0/TclCmd/uplevel.html
        
       | zitterbewegung wrote:
       | This is a cool article. If you want to understand how to
       | contribute to python there is a guide at
       | https://devguide.python.org
        
         | tmgehrmann wrote:
         | Be aware, however, that the inner circle will use you, take
         | your contributions, and, if you develop own opinions, publicly
         | ban and defame you without any possibility of setting any
         | record straight.
         | 
         | pyhton-dev is a corporate shark tank where only personalities
         | and employer matter (good code or ideas are optional).
        
           | pas wrote:
           | That's a pretty strong claim, (and since most inner circles
           | are hard to get into, I even assume it's not without any
           | basis in reality), yet could you please provide some exhibits
           | to support it?
        
             | 0x1242149 wrote:
             | The inner circle emerged after GvR resigned. It largely
             | consists of people who haven't contributed that much to
             | Python3 (sometimes nothing at all in terms of code).
             | 
             | The members occupy different chairs in the PSF, Steering
             | Council and the all-powerful CoC troika. They rotate,
             | sometimes skip one election and then come back.
             | 
             | Their latest achievement is the banning of Tim Peters and
             | others:
             | 
             | https://news.ycombinator.com/item?id=41234180
             | 
             | https://news.ycombinator.com/item?id=41385546
             | 
             | Tim Peters is just the tip of the iceberg. Many bans are
             | private, intimidation is private.
             | 
             | Steering council members can insult, bully and mock others
             | without any CoC consequences for themselves and keep
             | getting elected. That is how you know that they are in the
             | inner circle.
        
           | albertzeyer wrote:
           | That's not at all my experience. I contributed often to
           | CPython by filling many issues and also one PR, and the whole
           | process was very reasonable.
        
       | cpburns2009 wrote:
       | For all of the syntax features Python has been adding over the
       | years, this would be a nice enhancement: making the "else None"
       | optional in the ternary if-expression. E.g.,
       | spam = eggs if bar         # vs         spam = eggs if bar else
       | None
        
         | genter wrote:
         | So if "else None" is omitted, if bar is false, then does spam
         | == None or is it unmodified? The former is what I think you
         | want, but that would be very confusing.
        
           | 333c wrote:
           | Yeah, it would be confusing. In Ruby, this would leave spam
           | unmodified.
        
             | 8n4vidtmkvmk wrote:
             | That's even more confusing. There's an =, I expect an
             | assignment to happen. Maybe we just leave this one alone.
        
               | trehalose wrote:
               | Perhaps using a parenthesized assignment expression with
               | the walrus operator would be unambiguous:
               | 
               | (spam := eggs) if bar
               | 
               | I think that seems reasonable? It would act just the same
               | as it already does with an explicit `else None`, if I'm
               | not mistaken. I don't find it beautiful though.
        
               | 333c wrote:
               | In Ruby, the `if [condition]` modifier at the end of a
               | line is used for more cases than just assignment. For
               | example, `return if s.blank?` and `raise "invalid" if
               | input.length > 100`. In Ruby, this pattern makes it clear
               | that the statement is only executed if the condition is
               | met.
               | 
               | I'm not advocating for this feature to be added to
               | Python, just explaining why it's not confusing in Ruby.
        
               | mypalmike wrote:
               | Though I used it for 2 years of it as my primary language
               | at work, I never quite got used to Ruby's quirky idioms
               | like this. It just reads badly to me, in terms of quickly
               | understanding code flow, to have statements that start
               | with "raise" or "return" which might not raise or return.
               | Similar to the up-thread comment about assignment.
        
         | TwentyPosts wrote:
         | Do we really need more syntactic sugar? Frankly, I am still
         | confused why Python is going for a separate syntax for if
         | expressions instead of just making its regular ifs into
         | expressions
        
           | cpfohl wrote:
           | This is literally just a short tutorial helping people get
           | into the cpython codebase. This feedback is off topic.
           | 
           | Related to your comment, though: Python has had this syntax
           | for a VERY long time.
        
           | albertzeyer wrote:
           | There is an advantage of having this as syntax, which is a
           | difference in semantics, which you couldn't get if this would
           | just be an expression, namely:                   x = func1()
           | if something else func2()
           | 
           | In this example, only func1() or func2() is being called, but
           | not both.
        
             | gray_-_wolf wrote:
             | I do not understand why                   x = if something:
             | func1()             else:               func2()
             | 
             | Would not work. At least that is what I imagine when it is
             | said "make if an expression".
        
               | __mharrison__ wrote:
               | This doesn't work because the if statement is a
               | statement. Statements don't evaluate to anything in
               | Python, so you can't assign it to x.
        
               | 8n4vidtmkvmk wrote:
               | Hence "make it an expression"
        
               | otabdeveloper4 wrote:
               | It's way to late to fix the "statement vs expression" bug
               | in Python's design.
               | 
               | We could have done that in the v3 switch, but we decided
               | to spend man-centuries of effort on chasing already
               | deprecated legacy Windows Unicode semantics instead.
        
               | zahlman wrote:
               | Other languages do this sort of thing, so certainly it
               | "would work". But it's very much counter to the design
               | and aesthetics of Python.
               | 
               | Python is intended to enforce a strong distinction
               | between statements and expressions (`:=` notwithstanding
               | :/) because it sidesteps a lot of questions that one
               | might otherwise ask about how it's intended to be parsed
               | (including by humans).
               | 
               | Being able to write something like your example makes it
               | harder to figure out where the end is, figure out what
               | happens if there's more than one statement inside an `if`
               | (do we only consider the result of the _last_ expression?
               | What if the last thing isn 't an expression?), etc. By
               | the time you get to the end of understanding what's
               | happening inside the block, you can lose the context that
               | the result is being assigned to `x`.
               | 
               | At the other extreme, _everything_ is an expression, and
               | you have Lisp with funky syntax. But Python holds that
               | this syntactic structure is important for understanding
               | the code. I sense that this is part of what  "Flat is
               | better than nested" is intended to mean.
        
         | kstrauser wrote:
         | I almost never use None as the second value here. From my POV,
         | that would be new syntax for an unlikely situation.
        
       | bbb651 wrote:
       | You can almost do it with `condition and value`, but you get
       | `False` instead of `None` (and adding any more code makes this
       | worse than `value if condition else None`. Interestingly lua like
       | `<condition> and <true-expression> or <false-expression>`
       | ternaries actually work in python, with the added footgun that
       | <true-expression> must be truthy and people will despise you).
       | 
       | Rust for example has a solution for this in std, there's a
       | `bool::and_some` method (and `bool::then` that takes a closure to
       | avoid eagerly evaluating the value), but `if { ... }` isn't an
       | expression like `if { ... } else { ... }` is, probably to avoid
       | coupling the language to `Option`.
        
         | steveklabnik wrote:
         | Option is already a lang item, so that would t be an issue. I
         | don't know what the actual underlying reason is, or if it was
         | ever even proposed.
        
           | bbb651 wrote:
           | I meant more than it already is, I referenced this:
           | https://stackoverflow.com/a/43339003. I do think it would be
           | neat but it's niche enough to not be worth the non-obvious
           | behavior of boxing with `Option`, e.g. you might forget an
           | `else { ... }` and get complex type errors.
        
       | zahlman wrote:
       | >Condensed version of this cool blog post.
       | 
       | The effort is very much appreciated.
        
       | dalke wrote:
       | In the bygone era of 2008, I had fun adding Perl/Ruby-style
       | pattern matching.                 for line in
       | open("python_yacc.py"):           if line =~ m/def (\w+)/:
       | print repr($1)
       | 
       | See
       | http://www.dalkescientific.com/writings/diary/archive/2008/0...
       | 
       | I started by writing a Python grammar definition based on PLY
       | (see http://www.dabeaz.com/ply/) , then tweaked it to handle the
       | new feature and emit the right AST. It's when I discovered the
       | following was valid Python:                 >>> a=b=c=d=e=1
       | >>> del a, (b, (c, (((((d,e)))))))
       | 
       | I don't think PLY can handle the newest Python grammar, but I
       | haven't looked into it.
       | 
       | For what it's worth, my Python grammar was based on an older
       | project I did called GardenSnake, which still available as a PLY
       | example, at
       | https://github.com/dabeaz/ply/tree/master/example/GardenSnak... .
       | 
       | I've been told it's was one of the few examples of how to handle
       | an indentation-based grammar using a yacc-esque parser generator.
        
       ___________________________________________________________________
       (page generated 2024-10-19 23:01 UTC)