[HN Gopher] PEP 750: Tag Strings for Writing Domain-Specific Lan...
       ___________________________________________________________________
        
       PEP 750: Tag Strings for Writing Domain-Specific Languages
        
       Author : lumpa
       Score  : 65 points
       Date   : 2024-08-10 14:57 UTC (8 hours ago)
        
 (HTM) web link (discuss.python.org)
 (TXT) w3m dump (discuss.python.org)
        
       | formerly_proven wrote:
       | Yikes. Don't get me wrong, I totally understand the reasoning why
       | this would be useful (though I _violently disagree_ with the idea
       | of deferring the evaluation of the contained expressions), but it
       | 's also so very kitchensinky and adds so little over just calling
       | a function (which doesn't require a 20-page explainer, as
       | everyone already knows how function calls work). It also promotes
       | using what looks like string interpolation (and what might be
       | string interpolation, you can't tell at the "call site") for
       | things which we know string interpolation is the wrong tool. The
       | API also seems really, I dunno, weird to me. The string is split
       | around interpolations and verbatim portions result in one
       | argument, which is "string-like", while interpolations become
       | four-tuple-like (one of which is a lambda, which you call to
       | perform the deferred interpolation). This seems really awkward to
       | me for building stuff like the suggested use cases of XML/HTML or
       | SQL templating.
       | 
       | Also the scoping rules of this are a special case which doesn't
       | appear in regular Python code so far: "The use of annotation
       | scope means it's not possible to fully desugar interpolations
       | into Python code. Instead it's as if one is writing
       | interpolation_lambda: tag, not lambda: tag, where a hypothetical
       | interpolation_lambda keyword variant uses annotation scope
       | instead of the standard function scope." -- i.e. it's "as if you
       | wrapped all interpolation expressions in a lambda: <expr>,
       | _except it uses different scoping rules_ ".
        
         | masklinn wrote:
         | > This seems really awkward to me for building stuff like the
         | suggested use cases of XML/HTML or SQL templating.
         | 
         | Compared to what?
         | 
         | At the end of the day you're still doing string formatting, if
         | you want parsing, then you'd feed the item into a parser, which
         | this doesn't preclude.
         | 
         | The interface sounds a lot better than JS's anyway, as that
         | completely separates the literal strings and the interpolations
         | so you have to re-intersperse them which is muggy.
         | 
         | > interpolations become four-tuple-like
         | 
         | They become an Interpolation object, which can be unpacked if
         | you find that more convenient, but you can access the members
         | if you prefer:
         | 
         | - 0 is getvalue is the callable to retrieve the evaluated
         | expression
         | 
         | - 1 is expr is the raw text form of the expression
         | 
         | - 2 is conv is the !conversion tag (s, r, or a)
         | 
         | - 3 is format_spec
        
       | dmart wrote:
       | I have to admit that at first glance I don't like this. These
       | seem to be essentially normal str -> Any functions, with some
       | naming limitations due to the existing string prefixes special-
       | cased in the language. I don't feel like adding this additional
       | complexity is worth being able to save two parentheses per
       | function call.
        
         | Epa095 wrote:
         | This was my first thought as well. But an important difference
         | is that the arguments are not eagerly evaluated, but they are
         | passed as lambdas which can be evaluated if desired. This means
         | that it can be used for example in log messages (if you don't
         | want to evaluate the string at the wrong log levels). But is it
         | worth it for that? Idk.
        
           | masklinn wrote:
           | Even if eager evaluation it's already a very compelling way
           | of managing basically every lightweight "templating" for
           | safety: e.g. embedded dynamic HTML or SQL. `markupsafe` is
           | great, but it's way too easy to perform formatting _before_
           | calling it, especially with f-strings.
           | 
           | That f-strings were "static" was by far my biggest criticism
           | of it, given how useful I find JS's template strings.
           | 
           | And this proposal seems like a straight up better version of
           | template strings:
           | 
           | - the static strings and interpolations are not split and
           | don't have to be awkwardly re-interpsersed which I've never
           | found 100% trouble and 0% utility
           | 
           | - the lazy evaluation means it can be used for things like
           | logging (which really want lazy evaluation), or meta-
           | programmation (because you can introspect the callables, and
           | you get the expression text)
        
             | codethief wrote:
             | > - the lazy evaluation means it can be used for things
             | like logging (which really want lazy evaluation)
             | 
             | Could you elaborate? I would find it rather surprising if
             | my log messages don't contain the data at the very moment I
             | invoke the logger.
        
               | masklinn wrote:
               | The expressions for the data you want to log out can be
               | expensive, so ideally you only want to compute them
               | _after_ you've checked if the logger was enabled for the
               | level you need.
               | 
               | In most APIs this requires an explicit conditional check
               | and the average developer will not think of it. This
               | allows said check to be performed internally.
        
         | Too wrote:
         | How can you call a function that does this? html'<div
         | id={id:int}>{content:HTML|str}</div>'.
         | 
         | html() is not going to be equivalent.
        
           | mrweasel wrote:
           | That is probably a much better example than any of those
           | present in the PEP. I quite like your example. I'm not sure
           | I'd want to write code like that, but it shows the usefulness
           | much more clearly.
        
           | throwitaway1123 wrote:
           | For a practical example of this technique used in JS take a
           | look at libraries like htm and lit-html:
           | https://github.com/developit/htm
        
         | jerf wrote:
         | I think at this point Python really needs to just settle down.
         | I don't like this not because it's an intrinsically bad idea,
         | but adding another thing to the already fairly large pile of
         | things a Python user needs to know in order to read somebody
         | else's code needs to be something that brings more benefits to
         | the table than just "it slightly improves a particular type of
         | function call".
         | 
         | At the risk of riling some people up, this smells like Perl.
         | Some poor Python user comes across
         | greet"Hello {user}"
         | 
         | and there isn't even so much as a symbol they can search for on
         | the internet, just an identifier smashed into a string.
         | 
         | But I guess Python and I parted ways on this matter quite a
         | while ago. I like to joke about Katamari Dama-C++ but Python is
         | starting to give it a run for its money. C++ is still in the
         | lead, but Python is arguably sustainably moving more quickly on
         | the "add more features" front.
        
           | Waterluvian wrote:
           | My guess at the challenge is that the community who maintain
           | and develop a language are by that very nature not in touch
           | with what the complexity feels like for the average user.
           | 
           | Also it's harder to do nothing than something.
           | 
           | That being said, I think this is partly abstract. I've just
           | ignored a lot of new Python features without issue. And while
           | I worried that they'd force me to learn new things to
           | understand others' code, that's not really materialized.
        
       | ziml77 wrote:
       | It seems the purpose of this proposal is to have a way to
       | essentially have custom string interpolation. I don't think
       | that's necessarily a bad idea on its own, but this syntax feels
       | out of place to me.
       | 
       | Instead, why not add a single new string prefix, like "l" for
       | "lazy"? So, f"hello {name}" would immediately format it while
       | l"hello {name}" would produce an object which contains a template
       | and the captured variables. Then their example would be called
       | like: greet(l"hello {name}").
        
       | TwentyPosts wrote:
       | This like a bad idea on the first glance? Maybe I don't get the
       | whole pitch here?
       | 
       | It just doesn't seem worth it to define a whole new thing just to
       | abstract over a format() function call. The laziness might be
       | interesting, but I feel like "lazy strings" might be all that's
       | needed here. Laziness and validation (or custom string formatting
       | logic) are separate concerns and should be separated.
        
         | masklinn wrote:
         | > It just doesn't seem worth it to define a whole new thing
         | just to abstract over a format() function call.
         | 
         | That could also be leveraged at f-strings themselves.
         | 
         | > Laziness and validation (or custom string formatting logic)
         | are separate concerns and should be separated.
         | 
         | In which case the one to move out is the laziness not the
         | customised interpolation. Because the latter is the one that's
         | necessary for safer dynamic SQL or HTML or whatever.
        
       | behnamoh wrote:
       | I want this in Python:
       | https://codecodeship.com/blog/2024-06-03-curl_req
       | 
       | From the article: """                   ~CURL[curl
       | https://catfact.ninja/fact]         |> Req.request!()
       | 
       | This is actual code; you can run this. It will convert the curl
       | command into a Req request and you will get a response back. This
       | is really great, because we have been able to increase the
       | expressiveness of the language. """
        
       | Too wrote:
       | Looks good. Would have been nice if they included a way to
       | express type checking of the format_spec. That's going to be an
       | unnecessary source of runtime errors.
        
         | 12_throw_away wrote:
         | self-deleted
        
           | layer8 wrote:
           | https://news.ycombinator.com/item?id=36674260
        
       | jtwaleson wrote:
       | For people adding insightful critique on the PEP on HN (I saw
       | some on this thread already), please ensure your opinion is
       | represented in the PEP thread itself too.
        
         | 0cf8612b2e1e wrote:
         | Given the names attached to the proposal, is this PEP actually
         | up for debate?
         | 
         | Most of the critiques I am reading (with which I fully agree)
         | is that this is more complexity in the language without
         | sufficient payoff. There are now how many stupid or fancy ways
         | to construct a Python string with a variable?
        
         | akklrgaG wrote:
         | Why would anyone do this? You risk being banned and defamed if
         | you lack deference or exceed your allotment of five messages
         | per thread.
         | 
         | Moreover, Python has resume-driven development and people with
         | a financial or reputational interest will get their new toys in
         | no matter what.
         | 
         | You would just contribute to the appearance of democracy.
        
       | mrweasel wrote:
       | My issue with this that is will eventually sneak into libraries
       | and the users of that library would be expected to use these tag
       | strings all over the place to utilize the library. This prevents
       | people from having a uniform coding style and make code harder to
       | read.
       | 
       | The concern isn't having features that will make it easier to
       | write DSLs, my problem is that people will misuse it in regular
       | Python projects.
       | 
       | I know that one of the authors are Guido, but I'm not buying the
       | motivation. Jinja2 and Django template are pretty much just using
       | Python, it's not really much of an issue, and I don't believe
       | that business logic should exist in your templates anyway. As for
       | the SQL argument, it will still be possible for people to mess it
       | up, even with Tag Strings, unless you completely remove all
       | legacy code. The issue here isn't that the existing facilities
       | aren't good enough, it's that many developers aren't aware of the
       | concepts, like prepared statements. If developers aren't reading
       | the docs to learn about prepared statements, why would they do so
       | for some DSL developed using tag strings?
       | 
       | Obviously Guido is a better developer than me any day, so I might
       | be completely wrong, but this doesn't feel right. I've seen tools
       | developed to avoid just doing the proper training, and the result
       | is always worse.
        
         | masklinn wrote:
         | > If developers aren't reading the docs to learn about prepared
         | statements, why would they do so for some DSL developed using
         | tag strings?
         | 
         | Because you've deprecated the "bare string" interface so they
         | can't use that anymore, or it's hidden deep into the utility
         | modules.
        
           | mrweasel wrote:
           | But you could do that already, could you not? Django does.
           | Its just not really SQL anymore then.
           | 
           | Someone else also pointed out that you could just do this
           | with functions. It seems like a very fancy way of avoiding
           | using (). I don't know, maybe show me how that would solve
           | the issue of unsafe SQL and I'd be more easily convinced.
        
             | hombre_fatal wrote:
             | Because it turns into a parameterized SQL query.
             | 
             | This already exists in the Javascript ecosystem:
             | sql`select * from users where id = ${id}`
             | 
             | Turns into:                   { query: 'select * from users
             | where id = $1', values: [id] }
             | 
             | So if you tried an injection like this:
             | sql`select * from users ${'where id = 3'}`
             | 
             | It turns into an invalid statement since "where id = 3"
             | cannot exist as a parameterized value for the same reason
             | this doesn't work:                   { query: 'select *
             | from users $1', values: ['where id = 3'] }
             | 
             | Where you go from here is to offer a query(statement)
             | function that requires the use of the tag string so that
             | you can't accidentally pass in a normal string-interpolated
             | string.
             | 
             | Examples:
             | 
             | - slonik: https://github.com/gajus/slonik?tab=readme-ov-
             | file#protectin...
             | 
             | - postgres.js:
             | https://github.com/porsager/postgres?tab=readme-ov-
             | file#quer...
        
         | zarzavat wrote:
         | I feel like there's a clash of cultures.
         | 
         | There's Python the scripting language, replacement for bash
         | scripts, R and Lua.
         | 
         | Then there's Python the serious software development language.
         | Where people have style guides, code review, test coverage, and
         | projects with more than one directory.
         | 
         | I understand why people in the second group are fearful of this
         | feature and DSLs in general. I'm in the first group and I'm
         | quite excited for it.
        
           | masklinn wrote:
           | I'm in the second group and I'm very excited for it, getting
           | the linting right to prevent people doing wild formatting
           | into dangerous string-based APIs is not easy, this provides
           | an opportunity to make it much easier and safer.
        
             | culi wrote:
             | And the second group is already likely to have a linter set
             | up. The author's fears about this showing up in library
             | code is still valid though
        
           | mrweasel wrote:
           | I can get behind that. I see some of the same issue in
           | regards to type annotation. I'm heavily leaning into the
           | dynamic / duck-typing aspects of Python, so type annotation
           | is often complicated or very broad, to the point where it's a
           | little redundant. If you're not really writing code like
           | that, type annotation is an awesome addition to the language.
           | 
           | I'd be very interested in seeing where this goes, it
           | certainly has it's uses, but there's also a ton of projects
           | where it really doesn't belong.
        
         | MrBuddyCasino wrote:
         | No IMO this is correct. DSLs often lead to heterogeneous code
         | styles that are hard to reason about. Simpler is usually
         | better.
        
       | ianbicking wrote:
       | I LOVE tagged templates in JavaScript.
       | 
       | But in Python I could also imagine YET ANOTHER constant prefix,
       | like t"", that returns a "template" object of some sort, and then
       | you could do html(t"") or whatever. That is, it would be just
       | like f"" but return an object and not a string so you could get
       | at the underlying values. Like in JavaScript the ability to see
       | the original backslash escaping would be nice, and as an
       | improvement over JavaScript the ability to see the expressions as
       | text would also be nice.
       | 
       | But the deferred evaluation seems iffy to me. Like I can see all
       | the cool things one might do with it, but it also means you can't
       | understand the evaluation without understanding the tag
       | implementation.
       | 
       | Also I don't think deferred evaluation is enough to make this an
       | opportunity for a "full" DSL. Something like a for loop requires
       | introducing new variables local to the template/language, and
       | that's really beyond what this should be, or what deferred
       | evaluation would allow.
        
       | spankalee wrote:
       | I tried skimming the PEP while I could, but it seems like this
       | might be missing a couple of the features that make JS tagged
       | template literals work so well:
       | 
       | - tags get a strings array that's referentially stable across
       | invocations. This can function as a cache key to cache strings
       | parsing work. - tags can return any kind of value, not just a
       | string. Often you need to give structured data to another
       | cooperating API.
       | 
       | Deferred evaluation of expressions is very cool, and would be
       | really useful for reactive use-cases, assuming they can be
       | evaluated multiple times.
        
         | svieira wrote:
         | The string array is not referentially stable across invocations
         | and cannot be because there is a single argument array
         | containing both the "static" bits and the "dynamic" bits. So
         | you can't use it the way that JS' `static_strings` argument can
         | be used as a key in a `WeakMap`.
         | 
         | Tags _can_ return any kind of value, so there 's that.
         | 
         | Deferred evaluations can be evaluated multiple times and is in
         | fact one of the biggest foot-guns in this API (in my opinion).
        
           | spankalee wrote:
           | I wonder if each of the string args is stable? Then you could
           | just use the first as the key.
           | 
           | The JS API where the strings are separate might seem awkward
           | at first, but it ends up being a really elegant design.
           | 
           | Deferred evaluation is really powerful and I wish JS had it.
           | It's one of the reasons why Solid has a custom JSX compiler
           | and doesn't use tagged literals... to get the-evaluation you
           | need to user to pass in closures, which is cumbersome.
        
       | treyd wrote:
       | I can't help but believe that this is introducing _more_ spooky
       | action at a distance and is bound to be abused. Is it really more
       | usable this way? Do they have any concrete and practical examples
       | where this improves readability?
        
       | Hamuko wrote:
       | I hate the idea of reusing the existing string/bytes prefixes for
       | something that is completely different. How is someone expected
       | to know that br"" is inherent Python syntax and my"" is
       | essentially an user-defined function? And the only way to ever
       | add a new prefix into the language (like f"" was added quite
       | recently) is to wait until Python 4, at which point we'll need
       | 3to4 to automatically rename all of your old tag strings that are
       | now conflicting and people will bitch about how badly major
       | Python upgrades suck.
        
       | zoogeny wrote:
       | I've seen this feature used responsibly and to good effect in a
       | few TypeScript projects so I understand why it would be desirable
       | in Python.
        
       | tofflos wrote:
       | I would have loved to see Java introduce something similar to the
       | IntelliJ @Language-annotation in the standard library but maybe
       | they'll figure out the sweet spot in a future String Templating
       | JEP.                 @Language("application/sql")       String
       | query = "SELECT 1";
       | @Language("application/graphql+json")       String query = """
       | query HeroNameAndFriends {                        hero {
       | name                          friends {
       | name                          }                        }
       | }                       """;
        
       | DataDive wrote:
       | Excellent idea, I don't get the criticism,
       | 
       | If a syntax such as f"{variable}" is already a feature - and
       | turned out to be a popular one - why shouldn't we be able to add
       | our own custom "f"s? Because that is what this is about. It might
       | make generating output even simpler.
       | 
       | I applaud the idea and am pleased to see that Python keeps
       | innovating!
        
       | kbd wrote:
       | At least in the spirit of "the language shouldn't be able to
       | define things the user can't" (see: Java string concatenation)
       | this seems like a good change.
        
       | samatman wrote:
       | I think this will turn out well. Julia has had this forever as
       | string macros, and it has worked out rather nicely, features like
       | `r"\d+"` for regex, and `raw"strings"` are just string macros.
       | The set of all useful custom literal strings isn't bounded, so a
       | lightweight mechanism to define them and make use of the results
       | is a good thing.
       | 
       | Another kitchen sink to add to Python's world-class kitchen sink
       | collection.
        
       ___________________________________________________________________
       (page generated 2024-08-10 23:00 UTC)