[HN Gopher] PEP 750: Tag Strings for Writing Domain-Specific Lan...
___________________________________________________________________
PEP 750: Tag Strings for Writing Domain-Specific Languages
Author : lumpa
Score : 65 points
Date : 2024-08-10 14:57 UTC (8 hours ago)
(HTM) web link (discuss.python.org)
(TXT) w3m dump (discuss.python.org)
| formerly_proven wrote:
| Yikes. Don't get me wrong, I totally understand the reasoning why
| this would be useful (though I _violently disagree_ with the idea
| of deferring the evaluation of the contained expressions), but it
| 's also so very kitchensinky and adds so little over just calling
| a function (which doesn't require a 20-page explainer, as
| everyone already knows how function calls work). It also promotes
| using what looks like string interpolation (and what might be
| string interpolation, you can't tell at the "call site") for
| things which we know string interpolation is the wrong tool. The
| API also seems really, I dunno, weird to me. The string is split
| around interpolations and verbatim portions result in one
| argument, which is "string-like", while interpolations become
| four-tuple-like (one of which is a lambda, which you call to
| perform the deferred interpolation). This seems really awkward to
| me for building stuff like the suggested use cases of XML/HTML or
| SQL templating.
|
| Also the scoping rules of this are a special case which doesn't
| appear in regular Python code so far: "The use of annotation
| scope means it's not possible to fully desugar interpolations
| into Python code. Instead it's as if one is writing
| interpolation_lambda: tag, not lambda: tag, where a hypothetical
| interpolation_lambda keyword variant uses annotation scope
| instead of the standard function scope." -- i.e. it's "as if you
| wrapped all interpolation expressions in a lambda: <expr>,
| _except it uses different scoping rules_ ".
| masklinn wrote:
| > This seems really awkward to me for building stuff like the
| suggested use cases of XML/HTML or SQL templating.
|
| Compared to what?
|
| At the end of the day you're still doing string formatting, if
| you want parsing, then you'd feed the item into a parser, which
| this doesn't preclude.
|
| The interface sounds a lot better than JS's anyway, as that
| completely separates the literal strings and the interpolations
| so you have to re-intersperse them which is muggy.
|
| > interpolations become four-tuple-like
|
| They become an Interpolation object, which can be unpacked if
| you find that more convenient, but you can access the members
| if you prefer:
|
| - 0 is getvalue is the callable to retrieve the evaluated
| expression
|
| - 1 is expr is the raw text form of the expression
|
| - 2 is conv is the !conversion tag (s, r, or a)
|
| - 3 is format_spec
| dmart wrote:
| I have to admit that at first glance I don't like this. These
| seem to be essentially normal str -> Any functions, with some
| naming limitations due to the existing string prefixes special-
| cased in the language. I don't feel like adding this additional
| complexity is worth being able to save two parentheses per
| function call.
| Epa095 wrote:
| This was my first thought as well. But an important difference
| is that the arguments are not eagerly evaluated, but they are
| passed as lambdas which can be evaluated if desired. This means
| that it can be used for example in log messages (if you don't
| want to evaluate the string at the wrong log levels). But is it
| worth it for that? Idk.
| masklinn wrote:
| Even if eager evaluation it's already a very compelling way
| of managing basically every lightweight "templating" for
| safety: e.g. embedded dynamic HTML or SQL. `markupsafe` is
| great, but it's way too easy to perform formatting _before_
| calling it, especially with f-strings.
|
| That f-strings were "static" was by far my biggest criticism
| of it, given how useful I find JS's template strings.
|
| And this proposal seems like a straight up better version of
| template strings:
|
| - the static strings and interpolations are not split and
| don't have to be awkwardly re-interpsersed which I've never
| found 100% trouble and 0% utility
|
| - the lazy evaluation means it can be used for things like
| logging (which really want lazy evaluation), or meta-
| programmation (because you can introspect the callables, and
| you get the expression text)
| codethief wrote:
| > - the lazy evaluation means it can be used for things
| like logging (which really want lazy evaluation)
|
| Could you elaborate? I would find it rather surprising if
| my log messages don't contain the data at the very moment I
| invoke the logger.
| masklinn wrote:
| The expressions for the data you want to log out can be
| expensive, so ideally you only want to compute them
| _after_ you've checked if the logger was enabled for the
| level you need.
|
| In most APIs this requires an explicit conditional check
| and the average developer will not think of it. This
| allows said check to be performed internally.
| Too wrote:
| How can you call a function that does this? html'<div
| id={id:int}>{content:HTML|str}</div>'.
|
| html() is not going to be equivalent.
| mrweasel wrote:
| That is probably a much better example than any of those
| present in the PEP. I quite like your example. I'm not sure
| I'd want to write code like that, but it shows the usefulness
| much more clearly.
| throwitaway1123 wrote:
| For a practical example of this technique used in JS take a
| look at libraries like htm and lit-html:
| https://github.com/developit/htm
| jerf wrote:
| I think at this point Python really needs to just settle down.
| I don't like this not because it's an intrinsically bad idea,
| but adding another thing to the already fairly large pile of
| things a Python user needs to know in order to read somebody
| else's code needs to be something that brings more benefits to
| the table than just "it slightly improves a particular type of
| function call".
|
| At the risk of riling some people up, this smells like Perl.
| Some poor Python user comes across
| greet"Hello {user}"
|
| and there isn't even so much as a symbol they can search for on
| the internet, just an identifier smashed into a string.
|
| But I guess Python and I parted ways on this matter quite a
| while ago. I like to joke about Katamari Dama-C++ but Python is
| starting to give it a run for its money. C++ is still in the
| lead, but Python is arguably sustainably moving more quickly on
| the "add more features" front.
| Waterluvian wrote:
| My guess at the challenge is that the community who maintain
| and develop a language are by that very nature not in touch
| with what the complexity feels like for the average user.
|
| Also it's harder to do nothing than something.
|
| That being said, I think this is partly abstract. I've just
| ignored a lot of new Python features without issue. And while
| I worried that they'd force me to learn new things to
| understand others' code, that's not really materialized.
| ziml77 wrote:
| It seems the purpose of this proposal is to have a way to
| essentially have custom string interpolation. I don't think
| that's necessarily a bad idea on its own, but this syntax feels
| out of place to me.
|
| Instead, why not add a single new string prefix, like "l" for
| "lazy"? So, f"hello {name}" would immediately format it while
| l"hello {name}" would produce an object which contains a template
| and the captured variables. Then their example would be called
| like: greet(l"hello {name}").
| TwentyPosts wrote:
| This like a bad idea on the first glance? Maybe I don't get the
| whole pitch here?
|
| It just doesn't seem worth it to define a whole new thing just to
| abstract over a format() function call. The laziness might be
| interesting, but I feel like "lazy strings" might be all that's
| needed here. Laziness and validation (or custom string formatting
| logic) are separate concerns and should be separated.
| masklinn wrote:
| > It just doesn't seem worth it to define a whole new thing
| just to abstract over a format() function call.
|
| That could also be leveraged at f-strings themselves.
|
| > Laziness and validation (or custom string formatting logic)
| are separate concerns and should be separated.
|
| In which case the one to move out is the laziness not the
| customised interpolation. Because the latter is the one that's
| necessary for safer dynamic SQL or HTML or whatever.
| behnamoh wrote:
| I want this in Python:
| https://codecodeship.com/blog/2024-06-03-curl_req
|
| From the article: """ ~CURL[curl
| https://catfact.ninja/fact] |> Req.request!()
|
| This is actual code; you can run this. It will convert the curl
| command into a Req request and you will get a response back. This
| is really great, because we have been able to increase the
| expressiveness of the language. """
| Too wrote:
| Looks good. Would have been nice if they included a way to
| express type checking of the format_spec. That's going to be an
| unnecessary source of runtime errors.
| 12_throw_away wrote:
| self-deleted
| layer8 wrote:
| https://news.ycombinator.com/item?id=36674260
| jtwaleson wrote:
| For people adding insightful critique on the PEP on HN (I saw
| some on this thread already), please ensure your opinion is
| represented in the PEP thread itself too.
| 0cf8612b2e1e wrote:
| Given the names attached to the proposal, is this PEP actually
| up for debate?
|
| Most of the critiques I am reading (with which I fully agree)
| is that this is more complexity in the language without
| sufficient payoff. There are now how many stupid or fancy ways
| to construct a Python string with a variable?
| akklrgaG wrote:
| Why would anyone do this? You risk being banned and defamed if
| you lack deference or exceed your allotment of five messages
| per thread.
|
| Moreover, Python has resume-driven development and people with
| a financial or reputational interest will get their new toys in
| no matter what.
|
| You would just contribute to the appearance of democracy.
| mrweasel wrote:
| My issue with this that is will eventually sneak into libraries
| and the users of that library would be expected to use these tag
| strings all over the place to utilize the library. This prevents
| people from having a uniform coding style and make code harder to
| read.
|
| The concern isn't having features that will make it easier to
| write DSLs, my problem is that people will misuse it in regular
| Python projects.
|
| I know that one of the authors are Guido, but I'm not buying the
| motivation. Jinja2 and Django template are pretty much just using
| Python, it's not really much of an issue, and I don't believe
| that business logic should exist in your templates anyway. As for
| the SQL argument, it will still be possible for people to mess it
| up, even with Tag Strings, unless you completely remove all
| legacy code. The issue here isn't that the existing facilities
| aren't good enough, it's that many developers aren't aware of the
| concepts, like prepared statements. If developers aren't reading
| the docs to learn about prepared statements, why would they do so
| for some DSL developed using tag strings?
|
| Obviously Guido is a better developer than me any day, so I might
| be completely wrong, but this doesn't feel right. I've seen tools
| developed to avoid just doing the proper training, and the result
| is always worse.
| masklinn wrote:
| > If developers aren't reading the docs to learn about prepared
| statements, why would they do so for some DSL developed using
| tag strings?
|
| Because you've deprecated the "bare string" interface so they
| can't use that anymore, or it's hidden deep into the utility
| modules.
| mrweasel wrote:
| But you could do that already, could you not? Django does.
| Its just not really SQL anymore then.
|
| Someone else also pointed out that you could just do this
| with functions. It seems like a very fancy way of avoiding
| using (). I don't know, maybe show me how that would solve
| the issue of unsafe SQL and I'd be more easily convinced.
| hombre_fatal wrote:
| Because it turns into a parameterized SQL query.
|
| This already exists in the Javascript ecosystem:
| sql`select * from users where id = ${id}`
|
| Turns into: { query: 'select * from users
| where id = $1', values: [id] }
|
| So if you tried an injection like this:
| sql`select * from users ${'where id = 3'}`
|
| It turns into an invalid statement since "where id = 3"
| cannot exist as a parameterized value for the same reason
| this doesn't work: { query: 'select *
| from users $1', values: ['where id = 3'] }
|
| Where you go from here is to offer a query(statement)
| function that requires the use of the tag string so that
| you can't accidentally pass in a normal string-interpolated
| string.
|
| Examples:
|
| - slonik: https://github.com/gajus/slonik?tab=readme-ov-
| file#protectin...
|
| - postgres.js:
| https://github.com/porsager/postgres?tab=readme-ov-
| file#quer...
| zarzavat wrote:
| I feel like there's a clash of cultures.
|
| There's Python the scripting language, replacement for bash
| scripts, R and Lua.
|
| Then there's Python the serious software development language.
| Where people have style guides, code review, test coverage, and
| projects with more than one directory.
|
| I understand why people in the second group are fearful of this
| feature and DSLs in general. I'm in the first group and I'm
| quite excited for it.
| masklinn wrote:
| I'm in the second group and I'm very excited for it, getting
| the linting right to prevent people doing wild formatting
| into dangerous string-based APIs is not easy, this provides
| an opportunity to make it much easier and safer.
| culi wrote:
| And the second group is already likely to have a linter set
| up. The author's fears about this showing up in library
| code is still valid though
| mrweasel wrote:
| I can get behind that. I see some of the same issue in
| regards to type annotation. I'm heavily leaning into the
| dynamic / duck-typing aspects of Python, so type annotation
| is often complicated or very broad, to the point where it's a
| little redundant. If you're not really writing code like
| that, type annotation is an awesome addition to the language.
|
| I'd be very interested in seeing where this goes, it
| certainly has it's uses, but there's also a ton of projects
| where it really doesn't belong.
| MrBuddyCasino wrote:
| No IMO this is correct. DSLs often lead to heterogeneous code
| styles that are hard to reason about. Simpler is usually
| better.
| ianbicking wrote:
| I LOVE tagged templates in JavaScript.
|
| But in Python I could also imagine YET ANOTHER constant prefix,
| like t"", that returns a "template" object of some sort, and then
| you could do html(t"") or whatever. That is, it would be just
| like f"" but return an object and not a string so you could get
| at the underlying values. Like in JavaScript the ability to see
| the original backslash escaping would be nice, and as an
| improvement over JavaScript the ability to see the expressions as
| text would also be nice.
|
| But the deferred evaluation seems iffy to me. Like I can see all
| the cool things one might do with it, but it also means you can't
| understand the evaluation without understanding the tag
| implementation.
|
| Also I don't think deferred evaluation is enough to make this an
| opportunity for a "full" DSL. Something like a for loop requires
| introducing new variables local to the template/language, and
| that's really beyond what this should be, or what deferred
| evaluation would allow.
| spankalee wrote:
| I tried skimming the PEP while I could, but it seems like this
| might be missing a couple of the features that make JS tagged
| template literals work so well:
|
| - tags get a strings array that's referentially stable across
| invocations. This can function as a cache key to cache strings
| parsing work. - tags can return any kind of value, not just a
| string. Often you need to give structured data to another
| cooperating API.
|
| Deferred evaluation of expressions is very cool, and would be
| really useful for reactive use-cases, assuming they can be
| evaluated multiple times.
| svieira wrote:
| The string array is not referentially stable across invocations
| and cannot be because there is a single argument array
| containing both the "static" bits and the "dynamic" bits. So
| you can't use it the way that JS' `static_strings` argument can
| be used as a key in a `WeakMap`.
|
| Tags _can_ return any kind of value, so there 's that.
|
| Deferred evaluations can be evaluated multiple times and is in
| fact one of the biggest foot-guns in this API (in my opinion).
| spankalee wrote:
| I wonder if each of the string args is stable? Then you could
| just use the first as the key.
|
| The JS API where the strings are separate might seem awkward
| at first, but it ends up being a really elegant design.
|
| Deferred evaluation is really powerful and I wish JS had it.
| It's one of the reasons why Solid has a custom JSX compiler
| and doesn't use tagged literals... to get the-evaluation you
| need to user to pass in closures, which is cumbersome.
| treyd wrote:
| I can't help but believe that this is introducing _more_ spooky
| action at a distance and is bound to be abused. Is it really more
| usable this way? Do they have any concrete and practical examples
| where this improves readability?
| Hamuko wrote:
| I hate the idea of reusing the existing string/bytes prefixes for
| something that is completely different. How is someone expected
| to know that br"" is inherent Python syntax and my"" is
| essentially an user-defined function? And the only way to ever
| add a new prefix into the language (like f"" was added quite
| recently) is to wait until Python 4, at which point we'll need
| 3to4 to automatically rename all of your old tag strings that are
| now conflicting and people will bitch about how badly major
| Python upgrades suck.
| zoogeny wrote:
| I've seen this feature used responsibly and to good effect in a
| few TypeScript projects so I understand why it would be desirable
| in Python.
| tofflos wrote:
| I would have loved to see Java introduce something similar to the
| IntelliJ @Language-annotation in the standard library but maybe
| they'll figure out the sweet spot in a future String Templating
| JEP. @Language("application/sql") String
| query = "SELECT 1";
| @Language("application/graphql+json") String query = """
| query HeroNameAndFriends { hero {
| name friends {
| name } }
| } """;
| DataDive wrote:
| Excellent idea, I don't get the criticism,
|
| If a syntax such as f"{variable}" is already a feature - and
| turned out to be a popular one - why shouldn't we be able to add
| our own custom "f"s? Because that is what this is about. It might
| make generating output even simpler.
|
| I applaud the idea and am pleased to see that Python keeps
| innovating!
| kbd wrote:
| At least in the spirit of "the language shouldn't be able to
| define things the user can't" (see: Java string concatenation)
| this seems like a good change.
| samatman wrote:
| I think this will turn out well. Julia has had this forever as
| string macros, and it has worked out rather nicely, features like
| `r"\d+"` for regex, and `raw"strings"` are just string macros.
| The set of all useful custom literal strings isn't bounded, so a
| lightweight mechanism to define them and make use of the results
| is a good thing.
|
| Another kitchen sink to add to Python's world-class kitchen sink
| collection.
___________________________________________________________________
(page generated 2024-08-10 23:00 UTC)