[HN Gopher] C# Raw String Literal Proposal
___________________________________________________________________
C# Raw String Literal Proposal
Author : nikbackm
Score : 62 points
Date : 2022-02-17 13:26 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ducharmdev wrote:
| > This allows code to look natural, while still producing
| literals that are desired, and avoiding runtime costs if this
| required the use of specialized string manipulation routines.
|
| Maybe a source generator would be applicable here? Not as nice
| has having it built into the language of course, but it would at
| least eliminate these runtime costs.
| ape4 wrote:
| Perl's heredoc is better. You get to pick the delimiter.
| https://perlmaven.com/here-documents
| DalekBaldwin wrote:
| Interpolation is its own can of worms, but if you just want to be
| able to encode absolutely anything without escape characters, you
| just need two delimiters: "*n [stringA] "*n
| "*n [stringB] '*n '*n [stringC] "*n '*n
| [stringD] '*n
|
| Each string may contain runs of contiguous single or double
| quotes of length less than n, and furthermore:
|
| stringA may start and/or end with a single quote
|
| stringB may start with a single quote and/or end with a double
| quote
|
| stringC may start with a double quote and/or end with a single
| quote
|
| stringD may start and/or end with a double quote
| LandR wrote:
| var v = """""" contents""""" """"""
|
| lol.
|
| I like the proposal to have these sort of raw strings, where
| indentations are removed, but can't they use a symbol before the
| string like they do with interpolation `$` or literals `@`?
|
| I know it says design decision to go with 1 more " than the
| longest sequence of " in the string, but why ?
| brandmeyer wrote:
| I'm a fan of the C++ method. R"SQL(my string
| without the sequence S Q L goes here)SQL"
|
| You can use any extra delimiter you want. The concatenation
| rules make it easy for you to easily insert source code line
| breaks and indentation without literal string breaks or
| indentation.
| Metasyntactic wrote:
| Hi, I'm the lang designer here.
|
| We looked into this. However, there didn't seem to be any
| benefit to this above just the N-quote version (which fits
| into how C# does strings everywhere else). In the above case,
| the `SQL(` and `)SQL` tokens are just akin to N-quotes. Since
| there's no additional benefit, we went with the simpler
| approach that solves all these needs, but will look the same
| across all codebases.
| fmorel wrote:
| It's more about not needing to escape characters than stripping
| indentation (that's just an extra perk). Otherwise, if the
| string can contain `"`, how can the compiler know which `"`
| defines the end of it?
| taco_emoji wrote:
| They literally explain why as design goal #1:
|
| > Provide a mechanism that will allow all string values to be
| provided by the user without the need for any escape-sequences
| whatsoever.
| GordonS wrote:
| I'm in agreement: the function is needed, but the syntax feels
| really weird.
|
| I'd also much prefer some kind of prefix, maybe double "at",
| e.g.
|
| ``` var myString = @@"blah blah blah blah " ```
|
| This feels a lot more natural to me.
| Someone wrote:
| A design goal is that you won't need to escape _any_
| character sequence in the string. In your proposal, a _"_
| inside a literal string would have to be escaped.
|
| In practice, using _" ""_ will be sufficient almost all the
| time.
| Metasyntactic wrote:
| > but can't they use a symbol before the string like they do
| with interpolation `$` or literals `@`?
|
| Hi! I'm the language designer here :)
|
| I know it says design decision to go with 1 more " than the
| longest sequence of " in the string, but why ?
|
| Because if we use a symbol before the string, then there needs
| to be some mechanism to escape it within the string. e.g. if
| you use `@` literals, you still need to escape quotes within
| the string literal. The point of this feature (which we try to
| spell out in the spec) is so that you can have content without
| the need to escape anything at all.
| mastax wrote:
| Rust uses any number of `#` e.g. ###"There
| can be "stuff without escapes" #" "# "###
|
| C# already uses @ for raw string prefixes so they could extend
| it's usage for multiple prefixes+suffixes.
|
| """ Is fine though.
| chubot wrote:
| _To make the text easy to read and allow for indentation that
| developers like in code, these string literals will naturally
| remove the indentation specified on the last line when producing
| the final literal value._
|
| This is the same rule that Oil has; I think it came from the
| Julia language (or at least that's where I got it from)
|
| _Oil Has Multi-line Commands and String Literals_
| http://www.oilshell.org/blog/2021/09/multiline.html
| StevePerkins wrote:
| I like the three-quote literal syntax in general, and am a bit
| surprised that C# doesn't have this already. Even Java has had
| this for awhile now!
|
| But I don't like the indented form, where nested triple-quotes
| are ignored. Whitespace formatting is fine when I'm working with
| Python, but I really don't want to mix that paradigm when I'm
| working with curly-brace languages.
| taco_emoji wrote:
| Nested triple-quotes are not ignored, they will end the string
| if you used triple to start. If you need to include triples
| quotes within the string, you start with quadruple.
| KingOfCoders wrote:
| With all the hacks and exploits going on, and (Log4J) more
| security awareness coming,
|
| I've been considering a safe String class that prevents some
| characters like CR,LF,\ that are seldom needed in business
| strings but used in system level things. Drawing a line between
| these two would increase security.
| bdamm wrote:
| safe would be more a property of where it came from, and maybe
| what processing the string has had on it. Plenty of business
| strings have newlines.
|
| I like the idea but really I want the type to capture
| unsafe/semi-safe/safe. Now the trick is, how expressive can the
| idea of semi-safe be?
| KingOfCoders wrote:
| Joel tried to fix the problem / source with naming (I
| remember reading that article, and it's 17 years old!)
|
| https://www.joelonsoftware.com/2005/05/11/making-wrong-
| code-...
| ajnin wrote:
| I think it's a good approach but I would do it the opposite way
| : create an UnsafeString and use that whenever your program
| takes external inputs. Then make it so that this UnsafeString
| can't be used directly but must always be consciously converted
| to (safe) String when using that data anywhere.
| __ryan__ wrote:
| I've given some thought to this kind of string literal in the
| past (for my imaginary programming language). I want a syntax
| _something_ like this: var xml = """<element
| attr="content"> """ <body> """
| </body> """</element> """;
|
| This would give you the string: <element
| attr="content"> <body> </body>
| </element>(no newline)
|
| If you wanted a newline at the end, you'd do this:
| var xml = """<element attr="content"> """
| <body> """ </body>
| """</element> """ """;
|
| Basically the end delimiter of the string would be the last """.
| You could concatenate two strings like so: var
| xml = """<element attr="content"> """ <body>
| """ </body> """</element> """
| """ // this string ended on this line +
| """<element attr="content"> """ <body>
| """ </body> """</element> """
| """; // this string ended on this line
|
| This could use the same logic for using _at least_ three quotes
| as the indicator that it 's a multiline string.
|
| Please, tear this apart and offer improvements.
|
| Edit: this is conceptually similar to Zig's multiline literal:
| https://ziglang.org/documentation/master/#Multiline-String-L...
| wangweij wrote:
| One benefit of multiline raw string is that you can directly
| copy/paste a block of characters between the program and its
| source. Unless there are very sophisticated IDE support this
| proposal does not work fine in this sense.
| jcelerier wrote:
| ... which text editor does not have multi-line block
| selection these days ?
| pjob wrote:
| The browser that I'm using to read this proposal, for one.
| __ryan__ wrote:
| Ah, that's interesting. In general, yes my proposal does
| benefit from and assume some IDE/tooling support for quality
| of life.
|
| Edit: To be specific, the IDE could handle formatting when
| pasting into a line beginning with """. Or offer a "paste as
| _cool new multiline string syntax_ " feature.
| [deleted]
| skrebbel wrote:
| I don't understand what problem this solves that the proposal
| in the linked article doesn't solve. Maybe you're not
| suggesting that it does, but then what's the upside?
| __ryan__ wrote:
| Oh yeah, I was just sharing my string literal syntax that's
| been baking in my head for while for the sake of discussing.
|
| But off the top of my head, mainly just that there's a clear
| visual indicator of the start of lines of text, rather than
| counting/lining up leading whitespace. In the first example,
| the strings are all tabbed evenly for the sake of looking
| "pretty" in the code, but the following would generate the
| same string, since each line begins after the """:
| var xml = """<element attr="content"> """
| <body> """ </body> """</element>
| """;
| Metasyntactic wrote:
| Hi. I'm the language designer behind this feature :)
|
| A few points.
|
| > but the following would generate the same string, since
| each line begins after the """:
|
| That's not a virtue here. The point is to be able to write
| clear literals that never need escapes and which allow for
| easy grokking of what the content actually is.
|
| All current string forms in C# require some amount of
| manual (or tooling) help to fix them up to be legal. That's
| not the case with this literal. The content can always work
| as-is without having to touch it at all.
| __ryan__ wrote:
| That definitely makes sense. In my (limited, especially
| not C#) experience I just really dislike reasoning about
| trimming the leading whitespace, even if the rules are
| simple. I yearn for a consistent visual cue.
|
| In my syntax, the IDE would ideally treat the """ block
| virtually like a <textarea>.
| Metasyntactic wrote:
| > I just really dislike reasoning about trimming the
| leading whitespace
|
| Note: this feature is entirely optional. You can
| absolutely not have leading whitespace trimming at all.
| Indeed, this is a requirement of the proposal as we have
| to make it possible to actually represent text that has
| leading whitespace :)
| hankchinaski wrote:
| I have been temporarily working with c# for the past month after
| years of Go. It's a different philosophy to Go, there is a lot of
| syntactic sugar and magic spells that make your life easy... but
| I don't know if I prefer that to the Go way of doing things. I
| was pleasantly surprised tho. Much better experience than working
| with Java
| radicalbyte wrote:
| I've had to go the other way around.. and I find Go extremely
| verbose compared to C# - as long as you're not forced to follow
| certain constraints (SonarCube-driven-development).
| StevePerkins wrote:
| It sounds like you're both saying the same thing.
| radicalbyte wrote:
| Exactly, we've come from opposite sides and reached the
| same conclusion :-)
| dustymcp wrote:
| Dotnet core is great the earlier versions not so much
| monadmoproblems wrote:
| I've long wanted a more succinct way of writing implicitly typed
| arrays. Whenever you work with data directly in the code, for
| example when hacking on leetcode, you end you with lots of
| horrible nested arrays:
|
| new [] {new [] {1, 2}, new [] {3, 4}};
|
| Something like:
|
| @[ @[1, 2], @[3, 4] ]
| Metasyntactic wrote:
| Hi there! I'm one of the C# language designers. I'm working on
| a proposal for that right now:
| https://github.com/dotnet/csharplang/issues/5354
|
| Thanks!
| chhickman wrote:
| I would much rather see something like this:
| string longString = `This
| ` allows ` differentiation
| ` of `formatting
| ` indention ` from
| ` leading ` string
| ` spaces ` using
| ` back-ticks (\`)
| Metasyntactic wrote:
| Hi there, I'm the lang designer and implementor here.
|
| That would violate a core goal of the feature which is that the
| content itself doesn't need escaping. This sort of approach
| would require all users to have tooling that would make that
| pleasant, instead of providing a feature that was easy to use
| across any editor.
|
| Thanks!
| jimworm wrote:
| Heredoc by another name...?
| LandR wrote:
| Doesn't heredoc preserve the indentation unless you strip it
| back out ? I don't mean identation within the string, I mean
| level of indentation of where it is in the code
| sandreas wrote:
| You can use it without indentication using <<- AND TAB (does
| not work with spaces, so copy and past won't work on
| hackernews - replace the spaces of the three content lines
| with a TAB): cat <<-EOF content
| not indented EOF
| jimworm wrote:
| The perl-family heredoc syntax could be quite flexible with
| options. I'm used to ruby which does have an option for
| indent-stripping.
| torginus wrote:
| It seems like .NET is trying to outcompete Rust in the number of
| string types available in the language.
| ok123456 wrote:
| Rust's different strings are different types with different
| underlying memory semantics and representations.
|
| This is just some syntactic sugar for strings that contain
| escape codes. It's still just a 'string'.
| tialaramex wrote:
| Also, the Rust language itself only has one string type, str.
| std::string::String comes ultimately from Rust's alloc crate,
| it is special only in the limited sense that the prelude
| makes it available without specifically asking for it, but
| you could define your own prelude that introduces say MyText
| or CPlusPlusStyleString or whatever you wanted.
|
| Admittedly having one string type is still more than C, or
| indeed C++ bother with but we might notice that those
| languages have a pretty terrible relationship with strings
| and suspect that's not a coincidence.
| sbelskie wrote:
| With potentially more on the way!
|
| https://github.com/dotnet/csharplang/blob/main/proposals/utf...
| doodpants wrote:
| These aren't different string types, they're different syntaxes
| for string literals.
| radicalbyte wrote:
| The issue is being discussed here:
| https://github.com/dotnet/csharplang/issues/4304
| jsd1982 wrote:
| For the single-line case, what happens for:
| """"""" """"""""
|
| Are those strings containing `"` and `""` or are they empty
| strings? Is the first case an error because the starting and
| ending quote counts do not match? If the number of quote chars is
| even, do the contents alternate between `"` and empty as the
| number of surrounding quotes increases?
| rawling wrote:
| It does say
|
| > A single_line_raw_string_literal cannot represent a string
| value that starts or ends with a quote (") though an
| augmentation to this proposal is provided in the Drawbacks
| section that shows how that could be supported.
|
| so I'd assume the odd count would lead to an error (single
| trailing "?) rather than a string containing ".
|
| E: I'd assume it doesn't allow empty single line strings
| because otherwise how do you tell the difference between that
| and the start of a multi line one?
| exyi wrote:
| I hope it will also normalize newlines to `\n`. The current
| version of raw literals (@"...") just puts there whatever is in
| the file, so it in practice depends on if your program was
| compiled on Windows or Linux. Surely that should be irrelevant
| for the compilation to intermediate language
| Metasyntactic wrote:
| Hi, i'm the lang designer and implementor here.
|
| We absolutely do not normalize newlines as that would defeat
| the purpose of _raw_ literals. The point here is that your
| content is not interpreted as that 's the pain area that people
| are hitting today. How you write your literal is what you get
| at the end of the day.
|
| Note: if the content needs to be `\n` then just use that actual
| newline in teh code. WRT to the file line endings and whatnot,
| my recommendation is that you never use tools that arbitrarily
| change that behind your back as it does _already_ have impact
| _today_ in C#. For example, that will break standard `@ ""`
| strings today.
|
| If your line endings are important, then your tools should be
| setup to respect what you wrote and not change them. All
| editors can be setup this way, as can git. And that would
| absolutely be my recommendation on how you should structure
| things for your code if newlines are relevant.
| gpderetta wrote:
| Surely it depends on the encoding of the source, not where it
| was compiled?
| exyi wrote:
| Yes it does, but usually \n is committed in git, but on
| Windows it checks out as \r\n. So you are right that it
| technically does not depend on the system, but in practice
| there is a difference.
| gpderetta wrote:
| That's a very good point. Is it best practice to configure
| git to change line endings on Windows? I understand that
| these days Windows editors can handle unix line terminators
| corerctly.
| rawling wrote:
| I think you're right:
|
| > Any line breaks within verbatim string literals are part of
| the resulting string. If the exact characters used to form
| line breaks are semantically relevant to an application, any
| tools that translate line breaks in source code to different
| formats (between "\n" and "\r\n", for example) will change
| application behavior.
|
| https://docs.microsoft.com/en-us/dotnet/csharp/language-
| refe...
| captainmuon wrote:
| Nice, I'm surprized this is not already in the language. The only
| thing I find a bit strange is that the delimiters must be on
| separate lines (unless it is the special one-line-form). So this
| is apparantly not legal: var s = """This is a
| multiline string""";
|
| Requiring the start and especially end quotes to be on a separate
| line makes it take a lot of vertical space. But OTOH, that is
| consistent with the default coding style in C# which is
| vertically verbose (with {} on lines by themselves).
| Someone1234 wrote:
| > I'm surprized this is not already in the language.
|
| Because it is already in the language. var
| xml = @" <element attr=""content"">
| <body> </body> </element>";
|
| This proposal mostly seems to be about some edge case where @"
| " syntax isn't good enough. But really, this whole thing is an
| improvement to an anti-pattern, and you should instead be
| looking into not needing multi-line block specific string
| literals in your code (e.g. putting templates in their own
| files/resources).
| tialaramex wrote:
| Microsoft's "verbatim strings" aren't. Think of them instead
| as "Oops, we use a lot of backslashes here at Microsoft and
| over time that just looks more and more stupid" strings and
| then these actual raw strings make lots more sense than those
| did.
|
| There's plenty of stuff in a middle ground where a separate
| template file is a waste. Your example, now that you've
| corrected it shows that nicely. A separate template would be
| a waste here for these few bytes, and yet "verbatim strings"
| mean instead of this just being some actual XML you can copy-
| paste it has to be escaped / unescaped.
|
| If your issue is that you don't think string literals should
| be a thing at all, C# is the wrong language for you. Try one
| of the early numeric langauges, or something modern like
| WUFFS that eschews strings entirely because they're too
| dangerous. Once you accept that literals should be a thing
| (notice these aren't interpolated, they're just literals)
| this is an obvious idea.
|
| The bad "verbatim" syntax should go away in favour of a raw
| literal syntax such as the one proposed here.
| maybeOneDay wrote:
| You've perfectly demonstrated one motivation for this
| proposal: your string literal is incorrect. Verbatim strings
| in C# require " to be escaped, your string should be:
| var xml = @" <element attr=""content"">
| <body> </body> </element>";
| Someone1234 wrote:
| All it perfectly demonstrates is that this is inherently an
| anti-pattern and that we're discussing features to work
| around things you shouldn't be doing to begin with.
|
| If you want to store XML literals, then by all means do so,
| but within the code itself is inappropriate. Even the
| existing @" " syntax is a code-smell, the new syntax
| doesn't address why that is (e.g.
| validation/colorization/etc don't work for string literals
| containing arbitrary other languages).
|
| .Net already has constructs to allow the dynamic creation
| of XML blocks (and JSON) without resorting to string-comcat
| shenanigans.
| maybeOneDay wrote:
| Anti-patterns are rarely as absolute as you're making
| this out to be. Sure, I agree, lots of times it's better
| to store xml or json literals not in code. But for
| something three lines long it's perfectly fine, more
| readable, and trivial. This new proposal makes it elegant
| to do so, the only issue is that the @"" syntax should
| never have been used and unfortunately now we are
| proposing a third string literal syntax. That I don't
| like.
| Metasyntactic wrote:
| Hi. I'm the designer of this feature. The reason for this is so
| that we can potentially have fenced string blocks in the
| future. for example:
|
| ```c# var s = """xml <Book><title/></Book> """; ```
|
| and the like. Thanks!
| assbuttbuttass wrote:
| I always found these complex indentation-stripping rules to be
| confusing in a language such as Python. I was under the
| impression that C# doesn't treat whitespace significantly, so why
| do they need all these complex rules? Just interpret what's
| between the quotes literally.
| Deukhoofd wrote:
| When you have multiline strings, and don't want to start the
| lines with whitespace, it means you need to break the
| indentation of your code, which can look quite ugly. I'm not
| sure whether I want the compiler to do it like in the proposal
| though, I feel like it can easily cause unintentional issues.
| assbuttbuttass wrote:
| I think breaking the indentation is the lesser of two evils,
| because it becomes very obvious what the content of the
| string is. I've never minded it too much personally in go,
| but I guess it's subjective.
| LandR wrote:
| When I have strings like this, currently I find myself doing
| something like var foo = "<foo>" +
| Environment.NewLine + " <bar>" +
| Environment.NewLine + " <quax>" +
| Environment.NewLine + "</foo>";
|
| This would fix stuff like this nicely, but it's horrible
| syntax I think.
| Deukhoofd wrote:
| That would also not be a compile time constant, which would
| probably be desired in cases like that.
| tremon wrote:
| Why would it not be a compile-time contant? Can't every
| compiler worth its salt evaluate constant expressions at
| compile time nowadays?
| Deukhoofd wrote:
| Environment.NewLine depends on the machine it's ran in,
| not on the compiler, so at best it can only evaluate it
| at runtime.
| LandR wrote:
| > C# doesn't treat whitespace as signficant
|
| It doesn't for code, it does for strings.
|
| If you have: var foo = "<foo>
| <bar> <quax> </foo>";
|
| That string is actually <foo>
| <bar> <quax> </foo>
|
| In raw string form, as per the proposal, the string would be:
| <foo> <bar> <quax> </foo>
|
| It allows you to write strings nicely inline, and the
| indentation in the string itself doesn't matter.
| a9h74j wrote:
| In place of all of the special rules for handling indentation, I
| wonder if they could simply define some extra starting chars
| (besides $$""" for controlling interpolation) to indicate
| suppress-leading-newline or suppress-ending-newline etc. Offhand
| this would seem more explicit than implicit, and be searchable
| (unlike a pattern).
|
| Other than that, ++ for any mechanism to quote to arbitrary
| depth. I have imagined
|
| [abcfoo[ ...anything but ]abcfoo]... ]abcfoo]
|
| as another approach.
| crispyambulance wrote:
| I got confused at the first example. var xml =
| """ <element attr="content">
| <body> </body> </element>
| """;
|
| And then they say that xml gets this...
| <element attr="content"> <body> </body>
| </element>
|
| But they _don 't_ explicitly say if the new lines after and
| before the """ 's are considered part of the literal string or
| not.
|
| Are they?
| rawling wrote:
| No, although you're right, it doesn't look like they make it
| clear (although the example is in a little code element which
| presumably doesn't have leading or trailing newlines).
|
| Later on,
|
| > In the case of multi_line_raw_string_literal the initial
| whitespace* new_line and the final new_line whitespace* is not
| part of the value of the string.
| sbelskie wrote:
| The construct seems to be defined as:
|
| ``` multi_line_raw_string_literal :
| raw_string_literal_delimiter whitespace* new_line (raw_content
| | new_line)* new_line whitespace* raw_string_literal_delimiter
| ; ```
|
| Which I think says that the opening and closing new lines
| (after and before the """'s) are NOT part of the content of the
| string literal, but new lines between them can be part of the
| string literal.
| Metasyntactic wrote:
| Hi. I'm the designer of this lang feature. The specification
| covers this. However, to be clear, neither new line after the
| first `"""` is not part of the literal, nor is the newline
| before the last `"""`. Thanks!
| laurensr wrote:
| In Java the leading [1] newlines are stripped as well as the
| trailing ones [2].
|
| [1]:
| https://cr.openjdk.java.net/~jlaskey/Strings/TextBlocksGuide...
|
| [2]:
| https://cr.openjdk.java.net/~jlaskey/Strings/TextBlocksGuide...
| billpg wrote:
| A little while ago, I discovered that...
| $"A{new List<string>{$"B{"{C}"}D"}.First()}E"
|
| ... was valid C#.
|
| This means that a C# compiler can't start with a simple
| tokenizing loop. That compiler phase would have to keep track of
| state in a stack, recording what each } character means while its
| still looping through code character-by-character.
|
| Now we're adding {{ and }} into the equation. Yay.
| torginus wrote:
| >This means that a C# compiler can't start with a simple
| tokenizing loop.
|
| Not true, it just means that the parts between double quotes
| aren't bunched into a single token.
|
| To figure out how the compiler makes sense of this code, try
|
| https://roslynquoter.azurewebsites.net/
|
| You'll see how it tokenizes the string.
| billpg wrote:
| That's not a _simple_ tokenizer.
|
| A simple loop that would have worked with the 70s era of
| programming languages, would go through each character and
| once the boundary between two tokens has been identified,
| write out a token to a one-dimensional list. This would be a
| mostly stateless loop, tracking enough state for the current
| token in hand only. The _next_ phase would go through the
| tokens and pair up brackets, etc.
|
| A C# tokenizer can't do that. It needs to keep a stack of
| state. When it sees a '}', it needs to know if that's a
| "normal" brace or the } that resumes a interpolated string
| literal.
|
| I was writing a tokenizer myself and I wanted to have
| something similar to string interpolation. I very quickly
| realized my simple loop that I would have written for my CS
| degree isn't going to cut it and I had to start over.
| Metasyntactic wrote:
| Hi, I'm one of teh C# language designers, and I work on the
| compiler implementation as well.
|
| C# has never had a "simple tokenizer". Indeed, even the
| first language has complex lexical constructs that are part
| and parcel of the language. For example, our _comments_ can
| store structured data in them (like xml).
|
| > A simple loop that would have worked with the 70s era of
| programming languages
|
| Yes. But 70s era compilers had to deal with things like not
| having enough memory to even store basic amounts of data.
| It also had to work in spaces where things like a 'stack'
| was just not tenable. We're literally 50 years from that
| point, and having a compiler do stuff like keeping a stack
| is not an issue anymore :)
| Metasyntactic wrote:
| >Now we're adding {{ and }} into the equation. Yay.
|
| Hi, i'm the lang designer and feature implementor here :)
|
| The complexity of lexing/parsing did not get worse here. We
| actually just lex/parse this stuff the same way that
| interpolated strings have always been lexed/parsed. This has
| been supported in the language for almost 10 years at this
| point :)
| intrasight wrote:
| I did not understand the xml indentation examples
| Metasyntactic wrote:
| Hi there! I'm the lang designer and i wrote up that spec. Could
| you clarify what you didn't understand about the indentation
| examples? I can work on clarifying them. Thanks!
| intrasight wrote:
| You say "If the indentation behavior is not desired, it is
| also trivial to disable like so:"
|
| And the code sample you show differs only in that closing
| quote isn't indented. You don't explain why and how that
| change would affect the generated string.
| jodrellblank wrote:
| It says " _these string literals will naturally remove the
| indentation specified on the last line when producing the
| final literal value._ "
|
| Each line in the literal will have leading whitespace
| trimmed off, up to where the closing quotes are.
|
| (What happens if the closing quotes pass some of the text?)
| Metasyntactic wrote:
| > (What happens if the closing quotes pass some of the
| text?)
|
| That's an error. Called out here: https://github.com/dotn
| et/csharplang/blob/main/proposals/raw...
| Metasyntactic wrote:
| >And the code sample you show differs only in that closing
| quote isn't indented. You don't explain why and how that
| change would affect the generated string.
|
| Hi there. This is explained in the spec in a few places. In
| the examples section it explicitly states:
|
| > To make the text easy to read and allow for indentation
| that developers like in code, these string literals will
| naturally remove the indentation specified on the last line
| when producing the final literal value.
|
| > If the indentation behavior is not desired, it is also
| trivial to disable like so:
|
| I thought that was clear as the prior explanation says that
| we remove the indentation from teh last line. And then i
| show how you can disable it. Specifically, as you noted
| because the closing quote line is no longer indented.
| Cheers!
| gwbas1c wrote:
| ... Why?
|
| There are so many different ways to do strings in C#. Adding
| features like this just makes the language harder to learn, and
| the compiler harder to implement.
|
| At this point, it's probably better to adjust the compiler to
| make it easier to turn a text file into a hardcoded string. The
| embedded resource approach works, but it could be significantly
| smoother.
|
| Or, maybe the compiler needs some form of a plugin architecture
| so people who want obscure features can figure out how to add
| them?
| Metasyntactic wrote:
| Hi. I'm the lang designer and feature implementor.
|
| To your question of "why?", we tried to cover the reasoning in
| teh proposal. But, the core reason is that today people do use
| strings a ton. And in many cases it's unpleasant to do so
| because you always end up with reasons that you need to escape
| the content. This escaping serves to satisfy the compiler, but
| really doesn't buy value to teh user the majority of the time.
| The idea here is that you can just use a raw-string and say:
| here's the content, exactly as i want it.
| eterm wrote:
| It talks about why, because all those different ways require
| escaping.
| radicalbyte wrote:
| Because it allows us to literally embed other languages within
| C# and provide full refactoring, tooling support. Next level
| stuff this (and something which should be normal, it's 2022
| ffs!).
___________________________________________________________________
(page generated 2022-02-17 23:01 UTC)