[HN Gopher] HTML comments work in JavaScript too
       ___________________________________________________________________
        
       HTML comments work in JavaScript too
        
       Author : smitop
       Score  : 118 points
       Date   : 2022-03-08 12:31 UTC (10 hours ago)
        
 (HTM) web link (smitop.com)
 (TXT) w3m dump (smitop.com)
        
       | Leszek wrote:
       | Fun fact -- in V8 this is the only case which requires three
       | token lookahead, to distinguish `<!-foo` meaning "less-than not
       | negative foo" from `<!--` meaning "start of HTML comment".
       | There's a whole rewinding mechanism in the scanner which wouldn't
       | have to exist if not for this syntax.
        
         | gregsadetsky wrote:
         | Source:
         | 
         | https://github.com/v8/v8/blob/main/src/parsing/scanner-inl.h...
         | 
         | and
         | 
         | https://github.com/v8/v8/blob/main/src/parsing/scanner.cc#L3...
         | 
         | And also this nice article:
         | 
         | https://v8.dev/blog/scanner
         | 
         | "The scanner chooses a specific scanner method or token based
         | on a maximum lookahead of 4 characters, the longest ambiguous
         | sequence of characters in JavaScript"
        
         | srcreigh wrote:
         | That's extra funny considering `<!-foo` is an extraordinarily
         | useless expression.
        
           | eyelidlessness wrote:
           | In useful code sure. But `<!-f` with its dynamic casting
           | implications _may_ be useful in code golf. I haven't golfed
           | in years so I'm not sure, but I can imagine it having some
           | benefit I haven't though of.
        
             | wjmao88 wrote:
             | <! is less than or not equal right? which is a strict
             | subset of !=, since anything < by definition is !=
        
               | dragonwriter wrote:
               | > <! is less than or not equal right?
               | 
               | No, it is less-than-not, not less-than-or-not-equal-to.
               | 
               | That is <!x is the same as < (!x); if x is true-ish, it
               | is "< false" and if x is false-ish it is "< true".
        
             | srcreigh wrote:
             | You may be right. Unless it's true that `<!-f` is always
             | equivalent to `<!f`.
             | 
             | Then again, I'd be even more horrified if the above
             | statement is not true anyways.
        
               | eMSF wrote:
               | The two are certainly not always equivalent (for example
               | if f is an object). `<!-f' might be always equivalent to
               | `<!+f', though.
        
               | eyelidlessness wrote:
               | Had to try it! Here is some horror for your day:
               | https://codepen.io/eyelidlessness/pen/abVrjQq
        
               | chayleaf wrote:
               | this is <-!, not <!-
        
               | HAL9000Ti wrote:
               | It's 0<!-f in the code, just not in the html text
        
               | eyelidlessness wrote:
               | Thanks both, fixed! That's what I get for coding on my
               | phone
        
               | LordDragonfang wrote:
               | For anyone confused, this is actually (semi) sane
               | behavior by javascript standards, because it relies and
               | `-"f"` evaluating to NaN.
               | 
               | Of course, because javascript, there are other string
               | values of `f` for which the two statements are equal as
               | well. (e.g for empty string both statements are true
               | because `-"" === -0`, and for "1" both values are false)
               | 
               | https://codepen.io/LordDragonfang/pen/KKyLjLg
        
         | bilalq wrote:
         | Is V8 unable to remove this because of old, poorly-written
         | websites?
         | 
         | Can it maybe shut off if it detects strict mode or something?
         | Seems wild to find this behavior in Node and evergreen
         | browsers.
        
           | Leszek wrote:
           | Having it in there optionally would be even more awkward, and
           | getting rid of it entirely isn't possible without changing
           | the JavaScript spec /shrug
        
           | int_19h wrote:
           | It's part of the language spec.
        
       | yakshaving_jgt wrote:
       | Another ugly historical snapshot, but JS in CSS instead:
       | 
       | https://jezenthomas.com/you-think-css-in-js-is-bad/
        
         | tomxor wrote:
         | IE also had a custom image filtering scripting language
         | embedded in CSS that was pretty dangerous, you could easily
         | hang someone's whole computer if you weren't careful. A common
         | hack to find was adding png transparency which IE infamously
         | lacked, at great computational cost to the user (this was at a
         | time where the lack of other CSS features made PNG transparency
         | quite necessary for building fancy UIs).
         | 
         | It's worth pointing out that many of the old ugly things were
         | IE specific including the feature referenced by the parent
         | comment... it was a time when "standard" had little weight,
         | although some of the good ideas like box-sizing were adopted by
         | W3C.
        
         | err4nt wrote:
         | Today we have CSS custom properties which do similar things:
         | 
         | - Authors can define custom properties as long as they start
         | with a double-dash, like --demo
         | 
         | - Authors can leave these values empty, or put any code that
         | doesn't break CSS's syntax inside. This can be CSS values, it
         | can be JSON, it can be other languages, even some JavaScript
         | could be stored in CSS to be read later
         | 
         | - Using JavaScript you can read the values of CSS custom
         | properties, and set them
         | 
         | So it's possible with a tiny bit of JS to re-create what the
         | old IE expression() function used to do in any modern browser
         | in a fully standards-compliant way
        
           | mushyhammer wrote:
           | > So it's possible with a tiny bit of JS
           | 
           | Not really. expression() was evaluated by just moving the
           | mouse around, which is what allowed attaching stuff to your
           | cursor with "just some CSS"
           | 
           | https://web.archive.org/web/20120420154658/http://msdn.micro.
           | ..
           | 
           | To do what expression() could do you need several event
           | listeners at least which, I mean, of course you can still do
           | it, but it's really unrelated to CSS Properties and just
           | JavaScript at this point.
        
       | ape4 wrote:
       | I would like more commenting options in CSS. Doing /* something
       | */ is the only way I know of.
        
         | danShumway wrote:
         | I would not recommend this, I think it's more trouble than its
         | worth, the syntax highlighting isn't going to be great, and I
         | think you're just going to confuse your coworkers, but if you
         | _really_ want to, you can take advantage of the fact that CSS
         | ignores rules that it doesn 't recognize.
         | .Component__child {         TODO: add theming support       }
         | 
         | Of course, you have some restrictions around syntax here
         | (again, I didn't say I advise doing this), and of course if CSS
         | ever adds a TODO rule you might run into problems.
        
           | eyelidlessness wrote:
           | This isn't much different from the typical JSON comment hack:
           | {           "//": "comment goes here"         }
           | 
           | Most editors and tooling also know this is commonly use and
           | ignore validation of repeated keys. Unfortunately it doesn't
           | much help with arrays unless you specifically handle the case
           | (and depending on usage, handle escaped cases as well).
        
           | err4nt wrote:
           | Do not practice writing invalid CSS, if this gets out online
           | on the internet it will prevent any read `todo` property from
           | being used because people are camping on CSS's name space.
           | 
           | Instead, any author is allowed to invent _any_ property, so
           | long as the property name begins with a double-dash (--) and
           | you can put anything inside as a value so long as it doesn't
           | break CSS syntax                   a {           --note:
           | So you could do this instead and it would be standard CSS,
           | perfectly fine for all people forever!                    ;
           | }
        
             | danShumway wrote:
             | Hah, this is my brain thoroughly broken by forced IE-
             | compatibility for so long on legacy apps, but I completely
             | forgot about custom properties when I was writing that
             | comment! You're right, that's a much better solution.
             | 
             | I still don't think I recommend it, because sure it gets
             | rid of the problem of accidentally breaking CSS namespace,
             | but I still think in most cases sticking comments inside of
             | a property is going to end up confusing people on a team
             | more often than it will help, and there are still some
             | syntax issues and conflicts with your own internal
             | variables that can happen.
             | 
             | But for sure, if you're going to use properties for notes,
             | make them custom properties.
             | 
             | ----
             | 
             |  _Edit:_ I particularly don 't recommend this, but off the
             | top of my head using custom properties you also might be
             | able to rig this up with something like `:before` pseudo-
             | classes to even display some of your CSS comments visibly
             | in the page via `content` or by building style toggles[0]
             | or something.
             | 
             | Again, I feel like this is getting too clever for its own
             | good, but as long as we're playing around...
             | 
             | [0]: https://css-tricks.com/logical-operations-with-css-
             | variables...
        
         | thrusong wrote:
         | // I would love a two-forward-slashes comment in CSS
        
           | nayuki wrote:
           | I love it too, which is why I use Sass/SCSS, a gentle
           | extension of CSS.
        
             | frosted-flakes wrote:
             | I use SCSS for three reasons: // comments, nesting, and
             | importing stylesheets. SCSS has so much more than that, but
             | I don't use any of it.
        
           | ajb92 wrote:
           | Me too. I expect though that there'd be lots of breaking
           | changes around existing CSS having, for instance,`background-
           | image: url(https://somedomain.abc/somefile.png)`, as quotes
           | are optional in URL values
        
             | david422 wrote:
             | // is only a comment if there is only whitespace directly
             | before it?
        
               | mananaysiempre wrote:
               | Note that //foo/bar.txt is a valid address (a "network-
               | path relative URI reference"[1]). Those are rarely seen
               | but are used sometimes when the same resource or snippet
               | needs to be usable with both HTTP and HTTPS. (I think I
               | first learned about this form while reading a text on
               | gradual migration to HTTPS circa 2014.)
               | 
               | My impression is this form might actually be the original
               | way of expressing network paths, what with UNC in Windows
               | (\\\server\share\file etc.) and POSIX carving out an
               | exception specifically for two slashes at the beginning
               | of a pathname (that is, foo//bar is the same as foo/bar
               | and ///foo/bar is the same as /foo/bar, but //foo/bar may
               | or may not be the same as /foo/bar).
               | 
               | [1]
               | https://datatracker.ietf.org/doc/html/rfc3986#section-4.2
        
               | chrismorgan wrote:
               | On the web, and most of the rest of the internet as well
               | for most practical purposes, WHATWG's URL Standard is now
               | the normative reference for URLs, and obsoletes IETF's
               | RFC 3986 and 3987. (Source:
               | https://url.spec.whatwg.org/#goals.) There, this concept
               | is called a _scheme-relative_ URL (a much more sensible
               | name): https://url.spec.whatwg.org/#scheme-relative-url-
               | string. They're not the _most_ common, but they're not
               | especially rare, either. HTML compressors especially will
               | normally support emitting URLs in this form--you tell the
               | compressor "this file you're compressing will be served
               | from the URL https://1.example/path/to/foo" and it uses
               | this knowledge to rewrite URLs inside the page,
               | https://1.example/path/to/bar into bar,
               | https://1.example/path/from into ../from,
               | https://1.example/baz into /baz, https://2.example/ into
               | //2.example, and http://1.example/ into http://1.example.
        
           | chrismorgan wrote:
           | You kinda can:                 selector {         //property:
           | value;       }
           | 
           | This sets the property named "//property" to "value". Close
           | enough. (Expressed otherwise: it's _approximately_ a comment
           | until the next semicolon, barring quoted strings and
           | parentheses.)                 //selector {         property:
           | value;       }
           | 
           | This is an invalid selector, and so the entire rule is
           | ignored. (Expressed otherwise: it's _approximately_ a comment
           | until the next opening curly brace's matching closing curly
           | brace.)
        
             | croddin wrote:
             | I have done the same type of thing to comment out lines of
             | a json file used for configuration . Append "//" or
             | something to the beginning of a key that I want to have
             | temporarily "commented out". (Be careful as some programs
             | might expect those keys not to be there of course.)
        
               | hombre_fatal wrote:
               | Heh, I've done this in strict JSON config files:
               | {            "//": "lets us consume uncompiled protobufs
               | (see issue #123)",            "extern": "protobuf.js"
               | }
        
             | zamadatix wrote:
             | Seems like it has plenty of disadvantages but no advantages
             | to simply using the canonical method to do the same hack:
             | /*             // blah         */
             | 
             | Edit: I mistakenly parsed this as using a dummy selector to
             | hold the comment property, not just putting it inside real
             | selectors.
        
               | galaxyLogic wrote:
               | It is faster to type // than /* ... */ . Any time you
               | type the same character twice it is much faster than
               | typing two separate characters.
               | 
               | Further because I write a lot of JavaScript too I
               | sometimes make the mistake of putting a // -comment into
               | my CSS which breaks things. And when things break in CSS
               | you usually don't get a clear error-message about it.
               | 
               | It is always difficult to switch between two languages. I
               | would prefer something like JavaScript stylesheets, if
               | that was possible.
        
               | dylan604 wrote:
               | Unless you are commenting out multiple lines/blocks of
               | code.
        
               | galaxyLogic wrote:
               | True. But the IDE can of course help. In WebStorm I can
               | select multiple lines of code and press Ctrl-/ to have
               | them all turned into single-line comments if they were
               | not, and uncomment them if they were.
               | 
               | Single-line comments have the benefit that it is always
               | clear to the reader which lines are comments. Whereas if
               | you use multi-line comments to comment a large section of
               | code it is no longer clear when reading it whether it is
               | indeed "inside" a comment.
               | 
               | And finally having both types of comments available is
               | useful because you can multi-line-comment a section which
               | already contains // -comments, whereas you can not multi-
               | line comment a section which already contains multi-line
               | comments.
        
               | dylan604 wrote:
               | Commenting a comment is still a comment.
               | 
               | // => ////
               | 
               | /////////////////////////////////////////////////////////
               | /////
               | 
               | // just like those devs that like to decorate their code
               | with
               | 
               | /////////////////////////////////////////////////////////
               | /////
               | 
               | /*
               | 
               | * type of stuff
               | 
               | */
               | 
               | Edit: almost forgot my shell friends
               | 
               | ########################
               | 
               | ## works too
               | 
               | ########################
        
               | ptudan wrote:
               | This seems like the worst of both worlds. If you're going
               | to use the /* */ I don't see why you'd bother with //
               | unless it's about recognizing a comment in a single line.
        
         | err4nt wrote:
         | The reason CSS will never have //-style single-line comments is
         | that the entire CSS language is parsed as a single line, so if
         | you ever entered a //-style single line comment there would be
         | no return from it for the rest of the input stream.
         | 
         | (Kind of like the <plaintext> element in HTML, once you write
         | the <plaintext> open tag all text after that is plain text, so
         | you can't write a </plaintext> to get out of it - the whole
         | rest of the file is in that mode with no return)
        
         | acdha wrote:
         | Out of curiosity, what editor do you tend to use? Commenting
         | has been a single shortkey for me for so long that I had to
         | remind myself that CSS doesn't support something like the //
         | syntax because my normal editing flow is to either start typing
         | or select text and hit Command+/.
        
       | somehnacct3757 wrote:
       | I look forward to reading a future disclosure showing how you can
       | somehow use this information for mayhem
        
       | danShumway wrote:
       | Huh, that's really interesting, the Firefox dev tools even do
       | proper syntax highlighting. I wonder how JSX parsers react to
       | this. A quick test with an online Babel parser is throwing
       | errors, so I assume it's just not supported? But maybe there's a
       | setting.
       | 
       | JSX also by default doesn't have an easy way to insert comments
       | into the generated HTML, so I don't think JSX is the reason why
       | it wouldn't be supported in Babel. My understanding was always
       | that it has more to do with comments being treated slightly
       | differently than normal document nodes, but I could be wrong.
       | 
       | I guess more likely the reasoning is that it's just obscure. I've
       | been programming JS for a reasonably long time and never knew
       | this was supported.
        
         | colejohnson66 wrote:
         | The reason for the error from Babel is that JSX is an HTML
         | _like_ (more so XML) syntax, not HTML. The obtuse comment
         | syntax is because you need to escape to JavaScript with the
         | braces, then use a multi-line JavaScript comment block. I wish
         | they would 've allowed HTML comments in the original design,
         | but my guess is that comments were an oversight that just
         | happened to work due to the escape syntax.
        
           | schwartzworld wrote:
           | It shouldn't be hard to create a component that does what you
           | want
           | 
           | <Comment>some comment</Comment>
        
             | [deleted]
        
           | chrismorgan wrote:
           | One good reason not to support it is that it's ambiguous: is
           | it a comment (to be ignored), or does it represent a comment
           | node? If the former, why not just use / _..._ /? If the
           | latter, it requires that the library or compilation target or
           | whatever support comment nodes, which is more work for a
           | dubious feature.
        
             | colejohnson66 wrote:
             | I'd argue that comments only serve to help the developer.
             | And when using JSX, the generated HTML might not look
             | pretty (such as being a single, long line). So an HTML
             | style comment in the JSX should be stripped because
             | debugging the minified HTML isn't something people
             | generally do.
             | 
             | And if, for some reason, you wanted to inject an HTML
             | comment into the output, you could abuse the
             | `dangerouslySetInnerHTML` prop like this:
             | function Comment(props)         {             return <span
             | dangerouslySetInnerHTML={{__html: `<!-- ${props.comment}
             | -->` }} />;         }
             | 
             | Then use it like this:                   ...
             | <Comment comment="my comment" />         ...
             | 
             | Using `React.Fragment` doesn't work, sadly. Use the
             | "Inspect Element" tool on the output here:
             | https://playcode.io/870223
        
               | eyelidlessness wrote:
               | There are other use cases for (preserved) HTML comments
               | in (or produced by) JSX, but generally more for library
               | authors than end users. For example, comments can be used
               | for hydration markers to delineate "hydration islands"
               | without need for a wrapping element. (I have a proof of
               | concept of this sitting in a Codepen somewhere.)
               | 
               | And while it doesn't work in React's Fragment, it can
               | certainly work with another compiler. JSX doesn't specify
               | anything about what it compiles to,
               | React.createElement/_jsx are just defaults.
        
         | bradneuberg wrote:
         | I've literally been programming in JS for more than 20 years
         | and this is new to me too!
        
         | chrismorgan wrote:
         | A challenge: figure out how to get to the _script data double
         | escaped state_ (https://html.spec.whatwg.org/multipage/parsing.
         | html#script-d...), or perhaps cheat by looking at
         | https://stackoverflow.com/questions/23727025/script-
         | double-e.... Then, contemplate the implications, and devise an
         | injection attack for something that is foolish enough to try to
         | emit anything user-controllable into JavaScript (suppose it's a
         | string: they may have thought it was enough to escape \ and ",
         | but this will rapidly show you that it's not).
         | 
         | Reading through the specs reveals _lots_ of curious historical
         | details. The HTML spec is a glorious tangled mess because it's
         | an _exhaustive_ definition of a large amount of functionality
         | that grew[?] ah[?] _organically_.
        
       | jkrems wrote:
       | Important footnote: Unless you are in an JavaScript module file.
       | They work in scripts only. E.g. the following is a syntax error
       | in a module but a valid script:                   <!-- ok if it's
       | a script only         --> console.log("ok");
        
         | tentacleuno wrote:
         | Strict mode is forced in modules, while you need to "use
         | strict" in scripts. That might explain it.
        
           | jkrems wrote:
           | It's actually not related to strict mode. HTML comments work
           | fine in strict mode as well as sloppy mode in scripts. The
           | difference here is that modules are a different file format /
           | syntax and they fundamentally don't include parts of the
           | syntax that was supported in the older script file format
           | (and vice versa: they allow syntax like top-level await that
           | wasn't/isn't valid in the script file format).
        
         | lhorie wrote:
         | Technically, they "work" in both scripts and modules, they just
         | work differently in each case. This is valid Javascript that
         | returns true if running in script mode or false if running in
         | module mode:                   const isScript = (x = true) => x
         | <!--x         console.log(isScript()) // true or false
         | depending on whether the code is ESM
         | 
         | What's happening is that `<!--` is a comment token in script
         | mode, but it's parsed as `< ! --` in module mode.
        
       | nayuki wrote:
       | In XHTML (XML) mode, comments are treated as actual comments. In
       | the following working code:                   <script>
       | alert("Hello <!-- Comment 0 -->world");         &lt;!-- Comment 1
       | alert("Bonjour le monde");         </script>
       | 
       | Comment 0 gets removed by the XML parser so it doesn't go into
       | the alert string. Comment 1 is seen by the JavaScript parser and
       | gets removed at that stage.
        
         | chrismorgan wrote:
         | Demonstration of this, using a _data:_ URL (paste it in the
         | address bar):
         | data:application/xhtml+xml,<script
         | xmlns="http://www.w3.org/1999/xhtml">%0Aalert("Hello <!--
         | Comment 0 -->world");%0A&lt;!-- Comment 1%0Aalert("Bonjour le
         | monde");%0A</script>
         | 
         | (Observe also how what I've written here nets you a document
         | that lacks html, head and body tags--HTML syntax fills those in
         | through the magic of optional start and end tags, but XML
         | syntax takes what it's given and can be used to do things like
         | nesting hyperlinks or putting a heading inside a paragraph. The
         | HTML and XML syntaxes for HTML are actually mutually
         | incompatible.)
        
           | nayuki wrote:
           | Very nice use of media types and data URIs! Indeed, the root
           | of your document is <script>, which is rather funny. Of
           | course HTML and XML syntaxes are incompatible, but there is a
           | reasonable polyglot subset that is compatible with both.
           | 
           | Other common things that HTML syntax won't let you do:
           | <p><p></p></p>, <table><tr><td> without an implicit <tbody>.
           | 
           | Why I know this obscure corner of the HTML standard is
           | because I've been running my website on XHTML mode for more
           | than a decade continuously.
           | https://www.nayuki.io/page/practical-guide-to-xhtml
        
             | Semaphor wrote:
             | Ah, the times when I was always trying to serve my site in
             | the strictest way possible. I religiously followed
             | everything Anne van Kesteren [0] wrote, and used his code-
             | snippet [1] to serve application/xhtml+xml to browsers that
             | supported it. He was my same-aged teenage hero :)
             | 
             | [0]: https://annevankesteren.nl/
             | 
             | [1]: https://annevankesteren.nl/2003/09/send-
             | applicationxhtmlxml
        
             | chrismorgan wrote:
             | https://www.w3.org/TR/html-polyglot/ is good reading on
             | this topic too.
             | 
             | A fun thing that I learned recently is that `table > tr` is
             | actually _valid_ , the tbody is genuinely optional in the
             | spec, even though the HTML syntax injects it.
        
               | aidos wrote:
               | Back in the days when we used to nest tables 70 deep for
               | layouts I don't recall anyone using tbody - was it added
               | later?
        
               | TacticalCoder wrote:
               | > tbody is genuinely optional in the spec
               | 
               | IIRC optional in the HTML spec only. Not in the XHMTL one
               | and not in the polyglot syntax.
        
         | TacticalCoder wrote:
         | Yup that's not valid polyglot HTML5/XHMTL code. In HTML < (and
         | &) aren't special (except at the end tag). But in XHTML < is
         | treated as a tag. Here's the link to that part of the spec for
         | anyone interested:
         | 
         | https://www.w3.org/TR/html-polyglot/#raw-text-elements
        
       | yohannparis wrote:
       | Now I feel old to remember doing this when adding a tiny bit of
       | Javascript to display alert() as a cool feature.
        
         | Kiala wrote:
         | Ah, the good old <script language="JavaScript1.2"> days.
        
           | masswerk wrote:
           | Don't do that, it activates negative zero! :-)
           | 
           | (Which is probably fine, if your background is in scientific
           | UNIVACs.)
        
           | samwillis wrote:
           | Worse than that I remember <script language="VBScript"> with
           | IE, yuck!
        
         | nightgarden wrote:
         | Ah, the good ol' days of javascript text animations in the
         | status bar. I remember them too
        
       | [deleted]
        
       | olliej wrote:
       | This behaviour was a result of people transmitting "xhtml" as
       | html back when people were obsessed with xhtml.
        
       ___________________________________________________________________
       (page generated 2022-03-08 23:01 UTC)