[HN Gopher] How to Use Em Dashes (-), En Dashes (-), and Hyphens...
___________________________________________________________________
How to Use Em Dashes (-), En Dashes (-), and Hyphens (-)
Author : Stratoscope
Score : 585 points
Date : 2025-03-27 20:19 UTC (1 days ago)
(HTM) web link (www.merriam-webster.com)
(TXT) w3m dump (www.merriam-webster.com)
| perihelions wrote:
| Dupe--https://news.ycombinator.com/item?id=43447819 ( _" How to
| use an en-dash and em-dash correctly?"_, 43 comments)
| jimnotgym wrote:
| More interestingly it is the same highly niche subject from two
| different websites in two days. HN is...different
| milesrout wrote:
| This isn't niche, it is the sort of thing every child used to
| have drilled into them in primary school.
| rkosk wrote:
| Yeah exactly, used to. It's niche now, and has been for
| decades.
| milesrout wrote:
| It isn't niche just because education has taken a
| nosedive.
| A_D_E_P_T wrote:
| AFAIK most computer keyboards don't have em dashes. Rather than
| hit ALT+0151 every time, I've always just strung along two
| hyphens, like: --
|
| Absolutely _proper and correct_ use of em dashes, en dashes, and
| hyphens is, to me, the most obvious tell of the LLM writer. In
| fact, I think that you can use it to date internet writing in
| general. For it seems to me that real em dashes were uncommon
| pre-2022.
| n2d4 wrote:
| Alt+hyphen or alt+shift+hyphen is an endash/emdash. You may not
| have been aware of it because it's so subtle, but many people
| (including myself) used emdashes long before 2022
|
| (edit: apparently only on Mac, see reply below)
| jml7c5 wrote:
| I believe that's only on MacOS.
| n2d4 wrote:
| Seems like you're correct. Interesting!
| dragonwriter wrote:
| I think Microsoft Office (maybe jiat Word, but definitely
| not Windows) has a similar default shortcut.
| harrall wrote:
| You don't need a shortcut on Word.
|
| You just type two hyphens (--) and Word will convert it
| to an em dash.
| hunter2_ wrote:
| Across the Office suite:
|
| Typing <word><hyphenminus><hyphenminus><word><space>
| yields an em dash.
|
| Typing <word><space><hyphenminus><hyphenminus><space><wor
| d><space> yields an en dash.
|
| That this has been true for some 3 or 4 decades makes me
| doubt all the comments that em dashes are a "tell" of LLM
| authorship. On the other hand, I guess when we confine
| this possibility to web content, I can see how people
| haven't used Office for web authoring lately, and
| whatever they do use (like web-based content management
| systems) don't tend to have this feature.
| iggldiggl wrote:
| > Typing <word><space><hyphenminus><hyphenminus><space><w
| ord><space> yields an en dash.
|
| More importantly, typing _just a single hyphen minus_ in
| this constellation triggers the autoreplace, too. (Typing
| the double hyphen is only necessary without spaces in
| order to distinguish between an intentional hyphen and an
| em dash.)
| hunter2_ wrote:
| Good point. Either way, it's kind of peculiar that
| getting an en dash in this manner demands flanking the
| hyphen(s) with spaces, and those spaces persist after
| replacement, when the typical usage of an en dash
| specifically _doesn 't_ demand spaces.
|
| From TFA:
|
| > August 1-August 31
|
| From a top comment:
|
| > Boston-San Francisco flight, 10-20 years
|
| To achieve this using the replacement feature we're
| talking about would take something like <word><space><hyp
| henminus><space><word><space><alt+leftarrow><bksp><leftar
| row><bksp><alt+rightarrow> which is ridiculous.
|
| In professional typesetting, like a book, I sometimes see
| spaces flanking an em dash, however.
| venusenvy47 wrote:
| I can't get this to work in Powerpoint. It's funny, I
| clicked on this thread because I was struggling with
| trying to make an "emdash" in Powerpoint yesterday and
| couldn't find the correct search term for the "long
| hyphen" that I was looking for.
| hunter2_ wrote:
| Works fine for me on PowerPoint for Mac, oddly enough.
| Unrelatedly, Mac also allows easy (non-alt-code) keyboard
| entry: option-hyphen yields an en dash, while option-
| shift-hyphen yields an em dash.
| QuantumGood wrote:
| Turns into different things (like a bulleted list) in
| different situations in Word, though.
| lxgr wrote:
| That's one of my favorite features of macOS keyboard layouts,
| but it's so close to one of my least favorite ones - option +
| space inserting a non-breaking space.
|
| I almost never want that, and when typing "space, en dash,
| space", it happens quite easily and is usually impossible to
| tell visually.
| akho wrote:
| You always want a non-breaking space before a dash.
| lxgr wrote:
| How so? Wouldn't this prevent line breaks around dashes
| bracketing parenthetical statements? That's the opposite
| of what I want!
| IsTom wrote:
| Works here on Linux too, so not just Macs.
| thesauri wrote:
| On Macs:
|
| Hyphen -: -
|
| En Dash -: alt -
|
| Em Dash --: alt shift -
| alabastervlog wrote:
| The default US English Mac keyboard is so extremely good, and
| has been the way it is for so long, that I remain baffled
| that other platforms haven't simply copied it. I came to it
| relatively late in life and it's one of the reasons I wish
| I'd started using Macs sooner.
| agys wrote:
| This specific key combination is not US keyboard specific.
| I like how they managed to group characters that are
| formally similar by binding them to the same keys.
|
| Examples:
|
| en and em are on -
|
| Below are maybe Swiss specific?
|
| ~ is on N
|
| @ is on G
|
| | and \ and / are on 7
|
| [?] is on V
|
| Y= is on Y and EUR is on E
|
| [?] on W ( [?] is a rotated W :)
|
| etc.
| alabastervlog wrote:
| Yeah, mostly the same on my US keyboard, except a couple
| like "@" (that's shift-2 on basically all US keyboards,
| and is printed on the key) and |/\, which are more
| prominent on US keyboards (two simply have their own
| keys, no shift modifier, even). I get the (c) symbol for
| option+g (which still kind-of makes sense!)
|
| I appreciate that the designer of the layout clearly
| attempted to make some kind of mnemonic connection to the
| degree they could. Makes it easier to discover and
| remember the key-combos, even without a cheat sheet.
| agys wrote:
| Ah! (c) is on C (makes sense!)
| alabastervlog wrote:
| That's c-cedille here, because to write English fluently
| you need to be able to type French loan words like facade
| --but not quite so often as someone in Switzerland,
| probably (especially so in some parts of the country!) so
| I assume you've got it somewhere even more prominent on
| your keyboard.
| Eric_WVGG wrote:
| my favorite example of this is ellipsis ... opt-; (the
| key with the colon over the semicolon is sort of a
| rotated ellipsis)
|
| thank you for teaching me [?]
| jbverschoor wrote:
| Except for international where EUR is opt-shift-2 (next
| to the pound/hash), next the to dollar
|
| modifiers:
|
| opt-e+letter e (acute/aigu)
|
| opt-`+letter e (grave)
|
| opt-i+letter u (circumflex)
|
| opt-u+letter u (umlaut)
|
| opt-n+letter n (for the manana)
| culi wrote:
| It's pretty decent but the fact that I can't type an
| arbitrary unicode character has been a huge annoyance of
| mine since I switched from Windows/WSL to Mac.
|
| They have shortcuts for I, I, and I but not for many
| commonly used characters like arrows
| minitech wrote:
| Control+Command+Space or Fn+E or Edit > Emoji & Symbols
| if you know the character's name. It's not very
| convenient for repeated use, but it gets the job done in
| a pinch.
| culi wrote:
| Yeah it's not great. Edit isn't always there. Fn+E seems
| to make the most sense. I've heard about ctrl+cmd+space
| but commonly forget it. Both of those open the same GUI
| which combines emojis, stickers, and unicode symbols--
| preferring the first two categories over the last. To
| type out a unicode symbol it takes at least three clicks
| on top of me starting to type in the name of my symbol
|
| _sigh_
|
| Thanks for the suggestions
| Aaron2222 wrote:
| You can add the "Unicode Hex Input" keyboard layout,
| which lets you enter BMP characters by holding down
| Option and entering its code point in hex (similar to the
| hex entry on Windows). Expanding the Emoji & Symbols pane
| minitech mentioned also lets you browse by category (e.g.
| arrows), and you can customise the categories and add a
| full Unicode character picker (not limited to BMP like
| the Windows Character Map) there as well.
| JimDabell wrote:
| Aside from the solutions other people have mentioned, if
| you have often-used symbols, you can set up a text
| replacement in keyboard settings. For instance, I have
| :x: for the multiplication sign.
| kps wrote:
| It's very easy1 on MacOS to make yourself a custom layout
| with the characters you commonly use. Personally, I put
| arrows on [?]|HJKL, vi-style.2 (Doing so for Linux is a
| little more work, as xkb is more complicated and less
| capable.)
|
| 1 https://software.sil.org/ukelele/
|
| 2 https://codeberg.org/datatravelandexperiments/kps-
| keyboard-l...
| wil421 wrote:
| The engineers of various AIs are probably reading your comment
| and making adjustments.
|
| Or we are both just AIs, as a portion of HN comments are,
| commenting back and forth about other AIs.
| nextts wrote:
| As are super pedantic humans.
| salynchnew wrote:
| Most obvious tell of the former/current Stripe employee, imho.
| jsheard wrote:
| What's the significance of Stripe here?
| dadoum wrote:
| I use a compose key on Linux to write those. By default you
| should have these compositions available: --- - -- || --. - -
| akshayshah wrote:
| Em- and en-dashes have been well-supported by LaTeX, the
| smartypants family of Markdown extensions, and plain HTML for
| more than 20 years.
|
| In your support, though, calling the extension "smartypants"
| really hints at the target audience :)
| PartiallyTyped wrote:
| On mac it's very easy to get an em-dash, just alt+shift+`-`.
| Though I do concur that it's more likely to come from an LLM, I
| don't think it should be considered a tell -- I find it more of
| a predictor of the writer's age.
| grumbel wrote:
| In Linux/Xorg with a compose/multi-key one can do:
|
| <Multi_key> <minus> <minus> <period> : "-" U2013 # EN DASH
|
| <Multi_key> <minus> <minus> <minus> : "--" U2014 # EM DASH
|
| More in /usr/share/X11/locale/en_US.UTF-8/Compose
| dml2135 wrote:
| I used to intern for a literary magazine and I can confirm that
| half my copy-editing was enforcing proper use of em-dashes.
| This was well before 2022.
| mkehrt wrote:
| I always use an em dash when possible when I should, and double
| en dash when I can't, just because I'm that kind of nerd. But
| it is the case that a double en dash on iOS autocorrects to an
| em dash, so I'm suspicious of the claim that em dashes are a
| tell for LLM writing.
| nextts wrote:
| Most editors should auto changes a double dash into em dash.
| I thing Google Docs does for example.
| mmooss wrote:
| Why not a double hyphen, which has the same result?
| lxgr wrote:
| Not in all fonts. In most monospace fonts, two hyphens will
| show with a small gap between them, for example.
|
| I also personally prefer en dashes, surrounded by
| whitespace on both sides, over em dashes. Apparently some
| WYSIWYG software interprets two hyphens as an em dash,
| while other will interpret that as an en dash, so I'd
| rather just use the real thing if possible to avoid the
| ambiguity.
| BalinKing wrote:
| This test feels biased by the fact that, like others have said,
| macOS provides keyboard shortcuts. For example, I'm only Gen Z
| and yet have tried for many years to use the proper dash
| characters in the right places, which is made much easier by
| virtue of being on a Mac.
|
| Of course, I guess it's entirely possible--even accounting for
| OS--that this test remains statistically useful. It makes me
| kinda sad that my (very much human-generated) writing fails the
| Turing test....
| MrJohz wrote:
| The compose key, for those who use it, also makes it very
| easy to do em/en dashes, and I use them quite regularly as a
| result.
| Tmpod wrote:
| Came to say this as well. I use the compose key to write em
| dashes and other symbols on a daily basis. Very handy!
| akho wrote:
| `misc:typo` is easier
| harrall wrote:
| On iPhone, type two hyphens to make an em dash:
|
| -- into --
|
| If OP wrote their post on an iPhone, they would have
| inadvertently appeared as an LLM by their own test.
| oneeyedpigeon wrote:
| Does that become an en dash if it's between two numbers?
| icy wrote:
| It does indeed! One of my favourite iOS keyboard
| features.
| dan-robertson wrote:
| You can also hold the hyphen key to select an en dash.
| wruza wrote:
| I hate that this feature doesn't have a timeout, so when
| you want to type "--" you have to "- -" and then go back
| and delete the space. You can't just wait as with double-
| space vs space-wait-space. It can be turned off, but that
| turns off other locale-based punctuation like quotes.
| Freak_NL wrote:
| That has nothing to do with being on a Mac. Em-dashes and the
| compose-key work fine on Linux, and Android has them under
| the '-' of the on-screen keyboard when long-pressed.
|
| (Windows probably has some way, but those are rarely
| discoverable.)
| BalinKing wrote:
| That's true, I do use them a lot on iOS as well--similarly,
| it's a long-press on '-' to get an en or em dash.
| tkzed49 wrote:
| I disagree, there is absolutely no easy way to do it on
| Windows. You can install a third party program that
| emulates the compose key but on macos it "just works". And
| I think that makes a difference for 95% of users
| underwater wrote:
| Install PowerToys, hold dash and then press space. This
| works for all the variants for any keyboard character.
| pests wrote:
| Hit Windows+. click on the "Symbols" tab and they're
| right there under general punctuation.
|
| Released back in 2019 for Windows 10.
| dboreham wrote:
| I've always (well...for 20 years) done a Google search
| for "em-dash" then copy/paste the character off whatever
| result page come up. Word and other fancy editors always
| provided a popup pane where these characters could be
| clicked to insert.
| marcellus23 wrote:
| It's a bit funny. On macOS en and em dashes can be
| natively typed with alt+- and alt+shift+-. The responses
| to your comment are apparently suggesting these methods
| are just as easy as that:
|
| 1. Install and configure this extra tool, which also by
| default enables a ton of other things you may not want,
| and may as well be a third-party tool even though it's
| technically built by Microsoft
|
| 2. Do a Google search and copy-paste (!)
|
| 3. Use a keyboard shortcut to bring up a symbol picker,
| then click on the tab containing the en and em dashes,
| then click to type them in
|
| I mean, come on.
| tkzed49 wrote:
| yeah, this is exactly my point haha. these are not at all
| the same
| _flux wrote:
| EURKEY layout in particular has them easily accessible.
| pests wrote:
| Windows does too now via Windows+. which opens the "emoji
| keyboard" but you can switch to the "symbols" tab to see
| unicode. It does have multiple dashes in the quick access bar
| at the top or you can search.
| dspillett wrote:
| I've used WinCompose12 to add key composition to Windows
| for many years (after discovering the concept in Unix-
| land), which I still find more convenient than the other
| options I've tried (including the Windows Emoji keyboard).
|
| ----
|
| [1] https://wincompose.info/
|
| [2] Though having checked just now, the sequences for en-
| dash and em-dash don't seem to be working. Perhaps one of
| my custom macros is interfering somehow... (it is behaving
| overall, ellipsis just worked as did the following
| diacritic and other symbols: aeioun+-012[?]!?!?p). I'll
| have to poke at it later and see what is ary.
| edflsafoiewq wrote:
| It's — in HTML or Markdown.
|
| If you use eg. a Japanese IME, you can also get it by typing a
| normal hyphen and selecting the em dash from the picker.
| layman51 wrote:
| That's interesting to note. I have usually taken the time to
| properly use en-dashes when it seems appropriate because I
| frequently deal with strings that represent academic years. At
| least where I live, these span two calendar years. I have
| noticed that a lot of college websites tend to use the en-dash
| properly (e.g. on their academic calendar webpages).
| kbenson wrote:
| Automatic conversions have been happening for a long time. In
| fact, a few years ago there was some combination of settings on
| my terminal locale settings and man (well, troff/groff most
| likely) was converting hyphens in param definitions to some
| sort of dash character, meaning I couldn't copy and paste out
| of the man page. I think it also affected perldoc for the same
| reason.
|
| I don't doubt there are publishing platforms that do it
| automatically as well, so I wouldn't count on seeing them as an
| indicator of generated output, even if it may be _processed_ in
| some manner.
| tedunangst wrote:
| This is because the original was written using the wrong
| markup. When the output was ascii, nobody noticed, but it
| matters when the output is unicode.
| o11c wrote:
| That's revisionism. It was considered correct historically,
| before someone decided to unilaterally declare all existing
| man pages "wrong".
| tedunangst wrote:
| It's like we spent twenty years writing (mindlessly
| copying) web pages with &mdash and only viewing them with
| lynx, and then somebody makes a graphical browser and the
| mistake is apparent, but I don't think the browser is in
| the wrong.
| blueflow wrote:
| Context: https://lists.debian.org/debian-
| devel/2023/10/msg00085.html
|
| Money quote: This issue does indeed have a
| history of provoking unhinged lunacy.
| globular-toast wrote:
| (La)TeX would typeset -- as an en dash. --- gets you an em
| dash.
|
| I, of course, used proper dashes in typeset documents, at least
| after I'd learnt about them in Knuth's _The TeXbook_. I have
| found myself occasionally use them in ASCII contexts just as
| ---. But I 've never sought out the proper unicode character.
| maegul wrote:
| Certain corners of the world have absolutely cared about and
| employed the proper use of all the "dashes" well before but all
| the way up to 2022. I'd imagine LLMs have just consumed some of
| that material.
| dragonwriter wrote:
| Pretty much everything professionally edit and typeset does,
| and those will generally be retained in Unicode text
| (obviously, not if it gets converted to ASCII). It's less
| common in internet fora because not all users either know the
| use of dashes or have easy access to them on the devices they
| are using, and if its not both familiar and easy, people are
| going to skip it in quick messages.
| tomrod wrote:
| I wrote for a magazine during college days a few decades ago
| that uses the Chicago manual of style. I still use em dashes,
| en dashes, and hyphens regularly. They don't show up as such in
| markdown, but they are effectively: one dash for hyphen, two
| for em-dash, and one with spaces surrounding it for en dash.
| kingo55 wrote:
| More sophisticated clients require we use dashes correctly. I
| first encountered it pre-pandemic, so in professional contexts
| it's not a sure-fire signal of LLM use -- Should you see em
| dashes correctly used in the Hacker News comments or Reddit,
| for that matter, then it's pretty reliable tell... Usually. ;)
| necovek wrote:
| As I mentioned above, I've had them easily accessible with a
| keyboard layout for >20 years on all the systems I've used --
| the only caveat that I find it really ugly with no spaces
| around em-dashes, which is usually recommended for English.
| lxgr wrote:
| I'd like to have the record show that I've been using them
| since before LLMs :)
|
| Not sure when I started; my guess is that I got into the
| habit of using them in LaTeX when writing my thesis, and then
| at some point realized that they are easily reachable on
| standard macOS keyboard layouts (via "option" + "-").
| tshaddox wrote:
| I've been Googling "em dash" and copypasting from the Google
| results for a solid 15 years now. Long before LLMs.
| jbverschoor wrote:
| Just use the Raycast emojipicker, it's very good. Better and
| faster than the macOS one
| hiccuphippo wrote:
| I modify the keymap to use AltGr+dash as em dash. Very easy
| in Linux with xmodmap, bit more complicated in Windows with
| the Keyboard Layout Creator.
| ogurechny wrote:
| `misc:typo` has been in _xkb_ for about 15 years. There 's
| also _xkb-birman_ (matching the current state of the
| project that inspired all of it). If your national layout
| does not have level 3 and 4 symbols set, those should work
| straight away. If it does, it is highly likely that they
| clash, so you need to create a suitable subset. It is
| highly advised to find like-minded people, discuss the best
| options, and then _gently push_ the result to upstream to
| make it available for everyone. After all, it 's Linux, if
| you won't do it, no one will.
| necovek wrote:
| While this is true, this is an amazingly silly omission.
|
| Serbian and Croatian XKB keyboard layouts have had em- and en-
| dashes since early 2000s even if they were not standardized:
| AltGr (right Alt) + hyphen (to the left of right Shift)
| produces an em-dash, and press Shift on top, and you get an en-
| dash.
|
| This is how long I've had them easily accessible on any
| keyboard (I even have them converted to MacOS keyboard layouts
| for use with Karabiner).
|
| http://srpski.org/dunav/raspored-c.html
| PhunkyPhil wrote:
| Iphones will autocorrect two hyphens to an em dash
| op00to wrote:
| My LLM prompts all have "don't use em dashes or semicolons
| ever" when I send the output to someone else. ;)
| lxgr wrote:
| I get not using em/en dashes, but semicolons don't really
| have an alternative in many cases (other than rephrasing), do
| they?
| op00to wrote:
| Usually I split it into two sentences, but yes. I don't
| really see semicolons used in most business communications,
| so I treat them as a tell that the text was generated by
| LLM. Maybe I'm over reacting and prejudiced against
| semicolon usage.
| heyjamesknight wrote:
| I disagree--LLMs don't use them properly. They always put a
| space between the words before - and after - the dashed part.
| Freak_NL wrote:
| Using spaces is not wrong. Typographically, a hair space or
| another thinner than usual space is usually used, but in
| plain text a space is often preferred. Style guides vary of
| opinion on this, but newspapers often space them. Without a
| space they end up looking like elongated hyphens joining the
| words on both sides. That's not their function.
| bluebarbet wrote:
| This is US vs British English. You will struggle to find an
| em dash in any British publication.
| Freak_NL wrote:
| My comment was about spaces specifically. The Guardian
| and the BBC use en-dashes instead of em-dashes, but both
| do so with spaces.
| dragonwriter wrote:
| > Using spaces is not wrong.
|
| Its not wrong for en-dashes (and en-dash set open--with
| space on either side--is generally an alternative to an em-
| dash set closed.) And its not wrong on the _trailing_ side
| of an em-dash used in dialogue to show an abrupt stop mid-
| sentence _if_ the stop is followed by a new sentence. And
| there 's a few other particular uses, but, generally,
| setting an em-dash open is wrong.
|
| > but newspapers often space them.
|
| I've never seen a newspaper set em-dashes open, but I have
| seen them use en-dashes set open instead of using em-dashes
| at all. Given the space premium in print newspapers, em-
| dashes set open, which would consume enormous horizontal
| space, would, other concerns aside, be an odd choice.
| AstralSerenity wrote:
| "Windows" + "." brings up symbols, and at the very top were em
| dashes. I've been using that since it was added.
|
| On my Linux laptop, I confess to manually Googling them every
| time.
| oneeyedpigeon wrote:
| I've tried to use real hyphens and dashes since learning a bit
| about typography roughly 10-15 years ago. macOS makes it really
| easy with just alt and hyphen for en-dash, shift+alt and hyphen
| for em-dash. Definitely not an "obvious tell" of an LLM!
| apt-apt-apt-apt wrote:
| Thanks for the '|[?]<dash>' tip-- from 2022-2025, I have been
| using macOS en's thinking they were em's.
|
| (Side note: GTP says apostrophes should be used for
| pluralizing only for single letters to avoid confusion, but
| this seems more readable than "ens and ems" IMO.)
| starfezzy wrote:
| The lack of em dash usage in popular culture speaks more about
| typical people than it does about whether a text's author was
| an LLM. In fact, the average person has never even noticed--let
| alone considered--that the em dash exists. If they've read for
| 20+ years, they've seen at LEAST hundreds of them.
|
| Imagine being an NPC (a human bot), flattering yourself with
| the thought that people who understand the language are
| language bots...
| ryandrake wrote:
| 21% of adults in the US are illiterate in 2024 and 54% of
| adults have a literacy below a 6th-grade level[1]. "The
| average person" isn't really a high bar, unfortunately.
|
| 1: https://www.thenationalliteracyinstitute.com/post/literacy
| -s...
| andelink wrote:
| Is this a legitimate institute? The linked article offers
| many statistics and even financial figures but cites no
| sources or studies. There is a "TOLL FREE" (capitalized)
| phone number in the website footer, and the comments are
| full of prostitution ads.
| ryandrake wrote:
| Wikipedia[1] contains more links to data, if you want to
| sample a few more sources.
|
| 1: https://en.wikipedia.org/wiki/Literacy_in_the_United_S
| tates
| A_D_E_P_T wrote:
| Not at all. It's just inconvenient for most of the Windows-
| using world, as the characters are not accessible. It's
| ALT+[whatever] or Google-it-and-ctrl+V. Hence an awful lot of
| internet writing didn't really use _any_ of that stuff
| properly.
|
| See, e.g., Boss Szabo's blog:
| https://unenumerated.blogspot.com/2018/03/the-many-
| tradition...
|
| Two chained hypens, as was pretty much the norm back then.
|
| And did you just call me an NPC?!? It's not a matter of
| "understanding the language" at all. It's a matter of
| convenience and of a sort of evolved convention.
| metaphor wrote:
| I use --- to represent em dash in prose here, e.g. [1][2]. The
| behavior is just a residual of long time exposure to TeX.
|
| [1] https://news.ycombinator.com/item?id=41833665
|
| [2] https://news.ycombinator.com/item?id=41774199
| beejiu wrote:
| If em dashes were uncommon pre-2022, they wouldn't have ended
| up in the LLM training sets.
| account-5 wrote:
| I recently got accused of using AI for some writing I submitted
| because I regularly use both en-dashes and em-dashes, and have
| for years. I said in another thread recently they are second
| and third, to semi-colons, as my favourite punctuation marks.
|
| I was able to demonstrate my long use of them, prior to LLMs.
| And since I write in quarto markdown I don't need keyboard
| shortcuts.
| QuantumGood wrote:
| I have typed Alt+0151 almost every day for decades--and now
| with some annoyance I am limiting their use due to the "that's
| how LLMs write."
| toss1 wrote:
| As a diligent user of ALT+0151 for many years on Windoes
| systems, I can contradict that it is a sign of LLM writing --
| perhaps in combination with other factors it can be used to
| increase the likelihood of LLM authorship, but alone, nope.
| unleaded wrote:
| What i've been using: Install
| https://github.com/samhocevar/wincompose and you can then press
| AltGr then three hyphens to insert one. or if you're on Linux
| just search for "compose key".
| mmooss wrote:
| Most word processing applications auto-substitute EM dashes as
| appropriate - some do it for two consecutive hyphens, iirc. I
| don't know if they substitute EN dashes automatically ... I
| don't know if there's a logic for that without understanding
| the text.
| thomasfromcdnjs wrote:
| Someone should parse HN api and figure out total dash usage and
| see if there is a spike in recent times aha
|
| I write poems a fair bit and use em dash a lot. (maybe too much
| and incorrectly)
| ogurechny wrote:
| Just install a proper keyboard layout with proper typography
| support once.
|
| It is maddening that the whole world uses typewriter keyboards
| with some facelift in the era of Unicode and even blasphemous
| full color emoji font rendering. What has changed in decades?
| Windows logo key, power keys, media keys, IE and Outlook logo
| keys -- all Microsoft's fancies.
|
| So initially IBM made some ad hoc decisions on what keys would
| be suitable for a single user office computer (as opposed to
| data input and admin terminals they had). Then everyone copied
| that, because sending unexpected scan codes could lead to bad
| things (random BIOS and program code couldn't care less about
| your ideas of forward compatibility). Then Windows became the
| "basic system" installed on most computers. Microsoft really
| pushed forward the internationalisation at the time, making a
| lot of national layouts and code pages (sometimes contradicting
| the national standards, for better or for worse). Then everyone
| copied what they decided. What's more important, even single
| byte code pages had the basic typographic symbols, anyone
| could've been using them for three decades, but they were not
| added to most physical keyboard layouts.
|
| I wonder if that was because they wanted Word to seem more
| sophisticated than it was, and to make people think it was a
| requirement for "proper documents", or because programmers
| still treated all non-ASCII symbols as free data markup
| constants that would "never appear in a regular text".
| mmooss wrote:
| > So initially IBM made some ad hoc decisions on what keys
| would be suitable for a single user office computer
|
| Didn't it match ASCII and possibly typewriter keyboards?
| dboreham wrote:
| ASR-33 right?
| y1zhou wrote:
| A few years back a journal editor maticulously reviewed all
| dashes in our manuscript and pointed out places where em dashes
| should have been used. Since then I started noticing different
| dashes _everywhere_ around the internet.
| shmerl wrote:
| Compose - - - works for M-Dash (KDE / Linux).
|
| For other combos -- see
| /usr/share/X11/locale/en_US.UTF-8/Compose
|
| See also: System Settings > Keyboard > Key Bindings > Position
| of Compose key
| scelerat wrote:
| The Mac has had them as part of the standard keyboard layout
| since 1984. Using Apple kit since then, they have long been
| burned into my muscle memory:
|
| option-[-] for en dash -
|
| shift-option-[-] for em dash --
| simondotau wrote:
| The option key is IMHO the most underrated feature of the Mac
| platform. Having another modifier for character input is
| insanely handy, and I know where to find numerous characters
| like trademark(tm), divide (/), pound (PS), degrees (deg), pi
| (p) and so on.
| phlakaton wrote:
| I've been using real em- and en-dashes for decades, in more or
| less the way M-W describes. MacOS and iOS make it easy to do,
| and growing up Mac kindled a life of typographical nerdage.
| WhyNotHugo wrote:
| Just configure something like RightAlt to work as a compose
| key:
|
| Compose--- produces --
|
| Compose--. produces -
|
| Lots of other characters like aaadeg+-EUR are available through
| compose: https://whynothugo.nl/journal/2024/07/12/typing-non-
| english-...
|
| > Absolutely proper and correct use of em dashes, en dashes,
| and hyphens is, to me, the most obvious tell of the LLM writer.
|
| Or just someone who likes to use the right characters. There
| was a report a few months back about how writing from autistic
| kids keeps getting mislabelled as LLM simply because they use
| the correct specific terms.
|
| Please stop associating being precise with being an LLM.
| ryoshu wrote:
| I'm married to an editor and friends with an editor at work.
| They both use em dashes appropriately--even with informal
| writing. I've now learned the keyboard shortcut just to confuse
| people in the age of AI slop.
| ubermonkey wrote:
| Word and Outlook have replaced "hyphenhyphen" with an Em dash
| for decades.
|
| Or, I mean, it does SOMETHING. I've never checked, and just
| always assumed I was getting the em dash.
| psunavy03 wrote:
| It's pretty bonkers (and mildly depressing, really) to imply
| that correct grammar and usage is a reason to accuse someone of
| using an LLM.
|
| I mean if it's an obvious break from their normal style, sure.
| But by itself? Every time I hear this argument, it just seems
| like sour grapes from poor writers.
| Quailman84 wrote:
| For a while, em dashes were really popular among LLM
| enthusiasts because of the idea that it would encourage the LLM
| to draw from training data that contained em dashes--which
| typically were higher quality training data written by a
| professional writer or somebody with a professional editor.
| Subjectively, I think it worked. I suspect that the LLMs
| trained to be used as chatbots were finetuned to use the em
| dash liberally for that reason. Now, after a few generations of
| these models, I think that the em dash is starting to have the
| effect of drawing from "slop" training data that was written by
| other LLMs rather than well-written human data.
| the__alchemist wrote:
| If you're on Windows, install PowerToys, and check out the
| KeyBoard manager. It lets you set up shortcuts. I overload my
| keys using right alt for greek letters. (science stuff). Could do
| it for these dashes as well.
| a3w wrote:
| > spans pages 128-34.
|
| Who omits the 1 from the second number?! That is aweful!
| mkehrt wrote:
| What if it's 124 to 127? would you really type 124-127, or
| 124-7?
| wavemode wrote:
| > would you really type 124-127
|
| literally yes
| rossant wrote:
| The latter, I believe.
| eCa wrote:
| > would you really type 124-127?
|
| Yes, every time. The clarity for the reader is more important
| than the time I save by leaving out '12'.
| rossant wrote:
| When I was editing an academic book published by a well-known
| university press, we were all asked to do that for the
| references. (And my colleagues, all doctors and lawyers, only
| knew Word and entered the references manually.)
| crazygringo wrote:
| Who _keeps_ the 1?
|
| You write pages 1,003-4, instead of typing out 1,003-1,004
| which is just unnecessary.
|
| Works the same with two digits, or even three: pp. 1,899-902.
|
| This is standard practice and arguably clearer.
|
| I've only ever seen it done with page ranges, though. I'm not
| sure if it's done with year ranges? E.g. 1984-5? Or 1989-92?
| You work with page ranges constantly in academia, I just don't
| see year ranges much in _any_ form.
| lucgommans wrote:
| Literally never seen this (wish I could grep all comments
| I've ever replied to) and I do not understand what makes you
| say that it's clearer when it's dropping information, making
| it relative rather than a fully qualified number
|
| In speech, it's common, and misunderstandings are usually not
| a problem (if you're not monologuing on a recording) because
| someone will just ask; but in writing it looks like the range
| is the wrong way around. Maybe I expect more care in writing
| because the feedback loop is longer, or maybe it's just habit
| and I think it's wrong in writing because I never see it?
| LegionMammal978 wrote:
| MLA-style citations call for abbreviating page ranges in
| that way. I mostly see it in literary papers, and not many
| other contexts, so it would be easy to notice them rarely
| if at all. Outside of that context, I occasionally see it
| used for year ranges.
| crazygringo wrote:
| I think you're just not used to it.
|
| Quick, tell me how wide this range is, just as an order of
| magnitude:
|
| 285368737954-285368783645
|
| Would be a lot easier if I only included the range at the
| end which had actually changed, wouldn't it?
|
| That's why it's clearer. Now obviously that was an extreme
| example, but it's also easier to see at a glance that
| 1,387-9 is just three pages, as opposed to 1,387-1,389.
| handoflixue wrote:
| If you format your numbers properly, you get
| "285,368,737,954-285,368,783,645"
|
| That's a change of about 50K, which isn't really that
| hard to notice.
|
| "285368737954-83645" is... well I have to assume
| somewhere in the 10-100K range? Hold on a second while I
| line up the digits again... uh... let me rewrite that to
| "37,954 - 83,645", okay now I can read it. No, that
| wasn't any easier. I kept getting lost tracking where in
| the first number I was leaving off. Much easier to
| compare 737 vs 783 - digit groupings are really useful!
|
| (I'll agree that 1387-9 is pretty reasonable, it just
| breaks down the longer the number is. Also, if the page
| count is important, you can just say "1387-1389 (3
| pages)". This feels like the sort of shorthand you used
| to get on Twitter)
| MindBeams wrote:
| >"285368737954-83645" is... well I have to assume
| somewhere in the 10-100K range?
|
| 83645 is five digits, so certainly in the ~10,000 range.
| handoflixue wrote:
| Thus why I have to assume it's somewhere between 10K and
| 100K, yes :)
| lucgommans wrote:
| Taken to an extreme without formatting, sure, but what
| ranges have that many digits in human-readable
| situations? And if there are those exception situations,
| you can word around it for that case
| ("285368760800+-45691" or "45'691 years after
| 285'368'737'954")
|
| Genuinely trying to think of an examples, since e.g.
| books aren't ever that long and search results don't have
| that many pages (that you'd all read and refer back to).
| A salary range, perhaps, can get into the seven digits in
| extreme cases (not that you care about any individual
| digit when you make a lifetime's worth of money in a bit
| more than a year): "Prospective salary is 2'423'000 to
| 2'432'000" seems to convey the relevant info as well as
| "Prospective salary is 2'423'000 to 9'000" does (except
| that I wouldn't understand the latter and ask what this
| second number means, but that's plausibly attributable to
| me as an individual not being used to it)
| MindBeams wrote:
| It's definitely standard, but in what way is it clearer? An
| abbreviation is never more clear than the full thing it
| abbreviates.
|
| EDIT: I saw your explanation below, and you make a very good
| point.
| a3w wrote:
| copy/paste, "print", paste in from page, to to page
|
| Result:
|
| > print pages in range from: 1, 003
|
| > print pages in range to 4
|
| Now have I have two errors to fix: page 1003 to page 1004.
| Not nice. Who formats like this?!
|
| -------------------
|
| Also, some RPG books or encyclopedias I own have chapter that
| span like this:
|
| p. 630 to p. 70 (book 2)
|
| To me, now is unclear, is that 70 with a reset page count, or
| 670 for book 2?
|
| Since I just now learned that a quotation standard somewhere
| outside Germany exists that omits leading numbers, I now need
| to manually check where it ends.
|
| TL;DR:
|
| Don't make me think, and allow for automation. So just write
| on more number.
| zahlman wrote:
| $ python -m this | grep '--' -
| nextts wrote:
| As long as "this" is not a typical README.md with code
| snippets.
| lucgommans wrote:
| Is this meaning to grep for a double hyphen from standard in,
| or to mark the start of positional arguments and then grep for
| a hyphen? If you want both, it should be: $
| python -m this | grep -- -- -
|
| Which is just beautiful
|
| (Your example causes the last hyphen to be grepped for, which
| happens to only match doubled-up ones because single ones don't
| occur in that text. The quotes/apostrophes do nothing because
| they're parsed by (ba)sh and so only the hyphens are passed to
| grep, not the quotes. The last hyphen can be omitted because
| reading from stdin is the default if neither filenames nor
| recursion options are passed.)
| zahlman wrote:
| Oh, of course simply quoting it doesn't disable the special
| meaning of --, because quoting is handled by the shell and
| argument parsing is handled by the program.
| zahlman wrote:
| (Although that turns out not to matter for this particular
| grep invocation; the -- is still interpreted as a pattern
| and - as standard input.)
| rednafi wrote:
| I like em dashes and use "Option Shift -" to summon them on
| macOS. However, LLMs tend to overuse them and compose absurdly
| long sentences. While proofreading a draft, I often instruct an
| LLM to "keep the original tone intact and don't create overly
| complex sentences by fusing together simple ones." That usually
| gets the job done.
|
| Writers adores their em dashes. While they can sometimes clarify
| a concept by adding more context, overusing them can hurt
| readability. I prefer to read Hemingway-esque sentences that just
| say what they want to say and end sharply. So that's how I write
| too--and sometimes the overuse of em dashes directly conflicts
| with that, making the content sound as if the author is confused
| about what they wanted to convey.
| sandbach wrote:
| Robert Bringhurst1 prefers the en dash in the context of setting
| off phrases:
|
| "The em dash is the nineteenth-century standard, still prescribed
| in many editorial style books, but the em dash is too long for
| use with the best text faces. Like the oversized space between
| sentences, it belongs to the padded and corseted aesthetic of
| Victorian typography.
|
| "Used as a phrase marker - thus - the en dash is set with a
| normal word space either side."
|
| 1https://archive.org/details/isbn_9780881791327/page/80/mode/...
| asplake wrote:
| I think of that as "British" style (as opposed to American). I
| think it's more common here and I certainly prefer it
| tkcranny wrote:
| Presently re-reading this book, The Elements of Typographic
| Style. It's one of the few books I've gone out of my way to get
| a physical copy of - it's just beautiful.
|
| And I totally agree, space-set en dashes are vastly superior to
| em. I dislike the way it connects the word more closely to the
| word in the next clause than the phrase itself.
|
| E.g. He left--no explanation. Vs. He left - no explanation.
|
| To me, left--no feels like a weird gluing together than a
| separator for a different section.
| munificent wrote:
| Because I am exactly the kind of person to obsess about this
| sort of thing, when I was working on my last book, I spent a
| lot of time deciding how I wanted to style dashed subordinate
| clauses.
|
| Personally, I think en dashes are too small and look like a
| mistaken use of a hyphen. I really only use them in their
| Chicago Manual of Style recommended uses like date ranges.
|
| But I agree that em dashes without spaces around them look
| wrong. They glue the adjoining words together when the whole
| point is that the clause is secondary and should be set aside
| from the surrounding text.
|
| I ended up using em dashes with a little blob of CSS to put a
| tiny amount of space on either side.
| ibaikov wrote:
| So much this. Two weeks ago I learned that en dashes are used
| for numbers, but I thought they are what em dashes are for. Em
| dashes for me are too long and ugly.
| fsckboy wrote:
| "Used as a phrase marker - thus - the en dash is set with a
| normal word space either side."
|
| "Used as a phrase marker--thus--the em dash is set without
| normal word spaces."
|
| > _the em dash is too long for use_
|
| above, the em-dash without spaces is smaller, at least in this
| typeface
|
| I've taken to using dash offsets--just as an aside--in many
| places were I formerly used parentheses; I find it "less
| interrupts" the flow of the sentence.
| milesrout wrote:
| Mr Bringhurst is wrong. Em dashes have nothing to do with
| Victorian aesthetics.
| 7bit wrote:
| That's how you use them in Germany. N-dash with spaces around,
| instead of an m-dash, as Americans do.
| bangaladore wrote:
| Somewhat off topic, however, I'm thoroughly convinced that there
| is a very high probability something is AI generated when I see
| Em dashes. Anyone else noticing this?
|
| ChatGPT for example almost always uses them. I'm sure they are
| more common in academic writing, but its now super common on
| boards like Reddit.
| mychaelangelo wrote:
| I've noticed this, too. ChatGPT especially overuses them
| relative to other models. It's an easy tell-sign that something
| is probably LLM-written.
| zimpenfish wrote:
| I saw a reel the other day where some Young People(tm) were
| talking about "the ChatGPT hyphen" (an em-dash.) There was
| much wailing and gnashing of (false) teeth from Old
| People(tm) in the comments.
| pavlov wrote:
| It's largely the Baader-Meinhof phenomenon. You've started
| noticing it because you just learned about it.
| dkdcwashere wrote:
| yep. been using them for years. others have too. it's not
| weird
|
| same thing happened with "delve" -- these are just words and
| grammar, people use them
|
| there is no accurate way to tell whether text came out of a
| neural network or not
| chatmasta wrote:
| I'm not sure the same happened with "delve." I saw an
| analysis of paper abstracts showing a clear uptick of
| "delve" starting with the mass-adoption of ChatGPT. Maybe
| it suddenly became a trendy word -- especially in paper
| abstracts -- or maybe more paper abstracts were edited by
| ChatGPT.
| kingo55 wrote:
| Combining the various "tells" of an LLM (em dashes, delve,
| grammatical signs etc) with the context (Reddit comments vs
| professional setting), you could establish a rough
| probability it was AI generated. At this point, it's the
| best we can hope for.
| LeoPanthera wrote:
| Gemini is in love with the phrase "It's important to..."
|
| Whenever I see that at the start of a paragraph I know that
| there's an 80% chance it was written by Gemini.
| bangaladore wrote:
| I feel this is an broad oversimplification.
|
| When looking at the context of a given text, use of certain
| words or punctuation, can very well indicate AI use.
|
| The "original" example was delve. There is no doubt that AI
| (did, or still does) use this word at a significantly higher
| frequency than the average person. I would say the same about
| em dashes.
|
| When browsing a Reddit thread about a video game, if you
| encounter numerous comments written perfectly, especially
| those containing indicators like em dashes, the word delve,
| or similar language, it certainly can raise the question: am
| I genuinely seeing comments from users who write this way in
| this specific context, or is this content more likely
| produced by an LLM?
| MindBeams wrote:
| It sucks that people understanding their own language marks
| them as possibly AI.
| citrus1330 wrote:
| No, it's not. AI uses em dashes far more frequently than the
| average human.
| Kiro wrote:
| Why is this getting downvoted? ChatGPT is completely
| obsessed with em dashes. I don't even know how to make it
| on my keyboard.
| bangaladore wrote:
| Yeah, people are saying "well you didn't know about em
| dashes before LLMs".
|
| No, I learned about em dashes in school, I just literally
| don't know how to type them on my keyboard and I'm too
| stubborn to learn how to.
| jeroenhd wrote:
| It depends. Em dashes in news articles and written
| publications? Definitely expected. Em dashes on social media
| or reddit? Either someone who works in typesetting, or an
| LLM. Most likely an LLM, giving the dying nature of printed
| media.
|
| Only typography nerds and professional printers care about
| things like these. Popular media, even modern professional
| media, hasn't been paying all that much attention.
| arduanika wrote:
| Plausible. But apparently per TFA it's actually spelled
| Baader-Meinhof, with an en-dash not a hyphen.
| awestley wrote:
| Yes! It's a tell-tale sign something is written by AI.
| dkdcwashere wrote:
| it is not
| encypherai wrote:
| Yes, several of the most popular (and even lesser-popular but
| newly open-sourced models such as Gemma 3 27b) overuse Em
| dashes. Even when prompting them to not use dashes, they almost
| can't help themselves and include them occasionally anyways as
| it must be part of their learned stylometry. It's just not a
| common symbol to use at all as most people generally use commas
| for the same purpose. I can't even remember learning about Em
| dashes in my college english classes.
| nextos wrote:
| I submitted an application which I typeset using LaTeX, and
| some people thought it was AI-generated because of en and em
| dashes. I have been using these since forever.
| dskhatri wrote:
| There are regular folk who tend to be pedantic with their
| writing. I'm not sure this is a good test of whether text is
| generated by LLM. Consider that some may use LLMs to correct
| spelling or grammar, and the LLMs may often edit an en dash to
| em dash.
| bangaladore wrote:
| To be clear, It's essentially impossible to know if a given
| text is autonomously LLM generated (a bot on social media for
| example) or is the result of revision of real human effort.
|
| To what extent that distinction matters, I'm not sure.
| kbenson wrote:
| If it's posted through a publishing platform (not just a
| commend on one or on a public site), it's very possible they do
| an automatic conversion of some of the common cases. That could
| also be filtering down to comment boxes and stuff, I'm not
| sure.
|
| That's not to say that generated content doesn't use them, just
| that using them as an indicator might require a bit of nuance
| based on where you're seeing them.
| alabastervlog wrote:
| I've been employing em-dashes extensively since I went on a JD
| Salinger binge circa 2002. Also, "incidentally", for the same
| reason. I use "Nb" a lot, from reading a bunch of DFW years
| ago. Oh, and that very-precise construction he does with
| "which" all the time, I stole that.
|
| Before LLMs, I think em-dashes mostly signaled that you read
| books and paid attention to details, to the extent they
| signaled anything.
| arduanika wrote:
| To generalize your point: A lot of the "brown m&ms" that
| we've walked around with for detecting a writers status,
| education, etc., are less useful in an age of LLMs.[1]
|
| We might even be entering some waves of counter-signaling.
|
| [1] They'll never totally nail all of DFW's mannerisms,
| though.
| abyssin wrote:
| What is this very precise construction?
| alabastervlog wrote:
| Something like, "the monks wore brown habits, which habits
| were made from wool".
|
| The slight ambiguity if you don't do that now irks me,
| having seen a way to eliminate it.
| culi wrote:
| Everyone I know that writes a lot, especially for copy or
| product design, seems to use em dashes more heavily. I've even
| seen a Drake format meme where he is shaking his head at
| parantheses, commas, and colons but--finally--nodding in
| approval at the em dash.
|
| I wonder if it's a more recent phenomenon.
| jyunwai wrote:
| Em and en dash usage is officially part of style guides such
| as The Chicago Manual of Style [1], so it's often a work
| requirement for many writers and editors to use them in
| writing. This is why these kinds of dashes are everywhere in
| newspaper and magazine articles.
|
| Eventually, people learn to include them out of habit--
| especially as most people see them as aesthetically nicer
| than a simple hyphen (-).
|
| [1] https://www.chicagomanualofstyle.org/qanda/data/faq/topic
| s/H...
| bangaladore wrote:
| Exactly. If I see an Em/En dash in a publication of really
| any kind, I don't think twice. Because that's the
| traditional context for them. Professional writing.
| nilkn wrote:
| I've encountered and used em dashes regularly for the last 20
| years. If most of your reading and writing are associated with
| social media, I could see the trend you're describing appearing
| real within that limited context. But em dashes are not new and
| have been a feature of high quality writing for many decades.
| arduanika wrote:
| So you're saying that when you see an Em dash in someone's
| prose, it's a big minus?
| bangaladore wrote:
| As I said in another comment, it depends highly on the
| context and previous / alternative knowledge of the source.
| arduanika wrote:
| (How about when you see a pun in an HN thread?)
|
| :)
| Anon1096 wrote:
| The only people still using em-dashes are those who think it's
| somehow a signal of high intellect rather than being
| (extremely) behind the times. Case in point: this exact comment
| section where you see it with ~10000x the frequency of standard
| human writing, or even the average HN thread.
|
| Just makes me roll my eyes really seeing a human use an em-
| dash. We've in the age of informality, and at least for me
| personally I've definitely filed the em-dash away as "a near
| guarantee the text was written by a machine". No matter how
| much and perhaps especially because HN commentators are coming
| out of the woodworks to insist they've been using it daily for
| years.
| medstrom wrote:
| Maybe you're projecting? Not everyone has an agenda beyond
| just thinking it looks good.
| MindBeams wrote:
| This level of thinly veiled insecurity is just projection on
| your part.
| keybored wrote:
| I'm bored with y'alls keyboard habits.
|
| Not all though. Many people on HN use em-dashes and other
| proper punctuation.
| vanschelven wrote:
| There is a special kind of irony in the fact that habits that
| used to set one apart from the unwashed masses (like the proper
| use of punctuation) now serve as a signal for being non-human.
| gukov wrote:
| Yep, definitely been noticing it, especially on Reddit. It
| almost always makes me navigate away from the post, unless the
| author mentions that they're using AI.
| arduanika wrote:
| Hold on, I'm coming back to this thread, I think I've cracked
| it guys. Some real alpha for you right here:
|
| If the em dash has spaces around it -- as seen in AP style --
| it was probably written by a real human, because that's how it
| comes out most conveniently on a word processor.
|
| But if the em dash has no spaces around it--Chicago style--
| there's a good chance you're looking at LLM slop.
| 867-5309 wrote:
| minus (US negative) enters the chat..
| crazygringo wrote:
| Seriously. If you want your - + to match, in terms of crossbar
| vertical position and width.
| culi wrote:
| For comparison--
|
| - + minus sign
|
| - + hyphen
|
| - + en dash
|
| -- + em dash
|
| -+-+-+--
| perilunar wrote:
| The correct minus sign looks a lot clearer than a hyphen-minus
| when printing out negative numbers, especially at small font
| sizes. I have in the past written code to convert them.
| cporios wrote:
| if anyone's wondering, the post title is wrong -- both of the
| first two characters are en dashes (U+2013).
| jeroenhd wrote:
| That's actually kind of funny. Looks like it's the result of
| HN's Unicode filtering rules, though; the original website has
| different characters in its <title> tag.
| babypuncher wrote:
| Or, you can avoid an awful lot of headache by just sticking to
| hyphens.
| culi wrote:
| > If you want to be official about things, use the en dash to
| replace a hyphen in compound adjectives when at least one of the
| elements is a two-word compound.
|
| How is a literal dictionary making fun of people who "wanna be
| official about things" lol. That's the entire basis for
| dictionaries themselves
| mikethemerry wrote:
| It's Merriam-Webster - they are descriptivist rather than
| prescriptivist about language. They don't define correct usage
| per se, but rather document actual usage, though some usage may
| be given greater weight than others.
|
| In this case, they are calling out the prescriptivist
| definition but are implying that it may be overkill and
| offering the more commonly used alternative.
| nayuki wrote:
| Additionally:
|
| * Use the minus sign /-/ (U+2212) when formatting numbers,
| because the default hyphen-minus /-/ (U+2D) just looks wrong: "It
| is -1 degC vs. -1 degC." Moreover, the correct minus has the same
| width as plus (- vs. +).
|
| * Rare, but use the figure dash /-/ (U+2012) or figure space / /
| (U+2007) if you need a placeholder character that is the same
| width as a single digit. For example, "Guess the PIN: 1-34."
| ludicity wrote:
| I use em-dashes correctly because a reader emailed me, and I was
| dreadfully embarrassed. You can actually see them become correct
| in my writing after the "I will pile drive you" AI thing.
|
| It never occurred to me that doing this correctly might make
| people think I use LLMs in my writing.
|
| Edit: I'm sure the many typos protect me from that, actually.
| o11c wrote:
| One point that is very rarely mentioned is how to place em dashes
| around quotations marks.
|
| If the em dash indicates an interruption (not a planned pause) of
| the actual speech, the em dashes go inside the quotes (often just
| one, before the closing quote).
|
| If the em dash is the narrator interjecting with additional
| information, the em dashes go outside the quotes.
|
| Besides this, the question of where to put spaces when multiple
| forms of punctuation are combined can be quite a complex topic.
| efilife wrote:
| this is the definiton of bikeshedding
| milesrout wrote:
| No it isn't.
| numbers wrote:
| On macOS you can enter these by doing the following:
|
| * em dash: [?] + | + - (alt + shift + hyphen)
|
| * en dash: [?] + - (alt + hyphen)
| appleorchard46 wrote:
| Hot take - differentiating between these at all is dumb. There is
| virtually no situation when using one instead of another improves
| clarity.
| lucgommans wrote:
| It is _usually_ clear that 2-3 thingies means a range of
| thingies, but I seem to remember there being situations where
| it could also have been a minus sign. Perhaps it was with
| placeholders, where 10-N could be either one. Problem is, iirc,
| the real minus sign is longer than the hyphen, looking like an
| en dash (the one meant for ranges) and so it defeats the
| purpose... hence I totally use hyphens as minus signs, but en
| dashes for ranges, which makes sense in my head because a range
| has a certain span /length whereas a minus sign is just a
| little mark to indicate that something is negative. I see lots
| of people/software use en dashes for ranges but the existence
| of a real minus sign is, from my perspective, mostly just noted
| in typographic resources, so I think this reflects most
| people's usages (for the people that care for these details)
|
| I do like that the em dash is as long as it feels that broken-
| off thoughts should be
|
| Not everything has to be functional, sometimes things can also
| just look nice for the sake of it
| Starlevel004 wrote:
| I refuse to care about this. A single dash is all I will ever
| use. I see no possible reason to use the other two.
| theelous3 wrote:
| Throwing my hat in here. The sub millimeter difference in the
| length of a dash conveys no additional meaning or clarity. It
| is impossible to argue me out of this position.
|
| It's not like you can reliably write these consistently by hand
| either without going over the top in length to make it
| extremely obvious.
| miltonlost wrote:
| Length of breath/pause with a longer dash. Read some -- Emily
| Dickinson poems - you'll find a world --- of meaning --- in
| the millimeter.
| theelous3 wrote:
| I have read her in the past and can't say there were
| world's of meaning between -'s. Can you link an example? I
| looked again and couldn't see any obvious ones. Generally
| she just completely abused the -. Does she even use a comma
| once? lol
| efilife wrote:
| worlds. _world 's_ would indicate that a world owns
| something.
|
| Also, you can just write _-s_ instead of _- 's_ as the
| apostrophe indicates possession
| pc86 wrote:
| Exactly the type of comment I'd expect to see on an HN
| discussion about different types of dashes.
| handoflixue wrote:
| Poetry routines breaks grammar rules. A lot of poems rely
| on very specific white space layouts that you'd never see
| in writing.
|
| And your example shows how you can just use multiple dashes
| instead of having three different ones.
| california-og wrote:
| Here's some examples where the en dash could make things more
| clear:
|
| -5--2degC
|
| post-war-pre-digital era
|
| See sections 10-O-15-Q
|
| Try Our New York-London Flight Connection!
| mvdtnz wrote:
| -5degC to -2degC
|
| post-war - pre-digital era (not a sentence any sane person
| would use anyway).
|
| See sections 10-O - 15-Q
|
| Try our New York-London flight connection! (no kind of dash
| clears this one up without fixing capitalisation).
| california-og wrote:
| The last one was a gotcha: it's their newly established
| York-London flight!
|
| Try Our New York-London Flight Connection.
|
| Or if it was New York:
|
| Try Our New York - London Flight Connection.
|
| Note the additional spaces. Agree on the capitalization
| though.
| handoflixue wrote:
| > Try Our New York - London Flight Connection.
|
| I'd wager serious money that if you put that on a sign
| and surveyed people, at least in the US, they'd all still
| conclude it is a "New York" to "London" flight.
|
| What's the use of a communication tool, if it doesn't
| actually communicate anything to real people?
| voidUpdate wrote:
| York doesn't have an active airport
| quanloh wrote:
| In my region at least, -5 ~ -2degC, or -5degC ~ -2degC. If
| the something is making people confuse, we replace it with
| a suitable substitution. Re-educating people is really just
| last resort. Is there anything keeping us from changing it
| other than ego?
| jeffhuys wrote:
| -5 - 2degC
| account42 wrote:
| Have you heard of "to"?
| theelous3 wrote:
| Sorry, lol? You didn't really think this through. This is
| what that looks like using en/em
|
| -5--2
|
| That looks like dogshit.
|
| It's a mistake in the first place to decide to use only
| dashes and no spaces to convey all of this lol
|
| -5 - 2 (Everyone knows a sign has no space - if you are
| building your sign for idiots try some of these:)
|
| -5 > 2 -5->2 -5 <-> 2 -5 to 2 -5...2 Between -5 and 2
|
| blah blah blah
| MindBeams wrote:
| This sort of anti-intellectualism is the perfect antidote for
| those who claim that improper grammar is nothing more than
| evidence of language "evolving."
| Aardwolf wrote:
| I think many grammar rules are not intellectual but just
| randomly evolved conventions.
|
| E.g. some English language rule says that a comma or ending
| period of a non-quoted sentence goes inside the quotes if
| there's something quoted at the end of that sentence. That
| rule feels anti-intellectual to me, as if there's some
| misunderstanding of how hierarchical placement in one-
| dimensional space works (since something that's not being
| quoted is being put inside quotes)
| NegativeLatency wrote:
| Spelling used to be more fluid and up to the
| writer/printer. Printers would also use different
| spellings as a mechanism to change the line width and
| otherwise format text to their liking.
|
| https://www.ruf.rice.edu/~kemmer/Histengl/spelling.html
| milesrout wrote:
| That "rule" is the rule in America but not elsewhere.
| Please break it. It is stupid.
| theelous3 wrote:
| What is more intellectual about wanting to complicate the
| language for one reason, versus wanting to simplify it for
| another?
| harrall wrote:
| Em dashes don't convey much meaning or clarity for me.
|
| Rather, seeing too short of a dash is like putting two
| clashing colors together or wearing two pieces of clothes
| that don't match. It just looks instantly off.
|
| It's just not aesthetically pleasing for me.
| fernandotakai wrote:
| uh, really?
|
| i really like using em dashes -- for some reason, it feels
| "better" in my head than using something like a comma or a
| semi-colon.
| RandallBrown wrote:
| Then why didn't you use an em dash?
| fernandotakai wrote:
| you can use double dashes to symbolize an em-dash (i prefer
| to using an actual em-dash, which is option+shift+- in
| macos).
|
| https://en.wikipedia.org/wiki/Dash#Approximating_the_em_das
| h...
| lioeters wrote:
| That's the comment I was looking for to rally behind. I use the
| same character `-` for all purposes: minus, hyphen, em/en dash.
| It's easy to type and it makes practically no difference in
| meaning or legibility. I refuse to waste my time
| differentiating between multiple variations of a short
| horizontal line with a few pixels more or less. Ain't nobody
| got time for that.
| hydrogen7800 wrote:
| I was going to post basically this. There is only one dash, and
| it's the one for which my keyboard has a key. Minus sign,
| hyphen, or any other use case. When MS word autocorrects to
| something else, I always angrily undo it, because I don't know
| or care what it's doing.
|
| -proud dash luddite
| jeroenhd wrote:
| I take this advice like "do not use a preposition to end a
| sentence with" and "pay close attention to 'much' and 'many'".
| Personal preferences from the 1800s taken as gospel by
| grammatical extremists, to the point where they're taken as
| some kind of solid rule in a vain attempt to forcefully shape
| language to a personal preference.
|
| There are cases when you want to follow certain guidelines, for
| sure. If you write for a publication that adheres to Meriam-
| Webster, you'd better stay consistent and figure out the right
| AltGr code to type the right dashes. However, for the 99.99% of
| written media today, none of that matters.
| milesrout wrote:
| Ending sentences with prepositions is and had always been
| fine. It has never been a serious rule of grammar that you
| may not end a sentence with a preposition. It does sometimes
| make a sentence sound better to rewrite it so that it doesn't
| end with one though. For example, "do not use a preposition
| to end a sentence with" sounds awkward to my ears, probably
| because you deliberately crafted the sentence to end with a
| preposition even though that is not naturally what you'd end
| that sentence with. (The previous sentence doesn't sound
| awkward to me, interestingly.)
|
| Getting "much" and "many" right is completely different. They
| mean different things. Confusing them makes you sound stupid.
| Less vs fewer is the same. It often doesn't matter but in
| some cases it really grates on the ears (eg "there wasnt much
| people there" just sounds awful).
|
| Dashes are not in the same category. They are orthographical
| conventions. They aren't really grammar. They are more like
| spelling. You can spell things wrong and say it doesn't
| matter because spelling is arbitrary and you can use the
| wrong dashes too, but it makes you look either uncaring or
| ignorant. If you want to give a good first impression, learn
| the basic conventions of written English and follow them.
| MindBeams wrote:
| "Much" and "many" are not interchangeable:
|
| "I have too many water in the cup."
|
| "How much people are in attendance?"
|
| These sound obviously incorrect.
| Starlevel004 wrote:
| > Personal preferences from the 1800s taken as gospel by
| grammatical extremists, to the point where they're taken as
| some kind of solid rule in a vain attempt to forcefully shape
| language to a personal preference.
|
| This is also true of "less" and "fewer". I use "less"
| everywhere.
| milesrout wrote:
| i refuse to care about this lowercase letters are all i will
| ever use i see no possible reason to use the other symbols
|
| Suit yourself, but if you refuse to learn basic grammar you
| will be treated like you are stupid and uneducated. Like it or
| not, presentation matters. Getting the basics right, including
| things like spelling, grammar, etc, shows a basic attention to
| detail without which your services will likely do more harm
| than good.
| mvdtnz wrote:
| The various dashes are not "basic grammar" they are for
| pedants to argue amongst one another while the rest of the
| world just gets thing done.
| handoflixue wrote:
| > etc,
|
| actually it's "etc."
|
| (I wouldn't usually be a pedant, but if you think the
| difference between "--" and "--" matters, you should probably
| try to get the basics right too.)
| milesrout wrote:
| Wrong. Look at any dictionary. Etc is completely fine. What
| next, are you going to pretend you write N.A.S.A. or Mr.
| White? Come on
| MindBeams wrote:
| >Mr. White
|
| As opposed to what, exactly?
| milesrout wrote:
| Mr White, which is correct English. I believe Americans
| might put a dot after these abbreviations, but nobody
| else does.
| handoflixue wrote:
| https://www.merriam-webster.com/dictionary/etc. - even
| the URL has the period, and I did in fact look this up
| before replying :)
|
| https://www.merriam-webster.com/dictionary/etc even
| redirects to the correct URL with a "."
| milesrout wrote:
| Merriam-Webster is an American dictionary and therefore
| totally irrelevant to me.
| zamalek wrote:
| Etc. is an abbreviation for etcetera. Correctly
| signifying contractions, abbreviations, and acronyms is
| far more commonplace than using the correct dash. Almost
| everyone would have learned about shortening words in
| high school; many people leave university without ever
| having heard of an em dash.
| milesrout wrote:
| Etc is also an abbreviation of et cetera. Only Americans
| put pointless dots everywhere.
|
| This is all stuff you learn in school. Punctuation isn't
| obscure or niche. You may not have learnt about
| semicolons or em dashes in school but you should have and
| I did. As did anyone that has ever read a novel. There
| are two semicolons on the _first page_ of the first Harry
| Potter book, a novel read by approximately every child of
| my generation. There are loads of examples of the proper
| use of dashes and other "obscure" punctuation marks in
| any professionally typeset text.
| zamalek wrote:
| > Only Americans
|
| I was raised and educated in Africa, specifically the
| GCSE curriculum. I was taught to use etc.
| quanloh wrote:
| me too, do not think it makes a different in actual writing,
| like handwriting.
| grey413 wrote:
| En dashes, I'll grant you, are pointless. Those can go away.
|
| However, em dashes are a different case. The main reason why
| it's desirable to use em dashes (beside convention) is for
| clarity of purpose. The hyphen is already a very overloaded
| character; they're extensively used to denote ranges and link
| compound words. Importantly, both of those usages _do not
| correspond to pauses in spoken language._ If you 're voicing a
| hyphen you're supposed to barrel on through it. An em dash is
| much closer to a parenthesis, comma, or semicolon. It's a
| meaningful break in the sentence, in the way that a hyphen
| isn't.
|
| Now, if it were up to me I'd choose a different character to
| replace em dashes (maybe underscores), but that's a separate
| argument.
| krupan wrote:
| Just use two dashes. Or like you said, use parentheses,
| commas, or semi-colons
| grey413 wrote:
| Two dashes are fine, the other options have different
| literary functions than em dashes, and shouldn't generally
| be used as replacements.
| account42 wrote:
| Real monsters use a signle dash but with a wider font.
| tejohnso wrote:
| Yeah, trying to get people to take Em vs En vs Hyphen seriously
| is a fool's errand. Only typography nerds would take it
| seriously and there just aren't enough of them to make a
| difference. I'd guess that the vast majority of people have
| never even heard of these distinctions.
| Hnrobert42 wrote:
| I don't care about the length of the mark, but I did find this
| idea useful. Prone to excessive detail, I often find myself
| with a parenthetical inside of parenthetical. The developer in
| me insists on 2 closing parentheses. But it looks weird and
| nerdy. Although, using an em dash instead is probably just as
| nerdy.
|
| > Dashes are used inside parentheses, and vice versa, to
| indicate parenthetical material within parenthetical material.
| ...
|
| > The bakery's reputation for scrumptious goods (ambrosial,
| even--each item was surely fit for gods) spread far and wide.
| LinuxAmbulance wrote:
| Long live the parenthetical!
|
| I wish it was more popular, it neatly indicates meaning so
| very well.
| zamalek wrote:
| This is coming from someone who can only speak English: what a
| stupid language. How is having 3 symbols that are discernible
| only by their, almost identical, length a good idea? How would
| one grade a paper for correct usage, especially if handwritten?
|
| I agree with you completely.
| knallfrosch wrote:
| And that is why noone will remember your name.
| colanderman wrote:
| Note also that the "hyphen" on your keyboard is actually a
| "hyphen-minus". Unicode provides separate characters for hyphen
| (-) and minus (-).
| phkahler wrote:
| Let's not forget the minus symbol at U2212. I was making a
| Simulink like diagram editor and the dashes just didn't look
| good. 2212 worked nicely.
| starfezzy wrote:
| We need a blog post documenting the ironic trend of people--
| themselves NPCs, actual human bots, just now realizing the em
| dash exists despite seeing it hundreds if not thousands of times
| before LLMs--flattering themselves by suggesting that anyone who
| understands the language at above a 5th grade level must be an
| LLM.
| citrus1330 wrote:
| You aren't special for using em dashes, and it doesn't make
| someone an NPC to notice that AIs frequently make use of them.
| ogurechny wrote:
| The comment above is not about being special, it is about
| proper typography that is still everywhere around us: books,
| serious websites, anything done by real designers. Those
| people had to try hard to miss all of that.
|
| No, it is not "politically incorrect" to call people lacking
| curiosity and/or education like you see them.
|
| No, someone's personal preferences or transitory fashions are
| not automatically promoted to the holy reference for the
| whole world.
| jeroenhd wrote:
| Taking knowledge of the three extra pixels that are "more
| correct" as some kind of indicator of intelligence is silly.
| Pretending you're somehow above them is just sad.
|
| Must be lonely at the top.
| MindBeams wrote:
| This thread is rampant with anti-intellectualism that
| deserves to be called out.
| dankwizard wrote:
| a human has never used an em dash in the wild
| apparent wrote:
| This shows both the en dashes and hyphens for page ranges. Is one
| preferred?
| TomasEkeli wrote:
| I'm just gonna say it: this does not matter. Just use whatever
| you want. If you're afraid that someone is going to think less of
| you for it: the people who matter won't.
| efilife wrote:
| For those who downvoted this - how does a millimeter of
| difference in the length of a line matter?
| scottyeager wrote:
| Well-meaning can vary if you don't put spaces around your
| dashes, and a well--meaning writer wants to ease the job of
| the reader.
|
| it might simpiy not matter though, a miiiimeter here and
| there, i suppose.
| rkosk wrote:
| The difference in dash length really doesn't matter and
| your example is not the same at all, but it probably made
| you feel really smart.
| MindBeams wrote:
| Did you mix those up on purpose?
| rsch wrote:
| Today in "typesetting before we had typewriters": ...
|
| At least we have dedicated O/0, and l/1 keys now. But we still
| see a lot of "straight" quotes instead of "those smart quotes
| Microsoft Word likes to generate". And dashes. Did you know there
| is a dedicated ellipsis character? This is often set with
| slightly more space between dots than ..., and it by definition
| never wraps across a line between those dots. You still see (C)
| instead of (c).
|
| It is one of those things that doesn't really matter for
| readability, but although they can't necessarily put a finger on
| why, people may still notice that some documents or pages appear
| to be set with more care for details than others.
|
| (edit: I guess if you don't have to search on Google what the
| hell a 'Microsoft Word' is, then you're officially old)
| thangalin wrote:
| > dedicated O/0, and l/1 keys now
|
| And the 1 and 8 aren't next to each other anymore, either. (See
| typewriters from the "18"00s.)
|
| > those smart quotes
|
| Fixing straight quotes is a hard problem[0]. My FOSS text
| editor, KeenWrite[1], includes my library, KeenQuotes[2], for
| replacing them at build time. It's not perfect, but can typeset
| my ~400 page novel without any errors.
|
| > Did you know there is a dedicated ellipsis character?
|
| Yes! Here's where it gets parsed:
|
| https://gitlab.com/DaveJarvis/KeenQuotes/-/blob/main/src/mai...
|
| Then emitted:
|
| https://gitlab.com/DaveJarvis/KeenQuotes/-/blob/main/src/mai...
|
| Then transformed into an HTML entity:
|
| https://gitlab.com/DaveJarvis/KeenQuotes/-/blob/main/src/mai...
|
| When typesetting Markdown, KeenWrite first converts the
| document to XHTML (i.e., XML), then invokes ConTeXt to convert
| XML into TeX macros. One of those macros handles the ellipses
| by converting it to \dots{}:
|
| https://gitlab.com/DaveJarvis/keenwrite-themes/-/blob/main/x...
|
| This renders as the Unicode character in the final document:
| ...
|
| > set with more care for details
|
| Some of us old folks care about these details. ;-)
|
| [0]: https://stackoverflow.com/a/73466438/59087
|
| [1]: https://keenwrite.com/
|
| [2]: https://whitemagicsoftware.com/keenquotes
| keybored wrote:
| People have approximated ellipsis by using `. . .`.
|
| I use ellipsis. Which ironically is way too short when viewed
| in monotype...
| kps wrote:
| I use ellipses & dashes... perhaps the former will convince
| people I am human.
| vanschelven wrote:
| for em dashes and ellipsis at least it's trivial to convert
| before displaying them... which I do in my own markdown-to-
| publication toolchain (but not here on HN).
| knallfrosch wrote:
| I hate smart quotes because it's super weird to use the
| <<French>> and ,,German" quotation marks.
| mvdtnz wrote:
| I genuinely do not care one tiny bit about doing this right. At
| all. I will use the minus key for all of these like I always have
| and nothing bad will ever come of it. Find a better way to
| channel your limited energy.
| mmooss wrote:
| Here's an easy, if not always precise way to remember:
|
| * Hyphens connect things, such as compound words: _double-decker_
| , _cut-and-dried_ , _212-555-5555_.
|
| * EN dashes make a range between things: _Boston-San Francisco_
| flight, _10-20_ years: both connect not only the endpoints, but
| define that all the space between is included. (Compare the last
| usage with the phone number example under Hyphens.)
|
| * EM dashes break things, such as sentences or thoughts: _' What
| the--!'_; _A paragraph should express one idea--but rules are
| made to be broken._
|
| Unicode has the original ASCII hyphen-minus (U+002d), as well as
| a dedicated hyphen (U+2010), other functional hyphens such as
| soft and non-breaking hyphens, and a dedicated minus sign
| (U+2212), and some variations of minus such as subscript,
| superscript, etc.
|
| There's also the figure dash "-" (U+2012), essentally a hyphen-
| minus that's the same width as numbers and used aesthetically for
| typsetting, afaik. And don't overlook two-em-dashes "[?]" and
| three-em-dashes "[?]" and horizontal bars "--", the latter used
| like quotation marks!
| st_goliath wrote:
| Also, not to be confused with "Yi ", which is a different thing
| entirely......
| mortos wrote:
| This one is U+4E00, CJK Unified Ideograph-4E00. So it's a
| common character between Chinese, Japanese, and Korean. This
| should be "one" in all three. And it does technically look a
| little different than a dash: https://unicodeplus.com/U+4E00
| KPGv2 wrote:
| And this is different from Japanese's chuuonpu (U+30FC)
| which is a vowel elongation mark, and it's rendered
| horizontally or vertically depending on whether the text
| direction is horizontal or vertical, respectively.
| divbzero wrote:
| I prefer the dedicated minus (U+2212) over the hyphen-minus
| (U+002d) for mathematical use because they look different in
| most font faces.
|
| Are there cases where the dedicated hyphen (U+2010) is
| preferred over the hyphen-minus?
| LegionMammal978 wrote:
| G. Brandon Robinson swears by U+2010 for hyphens in groff's
| Unicode output [0], but I see it as a hypercorrection. The
| most common convention by far (among authors who use Unicode
| and care about dashes) is to use U+002D for hyphens and
| U+2212 for minus signs. Not even the Unicode Consortium uses
| U+2010 for hyphens in its documents, and I'm not aware of any
| major organization that does.
|
| As far as appearance goes, almost all fonts I've looked at
| make U+2010 identical to U+002D (i.e., they don't put any
| 'minus' into the 'hyphen-minus'), but a few make U+2010 a
| smidgeon shorter.
|
| [0] https://news.ycombinator.com/item?id=38121765
| wruza wrote:
| Intl.NumberFormat also prefers it, but then you can't paste
| negative numbers into most financial software, calculators,
| spreadsheets. Even back into inputs on the same webpage, if
| it does custom number parsing. Even though <input
| type=number> accepts U+2212 as a minus, it turns it into a
| regular minus when you spin it down to -2.
|
| It looks much better though and more visible: -1 vs -1. I
| wish hyphen was a separate symbol from the ascii start, or
| that monospace fonts didn't tend to shorten "-" cause it
| makes little sense in monospace anyway.
| mproud wrote:
| A regular hyphen arguably looks better when used as a hyphen
| and not a minus.
| zajio1am wrote:
| Visual style of hyphen-minus depends on font. Some fonts
| displays it more like a minus, others like a hyphen. So if
| you care about distinguishing hyphen and minus, it makes
| sense to use dedicated hyphen and minus, and do not use
| hyphen-minus at all.
| layer8 wrote:
| It has two potential benefits:
|
| -- In the context of automatic text processing, it
| unambiguously indicates the function of a hyphen, as opposed
| to a minus
|
| -- Fonts can choose to make the hyphen-minus a bit wider than
| a regular hyphen, to accommodate the usage as a minus sign.
| In that case, U+2010 would be typographically more
| appropriate for a hyphen, similar to how U+2212 usually is
| typographically more appropriate for a minus sign.
| lxgr wrote:
| > EM dashes break things, such as sentences or thoughts
|
| Some style guides recommend "space, en dash, space" for this,
| and I prefer that myself - mainly because some software doesn't
| treat em dashes correctly as word separators for double click
| selection purposes.
|
| For example, I'm pretty sure that at least some Kindle models
| would highlight both the word before and after the em dash when
| selecting one of them, which makes using the dictionary very
| annoying.
| rahimnathwani wrote:
| I grew up in the UK, and have always used space, minus,
| space.
|
| The first keyboard I used was my dad's typewriter, and I
| don't recall it having any 'dash' other that the minus sign.
| KPGv2 wrote:
| space, minus, space is on the same level as manually typing
| two spaces after a period
| lxgr wrote:
| How so? One is the only way to approximate an en or em
| dash on a typewriter or in a charset that doesn't have
| one, the other seems like a workaround of a typesetting
| bug at best.
| Propelloni wrote:
| -, --, --- is, IIRC, how it is done in LaTex and would be
| exceedingly simple to do on a typewriter. That being
| said, to break up sentences I use " -- " because I think
| it looks nicer than "---". I'll go now ;)
| lxgr wrote:
| LaTeX is a markup language though, not ASCII art. I can
| get behind two dashes as a substitute if no en dash is
| available, but three seems too much and looks like
| halfway to a horizontal line to me ;)
| rahimnathwani wrote:
| Until ~10 years ago, I used to type two spaces after a
| period.
| Daneel_ wrote:
| I still do, and I maintain that it's easier to read text
| with double spaces after periods.
| _emacsomancer_ wrote:
| TeX puts more space after periods/fullstops (which is why
| you're supposed to do special markup or other measures to
| mark '.' in the middle of sentences which aren't
| sentence-enders (e.g. like e.g.)). But it's generally
| smaller than the equivalent of two manual spaces.
|
| (A nice thing in (La)TeX is that one could follow the
| "two spaces after a full-stop" rule, which then has the
| advantage of being an explicit marking for sentence
| boundaries (which your editor might be able to navigate;
| Emacs has a convention of assuming two spaces after a
| sentence-ending '.'), but then the TeX typesetting will
| take care of making it look right. I lost the habit of
| actually doing this, for better or worse, except when
| flycheck/checkdoc/package-linter.el makes me do it for
| docstrings.)
| globnomulous wrote:
| I used to feel similarly. Now I find the double space a
| visual distraction that doesn't in any way improve
| readability.
|
| The effect of the double space is, I suspect, a product
| of the reader's expectations: if you expect it, its
| absence creates mental work, detracting from readability;
| if you don't expect it, its presence is what creates
| mental work.
| asveikau wrote:
| I'm still doing it when I am typing at a physical
| keyboard. Hard habit to break. I learned it so long ago
| too.
|
| You can tell when I've edited something on both a phone
| and a physical keyboard, based on the inconsistent use of
| spaces.
| rahimnathwani wrote:
| Hard habit to break. I learned it so long ago too.
|
| Haha I learned to type organically, and it was only in my
| mid-40s that I retrained myself to type the correct way.
| It took something like 40 hours of practice on keybr.com
| before I could get close enough to my regular typing
| speed, such that I could switch over to the 'correct'
| method without it impacting my work.
|
| Retraining myself to stop doing double-spaces took maybe
| a week.
| kevin_thibedeau wrote:
| Most word processors can be configured to flag double
| spaces. That gives feedback to break the habit.
| robin_reala wrote:
| en-US style is a single em-dash. en-GB style is a single
| en-dash with spaces on either side.
| Propelloni wrote:
| I was under the impression that you do "-" for hyphen, "--"
| for En dash, and "---" for Em dash. IIRC, LaTeX (or maybe
| the editor, it has been some time) even helpfully changes
| that for you to the correct dash.
| rahimnathwani wrote:
| Google Docs also does these replacements.
| JadeNB wrote:
| > I was under the impression that you do "-" for hyphen,
| "--" for En dash, and "---" for Em dash. IIRC, LaTeX (or
| maybe the editor, it has been some time) even helpfully
| changes that for you to the correct dash.
|
| The conversion of '--' to an en dash and '---' to an em
| dash is done by the TeX compiler, and appears in the
| rendered file, but I think that most TeX editors don't
| change the TeX code itself. (This is distinct from XeTeX-
| based compilers, which can handle non-ASCII Unicode
| characters like the em dash '--' directly in the source.)
|
| (I think that the article's point is that, in some fonts,
| -- (two hyphens) is literally the (approximate) size of
| an em dash, not that it is always understood as meaning
| an em dash. At least in my font, --- (three hyphens) is
| far too long to literally look like an em dash:
|
| ---
|
| --
|
| --
|
| -
|
| (in order, three hyphens, two hyphens, em dash, en
| dash).)
| Finnucane wrote:
| British typesetting style is a little different from US
| style in the way dashes are presented. In the UK, you might
| see a thin-space--en-dash---thin-space where a US
| typesetter would use a em-dash. Typewriter style generally
| follows books style. Since typesetters no longer use an
| extra space after punctuation, it's vestigial in typing.
| KPGv2 wrote:
| > Some style guides recommend "space, en dash, space" for
| this
|
| Which one does that? I threw up a little in my mouth and wish
| to avoid such style guides in the future!
| mmooss wrote:
| https://news.ycombinator.com/item?id=43501482
| lxgr wrote:
| Better avoid British journalism then, and many other
| languages on top of that.
|
| It's very common outside of America, even in English.
| mmooss wrote:
| The AP Style Manual, a/the leading source for US journalism
| at least, says <word> <space> <dash> <space>
| <word>
|
| Outside of journalism, usually there is no padding, only,
| <word> <dash> <word>
|
| I'm with you: For searches, the spaces make the words easier
| to parse. Those rules predate computers, I would guess.
| lxgr wrote:
| > <word> <dash> <word>
|
| That one I'd usually parse as a hyphen, as in e.g. well-
| known. "Word space dash space word" is much clearer, in my
| view.
|
| > The AP Style Manual, a/the leading source for US
| journalism
|
| One of the things I can easily get away with by not being a
| US journalist :)
| stouset wrote:
| It's quite hard to mistake an em dash for a hyphen in a
| proportional font.
|
| self-fulfilling
|
| self--fulfilling
|
| One of these looks very, very wrong.
| johnisgood wrote:
| I agree, although I still prefer spaces between --.
| mattl wrote:
| Chicago Manual of Style has no spaces, so there's some
| variation at least.
| mmooss wrote:
| CMOS is not journalism, so it's not variation from the
| GP?
| mattl wrote:
| A wider number of people use either of them. Every place
| I've used used CMOS which I now use with others.
| ghaff wrote:
| Company I used to work for used AP for things like press
| releases and, I think, official blog posts and Chicago
| plus a couple different tech style guides for everything
| else.
|
| Basically, we didn't like some things in AP but we wanted
| to make it easy for journalists to copy/paste.
| opello wrote:
| > Some style guides recommend "space, en dash, space" for
| this
|
| The last paragraph of the article also addressed the
| subjective nature of spacing around the em dash:
|
| > Spacing around an em dash varies. Most newspapers insert a
| space before and after the dash, and many popular magazines
| do the same, but most books and journals omit spacing,
| closing whatever comes before and after the em dash right up
| next to it.
|
| As far as the selection detail, did you mean that you replace
| an em dash used like a comma or parenthesis with spaces and
| an en dash for specific highlight performance issues? Surely
| the spaces and an em dash would alleviate the selection
| highlight behavior and not muddy the waters of when to use an
| em vs. an en dash?
| JadeNB wrote:
| > Spacing around an em dash varies. Most newspapers insert
| a space before and after the dash, and many popular
| magazines do the same, but most books and journals omit
| spacing, closing whatever comes before and after the em
| dash right up next to it.
|
| It's funny that they omit to mention the possibility of
| setting it off with a thin space ' ' or hair space ' '
| (those are the thin-space and hair-space Unicode
| characters, though they show up full width for me), which I
| thought was preferred typographic practice.
|
| (On Googling, maybe the reason that they don't mention it
| is that I was imagining it; I can't find any evidence for
| my belief.)
| opello wrote:
| > those are the thin-space and hair-space Unicode
| characters, though they show up full width for me
|
| Interestingly, at least in my browser and grabbing the
| direct link to the comment with curl, show the bytes as
| 0x20 for both. Perhaps the comment submission handler, or
| even the browser, collated your more specific U+2009
| (thin) and U+200A (hair) spaces into the regular U+0020
| space?
| JadeNB wrote:
| > Interestingly, at least in my browser and grabbing the
| direct link to the comment with curl, show the bytes as
| 0x20 for both. Perhaps the comment submission handler, or
| even the browser, collated your more specific U+2009
| (thin) and U+200A (hair) spaces into the regular U+0020
| space?
|
| Probably! I think HN strips out emoji; maybe it just
| takes the safest approach and strips out all non-white-
| listed Unicode.
| krick wrote:
| It's actually only your post that made me realize people
| don't normally put spaces around em dash. In French, Russian
| and a bunch of other languages proper typesetting is to use
| em dash as a standard dash character, and you always put
| spaces around them. So I did it in English as well, for many
| years now.
|
| (I also now looked up and found out that in Spanish,
| apparently, you are supposed to put space only on one side of
| the dash, when used as a direct speech separator.)
| rmunn wrote:
| I also put spaces around em dashes. It looks wrong--subtly
| wrong--to me to have the words glued together around the
| dash. It looks right -- completely right -- to me to have
| the dash standing on its own, as if it was a word in its
| own right.
| lashloch wrote:
| Funny--I'm the exact opposite. The extra spaces distract
| my eyes. To each their own! :)
| rmunn wrote:
| To each their own: fully agreed, even though our tastes
| differ. I will mention one advantage of the spaces-
| around-dashes method: word wrap with default settings
| will break on the spaces around the dashes so that the
| entire word one, dash, word two combo doesn't end up
| pulled onto the next line as a whole unit. Whereas the
| advantage of the no-spaces method that you prefer is that
| word wrap will pull the entire word one, dash, word two
| combo onto the next line as a whole unit.
|
| Why yes, I did list the opposite behavior as an advantage
| of each. Because that, too, is up to individual
| preference. :-)
| lxgr wrote:
| That depends on the layout engine, I believe. Just tried
| it in Firefox (on macOS; not sure if it uses Core Text or
| something custom there), and it does sometimes break
| around the em dash in "foo--bar" style, not just "foo -
| bar" style.
|
| I've definitely noticed the behavior you describe on some
| layout engines, too, and it's another reason why I
| personally prefer "foo - bar" style.
| rmunn wrote:
| P.S. I also prefer smileys with noses, :-), as opposed to
| the noseless smileys, :), that most people these days
| seem to prefer. :-)
| mmooss wrote:
| It's not your own. You write mostly for others to read.
| tines wrote:
| The reason not to do this is observable in your post on
| my phone. The spaces cause the word wrapping algorithm to
| leave a dangling dash at the end of the line which looks
| ugly. Omitting spaces prevents the word break.
| hansvm wrote:
| Funny, I'd rather have the break at the start or end of
| the emdash-implied break than just before or after it,
| not having to mentally handle some single dangling word
| divorced from its compatriots.
| rmunn wrote:
| I mentioned that as an advantage in one of my other
| comments. An advantage both ways, because it depends on
| preference. I have the same preference as hansvm: I would
| rather see the dangling dash at the end of the line, so I
| prefer putting spaces around the dashes. Having the
| entire word-dash-word structure move to the next line
| feels ugly to me. As with most things, _de gustibus non
| est disputandum_. (And also, _quidquid Latine dictum sit
| altum videtur_ ).
| chipotle_coyote wrote:
| It's the dangling dash at the _beginning_ of the line
| that gets me. I see a lot of word break algorithms,
| including the one WebKit (and I suspect Blink) uses,
| which are happy to break "foo--bar" on either side of
| the em dash.
| da_chicken wrote:
| Ironically, on my phone the only line that ends with an
| em dash has no spaces in it.
|
| If you want to not have a line break, you shouldn't rely
| on arbitrary behavior. You should use non-breaking
| characters like non-breaking spaces and word joiners.
| mmooss wrote:
| > The reason not to do this is observable in your post on
| my phone. The spaces cause the word wrapping algorithm to
| leave a dangling dash at the end of the line which looks
| ugly. Omitting spaces prevents the word break.
|
| That's an interesting practicality but I don't think it's
| the cause of the rule: The rule probably long predates
| automated line breaking. Also, I think automatic line
| breaking will break compound words at the hyphen; it
| doesn't require spaces (which is also obvious from a
| software development point of view: the logic is
| relatively simple either way): Lorem
| ipsum dolor sit amet, consectetur adipiscing double-
| decker lorem ipsum dolor sit amet, consectetur ...
| lxgr wrote:
| Preventing the word break doesn't seem very desirable,
| especially if it causes a large gap.
| laptopdev wrote:
| Grammar nasi but isn't it "It looks right -- completely
| right, to me -- to have the dash standing on its own"...
| snozolli wrote:
| _people don 't normally put spaces around em dash_
|
| For what it's worth, I was in the last class in my high
| school to learn typing on IBM Selectric typewriters. We
| were taught to type two spaces, two hyphens, then two
| spaces. Incidentally, we were taught two spaces after
| periods and colons. To this day, I find it hard to read
| text that doesn't have proper spacing after periods. (HTML
| and WYSIWYG word processors handle formatting, but e.g.
| fixed-font text editors don't)
| dragonwriter wrote:
| Its funny that people think that conventions for
| typewritten text built around the limitations of
| typewriters define what is "proper" in environments where
| typewriters and their limitations are not involved.
| ovalanche wrote:
| Yes, this always grinds my gears too. There is already a
| slightly larger space after periods in contemporary
| typefaces.
|
| The old typewriter typefaces were monospaced, ie. every
| character was the same width, but this is no longer the
| case. Virtually all typefaces today are proportionally
| spaced, not monospaced. So it's redundant to leave extra
| room after periods.
| kevin_thibedeau wrote:
| I was taught that and abandoned it as a pointless
| anachronism. How often are you reading long form text in
| a monospace font?
| mmooss wrote:
| What is a "standard dash character"? There is no such thing
| in English; only hyphen, EN dash, EM dash (and some odds
| and ends).
| cyrillite wrote:
| I have been doing this for purely aesthetic reasons my whole
| life. Style guides be damned, I hate connected em dashes.
| lxgr wrote:
| The good thing about style guides is that they're guides,
| not laws :)
|
| That's one thing I really like about English: There's no
| central authority decreeing what's right and what's wrong
| top down, and it feels like there is some room for
| individual preferences and experimentation.
|
| Very refreshing, compared to e.g. German, which has more
| than one semi-official authority gate keeping "correctness"
| in speech and writing.
| mmooss wrote:
| In fairness, especially in the Anglo-Saxon dominated
| world post-WWII, English was under no threat to be
| swamped by German or French words.
| energy123 wrote:
| The em dash is now a GPT-ism and is not advisable unless you
| want people to think your writing is the output of a LLM.
| xanderlewis wrote:
| No, thanks--I'll keep using them as I always have.
| alt187 wrote:
| The letter 'm' is now a GPT-ism and is not advisable unless
| you want people to think your writing is the output of a LLM.
| sho_hn wrote:
| My advise is to take pleasure and have confidence in good
| writing, over misspent energy worrying about things like
| this.
|
| If you practice your skills, you will reap the rewards.
| mmooss wrote:
| Someone else said the same. How can that be when most word
| processors, and at least some phone keyboards, automatically
| insert em dashes?
| phlakaton wrote:
| Emily Dickinson wept--
| mmooss wrote:
| Ha, good point, and an interesting question: What kinds of
| dashes did Dickinson intend?
|
| It's a hard one to answer: We could look at published Emily
| Dickinson books from the time, but did Dickinson really pay
| that close attention to or have that much control over the
| type?
|
| We could look at Dickinson's actual personal documents, but
| if they were handewritten, distinguishing dashes could be
| difficult even if there was intention there.
| grey413 wrote:
| I imagine it would have been up to the typesetter to make
| the call. The conventions for dash usage are fairly
| straightforward. You use em-dashes for asides, en dashes
| for ranges, and hyphens for most other cases. Its easy to
| figure out the right character from context (apart from
| en ranges vs hyphen ranges).
| armedgorilla wrote:
| Fortunately we have troves of her handwritten documents;
| all of her poems were first printed posthumously. To me,
| she's using the punctuation as pacing or tonal markers as
| opposed to ligatures ("I'll clutch-- and clutch-- " vs
| "I'll clutch-and clutch-"). Many publishers style these
| marks as longer than normal m-dashes for that reason,
| which makes sense seeing as they are rarely used as
| asides.
|
| I interpret her marks--
|
| as breathless pauses--
|
| that-- having no unicode--
|
| should be given to m--
|
| and space--
|
| https://www.edickinson.org/editions/2/image_sets/12170035
| phlakaton wrote:
| Em-dashes have been the norm in every Dickinson poem I
| read, and I think it might have derived from the
| preferences of Victorian publishers, who I understand
| _loved_ those long dashes.
| mmooss wrote:
| Great comment. Thank you!
| lostlogin wrote:
| I had a quick search, attempting to find a great author who
| hated em dashes and preferred the vastly superior en dash.
| I found nothing.
|
| This list of authors punctuation quirks is interesting
| though.
|
| https://lithub.com/the-punctuation-marks-loved-and-hated-
| by-...
| grey413 wrote:
| It's infuriating that people are drawing this conclusion.
| LLMs pick up on em dash usage because professional and
| skilled writers use em dashes. They're a consistently useful,
| if niche, part of the literary toolkit.
|
| But, no, now it's a problem because the majority of people's
| experience with writing is graded essays. And because LLMs
| emulate professionals, it's now a red flag if students write
| too much like professionals. What a joke.
| nkotov wrote:
| Recently ran into this. Didn't realize it was that obvious.
| windward wrote:
| And you'd better not 'delve' into anything
| econ wrote:
| I've always wanted an array or object with range keys like:
| arr[0-2] = 123; if(arr[1.5555]>122){}
| paulddraper wrote:
| In Python it's a colon.
| yesbabyyes wrote:
| That doesn't seem to be an array at all, if the idea is to
| check whether a number is within a range. Seems like an
| interesting data type though, a combination of a range data
| type and a map/associative array.
| mproud wrote:
| A Figure Dash is perfect for phone numbers (especially when
| working with tabular numbers).
| raverbashing wrote:
| You are right of course
|
| However this is the kind of rule that "existed" for a while and
| most likely will go away as most people can't be bothered with
| the difference and it all looks similar anyway
|
| Or maybe who knows, it will keep going on because chatgpt knows
| it
| BoumTAC wrote:
| I'm not a native English speaker, but don't you use the ";" in
| English ?
|
| To me, it feels like it is the same purpose as the EM dashes.
|
| And I discovered the EM with ChatGPT, I've never seen it
| before.
| OJFord wrote:
| Dashes surround a sub-clause - something like this - which is
| like a parenthetical addition to a sentence that could stand
| alone without it; semi-colons (';') connect a further
| sentence or part of one where perhaps a full-stop and
| additional word could have been. They also sometimes separate
| list items following a colon, especially if the things listed
| are longer sentences perhaps themselves containing commas
| that'd otherwise be ambiguous.
| grey413 wrote:
| Em dashes are very similar to semicolons. You use em dashes
| if your related sentence is in the middle of another
| sentence, and semicolons if it's at the end.
|
| They're frequently used in skilled and professional grade
| writing.
| mmooss wrote:
| So as not to mislead anyone, the parent is mostly
| incorrect:
|
| Here's an example sentence: _Semicolons must have
| independent clauses--phrases that could form a full
| sentence on their own--on both sides of them; they are
| essentially alternatives for periods._ Em dashes don 't
| require independent clauses on either side.
|
| In the italicized sentence,
|
| * _phrases that could form a full sentence on their own_ is
| not an independent clause but is valid between em dashes.
| _on both sides of them_ , after the em dashes, is also not
| an independent clause. (The em dashes function like commas
| or parentheses here.)
|
| * The parts before and after the semicolon are independent
| clauses. You could replace the semicolon with a period and
| you'd have perfectly valid grammar. I just chose to connect
| the two sentences a bit more.
|
| I don't know if you can use em dashes as the parent comment
| describes, connecting three independent clauses:
|
| * _My favorite fruit is peaches--they are very sweet--I eat
| them all summer._
|
| I think the above is wrong; it should be one of the
| following:
|
| * _My favorite fruit is peaches--they are very sweet--and I
| eat them all summer._ : The last section is a dependent
| clause made by "and", not an independent clause.
|
| * _My favorite fruit is peaches--they are very sweet; I eat
| them all summer._ : One both sides of the semicolon are
| independent clauses; I could replace the semicolon with a
| period.
|
| Maybe there are examples I'm not thinking of? I infer that
| the rule might be that the punctution following the em-
| dashed clauses should be the punctuation that would have
| been used without the em-dashed clause, but that's based on
| very limited evidence.
| layer8 wrote:
| A semicolon connects, whereas an em-dash creates more of a
| pause and therefore separates. In addition, em-dashes can be
| used in pairs to create a parenthesis, which semicolons
| can't. I think with time you will appreciate the difference.
|
| https://thenarrativearc.org/blog/2020/2/4/epic-grammar-
| battl...
| mmooss wrote:
| Many people don't use semicolons (;) in English but many do,
| and they are certainly part of correct grammar.
|
| Semicolons are generally alternatives to periods, when you
| want more connection between the two sentences. Like periods,
| semicolons _must_ have two full sentences--that is, what
| _could be_ full sentences--on either side of them; the
| potential 'full sentences' are properly called _independent
| clauses_. (A _dependent clause_ needs the rest of the
| sentence to form valid grammar; it can 't function on its
| own. For example, in this paragraph's first sentence, _when
| you want more connection between the two sentences_ is a
| dependent clause. Often they follow commas.)
|
| Another use of semicolons is for lists in a paragraph where
| one of the list items has a comma in it (similar to the
| parsing problem for CSVs where some records contain commas):
| _I only like wine; beer, but only ales; and orange juice._
| dspillett wrote:
| _> Unicode has the original ASCII hyphen-minus (U+002d), as
| well as a dedicated hyphen (U+2010), other functional
| hyphens..._
|
| Which can be fun when parsing CSV files from various sources.
| I've hit numbers with U2010 or others where you would expect a
| hyphen-minus should be. Presumably someone2 has copied a
| negative number from a document where one of the alternate
| symbols was used, and pasted it into everyone's favourite data-
| mangler1 which interpreted it as a string, and so on down the
| chain.
|
| --------
|
| [1] Excel. Sometimes a joy, sometimes the bane of my existence.
|
| [2] It is surprising, horrifying even, how much manual
| manipulation of data goes on in banking, where you might
| naturally assume everything is more automated these days.
| Sometimes a laborious manual process done regularly is seen as
| cheaper than paying for it to be automated...
| docmars wrote:
| EN dashes are also great for date ranges: _1 /1/2025-3/28/2025_
| darajava wrote:
| Most people don't use the em dash. It's too hard to type and
| looks too similar to a hyphen.
|
| As a result, a hallmark of GPT-generated text is its (over)using
| of the em dash--I have stopped using it for this reason an just
| use two hyphens now instead.
| emmelaich wrote:
| You mean en dash?
| darajava wrote:
| No. Em dashes are to separate two thoughts in one sentence.
| En dashes are for denoting ranges.
| grey413 wrote:
| Most people don't use em dashes... apart from professional and
| skilled writers, who use them regularly.
|
| It's a _bit_ of a problem that the same character is both a
| mark of LLMs and skilled writing.
| darajava wrote:
| Yes, true! I was tired when I clumsily made that point above
| (I am not a skilled writer).
|
| I learned how to use the em dash properly about 6 months
| before the release of ChatGPT and then when it was released I
| realized that it used them _all the time_. So, to convince
| people that I both know basic grammar and I am human I
| started to use "--" instead of "--".
| sethaurus wrote:
| For anyone finding em-dashes too small, behold the majesty of
| U+2E3B, the triple-em dash: [?]
| kazinator wrote:
| Use three repetitions of the ASCII minus for em-dash, two for en-
| dash.
|
| Do not use the Unicode characters, or people will think you are
| an AI bot.
| rappatic wrote:
| I use em dashes all the time in writing, but unfortunately
| ChatGPT and co. use the em dash frequently--and most people use
| the em dash infrequently, not knowing how to type it on a
| keyboard--so it's starting to make my writing look AI-generated
| sometimes. I fear it'll have to go the way of words like
| "tapestry."
|
| FWIW, you can type an em dash on Mac with shift + option +
| hyphen.
| fastball wrote:
| I use them as well. For blog posts I suppose I'll need to
| switch to regular hyphens lest people think all my writing is
| LLM-spam.
|
| That said, I don't even think you need the [shift] for em dash
| on Mac - just [option] + [hyphen] works for me.
| contact9879 wrote:
| That's an en-dash
| fastball wrote:
| Oh yeah, true. They look the exact same in the HN editor
| haha
| pama wrote:
| Embrace the confusion with the AI--it's a sign of progress!
| keybored wrote:
| Three^W Four top-level comments so far with this concern. Nice
| try AIs but I won't downgrade my writing.
| wraptile wrote:
| Em dashes without surrounding spaces is such a ugly relic that
| triggers me to no end and is objectively wrong. The dash object
| is part of the sentence -- not the two words it's separating.
| Imagenuity wrote:
| I agree, this bugs me too.
| keybored wrote:
| Using em-dash with spaces takes up way too much space. Use an
| en-dash then instead.
|
| The perfect way to surround with hairspace.
| anon1094 wrote:
| I've been writing for years and never used en or em dashes before
| LLMs.
| Imagenuity wrote:
| I could never remember which was the longer dash. Now it's easy,
| because the en dash - is the approximate length of a capital N,
| and a em dash -- is the approximate length of a capital M. Today
| I Learned!
| porridgeraisin wrote:
| I simply do not care. I will just use - (the one next to zero on
| the keyboard) everywhere. There are a grand total of zero
| situations where using one in place of the other hampers
| information reconstruction or reading comprehension (although the
| latter is subjective, I suppose)
| lispybanana wrote:
| Super- or subhuman intelligence can be identified in the pre-
| Mason-Dixon line era.
| low_tech_punk wrote:
| The problem with en and em dash is that
|
| 1) they are too hard to type.
|
| 2) using them without surrounding thin space or hairspace breaks
| the horizontal rhythm and draws unnecessary attention to the
| punctuation; but thin and hair spaces are equally hard to type
|
| 3) Most people write markdown with mono space fonts, making these
| dashes and spaces indistinguishable.
| perilunar wrote:
| The eternal debate between minimalism and the ornate.
|
| There's room for both: when presentation matters I use them; when
| it doesn't, I don't.
| Stratoscope wrote:
| I had one minor quarrel with this article: The use of spaces (of
| any kind) before and after the em dash or any dashes.
|
| Personally, I am fond of using either a hair space or a thin
| space before and after the em dash. Not a full space!
|
| To explore the various options, I wrote a little program to print
| the various combinations of dashes and spaces. I think what looks
| best depends a lot on what typeface you're using. But let's see
| how they look in the Verdana font used here. You should be able
| to paste this into your favorite word processor to see it in
| other fonts:
|
| ASCII 0x2D hyphen-with no spaces
|
| ASCII 0x2D hyphen - with U+200A hair spaces
|
| ASCII 0x2D hyphen - with U+2009 thin spaces
|
| ASCII 0x2D hyphen - with 0x20 full spaces
|
| Unicode U+2010 hyphen-with no spaces
|
| Unicode U+2010 hyphen - with U+200A hair spaces
|
| Unicode U+2010 hyphen - with U+2009 thin spaces
|
| Unicode U+2010 hyphen - with 0x20 full spaces
|
| Unicode U+2013 en dash-with no spaces
|
| Unicode U+2013 en dash - with U+200A hair spaces
|
| Unicode U+2013 en dash - with U+2009 thin spaces
|
| Unicode U+2013 en dash - with 0x20 full spaces
|
| Unicode U+2014 em dash--with no spaces
|
| Unicode U+2014 em dash -- with U+200A hair spaces
|
| Unicode U+2014 em dash -- with U+2009 thin spaces
|
| Unicode U+2014 em dash -- with 0x20 full spaces
|
| It looks like HN is really mangling this. Hair spaces are
| rendered wider than thin spaces?
|
| If anyone wants to experiment, here is the Python code:
| from dataclasses import dataclass @dataclass
| class Character: char: str name: str
| DASHES = [ Character( "-", "ASCII 0x2D hyphen" ),
| Character( "\u2010", "Unicode U+2010 hyphen" ),
| Character( "\u2013", "Unicode U+2013 en dash" ),
| Character( "\u2014", "Unicode U+2014 em dash" ), ]
| SPACES = [ Character( "", "no" ), Character(
| "\u200A", "U+200A hair" ), Character( "\u2009", "U+2009
| thin" ), Character( "\x20", "0x20 full" ),
| ] for dash in DASHES: for space in SPACES:
| print( f"{dash.name}{space.char}{dash.char}{space.char}with
| {space.name} spaces\n" )
| aorth wrote:
| I read Butterick's _Hyphens and dashes_ some years ago and it
| stuck with me. Now I regularly use hyphens, en dashes, and em
| dashes correctly--I even memorized the Unicode sequences and
| enter them seamlessly on Linux with Ctrl-Shift-U!
|
| https://practicaltypography.com/hyphens-and-dashes.html
| uneekname wrote:
| Came here to post the same link! That book is wonderfully
| opinionated and has helped clarify some typographic concepts
| for me
| lloeki wrote:
| > The en dash is the least loved of all; it's not easily rendered
| by the average keyboard user (one has to select it as a special
| character, whereas the em dash can be conjured with two hyphens)
|
| on macOS:
|
| - - => - (hyphen/minus)
|
| - [?] - => - (en dash)
|
| - | [?] - => -- (em dash)
|
| There are so many of these convenient typographical shortcuts
| that a long time ago I made Apple layouts for Windows and Linux.
|
| And many are mnemonic too, like:
|
| - of course / (division) is [?] / (slash, which is poor man's
| division)
|
| - of course ? is | [?] / because | / is ? so logically | [?] / is
| [?] ? which is ?
|
| - guess what <= >= +- [?] are
|
| - ! (logical negation) is [?] L because it's a L sideways
|
| - PS (pound) is [?] 3 because | 3 is # (octothorpe, abused as
| sharp or pound - the other kind)
| 8bithero wrote:
| The only problem with correctly using the Em or En dash is that
| people will automatically assume the text was written by an LLM
| -_-
| Kiro wrote:
| Yeah, I've stopped using them because of this reason.
| lol768 wrote:
| The more people proliferate this, the worse it'll be--frankly,
| we should be embarrassed that societal literacy and writing
| style knowledge is so poor that we jump to the "must be written
| by an LLM" conclusion whenever we see any sort of exotic
| character usage!
| quitit wrote:
| Invoking these from the mac keyboard: Hyphen
| for hyphen Option + Hyphen for n-dash
| Shift + Option + Hyphen for m-dash
|
| While I'm here, Shift+Return for a soft return (i.e. not a new
| paragraph.)
| jeffhuys wrote:
| I used a lot of these, but actually stopped due to my text
| sometimes being called out as chatgpt output. I also thorw in the
| occasional spelling mistake. If a piece of text on reddit/x has
| "-" (not "-") in it, you can be 95% sure it's an LLM.
| indexerror wrote:
| That is an interesting observation. I wonder what percentage of
| the training text data for LLMs contains proper dashes, since a
| large part of it is user-generated content.
| keybored wrote:
| All self-respecting journalistic outlets use proper symbols.
| Where does the LLM get their opinions on "foreign affairs"
| from? Probably from the likes of New York Times like a
| standard lib...
|
| And it shouldn't be hard for an LLM to learn to use proper
| symbols when synthesizing content from the everyman. It's not
| like it works on the level of literal copy and paste.
| the-mitr wrote:
| In LaTeX simple to remember hyphen (-), an en-dash (--), and an
| em-dash (---).
| renatoboo wrote:
| My therapist: "homoglyph and punycode attacks are made up term by
| computer people to justify their paycheck".
|
| Also Merriam-Webster:
| psychoslave wrote:
| If you are looking for alternative to kebab case to write
| identifier in programming language which reserve the - (U+002d)
| as an operator, chances are good you can use * (U+00B7 * MIDDLE
| DOT), that we use in _middot case_.
|
| So isMorePleasantToRead, is_more_pleasant_to_read or
| is*more*pleasant*to*read is up to you.
| nlitened wrote:
| But how pleasant is it to write?
| psychoslave wrote:
| On the bepo layout that I use, extremely well, as it sits
| between ' (U+2019 ' RIGHT SINGLE QUOTATION MARK) and -
| (U+2011 - NON-BREAKING HYPHEN), each being generated by
| altgr+shift and x . and k (which are all on the opposite side
| of the keyboard compared to altgr key).
|
| At least from the point of view of digital gymnastic, it's
| not really any worst than camel or snake cases, though direct
| access to dash could be said to give a small facilitation for
| input in kebab case.
|
| So it really depends on the keyboard layout used (or whatever
| input device facility is used). What's you favorite input
| method lately? Does it really doesn't provide a convenient
| way to input more than ASCII visible glyphs?
|
| Plus, let's be honest, identifiers are generally written in
| full expanse only once, then autocompletion is going to do it
| for us. And we all know we spend more time reading
| identifiers than declaring new ones.
| thomasjb wrote:
| This is intriguing to me, do you know which (programming)
| languages tolerate this?
| psychoslave wrote:
| Python python3 -c "some*identifier = 0;
| print(some*identifier)"
|
| C echo -e '#include <stdio.h>\nint main() {
| int some*identifier = 0; printf("%d", some*identifier);
| return 0; }' | gcc -x c -o temp - && ./temp
|
| C++ echo '#include <iostream>\nint main() {
| int some*identifier = 0; std::cout << some*identifier; return
| 0; }' | g++ -x c++ -o temp - && ./temp
|
| Ruby ruby -e 'some*identifier = 0; puts
| some*identifier'
|
| Javascript node -e 'let some*identifier =
| 0; console.log(some*identifier);'
|
| Rust echo 'fn main() { let some*identifier
| = 0; println!("{}", some*identifier); }' > temp.rs && rustc
| temp.rs && ./temp
|
| Go throw an invalid character U+00B7 '*' in identifier
|
| Java throw error: illegal character: '\u00b7'
|
| C# is really annoyed with it apparently:
| echo 'using System; class Program { static void Main() { int
| some*identifier = 0; Console.WriteLine(some*identifier); } }'
| > Program.cs && mcs Program.cs && mono Program.exe
|
| Program.cs(1,60): error CS1056: Unexpected character `*'
| Program.cs(1,60): error CS1525: Unexpected symbol
| `identifier', expecting `,', `;', or `=' Program.cs(1,99):
| error CS1056: Unexpected character `*' Program.cs(1,99):
| error CS1525: Unexpected symbol `identifier'
|
| That's it for the top in TIOB index I tested in the frame of
| this message.
| thomasjb wrote:
| Thank you very much for testing it! I'm plugging away on
| Advent of Code 2015 in C, I'll give this a go to see if I
| like it
| steveklabnik wrote:
| The reason this works in Rust is that Rust follows Unicode's
| categorization of which code points are useful as
| identifiers: https://www.unicode.org/reports/tr31/
|
| MIDDLE DOT is Other_ID_Continue
|
| I know less about the other languages but it wouldn't
| surprise me if they did similar things.
| pwdisswordfishz wrote:
| 0 0 000048 48 H LATIN CAPITAL LETTER
| H 1 1 00006F 6F o LATIN
| SMALL LETTER O 2 2 000077 77
| w LATIN SMALL LETTER W 3 3 000020 20
| SPACE 4 4 000074 74 t
| LATIN SMALL LETTER T 5 5 00006F 6F
| o LATIN SMALL LETTER O 6 6 000020 20
| SPACE 7 7 000055 55 U
| LATIN CAPITAL LETTER U 8 8 000073 73
| s LATIN SMALL LETTER S 9 9 000065 65
| e LATIN SMALL LETTER E 10 10 000020 20
| SPACE 11 11 000045 45 E
| LATIN CAPITAL LETTER E 12 12 00006D 6D
| m LATIN SMALL LETTER M 13 13 000020 20
| SPACE 14 14 000044 44 D
| LATIN CAPITAL LETTER D 15 15 000061 61
| a LATIN SMALL LETTER A 16 16 000073 73
| s LATIN SMALL LETTER S 17 17 000068 68
| h LATIN SMALL LETTER H 18 18 000065 65
| e LATIN SMALL LETTER E 19 19 000073 73
| s LATIN SMALL LETTER S 20 20 000020 20
| SPACE 21 21 000028 28 (
| LEFT PARENTHESIS 22 22 002013 E2 80 93
| - EN DASH 23 25 000029 29
| ) RIGHT PARENTHESIS
|
| Ironic.
| kzrdude wrote:
| That's a problem on the HN side only, not in the article
| dwighttk wrote:
| "So, you want to be accused of being an AI..."
| account42 wrote:
| Thanks, but I'll keep using good old U+002D. Widening a glyph is
| a font/typesetting concern and doesn't make it a different
| character.
| velcrovan wrote:
| Here's my AutoHotkey script for making my favorite punctuation
| hotkeys on my Windows laptops the same as my Mac:
| #-::Send("-") ; Win+- = en-dash #+-::Send("--") ;
| Win+SHIFT+- = em-dash #]::Send("'")
| #+]::Send("'") #[::Send(""") #+[::Send(""")
| #;::Send("...") #+>::Send("-") #+<::Send("-")
| #8::Send("*") #+x::Send("x") ; multiplication symbol
|
| edit...downvoted, why? weird
| ubermonkey wrote:
| I use the hyphen key, and hit it once for a hyphen or for a minus
| sign, and I use it twice for an em dash.
|
| At some point, many things I type into started replacing "--"
| with an em dash, but my precambrian computer typing muscle memory
| is fine with "hyphenhyphen" meaning "em dash".
|
| I will admit right here in front of god & everybody that I'm
| pretty sure I've never typed an en dash at all.
| MetaWhirledPeas wrote:
| If it's important in English, it should have a key on the
| keyboard. It follows that if it doesn't have a key, it's not
| important.
| fareesh wrote:
| emdashes are on the rise thanks to people copying and pasting
| chatgpt
| kayo_20211030 wrote:
| It might not be completely true that nobody cares, but I feel
| that almost nobody cares.
|
| > comma, a colon, or parenthesis
|
| They're all different. There _is_ a difference between clear
| writing and typesetting. Why mix them up? A narcissism of small
| differences?
| graiz wrote:
| Hyphens - I'm normal, breaking up thoughts. En / Em - I'm an AI
| or I'm using AP style guide to write articles.
| MichaelDickens wrote:
| In that case, I guess I must be an AI--I use em-dashes all the
| time in casual text.
| bilater wrote:
| I'm sick of em dashes cause somehow that's become the tell its AI
| generated text.
| dskhatri wrote:
| For Windows users, PowerToys has a Quick Accent tool, that lets
| you type in an em dash or figure dash by holding down the hyphen
| (-) and then toggling the space bar. Interestingly, the en dash
| is not available.
| pahbloo wrote:
| Fun fact: In Portuguese, the em dash is often used to introduce
| direct discourse, much like double quotes are used in English,
| but only when the direct discourse opens the paragraph. So
| instead of:
|
| "Hello," said John, "how are you today?"
|
| You'd see:
|
| -- Hello -- said John -- how are you today?
| irrational wrote:
| I'm all about spelling things correctly. To, too, two or their,
| there, they're matter. But using the correct dash/hyphen is way
| too pedantic to me. In isolation, I can't tell the difference
| between them.
| NegativeLatency wrote:
| My personal rule is simple I just use - for everything
| ergocoder wrote:
| DON'T YOU DARE
___________________________________________________________________
(page generated 2025-03-28 23:02 UTC)