[HN Gopher] Who lusts for certainty lusts for lies
___________________________________________________________________
Who lusts for certainty lusts for lies
Author : hprotagonist
Score : 349 points
Date : 2023-09-26 10:50 UTC (12 hours ago)
(HTM) web link (www.etymonline.com)
(TXT) w3m dump (www.etymonline.com)
| laura_g wrote:
| What is it specifically about the 1970/80s that causes this dip?
| Was there an explosion of this academic writing around that era
| or something else to have this effect?
| thfuran wrote:
| That or maths. Though I seem to recall a quote about
| statistics...
| [deleted]
| hprotagonist wrote:
| in the case of ngrams, both!
| thfuran wrote:
| Yes, I think (as the article says) using ngrams can easily
| land you in the camp of telling lies with statistics.
| tensor wrote:
| The authors assert that the ngram statistics for "said" are
| wrong, and imply that they have evidence of the contrary, but
| they don't provide the evidence. Looking at their own website,
| all they provide is google ngram statistics:
| https://www.etymonline.com/word/said#etymonline_v_25922.
|
| This coupled with the huge failing of not displaying zero on the
| y-axis of their graph, and even _interpreting_ the bad graph
| wrong, makes me not believe them at all. A very low quality
| article.
| coldtea wrote:
| A low effort comment. That "said" haven't declined and raised
| the way shown isn't what needs evidence.
|
| It's the extraordinary claim that it has that does.
|
| That claim is Google's, and before accusing the author of the
| blog, maybe how representative their unseen dataset is. Should
| we take statistics with no knowledge of their input set at face
| value because "trust Google"?
| tensor wrote:
| Google isn't claiming any such statement. It's merely
| providing fun statistics based on their data set. With that
| context, when I read a headline claiming that the statistics
| are "wrong," it would imply that the counts are somehow off.
| Maybe due to a bug in the algorithm or the like.
|
| Instead, we get a strawman put up where they misrepresent
| what the data set is, make up things that its "claiming,"
| fail to investigate the underlying data sources and look into
| "why" they see the trend they see, and also fail to provide
| any alternative data.
|
| It's cheap and snobby grandstanding, ironically complete with
| faulty interpretations of the little data they DO present.
| mattigames wrote:
| But Google is claiming such thing by calling it "trends",
| which the dictionary defines as "a general direction in
| which something is developing or changing.", if they didn't
| want to create such misunderstandings they would just call
| it "word frequency on Google books" so the biases of the
| data would be a lot more clear.
| prepend wrote:
| It's hard to present evidence because there's only one source.
| So the article basically calls out flaws in the methodology of
| Google Books/Ngram.
|
| I think this is reasonable. As otherwise we end up accepting
| things that exist solely, but are flawed. Just because
| something exists and is easy to use doesn't mean it's right.
|
| Just like the answer to "the most tweeted thing is X therefore
| it is most popular and important" does not require a separate
| study to find the truth. It's acceptable just to say "this is a
| stupid methodology, don't accept it just because that's what
| twitter says."
| lolc wrote:
| A decline to half the usage of "said" within 6 decades,
| followed by a recovery to the previous level within two
| decades? Show me evidence that the English language changed so
| fast in that way. It's extraordinary and you'd have to bring
| something convincing. Otherwise I believe their hypothesis and
| their conclusion that ngrams are bunk.
|
| Yeah they interpreted the "toast" graph wrong. They should be
| more careful to read shitty graphs that cut off at the low
| point.
| pixelesque wrote:
| It's possible (but I think unlikely) it could be somewhat due
| to different usage of words than the English language
| changing completely (which clearly didn't happen).
|
| i.e. maybe instead of lots of books having direct text like
| "David said" or "Dora said", over time there was a trend to
| use a different more varied/descriptive way of describing
| that, i.e. "David replied" or "Dora retorted"?
| lolc wrote:
| Yea there may be a shift in usage hidden in those numbers.
| As this article laments, we can't use ngrams to measure the
| develpment of usage between said, replied, and retorted.
| tensor wrote:
| It depends entirely on what the data set is, and to conclude
| that it's "wrong" you'd have to consider the underlying data
| too. Google ngrams makes no claim to be a consistent
| benchmark type data set. Over time the content its based on
| shifts, which can cause effects like this.
|
| To make any sort of claim like "this word's usage changes
| over time" in an academic sense you'd need to include a
| discussion of the data sources you used and why those are
| representative of word usage over time. The fact that they'd
| even try to use google ngrams in this way shows how little
| they actually researched the topic.
|
| Google ngrams is a cute data set that can sometimes show
| rough trends, but it's not some "authoritative source on
| usage over time" and it doesn't claim to be.
|
| The authors, on the other hand, are claiming to be
| authoritative and thus the burden of evidence on their claims
| is far far far higher. I didn't even get into their
| completely unobjective and vague accusations of "AI" somehow
| doing something bad. Ngrams don't involve AI, it's simple
| word counting.
| lolc wrote:
| The way I read it, the article was a rant about how people
| shouldn't be using ngrams to prove things.
| lolinder wrote:
| EtymOnline isn't in the business of tracking shifts in the
| popularity of words over time, they set out to track shifts in
| _meaning_. So it 's understandable that they don't have any
| specific contrary evidence in their listing for "said".
|
| As for why they don't include the evidence in TFA, as others
| have noted, it's the extraordinary claim that "said" dropped to
| nearly 1/3 of its peak usage that needs extraordinary evidence
| backing it up. It's plenty sufficient for them to say "this
| doesn't make any sense at all on its face, and is most likely
| due to a major shift in the genre makeup of Google's dataset".
| wrsh07 wrote:
| I think what you want is for someone (yourself, me, the author)
| to review newspapers or some similar source and determine how
| the frequency percent changes over time for the word "said".
|
| This is a reasonable request, but I also think it's fine for
| the author to state it _as an expert_ that newspapers continued
| using said at a similar frequency. The story they tell us
| plausible, and I don't really think the burden of proof is on
| them.
| vlz wrote:
| While the point made by the authors is certainly a valid one,
| it's a bit sneaky and not very fitting to their overall message
| that they have the Y-axes on the ngram graphs not 0-indexed. This
| makes the google results seem more extreme than they in fact are
| and is a bit of misdirection in itself.
|
| Compare e.g. to the actual ngram viewer which seems to index by 0
| per default:
|
| https://books.google.com/ngrams/graph?content=said&year_star...
|
| https://books.google.com/ngrams/graph?content=said&year_star...
| boxed wrote:
| Such a shame too as the point would be equally valid without
| the graph-lies.
| chefandy wrote:
| Kind of. The author could fix a lot of their problems with
| the very prominent dropdown above the graph letting them
| select the collection-- English fiction for example. The long
| s character can be tricky for OCR, but is not likely relevant
| to most people's casual use of the tool. I worked on a team
| that overcame it in a high volume scanning project so they
| should be able to correct that with software and their
| existing page images. The plurals criticism is just wrong--
| you can even do case sensitive searches.
|
| It's not perfect, but it's not useless, and it's not a
| "lie"-- it's just a blunt instrument. Even if the criticism
| was factually correct, 'proving' that you can't do fine work
| with blunt instrument is of dubious value.
|
| I think a lot of folks around here are super thirsty to see
| big tech companies get zinged and when it happens, their fact
| checking skills suffer.
| [deleted]
| stefantalpalaru wrote:
| [dead]
| nerdponx wrote:
| This is the fundamental problem of data analysis: your analysis
| is only as good as your data.
|
| This is not an easy problem.
|
| It's hard in general to evaluate data quality: How do we know
| when our data is good? Are we sure? How do we measure that and
| report on it?
|
| If we do have some qualitative or quantitative assessment of data
| quality, how do we present it in a way that is integrated with
| the results of our analysis?
|
| And if we want to quantitatively adjust our results for data
| quality, how do we do that?
|
| There are answers to the above, but they lie beyond the realm of
| a simple line chart, and they tend to require a fair amount of
| custom effort for each project.
|
| For example in the Google Ngrams case, one could present the data
| quality information on a chart showing the composition of data
| sources over time, broken out into broad categories like
| "academic" and "news". But then you have to assign categories to
| all those documents, which might be easy or hard depending on how
| they were obtained. And then you also have to post a link to that
| chart somewhere very prominently, so that people actually look at
| it, and maybe include some explanatory disclaimer text. That
| would help, but it's not going to prevent the intuitive reaction
| when a human looks at a time series of word usage declining.
|
| Maybe a better option is to try to quantify the uncertainty in
| the word usage time series and overlay that on the chart. There
| are well-established visualization techniques for doing this. but
| how do we quantify uncertainty in word usage? In this case, our
| count of usages is exact: the only uncertainty is uncertainty
| related to sampling. In order to quantify uncertainty, we must
| estimate how much our sample of documents deviates from all
| documents written at that time. It might be doable, but it
| doesn't sound easy. And once we have that done, will people
| actually interpret that uncertainty overlay correctly? Or will
| they just look at the line going down and ignore the rest?
|
| Your analysis is only as good as your data. This has been a
| fundamental problem for as long as we have been trying to analyze
| data, and it's never going to go away. We would do well to
| remember this as we move into the "AI age".
|
| It also says something about us as well: throughout our lives, we
| learn from data. We observe and consider and form opinions. How
| good it is the data that we have observed? Are our conclusions
| valid?
| gcanyon wrote:
| From the comments on that page: "Do publishers still order many
| carloads of "is" each year during spring thaw..."
|
| In Dictionopolis they do! Any Phantom Tollbooth peeps here?
|
| https://en.wikipedia.org/wiki/The_Phantom_Tollbooth
| gitgud wrote:
| Reminds me of a feeling I had when solving a jigsaw puzzle:
|
| _Everything must fit together to reveal the big picture!_ ...
|
| In reality things almost _never_ fit together to reveal some big
| picture... so trying to make them fit like puzzle pieces often
| leads to false conclusions
| digitalsushi wrote:
| When a measure (certainty) becomes a target, it ceases to be a
| good measure (lies)
| gniv wrote:
| BTW, that glyph should have a small bar on the left, but I don't
| see it in the article (in Chrome on Mac).
|
| https://www.compart.com/en/unicode/U+017F (that looks more like
| an s)
|
| Edit: But I see it in fixed-width font: s
| bradrn wrote:
| > that glyph should have a small bar on the left
|
| It depends on the typeface. My browser's fixed-width font, for
| instance, doesn't display a bar.
| brightball wrote:
| "Only a fool is sure if anything, the wise man is always
| guessing." - MacGuyver
| dotsam wrote:
| > It doesn't look like an indicator of the diachronic change in
| the popularity...
|
| I thought all change is diachronic.*
|
| I looked it up and found out that 'diachrony' is a term of art in
| linguistic analysis, contrasting with synchronic analysis.
|
| https://en.wikipedia.org/wiki/Diachrony_and_synchrony
|
| *Edit: I initially thought that saying 'diachronic change' was
| like saying 'three-sided triangle'. But thinking about it, I
| suppose things do change in space but not time, e.g 'the pattern
| changes abruptly'
| robertlagrant wrote:
| > Who Lusts for Certainty Lusts for Lies
|
| Well, maybe[0].
|
| [0] with thanks to https://xkcd.com/552
| diogenes4 wrote:
| At this point I'm waiting for data to show up validating that
| google ngrams has use.
| taeric wrote:
| Is this that the n-grams are wrong, or that they are limited in
| what you can do/say with them? I find the data fun, but I'm not
| entirely sure what to make of it. You will be doing a query on
| past books on today's lexicon. Which just feels wrong.
|
| As an easy example that I know, if you search for "the", you will
| not find a lot of hits. Which, is mostly fair, as historically we
| know that "th" dropped off around the 1400s. That said, add in
| "ye" and you see a ton of its use.
|
| Is that an intentional feature of n-grams? Feels more like an
| encoding mistake passed down through the ages. Would be like
| getting upset at the great vowel shift and not realizing that our
| phonetic symbols are not static universal truths.
| bluetomcat wrote:
| You can never construct a representative image of the past. You
| are operating with a limited amount of sources which have
| survived in one form or another. They are not evenly distributed
| across time and space. There is an inherent "data loss" problem
| when a person dies - gone are all the impressions, unwritten
| experiences, familiar smells. Even a living person's memory may
| not be reliable at one point.
| psychoslave wrote:
| That's why I always found so strange that only those with
| fame/wealth distorted social representations ends up with a
| Wikipedia biography.
| not_knuth wrote:
| Wikipedia is not meant to be an archive of _all_ information.
| It 's meant to be an encyclopedia of things that are
| _notable_ [1], which is probably where the confusion comes
| from.
|
| As you can imagine, the topic of what notability is, has been
| discussed at length since Wikipedia's inception [2].
|
| [1] Notability according to Wikipedia
| https://en.wikipedia.org/wiki/Wikipedia:Notability
|
| [2] Oldest Wikipedia talk comments I could find on Notability
| https://en.m.wikipedia.org/w/index.php?title=Special:History.
| ..
| pintxo wrote:
| At one point? Human memory is surprisingly unreliable.
|
| One example to test for yourself:
| https://youtu.be/vJG698U2Mvo?si=16fwk8wG8Yyhim5t
| psychoslave wrote:
| That is not even memory bias here.
|
| Sure, what you pay attention to will impact what you
| remember, but this experience goes further and show how your
| attention can be manipulated to be blind to ploted events.
| Miraltar wrote:
| Exact but the point is still valid. The Mandela Effect is a
| great example of it.
| ongy wrote:
| Serious question
|
| Are you supposed to not see the gorilla? I assumed it's the
| trap and there's some slightly less obvious catch in there.
| djha-skin wrote:
| The best part of this article is perhaps the following critique
| of ngrams and by extension their popular use in modern
| algorithms:
|
| > The text of Etymonline is built entirely from print sources,
| and is done entirely by human beings. Ngrams are not. They are
| unreliable, a sloppy product of an ignorant technology, one made
| to sell and distract, _one never taught the difference between
| "influence" and "inform."_
|
| > Why are they on the site at all? Because now, online, _pictures
| win and words lose_. The war is over; they won.
|
| _One never taught the difference between "influence" and
| "inform"._ What a scathing rebuke of our modern world and the
| social media that is part of it. Algorithms that attempt to
| quantify human speech and interaction and get it wrong most of
| the time in their quest to maximize their owner's profits.
|
| This somber warning is especially poignant in an age more and
| more ruled by generative AI, which I'm told is essentially an
| ngram predictor.
| acyou wrote:
| Influence and inform are two sides of the same moral coin,
| where we claim others ideas aren't their own, whereas we are
| the virtuous informed ones who draw our own conclusions.
|
| The low-pass filter of the mind only allows in what fits
| somewhere inside the existing framework. If you don't reject
| something, then being informed by it and being influenced by it
| are the same thing. In that framework, people who claim to be
| informed come off as high and mighty and a little lacking in
| self consciousness.
| gpderetta wrote:
| I inform, you influence, he propagandizes.
| thrdbndndn wrote:
| > The text of Etymonline is built entirely from print sources,
| and is done entirely by human beings. Ngrams are not.
|
| I'm confused about this part actually. I assume by "entirely
| from print sources" it means it does not include digital
| sources? That doesn't sound very relevant to the issues
| mentioned in the article though: unless it uses the "complete"
| set of _all_ print source, it totally could have the same
| skewed-dataset issues too; and humans can make the same mistake
| as OCR does.
| sudobash1 wrote:
| Etymonline compiles the information on etymology and
| historical usage from printed books (eg the Oxford English
| Dictionary). That is what is being referred to here. They are
| not having humans tally up different words from books. That
| data is entirely from ngrams.
| crazygringo wrote:
| The n-grams aren't _wrong_ , but it is a real problem that the
| underlying corpus distribution changes massively over time (in
| this case, proportion of academic vs. non-academic works).
|
| This is a really devilish problem with no easy answer.
|
| Because on the one hand, it's certainly easy enough to normalize
| by genre -- e.g. fix academic works at 20%, popular magazines at
| 20%, fiction books at 40%, and so forth.
|
| But the problem is that the popularity of genres changes over
| time separately in terms of supply and demand, as well as
| consumption of printed material overall. Fiction written might
| increase while fiction consumed might decrease. Or the
| consumption of books might decrease as television consumption
| increases.
|
| So there isn't any objectively "right" answer at all.
|
| But it would be nice if Google allowed you to plot popularity _by
| genre_ -- I think that would help a lot in terms of determining
| where and how words become more or less common.
| hyperific wrote:
| It seems to me that Google Ngram isn't _wrong_. It 's reporting
| statistics on the words it correctly identified in the corpus.
| The problem is the context of the statistics. You may somewhat
| confidently say the word "said" dips in usage at such and such
| time _in the Google Books corpus_. You can more confidently say
| it dips at such and such time for the subset of the corpus for
| which OCR correctly identified every instance of the word. But
| you can 't make claims in a broader context like "this word
| dipped in usage at such and such time" without having sufficient
| data.
| dredmorbius wrote:
| And this is why _sampling methodology_ is so much more vastly
| important in drawing inferential population statistics than
| _sample size_.
|
| Sample 1 million books from an academic corpus, and you'll turn
| up a very different linguistic corpus than selecting the ten
| best-selling books for each decade of the 20th century.
| gmd63 wrote:
| Just as "it depends" is a meme for economists, "need more data"
| is the galaxy-brain statistician meme.
|
| Until you've solved the grand unified theory, you can never be
| fully confident in the completeness of your data or statistical
| inferences.
|
| What's wrong is misleading the public away from this
| understanding.
| thomasfromcdnjs wrote:
| Does this criticism of ngrams also translates to keyword trends
| when considering SEO/SEM?
| andrewflnr wrote:
| The title is true for a lot more areas of life than linguistics.
| There are no shortcuts to truth, DVD anyone who tries to offer
| you one is probably trying to sell you something.
| madsbuch wrote:
| The title is about certainty and not truth.
|
| > Who Lusts for Certainty Lusts for Lies
|
| I think this is one of the one-liners that sound good, but is
| bogus at closer inspection.
|
| That articles talks about history. In that context it might
| make sense as it is hard to say something with certainty.
|
| But in every speech I can say things with certainty without
| lying.
|
| If we furthermore drag the word certainty out of a philosophers
| grip and apply a layman meaning to it, then many things are
| certain as the word can also mean commitment.
| RockyMcNuts wrote:
| Who demands certainty demands bullshit would be more
| accurate.
| Delk wrote:
| I don't think it's bogus.
|
| I've seen people who strongly crave for (a feeling of)
| certainty prefer simplified categorizations and false
| absolutes to complexity that doesn't offer absolute certainty
| and discrete clarity.
|
| Similarly, some things aren't readily quantifiable, and in
| some cases any quantification might be a great
| oversimplification at best. In those cases wanting a
| quantified and measurable answer instead of a more complex
| answer with less (of a feeling of) certainty can amount to
| wanting a lie. Or at least to wanting an answer that feels a
| lot more certain and true than it actually is.
|
| I think that's what the post is about.
|
| Of course the title isn't absolutely true either. Of course
| you can say and find things that are true and (to a good
| approximation) certain. But that's not really what the post
| or its title are trying to say.
| speak_plainly wrote:
| There's an entire field of study dedicated to these puzzles:
| epistemology.
|
| https://plato.stanford.edu/entries/certainty/
| AnimalMuppet wrote:
| In every speech you can say _some_ things with certainty
| without lying.
|
| But I think the point of the saying is in the other
| direction. If you are _listening_ to a speech, the things
| that the speaker can say with certainty may not be the ones
| where you want certainty. And if you demand certainty on
| those things, you will find those who will give it to you.
| But the certainty itself is a lie - that 's why the speaker
| can't (honestly) say those things with certainty.
|
| What is the optimum political program for the United States?
| There are plenty of people willing tell you with (apparent)
| certainty what the answer is. The truth is that nobody knows
| with certainty, and so the answers that sound certain are
| lies. The actual program may be correct - _may_ be - but the
| certainty itself is a lie.
|
| This is often true in linguistics, and history, and politics,
| and economics. Don't demand certainty where there is none.
| ta8645 wrote:
| This hits close to home with all the appeals to authority over
| the last few years. With absolute confidence they were holders
| of the truth, "trust the science!".
| andrewflnr wrote:
| Kinda, but most of the anti-scientific bullshit out there is
| a symptom of precisely this phenomenon. _Actual_ science
| cannot offer absolute certainly, so people reach for whatever
| alternate theory offers the feeling of certainty. Blind faith
| in "the science" kind of works, and even gets pretty decent
| practical results, but you know what's structurally really
| hard to disprove and thus amenable to feeling certain?
| Conspiracy theories!
| ta8645 wrote:
| > Conspiracy theories!
|
| I hear what you're saying. In the end, we have to believe
| _something_ -- on less than perfect information.
|
| But understanding human nature, isn't a conspiracy theory.
| And accepting obviously overreaching statements of "fact",
| that literally nobody had the data to state unequivocally,
| is not following the science.
|
| It wasn't so long ago, that most people understood big
| pharma was a profit seeking machine, that wasn't primarily
| motivated by what is best for humanity. Overstating the
| risks of Covid, and pretending that we faced an existential
| threat, made everyone forget that truth, and
| unquestioningly believe that only the purest of intentions
| motivated the industrial/media response.
| gilleain wrote:
| What does "DVD anyone" mean?
|
| (Perhaps a roundabout way to say "Make obsolete", as a way to
| say "Get rid of"?)
| mancerayder wrote:
| I just can't CD what that means either.
| Tactician_mark wrote:
| It's a Blu-ray mystery to me.
| psychoslave wrote:
| It fades away vinyl from my ens.
|
| https://en.wiktionary.org/wiki/ens
| compiler-devel wrote:
| The redditification of HN is sad. With reddit de facto
| purging third-party apps with increased API prices, we
| now see reddit-tier conversations spamming message boards
| like HN.
| sk0g wrote:
| https://news.ycombinator.com/newsguidelines.html
|
| > Please don't post comments saying that HN is turning
| into Reddit. It's a semi-noob illusion, as old as the
| hills.
| decremental wrote:
| [dead]
| thechao wrote:
| Typo insertion where the autocorrect hallucinates a word?
| Happens to me sometimes...
| andrewflnr wrote:
| This. Sorry everyone.
| adrianmonk wrote:
| It's probably supposed to be "and" instead of "DVD". Both
| words have a similar shape on the keyboard, especially if
| you're doing swipe-style smartphone keyboard input.
| cainxinth wrote:
| Agnostics have been saying this for years (jk... sorta).
| guardian5x wrote:
| You are not wrong there. This title could also be an article
| about atheism and religion.
| lvass wrote:
| Surely you meant to write agnostics.
| cainxinth wrote:
| Corrected it
| ttoinou wrote:
| The y-axis do not start at zero. So basically the author doesnt
| know how to read a graph.. what am I missing ?
| dahart wrote:
| > Ngram says toast almost vanishes from the English language by
| 1980, and then it pops back up.
|
| The Ngram plot does not say that. It shows usage dropping ~40%
| (since 1800). It's indeed a problem that the graph Y axis doesn't
| go to zero, as others have pointed out. But did the etymonline
| authors really not notice this before declaring incorrectly what
| it says? I would find that hard to believe (especially
| considering the subsequent "see, no dip" example that has a zero
| Y and a small but visible plateau around 1980), and it's ironic
| considering the hyperbolic and accusatory title and and opening
| sentence.
| lolinder wrote:
| The graph axis isn't the only problem. The word "toast" did not
| drop in usage by 40%, Google's dataset shifted dramatically
| towards a different genre than it was composed of previously.
| I've been in conversations with people trying to explain those
| drops in the 70s, and no one (myself included) realized that it
| was such a dramatic flaw in the data.
| bee_rider wrote:
| Is there no way to filter out particular data sets? This
| seems like a pretty huge limitation.
| dahart wrote:
| That's fair, the article has a very valid point, which would
| be made even stronger without the misreading of the plots
| they're critiquing, whether it was accidental or intentional.
| I always thought Ngrams were weird too, I remember in the
| past thinking some of the dramatic shifts it shows were
| unlikely.
| tantalor wrote:
| Why the title change?
|
| Title on the site is "Who Lusts for Certainty Lusts for Lies"
|
| Title here is "Google Ngram Viewer n-grams are wrong"
| 0xfae wrote:
| HN in general doesn't like "editorialized" titles. HN titles
| are meant to be a factual representation of what you are going
| read without the attention grabbing (albeit clever) title.
| tantalor wrote:
| Er no.
|
| > Otherwise please use the original title, unless it is
| misleading or linkbait; don't editorialize.
|
| The "don't editorialize" guideline is meant for the
| _submitter_ to not change the the title to make some point.
|
| The site can & should use whatever title it wants. So be it
| if they want to editorialize. That's their prerogative.
| dredmorbius wrote:
| Both your and GP comment are inaccurate and/or unclear.
|
| HN _prefers_ but does not _require_ the original title.
|
| HN _does not permit_ submitter editorialising.
|
| Where the original title is clickbait, _which may include
| editorialising_ , HN requests that submitters change the
| title, if at all possible to some phrase within the
| article.
|
| Another de facto rule concerns "title fever", which is when
| a title is so distracting that it overwhelms the content of
| the article in discussion.
|
| From the guidelines:
|
| _If the title includes the name of the site, please take
| it out, because the site name will be displayed after the
| link._
|
| _If the title contains a gratuitous number or number +
| adjective, we 'd appreciate it if you'd crop it. E.g.
| translate "10 Ways To Do X" to "How To Do X," and "14
| Amazing Ys" to "Ys." Exception: when the number is
| meaningful, e.g. "The 5 Platonic Solids."_
|
| _Otherwise please use the original title,_ unless it is
| misleading or linkbait; _don 't editorialize._
|
| <https://news.ycombinator.com/newsguidelines.html>
|
| Some of dang's comments on the issue:
|
| - On changing original title (from yesterday, and NPR to
| boot): <https://news.ycombinator.com/item?id=37625424>.
| Also: <https://news.ycombinator.com/item?id=36655892>
|
| - On substituting a phrase from the article: <https://hn.al
| golia.com/?dateRange=all&page=0&prefix=true&que...>
|
| - On submitter editorialising:
| <https://news.ycombinator.com/item?id=8357252>
| <https://news.ycombinator.com/item?id=35163133>
|
| - Distracting titles:
| <https://news.ycombinator.com/item?id=37137478>.
| Particularly cases where "the thread will lose its mind":
| <https://news.ycombinator.com/item?id=22176686>
|
| - "Title fever": (Beginning 4 'graphs in)
| <https://news.ycombinator.com/item?id=20429573>
| AugustoCAS wrote:
| I'm going to use that title on the next conversations I have
| about estimates, in particular in the context of 'we need to know
| that this piece of work will be started in 4 months and finished
| in 8'. Those conversations definitely suck for me.
| js8 wrote:
| Though you should also remember "who lusts for promotion lusts
| for telling lies".
| CapitalistCartr wrote:
| Only one goal can be first. If you want to set absolute dates,
| all other requirements must be subordinate to that. In which
| case, sure, we can absolutely meet it.
| ChrisMarshallNY wrote:
| There's that classic poster that you see in almost every auto
| mechanic's shop. Good Fast
| Cheap Pick 2
| nuancebydefault wrote:
| Not so rarely, you even need to settle for picking 1
| jklinger410 wrote:
| This title is an absolute banger
| [deleted]
| d-lisp wrote:
| [flagged]
| gascoigne wrote:
| Surely if you have story pointed and T-shirt sized your epics
| correctly that shouldn't be difficult? /s
| dumbfounder wrote:
| This guy sucks.
| [deleted]
| fenomas wrote:
| And boo, incidentally, to whomever changed the HN title - from
| the most memorably evocative title this site has ever seen to
| one of the blandest.
| etrevino wrote:
| What was it? I arrived too late.
| fenomas wrote:
| Sorry, HN previously had TFA's actual title - "Who Lusts
| for Certainty Lusts for Lies".
| scubbo wrote:
| I, uhhhh.....I would like to know what TFA is meant to
| stand for, because I assume it is not "the sucking
| article", but that was my first thought. Maybe
| "featured"? Google is only giving me "Teach For America"
| or "Trade Facilitation Agreement".
| klyrs wrote:
| Does "fornicating" sound more polite to you?
| iudqnolq wrote:
| it is the fucking article. or "featured" if you're
| feeling classy.
| mjochim wrote:
| I like to read it as The Fine Article.
| idrios wrote:
| This is the kind of question that doesnt need to be
| answered with certainty. "The fucking article" is
| definitely the most fun interpretation of "TFA".
| etrevino wrote:
| lol, that's pretty good, I agree with you.
| djsavvy wrote:
| Looks like it's been changed back! What was the "bland"
| title in the middle?
| Intralexical wrote:
| "Google Ngram Viewer n-grams are wrong".
| [deleted]
| dahart wrote:
| The article title is certainly provocative, yes, and that's
| the problem. Do you want clickbait titles? The article's
| title is a combination of a platitude, an inaccurate and/or
| irrelevant statement, and an implied inflammatory accusation.
| Swapping the title for the more accurate more informational
| less provocative first line is much better for me, but maybe
| true that not flinging around the word "lies" could result in
| fewer clicks.
| fenomas wrote:
| I don't think "Ngrams are wrong" is what TFA is about. The
| author isn't an expert on Ngrams and he's not sharing any
| new information about them; what he's really talking about
| is how data about language is unreliable, and why Ngram
| images are on his site even though he knows they're flawed.
| Personally, I found the original title truer to the article
| than the current one.
| zem wrote:
| the word "clickbait" is flung around way too readily these
| days. a good title is _supposed_ to make you want to read
| the article, and at its best it is an artistic flourish
| that enhances the overall piece. and personally, i love
| that. i enjoy seeing how writers (or editors) come up with
| good titles, and the fun and interesting ways they relate
| to the text of the piece. i enjoy when the title is clearly
| an allusion or reference to something, and chasing it down
| leads me to learn something new. and i even enjoy when the
| title is just a pun or play on words, because writers live
| for moments like that :)
|
| in this case i definitely felt "wow, that's an interesting
| quote, and i can see what they are getting at. let's read
| the article to see how it's substantiated or used as a
| springboard".
|
| clickbait is more "we have some amazing!!!!! information to
| tell you but to find out what you will have to read the
| article", e.g. the classic listicle format "10 things we
| imagined a beowulf cluster of - number 4 will shock you!",
| the spammy "one weird trick doctors don't want you to know"
| or the tabloid "john brown's shocking affair!". and yes,
| that sort of thing is a plague on the internet and i would
| not like to see more of it, but also that is not what is
| going on here.
| ComputerGuru wrote:
| I personally feel like more people will click with this new
| title. The old one was far too vague and ambiguous for a news
| aggregation site. I thought the old title would be about
| scientific papers and trying too hard to get definitive
| answers out of them.
| dredmorbius wrote:
| The title and site reward those who'd click through on the
| original rather than the bland substitute.
| fenomas wrote:
| Horses for courses, but to me the original title was the
| forest and the stuff about Ngrams was the trees. As such I
| found TFA interesting, even though I have no interest in
| Ngrams or whether they're correct (which is why I
| definitely would not have clicked on the current title).
| setgree wrote:
| adding "horses for courses" to my lexicon, TY :)
| 1970-01-01 wrote:
| At first glance, I thought it was a translated Latin phrase.
|
| desiderat certum, desiderat falsitates
| PaulHoule wrote:
| Don't like the title, at least for this article.
|
| When it comes to results like this it is more "lusting for
| clickbait" or the scientific equivalent thereof. (e.g. papers in
| _Science_ and _Nature_ aren't really particularly likely to be
| right, but they are particularly likely to be outrageous,
| particularly in fields like physics that aren't their center)
|
| On the other hand, "Real Clear Poltics" always had a toxic
| sounding name to me since there is nothing "Real" or "Clear"
| about poltics: I think the best book about politics is Hunter S.
| Thompson's _Fear and Loathing on the Campaign Trail '72_ which is
| a druggie's personal experience following the candidates around
| and picking up hitchhikers on the road at 3am and getting strung
| out on the train and having moments of jarring sobriety like the
| time when he understood the parliamentary maneuvering that won
| McGovern the nomination while more conventional journalists were
| at a loss.
|
| What I do know is 20 years from now an impeccably researched book
| will come out that makes a strong case that what we believed
| about political events today was all wrong and really it was
| something different. In the meantime different people are going
| to have radically different perspectives and... that's the way it
| is. Adjectives like "real" and "clear" are an attempt to shut
| down most of those perspectives and pretend one of those
| viewpoints is privileged. Makes me thing of Baudrillard's
| thorough shitting on the word "real" in _Simulacra and
| Simulation_ which ought to completely convince you that people
| peddling the fake will be heralded by the word "real".
|
| (Or for that matter, that Scientology calls itself the "science
| of certainty.")
| paulsutter wrote:
| And it will also be wrong.
|
| > 20 years from now an impeccably researched book will come out
| that makes a strong case that what we believed about political
| events today was all wrong and really it was something
| different
|
| The one good thing about politics is that the motives are
| crystal clear, politicians want to stay in power first, and
| only secondarily want to improve things.
|
| Once you know this, everything makes sense. Even if we never
| find out what "really" happened
| Karellen wrote:
| > politicians want to stay in power first, and only
| secondarily want to improve things.
|
| The politicians who want to be in power first, and only
| secondarily want to improve things, tend to be the
| politicians in power.
|
| Politicians who want to improve things first do exist, but
| they tend not to achieve power, because power is not their
| goal, and they are out-maneuvered by the first type.
|
| Notably, politicians who want to improve things are easily
| side-tracked by suggesting that their proposed policy is not
| the best way to improve things, and that some other way would
| be better. This explains to some degree a lot of infighting
| on the left, because many do want to genuinely help, but it's
| never 100% clear what the best way to help is. It also
| explains why the right can put aside major differences of
| opinion (2A is important to fight the government who can't be
| trusted, but support the troops and arm the police!) to
| achieve power, because acquiring and maintaining power is
| more important than exactly what you plan to do with it.
| Vt71fcAqt7 wrote:
| >2A is important to fight the government who can't be
| trusted, but support the troops and arm the police!
|
| I fail to see the contradiction here. 2A proponents would
| say that 2A is there for when the government goes wrong, or
| "when in the Course of human events, it becomes necessary
| for one people to dissolve the political bands which have
| connected them with another." At all other times, however,
| it would be up to the government to enforce the law and
| protect the people. Destroying the state is a different
| ideology.
|
| (To be clear, the last few wars may not have been about
| protecting the people. But that the US has not been
| attacked since Pearl Harbor may be a result of the
| investment made in "defence" since then, as well as
| favourable borders ect.)
|
| In any case 'both sides' have people who people who actualy
| care about society. And there are people on the left who
| may simply want power, and complex people who seem to be a
| bit of both (for example perhaps Lyndon Johnson depending
| on how you see him).
| bilbo0s wrote:
| _politicians want to stay in power first, and only
| secondarily want to improve things._
|
| In all honesty, many don't even want to improve things. Most
| people with power, love power. It's contrary to their nature
| to change a system that confers power to themselves. That's
| not just in your own, but in any nation, the people in power
| will be resistant to change.
| PaulHoule wrote:
| That's as close as you will get to a master narrative but it
| isn't all of it.
|
| Politicians aren't always sure what will win for them, often
| face a menu of unappetizing choices and have other
| motivations too. (Quite a few of the better Republicans have
| quit in disgust in the last decade: I watched the pope speak
| in front of congress flanked by Joe Biden, then VP and John
| Boehner, then House Speaker when the pope obliquely said they
| should start behaving like adults and then Boehner quit a few
| days later and got into the cannabis business.)
|
| I was an elected member of the state committee of the Green
| Party of New York and found myself arguing against a course
| of action that I emotionally agreed with, thought was a
| tactical mistake, and that my constituents were (it turns out
| fatally) divided about. It was a strategic disaster in the
| end.
| paulsutter wrote:
| You're right, I should have added that politics is also
| extremely difficult and filled with unpalatable choices.
| Each of the politicians I have met are intelligent, caring
| people with a clear grasp of the issues.
|
| And then you see what they do, and you wonder, what the...
| phkahler wrote:
| Classic mistake of not including zero on the vertical axis of a
| graph. If you're thinking "but then there won't be so much
| variation" you're right. Leaving zero off allows small variations
| to look large.
| mattkrause wrote:
| Am I alone in thinking that the graph was okay and the text was
| just indulging in a bit of hyperbole?
|
| It's a sudden ~50% dip, following nearly a century of apparent
| stability.
| PaulHoule wrote:
| On the other hand there are the cases where you do want to
| emphasize small variations. In a control chart showing the fill
| weight of cereal boxes you certainly don't want zero on the
| chart. Neither do you want to plot daily temperatures in a city
| on a chart that includes 0 Kelvin.
| hef19898 wrote:
| Sure you do, why not? If you don't, show the deviation values
| (plus and minus) centered around zero again.
| PaulHoule wrote:
| Not if it means the line looks flat.
| slenk wrote:
| Sometimes the data is flat...
| thfuran wrote:
| And many times small variations matter.
| slenk wrote:
| Yes, the CMB for instance.
| PaulHoule wrote:
| It sure feels like the temperature in Upstate NY varies
| by more than 10%!
| Scubabear68 wrote:
| Exactly. A lot of investment market charts are zoomed in like
| that because small deviations can matter a lot, and you don't
| want the base price (or whatever measure you're looking at)
| to swamp the signal.
| lolinder wrote:
| Including zero would have helped the "said" graph but not
| solved it--it just would still look like "said" dropped to
| almost 1/3 of its prior popularity, when what actually happened
| is the makeup of the sample changed dramatically.
| jgalt212 wrote:
| The words of Colonel Nathan R. Jessup come to mind.
___________________________________________________________________
(page generated 2023-09-26 23:00 UTC)