[HN Gopher] Tortured phrases: A dubious writing style emerging i...
___________________________________________________________________
Tortured phrases: A dubious writing style emerging in science
Author : DanBC
Score : 137 points
Date : 2021-08-08 15:54 UTC (7 hours ago)
(HTM) web link (www.nature.com)
(TXT) w3m dump (www.nature.com)
| vmilner wrote:
| I've a horrible premonition that the paper describing this
| problem (and those that cite it) may eventually end up being
| flagged for containing too many tortured phrases...
| PhasmaFelis wrote:
| I've been seeing this in news articles as well. Swipe someone
| else's article, run it through a synonym-replacer algorithm, and
| have Reddit bots post it on a bunch of news subs. Presumably the
| thesaurus work fools Google's just-a-copy detector.
|
| It's the next step in clickbait monetization. Why settle for low-
| effort content when you can have _no_ -effort content?
| lettergram wrote:
| This is pretty much how corporate news works imo. I can't tell
| you how many times I've seen one article then generate a
| million more.
|
| My favorite example, go to google or DuckDuckGo and type:
|
| "Xxx number hospitalized" or "yyy new cases"
|
| You can type almost any number and get a ton of articles. Not
| exactly a reprint, but they all seem almost generated
| newsclues wrote:
| Next will be a hybrid model where no effort content that begins
| to trend virally gets a human to tweak it for optimization.
|
| Rewriting headlines that bots wrote and A B testing humans vs
| Software
| withinboredom wrote:
| Even more entertaining would be all the traffic being from
| bots trying to do the same thing.
| coldpie wrote:
| Thanks to advertising as a business model.
| wolverine876 wrote:
| Could you share any examples?
| im3w1l wrote:
| I wish they had kept the method secret. Getting these papers
| retracted is less valuable than being able to secretly keep tabs
| on them.
| _Microft wrote:
| If they are not retracted, they might get cited by other works
| which themselves might get cited. Suddenly this faked,
| nonexistent research has been "laundered" into mainstream and
| nobody knows anymore that there was a problem in the first
| place.
| WesolyKubeczek wrote:
| I'm wondering what happened to good old reading with
| comprehension. Ain't nobody got no time for that? If so,
| doesn't it make those papers worthless?
| eecc wrote:
| Nope. Time is the only non-fungible asset being burned here
| and everyone is desperately defending their own allotment.
| WesolyKubeczek wrote:
| Then can we at least draw a border around such "science"
| so that serious people who have work to do know to not
| waste time with it?
| wmf wrote:
| We already know which venues are legit (because we've
| heard of them) and which aren't (because we haven't).
| twirlock wrote:
| >how to excuse an intelligentsia which manages the public by
| simply lying its ass off
| waterhouse wrote:
| What would be cool is if they'd figured out two methods, and
| only published one.
|
| Though if they've published a convenient list of the bad
| papers, then, assuming other markers exist, that makes it easy
| for others to discover them.
| bonniemuffin wrote:
| Maybe they did.
| maficious wrote:
| As much as it is sad that such a thing is happening, this is
| hilarious.
| ipsum2 wrote:
| A high profile case (on the internet) similar to the one
| described in the article is when Siraj Raval plagiarized a paper
| on quantum ML and made some amusing replacement phrases:
|
| complex Hilbert space -> Complicated Hilbert space
|
| Quantum gate -> Quantum door
|
| https://www.theregister.com/2019/10/14/ravel_ai_youtube/
| varjag wrote:
| First thought after the opening paragraph, "these have to be
| mainlanders". Scrolling down, yup.
| FabHK wrote:
| Pertinent passage from the preprint:
|
| > Out of 404 papers accepted in less then 30 days after
| submission, 394 papers (97.5%) have authors with affiliations
| in (mainland) China. Out of 615 papers of which editorial
| processing time exceeded 40 days, 58 papers (9.5%) only have
| authors with affiliations in (main- land) China. This tenfold
| imbalance suggests a differentiated processing of papers
| affiliated to China characterised by shorter peer-review
| duration.
| mrfusion wrote:
| I wonder if any phrases or styles could detect group think or
| studies following the crowd.
| neoCrimeLabs wrote:
| I'm very tempted to introduce tortured phrases at work for
| occasional humor. For example, who needs "continuous integration"
| when you have "ceaseless incorporation"? Sometimes it's nice to
| see if anyone reads my notes.
|
| In all seriousness though, I've experienced something similar
| before at a Japanese run American corporation as far back as the
| 90's. The combination of Jargon with executives and executive
| assistants who didn't know American tech-jargon often resulted in
| accepting mangled suggestions by the spell-checker. A notorious
| example was the "Data Whorehousing" presentation, which somehow
| made it through several reviews and rehearsals before being
| presented to the entire American IT department at an all-hands
| meeting.
| dmos62 wrote:
| I feel like only the highest profile journals can be trusted at
| this point. How long will it take academia to adapt?
| 08-15 wrote:
| Why do you feel that, though?
|
| My favorite counter example is "A Draft Sequence Of A
| Neandertal Genome". The article was accepted by both Nature and
| Science _before it was written_. The authors chose to publish
| in Science, because Science offered more on the side: the title
| page and an unlimited(!) number of "contributed" (this means
| unreviewed) companion papers. The article itself was about 20
| pages of drivel; all the substantial content was relegated to
| the 200(!) pages of "Online Supplemental Material". Nobody ever
| read, let alone reviewed, all of that.
|
| After that, I can't trust either Science or Nature, which
| offered pretty much the same crooked deal. If those two aren't
| "highest profile", who is?
| robwwilliams wrote:
| Not even those! The impact factor of a journal is a terrible
| guide to quality. It is more appropriately thought of as a
| measure of scientific sex appeal.
|
| You must read each paper to judge its merits. Lots of junk gets
| published in top ranked journals.
| nick__m wrote:
| Lots of junk gets published in top ranked journals.
|
| A lot more get published in vanity journals, so I use the
| impact factor as a first pass filter: I avoid papers from
| journals not listed the JCR1 or those with a factor below
| 1.000.
|
| I assume, maybe naively, that if an important finding were to
| be published in such low quality journal, it would eventually
| get published in a more legit publication.
|
| 1- https://www.researchgate.net/publication/342623066_Journal
| _C...
| raincom wrote:
| I thought top ranked journals have good reviewers, since the
| editorial board consists of researchers/professors from top
| notch schools. Can you share your thoughts why junk get
| published in such journals? Has it to do with collusion or
| reputation-laundering or more?
| wmf wrote:
| There's an order of magnitude difference between the worst
| paper published in a good venue vs. the "tortured" fake
| papers in fake journals though.
| AlexCoventry wrote:
| That's a low bar, though. The point is that it's very
| difficult to judge the scientific merits of a paper without
| actually reading it. (And even then, it's easy to be
| fooled.)
| geofft wrote:
| This is an Elsevier journal. Due to a mistake by Elsevier,
| these papers were published without review.
|
| University libraries who continue to pay Elsevier should know
| that they are propping up scammers and grifters.
| FabHK wrote:
| For the journal in question, its "Journal Impact Factor
| increased from 0.471 to 1.161 over 2015-2019, that is a 146%
| increase over four years"
|
| Would that be considered good?
| sampo wrote:
| > I feel like only the highest profile journals can be trusted
| at this point.
|
| The highest profile journals ( _Nature_ , _Science_ , _The
| Lancet_ in medicine, ...) have some tendency to go for
| sensationalism. They want to publish radical, ground-breaking
| research more than there is actual new ground-breaking results
| happening. So they also end up publishing mediocre research
| presented as ground-breaking, and some less-than-accurate
| research where results are exaggerated to make them look
| ground-breaking.
| Animats wrote:
| Um, yes. "Nature" used to have a great reputation. Supposedly
| it still does in bio. But battery articles in Nature are just
| awful. They keep blowing up "minor advance in surface
| chemistry" into "10x better battery that costs 10x less Real
| Soon Now".
|
| (I'd like to see EV World or something else in that space
| reprint old articles as "1, 5, and 10 years ago in battery
| hype".)
| petschge wrote:
| Yeah in my field the general attitude is that Nature isn't
| all that great. I have heard the phrase "it was published in
| Nature but might still be right" more than once.
| gunfighthacksaw wrote:
| A colleague in an unnamed field, attending an unnamed Polish
| university mentioned that this kind of thing was rife: publishing
| Polish papers translated from English texts and occasionally vice
| versa. Poland is a country with a strong academic tradition and
| similar enough institutions to others in the EU so I can only
| imagine this happens in more 'peripheral' countries with even
| less globalization.
| dang wrote:
| The paper is at https://arxiv.org/abs/2107.06751.
|
| (We merged this thread and
| https://news.ycombinator.com/item?id=28108111)
| doubtfuluser wrote:
| Maybe a future direction would be to train new models to identify
| plagiarism by training on this information. Use ,,non matching
| backtranslations for training classifiers. It's again the typical
| cat and mouse game I guess
| tarboreus wrote:
| Or someone could...read the papers.
| wereHamster wrote:
| It's the classical problem of people trying to find
| technological solutions to social problems. If plagiarism and
| fake research is still a problem after we've applied
| technology to fight it, clearly we haven't applied enough of
| it.
| waterhouse wrote:
| Sometimes technological solutions work really well to solve
| social problems. For example, at one point, one person
| using the internet would tie up the phone line for everyone
| else in the house, and vice versa. Negotiating this shared
| resource could be considered a household social problem.
| But now there's no such interference, and most people have
| their own cell phones.
| tnzm wrote:
| This is a social problem around the shared use of a
| technological resource. I'm reminded of the old saying,
| "computers can only solve problems that are created with
| computers".
|
| But then again you can view _all_ solutions to social
| problems as inherently technological in the broader
| sense; I adhere to that paradigm.
| robertlagrant wrote:
| That saying seems silly. Computers (i.e. Zoom) help with
| the problem of needing socially distanced education
| during Covid lockdowns.
| pas wrote:
| Referees have no real incentive to keep quality high. They
| already don't get anything in return for doing it. (At best
| they do it for reciprocity/goodwill.) Papers are usually hard
| to follow, replication rate is abysmal, etc. The incentives
| are all set for publishing, not for making real progress.
| PragmaticPulp wrote:
| The number of papers being published is growing at a
| staggering rate. This requires proportional growth in the
| number of people reading these papers, which inevitably means
| the plagiarists and cheaters themselves are being pulled into
| the review system as well. They don't care about letting
| fraudulent papers slip through because they never really
| cared about the science in the first place.
|
| They see it as a game that they're playing and they're doing
| their best to put as little effort as possible into the game
| while extracting as much reputation upside as they can.
|
| We really need to make publishing fraudulent papers a career-
| ending move across academia and even the industry. The only
| reason this continues to happen is because it has a lot of
| upside but very little downside. Caught publishing fraudulent
| papers? Oh well, just leave them off your resume and apply
| somewhere else.
| rusk wrote:
| In telecoms they call all the backend infrastructure "back haul"
| and have never read a satisfactory explanation. I'm convinced
| that somebody once coined "back hall" with the intention of
| invoking the image of service passages like what you see in the
| mall, it was misheard (as is often the case in Telecomms given
| it's global nature) and the metaphor of the bulldozer tail stuck
| for ever after
| robertlagrant wrote:
| IME I think backhaul is just sending data to the main internet,
| not all backend. I thought it just meant it hauled the data
| back into the core network.
| rusk wrote:
| Main Internet, across main Internet, between networks, intra-
| domain. Intra-station. Anything that joins it all together
| that isn't "front facing" i.e wireless network towards
| handsets
| Animats wrote:
| I've heard that term used where an ISP is piggbacking on a
| larger service. Sonic.net offers some of their services over
| AT&T infrastructure. Data to and from home DSL lines is
| "backhauled" to Sonic HQ in Santa Rosa, CA and then goes out
| over the bulk Internet backbone from there. This is a different
| path the data would take than if handled entirely by AT&T.
| vericiab wrote:
| In freight "backhaul" typically refers to transporting goods
| during the return journey. During the principle (non-return)
| journey, often the starting location is more central, like a
| distribution center, and the destination is a smaller satellite
| location like a store. So when something is backhauled, that
| tends to mean it's transported from the smaller satellite
| location to the central location.
|
| Maybe that's where the term came from?
| rusk wrote:
| Maybe actually, or at the very least it could explain the
| confusion. Better than some of the other explanations I've
| heard for sure.
| FabHK wrote:
| And the journal involved, _Microprocessors and Microsystems_ , is
| an Elsevier journal. Huge surprise. I am glad the publisher earns
| their outrageous fees by careful screening, peer-review, and
| editing of submitted manuscripts. /s
|
| Ceterum censeo Elsevier(um) esse delendum.
| CRConrad wrote:
| Elsevirus?
| DanBC wrote:
| Full title is: "Tortured phrases: A dubious writing style
| emerging in science. Evidence of critical issues affecting
| established journals"
| zozbot234 wrote:
| No mentions of tortured phrases in the humanities and softer
| social sciences? For all their supposed appreciation of _les
| belles-lettres_ (viz., "fine writing") those researchers sure
| seem to like their tortured phrasings.
| wmf wrote:
| That's a completely separate issue that shouldn't be conflated.
| zozbot234 wrote:
| > That's a completely separate issue
|
| How so? It seems quite related to me. Anecdotally, one would
| expect a pretty clear negative correlation between
| torturedness in the sense of this article and indicators of
| research quality.
| robertlagrant wrote:
| This is a category error. The article is only about the
| subset of low quality papers generated by automatic
| translation.
| atrettel wrote:
| I have encountered something similar to this for a submission
| that I reviewed for a scientific journal. I will not list any
| names or give much detail past those generalities, but I pointed
| out that the authors were misusing a particular technical term.
| In my review I defined the term and explained it briefly. I asked
| the authors to revise their submission accordingly. The paper was
| not bad but the authors did not know English very well, so it was
| quite difficult to read. That was its main problem. However, when
| I received the revised submission, I noticed that the authors
| plagiarized my definition and explanation almost word for word
| (from my _confidential_ review). I pointed this out to the
| editors and they said to just reject the paper with the stated
| reason being plagiarism, which I did. The journal ended up
| rejecting the article, but I discovered it a few years later in a
| different journal. The plagiarized section remained, but the
| authors swabbed out a lot my phrases for these kind of "tortured
| phrases".
|
| That said, the authors did not fabricate their research (as far
| as I can tell). They just did not know English well, so it was
| easier to just copy things that you know are phrased well than to
| learn to write English well. As the saying goes, do not attribute
| to malice what can be explained by ignorance or laziness. That
| does not excuse it but it makes it more understandable.
|
| I agree with the article that this is probably just the tip of
| the iceberg. There are likely many more lesser evils being
| committed with similar tools that are just much more difficult to
| spot. I would not have noticed my particular example if I were
| not a reviewer for the paper, for example. It makes me wonder how
| big the problem really is.
| hdjjhhvvhga wrote:
| > The paper was not bad but the authors did not know English
| very well, so it was quite difficult to read. That was its main
| problem.
|
| This seems to confirm my suspicion than these cases are not so
| much about AI-generated content but rather a result of machine
| translation.
| aliswe wrote:
| it's a common technique/first layer of plagiarizing a text to
| translate it from english to eg. spanish and then from
| spanish to english, to get rid of the unique words the author
| used.
| craftinator wrote:
| > it's a common technique/first layer of plagiarizing a
| text to translate it from english to eg. spanish and then
| from spanish to english
|
| It's also a common technique for people who don't speak
| English to translate it... In fact, quite a bit more
| common.
| turnersr wrote:
| In your review, did you suggest the definition and explanation
| that they used? In this situation, would have an acknowledgment
| at the end have been enough? In my mind, it seems like you all
| had a conversation and the authors took up your suggestions as
| the reviewer.
| atrettel wrote:
| No, I did not suggest the definition and explanation as
| content for them to use. I was trying to explain a concept
| that they discussed incorrectly multiple times in the paper.
| It is an advanced concept that might not even appear in
| graduate-level courses on the subject, so I can understand
| why they did not understand it fully. That said, I did not
| give them permission to copy my words there. If there are any
| particular changes I want the authors to do I put them in
| quotes. This wasn't in quotes. It was an explanation for
| their own benefit so that they can correct the mistakes in
| the paper (by re-writing it).
|
| Once I re-read the submission I wanted to reject it
| immediately, but I realized that I should get a second
| opinion first. So I contacted the editors, who agreed that it
| was blatant plagiarism. Hence, they rejected the paper once I
| recommended rejection in my second review. So this wasn't
| just a conversation where I made some suggestions and the
| authors used them. Even the editors thought it was plagiarism
| once they looked at it.
|
| An acknowledgment would be impossible because the review was
| single-blind. The reviewers knew the identities of the
| authors but not the other way around. What the authors should
| have done was just re-phrase where they used the term in the
| paper. They didn't even need to copy my explanation, to be
| frank. The paper would worked fine without the paragraph they
| copied. If they just re-phrased the relevant parts no other
| changes would have been needed and this whole thing could
| have been avoided.
| adaml_623 wrote:
| Not 100% sure but I believe the word confidential implies
| that the review should only have been read by the editor(s)
| and not passed on to the authors.
| pottertheotter wrote:
| A review is the written feedback authors receive from the
| journal reviewer. The reviewer can recommend that the
| authors revise and resubmit, based on the review comments.
| Usually the review is not published with the final piece,
| which is what was meant by "confidential review".
| MikeUt wrote:
| > the authors plagiarized my definition and explanation almost
| word for word (from my confidential review).
|
| Is there any way the authors could have kept your definition,
| and somehow credited you, even anonymously? Because rephrasing
| definitions is the pinnacle of wasted effort, and leads to
| confusion - you are asking them to say what you said, but
| without using your words.
| atrettel wrote:
| That is a good question that I do not have a good answer for,
| unfortunately. The review process for this journal is
| supposed to be blind, so crediting me would only reveal me as
| a reviewer. An anonymous acknowledgment is better than
| nothing, if the authors only copied a short definition
| without my permission, but they copied _an entire paragraph_
| from my review without my permission. That 's just
| inexcusable. I can understand to some degree why they did not
| understand the concept well, since you may not encounter it
| even in a graduate-level course on the subject, but what they
| did was just inexcusable and really poor judgment.
| sunshineforever wrote:
| Yeah. How can you plagiarize a definition?
| da39a3ee wrote:
| I agree with this. You sound expert and provided a
| definition. I don't think we should expect serious
| professionals to mess around altering the words to make it
| look like it didn't come from the source that it did come
| from. In fact wouldn't that itself be plagiarism? The usual
| approach here is to use a phrase like "as suggested by one of
| our reviewers".
| pottertheotter wrote:
| I think this is the cause of some of these weird terms that
| this HN post is discussing. I have a PhD and found it
| incredibly frustrating to write research papers because
| there was an expectation in my field to add a ton of
| background. That meant I had to spend a lot of time to
| rephrase bits and pieces of other papers where the authors
| had worked hard to word something very well. The professors
| didn't like me quoting from other papers. I had to come up
| with my own way to say something very specific.
| petschge wrote:
| I write papers too and hate finding a new way to say "my
| X is a Y that does Z". Especially if it is your tenth
| paper on the topic and you should even sound like the
| previous nine times.
|
| But about three sentences into the introduction (where
| you explain all the background) you start going into
| "there is also Y's that do Z backwards". Which Y's you
| compare and connect with is important and says alot about
| how you think about your X. It might even be a new way of
| looking at it. So telling other people how you think of
| it can be important.
|
| And another 5 sentences in you start referencing previous
| work on the topic. At this point you are crediting other
| and you get to chose whom to credit how much, with the
| benefit of hindsight. You refer to papers that are useful
| to people new in the study of capital letters. What you
| write here helps them much more than a mere list of
| papers or a google (well google scholar or ADS or pubmed
| or what ever) result list, because you can provide a good
| order to read them or which aspect of X's are best
| explained where. You also name papers that might be
| useful to practitioners in the field because they have a
| particular technique or a good explanation of it.
|
| So it is very much worth while of providing the
| background that others expect at the beginning of your
| paper. Even if it requires rewriting that first paragraph
| several times.
| mdp2021 wrote:
| I cannot understand: those articles should have been carefully
| examined before publishing - I understand they are in the set of
| those "Under the Warranty of the Publications' Authority". But if
| anyone read them, the rubbish involved would have emerged.
|
| What am I missing?
| NotSwift wrote:
| Some people don't have English as their native language. When
| such people want to write a scientific article in English they
| will have to use someone who can write English but probably does
| not know much about the research. So of course there will be
| articles with "Tortured Phrases".
| alexmcc81 wrote:
| If you read the article the authors directly address this and
| use datasets of machine translated articles as controls.
| wolverine876 wrote:
| People who don't speak English natively could use machine
| translation, and people plagiarizing could use machine
| translation. How do they distinguish (if you don't mind
| saving me digging into the research)?
| arthur2e5 wrote:
| Did you even read the abstract of the TFA? This should not even
| be a cultural background issue. As an L2 English speaker myself
| I have never ever thought about throwing a thesaurus onto some
| established phrase so I can turn "artificial intelligence" into
| "counterfeit consciousness", or "deep neural network" into
| "profound neural organization". These are deliberate use of
| fancy words without trying to make sense.
|
| Heck, we got a word for this sort of rampant plagiarism masking
| on Chinese internet -- Xi Gao (manuscript (or blog
| post)-laundry).
|
| OT: I do appreciate the funny phrase "elite figuring" for HPC.
| It's kind of like how they translate things to Anglish.
| Zababa wrote:
| No one would write "colossal information" instead of "big data"
| because English isn't their first language.
| twirlock wrote:
| So the way we can tell a computer has generated a scientific
| paper is... because the computer probably failed to use idiomatic
| terminology when it referred to concepts.
| guyromm wrote:
| Back in 2004 or so, I was building a distributed CMS with the
| goal of creating artificial "link pyramids" with the purpose of
| SEO, which was a rather new thing at the time.
|
| Content generation was one of our bottlenecks, and as Google was
| already rather successful at detecting duplicate content, we were
| looking for a way to "uniqify" posts that would be used to stuff
| sites intended for googlebot, but not humans.
|
| One of the methods that worked was taking source English content,
| running it through Babelfish, the Altavista translator to French,
| Spanish or German, and then using the same method to translate it
| back to English.
|
| This resulted in texts that did not make much sense to humans,
| were full of precisely such "tortured phrases" but which were
| considered unique by Google.
| etempleton wrote:
| I often wonder while reading an academic paper how the writing
| could be as hopelessly bad as it is.
|
| This type of manipulation and plagiarism may be partially to
| blame, but the academic writing style has also gone completely
| off the rails to the point that half the journal articles being
| published today read as if written by some kind of paper writing
| AI robot even when I am quite certain that that isn't the case.
| And no, I am not talking about cases where the author is writing
| in a non-native language.
|
| I have a theory that it may have to do with imposter syndrome and
| a need to sound smart. The author, fearing that they don't really
| belong and at any moment will be found out, therefore never
| making tenure, starts jamming academic sounding words where they
| don't belong and stretching sentences with commas and semi colons
| until the whole thing is just as insufferable to read as it was
| to write.
|
| There is also the possibility that there are just a lot of
| terrible writers out there.
| zwaps wrote:
| I am sure this was not your intention or meaning, but please be
| aware that it is virtually impossible for a non-native speaker
| to write perfect English. English is a language you have to
| intuit. In contrast to other languages, it has very few fixed
| rules. Writing elegantly in English is most certainly an art
| form.
|
| Of course, writing good science is hard enough for native
| speakers. It is very difficult for the vast majority of people
| on the planet - no matter how good their research.
|
| And just so we are clear: Not everyone can afford professional
| editing services at every point in their career.
|
| We meet in English under the premise that it allows for
| universal communication. In this, we accept that English
| natives are almost infinitely more privileged in writing,
| speaking, conferencing and networking. We also have to accept
| that the level of English proficiency varies, and - especially
| English - is easy to learn and so difficult to master.
| LargoLasskhyfv wrote:
| I think at least skimming some edition of the
|
| [1] https://en.wikipedia.org/wiki/The_Chicago_Manual_of_Style
|
| and some of what is available under
|
| [2] https://duckduckgo.com/?q=military+writing+guide
|
| would be useful for american english and technical writing.
| endtime wrote:
| I think you missed this part of the comment to which you were
| replying:
|
| > And no, I am not talking about cases where the author is
| writing in a non-native language.
| raincom wrote:
| A friend submitted a paper to a journal in humanities. The
| reviewer said "his English is informal". In other words, these
| reviewers are asking for stilted English.
| LargoLasskhyfv wrote:
| This makes me think of people smelling bad, in dark robes,
| wearing white powdered wigs, frantically using their
| https://en.wikipedia.org/wiki/Hand_fan
| Strilanc wrote:
| I also get this feedback on my papers. E.g. saying that it's
| written "more like a blog post".
|
| Of course, they're not wrong. It _is_ written more like a
| blog post. Because the writing style used in blog posts is
| hands down better than the writing style used in scientific
| papers. Blogs talk about the real reasons you worked on
| something, they go through simple examples, and they mention
| where you struggled and what you found confusing and what you
| tried that didn 't work. All of these things are very useful
| for understanding, and in my experience almost entirely
| lacking from papers. Or at least, in my experience they're
| lacking from modern papers. I think in papers from 100 years
| ago the authors tended to talk more about their worries and
| their excitement e.g. [1].
|
| [1]: https://youtu.be/RZfCqWZ8EAY?t=630
| yissp wrote:
| Good essay by Orwell that touches on this sort of thing
| https://www.orwellfoundation.com/the-orwell-foundation/orwel...
| I used to be guilty of writing this way and one of my high
| school English teachers recommended I read it. I've tried to
| take the message to heart ever since.
| hutzlibu wrote:
| "There is also the possibility that there are just a lot of
| terrible writers out there. "
|
| Surely they are and writing in a way that is easy to read and
| understand is an art in itself.
|
| But I would agree, that the main reason is probably the
| intention to sound smarter, than they are. Whole scientific
| disciplines seem to live by that standard.
|
| This is not limited to science though, I recall a german poet
| (I think Heinrich Heine) said about his fellow poets:
|
| You only fly so high like the swallow, that no one can actually
| hear your singing.
| FabHK wrote:
| Some of these tortured phrases are great. My favourites:
|
| "flag to clamor" for signal to noise
|
| "individual computerized collaborator" for PDA (personal digital
| assistant)
|
| "haze figuring" for cloud computing
|
| "information stockroom" for data warehouse
|
| "focal preparing unit" for CPU
|
| "discourse acknowledgement" for voice recognition
|
| "mean square blunder" for MSE (mean square error)
|
| "arbitrary right of passage" for random access
|
| "arbitrary timberland" for random forest
|
| "irregular esteem" for random value
|
| ETA:
|
| "notoriety examination" for sentiment analysis
| abecedarius wrote:
| Reminiscent of
| https://en.wikipedia.org/wiki/Uncleftish_Beholding
| netr0ute wrote:
| Reminds me of https://www.youtube.com/watch?v=GyV_UG60dD4
| aaron-santos wrote:
| I enjoyed finding "counterfeit consciousness" for artificial
| intelligence. To me it evokes a kind of science fiction that's
| shown up occasionally on HN[1].
|
| [1] https://qntm.org/mmacevedo
| Freak_NL wrote:
| Also "haze figuring" for cloud computing.
|
| It sounds like something you'd find in 30s, 40s, 50s sci-fi
| for sure! Like "visiplate" (E.E. "Doc" Smith, Heinlein) for a
| computer display screen. (Along with ticker tape printouts
| and tape reels in the far future of course.)
| slowmovintarget wrote:
| Makes me want to put smog-hosting in my CV.
| synquid wrote:
| The smog is just the Chinese cloud.
| laurent92 wrote:
| Ah, vapordecisionware. But that might be confused with
| regular management.
| seoaeu wrote:
| Really highlights that the actual phrases don't make any
| more sense than the tortured versions, other than the fact
| that we've been hearing all of them for years so they now
| sound normal
| rhino369 wrote:
| I happen to be reading Dune today, and AI is referred to as
| counterfeiting the human mind
| golemotron wrote:
| There might be a common concept between this, chaff[1] and
| Steven Pinker's Euphemism Treadmill.
|
| [1] https://en.wikipedia.org/wiki/Chaff_(countermeasure)
| nick__m wrote:
| If I was in a situation where I had to write on occupational
| health and safety in forestry I would shamelessly appropriate
| "mean square blunder" and "arbitrary timberland", those are
| superbly above the mean square!
| arkitaip wrote:
| Any HNers who want to join me in creating the dad-punk band
| Leftover Vitality?
| jszymborski wrote:
| As someone who has had to write technically in a second-language
| (French, funding agencies in Quebec), this rings particularly
| true.
|
| Luckily, I'm fluent enough to recognise the particularly
| egregious examples, but finding good translations for technical
| words is hard!
|
| One example that comes to mind is when trying to translate the
| phrase "data feed" which came back as "alimentation donnees"
| which ostensibly means "animal feed data".
|
| If you're looking for a lot of English-to-French translations of
| technical terms, check out the theses any English University in
| Quebec (McGill, Concordia, etc..). They're made public online
| [0]. Can't vouch for the quality as I'm sure there are plenty
| that just use Google Translate, but everyone I know has their
| abstract edited by a francophone in their field.
|
| A good way to validate translated technical terms is to just give
| them a quick internet search on e.g. DuckDuckGo or
| Semanticscholar.
|
| [0] McGill's is https://escholarship.mcgill.ca/
| dghughes wrote:
| It reminds me of scientific an article about Canadian journal
| publishers are being bought by a shady company (OMICS Group Inc.)
| so they can seemingly publish whatever they want to.
|
| https://www.ctvnews.ca/health/offshore-firm-accused-of-publi...
| jokoon wrote:
| What's the point of this? To waste the time of foreign
| scientists? Would we call this science warfare?
| riedel wrote:
| It is the result of a misguided science system that relies
| mostly on external quality checks (peer reviewed publication)
| and flooding the world with so much "novelty" that there is no
| way to digest it. At least you can use the output to train
| language models up to now: will machines now have to train
| themselves...
| wmf wrote:
| It's resume inflation.
| FabHK wrote:
| Got to agree with the conclusion of the paper:
|
| > In our strong opinion, the root of the problems discussed in
| this work is the notorious publish or perish atmosphere
| (Garfield, 1996) affecting both authors and publishers. This
| leads to blind counting and fuels production of uninteresting
| (and even nonsensical) publi- cations.
| dash2 wrote:
| I get this with students a lot. Papers which have been copied
| from some website, but then they've gone through and altered a
| few bits of vocabulary to disguise it.
| gzer0 wrote:
| This is anecdotal evidence at best, but it is worth considering.
| I know of several individuals who were able to complete their
| entire Master's thesis utilizing a combination of AI generated
| content (GPT-3) and a paraphrasing tool.
|
| The generated text was well over 50 pages, completely bypassed
| all known content/plagiarism checks and was even included in the
| Universities "exemplary examples". To this day, it is still
| there.
|
| This is of significant concern as some of these GPT-3 based tools
| are now integrated within MS Word itself. Word 2021 allows for
| "add-ons", out of which I have noticed several third party
| content generation and paraphrasing tools.
| 13415 wrote:
| Please include a link to these theses, because as it stands
| this anecdote sounds extremely implausible. I don't know what
| university you were, but I've been at a few in Europe and at
| every one of them Master theses were evaluated from the start
| to the end by several humans. GPT-3 is unable to produce even
| two pages of coherent text, let alone 50 pages good enough to
| be accepted as a Master thesis in _any_ discipline at any
| university I could think of (even the worst ones).
|
| I can imagine that plagiators use paraphrasing software quite
| extensively, though, and that it is a problem.
| gzer0 wrote:
| Let me clarify:
|
| It was not all automated, there was a fair bit of manual
| intervention needed. I understand your concerns and they are
| valid and this is why I preface my statement with "anecdotal
| evidence". What I write is most certainly not the entire
| story and a fair bit of detail is left out.
|
| It should be known that this is widespread across multiple
| industries and this will only become more of an issue in the
| future.
|
| This is a US-based institution, fully accredited.
| throwawaygh wrote:
| _> Master 's thesis_
|
| I don't doubt this at all, and I have no doubt that GPT-3 with
| a bit of human editing can spit out something better than the
| lower third of masters students at corn row colleges.
|
| Masters degrees are cash cows, which is why no one in
| unregulated industries cares about them. People in
| regulated/unionized industries also don't _actually_ care; even
| educators, who at least nominally see intrinsic value in
| education, go to borderline diploma mills to get that union-
| mandated raise at minimal effort.
| bjt wrote:
| > ... masters students at corn row colleges.
|
| First time I've heard the term "corn row colleges". Google's
| not bringing up anything that looks relevant.
|
| I suggest picking something else. Given that "cornrows" are a
| predominantly black hairstyle, the term reads like a racial
| slur.
| CRConrad wrote:
| I (non-American, not a native English speaker) thought it
| was a pejorative reference to rural universities; ("hick" /
| "rube") state universities of Midwestern states etc.
| throwawaygh wrote:
| The name originates as a perforative for small tuition-
| dependent non-research teaching colleges. Those colleges
| mostly catered to pastors, teachers, etc. and were located
| in small towns. The historical reasons that these
| institutions are now "in the corn fields" provides an
| interesting topic for historical inquiry. Perhaps many are
| in old rail-road or factory towns that have since
| languished, but schools that were similar at time of
| founding and didn't die are in industrial and post-
| industrial hubs where they attracting the attention needed
| to thrive. Who knows. The point is that they are small,
| inconsequential institutions that are predominately located
| in rural and semi-rural towns.'
|
| The name now includes small state schools -- usually branch
| campuses with lower enrollment and no major (R1) research
| output.
|
| (NB: corn row colleges are also by definition non-elite, so
| small liberal arts colleges with billion dollar endowments
| which might otherwise count, don't).
|
| Many such institutions have since started offering graduate
| (or at least non-bachelors) degrees and certificates that
| are somehow even more worthless than their undergraduate
| programs.
|
| Apparently the name has a lot of different meanings these
| days -- see sibling comments -- but it has DEFINITELY never
| been meant as a racial pejorative. If anything, exactly the
| opposite, since most of those "crap-tier
| midwestern/southern colleges" cater to 99.99% WASP social
| networks (the P is even explicit).
| selimthegrim wrote:
| No meaning in the cornfields, ie second tier state
| universities.
| aliswe wrote:
| nono, it means the long line of colleges that are virtually
| indistinguishable from eachother.
| CRConrad wrote:
| > Masters degrees are cash cows, which is why no one in
| unregulated industries cares about them.
|
| So, uh, is Business Administration a regulated industry?
| throwawaygh wrote:
| No one cares about MBAs. The networks can be helpful, but,
| unlike JDs/PhamDs/etc., an MBA from a no-name college &
| weak alumni network isn't worth the paper isn't printed on.
| OminousWeapons wrote:
| > Masters degrees are cash cows, which is why no one in
| unregulated industries cares about them. People in
| regulated/unionized industries also don't actually care; even
| educators, who at least nominally see intrinsic value in
| education, go to borderline diploma mills to get that union-
| mandated raise at minimal effort.
|
| I don't mean this rudely, but it is attitudes like this which
| cause the CS interviewing process to be 100X more painful
| than the interviewing process in any other field: "I don't
| trust your credential so I demand you prove your competence
| to me on the spot and let's do 5 rounds of interviews just to
| be sure."
| derefr wrote:
| How do you feel about doctorates?
| throwawaygh wrote:
| Depends. University of Phoenix awards doctorates that take
| 3-4 years (HUGE red flag -- the best and brightest phd
| students _might_ get out in 4 years if everything goes
| perfectly; an "expected time to graduation" of anything
| less than 5 years is almost certainly a worthless degree).
|
| Those doctorates don't require much more than taking some
| coursework and paying a boatload in tuition. Basically an
| expensive and length online masters program. Not worth the
| paper they're printed on, unless you're employed by the
| government or in a union job that mandates raises for
| education attainment.
|
| As a general rule of thumb, PhDs from R01 universities that
| are paid for by the university through research
| assistantships or teaching assistantships are generally a
| good signal of at least minimal training in research
| skills.
|
| Another good general rule of thumb is that paying for a PhD
| -- beyond perhaps some MD/PhDs or maybe nursing phds, stuff
| like that -- is always a good sign of someone who has both
| a meaningless degree and also poor reasoning/research
| skills.
|
| But anyways, real doctorates outside of a few fields (e.g.,
| pure math) usually come with a non-trivial publication
| record that speaks for itself. You don't even need to know
| that the person has a doctorate; you can just read their
| papers and a rec letter from an advisor describing the
| student's role in each paper.
|
| (I'm excluding discussion of professional degrees like JDs,
| PharmDs, etc. which are technically doctorates but sort of
| their own class.)
| lyaa wrote:
| Length of PhD programs is an indicator that should be
| considered in context. UK Universities, for example,
| often have research PhD programs that take 3 years to
| complete and they are legitimate.
| throwawaygh wrote:
| Yes, my comment is specific to US (where, additionally,
| it's somewhat uncommon to have a masters degree prior to
| starting the phd).
| thebooktocome wrote:
| NB: I'm only speaking about the math doctorate as it
| currently stands in the United States.
|
| Due to the current market saturation of math doctorates,
| any pure mathematics PhD worth the paper its printed on
| will also probably come with a non-trivial publication
| record. The exceptions I can think of are high-risk high-
| reward areas like cutting-edge number theory (I had a
| friend go eight years without publishing, which, yikes,
| but his thesis was semi-revolutionary (or so I'm told))
| or, I guess, suitably abstract category theory (though
| the people I follow in this area seem to publish lots of
| interesting papers, like the Baez school or the homotopy
| type theory people; your mileage may vary).
|
| It's really too bad. One wonders why we can't simply ax
| the entire advisor-candidate system (with all its myriad
| opportunities for physical, emotional, and even sexual
| abuse) and certify new candidates by saying: "You're a
| doctor of mathematics when you get five professors to
| sign off on 3-5 papers you've had published."
| derefr wrote:
| > certify new candidates by saying: "You're a doctor of
| mathematics when you get five professors to sign off on
| 3-5 papers you've had published."
|
| Or one big one.
|
| Basically, take the "honorary doctorates" some
| Universities give out to people retrospectively to people
| who have made major contributions to their fields; do it
| more often; and then make it the _only_ path to getting a
| doctorate, such that they 're no longer "honorary" at
| all.
| wolverine876 wrote:
| > Masters degrees are cash cows, which is why no one in
| unregulated industries cares about them. People in
| regulated/unionized industries also don't actually care...
|
| People don't care about masters degrees engineering, law,
| business, art, etc. etc.? Try applying for many jobs without
| one, or with one from lower-ranking colleges.
|
| The Chronicle of Higher Education article recently on the HN
| front page said that masters in some fields, they give the
| example of 'positive psychology', are indeed cash cows. But
| in the example, that degree was not part of the actual
| Department of Psychology, which is taken very seriously.
| throwawaygh wrote:
| Engineering and CS masters are 100% cash cows that no one
| cares about. I promise you.
|
| I haven't even heard of a Masters in Law (law degrees are
| doctorates), but I can't imagine it's worth the paper it's
| printed on.
|
| MBAs are worthless unless they're from a few good places,
| and even then the brand and networking does a lot of the
| lifting.
| wolverine876 wrote:
| > law degrees are doctorates
|
| Law degrees are called _Juris Doctor_ but are
| professional degrees, like MBAs. You aren 't required to
| publish original research (afaik) and in the US they were
| formerly Bachelor of Laws (LL.B.) and then renamed (as I
| understand it).
|
| The doctorate is Doctor of Juridical Science (J.S.D.).
| You can also get a Master of Law (LL.M.).
| fighterpilot wrote:
| In engineering and financial services, my experience is
| that they don't care. Some weight is given to a PhD but not
| a Masters.
| [deleted]
| MengerSponge wrote:
| Does Poe's Law cover parody becoming real? Because BBSpot
| called this nearly 18 years ago: "Word 2004 to Pioneer
| AutoUnsummarize Feature"
| https://www.bbspot.com/News/2003/12/autounsummarize.html
| bjourne wrote:
| I really doubt you can computer generate a Master's thesis.
| Completing a Master's thesis at an accredited institution is a
| heck of a lot of work and even a cursory reading of a thesis by
| an examiner, supervisor, opponent, or other interested party
| would give the generated content away. Maybe if you get your
| degree from a diploma mill you could get away with it, but then
| your degree wouldn't be worth toilet paper anyway.
|
| I've heard similar stories about generated phd theses and it is
| even more implausible. The reason is that writing a thesis is
| much more than just producing a hundred pages or so of prose.
| Any university student can poop that out in a few weeks. The
| main job of a thesis is coming up with a research question,
| conducting an experiment or a study, and describe the results
| and how it fits in whatever niche of the scientific world you
| are working in.
| hdjjhhvvhga wrote:
| I agree that in most cases it would be very difficult to do.
| But I can imagine some specific circumstances where it could
| be pulled off, possibly with some manual modifications: soft
| sciences like sociology (you can't imagine the amount of bs
| I've read during my college years), the subject matter being
| very different from the area your supervising prof
| specializes in, the topic that allows for arbitrary
| speculation, an underfunded university branch with profs
| having a more lax attitude.
| BoxOfRain wrote:
| There is apparently nothing new under the sun, in 1996 a
| physics professor fed up with people from a social sciences
| background publishing insufficiently rigorous papers on the
| subject of physics decided to submit a nonsensical paper
| liberally sprinkled with buzzwords to a journal of cultural
| studies [0]. While this was simply someone making up
| nonsense that "sounded right" as these AI language models
| obviously weren't around then and aimed at addressing a
| different issue, I definitely think it's relevant to this
| discussion because it shows that academia (or at least
| parts of academia) can be a bit flakey with what they
| accept.
|
| [0] https://en.m.wikipedia.org/wiki/Sokal_affair
| andai wrote:
| https://xkcd.com/451/
| dash2 wrote:
| Oh my sweet summer child.
|
| I regularly get dissertations with any or all of: barely
| readable English, useless empirics, half-baked research
| questions.
| pottertheotter wrote:
| How are dissertations getting to you like that? When I did
| my PhD, no one would have allowed a PhD student to start
| writing a dissertation without first having sufficient
| research questions and then completing appropriate
| statistical analyses.
| laurent92 wrote:
| > I know of several individuals who were able to complete their
| Master's thesis utilizing...
|
| Doesn't it stay published forever? Might be a shame for the
| someone during their career.
|
| On the other hand, even a chapter of Mein Kampf was accepted in
| 20 journals, after replacing the old word with newer versions.
| Human reviews are hard. Maybe we should put computers in charge
| of reviewing papers, they'd recognize the work of AI quicker?
|
| https://www.foxnews.com/us/academic-journal-accepts-feminist...
| phkahler wrote:
| Sounds like automated review of automatically generated papers.
| And people pay money for that...
| lostlogin wrote:
| > third party content generation and paraphrasing tools.
|
| Presumably this is an arms race against things like
| https://www.turnitin.com/
|
| Empower students 'to do their best, original work' and this is
| what you get. Though what the alternative is, I have no idea.
| tasty_freeze wrote:
| I ran into something like this in an amazon review once. I was
| looking for a book of transcriptions for the instrument I play,
| and two of the handful of reviews used the same awkward phrase:
| "music goals". I scratched my head and then realized what
| probably happened. They weren't native english speakers and they
| were being paid to write reviews and they had gotten the wrong
| synonym. "music goals" was supposed to be "music scores".
| itronitron wrote:
| I like that "citation of non-existent literature" is also a
| feature of these articles, although I wonder if the non-existent
| literature was previously cited in other papers.
| (https://irthoughts.wordpress.com/2009/07/15/the-most-influen...)
| Animats wrote:
| This is a major failure of Elsevier.
|
| Here's "Microprocessors and Microsystems."[1] This is supposed to
| be about embedded systems, which is generally a no-bullshit
| field. I'd never heard of this journal. People read Electronic
| Design, EE Times, "Embedded.com", maybe Control Systems Journal,
| etc. Those have either articles about how to do something, or
| "why what we're selling is great" articles.
|
| Now look at the article titles in Microprocessors and
| Microsystems.[2] Here are the first three.
|
| - COPS: A complete oblivious processing system
|
| - A perceptron-based replication scheme for managing the shared
| last level cache
|
| - Efficient underdetermined speech signal separation using
| encompassed Hammersley-Clifford algorithm and hardware
| implementation
|
| Now those might be legitimate, although what they're doing in an
| embedded systems journal isn't clear. They're all behind a
| paywall, so it's hard to tell if they're any good.
|
| "Oblivious processing" is a security concept. That belongs in a
| journal on security and encryption, where the crypto people will
| know what holes to look for. (Microsoft was doing work in this
| area in 2013, but I don't think a product emerged. If you can
| make it work, some cloud computing company can use it.)
|
| Cache management belongs in a journal on CPU design, where people
| who have struggled to make caches work will take a look. There
| are people using perceptrons for this, which makes sense; a cache
| has to guess which things will be reused. (If this works well,
| someone should be trying it in web caches such as NGINX to
| improve cache hit rates.)
|
| Signal separation is an active field, but this isn't a journal
| where you'd expect to find articles on it. Wikipedia has a good
| article on signal separation. The history of that article
| indicates attempts to sneak in citations to sketchy articles. No
| idea if the Hammersley-Clifford algorithm is even relevant. (If
| it's a significant advance, there's commercial value in this in
| improving audio quality for conferencing systems.)
|
| So these papers were all sent to a journal where the odds of
| getting published are good, and the odds that the editors have no
| idea about the subject matter is high.
|
| Why is Elsevier even publishing this journal?
|
| [1] https://www.sciencedirect.com/journal/microprocessors-and-
| mi...
|
| [2] https://www.sciencedirect.com/journal/microprocessors-and-
| mi...
|
| [3]
| https://en.wikipedia.org/w/index.php?title=Signal_separation...
| ksaj wrote:
| I noticed this happening in other areas a few years ago, but with
| faked blogs. The titles and subjects would sound interesting, but
| then when you tried to read them, you'd need a specialized
| decoder to get through the utterly baffling word replacements.
| But they already got their ad revenue by the time you notice the
| article is complete gibberish.
|
| The first one I found was about dog illnesses. They kept
| referring to dogs with phrases like "Your domesticated canine,"
| and it was quite a chore trying to figure out most of the
| symptoms that they were listing. "Heart worms" was translated to
| "love snakes," which I thought was delightful.
| armchairhacker wrote:
| Nowadays too many real blogs are padded with weird phrasing and
| sentences which don't really mean anything.
|
| In this case, sometimes you get lucky and can actually find
| meaningful information between the padding. But sometimes you
| just read an article that takes 5 paragraphs and 500 words to
| say "we don't know".
| dimatura wrote:
| Yes, this may be a specific example of a more widespread
| phenomenon. There's certain websites out there that republish
| articles from well-established publications (e.g., New York
| Times) almost word for word, except that they are rife with
| synonym swaps that may or may not make sense in context,
| presumably to escape some kind of automated copy detection.
| Results can be amusing. For example, the copied article said
| ""Drukqs" acquired a blended essential response..." where the
| original said ""Drukqs" received a mixed critical response...".
| Pixelbrick wrote:
| This will come as no surprise to FE/HE lecturers the world
| over...
| cpach wrote:
| What is FE/HE?
| cbarrick wrote:
| Seems like the UK equivalent of community colleges in the US.
| cperciva wrote:
| I'm guessing Further Education / Higher Education.
| dash2 wrote:
| Further Education = 16-18 years old; Higher Education =
| 18-21 years old, i.e. universities.
| mrfusion wrote:
| I wonder if there are any phrases to detect industry funded
| papers? I know they tell you their funding sources but it doesn't
| seem to always help.
___________________________________________________________________
(page generated 2021-08-08 23:00 UTC)