[HN Gopher] Propagation of mistakes in papers
___________________________________________________________________
Propagation of mistakes in papers
Author : greghn
Score : 82 points
Date : 2022-07-26 16:03 UTC (6 hours ago)
(HTM) web link (databasearchitects.blogspot.com)
(TXT) w3m dump (databasearchitects.blogspot.com)
| woliveirajr wrote:
| > Judging by publication date the source seems to be this paper
| (also it did not cite any other papers with the incorrect value,
| as far as I know). And everybody else just copied the constant
| from somewhere else, propagating it from paper to paper.
|
| And the Scheuermann and Mauve paper mentions that they picked the
| value (0.775351) from the Philippe Flajolet paper that only
| mentions it without the extra 5. It's not that it was calculated
| again, reviewed or something like that. It was simple picked up
| and typed wrong.
| sebastianconcpt wrote:
| Have you thought what would be needed and what would imply to
| have a kind of CI/CD pipeline for unit testing assertions on
| papers?
| radus wrote:
| How do you CI/CD assertions in papers using animal models in
| experiments that take months to years?
| _Algernon_ wrote:
| How would that work? You can't automate testing of papers.
| However flawed the process is, this is what peer review is
| intended to do.
| tinalumfoil wrote:
| Reminds me of https://en.wikiquote.org/wiki/Oil_drop_experiment,
| famously described by Feynman,
|
| > Millikan measured the charge on an electron by an experiment
| with falling oil drops, and got an answer which we now know not
| to be quite right. It's a little bit off because he had the
| incorrect value for the viscosity of air. It's interesting to
| look at the history of measurements of the charge of an electron,
| after Millikan. If you plot them as a function of time, you find
| that one is a little bit bigger than Millikan's, and the next
| one's a little bit bigger than that, and the next one's a little
| bit bigger than that, until finally they settle down to a number
| which is higher.
| m-watson wrote:
| That experiment is also used to teach about selective data
| exclusion (potential scientific fraud) as well as a resistance
| to challenging an already published value (future experiments
| searching to verify Millikan's value rather than show it was
| incorrect or off) in a lot of experimental classes.
|
| https://en.wikipedia.org/wiki/Oil_drop_experiment
| hinkley wrote:
| Is that part of the genesis for these conversations about how
| perhaps the physical constants of the universe are slowly
| changing over time? If you look at the 'right' experiments, the
| speed of light slowly crept up over time too, IIRC. When the
| movement is all in one direction it's easy to speculate that
| maybe that's because the target keeps moving.
| antognini wrote:
| Another example I recently came across was the early
| measurements of the AU using radar. The first two experiments
| tried to bounce radar off of Venus had very noisy data, but
| they seemed to have a detection that implied a distance that
| was pretty close to the earlier measurements that had been done
| using parallax. But after the equipment was upgraded, the
| detections went away and it turned out that they had just been
| noise. Later on an even more powerful radar system was able to
| successfully bounce a radar signal off of Venus and it turned
| out that the AU was quite a bit different from its earlier
| value.
| btrettel wrote:
| Researchers as a whole need to do more checking. While I agree
| that errors like the one identified in the link are rare, they
| are not so rare that one shouldn't spend the time looking for
| them or assume that everything was done properly.
|
| I've speculated before that peer review gives researchers false
| confidence in published results [0]. A lot of academics seems to
| believe that peer review is much better at finding errors than it
| actually is. (Here's one example of a conversation I had on HN
| that unfortunately was not productive: [1].) To be clear, I think
| getting through peer review _is_ evidence that a paper is good,
| albeit weak evidence. I would give the fact that a paper is peer
| reviewed little weight compared against my own evaluation of the
| paper.
|
| [0] https://news.ycombinator.com/item?id=22290907
|
| [1] https://news.ycombinator.com/item?id=31485701
| 11101010001100 wrote:
| I just completed a paper review as a reviewer. After I think 4
| rounds, the author finally ran the calculation I had asked for
| in the initial review and admitted I was right. We got there in
| the end, but I had to sit on my hands.
| pcrh wrote:
| Peer review can help improve a paper (and it has improved some
| of mine); however, contrary to some popular notions, it doesn't
| lend "truth" to a paper.
|
| Peer reviewers are not monitoring how experiments were
| conducted, they only have access to a data set that is by
| necessity already highly selected from all the work that went
| into producing the final manuscript. The authors thus bear
| ultimate responsibility.
|
| When considering published work close to mine, I use my own
| judgement of the work, regardless of peer review or which
| journal it is published in (for example it may be in a PhD
| thesis). For work where I am not so familiar with the
| methodologies, I prefer to wait for independent
| verification/replication (direct or indirect) from a different
| research group, which ideally used different methods.
| throwawaymaths wrote:
| Well, to be fair, there _is_ the journal "organic syntheses"
|
| https://en.m.wikipedia.org/wiki/Organic_Syntheses
| NegativeLatency wrote:
| Somewhere near 100% of my shipped bugs have been peer reviewed
| so that makes a lot of sense to me.
| michaelmior wrote:
| > To be clear, I think getting through peer review is evidence
| that a paper is good
|
| I think this depends on how you define _good_. I 'm sure
| there's some variation across fields, but peer review generally
| seeks to establish that what is presented in the paper is
| plausible, logically consistent, well-presented, meaningful,
| and novel. That list is non-exhaustive, but _correct_ is very
| hard to establish in a peer review process. In my experience,
| it would be rare for a reviewer to repeat calculations in a
| paper unless something seems fairly obviously off.
|
| As a computer scientist, it would be even more rare for a peer
| reviewer to examine the code written for a paper (if it is
| available) to check for bugs. Point being, there are a lot of
| reasons a paper that appears good may be completely incorrect.
| Although this is typically for reasons that I as a casual
| reader would be even less likely to distinguish than a reviewer
| who is particularly knowledgeable about that particular field.
| magicalhippo wrote:
| I've got a close relative who reviews papers all the time in
| their field (not CS). Based on that my take is that if a paper
| passes peer review it is a good indicator there's nothing
| egregiously wrong with the stuff that's written.
| hinkley wrote:
| I wonder if there's a trick we're missing related to the dead-
| tree history of papers that we could address.
|
| Namely, paper references always reach back in time. Papers don't
| reference papers that were written after they were written. And
| if that sounds stupid, bear with me a second.
|
| We've talked a lot about the reproducibility problem, and that's
| part of propagation errors in papers (I didn't prove this value,
| I just cribbed it from [5]). If we had a habit of peer reviewing
| papers and then adding the peer review retroactively to the
| original paper, both for positive and negative results, would we
| slow this merry-go-round down a little bit and reduce the head-
| rush? Would that help prevent people from citing papers that have
| been debunked?
| renewiltord wrote:
| Solid point. The paper is a delta-mapper: it provides a p -->
| [?]p prior to posterior-change. However, it does not tell you
| anything about p or p'=p+[?]p itself. To get true value of p^n,
| we sum over all [?]p in some way (affected by the path we take
| through papers addressing).
|
| You're modifying the thing so that future [?]p^{i+k} are added
| to the delta-mapper so that [?]p is appropriately modified
| accounting for that [?]p^{i+k}. It's like path-compression in a
| union-find structure.
|
| It is interesting as a helpful approach but does suffer from
| the pingback spam problem, right? And I have a slightly
| sneaking suspicion that it is not an accidental oversight in
| science that leads to these problems.
| jdougan wrote:
| A different kind of replication crisis.
| bluenose69 wrote:
| I don't think this sort of thing is all that unusual.
|
| I once did a web-of-science search for citations to a
| foundational paper in my field. It was published in volume 13 of
| a particular journal, and that was listed in a little over 90% of
| the citations, but the other citations all listed the journal as
| 113. My assumption is that somebody cited it in error, and that
| others were basically copying the citation from the bibliography,
| rather than going back to the original paper to get the original
| metadata.
|
| Does this mean that about 10% of writers were basically lying
| about having read the original paper? Well, maybe. But I fear
| that the number might be higher than 10%, because the correct
| citations might also have resulted from just copying from a
| bibliography.
|
| I tell this story to my students, in hopes that they will
| actually _read_ the original papers. Quite a few take my advice
| to heart. Alas, not all do.
| marcosdumay wrote:
| Or maybe that's because somebody published a bibtex entry for
| that paper that got that volume number wrong and those people
| just copied and pasted the entry without reading.
| RC_ITR wrote:
| This is somewhat a criticism of how contemporary citations work
| though.
|
| Primitive science (or even pre-publishing science) doesn't get
| cited because humanity figured it out before our current system
| was in place.
|
| It may sound silly, but no one feels the need to cite
| Eratosthenes when implying the world is round.
|
| But many people _do_ feel the need to cite the colorimetric
| determination for phosphorus (an SCI top 100 paper) even though
| it was published 100 years ago and is generally considered
| "base-level science."
|
| It is certainly an interesting paper to read, but I'm not sure
| I need every scientist to read it in order to believe they know
| how to do colorimetric analysis.
| actuallyalys wrote:
| I'd be curious to know whether the percentage of incorrect
| citations varies over time. I would guess more recent authors
| would be more likely to search by title in Google Scholar or
| SciHub (or use the DOI link, if available) rather than actually
| use the volume and page number, which could result in more
| authors who _did_ read the article nonetheless getting the
| volume number wrong.
| gwern wrote:
| There's a semi-famous line of research by Simkin which uses
| citation copying errors as 'radioactive tracers' to estimate
| the rate of copying & nonreading, under the logic that (in a
| pre-digital age), you could not possibly have repeated the
| '113' error if you got an ILL copy or physically consulted
| volume '13' (if only because you would be pissed at wasting
| your time either checking volume 113 first or verifying there's
| no such thing as volume 113):
|
| https://www.gwern.net/Leprechauns#citogenesis-how-often-do-r...
|
| Your 10% isn't far off from the 10-30% estimates people get, so
| not bad.
| hunglee2 wrote:
| same thing happens in the news - we assume due diligence has been
| satisfactorily (and honestly) conducted by publishers we hold in
| high esteem, and happily propagate without scrutiny, so long as
| it fits our preferred narrative
___________________________________________________________________
(page generated 2022-07-26 23:00 UTC)