[HN Gopher] More Fakery
___________________________________________________________________
More Fakery
Author : rossdavidh
Score : 116 points
Date : 2022-04-11 13:07 UTC (9 hours ago)
(HTM) web link (www.science.org)
(TXT) w3m dump (www.science.org)
| qchris wrote:
| My favorite article on this topic is "Escaping science's paradox"
| by Stuart Buck[1]. I'm, in particular, interested by the idea (at
| least within the United States) of "red teaming" science. This
| would involve having an independent agency funding attempts to
| replicate (and to find/publish flaws in) NSF- or NIH-funded
| projects, and publishing those. Ideally, the history of
| replication for authors' papers could then be part of the
| criteria for receiving funding for more novel research in the
| future.
|
| Obviously, there's a few fields where this might not work (you
| can't just create a second Large Hadron Collider for validation),
| but in areas from sociology to organic chemistry to environmental
| science, I think there's a lot of promise in that method for
| helping to re-align incentives around producing solid, replicable
| research.
|
| [1] https://www.worksinprogress.co/issue/escaping-sciences-
| parad...
| a-dub wrote:
| it's the same as everything. there should be more and easier
| money for the less rewarding task of verification/replication.
| some people actually enjoy this sort of work just as much as
| some people enjoy being on the bleeding edge... but there are
| probably less of them.
|
| where it would get complicated is also the same as everything.
| when the verification effort neither supports nor refutes the
| original one. many would argue that it means it wasn't done
| right, but lots of things aren't done right in life.
|
| then there can be the triple replication revolution! so it
| goes...
| Beldin wrote:
| Funding replication is a great idea, but cannot solve this.
| It would require roughly add much funding add now goes to
| science merely to replicate results produced now. That still
| leaves a rather hefty backlog. Moreover, the pace with which
| scientific output doubles is increasing. From top of my head
| out would be below a decade nowadays. Even if 90% of those
| would not need replication (new algorithms that work), then
| merely keeping pace would basically require 1 in 10
| institutions to devote itself fully to replication studies.
| Even then we'd need more capacity to look at previous results
| - that 10% is fully needed to investigate new results.
|
| Note that this is optimistic: I'd expect the percentage of
| publications where a reproducibility study makes sense to be
| above 50%.
| qchris wrote:
| This isn't intended as snarky, but I don't understand what
| "it's the same as everything" is supposed to mean. What is
| "it"? What is "everything"? Why are they the same?
|
| I'd also argue that your reduction of this problem sort of
| misses the point. One of the big problems with the way that
| studies are done is not that replication efforts aren't
| conclusive (it's very difficult to prove something doesn't
| exist), it's that a) non-replicable studies are generally
| considered as valuable as replicable ones, and b) as a
| result, it's extremely difficult to replicate many studies to
| begin with, because there's no incentive to take the time to
| make it possible. Even if the end result of a replication
| paper is "we couldn't produce the same results", the people
| working on it can say "this author's experiments were
| exceedingly difficult to even try to reproduce," or
| conversely "we didn't get the same results, but their data
| collection methods and analysis code were well-documented and
| accessible." That has a lot of value!
|
| If you tried doing triple replication for every paper, I
| agree that maybe wouldn't be the best use of resources. But
| the current state of affairs is so bad that a well-organized
| drive to create single-attempt replication on a fraction of
| publicly-funded projects has the potential to be a
| significant driver of change.
| a-dub wrote:
| "the same as everything" is an observation that often times
| verification/correctness/accuracy efforts are tossed aside
| in favor of new development and this is a truism across
| many fields. in science you see this as funding being
| committed to shiny new nature and science cover stories,
| with replication being left as an afterthought. in software
| you see this as heavy commitments to new features that
| drive revenue, with security/compliance/architecture and qa
| remaining underfunded and less respected. (until, of
| course, the problems that result from underfunding them
| make themselves apparent).
| 0x0203 wrote:
| One thing I'd like to see is a requirement that for all
| government funded research, a certain percentage of that
| funding, say 30%, must go toward replicating other publicly
| funded research that has had less than 2 independent and non-
| affiliated labs replicate. Any original research couldn't be
| published until at least two independent and non-affiliated
| labs replicate based on the submitted paper and report on the
| results that can then be included with the original research.
| I'd like to see this across all of academia, but I imagine
| there are enough challenges with enforcing this in a productive
| manner already that doing it across all research becomes both
| impractical and difficult to prevent abuse. But at least with
| public funds, it would be nice to put in some checks to reduce
| the amount of fraudulent or sloppy research that tax payers pay
| for.
| the_snooze wrote:
| I should point out that the notion of "replication" can often
| be way more difficult and nuanced than people expect. For
| one, what is the scope of the replication? Would it be simply
| to re-run the analysis on the data and make sure the math
| checks out? Or would it be to re-collect the data according
| to the methods described by the original researchers?
|
| The former is pretty easy, but only catches errors in the
| analysis phase (i.e., the data itself could be flawed). The
| latter is very comprehensive, but you essentially have to
| double up the effort on re-doing the study---which may not
| always be possible if you're studying a moving target (e.g.,
| how the original SARS-CoV-2 variant spread through the
| initial set of hosts).
| OrderlyTiamat wrote:
| > re-run the analysis on the data and make sure the math
| checks out?
|
| That isn't a replication in any meaningful sense. But a
| replication can certainly take many forms. An exact
| replication is one, another could be to do a conceptual
| replication, so studying the same effect but with a
| different design, or combining these with a new analysis
| pooling the data from both the study and the new study with
| (possibly) improved statistical analysis.
| epgui wrote:
| Here's an even easier set of requirement to simplify the
| first case:
|
| - Require all research to publish their source code.
|
| - Require all research to publish their raw data minus
| "PII".
|
| * Note: I use "PII" here with the intention of it taking
| the most liberal meaning possible, where privacy trumps
| transparency absolutely and where de-anonymization is
| impossible. This would rule out a lot of data, and
| personally I think we could take a more balanced approach,
| but even this minimalist approach would be a vast
| improvement on the current situation.
| bjelkeman-again wrote:
| When I learned at university that not all published
| research, especially government funded, didn't do this
| already I was dumbfounded.
| epgui wrote:
| "Not all" is a big understatement... I would estimate
| that less than 0.00001% of published research does this.
| Every time I talk about this to someone (colleagues in
| adjacent fields, PIs...), they seem to give zero pucks.
| It's really mind-boggling.
| mike_hearn wrote:
| Be aware that despite how much focus replicability gets, it's
| only one of many things that goes wrong with research papers.
| Even if you somehow waved a magic wand and fixed
| replicability perfectly tomorrow, entire academic fields
| would still be worthless and misleading.
|
| How can replicable research go wrong? Here's just a fraction
| of the things I've seen reading papers:
|
| 1. Logic errors. So many logic errors. Replicating something
| that doesn't make sense leaves you with two things that don't
| make sense: a waste of time and money.
|
| 2. Tiny effect sizes. Often an effect will "replicate" but
| with a smaller effect than the one claimed; is this a
| successful replication or not?
|
| 3. Intellectual fraud. Often this works by taking a normal
| English term and then at the start of your paper giving it an
| incorrect definition. Again this will replicate just fine but
| the result is still misinformation.
|
| 4. Incoherent concepts. What _exactly_ does R0 mean in
| epidemiology and _precisely_ how is it determined? You can
| replicate the calculations that are used but you won 't be
| calculating what you think you are.
|
| 5. A lot of research isn't experimental, it's purely
| observational. You can't go back and re-observe the things
| being studied, only re-analyze the data they originally
| collected. Does this count?
|
| 6. Incredibly obvious findings. Wealthy parents have more
| successful children, etc. It'll replicate all right but so
| what? Why are taxpayers being made to fund this stuff?
|
| 7. Fraudulent practices that are nonetheless normalized
| within a field. The article complains about scientists
| Photoshopping western blots (a type of artifact produced in
| biology experiments). That's because editing your data in
| ways that make it fit your theory is universally understood
| to be fraud ... except in climatology, where scientists have
| developed a habit of constantly rewriting the databases that
| contain historical temperature records. And by "historical"
| we mean "last year" here, not 1000 years ago. These edits
| always make global warming more pronounced, and sometimes
| actually create warming trends where previously there were
| none (e.g. [1]). Needless to say climatologists don't
| consider this fraud. It means if you're trying to replicate a
| claim from climatology, even an apparently factual claim
| about a certain fixed year, you may run into the problem that
| it was "true" at the time it was made and may even have been
| replicated, but is now "false" because the data has been
| edited since.
|
| Epidemiology has a somewhat similar problem - they don't
| consider deterministic models to be important, i.e. it may be
| impossible to get the same numbers out of a model as a paper
| presents, even if you give it identical inputs, due to race
| conditions/memory corruption bugs in the code. They do _not_
| consider this a problem and will claim it doesn 't matter
| because the model uses a PRNG somewhere, or that they
| "replicated" the model outputs because they got numbers only
| 25% different.
|
| What does it even mean to say a claim does or does not
| replicate, in fields like these?
|
| All this takes place in an environment of near total
| institutional indifference. Paper replicates? Great. Nobody
| cares, because they all assumed it would. Paper doesn't
| replicate, or has methodological errors making replication
| pointless? Nobody cares about that either.
|
| Your proposal suggests blocking publication until replication
| is done by independent labs. That won't work, because even if
| you found some way to actually enforce that (not all grants
| come from the government!), you'll just end up with lots of
| papers that can be replicated but are still nonsensical for
| other reasons.
|
| [1] https://nature.com/articles/nature.2015.17700
| Beldin wrote:
| One problem is that the amount of scientific output is
| increasing at an increasing rate.
|
| This means that the vast, vary majority of works will never be
| considered for replication - even with a dedicated replication
| institute. So for most applicants, the amount of replicated
| results will be 0.
| bee_rider wrote:
| Being on the science red team could also be really cool and
| fun. Since the goal is to explore the type of error or lie that
| gets through reliably, put new scientists on a team with some
| old greybeard, let's pass along that hard earned "how to screw
| up cleverly" experience.
| JacobThreeThree wrote:
| >Being on the science red team could also be really cool and
| fun.
|
| I think it depends on what you're investigating, and how much
| is at stake. I doubt it would be much fun to be put on a
| corporate hit list.
|
| >The court was told that James Fries, professor of medicine
| at Stanford University, wrote to the then Merck head Ray
| Gilmartin in October 2000 to complain about the treatment of
| some of his researchers who had criticised the drug.
|
| >"Even worse were allegations of Merck damage control by
| intimidation," he wrote, ... "This has happened to at least
| eight (clinical) investigators ... I suppose I was mildly
| threatened myself but I never have spoken or written on these
| issues."
|
| https://www.cbsnews.com/news/merck-created-hit-list-to-
| destr...
| mike_hearn wrote:
| Talk to people who have actually done it. Not one will tell
| you it's cool or fun. Here's how science red teaming actually
| goes:
|
| 1. You download a paper and read it. It's got major, obvious
| problems that look suspiciously like they might be
| deliberate.
|
| 2. You report the problems to the authors. They never reply.
|
| 3. You report the problems to the journals. They never reply.
|
| 4. You report the problems to the university where those
| people work. They never reply.
|
| 5. Months have passed, you're tired of this and besides by
| now the same team has published 3 more papers all of which
| are also flawed. So you start hunting around for people who
| _will_ reply, and eventually you find some people who run
| websites where bad science is discussed. They do reply and
| even publish an article you wrote about what is going wrong
| in science, but it 's the wrong sort of site so nobody who
| can do anything about the problem is reading it.
|
| 6. In addition if you red-teamed the wrong field, you get
| booted off Twitter for "spreading misinformation" and the
| press describe you as a right wing science denier. Nobody has
| ever asked you what your politics are and you're not denying
| science, you're denying pseudo-science in an effort to make
| actual science better, but none of that matters.
|
| 7. You realize that this is a pointless endeavour. The people
| you hoped would welcome your "red teaming" are actually
| committed to defending the institutions regardless of merit,
| and the people who actually do welcome it are all
| ideologically persona non grata in the academic world - even
| inviting them to give a talk risks your cancellation. The
| End.
|
| An essay that explores this problem from the perspective of
| psychology reform can be found here:
|
| https://www.psychologytoday.com/us/blog/how-do-you-
| know/2021...
| mherdeg wrote:
| I took a science journalism class in college where our
| instructor had us read a paper and then write the news story
| that explained what was interesting about the result.
|
| "You all got it wrong," he said, "the news is not that Amy
| Wagers could not make things work with mouse stem cells the way
| this prior paper said this one time. The news is that Wagers-
| ize is becoming a verb which means 'to disprove an amazing
| result after attempting to replicate it'. The lab has Wageres-
| ed another pluripotent stem cell result. The news is about how
| often this happens and what it means for this kind of science."
|
| This class was in 2006 and this later profile in 2008 seemed to
| bear things out the way he said:
| https://news.harvard.edu/gazette/story/2008/07/amy-wagers-fo...
| j7ake wrote:
| As long as there is some quantitative criteria on which jobs and
| promotions depend, there will be people gaming the system.
|
| One solution is to couple this quantitative criteria with
| independent committees that assess people beyond the metrics, but
| that requires a lot of human effort and not scalable.
|
| Assessing people in ways that don't scale seem to be the way to
| avoid this gaming trap in academia.
| cycomanic wrote:
| I'd argue that it's not just that the metrics don't scale but
| the problem is that we are trying to find quantitative metrics
| for something that can't be easily quantified. The worst
| outcome is not even the forgeries and fakes as in this article,
| but more that even the vast majority of ethical academics are
| being pushed into a direction that is detrimental to longterm
| scientific progress, in particular short term outcomes instead
| of longterm progress.
| lutorm wrote:
| _even the vast majority of ethical academics are being pushed
| into a direction that is detrimental to longterm scientific
| progress, in particular short term outcomes instead of
| longterm progress_
|
| I agree. The egregious fraud is just the high-sigma wing.
| It's a symptom, but the real problem is how it affects the
| majority.
| rossdavidh wrote:
| Interesting point; it is much like the problems of trying to
| assess programmer productivity.
| _tom_ wrote:
| I was thinking it's much like Google trying to deal with SEO.
| Most people optimize for high google ranking, not quality
| content.
|
| Google periodically changes the evaluation, in theory to
| reward good content and penalize bad, but people still try to
| game the system, rather than improving content.
| _tom_ wrote:
| And non-quantitive evaluations are prone to favoritism and
| prejudice. AKA people gaming the system.
|
| I doubt there's an easy answer.
|
| Trying to better align the short term objectives with longer
| term ones could help, but that just makes it harder to game,
| doesn't eliminate it.
| j7ake wrote:
| It's why one needs both. You need both undeniable
| productivity by quantitative metrics, as well as glowing
| reviews from independent panels that are not influenced by
| favoritism (almost like an audit).
| epolanski wrote:
| Data fabrication is sadly the norm nowadays.
|
| I was a chemistry researcher working on renewables, and during my
| master's thesis 9 months were spent validating fake results (from
| a publication of a scientist who worked in our group moreover).
| some_random wrote:
| It's crazy to me that academic fraud isn't a more pressing
| concern to society in general and academia in particular. The
| scientific process as currently implemented is broken across
| every single discipline. Even subjects like CS that should in
| theory be trivially reproducible, are rarely so. The reproduction
| crisis is still going on, but only nerds like us care.
| derbOac wrote:
| There are many causes of the lack of concern, but I think at
| the heart the problem, at least in the US, is that science has
| become politicized such that attempts at reform are
| mischaracterized for political gain. There's also a bit of
| ignorance in the general public, but that's only part of it.
|
| For example, if some on the right suggest some difficult but
| needed reforms, it tends to be spun as an attack on science. Or
| complaints that trivial projects are being overfunded get
| misinterpreted by the right and they try to make an example of
| the wrong studies for the wrong reasons.
|
| The pandemic was a good example of this in my mind, in that I
| think there were some serious systematic problems in academics
| and healthcare that were laid brutally bare, and many people
| suffered or died as a result. But then the whole thing got
| misidentified and sucked into the political vortex and all you
| end up with are hearings about how to rehabilitate the CDC, as
| if that is the problem and not a symptom of even bigger
| problems.
|
| I still think there are ways for things to change, but the most
| likely of them involve unnecessary suffering and chaos.
| N1H1L wrote:
| I can give a different perspective. It is not because of
| politicization IMO - at least not in the hard sciences. The
| problem comes from way up, from Congress because the
| immediate impact of science is not obvious. Especially, for
| basic sciences the impact takes decades to be really felt.
|
| But then how do you do promotions? How do you judge output?
| Worse still, how does US Congress justify spending taxpayer
| dollars. Rather than acknowledging that any short term
| measurement of the quality of science is a fool's errand, we
| have doubled down on meaningless metrics like impact factors
| and h-indices. And this is what we have as a result.
| ArnoVW wrote:
| Aside from reproducibility issues in ML, what sort of issue did
| you have in mind in CS?
|
| Most CS work is 90% maths, I don't see how you can have
| reproducibility issues?
| the_snooze wrote:
| Take, for example, network measurement research:
| https://conferences.sigcomm.org/imc/2021/accepted/
|
| One of those papers is about counting the scale and scope of
| online political advertising during the 2020 election. How
| does one reproduce that study? The 2020 election is long
| past, and that data isn't archived anywhere other than what
| the researchers have already collected. This is a pretty
| simple empiracal data collection tastk, but you can't just
| re-measure that today because that study is about a moving
| target.
| tlb wrote:
| I did my dissertation on this problem 25 years ago. It hasn't
| gone away.
|
| In general, performance comparisons are hard to reproduce.
| For instance, when benchmarking network protocols, often a
| tiny change in configuration makes a big change in a results.
| You might change the size of a buffer from 150 packets to 151
| packets and see performance double (or halve.)
|
| Instead of making measurements with some arbitrary choices
| for parameters, you can take lots of measurements with
| parameters randomly varied to show a distribution of
| measurements. It's hard work to track down all the possible
| parameters and decide on a reasonable range for them, so it's
| rarely done. I found many 10x variances in network protocol
| performance (like how fairly competing TCP streams can
| sharing bandwidth).
|
| The big idea was to show that by randomizing some decisions
| in the protocol (like discarding packets with some
| probability as the buffer gets full) you can make the
| performance less sensitive to small changes. ie, more
| reproducible. Less sensitivity is especially good when you
| care about the worst-case performance rather than average. It
| can also make tuning a protocol much easier, since you aren't
| constantly being fooled by unstable performance.
|
| Performance sensitivity analysis is hard work, so most papers
| are just like "we ran our new thing 3 times and got similar
| numbers so there you go."
| thechao wrote:
| If you're any good at your chosen specialty you get a "feel"
| for the bullshit. I know this doesn't help the public. My
| experience is in medical research, crystallography, and
| computer science. Here's an example for detecting "bullshit" in
| cardiology: call up the MD PI from the published paper and ask
| to review anonymized charts from patients targeted with the
| procedure. Are there any? Then, the research is probably good;
| are there none? It's probably because it'd kill the patient.
| Similarly, in Programming Language Theory: we'd just ask which
| popular compilers added the pass. Is it on in -O3 in LLVM?
| Serious fucking result; is it in some dodgy branch in GHC? Not
| useful.
| rossdavidh wrote:
| I think there's two problems impeding our ability to focus
| better on this:
|
| 1) for many people, the idea that science has widespread fraud
| is just hard to accept; in this respect it is similar to the
| difficulties that many religious communities have in accepting
| that their clergy could have a corruption problem
|
| 2) the solutions require thinking about problems like
| p-hacking, incentives, selection effects, and other non-trivial
| concepts that are tough for the average person to wrap their
| heads around.
| derbOac wrote:
| I've often thought religious corruption is a good analogy, in
| that many of the societal dynamics are very similar. As I'm
| writing this the parallels are interesting to think about
| relative to US politics.
| throwawayboise wrote:
| It it is a good analogy. For most lay people, science is a
| religion. They lack the expertise to understand the theory,
| but they unquestioningly accept the explanations and
| interpretations of the so-called experts.
|
| Most people don't understand astronomy and physics well
| enough to prove to themselves that the earth orbits the sun
| and not vice-versa. Yet they believe it does, with
| certainty, because they have been taught that it is true.
| SubiculumCode wrote:
| Also: I have not seen evidence of widespread fraud. Evidence
| of fraud,yes. Evidence of widespread fraud no.
| rossdavidh wrote:
| Agreed it's an important point that fraud is only a
| fraction of the problem.
| derbOac wrote:
| That's a fair point, although fraud per se is only a small
| part of all the problems. There's other forms of corruption
| than fraud, and a lot of it falls into this zone of
| plausible deniability rather than outright fraud. Also, I
| think the problems tend to find most weight with higher
| concentration of power, so what matters isn't as much "how
| widespread is corruption?" but rather "how is corruption
| distributed among power structures in academics and what is
| rewarded?"
| bhk wrote:
| I have. According to [1], "1 in 4 cancer research papers
| contains faked data". As the article argues, the standards
| are perhaps unreasonably strict, but even by more favorable
| criteria, 1/8 of the papers contained faked data.
| Interestingly, [2] using the same approach, found fraud in
| 12.4% of papers in the International Journal of Oncology.
| More broadly, [2] found fraud in about 4% of the papers
| studied (782 of 20,621). I'd say that's pretty widespread,
| but you further have to consider that these papers focused
| narrowly on a very specific type of fraud that is easy to
| detect (image duplication), so we would expect the true
| number of fraudulent papers to be much higher.
|
| [1] https://arstechnica.com/science/2015/06/study-
| claims-1-in-4-...
|
| [2] https://www.biorxiv.org/content/biorxiv/early/2016/04/2
| 0/049...
| mistermann wrote:
| Don't forget though: events proceed evidence, and evidence
| doesn't always follow events.
|
| Also: perception is ~effectively reality.
| javajosh wrote:
| Could it be there's just too much science being done for much of
| it to be any use? And that this oversupply causes these schemes,
| as a side-effect? If so, selling authorship is merely a symptom
| of the worthlessness of most modern science.
|
| For much of human history, science was something you did in your
| spare time - or, if you were exceptional, you might have a
| patron. Then nation states discovered the value of technology and
| science, and wanted more, and so have created science factories.
| But, perhaps unsurprisingly, the rate of science production
| cannot really be improved in this way, and yet the economics of
| science demand that is does. This disconnect between reality and
| expectation is the root of this problem, and many others.
| rossdavidh wrote:
| Oof. Good point. I feel like there's a similar pattern to
| having too much VC money chasing too few actually good ideas to
| invest in.
| pphysch wrote:
| Or a government printing money to hire private contractors,
| completely disregarding its ability do anything on its own.
|
| To some extent, this is the curse of being the creator of the
| global reserve currency. The US government can, in theory,
| print as much money as it wants and pay off whoever it wants
| to do whatever it wants. This also extends to the academic
| and financial (VC) sectors, because a lot of that liquidity
| comes directly from the Government/Fed.
|
| Unfortunately this leads to a culture of corruption (who gets
| the grants/contracts/funding?) and widespread fraud. This
| causes the ROI of money printing to go down, the money
| printer accelerates and we get inflation too.
| SubiculumCode wrote:
| This is in fact incredibly wrong. At least in my field, there
| is so much more data than there are qualified experts to
| analyze it. For one reason, academia pays so much less than the
| private sphere that post docs are leaving.
| seiferteric wrote:
| Something I was wondering is if faking results is so common,
| then surely these things they are researching must never be
| used in any application right? If they were, it would quickly
| be found that it does not actually work...
| HarryHirsch wrote:
| This is exactly how it works in practice. Anyone who works at
| the bench learns quickly to spot the frauds and fakes and
| avoids them. That's the "replication" everyone talks about,
| no special agency to waste funds on boring stuff needed.
| gwd wrote:
| > If they were, it would quickly be found that it does not
| actually work...
|
| Unfortunately some of the effect sizes are so small that it's
| hard to tell what's working or not. The results of papers on
| body building, for instance, are definitely put into practice
| by some people. If the claim of the paper is that eating
| pumpkin [EDIT] decreases muscle recovery time by 5%, how is
| an individual who starts eating pumpkin supposed to notice
| that he's not getting any particular benefit from following
| its advice? Particularly if he's also following random bits
| of advice from a dozen other papers, half of which are valid
| and half of which are not?
| btrettel wrote:
| One problem I've observed is that people applying things
| often cargo-cult "proven" things from the scientific
| literature that aren't actually proven. It's easier to say
| that you're following "best practices" than it is to check
| that what you're doing works, unfortunately.
| twofornone wrote:
| Maybe it's a deeper problem related to western liberal notions
| that anyone can do anything if they just "set their mind to
| it". We have a glut of "professionals" across industries and
| institutions who don't really have any business being there,
| but the machine requires that they appear to be useful, and so
| mechanisms emerge to satisfy this constraint. A consequence in
| science is a long list of poor quality junk publications, and
| few people are willing to acknowledge the nakedness of the
| emperor for fear of losing their positions, but because doing
| so may betray their own redundancy.
| photochemsyn wrote:
| My own rather short academic career involved doing lab work with
| three different PI-led groups. One PI was actually excellent, and
| I really had no idea how good I had it. I caught the other two
| engaging in deliberately fraudulent practices. For example, data
| they'd collect from experiments would be thrown out selectively
| so that they could publish better curve-fits. Another trick was
| fabricating data with highly obscure methods that other groups
| would be unlikely to replicate. They'd also apply pressure to
| graduate students to falsify data in order to get results that
| agreed with their previously published work.
|
| The main difference between the excellent PI and the two
| fraudsters was that the former insisted on everyone in her lab
| keeping highly accurate and detailed daily lab notebooks, while
| the other two had incredibly poor lab notebook discipline (and
| often didn't even keep records!). She actually caught one of her
| grad students fudging data via this method, before it went to
| publication. Another requirement was that samples had to be
| blindly randomized before we analyzed them, so that nobody could
| manipulate the analytical process to get their desired result.
|
| If you're thinking about going into academia, that's the kind of
| thing to look out for when visiting prospective PIs. Shoddy
| record keeping is a huge red flag. Inability to replicate
| results, and in particular no desire to replicate results, is
| another warning sign. And yes, a fair number of PIs have made
| careers out of publishing fraudulent results and never get
| caught, and they infest the academic system.
| georgecmu wrote:
| I would say that this applies even more so outside of academia.
| At early stages of development, a research group's or company's
| product is by necessity a report or a presentation rather than
| a physical plant's or process's real, quantifiable performance.
| No malicious intent is required; it's just all too easy to fool
| yourself or cherry-pick data to support desired conclusions
| when the recordkeeping is poor.
|
| In my hard-tech experiment-heavy start-up there's no way we
| could have made any actual technical progress without setting
| up a solid data preservation and analysis framework first. For
| every experimental run, all the original sensor data are
| collected and immediately uploaded along with any photos,
| videos, and operator comments to a uniquely-tagged confluence
| page. Results and data from any further data or product
| analysis are linked to this original page.
|
| As an anecdotal example, we recently caught swapped dataset
| labels in results from analysis performed on our physical
| samples by a third-party lab. We were able to do this easily
| just because we could refer back to every other piece of
| information regarding these samples, including the conditions
| in which they were generated months prior to this analysis. As
| soon all the data were on display at once, the discrepancies
| were obvious.
| ketanmaheshwari wrote:
| [PLUG] Some of what you mention are "negative results" that are
| quite prevelant and a necessary part of any research. However,
| the expected mold at publishing venues is such that they are
| not considered worthwhile.
|
| My colleagues and I are trying to address this by creating a
| platform to discuss and publish such "bad" or "negative"
| results. More info here:
|
| https://error-workshop.org/
| EamonnMR wrote:
| Does the new In The Pipeline blog have an RSS feed? I haven't
| been able to find it.
| bannedbybros wrote:
| Enginerrrd wrote:
| I always thought it would be a good idea to start a journal that
| has a lab submit their methods and intent of study for peer
| review and approval / denial PRIOR to performing the work. Then,
| if approved, and as long as they adhere to the approved methods,
| they get published regardless of outcome. That would really
| encourage the publishing of negative results and eliminate a lot
| of the incentive to fudge the numbers on the data. It would
| probably overly reward pre-existing clout, but frankly that's a
| problem ANYWAY.
| Guybrush_T wrote:
| This is done with clinical trials (or at least it's
| recommended). Many researchers register their study at
| https://clinicaltrials.gov/ before data collection starts. I'm
| not sure if something similar exists for lab based research.
| francislavoie wrote:
| Reminds me of Bobby Broccoli's video series on Jan Hendrik Schon
| who almost got the Nobel Prize in Physics fraudulently. Extremely
| good watch:
|
| https://www.youtube.com/playlist?list=PLAB-wWbHL7Vsfl4PoQpNs...
| slowhand09 wrote:
| Worked on a NASA program once, about measuring Earth Science
| data. We built a database application to gather suggested
| requirements from members of the earth science community. One
| such member from our own team helped develop specs for our
| system. After we built it, she wanted to measure its utility and
| usability. She watched as users navigated and entered data into
| the system. She also asked myself and members of my team who
| developed the developed the software to use it and be measured. I
| and one other developer (2 of 3 members) explained why we
| implemented each feature as we were utilizing the system. The
| "scientist" measuring us all promptly published as a conclusion
| in her paper "The usability of the system was better for
| inexperienced users than it was for experienced users. The
| experienced users took nearly 50% longer to navigate and enter
| similar requirements". She basically made up an "interesting"
| conclusion by omitting characterization of our testing session,
| where we explained how we implemented her requirements.
___________________________________________________________________
(page generated 2022-04-11 23:00 UTC)