[HN Gopher] Huge reproducibility project fails to validate biome...
       ___________________________________________________________________
        
       Huge reproducibility project fails to validate biomedical studies
        
       Author : rntn
       Score  : 102 points
       Date   : 2025-04-25 16:14 UTC (6 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | coastermug wrote:
       | I've not got the context on why Brazil was chosen here (paywall)
       | - but I coincidentally read a story on here of Richard Feynman
       | visiting Brazil whereby he assessed their teaching and tried to
       | impart his teaching and learning techniques.
        
         | elashri wrote:
         | The answer is straightforward. They are a coalition of
         | Brazilian labs (click on the link in the first sentence to get
         | more information) so it seems normal that they would be
         | focusing on the research conducted in their country. Also it is
         | not the first research of its kind as the nature article
         | provides context
         | 
         | > The teams were able to replicate the results of less than
         | half of the tested experiments1. That rate is in keeping with
         | that found by other large-scale attempts to reproduce
         | scientific findings. But the latest work is unique in focusing
         | on papers that use specific methods and in examining the
         | research output of a specific country, according to the
         | research teams.
        
       | 85392_school wrote:
       | https://archive.is/mmzWj
        
       | N_A_T_E wrote:
       | Is there any path forward to fixing the current reproducibility
       | crisis in science? Individuals can do better, but that won't
       | solve a problem at this scale. Could we make systemic changes to
       | how papers are validated and approved for publication in major
       | journals?
        
         | directevolve wrote:
         | Reproducibility studies are costly in time, reagents, and
         | possibly irreplaceable primary samples. I usually would prefer
         | a different study looking at similar mechanisms using different
         | methods than a reproduction of the original methods, although
         | there's an important place for direct replication studies like
         | this as well. We can also benefit from data sleuths uncovering
         | fraud, better whistleblower systems, and more ability for
         | graduate students to transfer out of toxic labs and into better
         | ones with their funding, reputation and research progress
         | intact.
         | 
         | Scientists have informal trust networks that I'd like to see
         | made explicit. For example, I'd like to see a social media
         | network for scientists where they can PRIVATELY specify trust
         | levels in each other and in specific papers, and subscribe to
         | each others' trust networks, to get an aggregated private view
         | of how their personal trusted community views specific labs and
         | papers.
        
           | JadeNB wrote:
           | > Scientists have informal trust networks that I'd like to
           | see made explicit. For example, I'd like to see a social
           | media network for scientists where they can PRIVATELY specify
           | trust levels in each other and in specific papers, and
           | subscribe to each others' trust networks, to get an
           | aggregated private view of how their personal trusted
           | community views specific labs and papers.
           | 
           | That sounds fascinating, but I'd have a darned high bar to
           | participate to make sure I wasn't inadertently disclosing my
           | very personal trust settings. Past experiences with
           | intentional or unintentional data deanonymization (or just
           | insufficient anonymization) makes me very wary of such
           | claims.
        
         | dilap wrote:
         | Yeah "individuals do better" is never the answer -- you've got
         | to structure incentives, of course.
         | 
         | I _don 't_ think you want to slow down publication (and
         | probably peer review and prestiage journals are
         | useless/obsolete in era of internet); it's already crazy slow.
         | 
         | So let's see: you want people to incentivize two things (1) no
         | false claims in original research (2) to have people try to
         | reproduce claims.
         | 
         | So here's a humble proposal for a funding source (say...the
         | govt): set aside a pot of money specifically for people to try
         | to reproduce research; let this be a valid career path. Your
         | goal should try to be getting research validated by repro
         | before OTHER research starts to build on those premises
         | (avoiding having the whole field go off on wild goose chases
         | like happened w/ Alzheheimer's). And then, when results DON'T
         | repro, blackball the original researchers from funding. (With
         | whatever sort of due process is needed to make this
         | reasonable.)
         | 
         | I think it'd sort things out.
        
           | directevolve wrote:
           | Punishing researchers who make mistakes or get unlucky due to
           | noise in the data is a recipe for disaster, just like in
           | other fields. The ideal amount of fraud and false claims in
           | research is not zero, because the policing effort it would
           | take to accomplish this goal would destroy all other forms of
           | value. I can't emphasize enough how bad an idea blackballing
           | researchers for publishing irreproducible results would be.
           | 
           | We have money to fund direct reproducibility studies (this
           | one is an example), and indirect replication by applying
           | othogonal methods to similar research topics can be more
           | powerful than direct replication.
        
             | MostlyStable wrote:
             | Completely agree.
             | 
             | Given the way that science and statistics work, completely
             | honest researchers that do everything correct and don't
             | make any mistakes at all will have some research that fails
             | to reproduce. And the flip side of that is that some
             | completely correct work that got the right answer, some
             | proportion of the time, the reproduction attempt will
             | _incorrectly_ fail to reproduce. Type 1 and Type 2 errors
             | are both real and occur without any need for misconduct or
             | mistakes.
        
             | JadeNB wrote:
             | > The ideal amount of fraud and false claims in research is
             | not zero, because the policing effort it would take to
             | accomplish this goal would destroy all other forms of
             | value.
             | 
             | Surely that just means that we shouldn't spend too much
             | effort achieving small marginal progress towards that
             | ideal, rather than that's not the ideal? I am a scientist
             | (well, a mathematician), and I can maintain my idealism
             | about my discipline in the face of the idea that we can't
             | and shouldn't try to catch and stop all fraud, but I can't
             | maintain it in the face of the idea that we should aim for
             | a small but positive amount of fraud.
        
               | mrguyorama wrote:
               | It's not actually "Ideal" is the point.
               | 
               | You CANNOT create a system that has zero fraud without
               | rejecting a HUGE amount of legitimate work/requests.
               | 
               | This is as true for credit card processing as it is for
               | scientific publishing.
               | 
               | There's no such thing as "Reject 100% of fraud, accept
               | 100% of non-fraud". It wouldn't be "ideal" to make our
               | spaceships with anti-gravity drives, it would be "science
               | fiction".
               | 
               | The relationship between how hard you prevent fraud and
               | how much legitimate traffic you let through is absurdly
               | non-linear, and super dependent on context. Is there
               | still low hanging fruit on the fraud prevention pipeline
               | for scientific publishing?
               | 
               | That depends. Scientists claim that having to treat each
               | other as hostile entities would basically destroy
               | scientific progress. I wholeheartedly agree.
               | 
               | This should be obvious to anyone who has approved a PR
               | from a coworker. Part of our job in code review is to
               | prevent someone from writing code to do hostile things.
               | I'm sure most of us put some effort towards preventing
               | obvious problems, but if you've ever seen https://en.wiki
               | pedia.org/wiki/International_Obfuscated_C_Cod... or some
               | of the famous bits of code used to hack nation states
               | then you should recognize that the amount of effort it
               | would take to be VERY SURE that this PR doesn't introduce
               | an attack is insane, and no company could afford it.
               | Instead, we assume that job interviews, coworker vibes,
               | and reputation are enough to dissuade that attack vector,
               | and it works for almost everyone except the juiciest
               | targets.
               | 
               | Science is a high trust industry. It also has "juicy
               | targets" like "high temp superconductor" or "magic pill
               | to cure cancer", but scientists approach everything with
               | "extreme claims require extreme results" and that seems
               | to do alright. They mostly treated LK-99 with "eh, let's
               | not get hasty" even as most of the internet was convinced
               | it was a new era of materials. I think scientists have a
               | better handle on this than the rest of us.
        
               | JadeNB wrote:
               | > It's not actually "Ideal" is the point.
               | 
               | > You CANNOT create a system that has zero fraud without
               | rejecting a HUGE amount of legitimate work/requests.
               | 
               | I think that we are using different definitions of
               | "ideal." It sounds like your definition is something like
               | "practically achievable," or even just "can exist in the
               | real world," in which case, sure, zero fraud is not ideal
               | in that sense. To check whether I am using the word
               | completely idiosyncratically, I just looked it up in
               | Apple Dictionary, and most of the senses seem to match my
               | conception, but I meant especially "2b. representing an
               | abstract or hypothetical optimum." It seems very clear to
               | me that you would agree with zero fraud being ideal in
               | sense "2a. existing only in the imagination; desirable or
               | perfect but not likely to become a reality," but possibly
               | we can even agree that it also fits sense 2b above.
        
             | dilap wrote:
             | Well, don't forgot I also said this!
             | 
             | > With whatever sort of due process is needed to make this
             | reasonable
             | 
             | Is it not reasonable to not continue to fund scientists
             | whose results consistently do not reproduce? And should we
             | not spend the funds to verify that they _do_ (or don 't)
             | reproduce (rather than e.g. going down an incredibly
             | expensive goose-chase like recently happened w/ Alzheimer's
             | research)?
             | 
             | Currently there is more or less no reason not to fudge
             | results; your chances of getting caught are slim, and
             | consequences are minimal. And if you don't fudge your
             | results, you'll be at a huge disadvantage when competing
             | against everyone that does!
             | 
             | Hence the replication crises.
             | 
             | So clearly something must be done. If not penalyzing
             | failures to reproduce and funding reproduction efforst,
             | then what?
        
               | jltsiren wrote:
               | Your way of thinking sounds alien to me. You seem to
               | assume that people mostly just follow the incentives,
               | rather than acting according to their internal values.
               | 
               | Science is a field with low wages, uncertain careers, and
               | relatively little status. If you respond strongly to
               | incentives, why would you choose science in the first
               | place? People tend to choose science for other reasons.
               | And, as a result, incentives are not a particularly
               | effective tool for managing scientists.
        
         | brnaftr361 wrote:
         | There's usually indirect reproduction. For instance I can take
         | some principle from a study and integrate it into something
         | else. The real issue is that if the result is negative - at
         | least from my understanding - the likelihood of publication is
         | minimal, so it isn't communicated. And if the principle I've
         | taken was at fault there's a lot of space for misattribution, I
         | could blame a litany of different confounders for failures
         | until, after some _long_ while I might decide to place blame on
         | the principle itself. That itself may require a complete rework
         | of any potential paper, redoing all the experiments (depending
         | on how anal one is in data collection).
         | 
         | Just open up a comment section for institutional affiliates.
        
         | somethingsome wrote:
         | IMO, stopping the race toward better h index.
         | 
         | There is an huge amount of pressure to publish publish publish.
         | 
         | So, many researchers prefeer to write very simple things that
         | are probably true or applicative work, which is kind of useful,
         | or publish false/fake results.
        
           | guerby wrote:
           | May be try to define a "reproducible" h-index, ie your
           | publication doesn't count or count less until a different
           | team has reproduced your results, the team doing the
           | reproducing work gets some points to.
           | 
           | (And may be add more points if in order to reproduce you
           | didn't have to ask plenty of questions to the original team,
           | ie the original paper didn't omit essential information)
        
             | somethingsome wrote:
             | The thing is, that would encourage two teams to cheat
             | together, it would displace the problem, I'm not sure it
             | will limit the effect that much (?)
        
           | somethingsome wrote:
           | I'm curious, I don't get why the down votes? Having to race
           | for publishing pushes people to cheat, It didn't occur to me
           | that it was a bad point, but if you have a different opinion
           | I would gladly hear!
        
             | southernplaces7 wrote:
             | > I don't get why the down votes?
             | 
             | Because a great many who comment on this site are infantile
             | but self-congratulating idiots who just can't help
             | themselves on downvoting anything that doesn't fit their
             | pet dislikes. That button should be removed or at least
             | made not to grey-out text.
        
         | analog31 wrote:
         | Disclosure: I'm a scientist, specializing in scientific
         | measurement equipment, so of course reproducibility is my
         | livelihood.
         | 
         | But at the same time, I doubt that fields like physics and
         | chemistry had better practices in, say, the 19th century. It
         | would be interesting to conduct a reproducibility project on
         | the empirical studies supporting electromagnetism or
         | thermodynamics. There were probably a lot of crap papers!
         | 
         | Those fields had a backup, which was that studies _and
         | theories_ were interconnected, so that they tended to cross-
         | validate one another. This also meant that individual studies
         | were hot-pluggable. One of them could fail replication and the
         | whole edifice wouldn 't suddenly collapse.
         | 
         | My graduate thesis project was never replicated. For one thing,
         | the equipment that I used had been discontinued before I
         | finished, and cost about a million bucks in today's dollars. On
         | the other hand, two labs built similar experiments that were
         | considerably better, made my results obsolete, and enabled
         | further progress. That was a much better use of resources.
         | 
         | I think fixing replication will have to involve fixing more
         | than replication, but thinking about how science progresses as
         | a whole.
        
         | _aavaa_ wrote:
         | Pre-registration is a pretty big one: essential you outline
         | your research plan (what you're looking for, how you will
         | analyze the data, what bars you are setting for significance,
         | etc.) _before_ you do any research. You plan is reviewed and
         | accepted (or denied), often by both funding agency and journal
         | you want to submit to, _before_ they know the results.
         | 
         | Then you perform the experiment exactly* how you said you would
         | based on the pre-registration, and you get to publish your
         | results whether they are positive or negative.
         | 
         | * Changes are allowed, but must be explicitly called out and a
         | valid reason given.
         | 
         | https://en.wikipedia.org/wiki/Preregistration_(science)
        
           | neilv wrote:
           | From the perspective of a dishonest researcher, what are the
           | compliance barriers to secretly doing the research work, and
           | only after that doing the pre-registration?
        
             | akshitgaur2005 wrote:
             | You would need the funding anyway before you could start
             | the research
        
               | smokel wrote:
               | One could implement some pipelining to avoid that
               | problem.
        
           | poincaredisk wrote:
           | Wow, I didn't think it's possible, but it sounds like a great
           | way to make research boring :).
        
         | pieisgood wrote:
         | I had always envisioned an institute for reproducibility & Peer
         | review. It would be a federally funded institute that would
         | require Phd candidate participation as an additional
         | requirement to receive your degree. Really it wouldn't be a
         | single place but office or team at each university where proper
         | equipment was available and perhaps similar conditions for
         | reproducing specific research. Of course the feasibility of
         | this is pretty low.
        
         | cogman10 wrote:
         | Yes, but it costs money. There's no solution that wouldn't.
         | 
         | IMO, the best way forward would be simply doubling every study
         | with independent researchers (ideally they shouldn't have
         | contact with each other beyond the protocol). That certainly
         | doubles the costs, but it's really just about the only way to
         | catch bad actors early.
        
           | JadeNB wrote:
           | > Yes, but it costs money. There's no solution that wouldn't.
           | 
           | True, although, as you doubtless know, as with most things
           | that cost money, the alternative also costs money (for
           | example, in funding experiments chasing after worthless
           | science). It's just that we tend to set aside the costs that
           | we have already priced in. So I tend to think in such
           | settings that a useful approach might be to see how we can
           | make such costs more visible, to increase the will to address
           | them.
        
             | cogman10 wrote:
             | This is a flaw of capitalism.
             | 
             | The flaw being that cost is everything. And, in particular,
             | the initial cost matters a lot more than the true cost.
             | This is why people don't install solar panels or energy
             | efficient appliances.
             | 
             | When it comes to scientific research, proposing you do a
             | higher cost study to avoid false results/data manipulation
             | will be seen as a bug. Bad data/results that make a flashy
             | journal paper (room temp superconductivity, for example)
             | bring in more eyeballs and prestige to the institute vs a
             | well-done study which shows negative results.
             | 
             | It's the same reason the public/private cooperation is
             | often a broken model for government spending. A government
             | agency will happily pick a road builder that puts out the
             | lowest bid and will later eat the cost when that builder
             | ultimately needs more money because the initial bid was a
             | fantasy.
             | 
             | Making costs more visible is a good goal, I just don't know
             | how you accomplish that when surfacing those costs will be
             | seen as a negative for anyone in charge of the budget.
             | 
             | > for example, in funding experiments chasing after
             | worthless science
             | 
             | This is tricky. It's basically impossible to know when an
             | experiment will be worthless. Further, a large portion of
             | experiments will be worthless (like 90% of them).
             | 
             | An example of this is superglue. It was originally supposed
             | to be a replacement glass for jet fighters. While running
             | refractory experiments on it and other compounds, the glue
             | destroyed the machine. Funnily, it was known to be highly
             | adhesive even before the experiment but putting the "maybe
             | we can sell this as a glue" thought to it didn't happen
             | until after the machine was destroyed.
             | 
             | A failed experiment that led to a useful product.
             | 
             | How does someone budget for that? How would you start to
             | surface that sort of cost?
             | 
             | That's where I think the current US grant system isn't a
             | terrible way to do things, provided more guidelines are put
             | in place to enforce reproducibility.
        
               | JadeNB wrote:
               | > > for example, in funding experiments chasing after
               | worthless science
               | 
               | > This is tricky. It's basically impossible to know when
               | an experiment will be worthless. Further, a large portion
               | of experiments will be worthless (like 90% of them).
               | 
               | I don't mean "worthless science" in the sense "doesn't
               | lead to a desired or exciting outcome." Such science can
               | still be very worthwhile. I mean "worthless science" in
               | the sense of "based on fraudulent methods." This might
               | accidentally find the right answer, but the answer it
               | finds, whether wrong or accidentally right, has no
               | scientific value.
        
         | maciej_pacula wrote:
         | On the data analysis side, I think making version control both
         | mandatory and automatic would go a long way.
         | 
         | One issue is that internal science within a company/lab can
         | move incredibly fast -- assays, protocols, datasets and
         | algorithms change often. People tend to lose track of what
         | data, what parameters, and what code they used to arrive at a
         | particular figure or conclusion. Inevitably, some of those end
         | up being published.
         | 
         | Journals requiring data and code for publication helps, but
         | it's usually just one step at the end of a LONG research
         | process. And as far as I'm aware, no one actually verifies that
         | the code you submitted produces the figures in your paper.
         | 
         | It's a big reason why we started https://GoFigr.io. I think
         | making reproducibility both real-time and automatic is key to
         | make this situation better.
        
         | 1970-01-01 wrote:
         | Yes, but nobody wants to acknowledge the elephant in the room.
         | Once again, this is why defunding research has gained merit. If
         | _more than half_ of new research is fake, don 't protest when
         | plugs are being pulled; You're protesting empirical results.
        
           | ndr42 wrote:
           | Science (including all the fake stuff) advanced humanity
           | immensely. I can not imaging that cutting research founding
           | to do less science (with the same percentage of fake) is
           | helpful in any way.
        
           | refulgentis wrote:
           | > more than half of new research is fake
           | 
           | You committed the same sin you are attempting to condemn,
           | while sophomorically claiming it is obvious this sin deserves
           | an intellectual death penalty.
           | 
           | It made me smile. :) Being human is hard!
           | 
           | Now I'm curious, will you acknowledge the elephant in _this_
           | room? It 's hard to, I know, but I have a strong feeling you
           | have a commitment to honesty even if it's hard to always
           | enact all the time. (i.e. being a human is hard :) )
        
         | pks016 wrote:
         | Yes. Accepting the uncertainty and publishing more than few.
         | 
         | Often famous/more cited studies are not replicable. But if you
         | want to work on similar research problem and publish null/non
         | exciting results, you're up for a fight. Journals want new,
         | fun, exciting results but unfortunately the world doesn't work
         | that way
        
         | Darkstryder wrote:
         | A dream of mine was that in order to get a PhD, you would not
         | have to publish original research, but instead you would have
         | to _reproduce existing research_. This would bring the PhD
         | student to the state of the art in a different way, and it
         | would create a natural replication process for current
         | research. Your thesis would be about your replication efforts,
         | what was reproducible and what was not, etc.
         | 
         | And then, once you got your PhD, only then you would be
         | expected to publish new, original research.
        
           | hyeonwho4 wrote:
           | That used to be the function of undergraduate and Masters
           | theses at the Ivy League universities. "For the undergraduate
           | thesis, fix someone else's mistake. For the Master's thesis,
           | find someone else's mistake. For the PhD thesis, make your
           | own mistake."
        
           | dkga wrote:
           | Well, in some fields some PhD classes involve a lot of
           | reproducing (at least partially) others' papers.
        
       | sshine wrote:
       | If they had just used NixOS, reproducibility would be less of a
       | problem!
        
       | jl6 wrote:
       | It would be interesting for reproducibility efforts to assess
       | "consequentiality" of failed replications, meaning: how much does
       | it matter that a particular study wasn't reproducible? Was it a
       | niche study that nobody cited anyway, or was it a pivotal result
       | that many other publications depended on, or anything in between
       | those two extremes?
       | 
       | I would like to think that the truly important papers receive
       | some sort of additional validation before people start to build
       | lives and livelihoods on them, but I've also seen some pretty
       | awful citation chains where an initial weak result gets overegged
       | by downstream papers which drop mention of its limitations.
        
         | 0cf8612b2e1e wrote:
         | It is an ongoing crisis how much Alzheimer's research was built
         | on faked amyloid beta data. Potentially billions of dollars
         | from public and private research which might have been spent
         | elsewhere had a competing theory not been overshadowed by the
         | initial fictitious results.
        
           | superfish wrote:
           | I went searching for more info on this and found
           | https://www.science.org/content/blog-post/faked-beta-
           | amyloid... which was an interesting read.
        
       | baxtr wrote:
       | I find it bizarre that people find this problematic.
       | 
       | Even Einstein tried to find flaws in his own theories. This is
       | how science should actually work.
       | 
       | We need to actively try and falsify theories and beliefs. Only if
       | we fail to falsify, the theories should be considered valid.
        
         | sshine wrote:
         | If scientific studies aren't reproducible with the reported
         | confidence, it fails as science.
         | 
         | It would be worse if the experiments were not even falsifiable,
         | yes.
         | 
         | But it's pretty damn bad when the conclusion of the original
         | study can never be confirmed when once in a rare min they try.
        
           | baxtr wrote:
           | I am not saying we should be happy about the results.
           | 
           | I am saying we should be happy that the scientific method is
           | working.
        
         | maronato wrote:
         | These studies didn't try to find theories, they tried to find
         | results.
         | 
         | In your example, it's the same as someone publishing a paper
         | that disproves Relativity - only for us to find that the author
         | fabricated the data.
        
       | jkh1 wrote:
       | In my field, trying to reproduce results or conclusions from
       | papers happens on a regular basis especially when the outcome
       | matters for projects in the lab. However, whatever the outcome,
       | it can't be published because either it confirms the previous
       | results and so isn't new or it doesn't and no journal wants to
       | publish negative results. The reproducibility attempts are
       | generally discussed at conferences in the corridors between
       | sessions or at the bar in the evening. This is part of how a
       | scientific consensus is formed in a community.
        
       | ein0p wrote:
       | And all the drugs and treatments derived from those "studies" are
       | going to continue to be prescribed for another couple of decades,
       | much like they were cutting people up to "cure ulcers" long after
       | it was proven that an antibiotic is all you really need to cure
       | it. It took about a decade for that bulletproof, 100%
       | reproducible study to make much of a difference in the field.
        
         | mrguyorama wrote:
         | Are you one of those people who somehow believe that, because
         | the pop culture "chemical imbalance" ideology was never
         | factual, SSRIs don't work.
         | 
         | They are continually prescribed because their actual mechanism
         | doesn't matter, _they demonstrably work_. That is a matter of
         | statistics, not science.
         | 
         | Anti-science types always point to the same EXTREMELY FEW
         | examples of how science "fails", like Galileo (which had
         | nothing to do with science) and ulcers.
         | 
         | They never seem to point to the much more common examples where
         | people became convinced of something scientifically untrue for
         | decades despite plenty of evidence otherwise. The British
         | recognized a link between citrus and scurvy well before they
         | were even called "Limeys"! They then screwed themselves over by
         | changing some variables (cooking lime juice) and instead turned
         | to a quack ("respected doctor" from a time when most people
         | recognized doctors were worse than the sickness they treated)
         | who insisted on alternative treatment. For about a hundred
         | years, British sailors suffered and died due to one quacks ego.
         | 
         | Phrenology was always, from day one, unscientific. You STILL
         | find morons pushing it's claims, using it to justify their
         | godawful, hateful, and murderous world views.
         | 
         | Ivermectin is a great example, since you can create a "study"
         | in Africa to show Ivermectin cures anything you want, because
         | it is a parasite killer and most people in impoverished areas
         | suffer from parasites, so will improve if they take it. It's
         | entirely unrelated to the illness you claim to treat, but
         | nobody on Facebook will ever understand that, because they
         | tuned out science education decades ago.
         | 
         | How many people have died from alternative medicine quacks
         | pushing outright disproven pseudoscience on people who have
         | been told not to trust scientists by people pushing an agenda?
         | 
         | How much money is made selling sugarpills to idiots who have
         | been told to distrust science, not just "be skeptical of any
         | paper" but outright, _scientists are in a conspiracy to lie to
         | you_!
        
           | logicchains wrote:
           | SSRIs may work, but the science isn't settled that they work
           | better than a placebo:
           | https://bmjopen.bmj.com/content/9/6/e024886.full . And they
           | come with side effects like sexual dysfunction that other
           | treatments (like therapy) don't face.
        
       | mrguyorama wrote:
       | Yet again more people in this site equating "failed to reproduce"
       | with "the original study can't possibly be correct and is
       | probably fraudulent"
       | 
       | That's not how it works. Science is hard, experiment design is
       | hard, and a failure to reproduce could mean a bunch of different
       | things. It could mean the original research failed to mention
       | something critical, or you had a fluke, or you didn't understand
       | the process right, or something about YOUR setup is unknowingly
       | different. Or the process itself is somewhat stochastic.
       | 
       | This goes 10X for such difficult sciences as psychology (which is
       | literally still in infancy) and biology. In these fields,
       | designing a proper experiment (controlling as much as you can) is
       | basically impossible, so we have to tease signal out of noise and
       | it's failure prone.
       | 
       | Hell, go watch Youtube Chemists who have Phds fail to reproduce
       | old papers. Were those papers fraudulent? No, science is just
       | difficult and failure prone.
       | 
       | If you treat "Paper published in Nature/Science" as a source of
       | truth, you will regularly be wrong. Scientists do not do that.
       | Nature is a _magazine_ , and is a business, and sees themselves
       | as trying to push the cutting edge of research, and they will
       | happily publish an outright fraudulent paper if there is even the
       | slightest chance it might be valid, and especially if it would be
       | really cool if it's right.
       | 
       | When discussing how Jan Hendrik Schon got tens of outright
       | fraudulent papers into Nature despite nobody being able to even
       | confirm he ran any experiments, they said that "even false papers
       | can push the field forward". One of the scientists who
       | investigated and helped Schon get fired even said that peer
       | review is no indicator of quality or correctness. Peer review
       | wasn't even a formal part of science publishing until the 60s.
       | 
       | Science is "self correcting" because if the "effect" you saw
       | isn't real, nobody will be able to build off your work.
       | Alzheimer's Amyloid research has been really unproductive, which
       | is how we knew it probably wasn't the magic bullet even before it
       | had fraud scandals.
       | 
       | If you doubt this, look to China. They have ENORMOUS amounts of
       | explicit fraud in their system, as well as a MUCH WORSE "publish
       | or perish" state. Would you suggest it has slowed them down?
       | 
       | Stop trying to outsource your critical thinking to an authority.
       | You cannot do science without publishing wrong or false papers.
       | If you are reading about "science" in a news article, press
       | release, or advertisement, you don't know science. I am
       | continually flabbergasted by how often "Computer Scientists"
       | don't even know the basics of the scientific method.
       | 
       | Scientists understood there was a strong link between cigarettes
       | and cancer at least 20 years before we had comprehensive
       | scientific studies to "prove" it.
       | 
       | That said, there are good things to do to mitigate the harms that
       | "publish or perish" causes, like preregistration and an incentive
       | to publish failed experiments, even though science progressed
       | pretty well for 400 years without them. These reproducibility
       | projects are great, but do not mistake their "these papers
       | failed" as "these papers were written fraudulently, or by bad
       | scientists, or were a waste".
       | 
       | Good programmers WILL ship bugs sometimes. Good scientists WILL
       | publish papers that don't pan out. These are truths of human
       | processes and imperfect systems.
        
         | bsder wrote:
         | > Hell, go watch Youtube Chemists who have Phds fail to
         | reproduce old papers. Were those papers fraudulent? No, science
         | is just difficult and failure prone.
         | 
         | Agreed. Lab technique is a thing. There is a reason for the
         | dark joke that in Physics, theorists are washed up by age 30,
         | but experimentalists aren't even competent until age 40.
        
         | damnitbuilds wrote:
         | "This goes 10X for such difficult sciences as psychology (which
         | is literally still in infancy) and biology. In these fields,
         | designing a proper experiment (controlling as much as you can)
         | is basically impossible, so we have to tease signal out of
         | noise and it's failure prone."
         | 
         | For psychology replace "Difficult" with "Pseudo".
         | 
         | To lose that tag, Psychology has to take a step back, do basic
         | research, replicate that research multiple times, think about
         | how to do replicatable new research, and only then start
         | actually letting psychologists do new research to advance
         | science.
         | 
         | Instead of that, unreplicated pseudo-scientific nonsense
         | psychology papers are being used to tell governments how to
         | force us to live our lives.
        
       | chmorgan_ wrote:
       | I follow Vinay Prasad (https://substack.com/@vinayprasadmdmph) to
       | keep up on these topics. It feels like getting a portal to the
       | future in some way as he's on the cutting edge of analyzing the
       | quality of the analysis in a ton of papers. You get to see what
       | conclusions are likely to change in the next handful of years as
       | the information becomes more widespread.
        
       | addoo wrote:
       | This doesn't really surprise me at all. It's an unrelated field,
       | but part of the reason I got completely disillusioned with
       | research to the point I switched out of a program with a thesis
       | was because I started noticing reproducibility problems in
       | published work. My field is CS/CE, generally papers reference
       | publicly available datasets and can be easily replicated...
       | except I kept finding papers with results I couldn't recreate.
       | It's possible I made mistakes (what does a college student know,
       | after all), but usually there were other systemic problems on top
       | of reproducibility. A secondary trait I would often notice is a
       | complete exclusion of [easily intuited] counter-facts because
       | they cut into the paper's claim.
       | 
       | To my mind there is a nasty pressure that exists for some
       | professions/careers, where publishing becomes _essential_.
       | Because it's essential, standards are relaxed and barriers
       | lowered, leading to the lower quality work being published.
       | Publishing isn't done in response to genuine discovery or
       | innovation, it's done because boxes need to be checked.
       | Publishers won't change because they benefit from this system,
       | authors won't change because they're bound to the system.
        
         | svachalek wrote:
         | The state of CS papers is truly awful, as they're uniquely
         | poised to be 100% reproducible. And yet my experience aligns
         | with yours in that they very rarely are.
        
           | justinnk wrote:
           | I can second this, even availability of the code is still a
           | problem. However, I would not say CS results are rarely
           | reproducible, at least from the few experineces I had so far,
           | but I heard of problematic cases from others. I guess it also
           | differs between fields.
           | 
           | I want to note there is hope. Contrary to what the root
           | comment says, some publishers try to endorse reproducible
           | results. See for example the ACM reproducibility initiative
           | [1]. I have participated in this before and believe it is a
           | really good initiative. Reproducing results can be very labor
           | intensive though, loading a review system already struggling
           | under massive floods of papers. And it is also not perfect,
           | most of the time it is only ensured that the author-supplied
           | code produces the presented results, but I still think more
           | such initiatives are healthy. When you really want to ensure
           | the rigor of a presented method, you have to replicate it,
           | i.e., using a different programming language or so, which is
           | really its own research endeavor. And there is also a place
           | to publish such results in CS already [2]! (although I
           | haven't tried this one). I imagine this may be especially
           | interesting for PhD students just starting out in a new
           | field, as it gives them the opportunity to learn while
           | satisfying the expectation of producing papers.
           | 
           | [1] https://www.acm.org/publications/policies/artifact-
           | review-an... [2] https://rescience.github.io
        
           | 0cf8612b2e1e wrote:
           | Even more ridiculous is the number of papers that do not
           | include code. Sure, maybe Google cannot offer an environment
           | to replicate the underlying 1PB dataset, but for mortals,
           | this is rarely a concern.
           | 
           | Even better is when the paper says code will be released
           | after publication, but they cannot be bothered to post it
           | anywhere.
        
         | dehrmann wrote:
         | All it takes is 14 grad students studying the same thing
         | targeting a 95% confidence interval for, on average, one to
         | stumble upon a 5% case. Factor in publication bias and you get
         | a bunch of junk data.
         | 
         | I think I heard this idea from Freakonomics, but a fix is to
         | propose research to a journal _before_ conducting it and being
         | committed to publication regardless of outcome.
        
           | beng-nl wrote:
           | A great idea. Also known as a pre registered study.
           | 
           | https://en.m.wikipedia.org/wiki/Preregistration_(science)
        
           | poincaredisk wrote:
           | Not familiar with this idea, but this idea is commonly
           | applied for grant applications: only apply for a grant when
           | you finished the thing you promise to work on. Then use the
           | grant money to prototype the next five ideas (of which maybe
           | one works), because science is about exploration.
        
       | WhitneyLand wrote:
       | As part of the larger reproducibility crisis including social
       | science, I wonder how much these things contribute to declining
       | public confidence in science and the post-truth era generally.
        
       | moralestapia wrote:
       | Academia is 90% a scam these days and plenty of the professors
       | involved are criminals. A criminal is someone who commits a crime
       | (or many) [1], before some purist comes to ask "what do you
       | mean?".
       | 
       | The most common crime they commit is fraud, the 2nd. most common
       | one is sexual harassment, while the third one would be
       | plagiarism, although this one might not necessarily be punishable
       | depending on the jurisdiction.
       | 
       | (IMO. I can't provide data on that and I'm not willing to
       | prosecute them personally, if that breaks the deal for you,
       | that's ok to me.)
       | 
       | I know academia like the palm of my hand and have been everywhere
       | around the world, it's the same thing all over. I can speak
       | loudly about it because I'm catholic and have money, so those
       | lowlives can't touch me :D.
       | 
       | Every single time this topic comes up, there's a lot of
       | resistance from "the public" who is willing to go to great
       | lengths to defend "the academics" even though they know
       | absolutely nothing about academic life and their only grasp of it
       | was created through TV and movies.
       | 
       | Anyone who has been involved in Academia for more than like 2
       | years can tell you the exact same thing. That doesn't mean
       | they're also rotten, I'm just saying they've seen all these
       | things taking place around.
       | 
       | We should really move the overton window around this topic so
       | that scientists are held to the same public scrutiny as everybody
       | else, like public officials, because btw. 9 out of 10 times they
       | are being funded by public money. They should be held
       | accountable, there should be jail for the offenders.
       | 
       | 1: https://dictionary.cambridge.org/dictionary/english/criminal
        
       | hahaxdxd123 wrote:
       | A lot of people have pointed out a reproducibility crisis in
       | social sciences, but I think it's interesting to point out this
       | happens in CompSci as well when verifying results is hard.
       | 
       | Reproducing ML Robotics papers requires the exact
       | robot/environment/objects/etc -> people fudge their numbers and
       | have strawman implementation of benchmarks.
       | 
       | LLMs are so expensive to train + the datasets are non-public ->
       | Meta trained on the test set for Llama4 (and we wouldn't have
       | known if not for some forum leak).
       | 
       | In some way it's no different than startups or salesmen
       | overpromising - it's just lying for personal gain. The truth
       | usually wins in the end though.
        
       | jpeloquin wrote:
       | The median sample size of the studies subjected to replication
       | was n = 5 specimens (https://osf.io/atkd7). Probably because only
       | protocols with an estimated cost less than BRL 5,000 (around USD
       | 1,300 at the time) per replication were included. So it's not
       | surprising that only ~ 60% of the original biomechemical assays'
       | point estimates were in the replicates' 95% prediction interval.
       | The mouse maze anxiety test (~ 10%) seems to be dragging down the
       | average. n = 5 just doesn't give reliable estimates, especially
       | in rodent psychology.
        
       ___________________________________________________________________
       (page generated 2025-04-25 23:01 UTC)