[HN Gopher] Replace peer review with "peer replication" (2021)
___________________________________________________________________
Replace peer review with "peer replication" (2021)
Author : dongping
Score : 340 points
Date : 2023-08-06 12:33 UTC (10 hours ago)
(HTM) web link (blog.everydayscientist.com)
(TXT) w3m dump (blog.everydayscientist.com)
| geysersam wrote:
| Both review and replication has their place. The mistake is
| treating researchers and the scientific community as a machine:
| "pull here, fill these forms, comment this research, have a gold
| star"
|
| Let people review what they want, where they want, how they want.
| Let people replicate when they find interesting and motivating to
| work on.
| SonOfLilit wrote:
| My first thought was "this would never work, there is so much
| science being published and not enough resources to replicate it
| all".
|
| Then I remembered that my main issue with modern academia is that
| everyone is incentivized to publish a huge amount of research
| that nobody cares about, and how I wish we would put much more
| work into each of much fewer research directions.
| tines wrote:
| "Replace peer code review with 'peer code testing.'"
|
| Probably not gonna catch on.
| dongping wrote:
| "peer code testing" is already the job of the CI server. As it
| is nothing new, it probably is not going to catch on.
| fastneutron wrote:
| As much as I agree with the sentiment, we have to admit it isn't
| always practical. There's only one LIGO, LHC or JWST, for
| example. Similarly, not every lab has the resources or know-how
| to host multi-TB datasets for the general public to pick through,
| even if they wanted to. I sure didn't when I was a grad student.
|
| That said, it infuriates me to no end when I read a Phys. Rev.
| paper that consists of a computational study of a particular
| physical system, and the only replicability information provided
| is the governing equation and a vague description of the
| numerical technique. No discretized example, no algorithm, and
| sure as hell no code repository. I'm sure other fields have this
| too. The only motivation I see for this behavior is the desire
| for a monopoly on the research topic on the part of authors, or
| embarrassment by poor code quality (real or perceived).
| fabian2k wrote:
| I don't see how this could ever work, and non-scientists seem to
| often dramatically underestimate the amount of work it would be
| to replicate every published paper.
|
| This of course depends a lot on the specific field, but it can
| easily be months of effort to replicate a paper. You save some
| time compared to the original as you don't have to repeat the
| dead ends and you might receive some samples and can skip parts
| of the preparation that way. But properly replicating a paper
| will still be a lot of effort, especially when there are any
| issues and it doesn't work on the first try. Then you have to
| troubleshoot your experiments and make sure that no mistakes were
| made. That can add a lot of time to the process.
|
| This is also all work that doesn't benefit the scientists
| replicating the paper. It only costs them money and time.
|
| If someone cares enough about the work to build on it, they will
| replicate it anyway. And in that case they have a good incentive
| to spend the effort. If that works this will indirectly support
| the original paper even if the following papers don't
| specifically replicate the original results. Though this part is
| much more problematic if the following experiments fail, then
| this will likely remain entirely unpublished. But the solution
| here unfortunately isn't as simple as just publishing negative
| results, it take far more work to create a solid negative result
| than just trying the experiments and abandoning them if they're
| not promising.
| ebiester wrote:
| It's simple but not easy: You create another path to tenure
| which is based on replication, or on equal terms as a part of a
| tenure package. (For example, x fewer papers but x number of
| replications, and you are expected to have x replications in
| your specialty.) You also create a grant funding section for
| replication which is then passed on to these independent
| systems. (You would have to have some sort of randomization
| handled as well.) Replication has to be considered at the same
| value as original research.
|
| And maybe smaller faculties at R2s pivot to replication hubs.
| And maybe this is easier for some sections of biology,
| chemistry and psychology than it is for particle physics. We
| could start where cost of replication is relatively low and
| work out the details.
|
| It's completely doable in some cases. (It may never be doable
| in some areas either.)
| SkyMarshal wrote:
| _> x fewer papers but x number of replications, and you are
| expected to have x replications in your specialty._
|
| Could it be simplified it even further to say x number of
| papers, but they only count if they're replicated by others
| in the field?
| nine_k wrote:
| No, the idea is that the same researcher should produce _k_
| papers and _n_ replications, instead of just _k + n_
| published papers.
|
| I'd argue that since replication is somehow faster than
| original research, the requirement would count a
| replication somewhat lower than an original paper (say, at
| 0.75).
| ebiester wrote:
| That is my idea... If we opened it up, there's probably
| more interesting iterations, such as requiring pre-
| registration for all papers, having papers with pre-
| registration count as some portion of a full paper even
| if they fail so long as the pre-registration passed
| scrutiny, having non-replicated papers count as some
| portion of a fully replicated paper, and having
| replication as a separate category such that there is a
| minimum k, a minimum n, and a minimum k+n.
|
| The non-easy part of this is once we start making changes
| to the criteria for tenure, this opens up people trying
| to stuff all the solutions for all of the problems that
| everyone knows already. (See Above.) Would some one try
| to stuff code-available for CS conference papers, for
| example? What does it mean for a poster session? At what
| point are papers released for pre-print? What does it
| mean for the tenure clock or the Ph.D clock? Does it mean
| that pre-tenure can't depend on studies that take time to
| replicate? What do we do with longitudinal studies?
|
| I think you're looking at a 50 year transition where you
| would have to start simple and iterate.
| harimau777 wrote:
| Is tenure really as mechanical as "publish this many
| papers and you get it"? My impression was that it took
| into account things like impact factor and was much more
| subjective. If that were the case, then wouldn't you run
| into problems with whoever decides tenure paying lip
| service to counting replication or failed pre-registered
| papers but in practice being biased in favor of original
| research?
| rapjr9 wrote:
| Another approach I've seen actually used in Computer Science
| and Physics is to make replication a part of teaching to
| undergrads and masters candidates. The students learn how to
| do the science, and they get a paper out of replicating the
| work (which may or may not support the original results), and
| the field benefits from the replication.
| Eddy_Viscosity2 wrote:
| It's not easy because it isn't simple. How do get all of the
| universities to change their incentives to back this?
| ebiester wrote:
| We agree - the "simple not easy" turn of phrase is speaking
| to that point. It is easy once implemented, but it isn't
| easy to transition. (I am academia-adjacent by marriage but
| closer to the humanities, so I understand the amount of
| work it would take to perform the transition.)
| MichaelZuo wrote:
| This isn't just not easy, it would probably be extremely
| political to change the structure of the NSF, National
| Labs, all universities and colleges, etc., so
| dramatically.
| tnecniv wrote:
| Your proposal has a whole slew of issues.
|
| First, people that want to be professors normally do so
| because they want to steer their research agenda, not repeat
| what other people are doing without contribution. Second, who
| works in their lab? Most of the people doing the leg work in
| a lab are PhD students, and, to graduate, they need to do
| something novel to write up in their dissertation. Thus, they
| can't just replicate three experiments and get a doctorate.
| Third, you underestimate how specialized lab groups are --
| both in terms of the incredibly expensive equipment it is
| equipped with and the expertise within the lab. Even folks in
| the same subfield (or even in the same research group!) often
| don't have much in common when it comes to interests,
| experience, and practical skills.
|
| For every lab doing new work, you'd basically need a clone of
| that lab to replicate their work.
| majormajor wrote:
| > First, people that want to be professors normally do so
| because they want to steer their research agenda, not
| repeat what other people are doing without contribution.
|
| If we're talking about weird incentives and academia you
| hit on one of the worst ones right here, I think, since
| nothing there is very closely connected to helping students
| learn.
|
| I know that's a dead horse, but it's VERY easy to find
| reasons that we shouldn't be too closely attached to the
| status quo.
|
| > For every lab doing new work, you'd basically need a
| clone of that lab to replicate their work.
|
| Hell, that's how startup funding works, or market economies
| in general. Top-down, non-redundant systems are way more
| fragile than distributed ecosystems. If you don't have the
| competition and the complete disconnection, you so much
| more easily fall into political games of "how do we get
| this published even if it ain't great" vs "how do we find
| shit that will survive the competition"
| harimau777 wrote:
| I think that there's also a lot of
| psychological/cultural/political issues that work also need
| to be worked out:
|
| If someone wins the Nobel Prize, do the people who replicated
| their work also win it? When the history books are written do
| the replicators get equal billing to the people who made the
| discovery?
|
| When selecting candidates for prestigious positions, are they
| really going to consider a replicator equal to an original
| researcher?
| kergonath wrote:
| > I don't see how this could ever work, and non-scientists seem
| to often dramatically underestimate the amount of work it would
| be to replicate every published paper.
|
| They also tend to over-estimate the effect of peer review
| (often equating peer review with validity).
|
| > If someone cares enough about the work to build on it, they
| will replicate it anyway. And in that case they have a good
| incentive to spend the effort. If that works this will
| indirectly support the original paper even if the following
| papers don't specifically replicate the original results.
| Though this part is much more problematic if the following
| experiments fail, then this will likely remain entirely
| unpublished.
|
| It can also remain unpublished if other things did not work
| out, even if the results could be replicated. A half-fictional
| example: a team is working on a revolutionary new material to
| solve complicated engineering problems. They found a material
| that was synthesised by someone in the 1980s, published once
| and never reproduced, which they think could have the specific
| property they are after. So they synthesise it, and it turns
| out that the material exists, with the expected structure but
| not with the property they hoped. They aren't going to write it
| up and publish it; they're just going to scrap it and move on
| to the next candidate. Different teams might be doing the same
| thing at the same time, and nobody coming after them will have
| a clue.
| techdragon wrote:
| This waste of effort by way of duplicating unpublished
| negative results is a big factor in why replicated results
| deserve to be rated more highly than results that have not
| been replicated regardless of the prestige of the researchers
| or the institutions involved... if no one can prove your work
| work was correct... how much can anyone trust your work...
|
| I have gone down the rabbit hole of engineering research
| before and 90% of the time I've managed to find an anecdote
| or subsequent research footnotes or actual subsequent
| research publications, that substantially invalidated the
| lofty claims of the engineers in the 70s or 80s (which is
| amazing still despite this, a genuine treasure trove of
| research unused and sometimes useful aerospace engineering
| research and development) and unfortunately outside the few
| proper publications, a lot of the invalidations are not
| properly reverse cited research material and I could have
| spent a week cross referencing before I spot the link and
| realise the unnamed work they are saying they are proving
| wrong is actually some footnotes containing the only
| published data (before their new paper) on some old work that
| has a bad scan copy on the NASA NTRS server under some
| obscure title and no related keywords to the topic the
| research is notionally about...
|
| Academic research can genuinely suck sometimes...
| particularly when you want to actually apply it.
| vibrio wrote:
| "They also tend to over-estimate the effect of peer review
| (often equating peer review with validity)."
|
| In my experience, scientists ate comfortably cynical about
| peer review- even those that serve as reviewers and editors-
| except maybe junior scientists that haven't gotten burned
| yet.
| renonce wrote:
| I don't know how scientists handle peer review but aren't
| they fighting with peer review to get their papers
| published and apply for PhD and tenure and grants etc with
| these publications?
| kergonath wrote:
| Yes, because we know how the metaphorical sausage is made:
| with unpaid reviewers who have many other, more interesting
| things to do and often an axe to grind. That is, if they
| don't delegate the review to one of their post-docs.
| aftoprokrustes wrote:
| Post doc? In what kind of utopian field did you work? In
| my former institute virtually all papers were written by
| PhD candidates, and reviewed by PhD candidates. With the
| expected effect on quality (due to lack of experience and
| impostor-syndrome-induced "how can I propose to reject?
| They are likely better than me"). But the Prof-to-
| postdoc-to-PhD-ratio was particularly bad (1-2-15).
| kelipso wrote:
| I was reviewing papers starting second semester of grad
| school with my advisor just signing off on it, so not
| even PhD candidates, and it was the same for my lab mates
| too.
|
| Initially we spent probably a few hours on a paper for
| peer review because we were relatively unfamiliar with
| the field but eventually I spent maybe a couple of hours
| doing the review. Wouldn't say peer review is a joke but
| it's definitely overrated by the public.
| jakear wrote:
| It's the general public that equates "peer reviewed" with
| "definitely correct, does not need to be questioned".
| dongping wrote:
| While it is a lot of work, I tend to think that one can then
| always publish preprints if they can't wait for the
| replication. I don't understand why a published paper should
| count as an achievement (against tenure or funding) at all
| before the work is replicated. The current model just creates
| perverse incentives to encourage lying, P-hacking, and cherry-
| picking. This would at least work for fields like machine
| learning.
|
| This is, of course, a naive proposal without too much thought
| into it. But I was wondering what I would have missed here.
| i_no_can_eat wrote:
| and in this proposal, who will be tasked with replicating the
| work?
| dongping wrote:
| In some fields, replication is already the prerequisite to
| benchmark the SoTA. So the incentives boil down to
| publishing them along with negative results. Or as some
| have suggested, make it mandatory for PHD candidates to
| replicate.
|
| Though, it seems that it is possible to game the system, by
| creating positive/negative replication intentionally, to
| collude with/harm the author.
| omgwtfbyobbq wrote:
| What about a system where peer replication is required if the
| number of citations exceeds some threshold?
| p1esk wrote:
| Who will be replicating it? Why would I want to set aside my
| own research to replicate some claim someone made? How would
| this help my career?
| Knee_Pain wrote:
| Academia's values are not objective. Why is it that
| replicating or refuting a study is not seen on par as being
| a co-author of said study? There is nothing set in stone
| preventing this, only the current academic culture.
| p1esk wrote:
| Because I want to do original research, and be known for
| doing original research. Only if I fail at that, I might
| settle for being a guy who reproduces others' work (which
| basically means the transition from a researcher to an
| engineer).
| omgwtfbyobbq wrote:
| Whether or not you would be doing original research
| depends on whether the cited work can be replicated.
|
| If the cited work is unable to be replicated, and you try
| to replicate but get different results, then you would be
| doing original research, and then you can base further
| work on your initial original study that came to a
| different result.
|
| On the flip side, if you are able to replicate it, then
| you are doing extra work initially, but after replicating
| the work you've cited, the work you've done is more
| likely to be reproducible by someone else.
|
| The amount of citations needed to require replication
| could itself be a function of how easy it is to replicate
| work across an entire field.
|
| A field where there's a high rate of success in
| replicating work could have a higher threshold for
| requiring replication compared to a field where it's
| difficult to replicate work.
| omgwtfbyobbq wrote:
| I dunno. Offhand, I guess whoever is citing the work would
| need to replicate it, but only if it's cited sufficiently
| (overall number of citations, considered foundational,
| etc...)
|
| This could help your career by increasing the probability
| that the work you're citing is more likely accurate, and as
| a result, your work is also likely more accurate.
| RoyalHenOil wrote:
| A typical paper may cite dozens or hundreds of other
| papers. This does not sound feasible. It honestly seems
| like it would worsen the existing problem and force
| scientists to focus even more on their own original
| research in isolation from others, to avoid the expense
| of running myriad replication experiments that they
| likely don't have the funding and personnel to do.
| boxed wrote:
| > I don't see how this could ever work, and non-scientists seem
| to often dramatically underestimate the amount of work it would
| be to replicate every published paper.
|
| I don't see how the current system works really either. Fraud
| is rampant, and replication crisis is the most common state of
| most fields.
|
| Basically the current system is failing at finding out what is
| true. Which is the entire point. That's pretty damn bad.
| tptacek wrote:
| Fraud seems rampant because you hear about cases of fraud,
| but not about the tens of thousands of research labs plugging
| away day after day.
| mike_hearn wrote:
| Unfortunately there's a lot of evidence that fraud really
| is very prevalent and we don't hear about it anywhere near
| enough. It depends a lot on the field though.
|
| One piece of evidence comes from software like GRIM and
| SPRITE. GRIM was run over psychology papers and found
| around 50% had impossible means in them (that could not be
| arrived at by any combination of allowed inputs) [1]. The
| authors generally did not cooperate to help uncover the
| sources of the problems.
|
| Yet another comes from estimates by editors of well known
| journals. For example Richard Horton at the Lancet is no
| stranger to fraud, having published and promoted the
| Surgisphere paper. He estimates that maybe 50% of medical
| papers are making untrue claims, which is interesting in
| that this intuition matches the number obtained in a
| different field by a more rigorous method. The former
| editor of the New England Journal of Medicine stated that
| it was "no longer possible to believe much of the medical
| research that is published".
|
| 50%+ is a number that crops up frequently in medicine. The
| famous Ioannidis paper, "Why most published research
| findings are false" (2005) has been cited over 12,000
| times.
|
| Marc Andreessen has said in an interview that he talked to
| the head of a very large government grant agency, and asked
| him whether it could really be true that half of all
| biomedical research claims were fake? The guy laughed and
| said no it's not true, it's more like 90%. [2]
|
| Elizabeth Bik uncovers a lot of fraud. Her work is behind
| the recent resignation of the head of Stanford University
| for example. Years ago she said, _" Science has a huge
| problem: 100s (1000s?) of science papers with obvious
| photoshops that have been reported, but that are all swept
| under the proverbial rug, with no action or only an author-
| friendly correction ... There are dozens of examples where
| journals rather accept a clean (better photoshopped?)
| figure redo than asking the authors for a thorough
| explanation."_ In reality there seem to be far more than
| mere thousands, as there are companies that specialize in
| professionally producing fake scientific papers, and whole
| markets where they are bought and sold.
|
| So you have people who are running the scientific system
| saying, on the record, that they think science is overrun
| with fake results. And there is some quantitive data to
| support this. And it seems to happen quite often now that
| presidents of entire universities are being caught having
| engaged in or having signed off on rule breaking behavior,
| like image manipulation or plagiarism, implying that this
| behavior is at least rewarded or possibly just very common.
|
| There are also whole fields in which the underlying
| premises are known to be false so arguably that's also
| pretty deceptive (e.g. "bot studies"). If you include those
| then it's quite likely indeed that most published research
| is simply untrue.
|
| [1] https://peerj.com/preprints/2064v1/
|
| [2] https://www.richardhanania.com/p/flying-x-wings-into-
| the-dea...
| lliamander wrote:
| I agree that most labs are probably not out to defraud
| people. But without replication I don't think it's
| reasonable to have much confidence in what is published.
| magimas wrote:
| replication happens over time. For example, when I did my
| PhD I wanted to grow TaS2 monolayers on a graphene layer
| on an Iridium crystal. So I took published growth
| recipees of related materials, adapted them to our setup
| and then finetuned the recipee for TaS2. This way I
| basically "peer replicated" the growth of the original
| paper. I then took those samples to a measurement device
| and modified the sample in-situ by evaporating Li atoms
| on top (which was the actual paper but I needed a sample
| to modify first). I published the paper with the growth
| recipee and the modification procedure and other
| colleagues then took those instructions to grow their own
| samples for their own studies (I think it was MoS2 on
| Graphene on Cobalt that they grew).
|
| This way papers are peer replicated in an emerging manner
| because the knowledge is passed from one group to another
| and they use parts of that knowledge to then apply it to
| their own research. You have to see this from a more
| holistic picture. Individual papers don't mean too much,
| it's their overlap that generates scientific consesus.
|
| In contrast, requiring some random reviewer to instead
| replicate my full paper would be an impossible task.
| He/she would not have the required equipment (because
| there's only 2 lab setups in the whole world with the
| necessary equipment), he/she would probably not have the
| required knowledge (because mine and his research only
| partially overlap - e.g. we're researching the same
| materials but I use angle-resolved photoemission
| experiments and he's doing electronic transport) and
| he/she would need to spend weeks first adapting the
| growth recipee to the point where his sample quality is
| the same as mine.
| tptacek wrote:
| That's not what publication is about. Publication is a
| conversation with other researchers; it is part of the
| process of reaching the truth, not its endpoint.
| cpach wrote:
| People in general (at least on da Internetz) seem to
| focus way to much on single studies, and way too little
| on meta-studies.
|
| AFAICT meta-studies is the level where we as a society
| really can try to say something intelligent about how
| stuff works. If an important question is not included in
| a meta-study, we (i.e. universities and research labs)
| probably need to do more research on that topic before we
| really can say that much about it.
| lliamander wrote:
| Sure, and scientists need a place to have such
| conversations.
|
| But publication is not a closed system. The "published,
| peer-reviewed paper" is frequently an artifact used to
| decide practical policy matters in many institutions both
| public and private. To the extent that Science (as an
| institution in its own right) wants to influence policy,
| that influence needs to be grounded in reproducible
| results.
|
| Also, I would not be surprised if stronger emphasis on
| reproducibility improved the quality of conversation
| among scientists.
| vladms wrote:
| Maybe replication should (and probably does) happen when
| the published thing is relevant to some entity and also
| interesting.
|
| I never seen papers as "truth", but more as
| "possibilities". After many other "proofs" (products,
| papers, demos, etc.) you can assign some concepts/ideas
| the label "truth" but one/two papers from the same group
| is definitely not enough.
| tnecniv wrote:
| Yeah passing peer review doesn't mean that the article is
| perfect and to be taken as truth now (and remember, to
| err is human; any coder on here has had some long
| standing bug that went mostly unnoticed in their code
| base). It means it passed the journal's standards for
| novelty, interest, and rigor based on the described
| methods as a retained by the editor / area chair and peer
| reviewers that are selected for being knowledgeable on
| the topic.
|
| Implicit in this process is that the authors are acting
| in good faith. To treat the authors as hostile is both
| demoralizing for the reviewers (who wants to be that
| cynical about their field) and would require extensive
| verification of each statement well beyond what is
| required to return the review in a timely manner.
|
| Unless your paper has mathematical theory (and mistakes
| do slip through), a publication should not be taken as
| proof of something on its own, but a data point. Over
| time and with enough data points, a field builds evidence
| to turn a hypothesis into a scientific theory.
| majormajor wrote:
| I think the current system is just measuring entirely the wrong
| thing. Yes, fewer papers would be published. But today's goal
| is "publish papers" not "learn and disseminate truly useful and
| novel things", and while this doesn't solve it entirely, it
| pushes incentives further away from "publish whatever pure crap
| you can get away with." You get what you measure -> sometimes
| you need to change what/how you measure.
|
| > If someone cares enough about the work to build on it, they
| will replicate it anyway.
|
| That's duplicative at the "oh maybe this will be useful to me"
| stage, with N different people trying to replicate. And with
| replication not a first-class part of the system, the effort of
| replication (e_R) is high. For appealing things, N is probably
| > 2. So N X e_R total effort.
|
| If you move the burden at the "replicate to publish" stage, you
| can fix the number of replicas needed so N=2 (or whatever)
| _and_ you incentive the orginal researchers to make e_R lower
| (which will improve the quality of their research _even before
| the submit-for-publication stage_ ).
|
| I've been in the system, I spent a year or two chasing the tail
| of rewrites, submissions, etc, for something that was
| detectable as low-effect-size in the first place but I was told
| would still be publishable. I found out as part of that that it
| would only sometimes yield a good p-value! And everything in
| the system incentivized me to hide that for as long as
| possible, instead of incentivizing me to look for something
| else or make it easy for others to replicate and judge for
| themselves.
|
| Hell, do something like "give undergrads the opportunity to
| earn Master's on top of their BSes, say, by replicating (or
| blowing holes in) other people's submissions." I would've eaten
| up an opportunity like that to go _really really deep* in some
| specialized area in exchange for a masters degree in a less-
| structured way than "just take a bunch more courses."_
| DoctorOetker wrote:
| > [...] non-scientists seem to often dramatically underestimate
| the amount of work it would be to replicate every published
| paper
|
| Either "peer reviewed" articles describe progress of promising
| results, or they don't. If they don't the research is
| effectively ignored (at least until someone finds it
| promising). So let's consider specifically output that
| described promising results.
|
| After "peer review" any apparently promising results prompt
| other groups to build on them by utilizing it as a step or
| building block.
|
| It can take many failed attempts by independent groups before
| anyone dares publish the absence of the proclaimed
| observations, since they may try it over multiple times
| thinking they must have botched it somewhere.
|
| On paper it sounds more expensive to require independent
| replication, but only because the costs of replication attempts
| are hidden until its typically rather late.
|
| Is it really more expensive if the replication attempts are in
| some sense mandatory?
|
| Or is it perhaps more expensive to pretend science has found a
| one-shot "peer reviewed" method, resulting in uncoordinated
| independent reproduction attempts that may go unannounced
| before, or even after failed replications?
|
| The pseudo-final word, end of line?
|
| What about the "in some sense mandatory" replication? Perhaps
| roll provable dice for each article, and in-domain sortition to
| randomly assign replicators. So every scientist would be
| spending a certain fraction of their time replicating the
| research of others. The types of acceptable excuses to derelict
| these duties should be scrutinized and controlled. But some
| excuses should be very valid, for example _conscientious
| objection_. If you are tasked to reproduce some of Dr. Mengele
| 's works, you can cop out on condition that you thoroughly
| motivate your ethical concerns and objections. This could also
| bring a lot of healthy criticism to a lot of practices, which
| is otherwise just ignored an glossed over for fear of future
| career opportunities.
| jofer wrote:
| Also, don't forget that a lot of replication would
| fundamentally involve going and collecting additional samples /
| observations / etc in the field area, which is often expensive,
| time consuming, and logistically difficult.
|
| It's not just "can we replicate the analysis on sample X", but
| also "can we collect a sample similar to X and do we observe
| similar things in the vicinity" in many cases. That alone may
| require multiple seasons of rather expensive fieldwork.
|
| Then you have tens to hundreds of thousands of dollars in
| instrument time to pay to run various analysis which are needed
| in parallel with the field observations.
|
| It's rarely the simple data analysis that's flawed and far more
| frequently subtle issues with everything else.
|
| In most cases, rather than try to replicate, it's best to test
| something slightly different to build confidence in a given
| hypothesis about what's going on overall. That merits a
| separate paper and also serves a similar purpose.
|
| E.g. don't test "can we observe the same thing at the same
| place?", and instead test "can we observe something
| similar/analogous at a different place / under different
| conditions?". That's the basis of a lot of replication work in
| geosciences. It's not considered replication, as it's a
| completely independent body of work, but it serves a similar
| purpose (and unlike replication studies, it's actually
| publishable).
| b59831 wrote:
| [dead]
| kshahkshah wrote:
| When I looked into this, more than 15 years ago, I thought the
| difficult portion wasn't sharing the recipe, but the
| ingredients, if you will - granted I was in a molecular biology
| lab. Effectively the Material Transfer Agreements between
| Universities all trying to protect their IP made working with
| each other unbelievably inefficient.
|
| You'd have no idea if you were going down a well trodden path
| which would yield no success because you have no idea it was
| well trod. No one publishes negative results, etc.
| RugnirViking wrote:
| lets be brutally honest with ourselves.
|
| 99% of all papers mean nothing. They add nothing to the
| collective knowledge of humanity. In my field of robotics there
| are SOOO many papers that are basically taking three or four
| established algorithms/machine learning models, and applying
| them to off-the-shelf hardware. The kind of thing any person
| educated in the field could almost guess the results exactly.
| Hundreds of such iterations for any reasonably popular problems
| space (prosthetics, drones for wildfires, museum guide robot)
| etc every month. Far more than could possibly be useful to
| anyone.
|
| There should probably be some sort of separate process for
| things that actually claim to make important discoveries. I
| don't know what or how that should work. In all honesty maybe
| there should just be less papers, however that could be
| achieved.
| indymike wrote:
| > 99% of all papers mean nothing. They add nothing to the
| collective knowledge of humanity.
|
| A lot of papers are done as a part of the process of getting
| a degree or keeping or getting job. The value is mostly the
| candidate showing they have the acumen to produce a paper of
| such quality that meets the publisher and peer review
| requirements. In some cases, it is to show a future employer
| some level of accomplishment or renown. The knowledge for
| humanity is mostly the authors ability to get published.
| RugnirViking wrote:
| well yes. But these should go somewhere else than the
| papers that may actually contain significant results. The
| problem we have here is that there is an enormous quantity
| of such useless papers mixed in with the ones actually
| trying to do science.
|
| I understand that part of the reason for that is that
| people need to appear as though they are part of the
| "actually trying" crowd to get the desired job effects. But
| it is nonetheless a problem, and a large one very worth at
| least trying to solve.
| staunton wrote:
| 99% of science is a waste of time, not just the papers. We
| just don't know which 1% will turn out not to be. The point
| is that this is making progress. As such, these 99%
| definitely _are_ adding to the collective knowledge. Maybe
| they add very little and maybe it 's not worth the effort but
| it's not nothing. I think one of the effects of AI progress
| will be allowing to extract much more of the little value
| such publications have (the 99% of papers might not be worth
| reading but are good enough for feeding the AI).
| [deleted]
| throwaway4aday wrote:
| What's the value in publishing something that is never
| replicated? If no one ever reproduces the experiment and gets
| the same results then you don't know if any interpretations
| based on that experiment are valid. It would also mean that
| whatever practical applications could have come from the
| experiment are never realized. It makes the entire pursuit seem
| completely useless.
| geysersam wrote:
| It still has value if we assume the experiment was done by
| competent honest people who are unlikely to try to fool us on
| purpose and unlikely do have made errors.
|
| It would be even better if it was replicated of course.
|
| Depending on what certainty you need you might have to wait
| for the result of one or several replications, but that is
| application dependent.
| wizofaus wrote:
| > What's the value in publishing something that is never
| replicated?
|
| Because it presents an experimental result to other
| scientists that they may consider worth trying to replicate?
| dongping wrote:
| Then those unconfirmed results are better put on arxiv,
| instead of being used to evaluate the performance of
| scientists. Tenure and grant committees should only
| consider replicated work.
| geysersam wrote:
| I don't agree. A published article should not be taken
| for Gods Truth no matter if it's replicated or peer
| reviewed.
|
| Lots of "replicated" "peer-reviewed" research have been
| found to be wrong. That's fine, it's part of the process
| of discovery.
|
| A paper should be taken for what it is: a piece of
| scientific work, a part of a puzzle.
| justinpombrio wrote:
| > If someone cares enough about the work to build on it, they
| will replicate it anyway.
|
| Well, the trouble is that hasn't been the case in practice. A
| lot of the replication crisis was attempting for the first time
| to replicate a _foundational_ paper that dozens of other papers
| took as true and built on top of, and then seeing said
| foundational paper fail to replicate. The incentives point
| toward doing new research instead of replication, and that
| needs to change.
| p1esk wrote:
| It is the case in my field (ML): if I care enough about a
| published result I try to replicate it.
| tnecniv wrote:
| This is something very sensible in ML since, you likely
| want to use that algorithm for something else (or to extend
| / modify it), so you need to get it working in your
| pipeline and verify it works by comparing with the
| published result.
|
| In something like psychology that is likely harder, since
| the experiment you want to do might be related to but
| differ significantly from the prior work. I am no
| psychologist, but I'd like to think that they don't take
| one study as ground truth for that reason but try to
| understand causal mechanisms with multiple studies as data
| points. If the hypothesis is correct, it will likely
| present in multiple ways.
| brightball wrote:
| > I don't see how this could ever work, and non-scientists seem
| to often dramatically underestimate the amount of work it would
| be to replicate every published paper.
|
| The alternative is a bunch of stuff being published which
| people belief as "science" that doesn't hold up under scrutiny,
| which undermines the reliability of science itself. The current
| approach simply gives people reason to be skeptical.
| ImPostingOnHN wrote:
| I'm not convinced this proposed alternative is better than
| the status quo. It's simply not feasible, no matter how many
| benefits one might imagine.
|
| the concern about skepticism is not irrelevant, but many of
| these skeptics also are skeptical of the earth being round,
| or older than a few thousand years, or not created by an
| omnipotent skylord, and I'm not sure it's actually a
| significant concern given the current number and expertise of
| those who are skeptical
|
| so, we can hear their arguments for their skepticism, but
| that doesn't mean the arguments are valid to warrant the
| skepticism exhibited. And in the end, that's what matters:
| skepticism warranted by valid arguments, not just any Cletus
| McCletus's skepticism of heliocentrism, as if his opinion is
| equal to that of an astrophysicist (it isn't). And you know
| what? It isn't necessary to convince a ditch digger that the
| earth goes around the sun, if they feel like arguing about
| it.
| backtoyoujim wrote:
| Yes it would indeed mean slowing down and having more
| scientists.
|
| It would mean disruption is no longer a useful tool for human
| development.
| brnaftr361 wrote:
| It may not be. I would be willing to argue that there was a
| tipping point and we've long exceeded its boundary - progress
| and disruption now is just making finding an equilibrium in
| the future increasingly difficult.
|
| So entering into a paradigm where we test the known space -
| especially presently - would 1) help reduce cruft; 2) abate
| undersirable forward progress; 3) train the next
| generation(s) of scientists to be more diligent and better
| custodians of the domain.
| ebiester wrote:
| I don't necessarily think it would mean more scientists, but
| it would mean more expense. You have a moderate number of low
| impact papers that people are doing for tenure today - papers
| for the purpose of cranking out papers. We are talking about
| redirecting efforts but increasing quality of what you have.
| jononomo wrote:
| If it is not replicated it shouldn't be published, other than
| as a provisional draft. I don't care if it hurts your feelings.
| sqrt_1 wrote:
| FYI there is a at least one science journal that only publishes
| reproduced research:
|
| Organic Syntheses "A unique feature of the review process is
| that all of the data and experiments reported in an article
| must be successfully repeated in the laboratory of a member of
| the editorial board as a check for reproducibility prior to
| publication"
|
| https://en.wikipedia.org/wiki/Organic_Syntheses
| throwawaymaths wrote:
| > I don't see how this could ever work,
|
| http://www.orgsyn.org/
|
| > All procedures and characterization data in OrgSyn are peer-
| reviewed and checked for reproducibility in the laboratory of a
| member of the Board of Editors
|
| Never is a strong word.
| indymike wrote:
| > This is also all work that doesn't benefit the scientists
| replicating the paper. It only costs them money and time.
|
| Maybe this is what needs to change. If we only reward discovery
| and success, then the incentive is to only produce discovery
| and success.
| johnnyworker wrote:
| > If someone cares enough about the work to build on it, they
| will replicate it anyway.
|
| Does it really deserve to be called _work_ if it doesn 't
| include the a full, working set of instructions that if
| followed to a T allow it to be replicated? To me that's more
| like pollution, making it someone else's problem. I certainly
| don't see how "we did this, just trust us" can even be
| considered science, and that's not because I don't understand
| the scientific method, that's because I don't make a living
| with it, and have no incentive to not rock the boat.
| davidktr wrote:
| You just described the majority of scientific papers. A
| "working set of instructions" is not really feasible in most
| cases. You can't include every piece of hard- and software
| required to replicate your own setup.
| lliamander wrote:
| Sounds like a problem worth solving.
| johnnyworker wrote:
| Then don't call it science, since it doesn't contribute
| anything to the body of human knowledge.
|
| I think it's fascinating that we can at the same time hold
| things like "one is none" to be true, or that you should
| write tests first, but with science we already got so used
| to a lack of discipline that we just declare it fine.
|
| It's not hard to not climb a tower you can't get down from.
| It's the default, actually. You start with something small
| where you can describe everything that goes into
| replicating it. Then you replicate it yourself, based on
| your own instructions. Before that, you don't bother anyone
| else with it. Once that is done, and others can replicate
| as well, it "actually exists".
|
| And if that means the majority of stuff has to be thrown
| out, I'd suggest doing that sooner rather than later,
| instead of just accumulating scientific debt.
| davidktr wrote:
| Imagine two scientists, Bob and Alice. Bob has spent the
| last 5 years examining a theory thoroughly. Now he can
| explain down to the last detail why the theory does not
| hold water, and why generations of researchers have been
| wrong about the issue. Unfortunately, he cannot offer an
| alternative, and nobody else can follow his long winded
| arguments anyway.
|
| Meanwhile, Alice has spent the last 5 years making the
| best possible use of the flawed theory, and published a
| lot of original research. Sure, many of her publications
| are rubbish, but a few contain interesting results.
| Contrary to Bob, Alice can show actual results and has
| publications.
|
| Who do you believe will remain in academia? And,
| according to public perception, will seem more like an
| actual scientist?
| tnecniv wrote:
| Then Bob has failed.
|
| Academic science isn't just the doing science part but
| the articulation and presentation of your work to the
| broader community. If Bob knows this space so well, he
| should be able to clearly communicate the issue and,
| ideally, present an easily understandable counter example
| to the existing theory.
|
| Technical folks undervalue presentation when writing
| articles and presenting at conferences. The burden of
| proof is on the presenter, and, unless there's some
| incredible demonstration at the end, most researchers
| won't have the time or attention to slog through your
| mess of a paper to decipher it. There's only so much time
| in the day and too many papers to read.
|
| In my experience, the best researchers are also the best
| presenters. I've been to great talks out of my domain
| that I left feeling like I understood the importance of
| their work despite not understanding the details. I've
| also seen many talks in my field that I thought were
| awful because the presentation was convoluted or they
| didn't motivate the importance of their problem / why
| their work addressed it
| johnnyworker wrote:
| I disagree that Bob doesn't produce actual results, or
| that something that is mostly rubbish, but partly
| "interesting" is an actual result. We know the current
| incentives are all sorts of broken, across the board.
| Goodhart's law and all that. To me the question isn't who
| remains in academia given the current broken model, but
| who would remain in academia in one that isn't as broken.
|
| To put a point on it, if public distrust of science
| becomes big enough, it all can go away before you can say
| "cultural revolution" or "fascist strongman". Then
| there'd be no more academia, and its shell would be
| inhabited by party members, so to speak. I'd gladly
| sacrifice the ability of Alice and others like her to
| live off producing "mostly rubbish" to at least have a
| _chance_ to save science itself.
| cycomanic wrote:
| This is a very simplistic view. Why do believe QC
| departments exist? Even in an industrial setting,
| companies make the same thing at the same place on the
| same equipment after sometimes years of process
| optimisation of well understood technology. This is
| essentially a best case scenario and still results fail
| to reproduce. How are scientists who work at the cutting
| edge of technology with much smaller budgets supposed to
| give instructions that can be easily reproduced on first
| go? Moreover how are they supposed to easily reproduce
| other results?
|
| That is not to say that scientist should not document the
| process to their best ability so it can be reproduced in
| principle. I'm just arguing that it is impossible to
| easily reproduce other people's results. Again when
| chemical/manufacturing companies open another location
| they often spend months to years to make the process work
| in the new factory.
| johnnyworker wrote:
| > companies make the same thing at the same place on the
| same equipment after sometimes years of process
| optimisation of well understood technology. This is
| essentially a best case scenario and still results fail
| to reproduce.
|
| We're not talking about 1 of 10 reproduction attempts
| failing, we're talking about 100%. And no, companies
| don't time and time again try to reproduce something that
| has never been reproduced and fail, to then try again,
| endlessly. That's just not a thing.
|
| > it is impossible to easily reproduce other people's
| results
|
| We're also not talking about "easily" reproducing
| something, but _at all_. And in principle doesn 't cut
| it, it needs to be reproduced in practice.
| johngladtj wrote:
| You should.
| MrJohz wrote:
| I work with code, which is about as reproducible as it is
| possible to get - the artifacts I produce are literally just
| instructions on how to reproduce the work I've done again,
| and again, and again. And still people come to me with some
| bug that they've experienced on their machine, that I cannot
| reproduce on my machine, despite the two environments being
| as identical as I can possibly make them.
|
| I agree that reproduction in scientific work is important,
| but it is also apparently impossible in the best possible
| circumstances. When dealing with physical materials, inexact
| measurements, margins of error, etc, I think we have to
| accept that there is no set of instructions that, if followed
| to a T, will ever ensure perfect replication.
| johnnyworker wrote:
| > And still people come to me with some bug that they've
| experienced on their machine, that I cannot reproduce on my
| machine
|
| But this is the other way around. Have you ever written a
| program that doesn't run _anywhere_ except a single machine
| of yours? Would you release it and advertise it and
| encourage other people to use it as dependency in their
| software?
|
| If it only runs on one machine of yours, you don't even
| know if your code is doing something, or something else in
| the machine/OS. Or in terms of science, whether the
| research says something about the world, or just about the
| research setup.
| MrJohz wrote:
| I think you misunderstand the point of scientific
| publication here (at least in theory, perhaps less so in
| practice). The purpose of a paper is typically to say "I
| have achieved these results in this environment (as far
| as I can tell)", and encourages reproduction. But the
| original result is useful in its own right - it tells us
| that there may be something worth exploring. Yes, it may
| just be a measurement error (I remember the magic faster
| than light neutrinos), but if it is exciting enough, and
| lots of eyes end up looking, then flaws are typically
| found fairly quickly.
|
| And yes, there are often overly excited press releases
| that accompany it - the "advertise it and encourage
| others to us it as a dependency" part of it analogy - but
| this is typically just noise in the context of scientific
| research. If that is your main problem with scientific
| publishing, you may want to be more critical of science
| journalism instead.
|
| Fwiw, yes of course I've written code that only runs on
| my machine. I imagine everyone has, typically
| accidentally. You do it, you realise your mistake, you
| learn something from it. Which is exactly what we expect
| from scientific papers that can't be reproduced.
| johnnyworker wrote:
| > But the original result is useful in its own right - it
| tells us that there may be something worth exploring.
|
| I disagree. It shows that when someone writes something
| in a text editor and publishes it, others can read the
| words they wrote. That's all it shows, by itself. Just
| like someone writing something on the web only tells us
| that a textarea accepts just about any input.
|
| And even if it did show more than that, when someone
| "explores" it, is the result is more of that, something
| that might be true, might not be, but "is worth
| exploring"? Then at what point does falsifiability enter
| into it? Why not right away? To me it's just another
| variation of making it someone else's problem, kicking
| the can down the road.
|
| > if it is exciting enough, and lots of eyes end up
| looking, then flaws are typically found fairly quickly.
|
| If that was true, there wouldn't even be a replication
| issue, much less a replication crisis. It's like saying
| open source means a lot of people look at the code, if
| it's important enough. Time and time again that's proven
| wrong, e.g. https://www.zdnet.com/article/open-source-
| software-security-...
|
| > yes of course I've written code that only runs on my
| machine. I imagine everyone has
|
| I wouldn't even know how to go about doing that. Can you
| post something that only runs on one of your machines,
| and you don't know why? Note I didn't say your machine, I
| said _one_ machine of yours. Would you publish something
| that runs on one machine of yours but not a single other
| one, other than to ask "can anyone tell me why this only
| runs on this machine"? I doubt it.
| varjag wrote:
| > Note I didn't say your machine, I said one machine of
| yours.
|
| This thread discusses _peer_ replication, this is not
| even an analogy.
| johnnyworker wrote:
| If you can't _even_ replicate it yourself, what makes you
| think peers could? We are talking about something not
| being replicated, not even by the original author. The
| most extreme version would be something that you could
| only get to run once on the same machine, and never on
| any other machine.
| MrJohz wrote:
| I think you may be seeing the purpose of these papers
| differently to me, which may be the cause of this
| confusion.
|
| The way you're describing a scientific publication is as
| if it were the end result of the scientific act. To use
| the software analogy, you're describing publication like
| a software release: all tests have been performed, all CI
| workflows have passed, QA have checked everything, and
| the result is about to be shipped to customers.
|
| But talking to researchers, they see publishing more like
| making a new branch in a repository. There is no
| expectation that the code in that branch already be
| perfect (hence why it might only run on one machine, or
| not even run at all, because sometimes even something
| that doesn't work is still worth committing and exploring
| later).
|
| And just like in software, where you might eventually
| merge those branches and create a release out of it, in
| the scientific world you have metastudies or other forms
| of analysis and literature reviews that attempt to glean
| a consensus out of what has been published so far. And
| typically in the scientific world, this is what happens.
| However, in journalism, this isn't usually what happens,
| and one person's experimental, "I've only tested this on
| my machine" research is often treated as equivalent to
| another person's "release branch" paper evaluating the
| state of a field and identifying which findings are
| likely to represent real, universal truths.
|
| Which isn't to say that journalists are the only ones at
| fault here - universities that evaluate researchers
| primarily on getting papers into journals, and prestige
| systems that make it hard to go against conventional
| wisdom in the field both cause similar problems by
| conflating different levels of research or adding
| competing incentives to researchers' work. But I don't
| think that invalidates the basic idea of published
| research: to present a found result (or non-really),
| provide as much information as possible about how to
| replicate the result again, and then let other people use
| that information to inform their work. It just requires
| us to be mindful of how we let that research inform us.
| johnnyworker wrote:
| > But talking to researchers, they see publishing more
| like making a new branch in a repository.
|
| Well some do, others don't. Like the one who wrote the
| article this is a discussion of.
|
| https://en.wikipedia.org/wiki/Replication_crisis
|
| > Replication is one of the central issues in any
| empirical science. To confirm results or hypotheses by a
| repetition procedure is at the basis of any scientific
| conception. A replication experiment to demonstrate that
| the same findings can be obtained in any other place by
| any other researcher is conceived as an
| operationalization of objectivity. It is the proof that
| the experiment reflects knowledge that can be separated
| from the specific circumstances (such as time, place, or
| persons) under which it was gained.
|
| Or, in short, "one is none". One _might_ turn into more
| than one, it might not. Until it does, it 's not real.
|
| more snippets from the above WP article:
|
| > This experiment was part of a series of three studies
| that had been widely cited throughout the years, was
| regularly taught in university courses
|
| > what the community found particularly upsetting was
| that many of the flawed procedures and statistical tools
| used in Bem's studies were part of common research
| practice in psychology.
|
| > alarmingly low replication rates (11-20%) of landmark
| findings in preclinical oncological research
|
| > A 2019 study in Scientific Data estimated with 95%
| confidence that of 1,989 articles on water resources and
| management published in 2017, study results might be
| reproduced for only 0.6% to 6.8%, even if each of these
| articles were to provide sufficient information that
| allowed for replication
|
| I'm not saying it couldn't be fine to just publish things
| because they "could be interesting". But the overall
| situation seems like quite the dumpster fire to me. As
| does software, FWIW.
| techas wrote:
| Well, you could put incentives to make replication attractive.
| Give credit for replication. Give money to the researchers
| doing the replication/review. Today we pay an average of
| 2000EUR per article, reviewers get 0EUR and the editorial keeps
| all for putting a pdf online. I would say there is margin there
| to invest in improving the review process.
| mandmandam wrote:
| It's wild to me that although we _know_ that it was Ghislaine
| Maxwell 's daddy who started this incredibly corrupt system,
| people hardly mention this fact.
|
| The US system, and others, even attack people who dare to try
| and make science more open. RIP Aaron Swartz, and long live
| Alexandra Elbakyan.
| sebzim4500 wrote:
| >I don't see how this could ever work, and non-scientists seem
| to often dramatically underestimate the amount of work it would
| be to replicate every published paper.
|
| I think it would be fine to half the productivity of these
| fields, if it means that you can reasonably expect papers to be
| accurate.
| dmarchand90 wrote:
| I believe that, contrary to popular belief, the
| implementation of this system would lead to a substantial
| increase in productivity in the long run. Here's why:
|
| Currently, a significant proportion of research results in
| various fields cannot be reproduced. This essentially means
| that a lot of work turns out to be flawed, leading to wasted
| efforts (you can refer to the 'reproducibility crisis' for
| more context). Moreover, future research often builds upon
| this erroneous information, wasting even more resources. As a
| result, academic journals get cluttered with substandard
| work, making them increasingly difficult to monitor and
| comprehend. Additionally, the overall quality of written
| communication deteriorates as emphasis shifts from the
| accurate transfer and reproduction of knowledge to the
| inflated portrayal of novelty.
|
| Now consider a scenario where 50% of all research is
| dedicated to reproduction. Although this may seem to
| decelerate progress in the short term, it ensures a more
| consistent and reliable advancement in the long term. The
| quality of writing would likely improve to facilitate
| replication. Furthermore, research methodology would be
| disseminated more quickly, enhancing overall research
| effectiveness.
| matthewdgreen wrote:
| In the current system scientists allocate reproduction
| efforts to results that they intend to build on. So if
| you've claimed a breakthrough technique for levitating
| widgets -- and I think this widget technique can be used to
| build spacecraft (or if I think your technique is wrong) --
| then I will allocate precious time and resources to
| reproducing your work. By contrast if I don't think your
| work is significant and worth following up on, then I
| allocate my efforts somewhere else. The advantage is that
| more apparently-significant results ("might cure cancer")
| tend to get a bigger slice of very limited resources, while
| dead-end or useless results ("might slightly reduce
| flatulence in cats") don't. This distributed
| entrepreneurial approach isn't perfect, but it works better
| than central planning. By contrast you could adopt a
| Soviet-like approach where cat farts and cancer both share
| replication resources, but this seems like it would be bad
| for everyone (except the cats.)
| advisedwang wrote:
| It would be more than just half productivity. Not only do you
| have to do the work twice, but you add the delay of someone
| else replicating before something can be published and built
| upon by others. If you are developer, imagine how much your
| productivity would drop going from a 3 minute build to a 1
| day build.
| orangepurple wrote:
| Terrible analogy. It might take months to come up with an
| idea but another should be able to follow your method and
| implement it much more quickly than it took you to come up
| with the concept and implement it.
| magimas wrote:
| horrible take. Taking the LK99 situation as an example:
| simply copying and adapting a well described growth
| recipee to your own setup and lab conditions may take
| weeks. And how would you address situations where
| measurement setups only exist once on the earth? How
| would you do peer replication of LHC measurements? Wait
| for 50 years till the next super-collider is built and
| someone else can finally verify the results? On a smaller
| scale: If you need measurements at a synchrotron
| radiation source to replicate a measurement, is someone
| supposed to give up his precious measurement time to
| replicate a paper he isn't interested in? And is the
| original author of a paper that's in the queue for peer
| replication supposed to wait for a year or two till the
| reviewer gets a beamtime on an appropriate measurement
| station? Even smaller: I did my PhD in a lab with a
| specific setup that only a single other group in the
| world had an equivalent to. You simply would not be able
| to replicate these results.
|
| Peer replication is completely unfeasible in experimental
| fields of science. The current process of peer review is
| alright, people just need to learn that single papers
| standing by themselves don't mean too much. The "peer
| replication" happens over time anyway when others use the
| same tools, samples, techniques on related problems and
| find results in agreement with earlier papers.
| evandrofisico wrote:
| Usually coming up with a idea is the _easy_ part. For
| example, in my PhD project, i started with an idea from
| my advisor that he had in the early 2000.
|
| Implementing the code for the simulation and analysis of
| the data? four months, at most. Running the simulation?
| almost three years until I had data with good enough
| resolution for publishing.
| tnecniv wrote:
| It's also very easy to come up with bad ideas -- I did
| plenty of that and I still do, albeit less than I used
| to. Finding an idea that is novel, interesting, and
| tractable given your time, skills, resources, and
| knowledge of the literature is hard, and maybe the most
| important skill you develop as a researcher.
|
| For a reductive example, the idea to solve P vs NP is a
| great one, but I'm not going to do that any time soon!
| cycomanic wrote:
| I think you don't understand how much work is involved in
| just building the techniques and expertise to pull some
| experiments off (let's not even talk about the
| equipment).
|
| Even if someone meticulously documents their process, it
| could still take months to replicate the results.
|
| I'm familiar with lithography/nanofabrication and I know
| that it is typically the case that a process developed in
| one clean-room can not be directly applied to a different
| clean room and instead one has to develop a new process
| based on what the other results.
|
| Even in the same lab it can often happen that if you come
| back to a process after a longer time, that things don't
| work out anymore and quite a bit of troubleshooting
| ensues (maybe a supplier for some chemical changed and
| even though it should be the same formula it behaves
| slightly different).
| RoyalHenOil wrote:
| Months. Haha.
|
| I previously worked in agricultural research (in the
| private sector), and we spent YEARS trying to replicate
| some published research from overseas. And that was
| research that had previously been successfully
| replicated, and we even flew in the original scientists
| and borrowed a number of their PhD students for several
| months, year after year, to help us try to make it work.
|
| We never did get it to fully replicate in our country. We
| ended up having to make some pretty extreme changes to
| the research to get similar (albeit less reliable)
| results here.
|
| We never did figure out why it worked in one part of the
| world but not another, since we controlled for every
| other factor we could think of (including literally
| importing the original team's lab supplies at great
| expense, just in case there was some trace contaminant on
| locally sourced materials).
| harimau777 wrote:
| The issue that I see is: even if halving productivity is
| acceptable to the field as a whole; how do you incentivize a
| given scientist to put in the effort?
|
| This seems particularly problematic because it is already
| notoriously hard to get tenure and academia is already
| notoriously unrewarding to researchers who don't have tenure.
| hoosieree wrote:
| Half is wildly optimistic.
| ImPostingOnHN wrote:
| half would only be possible if, for every single paper
| published by a given team, there exists a second team just as
| talented as the original team, skilled in that specific
| package of techniques, just waiting to replicate that paper
| coding123 wrote:
| Maybe doing an experiment twice, even with a cost that is
| double, makes more sense so that we don't all throw away our
| coffee when coffee is bad, or throw away our gluten when gluten
| is bad, etc... (those are trivial examples) basically the cost
| to perform the science in many cases is so minuscule in scale
| to how it could affect society.
| pvaldes wrote:
| One. Doing experiments is yet enough difficult and painful.
|
| Two. This drain of resources can't be done for free. Somebody
| will need to pay twice for half of the research [1], and
| faster. Peers will need to be hired and paid, maybe by the
| writer's grants. Researchers cant justify to give their own
| funds to other teams without a profound change in regulation
| and even in that case would be harming their own projects.
|
| [1] as the valuable experts are now stuck validating things
| instead doing their own job
|
| Would open also a door for foul play. Blocking competitors
| teams in molasses just trowing them secondary silly problems
| that they know that are a dead end, while the other team work
| in the real deal, and take the advantage to win the patent.
| mattkrause wrote:
| Longer, even!
|
| Some experiments that study biological development or trained
| animals can take a year or more of fairly intense effort to
| _start_ generating data.
| Maxion wrote:
| A year? some data sets take decades to build up before
| significant papers can be published on their data.
| Replication of the dataset is just not feasible.
|
| This whole thread just shows how little the average HNer
| knows about the academic sciences.
| tnecniv wrote:
| I know people that had to take a 6+ month trip to Antarctica
| for part of their work and others that had to share time on a
| piece of experimental equipment with a whole department --
| they got a few weeks per year to run their experiment and had
| to milk that for all it's worth. Even if they had funding,
| that machine required large amounts of space and staff to
| keep it running and they aren't off the shelf products --
| only a few exist at large research centers.
| seventytwo wrote:
| There would need to be an incentive structure where the first
| replications get (nearly) the same credit as the original
| publisher.
| j45 wrote:
| Can every thing be replicated in every field
| User23 wrote:
| That's the defining characteristic of engineering. If you can't
| reliably replicate everything in an engineering discipline then
| it's not an engineering discipline.
| Hiromy wrote:
| Hola te amo
| jimmar wrote:
| How do you replicate a literature review? Theoretical physics? A
| neuro case? Research that relies upon natural experiments? There
| are many types of research. Not all of them lend themselves to
| replication, but they can still contribute to our body of
| knowledge. Peer review is helpful in each of these instances.
|
| Science is a process. Peer review isn't perfect. Replication is
| important. But it doesn't seem like the author understands what
| it would take to simply replace peer review with replication.
| janalsncm wrote:
| I don't think the existence of papers that are difficult to
| replicate undermines the value of replicating those that are
| easier.
| freeopinion wrote:
| My mind automatically swapped out the words "peer" for "code". It
| took my brain to interesting places. When I came back to the
| actual topic, I had accidentally built a great way to contrast
| some of the discussion offered in this thread.
| dongping wrote:
| In the sense of replicating the results, we do have CI servers
| and even fuzzers running for our "code replication".
| freeopinion wrote:
| I don't want to derail the science discussion too much, but
| what if you actually had to reproduce the code by hand? Would
| that process produce anything of value? Would your habit of
| writing i+=1 instead of i++ matter? Or iteration instead of
| recursion?
|
| Would code replication result in fewer use after free, or off
| by one than code review? Or would it mostly be a waste of
| resources including time?
| abnry wrote:
| If scientists are going to complain that's its too hard or too
| expensive to replicate their studies, then that just shows their
| work is BS.
| fodkodrasz wrote:
| I guess if software developers will complain that it's too hard
| or too expensive to thoroughly test their code to ensure
| exactly zero bugs at release[1], then that just shows their
| work is BS.
|
| [1]: if you have delivered telco code to Softbank you may have
| heard this sentence
| abnry wrote:
| Replication is not the same thing as zero bugs in software.
| alsodumb wrote:
| Nah, it doesn't. It just shows that it's time consuming and
| expensive to replicate their studies.
| abnry wrote:
| If that's the case, then don't claim confidence in the work
| or make policy decisions based off of it. If there is no
| epistemological humility, then yes, it is still BS.
| Levitz wrote:
| If any study costs X, the study and the replication costs
| somewhere in the ballpark of 2*X. This is not trivial.
| abnry wrote:
| But this is science we are talking about. A one-off lucky
| novel result should not be good enough. Why should our
| standards and our funding be so low?
| Maxion wrote:
| Something in switzerland called the Large Hadron Collider comes
| to mind.
|
| I guess we should not talk about the Higgs before someone else
| builds a second one and replicates the papers.
| abnry wrote:
| Physics is generally better since they have good statistical
| models and can get six sigma (or whatever) results.
|
| And replication can be done by the same party (although an
| independent party is better), and that may mean many trials.
|
| And do we even set policy based on existence or non-existent
| of higgs bosons?
|
| I am particularly unhappy with soft sciences in terms of
| replication.
| azan_ wrote:
| What if it REALLY is too expensive? You do realize that there
| are studies which literally cost millions of dollars? Getting
| funding for original studies is hard enough, good luck securing
| additional funds for replication.
| snitty wrote:
| >If scientists are going to complain that's its too hard or too
| expensive to replicate their studies, then that just shows
| their work is BS.
|
| 1 mg of anti-rabbit antibody (a common thing to use in a lot of
| biology experiments) is $225 [1]. Outside of things like
| standard buffers and growth medium for prokaryotes, this is
| going to be the cheapest thing you use in an experiment.
|
| 1/10th of that amount for anti-flagellin antibody is $372. [2]
|
| A kit to prep a cell for RNA sequencing is $6-10 per use.
| That's JUST isolation of the RNA. Not including reverse
| transcribing it to cDNA for sequencing, or the sequencing
| itself. [3]
|
| Let's not even reach things like materials science where you
| may be working on an epitaxial growth paper, and there are only
| a handful of labs where they could even feasibly repeat the
| experiment.
|
| Or say something with a BSL-3 lab where there are literally
| only 15 labs in the US that could feasibly do the work,
| assuming they aren't working on their own stuff. [4]
|
| [1] - https://www.thermofisher.com/antibody/product/Goat-anti-
| Rabb... [2] https://www.invivogen.com/anti-flagellin [3]
| https://www.thermofisher.com/order/catalog/product/12183018A
| [4] https://www.niaid.nih.gov/research/tufts-regional-
| biocontain...
| NalNezumi wrote:
| Imo, A more realistic thing to do is "replicability review"
| and/or requirement to submit "methodology map" to each paper.
|
| The former would be a back and forth between a reviewer that
| inquire and ask questions (based on the paper) with the goal to
| _reproduce the result_ , but don't have to actually reproduce it.
| This is usually good to find out missing details in the paper
| that the writer just took for granted everyone in the field knows
| (I've met Bio PHD that have wasted Months of their life tracking
| up experimental details not mentioned in a paper)
|
| The latter would be the result of the former. Instead of having
| pages long "appendix" section in the main paper, you produce
| another document with meticulous details of the
| experiment/methodology with every stone turned together with an
| peer reviewer. Stamp it with the peer reviewes name so they can't
| get away with hand wavy review.
|
| I've read too many papers where important information to
| reproduce the result is omitted. (for ML/RL) If the code is
| included I've countless of times found implementation details
| that is not mentioned in the paper. In matter of fact, there's
| even results suggesting that those details are the make or break
| of certain algorithms. [1] I've also seen breaking details only
| mentioned in code comments...
|
| Another atrocious thing I've witnessed is a paper claiming they
| evaluated their method on a benchmark and if you check the
| benchmark, the task they evaluated on doesn't exit! They forked
| the benchmark and made their own task without being clear about
| it! [2]
|
| Shit like this make me lose faith in certain science directions.
| And I've seen a couple of junior researcher giving it all up
| because they concluded it's all just house of cards.
|
| [1] https://arxiv.org/abs/2005.12729
|
| [2] https://arxiv.org/abs/2202.02465
|
| Edit: also if you think that's too tedious/costly, reminder that
| publishers rake in record profits so the resources are already
| there https://youtu.be/ukAkG6c_N4M
| kergonath wrote:
| > I've met Bio PHD that have wasted Months of their life
| tracking up experimental details not mentioned in a paper
|
| Same. Now, when I review manuscripts, I pay much more attention
| to whether there is enough information to replicate the
| experiment or simulation. We can put out a paper with wrong
| interpretations and that's fine because other people will
| realise that when doing their own work. We cannot let papers
| get published if their results cannot be replicated.
|
| > The latter would be the result of the former. Instead of
| having pages long "appendix" section in the main paper, you
| produce another document with meticulous details of the
| experiment/methodology with every stone turned together with an
| peer reviewer. Stamp it with the peer reviewes name so they
| can't get away with hand wavy review
|
| Things that take too much space to go in the experimental
| section should go to a electronic supplementary information
| document. But then it would be nice if the ESI were appended to
| the article when we download a PDF because tracking them is a
| pain in the backside. Some fields are better than others about
| this, for example in materials characterisation studies it's
| very common to have ESI with a whole bunch of data and details.
|
| Large dataset should go to a repository or a dataset journal,
| that way the method is still peer reviewed and the dataset has
| a doi and is much easier to re-use. It's also a nice way of
| doubling a student's papers count by the end of their PhD.
|
| > Another atrocious thing I've witnessed is a paper claiming
| they evaluated their method on a benchmark and if you check the
| benchmark, the task they evaluated on doesn't exit! They forked
| the benchmark and made their own task without being clear about
| it! [2]
|
| That's just evil!
| Maxion wrote:
| > Large dataset should go to a repository or a dataset
| journal, that way the method is still peer reviewed and the
| dataset has a doi and is much easier to re-use.
|
| This may be possible in some sciences, but not in
| epidemiology or biomed. Often the study is based on tissue
| samples owned by some entity, with permission granted only to
| some certain entity.
|
| Datasets in epidemiology are often full of PII, and cannot be
| shared publicly for many reasons.
| infogulch wrote:
| I like the idea of splitting "peer review" into two, and then
| having a citation threshold standard where a field agrees that a
| paper should be replicated after a certain number of citations.
| And journals should have a dedicated section for attempted
| replications.
|
| 1. Rebrand peer review as a "readability review" which is what
| reviewers tend to focus on today.
|
| 2. A "replicability statement", a separately published document
| where reviewers push authors to go into detail about the
| methodology and strategy used to perform the experiments,
| including specifics that someone outside of their specialty may
| not know. Credit NalNezumi ITT
| analog31 wrote:
| Every experimental paper I've ever read has contained an
| "Experimental" section, where they provide the details on how
| they did it. Those sections tend to be general enough, albeit
| concise.
|
| In some fields, aside from specialized knowledge, good
| experimental work requires what we call "hands." For instance,
| handling air sensitive compounds, or anything in a condensed or
| crystalline state. In my thesis experiment, some of the
| equipment was hand made, by me.
|
| Sometimes specialized facilities are needed. My doctoral thesis
| project used roughly 1/2 million dollars of gear, and some of
| the equipment that I used was obsolete and unavailable by the
| time I finished.
| ahmadmijot wrote:
| > My doctoral thesis project used roughly 1/2 million dollars
| of gear,
|
| Wow I envy you. My doctoral thesis project spent like...
| USD2.5k directly for gears (half of it just to buy lego
| bricks to build our own instrument exactly because we can't
| afford to buy commercial one lol)
| xioxox wrote:
| I used a 3 billion dollar space telescope. I don't think
| NASA are going to launch another to replicate some of my
| results.
| janalsncm wrote:
| "Concise" isn't good enough. If other scientists are trying
| to read through the tea leaves at what you're trying to say
| you did, that defeats the entire point of a paper. The
| purpose of science is to create knowledge _that other people
| can use_ and if people can't replicate your work that's not
| science.
| analog31 wrote:
| I think the point is you don't have to give a complete BOM
| that includes where you got the power cables. Each
| scientist has to decide what amount of information needs to
| be conveyed. Of course this can be abused, or done
| sloppily, like anything else.
|
| A place where you can spread out more is in dissertations.
| Mine contained an entire chapter on the experiment, another
| on the analysis, and appendices full of source code,
| schematics, etc. I happily sent out copies, at my expense.
| My setup was replicated roughly 3 times.
| User23 wrote:
| One thing that everyone needs to remember about "peer review" is
| that it isn't part of the scientific method, but rather that it
| was imposed on the scientific enterprise by government funding
| authorities. It's basically JIRA for scientists.
| ahmadmijot wrote:
| Quite related: nowadays there is this movement within scientific
| researches ie Open Science where the (raw) data from ones
| research is open source. And even methods for in-house
| fabrication and development together with its source code is open
| source (open hardware and open software)
| waynecochran wrote:
| I spent a lot of my graduate years in CS implementing the details
| of papers only to learn that, time and time again, the paper
| failed to mention all the short comings and fail cases of the
| techniques. There are great exceptions to this.
|
| Due to the pressure of "publish or die" there is very little
| honesty in research. Fortunately there are some who are
| transparent with their work. But for the most part, science is
| drowning in a sea of research that lacks transparency and
| replication short falls.
| janalsncm wrote:
| I had a very similar experience in my masters. Really made me
| think, what exactly are the peers "reviewing" if they don't
| even know whether the technique works in the first place.
| waynecochran wrote:
| I have reviewed many papers and there is never the time to
| recreate the work and test. That is why I love the "papers w
| code" site. I think every published CS paper should require a
| git repo with all their code and experimental data.
| cptskippy wrote:
| You'll quickly discover when you enter the workforce that the
| reasons we have CI/CD, Docker, and virtualization are because
| of a similar problem. The dread "it works on my machine"
| response.
|
| CI/CD forces people to codify exactly how to build and deploy
| something in order for it to get into a production environment.
| Docker and VMs are ways around this by giving people a "my
| machine" that can be copied and shared easily.
| titzer wrote:
| In the PL field, conferences have started to allow authors to
| submit packaged artifacts (typically, source code, input data,
| training data, etc) that are evaluated separately, typically
| post-review. The artifacts are evaluated by a separate committee,
| usually graduate students. As usual, everything is volunteer.
| Even with explicit instructions, it is hard enough to even get
| the same _code_ to run in a different environment and give the
| same results. Would "replication" of a software technique
| require another team to reimplement something from scratch? That
| seems unworkable.
|
| I can't even _imagine_ how hard it would be to write instructions
| for another lab to successfully replicate an experiment at the
| forefront of physics or chemistry, or biology. Not just the
| specialized equipment, but we 're talking about the frontiers of
| Science with people doing cutting-edge research.
|
| I get the impression that suggestions like these are written by
| non-scientists who do not have experience with the peer review
| process of _any_ discipline. Things just don 't work like that.
| Maxion wrote:
| > I get the impression that suggestions like these are written
| by non-scientists who do not have experience with the peer
| review process of any discipline. Things just don't work like
| that.
|
| Not to mention that the cutting edge in many sciences are
| perhaps two-three research groups of 5-30 individuals each in
| varying research institutions around the world.
| mike_hearn wrote:
| Is PL theory actually science? Although we call it computer
| science, I don't personally think CS is actually a science in
| the sense of studying nature to understand it. Computers are
| artificial constructs. CS is a lot closer to engineering than
| science. Indeed it's kind of nonsensical to talk about
| replicating an experiment in programming language theory.
|
| For the "hard" sciences, replication often isn't so difficult
| it seems. LK-99 being an interesting study in this, where
| people are apparently successfully replicating an experiment
| described in a rushed paper that is widely agreed to lack
| sufficient details. It's cutting edge science but replication
| still isn't a problem. Most science isn't the LHC.
|
| The real problems with replication are found in the softer
| fields. There it's not just an issue of randomness or
| difficulty of doing the experiments. If that's all there was to
| it, no problem. In these fields it's common to find papers or
| entire fields where none of the work is replicable even in
| principle. As in, the people doing it don't think other people
| being able to replicate their work is even important at all,
| and they may go out of their way to _stop_ people being able to
| replicate their work (most frequently by gathering data in non-
| replicable ways and then withholding it deliberately, but
| sometimes it 's just due to the design of the study). The most
| obvious inference when you see this is that maybe they don't
| want replication attempts because they know their claims
| probably aren't true.
|
| So even if peer reviewers or journals were just checking really
| basic things like, is this claim even replicable in principle,
| that would be a good start. You would still be left with a lot
| of papers that replicate fine but their conclusions are still
| wrong because their methodology is illogical, or papers that
| replicate because their findings are obvious. But there's so
| much low hanging fruit.
| staunton wrote:
| Let's get people to publish their data and code first, shall we?
| That's sooo much easier than demanding whole studies to be
| replicated... and people still don't do it!
| ayakang31415 wrote:
| One of the Nobel prizes in Physics was the discovery of Higgs
| Boson at LHC. It cost billions of dollars just to build the
| facility, and required hundreds of physicists working on it to
| just conduct the experiment. You can't replicate this. Although I
| fully agree that replication must come first when it is
| reasonably doable.
| TrackerFF wrote:
| Seems to have been hugged to death.
|
| But - a quick counterexample - as far as replication goes: What
| if the experiments were run on custom made or exceedingly
| expensive equipment? How are the replicators supposed to access
| that equipment? Even in fields which are "easy" to replicate -
| like machine learning - we are seeing barriers of entry due to
| expensive computing power. Or data collection. Or both.
|
| But then you move over to physics, and suddenly you're also
| dealing with these one-off custom setups, doing experiments which
| could be close to impossible to replicate (say you want to
| conduct experiments on some physical event that only occurs every
| xxxx years or whatever)
| pajushi wrote:
| Why shouldn't we hold science more accountable?
|
| "Science needs accounting" is a search I had saved for months
| which really resonates with the idea of "peer replication."
|
| In accounting, you always have checks and balances, you never are
| counting money alone. In many cases, accountants duplicate their
| work to make sure that it is accurate.
|
| Auditors are the corollary to the peer review process. They're
| not there to redo your work, but to verify that your methods and
| processes are sound.
| paulpauper wrote:
| this would not apply to math or something subjective such as
| literature. only experimental results need to be replicated.
| Nevermark wrote:
| Reproducibility would become a much higher priority if electronic
| versions of papers are required (by their distributors, archives,
| institutions, ...) to have reproduction sections, which the
| authors are encouraged to update over time.
|
| UPDATABLE COVER PAGE:
|
| Title Authors
|
| Abstract Blah, blah, ...
|
| State of reproduction: Not reproduced.
| Successful reproductions: ...citations... Reproduction
| attempts: ...citations... Countering reproductions:
| ...citations...
|
| UPDATABLE REPRODUCTION SECTION ATTACHED AT END
|
| Reproduction resources: Data, algorithms,
| processes, materials, ...
|
| Reproduction challenges: Cost, time, one-off
| events, ...
|
| Making this stuff more visible would help reproducers validated
| the value of reproduction to their home and funding institutions.
|
| Having a standard section for this, with an initial state of "Not
| reproduced" provides more incentive for original workers to
| provide better reproduction info.
|
| For algorithm and math work the reproduction could be served best
| with downloadable executable bundle.
| gordian-not wrote:
| The incentive should be to clear the way for tenure track
|
| The junior faculty will clear the rotten apples at the top by
| finding flaws in their research and then will win the tenure that
| was lost in return
|
| This will create a nice political atmosphere and improve science
| user6723 wrote:
| I remember showing someone raw video of a Safire plasma chamber
| keeping the ball of plasma lit for several minutes. They said
| they would need to see a peer reviewed paper. The presumption
| brought about by the enlightenment era that everyone should get a
| vote was a mistake.
| dongping wrote:
| https://web.archive.org/web/20230130143126/https://blog.ever...
| moelf wrote:
| I wish we can replicate the LHC
| Maxion wrote:
| No talking about the Higgs before that happens, apparently.
| kergonath wrote:
| We will, don't worry.
| janalsncm wrote:
| For a while Reddit had the mantra "pics or it didn't happen".
|
| At least in CS/ML there needs to be a "code or it didn't happen".
| Why? Papers are ambiguous. Even if they have mathematical
| formulas, not all components are defined.
|
| Peer replication in these fields is an easy low hanging fruit
| that could set an example for other fields of science.
| simlan wrote:
| That is too simplistic. You underestimate the depth of
| academia. Sure the latest break through Alzheimers study or
| related research would benefit from a replication. Which is
| done out of commercial interest anyway.
|
| But your run of the mill niche topic will not have the dollars
| behind it to replicate everyones research.just because CS/AI
| research is very convenient to replicate does not mean this can
| be extended to all research being done.
|
| That is exactly why peer review exists to weed out the
| implausible and low effort/relevance work. It is not fraud
| proof because it was not designed to be.
| hedora wrote:
| The website dies if I try to figure out who the author ("sam")
| is, but it sounds like they are used to some awful backwater of
| academia.
|
| They have this idea that a single editor screens papers to decide
| if they are uninteresting or fundamentally flawed, then they want
| a bunch of professors to do grunt work litigating the correctness
| of the experiments.
|
| In modern (post industrial revolution) branches of science, the
| work of determining what is worthy of publication is distributed
| amongst a program committee, which is comprised of reviewers. The
| editor / conference organizers pick the program committee. There
| are typically dozens of program committee members, and authors
| and reviewers both disclose conflicts. Also, papers are
| anonymized, so the people that see the author list are not
| involved in accept/reject decisions.
|
| This mostly eliminates the problem where work is suppressed for
| political reasons, etc.
|
| It is increasingly common for paper PDFs to be annotated with
| badges showing the level of reproducibility of the work, and
| papers can win awards for being highly reproducible. The people
| that check reproducibility simply execute directions from a
| separate reproducibility submission that is produced after the
| paper is accepted.
|
| I argue the above approach is about 100 years ahead of what the
| blog post is suggesting.
|
| Ideally, we would tie federal funding to double blind review and
| venues with program committees, and papers selected by editors
| would not count toward tenure at universities that receive public
| funding.
| jltsiren wrote:
| The computer science practice you describe is the exception,
| not the norm. It causes a lot of trouble when evaluating the
| merits of researchers, because most people in the academia are
| not familiar with it. In many places, conference papers don't
| even count as real publications, putting CS researchers at a
| disadvantage.
|
| From my point of view, the biggest issue is accepting/rejecting
| papers based on first impressions. Because there is often only
| one round of reviews, you can't ask the authors for
| clarifications, and they can't try to fix the issues you have
| identified. Conferences tend to follow fashionable topics, and
| they are often narrower in scope than what they claim to be,
| because it's easier to evaluate papers on topics the program
| committee is familiar with.
|
| The work done by the program committee was not even supposed to
| be proper peer review but only the first filter. Old conference
| papers often call themselves extended abstracts, and they don't
| contain all the details you would expect in the full paper. For
| example, a theoretical paper may omit key proofs. Once the
| program committee has determined that the results look
| interesting and plausible and the authors have presented them
| in a conference, the authors are supposed to write the full
| paper and submit it to a journal for peer review. Of course,
| this doesn't always happen, for a number of reasons.
| cycomanic wrote:
| While I agree with the general sentiment of the paper and
| creating incentives for more replication is definitely a good
| idea, I do think the approach is flawed in several ways.
|
| The main point is that the paper seriously underestimates the
| difficulty and time it requires to replicate experiments in many
| experimental fields. Who will decide which work needs to be
| replicated? Should capable labs somehow become bogged down with
| just doing replication work? Even if they don't find the results
| not interesting?
|
| In reality if labs find results interesting enough to replicate
| they will try to do so. The current LK-99 hurrah is a perfect
| example of that, but it happens on a much smaller scale all the
| time. Researchers do replicate and build on other work all the
| time, they just use that replication to create new results (and
| acknowledge the previous work) instead of publishing a "we
| replicated paper".
|
| Where things usually fail is in publication of "failed
| replication" studies, and those are tricky. It is not always
| clear if the original research was flawed or the people trying to
| reproduce made an error (again just have a look at what's
| happening with LK-99 at the moment). Moreover, it can be
| politically difficult to try to publish a "fail to reproduce"
| result if you are small unknown lab, if the original result came
| from a big known group. Most people will believe that you are the
| one who made the error (and unfortunately big egos might get in
| the way, and the small lab will have a hard time).
|
| More generally, in my opinion the lack of replication of results
| is just one symptom of a bigger problem in science today. We (as
| in society) have essentially turned the scientific environment
| increasingly competitive, under the guise of "value for tax payer
| money". Academic scientists now have to constantly compete for
| grant funding, publish to keep the funding going. It's incredibly
| competitive to even get in ... At the same time they are supposed
| to constantly provide big headlines for university press
| releases, communicate their results to the general public and
| investigate (and patent) the potential for commercial
| exploitation. No wonder we see less cooperation.
| eesmith wrote:
| > the real test of a paper should be the ability to reproduce its
| findings in the real world. ...
|
| > What if all the experiments in the paper are too complicated to
| replicate? Then you can submit to [the Journal of Irreproducible
| Results].
|
| Observational science is still a branch of science even if it's
| difficult or impossible to replicate.
|
| Consider the first photographs of a live giant squid in its
| natural habitat, published in 2005 at
| https://royalsocietypublishing.org/doi/10.1098/rspb.2005.315... .
|
| Who seriously thinks this shouldn't have been published until
| someone else had been able to replicate the result?
|
| Who thinks the results of a drug trial can't be published until
| they are replicated?
|
| How does one replicate "A stellar occultation by (486958) 2014
| MU69: results from the 2017 July 17 portable telescope campaign"
| at
| https://ui.adsabs.harvard.edu/abs/2017DPS....4950403Z/abstra...
| which required the precise alignment of a star, the trans-
| Neptunian object 486958 Arrokoth, and a region in Argentina?
|
| Or replicate the results of the flyby of Pluto, or flying a
| helicopter on Mars?
|
| Here's a paper I learned about from "In The Pipeline"; "Insights
| from a laboratory fire" at
| https://www.nature.com/articles/s41557-023-01254-6 .
|
| """Fires are relatively common yet underreported occurrences in
| chemical laboratories, but their consequences can be devastating.
| Here we describe our first-hand experience of a savage laboratory
| fire, highlighting the detrimental effects that it had on the
| research group and the lessons learned."""
|
| How would peer replication be relevant?
| phpisthebest wrote:
| I think in some of those cases you have conclusions drawn from
| raw data that could be replicated or reviewed. For example many
| teams use the same raw data from Large Colliders, or JWT, or
| other large science projects to reach competiting conclusions.
|
| Yes in a perfect world we would also replicate the data
| collection but we do not live in a perfect world
|
| Same is true for Drug Trials, there is always a battle over
| getting the raw data from drug trails as the companies claim
| that data is trade secret, so independent verification of drug
| trails is very expensive but if the FDA required not just the
| release of redacted conclusions and supporting redacted data
| but 100% of all data gathered it would be alot better IMO
|
| For example the FDA says it will take decades to release the
| raw data from the COVID Vaccine trials.. Why... and that is
| after being forced to do so via a law suit.
| eesmith wrote:
| > For example many teams use the same raw data from Large
| Colliders, or JWT, or other large science projects to reach
| competiting conclusions.
|
| Yes, but why must the first team wait until the second is
| finished before publishing?
|
| What if you are the only person in the world with expertise
| in the fossil record of an obscure branch of snails? You
| spend 10 years developing a paper knowing that the next
| person with the right training to replicate the work might
| not even be born yet.
|
| Other paleontologists might not be able to replicate the
| work, but still tell if it's publishable - that's what they
| do now, yes?
|
| > but we do not live in a perfect world
|
| Alternatively, we don't live in a perfect world which is why
| we have the current system instead of requiring replication
| first.
|
| Since the same logic works for both cases, I don't think it's
| persuasive logic.
|
| > the FDA says it will take decades
|
| Well, that's a tangent. The FDA is charged with protecting
| and promoting public health, not improving the state of
| scholarly literature.
|
| And the FDA is only one of many public health organizations
| which carried out COVID vaccine trials.
| msla wrote:
| With some of the things, but admittedly not most of the things
| you mentioned, there's a dataset (somewhere) and some code run
| on that dataset (somewhere) and replication would mean someone
| else being able to run that code on that dataset and get the
| same results.
|
| Would this require labs to improve their software environments
| and learn some new tools? Would this require labs to give up
| whatever used to be secret sauce? That's. The. Point.
| counters wrote:
| In practice this is happening in many disciplines, for most
| research, on a daily basis. What _isn't_ happening is that
| the results of these replications are being independently
| peer reviewed, because that isn't incentivized. However, when
| replication fails for whatever reason, it usually leads to
| insights that themselves lead to stronger scientific work and
| better publications later on.
| eesmith wrote:
| > someone else being able to run that code on that dataset
| and get the same results.
|
| I think when people talk about "replicate" they mean
| something more than that.
|
| The dataset could contain coding errors, and the analysis
| could contain incorrect formulas and bad modeling.
| Reproducing a bad analysis, successfully, provide no
| corrective feedback.
|
| I know for one paper I could replicate the paper's results
| using the paper's own analysis, but I couldn't replicate the
| paper's results using my analysis.
|
| > Would this require labs to give up whatever used to be
| secret sauce? That's. The. Point.
|
| That seems to be a very different Point.
|
| Newton famously published results made from using his secret
| sauce - calculus - by recasting them using more traditional
| methods.
|
| In the extreme cas, I could publish the factors for RSA-1024
| without publishing my factorization method. "I prayed to God
| for the answer and He gave them to me." You can verify that
| result without the secret sauce.
|
| I mean, people use all sorts of methods to predict a protein
| structure, including manual tweaking guided by intuition and
| insight gained during a reverie or day-dream (a la Kekule)
| which is clearly not reproducible. Yet that final model may
| be publishable, because it may provide new insight and
| testable predictions.
| msla wrote:
| My point is that we can, apparently, improve the baseline
| expectations in the parts of science where this kind of
| reproducibility is possible. That isn't all science,
| granted, but it is some science. It isn't a panacea,
| granted, but it could guard against some forms of
| misconduct or honest error some of the time. The self-
| correcting part of science only works when there's
| something for it to work on, so open data and runnable code
| ought to improve that self-correction mechanism.
| eesmith wrote:
| Understood.
|
| But my point is this linked-to essay appears not only to
| exclude some areas of good science, but to suggest that
| any topics which cannot be replicated before publication
| is only worthy of publication in the Journal of
| Irreproducible Results.
|
| I gave examples to highlight why I disagree with author's
| opinion.
|
| Please do not interpret this to mean I do not think
| improvement is possible.
| kergonath wrote:
| > Who seriously thinks this shouldn't have been published until
| someone else had been able to replicate the result?
|
| Nobody, obviously. You cannot reproduce a result that hasn't
| been published, so no new phenomenon is replicated the moment
| it is first published. The problem is not the publication of
| new discoveries, it's the lack of incentives to confirm them
| once they've been published.
|
| In your example, new observations of giant squids are still
| massively valuable even if not that novel anymore. So new
| observations should be encouraged (as I am sure they are).
|
| > Or replicate the results of the flyby of Pluto, or flying a
| helicopter on Mars?
|
| Well, we should launch another probe anyway. And I am fairly
| confident we'll have many instances of aircrafts in Mars'
| atmosphere and more data than we'll know what to do with it. We
| can also simulate the hell out of it. We'll point spectrometers
| and a whole bunch of instruments towards Pluto. These are not
| really good examples of unreproducible observations.
|
| Besides, in such cases robustness can be improved by different
| teams performing their own analyses separately, even if the
| data comes from the same experimental setup. It's not all black
| or white. Observations are on a spectrum, some of them being
| much more reliable than others and replication is one aspect of
| it.
|
| > How would peer replication be relevant?
|
| How would you know which aspects of the observed phenomena come
| from particularities of this specific lab? You need more than
| one instance. You need some kind of statistical and factor
| analyses. Replication in this instance would not mean setting
| actual labs on fire on purpose.
|
| It's exactly like studying car crashes: nobody is going to kill
| people on purpose, but it is still important to study them so
| we regularly have new papers on the subject based on events
| that happened anyway, each one confirming or disproving
| previous observations.
| eesmith wrote:
| > Nobody, obviously. You cannot reproduce a result that
| hasn't been published, .. The problem is not the publication
| of new discoveries, it's the lack of incentives to confirm
| them once they've been published.
|
| Your comment concerns post-publication peer-replication, yes?
|
| If so, it's a different topic. The linked-to essay
| specifically proposes:
|
| ""Instead of sending out a manuscript to anonymous referees
| to read and review, preprints should be sent to other labs to
| actually replicate the findings. Once the key findings are
| replicated, the manuscript would be accepted and published.""
|
| That's _pre-publication_ peer-replication, and my comment was
| only meant to be interpreted in that light.
| kergonath wrote:
| > That's pre-publication peer-replication, and my comment
| was only meant to be interpreted in that light.
|
| Sorry I might have gone mixed up between threads.
|
| Yeah, pre-publication replication is nice (I do it when I
| can and am suspicious of some simulation results), but is
| not practical at scale. Besides, the role of peer review is
| not to ensure results are right, that is just not
| sustainable for referees.
| hinkley wrote:
| Is there space in the world for a few publications that only
| publish replicated work? Seems like that would be a reasonable
| compromise. Yes you were published, but were you published in
| Really Real Magazine? Get back to us when you have and we'll
| discuss.
| hospadar wrote:
| I assume that the goal here is to reduce the number of not-
| actually-valid results that get published. Not-actually-valid
| results happen for lots of reasons (whoops did experiment wrong,
| mystery impurity, cherry picked data, not enough subjects,
| straight-up lie, full verification expensive and time consuming
| but this looks promising) but often there's a common set of
| incentives: you must publish to get tenure/keep your job, you
| often need to publish in journals with high impact factor [1].
|
| High impact journals [6] tend to prefer exciting, novel, and
| positive results (we tried new thing and it worked so well!) vs
| negative results (we mixed up a bunch of crystals and absolutely
| none of them are room-temp superconductors! we're sure of it!).
|
| The result is that cherry picking data pays, leaning into
| confirmation bias pays, publishing replication studies and
| rigorous but negative results is not a good use of your academic
| inertia.
|
| I think that creating a new category of rigor (i.e. journals that
| only publish independently replicated results) is not a bad idea,
| but: who's gonna pay for that? If the incentive is you get your
| name on the paper, doesn't that incentivize coming up with a
| positive result? How do you incentivize negative replications?
| What if there is only one gigantic machine anywhere that can find
| those results (LHC, icecube, etc, a very expensive spaceship)?
|
| There might be easier and cheaper pathways to reducing bad papers
| - incentivizing the publishing of negative results and
| replication studies separately, paying reviewers for their time,
| coming up with new metrics for researchers that prioritize
| different kinds of activity (currently "how much you're cited"
| and "number of papers*journal impact" things are common, maybe a
| "how many results got replicated" score would be cool to roll
| into "do you get tenure"? See [3] for more details). PLoS
| publish.
|
| I really like OP's other article about a hypothetical "Journal of
| One Try" (JOOT) [2] to enable publishing of not-very-rigorous-
| but-maybe-useful-to-somebody results. If you go back and read OLD
| OLD editions of Philosophical Transactions (which goes back to
| the 1600's!! great time, highly recommend [4], in many ways the
| archetype for all academic journals), there are a ton of wacky
| submissions that are just little observations, small experiments,
| and I think something like that (JOOT let's say) tuned up for the
| modern era would, if nothing else, make science more fun. Here's
| a great one about reports of "Shining Beef" (literally beef that
| is glowing I guess?) enjoy [5]
|
| [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668985/ [2]
| https://web.archive.org/web/20220924222624/https://blog.ever...
| [3] https://www.altmetric.com/ [4]
| https://www.jstor.org/journal/philtran1665167 [5]
| https://www.jstor.org/stable/101710 [6]
| https://en.wikipedia.org/wiki/Impact_factor, see also
| https://clarivate.com/
| throwawaymaths wrote:
| How about we create a Nobel prize for replication. One impressive
| replication or refutation from last decade (that holds up) gets
| the prize split up to three ways among the most important
| authors.
| 37326 wrote:
| [flagged]
| elashri wrote:
| Great, but who is going to fund the peer replication?. The
| economics of research now doesn't even provide a compensation for
| peer review process time.
| nine_k wrote:
| Maybe the numerous complaints about the crisis of science are
| somehow related to the fact that scientific work is severely
| underpaid.
|
| The pay difference between research and industry in many areas
| is not even funny.
| matthewdgreen wrote:
| The purpose of science publications is to share new results with
| other scientists, so others can build on or verify the
| correctness of the work. There has always been an element of
| "receiving credit" to this, but the communication aspect is what
| actually matters _from the perspective of maximizing scientific
| progress._
|
| In the distant past, publication was an informal process that
| mostly involved mailing around letters, or for a major result,
| self-publishing a book. Eventually publishers began to devise
| formal journals for this purpose, and some of those journals
| began to receive more submissions than it was feasible to publish
| or verify just by reputation. Some of the more popular journals
| hit upon the idea of applying basic editorial standards to reject
| badly-written papers and obvious spam. Since the journal editors
| weren't experts in all fields of science, they asked for
| volunteers to help with this process. That's what peer review is.
|
| Eventually bureaucrats (inside and largely outside of the
| scientific community) demanded a technique for measuring the
| productivity of a scientist, so they could allocate budgets or
| promotions. They hit on the idea of using publications in a few
| prestigious journals as a metric, which turned a useful process
| (sharing results with other scientists) into [from an outsider
| perspective] a process of receiving "academic points", where the
| publication of a result appears to be the end-goal and not just
| an intermediate point in the validation of a result.
|
| Still other outsiders, who misunderstand the entire process, are
| upset that intermediate results are sometimes incorrect. This
| confuses them, and they're angry that the process sometimes
| assigns "points" to people who they perceive as undeserving. So
| instead of simply accepting that _sharing results widely to
| maximize the chance of verification_ is the whole point of the
| publication process, or coming up with a better set of promotion
| metrics, they want to gum up the essential sharing process to
| make it much less efficient and reduce the fan-out degree and
| rate of publication. This whole mess seems like it could be
| handled a lot more intelligently.
| nine_k wrote:
| For sharing results widely, there's arxiv. The problem is that
| the fanout is now overwhelming.
|
| The public perception of a publication in a prestigious journal
| as the established truth does not help, too.
| isaacremuant wrote:
| > The public perception of a publication in a prestigious
| journal as the established truth does not help, too.
|
| it's not so much the public perception but what
| govs/media/tech and other institutions have pushed down so
| that the public doesn't question whatever resulting policy
| they're trying to put forth.
|
| "Trust the science" means "Thou shalt not question us, simply
| obey".
|
| Anyone with eyes who has worked in institutions knows that
| bureocracy, careerism and corruption are intrinsic to them.
| casualscience wrote:
| Most of this is very legit, but this
|
| > Still other outsiders, who misunderstand the entire process,
| are upset that intermediate results are sometimes incorrect.
| This confuses them, and they're angry that the process
| sometimes assigns "points" to people who they perceive as
| undeserving. So instead of simply accepting that sharing
| results widely to maximize the chance of verification is the
| whole point of the publication process, or coming up with a
| better set of promotion metrics, they want to gum up the
| essential sharing process to make it much less efficient and
| reduce the fan-out degree and rate of publication.
|
| Does not represent my experience in the academy at all. There
| is a ton of gamesmanship in publishing. That is ultimately the
| yardstick academics are measured against, whether we like it or
| not. No one misunderstands that IMO, the issue is that it's a
| poor incentive. I think creating a new class of publication,
| one that requires replication, could be workable in some fields
| (e.g. optics/photonics), but probably is totally impossible in
| others (e.g. experimental particle physics).
|
| For purely intellectual fields like mathematics, theoretical
| physics, philosophy, you probably don't need this at all. Then
| there are 'in the middle fields' like machine learning which in
| theory would be easy to replicate, but also would be
| prohibitively expensive for, e.g. baseline training of LLMs.
| Maxion wrote:
| And on the extreme end you have the multi-decade longitudinal
| studies in epidemiology / biomedicine that would be more-or-
| less impossible to replicate.
| [deleted]
| sebastos wrote:
| Very well put. This is the clearest way of looking at it in my
| view.
|
| I'll pile on to say that you also have the variable of how the
| non-scientist public gleans information from the academics.
| Academia used to be a more insular cadre of people seeking
| knowledge for its own sake, so this was less relevant. What's
| new here is that our society has fixated on the idea that
| matters of state and administration should be significantly
| guided by the results and opinions of academia. Our enthusiasm
| for science-guided policy is a triple whammy, because 1.
| Knowing that the results of your study have the potential to
| affect policy creates incentives that may change how the
| underlying science is performed 2. Knowing that results of
| academia have outside influence may change WHICH science is
| performed, and draw in less-than-impartial actors to perform it
| 3. The outsized potential impact invites the uninformed public
| to peer into the world of academia and draw half-baked
| conclusions from results that are still preliminary or
| unreplicated. Relatively narrow or specious studies can gain a
| lot of undue traction if their conclusions appear, to the
| untrained eye, to provide a good bat to hit your opponent with.
| Maxion wrote:
| A significant problem we face today is the way research,
| especially in academia, gets spotlighted in the media. They
| often hyper-focus on single studies, which can give a skewed
| representation of scientific progress.
|
| The reality is that science isn't about isolated findings;
| it's a cumulative effort. One paper might suggest a
| conclusion, but it's the collective weight of multiple
| studies that provides a more rounded understanding. Media's
| tendency to cherry-pick results often distorts this nuanced
| process.
|
| It's also worth noting the trend of prioritizing certain
| studies, like large RCTs or systematic reviews, while
| overlooking smaller ones, especially pilot studies. Pilot
| studies are foundational--they often act as the preliminary
| research needed before larger studies can even be considered
| or funded. By sidelining or dismissing these smaller,
| exploratory studies, we risk undermining the very foundation
| that bigger, more definitive research efforts are built on.
| If we consistently ignore or undervalue pilot studies, the
| bigger and often more impactful studies may never even see
| the light of day.
| dmbche wrote:
| Your analysis seems to portray all scientists as pure hearted.
| May I remind you of the latest Stanford scandal where the
| president of Stanford was found to have manipulated data?
|
| Today, publications do not serve the same purpose as they did
| before the internet. It is trivial today to write a convincing
| paper without research and getting that
| published(www.theatlantic.com/ideas/archive/2018/10/new-sokal-h
| oax/572212/&sa=U&ved=2ahUKEwjnp5mRtsiAAxVwF1kFHesBDC8QFnoECAkQA
| g&usg=AOvVaw0t_Bo31BrT5D9zHBdmNAqi).
| matthewdgreen wrote:
| No subset of humanity is "pure hearted." Fraud and malice
| will exist in everything people do. Fortunately these
| fraudulent incidents seem relatively rare, when one compares
| the number of reported incidents to the number of
| publications and scientists. But this doesn't change
| anything. The benefit of scientific publication is _to make
| it easier to detect and verify incorrect results_ , which is
| exactly what happened in this case.
|
| I understand that it's frustrating it didn't happen
| instantly. And I also understand that it's deeply frustrating
| that some undeserving person accumulated status points with
| non-scientists based on fraud, and that let them take a high-
| status position outside of their field. (I think maybe you
| should assign some blame to the Stanford Trustees for this,
| but that's up to you.) None of this means we'd be better off
| making publication more difficult: it means the metrics are
| bad.
|
| PS When a TFA raises something like "the replication crisis"
| and then entangles it with accusations of deliberate fraud
| (high profile but exceedingly rare) it's like trying to have
| a serious conversation about automobile accidents, but
| spending half the conversation on a handful of rare incidents
| of intentional vehicular homicide. You're not going to get
| useful solutions out of this conversation, because it's
| (perhaps deliberately) misunderstanding the impact and causes
| of the problem.
| mike_hearn wrote:
| Fraud isn't exceedingly rare :( It only seems that way
| because academia doesn't pay anyone to find it, reacts to
| volunteer reports by ignoring it, and the media generally
| isn't interested.
|
| Fraud is so frequent and easy to find that there are
| volunteers who in their spare time manage to routinely
| uncover not just individual instances of fraud but entire
| companies whose sole purpose is to generate and sell fake
| papers on an industrial scale.
|
| https://www.nature.com/articles/d41586-023-01780-w
|
| Fraud is so easy and common that there are a steady stream
| of journals which publish entire editions consisting of
| nothing but AI generated articles!
|
| https://www.nature.com/articles/d41586-021-03035-y
|
| Despite being written as a joke over a decade ago, you can
| page through an endless stream of papers that were
| generated by SciGen - a Perl script - and yet they are
| getting published:
|
| https://pubpeer.com/search?q=scigen
|
| The problem is so prevalent that some people created the
| Problematic Paper Screener, a tool that automatically
| locates articles that contain text indicative of auto-
| generation.
|
| https://dbrech.irit.fr/pls/apex/f?p=9999:1::::::
|
| This is all pre-ChatGPT, and is just the researchers who
| can't be bothered writing a paper at all. The more serious
| problem is all the human written fraudulent papers with bad
| data and bad methodologies that are never detected, or only
| detected by randos with blogs or Twitter accounts that you
| never hear around.
| dmbche wrote:
| Thanks you - just discovered Scigen, these links are
| incredible
| dmbche wrote:
| For your analogy on car accidents - a notable difference
| between both is that in the case of car accidents, we are
| able to get numbers on when, how and why they happen and
| then make conclusions from that.
|
| In this case, we are not even aware of most events of
| fraud/"bad papers"/manipulation - the "crisis" is that we
| are losing faith in the science we are doing - results that
| were cornerstones of entire fields are found to be
| nonreproducible, making all the work built on top of it
| pointless.(psychology, cancer, economics, etc - I'm being
| very broad)
|
| At this point, we don't know how deep the rot goes. We are
| at the point of recognizing that it's real, and looking for
| solutions. For car accidents, we're past that - we're just
| arguing about what are the best solutions. For the
| replication crisis, we're trying to find a way forward.
|
| Like that scene in The Thing, where they test the blood?
| We're at the point where we don't know who to trust.
|
| Ps: what's a tfa?
| [deleted]
| 6510 wrote:
| Seems like a great way for "inferior" journals to gain
| reputation. Counting citations seems a pretty silly formula/hack.
| How often you say something doesn't affect how true it is.
| SubiculumCode wrote:
| Scientist publishes paper based on ABCD data.
|
| Replicator: Do you know how much data I'll need to collect?
| 11,000 particpants followed across multiple timepoints of MRI
| scanning. Show me the money.
| petesergeant wrote:
| Definitely something that needs large charitable investment,
| but charities like that do exist, eg Wellcome Trust
| SubiculumCode wrote:
| Like 290+ million, just to get started.
| jhart99 wrote:
| Replication in many fields comes with substantial costs. We are
| unlikely to see this strategy employed on many/most papers. I
| agree with other commenters that materials and methodology should
| be provided in sufficient detail so that others could replicate
| if desired.
| leedrake5 wrote:
| Peer Review is the right solution to the wrong problem:
| https://open.substack.com/pub/experimentalhistory/p/science-...
|
| On replication, it is a worthwhile goal but the career incentives
| need to be there. I think replicating studies should be a part of
| the curriculum in most programs - a step toward getting a PhD in
| lieu of one of the papers.
| vinnyvichy wrote:
| Fear of the frontier.. that's why instead of people getting
| excited to look for new rtsp superconductor candidates, we get
| a lot of talk downplaying the only known one. Strong link vs
| weak link reminds me of how some cultures frown on stimulants
| while other cultures frown on relaxants.
| nomilk wrote:
| https://web.archive.org/web/20230130143126/https://blog.ever...
| the_arun wrote:
| Thank you. Currently the original article is throttled.
|
| Seems like article is not about software code.
| fodkodrasz wrote:
| How would you peer-replicate observation of a rare, or unique
| event, for example in astronomy?
| lordnacho wrote:
| Either get your own telescope and gather your own data, or if
| only one telescope captured a fleeting event, take that data
| and see if the analysis turns out the same.
| GuB-42 wrote:
| Peer review is not the end. When replication is particularly
| complex or expensive, peer review may just a way to see if the
| study is worth replicating.
| hgsgm wrote:
| The problem is equating publication with truth.
|
| Publication is a _starting point_ , not a _conclusion_
|
| Publication is submitting your code. It still needs to be tested,
| rolled out, evaluated, and time-tested.
| miga wrote:
| Peer review does not serve to assure replication, but assure
| readability and comprehensibility of the paper.
|
| Given that some experiments cost billions to conduct, it is
| impossible to implement "Peer Replication" for all papers.
|
| What could be done is to add metadata about papers that were
| replicated.
| kergonath wrote:
| Barriers to publication should be lower for replication
| studies, I think that's the main problem.
|
| If someone wants to spend some time replicating something
| that's only been described in a paper or two, that is valuable
| work for the community and should be encouraged. If the person
| is a PhD student using that as an opportunity to hone their
| skills, it's even better. It's not glamorous, it's not
| something entirely new, but it is _useful_ and _important_. And
| this work needs to go to normal journals, otherwise there's
| just be journals dedicated to replication and their impact
| factor will be terrible and nobody will care.
| s1artibartfast wrote:
| They're basically no barriers to publication. There are a
| number of normal journals that publish everything submitted
| if it appears to be honest research.
| kergonath wrote:
| Not nice journals, though. At least not in my experience
| but that's probably very field-dependent. It's not uncommon
| to get a summary rejection letter for lack of novelty and
| that is one aspect they stress when they ask us to review
| articles.
| s1artibartfast wrote:
| But novelty IS what makes those journals nice and
| prestigious in the first place. It is the basis of their
| reputation.
|
| It's basically a catch 22. We want replication in
| prestigious journals, but any Journal with replications
| becomes less novel and prestigious.
|
| It all comes down to what people value about journals. If
| people valued replication more than novelty, replication
| journals would be the prestigious ones.
|
| It all comes back to the fact that doing novel science is
| considered more prestigious than replication.
| Institutions can play all kinds of games to try to make
| it harder for readers to tell novelty apart from
| replication, but people will just find new ways to signal
| and determine the difference.
|
| Let's say we pass a law that prestigious journals must
| published 50% replications. The Prestige from publishing
| in that journal will just shift to publishing in that
| journal with something like first demonstration in the
| title or publishing in that journal Plus having a high
| citation or impact value.
|
| It is really difficult to come up with the system or
| institution level solution when novelty is still what
| individuals value.
|
| As long as companies and universities value innovation,
| figure out ways to determine which scientists are
| innovative, and value them more
| strangattractor wrote:
| Maybe add people as special authors/contributors to the
| original work.
|
| There always seems to be a contingent of people that think that
| anything less than %100 solution is inadequate so nothing is
| done. Peer review has proven itself inadequate and people hang
| on to it tooth and nail. Some disciplines should require
| replication on everything - I won't name Psychology or Social
| Sciences in general but the failure to replicate rate for some
| is unacceptable.
| ebiester wrote:
| Let's not make perfect be the enemy of good. We may never be
| able to replicate every field, but we could start many fields
| today. It means changing our values to make replication as a
| valid path to tenure and promotion and a required element of
| Ph.D studies.
| julienreszka wrote:
| >Experiments that cost billions to conduct
|
| If you can't replicate them it's like they didn't happen
| anyways
| thfuran wrote:
| So no experiments have happened because I don't have a lab,
| and CERN is just an elaborate ruse?
| kergonath wrote:
| It's a bit more subtle than that. Not all papers are equal
| and I'd trust an article from a large team where error and
| uncertainty analysis has been done properly (think the Higgs
| boson paper) over a handful of dodgy experiments that are
| barely documented properly.
|
| But yeah, in the grand scheme of things if it hasn't been
| replicated, then it hasn't been proven, but some works are
| credible on their own.
| tnecniv wrote:
| Ah yes, if I can't run the LHC at home, none of the work
| there happened
| mathisfun123 wrote:
| >Peer review does not serve to assure replication, but assure
| readability and comprehensibility of the paper.
|
| I have had a paper rejected twice in a row over the last year.
| Both times the comments include something like "paper was very
| well-wriiten; well-written enough that an undergrad could read
| it".
|
| Peer review ensures the gates are kept.
| NalNezumi wrote:
| Isn't readability and comprehensibility the job of the
| editor/journal to check. (after all they're actually paid)
| maybe not for conference, but peer review is more for checking
| if the methodology, scope, claim, direction, conclusion and
| relevances is sound&trustable.
|
| At least that's my understanding
| hedora wrote:
| In CS, the editor / journal don't do those things. Instead,
| the reviewers do. (Sometimes reviewers "shepherd" papers to
| help fix readability after acceptance).
|
| Also, most work goes to conferences; journals typically
| publish longer versions of published works.
| kergonath wrote:
| The editor is often not the right person to decide based on
| technical details. Most often, articles they receive anre
| outside their field of expertise and they don't really have a
| way of deciding if a section is comprehensible or not. It's
| very difficult for an outsider to know what bit of jargon is
| redundant and what bit is actually important to make sense of
| the results. So this bit of readability check falls to the
| referees.
|
| In theory editors (or rather copyeditors, the editors
| themselves have to handle too many papers to do this sort of
| thing) should help with things like style, grammar, and
| spelling. In practice, quality varies but it is often subpar.
| kkylin wrote:
| Highly dependent on journal / field. In mine (mathematics)
| most associate editors work for free, same as reviwers. The
| reviewer do all the things you say, and in addition try to
| ensure readability & novelty. Most journals do have
| professional copy editing, but that's separate from the
| content review.
|
| I don't know how refereed conference proceedings work (we
| don't really use these). The only journals I know of that
| have professional editors (i.e., editors who are not active
| researchers themselves) are Nature and affiiliated journals,
| but someone more knowledgeble should correct me here.
___________________________________________________________________
(page generated 2023-08-06 23:00 UTC)