[HN Gopher] Deep learning gets the glory, deep fact checking get...
       ___________________________________________________________________
        
       Deep learning gets the glory, deep fact checking gets ignored
        
       Author : chmaynard
       Score  : 127 points
       Date   : 2025-06-03 21:31 UTC (1 hours ago)
        
 (HTM) web link (rachel.fast.ai)
 (TXT) w3m dump (rachel.fast.ai)
        
       | amelius wrote:
       | Before making AI do research, perhaps we should first let it
       | __reproduce__ research. For example, give it a paper of some deep
       | learning technique and make it produce an implementation of that
       | paper. Before it can do that, I have no hope that it can produce
       | novel ideas.
        
         | YossarianFrPrez wrote:
         | Seconded, as not only is this an interesting idea, it might
         | also help solve the issue of checking for reproducibility. Yet
         | even then human evaluators would need to go over the AI-
         | reproduced research with a fine-toothed comb.
         | 
         | Practically speaking, I think there are roles for current LLMs
         | in research. One is in the peer review process. LLMs can assist
         | in evaluating the data-processing code used by scientists.
         | Another is for brainstorming and the first pass at lit reviews.
        
         | ojosilva wrote:
         | I thought you were going to say "give AI the first part of a
         | paper (prompt) and let it finish it (completion)" as a
         | validation AI can produce science at par with research results.
         | Before it can do that, I have no hope that it can produce novel
         | ideas.
        
       | kenjackson wrote:
       | "And for most deep learning papers I read, domain experts have
       | not gone through the results with a fine-tooth comb inspecting
       | the quality of the output. How many other seemingly-impressive
       | papers would not stand up to scrutiny?"
       | 
       | Is this really not the case? I've read some of the AI papers in
       | my field, and I know many other domain experts have as well. That
       | said I do think that CS/software based work is generally easier
       | to check than biology (or it may just be because I know very
       | little bio).
        
       | slt2021 wrote:
       | Fantastic article by Rachel Thomas!
       | 
       | This is basically another argument that deep learning works only
       | as a [generative] information retrieval - i.e a stochastic
       | parrot, due to the fact that the training data is a very lossy
       | representation of the underlying domain.
       | 
       | Because the data/labels of genes do not always represent the
       | underlying domain (biology) perfectly, the output can be
       | false/invalid/nonsensical.
       | 
       | in cases where it works very well - there is data leakage,
       | because by design LLMs are information retrieval tools. It comes
       | form the information theory standpoint, a fundamental "unknown
       | unknown" for any model.
       | 
       | my takeaway is that its not a fault of the algorithm, its more
       | the fault of the training dataset.
       | 
       | We humans operate fluidly in the domain of natural language, and
       | even a kid can read and evaluate whether text make sense or not -
       | this explains the success of models trained on NLP.
       | 
       | but in domains where training data represents the fundamental
       | domain with losses, it will be imperfect.
        
       | rustcleaner wrote:
       | What AI needs is a 'reality checker' subsystem. LLMs are like the
       | phantasmal part of your psyche constantly jibbering phrases
       | (ideas), but what keeps all our internal jibberjabbers in our
       | brains from making endless false statements is a "does my
       | statement describe something falsifiable" and "is there a
       | detectable falsification."
       | 
       |  _looks around the room at all the churchgoers_
       | 
       | Well on second review, this isn't true for everybody...
        
         | airstrike wrote:
         | I couldn't agree more. On a random night a few months ago I
         | found myself in that curious half-asleep-half-awake state and
         | this time I had became aware of my brain's constant jibbering
         | phrases. It was as if I could hear my thoughts before the
         | filter pass through which they become actual cohesive
         | sentences.
         | 
         | I could "see" hundreds of words/thoughts/meanings being
         | generated in a diffuse way, all at the same time but also
         | slowly evolving over time and then see my brain distill them
         | into a sentence. It would happen repeatedly every second
         | ridiculously fast yet also "slow enough" that I could see it
         | happen.
         | 
         | It's just my personal half-asleep hallucination, so obviously
         | take from it what you will (~nothing) but I can't shake the
         | feeling we need a similar algorithm. If I ever pursue a
         | doctorate degree, this is what I'll be trying.
        
       | aucisson_masque wrote:
       | It's like fake news is taking in science now. Saying any stupid
       | thing will attract much more view and << likes >> than those
       | debunking them.
       | 
       | Except that we can't compare twitter to nature journal. Science
       | is supposed to be immune to these kind of bullshit thanks to
       | reputed journals and pair reviewing, blocking a publication
       | before it does any harm.
       | 
       | Was that a failure of nature ?
        
         | godelski wrote:
         | Yes. And let's not get started on that ML Quantum Wormhole
         | bullshit...
         | 
         | We've taken this all too far. It is bad enough to lie to the
         | masses in Pop-Sci articles. But we're straight up doing it in
         | top tier journals. Some are good faith mistakes, but a lot more
         | often they seem like due diligence just wasn't ever done. Both
         | by researchers and reviewers.
         | 
         | I at least have to thank the journals. I've hated them for a
         | long time and wanted to see their end. Free up publishing and
         | bullshit novelty and narrowing of research. I just never
         | thought they'd be the ones to put the knife through their own
         | heart.
         | 
         | But I'm still not happy about that tbh. The only result of this
         | is that the public grows to distrust science more and more. In
         | a time where we need that trust more than ever. We can't expect
         | the public to differentiate nuanced takes about internal
         | quibbling. And we sure as hell shouldn't be giving ammunition
         | to the anti-science crowds, like junk science does...
        
         | lamename wrote:
         | The Bullshit asymmetry principle comes to mind
         | https://en.wikipedia.org/wiki/Brandolini%27s_law
        
         | lamename wrote:
         | Have you seen the statistics about high impact journals having
         | higher retraction/unverified rates on papers?
         | 
         | The root causes can be argued...but keep that in mind.
         | 
         | No single paper is proof. Bodies of work across many labs,
         | independent verification, etc is the actual gold standard.
        
       | godelski wrote:
       | > although later investigation suggests there may have been data
       | leakage
       | 
       | I think this point is often forgotten. Everyone should assume
       | data leakage until it is strongly evidenced otherwise. It is not
       | on the reader/skeptic to prove that there is data leakage, it is
       | the authors who have the burden of proof.
       | 
       | It is easy to have data leakage on small datasets. Datasets where
       | you can look at everything. Data leakage is really easy to
       | introduce and you often do it unknowingly. Subtle things easily
       | spoil data.
       | 
       | Now, we're talking about gigantic datasets where there's no
       | chance anyone can manually look through it all. We know the
       | filter methods are imperfect, so it how do we come to believe
       | that there is no leakage? You can say you filtered it, but you
       | cannot say there's no leakage.
       | 
       | Beyond that, we are constantly finding spoilage in the datasets
       | we do have access to. So there's frequent evidence that it is
       | happening.
       | 
       | So why do we continue to assume there's no spoilage? Hype?
       | Honestly, it just sounds like a lie we tell ourselves because we
       | want to believe. But we can't fix these problems if we lie to
       | ourselves about them.
        
       | semiinfinitely wrote:
       | there is no truth- only power.
        
       ___________________________________________________________________
       (page generated 2025-06-03 23:00 UTC)