[HN Gopher] Large language models as research assistants
___________________________________________________________________
Large language models as research assistants
Author : leononame
Score : 46 points
Date : 2024-04-27 10:01 UTC (13 hours ago)
(HTM) web link (lemire.me)
(TXT) w3m dump (lemire.me)
| vouaobrasil wrote:
| If we truly need LLMs to be research assistants, then I question,
| are we really still doing useful things or just "playing the
| game" of research? I mean, if we need datacenters and models that
| cost millions to train and megawatts to run, is what comes out of
| it any use for us?
|
| Scientific research has come to resemble gambling more and more
| these days, where there is an extremely obsessive quest to
| accumulate more data, theories, and information, rather than
| trying to figure out improvements to life.
| falcor84 wrote:
| I'm not following your argument at all. Yes, there are
| diminishing returns in science as in everything, but generally
| speaking, all other things being equal, the more resources you
| put into an endeavor, the more you get out of it.
|
| One big example from recent years is AlphaFold, which required
| massive computation resources, and has since its release been
| an ongoing fountain of innovation for biomedical (and
| particularly pharmacological) application.
| hesiintle wrote:
| > has since its release been an ongoing fountain of
| innovation for biomedical (and particularly pharmacological)
| application.
|
| [Citation Needed]
|
| Last time I looked into it, as is often the case, the
| 'actual' results were much much more sobering than the
| headlines seemed to suggest.
| hesiintle wrote:
| I agree whole heartedly, except:
|
| > quest to accumulate more data, theories, and information,
| rather than
|
| You forgot "more money".
| raphlinus wrote:
| As a counterpoint, I appreciated this recent post by Martin
| Kleppmann[1]:
|
| I've worked out why I don't get much value out of LLMs. The
| hardest and most time-consuming parts of my job involve
| distinguishing between ideas that are correct, and ideas that are
| plausible-sounding but wrong. Current AI is great at the latter
| type of ideas, and I don't need more of those.
|
| [1]:
| https://bsky.app/profile/martin.kleppmann.com/post/3kquvol6s...
| leononame wrote:
| I think it's valuable for brainstorming and refining my texts.
| As a non native, it helps me immensely in correcting errors and
| sometimes weird phrases that I accidentally translate literally
| without noticing. But it's only helpful when I know enough
| about the source material to judge the output. I wouldn't trust
| it e.g. in sieving through other people's ideas or applications
| wizzwizz4 wrote:
| > _As a non native, it helps me immensely in correcting
| errors and sometimes weird phrases that I accidentally
| translate literally without noticing._
|
| Warning: every time I have seen somebody write that, and seen
| an example of their writing, it's been fine to start with but
| the LLM has completely trashed it. See:
| https://meta.stackexchange.com/a/396009/308065
| WanderPanda wrote:
| In the end an "idea" is by definition contrarian which is the
| opposite of the training objective of LLMs. The question is how
| far fine-tuning and tree-search can go to extrapolate from the
| data-manifold. And the answer is probably not that far,
| currently.
| mr_mitm wrote:
| LLMs aren't a good fit to do the hardest part of our job.
| They're great at doing mental routine tasks, though. They take
| the easy, boring, menial yet necessary parts of our jobs off
| our hands.
| andy99 wrote:
| I think it's generally insulting to your audience to write with
| an LLM. If you don't care about what you're saying, why should
| someone care to read it? I hope the advent of automated writing
| will lead to reforms in the way research is presented, with less
| focus on boilerplate or other stuff nobody cares about (societal
| impact statements some conferences force on us).
|
| For grant applications, I agree it's a great tool because they're
| rife with bureaucratic crap that nobody really needs to read or
| write. In time, again hopefully the system will be reformed to
| not waste time asking for stuff an LLM could generate.
| bjourne wrote:
| > I think it's generally insulting to your audience to write
| with an LLM.
|
| But not as insulting as commenting on HN before even skimming
| the article you're commenting on. :)
| BeetleB wrote:
| > I think it's generally insulting to your audience to write
| with an LLM. If you don't care about what you're saying, why
| should someone care to read it?
|
| I do care about what I'm saying. That's why I review the LLMs
| output and edit before sending. If an LLM can express what I
| meant to say better than I can, why would I not use it?
|
| Personally, I don't do this because the LLM changes the style
| of my text too much and doesn't sound like me any more. But oh,
| I so do wish it could. Often I type a first draft of an email,
| and I know it needs (simple) editing. If an LLM could do it for
| me, I'd be very happy.
|
| For research papers, writing the introduction is a large
| headache, and frankly, is often more of a ritual. It's the
| least important part of the paper. I mean, if all I had to do
| is describe the purpose of my paper, etc, that would be great.
| But a lot of referees want me to load it up with a lot more
| verbiage to satisfy dubious traditions.
|
| Unfortunately, GPT can't do it for me. But it should.
| lnkdinsuxs wrote:
| The achilles heels of current LLMs are:
|
| 1. Hallucinations
|
| 2. Prompt injections
|
| Currently, there is no known way to detect either using LLMs
| themselves. As a research assistant, if the LLM hallucinates, and
| it always sounds extremely confident when it does so, the LLM
| itself is of little use and causes additional burden, defeating
| the whole point of this.
|
| Maybe an external validation step that employs a pagerank like
| algorithm is needed to detect and flag hallucinations? If so, how
| valuable would that company be?
| ukuina wrote:
| > Idea generation. I used to spend a lot of time chatting with
| colleagues about a vague idea I had. "How could we check whether
| X is true?" A tool like ChatGPT can help you get started. If you
| ask how to design an experiment to check a given hypothesis, it
| can often do a surprisingly good job.
|
| While GPT4 can recognize an innovative idea, I have yet to see it
| suggest such an idea, or successfully extrapolate or question the
| idea. If you are beyond the "concept space" of the model, it is
| not going to help you explore it.
| drycabinet wrote:
| But some random word in its response can trigger an idea in
| your mind. Getting an idea from a conversation is not always
| about getting it directly. It's already in you and you just
| wanted a trigger.
| jprete wrote:
| Rubber-ducking is useful, but nobody gives the rubber duck
| anywhere near as much credit as AI enthusiasts give to
| chatbots.
| cl42 wrote:
| I think LLMs can do a lot more than people assume, but they need
| to be given the proper frameworks.
|
| When was the last time a researcher, economist, etc. was given
| 10,000 papers and simply told "do some original work"? That's not
| how it works. Daniel (the author) provides some good examples
| where _streamlined_ work can happen, but again, this is pretty
| basic stuff.
|
| To push this further, though, imagine LLMs that fill in
| frameworks... A few steps here: (1) do a lit review, (2) fill in
| the framework, (3) discuss what might be missing, and maybe even
| try and fill in the missing information.
|
| I'm doing something like this with politics and economics (see:
| https://emergingtrajectories.com/) and it works generally well. I
| think with a ton more engineering, curating of knowledge bases,
| etc., one can get these LLMs to actually find some new "nuggets"
| of information.
|
| Admittedly, it's very hard, but I think there's something there.
| julienchastang wrote:
| > Grant applications.
|
| Inspired by a Wharton Business School study [0] I went down this
| road recently where I "primed" ChatGPT4 with an RFP (Request for
| Proposal from a US granting agency) and publicly available
| documents about the organization I work for. The ideas that it
| generated made sense, but were unfortunately way too generic to
| be useful. I am open to the idea that through better prompting,
| LLMs could be helpful here. As a first attempt in this arena,
| however, my initial results were disappointing.
|
| [0] https://mackinstitute.wharton.upenn.edu/2023/new-working-
| pap...
| julienchastang wrote:
| > It is quite certain that in the near future, a majority of all
| research papers will be written with the help of artificial
| intelligence. I suspect that they will be reviewed with
| artificial intelligence as well. We might soon face a closed loop
| where software writes papers while other software reviews it.
|
| This is fine as long as there are humans trained in critical
| thinking skills (i.e., a liberal arts education) are monitoring
| every step in this loop ensuring that the scholarly output is of
| high quality. I am unfortunately not sanguine about this
| optimistic scenario.
|
| > And this new technology should mediocre academics even less
| useful, relatively speaking. If artificial intelligence can write
| credible papers and grant applications, what is the worth of
| someone who can barely do these things?
|
| Actually, I think the opposite of this is true where AI has the
| potential of leveling the playing field and increasing the
| productivity of less productive employees.
|
| > Unsurprisingly, software and artificial intelligence can help
| academics, and maybe replace them in some cases.
|
| I don't think so. Instead the individual components of academic
| workflows can potentially be accelerated by AI.
___________________________________________________________________
(page generated 2024-04-27 23:01 UTC)