[HN Gopher] Deterministic Quoting: Making LLMs safer for healthcare
___________________________________________________________________
Deterministic Quoting: Making LLMs safer for healthcare
Author : mattyyeung
Score : 69 points
Date : 2024-05-05 10:47 UTC (2 days ago)
(HTM) web link (mattyyeung.github.io)
(TXT) w3m dump (mattyyeung.github.io)
| telotortium wrote:
| We've developed LLM W^X now - time to develop LLM ROP!
| gojomo wrote:
| Interesting analogies for LLMs!
| (https://en.wikipedia.org/wiki/W%5EX &
| https://en.wikipedia.org/wiki/Return-oriented_programming)
| w10-1 wrote:
| I'm not sure determinism alone is sufficient for proper
| attribution.
|
| This presumes "chunks" are the source. But it's not easy to
| identify the propositions that form the source of some knowledge.
| In the best case, you are looking for an association and find it
| in a sentence you've semantically parsed, but that's rarely the
| case, particularly for medical histories.
|
| That said, deterministic accuracy might not matter if you can
| provide enough context, particularly for further exploration. But
| that's not really "chunks".
|
| So it's unclear to me that tracing probability clouds back to
| chunks of text will work better than semantic search.
| nextworddev wrote:
| Did I miss something or did the article never describe how the
| technique works? (Despite the "How It Works" section
| Smaug123 wrote:
| It's explained at considerable length in the section _A
| "Minimalist Implementation" of DQ: a modified RAG Pipeline_.
| Animats wrote:
| It's a search engine, basically?
| robrenaud wrote:
| A good, automatically run, privacy preserving search engine
| that uses electronic medical records might be a valuable
| resource for busy doctors.
| simonw wrote:
| Building better search tools is one of the most directly
| interesting applications of LLMs in my opinion.
| tylersmith wrote:
| Yes, and Dropbox is an rsync server.
| itishappy wrote:
| What happens if it hallucinates the <title>?
| resource_waste wrote:
| Same thing when a human hallucinates.
|
| Except with LLMs, you can run like 10 different models. With a
| human, you owe $120 and are taking medicine.
| KaiserPro wrote:
| > With a human, you owe $120 and are taking medicine.
|
| Well there are protocols, procedures and a bunch of checks
| and balances.
|
| The problem with the LLM is that there isn't any, its you vs
| one shot retrieval.
| pton_xd wrote:
| Except with a human there's a counter-party with assets or
| insurance who assumes liability for mistakes.
|
| Although presumably if a company is making decisions using an
| LLM, and the LLM makes a mistake, the company would still be
| held liable ... probably.
|
| If there's no "damage" from the mistake then it doesn't
| matter either way.
| simonw wrote:
| You catch it. The hallucinated title will fail to match the
| retrieved text based on the reference ID.
|
| If it hallucinates an incorrect (but valid) reference ID then
| hopefully your users can spot that the quoted text has no
| relevance to their question.
| resource_waste wrote:
| I feel like this is the perfect application of running the data
| multiple times.
|
| Imagine having ~10-100 different LLMs, maybe some are medical,
| maybe some are general, some are from a different language. Have
| them all run it, rank the answers.
|
| Now I believe this can further be amplified by having another
| prompt ask to confirm the previous answer. This could get a bit
| insane computationally with 100 original answers, but I believe
| the original paper I read was that by doing this prompt
| processing ~4 times, they got to some 95% accuracy.
|
| So 100 LLMs give an answer, each time we process it 4 times, can
| we beat a 64 year old doctor?
| simonw wrote:
| I like this a lot. I've been telling people for a while that
| asking for direct quotations in LLM output - which you can then
| "fact-check" by confirming them against the source document - is
| a useful trick. But that still depends on people actually doing
| that check, which most people won't do.
|
| I'd thought about experimenting with automatically validating
| that the quoted text does indeed 100% match the original source,
| but should even a tweak to punctuation count as a failure there?
|
| The proposed deterministic quoting mechanism feels like a much
| simpler and more reliable way to achieve the same effect.
| jonathan-adly wrote:
| I built and sold a company that does this a year ago. It was hard
| 2 years ago, but now pretty standard RAG with a good
| implementation will get you there.
|
| The trick is, healthcare users would complain to no end about
| determinism. But, these are "below-the-line" user - aka, folks
| who don't write checks and the AI is better than them. (I am a
| pharmacist by training, and plain vanilla GPT4-turbo is better
| than me).
|
| Don't really worry about them. The folks who are interested and
| willing to pay for AI has more practical concerns - like what is
| my ROI and the implementation like.
|
| Also - folks should be building Baymax from big hero 6 by now
| (the medical capabilities, not the rocket arm stuff). That's the
| next leg up.
| not2b wrote:
| I was thinking that something like this could be useful for
| discovery in legal cases, where a company might give up a
| gigabyte or more of allegedly relevant material in response to
| recovery demands and the opposing side has to plow through it to
| find the good stuff. But then I thought of a countermeasure:
| there could be messages in the discovery material that act as
| instructions to the LLM, telling it what it should _not_ find. We
| can guarantee that any reports generated will contain accurate
| quotes, even where they are so that surrounding context can be
| found. But perhaps, if the attacker controls the input data,
| things can be missed. And it could be done in a deniable way:
| email conversations talking about LLMs that also have keywords
| related to the lawsuit.
| burntcaramel wrote:
| Is there existing terms of art for this concept? It's not like
| slightly unreliable writers is a new concept, such as a student
| writing a paper.
|
| For example:
|
| - Authoritative reference:
| https://www.montana.edu/rmaher/ee417/Authoritative%20Referen...
|
| - Authoritative source:
| https://piedmont.libanswers.com/faq/135714
| mattyyeung wrote:
| Author here, thanks for your interest! Surprising way to wake up
| in the morning. Happy to answer questions
___________________________________________________________________
(page generated 2024-05-07 23:01 UTC)