[HN Gopher] Rule-based NLP system beats LLM for analysis of psyc...
___________________________________________________________________
Rule-based NLP system beats LLM for analysis of psychiatric
clinical notes
Author : PaulHoule
Score : 83 points
Date : 2024-04-04 18:47 UTC (4 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| the_decider wrote:
| This work is based on a fine-tuned Google Palm Model from 2022.
| I'm not sure if this is a fair comparison to the latest
| groundbreaking series of LLMs
| jxy wrote:
| FLAN-T5-XL, a 3B model, to be precise.
| smartmic wrote:
| In view of the fact that we have been experiencing a
| breakthrough in the public perception of LLMs for 1.5 years and
| that the resources for their further development have increased
| explosively, it is indeed questionable to publish this now in
| March '24. A review based on the latest, most powerful LLMs
| would be urgently needed.
| spdustin wrote:
| I would've expected so. Rule-based systems in NLP can express
| parts of speech, coreferences, modifiers...it's almost too easy
| to do something like "Extract all subject-verb-object clauses
| where the object is connected to the subject by, say, a
| possessive pronoun.
| lgessler wrote:
| So you're thinking of expressions like "Winston said he had
| noticed a continued decrease in his suicidal ideation" (real
| example). I don't know if I'd agree that it's clear a rule-
| based system would be able to catch everything like this.
| Consider:
|
| 1a. (original) Winston said he had noticed a continued decrease
| in his suicidal ideation.
|
| 1b. (simplified) Winston noticed a decrease in his suicidal
| ideation.
|
| 2. Winston's suicidal ideation decreased.
|
| 3. There was a decrease in the suicidal ideation Winston felt.
|
| 4. Suicidal ideation decreased in the patient.
|
| Many more syntactic repackagings of similar propositional
| content are possible, too. For perfect recall, rule-based
| systems need to grapple with this, and additionally, in a real
| setting you're probably getting predicted (rather than gold-
| standard) POS tags, relations, etc., adding additional noise.
|
| My expectation (as someone who's mildly bearish on LLMs, btw)
| would be that if an LLM appears to be doing worse than a rule-
| based system then you probably haven't tried some low-hanging
| fruit yet such as adjusting prompting strategies, fine-tuning,
| using different pretraining data, etc.
| og_kalu wrote:
| They used Flan T5-XL, a 3B model from 2022. Very weird
| choice.
| EncomLab wrote:
| Any specifically intentioned and designed software should
| outperform a generalist approach to solving a solution set. The
| primary difference is in the level of brittleness between the two
| - I mean AlphaGO beat Lee Sedol at GO, but Lee has a much more
| advanced self-driving ability.
| paulvnickerson wrote:
| I think the more appropriate comparison would be between a rule-
| based system and an LLM specifically built on clinical data, e.g.
| Gatortron
| (https://arxiv.org/ftp/arxiv/papers/2203/2203.03540.pdf)
| languagehacker wrote:
| My computational linguistics professor in grad school (who went
| to do do NLP research at Google) always said, "Do the dumb thing
| first!"
| lxgr wrote:
| You have to admit that expecting general LLMs to automatically
| be domain experts everywhere is kind of a dumb thing, though!
| og_kalu wrote:
| Of all the models to perform such a comparison, Flan-T5-XL, a 3B
| parameter model from 2022 is a genuinely baffling choice. The
| paper is nearly worthless for this reason alone.
___________________________________________________________________
(page generated 2024-04-04 23:01 UTC)