[HN Gopher] Rule-based NLP system beats LLM for analysis of psyc...
       ___________________________________________________________________
        
       Rule-based NLP system beats LLM for analysis of psychiatric
       clinical notes
        
       Author : PaulHoule
       Score  : 83 points
       Date   : 2024-04-04 18:47 UTC (4 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | the_decider wrote:
       | This work is based on a fine-tuned Google Palm Model from 2022.
       | I'm not sure if this is a fair comparison to the latest
       | groundbreaking series of LLMs
        
         | jxy wrote:
         | FLAN-T5-XL, a 3B model, to be precise.
        
         | smartmic wrote:
         | In view of the fact that we have been experiencing a
         | breakthrough in the public perception of LLMs for 1.5 years and
         | that the resources for their further development have increased
         | explosively, it is indeed questionable to publish this now in
         | March '24. A review based on the latest, most powerful LLMs
         | would be urgently needed.
        
       | spdustin wrote:
       | I would've expected so. Rule-based systems in NLP can express
       | parts of speech, coreferences, modifiers...it's almost too easy
       | to do something like "Extract all subject-verb-object clauses
       | where the object is connected to the subject by, say, a
       | possessive pronoun.
        
         | lgessler wrote:
         | So you're thinking of expressions like "Winston said he had
         | noticed a continued decrease in his suicidal ideation" (real
         | example). I don't know if I'd agree that it's clear a rule-
         | based system would be able to catch everything like this.
         | Consider:
         | 
         | 1a. (original) Winston said he had noticed a continued decrease
         | in his suicidal ideation.
         | 
         | 1b. (simplified) Winston noticed a decrease in his suicidal
         | ideation.
         | 
         | 2. Winston's suicidal ideation decreased.
         | 
         | 3. There was a decrease in the suicidal ideation Winston felt.
         | 
         | 4. Suicidal ideation decreased in the patient.
         | 
         | Many more syntactic repackagings of similar propositional
         | content are possible, too. For perfect recall, rule-based
         | systems need to grapple with this, and additionally, in a real
         | setting you're probably getting predicted (rather than gold-
         | standard) POS tags, relations, etc., adding additional noise.
         | 
         | My expectation (as someone who's mildly bearish on LLMs, btw)
         | would be that if an LLM appears to be doing worse than a rule-
         | based system then you probably haven't tried some low-hanging
         | fruit yet such as adjusting prompting strategies, fine-tuning,
         | using different pretraining data, etc.
        
           | og_kalu wrote:
           | They used Flan T5-XL, a 3B model from 2022. Very weird
           | choice.
        
       | EncomLab wrote:
       | Any specifically intentioned and designed software should
       | outperform a generalist approach to solving a solution set. The
       | primary difference is in the level of brittleness between the two
       | - I mean AlphaGO beat Lee Sedol at GO, but Lee has a much more
       | advanced self-driving ability.
        
       | paulvnickerson wrote:
       | I think the more appropriate comparison would be between a rule-
       | based system and an LLM specifically built on clinical data, e.g.
       | Gatortron
       | (https://arxiv.org/ftp/arxiv/papers/2203/2203.03540.pdf)
        
       | languagehacker wrote:
       | My computational linguistics professor in grad school (who went
       | to do do NLP research at Google) always said, "Do the dumb thing
       | first!"
        
         | lxgr wrote:
         | You have to admit that expecting general LLMs to automatically
         | be domain experts everywhere is kind of a dumb thing, though!
        
       | og_kalu wrote:
       | Of all the models to perform such a comparison, Flan-T5-XL, a 3B
       | parameter model from 2022 is a genuinely baffling choice. The
       | paper is nearly worthless for this reason alone.
        
       ___________________________________________________________________
       (page generated 2024-04-04 23:01 UTC)