hngopher.com

       [HN Gopher] Benchmarking LLMs against human expert-curated biome...
       ___________________________________________________________________
        
       Benchmarking LLMs against human expert-curated biomedical knowledge
       graphs
        
       Author : Al0neStar
       Score  : 35 points
       Date   : 2024-03-30 16:51 UTC (6 hours ago)
        
 (HTM) web link (www.sciencedirect.com)
 (TXT) w3m dump (www.sciencedirect.com)
        
       | CraftingLinks wrote:
       | Academic writing 101: The abstract is NOT meant to be written as
       | a cliff-hanger!
        
         | serialdev wrote:
         | You will not believe what it is all you need!
        
       | jmugan wrote:
       | I didn't see UMLS in the paper, but I've tried some of their
       | human-created biomedical knowledge graphs, and they were too full
       | of errors to be used. I imagine different ones have different
       | levels of accuracy.
        
       | nyrikki wrote:
       | Due to the cliffhanger abstract, here is a part from the
       | discussion that may help.
       | 
       | > In our case, the manual curation of a proportion of triples
       | revealed that Sherpa was able to extract more triples categorized
       | as correct or partially correct. However, when compared to the
       | manually curated gold standard, the performance of all automated
       | tools remains subpar.
        
       | egberts1 wrote:
       | i was right; LLM needs two major components added before we can
       | swan dive into humanistic aspect of medicine/pyschology/politics
       | using a form of LLM.
       | 
       | 1) weighting of each statement for probability of correctness and
       | 
       | 2) citation for each source.
        
       ___________________________________________________________________
       (page generated 2024-03-30 23:01 UTC)