[HN Gopher] Benchmarking LLMs against human expert-curated biome...
___________________________________________________________________
Benchmarking LLMs against human expert-curated biomedical knowledge
graphs
Author : Al0neStar
Score : 35 points
Date : 2024-03-30 16:51 UTC (6 hours ago)
(HTM) web link (www.sciencedirect.com)
(TXT) w3m dump (www.sciencedirect.com)
| CraftingLinks wrote:
| Academic writing 101: The abstract is NOT meant to be written as
| a cliff-hanger!
| serialdev wrote:
| You will not believe what it is all you need!
| jmugan wrote:
| I didn't see UMLS in the paper, but I've tried some of their
| human-created biomedical knowledge graphs, and they were too full
| of errors to be used. I imagine different ones have different
| levels of accuracy.
| nyrikki wrote:
| Due to the cliffhanger abstract, here is a part from the
| discussion that may help.
|
| > In our case, the manual curation of a proportion of triples
| revealed that Sherpa was able to extract more triples categorized
| as correct or partially correct. However, when compared to the
| manually curated gold standard, the performance of all automated
| tools remains subpar.
| egberts1 wrote:
| i was right; LLM needs two major components added before we can
| swan dive into humanistic aspect of medicine/pyschology/politics
| using a form of LLM.
|
| 1) weighting of each statement for probability of correctness and
|
| 2) citation for each source.
___________________________________________________________________
(page generated 2024-03-30 23:01 UTC)