hngopher.com

       [HN Gopher] Detecting hallucinations in large language models us...
       ___________________________________________________________________
        
       Detecting hallucinations in large language models using semantic
       entropy
        
       Author : Tomte
       Score  : 41 points
       Date   : 2024-06-23 18:32 UTC (4 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | MikeGale wrote:
       | One formulation is that these are hallucinations. Another is that
       | these systems are "orthogonal to truth". They have nothing to do
       | with truth or falsity.
       | 
       | One expression of that idea is in this paper:
       | https://link.springer.com/article/10.1007/s10676-024-09775-5
        
         | soist wrote:
         | It's like asking if a probability distribution is truthful or a
         | liar. It's a category error to speak about algorithms as if
         | they had personal characteristics.
        
         | kreeben wrote:
         | Your linked paper suffers from the same anthropomorphisation as
         | does all papers who uses the word "hallucination".
        
       | more_corn wrote:
       | This is huge though not a hundred percent there.
        
       | jostmey wrote:
       | So, I can understand how their semantic entropy (which seems to
       | require a LLM trained to detect semantic equivalence) might be
       | better at catching hallucinations. However, I don't see how
       | semantic equivalence directly tackles the problem of
       | hallucinations. Currently, I naively suspect it is just a
       | heuristic for catching hallucinations. Furthermore, the
       | requirement of a second LLM trained at detecting semantic
       | equivalence to catch these events seems like an unnecessary
       | pipeline. If I had a dataset of semantic equivalence to train a
       | second LLM, I would directly incorporate this into the training
       | process of my primary LLM, which to me, seems like the way things
       | are done with deep learning
        
       ___________________________________________________________________
       (page generated 2024-06-23 23:00 UTC)