[HN Gopher] Taming randomness in ML models with hypothesis testi...
       ___________________________________________________________________
        
       Taming randomness in ML models with hypothesis testing and marimo
        
       Author : mscolnick
       Score  : 47 points
       Date   : 2024-10-17 15:57 UTC (2 days ago)
        
 (HTM) web link (blog.mozilla.ai)
 (TXT) w3m dump (blog.mozilla.ai)
        
       | amarcheschi wrote:
       | i - luckily - passed my statistics exam this summer, it's however
       | cool to visualize what's happening
        
       | axpy906 wrote:
       | Good post. I've been thinking about doing offline testing of LLM
       | tasks a bit these days and have come to the conclusion that old
       | school testing is the best until more mature features can be
       | developed. Specifically, I mean running a power analysis to
       | determine sample size, random sampling based on that and then
       | running tests like a z test to see if there is a difference and
       | between what bounds. Tests are expensive and I wish there was a
       | better way for realizable offline evals.
        
         | rhdunn wrote:
         | Have you seen LLM testing tools like promptfoo?
        
           | axpy906 wrote:
           | Yes, I have seen it and BrainTrust too. Unfortunately, need
           | FOSS.
        
       ___________________________________________________________________
       (page generated 2024-10-19 23:01 UTC)