[HN Gopher] Taming randomness in ML models with hypothesis testi...
___________________________________________________________________
Taming randomness in ML models with hypothesis testing and marimo
Author : mscolnick
Score : 47 points
Date : 2024-10-17 15:57 UTC (2 days ago)
(HTM) web link (blog.mozilla.ai)
(TXT) w3m dump (blog.mozilla.ai)
| amarcheschi wrote:
| i - luckily - passed my statistics exam this summer, it's however
| cool to visualize what's happening
| axpy906 wrote:
| Good post. I've been thinking about doing offline testing of LLM
| tasks a bit these days and have come to the conclusion that old
| school testing is the best until more mature features can be
| developed. Specifically, I mean running a power analysis to
| determine sample size, random sampling based on that and then
| running tests like a z test to see if there is a difference and
| between what bounds. Tests are expensive and I wish there was a
| better way for realizable offline evals.
| rhdunn wrote:
| Have you seen LLM testing tools like promptfoo?
| axpy906 wrote:
| Yes, I have seen it and BrainTrust too. Unfortunately, need
| FOSS.
___________________________________________________________________
(page generated 2024-10-19 23:01 UTC)