Post Ap2sc16e4sY8CesbGy by williampietri@sfba.social
 (DIR) More posts by williampietri@sfba.social
 (DIR) Post #Ap2sc16e4sY8CesbGy by williampietri@sfba.social
       0 likes, 0 repeats
       
       We've launched! After months of work, MLCommons has released our v1.0 benchmark that measures LLM (aka "AI") propensity for giving hazardous responses. Here's the results for 15 common models: https://ailuminate.mlcommons.org/benchmarks/And here's the overview: https://mlcommons.org/ailuminate/I was the tech lead for the software and want to give a shout out to my excellent team of developers and the many experts we worked closely with to make this happen.
       
 (DIR) Post #Ap2sc2r7ZZJtd5dELw by williampietri@sfba.social
       0 likes, 0 repeats
       
       Ooh, and we got written up in Wired: https://www.wired.com/story/benchmark-for-ai-risks