[HN Gopher] Groqchat
___________________________________________________________________
Groqchat
Author : izzymiller
Score : 24 points
Date : 2023-12-22 22:03 UTC (57 minutes ago)
(HTM) web link (chat.groq.com)
(TXT) w3m dump (chat.groq.com)
| badFEengineer wrote:
| This was surprisingly fast, 276.27 T/s (although Llama 2 70B is
| noticeably worse than GPT-4 turbo). I'm actually curious if
| there's good benchmarks for inference tokens per second- I
| imagine it's a bit different for throughput vs. single inference
| optimization, but curious if there's an analysis somewhere on
| this
|
| edit: I re-ran the same prompt on perplexity llama-2-70b and
| getting 59 tokens per sec there
___________________________________________________________________
(page generated 2023-12-22 23:00 UTC)