hngopher.com

       [HN Gopher] Groqchat
       ___________________________________________________________________
        
       Groqchat
        
       Author : izzymiller
       Score  : 24 points
       Date   : 2023-12-22 22:03 UTC (57 minutes ago)
        
 (HTM) web link (chat.groq.com)
 (TXT) w3m dump (chat.groq.com)
        
       | badFEengineer wrote:
       | This was surprisingly fast, 276.27 T/s (although Llama 2 70B is
       | noticeably worse than GPT-4 turbo). I'm actually curious if
       | there's good benchmarks for inference tokens per second- I
       | imagine it's a bit different for throughput vs. single inference
       | optimization, but curious if there's an analysis somewhere on
       | this
       | 
       | edit: I re-ran the same prompt on perplexity llama-2-70b and
       | getting 59 tokens per sec there
        
       ___________________________________________________________________
       (page generated 2023-12-22 23:00 UTC)