[HN Gopher] Claude 3 Haiku: our fastest model yet
___________________________________________________________________
Claude 3 Haiku: our fastest model yet
Author : minimaxir
Score : 31 points
Date : 2024-03-13 20:59 UTC (2 hours ago)
(HTM) web link (www.anthropic.com)
(TXT) w3m dump (www.anthropic.com)
| minimaxir wrote:
| This is a new announcement from the previous Claude 3
| announcement last week:
| https://news.ycombinator.com/item?id=39590666
|
| Specifically, the smallest/likely will be the most popular model
| is now available when it wasn't then. (The model ID is
| claude-3-haiku-20240307 ). Notably, this is also a cheap model
| that supports image input, but per the documentation you can only
| provide 20 images at a time which won't work for video inputs.
|
| Testing around image inputs in the web Workbench, it's
| surprisingly good for the price.
| simonw wrote:
| Pricing: $0.25/million tokens of input, $1.25/million of output
|
| GPT-3.5 Turbo is $0.50/$1.50
|
| I've updated the Claude 3 plugin for my LLM CLI tool to support
| the new model: https://github.com/simonw/llm-
| claude-3/releases/tag/0.3 pipx install llm
| llm install llm-claude-3 llm keys set claude #
| Paste Anthropic API key here llm -m claude-3-haiku 'Fun
| facts about armadillos'
|
| It's pretty fast! Animated GIF here:
| https://github.com/simonw/llm-claude-3/issues/3#issuecomment...
| BoorishBears wrote:
| Seeing it be pretty slow in production with long prompts:
|
| - 10-15 seconds for 400 tokens out, and 4,000-10,000 tokens in.
|
| - 6-8 seconds when using Claude Instant for the same prompts
|
| Hoping it's just a rush at launch.
| porphyra wrote:
| For comparison, Groq [1] has (price per million tokens of input
| vs output): Llama 2 70B (4096 Context Length)
| ~300 tokens/s $0.70/$0.80 Llama 2 7B (2048 Context
| Length) ~750 tokens/s $0.10/$0.10 Mixtral, 8x7B
| SMoE (32K Context Length) ~480 tokens/s $0.27/$0.27
| Gemma 7B (8K Context Length) ~820 tokens/s $0.10/$0.10
|
| [1] https://wow.groq.com/
| BoorishBears wrote:
| And zero capacity. Groq is coming across a total paper tiger:
| no billing, unusable rate limits, and most importantly: a
| request queue that makes it dramatically slower than any
| other option.
|
| They say they're just waiting on implementing billing, but at
| this point it reads more like "we wouldn't be able to meet
| demand of all your request usages".
|
| -
|
| Groq is going through all that to offer 500tk/s
| theoretically, meanwhile I'm seeing Fireworks.ai come in at
| 300+tk/s in production use.
| gabev wrote:
| This is fantastic! We just shipped so much of our old Claude
| Instant calls to Haiku and the results are fantastic.
|
| Zenfetch is now primarily powered by Claude 3 family of models :O
| https://www.zenfetch.com
| josh-sematic wrote:
| You can play with Haiku for free at
| https://app.airtrain.ai/playground
| ldjkfkdsjnv wrote:
| Is Anthropic moving faster than OpenAI? Or is OpenAI working on
| something so big, that they aren't worried by being outpaced.
| Regardless, I feel like I am watching history in real time.
| svdr wrote:
| I guess they must have put some time in Sora?
| a_wild_dandan wrote:
| OpenAI are presently training toward GPT-5, with periodic
| GPT-4.x releases planned. These models take immense training
| resources, red teaming, etc. It'll be a hot minute before you
| see GPT-4.5, etc.
| devnullbrain wrote:
| SMS verification seems to be broken. Nothing reported on the
| status page.
|
| https://status.anthropic.com/
| GaggiX wrote:
| The multilingual capabilities of Claude 3 models are incredible,
| even the smallest model, Haiku, is fluent in Georgian, a language
| that not even GPT-4 can speak without making a huge amount of
| mistakes.
| brcmthrowaway wrote:
| What are their training sources for this
| pedalpete wrote:
| Is there a good reason why they are comparing Claude 3 with
| ChatGPT 3.5 instead of 4? Does anybody really even care about/use
| Gemini?
| bryanlarsen wrote:
| AFAICT Claude 3 Haiku is supposed to be cheap/fast so compares
| with ChatGPT 3.5, and Claude 3 Ultra is supposed to be the best
| so compares with ChatGPT 4.
| pants2 wrote:
| You mean Claude 3 Opus, but yes.
___________________________________________________________________
(page generated 2024-03-13 23:01 UTC)