[HN Gopher] The Inference Cost of Search Disruption - Large Lang...
___________________________________________________________________
The Inference Cost of Search Disruption - Large Language Model Cost
Analysis
Author : lxm
Score : 19 points
Date : 2023-02-10 18:44 UTC (4 hours ago)
(HTM) web link (www.semianalysis.com)
(TXT) w3m dump (www.semianalysis.com)
| freediver wrote:
| Few things to note:
|
| - Google qps is closer to 100k then 320k [1]
|
| - Not every query has to run on LLM. Probably only 10% would
| benefit from it
|
| - This means 10,000 queries per second, each needing 5 A100s to
| run, so 50,000 A100s are sufficient. Cost for that is $500MM,
| quadruple that to $2B with CPU/RAM/storage/network. That is
| peanuts for Google.
|
| - Let's say AI unlocks new, never seen before queries ('do more
| people live in madrid or tel aviv') and those are 10x in volume.
| So capex is now $20B, still peanuts.
|
| - Latency, not cost, is a bigger issue. This should be addressed
| soon by H100 and newer chips.
|
| - The main issue for the consumer remains the business model. Who
| will pay for LLMs (user or the advertiser) and will Google and
| Microsoft stuff ads into LLM responses?
|
| [1] https://www.internetlivestats.com/one-second/#google-band
___________________________________________________________________
(page generated 2023-02-10 23:00 UTC)