[HN Gopher] The Inference Cost of Search Disruption - Large Lang...
       ___________________________________________________________________
        
       The Inference Cost of Search Disruption - Large Language Model Cost
       Analysis
        
       Author : lxm
       Score  : 19 points
       Date   : 2023-02-10 18:44 UTC (4 hours ago)
        
 (HTM) web link (www.semianalysis.com)
 (TXT) w3m dump (www.semianalysis.com)
        
       | freediver wrote:
       | Few things to note:
       | 
       | - Google qps is closer to 100k then 320k [1]
       | 
       | - Not every query has to run on LLM. Probably only 10% would
       | benefit from it
       | 
       | - This means 10,000 queries per second, each needing 5 A100s to
       | run, so 50,000 A100s are sufficient. Cost for that is $500MM,
       | quadruple that to $2B with CPU/RAM/storage/network. That is
       | peanuts for Google.
       | 
       | - Let's say AI unlocks new, never seen before queries ('do more
       | people live in madrid or tel aviv') and those are 10x in volume.
       | So capex is now $20B, still peanuts.
       | 
       | - Latency, not cost, is a bigger issue. This should be addressed
       | soon by H100 and newer chips.
       | 
       | - The main issue for the consumer remains the business model. Who
       | will pay for LLMs (user or the advertiser) and will Google and
       | Microsoft stuff ads into LLM responses?
       | 
       | [1] https://www.internetlivestats.com/one-second/#google-band
        
       ___________________________________________________________________
       (page generated 2023-02-10 23:00 UTC)