[HN Gopher] Exploring inference memory saturation effect: H100 v...
       ___________________________________________________________________
        
       Exploring inference memory saturation effect: H100 vs. MI300x
        
       Author : latchkey
       Score  : 49 points
       Date   : 2024-12-05 16:50 UTC (6 hours ago)
        
 (HTM) web link (dstack.ai)
 (TXT) w3m dump (dstack.ai)
        
       | honzafafa wrote:
       | Cool!
        
       | byefruit wrote:
       | Bit weird to show $ per 1M tokens and not include the actual
       | costs of the systems anywhere.
       | 
       | It would be interesting to know the outright prices for those
       | systems as well as their hourly rental rates at the moment.
        
         | latchkey wrote:
         | Hot Aisle: https://hotaisle.xyz/pricing/
         | 
         | Lambda: https://lambdalabs.com/service/gpu-cloud#pricing
        
           | cheptsov wrote:
           | Yes, we should have included the price too--thanks for
           | pointing that out. I forgot about it. Appreciate you adding
           | the links here! And yes, Lambda's and Hot Aisle's prices were
           | used in the calculation.
        
       | fngarrett wrote:
       | Great read. Have you compared performance with other Llama models
       | (3, 3.2) or have you just done benchmarking with 3.1?
       | 
       | Is there some intuition as to why 3.1 might outperform 3.2 on
       | MI300X?
        
         | cheptsov wrote:
         | In this one we were only using 3.1 405B FP8. We took one model
         | to simplify the setup and were mostly looking at the memory
         | saturation effect. So basically we compared inference metrics
         | of the same model. I suppose comparing 3.1 and 3.2 will be
         | difficult as they are different models entirely. But open to
         | ideas
        
       | sylware wrote:
       | Finally somebody tackling the issue of AI sleep.
       | 
       | May need a neural net with a third dimension, namely 3D.
       | 
       | Let's wait for the outcome, but they should have a look at this
       | regime in the human brain.
        
       | JackYoustra wrote:
       | Glad to see hot aisle again providing support to the AMD research
       | community!
        
         | latchkey wrote:
         | Thanks Jack. This one too...
         | https://news.ycombinator.com/item?id=42331355
        
       ___________________________________________________________________
       (page generated 2024-12-05 23:01 UTC)