[HN Gopher] Exploring inference memory saturation effect: H100 v...
___________________________________________________________________
Exploring inference memory saturation effect: H100 vs. MI300x
Author : latchkey
Score : 49 points
Date : 2024-12-05 16:50 UTC (6 hours ago)
(HTM) web link (dstack.ai)
(TXT) w3m dump (dstack.ai)
| honzafafa wrote:
| Cool!
| byefruit wrote:
| Bit weird to show $ per 1M tokens and not include the actual
| costs of the systems anywhere.
|
| It would be interesting to know the outright prices for those
| systems as well as their hourly rental rates at the moment.
| latchkey wrote:
| Hot Aisle: https://hotaisle.xyz/pricing/
|
| Lambda: https://lambdalabs.com/service/gpu-cloud#pricing
| cheptsov wrote:
| Yes, we should have included the price too--thanks for
| pointing that out. I forgot about it. Appreciate you adding
| the links here! And yes, Lambda's and Hot Aisle's prices were
| used in the calculation.
| fngarrett wrote:
| Great read. Have you compared performance with other Llama models
| (3, 3.2) or have you just done benchmarking with 3.1?
|
| Is there some intuition as to why 3.1 might outperform 3.2 on
| MI300X?
| cheptsov wrote:
| In this one we were only using 3.1 405B FP8. We took one model
| to simplify the setup and were mostly looking at the memory
| saturation effect. So basically we compared inference metrics
| of the same model. I suppose comparing 3.1 and 3.2 will be
| difficult as they are different models entirely. But open to
| ideas
| sylware wrote:
| Finally somebody tackling the issue of AI sleep.
|
| May need a neural net with a third dimension, namely 3D.
|
| Let's wait for the outcome, but they should have a look at this
| regime in the human brain.
| JackYoustra wrote:
| Glad to see hot aisle again providing support to the AMD research
| community!
| latchkey wrote:
| Thanks Jack. This one too...
| https://news.ycombinator.com/item?id=42331355
___________________________________________________________________
(page generated 2024-12-05 23:01 UTC)