[HN Gopher] Show HN: I made a GPU VRAM calculator for transforme...
       ___________________________________________________________________
        
       Show HN: I made a GPU VRAM calculator for transformer-based models
        
       Author : furiousteabag
       Score  : 50 points
       Date   : 2023-12-26 17:53 UTC (5 hours ago)
        
 (HTM) web link (vram.asmirnov.xyz)
 (TXT) w3m dump (vram.asmirnov.xyz)
        
       | cchance wrote:
       | Very nice, would be cool to have a little i next to each spot to
       | explain what each thing is for newer users (batch size, etc)
        
       | a2128 wrote:
       | I noticed the default parameter count value is 1.418 billion but
       | if you erase it you can't actually enter it back because you
       | can't type a decimal point in the input area. Also, you can't
       | enter parameter counts smaller than 1 billion
        
         | sp332 wrote:
         | It works if you type the digits first and then insert the
         | decimal point after.
        
       | ilaksh wrote:
       | Does this have an option for quantization levels? Don't think I
       | saw it.
        
         | ComputerGuru wrote:
         | I second the request for quantization, eg for exl2.
        
       | roseway4 wrote:
       | While not as pretty (and mobile-friendly) as the original link,
       | the calculators below support modeling LoRA-based training,
       | alongside full finetuning.
       | 
       | https://huggingface.co/spaces/Vokturz/can-it-run-llm
       | 
       | https://rahulschand.github.io/gpu_poor/
        
         | ComputerGuru wrote:
         | They seem to be broken when I try any HF ids besides what came
         | preconfigured. e.g. just tried brucethemoose/Yi-34B-200K-DARE-
         | merge-v5-3.1bpw-exl2-fiction or
         | LoneStriker/shisa-7b-v1-3.0bpw-h6-exl2
        
         | icelancer wrote:
         | Second link hasn't been working for awhile.
        
       | a_wild_dandan wrote:
       | Are people still rawdoggin' 16-bit models? I almost exclusively
       | use 5-bit inference quants (or 8-bit natives like Yi-34b) on my
       | MacBook Pro. Tiny accuracy loss, runs fast, and leave plenty of
       | (V)RAM on the table. Mixtral 8x7 is my new daily driver, and only
       | takes like 40GB to run! I wonder if I could run two of them
       | talking to each other...
        
       ___________________________________________________________________
       (page generated 2023-12-26 23:00 UTC)