[HN Gopher] Show HN: I made a GPU VRAM calculator for transforme...
___________________________________________________________________
Show HN: I made a GPU VRAM calculator for transformer-based models
Author : furiousteabag
Score : 50 points
Date : 2023-12-26 17:53 UTC (5 hours ago)
(HTM) web link (vram.asmirnov.xyz)
(TXT) w3m dump (vram.asmirnov.xyz)
| cchance wrote:
| Very nice, would be cool to have a little i next to each spot to
| explain what each thing is for newer users (batch size, etc)
| a2128 wrote:
| I noticed the default parameter count value is 1.418 billion but
| if you erase it you can't actually enter it back because you
| can't type a decimal point in the input area. Also, you can't
| enter parameter counts smaller than 1 billion
| sp332 wrote:
| It works if you type the digits first and then insert the
| decimal point after.
| ilaksh wrote:
| Does this have an option for quantization levels? Don't think I
| saw it.
| ComputerGuru wrote:
| I second the request for quantization, eg for exl2.
| roseway4 wrote:
| While not as pretty (and mobile-friendly) as the original link,
| the calculators below support modeling LoRA-based training,
| alongside full finetuning.
|
| https://huggingface.co/spaces/Vokturz/can-it-run-llm
|
| https://rahulschand.github.io/gpu_poor/
| ComputerGuru wrote:
| They seem to be broken when I try any HF ids besides what came
| preconfigured. e.g. just tried brucethemoose/Yi-34B-200K-DARE-
| merge-v5-3.1bpw-exl2-fiction or
| LoneStriker/shisa-7b-v1-3.0bpw-h6-exl2
| icelancer wrote:
| Second link hasn't been working for awhile.
| a_wild_dandan wrote:
| Are people still rawdoggin' 16-bit models? I almost exclusively
| use 5-bit inference quants (or 8-bit natives like Yi-34b) on my
| MacBook Pro. Tiny accuracy loss, runs fast, and leave plenty of
| (V)RAM on the table. Mixtral 8x7 is my new daily driver, and only
| takes like 40GB to run! I wonder if I could run two of them
| talking to each other...
___________________________________________________________________
(page generated 2023-12-26 23:00 UTC)