[HN Gopher] Alpaca-LoRA with Docker
___________________________________________________________________
Alpaca-LoRA with Docker
Author : syntaxing
Score : 139 points
Date : 2023-03-24 11:41 UTC (11 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jvanderbot wrote:
| This is neat and all but both Alpaca and Lora are things I
| already use and already read about on HN, except now their names
| are bulldozed by LLM tech and things will never be the same.
| gitfan86 wrote:
| Just run all your web browsing through GTP and tell it to
| differentiate them for you
| yieldcrv wrote:
| cloned, hmu if that repo gets nuked
| danso wrote:
| From the repo README:
|
| > _Try the pretrained model out here, courtesy of a GPU grant
| from Huggingface!_
|
| https://huggingface.co/spaces/tloen/alpaca-lora
|
| Anyone else getting error messages when trying to submit
| instructions to the model on Huggingface? It just says "Error" so
| I don't know if it's a "too many users" problem or something else
|
| edit: nevermind, I was able to get a response after a few more
| tries, plus a 20 second processing time
| zapdrive wrote:
| Sorry this is moving too fast for me. So if I understand
| correctly, LoRa kind of does what Alpaca does but using different
| data.
|
| So what is Alpaca-Lora? I know you get Alpaca by retraining Llama
| using Stanford Alpaca 52k instruction-following data? So if I am
| guessing right, you get Aplaca-Lora by retraining Alpaca using
| Lora's data?
| return_to_monke wrote:
| I think your first statement is incorrect. Lora seems to be a
| method to fine-tune and optimize the weights of models like
| Alpaca. It is not a different dataset.
|
| This reduces model sizes and therefore also compute costs.
|
| See the abstract of https://arxiv.org/pdf/2106.09685.pdf
| sp332 wrote:
| This says "We provide an Instruct model of similar quality to
| text-davinci-003", but two paragraphs later says the output is
| comparable to Stanford's Alpaca. Those seem like very different
| claims.
| MacsHeadroom wrote:
| "We performed a blind pairwise comparison between text-
| davinci-003 and Alpaca 7B, and we found that these two models
| have very similar performance: Alpaca wins 90 versus 89
| comparisons against text-davinci-003."
|
| https://crfm.stanford.edu/2023/03/13/alpaca.html
| ChrisAlexiuk wrote:
| Hey! Thanks for linking this!
|
| The work was all done by the original repo author - just added a
| Dockerfile!
| saurik wrote:
| Yesterday there was a discussion about an article which goes into
| the usage of Alpaca-LoRA.
|
| https://news.ycombinator.com/item?id=35279656
| dougmwne wrote:
| What is the final size of the weights?
| teekert wrote:
| That name is so unfortunate. Nobody searched "Lora" before
| picking it. Bit of a blunder if you ask me.
| b33j0r wrote:
| They even capitalize the R like LoRa, but I don't think we'll
| be running this model on an ESP32 to much profit.
|
| Perhaps someone will release a llama I can run at home... how
| about "llama-homekit"? ;)
| nico wrote:
| The demo on HuggingFace with the pre trained mode doesn't seem
| that good.
|
| Although better than Bard (btw, Bard sucks compared to ChatGPT
| and can't even do translations - which I would have expected out
| of the box from Google)
| syntaxing wrote:
| It's worth noting this is the 7B model (nonquantized). You can
| get this running on pretty much any GPU with 8GB VRAM and
| above. You can run the 13B model but that would take two GPU or
| reducing FP16 to FP8 (I haven't tried it myself). A single
| connection for chatgpt is rumored to require 8X A100.
| nico wrote:
| It makes me wonder if this trend will kill NVIDIA.
|
| At this pace we might not even need GPUs anymore.
| jnwatson wrote:
| The race for bigger NNs will never stop.
| zamalek wrote:
| Quantizing it to 8-bit basically eliminates its ability to
| write code.
| schappim wrote:
| I never thought that both Alpaca and LoRA would belong to such a
| crowded tech namespace...
| HnUser12 wrote:
| > Tell me about you
|
| >I am a 25-year-old woman from the United States. I have a
| bachelor's degree in computer science and am currently pursuing a
| master's degree in data science. I am passionate about technology
| and am always looking for new ways to use it to make the world a
| better place. Outside of work, I enjoy spending time with my
| family and friends, reading, and traveling.
|
| Well, I was starting to get tired of "as a AI language model"
| disclaimer. Out of curiosity, is this model meant to be a 25 year
| old personal assistant?
| jonny_eh wrote:
| No, it's just random "plausible" response. Re-roll the response
| and you'll get something different.
|
| Think of the prompt as "pretend you're some random person, tell
| me some details"
| kkielhofner wrote:
| Ok, this is the base for actually self-hosted production use of
| these things now (if you don't care about licensing...). I've
| said in previous HN comments we've been a Dockerfile using an
| Nvidia base image away from this for a while now (just never got
| around to it myself).
|
| I love the .ccp, Apple Silicon, etc projects but IMO for the time
| being Nvidia is still king when it comes to multi-user production
| use of these models with competitive response time, parameter
| count/size, etc.
|
| Of course as others pointed out the quality of these models still
| leaves a lot to be desired but this is a good start for the
| inevitable actually open models, finetuned variants, etc that are
| being released on what seems like a daily basis at this point.
|
| I'm walking through it (fun weekend project!) but my dual RTX
| 4090 dev workstation will almost certainly scream with these
| (even though VRAM isn't "great"). Over time with better and
| better models (with compatible licenses) the OpenAI lead will get
| smaller and smaller.
| cuuupid wrote:
| I'm hitting ChatGPT or faster speeds on my 3090. Have it
| running the image with a reverse SSH tunnel to an EC2 instance
| that's ferrying requests from the web. It only took 4 hours of
| an afternoon, and based off the trending Databricks article on
| HN we're probably only days away from a commercially licensed
| model.
| kkielhofner wrote:
| Bit of a tangent, have you tried CloudFlare tunnels for what
| you're doing? Literally one liner to install cloudflared and
| boom service is on the internet with Cloudflare in front.
| I've even used it in cases where my host was behind multiple
| layers of NAT - just works. If you're concerned with speed
| and performance I guarantee it will blow away your current
| approach (while giving you all of the other Cloudflare
| stuff). Of course if you hate CF (fair enough) disregard :).
|
| I use this for an optimized hosted Whisper implementation
| I've been working on. It hits 120x realtime with large v2 on
| a 4090 and uses WebRTC to stream the audio in realtime with
| datachannels for ASR responses. Hopefully a "Show HN" soon
| once I get some legal stuff out of the way :). I mention it
| because AFAIK it's many multiples faster than the OpenAI
| hosted Whisper (especially for "realtime" speech).
|
| I expect we'll see these kinds of innovations and more come
| to self-hosted approaches generally and the open source
| community will pull a web hosting, etc Microsoft vs
| Linux/LAMP/etc 1990s/early 2000s situation on OpenAI where
| open source wins in the end. The fact that MS is so heavily
| invested in OpenAI is just history repeating itself.
|
| Yep, saw the Databricks article! I don't try to make specific
| time predictions but you're probably not far off :).
___________________________________________________________________
(page generated 2023-03-24 23:00 UTC)