[HN Gopher] TinyLlama: An Open-Source Small Language Model
___________________________________________________________________
TinyLlama: An Open-Source Small Language Model
Author : matt1
Score : 63 points
Date : 2024-01-05 21:15 UTC (1 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| ronsor wrote:
| GitHub repo with links to the checkpoints:
| https://github.com/jzhang38/TinyLlama
| andy99 wrote:
| I've been using one of the earlier checkpoints for benchmarking a
| Llama implementation. Completely anecdotally I feel at least as
| good or better about this one than the earlier openllama 3B. I
| wouldn't use either of them for RAG or anything requiring more
| power, just to say that it's competitive as a smaller model,
| whatever you use those for, and easy to run on CPU at FP16
| (meaning without serious quantization).
| rnd0 wrote:
| >I wouldn't use either of them for RAG
|
| What's RAG?
| andy99 wrote:
| Retrieval augmented generative, basically giving it some text
| passage and asking questions about the text.
| dmezzetti wrote:
| If you want more on RAG with a concrete example:
| https://neuml.hashnode.dev/build-rag-pipelines-with-txtai
| sroussey wrote:
| What is good for RAG?
| andy99 wrote:
| The smallest model your users agree meets their needs. It
| really depends.
|
| The retrieval part is way more important.
|
| I've used the original 13B instruction tuned llama2,
| quantized, and found it gives coherent answers about the
| context provided, ie the bottleneck was mostly getting good
| context.
|
| When I played with long context models (like 16k tokens, and
| this was a few months ago, maybe they improved) they sucked.
| eachro wrote:
| What use cases would you say it is good enough for?
| matt1 wrote:
| OP here with a shameless plug: for anyone interested, I'm working
| on a site called Emergent Mind that surfaces trending AI/ML
| papers. This TinyLlama paper/repo is trending #1 right now and
| likely will be for a while due to how much attention it's getting
| across social media:
| https://www.emergentmind.com/papers/2401.02385. Emergent Mind
| also looks for and links to relevant discussions/resources on
| Reddit, X, HackerNews, GitHub, and YouTube for every new arXiv
| AI/ML paper. Feedback welcome!
| ukuina wrote:
| I visit your site every day. Thank you for creating it and
| evolving it past simple summaries to show paper details!
|
| I recall you were looking to sell it at some point. Was
| wondering what that process looked like, and why you ended up
| holding on to the site.
| matt1 wrote:
| Hey, thanks for the kind words.
|
| To answer your question: an earlier version of the site
| focused on surfacing AI news, but that space is super
| competitive and I don't think Emergent Mind did a better job
| than the other resources out there. I tried selling it
| instead of just shutting it down, but ultimately decided to
| keep it. I recently decided to pivot to covering arXiv
| papers, which is a much better fit than AI news. I think
| there's an opportunity with it to not only help surface
| trending papers, but help educate people about them too using
| AI (the GPT-4 summaries are just a start). A lot of the
| future work will be focused in that direction, but I'd also
| love any feedback folks have on what I could add to make it
| more useful.
| tmaly wrote:
| I am new to this space. Is it hard to fine tune this model?
| dmezzetti wrote:
| Link to model on HF Hub:
| https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
| minimaxir wrote:
| It was fun to follow the public TinyLlama loss curves in near
| real-time, although it showed that it can be frustrating since
| the loss curves barely moved down even after an extra trillion
| tokens: https://wandb.ai/lance777/lightning_logs/reports/metric-
| trai... (note the log-scaled X-axis)
|
| But they _did_ move down and that 's what's important.
|
| There should probably be more aggressive learning rate annealing
| for models trying to be Chinchilla-optimal instead of just
| cosine-with-warmup like every other model nowadays.
| sroussey wrote:
| Needs an onnx folder to use it with transformer.js out of the
| box.
|
| Hopefully @xenova will make a copy with it soon.
___________________________________________________________________
(page generated 2024-01-05 23:00 UTC)