[HN Gopher] The Llama Ecosystem: Past, Present, and Future
       ___________________________________________________________________
        
       The Llama Ecosystem: Past, Present, and Future
        
       Author : allanberger
       Score  : 52 points
       Date   : 2023-09-27 20:31 UTC (2 hours ago)
        
 (HTM) web link (ai.meta.com)
 (TXT) w3m dump (ai.meta.com)
        
       | waffletower wrote:
       | I actively use llama.cpp and I don't find lack of mention of it
       | as a slight -- it isn't directly affiliated with Meta. While
       | there is tremendous innovation in the project, backwards
       | compatibility is antithetical to the project's culture. I have
       | been updating my models to GGUF, which isn't terrible, but I find
       | I have to invest too much time to stay on top of the rapid,
       | scorched-earth developments. Going to move to containerized
       | checkpoints, as I do for my GPU models, for greater
       | maintainability and consistency.
        
       | skilled wrote:
       | > There are now over 7,000 projects on GitHub built on or
       | mentioning Llama. New tools, deployment libraries, methods for
       | model evaluation, and even "tiny" versions of Llama are being
       | developed to bring Llama to edge devices and mobile platforms.
       | 
       | Let's say I want to find the latest or most recent projects on
       | this, is it possible to find them on GitHub based on that
       | criteria?
        
         | version_five wrote:
         | Github has pretty varied filters, you can just search llama and
         | sort by stars or recent activity etc. It doesn't look like it's
         | possible to exclude python, but doing so might get you the
         | "edge" ones. (Except they usually have python utilities for
         | converting pytorch models)
        
           | skilled wrote:
           | Oh? But you can't sort by date or things like that?
        
             | version_five wrote:
             | See https://github.com/search/advanced there are various
             | date options
        
       | version_five wrote:
       | They didn't mention llama.cpp or show it in their picture, that's
       | hopefully an oversight, it feels like a major slight. It's a
       | (the?) major reason for llamas popularity.
       | 
       | I have mixed feelings, llama is great but it's perpetuated it's
       | shitty license. They could have done so much more good if they'd
       | used gpl style licensing, instead they basically subverted open
       | source, using an objectively good model as leverage.
        
         | refulgentis wrote:
         | > It's a (the?) major reason for llamas popularity.
         | 
         | Absolutely not. There's a corner of the overall community that
         | hovers it and overperceives it as everyone else only uses it
         | too.
         | 
         | Its great if you have an Apple ARM machine and want to see an
         | M2 Pro do 10 tokens/sec (and what could make an Apple ARM have
         | 30 minute battery life).
         | 
         | I also doubt it's a slight, the only callouts are large
         | commercial collaborations, ex. nVidia, AMD, Google are
         | representative of each of the 3 groups we could assign it
        
           | version_five wrote:
           | I'd be curious if you have any hard data about use. Mine is
           | anecdotal too, but I see that llama.cpp is the very close
           | second highest starred repo with llama on the name, after
           | meta llama. Additionally, all the HF models seem to have ggml
           | / gguf quantized versions . I'm not aware of a competing
           | format for quantized models. There are also python bindings
           | which are used in a lot of projects. What is a competing
           | framework, other than pytorch, that's getting more use? Or is
           | it all just pytorch (and some hf wrappers) and the rest is a
           | rounding error?
        
             | refulgentis wrote:
             | This reminds me of a comment elsewhere I also replied to
             | today: it's sort of hard to even pretend I have global
             | usage stats, so I won't.
             | 
             | There's a certain type of myopia that leads to overindexing
             | on llama.cpp that makes it easy to classify. to wit:
             | 
             | > not aware of a competing format for quantized models
             | 
             | ONNX, that's how its done in prod and on other models
             | besides (and including) LLaMa. Quantization is a general
             | technique. 100 small variants of llama2 GGML weights feels
             | like spam from that perspective. (sort of civitai vs.
             | huggingface, hugginface smartly stopped that with AI art).
             | 
             | llm.mlc.ai for a more academic / less ad-hoc approach.
             | 
             | > [stars on github]
             | 
             | It's great for a very narrow & simple case that matches a
             | large demographic on Github, and the demographics of people
             | talking LLMs casually on HN: MacBook, wanna run locally and
             | dream of a future free of having to ship your data to
             | servers to get personalization. 5% of overall usage can be
             | #2 in usage, if that makes sense.
        
               | version_five wrote:
               | > This reminds me of a comment elsewhere I also replied
               | to today
               | 
               | Right, looks like you made fun of / were condescendingly
               | dismissive of my comment in another thread, I wouldn't
               | have replied here if I'd realized you were the same
               | person.
        
         | kordlessagain wrote:
         | A lot of times there can be a feeling of being wrong without it
         | being intentional. In this case I think the mention of AWS
         | being a partner shows intent to put value behind what they are
         | doing for their stakeholders.
         | 
         | The license for Llama 2 is pretty intense, but mirrors that
         | intent by limiting interactions with individuals at scale, as
         | well as limiting anything learned from the model through
         | inference in being used to _train_ another model. I suspect
         | this is because the dataset on which it was trained is the
         | company 's IP, which again is for the shareholder's benefit.
         | 
         | The code is open though, I think out of necessity. AI poses a
         | significant challenge for our survival, and making it open is
         | an indication of transparency. They still need to make money at
         | what they do and charge people for using their IP, within
         | reason.
         | 
         | I guess my question would be that, if I used Llama (not the
         | code, but the model itself) to code up a new model, would that
         | be a derivative work?
        
       | kristianp wrote:
       | Any MOE models in development at meta?
        
       | syntaxing wrote:
       | People on HN like to complain about the license all the time like
       | a crusade but I'm personally very thankful for their work and the
       | community that is building off of it. I recently setup Ollama +
       | codellama + continue dev and it's game changer. Practically have
       | been a drop in github copilot replacement but local.
        
         | dartos wrote:
         | Yeah the community is great.
         | 
         | It'd just be better if it was around RWKV or something that
         | doesn't prevent you from improving any models outside of the
         | llama ecosystem.
         | 
         | It's a great embrace, extend, extinguish play by meta.
        
         | version_five wrote:
         | The license is a wedge that's destroying the meaning of open
         | source, it's worth complaining about, and it's evil to have
         | done it that way. I would have preferred a commercial license
         | that was at least honest instead of a scorched earth ecosystem
         | takeover like they've done. In a sense it's an extension of the
         | big tech "provide something notionally free that's too good not
         | to use and use it to destroy competition" model.
        
           | waffletower wrote:
           | It is a wedge for some, but not at all 'evil', at least not
           | for the reason you are providing. If you feel it is
           | cannibalizing your company's business model, my apologies.
        
             | version_five wrote:
             | Lol is that the strawman that people have come up with,
             | that not liking metas "only do what we allow" license must
             | be anger about competition?
             | 
             | No. A good parallel would be if Microsoft (say) wrote their
             | own linux clone that was compatible but had some
             | proprietary enhancements that made it desirable over open
             | source distros. The only catch being, it wasn't gpl
             | licensed (they wrote it from scratch) it had a proprietary
             | MS license that says you can only use it for things MS
             | approved of, and are using it at their pleasure, to be
             | revoked at any time.
             | 
             | People don't care about the license, they call it open
             | source and move away from gnu/linux to the proprietary MS
             | version, and now we're only doing what they allow us to.
             | 
             | That's exactly what's happening in the ML model world right
             | now, but people are happy with the shiny models Facebook
             | lets them use so they say "what's the big deal".
        
       ___________________________________________________________________
       (page generated 2023-09-27 23:01 UTC)