[HN Gopher] Llama2-shepherd a CLI tool to install multiple imple...
       ___________________________________________________________________
        
       Llama2-shepherd a CLI tool to install multiple implementations of
       the llama2
        
       Author : mikepapadim
       Score  : 56 points
       Date   : 2024-01-04 12:24 UTC (10 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | simonw wrote:
       | Are you planning on adding documentation / a mechanism for
       | running a prompt using the code this installs?
       | 
       | As far as I can tell at the moment it clones one of the various
       | repos for you and downloads some model writes, but it doesn't yet
       | help you compile and run the code.
        
         | mikepapadim wrote:
         | hello, I am planning to add also a runner option and a
         | benchmarking option among different implementations. This is
         | just an MVP version while trying to keep track of all llama2
         | implementations as the ones in the original repo is a bit
         | outdated
        
       | liuliu wrote:
       | Everyone uses gguf / safetensors? How model management is done
       | for this kind of tool?
        
         | mikepapadim wrote:
         | most of the models support the tinyllamas, regarding gguf/ggml
         | and safetensors each implementation has its own model
         | importers, so there is not guarantee that all types can be
         | consumed by all implementations
        
       | superkuh wrote:
       | It's sad that github has now defaults to showing a blank page
       | devoid of any content related to the projects linked when js
       | execution isn't complete.
        
         | vlugorilla wrote:
         | Maybe Gothub can help with this: https://gothub.app/
         | 
         | It's like https://nitter.net/about but for Github. Just
         | `s/github.com/gothub.app/g`
        
       | pama wrote:
       | Do you have a performance comparison for inference on the same
       | hardware using each of these implementations?
        
         | mikepapadim wrote:
         | Not yet, this is the end-goal of this repo, to be able to do
         | this kind of perf evaluation.
        
       ___________________________________________________________________
       (page generated 2024-01-04 23:01 UTC)