[HN Gopher] Google CodeGemma: Open Code Models Based on Gemma [pdf]
       ___________________________________________________________________
        
       Google CodeGemma: Open Code Models Based on Gemma [pdf]
        
       Author : tosh
       Score  : 145 points
       Date   : 2024-04-09 12:32 UTC (10 hours ago)
        
 (HTM) web link (storage.googleapis.com)
 (TXT) w3m dump (storage.googleapis.com)
        
       | tosh wrote:
       | It's really sad that Cursor does not support local models yet
       | (afaiu they fetch the URL you provide from their server). Is
       | there a VS Code plugin or other editor that does?
       | 
       | With models like CodeGemma and Command-R+ it makes more and more
       | sense to run them locally.
        
         | sp332 wrote:
         | According to https://forum.cursor.sh/t/support-local-
         | llms/1099/7 the Cursor servers do a lot of work in between your
         | local computer and the model. So porting all that to work on
         | users' laptops is going to take a while.
        
         | ericskiff wrote:
         | I've been playing with Continue:
         | https://github.com/continuedev/continue
        
           | tosh wrote:
           | ty for the pointer!
        
         | tosh wrote:
         | Already on ollama: https://ollama.com/library/codegemma
        
         | ado__dev wrote:
         | Cody supports local inference with Ollama for both Chat and
         | Autocomplete. Here's how to set it up:
         | https://sourcegraph.com/blog/local-chat-with-ollama-and-cody :)
        
           | tosh wrote:
           | ty for the pointer!
        
       | sheepscreek wrote:
       | Download the model weights here (PyTorch, GGUF):
       | 
       | https://huggingface.co/collections/google/codegemma-release-...
       | 
       | I am really liking the Gemma line of models. Thoroughly impressed
       | with the 2B and 7B non-code optimized variants. The 2B especially
       | packs a lot of punch. I reckon its quality must be at par with
       | some older 7B models, and it runs blazing fast on Apple Silicon -
       | even at 8 bit quantization.
        
         | tosh wrote:
         | Gemma 2b instruct worked well for me for categorization. I
         | would also say it felt 7b-ish. Very impressed. The initial
         | release left me a bit underwhelmed but 1.1 is better and
         | punches above its weight.
         | 
         | Also looking fwd to use 2b models on iOS and Android (even if
         | they will be heavy on the battery).
        
       | kolbe wrote:
       | My issue so far with the various code assistants isn't the
       | quality necessarily, but the ability of them to draw in context
       | from the rest of the code base without breaking the bank or
       | proving so much info that the middle gets ignored. Are there any
       | systems doing that well these days?
        
         | grey8 wrote:
         | If I'm not mistaken, this is not on the models itself, but
         | rather on the implementation of the addon.
         | 
         | I haven't found an open source VSCode or WebStorm addon yet
         | that allows me to use a local model and implements code
         | completion and commands as good as GitHub Copilot.
         | 
         | They either miss a chat feature and/or inline action / code
         | completion and/or fill-in-the-middle models. And if they do,
         | they don't provide the context as intelligently (? an
         | assumption!) as GH's Copilot does.
         | 
         | One alternative I liked was Supermaven: It's really really fast
         | and has a huge context window, so it knows almost your whole
         | project. That was nice! But - one thing I ultimately didn't
         | continue using it for: It doesn't support chat and/or inline
         | commands (CTRL+I on VSCode's GH Copilot).
         | 
         | I feel like a really good Copilot alternative is definitely a
         | still missing.
         | 
         | But: Regarding your question, I think GitHub Copilot's VSCode
         | extension is the best - as of now. The WebStorm extension is
         | sadly not as good, it misses the "inline command" function
         | which IMHO is a must.
        
           | skybrian wrote:
           | Could you use one tool for code completion and another for
           | chat?
        
             | evilduck wrote:
             | Continue.dev allows for this. You can even mix hosted Chat
             | options like GPT-4 (via API) with local completion. I
             | typically use a smaller model for faster text completion
             | and a larger model (with a bigger context) for chat.
             | 
             | https://github.com/continuedev/continue
        
             | mediaman wrote:
             | I think most of them allow for that. Works in vscode and
             | vscode-derived (e.g., cursor) editors.
        
         | wsxiaoys wrote:
         | Checkout tabby: https://github.com/TabbyML/tabby
         | 
         | Blog post on repository context:
         | https://tabby.tabbyml.com/blog/2023/10/16/repository-context...
         | 
         | (Disclaimer: I started this project)
        
         | Havoc wrote:
         | Continue + deepseek local model has been working reasonably
         | well for me
        
         | snovv_crash wrote:
         | A RAG system is better than a pure LLM for this usecase IMO.
        
           | kolbe wrote:
           | Yeah. This is what I imagined should be how these things
           | work. But it is tricky. The system needs to pattern match on
           | what types you've been using, if possible. So you need to
           | vector search for code to do that. Then you need to vector
           | search for the actual dependency source. It's not that
           | simple, but would be the ultimate solution.
        
           | viksit wrote:
           | a rag system that uses tree sitter vs vector search alone
           | would lead to better results (intuitively speaking)? have you
           | seen anything like that yet?
        
         | mediaman wrote:
         | Seconding Supermaven here, from the guy that made Tabnine.
         | 
         | Supermaven has a 300k token context. It doesn't seem like it
         | has a ton of intelligence -- maybe comparable to copilot, maybe
         | a bit less -- but it's much better at picking up data
         | structures and code patterns from your code, and usually what I
         | want is help autocompleting that sort of thing rather than
         | writing an algorithm for me (which LLMs often get wrong
         | anyway).
         | 
         | You can also pair it with a gpt4 / opus chat in Cursor, so you
         | can get your slower but more intelligent chat along with the
         | simpler but very fast, high context autocomplete.
        
       | typpo wrote:
       | If anyone wants to eval this locally versus codellama, it's
       | pretty easy with Ollama[0] and Promptfoo[1]:
       | prompts:         - "Solve in Python: {{ask}}"
       | providers:         - ollama:chat:codellama:7b         -
       | ollama:chat:codegemma:instruct            tests:         - vars:
       | ask: function to return the nth number in fibonacci sequence
       | - vars:             ask: convert roman numeral to number
       | # ...
       | 
       | YMMV based on your coding tasks, but I notice gemma is much less
       | verbose by default.
       | 
       | [0] https://github.com/ollama/ollama
       | 
       | [1] https://github.com/promptfoo/promptfoo
        
       | danielhanchen wrote:
       | Made Code Gemma 7b 2.4x faster and use 68% less VRAM with Unsloth
       | if anyone wants to finetune it! :) Have a Tesla T4 Colab notebook
       | with ChatML:
       | https://colab.research.google.com/drive/19lwcRk_ZQ_ZtX-qzFP3...
        
         | trisfromgoogle wrote:
         | Love to see your work, Daniel -- thank you, as always! Playing
         | with the Colab now =). Go Unsloth, and thanks from the Gemma
         | team!
        
           | danielhanchen wrote:
           | Thanks! :) Appreciate it a lot! :)
        
       ___________________________________________________________________
       (page generated 2024-04-09 23:01 UTC)