hngopher.com

       [HN Gopher] OpenCoder: Open Cookbook for Top-Tier Code Large Lan...
       ___________________________________________________________________
        
       OpenCoder: Open Cookbook for Top-Tier Code Large Language Models
        
       Author : pil0u
       Score  : 209 points
       Date   : 2024-11-09 17:27 UTC (5 hours ago)
        
 (HTM) web link (opencoder-llm.github.io)
 (TXT) w3m dump (opencoder-llm.github.io)
        
       | TZubiri wrote:
       | What is that "this http URL" thing in the first sentence of the
       | abstract?
       | 
       | Is this slob?
        
         | HerrMonnezza wrote:
         | arXiv replaces any URL in the text of the abstract with a link
         | with text "this http url"; it seems the authors did not know
         | this and just embedded a bare URL in their abstract.
        
           | vasco wrote:
           | I think it mistook a typo that didn't add a space after a
           | sentence.
        
             | johndough wrote:
             | I think this is the relevant code:                   TLDS =
             | "[a-z][a-z]+"
             | 
             | https://github.com/arXiv/arxiv-
             | base/blob/develop/arxiv/base/...
             | 
             | A more restrictive TLD list would have prevented this, but
             | I certainly don't want to be the one to add new TLDs all
             | the time, so I can see why the code looks like it does.
        
               | Mathnerd314 wrote:
               | Mozilla has a list, https://publicsuffix.org/list/,
               | relatively easy to update. I'm sure there is some Python
               | wrapper library they could use.
        
         | Retr0id wrote:
         | Bad auto-URL-extraction, presumably. The PDF reads:
         | 
         | > Large language models (LLMs) for code have become
         | indispensable in various domains, including code generation,
         | reasoning tasks and agent systems. While open-access code LLMs
         | are increasingly approaching the performance levels of
         | proprietary models,
         | 
         | "systems.while" is obviously not a valid domain.
        
           | 4b11b4 wrote:
           | while.systems
        
       | atilimcetin wrote:
       | Home page of that arxiv paper: https://opencoder-llm.github.io/
        
         | dang wrote:
         | Thanks! We've changed to that from
         | https://arxiv.org/abs/2411.04905, which is also linked there.
        
       | mistrial9 wrote:
       | making a wild guess on the nationality of every author of this
       | paper (1), and observing the number of authors, and observing the
       | velocity and volume of similar papers.. it seems a pattern of
       | "English language as a service to automated programming
       | environments" appears to be very useful and relevant for people
       | (nations?) that are wholly and firmly not English speaking..
       | 
       | (1) is M-A-P or INFtech dot ai a well-known institutional
       | affiliation?
        
         | jstanley wrote:
         | What are you trying to say here?
         | 
         | I gave it a few tries but couldn't figure it out.
        
           | jannyfer wrote:
           | It seems proofreading-as-a-service would be very useful for
           | mistrial9.
        
           | rnewme wrote:
           | Keep trying, you might get it.
        
         | bbor wrote:
         | To be clear: INFTech is a for-profit (I think...?) firm out of
         | Shanghai, and MAP is an international FOSS collective
         | (https://m-a-p.ai/about).
         | 
         | Speaking generally, a _lot_ of software engineering worldwide
         | is done in English, so it makes sense that they're training
         | models in English even if some /most of the researchers also
         | speak a Chinese language. Plus, HuggingFace is English-native,
         | and working on FOSS models (FOSLMs?) without targeting that
         | community would be like making a command line accounting tool
         | and not immediately posting it to the HackerNews community.
         | 
         | Your comment seems to imply some sort of hidden motivation, but
         | idk, seems pretty straightforwardly benign to me! Plus it's
         | hard to say how many papers are published in other languages
         | about LLMs, considering we wouldn't read them.
        
         | swyx wrote:
         | someone on twitter once referred to these as "wechat papers"
         | and i cant get it out of my head
        
       | tontoncyber wrote:
       | Interesting paper and work but the model doesn't seems to be
       | better than Qwen2.5-Coder in some languages including Ruby.
        
         | deepsquirrelnet wrote:
         | I've tried a bunch of different models that are essentially
         | different instruction tuning on base models, and that seems to
         | be generally true in my experience. I don't think you can fine
         | tune your way into a significantly better code model. At best,
         | one that can follow instructions better, but not one that can
         | usually write noticeably better code or solve harder problems.
        
         | tontoncyber wrote:
         | I'm waiting for the 32B!
         | https://news.ycombinator.com/item?id=42096027
        
       | johndough wrote:
       | I was wondering why Figure 1 showed a HumanEval score of 61.6 for
       | Qwen2.5-Coder-7B, but Table 1 shows a score of 88.4, i. e. better
       | than this new model with a score of 66.5.
       | 
       | The reason is that those are actually two different models
       | (Qwen2.5-Coder-7B-Base with 61.6, Qwen2.5-Coder-7B-Instruct with
       | 88.4).
        
       | marmaduke wrote:
       | > Unlike most prior efforts, we release not only model weights
       | and inference code, but also the reproducible training data,
       | complete data processing pipeline, rigorous experimental ablation
       | results, and detailed training protocols for open scientific
       | research.
       | 
       | Regardless of the specific performance of this model versus
       | another model, I think it's good to keep in mind that everyone
       | benefits from this kind of work
        
       | 4b11b4 wrote:
       | plumbing is important
        
       | hasnain99 wrote:
       | nice
        
       | v3ss0n wrote:
       | Tested , so much hallucination , cannot hold a candle against
       | Qwen 2.5 or even General Purpose model Mistral-Nemo.
        
         | bt1a wrote:
         | To be fair, nothing comes close to Qwen2.5 atm
        
           | v3ss0n wrote:
           | don't know how they are getting top of Qwen at very poor
           | quality via humaneval bench.
        
           | littlestymaar wrote:
           | This is something that's obvious to anyone playing with local
           | LLMs but that doesn't seem to be that much well-known even
           | among tech enthusiast.
           | 
           | Qwen is really ahead of the pack right now when it comes to
           | weight-available models.
        
             | drawnwren wrote:
             | How does it compare to Claude?
        
             | tomr75 wrote:
             | which size are you using?
             | 
             | I don't see why you would use it over claude and 4o-mini
             | with cursor unless you are working on a top secret repo
        
           | rnewme wrote:
           | Not even deepseek coder 2.5?
        
             | viraptor wrote:
             | Not according to the scores here
             | https://github.com/QwenLM/Qwen2.5-Coder
        
       | IshKebab wrote:
       | What kind of hardware do you need to run this?
        
       | smilebot wrote:
       | >Due to the prevalence of forking and copy-pasting within the
       | codebase, nearly 75% of files are completely duplicated.
       | 
       | This is surprisingly high. Does the include imported libraries
       | and packages? Since you are hashing at the file level, I am not
       | fully convinced that this is due to people copying entire files
       | over without modification.
        
       ___________________________________________________________________
       (page generated 2024-11-09 23:00 UTC)