[HN Gopher] Codestral: Mistral's Code Model
___________________________________________________________________
Codestral: Mistral's Code Model
Author : alexmolas
Score : 345 points
Date : 2024-05-29 14:16 UTC (8 hours ago)
(HTM) web link (mistral.ai)
(TXT) w3m dump (mistral.ai)
| mousetree wrote:
| How does this compare to Github Copilot? It's not shown in their
| comparison
| nkozyra wrote:
| Not sure how much current Copilot varies from the original
| Codex, but another set of benchmarks here:
| https://paperswithcode.com/sota/code-generation-on-humaneval
| ramon156 wrote:
| Knowing the training data GH has I doubt it's comparable, then
| again I don't have the benchmarks
| ramon156 wrote:
| After typing this I tried the live chat out and it honestly
| seems a lot more promising than current GH Copilot, very
| nice!
| ssgodderidge wrote:
| Are you saying GH has more than Codestral and therefore GH
| has a better model? Or that Codestral would be better because
| Codestral is not littered with "bad" code?
| nkozyra wrote:
| Bad code is obviously very subjective, but I would wager
| that GH places a much higher value on feedback mechanisms
| like stars, issues, PRs, velocity, etc. Their ubiquity
| likely allows them to automatically cherry-pick less "bad
| code."
| nicce wrote:
| Nothing prevents Mistral do the same if they want to.
| Issues and and PRs are public information, exposed by
| APIs, and not that much rate limited.
| rohansood15 wrote:
| Copilot primarily uses GPT-3.5, which is outclassed by
| Llama3-70B. And this model claims to be slightly better than
| Llama3-70B.
|
| Edit: For those who don't believe me,
| https://github.com/microsoft/vscode-copilot-
| release/issues/6.... Gpt-4 for chat, 3.5 for code.
| jasonjmcghee wrote:
| GitHub Copilot uses GPT-3.5?
|
| I was under the impression it was a custom codex model with a
| surrogate local model as per
| https://github.blog/2023-02-14-github-copilot-now-has-a-
| bett...
|
| When did this change?
| Rastonbury wrote:
| When it first launched it, I too didn't know they had
| changed the model from the original codex which came
| similar time as gpt-3.5
| jasonjmcghee wrote:
| > Gpt-4 for chat, 3.5 for code
|
| That thread is comparing sidebar chat to inline chat. Doesn't
| discuss code completions afaict.
| localfirst wrote:
| It's miles better.
|
| In fact I stopped using expensive GPT-4
|
| Codestral just works, its quick, output is accurate its kinda
| scary.
| Zambyte wrote:
| Link to the huggingface page:
| https://huggingface.co/mistralai/Codestral-22B-v0.1
| bloopernova wrote:
| Does anyone know of a link to a codegen comparison page? In other
| words, you write your request, and it's submitted to multiple
| codegen engines, so you can compare the output.
| rohansood15 wrote:
| Not the same, but we evaluated how good LLMs are at fixing code
| and just posted it on HN:
| https://news.ycombinator.com/item?id=40511689
| andruby wrote:
| This is an open weights 22B model. The download on Huggingface is
| 44GB.
|
| Is there a rule-of-thumb estimate for how much RAM this would
| need to be used locally?
|
| Is the RAM requirement the same for a GPU and "unified" RAM like
| Apple silicon?
| fnbr wrote:
| The rule of thumb is roughly 44gb, as most models are trained
| in bf16, and require 16 bits per parameter, so 2 bytes. You
| need a bit more for activations, so maybe 50GB?
|
| you need enough RAM and HBM (GPU RAM) so it's a constraint on
| both.
| sharbloop wrote:
| Which GPU card can I buy to run this model? Can it run on
| commercial RTX3090 or does it need a custom GPU?
| Havoc wrote:
| 3090 or 4090 will be able to run quantized 22B models.
|
| Though realistically for code completion smaller models
| will be better due to speed
| TechDebtDevin wrote:
| Easy..
| Novosell wrote:
| Most GPUs still use GDDR I'm pretty sure, not HBM. Do you
| mean VRAM?
| mauricio wrote:
| 22B params * 2 bytes (FP16) = 44GB just for the weights.
| Doesn't include KV cache and other things.
|
| When the model gets quantized to say 4bit ints, it'll be 22B
| params * 0.5 bytes = 11GB for example.
| tosh wrote:
| B x Q / 8
|
| B: number of parameters
|
| Q: quantization (16 = no quantization)
|
| via https://news.ycombinator.com/item?id=40090566
| TechDebtDevin wrote:
| I'm honestly not sure on how to measure the amount of vRAM
| required for these models but I suspect this would run
| relatively fast, depending on your use case, on a mid to high
| end 20 or 30 series card. No idea about Apple unified RAM. I
| get a lot out of performance out of even older cards such as a
| 1080ti but haven't tested this model.
| wing-_-nuts wrote:
| Wait for a gguf release of this and it will fit neatly into a
| 3090 with a decent quant. I'm excited for this model and I'll
| be adding it to my collection.
| sashank_1509 wrote:
| Seems nice but some preliminary testing against GPT-4o shows it's
| lacking a bit. It does a pretty good job for easy questions
| though
| jasonjmcghee wrote:
| GPT-4o is really oddly hit or miss for code.
|
| Sometimes it outperforms GPT-4 in quality by a fair amount, and
| other times it starts repeating itself. Duplicating function
| definitions, even misremembering what things are named.
|
| It seems to have to do with length. If the output exceeds a few
| thousand tokens, it seems to experience some pretty bad failure
| modes.
| afro88 wrote:
| 4o can only output 4k tokens. So the training to complete an
| answer within 4k tokens is probably kicking in and nerfing
| the quality
| localfirst wrote:
| personally this has performed consistently and just as good if
| not better than GPT-4
|
| what strikes me is the consistency and lack of hallucination
| you got in GPT4o making in unusuable for any reliable code gen
| swyx wrote:
| i've been noticing that there's a divergence in philosophy
| between Llama style LLMs (Mistral are Meta alums so I'm counting
| them in tehre) and OpenAI/GPT style LLMs when it comes to code.
|
| GPT3.5+ prioritized code very heavily - there's no CodeGPT, its
| just GPT4, and every version is better than the last.
|
| Whereas the Llama/Mistral models are now shipping the general
| language model first, then adding CodeLlama/Codestral with
| additional pretraining (it seems like we don't know how much more
| tokens are on this one, but CodeLLama was 500B-1T extra tokens of
| code).
|
| Zuck has mentioned recently that he doesnt see coding ability as
| important for his usecases, whereas obviously OpenAI is betting
| heavily on code as a way to improve LLM reasoning for AGI.
| memothon wrote:
| >Zuck has mentioned recently
|
| That's a really surprising thing to hear, where did you see
| that? The only quote I've seen is this one:
|
| >"One hypothesis was that coding isn't that important because
| it's not like a lot of people are going to ask coding questions
| in WhatsApp," he says. "It turns out that coding is actually
| really important structurally for having the LLMs be able to
| understand the rigor and hierarchical structure of knowledge,
| and just generally have more of an intuitive sense of logic."
|
| https://www.theverge.com/2024/1/18/24042354/mark-zuckerberg-...
| imachine1980_ wrote:
| Make Sense, they want better interaction whit users for
| Whatsapp, Instagram and Facebook marketers, content creation
| and moderation,and their glasses(ai /ar) I don't see in that
| context why the should push more effort into llm coding, is
| sad anyways
| whoami_nr wrote:
| He mentioned it on the Dwarkesh podcast:
| https://www.youtube.com/watch?v=bc6uFV9CJGg
| tkellogg wrote:
| The OpenAI philosophy is that adding modes improves everything.
| Sure, it's astronomically expensive, but I tend to think
| they're on to something.
| guyomes wrote:
| > OpenAI is betting heavily on code as a way to improve LLM
| reasoning for AGI.
|
| And researchers from Google Deepmind, University of Wisconsin-
| Madison and Laboratoire de l'Informatique du Parallelisme,
| University of Lyon, actually publish some of their results in
| that direction [1,2].
|
| [1]: https://deepmind.google/discover/blog/funsearch-making-
| new-d...
|
| [2]: https://www.nature.com/articles/s41586-023-06924-6
| Rastonbury wrote:
| I thought that was the idea, open source small specific models
| that most people can run vs general purpose ones that require a
| massive amount of GPUs
| behnamoh wrote:
| > Zuck
|
| No, if anything he said Meta realized coding abilities make the
| model overall better, so they focused on those more than
| before.
| sebzim4500 wrote:
| Very impressed with it based on a short live chat, feels insanely
| fast considering its capability.
|
| chat.mistral.ai
| kergonath wrote:
| We'll see how fast it is on consumer hardware once decent
| quantisations are available.
| colesantiago wrote:
| I'm so happy now LLMs are democratising access to programming,
| especially open models like what Meta with Llama and Mistral is
| doing with Codestral are doing.
|
| The abundance of programming is going to allow almost everyone to
| become a great programmer.
|
| This is so exciting to see and each day programming is becoming a
| solved problem so we can focus on other things.
| skydhash wrote:
| Shadow libraries did more to democratize anything than LLMs.
| And following a book like Elixir in Action (Manning) will get
| you there faster than chatting with LLMs or copilot generating
| code for you.
| smokel wrote:
| In my experience these tools amplify the quality of a
| programmer.
|
| I have seen good programmers dramatically increase their
| productivity, but I've also seen others copy-pasting for loops
| inside other for loops where one loop would definitely suffice.
| We're not quite there yet.
| croes wrote:
| I'm curious for the long-term effect.
|
| I observe a certain laziness in myself when it comes to
| certain problems. It's easier to ask a LLM and debug provided
| code, but I ask myself if I'm losing some problem solving
| capabilities in the long run because of this.
|
| Similar to the loss of speed in doing mental arithmetic
| because of calculators on the smartphone.
| bubbleRefuge wrote:
| Absolutely it amplifies. Complex and esoteric configuration
| of frameworks, for example, entails so much reading and
| Googling and can be very time consuming without AI. AI can
| help to bring custom software to the markets that could not
| otherwise afford to pay for it.
| icedchai wrote:
| I'm skeptical. I've run into people who used LLMs to code, then
| can't debug it without someone else's help. It may get you 80%
| there though.
| whiplash451 wrote:
| It does not get you 80% there if it achieves what you
| described. It rather gets you 100% into trouble.
| croes wrote:
| Programmer view vs management view.
|
| 100% of nothing vs 80% of enough.
|
| That's the risk of AI. Not that AI outperforms humans
| already but that managers believe it does. That and that
| code writing is the main work of programmers.
| icedchai wrote:
| I agree with you. I've had to debug some of that junk.
| Cyphase wrote:
| I've run into working programmers who were bad at debugging
| before LLMs existed.
| croes wrote:
| >The abundance of programming is going to allow almost everyone
| to become a great programmer.
|
| How do you become a great programmer if you don't really
| program?
| maskil wrote:
| I would argue the opposite is true.
|
| My experience with coding with LLMs is that the only thing it's
| really good at is generating boilerplate that it has more-or-
| less seen before (essentially a library, even if is somewhat
| adapted), however it is incapable of the creative thinking that
| developers regularly need to engage in when architecting a
| solution for their use case.
| Kiro wrote:
| My experience is the opposite. When I started using Copilot I
| thought it would only be good at standard boilerplate but I'm
| constantly surprised how well it understands my completely
| convoluted legacy architecture that barely I understand
| myself even though I'm the only contributor.
| localfirst wrote:
| I've been on both sides of the fence here.
|
| Parents problem I experienced -> it gets "stuck" and its
| limitation of learning loop (humans are always asking why
| it gets stuck and how to get unstuck), LLMs just power
| through without understanding what "stuck" is.
|
| For explaining existing corpus, algorithm it does a
| fantastic job.
|
| So likely we will see significant wage garnishing in
| "agency/b2b enterprise" shops.
| huygens6363 wrote:
| This enables everyone to be great programmers like how easily
| available power tools enables everyone to be a great carpenter
| and general craftsman.
|
| You'll get a lot of shitty stuff and the profession will get
| hollowed out losing attraction of the smart people. We'll be
| left with low-quality, disposable bullshit while wondering
| where all the programmers went.
| jhonatan08 wrote:
| Do we have a list of the 80+ languages it was trained on? I
| couldn't find it
| ddavis wrote:
| My favorite thing to ask the models designed for programming is:
| "Using Python write a pure ASGI middleware that intercepts the
| request body, response headers, and response body, stores that
| information in a dict, and then JSON encodes it to be sent to an
| external program using a function called transmit." None of them
| ever get it right :)
| bongodongobob wrote:
| Cool, you've identified that your prompt is inadequate for the
| task.
|
| 'Pray, Mr. Babbage, if you put into the machine wrong figures,
| will the right answers come out?'
| TechDebtDevin wrote:
| Damn, show us your brilliant prompt then. LLMs cannot do
| this, not even in python, of which there are libraries like
| Blacksheep that honestly make it a trivial task.
| bongodongobob wrote:
| My point is that you shouldn't expect to one shot
| everything. Have it start by writing a spec, then outline
| classes and methods, then write the code, and feed it debug
| stuff.
| bottom999mottob wrote:
| Exactly, expecting one shot 100% working code with one
| prompt is ridiculous at this point. It's why libraries
| like Aider are so useful, because you can iteratively
| diff generated code until it's useable.
| TechDebtDevin wrote:
| Sure it's impossible at this point, but the point of a
| benchmark isn't to complete the task it's to test it's
| efficacy overall and to see progress. None of them are
| 100% at even the simplistic python benchmarks, doesn't
| mean we shouldn't measure that capability. But sure, I
| get it. That's not how they are intended to be used but
| that's also not the point the commenter was laying out.
| TechDebtDevin wrote:
| I see your point but hand holding isn't really a good way
| to benchmark a models coding capabilities.
| Closi wrote:
| Depends if benchmarking is the aim, rather than
| decreasing the time it takes to build things.
| TechDebtDevin wrote:
| Well sure, but that wasn't what we were discussing. The
| original comment says they use that as their benchmark.
| While their coding task is a bit complex compared to
| other benchmarking prompts, it's not that crazy. Here is
| an example of prompts used for benchmarking with Python
| for reference:
|
| https://huggingface.co/datasets/mbpp?row=98
|
| At the end of the day LLMs in their current iteration
| aren't intended to do even moderately difficult tasks on
| their own but it's fun to query them to see progress when
| new claims are made.
| Closi wrote:
| The original comment says nothing about benchmarking,
| they just say that an AI can't one shot their complex
| task?
| amne wrote:
| When I read
|
| _" My favorite thing to ask the models designed for
| programming is ....... None of them ever get it right"_
|
| I read "benchmark".
| ben_w wrote:
| Prompts like yours (I ask them for a fluid dynamics
| simulator which also doesn't succeed) inform us of the
| level they have reached. A useful benchmark, given how many
| of the formal ones they breeze through.
|
| I'm glad they can't quite manage this yet. Means I still
| have a job.
| Closi wrote:
| Break your prompt up into smaller pieces and it can.
| qeternity wrote:
| Taken to the extreme, a sufficiently broken down prompt
| is simply the code itself.
|
| The whole point is to prompt less?
| meiraleal wrote:
| > Taken to the extreme, a sufficiently broken down prompt
| is simply the code itself
|
| it is not. But the artifacts generated through the steps
| will be code. The last prompt will have most of the code
| supplied to it as the context.
| achierius wrote:
| A prompt is just a specification for an output. Code is
| just what we call a sufficiently detailed specification.
| buddhistdude wrote:
| No he is right, he is saying taken to the extreme. The
| point is the more and more specific you have to prompt,
| the more you are actually contributing to the result
| yourself and the less the model is
| Closi wrote:
| More practically, the whole point is to prompt enough to
| generate valid code.
| ddavis wrote:
| It's something I know how to do after figuring it out myself
| and discovering the potential sharp edges, so I've made it
| into a fun game to test the models. I'd argue that it's a
| great prompt (to keep using consistently over time) to see
| the evolution of this wildly accelerating field.
| kergonath wrote:
| Do you notice any progress over time?
| AnimalMuppet wrote:
| How is that "putting in wrong figures"? It's a perfectly
| valid prompt, written in clear, proper English.
| sanex wrote:
| Can you get it right without an IDE?
| ddavis wrote:
| Nope, I don't know how to do it at all- that's why I have to
| ask AI!
| nicce wrote:
| I usually through some complex Rust code with lifetime
| requirements. And ask them to fix it. LLMs aren't capable on
| providing much help for that in general, other than some very
| basic cases.
|
| The best way to get your work done is still to look into Rust
| forums.
| meiraleal wrote:
| It works amazingly well for the ones that never coded in
| Rust, at least in my experience. It took me a couple hours
| and 120 lines of code to set up a WebRTC signaling server.
| meiraleal wrote:
| Interesting. My favorite thing to ask the models is to refactor
| code I've not touched for too long and this works very well.
| JimDabell wrote:
| I normally ask about building a multi-tenant system using async
| SQLAlchemy 2 ORM where some tables are shared between tenants
| in a global PostgreSQL schema and some are in a per-tenant
| schema.
|
| Nothing gets it right first time, _but_ when ChatGPT 4 first
| came out, I could talk to it more and it would eventually get
| it right. Not long after that though, ChatGPT degraded. It
| would get it wrong on the first try, but with every subsequent
| follow up it would forget one of the constraints. Then when it
| was prompted to fix that one, it forgot a different one. And
| eventually it would cycle through all of the constraints,
| getting at least one wrong each time.
|
| Since then benchmarks came out showing that ChatGPT "didn't
| really degrade", but all of the benchmarks seemed focused on
| single question/answer pairs and not actual multi-turn chat.
| For this kind of thing, ChatGPT 4 has never managed to recover
| to as good as it was when it was first released in my
| experience.
|
| It's been months since I've had to deal with that kind of code,
| so I might be forgetting something, but I just tried it with
| Codestral and it spat out something that looked reasonable very
| quickly on its first try.
| checkyoursudo wrote:
| I had a similar experience. I was trying to get GPT 4 to
| write some R/Stan code for a bit of bayesian modelling. It
| would get the model wrong, and then I would walk it through
| how to do it right, and by the end it would almost get it
| right, but on the next step, it would be like, oh, this is
| what you want, and the output was identical to the first
| wrong attempt, which would start the loop over again.
| happypumpkin wrote:
| Similar experience using GPT4 for help with Apple's
| Accessibility API. I wanted to do some non-happy-path
| things and it kept looping between solutions that failed to
| satisfy at least one of a handful of requirements that I
| had, and in ways that I couldn't combine the different
| "solutions" to meet all the requirements.
|
| I was eventually able to figure it out with the help of
| some early 2010s blog posts. Sadly I didn't test giving it
| that context and having it attempt to find a solution again
| (and this was before web browsing was integrated with the
| web app).
|
| More of an issue than it not knowing enough to fulfill my
| request (it was pretty obscure so I didn't necessarily
| expect that it would be able to) was that it didn't mind
| emitting solutions that failed to meet the requirements. "I
| don't know how to do that" would've been a much preferred
| answer.
| alephxyz wrote:
| >It would get it wrong on the first try, but with every
| subsequent follow up it would forget one of the constraints.
| Then when it was prompted to fix that one, it forgot a
| different one. And eventually it would cycle through all of
| the constraints, getting at least one wrong each time.
|
| That drives me nuts and makes me ragequit about half the
| time. Although it's usually more effective to go and correct
| your initial prompt rather than prompt it again
| gyudin wrote:
| I ask software developers to do the same thing and give them
| the same amount of time. None of them ever write a single line
| of code :)
| dieortin wrote:
| Give an LLM all the time you want, and they will still not
| get it right. In fact, they most likely will give worse and
| worse answers with time. That's a big difference with a
| software developer.
| mypalmike wrote:
| My experience is very different. Often it (ChatGPT or
| Copilot, depending on what I'm trying to accomplish) gets
| things right the first time. When it doesn't, it's usually
| close enough that a bit of manual modification is all
| that's needed. Sometimes it's totally wrong, but I can
| usually point it in the right direction.
| shepardrtc wrote:
| gpt-4o gets it right on the first try for me. Just ran it and
| tested it.
| spmurrayzzz wrote:
| I love to ask it to "make me a Node.js library that pings an
| ipv4 address, but you must use ZERO dependencies, you must only
| the native Node.js API modules"
|
| The majority of models (both proprietary and open-weight) don't
| understand:
|
| - by inference, ping means we're talking about ICMP
|
| - ICMP requires raw sockets
|
| - Node.js has no native raw socket API
|
| You can do some CoT trickery to help it reason about the
| problem and maybe finally get it settled on a variety of
| solutions (usually some flavor of building a native add-on
| using C/C++/Rust/Go), or just guide it there step by step
| yourself, but the back and forth to get there requires a ton of
| pre-knowledge of the problem space which sorta defeats the
| purpose. If you just feed it the errors you get verbatim trying
| to run the code it generates, you end up in painful feedback
| loops.
|
| (Note: I never expect the models to get this right, it's just a
| good microcosmic but concrete example of where knowledge &
| reasoning meets actual programming acumen, so its cool to see
| how models evolve to get better, if at all, at the task).
| IMTDb wrote:
| Is there a way to use this within VSCode like copilot , meaning
| having the "shadow code" appear while you code instead of having
| to tho back-and-forth between the editor and a chat-like
| interface ?
|
| For me, a significant component of the quality of these tools
| resides on the "client" side; being able to engineer a prompt
| that will yield to accurate code being generated by the model.
| The prompt needs to find and embed the right chunks from the user
| current workspace, or even from his entire org repos. The model
| is "just" one piece of the puzzle.
| meiraleal wrote:
| I created a simple CLI app that does this in my workspace,
| which is under source control so after the LLM execution all
| the changes are highlighted by diff and the LLM also creates a
| COMMIT_EDITMSG file describing what it changed. Now I don't use
| chatgpt anymore, only this cli tool.
|
| I never saw something like this integrated directly on VSCode
| tho (and isn't my preferred workflow anyway, command line works
| better).
| pyepye wrote:
| Not using Codestral (yet) but check out Continue.dev[1] with
| Ollama[2] running llama3:latest and starcoder2:3b. It gives you
| a locally running chat and edit via llama3 and autocomplete via
| starcoder2.
|
| It's not perfect but it's getting better and better.
|
| [1] https://www.continue.dev/ [2] https://ollama.com/
| jmorgan wrote:
| Codestral was just published here as well:
| https://ollama.com/library/codestral
| sa-code wrote:
| This doesn't give the "shadow text" that the user
| specifically mentioned
| mijoharas wrote:
| Wow... That site (continue.dev) managed to consistently crash
| my mobile google chrome.
|
| I've had the odd crash now and again, but I can't think of
| many sites that will reliably make it hard crash. It's almost
| impressive.
| croes wrote:
| You mean like in their example VS code integration shown here?:
|
| https://m.youtube.com/watch?v=mjltGOJMJZA
| jacekm wrote:
| The article says that the model is available in Tabnine, a
| direct competitor to Copilot.
| jdoss wrote:
| I have been using Ollama to run the Llama3 model and I chat
| with it via Obsidian using
| https://github.com/logancyang/obsidian-copilot and I hook
| VSCode into it with https://github.com/ex3ndr/llama-coder
|
| Having the chats in Obsidian lets me save them to reference
| them later in my notes. When I first started using it in VSCode
| when programming in Python it felt like a lot of noise at
| first. It kept generating a lot of useless recommendations, but
| recently it has been super helpful.
|
| I think my only gripe is I sometimes forget to turn off my
| ollama systemd unit and I get some noticeable video lag when
| playing games on my workstation. I think for my next video card
| upgrade, I am going to build a new home server that can fit my
| current NVIDIA RTX 3090 Ti and use that as a dedicated server
| for running ollama.
| analyte123 wrote:
| The license for this [1] prohibits use of the model and its
| outputs for any commercial activity, or even any "live" (whatever
| that means) conditions, commercial or not.
|
| There seems to be an exclusion for using the code outputs as part
| of "development". But wait! It also prohibits "any internal usage
| by employees in the context of the company's business
| activities". However you interpret these clauses, this puts their
| claims and comparisons on completely unequal ground. They only
| compare to other open-weight models, not GPT-4 or Opus, but a
| normal company or individual can do whatever they want with the
| Llama weights and outputs. LangChain? "Your favourite coding and
| building environment"? Who cares? It seems you're not allowed to
| integrate this with anything else and show it to anyone, even as
| an art project.
|
| [1] https://mistral.ai/licenses/MNPL-0.1.md
| croes wrote:
| It's more like a demo version you can evaluate before you need
| to buy a commercial license.
|
| On whose code is Mistral trained?
| rvnx wrote:
| Your code, my code, etc. But there is a common case with law;
| copyright do not apply when you have billions.
|
| Examples: recurring infringement from Microsoft on open-
| source projects, Google scraping content to build their own
| database, etc...
| foobiekr wrote:
| There's some irony in the fact that people will ignore this
| license in exactly the same way Mistral and all the other LLM
| guys ignore the copyright and licensing on the works they
| ingest.
| nicce wrote:
| In many countries you even can't claim copyright for the
| output of the AI to use license like this.
| hannasanarion wrote:
| Copyright on the software that produces something isn't the
| same as copyright on the output.
|
| The library's copyright is intact, as normal, and they can
| control who uses it and how just like any other software.
|
| The _output_ of AI systems is not copyrightable, but the
| systems themselves are, and associated EULAs are valid.
| nicce wrote:
| Is that so certain? To be able to make claims for _what_
| you can use the output, can you do it without making any
| claims for about control and ownership of the output?
|
| Of course, they can revoke your right to use the
| software, but if it goes to court, that would be
| interesting case.
| belter wrote:
| And nobody will sue anybody because suing
| means...discovery....
| hehdhdjehehegwv wrote:
| So basically I, as an open source author, had my code eaten
| up by Mistral without my consent, but if I want to use their
| code model I'm subject to a bunch of restrictions that
| benefit their bottom line?
|
| The problem these AI companies have is they live in a glass
| house and they can't throw IP rocks around without breaking
| their own "your content is our training data" foundation.
|
| They only reason I can think of that Google doesn't go after
| OpenAI for scraping YouTube is then they'd put themselves in
| the same crosshairs, and may set a precedent they'd also be
| bound by.
|
| Given the model is "on the web" I have the same rights as
| Mistral to use anything online however I want without regard
| for IP, right?
|
| Utter absurdity.
| htrp wrote:
| call it an Enterprise poison pill.
| hehdhdjehehegwv wrote:
| But a pill they also have to swallow.
| dodslaser wrote:
| Enterprise suicide cult.
| IshKebab wrote:
| > So basically I, as an open source author, had my code
| eaten up by Mistral without my consent
|
| Not necessarily. You consented to people reading your code
| and learning from it when you posted it on Github. Whether
| or not there's an issue with AI doing the same remains to
| be settled. It certainly isn't clear cut that separate
| consent would be required.
| hehdhdjehehegwv wrote:
| I did not give consent to train on my software and the
| license does not allow commercial use of it.
|
| They have taken _my_ code and now are dictating how _I_
| can use their derived work.
|
| Personally I think these tools are useful, but if the
| data comes from the commons the model should also belong
| to the commons. This is just another attempt to gain
| private benefit from public work.
|
| There are legal issues to be resolved, and there is an
| explosion of lawsuits already, but the fact pattern is
| simple and applies to nearly all closed-source AI
| companies.
| portaouflop wrote:
| Mistral is as open as they get, most others are far
| worse. Here you can use the model without issues, as
| others are saying it's doubtful they would sue you if you
| were to use code generated by the model in a commercial
| app
| mananaysiempre wrote:
| Replit's replit-code[1,2] is CC BY-SA 4.0 for the
| weights, Apache 2.0 for the sources. Replit has its own
| unpleasant history[3], but the model's terms are good.
| (The model is not as good, but deciding whether that's a
| worthwhile tradeoff is up to you. The tradeoff exists and
| is meaningful, is my point.)
|
| [1] https://huggingface.co/replit/replit-code-v1-3b
|
| [2] https://huggingface.co/replit/replit-code-v1_5-3b
|
| [3] https://news.ycombinator.com/item?id=27424195
| Liquix wrote:
| MIT/BSD code is fair game, but isn't the whole point of
| GPL/AGPL "you can read and share and use this, but you
| can't take it and roll it into your closed commercial
| product for profit"? It seems like what Mistral and co
| are doing is a fundamental violation of the one thing GPL
| is striving to enforce.
| nullc wrote:
| Five years ago it would not have been at all controversial
| that these weights would not be copyrightable in the US,
| they're machine generated output on third party data. Yet
| somehow we've entered a weird timeline where obvious
| copyfraud is fine, by the same entities that are at best on
| the line of engaging in commercial copyright _infringement_
| at a hereto unforeseen scale.
| rohansood15 wrote:
| That license is just hilarious.
| das_keyboard wrote:
| OT, but 7.2 reads like the description of some Yu-Gi-Oh card
| or something:
|
| > Mistral AI may terminate this Agreement at any time [...].
| Sections 5, 6, 7 and 8 shall survive the termination of this
| Agreement.
| behnamoh wrote:
| From the website:
|
| > licensed under the new Mistral AI Non-Production License,
| which means that you can use it for research and testing
| purposes. ...
|
| Which basically means "we give you this model. Go find its
| weaknesses and report on r/locallama. Then we'll use that to
| improve our commercial model which we won't open-source."
|
| I'm sick of abusing the word "open-source" in this field.
| JimDabell wrote:
| > I'm sick of abusing the word "open-source" in this field.
|
| They don't call this open source anywhere, do they? As far as
| I can see, they only say it's open weights and that it's
| available under their Mistral AI Non-Production License for
| research and testing. That doesn't scream "open source" to
| me.
| demosthanos wrote:
| They do say "open-weight", which is I think still very
| misleading in this context. Open-weight sounds like it
| should be the same as open-source, just for weights instead
| of the full source (for example, training data and the code
| used to generate the weights may not be released). This
| isn't really "open" in any meaningful sense.
| Zambyte wrote:
| The fact that I can downloaded it and run it myself is a
| pretty meaningful amount of openness to me. I can easily
| ignore their bogus claims about what I'm allowed to do
| with it due to their distribution model. I can't
| necessarily do the same with a propriety service, as they
| can cut me off if the way I use the output makes them sad
| :(
| TeMPOraL wrote:
| > _The fact that I can downloaded it and run it myself is
| a pretty meaningful amount of openness to me_
|
| That's typically called _freeware_ , though.
| Zambyte wrote:
| The inference engine that I use to run open weight
| language models is fully free software. The model itself
| isn't really software in the traditional sense. So
| calling it ____ware seems inaccurate.
| TeMPOraL wrote:
| The interpreter is free software. The model is freeware
| distributed as a binary blob. Code vs. Data is a matter
| of perspective, but with large neural nets, more than
| anywhere, it makes no sense to pretend they're plain
| data. All the computational complexity is in the weights,
| they're very much code compiled for an unusual
| architecture (the inference engine).
| demosthanos wrote:
| > I can easily ignore their bogus claims about what I'm
| allowed to do with it due to their distribution model.
|
| If you're talking about exclusively personally use, sure.
| If you're talking about a business setting in a
| jurisdiction that Mistral can sue in, not so much.
|
| Being able to use it in a business setting is a pretty
| darn important part of what Open Source has always meant
| (it's why it exists as a term at all).
| boulos wrote:
| This is why I prefer the term "weights available" just
| like "source available". It makes it clear that you can
| get your hands on the copy, you could run this exact
| thing locally if they go out of business, etc. but it is
| definitely not open in the OSS sense.
| gyudin wrote:
| All their other models are "open source" and it was the
| selling point they built their brand on. I doubt they made
| their new model completely different from previous ones so
| it's supposed be open source too, unless they found some
| juridical loophole lol
| Rastonbury wrote:
| No but they do say "empowering developers" and
| "democratising coding" as the subtitle, I guess only those
| who pay
| meiraleal wrote:
| > Who cares? It seems you're not allowed to integrate this with
| anything else and show it to anyone, even as an art project.
|
| Now they just lack the means to enforce it.
| localfirst wrote:
| impossible to enforce
| isoprophlex wrote:
| So, it's almost entirely useless with that license, because the
| average pack of corpo beancounters will never let you use it
| over whatever Microsoft has already sold them.
| batch12 wrote:
| If they can make agreements with arbitrary terms, why can't we?
| [0]
|
| [0] https://o565.com/content-ownership-and-licensing-agreement/
| croes wrote:
| >Usage Limitation
|
| - You shall only use the Mistral Models and Derivatives (whether
| or not created by Mistral AI) for testing, research, Personal, or
| evaluation purposes in Non-Production Environments;
|
| - Subject to the foregoing, You shall not supply the Mistral
| Models, Derivatives, or Outputs in the course of a commercial
| activity, whether in return for payment or free of charge, in any
| medium or form, including but not limited to through a hosted or
| managed service (e.g. SaaS, cloud instances, etc.), or behind a
| software layer
| jstummbillig wrote:
| How I interact with new model reports at this point: Open the
| page, ctrl + f, "gpt-4" and skip the rest.
| isoprophlex wrote:
| Does it do SQL, and if so, which dialects? I am having a hard
| time figuring out what it is actually trained on
| sebzim4500 wrote:
| They claim good results in a SQL benchmark but they don't
| specify what dialects it knows.
| esafak wrote:
| Are there any IDE plugins that index your entire code base in
| order to provide contextual responses AND let you pick between
| the latest models?
|
| If not, consider it a product idea ;)
| pmmucsd wrote:
| There are plugins for various IDEs that operate like copilot
| but let you select model you want to use, just supply your key.
| CodeGPT for JetBrains/Android Studio is pretty good. I think
| you can even use a model running locally.
| saturatedfat wrote:
| Supermaven, but you don't get model choice.
| elmariachi wrote:
| Cody by Sourcegraph allows you to do this. It doesn't have
| Codestral yet but probably will soon.
| jdorfman wrote:
| We are working on it.
| asadm wrote:
| How do people do infilling these days? In olden times models used
| to provide a way to provide suffix separately.
| artninja1988 wrote:
| This is a business model I can get behind. The model is under a
| non-commercial license, but it's open weights and they have their
| official API for it
| YetAnotherNick wrote:
| What's the business model for semi open source models like these?
| Is it just because they can't be fully closed as they have to
| then compare with OpenAI. Who would pay for these model if better
| is available for cheaper from Anthropic or Google.
| mirekrusin wrote:
| Fifty shades of "open".
| isaacrolandson wrote:
| Will this run on an M3 48GB?
| piskov wrote:
| You'll need 44GB just for the weights
|
| By default only 75% of unified memory is available to GPU if
| you have >36GB. So with 48 total only 36 is available for GPU
| with is lower than 44.
|
| tldr; without quantization you will not be able to run it.
| Sytten wrote:
| Is there a vscode extension that could plug any model out there
| and have a similar experience to copilot. I always want to try
| them but I cant be bothered to do a whole setup each time.
| gsuuon wrote:
| How does the Mistral non-production license work for
| small/hobby/indie projects? Has anyone tried to get approval for
| that kind of use?
| James_K wrote:
| > Democratising code
|
| Did yall see what happened when they democratised art? I don't
| want to have a billion and one AI garbage libraries to sift
| through before I can find something reliable and human-made. At
| least the potential for creating horrific political software is
| slightly lower than with simple images.
| gavin_gee wrote:
| what the heck is this for, if you can't use it for commercial
| work?
| ein0p wrote:
| If I can't use the output of this in practical code completion
| use cases, it's meaningless, because GH Copilot exists. Idk what
| they're thinking or what business model they're envisioning -
| Copilot is far and away the best model of this kind anyway
___________________________________________________________________
(page generated 2024-05-29 23:00 UTC)