[HN Gopher] Code Llama, a state-of-the-art large language model ...
___________________________________________________________________
Code Llama, a state-of-the-art large language model for coding
Author : marcopicentini
Score : 435 points
Date : 2023-08-24 13:26 UTC (9 hours ago)
(HTM) web link (ai.meta.com)
(TXT) w3m dump (ai.meta.com)
| redox99 wrote:
| The highlight IMO
|
| > The Code Llama models provide stable generations with up to
| 100,000 tokens of context. All models are trained on sequences of
| 16,000 tokens and show improvements on inputs with up to 100,000
| tokens.
|
| Edit: Reading the paper, key retrieval accuracy really
| deteriorates after 16k tokens, so it remains to be seen how
| useful the 100k context is.
| brucethemoose2 wrote:
| Did Meta add scalable rope to the official implementation?
| snippyhollow wrote:
| We changed RoPE's theta from 10k to 1m and fine-tuned with
| 16k tokens long sequences.
| lucidrains wrote:
| to give a bit of inspiration to the open source community,
| this trick was discovered by a random redditor https://www.
| reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkawar... cool to
| see it applied at scale!
| malwrar wrote:
| Curious, what led you to adjusting the parameters this way?
| Also, have you guys experimented with ALiBi[1] which claims
| better extrapolative results than rotary positional
| encoding?
|
| [1]: https://arxiv.org/abs/2108.12409 (charts on page two
| if you're skimming)
| ttul wrote:
| Undoubtedly, they have tried ALiBi...
| nabakin wrote:
| Looks like they aren't releasing a pretty interesting model
| too. In the paper they mention a "Unnatural Code Llama" which
| wipes the floor with every other model/finetune on every
| benchmark except for slightly losing to Code Llama Python on
| MBPP pass@100 and slightly losing to GPT-4 on HumanEval pass@1
| which is insane.
|
| Meta says later on that they aren't releasing it and give no
| explanation. I wonder why given how incredible it seems to be.
| EvgeniyZh wrote:
| Note that current GPT-4 pass@1 for HumanEval is closer to 90%
| than to 67% reported in GPT-4 technical report, as reported,
| e.g., in [1]
|
| [1] https://arxiv.org/abs/2305.01210
| nabakin wrote:
| Good point, I guess Meta should be using that number in
| their chart
| jonchurch_ wrote:
| The paper states it was instruction fine tuned with synthetic
| data (LLM generated instructions) ala another paper
| ("Unnatural Instructions: Tuning Language Models with
| (Almost) No Human Labor").
|
| The github repo associated with that paper is linked below.
| It links to the paper on arxiv, but also has some data in the
| repo.
|
| https://github.com/orhonovich/unnatural-instructions
| ilaksh wrote:
| Maybe they used GPT-4 to train it. OpenAI terms of use
| don't allow that to be released commercially.
| nabakin wrote:
| I've seen this argued a lot but is it fact? OpenAI was
| able to train on data from other platforms and surely,
| those platforms weren't letting their data go if they
| could help it. Unless some new laws have been passed, I
| don't think OpenAI can legally prevent others from using
| their data to train models. OpenAI can't have their cake
| and eat it too. After all, any content generated by AI
| can't be copyrighted.
| lhl wrote:
| It is indeed a fact that OpenAI's Terms of Use do state
| that you can't use their service to develop competing
| models: Section 2.c.iii -
| https://openai.com/policies/terms-of-use
|
| Now of course, the terms are not the law (so don't govern
| the use of the generated data by any third party), they
| are an agreement between two parties. If you did click
| "agree" then that's a binding agreement and there could
| be legal/contractual repercussions (some of which are
| outlined in the terms).
| haldujai wrote:
| That seems like a likely explanation, probably won't get
| into legal trouble for using an OpenAI model for a
| research paper but redistributing said model may be
| upsetting enough for OpenAI trigger a legal challenge.
|
| Unnatural language used davinci-002 although that was a
| while ago, they only say "similarly" in this paper and
| don't specify what they used. I can't see a reason why
| they wouldn't be releasing it if the unnatural prompts
| were generated by LLaMA2-family.
|
| In any case, replicating this training seems trivial and
| very cheap compute-wise for anyone who wanted to do it.
| nkohari wrote:
| This is the most likely explanation for both why they
| wouldn't release it and wouldn't explain why.
| kapp_in_life wrote:
| Likely trained on internal code.
| mediaman wrote:
| That model is trained on synthetically AI-generated code,
| not internal code.
|
| It suggests that synthetic training could be the future in
| increasing capability of smaller models (and perhaps bigger
| ones too). AI will train AI.
| sroussey wrote:
| That is the basis for https://synthesis.ai/
| SubiculumCode wrote:
| I'm an amateur, but it seems to me that methods to
| synthesize will have to be distinct from methods of the
| generative model.
| haldujai wrote:
| I thought this specific model was referring to self-
| instruction using both synthetic prompts (generated from
| few-shot in-context prompting of presumably some OpenAI
| model, the original paper used text-davinci-002) as well
| as synthetic code (presumably Code Llama 7 like for self-
| instruct) subsequently validated with execution?
|
| The differences being it's not just training on
| unvalidated synthetic data and this specific method (per
| the unnatural questions paper) results in increased
| instruction diversity which confers some added advantage
| and I'm assuming explains the performance gain over the
| also synthetic self-instruct code?
|
| I may be misunderstanding but this seems more nuanced
| than just training on synthetically AI-generated code and
| is more validating of synthetic instructions (i.e. low
| resource setting) rather than synthetic code (i.e. high
| resource setting).
| riku_iki wrote:
| > The Code Llama models provide stable generations with up to
| 100,000 tokens of context.
|
| what is the trick to achieve 100k context? They can't just use
| 100k wide transformer layer, it is cost prohibitive, right?..
| littlestymaar wrote:
| I'm pretty sure they don't do that, but for code the relevant
| relationship between two tokens is easy to determine with the
| semantics of the language alone (for instance you can say
| that tokens related to a local variable have no relationship
| with tokens outside), so it would lead to a sparse matrix in
| the transformer, reducing the cost of big contexts by a lot.
| But it would require language specific preprocessing, and
| whether you can make it fast is also dubious. I don't think
| it's been tried so far.
| gdcbe wrote:
| Is there somewhere docs to show you how to run this on your local
| machine and can you make it port it a script between languages?
| Gpt4 can do that pretty well but its context is too small for
| advanced purposes.
| ChatGTP wrote:
| [flagged]
| jtwaleson wrote:
| This is probably a stupid question, but would it be possible to
| use these models to rate existing code and point to possible
| problems, rather than generating new code? That would be
| extremely useful to some use cases I'm working on.
| dangerwill wrote:
| It's really sad how everyone here is fawning over tech that will
| destroy you own livelihoods. "AI won't take your job, those who
| use AI will" is purely short term, myopic thinking. These tools
| are not aimed to help workers, the end goal is to make it so you
| don't need to be an engineer to build software, just let the
| project manager or director describe the system they want and
| boom there it is.
|
| You can scream that this is progress all you want, and I'll grant
| you that these tools will greatly speed up the generation of
| code. But more code won't make any of these businesses provide
| better services to people, lower their prices, or pay workers
| more. They are just a means to keep money from flowing out of the
| hands of the C-Suite and investor classes.
|
| If software engineering becomes a solved problem then fine, we
| probably shouldn't continue to get paid huge salaries to write it
| anymore, but please stop acting like this is a better future for
| any of us normal folks.
| criley2 wrote:
| You can say this about every major invention. The loom
| destroyed jobs! The engine destroyed jobs! So on and so forth.
|
| This view is critically flawed in two major ways:
|
| 1) AI is not anywhere near being able to replace the majority
| of what developers do on a product team. We are decades away
| from a PM at Facebook being able to type "make a twitter clone
| that uses instagram login and can scale to 1 billion" and have
| an AI just do it.
|
| 2) Programming and product work is not zero sum. The more we
| can do means the more product we can make. It means more
| products can be made overall. After the loom came out, we
| simply made more clothes than ever before and in the process
| created a ton of jobs. We are not at some peak software point
| where we've completely saturated all humanity's need for
| software or profitable software and thus tools that increase
| efficiency don't put us out of work.
|
| And frankly, if we develop the kind of general AI that accept a
| query like "make a facebook competitor capable of scaling to 10
| billion" and simply do it, inventing whatever languages,
| frameworks, hardware, processors, patterns, methodologies,
| global datacenters handling global politics and law, etc, etc
| necessary to accomplish such a task, then so be it. I welcome
| the overlords!
| bbor wrote:
| We have three options, IMO:
|
| 1. As a species decide to never build another LLM, ever.
|
| 2. Change the path of society from the unequal, capitalist one
| it's taken the last 2-300 years.
|
| 3. Give up
|
| I know which I believe in :). Do you disagree?
| sp332 wrote:
| Finish #2 first, or else people with access to more money and
| infrastructure will use LLMs to increase inequality even
| further.
| pbhjpbhj wrote:
| The problem with 2 is that the people in power, and with
| immense wealth, remain there because of capitalism. They have
| the political power, and the resources, to enact the change
| ... but they also lose the most (unless you count altruism as
| gain, which of it were true the World would be so different).
|
| We could structure things so that LLM, and the generalised
| AIs to come, benefit the whole of society ... but we know
| that those with the power to make that happen want only to
| widen the poverty gap.
| bbor wrote:
| Yes but the common man has won before! There has never been
| a perfect revolution/paradigm shift (personally anti
| utopia-through-intense-bloodshed, so hesitant to use the
| former term alone), but there have been many, all of which
| were against the wishes of those in power.
|
| Plus, if these AIs are enough to change everything, that
| kinda implies that we've developed flexible, reliable AGI
| systems. In such a world, everything changes - maybe the
| calculus of The Powerful Few vs. The Oppressed Masses
| changes in too! It might even change in our favor, if we're
| terribly lucky...
| Draiken wrote:
| I agree. This will ultimately become another way to extract
| more value straight into the pockets of the owners.
|
| Unfortunately, I don't believe there's a way to stop (or even
| slow down) this train. We can't defeat it, so the only logical
| answer is to join it.
|
| It's the classical issue with progress removing jobs. In
| today's world, since mostly everyone (aside from the
| capitalists themselves) relies on jobs to survive, barring a
| complete switch from capitalism (which will not happen in our
| lifetimes), we're fucked.
|
| Next best thing we can do is to try and democratize it enough
| so that not only the rich have access to it.
| manicennui wrote:
| I'm not worried because it solves a problem for the semi-
| competent.
| tomr75 wrote:
| Improve productivity, cheapen goods and services. Nature of
| technological advancement
| lamp987 wrote:
| "If software engineering becomes a solved problem"
|
| It will simply move to a higher level of abstraction.
|
| Remind me, how many programmers today are writing in assembly?
| blibble wrote:
| due to the way LLMs work: it will be able to handle that
| level of abstraction in exactly the same way
| lamp987 wrote:
| LLM is not strong AI. It's not even AI.
|
| So no. You will always need that strong "I" at the top
| somewhere.
| blibble wrote:
| I think you're in for a nasty surprise
| lamp987 wrote:
| Two more weeks!
| lsmeducation wrote:
| It's less of a concern if you are in mid career. But someone
| should warn all these college kids that are going into comp
| sci. I don't think this will be the kind of lucrative field
| they think it's going to be over the course of a 40 year
| career.
|
| The days of getting paid well for making crud are numbered
| (which most of us do, even in the most interesting problem
| spaces).
|
| # need a front end boilerplate that hits a backend with the
| following end points. REST api for movies catalogue, and a
| corresponding ui. Oh, unit tests please. Go with a responsive
| design and also make a React Native version (matter of fact
| provision it to my iPhone). Decide between Heroku or AWS, set
| up deploy with git hooks.
|
| # scrape IMDb for initial population of the db
|
| # I think a Reddit like comment system would be good to add, so
| add it. No upvote/downvote though
|
| # handle user login with google/fb/email
|
| # also make an admin page to manage all this
|
| I guess the technical product designer will be the new unicorn.
| int_19h wrote:
| _Any_ improvement in tooling benefits capital owners the most,
| since the productivity gains mostly end up in their pockets.
|
| But the answer to that is to deal with concentration of
| capital, not to eschew better tools.
| bryanlyon wrote:
| Llama is a very cool language model, it being used for coding was
| all but inevitable. I especially love it being released open for
| everyone.
|
| I do wonder about how much use it'll get, seeing as running a
| heavy language model on local hardware is kinda unlikely for most
| developers. Not everyone is runnning a system powerful enough to
| equip big AIs like this. I also doubt that companies are going to
| set up large AIs for their devs. It's just a weird positioning.
| outside1234 wrote:
| ... "seeing as running a heavy language model on local hardware
| is kinda unlikely for most developers"
|
| for now it is :) but with quantization advances etc. it is not
| hard to see the trajectory.
| ctoth wrote:
| As we all know, computers stay the same and rarely improve.
| int_19h wrote:
| 12Gb of VRAM lets you run 13B models (4-bit quantized) with
| reasonable speed, and can be had for under $300 if you go for
| previous-generation NVidia hardware. Plenty of developers
| around with M1 and M2 Macs, as well.
| ilaksh wrote:
| https://github.com/facebookresearch/codellama
| [deleted]
| ceejayoz wrote:
| This is 404ing now. (Not your fault, the email's link is
| similarly broken.)
| ilaksh wrote:
| Really? Works for me.
| ceejayoz wrote:
| It's back now.
| bracketslash wrote:
| So uhh...how does one go about using it?
| andrewjl wrote:
| What I found interesting in Meta's paper is the mention of
| HumanEval[1] and MBPP[2] as benchmarks for code quality.
| (Admittedly maybe they're well-known to those working in the
| field.)
|
| I haven't yet read the whole paper (nor have I looked at the
| benchmark docs which might very well cover this) but curious how
| these are designed to avoid issues with overfitting. My thinking
| here is that canned algorithm type problems common in software
| engineering interviews are probably over represented in the
| training data used for these models. Which might point to
| artificially better performance by LLMs versus their performance
| on more domain-specific type tasks they might be used for in day-
| to-day work.
|
| [1] https://github.com/openai/human-eval
|
| [2] https://github.com/google-research/google-
| research/tree/mast...
| gw67 wrote:
| In your opinion, Why Meta does this?
| chaorace wrote:
| To a certain extent, I think it's just IBM disease. A company
| the size of Meta is expected to have an AI research department
| like Microsoft or Google, even if their core business (social
| media) derives relatively less benefit from the technology.
|
| Pretend you're an uncreative PM on an AI team; what part of
| Facebook or VR could you feasibly improve by iterating on LLMs?
| Perhaps the content moderation system... but that would require
| wrangling with the company ethics comittee and someone else at
| the company probably already took ownership that idea. You've
| gotta do _something_ compelling or else your ML engineers are
| going to run off somewhere else.
|
| If I were to ask my ML engineers about what they wanted to work
| on, they're going to avoid areas where their model is outgunned
| (i.e.: chat) and instead prefer lower hanging fruit which
| generalizes well on a resume (i.e.: "Pioneered and published
| key innovations in LLM code-generation").
|
| Of course, the alternative answer is that Meta wants to replace
| all of their jr. developers with GPUs, but I think their
| leadership is a little too preoccupied with VR to be actively
| pushing for such a transformative initiative in anything more
| than a very uninvested capacity (e.g.: "Sure I'll greenlight
| this. Even if it doesn't pay off I don't have any better
| ideas")
| maccam912 wrote:
| It appears we do have a 34B version now, which never appeared for
| non fine tuned llama 2.
| jspisak wrote:
| It would be interesting to understand if a ~30B Llama-2 model
| would be interesting and for what reasons.
| hnuser123456 wrote:
| It would fit on the 24GB top-end consumer graphics cards with
| quantization.
| brucethemoose2 wrote:
| Llama 34B is _just_ big enough to fit on a 24GB consumer (or
| affordable server) GPU.
|
| Its also just the right size for llama.cpp inference on
| machines with 32GB RAM, or 16GB RAM with a 8GB+ GPU.
|
| Basically its the most desirable size for AI finetuning
| hobbyists, and the quality jump from llama v1 13B to llama v1
| 33B is huge.
| Tostino wrote:
| Better reasoning and general performance than 13b by far (if
| llama1 was any indication), and like the other user said, can
| fit on a single 24gb vram gaming card, and can be peft fine-
| tuned with 2x 24gb cards.
| airgapstopgap wrote:
| Llama-1-33B was trained on 40% more tokens than
| LLama-1-13B; this explained some of the disparity. This
| time around they both have the same data scale (2T
| pretraining + 500B code finetune), but 34B is also using
| GQA which is slightly more noisy than MHA. Furthermore,
| there have been some weird indications in the original
| LLama-2 paper that 34B base model is something... even more
| special, it's been trained on a separate internal cluster
| with undervolted/underclocked GPUs (though this in itself
| can't hurt training results), its scores are below
| expectations, it's been less "aligned". Here, Code-Llama-
| Instruct-13B is superior to 34B on HumanEval@1. So yes,
| it's desirable but I wouldn't get my hopes up.
| lolinder wrote:
| Does anyone have a good explanation for Meta's strategy with AI?
|
| The only thing I've been able to think is they're trying to
| commoditize this new category before Microsoft and Google can
| lock it in, but where to from there? Is it just to block the
| others from a new revenue source, or do they have a longer game
| they're playing?
| brandall10 wrote:
| On Lex Fridman, Mark said the strategy is to attract talent
| while keeping the playing field level (not a fan of big tech
| moating this up).
| aent wrote:
| Zuckerberg talked about it on Lex Fridman podcast
| https://youtu.be/Ff4fRgnuFgQ?t=1113
| [deleted]
| nologic01 wrote:
| Clearly the research team at Meta knows the domain as well
| anybody, has access to a data trove as large as anybody and
| their distribution capability is as large scale as anyone's.
|
| If their choice right now is not to try to overtly monetize
| these capabilities but instead commoditize and "democratize"
| what others are offering it suggests they think that a
| proprietary monetization route is not available to them. In
| other words they do not leave money on the table. They think
| that (at least right now) there is no money on the table that
| they can get to.
|
| Rather than remaining quiet and isolated, the best alternative
| - their conjectured thinking goes - is to show up as they do,
| buying up good will with various stakeholders, maintaining
| mindshare internally and externally etc.
|
| Assuming that the above reading is correct it still leaves
| various options as to why they may have come to that
| conclusion: For example reasoning about the future of this
| sector they might be thinking that there is no real technical
| moat and they simply accelerate that reality to gain some
| brownie points.
|
| It may be also idiosyncratic reasons specific to their own
| business model (data privacy challenges and how any AI
| monetization will mesh with all that). The drawback of being
| the elephant in the room is that there is not much room to
| move.
|
| The nature of their long game depends on which of the decision
| branches carries more weight. Maybe it is wait-and-see until
| others clear up the regulatory hurdles. Or keep the engines
| running until the real and irreducible added value of LLM algos
| and the like becomes clear.
| CuriouslyC wrote:
| There really is no technical moat. Any new architectures are
| going to be published because that's 100% the culture and AI
| folks won't work somewhere where that's not true. Training
| details/hyperparameters/model "build-ons" aren't published
| but those are a very weak moat.
|
| The only moat that is meaningful is data and they've got that
| more than any other player save maybe google. Publishing
| models doesn't erode that moat, and it's not going anywhere
| as long as facebook/whatsapp/instagram rule "interactive"
| social.
| BryanLegend wrote:
| Facebook could sure use the good will. They are winning
| plenty of mine here.
| losteric wrote:
| Well Facebook is a walled garden, perhaps the board hopes
| free highly capable LLMs will continue degrading the internet
| outside those walls thus acting as a moat for their money
| printer.
| emmender1 wrote:
| the only beneficiary of this are the hardware vendors.. nvidia
| and amd. and startups which get these foundation models for
| free.
|
| because language models are a complementary product, and the
| complement must be commoditized as a strategy.
|
| I see AMD as a bigger beneficiary, since, very soon, amd will
| equal nvidia for inference and fine-tuning, but amd has a long
| way to go to equal in foundation model training.
| smoldesu wrote:
| > and startups which get these foundation models for free.
|
| It's licensed non-commercially, so I'm not sure what those
| startups stand to gain.
|
| > since, very soon, amd will equal nvidia for inference and
| fine-tuning
|
| Source? If you're referring to Olive, it is indeed impressive
| but also has caveats:
|
| 1. It is just as proprietary as CUDA or CoreML.
|
| 2. You need a copy of Windows and licensed DirectX to use
| those optimizations.
|
| 3. AMD only matches Nvidia's inferencing performance when
| comparing Olive to Pytorch. Olive-to-Olive comparisons will
| still reflect an Nvidia lead.
|
| I don't think AMD has the capability to equal Nvidia in the
| short-term. It will take longtime software investments from
| across the industry to shake Nvidia's yoke.
| artninja1988 wrote:
| Llama2 is not licensed non-commercially. There was some
| weird provisions about not having more than 1 billion users
| at llama launch though
| jspisak wrote:
| If you watch the Connect talks, I'll be speaking about this..
| currio wrote:
| Excited! I hope your talks are just as informative as this
| comment. Keep rocking!
| lolinder wrote:
| Sorry--who are you and what are the Connect talks? I haven't
| heard of them and you don't have a bio.
| jph00 wrote:
| That'll be Joseph Spisak, Head of Generative AI Open Source
| at Meta AI.
| dewey wrote:
| I guess that's what he is referring to:
| https://www.linkedin.com/posts/jspisak_home-fully-
| connected-...
| titaniumtown wrote:
| what is that?
| darrenf wrote:
| Facebook Connect is what used to be called Oculus Connect.
| Kinda their equivalent of Apple's WWDC, I guess. It's when
| and where the Quest 3 will be officially unveiled in full,
| for example.
| jspisak wrote:
| Yep - here is the site:
| https://www.metaconnect.com/en/home
| coder543 wrote:
| I wish that Meta would release models like SeamlessM4T[0]
| under the same license as llama2, or an even better one. I
| don't understand the rationale for keeping it under a
| completely non-commercial license, but I agree that is better
| than not releasing anything at all.
|
| There seem to be opportunities for people to use technology
| like SeamlessM4T to improve lives, if it were licensed
| correctly, and I don't see how any commercial offering from
| smaller companies would compete with anything that Meta does.
| Last I checked, Meta has never offered any kind of
| translation or transcription API that third parties can use.
|
| Whisper is licensed more permissively and does a great job
| with speech to text in some languages, and it can translate
| to English only. However, it can't translate between a large
| number of languages, and it doesn't have any kind of text to
| speech or speech to speech capabilities. SeamlessM4T seems
| like it would be an all-around upgrade.
|
| [0]:
| https://github.com/facebookresearch/seamless_communication
| jspisak wrote:
| Yeah - different projects have different goals and licenses
| aren't one size fits all. Depending on the project, type of
| technology, goals, etc.. we will select or even develop the
| right license that aligns with those goals. Hope this helps
| :)
| politelemon wrote:
| > Microsoft
|
| But they're a partner in Llama too. Why is Microsoft in this
| space too, how do they benefit?
| azeirah wrote:
| Microsoft is a hosting partner, there's an Azure service for
| hosted private LLaMa inference for business. Being a go-to
| hosting provider for SoTA AI is of course a very good thing
| for Microsoft.
| yomlica8 wrote:
| vessenes and rvz kind of sum the idea I think they're going for
| to me.
|
| AI has no moat, but many players are in denial about this
| still. Microsoft and other might have tight enough control they
| can use a product dumping strategy to get people dependent upon
| their implementation such they can start charging, but that
| isn't a delusion Meta can have.
|
| That max revenue license they used with the models seemed
| fairly clever to me. It will seed the environment with players
| that base their product on Meta tech in return for them being
| born with a poison pill preventing their use by big players
| (other than Meta) buying them. This is a long term play that
| may not really work but it creates the potential for big
| opportunities. And even if it doesn't work out, denying easy
| wins for their powerful competitors might be worth the price on
| its own.
| hackernewds wrote:
| I posit it is similar to how Adobe lets students pirate
| Photoshop, because when they join the workforce that is what
| they know and need their employers to buy Adobe services, which
| for corporate is very expensive.
|
| Meta by democratizing AI access is generating more capable
| developers which will make the Metaverse a reality, where FB
| leads. They have already realized they have a losing gambit
| with Google, Apple, Microsoft (also X?) having an antagonistic
| monopoly against Meta product advancement
| rvz wrote:
| > Does anyone have a good explanation for Meta's strategy with
| AI?
|
| Yes. I said it many times. Meta is already at the finish line
| in the AI race to zero. All the other cloud-based AI models
| cannot increase their prices given that a $0 free AI model is
| available to be self-hosted or used on-device for private /
| compliance reasons.
|
| Cloud-based AI models cannot afford to compete with free or
| close to free. It costs Meta close to nothing to release a
| readily available $0 AI model which is good enough for most
| use-cases that ChatGPT has already done.
|
| > The only thing I've been able to think is they're trying to
| commoditize this new category before Microsoft and Google can
| lock it in, but where to from there? Is it just to block the
| others from a new revenue source, or do they have a longer game
| they're playing?
|
| Mostly benefits the PyTorch ecosystem which Meta has an active
| community around it.
| megaman821 wrote:
| Probably just talent acquisition. As Google and OpenAI start
| sharing and publishing less, they become less attractive to
| scientists. No scientist wants to fall into a black hole and
| not publish for 8 years.
| norsurfit wrote:
| Exactly. The Google and OpenAI engineers who published their
| groundbreaking research 5 years ago are now rockstars. Those
| who create great research but can't share it often get
| frustrated.
| rvnx wrote:
| The problem is also companies bragging about AI, but not
| releasing anything behind (like most of the recent Google
| announcements).
|
| If nobody except the researcher can reproduce an AI paper,
| and there is no source-code, and no demos that the public can
| access, then it's almost like if it doesn't exist.
|
| I wouldn't want to work in a company that would throw away my
| research and just use it for PR purposes.
| survirtual wrote:
| Maybe Meta is waking up to the endgame of humanity and has
| decided to stop playing the old game? Who knows :)
| belter wrote:
| Maybe Meta think it can increase the stock price by claiming
| 40 billion avatars are real friends...
| morkalork wrote:
| Retention project to keep their top ML/AI staff engaged and not
| straying away?
|
| Working towards NLU that can solve content moderation once and
| for all? Contrast with tiktok which is clearly using word
| filters that are easily worked around with phrases like "un-
| alived" or "corn".
|
| They want to replace influencers and your friends with chatbots
| and keep you scrolling through an infinite feed of ads and AI
| generated content?
| gaogao wrote:
| A lot of top ML/AI talent has already bailed too, so some of
| it is probably them trying to keep open research closer to
| SOTA.
| hn_20591249 wrote:
| There has been some shuffling of seats but from what I am
| hearing FAIR is the best setup as far as staffing and
| funding that they have been in quite some time. Mark is
| pivoting hard to stay competitive in AI and is providing
| the resourcing to do so, the results speak for themselves.
| idopmstuff wrote:
| Meta has a clear channel to leverage generative AI in
| profitable ways in their ads. At some point in the probably not
| so far future, everybody's going to have custom ads generated
| for them that are optimized to get that particular person to
| click/buy/etc. Those will convert well, and the better ads
| convert, the more businesses will be willing to pay Meta for a
| given ad.
|
| This compares favorably with Google, which is as likely to
| cannibalize its search business with generative AI as to create
| new value for itself.
|
| Thus, for all the gen AI stuff like this, for which Meta
| doesn't have an obvious path to commercialization, it makes
| sense to release it publicly. They get plenty of benefits from
| this - for one, engineers (and smart people generally) who are
| working on really complex problems like to be able to talk
| about the work they're doing. If you're picking between jobs at
| Meta and Google, the fact that Meta's going to release your
| stuff publicly might well be the deciding factor.
|
| I would also argue that there's an economic incentive. Right
| now, being seen as an AI company is definitely a positive for
| your multiple. I think the movement of Meta's stock price over
| the last 12 months relative to their change in profit and
| revenue is certainly driven in part by the perception that
| they're a leader in AI.
| doomlaser wrote:
| I watched a good talk from Yann LeCun who is Chief AI Scientist
| at Meta, and he explained that the thinking is that open source
| AI models will be the long-term winner, so it's best for them
| to work in that arena.
|
| https://www.youtube.com/watch?v=vyqXLJsmsrk
| GreedClarifies wrote:
| That's not a business strategy.
|
| Likely this is driven by ego.
|
| Yann wants to cement his position as a leader in AI and while
| he clearly does not appreciate LLMS at all, he realizes that
| he needs to make waves in this area.
|
| Mark _needs_ a generative product and has invested
| tremendously in the infrastructure for AI in general (for
| recommendation). He needs researchers to use that
| infrastructure to create a generative product(s).
|
| Yann sees this going on, realizes that he has a very powerful
| (research+recruiting) position and tells mark that he will
| only sign on if Meta gives away a good deal of research and
| Mark concedes, with the condition that he wants his
| generative product by end of 2023 or start of 2024.
| oceanplexian wrote:
| It's not just ego. It's accelerationism. Giving this stuff
| away from free is probably going to accelerate AI a decade
| faster than if it was kept locked up behind closed doors at
| Google, OpenAI, etc. And if you're an optimist then that
| actually might make the world a better place much faster.
| swader999 wrote:
| Realistically, AI will ramp up the good and the bad.
| lsmeducation wrote:
| Linux is wasn't a business strategy either.
| heatmiser wrote:
| It makes sense to me that Facebook is releasing these models
| similarly to the way that Google releases Android OS. Google's
| advertising model benefits from as many people being online as
| possible and their mobile operating system furthers that aim.
| Similarly, Facebook's advertising model benefits from having
| loads of content being generated to then be posted in their
| various products' feeds.
| fnordpiglet wrote:
| As a business strategy I would see it as preventing themselves
| from being hemmed in from the market leaders. By open sourcing
| and raising the bar for commodity AI, they get to crowd source
| improvement to their models and techniques to get ahead in
| their own uses by co-opting open source improvement. I would
| say to sage this is working amazingly well - the amount of
| interest around open source models from meta is immense. I also
| think, as do I, the majority of uses in the future will be from
| fine tuned RAG capable models embedded in devices, not
| pangalactic planet sized computers running septillion parameter
| models. Llamacpp is a perfect illustration of where that's
| working.
|
| We followed a similar model under more duress at Netscape. When
| you use Firefox that's the fruit of that effort. It didn't save
| Netscape, but Meta has a better and more diversified revenue
| base.
| vessenes wrote:
| They are behind commercially, very behind.
|
| They also don't have the same economic setup and DNA as
| MS/OpenAI. Large corporate customers don't pay for access to
| the FB cloud, nor are they likely to -- Ellison has spent years
| building out Oracle Cloud, and he's on the FB board, for
| example. And I bet you didn't think of using Oracle's Cloud for
| your last project.
|
| So, your company DNA is free-to-all social based on ad
| monetization, with a large bet on metaverse / AR / experiential
| social compute being next. You aren't a trusted corporate
| partner for anything but gatekeeping your immense community
| through ad sales.
|
| And, it's clear you a) have some of the most interesting
| private social data in the world, including photos and DMs and
| texts, and b) this AI thing is huge.
|
| A play that doesn't f with your existing corporate structure
| too much is to build this stuff, give it away, keep publishing,
| build your AI team internally, and see where it takes you.
|
| This isn't the only play, but I think it's reasonable. It's
| pretty clear large enterprises are going to need their own,
| internally built / owned, Foundation models to be competitive
| in a bunch of arenas in the next decade. In this case, if FB
| can get a little mindshare, keep the conversation going, and as
| a sidenote, be a disruptor by lowering Azure/OpenAI revs with
| open releases at-the-edge, that's probably a strategy win.
|
| If I were in charge of AI strategy at FB, I'd probably double
| down more on generative AI, and I'd be working hard on realtime
| multimodal stuff -- their recent very large multimodal speech
| to text in multiple languages work is good. If a team could
| eyeball realtime-ish video chat with translations, that would
| be something the platform has a natural advantage in pushing
| out. Generative hits existing customers, and metaverse asset
| creation, which is going to experience radical changes in costs
| and productivity over the next few years, and impact Oculus
| 100% no matter what anybody wishes were true.
| tyre wrote:
| I don't believe they're going for the same hosted
| monetization as Oracle or Google. I'm sure they'll play
| around with assistant AIs but you can imagine them leveraging
| their graph and data for this.
|
| Who is better positioned to answer a question like, "What
| should I get my friend Sophia for her birthday?"
| Facebook/Instagram already have huge volumes of data to
| specifically target ads. They can feed those into a chat
| interface pretty easily.
|
| Customers would then buy per impression by describing their
| product and trusting Facebook to place it correctly. They
| already do this today, it's just a different medium.
| cvhashim04 wrote:
| > Who is better positioned to answer a question like, "What
| should I get my friend Sophia for her birthday?"
| Facebook/Instagram already have huge volumes of data to
| specifically target ads. They can feed those into a chat
| interface pretty easily.
|
| Interesting idea but sounds risky and intrusive in
| practice.
| hiatus wrote:
| I think this suggestion lacks subtlety. More likely,
| around the time leading up to Sophia's birthday, you will
| see more ads for things (maybe even gift idea ads) that
| just so happen to be things Sophia would love (at least,
| according to their data).
| roughly wrote:
| > Interesting idea but sounds risky and intrusive in
| practice.
|
| That's pretty much the entire Meta empire in a single
| sentence.
| Paradigma11 wrote:
| Medication for her new std?
| throwaway290 wrote:
| Commercially it's not clear if there is a reliable "ahead",
| I'd be surprised if copyright lawsuits don't start hitting
| MS/OAI when publishers wake up and if you take out that
| training data where does it leave their models?
| visarga wrote:
| Countries putting copyright above AI progress will just
| fall behind. It's one thing to demand no exact replication
| of copyrighted content, another to forbid training on
| copyrighted works. Ideas were not supposed to be under
| copyright, only expression, from what I remember.
| throwaway290 wrote:
| The argument that copyright abuse is required for "AI
| progress" is sus. It is required for quick easy buck to
| be made by the likes of Microsoft-- that I agree...
| roughly wrote:
| That's interesting. I tend to lump FB, Amazon, Google, and MS
| in my head when thinking about the tech giants, but you're
| right, FB is the only one of those not offering a commercial
| platform. For them, building out the capabilities of the LLMs
| is something to be done in the open with community
| involvement, because they're not going to monetize the models
| themselves.
|
| They're also getting a fantastic amount of press from all
| this, which is good for attracting talent and helping improve
| their image, at least among the nerd set.
| basch wrote:
| I'm in the camp that its a mistake for Meta to not be
| competing in the commercial compute space.
|
| wrote about and diagramed it here -
| https://telegra.ph/Facebook-Social-Services-FbSS-a-missed-
| op...
| CuriouslyC wrote:
| Meta absolutely could not overcome the barriers to entry
| and technical mismatch for any sort of traditional IAS
| style product, and it would be foolish for them to try.
| They might be able to pull off some sort of next
| generation Heroku style service aimed at smaller shops
| with built in facebook integration and authn/z
| management, but that's tangential.
| oceanplexian wrote:
| FB is unlike the other BigTech(tm) since Zuck never sold
| out and has a controlling equity stake. Amazon, Google, and
| MS are all controlled by and beholden to institutional
| investors.
|
| FB can release these for no other reason than Zuck's ego or
| desire to kill OpenAI. Same deal as him going off on a
| tangent with the Metaverse thing.
| boppo1 wrote:
| Wonder why Zuck particularly wants to kill OpenAI instead
| of increasing revenue with a new product offering.
| p1esk wrote:
| Given that OpenAI finished training GPT4 a year ago, and
| no models today (including these) can beat it, I highly
| doubt anyone is capable of killing Open AI in the near
| future. I'm guessing by the time GPT5 is out, someone
| will finally catch up with GPT4.
| sroussey wrote:
| They could always spin it out as a separate company.
| mupuff1234 wrote:
| Larry and Sergey still control majority of voting power
| from what I recall.
| vladms wrote:
| Depends what you mean by platform and depends what you mean
| by FB. If by FB you mean Meta, they have also
| https://www.workplace.com/ (which is like an internal
| facebook), instagram, whatsapp and some others. Integration
| of LLMs technology in those "platform" might give them some
| advantage.
| roughly wrote:
| Right, but they're not competing directly on offering the
| LLM - they benefit from having a better LLM as a feature,
| but their value add is elsewhere in the product.
| Der_Einzige wrote:
| You ought to think about using Oracle Cloud for your next
| LLM/GPU project, because they sell access to A100/H100s for
| cheap and they actually have them in stock!
| hutzlibu wrote:
| "b) this AI thing is huge."
|
| Yeah, there are tons of opportunities for AI to do something
| with facebooks private user data and sell new services. For
| users to create engagement - and for ad companies to get very
| good targeted ads delivered. It is of course a challenge, to
| update the models on the fly, to include the latest private
| data, but then you can tailor an ad, that has subtil
| references to the latest shared wishes of the user. Probably
| quite effective.
|
| So for now they mainly need top talent, to make some of it
| work. And open source is the best bet, for creating a
| ecosystem they can control and get talents who already
| trained on their tools. And they loose allmost nothing,
| because yes, they ain't in the cloud buisness.
|
| So I will continue to not use facebook. But the models I will
| try.
| jatins wrote:
| In times like these Facebook/Zuck probably wonders how things
| would have turned out had they not killed Parse.
|
| Had they continued with it, they'd have likely had some
| semblance of a public cloud today and would be able to sell
| these models.
| liuliu wrote:
| Yes. But it also needs a very different org structure to
| support that. Their internal infra from what I heard is
| dated (monolithic PHP binary deployment, no federated
| authorization management etc.). It is doable (FAIR's org
| structure was very different in the first a few years), but
| would also be a distraction for a long time.
|
| Very interesting to ponder for sure.
| nabusman wrote:
| I would add that having open source gen AI will enable the
| creation of content for metaverse / AR / VR, which will
| improve the chances that all of that will take off.
| vessenes wrote:
| Right, exactly this. Ratcheting the costs down two orders
| of magnitude in both dollar and expertise/human costs is
| going to make huge changes. You better believe FB is
| thinking about this hard.
| lordnacho wrote:
| Copilot has been working great for me thus far, but it's limited
| by its interface. It seems like it only knows how to make
| predictions for the next bit of text.
|
| Is anyone working on a code AI that can suggest refactorings?
|
| "You should pull these lines into a function, it's repetitive"
|
| "You should change this structure so it is easier to use"
|
| Etc
| sestinj wrote:
| You can use Continue for all of this, as easy as highlighting
| code and making the request. We also support using Code Llama:
| https://continue.dev/docs/walkthroughs/codellama
| thewataccount wrote:
| Any plans to support IntelliJ?
| vunderba wrote:
| Yeah this would be a crucial feature - interoperability
| with Jetbrains IDEs.
| GordonS wrote:
| I'd also be really keen on this.
| adocomplete wrote:
| Give Cody a try! (Cody.dev)
|
| With Cody you can create embeddings for your entire repo, so
| Cody will have much greater context about your code base and
| the problems you're trying to solve.
|
| Disclaimer: I just joined Sourcegraph a few weeks ago.
| stuzenz wrote:
| Cody is great, it had become my go-to (and I pay for Github
| Co-pilot).
|
| With that said, they have recently changed the architecture,
| with the local install required, and I have not managed (yet)
| to get it working with NixOS. Once I have some more time, I
| will try again - it looks like there will be some hoops to go
| through. https://nixos.org/manual/nixpkgs/stable/#ssec-pkgs-
| appimageT...
|
| Kudos to the Source Graph team, Source Graph's original
| product was nicely thought out and ahead of it's time. Nice
| to see how the original product gave a nice basis for
| building out Cody.
| armchairhacker wrote:
| Copilot calls these "Code Brushes"
| https://githubnext.com/projects/code-brushes/
|
| Last I heard they are in beta and don't work very well (even on
| the examples page: the "add types" brush is too strict, since
| `a` and `b` are checked for `null`, and the "fix simple bug" is
| a typo)
| artificialLimbs wrote:
| I let mine generate whatever it likes, then add a comment below
| such as "# Refactor the above to foo.." Works fairly well at
| times.
| lordnacho wrote:
| Can it suggest deletions? Just seems like I don't know how to
| use it.
| make3 wrote:
| There's an instruct model in there, you can definitely use it
| for this, that's one of the objectives.
|
| An instruct model means that you can ask it to do what you
| want, including asking it to give you refactoring ideas from
| the code you will give it.
| regularfry wrote:
| Sounds like what's needed is a bit of tooling in the
| background consistently asking the LLM "How would you improve
| this code?" so you don't need to actually ask it.
| lordnacho wrote:
| How do I access it from my IDE? Jetbrains/VSCode?
| phillipcarter wrote:
| SourceGraph Cody is going in that direction, as is Copilot
| Chat. But it's still early days. I don't think there's anything
| robust here yet.
| nvm0n2 wrote:
| Neither of those tasks require AI. IntelliJ IDEA will happily
| suggest both for you today, locally. It can find large chunks
| of duplicated code and automatically refactor them out to
| functions for you. And it has many inspections that suggest
| refactorings to make code clearer.
| fpgaminer wrote:
| https://docs.github.com/en/copilot/github-copilot-chat/using...
| can basically do that if you're in the beta.
| claytongrassick wrote:
| I've been using Cursor (https://www.cursor.so/) and it can do
| embeddings of the entire codebase, refactoring entire classes,
| etc. I had it rewrite a UI to add state to show one item at a
| time and have a selection list to the left and it executed it
| perfectly in MUI controls, first try.
| modeless wrote:
| Interesting that there's a 34B model. That was missing from the
| original Llama 2 release. I wonder if it's still usable for
| general non-code chat tasks or if the code fine tuning destroyed
| that. It should be the best model that would still fit on 24GB
| gaming GPUs with quantization, because 70B doesn't fit.
| brucethemoose2 wrote:
| Someone "grafted" llama 33B onto llama v2 13B to make "llama
| 22B"
|
| https://huggingface.co/chargoddard/llama2-22b
|
| Theoretically this is an even better size, as it would fit on a
| 20GB-24GB GPU with more relaxed quantization and much longer
| context.
|
| Metrics are slightly below 13B, but the theory is that the
| higher parameter count is more amenable to finetuning. If you
| search for 22B on huggingface, you can see that frankenllama
| experiments are ongoing:
|
| https://huggingface.co/models?sort=modified&search=22b
| redox99 wrote:
| I can't imagine it being better than Llama1 33B, after all this
| code finetuning.
| modeless wrote:
| But the license for llama 2 is a whole lot better.
| redox99 wrote:
| Meh.
|
| If you're using it commercially you're probably deploying
| it on a server where you're not limited by the 24GB and you
| can just run llama 2 70b.
|
| The majority of people who want to run it locally on 24GB
| either want roleplay (so non commercial) or code (you have
| codellama)
| nabakin wrote:
| Looks like they left out another model though. In the paper
| they mention a "Unnatural Code Llama" which wipes the floor
| with every other model/finetune on every benchmark except for
| slightly losing to Code Llama Python on MBPP pass@100 and
| slightly losing to GPT-4 on HumanEval pass@1 which is insane.
|
| Meta says later on that they aren't releasing it and give no
| explanation. I wonder why given how incredible it seems to be.
| ImprobableTruth wrote:
| It's "unnatural" because it was finetuned on generated data
| using another model, almost certainly gpt-4 (whose TOS forbid
| this).
| jrh3 wrote:
| lol... Python for Dummies (TM)
| the-alchemist wrote:
| Anyone know if it supports Clojure?
| waitingkuo wrote:
| Looks like that we need to request the access first
| taylorbuley wrote:
| In the past, LLAMA access was granted nearly immediately. For
| HuggingFace downloads, it took a full day.
| naillo wrote:
| Feels like we're like a year away from local LLMs that can debug
| code reliably (via being hooked into console error output as
| well) which will be quite the exciting day.
| brucethemoose2 wrote:
| That sounds like an interesting finetuning dataset.
|
| Imagine a database of "Here is the console error, here is the
| fix in the code"
|
| Maybe one could scrape git issues with console output and
| tagged commits.
| ilaksh wrote:
| Have you tried Code Llama? How do you know it can't do it
| already?
|
| In my applications, GPT-4 connected to a VM or SQL engine can
| and does debug code when given error messages. "Reliably" is
| very subjective. The main problem I have seen is that it can be
| stubborn about trying to use outdated APIs and it's not easy to
| give it a search result with the correct API. But with a good
| web search and up to date APIs, it can do it.
|
| I'm interested to see general coding benchmarks for Code Llama
| versus GPT-4.
| jebarker wrote:
| What does "GPT-4 connected to a VM or SQL engine" mean?
| ilaksh wrote:
| https://aidev.codes shows connected to VM.
| tomr75 wrote:
| Have you tried giving up to date apis as context?
| 6stringmerc wrote:
| So it's stubborn, stinks, bites and spits?
|
| No thanks, going back to Winamp.
| gorbypark wrote:
| I can't wait for some models fine tuned on other languages. I'm
| not a Python developer, so I downloaded the 13B-instruct variant
| (4 bit quantized Q4_K_M) and it's pretty bad at doing javascript.
| I asked it to write me a basic React Native component that has a
| name prop and displays that name. Once it returned a regular
| React component, and when I asked it to make sure it uses React
| Native components, it said sure and outputted a bunch of random
| CSS and an HTML file that was initializing a React project.
|
| It might be the quantization or my lacklustre prompting skills
| affecting it, though. To be fair I did get it to output a little
| bit of useful code after trying a few times.
| mdaniel wrote:
| it looks like https://news.ycombinator.com/item?id=37248844 has
| gotten the traction at 295 points
| dang wrote:
| Maybe we'll merge that one hither to split the karma.
| WhitneyLand wrote:
| How much am I'm missing out on with tools like this or code
| pilot, compared to using GPT-4?
|
| I guess since Xcode doesn't have a good plug-in architecture for
| this I began experimenting more with a chat interface.
|
| So far gpt-4 has seemed quite useful for generating code,
| reviewing code for certain problems, etc.
| syntaxing wrote:
| TheBloke doesn't joke around [1]. I'm guessing we'll have the
| quantized ones by the end of the day. I'm super excited to use
| the 34B Python 4 bit quantized one that should just fit on a
| 3090.
|
| [1] https://huggingface.co/TheBloke/CodeLlama-13B-Python-fp16
| [deleted]
| stuckinhell wrote:
| What kind of cpu/gpu power do you need for quantization or
| these new gguf formats ?
| syntaxing wrote:
| I haven't quantized these myself since TheBloke has been the
| main provider for all the quantized models. But when I did a
| 8 bit quantization to see how it compares to the transformers
| library load_in_8bit 4 months ago(?), it didn't use my GPU
| but loaded each shard into the RAM during the conversion. I
| had an old 4C/8T CPU and the conversion took like 30 mins for
| a 13B.
| SubiculumCode wrote:
| i run llama2 13B models with 4-6 k-quantized oin a 3060 with
| 12Gb VRam
| suyash wrote:
| can it be quantised further so it can run locally on a normal
| laptop of a developer?
| syntaxing wrote:
| "Normal laptop" is kind of hard to gauge but if you have a M
| series MacBook with 16GB+ RAM, you will be able to run 7B
| comfortably and 13B but stretching your RAM (cause of the
| unified RAM) at 4 bit quantization. These go all the way down
| to 2 bit but I personally I find the model noticeably
| deteriorate anything below 4 bit. You can see how much (V)RAM
| you need here [1].
|
| [1] https://github.com/ggerganov/llama.cpp#quantization
| UncleOxidant wrote:
| If I don't want to run this locally is it runnable somewhere on
| huggingface?
| emporas wrote:
| Replicate has already hosted Llama2 13B, the chat version. My
| guess is, in a short span of days or weeks they will host the
| code version too. They charge a dollar for 2000 generations
| if i am not mistaken.
|
| https://replicate.com/a16z-infra/llama-2-13b-chat
| mchiang wrote:
| Ollama supports it already:
|
| `ollama run codellama:7b-instruct`
|
| https://ollama.ai/blog/run-code-llama-locally
|
| More models uploaded as we speak:
|
| https://ollama.ai/library/codellama
| jerrysievert wrote:
| while it supports it, so far I've only managed to get
| infinite streams of near nonsense from the ollama models
| (codellama:7b-q4_0 and codellama:latest)
|
| my questions were asking how to construct an indexam for
| postgres in c, how to write an r-tree in javascript, and how
| to write a binary tree in javascript.
| jmorgan wrote:
| > managed to get infinite streams of near nonsense
|
| This should be fixed now! To update you'll have to run:
| ollama pull codellama:7b-instruct
| carbocation wrote:
| Similarly, I had it emit hundreds of blank lines before
| cancelling it.
| justinsaccount wrote:
| Maybe it's outputting https://en.wikipedia.org/wiki/White
| space_(programming_langua... :-)
| kordlessagain wrote:
| What fortune, I so happen to need hundreds of blank
| lines.
| syntaxing wrote:
| Same, just tried it and it would give me infinite amount of
| blank lines
| jmorgan wrote:
| Sorry, this should be fixed now! To update you'll have to
| run: ollama pull codellama:7b-instruct
| mchiang wrote:
| still modifying the code completion (foundation / python
| models) to see what's causing the behavior.
|
| Have had some good success with the instruct model:
|
| codellama:7b-instruct
| jerrysievert wrote:
| thanks! this give me some results, but I've had to use a
| specific construct to get anything meaningful:
|
| using <language> write me a <thing>
|
| it's managed to spit out code, rather than "write a
| traversal function".
| [deleted]
| comechao wrote:
| I'm testing it on my M2 Air (16GB). Quite fast!
| syntaxing wrote:
| Whoa, it's absolutely astounding how fast the community is
| reacting to these model release!
| Pesthuf wrote:
| Isn't ollama terminal only? For code, that wouldn't be good.
| natrys wrote:
| They have a server/client model. The binary comes with a
| basic terminal front-end but you can just create your own
| self-hosted GUI or editor integration against the API[1]:
|
| [1]
| https://github.com/jmorganca/ollama/blob/main/docs/api.md
| jmorgan wrote:
| Indeed! After pulling a model with "ollama pull
| codellama" you can access it via the REST API:
| curl -X POST http://localhost:11434/api/generate -d '{
| "model": "codellama", "prompt":"write a python
| script to add two numbers" }'
| [deleted]
| benvolio wrote:
| >The Code Llama models provide stable generations with up to
| 100,000 tokens of context.
|
| Not a bad context window, but makes me wonder how embedded code
| models would pick that context when dealing with a codebase
| larger than 100K tokens.
|
| And this makes me further wonder if, when coding with such a tool
| (or at least a knowledge that they're becoming more widely used
| and leaned on), are there some new considerations that we should
| be applying (or at least starting to think about) when
| programming? Perhaps having more or fewer comments, perhaps more
| terse and less readable code that would consume fewer tokens,
| perhaps different file structures, or even more deliberate naming
| conventions (like Hungarian notation but for code models) to
| facilitate searching or token pattern matching of some kind.
| Ultimately, in what ways could (or should) we adapt to make the
| most of these tools?
| gonzan wrote:
| I built a VS code extension a while back that I still use that
| wraps GPT-4 and writes code directly in my editor.
|
| The method I used to choose which files to feed GPT-4 was
| embeddings-based. I got an embedding for each file and then an
| embedding from the instruction + some simple processing to pick
| the files more likely to be relevant. It isn't perfect but good
| enough most of the time in medium-sized codebases (not very
| large ones).
|
| The one thing I started doing because of how I implemented this
| is make files shorter and move stuff into different files.
| Having a 1k+ LOC file is prohibitive because it eats up all the
| context window (although with 100k context window maybe less
| so). I think it's a good idea to keep files short anyways.
|
| There's other smarter things that can be done (like embed and
| pass individual functions/classes instead of entire files) so I
| have no doubt someone will build something smarter soon. You'll
| likely not have to change your coding patterns at all to make
| use of AI.
| visarga wrote:
| A good practice is to have a prompt file where you keep the
| information you want the model to have at its disposal. Then
| you put it in the start of your conversations with GPT-4. It's
| also good documentation for people.
|
| You start a project by defining the task. Then as you iterate,
| you can add new information to the prompt. But it can be also
| partially automated - the model can have a view of the file
| structure, classes, routes, assets and latest errors.
|
| I was really hoping that the one year update of Codex would be
| that - a LLM that can see deep into the project, not just code,
| but runtime execution, debugging, inspecting and monitoring.
| Something that can iterate like autoGPT. Unfortunately it
| didn't improve much and has weird conflicts with the native
| code completion in VSCode, you get freezes or doubled brackets.
| wokwokwok wrote:
| That seems daft.
|
| You can, I suppose, contract your code so that it's context
| free and uses less tokens, but that makes it more confusing for
| humans _and language models_.
|
| Taken to the extreme, you can see obviously with one letter
| functions and variables like i, j, k the model will be able to
| infer literally nothing and, thus, produce arbitrary nonsense.
|
| Clearly the solution is to do what we _already_ do to manage
| complexity which is to decompose large tasks into smaller black
| box modules with an api where the (large number of tokens)
| implementation is hidden and not known or relevant to using it.
|
| If you give an LLM a function signature and good description,
| maybe some usage examples, it doesn't need the implementation
| to use it.
|
| Terseness _decreases_ the ability of LLMs to process code; it
| doesn't solve context length, and even at best it doesn't
| scale.
|
| 100k tokens is plenty.
|
| You don't need to do anything like that.
| emporas wrote:
| The process of decomposing the task into smaller steps and
| generate each step independently seems to be the correct way
| in my experience too. It works very well with GPT (chatGPT or
| GPT4).
|
| >100k tokens is plenty.
|
| The context window can be really helpful, in case there is a
| release of a new library and the user wants to generate code
| targeting the API of the library. When the training date
| stops at August 2023, any library released after that date is
| not known to the engine.
|
| My general opinion in regards to context window, is that 1
| trillion tokens context window still may not be enough for
| all use cases.
| roughly wrote:
| I've found the utility of the coding LLMs gets a lot higher
| when you've got code comments and descriptive variable and
| function names - the LLM makes better inferences and
| suggestions. We've seen similar on data - properly tagged data
| and descriptive field names helps the LLM to produce much more
| useful responses. I'm secretly hoping the spread of these tools
| will finally lead my fellow developers to comment their code
| and stop using three character variable names.
| GreedClarifies wrote:
| Commenting the code in this manner sounds like a job for an
| LLM, maybe with human assistance in the short run.
| bbor wrote:
| This is my ultimate (short term) AI fear - letting it get
| into a feedback loop with itself, leading to perverse and
| incorrect results.
|
| To state my position more clearly: I don't think an AI
| could comment code from scratch very well - how would it
| know all the decisions made, business logic considerations,
| historical conventions, micro-industry standards, etc?
|
| A good benchmark I was told once was "if a human expert
| couldn't do it, an AI probably can't either". And
| commenting code I didn't write would certainly test the
| bounds of my abilities
| ttul wrote:
| Your developer tool already maps out the entire code base in
| useful ways, such as knowing all the symbols available in the
| current context and the structure of classes. This information
| can be distilled for presentation to the LLM. For instance, if
| you're wanting to generate a method implementation inside a C++
| class, the LLM can be given a condensed version of the header
| files that the compiler would have access to on compiling that
| specific class. Removing white space and comments and boiling
| macros down saves a lot of tokens.
|
| You can also probably skip including standard library headers
| since those will be well known to the LLM through its fine
| tuning.
|
| Either way, consider that a typical preprocessed C++ file would
| push against the 100K limit even with some optimizations. You
| will definitely want to have some middleware doing additional
| refinement before presenting that file to the LLM.
| brucethemoose2 wrote:
| This sounds like a job for middleware. Condensing split code
| into a single huge file, shortening comments, removing
| whitespace and such can be done by a preprocessor for the llm.
| gabereiser wrote:
| So now we need an llmpack like we did webpack? Could it be
| smart enough to truncate comments, white space, etc?
| brucethemoose2 wrote:
| You dont even need an llm for trimming whitespace, just a
| smart parser with language rules like ide code checkers
| already use. Existing llms are fine at summarizing
| comments, especially with language specific grammar
| constraints.
| gabereiser wrote:
| My point. We don't need the middleware.
| adamgordonbell wrote:
| Solutions exist that feed LLMS ctags, and seem to work well.
| The function signatures and symbols names for a code base are
| much smaller than the actual code.
| e12e wrote:
| Curious if there are projects to enable working with these things
| self-hosted, tuned to a git repo as context on the cli, like a
| Unix filter - or with editors like vim? (I'd love to use this
| with Helix)
|
| I see both vscode and netbeans have a concept of "inference URL"
| - are there any efforts like language server (lsp) - but for
| inference?
| ilaksh wrote:
| https://github.com/runvnc/smartcat
| ingridpan wrote:
| not quite self-hosted but gradient.ai gives you access to
| llama2 via CLI
| up6w6 wrote:
| Even the 7B model of code llama seems to be competitive with
| Codex, the model behind copilot
|
| https://ai.meta.com/blog/code-llama-large-language-model-cod...
| SparkyMcUnicorn wrote:
| I'm not sure copilot is using codex anymore[0]. They've also
| been talking about a shift towards GPT-4 with "Copilot X" a few
| times now[1][2].
|
| [0] https://github.blog/2023-07-28-smarter-more-efficient-
| coding...
|
| [1] https://github.com/features/preview/copilot-x
|
| [2] https://github.blog/2023-07-20-github-copilot-chat-beta-
| now-...
| ramesh31 wrote:
| >Even the 7B model of code llama seems to be competitive with
| Codex, the model behind copilot
|
| It's extremely good. I keep a terminal tab open with 7b running
| for all of my "how do I do this random thing" questions while
| coding. It's pretty much replaced Google/SO for me.
| solarkraft wrote:
| Huh? Do you perhaps mean standard Llama?
| coder543 wrote:
| You've already downloaded and thoroughly tested the 7B
| parameter model of "code llama"? I'm skeptical.
| realce wrote:
| Just sign up at meta and you'll get an email link in like 5
| minutes
| coder543 wrote:
| Yes, that's not a response to my comment.
|
| No one who has been using any model for just the past 30
| minutes would say that it has "pretty much replaced
| Google/SO" for them, unless they were being facetious.
| dataangel wrote:
| GPT4 has replaced SO for me and I've been using it for
| months.
| tyre wrote:
| They said 7b llama which I read as the base LLaMa model,
| not this one specifically. All of these LLMs are trained
| on Stack Overflow so it makes sense that they'd be good
| out of the box.
| brandall10 wrote:
| The top level comment is specifically citing performance
| of code llama against codex.
| [deleted]
| bbor wrote:
| It was made available internally, I believe. So this is one
| of the many Meta engineers on this site --- after all,
| Facebook is now less hated than Google here ;)
| [deleted]
| Eddygandr wrote:
| Maybe confused Code Llama with Llama 2?
| lddemi wrote:
| Likely meta employee?
| MertsA wrote:
| I've been using this or something similar internally for
| months and love it. The thing that gets downright spooky
| is the comments believe it or not. I'll have some method
| with a short variable name in a larger program and not
| only does it often suggest a pretty good snippet of code
| the comments will be correct and explain what the intent
| behind the code is. It's just a LLM but you really start
| to get the feeling the whole is greater than the sum of
| the parts.
| coder543 wrote:
| I just don't understand how anyone is making practical
| use of local code completion models. Is there a VS Code
| extension that I've been unable to find? HuggingFace
| released one that is meant to use their service for
| inference, not your local GPU.
|
| The instruct version of code llama could certainly be run
| locally without trouble, and that's interesting too, but
| I keep wanting to test out a local CoPilot alternative
| that uses these nice, new completion models.
| fredoliveira wrote:
| There are a bunch of VSCode extensions that make use of
| local models. Tabby seems to be the most friendly right
| now, but I admittedly haven't tried it myself:
| https://tabbyml.github.io/tabby/
| ohyes wrote:
| What hardware do you have that lets you run 7b and do other
| stuff at the same time?
| _joel wrote:
| If you're willing to sacrifice token/s you can even run
| these on your phone.
| hmottestad wrote:
| Maybe a MacBook Pro. The Apple silicon chops can offload a
| special AI inference engine, and all ram is accessible by
| all parts of the chip.
| gzer0 wrote:
| An M1 Max with 64GB of RAM allows me to run multiple models
| simultaneously, on top of stable diffusion generating
| images non-stop + normal chrome, vscode, etc. Definitely
| feeling the heat, but it's working. Well worth the
| investment.
| brucethemoose2 wrote:
| Pretty much any PC with 16GB+ of fast RAM can do this, any
| PC with a dGPU can do it well.
| [deleted]
| rafaelero wrote:
| Those charts remind me just how insanely good GPT-4 is. It's
| almost 5 months since its release and I am still at awe with its
| capabilities. The way it helps with coding is just crazy.
| binary132 wrote:
| I wonder whether org-ai-mode could easily support this.
| [deleted]
| scriptsmith wrote:
| How are people using these local code models? I would much prefer
| using these in-context in an editor, but most of them seem to be
| deployed just in an instruction context. There's a lot of value
| to not having to context switch, or have a conversation.
|
| I see the GitHub copilot extensions gets a new release one every
| few days, so is it just that the way they're integrated is more
| complicated so not worth the effort?
| thewataccount wrote:
| For in-editor like copilot you can try this locally -
| https://github.com/smallcloudai/refact
|
| This works well for me except the 15B+ don't run fast enough on
| a 4090 - hopefully exllama supports non-llama models, or maybe
| it'll support CodeLLaMa already I'm not sure.
|
| For general chat testing/usage this works pretty well with lots
| of options - https://github.com/oobabooga/text-generation-
| webui/
| msp26 wrote:
| >This works well for me except the 15B+ don't run fast enough
| on a 4090
|
| I assume quantized models will run a lot better. TheBloke
| already seems like he's on it.
|
| https://huggingface.co/TheBloke/CodeLlama-13B-fp16
| thewataccount wrote:
| Unfortunately what I tested was StarCoder 4bit. We really
| need exllama which should make even 30b viable from what I
| can tell.
|
| Because codellama is llama based it may just work possibly?
| modeless wrote:
| http://cursor.sh integrates GPT-4 into vscode in a sensible
| way. Just swapping this in place of GPT-4 would likely work
| perfectly. Has anyone cloned the OpenAI HTTP API yet?
| fudged71 wrote:
| I was tasked with a massive project over the last month and
| I'm not sure I could have done it as fast as I have without
| Cursor. Also check out the Warp terminal replacement.
| Together it's a winning combo!
| lhl wrote:
| LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/
| both have fairly complete OpenAI compatibility layers. llama-
| cpp-python has a FastAPI server as well:
| https://github.com/abetlen/llama-cpp-
| python/blob/main/llama_... (as of this moment it hasn't
| merged GGUF update yet though)
| sestinj wrote:
| You can use Continue as a drop-in replacement for Copilot Chat
| with Code Llama. We've released a short tutorial here:
| https://continue.dev/docs/walkthroughs/codellama. It should
| save you a lot of time context-switching; you can just
| highlight code and ask questions or make edits, all with
| keyboard shortcuts
| brucethemoose2 wrote:
| Here is the paper:
|
| https://ai.meta.com/research/publications/code-llama-open-fo...
| 1024core wrote:
| > Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash
|
| What?!? No Befunge[0], Brainfuck or Perl?!?
|
| [0] https://en.wikipedia.org/wiki/Befunge
|
| /just kidding, of course!
| jmorgan wrote:
| To run Code Llama locally, the 7B parameter quantized version can
| be downloaded and run with the open-source tool Ollama:
| https://github.com/jmorganca/ollama ollama run
| codellama "write a python function to add two numbers"
|
| More models coming soon (completion, python and more parameter
| counts)
| natch wrote:
| Why wouldn't they provide a hosted version? Seems like a no
| brainer... they have the money, the hardware, the bandwidth, the
| people to build support for it, and they could design the
| experience and gather more learning data about usage in the
| initial stages, while putting a dent in ChatGPT commercial
| prospects, and all while still letting others host and use it
| elsewhere. I don't get it. Maybe it was just the fastest option?
| redox99 wrote:
| Probably the researchers at meta are only interested in
| research, and productionizing this would be up to other teams.
| natch wrote:
| But Yann LeCun seems to think the safety problems of eventual
| AGI will be solved somehow.
|
| Nobody is saying this model is AGI obviously.
|
| But this would be an entry point into researching one small
| sliver of the alignment problem. If you follow my thinking,
| it's odd that he professes confidence that AI safety is a non
| issue, yet from this he seems to want no part in
| understanding it.
|
| I realize their research interest may just be the
| optimization / mathy research... that's their prerogative but
| it's odd imho.
| ShamelessC wrote:
| It's not that odd and I think you're overestimating the
| importance of user submitted data for the purposes of
| alignment research. In particular because it's more
| liability for them to try to be responsible for outputs.
| Really though, this way they get a bunch of free work from
| volunteers in open source/ML communities.
| Palmik wrote:
| The best model, Unnatural Code Llama, is not released. Likely
| because it's trained on GPT4 based data, and might violate OpenAI
| TOS, because as per the "Unnatural" paper [1], the "unnatural"
| data is generated with the help of some LLM -- and you would want
| to use as good of an LLM as possible.
|
| [1] https://arxiv.org/pdf/2212.09689.pdf
| redox99 wrote:
| The good thing is that if it's only finetuned on 15k
| instructions, we should see a community made model like that
| very soon.
| bick_nyers wrote:
| Anyone know of a good plugin for the JetBrains IDE ecosystem
| (namely, PyCharm) that is CoPilot but with a local LLM?
| WaitWaitWha wrote:
| Can someone point me to a ELI5 sequence of steps that shows how
| someone can install and use LLMs locally and in some way,
| functionally?
|
| Asking for purposes of educating non-technologists.
| Patrick_Devine wrote:
| There are several different ways, but the easiest way in my
| (clearly biased) opinion is just got to ollama.ai, download it,
| and start playing around. It works out of the box w/ newer
| Macs, but there are versions for Linux and Windows in the
| works.
| marcopicentini wrote:
| It's just a matter of time that Microsoft will integrate it into
| VSCode.
| daemonologist wrote:
| Works nearly out of the box with llama.cpp, which makes it easy
| to try locally:
| https://github.com/ggerganov/llama.cpp/issues/2766
|
| Here's some output from q4_0 quantization of CodeLlama-7b-Python
| (first four lines are the prompt): # prints the
| first ten prime numbers def print_primes():
| i = 2 num_printed = 0 # end of prompt
| while num_printed < 10: if is_prime(i):
| print(i) num_printed += 1 i
| += 1 def is_prime(n): i = 2
| while i * i <= n: if n % i == 0:
| return False i += 1 return True
| def main(): print_primes() if __name__
| == '__main__': main()
|
| It will be interesting to see how the larger models perform,
| especially after community tuning and with better
| context/prompting.
| quickthrower2 wrote:
| Funny watching HN be nerd sniped by a machine :-)
| blibble wrote:
| I'd fail an interview candidate that suggested adding 1 each
| time for subsequent prime testing
| maleldil wrote:
| I assume you meant that you should add 2? If yes, that's such
| a mind boggling basic thing to do that I agree with you, and
| it makes no sense that you're being crucified.
| blibble wrote:
| yes
| throwuxiytayq wrote:
| i'd walk out from an interview that asked me to write a prime
| number generator
| belenos46 wrote:
| I've done that (maybe it was fizzbuzz, now that I'm
| thinking about it) and boy howdy does that get the people
| you're interviewing with agitated. Saying "I'm interviewing
| for a architect level container orchestration position. If
| I'm reinventing the wheel writing algorithms, something is
| _terribly_ wrong " shuts them up, but doesn't make them any
| happier.
| [deleted]
| dontupvoteme wrote:
| Simply prompting the output with "Optimize " prepended adds
| your suggestion, and some others.
| csmpltn wrote:
| > "I'd fail an interview candidate that suggested adding 1
| each time for subsequent prime testing"
|
| Congratulations! You must be that arrogant guy everybody
| hates interviewing with, the one with the superiority
| complex.
|
| How about instead of just failing people over literally
| nothing (wasting everybody's time and money) - just ask the
| candidate whether they could somehow reduce the search space
| by utilizing the properties of a prime number?
| droopyEyelids wrote:
| The simple-to-understand, greedy algorithm is always the
| correct first choice till you have to deal with a constraint.
| blibble wrote:
| it's not that though, there's several other typical
| optimisations in there
|
| just not the super obvious one that demonstrates extremely
| basic understanding of what a prime number is
| jpeterson wrote:
| Having "extremely basic understanding" of prime numbers
| immediately at one's command is important for
| approximately 0% of software engineering jobs. If you
| instant-fail a candidate for this, it says a lot more
| about you and your organization than the candidate.
| blibble wrote:
| > If you instant-fail a candidate for this, it says a lot
| more about you and your organization than the candidate.
|
| yes, we expect professional software developers to have
| basic maths skills
|
| "what is a prime number" is taught to 7 year olds, it's
| not vector calculus
|
| what else would you consider to be an unreasonable thing
| for an employer to require?
|
| reading and writing skills of a typical 7 year old?
| ungruntled wrote:
| I think the key problem here is that is is a bad
| programming question. If you know anything about prime
| numbers then coming up with an answer is trivial. If you
| expect a more optimized solution, then you are really
| only gauging the interviewee's understanding of prime
| numbers. So effectively the interview is more about
| mathematics than it is about programming or problem
| solving.
| [deleted]
| daok wrote:
| You probably do not have a child of 7 years old because
| they do not know at that age what is a prime number.
|
| Second, basic math still that you never or rarely use or
| with very large time between usage might get rusty. You
| may understand the concept but not find the optimal
| solution. The way you are responding here shows quite a
| lot about how you are short sighted by instant-failing
| someone with a single question instead of trying to asses
| the whole person as much as you can. On you side, you are
| wasting opportunity to have a great person that could be
| a key player in your team by bringing other set of skill
| on the table.
| blibble wrote:
| > You probably do not have a child of 7 years old because
| they do not know at that age what is a prime number.
|
| it's part of the curriculum for children of this age
| where I grew up (I did check)
|
| > The way you are responding here shows quite a lot about
| how you are short sighted by instant-failing someone with
| a single question instead of trying to asses the whole
| person as much as you can. On you side, you are wasting
| opportunity to have a great person that could be a key
| player in your team by bringing other set of skill on the
| table.
|
| it may also be the case I have more in depth knowledge
| about the that roles I've interviewed candidates for
|
| most recently: hiring people to work for quants
|
| not instantly knowing that even numbers (other than 2)
| are not prime is a very strong signal
| noduerme wrote:
| I'm mad at myself now that it has eaten 15 minutes of my
| time trying to come up with the right optimization.
| What's the trick? 2, +1, and then +2 from there on seems
| obvious but once you get to 9 is it worth building a list
| of nonprimes to skip?
| Our_Benefactors wrote:
| https://stackoverflow.com/a/54544012/1336678
|
| Common approach is to use square roots, this reduces the
| runtime. Recommend checking out project euler if you like
| solving hard math-code-o(n)-puzzles.
| noduerme wrote:
| I didn't want to cheat by looking on S.O. but thanks ;)
|
| Yes it makes sense (in the GPT code) that you'd only go
| up to i * i ... although looking at pythonic while:
| statements is just gross to me in this context, it would
| feel a lot more readable to say, e.g. in PHP:
|
| for ($i=2;$i<sqrt($n);) { $i+=($i==2 ? 1 : 2); //although
| the first one should just be outside the loop }
| thewataccount wrote:
| I think they're suggesting simply doing +2
|
| +1 is not a good idea since ~half of all numbers are
| effectively non-prime simply by being even numbers.
|
| You can double the speed by using +2 without using any
| fancy tricks, just changing a single character.
| [deleted]
| tasubotadas wrote:
| Finally we meet the lifeless drone that everybody complains
| about in the interviews.
|
| My suggestion for your next interview: decide to hire them
| just based on their leetcode score, but invite to the
| interview just to flex that you're still better at puzzle
| solving :-D
|
| Perfect
| d0mine wrote:
| Simple, concise, more efficient: def
| primes_upto(limit: int): """Generate prime numbers <
| *limit*.""" # Sieve of Eratosthene is_prime =
| [True] * limit for n in range(2, limit):
| if is_prime[n]: yield n # found prime number
| for c in range(n*n, limit, n): # start with square, less
| values are marked already is_prime[c] =
| False # mark composites if __name__ == "__main__":
| from itertools import islice
| print(*islice(primes_upto(100), 10)) # -> 2 3 5 7 11 13 17 19
| 23 29
| someplaceguy wrote:
| Yeah, but yours was generated by the "post unoptimized code
| to HN and wait for someone to optimize it" model, which,
| although free and doesn't require a GPU, is a much slower
| model.
| turnsout wrote:
| Someone should turn this into a product! You highlight the
| code you want to optimize, and it posts it to hn as a semi-
| contextually-appropriate comment to invite code golfing,
| and the highest rated reply gets posted back to your repo
| as a PR.
| saurik wrote:
| But, unless you are trying to find a prime number low
| enough that you might as well look it up in a pre-generated
| table, it might still be end-to-end more efficient?
| someplaceguy wrote:
| Ah, good point :) Touche!
| reacharavindh wrote:
| Code llama Python is very interesting. Specifically tuned for
| Python.
|
| I wonder if we could make such specific LLMs (one that is
| proficient in all things Rust, another- all things Linux, all
| things genomics, all things physics modeling etc) and have them
| talk to each other to collaboratively solve problems.
|
| That would be a crazy future thing! Putting machines truly to
| work..
| seydor wrote:
| Start with a CodeLlama for C, and start treating these systems
| as natural language compilers. C is low level enough and still
| readable for those rare moments
| brucethemoose2 wrote:
| If you can find a large body of good, permissively licensed
| example code, you can finetune an LLM on it!
|
| There was a similar attempt for Godot script trained a few
| months ago, and its reportedly pretty good:
|
| https://github.com/minosvasilias/godot-dodo
|
| I think more attempts havent been made because base llama is
| not that great at coding in general, relative to its other
| strengths, and stuff like Starcoder has flown under the radar.
| [deleted]
| esperent wrote:
| I think this is called "mixture of experts" and also there's a
| lot of speculation that it's how GPT-4 works, although probably
| with just a few large models rather than many small ones.
| jmiskovic wrote:
| It's been confirmed by multiple (unofficial) sources that
| GPT-4 is 8 models, each 220B parameters. Another rumor is
| GPT-4 being 16x111B models.
|
| There's a quite fresh and active project replicating
| something similar with herd of llamas:
| https://github.com/jondurbin/airoboros
| bbor wrote:
| Mark my words: you've caught a glimpse of the near future :).
| Google "Society of Mind" if you're not yet familiar
| mercurialsolo wrote:
| Is there a version of this on replicate yet?
| ilaksh wrote:
| Between this, ideogram.ai (image generator which can spell, from
| former Google Imagen team member and others), and ChatGPT fine-
| tuning, this has been a truly epic week.
|
| I would argue that many teams will have to reevaluate their LLM
| strategy _again_ for the second time in a week.
| ShamelessC wrote:
| Did ideogram release a checkpoint?
| ilaksh wrote:
| I can't find any info or Discord or forum or anything. I
| think it's a closed service that they plan to sell to make
| money.
| rvnx wrote:
| Amazing! It's great that Meta is making AI progress.
|
| In the meantime, we are still waiting for Google to show what
| they have (according to their research papers, they are beating
| others).
|
| > User: Write a loop in Python that displays the top 10 prime
| numbers.
|
| > Bard: Sorry I am just an AI, I can't help you with coding.
|
| > User: How to ask confirmation before deleting a file ?
|
| > Bard: To ask confirmation before deleting a file, just add -f
| to the rm command.
|
| (real cases)
| [deleted]
| criley2 wrote:
| I don't get comments like this, we can all go and test Bard and
| see that what you're saying isn't true
|
| https://g.co/bard/share/95761dd6d45e
| rvnx wrote:
| Well look for yourself:
|
| https://g.co/bard/share/e8d14854ccab
|
| The rm answer is now "hardcoded" (aka, manually entered by
| reviewers), the same with the prime or fibonnaci.
|
| This is why we both see the same code across different
| accounts (you can make the test if you are curious).
| ed wrote:
| That's a hallucination. Here's a similar made-up answer:
|
| https://g.co/bard/share/9ce2e6a11e83
|
| LLM's aren't trained on their own documentation, and can't
| introspect, so generally can't answer questions like this.
|
| (`"Mark House" "Bard"` gives no results on Google.)
| criley2 wrote:
| Okay, so the entire point of the comment is "A current
| model which does well used to be bad!"
|
| With all due respect, is that a valuable thing to say?
| Isn't it true of them all?
| nickthegreek wrote:
| Isn't the model STILL doing bad if it needs to present a
| hard-coded answer?
| rvnx wrote:
| Mhh not just about the past, you can see such in current
| answers from Bard.
|
| They are generally okayish, closer to "meh", than
| something outstanding.
|
| Yes the shell script solution is better, it doesn't give
| rm -f anymore, but is still somewhat closer to a bad
| solution instead of just giving rm -i.
|
| I'm just really happy and excited to see that a free-to-
| download and free-to-use model can beat a commercially-
| hosted offering.
|
| This is what has brought the most amazing projects (e.g.
| Stable Diffusion)
| [deleted]
| eurekin wrote:
| theBloke cannot rest :)
| ynniv wrote:
| Every time a new model hits I'm waiting for his ggmls
| brucethemoose2 wrote:
| ggml quantization is very easy with the official llama.cpp
| repo. Its quick and mostly dependency free, and you can pick
| the perfect size for your CPU/GPU pool.
|
| But don't get me wrong, TheBloke is a hero.
| ynniv wrote:
| Some of the newer models have slightly different
| architectures, so he explains any differences and shows a
| llama.cpp invocation. Plus you can avoid pulling the larger
| dataset.
| brucethemoose2 wrote:
| Yeah. Keeping up wkth the changes is madness, and those
| FP16 weights are huge.
| int_19h wrote:
| While we're at it, the GGML file format has been deprecated
| in favor of GGUF.
|
| https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md
|
| https://github.com/ggerganov/llama.cpp/pull/2398
| regularfry wrote:
| As if by magic...
| https://huggingface.co/TheBloke/CodeLlama-13B-fp16. Empty so
| still uploading right now, at a guess.
| likenesstheft wrote:
| no more work soon?
| kypro wrote:
| The ability to work less historically has always came as a
| byproduct of individuals earning more per hour through
| productivity increases.
|
| The end goal of AI isn't to make _your_ labour more productive,
| but to not need your labour at all.
|
| As your labour becomes less useful if anything you'll find you
| need to work more. At some point you may be as useful to the
| labour market as someone with 60 IQ today. At this point most
| of the world will become entirely financially dependent on the
| wealth redistribution of the few who own the AI companies
| producing all the wealth - assuming they take pity on you or
| there's something governments can actually do to force them to
| pay 90%+ tax rates, of course.
| likenesstheft wrote:
| What?
| thehacker1234 wrote:
| [flagged]
| born-jre wrote:
| 34B is grouped query attention, right? Does that make it the
| smallest model with grouped attention?
|
| I can see some people fine-tuning it again for general propose
| instruct.
| dontupvoteme wrote:
| Did people * _really*_ think only artists would be losing their
| jobs to AI?
| 1024core wrote:
| If GPT-4's accuracy is 67% and this is 54%, how can these guys
| claim to be SOTA?
| binreaper wrote:
| Seriously, I was expecting to read the article and them be on a
| level on-par with GPT-4 or higher. For all this chat of how
| long Google/Facebook have been in the AI space longer than
| OpenAI, their products don't speak to that..
| MuffinFlavored wrote:
| Can I feed this entire GitHub projects (of reasonable size) and
| get non-hallucinated up-to-date API refactoring recommendations?
| msoad wrote:
| Is there any place we can try those models? Are they available on
| HuggingFace?
| jspisak wrote:
| Partner integrations will follow. For now we just have the
| weights available.
|
| But don't worry, this community moves fast!
| Eddygandr wrote:
| Probably superseded (by y'all) within a week!
| KaiserPro wrote:
| This is great for asking questions like "how do I do x with y"
| and this code <<some code>> isn't working, whats wrong? Much
| faster that googling, or a great basis for forming a more
| accurate google search.
|
| Where its a bit shit is when its used to provide auto suggest. It
| hallucinates plausible sounding functions/names, which for me
| personally are hard to stop if they are wrong (I suspect that's a
| function of the plugin)
| Someone1234 wrote:
| Business opportunity: I'd pay money for NICE desktop software
| that can run all these different models (non-subscription,
| "2-year updates included, then discount pricing" modal perhaps).
| My wishlist:
|
| - Easy plug & play model installation, and trivial to change
| which model once installed.
|
| - Runs a local web server, so I can interact with it via any
| browser
|
| - Ability to feed a model a document or multiple documents and be
| able to ask questions about them (or build a database of some
| kind?).
|
| - Absolute privacy guarantees. Nothing goes off-machine from my
| prompt/responses (USP over existing cloud/online ones). Routine
| license/update checks are fine though.
|
| I'm not trying to throw shade at the existing ways to running
| LLMs locally, just saying there may be room for an OPTIONAL
| commercial piece of software in this space. Most of them are
| designed for academics to do academic things. I am talking about
| a turn-key piece of software for everyone else that can give you
| an "almost" ChatGPT or "almost" CoPilot-like experience for a one
| time fee that you can feed sensitive private information to.
| julianeon wrote:
| What I want is even simpler: just an API that you make requests
| to and receive answers back. Surprisingly hard to find, outside
| OpenAI that is.
| MaKey wrote:
| Oobabooga exposes an API.
| BoorishBears wrote:
| I'm not into open source LLMs in the slightest, and yet even
| I've trivially found tools to do what both you and the poster
| above you wanted
|
| lmstudio actually does what both of you want: provides an
| easy GUI and serves up your model over a local endpoint that
| mirrors the OpenAI API.
|
| There's just too much noise in terms of the tooling for LLMs,
| the solution is fewer higher quality solutions, not more
| solutions
| ingridpan wrote:
| https://gradient.ai/ is doing that with llama2
| worldsayshi wrote:
| Looks really promising. I wonder if the similar pricing to
| OpenAI means that Gradient is also(?) bleeding money even
| if they get a good customer base. Or are these prices
| sustainable over time?
| ingridpan wrote:
| Good question, esp as Gradient fine-tuning is so much
| cheaper than Open AI's
| xanderatallah wrote:
| We're _trying_ to do this at https://openrouter.ai
| MaKey wrote:
| Did you try Oobabooga (https://github.com/oobabooga/text-
| generation-webui) yet?
| max51 wrote:
| Oobabooga is a great tool but it still has a long way to go
| in term of user-friendliness. It's absolutely not plug and
| play the way that chatgpt is; It requires research, trial and
| error, and knowledge of the tech to make the model work to
| its full potential. It's great once you finish setting it up,
| but it does not compare to what you would expect from a
| commercial product aimed at normal end-users.
|
| Things like bad default values, no tooltips, an no curated
| model list to one-click download is what separates a tool
| like Oobabooga from a paid commercial product. These things
| require time/money and it would be very unlikely that an open
| source tool could find resources for all the testing and R&D.
|
| I think there is a big market for products where you pay and
| can just start chatting with the model without having to ever
| go to the settings tab or google anything unless you need to
| do something out of the ordinary.
| noduerme wrote:
| Agreed. After several rounds of setting up various python
| environments and tinkering with directory structures and
| debugging glitches and quantizing models just to end up playing
| around for a few minutes and getting bored, it would be nice to
| have the experience just be seamless. I wouldn't try to set up
| a workflow around seriously using what's out there to run on
| localhost now.
|
| That said, _non-subscription_ is essential, and that 's
| probably going to be a heavy lift considering how quickly
| things are evolving.
| simonw wrote:
| I've been trying to push things in that direction with my LLM
| tool - the idea is to have Python plugins which you can
| install that do all of the irritating details to get a model
| setup.
|
| I've not yet been able to solve the challenge of needing CUDA
| etc for some models though!
|
| Plugins so far:
| https://llm.datasette.io/en/stable/plugins/directory.html
| noduerme wrote:
| Cool! I've followed your instructions and your blog quite a
| bit as I've experimented with running local LLMs as well as
| stable diffusion. It's been especially helpful, as python
| is not my language or usual environment. Your patience at
| hacking your way through each new iteration and presenting
| what's important about them is astonishing; I personally
| think I'd have gone mad, but you've done great work in
| charting the territory.
| irrational wrote:
| I work for a Fortune 100 company with 80,000+ employees. All of
| us are explicitly forbidden from using any sort of AI/LLM tool
| without written permission from the head of legal AND the CEO.
| In other words, nobody is going to get permission.
|
| The concerns are 2 fold - 1. We might inadvertently use someone
| else's intellectual property. 2. Someone else might gain access
| to our intellectual property.
|
| What you are describing would help alleviate the concern about
| issue 2, but I'm not sure if it would help alleviate the
| concerns with issue 1.
| roguas wrote:
| Change company. Honestly. If you go as far as to forbid your
| partners in crime (workers sigh..) to explore new uncharted
| territory at all - well ya know someone will/might just win
| by not doing that.
| sshine wrote:
| Working for an 80.000+ employee company, one has already
| accepted a certain degree of inertia.
| flatline wrote:
| This is particular, specifically problematic territory. I
| cannot imagine handing over proprietary data to a third
| party without a contract in place for how that data is
| stored and used. It's not about innovation, it's about
| using someone else's tools without ownership. For the other
| case, it's both about integrity in owning your own work,
| and a shield from legal consequences. These things should
| be very relevant to any business.
|
| I also don't know any professional devs who have used tools
| like copilot and said they were anything but a toy. I am
| more bullish on LLMs than most of my coworkers. I think
| there is a lot of potential there. I do not see that
| potential in the current commercial offerings, and the
| financial outlay to fine-tune an open-source model and run
| it at scale is...prohibitive.
| thfuran wrote:
| That's not banning all uncharted territory, it's banning
| specific legally fraught territory.
| ttyyzz wrote:
| It's basically the same thing in our company, too. They
| basically put a similar rule in place that prevents anyone
| from using e.g. Chat GPT. Little do they know that all
| software devs within the company are using co-pilot and the
| company is even paying for it. It's quite a funny situation
| tbh..
| sangnoir wrote:
| > Little do they know that all software devs within the
| company are using co-pilot and the company is even paying
| for it.
|
| Just like annual sexual harassment training - it's mostly
| corporate CYA on liability. If it ever goes to court,
| they'll plead ignorance and blame the employees who should
| have known better as they were trained/informed on what
| they ought _not_ to do.
|
| Paying for co-pilot could bite them though, so I suspect
| it's a case were the one part of the organization isn't
| aware of what the other is doing
| ttyyzz wrote:
| All of your assumptions are exactly right. They (mostly
| managers with little to no IT background) want to cover
| their own asses in case shit hits the fan (unlikely
| scenario if you ask me, because the company is just
| overrating the value of their data. Nobody gives a fuck
| about us anyway...) and many parts of this company have
| developed their own habits... The company is just very
| big and I can understand why they might be afraid, but
| come on, nobody will take that policy seriously forever.
| You need to eventually put some reasonable rules in place
| that allow you to make use of such Innovations...
| ssalka wrote:
| GPT4All satisfies these requirements, except for (AFAIK)
| running a web server
|
| https://gpt4all.io/index.html
| simonw wrote:
| It runs a web server too - if you start up the desktop app it
| can run a web server with an API on a port for you.
| alsobrsp wrote:
| I have been using refact.ai on my laptop, it has been quite
| good.
|
| https://github.com/smallcloudai/refact/blob/main/README.md
| thewataccount wrote:
| Refact has worked for me. Hopefully exllama will support
| CodeLlama.
| firecall wrote:
| I wish these things worked with anything other than VSCode or
| JerBrains tools!
|
| VSCode is such a bloated hog of an editor!
|
| Every time I open VSCode it's bugging with badges to update
| extensions... and it's so slow!
| jmorgan wrote:
| A few folks and I have been working on an open-source tool that
| does some of this (and hopefully more soon!)
| https://github.com/jmorganca/ollama
|
| There's a "PrivateGPT" example in there that is similar to your
| third point above:
| https://github.com/jmorganca/ollama/tree/main/examples/priva...
|
| Would love to know your thoughts
| luma wrote:
| I'd love to test this out as soon as you get Linux or Windows
| support going!
| appel wrote:
| Me too! I starred the repo and am watching releases,
| excited to try it.
| SubiculumCode wrote:
| [flagged]
| tibbon wrote:
| I've used llama.cpp easily for some local things, but it does
| lack a good ui.
| Draiken wrote:
| As a complete noob at actually running these models, what kind of
| hardware are we talking here? Couldn't pick that up from the
| README.
|
| I absolutely love the idea of using one of these models without
| having to upload my source code to a tech giant.
| liuliu wrote:
| 34B _should_ be able to run on 24GiB consumer graphics card, or
| 32GiB Mac (M1 / M2 chips) with quantization (5~6bit) (and 7B
| _should_ be able to run on your smart toaster).
| epolanski wrote:
| Are there cloud offerings to run those models on somebody's
| else computer?
|
| Any "eli5" tutorial on how to do so, if so?
|
| I want to give these models a run but I have no powerful GPU
| to run them on so don't know where to start.
| redox99 wrote:
| On runpod there is a TheBloke template with everything set
| up for you. An A6000 is good enough to run 70b 4bit.
| dangelov wrote:
| I've used Ollama to run Llama 2 (all variants) on my 2020 Intel
| MacBook Pro - it's incredibly easy. You just install the app
| and run a couple of shell commands. I'm guessing soon-ish this
| model will be available too and then you'd be able to use it
| with the Continue VS Code extension.
|
| Edited to add: Though somewhat slow, swap seems to have been a
| good enough replacement for not having the loads of RAM
| required. Ollama says "32 GB to run the 13B models", but I'm
| running the llama2:13b model on a 16 GB MBP.
| j45 wrote:
| Apple Silicon, especially an M1 Max Studio seems to be an
| interesting machine to hang on to as the models become more
| and more efficient with using less and less.
|
| If there's nay other opinions or thoughts on this, I'd be
| very happy to learn as well. I have considered the eGPU route
| connected to a 1L PC such as a thinkcentre m80/90.
| BoorishBears wrote:
| I have a 64 GB M1 Max MBP, and I'd say unless you really
| have some academic interest towards messing with open
| models, for now accessing SOTA models via a REST API has
| better latency for a given quality.
|
| Claude 1.2 instant is as fast as 3.5, follows instructions
| at a quality closer to 4, and has a 100k context window.
| Hard to compete with that with an open source model right
| now.
| jkeisling wrote:
| How does open source compete with the Claude API? Easy:
| actually let you use the model. From the signup page:
|
| > Anthropic is rolling out Claude slowly and
| incrementally, as we work to ensure the safety and
| scalability of it, in alignment with our company values.
|
| > We're working with select partners to roll out Claude
| in their products. If you're interested in becoming one
| of those partners, we are accepting applications. Keep in
| mind that, due to the overwhelming interest we've
| received so far, we may take a while to reply.
|
| No thanks, I'd much rather not wait months to see if my
| app deserves their oh-so-limited attention, or "aligns
| with the values" of a company taking $400m from Sam
| Bankman-Fried.
|
| To be more charitable to your underlying point, Claude 2
| is free to chat with via Anthropic's website, Poe, or
| Slack, and the GPT-4 API is open to use. If you're
| building a prototype or just need a chatbot, these _do_
| have better results and dev experience, at least for now.
| But I don 't think picking on your Claude API example is
| unfair. These companies could randomly refuse your
| prompts via some opaque "moderation API" (that all GPT
| fine-tuning data goes through!), train on your company's
| proprietary data, spy on your most intimate questions, or
| just not find you worth the trouble and cut you off, at
| any time. THAT is why open source beats proprietary hands
| down: My device, my data, my weights, my own business.
| redox99 wrote:
| If you want to run them fast, a 12GB GPU (e.g 3060) for the 13B
| and a 24GB GPU for the 34B (e.g 3090). Otherwise llama.cpp CPU
| inference would work on most machines.
| praveenhm wrote:
| which is the best model for coding right now, GPT4/copilot/phind
| ?
| mymac wrote:
| Never before in the history of mankind was a group so absolutely
| besotted with the idea of putting themselves out of a job.
| yborg wrote:
| When mechanized textile machinery was invented, the weavers
| that had jobs after their introduction were those that learned
| how to use them.
| worksonmine wrote:
| This should be the only goal of mankind so we can smell the
| flowers instead of wasting our years in some cubicle. Some
| people will always want to work, but it shouldn't be the norm.
| What's the point really unless we're doing something we're
| passionate about? The economy?
| 037 wrote:
| I understand the fear of losing your job or becoming less
| relevant, but many of us love this work because we're
| passionate about technology, programming, science, and the
| whole world of possibilities that this makes... possible.
|
| That's why we're so excited to see these extraordinary advances
| that I personally didn't think I'd see in my lifetime.
|
| The fear is legitimate and I respect the opinions of those who
| oppose these advances because they have children to provide for
| and have worked a lifetime to get where they are. But at least
| in my case, the curiosity and excitement to see what will
| happen is far greater than my little personal garden. Damn, we
| are living what we used to read in the most entertaining sci-fi
| literature!
|
| (And that's not to say that I don't see the risks in all of
| this... in fact, I think there will be consequences far more
| serious than just "losing a job," but I could be wrong)
| ttul wrote:
| That's just one perspective... Another perspective is that LLMs
| enable programmers to skip a lot of the routine and boring
| aspects of coding - looking up stuff, essentially - so they can
| focus on the fun parts that engage creativity.
| mymac wrote:
| But it won't stop there. Why would it stop at some
| arbitrarily defined boundary? The savings associated with no
| longer having to pay programmers the amounts of money that
| they believe they are worth (high enough to result in
| collusion between employers) are just too tempting.
| swader999 wrote:
| The answer to AI stealing your job is to go ahead and start
| a company, solve a hard problem, sell the solution and
| leverage AI to do this.
| ilaksh wrote:
| Some form of AI will eventually take over almost all
| existing jobs. Whether those jobs evolve or not somehow and
| new jobs replace them, we will see.
|
| But it's definitely not just programmers. And it will take
| time.
|
| Society needs to adjust. Stopping progress would not be a
| solution and is not possible.
|
| However, hopefully we can pause before we create digital
| animals with hyperspeed reasoning and typical animal
| instincts like self-preservation. Researchers like LeCun
| are already moving on from things like LLMs and working on
| approaches that really imitate animal cognition (like
| humans) and will eventually blow all existing techniques
| out of the water.
|
| The path that we are on seems to make humans obsolete
| within three generations or so.
|
| So the long term concern is not jobs, but for humans to
| lose control of the planet in less than a century.
|
| On the way there we might be able to manage a new golden
| age -- a crescendo for human civilization.
| lsmeducation wrote:
| Continuing your aside...
|
| Humans don't become obsolete, we become bored. This tech
| will make us bored. When humans get too bored and need
| shit to stir up, we'll start a war. Take US and China,
| global prosperity is not enough right? We need to stoke
| the flames of war over Taiwan.
|
| In the next 300 years we'll wipe out most of each other
| in some ridiculous war, and then rebuild.
| ilaksh wrote:
| I agree that WWIII is a concern but I don't think it will
| be brought about by boredom.
|
| "Global prosperity" might be true in a very long-term
| historical sense, but it's misleading to apply it to the
| immediate situation.
|
| Taiwan is not just a talking point. Control over Taiwan
| is critical for maintaining hegemony. When that is no
| longer assured, there will likely be a bloody battle
| before China is given the free reign that it desires.
|
| WWIII is likely to fully break out within the next 3-30
| years. We don't really have the facilities to imagine
| what 300 years from now will look like, but it will
| likely be posthuman.
| lsmeducation wrote:
| I'll go with the 30 year mark. Countries like Russia or
| China don't get humbled in a loss (like Germany didn't in
| WW1). Russia will negotiate some terms for Ukraine (or
| maintain perpetual war), but I believe it will become a
| military state that will funnel all money into the
| defense sector. The same with Iran, and the same with
| China.
|
| Iran supplies Russia with drones. I can promise you
| Russia will help Iran enrich their uranium. They are both
| pariah states, what do they have to lose? Nuclear Iran,
| here enters Israel.
|
| Everyone's arming up, there's a gun fight coming.
| lsmeducation wrote:
| Okay, think about it this way. This thing helps generate
| tons and tons of code. The more code people (or this thing)
| writes, the more shit there is to debug. More and more
| code, each calling each other means more and more insane
| bugs.
|
| We're going to move from debugging some crap the last
| developer wrote to debugging an order of magnitude more
| code the last developer generated.
|
| It's going to be wonderful for job prospects really.
| PUSH_AX wrote:
| We're not looking at a product that's putting anyone out of a
| job though, we're looking at a product that frees up a lot of
| time, and time is great.
| quickthrower2 wrote:
| The best interpretation of this is you mean eventually ML/AI
| will put programmers out of a job, and not Code LLama
| specifically.
|
| However it is hard to tell how that might pan out. Can such an
| ML/AI do all the parts of the job effectively? A lot of non-
| coding skill bleed into the coder's job. For example talking to
| people who need an input to the task and finding out what they
| are really asking for, and beyond that, what the best solution
| is that solves the underlying problem of what they ask for,
| while meeting nonfunctional requirements such as performance,
| reliability, code complexity, and is a good fit for the
| business.
|
| On the other hand eventually the end users of a lot of services
| might be bots. You are more likely to have a pricing.json than
| a pricing.html page, and bots discover the services they need
| from searches, negotiate deals, read contracts and sue each
| other etc.
|
| Once the programming job (which is really a "technical problem
| solver" job) is replaced either it will just be same-but-
| different (like how most programmers use high level languages
| not C) or we have invented AGI that will take many other jobs.
|
| In which case the "job" aspect of it is almost moot. Since we
| will be living in post-scarcity and you would need to figure
| out the "power" aspect and what it means to even be
| sentient/human.
| kbrannigan wrote:
| Do you really want to spend you days writing REDUX
| accumulators?
| thewataccount wrote:
| Is automation not what every engineer strives for when
| possible? Especially software developers.
|
| From my experience with github copilot and GPT4 - developers
| are NOT going anywhere anytime soon. You'll certainly be faster
| though.
| vunderba wrote:
| If we get to the point where these large language models can
| create complete applications and software solutions from design
| specs alone, then there's no reason to believe that this would
| be limited to merely replacing software devs.
|
| It would likely impact a _far_ larger swath of the engineering
| / design industry.
| jasfi wrote:
| Now we need code quality benchmarks comparing this against GPT-4
| and other contenders.
| nick0garvey wrote:
| They show the benchmarks in the original post, a few pages down
| jasfi wrote:
| Thanks, I missed that somehow.
| braindead_in wrote:
| The 34b Python model is quite close to GPT4 on HumanEval pass@1.
| Small specialised models are catching up to GPT4 slowly. Why not
| train a 70b model though?
| pmarreck wrote:
| I want "safety" to be opt-in due to the inaccuracy it introduces.
| I don't want to pay that tax just because someone is afraid I can
| ask it how to make a bomb when I can just Google that and get
| pretty close to the same answer already, and I certainly don't
| care about being offended by its answers.
| Dowwie wrote:
| What did the fine tuning process consist of?
___________________________________________________________________
(page generated 2023-08-24 23:00 UTC)