hngopher.com

       [HN Gopher] Code Llama, a state-of-the-art large language model ...
       ___________________________________________________________________
        
       Code Llama, a state-of-the-art large language model for coding
        
       Author : marcopicentini
       Score  : 435 points
       Date   : 2023-08-24 13:26 UTC (9 hours ago)
        
 (HTM) web link (ai.meta.com)
 (TXT) w3m dump (ai.meta.com)
        
       | redox99 wrote:
       | The highlight IMO
       | 
       | > The Code Llama models provide stable generations with up to
       | 100,000 tokens of context. All models are trained on sequences of
       | 16,000 tokens and show improvements on inputs with up to 100,000
       | tokens.
       | 
       | Edit: Reading the paper, key retrieval accuracy really
       | deteriorates after 16k tokens, so it remains to be seen how
       | useful the 100k context is.
        
         | brucethemoose2 wrote:
         | Did Meta add scalable rope to the official implementation?
        
           | snippyhollow wrote:
           | We changed RoPE's theta from 10k to 1m and fine-tuned with
           | 16k tokens long sequences.
        
             | lucidrains wrote:
             | to give a bit of inspiration to the open source community,
             | this trick was discovered by a random redditor https://www.
             | reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkawar... cool to
             | see it applied at scale!
        
             | malwrar wrote:
             | Curious, what led you to adjusting the parameters this way?
             | Also, have you guys experimented with ALiBi[1] which claims
             | better extrapolative results than rotary positional
             | encoding?
             | 
             | [1]: https://arxiv.org/abs/2108.12409 (charts on page two
             | if you're skimming)
        
               | ttul wrote:
               | Undoubtedly, they have tried ALiBi...
        
         | nabakin wrote:
         | Looks like they aren't releasing a pretty interesting model
         | too. In the paper they mention a "Unnatural Code Llama" which
         | wipes the floor with every other model/finetune on every
         | benchmark except for slightly losing to Code Llama Python on
         | MBPP pass@100 and slightly losing to GPT-4 on HumanEval pass@1
         | which is insane.
         | 
         | Meta says later on that they aren't releasing it and give no
         | explanation. I wonder why given how incredible it seems to be.
        
           | EvgeniyZh wrote:
           | Note that current GPT-4 pass@1 for HumanEval is closer to 90%
           | than to 67% reported in GPT-4 technical report, as reported,
           | e.g., in [1]
           | 
           | [1] https://arxiv.org/abs/2305.01210
        
             | nabakin wrote:
             | Good point, I guess Meta should be using that number in
             | their chart
        
           | jonchurch_ wrote:
           | The paper states it was instruction fine tuned with synthetic
           | data (LLM generated instructions) ala another paper
           | ("Unnatural Instructions: Tuning Language Models with
           | (Almost) No Human Labor").
           | 
           | The github repo associated with that paper is linked below.
           | It links to the paper on arxiv, but also has some data in the
           | repo.
           | 
           | https://github.com/orhonovich/unnatural-instructions
        
             | ilaksh wrote:
             | Maybe they used GPT-4 to train it. OpenAI terms of use
             | don't allow that to be released commercially.
        
               | nabakin wrote:
               | I've seen this argued a lot but is it fact? OpenAI was
               | able to train on data from other platforms and surely,
               | those platforms weren't letting their data go if they
               | could help it. Unless some new laws have been passed, I
               | don't think OpenAI can legally prevent others from using
               | their data to train models. OpenAI can't have their cake
               | and eat it too. After all, any content generated by AI
               | can't be copyrighted.
        
               | lhl wrote:
               | It is indeed a fact that OpenAI's Terms of Use do state
               | that you can't use their service to develop competing
               | models: Section 2.c.iii -
               | https://openai.com/policies/terms-of-use
               | 
               | Now of course, the terms are not the law (so don't govern
               | the use of the generated data by any third party), they
               | are an agreement between two parties. If you did click
               | "agree" then that's a binding agreement and there could
               | be legal/contractual repercussions (some of which are
               | outlined in the terms).
        
               | haldujai wrote:
               | That seems like a likely explanation, probably won't get
               | into legal trouble for using an OpenAI model for a
               | research paper but redistributing said model may be
               | upsetting enough for OpenAI trigger a legal challenge.
               | 
               | Unnatural language used davinci-002 although that was a
               | while ago, they only say "similarly" in this paper and
               | don't specify what they used. I can't see a reason why
               | they wouldn't be releasing it if the unnatural prompts
               | were generated by LLaMA2-family.
               | 
               | In any case, replicating this training seems trivial and
               | very cheap compute-wise for anyone who wanted to do it.
        
               | nkohari wrote:
               | This is the most likely explanation for both why they
               | wouldn't release it and wouldn't explain why.
        
           | kapp_in_life wrote:
           | Likely trained on internal code.
        
             | mediaman wrote:
             | That model is trained on synthetically AI-generated code,
             | not internal code.
             | 
             | It suggests that synthetic training could be the future in
             | increasing capability of smaller models (and perhaps bigger
             | ones too). AI will train AI.
        
               | sroussey wrote:
               | That is the basis for https://synthesis.ai/
        
               | SubiculumCode wrote:
               | I'm an amateur, but it seems to me that methods to
               | synthesize will have to be distinct from methods of the
               | generative model.
        
               | haldujai wrote:
               | I thought this specific model was referring to self-
               | instruction using both synthetic prompts (generated from
               | few-shot in-context prompting of presumably some OpenAI
               | model, the original paper used text-davinci-002) as well
               | as synthetic code (presumably Code Llama 7 like for self-
               | instruct) subsequently validated with execution?
               | 
               | The differences being it's not just training on
               | unvalidated synthetic data and this specific method (per
               | the unnatural questions paper) results in increased
               | instruction diversity which confers some added advantage
               | and I'm assuming explains the performance gain over the
               | also synthetic self-instruct code?
               | 
               | I may be misunderstanding but this seems more nuanced
               | than just training on synthetically AI-generated code and
               | is more validating of synthetic instructions (i.e. low
               | resource setting) rather than synthetic code (i.e. high
               | resource setting).
        
         | riku_iki wrote:
         | > The Code Llama models provide stable generations with up to
         | 100,000 tokens of context.
         | 
         | what is the trick to achieve 100k context? They can't just use
         | 100k wide transformer layer, it is cost prohibitive, right?..
        
           | littlestymaar wrote:
           | I'm pretty sure they don't do that, but for code the relevant
           | relationship between two tokens is easy to determine with the
           | semantics of the language alone (for instance you can say
           | that tokens related to a local variable have no relationship
           | with tokens outside), so it would lead to a sparse matrix in
           | the transformer, reducing the cost of big contexts by a lot.
           | But it would require language specific preprocessing, and
           | whether you can make it fast is also dubious. I don't think
           | it's been tried so far.
        
       | gdcbe wrote:
       | Is there somewhere docs to show you how to run this on your local
       | machine and can you make it port it a script between languages?
       | Gpt4 can do that pretty well but its context is too small for
       | advanced purposes.
        
       | ChatGTP wrote:
       | [flagged]
        
       | jtwaleson wrote:
       | This is probably a stupid question, but would it be possible to
       | use these models to rate existing code and point to possible
       | problems, rather than generating new code? That would be
       | extremely useful to some use cases I'm working on.
        
       | dangerwill wrote:
       | It's really sad how everyone here is fawning over tech that will
       | destroy you own livelihoods. "AI won't take your job, those who
       | use AI will" is purely short term, myopic thinking. These tools
       | are not aimed to help workers, the end goal is to make it so you
       | don't need to be an engineer to build software, just let the
       | project manager or director describe the system they want and
       | boom there it is.
       | 
       | You can scream that this is progress all you want, and I'll grant
       | you that these tools will greatly speed up the generation of
       | code. But more code won't make any of these businesses provide
       | better services to people, lower their prices, or pay workers
       | more. They are just a means to keep money from flowing out of the
       | hands of the C-Suite and investor classes.
       | 
       | If software engineering becomes a solved problem then fine, we
       | probably shouldn't continue to get paid huge salaries to write it
       | anymore, but please stop acting like this is a better future for
       | any of us normal folks.
        
         | criley2 wrote:
         | You can say this about every major invention. The loom
         | destroyed jobs! The engine destroyed jobs! So on and so forth.
         | 
         | This view is critically flawed in two major ways:
         | 
         | 1) AI is not anywhere near being able to replace the majority
         | of what developers do on a product team. We are decades away
         | from a PM at Facebook being able to type "make a twitter clone
         | that uses instagram login and can scale to 1 billion" and have
         | an AI just do it.
         | 
         | 2) Programming and product work is not zero sum. The more we
         | can do means the more product we can make. It means more
         | products can be made overall. After the loom came out, we
         | simply made more clothes than ever before and in the process
         | created a ton of jobs. We are not at some peak software point
         | where we've completely saturated all humanity's need for
         | software or profitable software and thus tools that increase
         | efficiency don't put us out of work.
         | 
         | And frankly, if we develop the kind of general AI that accept a
         | query like "make a facebook competitor capable of scaling to 10
         | billion" and simply do it, inventing whatever languages,
         | frameworks, hardware, processors, patterns, methodologies,
         | global datacenters handling global politics and law, etc, etc
         | necessary to accomplish such a task, then so be it. I welcome
         | the overlords!
        
         | bbor wrote:
         | We have three options, IMO:
         | 
         | 1. As a species decide to never build another LLM, ever.
         | 
         | 2. Change the path of society from the unequal, capitalist one
         | it's taken the last 2-300 years.
         | 
         | 3. Give up
         | 
         | I know which I believe in :). Do you disagree?
        
           | sp332 wrote:
           | Finish #2 first, or else people with access to more money and
           | infrastructure will use LLMs to increase inequality even
           | further.
        
           | pbhjpbhj wrote:
           | The problem with 2 is that the people in power, and with
           | immense wealth, remain there because of capitalism. They have
           | the political power, and the resources, to enact the change
           | ... but they also lose the most (unless you count altruism as
           | gain, which of it were true the World would be so different).
           | 
           | We could structure things so that LLM, and the generalised
           | AIs to come, benefit the whole of society ... but we know
           | that those with the power to make that happen want only to
           | widen the poverty gap.
        
             | bbor wrote:
             | Yes but the common man has won before! There has never been
             | a perfect revolution/paradigm shift (personally anti
             | utopia-through-intense-bloodshed, so hesitant to use the
             | former term alone), but there have been many, all of which
             | were against the wishes of those in power.
             | 
             | Plus, if these AIs are enough to change everything, that
             | kinda implies that we've developed flexible, reliable AGI
             | systems. In such a world, everything changes - maybe the
             | calculus of The Powerful Few vs. The Oppressed Masses
             | changes in too! It might even change in our favor, if we're
             | terribly lucky...
        
         | Draiken wrote:
         | I agree. This will ultimately become another way to extract
         | more value straight into the pockets of the owners.
         | 
         | Unfortunately, I don't believe there's a way to stop (or even
         | slow down) this train. We can't defeat it, so the only logical
         | answer is to join it.
         | 
         | It's the classical issue with progress removing jobs. In
         | today's world, since mostly everyone (aside from the
         | capitalists themselves) relies on jobs to survive, barring a
         | complete switch from capitalism (which will not happen in our
         | lifetimes), we're fucked.
         | 
         | Next best thing we can do is to try and democratize it enough
         | so that not only the rich have access to it.
        
         | manicennui wrote:
         | I'm not worried because it solves a problem for the semi-
         | competent.
        
         | tomr75 wrote:
         | Improve productivity, cheapen goods and services. Nature of
         | technological advancement
        
         | lamp987 wrote:
         | "If software engineering becomes a solved problem"
         | 
         | It will simply move to a higher level of abstraction.
         | 
         | Remind me, how many programmers today are writing in assembly?
        
           | blibble wrote:
           | due to the way LLMs work: it will be able to handle that
           | level of abstraction in exactly the same way
        
             | lamp987 wrote:
             | LLM is not strong AI. It's not even AI.
             | 
             | So no. You will always need that strong "I" at the top
             | somewhere.
        
               | blibble wrote:
               | I think you're in for a nasty surprise
        
               | lamp987 wrote:
               | Two more weeks!
        
         | lsmeducation wrote:
         | It's less of a concern if you are in mid career. But someone
         | should warn all these college kids that are going into comp
         | sci. I don't think this will be the kind of lucrative field
         | they think it's going to be over the course of a 40 year
         | career.
         | 
         | The days of getting paid well for making crud are numbered
         | (which most of us do, even in the most interesting problem
         | spaces).
         | 
         | # need a front end boilerplate that hits a backend with the
         | following end points. REST api for movies catalogue, and a
         | corresponding ui. Oh, unit tests please. Go with a responsive
         | design and also make a React Native version (matter of fact
         | provision it to my iPhone). Decide between Heroku or AWS, set
         | up deploy with git hooks.
         | 
         | # scrape IMDb for initial population of the db
         | 
         | # I think a Reddit like comment system would be good to add, so
         | add it. No upvote/downvote though
         | 
         | # handle user login with google/fb/email
         | 
         | # also make an admin page to manage all this
         | 
         | I guess the technical product designer will be the new unicorn.
        
         | int_19h wrote:
         | _Any_ improvement in tooling benefits capital owners the most,
         | since the productivity gains mostly end up in their pockets.
         | 
         | But the answer to that is to deal with concentration of
         | capital, not to eschew better tools.
        
       | bryanlyon wrote:
       | Llama is a very cool language model, it being used for coding was
       | all but inevitable. I especially love it being released open for
       | everyone.
       | 
       | I do wonder about how much use it'll get, seeing as running a
       | heavy language model on local hardware is kinda unlikely for most
       | developers. Not everyone is runnning a system powerful enough to
       | equip big AIs like this. I also doubt that companies are going to
       | set up large AIs for their devs. It's just a weird positioning.
        
         | outside1234 wrote:
         | ... "seeing as running a heavy language model on local hardware
         | is kinda unlikely for most developers"
         | 
         | for now it is :) but with quantization advances etc. it is not
         | hard to see the trajectory.
        
         | ctoth wrote:
         | As we all know, computers stay the same and rarely improve.
        
         | int_19h wrote:
         | 12Gb of VRAM lets you run 13B models (4-bit quantized) with
         | reasonable speed, and can be had for under $300 if you go for
         | previous-generation NVidia hardware. Plenty of developers
         | around with M1 and M2 Macs, as well.
        
       | ilaksh wrote:
       | https://github.com/facebookresearch/codellama
        
         | [deleted]
        
         | ceejayoz wrote:
         | This is 404ing now. (Not your fault, the email's link is
         | similarly broken.)
        
           | ilaksh wrote:
           | Really? Works for me.
        
             | ceejayoz wrote:
             | It's back now.
        
       | bracketslash wrote:
       | So uhh...how does one go about using it?
        
       | andrewjl wrote:
       | What I found interesting in Meta's paper is the mention of
       | HumanEval[1] and MBPP[2] as benchmarks for code quality.
       | (Admittedly maybe they're well-known to those working in the
       | field.)
       | 
       | I haven't yet read the whole paper (nor have I looked at the
       | benchmark docs which might very well cover this) but curious how
       | these are designed to avoid issues with overfitting. My thinking
       | here is that canned algorithm type problems common in software
       | engineering interviews are probably over represented in the
       | training data used for these models. Which might point to
       | artificially better performance by LLMs versus their performance
       | on more domain-specific type tasks they might be used for in day-
       | to-day work.
       | 
       | [1] https://github.com/openai/human-eval
       | 
       | [2] https://github.com/google-research/google-
       | research/tree/mast...
        
       | gw67 wrote:
       | In your opinion, Why Meta does this?
        
         | chaorace wrote:
         | To a certain extent, I think it's just IBM disease. A company
         | the size of Meta is expected to have an AI research department
         | like Microsoft or Google, even if their core business (social
         | media) derives relatively less benefit from the technology.
         | 
         | Pretend you're an uncreative PM on an AI team; what part of
         | Facebook or VR could you feasibly improve by iterating on LLMs?
         | Perhaps the content moderation system... but that would require
         | wrangling with the company ethics comittee and someone else at
         | the company probably already took ownership that idea. You've
         | gotta do _something_ compelling or else your ML engineers are
         | going to run off somewhere else.
         | 
         | If I were to ask my ML engineers about what they wanted to work
         | on, they're going to avoid areas where their model is outgunned
         | (i.e.: chat) and instead prefer lower hanging fruit which
         | generalizes well on a resume (i.e.: "Pioneered and published
         | key innovations in LLM code-generation").
         | 
         | Of course, the alternative answer is that Meta wants to replace
         | all of their jr. developers with GPUs, but I think their
         | leadership is a little too preoccupied with VR to be actively
         | pushing for such a transformative initiative in anything more
         | than a very uninvested capacity (e.g.: "Sure I'll greenlight
         | this. Even if it doesn't pay off I don't have any better
         | ideas")
        
       | maccam912 wrote:
       | It appears we do have a 34B version now, which never appeared for
       | non fine tuned llama 2.
        
         | jspisak wrote:
         | It would be interesting to understand if a ~30B Llama-2 model
         | would be interesting and for what reasons.
        
           | hnuser123456 wrote:
           | It would fit on the 24GB top-end consumer graphics cards with
           | quantization.
        
           | brucethemoose2 wrote:
           | Llama 34B is _just_ big enough to fit on a 24GB consumer (or
           | affordable server) GPU.
           | 
           | Its also just the right size for llama.cpp inference on
           | machines with 32GB RAM, or 16GB RAM with a 8GB+ GPU.
           | 
           | Basically its the most desirable size for AI finetuning
           | hobbyists, and the quality jump from llama v1 13B to llama v1
           | 33B is huge.
        
           | Tostino wrote:
           | Better reasoning and general performance than 13b by far (if
           | llama1 was any indication), and like the other user said, can
           | fit on a single 24gb vram gaming card, and can be peft fine-
           | tuned with 2x 24gb cards.
        
             | airgapstopgap wrote:
             | Llama-1-33B was trained on 40% more tokens than
             | LLama-1-13B; this explained some of the disparity. This
             | time around they both have the same data scale (2T
             | pretraining + 500B code finetune), but 34B is also using
             | GQA which is slightly more noisy than MHA. Furthermore,
             | there have been some weird indications in the original
             | LLama-2 paper that 34B base model is something... even more
             | special, it's been trained on a separate internal cluster
             | with undervolted/underclocked GPUs (though this in itself
             | can't hurt training results), its scores are below
             | expectations, it's been less "aligned". Here, Code-Llama-
             | Instruct-13B is superior to 34B on HumanEval@1. So yes,
             | it's desirable but I wouldn't get my hopes up.
        
       | lolinder wrote:
       | Does anyone have a good explanation for Meta's strategy with AI?
       | 
       | The only thing I've been able to think is they're trying to
       | commoditize this new category before Microsoft and Google can
       | lock it in, but where to from there? Is it just to block the
       | others from a new revenue source, or do they have a longer game
       | they're playing?
        
         | brandall10 wrote:
         | On Lex Fridman, Mark said the strategy is to attract talent
         | while keeping the playing field level (not a fan of big tech
         | moating this up).
        
         | aent wrote:
         | Zuckerberg talked about it on Lex Fridman podcast
         | https://youtu.be/Ff4fRgnuFgQ?t=1113
        
         | [deleted]
        
         | nologic01 wrote:
         | Clearly the research team at Meta knows the domain as well
         | anybody, has access to a data trove as large as anybody and
         | their distribution capability is as large scale as anyone's.
         | 
         | If their choice right now is not to try to overtly monetize
         | these capabilities but instead commoditize and "democratize"
         | what others are offering it suggests they think that a
         | proprietary monetization route is not available to them. In
         | other words they do not leave money on the table. They think
         | that (at least right now) there is no money on the table that
         | they can get to.
         | 
         | Rather than remaining quiet and isolated, the best alternative
         | - their conjectured thinking goes - is to show up as they do,
         | buying up good will with various stakeholders, maintaining
         | mindshare internally and externally etc.
         | 
         | Assuming that the above reading is correct it still leaves
         | various options as to why they may have come to that
         | conclusion: For example reasoning about the future of this
         | sector they might be thinking that there is no real technical
         | moat and they simply accelerate that reality to gain some
         | brownie points.
         | 
         | It may be also idiosyncratic reasons specific to their own
         | business model (data privacy challenges and how any AI
         | monetization will mesh with all that). The drawback of being
         | the elephant in the room is that there is not much room to
         | move.
         | 
         | The nature of their long game depends on which of the decision
         | branches carries more weight. Maybe it is wait-and-see until
         | others clear up the regulatory hurdles. Or keep the engines
         | running until the real and irreducible added value of LLM algos
         | and the like becomes clear.
        
           | CuriouslyC wrote:
           | There really is no technical moat. Any new architectures are
           | going to be published because that's 100% the culture and AI
           | folks won't work somewhere where that's not true. Training
           | details/hyperparameters/model "build-ons" aren't published
           | but those are a very weak moat.
           | 
           | The only moat that is meaningful is data and they've got that
           | more than any other player save maybe google. Publishing
           | models doesn't erode that moat, and it's not going anywhere
           | as long as facebook/whatsapp/instagram rule "interactive"
           | social.
        
           | BryanLegend wrote:
           | Facebook could sure use the good will. They are winning
           | plenty of mine here.
        
           | losteric wrote:
           | Well Facebook is a walled garden, perhaps the board hopes
           | free highly capable LLMs will continue degrading the internet
           | outside those walls thus acting as a moat for their money
           | printer.
        
         | emmender1 wrote:
         | the only beneficiary of this are the hardware vendors.. nvidia
         | and amd. and startups which get these foundation models for
         | free.
         | 
         | because language models are a complementary product, and the
         | complement must be commoditized as a strategy.
         | 
         | I see AMD as a bigger beneficiary, since, very soon, amd will
         | equal nvidia for inference and fine-tuning, but amd has a long
         | way to go to equal in foundation model training.
        
           | smoldesu wrote:
           | > and startups which get these foundation models for free.
           | 
           | It's licensed non-commercially, so I'm not sure what those
           | startups stand to gain.
           | 
           | > since, very soon, amd will equal nvidia for inference and
           | fine-tuning
           | 
           | Source? If you're referring to Olive, it is indeed impressive
           | but also has caveats:
           | 
           | 1. It is just as proprietary as CUDA or CoreML.
           | 
           | 2. You need a copy of Windows and licensed DirectX to use
           | those optimizations.
           | 
           | 3. AMD only matches Nvidia's inferencing performance when
           | comparing Olive to Pytorch. Olive-to-Olive comparisons will
           | still reflect an Nvidia lead.
           | 
           | I don't think AMD has the capability to equal Nvidia in the
           | short-term. It will take longtime software investments from
           | across the industry to shake Nvidia's yoke.
        
             | artninja1988 wrote:
             | Llama2 is not licensed non-commercially. There was some
             | weird provisions about not having more than 1 billion users
             | at llama launch though
        
         | jspisak wrote:
         | If you watch the Connect talks, I'll be speaking about this..
        
           | currio wrote:
           | Excited! I hope your talks are just as informative as this
           | comment. Keep rocking!
        
           | lolinder wrote:
           | Sorry--who are you and what are the Connect talks? I haven't
           | heard of them and you don't have a bio.
        
             | jph00 wrote:
             | That'll be Joseph Spisak, Head of Generative AI Open Source
             | at Meta AI.
        
             | dewey wrote:
             | I guess that's what he is referring to:
             | https://www.linkedin.com/posts/jspisak_home-fully-
             | connected-...
        
           | titaniumtown wrote:
           | what is that?
        
             | darrenf wrote:
             | Facebook Connect is what used to be called Oculus Connect.
             | Kinda their equivalent of Apple's WWDC, I guess. It's when
             | and where the Quest 3 will be officially unveiled in full,
             | for example.
        
               | jspisak wrote:
               | Yep - here is the site:
               | https://www.metaconnect.com/en/home
        
           | coder543 wrote:
           | I wish that Meta would release models like SeamlessM4T[0]
           | under the same license as llama2, or an even better one. I
           | don't understand the rationale for keeping it under a
           | completely non-commercial license, but I agree that is better
           | than not releasing anything at all.
           | 
           | There seem to be opportunities for people to use technology
           | like SeamlessM4T to improve lives, if it were licensed
           | correctly, and I don't see how any commercial offering from
           | smaller companies would compete with anything that Meta does.
           | Last I checked, Meta has never offered any kind of
           | translation or transcription API that third parties can use.
           | 
           | Whisper is licensed more permissively and does a great job
           | with speech to text in some languages, and it can translate
           | to English only. However, it can't translate between a large
           | number of languages, and it doesn't have any kind of text to
           | speech or speech to speech capabilities. SeamlessM4T seems
           | like it would be an all-around upgrade.
           | 
           | [0]:
           | https://github.com/facebookresearch/seamless_communication
        
             | jspisak wrote:
             | Yeah - different projects have different goals and licenses
             | aren't one size fits all. Depending on the project, type of
             | technology, goals, etc.. we will select or even develop the
             | right license that aligns with those goals. Hope this helps
             | :)
        
         | politelemon wrote:
         | > Microsoft
         | 
         | But they're a partner in Llama too. Why is Microsoft in this
         | space too, how do they benefit?
        
           | azeirah wrote:
           | Microsoft is a hosting partner, there's an Azure service for
           | hosted private LLaMa inference for business. Being a go-to
           | hosting provider for SoTA AI is of course a very good thing
           | for Microsoft.
        
         | yomlica8 wrote:
         | vessenes and rvz kind of sum the idea I think they're going for
         | to me.
         | 
         | AI has no moat, but many players are in denial about this
         | still. Microsoft and other might have tight enough control they
         | can use a product dumping strategy to get people dependent upon
         | their implementation such they can start charging, but that
         | isn't a delusion Meta can have.
         | 
         | That max revenue license they used with the models seemed
         | fairly clever to me. It will seed the environment with players
         | that base their product on Meta tech in return for them being
         | born with a poison pill preventing their use by big players
         | (other than Meta) buying them. This is a long term play that
         | may not really work but it creates the potential for big
         | opportunities. And even if it doesn't work out, denying easy
         | wins for their powerful competitors might be worth the price on
         | its own.
        
         | hackernewds wrote:
         | I posit it is similar to how Adobe lets students pirate
         | Photoshop, because when they join the workforce that is what
         | they know and need their employers to buy Adobe services, which
         | for corporate is very expensive.
         | 
         | Meta by democratizing AI access is generating more capable
         | developers which will make the Metaverse a reality, where FB
         | leads. They have already realized they have a losing gambit
         | with Google, Apple, Microsoft (also X?) having an antagonistic
         | monopoly against Meta product advancement
        
         | rvz wrote:
         | > Does anyone have a good explanation for Meta's strategy with
         | AI?
         | 
         | Yes. I said it many times. Meta is already at the finish line
         | in the AI race to zero. All the other cloud-based AI models
         | cannot increase their prices given that a $0 free AI model is
         | available to be self-hosted or used on-device for private /
         | compliance reasons.
         | 
         | Cloud-based AI models cannot afford to compete with free or
         | close to free. It costs Meta close to nothing to release a
         | readily available $0 AI model which is good enough for most
         | use-cases that ChatGPT has already done.
         | 
         | > The only thing I've been able to think is they're trying to
         | commoditize this new category before Microsoft and Google can
         | lock it in, but where to from there? Is it just to block the
         | others from a new revenue source, or do they have a longer game
         | they're playing?
         | 
         | Mostly benefits the PyTorch ecosystem which Meta has an active
         | community around it.
        
         | megaman821 wrote:
         | Probably just talent acquisition. As Google and OpenAI start
         | sharing and publishing less, they become less attractive to
         | scientists. No scientist wants to fall into a black hole and
         | not publish for 8 years.
        
           | norsurfit wrote:
           | Exactly. The Google and OpenAI engineers who published their
           | groundbreaking research 5 years ago are now rockstars. Those
           | who create great research but can't share it often get
           | frustrated.
        
           | rvnx wrote:
           | The problem is also companies bragging about AI, but not
           | releasing anything behind (like most of the recent Google
           | announcements).
           | 
           | If nobody except the researcher can reproduce an AI paper,
           | and there is no source-code, and no demos that the public can
           | access, then it's almost like if it doesn't exist.
           | 
           | I wouldn't want to work in a company that would throw away my
           | research and just use it for PR purposes.
        
         | survirtual wrote:
         | Maybe Meta is waking up to the endgame of humanity and has
         | decided to stop playing the old game? Who knows :)
        
           | belter wrote:
           | Maybe Meta think it can increase the stock price by claiming
           | 40 billion avatars are real friends...
        
         | morkalork wrote:
         | Retention project to keep their top ML/AI staff engaged and not
         | straying away?
         | 
         | Working towards NLU that can solve content moderation once and
         | for all? Contrast with tiktok which is clearly using word
         | filters that are easily worked around with phrases like "un-
         | alived" or "corn".
         | 
         | They want to replace influencers and your friends with chatbots
         | and keep you scrolling through an infinite feed of ads and AI
         | generated content?
        
           | gaogao wrote:
           | A lot of top ML/AI talent has already bailed too, so some of
           | it is probably them trying to keep open research closer to
           | SOTA.
        
             | hn_20591249 wrote:
             | There has been some shuffling of seats but from what I am
             | hearing FAIR is the best setup as far as staffing and
             | funding that they have been in quite some time. Mark is
             | pivoting hard to stay competitive in AI and is providing
             | the resourcing to do so, the results speak for themselves.
        
         | idopmstuff wrote:
         | Meta has a clear channel to leverage generative AI in
         | profitable ways in their ads. At some point in the probably not
         | so far future, everybody's going to have custom ads generated
         | for them that are optimized to get that particular person to
         | click/buy/etc. Those will convert well, and the better ads
         | convert, the more businesses will be willing to pay Meta for a
         | given ad.
         | 
         | This compares favorably with Google, which is as likely to
         | cannibalize its search business with generative AI as to create
         | new value for itself.
         | 
         | Thus, for all the gen AI stuff like this, for which Meta
         | doesn't have an obvious path to commercialization, it makes
         | sense to release it publicly. They get plenty of benefits from
         | this - for one, engineers (and smart people generally) who are
         | working on really complex problems like to be able to talk
         | about the work they're doing. If you're picking between jobs at
         | Meta and Google, the fact that Meta's going to release your
         | stuff publicly might well be the deciding factor.
         | 
         | I would also argue that there's an economic incentive. Right
         | now, being seen as an AI company is definitely a positive for
         | your multiple. I think the movement of Meta's stock price over
         | the last 12 months relative to their change in profit and
         | revenue is certainly driven in part by the perception that
         | they're a leader in AI.
        
         | doomlaser wrote:
         | I watched a good talk from Yann LeCun who is Chief AI Scientist
         | at Meta, and he explained that the thinking is that open source
         | AI models will be the long-term winner, so it's best for them
         | to work in that arena.
         | 
         | https://www.youtube.com/watch?v=vyqXLJsmsrk
        
           | GreedClarifies wrote:
           | That's not a business strategy.
           | 
           | Likely this is driven by ego.
           | 
           | Yann wants to cement his position as a leader in AI and while
           | he clearly does not appreciate LLMS at all, he realizes that
           | he needs to make waves in this area.
           | 
           | Mark _needs_ a generative product and has invested
           | tremendously in the infrastructure for AI in general (for
           | recommendation). He needs researchers to use that
           | infrastructure to create a generative product(s).
           | 
           | Yann sees this going on, realizes that he has a very powerful
           | (research+recruiting) position and tells mark that he will
           | only sign on if Meta gives away a good deal of research and
           | Mark concedes, with the condition that he wants his
           | generative product by end of 2023 or start of 2024.
        
             | oceanplexian wrote:
             | It's not just ego. It's accelerationism. Giving this stuff
             | away from free is probably going to accelerate AI a decade
             | faster than if it was kept locked up behind closed doors at
             | Google, OpenAI, etc. And if you're an optimist then that
             | actually might make the world a better place much faster.
        
               | swader999 wrote:
               | Realistically, AI will ramp up the good and the bad.
        
             | lsmeducation wrote:
             | Linux is wasn't a business strategy either.
        
         | heatmiser wrote:
         | It makes sense to me that Facebook is releasing these models
         | similarly to the way that Google releases Android OS. Google's
         | advertising model benefits from as many people being online as
         | possible and their mobile operating system furthers that aim.
         | Similarly, Facebook's advertising model benefits from having
         | loads of content being generated to then be posted in their
         | various products' feeds.
        
         | fnordpiglet wrote:
         | As a business strategy I would see it as preventing themselves
         | from being hemmed in from the market leaders. By open sourcing
         | and raising the bar for commodity AI, they get to crowd source
         | improvement to their models and techniques to get ahead in
         | their own uses by co-opting open source improvement. I would
         | say to sage this is working amazingly well - the amount of
         | interest around open source models from meta is immense. I also
         | think, as do I, the majority of uses in the future will be from
         | fine tuned RAG capable models embedded in devices, not
         | pangalactic planet sized computers running septillion parameter
         | models. Llamacpp is a perfect illustration of where that's
         | working.
         | 
         | We followed a similar model under more duress at Netscape. When
         | you use Firefox that's the fruit of that effort. It didn't save
         | Netscape, but Meta has a better and more diversified revenue
         | base.
        
         | vessenes wrote:
         | They are behind commercially, very behind.
         | 
         | They also don't have the same economic setup and DNA as
         | MS/OpenAI. Large corporate customers don't pay for access to
         | the FB cloud, nor are they likely to -- Ellison has spent years
         | building out Oracle Cloud, and he's on the FB board, for
         | example. And I bet you didn't think of using Oracle's Cloud for
         | your last project.
         | 
         | So, your company DNA is free-to-all social based on ad
         | monetization, with a large bet on metaverse / AR / experiential
         | social compute being next. You aren't a trusted corporate
         | partner for anything but gatekeeping your immense community
         | through ad sales.
         | 
         | And, it's clear you a) have some of the most interesting
         | private social data in the world, including photos and DMs and
         | texts, and b) this AI thing is huge.
         | 
         | A play that doesn't f with your existing corporate structure
         | too much is to build this stuff, give it away, keep publishing,
         | build your AI team internally, and see where it takes you.
         | 
         | This isn't the only play, but I think it's reasonable. It's
         | pretty clear large enterprises are going to need their own,
         | internally built / owned, Foundation models to be competitive
         | in a bunch of arenas in the next decade. In this case, if FB
         | can get a little mindshare, keep the conversation going, and as
         | a sidenote, be a disruptor by lowering Azure/OpenAI revs with
         | open releases at-the-edge, that's probably a strategy win.
         | 
         | If I were in charge of AI strategy at FB, I'd probably double
         | down more on generative AI, and I'd be working hard on realtime
         | multimodal stuff -- their recent very large multimodal speech
         | to text in multiple languages work is good. If a team could
         | eyeball realtime-ish video chat with translations, that would
         | be something the platform has a natural advantage in pushing
         | out. Generative hits existing customers, and metaverse asset
         | creation, which is going to experience radical changes in costs
         | and productivity over the next few years, and impact Oculus
         | 100% no matter what anybody wishes were true.
        
           | tyre wrote:
           | I don't believe they're going for the same hosted
           | monetization as Oracle or Google. I'm sure they'll play
           | around with assistant AIs but you can imagine them leveraging
           | their graph and data for this.
           | 
           | Who is better positioned to answer a question like, "What
           | should I get my friend Sophia for her birthday?"
           | Facebook/Instagram already have huge volumes of data to
           | specifically target ads. They can feed those into a chat
           | interface pretty easily.
           | 
           | Customers would then buy per impression by describing their
           | product and trusting Facebook to place it correctly. They
           | already do this today, it's just a different medium.
        
             | cvhashim04 wrote:
             | > Who is better positioned to answer a question like, "What
             | should I get my friend Sophia for her birthday?"
             | Facebook/Instagram already have huge volumes of data to
             | specifically target ads. They can feed those into a chat
             | interface pretty easily.
             | 
             | Interesting idea but sounds risky and intrusive in
             | practice.
        
               | hiatus wrote:
               | I think this suggestion lacks subtlety. More likely,
               | around the time leading up to Sophia's birthday, you will
               | see more ads for things (maybe even gift idea ads) that
               | just so happen to be things Sophia would love (at least,
               | according to their data).
        
               | roughly wrote:
               | > Interesting idea but sounds risky and intrusive in
               | practice.
               | 
               | That's pretty much the entire Meta empire in a single
               | sentence.
        
             | Paradigma11 wrote:
             | Medication for her new std?
        
           | throwaway290 wrote:
           | Commercially it's not clear if there is a reliable "ahead",
           | I'd be surprised if copyright lawsuits don't start hitting
           | MS/OAI when publishers wake up and if you take out that
           | training data where does it leave their models?
        
             | visarga wrote:
             | Countries putting copyright above AI progress will just
             | fall behind. It's one thing to demand no exact replication
             | of copyrighted content, another to forbid training on
             | copyrighted works. Ideas were not supposed to be under
             | copyright, only expression, from what I remember.
        
               | throwaway290 wrote:
               | The argument that copyright abuse is required for "AI
               | progress" is sus. It is required for quick easy buck to
               | be made by the likes of Microsoft-- that I agree...
        
           | roughly wrote:
           | That's interesting. I tend to lump FB, Amazon, Google, and MS
           | in my head when thinking about the tech giants, but you're
           | right, FB is the only one of those not offering a commercial
           | platform. For them, building out the capabilities of the LLMs
           | is something to be done in the open with community
           | involvement, because they're not going to monetize the models
           | themselves.
           | 
           | They're also getting a fantastic amount of press from all
           | this, which is good for attracting talent and helping improve
           | their image, at least among the nerd set.
        
             | basch wrote:
             | I'm in the camp that its a mistake for Meta to not be
             | competing in the commercial compute space.
             | 
             | wrote about and diagramed it here -
             | https://telegra.ph/Facebook-Social-Services-FbSS-a-missed-
             | op...
        
               | CuriouslyC wrote:
               | Meta absolutely could not overcome the barriers to entry
               | and technical mismatch for any sort of traditional IAS
               | style product, and it would be foolish for them to try.
               | They might be able to pull off some sort of next
               | generation Heroku style service aimed at smaller shops
               | with built in facebook integration and authn/z
               | management, but that's tangential.
        
             | oceanplexian wrote:
             | FB is unlike the other BigTech(tm) since Zuck never sold
             | out and has a controlling equity stake. Amazon, Google, and
             | MS are all controlled by and beholden to institutional
             | investors.
             | 
             | FB can release these for no other reason than Zuck's ego or
             | desire to kill OpenAI. Same deal as him going off on a
             | tangent with the Metaverse thing.
        
               | boppo1 wrote:
               | Wonder why Zuck particularly wants to kill OpenAI instead
               | of increasing revenue with a new product offering.
        
               | p1esk wrote:
               | Given that OpenAI finished training GPT4 a year ago, and
               | no models today (including these) can beat it, I highly
               | doubt anyone is capable of killing Open AI in the near
               | future. I'm guessing by the time GPT5 is out, someone
               | will finally catch up with GPT4.
        
               | sroussey wrote:
               | They could always spin it out as a separate company.
        
               | mupuff1234 wrote:
               | Larry and Sergey still control majority of voting power
               | from what I recall.
        
             | vladms wrote:
             | Depends what you mean by platform and depends what you mean
             | by FB. If by FB you mean Meta, they have also
             | https://www.workplace.com/ (which is like an internal
             | facebook), instagram, whatsapp and some others. Integration
             | of LLMs technology in those "platform" might give them some
             | advantage.
        
               | roughly wrote:
               | Right, but they're not competing directly on offering the
               | LLM - they benefit from having a better LLM as a feature,
               | but their value add is elsewhere in the product.
        
           | Der_Einzige wrote:
           | You ought to think about using Oracle Cloud for your next
           | LLM/GPU project, because they sell access to A100/H100s for
           | cheap and they actually have them in stock!
        
           | hutzlibu wrote:
           | "b) this AI thing is huge."
           | 
           | Yeah, there are tons of opportunities for AI to do something
           | with facebooks private user data and sell new services. For
           | users to create engagement - and for ad companies to get very
           | good targeted ads delivered. It is of course a challenge, to
           | update the models on the fly, to include the latest private
           | data, but then you can tailor an ad, that has subtil
           | references to the latest shared wishes of the user. Probably
           | quite effective.
           | 
           | So for now they mainly need top talent, to make some of it
           | work. And open source is the best bet, for creating a
           | ecosystem they can control and get talents who already
           | trained on their tools. And they loose allmost nothing,
           | because yes, they ain't in the cloud buisness.
           | 
           | So I will continue to not use facebook. But the models I will
           | try.
        
           | jatins wrote:
           | In times like these Facebook/Zuck probably wonders how things
           | would have turned out had they not killed Parse.
           | 
           | Had they continued with it, they'd have likely had some
           | semblance of a public cloud today and would be able to sell
           | these models.
        
             | liuliu wrote:
             | Yes. But it also needs a very different org structure to
             | support that. Their internal infra from what I heard is
             | dated (monolithic PHP binary deployment, no federated
             | authorization management etc.). It is doable (FAIR's org
             | structure was very different in the first a few years), but
             | would also be a distraction for a long time.
             | 
             | Very interesting to ponder for sure.
        
           | nabusman wrote:
           | I would add that having open source gen AI will enable the
           | creation of content for metaverse / AR / VR, which will
           | improve the chances that all of that will take off.
        
             | vessenes wrote:
             | Right, exactly this. Ratcheting the costs down two orders
             | of magnitude in both dollar and expertise/human costs is
             | going to make huge changes. You better believe FB is
             | thinking about this hard.
        
       | lordnacho wrote:
       | Copilot has been working great for me thus far, but it's limited
       | by its interface. It seems like it only knows how to make
       | predictions for the next bit of text.
       | 
       | Is anyone working on a code AI that can suggest refactorings?
       | 
       | "You should pull these lines into a function, it's repetitive"
       | 
       | "You should change this structure so it is easier to use"
       | 
       | Etc
        
         | sestinj wrote:
         | You can use Continue for all of this, as easy as highlighting
         | code and making the request. We also support using Code Llama:
         | https://continue.dev/docs/walkthroughs/codellama
        
           | thewataccount wrote:
           | Any plans to support IntelliJ?
        
             | vunderba wrote:
             | Yeah this would be a crucial feature - interoperability
             | with Jetbrains IDEs.
        
             | GordonS wrote:
             | I'd also be really keen on this.
        
         | adocomplete wrote:
         | Give Cody a try! (Cody.dev)
         | 
         | With Cody you can create embeddings for your entire repo, so
         | Cody will have much greater context about your code base and
         | the problems you're trying to solve.
         | 
         | Disclaimer: I just joined Sourcegraph a few weeks ago.
        
           | stuzenz wrote:
           | Cody is great, it had become my go-to (and I pay for Github
           | Co-pilot).
           | 
           | With that said, they have recently changed the architecture,
           | with the local install required, and I have not managed (yet)
           | to get it working with NixOS. Once I have some more time, I
           | will try again - it looks like there will be some hoops to go
           | through. https://nixos.org/manual/nixpkgs/stable/#ssec-pkgs-
           | appimageT...
           | 
           | Kudos to the Source Graph team, Source Graph's original
           | product was nicely thought out and ahead of it's time. Nice
           | to see how the original product gave a nice basis for
           | building out Cody.
        
         | armchairhacker wrote:
         | Copilot calls these "Code Brushes"
         | https://githubnext.com/projects/code-brushes/
         | 
         | Last I heard they are in beta and don't work very well (even on
         | the examples page: the "add types" brush is too strict, since
         | `a` and `b` are checked for `null`, and the "fix simple bug" is
         | a typo)
        
         | artificialLimbs wrote:
         | I let mine generate whatever it likes, then add a comment below
         | such as "# Refactor the above to foo.." Works fairly well at
         | times.
        
           | lordnacho wrote:
           | Can it suggest deletions? Just seems like I don't know how to
           | use it.
        
         | make3 wrote:
         | There's an instruct model in there, you can definitely use it
         | for this, that's one of the objectives.
         | 
         | An instruct model means that you can ask it to do what you
         | want, including asking it to give you refactoring ideas from
         | the code you will give it.
        
           | regularfry wrote:
           | Sounds like what's needed is a bit of tooling in the
           | background consistently asking the LLM "How would you improve
           | this code?" so you don't need to actually ask it.
        
           | lordnacho wrote:
           | How do I access it from my IDE? Jetbrains/VSCode?
        
         | phillipcarter wrote:
         | SourceGraph Cody is going in that direction, as is Copilot
         | Chat. But it's still early days. I don't think there's anything
         | robust here yet.
        
         | nvm0n2 wrote:
         | Neither of those tasks require AI. IntelliJ IDEA will happily
         | suggest both for you today, locally. It can find large chunks
         | of duplicated code and automatically refactor them out to
         | functions for you. And it has many inspections that suggest
         | refactorings to make code clearer.
        
         | fpgaminer wrote:
         | https://docs.github.com/en/copilot/github-copilot-chat/using...
         | can basically do that if you're in the beta.
        
         | claytongrassick wrote:
         | I've been using Cursor (https://www.cursor.so/) and it can do
         | embeddings of the entire codebase, refactoring entire classes,
         | etc. I had it rewrite a UI to add state to show one item at a
         | time and have a selection list to the left and it executed it
         | perfectly in MUI controls, first try.
        
       | modeless wrote:
       | Interesting that there's a 34B model. That was missing from the
       | original Llama 2 release. I wonder if it's still usable for
       | general non-code chat tasks or if the code fine tuning destroyed
       | that. It should be the best model that would still fit on 24GB
       | gaming GPUs with quantization, because 70B doesn't fit.
        
         | brucethemoose2 wrote:
         | Someone "grafted" llama 33B onto llama v2 13B to make "llama
         | 22B"
         | 
         | https://huggingface.co/chargoddard/llama2-22b
         | 
         | Theoretically this is an even better size, as it would fit on a
         | 20GB-24GB GPU with more relaxed quantization and much longer
         | context.
         | 
         | Metrics are slightly below 13B, but the theory is that the
         | higher parameter count is more amenable to finetuning. If you
         | search for 22B on huggingface, you can see that frankenllama
         | experiments are ongoing:
         | 
         | https://huggingface.co/models?sort=modified&search=22b
        
         | redox99 wrote:
         | I can't imagine it being better than Llama1 33B, after all this
         | code finetuning.
        
           | modeless wrote:
           | But the license for llama 2 is a whole lot better.
        
             | redox99 wrote:
             | Meh.
             | 
             | If you're using it commercially you're probably deploying
             | it on a server where you're not limited by the 24GB and you
             | can just run llama 2 70b.
             | 
             | The majority of people who want to run it locally on 24GB
             | either want roleplay (so non commercial) or code (you have
             | codellama)
        
         | nabakin wrote:
         | Looks like they left out another model though. In the paper
         | they mention a "Unnatural Code Llama" which wipes the floor
         | with every other model/finetune on every benchmark except for
         | slightly losing to Code Llama Python on MBPP pass@100 and
         | slightly losing to GPT-4 on HumanEval pass@1 which is insane.
         | 
         | Meta says later on that they aren't releasing it and give no
         | explanation. I wonder why given how incredible it seems to be.
        
           | ImprobableTruth wrote:
           | It's "unnatural" because it was finetuned on generated data
           | using another model, almost certainly gpt-4 (whose TOS forbid
           | this).
        
       | jrh3 wrote:
       | lol... Python for Dummies (TM)
        
       | the-alchemist wrote:
       | Anyone know if it supports Clojure?
        
       | waitingkuo wrote:
       | Looks like that we need to request the access first
        
         | taylorbuley wrote:
         | In the past, LLAMA access was granted nearly immediately. For
         | HuggingFace downloads, it took a full day.
        
       | naillo wrote:
       | Feels like we're like a year away from local LLMs that can debug
       | code reliably (via being hooked into console error output as
       | well) which will be quite the exciting day.
        
         | brucethemoose2 wrote:
         | That sounds like an interesting finetuning dataset.
         | 
         | Imagine a database of "Here is the console error, here is the
         | fix in the code"
         | 
         | Maybe one could scrape git issues with console output and
         | tagged commits.
        
         | ilaksh wrote:
         | Have you tried Code Llama? How do you know it can't do it
         | already?
         | 
         | In my applications, GPT-4 connected to a VM or SQL engine can
         | and does debug code when given error messages. "Reliably" is
         | very subjective. The main problem I have seen is that it can be
         | stubborn about trying to use outdated APIs and it's not easy to
         | give it a search result with the correct API. But with a good
         | web search and up to date APIs, it can do it.
         | 
         | I'm interested to see general coding benchmarks for Code Llama
         | versus GPT-4.
        
           | jebarker wrote:
           | What does "GPT-4 connected to a VM or SQL engine" mean?
        
             | ilaksh wrote:
             | https://aidev.codes shows connected to VM.
        
           | tomr75 wrote:
           | Have you tried giving up to date apis as context?
        
       | 6stringmerc wrote:
       | So it's stubborn, stinks, bites and spits?
       | 
       | No thanks, going back to Winamp.
        
       | gorbypark wrote:
       | I can't wait for some models fine tuned on other languages. I'm
       | not a Python developer, so I downloaded the 13B-instruct variant
       | (4 bit quantized Q4_K_M) and it's pretty bad at doing javascript.
       | I asked it to write me a basic React Native component that has a
       | name prop and displays that name. Once it returned a regular
       | React component, and when I asked it to make sure it uses React
       | Native components, it said sure and outputted a bunch of random
       | CSS and an HTML file that was initializing a React project.
       | 
       | It might be the quantization or my lacklustre prompting skills
       | affecting it, though. To be fair I did get it to output a little
       | bit of useful code after trying a few times.
        
       | mdaniel wrote:
       | it looks like https://news.ycombinator.com/item?id=37248844 has
       | gotten the traction at 295 points
        
         | dang wrote:
         | Maybe we'll merge that one hither to split the karma.
        
       | WhitneyLand wrote:
       | How much am I'm missing out on with tools like this or code
       | pilot, compared to using GPT-4?
       | 
       | I guess since Xcode doesn't have a good plug-in architecture for
       | this I began experimenting more with a chat interface.
       | 
       | So far gpt-4 has seemed quite useful for generating code,
       | reviewing code for certain problems, etc.
        
       | syntaxing wrote:
       | TheBloke doesn't joke around [1]. I'm guessing we'll have the
       | quantized ones by the end of the day. I'm super excited to use
       | the 34B Python 4 bit quantized one that should just fit on a
       | 3090.
       | 
       | [1] https://huggingface.co/TheBloke/CodeLlama-13B-Python-fp16
        
         | [deleted]
        
         | stuckinhell wrote:
         | What kind of cpu/gpu power do you need for quantization or
         | these new gguf formats ?
        
           | syntaxing wrote:
           | I haven't quantized these myself since TheBloke has been the
           | main provider for all the quantized models. But when I did a
           | 8 bit quantization to see how it compares to the transformers
           | library load_in_8bit 4 months ago(?), it didn't use my GPU
           | but loaded each shard into the RAM during the conversion. I
           | had an old 4C/8T CPU and the conversion took like 30 mins for
           | a 13B.
        
           | SubiculumCode wrote:
           | i run llama2 13B models with 4-6 k-quantized oin a 3060 with
           | 12Gb VRam
        
         | suyash wrote:
         | can it be quantised further so it can run locally on a normal
         | laptop of a developer?
        
           | syntaxing wrote:
           | "Normal laptop" is kind of hard to gauge but if you have a M
           | series MacBook with 16GB+ RAM, you will be able to run 7B
           | comfortably and 13B but stretching your RAM (cause of the
           | unified RAM) at 4 bit quantization. These go all the way down
           | to 2 bit but I personally I find the model noticeably
           | deteriorate anything below 4 bit. You can see how much (V)RAM
           | you need here [1].
           | 
           | [1] https://github.com/ggerganov/llama.cpp#quantization
        
         | UncleOxidant wrote:
         | If I don't want to run this locally is it runnable somewhere on
         | huggingface?
        
           | emporas wrote:
           | Replicate has already hosted Llama2 13B, the chat version. My
           | guess is, in a short span of days or weeks they will host the
           | code version too. They charge a dollar for 2000 generations
           | if i am not mistaken.
           | 
           | https://replicate.com/a16z-infra/llama-2-13b-chat
        
         | mchiang wrote:
         | Ollama supports it already:
         | 
         | `ollama run codellama:7b-instruct`
         | 
         | https://ollama.ai/blog/run-code-llama-locally
         | 
         | More models uploaded as we speak:
         | 
         | https://ollama.ai/library/codellama
        
           | jerrysievert wrote:
           | while it supports it, so far I've only managed to get
           | infinite streams of near nonsense from the ollama models
           | (codellama:7b-q4_0 and codellama:latest)
           | 
           | my questions were asking how to construct an indexam for
           | postgres in c, how to write an r-tree in javascript, and how
           | to write a binary tree in javascript.
        
             | jmorgan wrote:
             | > managed to get infinite streams of near nonsense
             | 
             | This should be fixed now! To update you'll have to run:
             | ollama pull codellama:7b-instruct
        
             | carbocation wrote:
             | Similarly, I had it emit hundreds of blank lines before
             | cancelling it.
        
               | justinsaccount wrote:
               | Maybe it's outputting https://en.wikipedia.org/wiki/White
               | space_(programming_langua... :-)
        
               | kordlessagain wrote:
               | What fortune, I so happen to need hundreds of blank
               | lines.
        
             | syntaxing wrote:
             | Same, just tried it and it would give me infinite amount of
             | blank lines
        
               | jmorgan wrote:
               | Sorry, this should be fixed now! To update you'll have to
               | run:                 ollama pull codellama:7b-instruct
        
             | mchiang wrote:
             | still modifying the code completion (foundation / python
             | models) to see what's causing the behavior.
             | 
             | Have had some good success with the instruct model:
             | 
             | codellama:7b-instruct
        
               | jerrysievert wrote:
               | thanks! this give me some results, but I've had to use a
               | specific construct to get anything meaningful:
               | 
               | using <language> write me a <thing>
               | 
               | it's managed to spit out code, rather than "write a
               | traversal function".
        
           | [deleted]
        
           | comechao wrote:
           | I'm testing it on my M2 Air (16GB). Quite fast!
        
           | syntaxing wrote:
           | Whoa, it's absolutely astounding how fast the community is
           | reacting to these model release!
        
           | Pesthuf wrote:
           | Isn't ollama terminal only? For code, that wouldn't be good.
        
             | natrys wrote:
             | They have a server/client model. The binary comes with a
             | basic terminal front-end but you can just create your own
             | self-hosted GUI or editor integration against the API[1]:
             | 
             | [1]
             | https://github.com/jmorganca/ollama/blob/main/docs/api.md
        
               | jmorgan wrote:
               | Indeed! After pulling a model with "ollama pull
               | codellama" you can access it via the REST API:
               | curl -X POST http://localhost:11434/api/generate -d '{
               | "model": "codellama",         "prompt":"write a python
               | script to add two numbers"       }'
        
       | [deleted]
        
       | benvolio wrote:
       | >The Code Llama models provide stable generations with up to
       | 100,000 tokens of context.
       | 
       | Not a bad context window, but makes me wonder how embedded code
       | models would pick that context when dealing with a codebase
       | larger than 100K tokens.
       | 
       | And this makes me further wonder if, when coding with such a tool
       | (or at least a knowledge that they're becoming more widely used
       | and leaned on), are there some new considerations that we should
       | be applying (or at least starting to think about) when
       | programming? Perhaps having more or fewer comments, perhaps more
       | terse and less readable code that would consume fewer tokens,
       | perhaps different file structures, or even more deliberate naming
       | conventions (like Hungarian notation but for code models) to
       | facilitate searching or token pattern matching of some kind.
       | Ultimately, in what ways could (or should) we adapt to make the
       | most of these tools?
        
         | gonzan wrote:
         | I built a VS code extension a while back that I still use that
         | wraps GPT-4 and writes code directly in my editor.
         | 
         | The method I used to choose which files to feed GPT-4 was
         | embeddings-based. I got an embedding for each file and then an
         | embedding from the instruction + some simple processing to pick
         | the files more likely to be relevant. It isn't perfect but good
         | enough most of the time in medium-sized codebases (not very
         | large ones).
         | 
         | The one thing I started doing because of how I implemented this
         | is make files shorter and move stuff into different files.
         | Having a 1k+ LOC file is prohibitive because it eats up all the
         | context window (although with 100k context window maybe less
         | so). I think it's a good idea to keep files short anyways.
         | 
         | There's other smarter things that can be done (like embed and
         | pass individual functions/classes instead of entire files) so I
         | have no doubt someone will build something smarter soon. You'll
         | likely not have to change your coding patterns at all to make
         | use of AI.
        
         | visarga wrote:
         | A good practice is to have a prompt file where you keep the
         | information you want the model to have at its disposal. Then
         | you put it in the start of your conversations with GPT-4. It's
         | also good documentation for people.
         | 
         | You start a project by defining the task. Then as you iterate,
         | you can add new information to the prompt. But it can be also
         | partially automated - the model can have a view of the file
         | structure, classes, routes, assets and latest errors.
         | 
         | I was really hoping that the one year update of Codex would be
         | that - a LLM that can see deep into the project, not just code,
         | but runtime execution, debugging, inspecting and monitoring.
         | Something that can iterate like autoGPT. Unfortunately it
         | didn't improve much and has weird conflicts with the native
         | code completion in VSCode, you get freezes or doubled brackets.
        
         | wokwokwok wrote:
         | That seems daft.
         | 
         | You can, I suppose, contract your code so that it's context
         | free and uses less tokens, but that makes it more confusing for
         | humans _and language models_.
         | 
         | Taken to the extreme, you can see obviously with one letter
         | functions and variables like i, j, k the model will be able to
         | infer literally nothing and, thus, produce arbitrary nonsense.
         | 
         | Clearly the solution is to do what we _already_ do to manage
         | complexity which is to decompose large tasks into smaller black
         | box modules with an api where the (large number of tokens)
         | implementation is hidden and not known or relevant to using it.
         | 
         | If you give an LLM a function signature and good description,
         | maybe some usage examples, it doesn't need the implementation
         | to use it.
         | 
         | Terseness _decreases_ the ability of LLMs to process code; it
         | doesn't solve context length, and even at best it doesn't
         | scale.
         | 
         | 100k tokens is plenty.
         | 
         | You don't need to do anything like that.
        
           | emporas wrote:
           | The process of decomposing the task into smaller steps and
           | generate each step independently seems to be the correct way
           | in my experience too. It works very well with GPT (chatGPT or
           | GPT4).
           | 
           | >100k tokens is plenty.
           | 
           | The context window can be really helpful, in case there is a
           | release of a new library and the user wants to generate code
           | targeting the API of the library. When the training date
           | stops at August 2023, any library released after that date is
           | not known to the engine.
           | 
           | My general opinion in regards to context window, is that 1
           | trillion tokens context window still may not be enough for
           | all use cases.
        
         | roughly wrote:
         | I've found the utility of the coding LLMs gets a lot higher
         | when you've got code comments and descriptive variable and
         | function names - the LLM makes better inferences and
         | suggestions. We've seen similar on data - properly tagged data
         | and descriptive field names helps the LLM to produce much more
         | useful responses. I'm secretly hoping the spread of these tools
         | will finally lead my fellow developers to comment their code
         | and stop using three character variable names.
        
           | GreedClarifies wrote:
           | Commenting the code in this manner sounds like a job for an
           | LLM, maybe with human assistance in the short run.
        
             | bbor wrote:
             | This is my ultimate (short term) AI fear - letting it get
             | into a feedback loop with itself, leading to perverse and
             | incorrect results.
             | 
             | To state my position more clearly: I don't think an AI
             | could comment code from scratch very well - how would it
             | know all the decisions made, business logic considerations,
             | historical conventions, micro-industry standards, etc?
             | 
             | A good benchmark I was told once was "if a human expert
             | couldn't do it, an AI probably can't either". And
             | commenting code I didn't write would certainly test the
             | bounds of my abilities
        
         | ttul wrote:
         | Your developer tool already maps out the entire code base in
         | useful ways, such as knowing all the symbols available in the
         | current context and the structure of classes. This information
         | can be distilled for presentation to the LLM. For instance, if
         | you're wanting to generate a method implementation inside a C++
         | class, the LLM can be given a condensed version of the header
         | files that the compiler would have access to on compiling that
         | specific class. Removing white space and comments and boiling
         | macros down saves a lot of tokens.
         | 
         | You can also probably skip including standard library headers
         | since those will be well known to the LLM through its fine
         | tuning.
         | 
         | Either way, consider that a typical preprocessed C++ file would
         | push against the 100K limit even with some optimizations. You
         | will definitely want to have some middleware doing additional
         | refinement before presenting that file to the LLM.
        
         | brucethemoose2 wrote:
         | This sounds like a job for middleware. Condensing split code
         | into a single huge file, shortening comments, removing
         | whitespace and such can be done by a preprocessor for the llm.
        
           | gabereiser wrote:
           | So now we need an llmpack like we did webpack? Could it be
           | smart enough to truncate comments, white space, etc?
        
             | brucethemoose2 wrote:
             | You dont even need an llm for trimming whitespace, just a
             | smart parser with language rules like ide code checkers
             | already use. Existing llms are fine at summarizing
             | comments, especially with language specific grammar
             | constraints.
        
               | gabereiser wrote:
               | My point. We don't need the middleware.
        
         | adamgordonbell wrote:
         | Solutions exist that feed LLMS ctags, and seem to work well.
         | The function signatures and symbols names for a code base are
         | much smaller than the actual code.
        
       | e12e wrote:
       | Curious if there are projects to enable working with these things
       | self-hosted, tuned to a git repo as context on the cli, like a
       | Unix filter - or with editors like vim? (I'd love to use this
       | with Helix)
       | 
       | I see both vscode and netbeans have a concept of "inference URL"
       | - are there any efforts like language server (lsp) - but for
       | inference?
        
         | ilaksh wrote:
         | https://github.com/runvnc/smartcat
        
         | ingridpan wrote:
         | not quite self-hosted but gradient.ai gives you access to
         | llama2 via CLI
        
       | up6w6 wrote:
       | Even the 7B model of code llama seems to be competitive with
       | Codex, the model behind copilot
       | 
       | https://ai.meta.com/blog/code-llama-large-language-model-cod...
        
         | SparkyMcUnicorn wrote:
         | I'm not sure copilot is using codex anymore[0]. They've also
         | been talking about a shift towards GPT-4 with "Copilot X" a few
         | times now[1][2].
         | 
         | [0] https://github.blog/2023-07-28-smarter-more-efficient-
         | coding...
         | 
         | [1] https://github.com/features/preview/copilot-x
         | 
         | [2] https://github.blog/2023-07-20-github-copilot-chat-beta-
         | now-...
        
         | ramesh31 wrote:
         | >Even the 7B model of code llama seems to be competitive with
         | Codex, the model behind copilot
         | 
         | It's extremely good. I keep a terminal tab open with 7b running
         | for all of my "how do I do this random thing" questions while
         | coding. It's pretty much replaced Google/SO for me.
        
           | solarkraft wrote:
           | Huh? Do you perhaps mean standard Llama?
        
           | coder543 wrote:
           | You've already downloaded and thoroughly tested the 7B
           | parameter model of "code llama"? I'm skeptical.
        
             | realce wrote:
             | Just sign up at meta and you'll get an email link in like 5
             | minutes
        
               | coder543 wrote:
               | Yes, that's not a response to my comment.
               | 
               | No one who has been using any model for just the past 30
               | minutes would say that it has "pretty much replaced
               | Google/SO" for them, unless they were being facetious.
        
               | dataangel wrote:
               | GPT4 has replaced SO for me and I've been using it for
               | months.
        
               | tyre wrote:
               | They said 7b llama which I read as the base LLaMa model,
               | not this one specifically. All of these LLMs are trained
               | on Stack Overflow so it makes sense that they'd be good
               | out of the box.
        
               | brandall10 wrote:
               | The top level comment is specifically citing performance
               | of code llama against codex.
        
               | [deleted]
        
             | bbor wrote:
             | It was made available internally, I believe. So this is one
             | of the many Meta engineers on this site --- after all,
             | Facebook is now less hated than Google here ;)
        
               | [deleted]
        
             | Eddygandr wrote:
             | Maybe confused Code Llama with Llama 2?
        
             | lddemi wrote:
             | Likely meta employee?
        
               | MertsA wrote:
               | I've been using this or something similar internally for
               | months and love it. The thing that gets downright spooky
               | is the comments believe it or not. I'll have some method
               | with a short variable name in a larger program and not
               | only does it often suggest a pretty good snippet of code
               | the comments will be correct and explain what the intent
               | behind the code is. It's just a LLM but you really start
               | to get the feeling the whole is greater than the sum of
               | the parts.
        
               | coder543 wrote:
               | I just don't understand how anyone is making practical
               | use of local code completion models. Is there a VS Code
               | extension that I've been unable to find? HuggingFace
               | released one that is meant to use their service for
               | inference, not your local GPU.
               | 
               | The instruct version of code llama could certainly be run
               | locally without trouble, and that's interesting too, but
               | I keep wanting to test out a local CoPilot alternative
               | that uses these nice, new completion models.
        
               | fredoliveira wrote:
               | There are a bunch of VSCode extensions that make use of
               | local models. Tabby seems to be the most friendly right
               | now, but I admittedly haven't tried it myself:
               | https://tabbyml.github.io/tabby/
        
           | ohyes wrote:
           | What hardware do you have that lets you run 7b and do other
           | stuff at the same time?
        
             | _joel wrote:
             | If you're willing to sacrifice token/s you can even run
             | these on your phone.
        
             | hmottestad wrote:
             | Maybe a MacBook Pro. The Apple silicon chops can offload a
             | special AI inference engine, and all ram is accessible by
             | all parts of the chip.
        
             | gzer0 wrote:
             | An M1 Max with 64GB of RAM allows me to run multiple models
             | simultaneously, on top of stable diffusion generating
             | images non-stop + normal chrome, vscode, etc. Definitely
             | feeling the heat, but it's working. Well worth the
             | investment.
        
             | brucethemoose2 wrote:
             | Pretty much any PC with 16GB+ of fast RAM can do this, any
             | PC with a dGPU can do it well.
        
       | [deleted]
        
       | rafaelero wrote:
       | Those charts remind me just how insanely good GPT-4 is. It's
       | almost 5 months since its release and I am still at awe with its
       | capabilities. The way it helps with coding is just crazy.
        
       | binary132 wrote:
       | I wonder whether org-ai-mode could easily support this.
        
       | [deleted]
        
       | scriptsmith wrote:
       | How are people using these local code models? I would much prefer
       | using these in-context in an editor, but most of them seem to be
       | deployed just in an instruction context. There's a lot of value
       | to not having to context switch, or have a conversation.
       | 
       | I see the GitHub copilot extensions gets a new release one every
       | few days, so is it just that the way they're integrated is more
       | complicated so not worth the effort?
        
         | thewataccount wrote:
         | For in-editor like copilot you can try this locally -
         | https://github.com/smallcloudai/refact
         | 
         | This works well for me except the 15B+ don't run fast enough on
         | a 4090 - hopefully exllama supports non-llama models, or maybe
         | it'll support CodeLLaMa already I'm not sure.
         | 
         | For general chat testing/usage this works pretty well with lots
         | of options - https://github.com/oobabooga/text-generation-
         | webui/
        
           | msp26 wrote:
           | >This works well for me except the 15B+ don't run fast enough
           | on a 4090
           | 
           | I assume quantized models will run a lot better. TheBloke
           | already seems like he's on it.
           | 
           | https://huggingface.co/TheBloke/CodeLlama-13B-fp16
        
             | thewataccount wrote:
             | Unfortunately what I tested was StarCoder 4bit. We really
             | need exllama which should make even 30b viable from what I
             | can tell.
             | 
             | Because codellama is llama based it may just work possibly?
        
         | modeless wrote:
         | http://cursor.sh integrates GPT-4 into vscode in a sensible
         | way. Just swapping this in place of GPT-4 would likely work
         | perfectly. Has anyone cloned the OpenAI HTTP API yet?
        
           | fudged71 wrote:
           | I was tasked with a massive project over the last month and
           | I'm not sure I could have done it as fast as I have without
           | Cursor. Also check out the Warp terminal replacement.
           | Together it's a winning combo!
        
           | lhl wrote:
           | LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/
           | both have fairly complete OpenAI compatibility layers. llama-
           | cpp-python has a FastAPI server as well:
           | https://github.com/abetlen/llama-cpp-
           | python/blob/main/llama_... (as of this moment it hasn't
           | merged GGUF update yet though)
        
         | sestinj wrote:
         | You can use Continue as a drop-in replacement for Copilot Chat
         | with Code Llama. We've released a short tutorial here:
         | https://continue.dev/docs/walkthroughs/codellama. It should
         | save you a lot of time context-switching; you can just
         | highlight code and ask questions or make edits, all with
         | keyboard shortcuts
        
       | brucethemoose2 wrote:
       | Here is the paper:
       | 
       | https://ai.meta.com/research/publications/code-llama-open-fo...
        
       | 1024core wrote:
       | > Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash
       | 
       | What?!? No Befunge[0], Brainfuck or Perl?!?
       | 
       | [0] https://en.wikipedia.org/wiki/Befunge
       | 
       | /just kidding, of course!
        
       | jmorgan wrote:
       | To run Code Llama locally, the 7B parameter quantized version can
       | be downloaded and run with the open-source tool Ollama:
       | https://github.com/jmorganca/ollama                  ollama run
       | codellama "write a python function to add two numbers"
       | 
       | More models coming soon (completion, python and more parameter
       | counts)
        
       | natch wrote:
       | Why wouldn't they provide a hosted version? Seems like a no
       | brainer... they have the money, the hardware, the bandwidth, the
       | people to build support for it, and they could design the
       | experience and gather more learning data about usage in the
       | initial stages, while putting a dent in ChatGPT commercial
       | prospects, and all while still letting others host and use it
       | elsewhere. I don't get it. Maybe it was just the fastest option?
        
         | redox99 wrote:
         | Probably the researchers at meta are only interested in
         | research, and productionizing this would be up to other teams.
        
           | natch wrote:
           | But Yann LeCun seems to think the safety problems of eventual
           | AGI will be solved somehow.
           | 
           | Nobody is saying this model is AGI obviously.
           | 
           | But this would be an entry point into researching one small
           | sliver of the alignment problem. If you follow my thinking,
           | it's odd that he professes confidence that AI safety is a non
           | issue, yet from this he seems to want no part in
           | understanding it.
           | 
           | I realize their research interest may just be the
           | optimization / mathy research... that's their prerogative but
           | it's odd imho.
        
             | ShamelessC wrote:
             | It's not that odd and I think you're overestimating the
             | importance of user submitted data for the purposes of
             | alignment research. In particular because it's more
             | liability for them to try to be responsible for outputs.
             | Really though, this way they get a bunch of free work from
             | volunteers in open source/ML communities.
        
       | Palmik wrote:
       | The best model, Unnatural Code Llama, is not released. Likely
       | because it's trained on GPT4 based data, and might violate OpenAI
       | TOS, because as per the "Unnatural" paper [1], the "unnatural"
       | data is generated with the help of some LLM -- and you would want
       | to use as good of an LLM as possible.
       | 
       | [1] https://arxiv.org/pdf/2212.09689.pdf
        
         | redox99 wrote:
         | The good thing is that if it's only finetuned on 15k
         | instructions, we should see a community made model like that
         | very soon.
        
       | bick_nyers wrote:
       | Anyone know of a good plugin for the JetBrains IDE ecosystem
       | (namely, PyCharm) that is CoPilot but with a local LLM?
        
       | WaitWaitWha wrote:
       | Can someone point me to a ELI5 sequence of steps that shows how
       | someone can install and use LLMs locally and in some way,
       | functionally?
       | 
       | Asking for purposes of educating non-technologists.
        
         | Patrick_Devine wrote:
         | There are several different ways, but the easiest way in my
         | (clearly biased) opinion is just got to ollama.ai, download it,
         | and start playing around. It works out of the box w/ newer
         | Macs, but there are versions for Linux and Windows in the
         | works.
        
       | marcopicentini wrote:
       | It's just a matter of time that Microsoft will integrate it into
       | VSCode.
        
       | daemonologist wrote:
       | Works nearly out of the box with llama.cpp, which makes it easy
       | to try locally:
       | https://github.com/ggerganov/llama.cpp/issues/2766
       | 
       | Here's some output from q4_0 quantization of CodeLlama-7b-Python
       | (first four lines are the prompt):                   # prints the
       | first ten prime numbers          def print_primes():
       | i = 2              num_printed = 0 # end of prompt
       | while num_printed < 10:                 if is_prime(i):
       | print(i)                     num_printed += 1                 i
       | += 1              def is_prime(n):             i = 2
       | while i * i <= n:                 if n % i == 0:
       | return False                 i += 1             return True
       | def main():             print_primes()              if __name__
       | == '__main__':             main()
       | 
       | It will be interesting to see how the larger models perform,
       | especially after community tuning and with better
       | context/prompting.
        
         | quickthrower2 wrote:
         | Funny watching HN be nerd sniped by a machine :-)
        
         | blibble wrote:
         | I'd fail an interview candidate that suggested adding 1 each
         | time for subsequent prime testing
        
           | maleldil wrote:
           | I assume you meant that you should add 2? If yes, that's such
           | a mind boggling basic thing to do that I agree with you, and
           | it makes no sense that you're being crucified.
        
             | blibble wrote:
             | yes
        
           | throwuxiytayq wrote:
           | i'd walk out from an interview that asked me to write a prime
           | number generator
        
             | belenos46 wrote:
             | I've done that (maybe it was fizzbuzz, now that I'm
             | thinking about it) and boy howdy does that get the people
             | you're interviewing with agitated. Saying "I'm interviewing
             | for a architect level container orchestration position. If
             | I'm reinventing the wheel writing algorithms, something is
             | _terribly_ wrong " shuts them up, but doesn't make them any
             | happier.
        
           | [deleted]
        
           | dontupvoteme wrote:
           | Simply prompting the output with "Optimize " prepended adds
           | your suggestion, and some others.
        
           | csmpltn wrote:
           | > "I'd fail an interview candidate that suggested adding 1
           | each time for subsequent prime testing"
           | 
           | Congratulations! You must be that arrogant guy everybody
           | hates interviewing with, the one with the superiority
           | complex.
           | 
           | How about instead of just failing people over literally
           | nothing (wasting everybody's time and money) - just ask the
           | candidate whether they could somehow reduce the search space
           | by utilizing the properties of a prime number?
        
           | droopyEyelids wrote:
           | The simple-to-understand, greedy algorithm is always the
           | correct first choice till you have to deal with a constraint.
        
             | blibble wrote:
             | it's not that though, there's several other typical
             | optimisations in there
             | 
             | just not the super obvious one that demonstrates extremely
             | basic understanding of what a prime number is
        
               | jpeterson wrote:
               | Having "extremely basic understanding" of prime numbers
               | immediately at one's command is important for
               | approximately 0% of software engineering jobs. If you
               | instant-fail a candidate for this, it says a lot more
               | about you and your organization than the candidate.
        
               | blibble wrote:
               | > If you instant-fail a candidate for this, it says a lot
               | more about you and your organization than the candidate.
               | 
               | yes, we expect professional software developers to have
               | basic maths skills
               | 
               | "what is a prime number" is taught to 7 year olds, it's
               | not vector calculus
               | 
               | what else would you consider to be an unreasonable thing
               | for an employer to require?
               | 
               | reading and writing skills of a typical 7 year old?
        
               | ungruntled wrote:
               | I think the key problem here is that is is a bad
               | programming question. If you know anything about prime
               | numbers then coming up with an answer is trivial. If you
               | expect a more optimized solution, then you are really
               | only gauging the interviewee's understanding of prime
               | numbers. So effectively the interview is more about
               | mathematics than it is about programming or problem
               | solving.
        
               | [deleted]
        
               | daok wrote:
               | You probably do not have a child of 7 years old because
               | they do not know at that age what is a prime number.
               | 
               | Second, basic math still that you never or rarely use or
               | with very large time between usage might get rusty. You
               | may understand the concept but not find the optimal
               | solution. The way you are responding here shows quite a
               | lot about how you are short sighted by instant-failing
               | someone with a single question instead of trying to asses
               | the whole person as much as you can. On you side, you are
               | wasting opportunity to have a great person that could be
               | a key player in your team by bringing other set of skill
               | on the table.
        
               | blibble wrote:
               | > You probably do not have a child of 7 years old because
               | they do not know at that age what is a prime number.
               | 
               | it's part of the curriculum for children of this age
               | where I grew up (I did check)
               | 
               | > The way you are responding here shows quite a lot about
               | how you are short sighted by instant-failing someone with
               | a single question instead of trying to asses the whole
               | person as much as you can. On you side, you are wasting
               | opportunity to have a great person that could be a key
               | player in your team by bringing other set of skill on the
               | table.
               | 
               | it may also be the case I have more in depth knowledge
               | about the that roles I've interviewed candidates for
               | 
               | most recently: hiring people to work for quants
               | 
               | not instantly knowing that even numbers (other than 2)
               | are not prime is a very strong signal
        
               | noduerme wrote:
               | I'm mad at myself now that it has eaten 15 minutes of my
               | time trying to come up with the right optimization.
               | What's the trick? 2, +1, and then +2 from there on seems
               | obvious but once you get to 9 is it worth building a list
               | of nonprimes to skip?
        
               | Our_Benefactors wrote:
               | https://stackoverflow.com/a/54544012/1336678
               | 
               | Common approach is to use square roots, this reduces the
               | runtime. Recommend checking out project euler if you like
               | solving hard math-code-o(n)-puzzles.
        
               | noduerme wrote:
               | I didn't want to cheat by looking on S.O. but thanks ;)
               | 
               | Yes it makes sense (in the GPT code) that you'd only go
               | up to i * i ... although looking at pythonic while:
               | statements is just gross to me in this context, it would
               | feel a lot more readable to say, e.g. in PHP:
               | 
               | for ($i=2;$i<sqrt($n);) { $i+=($i==2 ? 1 : 2); //although
               | the first one should just be outside the loop }
        
               | thewataccount wrote:
               | I think they're suggesting simply doing +2
               | 
               | +1 is not a good idea since ~half of all numbers are
               | effectively non-prime simply by being even numbers.
               | 
               | You can double the speed by using +2 without using any
               | fancy tricks, just changing a single character.
        
           | [deleted]
        
           | tasubotadas wrote:
           | Finally we meet the lifeless drone that everybody complains
           | about in the interviews.
           | 
           | My suggestion for your next interview: decide to hire them
           | just based on their leetcode score, but invite to the
           | interview just to flex that you're still better at puzzle
           | solving :-D
           | 
           | Perfect
        
         | d0mine wrote:
         | Simple, concise, more efficient:                 def
         | primes_upto(limit: int):         """Generate prime numbers <
         | *limit*."""          # Sieve of Eratosthene         is_prime =
         | [True] * limit          for n in range(2, limit):
         | if is_prime[n]:                 yield n  # found prime number
         | for c in range(n*n, limit, n):  # start with square, less
         | values are marked already                     is_prime[c] =
         | False # mark composites            if __name__ == "__main__":
         | from itertools import islice
         | print(*islice(primes_upto(100), 10)) # -> 2 3 5 7 11 13 17 19
         | 23 29
        
           | someplaceguy wrote:
           | Yeah, but yours was generated by the "post unoptimized code
           | to HN and wait for someone to optimize it" model, which,
           | although free and doesn't require a GPU, is a much slower
           | model.
        
             | turnsout wrote:
             | Someone should turn this into a product! You highlight the
             | code you want to optimize, and it posts it to hn as a semi-
             | contextually-appropriate comment to invite code golfing,
             | and the highest rated reply gets posted back to your repo
             | as a PR.
        
             | saurik wrote:
             | But, unless you are trying to find a prime number low
             | enough that you might as well look it up in a pre-generated
             | table, it might still be end-to-end more efficient?
        
               | someplaceguy wrote:
               | Ah, good point :) Touche!
        
       | reacharavindh wrote:
       | Code llama Python is very interesting. Specifically tuned for
       | Python.
       | 
       | I wonder if we could make such specific LLMs (one that is
       | proficient in all things Rust, another- all things Linux, all
       | things genomics, all things physics modeling etc) and have them
       | talk to each other to collaboratively solve problems.
       | 
       | That would be a crazy future thing! Putting machines truly to
       | work..
        
         | seydor wrote:
         | Start with a CodeLlama for C, and start treating these systems
         | as natural language compilers. C is low level enough and still
         | readable for those rare moments
        
         | brucethemoose2 wrote:
         | If you can find a large body of good, permissively licensed
         | example code, you can finetune an LLM on it!
         | 
         | There was a similar attempt for Godot script trained a few
         | months ago, and its reportedly pretty good:
         | 
         | https://github.com/minosvasilias/godot-dodo
         | 
         | I think more attempts havent been made because base llama is
         | not that great at coding in general, relative to its other
         | strengths, and stuff like Starcoder has flown under the radar.
        
         | [deleted]
        
         | esperent wrote:
         | I think this is called "mixture of experts" and also there's a
         | lot of speculation that it's how GPT-4 works, although probably
         | with just a few large models rather than many small ones.
        
           | jmiskovic wrote:
           | It's been confirmed by multiple (unofficial) sources that
           | GPT-4 is 8 models, each 220B parameters. Another rumor is
           | GPT-4 being 16x111B models.
           | 
           | There's a quite fresh and active project replicating
           | something similar with herd of llamas:
           | https://github.com/jondurbin/airoboros
        
         | bbor wrote:
         | Mark my words: you've caught a glimpse of the near future :).
         | Google "Society of Mind" if you're not yet familiar
        
       | mercurialsolo wrote:
       | Is there a version of this on replicate yet?
        
       | ilaksh wrote:
       | Between this, ideogram.ai (image generator which can spell, from
       | former Google Imagen team member and others), and ChatGPT fine-
       | tuning, this has been a truly epic week.
       | 
       | I would argue that many teams will have to reevaluate their LLM
       | strategy _again_ for the second time in a week.
        
         | ShamelessC wrote:
         | Did ideogram release a checkpoint?
        
           | ilaksh wrote:
           | I can't find any info or Discord or forum or anything. I
           | think it's a closed service that they plan to sell to make
           | money.
        
       | rvnx wrote:
       | Amazing! It's great that Meta is making AI progress.
       | 
       | In the meantime, we are still waiting for Google to show what
       | they have (according to their research papers, they are beating
       | others).
       | 
       | > User: Write a loop in Python that displays the top 10 prime
       | numbers.
       | 
       | > Bard: Sorry I am just an AI, I can't help you with coding.
       | 
       | > User: How to ask confirmation before deleting a file ?
       | 
       | > Bard: To ask confirmation before deleting a file, just add -f
       | to the rm command.
       | 
       | (real cases)
        
         | [deleted]
        
         | criley2 wrote:
         | I don't get comments like this, we can all go and test Bard and
         | see that what you're saying isn't true
         | 
         | https://g.co/bard/share/95761dd6d45e
        
           | rvnx wrote:
           | Well look for yourself:
           | 
           | https://g.co/bard/share/e8d14854ccab
           | 
           | The rm answer is now "hardcoded" (aka, manually entered by
           | reviewers), the same with the prime or fibonnaci.
           | 
           | This is why we both see the same code across different
           | accounts (you can make the test if you are curious).
        
             | ed wrote:
             | That's a hallucination. Here's a similar made-up answer:
             | 
             | https://g.co/bard/share/9ce2e6a11e83
             | 
             | LLM's aren't trained on their own documentation, and can't
             | introspect, so generally can't answer questions like this.
             | 
             | (`"Mark House" "Bard"` gives no results on Google.)
        
             | criley2 wrote:
             | Okay, so the entire point of the comment is "A current
             | model which does well used to be bad!"
             | 
             | With all due respect, is that a valuable thing to say?
             | Isn't it true of them all?
        
               | nickthegreek wrote:
               | Isn't the model STILL doing bad if it needs to present a
               | hard-coded answer?
        
               | rvnx wrote:
               | Mhh not just about the past, you can see such in current
               | answers from Bard.
               | 
               | They are generally okayish, closer to "meh", than
               | something outstanding.
               | 
               | Yes the shell script solution is better, it doesn't give
               | rm -f anymore, but is still somewhat closer to a bad
               | solution instead of just giving rm -i.
               | 
               | I'm just really happy and excited to see that a free-to-
               | download and free-to-use model can beat a commercially-
               | hosted offering.
               | 
               | This is what has brought the most amazing projects (e.g.
               | Stable Diffusion)
        
               | [deleted]
        
       | eurekin wrote:
       | theBloke cannot rest :)
        
         | ynniv wrote:
         | Every time a new model hits I'm waiting for his ggmls
        
           | brucethemoose2 wrote:
           | ggml quantization is very easy with the official llama.cpp
           | repo. Its quick and mostly dependency free, and you can pick
           | the perfect size for your CPU/GPU pool.
           | 
           | But don't get me wrong, TheBloke is a hero.
        
             | ynniv wrote:
             | Some of the newer models have slightly different
             | architectures, so he explains any differences and shows a
             | llama.cpp invocation. Plus you can avoid pulling the larger
             | dataset.
        
               | brucethemoose2 wrote:
               | Yeah. Keeping up wkth the changes is madness, and those
               | FP16 weights are huge.
        
             | int_19h wrote:
             | While we're at it, the GGML file format has been deprecated
             | in favor of GGUF.
             | 
             | https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md
             | 
             | https://github.com/ggerganov/llama.cpp/pull/2398
        
         | regularfry wrote:
         | As if by magic...
         | https://huggingface.co/TheBloke/CodeLlama-13B-fp16. Empty so
         | still uploading right now, at a guess.
        
       | likenesstheft wrote:
       | no more work soon?
        
         | kypro wrote:
         | The ability to work less historically has always came as a
         | byproduct of individuals earning more per hour through
         | productivity increases.
         | 
         | The end goal of AI isn't to make _your_ labour more productive,
         | but to not need your labour at all.
         | 
         | As your labour becomes less useful if anything you'll find you
         | need to work more. At some point you may be as useful to the
         | labour market as someone with 60 IQ today. At this point most
         | of the world will become entirely financially dependent on the
         | wealth redistribution of the few who own the AI companies
         | producing all the wealth - assuming they take pity on you or
         | there's something governments can actually do to force them to
         | pay 90%+ tax rates, of course.
        
           | likenesstheft wrote:
           | What?
        
       | thehacker1234 wrote:
       | [flagged]
        
       | born-jre wrote:
       | 34B is grouped query attention, right? Does that make it the
       | smallest model with grouped attention?
       | 
       | I can see some people fine-tuning it again for general propose
       | instruct.
        
       | dontupvoteme wrote:
       | Did people * _really*_ think only artists would be losing their
       | jobs to AI?
        
       | 1024core wrote:
       | If GPT-4's accuracy is 67% and this is 54%, how can these guys
       | claim to be SOTA?
        
         | binreaper wrote:
         | Seriously, I was expecting to read the article and them be on a
         | level on-par with GPT-4 or higher. For all this chat of how
         | long Google/Facebook have been in the AI space longer than
         | OpenAI, their products don't speak to that..
        
       | MuffinFlavored wrote:
       | Can I feed this entire GitHub projects (of reasonable size) and
       | get non-hallucinated up-to-date API refactoring recommendations?
        
       | msoad wrote:
       | Is there any place we can try those models? Are they available on
       | HuggingFace?
        
         | jspisak wrote:
         | Partner integrations will follow. For now we just have the
         | weights available.
         | 
         | But don't worry, this community moves fast!
        
           | Eddygandr wrote:
           | Probably superseded (by y'all) within a week!
        
       | KaiserPro wrote:
       | This is great for asking questions like "how do I do x with y"
       | and this code <<some code>> isn't working, whats wrong? Much
       | faster that googling, or a great basis for forming a more
       | accurate google search.
       | 
       | Where its a bit shit is when its used to provide auto suggest. It
       | hallucinates plausible sounding functions/names, which for me
       | personally are hard to stop if they are wrong (I suspect that's a
       | function of the plugin)
        
       | Someone1234 wrote:
       | Business opportunity: I'd pay money for NICE desktop software
       | that can run all these different models (non-subscription,
       | "2-year updates included, then discount pricing" modal perhaps).
       | My wishlist:
       | 
       | - Easy plug & play model installation, and trivial to change
       | which model once installed.
       | 
       | - Runs a local web server, so I can interact with it via any
       | browser
       | 
       | - Ability to feed a model a document or multiple documents and be
       | able to ask questions about them (or build a database of some
       | kind?).
       | 
       | - Absolute privacy guarantees. Nothing goes off-machine from my
       | prompt/responses (USP over existing cloud/online ones). Routine
       | license/update checks are fine though.
       | 
       | I'm not trying to throw shade at the existing ways to running
       | LLMs locally, just saying there may be room for an OPTIONAL
       | commercial piece of software in this space. Most of them are
       | designed for academics to do academic things. I am talking about
       | a turn-key piece of software for everyone else that can give you
       | an "almost" ChatGPT or "almost" CoPilot-like experience for a one
       | time fee that you can feed sensitive private information to.
        
         | julianeon wrote:
         | What I want is even simpler: just an API that you make requests
         | to and receive answers back. Surprisingly hard to find, outside
         | OpenAI that is.
        
           | MaKey wrote:
           | Oobabooga exposes an API.
        
           | BoorishBears wrote:
           | I'm not into open source LLMs in the slightest, and yet even
           | I've trivially found tools to do what both you and the poster
           | above you wanted
           | 
           | lmstudio actually does what both of you want: provides an
           | easy GUI and serves up your model over a local endpoint that
           | mirrors the OpenAI API.
           | 
           | There's just too much noise in terms of the tooling for LLMs,
           | the solution is fewer higher quality solutions, not more
           | solutions
        
           | ingridpan wrote:
           | https://gradient.ai/ is doing that with llama2
        
             | worldsayshi wrote:
             | Looks really promising. I wonder if the similar pricing to
             | OpenAI means that Gradient is also(?) bleeding money even
             | if they get a good customer base. Or are these prices
             | sustainable over time?
        
               | ingridpan wrote:
               | Good question, esp as Gradient fine-tuning is so much
               | cheaper than Open AI's
        
           | xanderatallah wrote:
           | We're _trying_ to do this at https://openrouter.ai
        
         | MaKey wrote:
         | Did you try Oobabooga (https://github.com/oobabooga/text-
         | generation-webui) yet?
        
           | max51 wrote:
           | Oobabooga is a great tool but it still has a long way to go
           | in term of user-friendliness. It's absolutely not plug and
           | play the way that chatgpt is; It requires research, trial and
           | error, and knowledge of the tech to make the model work to
           | its full potential. It's great once you finish setting it up,
           | but it does not compare to what you would expect from a
           | commercial product aimed at normal end-users.
           | 
           | Things like bad default values, no tooltips, an no curated
           | model list to one-click download is what separates a tool
           | like Oobabooga from a paid commercial product. These things
           | require time/money and it would be very unlikely that an open
           | source tool could find resources for all the testing and R&D.
           | 
           | I think there is a big market for products where you pay and
           | can just start chatting with the model without having to ever
           | go to the settings tab or google anything unless you need to
           | do something out of the ordinary.
        
         | noduerme wrote:
         | Agreed. After several rounds of setting up various python
         | environments and tinkering with directory structures and
         | debugging glitches and quantizing models just to end up playing
         | around for a few minutes and getting bored, it would be nice to
         | have the experience just be seamless. I wouldn't try to set up
         | a workflow around seriously using what's out there to run on
         | localhost now.
         | 
         | That said, _non-subscription_ is essential, and that 's
         | probably going to be a heavy lift considering how quickly
         | things are evolving.
        
           | simonw wrote:
           | I've been trying to push things in that direction with my LLM
           | tool - the idea is to have Python plugins which you can
           | install that do all of the irritating details to get a model
           | setup.
           | 
           | I've not yet been able to solve the challenge of needing CUDA
           | etc for some models though!
           | 
           | Plugins so far:
           | https://llm.datasette.io/en/stable/plugins/directory.html
        
             | noduerme wrote:
             | Cool! I've followed your instructions and your blog quite a
             | bit as I've experimented with running local LLMs as well as
             | stable diffusion. It's been especially helpful, as python
             | is not my language or usual environment. Your patience at
             | hacking your way through each new iteration and presenting
             | what's important about them is astonishing; I personally
             | think I'd have gone mad, but you've done great work in
             | charting the territory.
        
         | irrational wrote:
         | I work for a Fortune 100 company with 80,000+ employees. All of
         | us are explicitly forbidden from using any sort of AI/LLM tool
         | without written permission from the head of legal AND the CEO.
         | In other words, nobody is going to get permission.
         | 
         | The concerns are 2 fold - 1. We might inadvertently use someone
         | else's intellectual property. 2. Someone else might gain access
         | to our intellectual property.
         | 
         | What you are describing would help alleviate the concern about
         | issue 2, but I'm not sure if it would help alleviate the
         | concerns with issue 1.
        
           | roguas wrote:
           | Change company. Honestly. If you go as far as to forbid your
           | partners in crime (workers sigh..) to explore new uncharted
           | territory at all - well ya know someone will/might just win
           | by not doing that.
        
             | sshine wrote:
             | Working for an 80.000+ employee company, one has already
             | accepted a certain degree of inertia.
        
             | flatline wrote:
             | This is particular, specifically problematic territory. I
             | cannot imagine handing over proprietary data to a third
             | party without a contract in place for how that data is
             | stored and used. It's not about innovation, it's about
             | using someone else's tools without ownership. For the other
             | case, it's both about integrity in owning your own work,
             | and a shield from legal consequences. These things should
             | be very relevant to any business.
             | 
             | I also don't know any professional devs who have used tools
             | like copilot and said they were anything but a toy. I am
             | more bullish on LLMs than most of my coworkers. I think
             | there is a lot of potential there. I do not see that
             | potential in the current commercial offerings, and the
             | financial outlay to fine-tune an open-source model and run
             | it at scale is...prohibitive.
        
             | thfuran wrote:
             | That's not banning all uncharted territory, it's banning
             | specific legally fraught territory.
        
           | ttyyzz wrote:
           | It's basically the same thing in our company, too. They
           | basically put a similar rule in place that prevents anyone
           | from using e.g. Chat GPT. Little do they know that all
           | software devs within the company are using co-pilot and the
           | company is even paying for it. It's quite a funny situation
           | tbh..
        
             | sangnoir wrote:
             | > Little do they know that all software devs within the
             | company are using co-pilot and the company is even paying
             | for it.
             | 
             | Just like annual sexual harassment training - it's mostly
             | corporate CYA on liability. If it ever goes to court,
             | they'll plead ignorance and blame the employees who should
             | have known better as they were trained/informed on what
             | they ought _not_ to do.
             | 
             | Paying for co-pilot could bite them though, so I suspect
             | it's a case were the one part of the organization isn't
             | aware of what the other is doing
        
               | ttyyzz wrote:
               | All of your assumptions are exactly right. They (mostly
               | managers with little to no IT background) want to cover
               | their own asses in case shit hits the fan (unlikely
               | scenario if you ask me, because the company is just
               | overrating the value of their data. Nobody gives a fuck
               | about us anyway...) and many parts of this company have
               | developed their own habits... The company is just very
               | big and I can understand why they might be afraid, but
               | come on, nobody will take that policy seriously forever.
               | You need to eventually put some reasonable rules in place
               | that allow you to make use of such Innovations...
        
         | ssalka wrote:
         | GPT4All satisfies these requirements, except for (AFAIK)
         | running a web server
         | 
         | https://gpt4all.io/index.html
        
           | simonw wrote:
           | It runs a web server too - if you start up the desktop app it
           | can run a web server with an API on a port for you.
        
         | alsobrsp wrote:
         | I have been using refact.ai on my laptop, it has been quite
         | good.
         | 
         | https://github.com/smallcloudai/refact/blob/main/README.md
        
           | thewataccount wrote:
           | Refact has worked for me. Hopefully exllama will support
           | CodeLlama.
        
           | firecall wrote:
           | I wish these things worked with anything other than VSCode or
           | JerBrains tools!
           | 
           | VSCode is such a bloated hog of an editor!
           | 
           | Every time I open VSCode it's bugging with badges to update
           | extensions... and it's so slow!
        
         | jmorgan wrote:
         | A few folks and I have been working on an open-source tool that
         | does some of this (and hopefully more soon!)
         | https://github.com/jmorganca/ollama
         | 
         | There's a "PrivateGPT" example in there that is similar to your
         | third point above:
         | https://github.com/jmorganca/ollama/tree/main/examples/priva...
         | 
         | Would love to know your thoughts
        
           | luma wrote:
           | I'd love to test this out as soon as you get Linux or Windows
           | support going!
        
             | appel wrote:
             | Me too! I starred the repo and am watching releases,
             | excited to try it.
        
           | SubiculumCode wrote:
           | [flagged]
        
         | tibbon wrote:
         | I've used llama.cpp easily for some local things, but it does
         | lack a good ui.
        
       | Draiken wrote:
       | As a complete noob at actually running these models, what kind of
       | hardware are we talking here? Couldn't pick that up from the
       | README.
       | 
       | I absolutely love the idea of using one of these models without
       | having to upload my source code to a tech giant.
        
         | liuliu wrote:
         | 34B _should_ be able to run on 24GiB consumer graphics card, or
         | 32GiB Mac (M1  / M2 chips) with quantization (5~6bit) (and 7B
         | _should_ be able to run on your smart toaster).
        
           | epolanski wrote:
           | Are there cloud offerings to run those models on somebody's
           | else computer?
           | 
           | Any "eli5" tutorial on how to do so, if so?
           | 
           | I want to give these models a run but I have no powerful GPU
           | to run them on so don't know where to start.
        
             | redox99 wrote:
             | On runpod there is a TheBloke template with everything set
             | up for you. An A6000 is good enough to run 70b 4bit.
        
         | dangelov wrote:
         | I've used Ollama to run Llama 2 (all variants) on my 2020 Intel
         | MacBook Pro - it's incredibly easy. You just install the app
         | and run a couple of shell commands. I'm guessing soon-ish this
         | model will be available too and then you'd be able to use it
         | with the Continue VS Code extension.
         | 
         | Edited to add: Though somewhat slow, swap seems to have been a
         | good enough replacement for not having the loads of RAM
         | required. Ollama says "32 GB to run the 13B models", but I'm
         | running the llama2:13b model on a 16 GB MBP.
        
           | j45 wrote:
           | Apple Silicon, especially an M1 Max Studio seems to be an
           | interesting machine to hang on to as the models become more
           | and more efficient with using less and less.
           | 
           | If there's nay other opinions or thoughts on this, I'd be
           | very happy to learn as well. I have considered the eGPU route
           | connected to a 1L PC such as a thinkcentre m80/90.
        
             | BoorishBears wrote:
             | I have a 64 GB M1 Max MBP, and I'd say unless you really
             | have some academic interest towards messing with open
             | models, for now accessing SOTA models via a REST API has
             | better latency for a given quality.
             | 
             | Claude 1.2 instant is as fast as 3.5, follows instructions
             | at a quality closer to 4, and has a 100k context window.
             | Hard to compete with that with an open source model right
             | now.
        
               | jkeisling wrote:
               | How does open source compete with the Claude API? Easy:
               | actually let you use the model. From the signup page:
               | 
               | > Anthropic is rolling out Claude slowly and
               | incrementally, as we work to ensure the safety and
               | scalability of it, in alignment with our company values.
               | 
               | > We're working with select partners to roll out Claude
               | in their products. If you're interested in becoming one
               | of those partners, we are accepting applications. Keep in
               | mind that, due to the overwhelming interest we've
               | received so far, we may take a while to reply.
               | 
               | No thanks, I'd much rather not wait months to see if my
               | app deserves their oh-so-limited attention, or "aligns
               | with the values" of a company taking $400m from Sam
               | Bankman-Fried.
               | 
               | To be more charitable to your underlying point, Claude 2
               | is free to chat with via Anthropic's website, Poe, or
               | Slack, and the GPT-4 API is open to use. If you're
               | building a prototype or just need a chatbot, these _do_
               | have better results and dev experience, at least for now.
               | But I don 't think picking on your Claude API example is
               | unfair. These companies could randomly refuse your
               | prompts via some opaque "moderation API" (that all GPT
               | fine-tuning data goes through!), train on your company's
               | proprietary data, spy on your most intimate questions, or
               | just not find you worth the trouble and cut you off, at
               | any time. THAT is why open source beats proprietary hands
               | down: My device, my data, my weights, my own business.
        
         | redox99 wrote:
         | If you want to run them fast, a 12GB GPU (e.g 3060) for the 13B
         | and a 24GB GPU for the 34B (e.g 3090). Otherwise llama.cpp CPU
         | inference would work on most machines.
        
       | praveenhm wrote:
       | which is the best model for coding right now, GPT4/copilot/phind
       | ?
        
       | mymac wrote:
       | Never before in the history of mankind was a group so absolutely
       | besotted with the idea of putting themselves out of a job.
        
         | yborg wrote:
         | When mechanized textile machinery was invented, the weavers
         | that had jobs after their introduction were those that learned
         | how to use them.
        
         | worksonmine wrote:
         | This should be the only goal of mankind so we can smell the
         | flowers instead of wasting our years in some cubicle. Some
         | people will always want to work, but it shouldn't be the norm.
         | What's the point really unless we're doing something we're
         | passionate about? The economy?
        
         | 037 wrote:
         | I understand the fear of losing your job or becoming less
         | relevant, but many of us love this work because we're
         | passionate about technology, programming, science, and the
         | whole world of possibilities that this makes... possible.
         | 
         | That's why we're so excited to see these extraordinary advances
         | that I personally didn't think I'd see in my lifetime.
         | 
         | The fear is legitimate and I respect the opinions of those who
         | oppose these advances because they have children to provide for
         | and have worked a lifetime to get where they are. But at least
         | in my case, the curiosity and excitement to see what will
         | happen is far greater than my little personal garden. Damn, we
         | are living what we used to read in the most entertaining sci-fi
         | literature!
         | 
         | (And that's not to say that I don't see the risks in all of
         | this... in fact, I think there will be consequences far more
         | serious than just "losing a job," but I could be wrong)
        
         | ttul wrote:
         | That's just one perspective... Another perspective is that LLMs
         | enable programmers to skip a lot of the routine and boring
         | aspects of coding - looking up stuff, essentially - so they can
         | focus on the fun parts that engage creativity.
        
           | mymac wrote:
           | But it won't stop there. Why would it stop at some
           | arbitrarily defined boundary? The savings associated with no
           | longer having to pay programmers the amounts of money that
           | they believe they are worth (high enough to result in
           | collusion between employers) are just too tempting.
        
             | swader999 wrote:
             | The answer to AI stealing your job is to go ahead and start
             | a company, solve a hard problem, sell the solution and
             | leverage AI to do this.
        
             | ilaksh wrote:
             | Some form of AI will eventually take over almost all
             | existing jobs. Whether those jobs evolve or not somehow and
             | new jobs replace them, we will see.
             | 
             | But it's definitely not just programmers. And it will take
             | time.
             | 
             | Society needs to adjust. Stopping progress would not be a
             | solution and is not possible.
             | 
             | However, hopefully we can pause before we create digital
             | animals with hyperspeed reasoning and typical animal
             | instincts like self-preservation. Researchers like LeCun
             | are already moving on from things like LLMs and working on
             | approaches that really imitate animal cognition (like
             | humans) and will eventually blow all existing techniques
             | out of the water.
             | 
             | The path that we are on seems to make humans obsolete
             | within three generations or so.
             | 
             | So the long term concern is not jobs, but for humans to
             | lose control of the planet in less than a century.
             | 
             | On the way there we might be able to manage a new golden
             | age -- a crescendo for human civilization.
        
               | lsmeducation wrote:
               | Continuing your aside...
               | 
               | Humans don't become obsolete, we become bored. This tech
               | will make us bored. When humans get too bored and need
               | shit to stir up, we'll start a war. Take US and China,
               | global prosperity is not enough right? We need to stoke
               | the flames of war over Taiwan.
               | 
               | In the next 300 years we'll wipe out most of each other
               | in some ridiculous war, and then rebuild.
        
               | ilaksh wrote:
               | I agree that WWIII is a concern but I don't think it will
               | be brought about by boredom.
               | 
               | "Global prosperity" might be true in a very long-term
               | historical sense, but it's misleading to apply it to the
               | immediate situation.
               | 
               | Taiwan is not just a talking point. Control over Taiwan
               | is critical for maintaining hegemony. When that is no
               | longer assured, there will likely be a bloody battle
               | before China is given the free reign that it desires.
               | 
               | WWIII is likely to fully break out within the next 3-30
               | years. We don't really have the facilities to imagine
               | what 300 years from now will look like, but it will
               | likely be posthuman.
        
               | lsmeducation wrote:
               | I'll go with the 30 year mark. Countries like Russia or
               | China don't get humbled in a loss (like Germany didn't in
               | WW1). Russia will negotiate some terms for Ukraine (or
               | maintain perpetual war), but I believe it will become a
               | military state that will funnel all money into the
               | defense sector. The same with Iran, and the same with
               | China.
               | 
               | Iran supplies Russia with drones. I can promise you
               | Russia will help Iran enrich their uranium. They are both
               | pariah states, what do they have to lose? Nuclear Iran,
               | here enters Israel.
               | 
               | Everyone's arming up, there's a gun fight coming.
        
             | lsmeducation wrote:
             | Okay, think about it this way. This thing helps generate
             | tons and tons of code. The more code people (or this thing)
             | writes, the more shit there is to debug. More and more
             | code, each calling each other means more and more insane
             | bugs.
             | 
             | We're going to move from debugging some crap the last
             | developer wrote to debugging an order of magnitude more
             | code the last developer generated.
             | 
             | It's going to be wonderful for job prospects really.
        
         | PUSH_AX wrote:
         | We're not looking at a product that's putting anyone out of a
         | job though, we're looking at a product that frees up a lot of
         | time, and time is great.
        
         | quickthrower2 wrote:
         | The best interpretation of this is you mean eventually ML/AI
         | will put programmers out of a job, and not Code LLama
         | specifically.
         | 
         | However it is hard to tell how that might pan out. Can such an
         | ML/AI do all the parts of the job effectively? A lot of non-
         | coding skill bleed into the coder's job. For example talking to
         | people who need an input to the task and finding out what they
         | are really asking for, and beyond that, what the best solution
         | is that solves the underlying problem of what they ask for,
         | while meeting nonfunctional requirements such as performance,
         | reliability, code complexity, and is a good fit for the
         | business.
         | 
         | On the other hand eventually the end users of a lot of services
         | might be bots. You are more likely to have a pricing.json than
         | a pricing.html page, and bots discover the services they need
         | from searches, negotiate deals, read contracts and sue each
         | other etc.
         | 
         | Once the programming job (which is really a "technical problem
         | solver" job) is replaced either it will just be same-but-
         | different (like how most programmers use high level languages
         | not C) or we have invented AGI that will take many other jobs.
         | 
         | In which case the "job" aspect of it is almost moot. Since we
         | will be living in post-scarcity and you would need to figure
         | out the "power" aspect and what it means to even be
         | sentient/human.
        
         | kbrannigan wrote:
         | Do you really want to spend you days writing REDUX
         | accumulators?
        
         | thewataccount wrote:
         | Is automation not what every engineer strives for when
         | possible? Especially software developers.
         | 
         | From my experience with github copilot and GPT4 - developers
         | are NOT going anywhere anytime soon. You'll certainly be faster
         | though.
        
         | vunderba wrote:
         | If we get to the point where these large language models can
         | create complete applications and software solutions from design
         | specs alone, then there's no reason to believe that this would
         | be limited to merely replacing software devs.
         | 
         | It would likely impact a _far_ larger swath of the engineering
         | / design industry.
        
       | jasfi wrote:
       | Now we need code quality benchmarks comparing this against GPT-4
       | and other contenders.
        
         | nick0garvey wrote:
         | They show the benchmarks in the original post, a few pages down
        
           | jasfi wrote:
           | Thanks, I missed that somehow.
        
       | braindead_in wrote:
       | The 34b Python model is quite close to GPT4 on HumanEval pass@1.
       | Small specialised models are catching up to GPT4 slowly. Why not
       | train a 70b model though?
        
       | pmarreck wrote:
       | I want "safety" to be opt-in due to the inaccuracy it introduces.
       | I don't want to pay that tax just because someone is afraid I can
       | ask it how to make a bomb when I can just Google that and get
       | pretty close to the same answer already, and I certainly don't
       | care about being offended by its answers.
        
       | Dowwie wrote:
       | What did the fine tuning process consist of?
        
       ___________________________________________________________________
       (page generated 2023-08-24 23:00 UTC)