[HN Gopher] GitHub cuts AI deals with Google, Anthropic
       ___________________________________________________________________
        
       GitHub cuts AI deals with Google, Anthropic
        
       Author : jbredeche
       Score  : 538 points
       Date   : 2024-10-29 16:20 UTC (6 hours ago)
        
 (HTM) web link (www.bloomberg.com)
 (TXT) w3m dump (www.bloomberg.com)
        
       | altbdoor wrote:
       | https://archive.is/Il4QM
        
       | campbel wrote:
       | This is pretty exciting. I'm a copilot user at work, but also
       | have access to Claude. I'm more inclined to use Claude for
       | difficult coding problems or to review my work as I've just grown
       | more confident in its abilities over the last several months.
        
         | ganoushoreilly wrote:
         | I too use Claude more frequently than OpenAi GPT4o. I think
         | this is a two fold move for MS and I like it. Claude being more
         | accurate / efficient for me says it's likely they see the same
         | thing, win number 1. The second is with all the OpenAI drama MS
         | has started to distance themselves over a souring relationship
         | (allegedly). If so, this could be a smart move away tactfully.
         | 
         | Either way, Claude is great so this is a net win for everyone.
        
           | dartos wrote:
           | Yeah, Claude consistently impresses me.
           | 
           | A commenter on another thread mentioned it but it's very
           | similar to how search felt in the early 2000s. I ask it a
           | question and get my answer.
           | 
           | Sometimes it's a little (or a lot) wrong or outdated, but at
           | least I get something to tinker with.
        
             | gonab wrote:
             | Conversely I feel that the experience of searching has been
             | degraded by a lot since 2016/17. My these is that, at this
             | time, online spam increased by an order of magnitude
        
               | TeaBrain wrote:
               | I don't think this is necessarily converse to what they
               | said.
        
               | bobthepanda wrote:
               | Winning the war against spam is an arms race. Spam hasn't
               | spent years targeting AI search yet.
        
               | dageshi wrote:
               | I think it was the switch from desktop search traffic
               | being dominant to mobile traffic being dominant, that
               | switch happened around the end of 2016.
               | 
               | Google used to prioritise big comprehensive articles on
               | subjects for desktop users but mobile users just wanted
               | quick answers, so that's what google prioritised as they
               | became the biggest users.
               | 
               | But also, per your point, I think those smaller simpler
               | less comprehensive posts are easier to fake/spam than the
               | larger more compreshensible posts that came before.
        
               | zeknife wrote:
               | Ironically, I almost never see quick answers in the top
               | results, mostly it's dragged out pages of paragraph after
               | paragraph with ads inbetween.
        
               | dartos wrote:
               | Guess who sells the ads...
        
               | state_less wrote:
               | Old style Google search is dead, folks just haven't
               | closed the casket yet. My index queries are down ~90%. In
               | the future, we'll look back at LLMs as a major turning
               | point in how people retrieve and consume information.
        
               | darepublic wrote:
               | I still prefer it over using llm. And I would be doubtful
               | that llm search has major benefits over Google search imo
        
               | ben_w wrote:
               | Depends what you want it for.
               | 
               | Right now, I find each tool better at different things.
               | 
               | If I can only describe what I want but don't know key
               | words, LLM are the only solution.
               | 
               | If I need citations, LLMs suck.
        
               | EVa5I7bHFq9mnYK wrote:
               | It's getting ridiculous. Half of the time now when I ask
               | AI to search some information for me, it finds and
               | summarizes some very long article obviously written by
               | AI, and lacking any useful information.
        
             | imiric wrote:
             | I recently tried to ask these tools for help with using a
             | popular library, and both GPT-4o and Claude 3.5 Sonnet gave
             | highly misleading and unusable suggestions. They
             | consistently hallucinated APIs that didn't exist, and would
             | repeat the same wrong answers, ignoring my previous
             | instructions. I spent upwards of 30 minutes repeating "now
             | I get this error" to try to coax them in the right
             | direction, but always ending up in a loop that got me
             | nowhere. Some of the errors were really basic too, like
             | referencing a variable that was never declared, etc.
             | Finally, Claude made a tangential suggestion that made me
             | look into using a different approach, but it was still
             | faster to look into the official documentation than to keep
             | asking it questions. GPT-4o was noticeably worse, and I
             | quickly abandoned it.
             | 
             | If this is the state of the art of coding LLMs, I really
             | don't see why I should waste my time evaluating their
             | confident sounding, but wrong, answers. It doesn't seem
             | like much has improved in the past year or so, and at this
             | point this seems like an inherent limitation of the
             | architecture.
        
               | geodel wrote:
               | Well it is volume business. <1% of advanced skill
               | developers will find AI helper useless but for 99% of IT
               | CRUD peddlers these tools are quite sufficient. All in
               | all if employers cut down 15-20% of net development costs
               | by reducing head counts, it will be very worthwhile for
               | companies.
        
               | WgaqPdNr7PGLGVW wrote:
               | I suspect it will go a different direction.
               | 
               | Codebases are exploding in size. Feature development has
               | slowed down.
               | 
               | What might have been a carefully designed 100kloc
               | codebase in 2018 is now a 500kloc ball of mud in 2024.
               | 
               | Companies need many more developers to complete a decent
               | sized feature than they needed in 2018.
        
               | outworlder wrote:
               | It's worse than that. Now the balls of mud are
               | distributed. We get incredibly complex interactions
               | between services which need a lot of infrastructure to
               | enable them, that requires more observability, which
               | requires more infrastructure...
        
               | imiric wrote:
               | Sure, but my specific question was fairly trivial, using
               | a mainstream language and a popular library. Most of my
               | work qualifies as CRUD peddling. And yet these tools are
               | still wasting my time.
               | 
               | Maybe I'll have better luck next time, or maybe I need to
               | improve my prompting skills, or use a different model,
               | etc. I was just expecting more from state of the art LLMs
               | in 2024.
        
               | WgaqPdNr7PGLGVW wrote:
               | Yeah there is a big disconnect between the devs caught up
               | in the hype and the devs who aren't.
               | 
               | A lot of the devs in my office using Claude/gpt are
               | convinced they are so much more productive but they
               | aren't actually producing features or bug fixes any
               | faster.
               | 
               | I think they are just excited about a novel new way to
               | write code.
        
               | dartos wrote:
               | FWIW I almost never ask it to write code for me. I did
               | once to write a matplotlib script and it gave me a
               | similar headache.
               | 
               | I ask it questions mostly about libraries I'm using
               | (usually that have poor documentation) and how to
               | integrate it with other libraries.
               | 
               | I found out about Yjs by asking about different
               | operational transform patterns.
               | 
               | Got some context on the prosemirror plugin by pasting the
               | entire provider class into Claude and asking questions.
               | 
               | It wasn't always exactly correct, but it was correct
               | enough that it made the process of learning prosemirror,
               | yjs, and how they interact pretty nice.
               | 
               | The "complete" examples it kept spitting out were totally
               | wrong, but the information it gave me was not.
        
               | imiric wrote:
               | To be clear, I didn't ask it to write something complex.
               | The prompt was "how do I do X with library Y?", with a
               | bit more detail. The library is fairly popular and in a
               | mainstream language.
               | 
               | I had a suspicion that what I was trying to do was simply
               | not possible with that library, but since LLMs are
               | incapable of saying "that's not possible" or "I don't
               | know", they will rephrase your prompt and hallucinate
               | whatever might plausibly make sense. They have no way to
               | gauge whether what they're outputting is actually
               | correct.
               | 
               | So I can imagine that you sometimes might get something
               | useful from this, but if you want a specific answer about
               | something, you will always have to double-check their
               | work. In the specific case of programming, this could be
               | improved with a simple engineering task: integrate the
               | output with a real programming environment, and evaluate
               | the result of actually running the code. I think there
               | are coding assistant services that do this already, but
               | frankly, I was expecting more from simple chat services.
        
           | thelittleone wrote:
           | I'm the same, but had a lot of issues getting structured
           | output from Anthropic. Ended up always writing response
           | processors. Frustrated by how fragile that was, decided to
           | try OpenAI structured outputs and it just worked and since
           | they also have prompt caching now, it worked out very well
           | for my use case.
           | 
           | Anthropic's seems to have addressed the issue using pydantic
           | but I haven't had a chance to test it yet.
           | 
           | I pretty much use Anthropic for everything else.
        
           | JacobThreeThree wrote:
           | >The second is with all the OpenAI drama MS has started to
           | distance themselves over a souring relationship (allegedly).
           | If so, this could be a smart move away tactfully.
           | 
           | I agree, this was a tactical move designed to give them
           | leverage over OpenAI.
        
         | a_wild_dandan wrote:
         | The speed with which AI models are improving blows my mind.
         | Humans quickly normalize technological progress, but it's
         | staggering to reflect on our progress over just these _two
         | years_.
        
           | campbel wrote:
           | Yes! I'm much more inclined to write one-off scripts for
           | short manual tasks as I can usually get AI to get something
           | useful very fast. For example, last week I worked with Claude
           | to write a script to get a sense of how many PRs my company
           | had that included comprehensive testing. This was borderline
           | best done as a manual task previously, now I just ask Claude
           | to write a short bash script that uses the GitHub CLI to do
           | it and I've got a repeatable reliable process for pulling
           | this information.
        
             | unshavedyak wrote:
             | I rarely use LLMs for tasks but i love it for exploring
             | spaces i would otherwise just ignore. Like writing some
             | random bash script isn't difficult at all, but it's also so
             | fiddly that i just don't care to do it. It's nice to just
             | throw a bot at it and come back later. Loosely speaking.
             | 
             | Still i find very little use from LLMs in this front, but
             | they do come in handy randomly.
        
           | unshavedyak wrote:
           | I wonder how long people will still protest in these threads
           | that "It doesn't know anything! It's just an autocomplete
           | parrot!"
           | 
           | Because.. yea, it is. However.. it keeps expanding, it keeps
           | getting more useful. Yea people and especially companies are
           | using it for things which it has no business being involved
           | in.. and despite that it keeps growing, it keeps progressing.
           | 
           | I do find the "stochastic parrot" comments slowly dwindle in
           | number and volume with each significant release, though.
           | 
           | Still, i find it weirdly interesting to see a bunch of people
           | be both right and "wrong" at the same time. They're
           | completely right, and yet it's like they're also being proven
           | wrong in the ways that matter.
           | 
           | Very weird space we're living in.
        
             | a_wild_dandan wrote:
             | The "statistical parrot" parrots have been demonstrably
             | wrong for years (see e.g. LeCun et al[1]). It's just harder
             | to ignore reality with hundreds of millions of people now
             | using incredible new AI tools. We're approaching "don't
             | believe your lying eyes" territory. Deniers will continue
             | pretending that LLMs are just an NFT-level fad or bubble or
             | whatever. The AI revolution will continue to pass them by.
             | More's the pity.
             | 
             | [1] https://arxiv.org/abs/2110.09485
        
               | mensetmanusman wrote:
               | A trillion dimensional stochastic parrot is still a
               | stochastic parrot.
               | 
               | If these systems showed understanding we would notice.
               | 
               | No one is denying that this form of intelligence is
               | useful.
        
               | logicchains wrote:
               | I don't know how you can say they lack understanding of
               | the world when in pretty much any standardised test
               | designed to measure human intelligence they perform
               | better than the average human. They only thing that don't
               | understand is touch because they're not trained on that,
               | but they can already understand audio and video.
        
               | zeknife wrote:
               | You said it, those tests are designed to measure human
               | intelligence, because we know that there is a
               | correspondence between test results and other, more
               | general tasks - in humans. We do not know that such a
               | correspondence exists with language models. I would
               | actually argue that they demonstrably do not, since even
               | an LLM that passes every IQ test you put in front of it
               | can still trip up on trivial exceptions that wouldn't
               | fool a child.
        
               | zeroonetwothree wrote:
               | An answer key would outperform the average human but it
               | isn't intelligent. Tests designed for humans are not
               | appropriate to judge non humans.
        
               | devmor wrote:
               | No you don't understand, if i put a billion billion
               | trillion monkeys on typewriters, they're actually now one
               | super intelligent monkey because they're useful now!
               | 
               | We just need more monkeys and it will be the same as a
               | human brain.
        
               | croes wrote:
               | What does the mass of users change about what it is? How
               | many of these check the results for hallucinations and
               | how many don't because I part of AI?
               | 
               | More than once these tools fail at tasks a fifth grader
               | could understand
        
               | outworlder wrote:
               | > Deniers will continue pretending that LLMs are just an
               | NFT-level fad or bubble or whatever. The AI revolution
               | will continue to pass them by. More's the pity.
               | 
               | You should re-read that very slowly and carefully and
               | really think about it. Calling anyone that's skeptical a
               | 'denier' is a red flag.
               | 
               | We have been through these AI cycles before. In every
               | case, the tools were impressive for their time. Their
               | limitations were always brushed aside and we would get a
               | hype cycle. There was nothing wrong with the technology,
               | but humans always like to try to extrapolate their
               | capabilities and we usually get that wrong. When hype
               | caught up to reality, investments dried up and nobody
               | wanted to touch "AI" for a while.
               | 
               | Rinse, repeat.
               | 
               | LLMs are again impressive, for our time. When the dust
               | settles, we'll get some useful tools but I'm pretty sure
               | we will experience another - severe - AI winter.
               | 
               | If we had some optimistic but also realistic discussions
               | on their limitations, I'd be less skeptical. As it is, we
               | are talking about 'revolution', and developers being out
               | of jobs, and superintelligence and whatnot. That's _not_
               | the level the technology is at today and it is not clear
               | we are going to do anything else other than get stuck in
               | a local maxima.
        
             | wavemode wrote:
             | You're conflating three different things.
             | 
             | There's the question, "is an LLM just autocomplete"? The
             | answer to that question is obviously no, but the question
             | is also a strawman - people who actually use LLM's
             | regularly do recognize that there is more to their
             | capabilities than randomized pattern matching.
             | 
             | Separately, there's the question of "will LLM's become AGI
             | and/or become super intelligent." Most people recognize
             | that LLM's are not currently super intelligent, and that
             | there currently isn't a clear path toward making them so.
             | Still, many people seem to feel that we're on the verge of
             | progress here, and feel very strongly that anyone who
             | disagrees is an AI "doomer".
             | 
             | Then there's the question of "are we in an AI bubble"? This
             | is more a matter of debate. Some would argue that if LLM
             | reasoning capabilities plateau, people will stop investing
             | in the technology. I actually don't agree with that view -
             | I think there is a lot of economic value still yet to be
             | realized in AI advancements - I don't think we're on the
             | verge of some sort of AI winter, even if LLM's never become
             | super intelligent.
        
               | sdesol wrote:
               | > Most people recognize that LLM's are not currently
               | super intelligent,
               | 
               | I think calling it intelligent is being extremely
               | generous. Take a look at the following example which is a
               | spelling and grammar checker that I wrote:
               | 
               | https://app.gitsense.com/?doc=f7419bfb27c89&temperature=0
               | .50...
               | 
               | When the temperature is 0.5, both Claude 3.5 and GPT-4o
               | can't properly recognize that GitHub is capitalized. You
               | can see the responses by clicking in the sentence. Each
               | model was asked to validate the sentence 5 times.
               | 
               | If the temperature is set to 0.0, most models will get it
               | right (most of the time), but Claude 3.5 still can't see
               | the sentence in front of it.
               | 
               | https://app.gitsense.com/?doc=f7419bfb27c89&temperature=0
               | .00...
               | 
               | Right now, LLM is an insanely useful and powerful next
               | word predictor, but I wouldn't call it intelligent.
        
               | digging wrote:
               | > I think calling it intelligent is being extremely
               | generous ... can't properly recognize that GitHub is
               | capitalized.
               | 
               | Wouldn't this make chimpanzees and ravens and dolphins
               | unintelligent too? You're asking it to do a task that's
               | (mostly) easy _for humans_. It 's not a human though.
               | It's an alien intelligence which "thinks" in our
               | language, but not in the same way we do.
               | 
               | If they could, specialized AI might think we're
               | unintelligent based on how often we fail, even with
               | advanced tools, pattern matching tasks that are trivial
               | for them. Would you say they're right to feel that way?
        
               | sdesol wrote:
               | Animals are capable of learning. LLMs can not. LLM uses
               | weights that are defined during the training process to
               | decide what to do next. LLM cannot self evalaute based on
               | what it has said. You have to create a new message for it
               | to create a new probability path.
               | 
               | Animals have the ability to learn and grow by themselves.
               | LLMs are not intelligent and I don't see how they can be
               | since they just follow the most likely path with
               | randomness (temperature) sprinkled in.
        
             | croes wrote:
             | Are you confusing frequency of use with usefulness?
             | 
             | If these tools boost tue productivity where is the output
             | spike of all the companies, the spike in revenue and
             | profits?
             | 
             | How often do we lose the benefit auto text generation to
             | the loop of That's wrong Oh yes of course, here is the
             | correct version Nope, still wrong Prompt editing?
        
           | ffujdefvjg wrote:
           | Lots of progress, but I feel like we've been seeing
           | diminishing returns. I can't help but feel like recent
           | improvements are just refinements and not real advances. The
           | interest in AI may drive investment and research in better
           | models that are game-changers, but we aren't there yet.
        
             | ipsum2 wrote:
             | I don't know about you, but o1-preview/o1-mini has been
             | able to solve many moderately challenging programming tasks
             | that would've taken me 30 mins to an hour. No other models
             | earlier could've done that.
        
               | ffujdefvjg wrote:
               | It's an improvement but...I've asked it to do some really
               | simple tasks and it'll occasionally do them in the most
               | roundabout way you could imagine. Like, let's source a
               | bash file that creates and reads a state file to do
               | something for which the functionality was already built-
               | in. Say I'm a little skeptical of this solution and plug
               | it into a new o1-preview prompt to double check the
               | solution, and it starts by critiquing the bash script and
               | error handling instead of seeing that the functionality
               | is baked in and it's plainly documented. Other errors
               | have been more subtle.
               | 
               | When it works, it's pretty good, and sometimes great. But
               | when failure modes look like the above I'm very wary of
               | accepting its output.
        
               | warkdarrior wrote:
               | > I've asked it to do some really simple tasks and it'll
               | occasionally do them in the most roundabout way you could
               | imagine.
               | 
               | But it still does the tasks you asked for, so that's the
               | part that really matters.
        
             | TeMPOraL wrote:
             | You're proving GP's point about normalization of progress.
             | It's been two years. We're still during the first iteration
             | of applications of this new tech, advancements didn't have
             | time yet to start compounding. This is barely getting
             | started.
        
         | pseudosavant wrote:
         | I use both Claude and ChatGPT/GPT-4o a lot. Claude, the model,
         | definitely is 'better' than GPT-4o. But OpenAI provides a much
         | more capable app in ChatGPT and an easier development platform.
         | 
         | I would absolutely choose to use Claude as my model with
         | ChatGPT if that happened (yes, I know it won't). ChatGPT as an
         | app is just so far ahead: code interpreter, web search/fetch,
         | fluid voice interaction, Custom GPTs, image generation, and
         | memory. It isn't close. But Claude absolutely produces better
         | code, only being beaten by ChatGPT because it can fetch data
         | from the web to RAG enhance its knowledge of things like APIs.
         | 
         | Claude's implementation of artifacts is very good though, and
         | I'm sure that is what lead OpenAI to push out their buggy
         | canvas feature.
        
           | tanelpoder wrote:
           | Are there any good 3rd-party native frontend apps for Claude
           | (on MacOS)? I mean something like ChatGPTs app, not an
           | editor. I guess one option would be to just run Claude iPad
           | app on MacOS.
        
             | mike_hearn wrote:
             | You can use https://recurse.chat/ if you have an Apple
             | silicon Mac.
        
             | greenavocado wrote:
             | Open-WebUI doesn't support Claude natively (only through a
             | series of hacks) but it is absolutely "THE" go-to for a
             | ChatGPT Pro like experience (it is slightly better).
             | 
             | https://github.com/open-webui/open-webui
        
             | TeMPOraL wrote:
             | If you're willing to settle for a client-side only web
             | frontend (i.e. talks directly with APIs of the models you
             | use), TypingMind would work. It's paid, but it's good (see
             | [0]), and I guess you could always go for the self-hosted
             | version and wrap it in an Electron app - it's what most
             | "native" apps are these days anyway (and LLM frontends in
             | particular).
             | 
             | --
             | 
             | [0] - https://news.ycombinator.com/item?id=41988306
        
             | Liquix wrote:
             | Jan [0] is MacOS native, open source, similar feel to the
             | ChatGPT frontend, very polished, and offers Anthropic
             | integration (all Claude models).
             | 
             | It also features one-click installation, OpenAI
             | integration, a hub for downloading and running local
             | models, a spec-compatible API server, global "quick answer"
             | shortcut, and more. Really can't recommend it enough!
             | 
             | [0] https://github.com/janhq/jan
        
             | jawon wrote:
             | I like msty.app. Parallel prompting across multiple
             | commercial and local models plus branching dialogs. Doesn't
             | do artifacts, etc, though.
        
             | octohub wrote:
             | Msty [0] is a really good app - you can use both local or
             | online models and has web search, attachments, RAG, split
             | chats, etc., built-in.
             | 
             | [0] https://msty.app
        
           | benreesman wrote:
           | It's all a dice game with these things, you have to watch
           | them closely or they start running you (with bad outcomes).
           | Disclaimers aside:
           | 
           | Sonnet is better in the small, by a lot. It's sharply up from
           | idk, three months ago or something when it was still an
           | attractive nuisance. It still tops out at "Best SO Answer",
           | but it hits that like 90%+. If it involves more than copy
           | paste, sorry folks, it's still just really fucking good copy
           | paste.
           | 
           | But for sheer "doesn't stutter every interaction at the worst
           | moment"? You've got to hand it to the ops people: 4o can give
           | you second best in industrial quantity on demand. I'm finding
           | that if AI is good enough, then OpenAI is good enough.
        
             | logicchains wrote:
             | >If it involves more than copy paste, sorry folks, it's
             | still just really fucking good copy paste.
             | 
             | Are you sure you're using Claude 3.5 Sonnet? In my
             | experience it's absolutely capable of writing entire small
             | applications based off a detailed spec I give it, which
             | don't exist on GitHub or Stack Overflow. It makes some
             | mistakes, especially for underspecified things, but
             | generally it can fix them with further prompting.
        
               | benreesman wrote:
               | I'm quite sure what model revision their API quotes,
               | though serious users rapidly discover that like any
               | distributed system, it has a rhythm to it.
               | 
               | And I'm not sure we disagree.
               | 
               | Vercel demo but Pets _is_ copy paste.
        
               | benreesman wrote:
               | We have entered the era of generic fashionable CRUD
               | framework demo Too Cheap To Hawk.
        
           | ben_w wrote:
           | FWIW, I was able to get a decent way into making my own
           | client for ChatGPT by asking the free 3.5 version to do JS
           | for me* before it was made redundant by the real app, so this
           | shouldn't be too hard if you want a specific
           | experience/workflow?
           | 
           | * I'm iOS by experience; my main professional JS experience
           | was something like a year before jQuery came out, so I kinda
           | need an LLM to catch me up for anything HTML
           | 
           | Also, I wanted HTML rather than native for this.
        
           | mattwad wrote:
           | Have you tried using Cursor with Claude embedded? I can't go
           | back to anything else, it's very nice having the AI embedded
           | in the IDE and it just knows all the files i am working with.
           | Cursor can use GPT-4o too if you want
        
           | TeMPOraL wrote:
           | > _ChatGPT as an app is just so far ahead: code interpreter,
           | web search /fetch, fluid voice interaction, Custom GPTs,
           | image generation, and memory. It isn't close._
           | 
           | Funny thing, TypingMind was ahead of them for over a year,
           | implementing those features on top of the API, without trying
           | to mix business model with engineering[0]. It's only recently
           | that ChatGPT webapp got more polished and streamlined, but
           | TypingMind's been giving you all those features for _every_
           | LLM that can handle it. So, if you 're looking for ChatGPT-
           | level frontend to Anthropic models, this is it.
           | 
           | ChatGPT shines on mobile[1] and I still keep my subscription
           | for that reason. On desktop, I stick to TypingMind and being
           | able to run the same plugins on GPT-4o and Claude 3.5 Sonnet,
           | and if I need a new tool, I can make myself one in five
           | minutes with passing knowledge of JavaScript[2]; no need to
           | subscribe to some Gee Pee Tee.
           | 
           | Now, I know I sound like a shill, I'm not. I'm just a
           | satisfied user with no affiliation to the app or the guy that
           | made it. It's just that TypingMind did the _bloodingly stupid
           | obvious_ thing to do with the API and tool support (even
           | before the latter was released), and continues to do the
           | _obvious things_ with it, and I 'm completely confused as to
           | why others don't, or why people find "GPTs" novel. They're
           | not. They're a simple idea, wrapped in tons of marketing
           | bullshit that makes it less useful and delayed its release by
           | half a year.
           | 
           | --
           | 
           | [0] - "GPTs", seriously. That's not a feature, that's just
           | system prompt and model config, put in an opaque box and
           | distributed on a _marketplace_ for no good reason.
           | 
           | [1] - Voice story has been better for a while, but that's a
           | matter of integration - OpenAI putting together their own LLM
           | and (unreleased) voice model in a mobile app, in a manner
           | hardly possible with the API their offered, vs. TypingMind
           | being a webapp that uses third party TTS and STT models via
           | "bring your own API key" approach.
           | 
           | [2] - I made https://docs.typingmind.com/plugins/plugins-
           | examples#db32cc6... long before you could do that stuff with
           | ChatGPT app. It's literally as easy as it can possibly be:
           | https://git.sr.ht/~temporal/typingmind-plugins/tree. In
           | particular, this one is more representative -
           | https://git.sr.ht/~temporal/typingmind-
           | plugins/tree/master/i... - PlantUML one is also less than 10
           | lines of code, but on top of 1.5k lines of DEFLATE
           | implementation in JS I plain copy-pasted from the interwebz
           | because I cannot into JS modules.
        
           | coryfklein wrote:
           | > But OpenAI provides a much more capable app in ChatGPT and
           | an easier development platform
           | 
           | Which app are you talking about here?
        
         | szundi wrote:
         | Switch to Cursor with Claude backend and 5x immediately
        
         | mtkd wrote:
         | One service is not really enough -- you need a few to
         | triangulate more often than not, especially when it comes to
         | code using latest versions of public APIs
         | 
         | Phind is useful as you can switch between them -- but only get
         | a handful of o1 and Opus a day which I burn through quick at
         | moment on deeper things -- Phind-405b and 3.5 Sonnet are decent
         | for general use
        
       | GraemeMeyer wrote:
       | Non-paywall alternative: GitHub Copilot will support models from
       | Anthropic, Google, and OpenAI -
       | https://www.theverge.com/2024/10/29/24282544/github-copilot-...
        
       | candiddevmike wrote:
       | Wake me up when they support self hosted llama or openwebui.
       | 
       | Wonder if we'll ever see a standard LLM API.
        
         | internetter wrote:
         | > Wonder if we'll ever see a standard LLM API.
         | 
         | At this point its just the OpenAI API
        
         | hshshshshsh wrote:
         | Isn there no open source alternative? Like a plugin or
         | something.
        
           | SirMaster wrote:
           | Not for visual studio 2022 unfortunately.
        
             | int_19h wrote:
             | There are several plugins for VS 2022 that offer Copilot-
             | like UI on top of a local Llama model, although I can't
             | speak for their quality.
        
               | SirMaster wrote:
               | Hmm, I wonder why I didn't seem to find any.
        
           | rihegher wrote:
           | for VScode you can use https://github.com/twinnydotdev/twinny
        
         | Tiberium wrote:
         | cursor.ai lets you use any OpenAI-compatible endpoint, although
         | not all features work. And continue.dev does too, iirc.
        
       | thecopy wrote:
       | Great news! This can only mean better suggestions.
       | 
       | I expected little from Copilot, but now i find it indispensible.
       | It is such a productivity multiplier.
        
         | otteromkram wrote:
         | I removed it from windows and I'm still very productive.
         | Probably moreso, since I don't have to make constant
         | corrections.
         | 
         | To each their own.
        
           | Tiberium wrote:
           | GitHub Copilot and Microsoft Copilot are different products
        
             | doublerabbit wrote:
             | Same difference. They both are glorified liberians.
        
               | TimeBearingDown wrote:
               | Liberians seem quite useful, then! I've never been to
               | Africa myself.
        
               | timeon wrote:
               | Just don't lick your fingers.
        
             | ipaddr wrote:
             | Their branding is confusing
        
             | hbn wrote:
             | Is Microsoft Copilot even a single product? It seems to me
             | they just shove AI in random places throughout their
             | products and call it Copilot. Which would make Github
             | Copilot essentially another one of these places the
             | branding shows up (even if it started there)
        
       | thenobsta wrote:
       | I wonder what the rationale for this was internally. More OpenAI
       | issues? competitiveness with Cursor? It seems good for the user
       | to increase competition across LLM providers.
       | 
       | Also ambiguous title. I thought GitHub canceled deals they had in
       | the work. The article is clearly about making a deal, but it's
       | unclear from the article's title.
        
         | cma wrote:
         | Could be a fight against Llama, which excludes MS and Google in
         | its open license (though I think has done separate pay deals
         | with one or both of them). Meta are notably absent from this
         | announcement.
        
           | szundi wrote:
           | Try to fight the free good-enough haha. At least that's the
           | plan of Meta, who does not benefit as much selling this than
           | using this
        
       | aimazon wrote:
       | "cuts" has to be the worse word choice in this context, it sounds
       | like they're terminating deals rather than creating them.
        
         | breck wrote:
         | "inks"
        
           | Jerrrrrrry wrote:
           | is there a slim chance at a title change?
           | 
           | or a fat chance?
        
         | scinadier wrote:
         | Common english lexicon should cut ties with the phrase "cut a
         | deal"
        
           | justsocrateasin wrote:
           | Yeah I agree, could be confusing to non native speakers
           | though. It's a weird idiom.
        
         | mattlondon wrote:
         | Came here to say that - my reaction was initially "I didn't
         | know they even had those deals to cut them!"
        
       | jddj wrote:
       | Sensible.
       | 
       | Big part of competitors' (eg. Aider, Cursor, I imagine also
       | jetbrains) advantage was not being tied to one model as the
       | landscape changed.
       | 
       | After large MS OpenAI investment they could just as easily have
       | put blinders on and doubled down.
        
         | kyawzazaw wrote:
         | Jetbrains is doing its own LLM
        
           | a_wild_dandan wrote:
           | Cursor is too! Mixing and matching specialized & flagship
           | models is the way forward.
        
       | yanis_t wrote:
       | Isn't using big models like gpt-4o going to slow down the
       | autocomplete?
        
         | HyprMusic wrote:
         | I think they mean for the chat and code editing features.
        
       | 7thpower wrote:
       | I wonder if this is an example of the freedom of being an arms
       | length subsidiary or foreshadowing to a broader strategy shift
       | within Microsoft.
        
       | neevans wrote:
       | Actually excited 2M context window will be useful in this case
        
       | mansoor_ wrote:
       | I wonder how this will affect latency,
        
       | JimDabell wrote:
       | Anthropic's article: https://www.anthropic.com/news/github-
       | copilot
       | 
       | GitHub's article: https://github.blog/news-insights/product-
       | news/bringing-deve...
       | 
       | Google Cloud's article:
       | https://cloud.google.com/blog/products/ai-machine-learning/g...
       | 
       | Weird that it wasn't published on the official Gemini news site
       | here: https://blog.google/products/gemini/
       | 
       | Edit: GitHub Copilot is now also available in Xcode:
       | https://github.blog/changelog/2024-10-29-github-copilot-code...
       | 
       | Discussion here: https://news.ycombinator.com/item?id=41987404
        
         | vault wrote:
         | I wonder if behind the choice of calling the human user "mona"
         | there's an Italian XD
         | 
         | https://i.imgur.com/z01xgfl.png
        
           | throwup238 wrote:
           | It's Mona Lisa the Octocat: https://github.com/monatheoctocat
        
             | lelandfe wrote:
             | Hah, TIL. https://cameronmcefee.com/work/the-octocat/
        
         | patates wrote:
         | Google Cloud's article is from tomorrow?
         | 
         | https://cloud.google.com/blog/products/ai-machine-learning/g...
         | 
         | https://i.postimg.cc/RVWSfpvs/grafik.png
        
           | cortesoft wrote:
           | It says the 29th now
        
           | JimDabell wrote:
           | It's October 30th in several parts of the world already. It's
           | after midnight everywhere GMT+7 onwards.
        
             | patates wrote:
             | Obviously! However, Google being an American company, that
             | was surprising. I'm in Europe and am used to seeing newest
             | posts "from yesterday" when they are from the USA. This one
             | is weird.
        
       | ninininino wrote:
       | The threat of anti-trust creates a win for consumers, this is an
       | example of why we need a strong FTC.
        
         | hedora wrote:
         | This is a standard "commoditize your complement" play. It's in
         | GitHub / Microsoft's best interest to make sure none of the
         | LLMs become dominant.
         | 
         | As long as that happens, their competitors light money on fire
         | to build the model while GitHub continues to build / defend its
         | monopoly position.
         | 
         | Also, given that there are already multiple companies building
         | decent models, it's a pretty safe bet Microsoft could build
         | their own in a year or two if the market starts settling on one
         | that's a strategic threat.
         | 
         | See also: "embrace, extend, extinguish" from the 1990's
         | Microsoft antitrust days.
        
       | gdiamos wrote:
       | Github was an early OpenAI design partner. OpenAI developed a
       | custom LLM for them.
       | 
       | It's so interesting that even after that early mover advantage
       | they have to go back to the foundation model providers.
       | 
       | Does this mean that future tech companies have no choice but to
       | do this?
        
         | dartos wrote:
         | I see no reason why GitHub wouldn't use fine tuned models from
         | google or anthropic.
         | 
         | I think their version of gpt-3.5 was a fine tune as well. I
         | doubt they had a whole model from scratch made just for them.
        
         | a_wild_dandan wrote:
         | Yes, because transfer learning works. A specialized model for X
         | will be subsumed by a general model for X/Y/Z as it becomes
         | better at Y/Z. This is why models which learn other languages
         | become better at English.
         | 
         | Custom models still have use cases, e.g. situations requiring
         | cheaper or faster inference. But ultimately The Bitter Lesson
         | holds -- your specialized thing will always be overtaken by
         | throwing more compute at a general thing. We'll be following
         | around foundation models for the foreseeable future, with
         | distilled offshoots bubbling up/dying along the way.
        
           | kingkongjaffa wrote:
           | > This is why models which learn other languages become
           | better at English.
           | 
           | Do you have a source for that, I'd love to learn more!
        
             | a_wild_dandan wrote:
             | _Evaluating cross-lingual transfer learning approaches in
             | multilingual conversational agent models_ [1]
             | 
             |  _Cross-lingual transfer learning for multilingual voice
             | agents_ [2]
             | 
             |  _Large Language Models Are Cross-Lingual Knowledge-Free
             | Reasoners_ [3]
             | 
             |  _An Empirical Study of Cross-Lingual Transfer Learning in
             | Programming Languages_ [4]
             | 
             | That should get you started on transfer learning re.
             | languages, but you'll have more fun personally picking
             | interesting papers over reading a random yahoo's choices.
             | The fire hose of papers is nuts, so you'll never be left
             | wanting.
             | 
             | [1] https://www.amazon.science/publications/evaluating-
             | cross-lin...
             | 
             | [2] https://www.amazon.science/blog/cross-lingual-transfer-
             | learn...
             | 
             | [3] https://arxiv.org/pdf/2406.16655v1
             | 
             | [4] https://arxiv.org/pdf/2310.16937v2
        
         | gk1 wrote:
         | It may not be a model quality issue. It may be that GitHub
         | wants to sell a lot more of Copilot, including to companies who
         | refuse to use anything from OpenAI. Now GitHub can say "Oh
         | that's fine, we have these two other lovely providers to choose
         | from."
         | 
         | Also, after Anthropic and Google sold massive amounts of pre-
         | paid usage credits to companies, those companies want to draw
         | down that usage and get their money's worth. GitHub might allow
         | them to do that through Copilot, and therefore get their
         | business.
        
           | manquer wrote:
           | I think that the credit scenario is more true for OpenAI than
           | others . Existing Azure commits can be used to buy OpenAI via
           | the marketplace. It will never be as simple for any non Azure
           | partner (Only Github is tying up with Anthropic here not
           | Azure)
           | 
           | GitHub doesn't even support using those azure managed APIs
           | for copilot today, it is just a license you can buy currently
           | and add to a user license. The best you can do is pay for
           | copilot with existing azure commits .
           | 
           | This seems about not being left behind as other models
           | outpace what copilot can do with their custom OpenAI model
           | that doesn't seem to getting updated .
        
       | rnmaker wrote:
       | If you want to destroy open source completely, the more models
       | the better. Microsoft's co-opting and infiltration of OSS
       | projects will serve as a textbook example of eliminating
       | competition in MBA programs.
       | 
       | And people still support it by uploading to GitHub.
        
         | dartos wrote:
         | > And people still support it by uploading to GitHub.
         | 
         | It's slowly, but noticeably moving from GitHub to other sites.
         | 
         | The network effect is hard to work against.
        
           | fhdsgbbcaA wrote:
           | Migration is on my todo list, but it's non trivial enough I'm
           | not sure when I'll ever have cycles to even figure out the
           | best option. Gitlab? Self-hosted Git? Go back to SVN? A
           | totally different platform?
           | 
           | Truth be told, Git is a major pain in the ass anyway and I'm
           | very open to something else.
        
             | kubanczyk wrote:
             | A classic case of perfect being the enemy of the good. The
             | answers are Gitlab and jj, cheers.
        
         | amelius wrote:
         | > If you want to destroy open source completely
         | 
         | The irony is of course that open source is what they used to
         | train their models with.
        
           | guerrilla wrote:
           | That was the point. They are laundering IP. It's the long way
           | around the GPL, allowing then to steal.
        
             | ianeigorndua wrote:
             | How many OSS repositories do I personally have to read
             | through for my own code to be considered stolen property?
             | 
             | That line of thought would get thrown out of court faster
             | than an AI would generate it.
        
               | poincaredisk wrote:
               | I assume you're not an AI model, but a real human being
               | (I hope). The analogy "AI == human" just... doesn't work,
               | really.
        
               | ianeigorndua wrote:
               | That's beside the point.
               | 
               | Me teaching my brain someone's way of syntactically
               | expressing procedures is analogous to AI developers
               | teaching their model that same mode of expression.
        
               | guerrilla wrote:
               | It's not your reading that would be illegal, but your
               | copying. This is well a documented area of the law and
               | there are concrete answers to your questions.
        
               | ianeigorndua wrote:
               | Are you saying that if I see a nice programming pattern
               | in someone else's code, I am not allowed to use that
               | pattern in my code?
        
               | candiddevmike wrote:
               | Can I copy you or provide you as a service?
               | 
               | To me, the argument is a LLM learning from GPL stuff ==
               | creating a derivative of the GPL code, just "compressed"
               | within the LLM. The LLM then goes on to create more
               | derivatives, or it's being distributed (with the embedded
               | GPL code).
        
               | 0x457 wrote:
               | Yes, I provide it as a service to my employer. It's
               | called a job. Guess what? When I read code I learn from
               | it and my brain doesn't care what license that code is
               | under.
        
               | ianeigorndua wrote:
               | That's what my employers keep asking.
        
               | timeon wrote:
               | This seems bit nihilistic. You can't be automated. You
               | can't process repos at scale.
        
               | ianeigorndua wrote:
               | Yet.
        
         | atomic128 wrote:
         | Yes. Thank you for saying it. We're watching Microsoft et al.
         | defeat open source.
         | 
         | Large language models are used to aggregate and interpolate
         | intellectual property.
         | 
         | This is performed with no acknowledgement of authorship or
         | lineage, with no attribution or citation.
         | 
         | In effect, the intellectual property used to train such models
         | becomes anonymous common property.
         | 
         | The social rewards (e.g., credit, respect) that often motivate
         | open source work are undermined.
         | 
         | Embrace, extend, extinguish.
        
           | bastardoperator wrote:
           | Can you name a company with more OSS projects and
           | contributors? Stop with the hyperbole...
        
             | atomic128 wrote:
             | Embrace, extend...
        
           | TeMPOraL wrote:
           | > _The social rewards (e.g., credit, respect) that often
           | motivate open source work are undermined._
           | 
           | You mean people making contributions to solve problems and
           | scratch each others' itches got displaced by people seeking
           | social status and/or a do-at-your-own-pace accreditation
           | outside of formal structures, to show to prospective
           | employees? And now that LLMs start letting people solve their
           | own coding problems, sidestepping their whole social game,
           | the credit seekers complain because large corps did something
           | they couldn't possibly have done?
           | 
           | I mean sure, their contributions were a critical piece - _in
           | aggregate_ - individually, any single piece of OSS code
           | contributes approximately 0 value to LLM training. But they
           | 're somehow entitled to the reward for a vastly greater value
           | someone is providing, just because they _retroactively_ feel
           | they contributed.
           | 
           | Or, looking from a different angle: what the complainers are
           | saying is, they're sad they can't extract rent now that their
           | past work became valuable for reasons they had no part in,
           | and if they could turn back time, they'd happily rent-seek
           | the shit out of their code, to the point of destroying LLMs
           | as a possibility, and denying the world the value LLMs
           | provided?
           | 
           | I have little sympathy for that argument. We've been calling
           | out "copyright laundering" way before GPT-3 was a thing -
           | those who don't like to contribute without capturing all the
           | value for themselves should've moved off GitHub years ago.
           | It's not like GitHub has any hold over OSS other than plain
           | inertia (and the egos in the community - social signalling
           | games create a network effect).
        
             | discreteevent wrote:
             | >Or, looking from a different angle: what the complainers
             | are saying is, they're sad they can't extract rent now that
             | their past work became valuable for reasons they had no
             | part in, and if they could turn back time, they'd happily
             | rent-seek the shit out of their code,
             | 
             | Wrong and completely unfair/bitter accusation. The only
             | people rent seeking are the corporations.
             | 
             | What kind of world do you want to live in? The one with
             | "social games" or the one with corporate games? The one
             | with corporate games seems to have less and less room for
             | artists, musicians, language graduates, programmers...
        
             | raegis wrote:
             | > individually, any single piece of OSS code contributes
             | approximately 0 value to LLM training. But they're somehow
             | entitled to the reward for a vastly greater value someone
             | is providing, just because they retroactively feel they
             | contributed.
             | 
             | You are attributing arguments to people which they never
             | made. The most lenient of open source licenses require a
             | simple citation, which the "A.I." never provides. Your tone
             | comes off as pretty condescending, in my opinion. My
             | summary of what you wrote: "I know they violated your
             | license, but too bad! You're not as important as you
             | think!"
        
           | warkdarrior wrote:
           | > This is performed with no acknowledgement of authorship or
           | lineage, with no attribution or citation.
           | 
           | GitHub hosts a lot of source code, including presumably the
           | code it trained CoPilot on. So they satisfy any license that
           | requires sharing the code and license, such as GPL 3. Not
           | sure what the problem is.
        
         | whitehexagon wrote:
         | I deleted my github 2 weeks ago, as much about AI, as about
         | them forcing 2FA. Before AI it was SAAS taking more than they
         | were giving. I miss the 'helping each other' feel of these code
         | share sites. I wonder where are we heading with all this. All
         | competition and no collaboration, no wonder the planet is
         | burning.
        
         | pessimizer wrote:
         | I don't understand the case being made here at all. AI is
         | violating FOSS licenses, I totally agree. But you can write
         | more FOSS using AI. It's totally unfair, because these
         | companies are not sharing their source, and extracting all of
         | the value from FOSS as they can. Fine. But when it comes to OSI
         | Open Source, all they usually had to do was include a text file
         | somewhere mentioning that they used it in order to do the same
         | thing, and when it comes to Free Software, they could just lie
         | about stealing it and/or fly under the radar.
         | 
         | Free software needs more user-facing software, and it needs
         | people other than coders to drive development (think UI people,
         | subject matter specialists, etc.), and AI will help that. While
         | I think what the AI companies are doing is tortious, and that
         | they either should be stopped from doing it or the entire idea
         | of software copyright should be re-examined, I _also_ think
         | that AI will be massively beneficial for Free Software.
         | 
         | I also suspect that this could result in a grand bargain in
         | some court (which favors the billionaires of course) where the
         | AI companies have to pay into a fund of some sort that will be
         | used to pay for FOSS to be created and maintained.
         | 
         | Lastly, maybe Free Software developers should start zipping up
         | all of the OSI licenses that only require that a license be
         | included in the distribution and including that zipfile with
         | their software written in collaboration with AI copilots. That
         | and your latest GPL for the rest (and for your own code) puts
         | you in as safe a place as you could possibly be legally. You'll
         | still be hit by all of the "don't do evil"-style FOSS-esque
         | licenses out there, but you'll at least be safer than _all_ of
         | the proprietary software being written with AI.
         | 
         | I don't know what textbook directs you to eliminate all of your
         | competition by lowering your competition's costs, narrowing
         | your moat of expertise, and not even owning a piece of that.
         | 
         | edit: that being said, I'm obviously talking about Free
         | Software here, and not Open Source. Wasn't Open Source only
         | protected by spirits anyway?
        
         | mnau wrote:
         | It doesn't matter whether it is uploaded to GitHub or not. They
         | would siphon it from GitLab, self hosting or source forge as
         | well using crawlers.
        
       | mmiyer wrote:
       | Seems to be part of Microsoft's hedging of its OpenAI bet, ever
       | since Sam Altman's ousting:
       | https://www.nytimes.com/2024/10/17/technology/microsoft-open...
        
       | mk_chan wrote:
       | The reason here is Microsoft is trying to make copilot a
       | platform. This is the essential step to moving all the power from
       | OpenAI to Microsoft. It would grant Microsoft leverage over all
       | providers since the customers would depend on Microsoft and not
       | OpenAI or Google or Anthropic. Classic platform business
       | evolution at play here.
        
         | caesil wrote:
         | I think the reason here is that Copilot is very very obviously
         | inferior to Cursor, mostly because the model at its core is
         | pretty dumb.
        
           | echelon wrote:
           | The Copilot team probably thinks of Cursor's efforts as cute.
           | They can be a neat little product in their tiny corner of the
           | market.
           | 
           | It's far more valuable to be a platform. Maybe Cursor can
           | become a platform, but the race is on and they're up against
           | giants that are moving rather surprisingly nimbly.
           | 
           | Github does way more, you can build on top of it, and they
           | already have a metric ton of business relationships and
           | enterprise customers.
        
             | woah wrote:
             | A developer will spend far more time in the IDE than the
             | version control system so I wouldn't discount it that
             | easily. That being said, there are no network effects for
             | an IDE and Cursor is basically just a VSCode plugin. Maybe
             | Cursor gets a nice acquihire deal
        
         | sangnoir wrote:
         | I'm sure there are multiple reasons, including lowering the
         | odds of antitrust action by regulators. The EU was already
         | sniffing around Microsoft's relationship with OpenAI.
        
       | rogerkirkness wrote:
       | Commoditize your compliment baby.
        
       | dfried wrote:
       | Anyone doing strategic business with Microsoft would do well to
       | remember what they did to Nokia.
        
         | TheRealPomax wrote:
         | You mean waste a few billion on buying a company that couldn't
         | compete with the market anymore because the iphone made "even
         | an idiot should be able to use this thing, and it should be
         | able to do pretty much everything" a baseline expectation with
         | an OS/software experience to match? Nokia failed Nokia, and
         | then Microsoft gave it a shot. And they also couldn't make it
         | work.
         | 
         | (sure, that glosses over the whole Elop saga, but Microsoft
         | didn't buy a Nokia-in-its-prime and killed it. They bought an
         | already failing business and even throwing MS levels of
         | resources at it couldn't turn it around)
        
           | muststopmyths wrote:
           | Man, as a windows phone mourner the only disagreement i have
           | with this comment is that they threw anywhere near MS level
           | of resources at Nokia.
           | 
           | Satya never wanted the acquisition and nuked WP as soon as he
           | could.
        
           | ahoka wrote:
           | I can see why people would think that, but Microsoft did not
           | buy Nokia.
        
             | SSLy wrote:
             | They did bought the (then) richer half of the company. The
             | other is now trying to get out of the rot.
        
       | xnx wrote:
       | Frankly surprised to see GitHub (Microsoft) signing a deal with
       | their biggest competitor, Google. It does give Microsoft some
       | good terms/pricing leverage over OpenAI, though I'm not sure what
       | degree Microsoft needs that given their investment in OpenAI.
       | 
       | GitHub Spark seems like the most interesting part of the
       | announcement.
        
         | miyuru wrote:
         | On the anthropic blog it say it uses AWS Bedrock.
         | 
         | > Claude 3.5 Sonnet runs on GitHub Copilot via Amazon Bedrock,
         | leveraging Bedrock's cross-region inference to further enhance
         | reliability.
         | 
         | https://www.anthropic.com/news/github-copilot
        
       | xyst wrote:
       | Got to cut deals before the AI bust pops, VC money and interest
       | vanishes and interest rates go up.
       | 
       | Also diversifying is always a good option. Even if one cash cow
       | gets nuked from orbit, you have 2 other companies to latch onto
        
         | kingkongjaffa wrote:
         | > interest rates go up
         | 
         | This is kind of a cynical tech startup take:
         | 
         | - ragging on VC's - calling something a bubble
         | 
         | Interest rates are on their way back down btw.
         | 
         | https://www.federalreserve.gov/newsevents/pressreleases/mone...
         | 
         | https://www.reuters.com/world/uk/bank-england-cut-bank-rate-...
         | 
         | Funding has looked to be running out a few times for OpenAI
         | specifically, but most frontier model development is reasonably
         | well funded still.
        
           | njtransit wrote:
           | If interest rates are on their way down, why has the 10Y
           | treasury yield increased 50 points over the last month?
           | https://www.cnbc.com/quotes/US10Y
        
             | kortilla wrote:
             | Because they previously decreased more under the
             | expectation of another half point cut by the fed. Stronger
             | economic indicators have cut the expectation for steep rate
             | cuts so treasuries are declining.
        
             | warkdarrior wrote:
             | It also dropped 40 points over the last six months.
        
       | tqwhite wrote:
       | I wish people would stop posting Bloomberg paywall links.
        
       | greenavocado wrote:
       | I replaced ChatGPT Plus with hosted
       | nvidia/Llama-3.1-Nemotron-70B-Instruct for coding tasks. Nemotron
       | produces good code. The cost different is massive. Nemotron is
       | available for $0.35 per Mtoken in and out. ChatGPT is
       | considerably more expensive.
        
         | greenavocado wrote:
         | Just kidding. Qwen 2.5 Instruct is superior. Nemotron is
         | overfit to pass benchmarks.
        
       | shagie wrote:
       | Elseweb with GitHub Copilot today...
       | 
       | Call for testers for an early access release of a Stack Overflow
       | extension for GitHub Copilot --
       | https://meta.stackoverflow.com/q/432029
        
       | rvz wrote:
       | You mean "Microsoft" cuts deals with Google and Anthropic on top
       | of their already existing deals with Mistral, Inflection whilst
       | also having an exclusivity deal with OpenAI?
       | 
       | This is an extend to extinguish round 4 [0], whilst racing
       | everyone else to zero.
       | 
       | [0] https://news.ycombinator.com/item?id=41908456
        
       | sprkv5 wrote:
       | One of the reasons that comes to my mind is - it could have been
       | problematic look for only Microsoft (Copilot) to have access to
       | GitHub for training AI models - a la monopolizing a data treasure
       | trove. With anti-competitive legislation catching up to Google to
       | open up its Play Store, this could have been one of key reasons
       | why this deal came about.
        
         | poincaredisk wrote:
         | Copilot can choke on my AGPL code on GitHub, that was used for
         | training their proprietary models. I'm still salty about this,
         | sadly looks like the world has largely moved on.
        
           | azemetre wrote:
           | It really feels like a digital form of colonialism; they come
           | in take everything, completely disregard the rules, ignore
           | intellectual copyright laws (while you still have to obey
           | them), but when you speak out against this suddenly you are a
           | luddite that doesn't care about human progress.
        
             | mnau wrote:
             | It's especially distasteful when we consider lawsuits like
             | Epic vs Silicon Knights.
             | https://en.wikipedia.org/wiki/Silicon_Knights
             | 
             | > Silicon Knights had "deliberately and repeatedly copied
             | thousands of lines of Epic Games' copyrighted code, and
             | then attempted to conceal its wrongdoing by removing Epic
             | Games' copyright notices and by disguising Epic Games'
             | copyrighted code as Silicon Knights' own
             | 
             | > Epic Games prevailed against Silicon Knights' lawsuit,
             | and won its counter-suit for $4.45 million on grounds of
             | copyright infringement,
             | 
             | > following the loss of the court case, Silicon Knights
             | filed for bankruptcy
        
             | baq wrote:
             | If it doesn't work, oh well, you'll get VC money for
             | something else.
             | 
             | If it works, the lawyers will figure it out.
        
           | sprkv5 wrote:
           | Yet Google and Anthropic wanted in on the huge data that
           | GitHub has to offer. It seems the world has not moved on just
           | yet.
        
             | nonfamous wrote:
             | The Claude terms of service [1] apparently preclude
             | Anthropic or AWS using GitHub user data for training:
             | 
             | GitHub Copilot uses Claude 3.5 Sonnet hosted on Amazon Web
             | Services. When using Claude 3.5 Sonnet, prompts and
             | metadata are sent to Amazon's Bedrock service, which makes
             | the following data commitments: Amazon Bedrock doesn't
             | store or log your prompts and completions. Amazon Bedrock
             | doesn't use your prompts and completions to train any AWS
             | models and doesn't distribute them to third parties.
             | 
             | [1] https://docs.github.com/en/copilot/using-github-
             | copilot/usin...
        
       | yieldcrv wrote:
       | Seems to be trying to get its lunch money back from CodeGPT
       | plugin and similar ones
        
       | kleton wrote:
       | A case where "cut" is its own antonym, and its unclear which
       | sense is meant from the headline alone.
        
         | echoangle wrote:
         | I just had the same problem and thought there was a deal that
         | was ended now.
        
           | jacobgkau wrote:
           | Yeah, I was expecting outrage when I first clicked into the
           | thread to glance at the comments, and then I was like "wait,
           | why are people saying it's exciting?"
        
         | keiferski wrote:
         | Don't think I've ever seen the word "cut" used with "deal" in a
         | negative sense. Cutting a deal always means you made a deal,
         | not that one ended.
        
           | JulianChastain wrote:
           | What about "we were cut from the deal"? It seems like you
           | could make a phrase in which 'cut' means "to exclude"
        
             | keiferski wrote:
             | Doesn't sound natural to me, and I couldn't find any
             | examples online using that phrasing to mean someone was
             | removed from a deal. You can be cut from a team, though.
        
         | shagie wrote:
         | Cutting Deals and Striking Bargains: The History of an Idiom
         | 
         | https://web.archive.org/web/20060920230602/https://www.csub....
         | 
         | By way of "Why do we 'cut' a deal?"
         | https://english.stackexchange.com/q/284233
         | 
         | ---
         | 
         | "Cuts " ... leads to the initial parsing of "cuts all ties
         | with" or similar "severs relationship with".
         | 
         | When with additional modifiers between "cuts" and "deal" the
         | "cuts deal with" becomes harder to recognize as the "forms a
         | deal with" meaning of the phase.
        
         | contextfree wrote:
         | GitHub sublates AI deals with Google, Anthropic
        
       | phreeza wrote:
       | I guess this goes to show, nobody really has a moat in this game
       | so far. Everyone is sprinting like crazy but I don't see anyone
       | really gaining a sustainable edge that will push out competitors.
        
       | marban wrote:
       | In AI, the only real moat is seeing how many strategic
       | partnerships you can announce before anyone figures out they're
       | all with the same people.
        
         | selimthegrim wrote:
         | Claude and Carol and Carol and Carol?
        
       | dgellow wrote:
       | I've been using Cody from Sourcegraph to have access to other
       | models, if copilot offers something similar I guess I will switch
       | back to it. I find copilot autocomplete to be more often on point
       | than Cody, but the chat experience with Cody + Sonnet 3.5 is way
       | ahead in may experience
        
         | sqs wrote:
         | Context is a huge part of the chat experience in Cody, and
         | we're working hard to stay ahead there as well with things like
         | OpenCtx (https://openctx.org) and more code context based on
         | the code graph (defs/refs/etc.). All this competition is good
         | for everyone. :)
        
       | sincerecook wrote:
       | I replaced chatgpt with mybrain 1.0 and I'm seeing huge
       | improvements in accuracy and reasoning performance!
        
         | nforgerit wrote:
         | Also energy efficiency significantly improved, no?
        
       | kingkongjaffa wrote:
       | If I'm already paying Anthropic can I use this without paying
       | github as well?
        
       | mmaunder wrote:
       | History has shown being first to market isn't all it's cut out to
       | be. You spend more, it's more difficult creating the trail others
       | will follow, you end up with a tech stack that was built before
       | tools and patterns stabilized and you've created a giant super
       | highway for a fast-follower. Anyone remember MapQuest, AltaVista
       | or Hotmail?
       | 
       | OpenAI has some very serious competition now. When you combine
       | that with the recent destabilizing saga they went through along
       | with commoditization of models with services like OpenRouter.ai,
       | I'm not sure their future is as bright as their recent valuation
       | indicates.
        
         | sebzim4500 wrote:
         | Claude is better than OpenAI for most tasks, and yet OpenAI has
         | enormously more users.
         | 
         | What is this, if not first mover advantage?
        
           | szundi wrote:
           | Claude cannot "research" stuff on the web and provide results
           | like 4o does in 5 secs like "which is the cheapest Skoda car
           | and how much"
        
             | mmaunder wrote:
             | Just wanted to add a note to this. Tool calling -
             | particularly to source external current data - is something
             | that's had the big foundational LLM providers very nervous
             | so they've held back on it, even though it's trivial to
             | implement at this point. But we're seeing it rapidly emerge
             | with third party providers who use the foundational APIs.
             | Holding back tool calling has limited the complex graph-
             | like execution flows that the big providers could have
             | implemented on their user facing apps e.g. the kind of
             | thing that Perplexity Pro has implemented. So they've
             | fallen behind a bit. They may catch up. If they don't they
             | risk becoming just an API provider.
        
               | ethbr1 wrote:
               | I'm hoping a lot of the graph-like execution flow engines
               | are still in stealth mode, as believe that's where we'll
               | start to see truly useful AI.
               | 
               | Mass data parsing and reformatting is useful... but
               | building agents that span existing APIs / tools is a lot
               | more exciting to me.
               | 
               | I.e. IFTTT, with automatic tool discovery, parameter
               | mapping, and output parsing handled via LLM
        
             | sitkack wrote:
             | This is what I use phind for.
        
           | mmaunder wrote:
           | Yes, muscle memory is powerful. But it's not an
           | insurmountable barrier for a follower. The switch from Google
           | to various AI apps like Perplexity being a case in point. I
           | still find myself beginning to reach for Google and then 0.1
           | seconds later catching myself. As a side note: I'm also
           | catching myself having a lack of imagination when it comes to
           | what is solvable. e.g. I had a specific technical question
           | about github's UX and how to get to a thing that no one would
           | have written about and thus Google wouldn't know, but openAI
           | chat nailed it first try.
        
           | hbn wrote:
           | Most people's first exposure to LLMs was ChatGPT, and that
           | was only what - like 18 months ago it really took off in the
           | mainstream? We're still very early on in the grand scheme of
           | things.
        
             | dmix wrote:
             | Yes it's silly to talk about first mover advantage in sub 3
             | years. Maybe in 2026 we can revisit this question and see
             | if being the first mattered.
             | 
             | First mover being a general myth doesn't mean being the
             | first to launch and then immediately dominating the wider
             | market for a long period is impossible. It's just usually
             | means their advantage was about a lot more than simply
             | being first.
        
           | jedberg wrote:
           | Claude requires a login, ChatGPT does not.
        
           | nabla9 wrote:
           | It's a short lived first mover advantage.
        
           | sigmoid10 wrote:
           | Claude is only better in some cherry picked standard eval
           | benchmarks, which are becoming more useless every month due
           | to the likelihood of these tests leaking into training data.
           | If you look at the Chatbot Arena rankings where actual users
           | blindly select the best answer from a random choice of
           | models, the top 3 models are all from OpenAI. And the next
           | best ones are from Google and X.
        
             | trzy wrote:
             | Bullshit. Claude 3.5 Sonnet owns the competition according
             | to the most useful benchmark: operating a robot body in the
             | real world. No other model comes close.
        
               | Matticus_Rex wrote:
               | This seems incorrect. I don't need Claude 3.5 Sonnet to
               | operate a robot body for me, and don't know anyone else
               | who does. And general-purpose robotics is not going to be
               | the most efficient way to have robots do many tasks ever,
               | and certainly not in the short term.
        
               | trzy wrote:
               | Of course not but the task requires excellent image
               | understanding, large context window, a mix of structured
               | and unstructured output, high level and spatial
               | reasoning, and a conversational layer on top.
               | 
               | I find it's predictive of relative performance in other
               | tasks I use LLMs for. Claude is the best. The only
               | shortcoming is its peculiar verbosity.
               | 
               | Definitely superior to anything OpenAI has and miles
               | beyond the "open weights" alternatives like Llama.
        
               | int_19h wrote:
               | The problem is that it also fails on fairly simple logic
               | puzzles that ChatGPT can do just fine.
               | 
               | For example, even the new 3.5 Sonnet can't solve this
               | reliably:
               | 
               | > Doom Slayer needs to teleport from Phobos to Deimos. He
               | has his pet bunny, his pet cacodemon, and a UAC scientist
               | who tagged along. The Doom Slayer can only teleport with
               | one of them at a time. But if he leaves the bunny and the
               | cacodemon together alone, the bunny will eat the
               | cacodemon. And if he leaves the cacodemon and the
               | scientist alone, the cacodemon will eat the scientist.
               | How should the Doom Slayer get himself and all his
               | companions safely to Deimos?
               | 
               | In fact, not only its solution is wrong, but it can't
               | figure out _why_ it 's wrong on its own if you ask it to
               | self-check.
               | 
               | In contrast, GPT-4o always consistently gives the correct
               | response.
        
               | BobaFloutist wrote:
               | Yeah, but Mistral brews a mean cup of tea, and Llama's
               | easily the best at playing hopscotch.
        
             | gr3ml1n wrote:
             | 3.5 Sonnet, ime, is dramatically better at coding than 4o.
             | o1-preview may be better, but it's too slow.
        
             | amanzi wrote:
             | I don't pay any attention to leaderboards. I pay for both
             | Claude and ChatGPT and use them both daily for anything
             | from Python coding to the most random questions I can think
             | of. In my experience Claude is better (much better) that
             | ChatGPT in almost all use cases. Where ChatGPT shines is
             | the voice assistant - it still feels almost magical having
             | a "human-like" conversation with the AI agent.
        
             | rogerkirkness wrote:
             | Claude 3.5 Sonnet (New) is meaningfully better than ChatGPT
             | GPT4o or o1.
        
               | drcode wrote:
               | my experience is that o1 is still slightly better for
               | coding, sonnet new is better for analyzing data, and most
               | other tasks besides coding
        
             | scarmig wrote:
             | I'm subscribed to all of Claude, Gemini, and ChatGPT.
             | Benchmarks aside, my go-to is always Claude. Subjectively
             | speaking, it consistently gives better results than
             | anything else out there. The only reason I keep the other
             | subscriptions is to check in on them occasionally to see if
             | they've improved.
        
             | Cu3PO42 wrote:
             | Anecdotally, I disagree. Since the release of the "new" 3.5
             | Sonnet, it has given me consistently better results than
             | Copilot based on GPT-4o.
             | 
             | I've been using LLMs as my rubber duck when I get stuck
             | debugging something and have exhausted my standard avenues.
             | GPT-4o tends to give me very general advice that I have
             | almost always already tried or considered, while Claude is
             | happy to say "this snippet looks potentially incorrect;
             | please verify XYZ" and it has gotten me back on track in
             | maybe 4/5 cases.
        
           | ipaddr wrote:
           | Claude is more restricted and can't generate images.
        
             | SV_BubbleTime wrote:
             | I asked Claude a physics question about bullet trajectory
             | and it refused to answer. Restricted too far imo.
        
               | metalliqaz wrote:
               | couldn't you s/bullet/ball/ ? or s/bullet/arrow/ ?
        
               | gkbrk wrote:
               | You could, but you could also use a model that's not
               | restricted so much that it cannot do simple tasks.
        
               | SV_BubbleTime wrote:
               | Exactly.
               | 
               | I ended up asking about half pound ball I would throw
               | with a 3600rpm spin and the acceleration phase was 4ms.
               | 
               | It had no issue with that but it was stupid.
        
           | ronnier wrote:
           | I think "Claude" is also a bad name. If I knew nothing else,
           | am I picking OpenAI or Claude based on the name? I'm going
           | with OpenAI
        
             | block_dagger wrote:
             | Claude is a product name, OpenAI is a company name. You
             | really think Claude is better than ChatGPT?
        
               | ronnier wrote:
               | The name ChatGPT is better than the name Claude, to me.
               | Of course this is all subjective though.
        
               | setsewerd wrote:
               | This brings up the broader question: why are AI companies
               | so bad at naming their products?
               | 
               | All the OpenAI model names look like garbled nonsense to
               | the layperson, while Anthropic is a bit of a mixed bag
               | too. I'm not sure what image Claude is supposed to
               | conjure, Sonnet is a nice name if it's packaged as a
               | creative writing tool but less so for developers. Meta AI
               | is at least to the point, though not particularly
               | interesting as far as names go.
               | 
               | Gemini is kind of cool sounding, aiming for the
               | associations of playful/curious of that zodiac sign. And
               | the Gemini models are about as unreliable as astrology is
               | for practical use, so I guess that name makes the most
               | sense.
        
               | jmcmaster wrote:
               | Asking Americans to read a French name that is a homonym
               | for "clod" may not be the best mass market decision.
        
               | 0x457 wrote:
               | Plot twist: regular users don't care what model
               | underneath is called or how it works.
        
           | HarHarVeryFunny wrote:
           | They seem to be going after different markets, or at least
           | having differing degrees of success in going after different
           | markets.
           | 
           | OpenAI is most successful with consumer chat app (ChatGPT)
           | market.
           | 
           | Anthropic is most successful with business API market.
           | 
           | OpenAI currently has a lot more revenue than Anthropic, but
           | it's mostly from ChatGPT. For API use the revenue numbers of
           | both companies are roughly the same. API success seems more
           | important that chat apps since this will scale with success
           | of the user's business, and this is really where the dream of
           | an explosion in AI profits comes from.
           | 
           | ChatGPT's user base size vs that of Claude's app may be first
           | mover advantage, or just brand recognition. I use Claude
           | (both web based and iOS app), but still couldn't tell you if
           | the chat product even has a name distinct from the model.
           | How's that for poor branding?! OpenAI have put a lot of
           | effort into the "her" voice interface, while Anthropic's app
           | improvements are more business orientated in terms of
           | artifacts (which OpenAI have now copied) and now code
           | execution.
        
           | azemetre wrote:
           | Honestly I think the biggest reason for this is that Claude
           | requires you to login via an email link whereas OpenAI will
           | let you just login with any credentials.
           | 
           | This matters if you have a corporate machine and can't access
           | your personal email to login.
        
         | LeoPanthera wrote:
         | Given that Hotmail is now Outlook.com, maybe that's a bad
         | example.
        
       | holografix wrote:
       | Can we change the title to "GitHub _signs_ deals with Google,
       | Anthropic" ?
       | 
       | The original got me thinking it already had deals it was getting
       | out of
        
         | eddd-ddde wrote:
         | I agree, very weird choice of words.
        
         | kelnos wrote:
         | To "cut a deal" is a common (American?) English idiom meaning
         | to "make a deal".
         | 
         | But agree that it's better to avoid using idioms on a site that
         | has many visitors for whom English is not their first language.
        
           | archgoon wrote:
           | Do you mean that Bloomberg should have used a different title
           | or Hacker News should have modified the title?
        
             | pxeger1 wrote:
             | I think Bloomberg's at fault: "cut a deal" isn't usually
             | that ambiguous because it's clear which state transition is
             | more likely. But here it's plausible they could've been
             | ending some existing training-data-sharing agreement, or
             | that they were making a new different deal. Also the fact
             | it's pluralised here makes it different enough to the most
             | common form for it to be a bit harder to notice the idiom.
             | But since we can't change the fact they used that title, I
             | would like HN to change it now.
        
       | lofaszvanitt wrote:
       | Thank you people for contributing to this free software
       | ecosystem. Oh, you can't monetize your work? Your problem, not
       | ours! Deals are made, but you, who provide your free code, we
       | have zero monetization options for you on our github platform. Go
       | pay for copilot which was trained on your data.
       | 
       | I mean, this is the worst farce ever concocted. And people are
       | oblivious what's happening...
        
         | mnau wrote:
         | We are not oblivious. We are powerless. Oracle could go toe to
         | toe with Google and threaten multibillion fines over basically
         | API and 11kLOC. As a open source developer, there is no way to
         | match that.
        
       | Fairburn wrote:
       | 1 point by Fairburn 0 minutes ago | prev | next | edit | delete
       | [-]
       | 
       | I have no doubts that Claude is serviceable from a coders
       | perspective. But for me, as a paid user, I became tired of being
       | told that I have to slow down and then be cut off while actively
       | working on a product. When Anthropic addresses this, Ill add it
       | back to my tools.
        
       | wg0 wrote:
       | This only makes Copilot more competitive and price effective.
       | Microsoft's business managers are smart.
        
       | hi41 wrote:
       | That's a strange usage of the word "cuts". I thought GitHub
       | terminated the deals with Google and Anthropic. It would be
       | better if the title were GitHub signs AI deals instead of cuts.
        
         | r00fus wrote:
         | https://plainenglish.com/expressions/cut-a-
         | deal/#:~:text=Tod....
        
         | gregschlom wrote:
         | I'm assuming you're not a native speaker? (I'm not) - "to cut a
         | deal" is a fairly common idiom that means to reach and
         | agreement.
        
           | naniwaduni wrote:
           | As an aside, "closing" and "concluding" a deal or sale also
           | usually mean to successfully reach an agreement. It's more of
           | a semantic quirk around deals than an isolated idiom.
        
           | hi41 wrote:
           | That's correct. Not a native speaker. I am not well versed
           | with slang words. I am sometimes embarrassed because I speak
           | as if they are words from a book instead of sounding like
           | spoken words. Do you know how cuts came to mean that it's a
           | deal. For a non-native speaker it means the exact opposite
           | thing as in "he cut a wire". Language evolves in strange
           | ways.
        
             | samatman wrote:
             | "Cut a deal" is an idiom, not slang: it's appropriate
             | language to use in a business context, for example.
             | 
             | The origin is hazy, of the theories I've seen I consider
             | this the best one: "deal" means both "an agreement" and "to
             | distribute cards in a card game". The dealer, in the latter
             | sense, first cuts the card deck then deals the card. "Cut
             | and deal" -> "cut a deal".
             | 
             | It could also be related to "cut a check", which comes from
             | an era before perforated paper was widespread, when one
             | would literally cut the check out of a book of checks.
        
               | hi41 wrote:
               | Thanks much for the explanation.
        
       | epolanski wrote:
       | Yet another confirmation that AI models are nothing but
       | commodities.
       | 
       | There's no moat, none.
       | 
       | I'm really curious how can any company building models hope to
       | have any meaningful return from their billion dollars
       | investments, when few people leaving and getting enough azure
       | credits can get create a competitor in few months.
        
       | thih9 wrote:
       | I use cursor and its tab completion; while what it can do is mind
       | blowing, in practice I'm not noticing a productivity boost.
       | 
       | I find that ai can help significantly with doing plumbing, but it
       | has no problems with connecting the pipes wrong. I need to double
       | and triple check the updated code - or fix the resulting errors
       | when I don't do that. So: boilerplate and outer app layers, yes;
       | architecture and core libraries, no.
       | 
       | Curious, is that a property of all ai assisted tools for now? Or
       | would copilot, perhaps with its new models, offer a different
       | experience?
        
         | MuffinFlavored wrote:
         | > in practice I'm not noticing a productivity boost.
         | 
         | How can this be possible if you literally admit its tab
         | completion is mindblowing?
         | 
         | Isn't really good tab completion good enough for at least a 5%
         | producitvity boost? 10%? 20%?
         | 
         | Select line of code, prompt it to refactor, verify they are
         | good, accept the changes
        
           | m3kw9 wrote:
           | In experience, I always have to spend time to check and most
           | times it doesn't do what I need unless it's very simple asks.
        
           | thih9 wrote:
           | > How can this be possible if you literally admit its tab
           | completion is mindblowing?
           | 
           | What about it makes it impossible? I'm impressed by what AI
           | assistants can do - and in practice it doesn't help me
           | personally.
           | 
           | > Select line of code, prompt it to refactor, verify they are
           | good, accept the changes.
           | 
           | It's the "verify" part that I find tricky. Do it too fast and
           | you spend more time debugging than you originally gained. Do
           | it too slow and you don't gain much time.
           | 
           | There is a whole category of bugs that I'm unlikely to write
           | myself but I'm likely to overlook when reading code. Mixing
           | up variable types, mixing up variables with similar names,
           | misusing functions I'm unfamiliar with and more.
        
             | beepbooptheory wrote:
             | I think the essential point around impressive vs helpful
             | sums up so much of the discourse around this stuff. Its all
             | just where you fall on the line between "impressive is
             | necessarily good" and "no it isn't".
        
             | throttlebody wrote:
             | How does AI learn from it's mistakes ? Genuine question as
             | i have only briefly used ChatGpt and found it interesting
             | but not usefull.
        
           | bigstrat2003 wrote:
           | > How can this be possible if you literally admit its tab
           | completion is mindblowing?
           | 
           | If I had a knife of perfect sharpness which never dulled,
           | that would be mind-blowing. It also would very likely not
           | make me a better cook.
        
           | kortilla wrote:
           | If someone can eat 20 golf balls that's impressive but it
           | doesn't improve my golf game
        
           | bradford wrote:
           | > How can this be possible if you literally admit its tab
           | completion is mindblowing?
           | 
           | I might suggest that coding doesn't take as much of our time
           | as we might think it does.
           | 
           | Hypothetically:
           | 
           | Suppose coding takes 20% of your total clock time. If you
           | improve your coding efficiency by 10%, you've only improved
           | your total job efficiency by 2%. This is great, but probably
           | not the mind-blowing gain that's hyped by the AI boom.
           | 
           | (I used 20% as a sample here, but it's not far away from my
           | anecdotal experience, where so much of my time is spent in
           | spec gathering, communication, meeting security/compliance
           | standards, etc).
        
         | MangoCoffee wrote:
         | Time will tell. As a GitHub Copilot user, I still review the
         | code.
         | 
         | SpaceX's advancements are impressive, from rocket blow up to
         | successfully catching the Starship booster.
         | 
         | Who knows what AI will be capable of in 5-10 years? Perhaps it
         | will revolutionize code assistance or even replace developers
        
           | outworlder wrote:
           | > SpaceX's advancements are impressive, from rocket blow up
           | to successfully catching the Starship booster.
           | 
           | That felt like it was LLM generated since that doesn't have
           | anything to do with the subject being discussed. Not only
           | it's on a different industry but it's a completely different
           | set of problems. We know what's involved in catching a
           | rocket. It's a massive engineering challenge yes, but we all
           | know it can be done(whether or not it makes sense or is
           | economically viable are different issues).
           | 
           | Even going to the Moon - which was a massive project and took
           | massive focus from an entire country to do - was a matter of
           | developing the equipment, procedures, calculations (and yes,
           | some software). We knew back then it could be done, and
           | roughly how.
           | 
           | Artificial intelligence? We don't know enough about
           | "intelligence". There isn't even a target to reach right now.
           | If we said "resources aren't a problem, let's build AI",
           | there isn't a single person on this planet that can tell you
           | how to build such an AI or even which technologies need to be
           | developed.
           | 
           | More to the point, current LLMs are able to probabilistically
           | generate data based on prompts. That's pretty much it. They
           | don't "know" anything about what they are generating, they
           | can't reason about it. In order for "AI" to replace
           | developers entirely, we need other big advancements in the
           | field, which may or may not come.
        
             | dspillett wrote:
             | _> Artificial intelligence? We don 't know enough about
             | "intelligence"._
             | 
             | The problem I have with this objection is that it, like
             | many discussions, conflates LLMs (glorified predictive
             | text) and other technologies currently being referred to as
             | AI, with AGI.
             | 
             | Most of these technologies should still be called machine
             | learning as they aren't really doing anything intelligent
             | in the sense of general intelligence. As you say yourself:
             | they don't know anything. And by inference, they aren't
             | _reasoning_ about anything.
             | 
             | Boilerplate code for common problems, and some not so
             | common ones, which is what LLMs are getting pretty OK at
             | and might in the coming years be very good at, _is_ a
             | definable problem that we understand quite well. And much
             | as we like to think of ourselves as  "computer scientists",
             | the vast majority of what we do boils down to boilerplate
             | code using common primitives, that are remarkably similar
             | across many problem domains that might on first look appear
             | to be quite different, because many of the same primitives
             | and compound structures are used. The bits that require
             | actual intelligence are often quite small (this is how _I_
             | survive as a dev!), or are away from the development
             | coalface (for instance: discovering and defining the
             | problems before we can solve them, or describing the
             | problem  & solution such that someone or an "AI" can do the
             | legwork).
             | 
             |  _> we need other big advancements in the field, which may
             | or may not come._
             | 
             | I'm waiting for an LLM being guided to create a better LLM,
             | and eventually down that chain a real AGI popping into
             | existence, much like the infinite improbability drive being
             | created by clever use of a late version finite
             | improbability generator. This is (hopefully) many years (in
             | fact I'm hoping for at least a couple of decades so I can
             | be safely retired or nearly there!) from happening, but it
             | feels like such things are just over the next deep valley
             | of disillusionment.
        
           | olivermuty wrote:
           | Except cursor is the fireworks based on black powder here. It
           | will look good, but as a technology to get you to the moon it
           | seems to look like a dead end. NOTHING (of serious science)
           | seems to indicate LLMs being anything but a dead end with the
           | current hardware capabilites.
           | 
           | So then I ask: What, in qualitative terms, makes you think AI
           | in the current form will be capable of this in 5 or 10 years?
           | Other than seeing the middle of what seems to be an S-curve
           | and going <<ooooh shiny exponential!>>
        
             | TeMPOraL wrote:
             | > _NOTHING (of serious science) seems to indicate LLMs
             | being anything but a dead end with the current hardware
             | capabilites._
             | 
             | In the same sense that black powder sucks as a rocket
             | propellant - but it's enough to demonstrate that iterating
             | on the same architecture and using better fuels _will_ get
             | you to the Moon eventually. LLMs of today are starting
             | points, and many ideas for architectural improvements are
             | being explored, and nothing in serious science suggests
             | _that_ will be a dead end any time soon.
        
               | zeroonetwothree wrote:
               | It's easy to say with hindsight but if all you have is
               | black powder I don't think it's obvious those better
               | fuels even exist.
        
             | dbmikus wrote:
             | If you look at LLM performance on benchmarks, they keep
             | getting better at a fast rate.[1]
             | 
             | We also now have models of various sizes trained in general
             | matters, and those can now be tuned or fine-tuned to
             | specific domains. The advances in multi-modal AI are also
             | happening very quickly as well. Model specialization, model
             | reflection (chain of thought, OpenAI's new O1 model, etc.)
             | are also undergoing rapid experimentation.
             | 
             | Two demonstrable things that LLMs don't do well currently,
             | are (1) generalize quickly to out-of-distribution examples,
             | (2) catch logic mistakes in questions that look very
             | similar to training data, but are modified. This video
             | talks about both of these things.[2]
             | 
             | I think I-JEPA is a pretty interesting line of work towards
             | solving these problems. I also think that multi-modal AI
             | pushes in a similar direction. We need AI to learn
             | abstractions that are more decoupled from the source
             | format, and we need AI that can reflect and modify its
             | plans and update itself in real time.
             | 
             | All these lines of research and development are more-or-
             | less underway. I think 5-10 years is reasonable for another
             | big advancement in AI capability. We've shown that applying
             | data at scale to simple models works, and now we can
             | experiment with other representations of that data (ie
             | other models or ways to combine LLM inferences).
             | 
             | [1]: https://www.anthropic.com/news/3-5-models-and-
             | computer-use [2]:
             | https://www.youtube.com/watch?v=s7_NlkBwdj8
        
         | imafish wrote:
         | I rarely use the tab completion. Instead I use the chat and
         | manually select files I know should be in context. I am barely
         | writing any code myself anymore.
         | 
         | Just sanity checking that the output and "piping" is correct.
         | 
         | My productivity (in frontend work at least) is significantly
         | higher than before.
        
           | big_jimmer wrote:
           | Out of curiosity, how long have you been working as a
           | developer? Just that, in my experience, this is mostly true
           | for juniors and mids (depending on the company, language,
           | product etc. etc.). For example, I often find that copilot
           | will hallucinate tailwind classes that don't exist in our
           | design system library, or make simple logical errors when
           | building charts (sometimes incorrect ranges, rarely
           | hallucinated fields) and as soon as I start bringing in 3rd
           | party services or poorly named legacy APIs all hope is lost
           | and I'm better off going it alone with an LSP and a prayer.
        
         | SparkyMcUnicorn wrote:
         | I haven't used Cursor, but I use Aider with Sonnet 3.5 and also
         | use Copilot for "autocomplete".
         | 
         | I'd highly recommend reading through Aider's docs[0], because I
         | think it's relevant for any AI tool you use. A lot of people
         | harp on prompting, and while a good prompt is important I often
         | see developers making other mistakes like not providing context
         | that's good, correct, or even too much[1].
         | 
         | When I find models are going on the wrong path with something,
         | or "connecting the pipes wrong", I often add code comments that
         | provide additional clarity. Not only does this help future
         | me/devs, but the more I steer AI towards correct results, the
         | fewer problems models seem to have going forward.
         | 
         | Everybody seems to be having wildly different experiences using
         | AI for coding assistance, but I've personally found it to be a
         | big productivity boost.
         | 
         | [0] https://aider.chat/docs/usage/tips.html
         | 
         | [1] https://aider.chat/docs/troubleshooting/edit-
         | errors.html#red...
        
           | realce wrote:
           | Totally agree that heavy commenting is the best convention
           | for helping the assistant help you best. I try to comment in
           | a way that makes a file or function into a "story" or kind of
           | a single narrative.
        
             | jascha_eng wrote:
             | That's super interesting, I've been removing a lot of the
             | redundant comments from the AI results. But adding new more
             | explanatory ones that make it easier for both AI and humans
             | to understand the code base makes a lot of sense in my
             | head.
             | 
             | I was big on writing code to be easy to read for humans,
             | but it being easy to read for AI hasn't been a large
             | concern of mine.
        
         | bob1029 wrote:
         | It's the subtle errors that are really difficult to navigate. I
         | got burned for about 40 hours on a conditional being backward
         | in the middle of an otherwise flawless method.
         | 
         | The apparent speed up is mostly a deception. It definitely
         | helps with rough outlines and approaches. But, the faster you
         | go, the less you will notice the fine details, and the more
         | assumptions you will accumulate before realizing the
         | fundamental error.
         | 
         | I'd rather find out I was wrong within the same day. I'd
         | probably have written some unit tests and played around with
         | that function a lot more if I had handcrafted it.
        
           | enneff wrote:
           | That's the thing, isn't it? The craft of programming in the
           | small is one of being intimate with the details, thinking
           | things through conscientiously. LLMs don't do that.
        
             | Nevermark wrote:
             | Perhaps it should be prompted to then?
             | 
             | Ask it to review its own code for any problems?
             | 
             | Also identify typical and corner cases and generate tests?
             | 
             | Question marks here because I have not used the tool.
             | 
             | The size & depth of each accepted code step is still up to
             | the developer slash prompter
        
               | nrclark wrote:
               | I use Chatgpt for coding / API questions pretty
               | frequently. It's bad at writing code with any kind of
               | non-trivial design complexity.
               | 
               | There have been a bunch of times where I've asked it to
               | write me a snippet of code, and it cheerfully gave me
               | back something that doesn't work for one reason or
               | another. Hallucinated methods are common. Then I ask it
               | to check its code, and it'll find the error and give me
               | back code with a different error. I'll repeat the process
               | a few times before it eventually gets back to code that
               | resembles its first attempt. Then I'll give up and write
               | it myself.
               | 
               | As an example of a task that it failed to do: I asked it
               | to write me an example Python function that runs a
               | subprocess, prints its stdout transparently (so that I
               | can use it for running interactive applications), but
               | also records the process's stdout so that I can use it
               | later. I wanted something that used non-blocking I/O
               | methods, so that I didn't have to explicitly poll every N
               | milliseconds or something.
        
               | bongodongobob wrote:
               | Honestly I find that when GPT starts to lose the plot
               | it's a good time to refactor and then keep on moving.
               | "Break this into separate headers or modules and give me
               | some YAML like markup with function names, return type,
               | etc for each file." Or just use stubs instead of dumping
               | every line of code in.
        
               | tomrod wrote:
               | How long are you willing to iterate to get things right?
        
               | bongodongobob wrote:
               | If it takes almost no cognitive energy, quite a while.
               | Even if it's a little slower than what I can do, I don't
               | care because I didn't have to focus deeply on it and have
               | plenty of energy left to keep on pushing.
        
               | EVa5I7bHFq9mnYK wrote:
               | That's presumably what o1-preview does? Iterates and
               | checks the result. It takes much longer, but does indeed
               | write slightly better code.
        
             | __MatrixMan__ wrote:
             | I find that it depends very heavily on what you're up to.
             | When I ask it to write nix code it'll just flat out forget
             | how the syntax works half way though. But if I want it to
             | troubleshoot an emacs config or wield matplotlib it's
             | downright wizardly, often including the kind of thing that
             | does indicate an intimacy with the details. I get
             | distracted because I'm then asking it:
             | 
             | > I un-did your change which made no sense to me and now
             | everything is broken, why is what you did necessary?
             | 
             | I think we just have to ask ourselves what we want it to be
             | good at, and then be diligent about generating decades
             | worth of high quality training material in that domain. At
             | some point, it'll start getting the details right.
        
           | tanseydavid wrote:
           | >> The apparent speed up is mostly a deception.
           | 
           | When I am able ask a very simple question of an LLM which
           | then prevents me having to context-switch to answer the same
           | simple question myself; this is a big time saver for me but
           | hard-to-quantify.
           | 
           | Anything that reduces my cognitive load when the pressure is
           | on is a blessing on some level.
        
             | oogetyboogety wrote:
             | This might be the measurable "some" non deceptive time
             | saving, whereas most of it is still deceptive in terms of
             | time saved
        
               | tensor wrote:
               | Except actual studies objectively show efficiency gains,
               | more with junior devs, which make sense. So no, it's not
               | a "deception" but it is often overstated in popular
               | media.
        
               | zeroonetwothree wrote:
               | Studies have limitations, in particular they test
               | artificial and narrowly-scoped problems that are quite
               | different from real world work.
        
               | rqmedes wrote:
               | I find the opposite, the more senior the more value they
               | offer as you know how to ask the right questions, how to
               | vary the questions and try different tact's and also
               | observe errors or mistakes
        
               | 0xFACEFEED wrote:
               | You could make the same argument for any non-AI driven
               | productivity tool/technique. If we can't trust the user
               | to determine what is and is not time-saving then time-
               | saving isn't a useful thing to discuss outside of an
               | academic setting.
               | 
               | My issue with most AI discussions is they seem to
               | completely change the dimensions we use to evaluate basic
               | things. I believe if we replaced "AI" with "new useful
               | tool" then people would be much more eager to adopt it.
               | 
               | What clicked for me is when I started treating it more
               | like a tool and less like some sort of nebulous pandora's
               | box.
               | 
               | Now to me it's no different than auto completing code,
               | fuzzy finding files, regular expressions, garbage
               | collection, unit testing, UI frameworks, design patterns,
               | etc. It's just a tool. It has weaknesses and it has
               | strengths. Use it for the strengths and account for the
               | weaknesses.
               | 
               | Like any tool it can be destructive in the hands of an
               | inexperienced person or a person who's asking it to do
               | too much. But in the hands of someone who knows what
               | they're doing and knows what they want out of it - it's
               | so freakin' awesome.
               | 
               | Sorry for the digression. All that to say that if someone
               | believes it's a productivity boost for them then I don't
               | think they're being misled.
        
             | bongodongobob wrote:
             | Cognitive load is something people always leave out. I can
             | fuckin code drunk with these things. Or just increase
             | stamina to push farther than I would writing every single
             | line.
        
           | tensor wrote:
           | Why aren't you writing unit tests just because AI wrote the
           | function? Unit tests should be written regardless of the
           | skill of the developer. Ironically, unit tests are also one
           | area where AI really does help move faster.
           | 
           | High level design, rough outlines and approaches, is the
           | worst place to use AI. The other place AI is pretty good is
           | surfacing api call or function calls you might not know about
           | if you're new to the language. Basically, it can save you a
           | lot of time by avoiding the need for tons of internet
           | searching in some cases.
        
             | chairhairair wrote:
             | I have completely the opposite perspective.
             | 
             | Unit tests actually need to be correct, down to individual
             | characters. Same goes with API calls. The API needs to
             | actually exist.
             | 
             | Contrast that with "high level design, rough outlines".
             | Those can be quite vague and hand-wavy. That's where these
             | fuzzy LLMs shine.
             | 
             | That said, these LLM-based systems are great at writing
             | "change detection" unit tests that offer ~zero value (or
             | negative).
        
               | Aeolun wrote:
               | > That said, these LLM-based systems are great at writing
               | "change detection" unit tests that offer ~zero value (or
               | negative).
               | 
               | That's not at all true in my experience. With minimal
               | guidance they put out pretty sensible tests.
        
           | pawelduda wrote:
           | Exactly, 1 step forward, 1 step backward. Avoiding edge cases
           | is something that can't be glossed over, and for that I need
           | to carefully review the code. Since I'm accountable for it,
           | and can't skip this part anyway, I'd rather review my own
           | than some chatbot's.
        
         | knallfrosch wrote:
         | I use it for an unfamiliar programming language and it's very
         | nice. You can also ask it to explain badly documented code.
        
         | weitendorf wrote:
         | I'm building a tool in this space and believe it's actually
         | multiple separate problems. From most to least solvable:
         | 
         | 1. AI coding tools benefit a lot from explicit
         | instructions/specifications and context for how their output
         | will be used. This is actually a very similar problem to when
         | eg someone asks a programmer "build me a website to do X" and
         | then being unhappy with the result because they actually wanted
         | to do "something like X", and a payments portal, and yellow
         | buttons, and to host it on their existing website. So models
         | need to be given those particular instructions somehow (there
         | are many ways to do it, I think my approach is one of the best
         | so far) and context (eg RAG via find-references, other files in
         | your codebase, etc)
         | 
         | 2. AI makes coding errors, bad assumptions, and mistakes just
         | like humans. It's rather difficult to implement auto-correction
         | in a good way, and goes beyond mere code-writing into "agentic"
         | territory. This is also what I'm working on.
         | 
         | 3. AI tools don't have architecture/software/system design
         | knowledge appropriate represented in their training data and
         | all the other techniques used to refine the model before
         | releasing it. More accurately, they might have _knowledge_ in
         | the form of eg all the blog posts and docs out there about it,
         | but not _skill_. Actually, there is some improvement here,
         | because I think o1 and 3.5 sonnet are doing some kind of
         | reinforcement-learning /self-training to get better at this.
         | But it's not easily addressable on your end.
         | 
         | 4. There is ultimately a ton of context cached in your brain
         | that you cannot realistically share with the AI model, either
         | because it's not written anywhere or there is just too much of
         | it. For example, you may want to structure your code in a
         | certain way because your next feature will extend it or use it.
         | Or your product is hosted on serving platform Y which has an
         | implementation detail where it tries automatically setting
         | Content-Type response headers by appending them to existing
         | headers, so manually setting Content-Type in the response
         | causes bugs on certain clients. You can't magically stuff all
         | of this into the model context.
         | 
         | My product tries to address all of these to varying extents.
         | The largest gains in coding come from making it easier to
         | specify requirements and self-correct, but architecture/design
         | are much harder and not something we're working on much. You or
         | anybody else can feel free to email me if you're interested in
         | meeting for a product demo/feedback session - so far people
         | really like our approach to setting output specs.
        
         | fullstackwife wrote:
         | One of the reasons for that may be the price: large code
         | changes with multi turn conversation can eat up a lot of
         | tokens, while those tools charge you a flat price per month.
         | Probably many hacks are done under the hood to keep *their*
         | costs low, and the user experiences this as lower quality
         | responses.
         | 
         | Still the "architecture and core libraries" is rather corner
         | case, something at the bottom of their current sales funnel.
         | 
         | also: do you really want to get equivalent of 1 FTE work for 20
         | USD per month?:)
        
         | tomrod wrote:
         | I'd love an autoselected LLM that is fine-tuned to the syntax
         | I'm actively using -- Cursor has a bit of a head start, but
         | where Github and others can take it could be mindblowing
         | (Cursor's moat is a decent VS Code extension -- I'm not sure
         | it's a deep moat though).
        
         | bmitc wrote:
         | That's my exact experience with GitHub Copilot. Even
         | boilerplate stuff it sucks at as well. I have no idea why its
         | autocomplete is so bad when it has access to my code, the
         | function signatures, types, etc. It gets stuff wrong all the
         | time. For example, it will just flat out suggest functions that
         | don't exist, neither in the Python core libraries or in my own
         | modules. It doesn't make sense.
         | 
         | I have all but given up on using Copilot for code development.
         | I still do use it for autocomplete and boilerplate stuff, but I
         | still have to review that. So there's still quite a bit of
         | overhead, as it introduces subtle errors, especially in
         | languages like Python. Beyond that, it's failure rate at
         | producing running, correct code is basically 100%.
        
         | miki123211 wrote:
         | For now, I mostly use AI as a "faster typist".
         | 
         | If it wants to complete what I wanted to type anyway, or
         | something extremely similar, I just press tab, otherwise I type
         | my own code.
         | 
         | I'd say about 70% of individual lines are obvious enough if you
         | have the surrounding context that this works pretty well in
         | practice. This number is somewhat lower in normal code and
         | higher in unit tests.
         | 
         | Another use case is writing one-off scripts that aren't
         | connected to any codebase in particular. If you're doing a lot
         | of work with data, this comes in very handy.
         | 
         | Something like "here's the header of a CSV file", pass each row
         | through model x, only pass these three fields, the model will
         | give you annotations, put these back in the csv and save, show
         | progress, save every n rows in case of crashes, when the output
         | file exists, skip already processed rows."
         | 
         | I'm not (yet) convinced by AI writing entire features, I tried
         | that a few times and it was very inconsistent with the
         | surrounding codebase. Managing which parts of the codebase to
         | put in its context is definitely an art though.
         | 
         | It's worth keeping in mind that this is the worst AI we'll ever
         | have, so this will probably get better soon.
        
           | zeroonetwothree wrote:
           | One off scripts do work very well.
        
           | blitzar wrote:
           | Reminds me of how I use the satnav when driving.
           | 
           | I don't close my eyes and do whatever it tells me to do. If I
           | think I know better I don't "turn right at the next set of
           | lights" I just drive on as I would have before GPS and
           | eventually realise that I went the wrong way or the satnav
           | realises there was a perfectly valid 2nd/3rd/4th path to get
           | to where I wanted to go.
        
         | pnathan wrote:
         | In general I do not find AI a net positive. Other tools seem to
         | do at least as well in general.
         | 
         | it can be used if you want the reliability of a random forum
         | poster. which... sure. knock yourself out. sometimes there's
         | gems in that dirt.
         | 
         | I'm getting _very_ bearish on using LLMs for things that aren't
         | pattern recognition.
        
         | ianbutler wrote:
         | I'm actually very curious why AI use is such a bi-modal
         | experience. I've used AI to move multi thousand line codebases
         | between languages. I've created new apps from scratch with it.
         | 
         | My theory is the willingness to baby sit and the modality. I'm
         | perfectly fine telling the tool I use its errors and working
         | side by side with it like it was another person. At the end of
         | the day it can belt out lines of code faster than I, or any
         | human, can and I can review code very quickly so the overall
         | productivity boost has been great.
         | 
         | It does fundamentally alter my workflow. I'm very hands off
         | keyboard when I'm working with AI in a way that is much more
         | like working with someone or coaching someone to make something
         | instead of doing the making myself. Which I'm fine with but
         | recognize many developers aren't.
         | 
         | I use AI autocomplete 0% of the time as I found that workflow
         | was not as effective as me just writing code, but most of my
         | most successful work using AI is a chat dialogue where I'm
         | letting it build large swaths of the project a file or parts of
         | a file at a time, with me reviewing and coaching.
        
           | __float wrote:
           | I'm not sure how many people are like me, but my attempts to
           | use Copilot have largely been the context of writing code as
           | usual, occasionally getting end-of-line or handful-of-lines
           | completions from it. I suspect there's probably a bigger
           | shift needed, but I haven't seen anyone (besides AI
           | "influencers" I don't trust..?) showing what their day-to-day
           | workflows look like.
           | 
           | Is there a Vimcasts equivalent for learning the AI editor
           | tips and tricks?
        
             | sbarre wrote:
             | Have you tried the chat mode?
             | 
             | The autocomplete is somewhere between annoying and
             | underwhelming for me, but the chat is super useful. Being
             | able to just describe what you're thinking or what you're
             | trying to do and having a bespoke code sample just show up
             | (based on the code in your editor) that you can then either
             | copy/paste in, cherry-pick from or just get inspired by,
             | has been a great productivity booster..
             | 
             | Treat it like a pair programmer or a rubber duck and you
             | might have a better experience. I did!
        
           | zeroonetwothree wrote:
           | I guess for me it actually takes longer to review code than
           | to write it. So maybe that's some of the difference.
        
           | 0xFACEFEED wrote:
           | As a programmer of over 20 years - this is terrifying.
           | 
           | I'm willing to accept that I just have "get off my lawn"
           | syndrome or something.
           | 
           | But the idea of letting an LLM write/move large swaths of
           | code seems so incredibly irresponsible. Whenever I sit down
           | to write some code, be it a large implementation or a small
           | function, I think about what other people (or future versions
           | of myself) will struggle with when interacting with the code.
           | Is it clear and concise? Is it too clever? Is it too easy to
           | write a subtle bug when making changes? Have I made it
           | totally clear that X is relying on Y dangerous behavior by
           | adding a comment or intentionally making it visible in some
           | other way?
           | 
           | It goes the other way too. If I know someone well (or their
           | style) then it makes evaluating their code easier. The more
           | time I spend in a codebase the better idea I have of what the
           | writer was trying to do. I remember spending a lot of time
           | reading the early Redis codebase and got a pretty good sense
           | of how Salvatore thinks. Or altering my approaches to code
           | reviews depending on which coworker was submitting it. These
           | weren't things I were doing out of desire but because all
           | non-trivial code has so much subtlety; it's just the nature
           | of the beast.
           | 
           | So the thought of opening up a codebase that was cobbled
           | together by an AI is just scary to me. Subtle bugs and errors
           | would be equally distributed across the whole thing instead
           | of where the writer was less competent (as is often the
           | case). The whole thing just sounds like a gargantuan mess.
           | 
           | Change my mind.
        
             | bongodongobob wrote:
             | You. Can. Write. Tests.
        
               | ok_dad wrote:
               | Tests haven't saved us so far, humans have been writing
               | tests that passed for software with bugs for decades.
        
               | lanternfish wrote:
               | Tests aren't a full solution for all the considerations
               | of the above post.
        
               | blitzar wrote:
               | Just let the LLM do that too.
        
               | the_real_cher wrote:
               | Even better you can let the AI write tests.
        
               | hakunin wrote:
               | How do you write a test for code clarity / readability /
               | maintainability?
        
               | 0xFACEFEED wrote:
               | How do tests account for cases where I'm looking at a 100
               | line function that could have easily been written in 20
               | lines with just as much, if not more, clarity?
               | 
               | It reminds me of a time (long ago) when the trend/fad was
               | building applications visually. You would drag and drop
               | UI elements and define logic using GUIs. Behind the
               | scenes the IDE would generate code that linked everything
               | together. One of the selling points was that underneath
               | the hood it's just code so if someone didn't have access
               | to the IDE (or whatever) then they could just open the
               | source and make edits themselves.
               | 
               | It obviously didn't work out. But not because of the
               | scope/scale (something AI code generation solves) but
               | because, it turns out, writing maintainable secure
               | software takes a lot of careful thought.
               | 
               | I'm not talking about asking an AI to vomit out a CRUD
               | UI. For that I'm sure it's well suited and the risk is
               | pretty low. But as soon as you introduce domain specific
               | logic or non-trivial things connected to the real world -
               | it requires thought. Often times you need to spend more
               | time thinking about the problem than writing the code.
               | 
               | I just don't see how "guidance" of an LLM gets anywhere
               | near writing good software outside of trivial stuff.
        
             | sbarre wrote:
             | I think it depends on the stakes of what you're building.
             | 
             | A lot of the concerns you describe make me think you work
             | in a larger company or team and so both the organizational
             | stakes (maintenance, future changes, tech debt, other
             | people taking it over) and the functional stakes (bug free,
             | performant, secure, etc) are high?
             | 
             | If the person you're responding to is cranking out a
             | personal SaaS project or something they won't ever want to
             | maintain much, then they can do different math on risks.
             | 
             | And probably also the language you're using, and the actual
             | code itself.
             | 
             | Porting a multi-thousand line web SaaS product in
             | Typescript that's just CRUD operations and cranking out web
             | views? Sure why not.
             | 
             | Porting a multi-thousand line game codebase that's
             | performance-critical and written in C++? Probably not.
             | 
             | That said, I am super fascinated by the approach of "let
             | the LLM write the code and coach it when it gets it wrong"
             | and I feel like I want to try that.. But probably not on a
             | work project, and maybe just on a personal project.
        
             | ianbutler wrote:
             | I have 10 years professional experience and I've been
             | writing code for 20 years, really with this workflow I just
             | read and review significantly more code and I coach it when
             | it structures or styles something in a way I don't like.
             | 
             | I'm fully in control and nothing gets committed I haven't
             | read its an extension of me at that point.
        
             | geysersam wrote:
             | I'll take a stab at changing your mind.
             | 
             | AIs are not able to write Redis. That's not their job. AIs
             | should not write complex high performance code that
             | millions of users rely on. If the code does something
             | valuable for a large number of people you can afford humans
             | to write it.
             | 
             | AIs should write low value code that just repeats what's
             | been done before but with some variations. Generic parts of
             | CRUD apps, some fraction of typical frontends, common CI
             | setups. That's what they're good at because they've seen it
             | a million times already. That category constitutes most
             | code written.
             | 
             | This relieves human developers of ballpark 20% of their
             | workload and that's already worth a lot of money.
        
           | rqmedes wrote:
           | I agree. I am in a very senior role and find that working
           | with AI the same way you do I am many times more productive.
           | Months of work becomes days or even hours of work
        
           | bongodongobob wrote:
           | My theory is grammatical correctness and specificity. I see a
           | lot of people prompt like this:
           | 
           | "use python to write me a prog that does some dice rolls and
           | makes a graph"
           | 
           | Vs
           | 
           | "Create a Python program that generates random numbers to
           | simulate a series of dice rolls. Export a graph of the
           | results in PNG format."
           | 
           | Information theory requires that you provide enough actual
           | information. There is a minimum amount of work to supply the
           | input. Otherwise, the gaps will get filled in with noise,
           | working, what you want, or not.
           | 
           | For example, maybe someday you could say "write me an OS" and
           | it would work. However, to get exactly what you want, you
           | still have to specify it. You can only compress so far.
        
         | Aeolun wrote:
         | > in practice I'm not noticing a productivity boost
         | 
         | I am. Can suddenly do in a weekend what would have taken a
         | week.
        
         | Yodel0914 wrote:
         | I find chatgpt incredibly useful for writing scripts against
         | well-known APIs, or for a "better stackoverflow". Things like
         | "how do I use a cursor in sql" or "in a devops yaml pipeline, I
         | want to trigger another pipeline. How do I do that?".
         | 
         | But working on our actual codebase with copilot in the IDE
         | (Rider, in my case) is a net negative. It usually does OK when
         | it's suggesting the completion of a single line, but when it
         | decides to generate a whole block it invariably misunderstands
         | the point of the code. I could imagine that getting better if I
         | wrote more descriptive method names or comments, but the killer
         | for me is that it just makes up methods and method signatures,
         | even for objects that are part of publicly documented
         | frameworks/APIs.
        
       | fifteen1506 wrote:
       | Paywall; can't read legally.
        
       | ed_elliott_asc wrote:
       | I am excited about this as I use Claude for coding but what I
       | really like about copilot is if you have a list of something
       | random like:
       | 
       | /* Col1 varchar not null, Col2 int null, Col3 int not nul*/
       | 
       | Then start doing something else like:
       | 
       | | column | type | |---| ---| | Col1 | varchar |
       | 
       | Then copilot is very good at guessing the rest of the table.
       | 
       | (This isn't just sql to markdown it works whenever you want to
       | repeat something using parts of another list somewhere in the
       | same doc)
       | 
       | I hope they continues as this has been a game changer for me as
       | it is so quick, really great.
        
       | johnyzee wrote:
       | > _"The size of the Lego blocks that Copilot on AI can generate
       | has grown [...] It certainly cannot write a whole GitHub or a
       | whole Facebook, but the size of the building blocks will
       | increase"_
       | 
       | Um, that would make it _less_ capable, not more...  /thatguy
        
       | delduca wrote:
       | I don't like using AI assistants in my editor; I prefer to keep
       | it as clean as possible. So, I manually copy relevant parts of
       | the code into ChatGPT, ask my question, and continue interacting
       | until I get what I need. It's a bit manual, but since I use GPT
       | for other tasks, it's convenient to have a single interface for
       | everything.
        
       | qubitly wrote:
       | So GitHub's teaming up with Google, Anthropic, and OpenAI? Kinda
       | feels Microsoft's version of a 'safety net', but for who exactly?
       | It's hard not to wonder if this is actually about choice for the
       | user or just insurance for Microsoft
        
       | suyash wrote:
       | Does this mean you need to be a paying user for Claude and Gemini
       | or just with GitHub copilot?
        
       | richardw wrote:
       | This kind of thing is why I think Sam is often misjudged. You
       | can't fuck around in such a competitive market. If you go in all
       | kumbaya you'll get crushed by market forces. It's rare for
       | company/founder ideals to survive the market indefinitely. I
       | think he's iterated fast and the job is still very hard.
        
       | chucke1992 wrote:
       | This Github purchase was incredible
        
       | moondistance wrote:
       | Microsoft is negotiating equity in OpenAI as part of the switch
       | to a for-profit. Non-zero chance this is a negotiation flex.
        
       | pavelboyko wrote:
       | I mentored junior SWE and CS students for years, and now using
       | Claude as a coding assistant feels very similar. Yesterday, it
       | suggested implementing a JSON parser from scratch in C to avoid a
       | dependency -- and, unsurprisingly, the code didn't work. Two main
       | differences stand out: 1) the LLM doesn't learn from corrections
       | (at least not directly), and 2) the feedback loop is seconds
       | instead of days. This speed is so convenient that it makes hiring
       | junior SWEs seem almost pointless, though I sometimes wonder
       | where we'll find mid-level and senior developers tomorrow if we
       | stop hiring juniors today.
        
         | hypeatei wrote:
         | Years of experience doesn't correlate to a good developer
         | either. I've seen senior devs using AI to solve impossible
         | problems, for example asking it how to store an API key client
         | side without leaking it...
        
         | al_borland wrote:
         | Does speed matter when it's not getting better and learning
         | from corrections? I think I'd rather give someone a problem and
         | have them come back with something that works in a couple days
         | (answering a question here or there), rather than spend my time
         | doing it myself because I'm getting fast, but wrong, results
         | that aren't improving from the AI.
         | 
         | > though I sometimes wonder where we'll find mid-level and
         | senior developers tomorrow if we stop hiring juniors today.
         | 
         | This is also a key point. While there is a lot of short term
         | thinking these days, since people don't stick with companies
         | like they used to. As a person who has been with my company for
         | close to 20 years, making sure things can still run once you
         | leave is important from a business perspective.
         | 
         | Training isn't about today, it's about tomorrow. I've trained a
         | lot of people, and doing it myself would always be faster in
         | the moment. But it's about making the team better and making
         | sure more people have more skill, to reduce single points of
         | failure and ensure business continuity over the long-term. Not
         | all of it pays off, but when it does, it pays off big.
        
       | xpe wrote:
       | Award for most ambiguous headline. ("cuts" can mean "initiates"
       | or "terminates"!)
        
         | zeroonetwothree wrote:
         | "Cut a deal" is a standard phrase with only one meaning
        
           | nvader wrote:
           | But when you make it "cut AI deal", that breaks the standard
           | phrase and opens the door to alternative explanations. I
           | initially thought this was a news article about the deal
           | breaking up.
        
             | AntiqueFig wrote:
             | I thought the same indeed.
        
           | dankwizard wrote:
           | A word can have multiple meanings.
        
           | corobo wrote:
           | Yes but cuts deals, which is what the title says, is
           | ambiguous.
           | 
           | Those are some load bearing pluralisations right there.
        
         | fabmilo wrote:
         | I also thought that they were ending some previous deal and not
         | creating a new one.
        
       | lifeisstillgood wrote:
       | I still think it's worth emphasising - LLMs represent a massive
       | capital absorber. Taking gobs of funding into your company is how
       | you grow, how your options become more valuable, how your
       | employees stay with you. If that treadmill were to break bad
       | things happen.
       | 
       | Search has been stuttering for a while - Google's growth and
       | investment has been flattening - at some point they absorbed all
       | the worlds stored information.
       | 
       | OpenAI showed the new growth - we need billions of dollars to
       | build and the run the LLMs (at a loss one assumes) - the
       | treadmill can keep going
        
       ___________________________________________________________________
       (page generated 2024-10-29 23:00 UTC)