[HN Gopher] GitHub cuts AI deals with Google, Anthropic
___________________________________________________________________
GitHub cuts AI deals with Google, Anthropic
Author : jbredeche
Score : 538 points
Date : 2024-10-29 16:20 UTC (6 hours ago)
(HTM) web link (www.bloomberg.com)
(TXT) w3m dump (www.bloomberg.com)
| altbdoor wrote:
| https://archive.is/Il4QM
| campbel wrote:
| This is pretty exciting. I'm a copilot user at work, but also
| have access to Claude. I'm more inclined to use Claude for
| difficult coding problems or to review my work as I've just grown
| more confident in its abilities over the last several months.
| ganoushoreilly wrote:
| I too use Claude more frequently than OpenAi GPT4o. I think
| this is a two fold move for MS and I like it. Claude being more
| accurate / efficient for me says it's likely they see the same
| thing, win number 1. The second is with all the OpenAI drama MS
| has started to distance themselves over a souring relationship
| (allegedly). If so, this could be a smart move away tactfully.
|
| Either way, Claude is great so this is a net win for everyone.
| dartos wrote:
| Yeah, Claude consistently impresses me.
|
| A commenter on another thread mentioned it but it's very
| similar to how search felt in the early 2000s. I ask it a
| question and get my answer.
|
| Sometimes it's a little (or a lot) wrong or outdated, but at
| least I get something to tinker with.
| gonab wrote:
| Conversely I feel that the experience of searching has been
| degraded by a lot since 2016/17. My these is that, at this
| time, online spam increased by an order of magnitude
| TeaBrain wrote:
| I don't think this is necessarily converse to what they
| said.
| bobthepanda wrote:
| Winning the war against spam is an arms race. Spam hasn't
| spent years targeting AI search yet.
| dageshi wrote:
| I think it was the switch from desktop search traffic
| being dominant to mobile traffic being dominant, that
| switch happened around the end of 2016.
|
| Google used to prioritise big comprehensive articles on
| subjects for desktop users but mobile users just wanted
| quick answers, so that's what google prioritised as they
| became the biggest users.
|
| But also, per your point, I think those smaller simpler
| less comprehensive posts are easier to fake/spam than the
| larger more compreshensible posts that came before.
| zeknife wrote:
| Ironically, I almost never see quick answers in the top
| results, mostly it's dragged out pages of paragraph after
| paragraph with ads inbetween.
| dartos wrote:
| Guess who sells the ads...
| state_less wrote:
| Old style Google search is dead, folks just haven't
| closed the casket yet. My index queries are down ~90%. In
| the future, we'll look back at LLMs as a major turning
| point in how people retrieve and consume information.
| darepublic wrote:
| I still prefer it over using llm. And I would be doubtful
| that llm search has major benefits over Google search imo
| ben_w wrote:
| Depends what you want it for.
|
| Right now, I find each tool better at different things.
|
| If I can only describe what I want but don't know key
| words, LLM are the only solution.
|
| If I need citations, LLMs suck.
| EVa5I7bHFq9mnYK wrote:
| It's getting ridiculous. Half of the time now when I ask
| AI to search some information for me, it finds and
| summarizes some very long article obviously written by
| AI, and lacking any useful information.
| imiric wrote:
| I recently tried to ask these tools for help with using a
| popular library, and both GPT-4o and Claude 3.5 Sonnet gave
| highly misleading and unusable suggestions. They
| consistently hallucinated APIs that didn't exist, and would
| repeat the same wrong answers, ignoring my previous
| instructions. I spent upwards of 30 minutes repeating "now
| I get this error" to try to coax them in the right
| direction, but always ending up in a loop that got me
| nowhere. Some of the errors were really basic too, like
| referencing a variable that was never declared, etc.
| Finally, Claude made a tangential suggestion that made me
| look into using a different approach, but it was still
| faster to look into the official documentation than to keep
| asking it questions. GPT-4o was noticeably worse, and I
| quickly abandoned it.
|
| If this is the state of the art of coding LLMs, I really
| don't see why I should waste my time evaluating their
| confident sounding, but wrong, answers. It doesn't seem
| like much has improved in the past year or so, and at this
| point this seems like an inherent limitation of the
| architecture.
| geodel wrote:
| Well it is volume business. <1% of advanced skill
| developers will find AI helper useless but for 99% of IT
| CRUD peddlers these tools are quite sufficient. All in
| all if employers cut down 15-20% of net development costs
| by reducing head counts, it will be very worthwhile for
| companies.
| WgaqPdNr7PGLGVW wrote:
| I suspect it will go a different direction.
|
| Codebases are exploding in size. Feature development has
| slowed down.
|
| What might have been a carefully designed 100kloc
| codebase in 2018 is now a 500kloc ball of mud in 2024.
|
| Companies need many more developers to complete a decent
| sized feature than they needed in 2018.
| outworlder wrote:
| It's worse than that. Now the balls of mud are
| distributed. We get incredibly complex interactions
| between services which need a lot of infrastructure to
| enable them, that requires more observability, which
| requires more infrastructure...
| imiric wrote:
| Sure, but my specific question was fairly trivial, using
| a mainstream language and a popular library. Most of my
| work qualifies as CRUD peddling. And yet these tools are
| still wasting my time.
|
| Maybe I'll have better luck next time, or maybe I need to
| improve my prompting skills, or use a different model,
| etc. I was just expecting more from state of the art LLMs
| in 2024.
| WgaqPdNr7PGLGVW wrote:
| Yeah there is a big disconnect between the devs caught up
| in the hype and the devs who aren't.
|
| A lot of the devs in my office using Claude/gpt are
| convinced they are so much more productive but they
| aren't actually producing features or bug fixes any
| faster.
|
| I think they are just excited about a novel new way to
| write code.
| dartos wrote:
| FWIW I almost never ask it to write code for me. I did
| once to write a matplotlib script and it gave me a
| similar headache.
|
| I ask it questions mostly about libraries I'm using
| (usually that have poor documentation) and how to
| integrate it with other libraries.
|
| I found out about Yjs by asking about different
| operational transform patterns.
|
| Got some context on the prosemirror plugin by pasting the
| entire provider class into Claude and asking questions.
|
| It wasn't always exactly correct, but it was correct
| enough that it made the process of learning prosemirror,
| yjs, and how they interact pretty nice.
|
| The "complete" examples it kept spitting out were totally
| wrong, but the information it gave me was not.
| imiric wrote:
| To be clear, I didn't ask it to write something complex.
| The prompt was "how do I do X with library Y?", with a
| bit more detail. The library is fairly popular and in a
| mainstream language.
|
| I had a suspicion that what I was trying to do was simply
| not possible with that library, but since LLMs are
| incapable of saying "that's not possible" or "I don't
| know", they will rephrase your prompt and hallucinate
| whatever might plausibly make sense. They have no way to
| gauge whether what they're outputting is actually
| correct.
|
| So I can imagine that you sometimes might get something
| useful from this, but if you want a specific answer about
| something, you will always have to double-check their
| work. In the specific case of programming, this could be
| improved with a simple engineering task: integrate the
| output with a real programming environment, and evaluate
| the result of actually running the code. I think there
| are coding assistant services that do this already, but
| frankly, I was expecting more from simple chat services.
| thelittleone wrote:
| I'm the same, but had a lot of issues getting structured
| output from Anthropic. Ended up always writing response
| processors. Frustrated by how fragile that was, decided to
| try OpenAI structured outputs and it just worked and since
| they also have prompt caching now, it worked out very well
| for my use case.
|
| Anthropic's seems to have addressed the issue using pydantic
| but I haven't had a chance to test it yet.
|
| I pretty much use Anthropic for everything else.
| JacobThreeThree wrote:
| >The second is with all the OpenAI drama MS has started to
| distance themselves over a souring relationship (allegedly).
| If so, this could be a smart move away tactfully.
|
| I agree, this was a tactical move designed to give them
| leverage over OpenAI.
| a_wild_dandan wrote:
| The speed with which AI models are improving blows my mind.
| Humans quickly normalize technological progress, but it's
| staggering to reflect on our progress over just these _two
| years_.
| campbel wrote:
| Yes! I'm much more inclined to write one-off scripts for
| short manual tasks as I can usually get AI to get something
| useful very fast. For example, last week I worked with Claude
| to write a script to get a sense of how many PRs my company
| had that included comprehensive testing. This was borderline
| best done as a manual task previously, now I just ask Claude
| to write a short bash script that uses the GitHub CLI to do
| it and I've got a repeatable reliable process for pulling
| this information.
| unshavedyak wrote:
| I rarely use LLMs for tasks but i love it for exploring
| spaces i would otherwise just ignore. Like writing some
| random bash script isn't difficult at all, but it's also so
| fiddly that i just don't care to do it. It's nice to just
| throw a bot at it and come back later. Loosely speaking.
|
| Still i find very little use from LLMs in this front, but
| they do come in handy randomly.
| unshavedyak wrote:
| I wonder how long people will still protest in these threads
| that "It doesn't know anything! It's just an autocomplete
| parrot!"
|
| Because.. yea, it is. However.. it keeps expanding, it keeps
| getting more useful. Yea people and especially companies are
| using it for things which it has no business being involved
| in.. and despite that it keeps growing, it keeps progressing.
|
| I do find the "stochastic parrot" comments slowly dwindle in
| number and volume with each significant release, though.
|
| Still, i find it weirdly interesting to see a bunch of people
| be both right and "wrong" at the same time. They're
| completely right, and yet it's like they're also being proven
| wrong in the ways that matter.
|
| Very weird space we're living in.
| a_wild_dandan wrote:
| The "statistical parrot" parrots have been demonstrably
| wrong for years (see e.g. LeCun et al[1]). It's just harder
| to ignore reality with hundreds of millions of people now
| using incredible new AI tools. We're approaching "don't
| believe your lying eyes" territory. Deniers will continue
| pretending that LLMs are just an NFT-level fad or bubble or
| whatever. The AI revolution will continue to pass them by.
| More's the pity.
|
| [1] https://arxiv.org/abs/2110.09485
| mensetmanusman wrote:
| A trillion dimensional stochastic parrot is still a
| stochastic parrot.
|
| If these systems showed understanding we would notice.
|
| No one is denying that this form of intelligence is
| useful.
| logicchains wrote:
| I don't know how you can say they lack understanding of
| the world when in pretty much any standardised test
| designed to measure human intelligence they perform
| better than the average human. They only thing that don't
| understand is touch because they're not trained on that,
| but they can already understand audio and video.
| zeknife wrote:
| You said it, those tests are designed to measure human
| intelligence, because we know that there is a
| correspondence between test results and other, more
| general tasks - in humans. We do not know that such a
| correspondence exists with language models. I would
| actually argue that they demonstrably do not, since even
| an LLM that passes every IQ test you put in front of it
| can still trip up on trivial exceptions that wouldn't
| fool a child.
| zeroonetwothree wrote:
| An answer key would outperform the average human but it
| isn't intelligent. Tests designed for humans are not
| appropriate to judge non humans.
| devmor wrote:
| No you don't understand, if i put a billion billion
| trillion monkeys on typewriters, they're actually now one
| super intelligent monkey because they're useful now!
|
| We just need more monkeys and it will be the same as a
| human brain.
| croes wrote:
| What does the mass of users change about what it is? How
| many of these check the results for hallucinations and
| how many don't because I part of AI?
|
| More than once these tools fail at tasks a fifth grader
| could understand
| outworlder wrote:
| > Deniers will continue pretending that LLMs are just an
| NFT-level fad or bubble or whatever. The AI revolution
| will continue to pass them by. More's the pity.
|
| You should re-read that very slowly and carefully and
| really think about it. Calling anyone that's skeptical a
| 'denier' is a red flag.
|
| We have been through these AI cycles before. In every
| case, the tools were impressive for their time. Their
| limitations were always brushed aside and we would get a
| hype cycle. There was nothing wrong with the technology,
| but humans always like to try to extrapolate their
| capabilities and we usually get that wrong. When hype
| caught up to reality, investments dried up and nobody
| wanted to touch "AI" for a while.
|
| Rinse, repeat.
|
| LLMs are again impressive, for our time. When the dust
| settles, we'll get some useful tools but I'm pretty sure
| we will experience another - severe - AI winter.
|
| If we had some optimistic but also realistic discussions
| on their limitations, I'd be less skeptical. As it is, we
| are talking about 'revolution', and developers being out
| of jobs, and superintelligence and whatnot. That's _not_
| the level the technology is at today and it is not clear
| we are going to do anything else other than get stuck in
| a local maxima.
| wavemode wrote:
| You're conflating three different things.
|
| There's the question, "is an LLM just autocomplete"? The
| answer to that question is obviously no, but the question
| is also a strawman - people who actually use LLM's
| regularly do recognize that there is more to their
| capabilities than randomized pattern matching.
|
| Separately, there's the question of "will LLM's become AGI
| and/or become super intelligent." Most people recognize
| that LLM's are not currently super intelligent, and that
| there currently isn't a clear path toward making them so.
| Still, many people seem to feel that we're on the verge of
| progress here, and feel very strongly that anyone who
| disagrees is an AI "doomer".
|
| Then there's the question of "are we in an AI bubble"? This
| is more a matter of debate. Some would argue that if LLM
| reasoning capabilities plateau, people will stop investing
| in the technology. I actually don't agree with that view -
| I think there is a lot of economic value still yet to be
| realized in AI advancements - I don't think we're on the
| verge of some sort of AI winter, even if LLM's never become
| super intelligent.
| sdesol wrote:
| > Most people recognize that LLM's are not currently
| super intelligent,
|
| I think calling it intelligent is being extremely
| generous. Take a look at the following example which is a
| spelling and grammar checker that I wrote:
|
| https://app.gitsense.com/?doc=f7419bfb27c89&temperature=0
| .50...
|
| When the temperature is 0.5, both Claude 3.5 and GPT-4o
| can't properly recognize that GitHub is capitalized. You
| can see the responses by clicking in the sentence. Each
| model was asked to validate the sentence 5 times.
|
| If the temperature is set to 0.0, most models will get it
| right (most of the time), but Claude 3.5 still can't see
| the sentence in front of it.
|
| https://app.gitsense.com/?doc=f7419bfb27c89&temperature=0
| .00...
|
| Right now, LLM is an insanely useful and powerful next
| word predictor, but I wouldn't call it intelligent.
| digging wrote:
| > I think calling it intelligent is being extremely
| generous ... can't properly recognize that GitHub is
| capitalized.
|
| Wouldn't this make chimpanzees and ravens and dolphins
| unintelligent too? You're asking it to do a task that's
| (mostly) easy _for humans_. It 's not a human though.
| It's an alien intelligence which "thinks" in our
| language, but not in the same way we do.
|
| If they could, specialized AI might think we're
| unintelligent based on how often we fail, even with
| advanced tools, pattern matching tasks that are trivial
| for them. Would you say they're right to feel that way?
| sdesol wrote:
| Animals are capable of learning. LLMs can not. LLM uses
| weights that are defined during the training process to
| decide what to do next. LLM cannot self evalaute based on
| what it has said. You have to create a new message for it
| to create a new probability path.
|
| Animals have the ability to learn and grow by themselves.
| LLMs are not intelligent and I don't see how they can be
| since they just follow the most likely path with
| randomness (temperature) sprinkled in.
| croes wrote:
| Are you confusing frequency of use with usefulness?
|
| If these tools boost tue productivity where is the output
| spike of all the companies, the spike in revenue and
| profits?
|
| How often do we lose the benefit auto text generation to
| the loop of That's wrong Oh yes of course, here is the
| correct version Nope, still wrong Prompt editing?
| ffujdefvjg wrote:
| Lots of progress, but I feel like we've been seeing
| diminishing returns. I can't help but feel like recent
| improvements are just refinements and not real advances. The
| interest in AI may drive investment and research in better
| models that are game-changers, but we aren't there yet.
| ipsum2 wrote:
| I don't know about you, but o1-preview/o1-mini has been
| able to solve many moderately challenging programming tasks
| that would've taken me 30 mins to an hour. No other models
| earlier could've done that.
| ffujdefvjg wrote:
| It's an improvement but...I've asked it to do some really
| simple tasks and it'll occasionally do them in the most
| roundabout way you could imagine. Like, let's source a
| bash file that creates and reads a state file to do
| something for which the functionality was already built-
| in. Say I'm a little skeptical of this solution and plug
| it into a new o1-preview prompt to double check the
| solution, and it starts by critiquing the bash script and
| error handling instead of seeing that the functionality
| is baked in and it's plainly documented. Other errors
| have been more subtle.
|
| When it works, it's pretty good, and sometimes great. But
| when failure modes look like the above I'm very wary of
| accepting its output.
| warkdarrior wrote:
| > I've asked it to do some really simple tasks and it'll
| occasionally do them in the most roundabout way you could
| imagine.
|
| But it still does the tasks you asked for, so that's the
| part that really matters.
| TeMPOraL wrote:
| You're proving GP's point about normalization of progress.
| It's been two years. We're still during the first iteration
| of applications of this new tech, advancements didn't have
| time yet to start compounding. This is barely getting
| started.
| pseudosavant wrote:
| I use both Claude and ChatGPT/GPT-4o a lot. Claude, the model,
| definitely is 'better' than GPT-4o. But OpenAI provides a much
| more capable app in ChatGPT and an easier development platform.
|
| I would absolutely choose to use Claude as my model with
| ChatGPT if that happened (yes, I know it won't). ChatGPT as an
| app is just so far ahead: code interpreter, web search/fetch,
| fluid voice interaction, Custom GPTs, image generation, and
| memory. It isn't close. But Claude absolutely produces better
| code, only being beaten by ChatGPT because it can fetch data
| from the web to RAG enhance its knowledge of things like APIs.
|
| Claude's implementation of artifacts is very good though, and
| I'm sure that is what lead OpenAI to push out their buggy
| canvas feature.
| tanelpoder wrote:
| Are there any good 3rd-party native frontend apps for Claude
| (on MacOS)? I mean something like ChatGPTs app, not an
| editor. I guess one option would be to just run Claude iPad
| app on MacOS.
| mike_hearn wrote:
| You can use https://recurse.chat/ if you have an Apple
| silicon Mac.
| greenavocado wrote:
| Open-WebUI doesn't support Claude natively (only through a
| series of hacks) but it is absolutely "THE" go-to for a
| ChatGPT Pro like experience (it is slightly better).
|
| https://github.com/open-webui/open-webui
| TeMPOraL wrote:
| If you're willing to settle for a client-side only web
| frontend (i.e. talks directly with APIs of the models you
| use), TypingMind would work. It's paid, but it's good (see
| [0]), and I guess you could always go for the self-hosted
| version and wrap it in an Electron app - it's what most
| "native" apps are these days anyway (and LLM frontends in
| particular).
|
| --
|
| [0] - https://news.ycombinator.com/item?id=41988306
| Liquix wrote:
| Jan [0] is MacOS native, open source, similar feel to the
| ChatGPT frontend, very polished, and offers Anthropic
| integration (all Claude models).
|
| It also features one-click installation, OpenAI
| integration, a hub for downloading and running local
| models, a spec-compatible API server, global "quick answer"
| shortcut, and more. Really can't recommend it enough!
|
| [0] https://github.com/janhq/jan
| jawon wrote:
| I like msty.app. Parallel prompting across multiple
| commercial and local models plus branching dialogs. Doesn't
| do artifacts, etc, though.
| octohub wrote:
| Msty [0] is a really good app - you can use both local or
| online models and has web search, attachments, RAG, split
| chats, etc., built-in.
|
| [0] https://msty.app
| benreesman wrote:
| It's all a dice game with these things, you have to watch
| them closely or they start running you (with bad outcomes).
| Disclaimers aside:
|
| Sonnet is better in the small, by a lot. It's sharply up from
| idk, three months ago or something when it was still an
| attractive nuisance. It still tops out at "Best SO Answer",
| but it hits that like 90%+. If it involves more than copy
| paste, sorry folks, it's still just really fucking good copy
| paste.
|
| But for sheer "doesn't stutter every interaction at the worst
| moment"? You've got to hand it to the ops people: 4o can give
| you second best in industrial quantity on demand. I'm finding
| that if AI is good enough, then OpenAI is good enough.
| logicchains wrote:
| >If it involves more than copy paste, sorry folks, it's
| still just really fucking good copy paste.
|
| Are you sure you're using Claude 3.5 Sonnet? In my
| experience it's absolutely capable of writing entire small
| applications based off a detailed spec I give it, which
| don't exist on GitHub or Stack Overflow. It makes some
| mistakes, especially for underspecified things, but
| generally it can fix them with further prompting.
| benreesman wrote:
| I'm quite sure what model revision their API quotes,
| though serious users rapidly discover that like any
| distributed system, it has a rhythm to it.
|
| And I'm not sure we disagree.
|
| Vercel demo but Pets _is_ copy paste.
| benreesman wrote:
| We have entered the era of generic fashionable CRUD
| framework demo Too Cheap To Hawk.
| ben_w wrote:
| FWIW, I was able to get a decent way into making my own
| client for ChatGPT by asking the free 3.5 version to do JS
| for me* before it was made redundant by the real app, so this
| shouldn't be too hard if you want a specific
| experience/workflow?
|
| * I'm iOS by experience; my main professional JS experience
| was something like a year before jQuery came out, so I kinda
| need an LLM to catch me up for anything HTML
|
| Also, I wanted HTML rather than native for this.
| mattwad wrote:
| Have you tried using Cursor with Claude embedded? I can't go
| back to anything else, it's very nice having the AI embedded
| in the IDE and it just knows all the files i am working with.
| Cursor can use GPT-4o too if you want
| TeMPOraL wrote:
| > _ChatGPT as an app is just so far ahead: code interpreter,
| web search /fetch, fluid voice interaction, Custom GPTs,
| image generation, and memory. It isn't close._
|
| Funny thing, TypingMind was ahead of them for over a year,
| implementing those features on top of the API, without trying
| to mix business model with engineering[0]. It's only recently
| that ChatGPT webapp got more polished and streamlined, but
| TypingMind's been giving you all those features for _every_
| LLM that can handle it. So, if you 're looking for ChatGPT-
| level frontend to Anthropic models, this is it.
|
| ChatGPT shines on mobile[1] and I still keep my subscription
| for that reason. On desktop, I stick to TypingMind and being
| able to run the same plugins on GPT-4o and Claude 3.5 Sonnet,
| and if I need a new tool, I can make myself one in five
| minutes with passing knowledge of JavaScript[2]; no need to
| subscribe to some Gee Pee Tee.
|
| Now, I know I sound like a shill, I'm not. I'm just a
| satisfied user with no affiliation to the app or the guy that
| made it. It's just that TypingMind did the _bloodingly stupid
| obvious_ thing to do with the API and tool support (even
| before the latter was released), and continues to do the
| _obvious things_ with it, and I 'm completely confused as to
| why others don't, or why people find "GPTs" novel. They're
| not. They're a simple idea, wrapped in tons of marketing
| bullshit that makes it less useful and delayed its release by
| half a year.
|
| --
|
| [0] - "GPTs", seriously. That's not a feature, that's just
| system prompt and model config, put in an opaque box and
| distributed on a _marketplace_ for no good reason.
|
| [1] - Voice story has been better for a while, but that's a
| matter of integration - OpenAI putting together their own LLM
| and (unreleased) voice model in a mobile app, in a manner
| hardly possible with the API their offered, vs. TypingMind
| being a webapp that uses third party TTS and STT models via
| "bring your own API key" approach.
|
| [2] - I made https://docs.typingmind.com/plugins/plugins-
| examples#db32cc6... long before you could do that stuff with
| ChatGPT app. It's literally as easy as it can possibly be:
| https://git.sr.ht/~temporal/typingmind-plugins/tree. In
| particular, this one is more representative -
| https://git.sr.ht/~temporal/typingmind-
| plugins/tree/master/i... - PlantUML one is also less than 10
| lines of code, but on top of 1.5k lines of DEFLATE
| implementation in JS I plain copy-pasted from the interwebz
| because I cannot into JS modules.
| coryfklein wrote:
| > But OpenAI provides a much more capable app in ChatGPT and
| an easier development platform
|
| Which app are you talking about here?
| szundi wrote:
| Switch to Cursor with Claude backend and 5x immediately
| mtkd wrote:
| One service is not really enough -- you need a few to
| triangulate more often than not, especially when it comes to
| code using latest versions of public APIs
|
| Phind is useful as you can switch between them -- but only get
| a handful of o1 and Opus a day which I burn through quick at
| moment on deeper things -- Phind-405b and 3.5 Sonnet are decent
| for general use
| GraemeMeyer wrote:
| Non-paywall alternative: GitHub Copilot will support models from
| Anthropic, Google, and OpenAI -
| https://www.theverge.com/2024/10/29/24282544/github-copilot-...
| candiddevmike wrote:
| Wake me up when they support self hosted llama or openwebui.
|
| Wonder if we'll ever see a standard LLM API.
| internetter wrote:
| > Wonder if we'll ever see a standard LLM API.
|
| At this point its just the OpenAI API
| hshshshshsh wrote:
| Isn there no open source alternative? Like a plugin or
| something.
| SirMaster wrote:
| Not for visual studio 2022 unfortunately.
| int_19h wrote:
| There are several plugins for VS 2022 that offer Copilot-
| like UI on top of a local Llama model, although I can't
| speak for their quality.
| SirMaster wrote:
| Hmm, I wonder why I didn't seem to find any.
| rihegher wrote:
| for VScode you can use https://github.com/twinnydotdev/twinny
| Tiberium wrote:
| cursor.ai lets you use any OpenAI-compatible endpoint, although
| not all features work. And continue.dev does too, iirc.
| thecopy wrote:
| Great news! This can only mean better suggestions.
|
| I expected little from Copilot, but now i find it indispensible.
| It is such a productivity multiplier.
| otteromkram wrote:
| I removed it from windows and I'm still very productive.
| Probably moreso, since I don't have to make constant
| corrections.
|
| To each their own.
| Tiberium wrote:
| GitHub Copilot and Microsoft Copilot are different products
| doublerabbit wrote:
| Same difference. They both are glorified liberians.
| TimeBearingDown wrote:
| Liberians seem quite useful, then! I've never been to
| Africa myself.
| timeon wrote:
| Just don't lick your fingers.
| ipaddr wrote:
| Their branding is confusing
| hbn wrote:
| Is Microsoft Copilot even a single product? It seems to me
| they just shove AI in random places throughout their
| products and call it Copilot. Which would make Github
| Copilot essentially another one of these places the
| branding shows up (even if it started there)
| thenobsta wrote:
| I wonder what the rationale for this was internally. More OpenAI
| issues? competitiveness with Cursor? It seems good for the user
| to increase competition across LLM providers.
|
| Also ambiguous title. I thought GitHub canceled deals they had in
| the work. The article is clearly about making a deal, but it's
| unclear from the article's title.
| cma wrote:
| Could be a fight against Llama, which excludes MS and Google in
| its open license (though I think has done separate pay deals
| with one or both of them). Meta are notably absent from this
| announcement.
| szundi wrote:
| Try to fight the free good-enough haha. At least that's the
| plan of Meta, who does not benefit as much selling this than
| using this
| aimazon wrote:
| "cuts" has to be the worse word choice in this context, it sounds
| like they're terminating deals rather than creating them.
| breck wrote:
| "inks"
| Jerrrrrrry wrote:
| is there a slim chance at a title change?
|
| or a fat chance?
| scinadier wrote:
| Common english lexicon should cut ties with the phrase "cut a
| deal"
| justsocrateasin wrote:
| Yeah I agree, could be confusing to non native speakers
| though. It's a weird idiom.
| mattlondon wrote:
| Came here to say that - my reaction was initially "I didn't
| know they even had those deals to cut them!"
| jddj wrote:
| Sensible.
|
| Big part of competitors' (eg. Aider, Cursor, I imagine also
| jetbrains) advantage was not being tied to one model as the
| landscape changed.
|
| After large MS OpenAI investment they could just as easily have
| put blinders on and doubled down.
| kyawzazaw wrote:
| Jetbrains is doing its own LLM
| a_wild_dandan wrote:
| Cursor is too! Mixing and matching specialized & flagship
| models is the way forward.
| yanis_t wrote:
| Isn't using big models like gpt-4o going to slow down the
| autocomplete?
| HyprMusic wrote:
| I think they mean for the chat and code editing features.
| 7thpower wrote:
| I wonder if this is an example of the freedom of being an arms
| length subsidiary or foreshadowing to a broader strategy shift
| within Microsoft.
| neevans wrote:
| Actually excited 2M context window will be useful in this case
| mansoor_ wrote:
| I wonder how this will affect latency,
| JimDabell wrote:
| Anthropic's article: https://www.anthropic.com/news/github-
| copilot
|
| GitHub's article: https://github.blog/news-insights/product-
| news/bringing-deve...
|
| Google Cloud's article:
| https://cloud.google.com/blog/products/ai-machine-learning/g...
|
| Weird that it wasn't published on the official Gemini news site
| here: https://blog.google/products/gemini/
|
| Edit: GitHub Copilot is now also available in Xcode:
| https://github.blog/changelog/2024-10-29-github-copilot-code...
|
| Discussion here: https://news.ycombinator.com/item?id=41987404
| vault wrote:
| I wonder if behind the choice of calling the human user "mona"
| there's an Italian XD
|
| https://i.imgur.com/z01xgfl.png
| throwup238 wrote:
| It's Mona Lisa the Octocat: https://github.com/monatheoctocat
| lelandfe wrote:
| Hah, TIL. https://cameronmcefee.com/work/the-octocat/
| patates wrote:
| Google Cloud's article is from tomorrow?
|
| https://cloud.google.com/blog/products/ai-machine-learning/g...
|
| https://i.postimg.cc/RVWSfpvs/grafik.png
| cortesoft wrote:
| It says the 29th now
| JimDabell wrote:
| It's October 30th in several parts of the world already. It's
| after midnight everywhere GMT+7 onwards.
| patates wrote:
| Obviously! However, Google being an American company, that
| was surprising. I'm in Europe and am used to seeing newest
| posts "from yesterday" when they are from the USA. This one
| is weird.
| ninininino wrote:
| The threat of anti-trust creates a win for consumers, this is an
| example of why we need a strong FTC.
| hedora wrote:
| This is a standard "commoditize your complement" play. It's in
| GitHub / Microsoft's best interest to make sure none of the
| LLMs become dominant.
|
| As long as that happens, their competitors light money on fire
| to build the model while GitHub continues to build / defend its
| monopoly position.
|
| Also, given that there are already multiple companies building
| decent models, it's a pretty safe bet Microsoft could build
| their own in a year or two if the market starts settling on one
| that's a strategic threat.
|
| See also: "embrace, extend, extinguish" from the 1990's
| Microsoft antitrust days.
| gdiamos wrote:
| Github was an early OpenAI design partner. OpenAI developed a
| custom LLM for them.
|
| It's so interesting that even after that early mover advantage
| they have to go back to the foundation model providers.
|
| Does this mean that future tech companies have no choice but to
| do this?
| dartos wrote:
| I see no reason why GitHub wouldn't use fine tuned models from
| google or anthropic.
|
| I think their version of gpt-3.5 was a fine tune as well. I
| doubt they had a whole model from scratch made just for them.
| a_wild_dandan wrote:
| Yes, because transfer learning works. A specialized model for X
| will be subsumed by a general model for X/Y/Z as it becomes
| better at Y/Z. This is why models which learn other languages
| become better at English.
|
| Custom models still have use cases, e.g. situations requiring
| cheaper or faster inference. But ultimately The Bitter Lesson
| holds -- your specialized thing will always be overtaken by
| throwing more compute at a general thing. We'll be following
| around foundation models for the foreseeable future, with
| distilled offshoots bubbling up/dying along the way.
| kingkongjaffa wrote:
| > This is why models which learn other languages become
| better at English.
|
| Do you have a source for that, I'd love to learn more!
| a_wild_dandan wrote:
| _Evaluating cross-lingual transfer learning approaches in
| multilingual conversational agent models_ [1]
|
| _Cross-lingual transfer learning for multilingual voice
| agents_ [2]
|
| _Large Language Models Are Cross-Lingual Knowledge-Free
| Reasoners_ [3]
|
| _An Empirical Study of Cross-Lingual Transfer Learning in
| Programming Languages_ [4]
|
| That should get you started on transfer learning re.
| languages, but you'll have more fun personally picking
| interesting papers over reading a random yahoo's choices.
| The fire hose of papers is nuts, so you'll never be left
| wanting.
|
| [1] https://www.amazon.science/publications/evaluating-
| cross-lin...
|
| [2] https://www.amazon.science/blog/cross-lingual-transfer-
| learn...
|
| [3] https://arxiv.org/pdf/2406.16655v1
|
| [4] https://arxiv.org/pdf/2310.16937v2
| gk1 wrote:
| It may not be a model quality issue. It may be that GitHub
| wants to sell a lot more of Copilot, including to companies who
| refuse to use anything from OpenAI. Now GitHub can say "Oh
| that's fine, we have these two other lovely providers to choose
| from."
|
| Also, after Anthropic and Google sold massive amounts of pre-
| paid usage credits to companies, those companies want to draw
| down that usage and get their money's worth. GitHub might allow
| them to do that through Copilot, and therefore get their
| business.
| manquer wrote:
| I think that the credit scenario is more true for OpenAI than
| others . Existing Azure commits can be used to buy OpenAI via
| the marketplace. It will never be as simple for any non Azure
| partner (Only Github is tying up with Anthropic here not
| Azure)
|
| GitHub doesn't even support using those azure managed APIs
| for copilot today, it is just a license you can buy currently
| and add to a user license. The best you can do is pay for
| copilot with existing azure commits .
|
| This seems about not being left behind as other models
| outpace what copilot can do with their custom OpenAI model
| that doesn't seem to getting updated .
| rnmaker wrote:
| If you want to destroy open source completely, the more models
| the better. Microsoft's co-opting and infiltration of OSS
| projects will serve as a textbook example of eliminating
| competition in MBA programs.
|
| And people still support it by uploading to GitHub.
| dartos wrote:
| > And people still support it by uploading to GitHub.
|
| It's slowly, but noticeably moving from GitHub to other sites.
|
| The network effect is hard to work against.
| fhdsgbbcaA wrote:
| Migration is on my todo list, but it's non trivial enough I'm
| not sure when I'll ever have cycles to even figure out the
| best option. Gitlab? Self-hosted Git? Go back to SVN? A
| totally different platform?
|
| Truth be told, Git is a major pain in the ass anyway and I'm
| very open to something else.
| kubanczyk wrote:
| A classic case of perfect being the enemy of the good. The
| answers are Gitlab and jj, cheers.
| amelius wrote:
| > If you want to destroy open source completely
|
| The irony is of course that open source is what they used to
| train their models with.
| guerrilla wrote:
| That was the point. They are laundering IP. It's the long way
| around the GPL, allowing then to steal.
| ianeigorndua wrote:
| How many OSS repositories do I personally have to read
| through for my own code to be considered stolen property?
|
| That line of thought would get thrown out of court faster
| than an AI would generate it.
| poincaredisk wrote:
| I assume you're not an AI model, but a real human being
| (I hope). The analogy "AI == human" just... doesn't work,
| really.
| ianeigorndua wrote:
| That's beside the point.
|
| Me teaching my brain someone's way of syntactically
| expressing procedures is analogous to AI developers
| teaching their model that same mode of expression.
| guerrilla wrote:
| It's not your reading that would be illegal, but your
| copying. This is well a documented area of the law and
| there are concrete answers to your questions.
| ianeigorndua wrote:
| Are you saying that if I see a nice programming pattern
| in someone else's code, I am not allowed to use that
| pattern in my code?
| candiddevmike wrote:
| Can I copy you or provide you as a service?
|
| To me, the argument is a LLM learning from GPL stuff ==
| creating a derivative of the GPL code, just "compressed"
| within the LLM. The LLM then goes on to create more
| derivatives, or it's being distributed (with the embedded
| GPL code).
| 0x457 wrote:
| Yes, I provide it as a service to my employer. It's
| called a job. Guess what? When I read code I learn from
| it and my brain doesn't care what license that code is
| under.
| ianeigorndua wrote:
| That's what my employers keep asking.
| timeon wrote:
| This seems bit nihilistic. You can't be automated. You
| can't process repos at scale.
| ianeigorndua wrote:
| Yet.
| atomic128 wrote:
| Yes. Thank you for saying it. We're watching Microsoft et al.
| defeat open source.
|
| Large language models are used to aggregate and interpolate
| intellectual property.
|
| This is performed with no acknowledgement of authorship or
| lineage, with no attribution or citation.
|
| In effect, the intellectual property used to train such models
| becomes anonymous common property.
|
| The social rewards (e.g., credit, respect) that often motivate
| open source work are undermined.
|
| Embrace, extend, extinguish.
| bastardoperator wrote:
| Can you name a company with more OSS projects and
| contributors? Stop with the hyperbole...
| atomic128 wrote:
| Embrace, extend...
| TeMPOraL wrote:
| > _The social rewards (e.g., credit, respect) that often
| motivate open source work are undermined._
|
| You mean people making contributions to solve problems and
| scratch each others' itches got displaced by people seeking
| social status and/or a do-at-your-own-pace accreditation
| outside of formal structures, to show to prospective
| employees? And now that LLMs start letting people solve their
| own coding problems, sidestepping their whole social game,
| the credit seekers complain because large corps did something
| they couldn't possibly have done?
|
| I mean sure, their contributions were a critical piece - _in
| aggregate_ - individually, any single piece of OSS code
| contributes approximately 0 value to LLM training. But they
| 're somehow entitled to the reward for a vastly greater value
| someone is providing, just because they _retroactively_ feel
| they contributed.
|
| Or, looking from a different angle: what the complainers are
| saying is, they're sad they can't extract rent now that their
| past work became valuable for reasons they had no part in,
| and if they could turn back time, they'd happily rent-seek
| the shit out of their code, to the point of destroying LLMs
| as a possibility, and denying the world the value LLMs
| provided?
|
| I have little sympathy for that argument. We've been calling
| out "copyright laundering" way before GPT-3 was a thing -
| those who don't like to contribute without capturing all the
| value for themselves should've moved off GitHub years ago.
| It's not like GitHub has any hold over OSS other than plain
| inertia (and the egos in the community - social signalling
| games create a network effect).
| discreteevent wrote:
| >Or, looking from a different angle: what the complainers
| are saying is, they're sad they can't extract rent now that
| their past work became valuable for reasons they had no
| part in, and if they could turn back time, they'd happily
| rent-seek the shit out of their code,
|
| Wrong and completely unfair/bitter accusation. The only
| people rent seeking are the corporations.
|
| What kind of world do you want to live in? The one with
| "social games" or the one with corporate games? The one
| with corporate games seems to have less and less room for
| artists, musicians, language graduates, programmers...
| raegis wrote:
| > individually, any single piece of OSS code contributes
| approximately 0 value to LLM training. But they're somehow
| entitled to the reward for a vastly greater value someone
| is providing, just because they retroactively feel they
| contributed.
|
| You are attributing arguments to people which they never
| made. The most lenient of open source licenses require a
| simple citation, which the "A.I." never provides. Your tone
| comes off as pretty condescending, in my opinion. My
| summary of what you wrote: "I know they violated your
| license, but too bad! You're not as important as you
| think!"
| warkdarrior wrote:
| > This is performed with no acknowledgement of authorship or
| lineage, with no attribution or citation.
|
| GitHub hosts a lot of source code, including presumably the
| code it trained CoPilot on. So they satisfy any license that
| requires sharing the code and license, such as GPL 3. Not
| sure what the problem is.
| whitehexagon wrote:
| I deleted my github 2 weeks ago, as much about AI, as about
| them forcing 2FA. Before AI it was SAAS taking more than they
| were giving. I miss the 'helping each other' feel of these code
| share sites. I wonder where are we heading with all this. All
| competition and no collaboration, no wonder the planet is
| burning.
| pessimizer wrote:
| I don't understand the case being made here at all. AI is
| violating FOSS licenses, I totally agree. But you can write
| more FOSS using AI. It's totally unfair, because these
| companies are not sharing their source, and extracting all of
| the value from FOSS as they can. Fine. But when it comes to OSI
| Open Source, all they usually had to do was include a text file
| somewhere mentioning that they used it in order to do the same
| thing, and when it comes to Free Software, they could just lie
| about stealing it and/or fly under the radar.
|
| Free software needs more user-facing software, and it needs
| people other than coders to drive development (think UI people,
| subject matter specialists, etc.), and AI will help that. While
| I think what the AI companies are doing is tortious, and that
| they either should be stopped from doing it or the entire idea
| of software copyright should be re-examined, I _also_ think
| that AI will be massively beneficial for Free Software.
|
| I also suspect that this could result in a grand bargain in
| some court (which favors the billionaires of course) where the
| AI companies have to pay into a fund of some sort that will be
| used to pay for FOSS to be created and maintained.
|
| Lastly, maybe Free Software developers should start zipping up
| all of the OSI licenses that only require that a license be
| included in the distribution and including that zipfile with
| their software written in collaboration with AI copilots. That
| and your latest GPL for the rest (and for your own code) puts
| you in as safe a place as you could possibly be legally. You'll
| still be hit by all of the "don't do evil"-style FOSS-esque
| licenses out there, but you'll at least be safer than _all_ of
| the proprietary software being written with AI.
|
| I don't know what textbook directs you to eliminate all of your
| competition by lowering your competition's costs, narrowing
| your moat of expertise, and not even owning a piece of that.
|
| edit: that being said, I'm obviously talking about Free
| Software here, and not Open Source. Wasn't Open Source only
| protected by spirits anyway?
| mnau wrote:
| It doesn't matter whether it is uploaded to GitHub or not. They
| would siphon it from GitLab, self hosting or source forge as
| well using crawlers.
| mmiyer wrote:
| Seems to be part of Microsoft's hedging of its OpenAI bet, ever
| since Sam Altman's ousting:
| https://www.nytimes.com/2024/10/17/technology/microsoft-open...
| mk_chan wrote:
| The reason here is Microsoft is trying to make copilot a
| platform. This is the essential step to moving all the power from
| OpenAI to Microsoft. It would grant Microsoft leverage over all
| providers since the customers would depend on Microsoft and not
| OpenAI or Google or Anthropic. Classic platform business
| evolution at play here.
| caesil wrote:
| I think the reason here is that Copilot is very very obviously
| inferior to Cursor, mostly because the model at its core is
| pretty dumb.
| echelon wrote:
| The Copilot team probably thinks of Cursor's efforts as cute.
| They can be a neat little product in their tiny corner of the
| market.
|
| It's far more valuable to be a platform. Maybe Cursor can
| become a platform, but the race is on and they're up against
| giants that are moving rather surprisingly nimbly.
|
| Github does way more, you can build on top of it, and they
| already have a metric ton of business relationships and
| enterprise customers.
| woah wrote:
| A developer will spend far more time in the IDE than the
| version control system so I wouldn't discount it that
| easily. That being said, there are no network effects for
| an IDE and Cursor is basically just a VSCode plugin. Maybe
| Cursor gets a nice acquihire deal
| sangnoir wrote:
| I'm sure there are multiple reasons, including lowering the
| odds of antitrust action by regulators. The EU was already
| sniffing around Microsoft's relationship with OpenAI.
| rogerkirkness wrote:
| Commoditize your compliment baby.
| dfried wrote:
| Anyone doing strategic business with Microsoft would do well to
| remember what they did to Nokia.
| TheRealPomax wrote:
| You mean waste a few billion on buying a company that couldn't
| compete with the market anymore because the iphone made "even
| an idiot should be able to use this thing, and it should be
| able to do pretty much everything" a baseline expectation with
| an OS/software experience to match? Nokia failed Nokia, and
| then Microsoft gave it a shot. And they also couldn't make it
| work.
|
| (sure, that glosses over the whole Elop saga, but Microsoft
| didn't buy a Nokia-in-its-prime and killed it. They bought an
| already failing business and even throwing MS levels of
| resources at it couldn't turn it around)
| muststopmyths wrote:
| Man, as a windows phone mourner the only disagreement i have
| with this comment is that they threw anywhere near MS level
| of resources at Nokia.
|
| Satya never wanted the acquisition and nuked WP as soon as he
| could.
| ahoka wrote:
| I can see why people would think that, but Microsoft did not
| buy Nokia.
| SSLy wrote:
| They did bought the (then) richer half of the company. The
| other is now trying to get out of the rot.
| xnx wrote:
| Frankly surprised to see GitHub (Microsoft) signing a deal with
| their biggest competitor, Google. It does give Microsoft some
| good terms/pricing leverage over OpenAI, though I'm not sure what
| degree Microsoft needs that given their investment in OpenAI.
|
| GitHub Spark seems like the most interesting part of the
| announcement.
| miyuru wrote:
| On the anthropic blog it say it uses AWS Bedrock.
|
| > Claude 3.5 Sonnet runs on GitHub Copilot via Amazon Bedrock,
| leveraging Bedrock's cross-region inference to further enhance
| reliability.
|
| https://www.anthropic.com/news/github-copilot
| xyst wrote:
| Got to cut deals before the AI bust pops, VC money and interest
| vanishes and interest rates go up.
|
| Also diversifying is always a good option. Even if one cash cow
| gets nuked from orbit, you have 2 other companies to latch onto
| kingkongjaffa wrote:
| > interest rates go up
|
| This is kind of a cynical tech startup take:
|
| - ragging on VC's - calling something a bubble
|
| Interest rates are on their way back down btw.
|
| https://www.federalreserve.gov/newsevents/pressreleases/mone...
|
| https://www.reuters.com/world/uk/bank-england-cut-bank-rate-...
|
| Funding has looked to be running out a few times for OpenAI
| specifically, but most frontier model development is reasonably
| well funded still.
| njtransit wrote:
| If interest rates are on their way down, why has the 10Y
| treasury yield increased 50 points over the last month?
| https://www.cnbc.com/quotes/US10Y
| kortilla wrote:
| Because they previously decreased more under the
| expectation of another half point cut by the fed. Stronger
| economic indicators have cut the expectation for steep rate
| cuts so treasuries are declining.
| warkdarrior wrote:
| It also dropped 40 points over the last six months.
| tqwhite wrote:
| I wish people would stop posting Bloomberg paywall links.
| greenavocado wrote:
| I replaced ChatGPT Plus with hosted
| nvidia/Llama-3.1-Nemotron-70B-Instruct for coding tasks. Nemotron
| produces good code. The cost different is massive. Nemotron is
| available for $0.35 per Mtoken in and out. ChatGPT is
| considerably more expensive.
| greenavocado wrote:
| Just kidding. Qwen 2.5 Instruct is superior. Nemotron is
| overfit to pass benchmarks.
| shagie wrote:
| Elseweb with GitHub Copilot today...
|
| Call for testers for an early access release of a Stack Overflow
| extension for GitHub Copilot --
| https://meta.stackoverflow.com/q/432029
| rvz wrote:
| You mean "Microsoft" cuts deals with Google and Anthropic on top
| of their already existing deals with Mistral, Inflection whilst
| also having an exclusivity deal with OpenAI?
|
| This is an extend to extinguish round 4 [0], whilst racing
| everyone else to zero.
|
| [0] https://news.ycombinator.com/item?id=41908456
| sprkv5 wrote:
| One of the reasons that comes to my mind is - it could have been
| problematic look for only Microsoft (Copilot) to have access to
| GitHub for training AI models - a la monopolizing a data treasure
| trove. With anti-competitive legislation catching up to Google to
| open up its Play Store, this could have been one of key reasons
| why this deal came about.
| poincaredisk wrote:
| Copilot can choke on my AGPL code on GitHub, that was used for
| training their proprietary models. I'm still salty about this,
| sadly looks like the world has largely moved on.
| azemetre wrote:
| It really feels like a digital form of colonialism; they come
| in take everything, completely disregard the rules, ignore
| intellectual copyright laws (while you still have to obey
| them), but when you speak out against this suddenly you are a
| luddite that doesn't care about human progress.
| mnau wrote:
| It's especially distasteful when we consider lawsuits like
| Epic vs Silicon Knights.
| https://en.wikipedia.org/wiki/Silicon_Knights
|
| > Silicon Knights had "deliberately and repeatedly copied
| thousands of lines of Epic Games' copyrighted code, and
| then attempted to conceal its wrongdoing by removing Epic
| Games' copyright notices and by disguising Epic Games'
| copyrighted code as Silicon Knights' own
|
| > Epic Games prevailed against Silicon Knights' lawsuit,
| and won its counter-suit for $4.45 million on grounds of
| copyright infringement,
|
| > following the loss of the court case, Silicon Knights
| filed for bankruptcy
| baq wrote:
| If it doesn't work, oh well, you'll get VC money for
| something else.
|
| If it works, the lawyers will figure it out.
| sprkv5 wrote:
| Yet Google and Anthropic wanted in on the huge data that
| GitHub has to offer. It seems the world has not moved on just
| yet.
| nonfamous wrote:
| The Claude terms of service [1] apparently preclude
| Anthropic or AWS using GitHub user data for training:
|
| GitHub Copilot uses Claude 3.5 Sonnet hosted on Amazon Web
| Services. When using Claude 3.5 Sonnet, prompts and
| metadata are sent to Amazon's Bedrock service, which makes
| the following data commitments: Amazon Bedrock doesn't
| store or log your prompts and completions. Amazon Bedrock
| doesn't use your prompts and completions to train any AWS
| models and doesn't distribute them to third parties.
|
| [1] https://docs.github.com/en/copilot/using-github-
| copilot/usin...
| yieldcrv wrote:
| Seems to be trying to get its lunch money back from CodeGPT
| plugin and similar ones
| kleton wrote:
| A case where "cut" is its own antonym, and its unclear which
| sense is meant from the headline alone.
| echoangle wrote:
| I just had the same problem and thought there was a deal that
| was ended now.
| jacobgkau wrote:
| Yeah, I was expecting outrage when I first clicked into the
| thread to glance at the comments, and then I was like "wait,
| why are people saying it's exciting?"
| keiferski wrote:
| Don't think I've ever seen the word "cut" used with "deal" in a
| negative sense. Cutting a deal always means you made a deal,
| not that one ended.
| JulianChastain wrote:
| What about "we were cut from the deal"? It seems like you
| could make a phrase in which 'cut' means "to exclude"
| keiferski wrote:
| Doesn't sound natural to me, and I couldn't find any
| examples online using that phrasing to mean someone was
| removed from a deal. You can be cut from a team, though.
| shagie wrote:
| Cutting Deals and Striking Bargains: The History of an Idiom
|
| https://web.archive.org/web/20060920230602/https://www.csub....
|
| By way of "Why do we 'cut' a deal?"
| https://english.stackexchange.com/q/284233
|
| ---
|
| "Cuts " ... leads to the initial parsing of "cuts all ties
| with" or similar "severs relationship with".
|
| When with additional modifiers between "cuts" and "deal" the
| "cuts deal with" becomes harder to recognize as the "forms a
| deal with" meaning of the phase.
| contextfree wrote:
| GitHub sublates AI deals with Google, Anthropic
| phreeza wrote:
| I guess this goes to show, nobody really has a moat in this game
| so far. Everyone is sprinting like crazy but I don't see anyone
| really gaining a sustainable edge that will push out competitors.
| marban wrote:
| In AI, the only real moat is seeing how many strategic
| partnerships you can announce before anyone figures out they're
| all with the same people.
| selimthegrim wrote:
| Claude and Carol and Carol and Carol?
| dgellow wrote:
| I've been using Cody from Sourcegraph to have access to other
| models, if copilot offers something similar I guess I will switch
| back to it. I find copilot autocomplete to be more often on point
| than Cody, but the chat experience with Cody + Sonnet 3.5 is way
| ahead in may experience
| sqs wrote:
| Context is a huge part of the chat experience in Cody, and
| we're working hard to stay ahead there as well with things like
| OpenCtx (https://openctx.org) and more code context based on
| the code graph (defs/refs/etc.). All this competition is good
| for everyone. :)
| sincerecook wrote:
| I replaced chatgpt with mybrain 1.0 and I'm seeing huge
| improvements in accuracy and reasoning performance!
| nforgerit wrote:
| Also energy efficiency significantly improved, no?
| kingkongjaffa wrote:
| If I'm already paying Anthropic can I use this without paying
| github as well?
| mmaunder wrote:
| History has shown being first to market isn't all it's cut out to
| be. You spend more, it's more difficult creating the trail others
| will follow, you end up with a tech stack that was built before
| tools and patterns stabilized and you've created a giant super
| highway for a fast-follower. Anyone remember MapQuest, AltaVista
| or Hotmail?
|
| OpenAI has some very serious competition now. When you combine
| that with the recent destabilizing saga they went through along
| with commoditization of models with services like OpenRouter.ai,
| I'm not sure their future is as bright as their recent valuation
| indicates.
| sebzim4500 wrote:
| Claude is better than OpenAI for most tasks, and yet OpenAI has
| enormously more users.
|
| What is this, if not first mover advantage?
| szundi wrote:
| Claude cannot "research" stuff on the web and provide results
| like 4o does in 5 secs like "which is the cheapest Skoda car
| and how much"
| mmaunder wrote:
| Just wanted to add a note to this. Tool calling -
| particularly to source external current data - is something
| that's had the big foundational LLM providers very nervous
| so they've held back on it, even though it's trivial to
| implement at this point. But we're seeing it rapidly emerge
| with third party providers who use the foundational APIs.
| Holding back tool calling has limited the complex graph-
| like execution flows that the big providers could have
| implemented on their user facing apps e.g. the kind of
| thing that Perplexity Pro has implemented. So they've
| fallen behind a bit. They may catch up. If they don't they
| risk becoming just an API provider.
| ethbr1 wrote:
| I'm hoping a lot of the graph-like execution flow engines
| are still in stealth mode, as believe that's where we'll
| start to see truly useful AI.
|
| Mass data parsing and reformatting is useful... but
| building agents that span existing APIs / tools is a lot
| more exciting to me.
|
| I.e. IFTTT, with automatic tool discovery, parameter
| mapping, and output parsing handled via LLM
| sitkack wrote:
| This is what I use phind for.
| mmaunder wrote:
| Yes, muscle memory is powerful. But it's not an
| insurmountable barrier for a follower. The switch from Google
| to various AI apps like Perplexity being a case in point. I
| still find myself beginning to reach for Google and then 0.1
| seconds later catching myself. As a side note: I'm also
| catching myself having a lack of imagination when it comes to
| what is solvable. e.g. I had a specific technical question
| about github's UX and how to get to a thing that no one would
| have written about and thus Google wouldn't know, but openAI
| chat nailed it first try.
| hbn wrote:
| Most people's first exposure to LLMs was ChatGPT, and that
| was only what - like 18 months ago it really took off in the
| mainstream? We're still very early on in the grand scheme of
| things.
| dmix wrote:
| Yes it's silly to talk about first mover advantage in sub 3
| years. Maybe in 2026 we can revisit this question and see
| if being the first mattered.
|
| First mover being a general myth doesn't mean being the
| first to launch and then immediately dominating the wider
| market for a long period is impossible. It's just usually
| means their advantage was about a lot more than simply
| being first.
| jedberg wrote:
| Claude requires a login, ChatGPT does not.
| nabla9 wrote:
| It's a short lived first mover advantage.
| sigmoid10 wrote:
| Claude is only better in some cherry picked standard eval
| benchmarks, which are becoming more useless every month due
| to the likelihood of these tests leaking into training data.
| If you look at the Chatbot Arena rankings where actual users
| blindly select the best answer from a random choice of
| models, the top 3 models are all from OpenAI. And the next
| best ones are from Google and X.
| trzy wrote:
| Bullshit. Claude 3.5 Sonnet owns the competition according
| to the most useful benchmark: operating a robot body in the
| real world. No other model comes close.
| Matticus_Rex wrote:
| This seems incorrect. I don't need Claude 3.5 Sonnet to
| operate a robot body for me, and don't know anyone else
| who does. And general-purpose robotics is not going to be
| the most efficient way to have robots do many tasks ever,
| and certainly not in the short term.
| trzy wrote:
| Of course not but the task requires excellent image
| understanding, large context window, a mix of structured
| and unstructured output, high level and spatial
| reasoning, and a conversational layer on top.
|
| I find it's predictive of relative performance in other
| tasks I use LLMs for. Claude is the best. The only
| shortcoming is its peculiar verbosity.
|
| Definitely superior to anything OpenAI has and miles
| beyond the "open weights" alternatives like Llama.
| int_19h wrote:
| The problem is that it also fails on fairly simple logic
| puzzles that ChatGPT can do just fine.
|
| For example, even the new 3.5 Sonnet can't solve this
| reliably:
|
| > Doom Slayer needs to teleport from Phobos to Deimos. He
| has his pet bunny, his pet cacodemon, and a UAC scientist
| who tagged along. The Doom Slayer can only teleport with
| one of them at a time. But if he leaves the bunny and the
| cacodemon together alone, the bunny will eat the
| cacodemon. And if he leaves the cacodemon and the
| scientist alone, the cacodemon will eat the scientist.
| How should the Doom Slayer get himself and all his
| companions safely to Deimos?
|
| In fact, not only its solution is wrong, but it can't
| figure out _why_ it 's wrong on its own if you ask it to
| self-check.
|
| In contrast, GPT-4o always consistently gives the correct
| response.
| BobaFloutist wrote:
| Yeah, but Mistral brews a mean cup of tea, and Llama's
| easily the best at playing hopscotch.
| gr3ml1n wrote:
| 3.5 Sonnet, ime, is dramatically better at coding than 4o.
| o1-preview may be better, but it's too slow.
| amanzi wrote:
| I don't pay any attention to leaderboards. I pay for both
| Claude and ChatGPT and use them both daily for anything
| from Python coding to the most random questions I can think
| of. In my experience Claude is better (much better) that
| ChatGPT in almost all use cases. Where ChatGPT shines is
| the voice assistant - it still feels almost magical having
| a "human-like" conversation with the AI agent.
| rogerkirkness wrote:
| Claude 3.5 Sonnet (New) is meaningfully better than ChatGPT
| GPT4o or o1.
| drcode wrote:
| my experience is that o1 is still slightly better for
| coding, sonnet new is better for analyzing data, and most
| other tasks besides coding
| scarmig wrote:
| I'm subscribed to all of Claude, Gemini, and ChatGPT.
| Benchmarks aside, my go-to is always Claude. Subjectively
| speaking, it consistently gives better results than
| anything else out there. The only reason I keep the other
| subscriptions is to check in on them occasionally to see if
| they've improved.
| Cu3PO42 wrote:
| Anecdotally, I disagree. Since the release of the "new" 3.5
| Sonnet, it has given me consistently better results than
| Copilot based on GPT-4o.
|
| I've been using LLMs as my rubber duck when I get stuck
| debugging something and have exhausted my standard avenues.
| GPT-4o tends to give me very general advice that I have
| almost always already tried or considered, while Claude is
| happy to say "this snippet looks potentially incorrect;
| please verify XYZ" and it has gotten me back on track in
| maybe 4/5 cases.
| ipaddr wrote:
| Claude is more restricted and can't generate images.
| SV_BubbleTime wrote:
| I asked Claude a physics question about bullet trajectory
| and it refused to answer. Restricted too far imo.
| metalliqaz wrote:
| couldn't you s/bullet/ball/ ? or s/bullet/arrow/ ?
| gkbrk wrote:
| You could, but you could also use a model that's not
| restricted so much that it cannot do simple tasks.
| SV_BubbleTime wrote:
| Exactly.
|
| I ended up asking about half pound ball I would throw
| with a 3600rpm spin and the acceleration phase was 4ms.
|
| It had no issue with that but it was stupid.
| ronnier wrote:
| I think "Claude" is also a bad name. If I knew nothing else,
| am I picking OpenAI or Claude based on the name? I'm going
| with OpenAI
| block_dagger wrote:
| Claude is a product name, OpenAI is a company name. You
| really think Claude is better than ChatGPT?
| ronnier wrote:
| The name ChatGPT is better than the name Claude, to me.
| Of course this is all subjective though.
| setsewerd wrote:
| This brings up the broader question: why are AI companies
| so bad at naming their products?
|
| All the OpenAI model names look like garbled nonsense to
| the layperson, while Anthropic is a bit of a mixed bag
| too. I'm not sure what image Claude is supposed to
| conjure, Sonnet is a nice name if it's packaged as a
| creative writing tool but less so for developers. Meta AI
| is at least to the point, though not particularly
| interesting as far as names go.
|
| Gemini is kind of cool sounding, aiming for the
| associations of playful/curious of that zodiac sign. And
| the Gemini models are about as unreliable as astrology is
| for practical use, so I guess that name makes the most
| sense.
| jmcmaster wrote:
| Asking Americans to read a French name that is a homonym
| for "clod" may not be the best mass market decision.
| 0x457 wrote:
| Plot twist: regular users don't care what model
| underneath is called or how it works.
| HarHarVeryFunny wrote:
| They seem to be going after different markets, or at least
| having differing degrees of success in going after different
| markets.
|
| OpenAI is most successful with consumer chat app (ChatGPT)
| market.
|
| Anthropic is most successful with business API market.
|
| OpenAI currently has a lot more revenue than Anthropic, but
| it's mostly from ChatGPT. For API use the revenue numbers of
| both companies are roughly the same. API success seems more
| important that chat apps since this will scale with success
| of the user's business, and this is really where the dream of
| an explosion in AI profits comes from.
|
| ChatGPT's user base size vs that of Claude's app may be first
| mover advantage, or just brand recognition. I use Claude
| (both web based and iOS app), but still couldn't tell you if
| the chat product even has a name distinct from the model.
| How's that for poor branding?! OpenAI have put a lot of
| effort into the "her" voice interface, while Anthropic's app
| improvements are more business orientated in terms of
| artifacts (which OpenAI have now copied) and now code
| execution.
| azemetre wrote:
| Honestly I think the biggest reason for this is that Claude
| requires you to login via an email link whereas OpenAI will
| let you just login with any credentials.
|
| This matters if you have a corporate machine and can't access
| your personal email to login.
| LeoPanthera wrote:
| Given that Hotmail is now Outlook.com, maybe that's a bad
| example.
| holografix wrote:
| Can we change the title to "GitHub _signs_ deals with Google,
| Anthropic" ?
|
| The original got me thinking it already had deals it was getting
| out of
| eddd-ddde wrote:
| I agree, very weird choice of words.
| kelnos wrote:
| To "cut a deal" is a common (American?) English idiom meaning
| to "make a deal".
|
| But agree that it's better to avoid using idioms on a site that
| has many visitors for whom English is not their first language.
| archgoon wrote:
| Do you mean that Bloomberg should have used a different title
| or Hacker News should have modified the title?
| pxeger1 wrote:
| I think Bloomberg's at fault: "cut a deal" isn't usually
| that ambiguous because it's clear which state transition is
| more likely. But here it's plausible they could've been
| ending some existing training-data-sharing agreement, or
| that they were making a new different deal. Also the fact
| it's pluralised here makes it different enough to the most
| common form for it to be a bit harder to notice the idiom.
| But since we can't change the fact they used that title, I
| would like HN to change it now.
| lofaszvanitt wrote:
| Thank you people for contributing to this free software
| ecosystem. Oh, you can't monetize your work? Your problem, not
| ours! Deals are made, but you, who provide your free code, we
| have zero monetization options for you on our github platform. Go
| pay for copilot which was trained on your data.
|
| I mean, this is the worst farce ever concocted. And people are
| oblivious what's happening...
| mnau wrote:
| We are not oblivious. We are powerless. Oracle could go toe to
| toe with Google and threaten multibillion fines over basically
| API and 11kLOC. As a open source developer, there is no way to
| match that.
| Fairburn wrote:
| 1 point by Fairburn 0 minutes ago | prev | next | edit | delete
| [-]
|
| I have no doubts that Claude is serviceable from a coders
| perspective. But for me, as a paid user, I became tired of being
| told that I have to slow down and then be cut off while actively
| working on a product. When Anthropic addresses this, Ill add it
| back to my tools.
| wg0 wrote:
| This only makes Copilot more competitive and price effective.
| Microsoft's business managers are smart.
| hi41 wrote:
| That's a strange usage of the word "cuts". I thought GitHub
| terminated the deals with Google and Anthropic. It would be
| better if the title were GitHub signs AI deals instead of cuts.
| r00fus wrote:
| https://plainenglish.com/expressions/cut-a-
| deal/#:~:text=Tod....
| gregschlom wrote:
| I'm assuming you're not a native speaker? (I'm not) - "to cut a
| deal" is a fairly common idiom that means to reach and
| agreement.
| naniwaduni wrote:
| As an aside, "closing" and "concluding" a deal or sale also
| usually mean to successfully reach an agreement. It's more of
| a semantic quirk around deals than an isolated idiom.
| hi41 wrote:
| That's correct. Not a native speaker. I am not well versed
| with slang words. I am sometimes embarrassed because I speak
| as if they are words from a book instead of sounding like
| spoken words. Do you know how cuts came to mean that it's a
| deal. For a non-native speaker it means the exact opposite
| thing as in "he cut a wire". Language evolves in strange
| ways.
| samatman wrote:
| "Cut a deal" is an idiom, not slang: it's appropriate
| language to use in a business context, for example.
|
| The origin is hazy, of the theories I've seen I consider
| this the best one: "deal" means both "an agreement" and "to
| distribute cards in a card game". The dealer, in the latter
| sense, first cuts the card deck then deals the card. "Cut
| and deal" -> "cut a deal".
|
| It could also be related to "cut a check", which comes from
| an era before perforated paper was widespread, when one
| would literally cut the check out of a book of checks.
| hi41 wrote:
| Thanks much for the explanation.
| epolanski wrote:
| Yet another confirmation that AI models are nothing but
| commodities.
|
| There's no moat, none.
|
| I'm really curious how can any company building models hope to
| have any meaningful return from their billion dollars
| investments, when few people leaving and getting enough azure
| credits can get create a competitor in few months.
| thih9 wrote:
| I use cursor and its tab completion; while what it can do is mind
| blowing, in practice I'm not noticing a productivity boost.
|
| I find that ai can help significantly with doing plumbing, but it
| has no problems with connecting the pipes wrong. I need to double
| and triple check the updated code - or fix the resulting errors
| when I don't do that. So: boilerplate and outer app layers, yes;
| architecture and core libraries, no.
|
| Curious, is that a property of all ai assisted tools for now? Or
| would copilot, perhaps with its new models, offer a different
| experience?
| MuffinFlavored wrote:
| > in practice I'm not noticing a productivity boost.
|
| How can this be possible if you literally admit its tab
| completion is mindblowing?
|
| Isn't really good tab completion good enough for at least a 5%
| producitvity boost? 10%? 20%?
|
| Select line of code, prompt it to refactor, verify they are
| good, accept the changes
| m3kw9 wrote:
| In experience, I always have to spend time to check and most
| times it doesn't do what I need unless it's very simple asks.
| thih9 wrote:
| > How can this be possible if you literally admit its tab
| completion is mindblowing?
|
| What about it makes it impossible? I'm impressed by what AI
| assistants can do - and in practice it doesn't help me
| personally.
|
| > Select line of code, prompt it to refactor, verify they are
| good, accept the changes.
|
| It's the "verify" part that I find tricky. Do it too fast and
| you spend more time debugging than you originally gained. Do
| it too slow and you don't gain much time.
|
| There is a whole category of bugs that I'm unlikely to write
| myself but I'm likely to overlook when reading code. Mixing
| up variable types, mixing up variables with similar names,
| misusing functions I'm unfamiliar with and more.
| beepbooptheory wrote:
| I think the essential point around impressive vs helpful
| sums up so much of the discourse around this stuff. Its all
| just where you fall on the line between "impressive is
| necessarily good" and "no it isn't".
| throttlebody wrote:
| How does AI learn from it's mistakes ? Genuine question as
| i have only briefly used ChatGpt and found it interesting
| but not usefull.
| bigstrat2003 wrote:
| > How can this be possible if you literally admit its tab
| completion is mindblowing?
|
| If I had a knife of perfect sharpness which never dulled,
| that would be mind-blowing. It also would very likely not
| make me a better cook.
| kortilla wrote:
| If someone can eat 20 golf balls that's impressive but it
| doesn't improve my golf game
| bradford wrote:
| > How can this be possible if you literally admit its tab
| completion is mindblowing?
|
| I might suggest that coding doesn't take as much of our time
| as we might think it does.
|
| Hypothetically:
|
| Suppose coding takes 20% of your total clock time. If you
| improve your coding efficiency by 10%, you've only improved
| your total job efficiency by 2%. This is great, but probably
| not the mind-blowing gain that's hyped by the AI boom.
|
| (I used 20% as a sample here, but it's not far away from my
| anecdotal experience, where so much of my time is spent in
| spec gathering, communication, meeting security/compliance
| standards, etc).
| MangoCoffee wrote:
| Time will tell. As a GitHub Copilot user, I still review the
| code.
|
| SpaceX's advancements are impressive, from rocket blow up to
| successfully catching the Starship booster.
|
| Who knows what AI will be capable of in 5-10 years? Perhaps it
| will revolutionize code assistance or even replace developers
| outworlder wrote:
| > SpaceX's advancements are impressive, from rocket blow up
| to successfully catching the Starship booster.
|
| That felt like it was LLM generated since that doesn't have
| anything to do with the subject being discussed. Not only
| it's on a different industry but it's a completely different
| set of problems. We know what's involved in catching a
| rocket. It's a massive engineering challenge yes, but we all
| know it can be done(whether or not it makes sense or is
| economically viable are different issues).
|
| Even going to the Moon - which was a massive project and took
| massive focus from an entire country to do - was a matter of
| developing the equipment, procedures, calculations (and yes,
| some software). We knew back then it could be done, and
| roughly how.
|
| Artificial intelligence? We don't know enough about
| "intelligence". There isn't even a target to reach right now.
| If we said "resources aren't a problem, let's build AI",
| there isn't a single person on this planet that can tell you
| how to build such an AI or even which technologies need to be
| developed.
|
| More to the point, current LLMs are able to probabilistically
| generate data based on prompts. That's pretty much it. They
| don't "know" anything about what they are generating, they
| can't reason about it. In order for "AI" to replace
| developers entirely, we need other big advancements in the
| field, which may or may not come.
| dspillett wrote:
| _> Artificial intelligence? We don 't know enough about
| "intelligence"._
|
| The problem I have with this objection is that it, like
| many discussions, conflates LLMs (glorified predictive
| text) and other technologies currently being referred to as
| AI, with AGI.
|
| Most of these technologies should still be called machine
| learning as they aren't really doing anything intelligent
| in the sense of general intelligence. As you say yourself:
| they don't know anything. And by inference, they aren't
| _reasoning_ about anything.
|
| Boilerplate code for common problems, and some not so
| common ones, which is what LLMs are getting pretty OK at
| and might in the coming years be very good at, _is_ a
| definable problem that we understand quite well. And much
| as we like to think of ourselves as "computer scientists",
| the vast majority of what we do boils down to boilerplate
| code using common primitives, that are remarkably similar
| across many problem domains that might on first look appear
| to be quite different, because many of the same primitives
| and compound structures are used. The bits that require
| actual intelligence are often quite small (this is how _I_
| survive as a dev!), or are away from the development
| coalface (for instance: discovering and defining the
| problems before we can solve them, or describing the
| problem & solution such that someone or an "AI" can do the
| legwork).
|
| _> we need other big advancements in the field, which may
| or may not come._
|
| I'm waiting for an LLM being guided to create a better LLM,
| and eventually down that chain a real AGI popping into
| existence, much like the infinite improbability drive being
| created by clever use of a late version finite
| improbability generator. This is (hopefully) many years (in
| fact I'm hoping for at least a couple of decades so I can
| be safely retired or nearly there!) from happening, but it
| feels like such things are just over the next deep valley
| of disillusionment.
| olivermuty wrote:
| Except cursor is the fireworks based on black powder here. It
| will look good, but as a technology to get you to the moon it
| seems to look like a dead end. NOTHING (of serious science)
| seems to indicate LLMs being anything but a dead end with the
| current hardware capabilites.
|
| So then I ask: What, in qualitative terms, makes you think AI
| in the current form will be capable of this in 5 or 10 years?
| Other than seeing the middle of what seems to be an S-curve
| and going <<ooooh shiny exponential!>>
| TeMPOraL wrote:
| > _NOTHING (of serious science) seems to indicate LLMs
| being anything but a dead end with the current hardware
| capabilites._
|
| In the same sense that black powder sucks as a rocket
| propellant - but it's enough to demonstrate that iterating
| on the same architecture and using better fuels _will_ get
| you to the Moon eventually. LLMs of today are starting
| points, and many ideas for architectural improvements are
| being explored, and nothing in serious science suggests
| _that_ will be a dead end any time soon.
| zeroonetwothree wrote:
| It's easy to say with hindsight but if all you have is
| black powder I don't think it's obvious those better
| fuels even exist.
| dbmikus wrote:
| If you look at LLM performance on benchmarks, they keep
| getting better at a fast rate.[1]
|
| We also now have models of various sizes trained in general
| matters, and those can now be tuned or fine-tuned to
| specific domains. The advances in multi-modal AI are also
| happening very quickly as well. Model specialization, model
| reflection (chain of thought, OpenAI's new O1 model, etc.)
| are also undergoing rapid experimentation.
|
| Two demonstrable things that LLMs don't do well currently,
| are (1) generalize quickly to out-of-distribution examples,
| (2) catch logic mistakes in questions that look very
| similar to training data, but are modified. This video
| talks about both of these things.[2]
|
| I think I-JEPA is a pretty interesting line of work towards
| solving these problems. I also think that multi-modal AI
| pushes in a similar direction. We need AI to learn
| abstractions that are more decoupled from the source
| format, and we need AI that can reflect and modify its
| plans and update itself in real time.
|
| All these lines of research and development are more-or-
| less underway. I think 5-10 years is reasonable for another
| big advancement in AI capability. We've shown that applying
| data at scale to simple models works, and now we can
| experiment with other representations of that data (ie
| other models or ways to combine LLM inferences).
|
| [1]: https://www.anthropic.com/news/3-5-models-and-
| computer-use [2]:
| https://www.youtube.com/watch?v=s7_NlkBwdj8
| imafish wrote:
| I rarely use the tab completion. Instead I use the chat and
| manually select files I know should be in context. I am barely
| writing any code myself anymore.
|
| Just sanity checking that the output and "piping" is correct.
|
| My productivity (in frontend work at least) is significantly
| higher than before.
| big_jimmer wrote:
| Out of curiosity, how long have you been working as a
| developer? Just that, in my experience, this is mostly true
| for juniors and mids (depending on the company, language,
| product etc. etc.). For example, I often find that copilot
| will hallucinate tailwind classes that don't exist in our
| design system library, or make simple logical errors when
| building charts (sometimes incorrect ranges, rarely
| hallucinated fields) and as soon as I start bringing in 3rd
| party services or poorly named legacy APIs all hope is lost
| and I'm better off going it alone with an LSP and a prayer.
| SparkyMcUnicorn wrote:
| I haven't used Cursor, but I use Aider with Sonnet 3.5 and also
| use Copilot for "autocomplete".
|
| I'd highly recommend reading through Aider's docs[0], because I
| think it's relevant for any AI tool you use. A lot of people
| harp on prompting, and while a good prompt is important I often
| see developers making other mistakes like not providing context
| that's good, correct, or even too much[1].
|
| When I find models are going on the wrong path with something,
| or "connecting the pipes wrong", I often add code comments that
| provide additional clarity. Not only does this help future
| me/devs, but the more I steer AI towards correct results, the
| fewer problems models seem to have going forward.
|
| Everybody seems to be having wildly different experiences using
| AI for coding assistance, but I've personally found it to be a
| big productivity boost.
|
| [0] https://aider.chat/docs/usage/tips.html
|
| [1] https://aider.chat/docs/troubleshooting/edit-
| errors.html#red...
| realce wrote:
| Totally agree that heavy commenting is the best convention
| for helping the assistant help you best. I try to comment in
| a way that makes a file or function into a "story" or kind of
| a single narrative.
| jascha_eng wrote:
| That's super interesting, I've been removing a lot of the
| redundant comments from the AI results. But adding new more
| explanatory ones that make it easier for both AI and humans
| to understand the code base makes a lot of sense in my
| head.
|
| I was big on writing code to be easy to read for humans,
| but it being easy to read for AI hasn't been a large
| concern of mine.
| bob1029 wrote:
| It's the subtle errors that are really difficult to navigate. I
| got burned for about 40 hours on a conditional being backward
| in the middle of an otherwise flawless method.
|
| The apparent speed up is mostly a deception. It definitely
| helps with rough outlines and approaches. But, the faster you
| go, the less you will notice the fine details, and the more
| assumptions you will accumulate before realizing the
| fundamental error.
|
| I'd rather find out I was wrong within the same day. I'd
| probably have written some unit tests and played around with
| that function a lot more if I had handcrafted it.
| enneff wrote:
| That's the thing, isn't it? The craft of programming in the
| small is one of being intimate with the details, thinking
| things through conscientiously. LLMs don't do that.
| Nevermark wrote:
| Perhaps it should be prompted to then?
|
| Ask it to review its own code for any problems?
|
| Also identify typical and corner cases and generate tests?
|
| Question marks here because I have not used the tool.
|
| The size & depth of each accepted code step is still up to
| the developer slash prompter
| nrclark wrote:
| I use Chatgpt for coding / API questions pretty
| frequently. It's bad at writing code with any kind of
| non-trivial design complexity.
|
| There have been a bunch of times where I've asked it to
| write me a snippet of code, and it cheerfully gave me
| back something that doesn't work for one reason or
| another. Hallucinated methods are common. Then I ask it
| to check its code, and it'll find the error and give me
| back code with a different error. I'll repeat the process
| a few times before it eventually gets back to code that
| resembles its first attempt. Then I'll give up and write
| it myself.
|
| As an example of a task that it failed to do: I asked it
| to write me an example Python function that runs a
| subprocess, prints its stdout transparently (so that I
| can use it for running interactive applications), but
| also records the process's stdout so that I can use it
| later. I wanted something that used non-blocking I/O
| methods, so that I didn't have to explicitly poll every N
| milliseconds or something.
| bongodongobob wrote:
| Honestly I find that when GPT starts to lose the plot
| it's a good time to refactor and then keep on moving.
| "Break this into separate headers or modules and give me
| some YAML like markup with function names, return type,
| etc for each file." Or just use stubs instead of dumping
| every line of code in.
| tomrod wrote:
| How long are you willing to iterate to get things right?
| bongodongobob wrote:
| If it takes almost no cognitive energy, quite a while.
| Even if it's a little slower than what I can do, I don't
| care because I didn't have to focus deeply on it and have
| plenty of energy left to keep on pushing.
| EVa5I7bHFq9mnYK wrote:
| That's presumably what o1-preview does? Iterates and
| checks the result. It takes much longer, but does indeed
| write slightly better code.
| __MatrixMan__ wrote:
| I find that it depends very heavily on what you're up to.
| When I ask it to write nix code it'll just flat out forget
| how the syntax works half way though. But if I want it to
| troubleshoot an emacs config or wield matplotlib it's
| downright wizardly, often including the kind of thing that
| does indicate an intimacy with the details. I get
| distracted because I'm then asking it:
|
| > I un-did your change which made no sense to me and now
| everything is broken, why is what you did necessary?
|
| I think we just have to ask ourselves what we want it to be
| good at, and then be diligent about generating decades
| worth of high quality training material in that domain. At
| some point, it'll start getting the details right.
| tanseydavid wrote:
| >> The apparent speed up is mostly a deception.
|
| When I am able ask a very simple question of an LLM which
| then prevents me having to context-switch to answer the same
| simple question myself; this is a big time saver for me but
| hard-to-quantify.
|
| Anything that reduces my cognitive load when the pressure is
| on is a blessing on some level.
| oogetyboogety wrote:
| This might be the measurable "some" non deceptive time
| saving, whereas most of it is still deceptive in terms of
| time saved
| tensor wrote:
| Except actual studies objectively show efficiency gains,
| more with junior devs, which make sense. So no, it's not
| a "deception" but it is often overstated in popular
| media.
| zeroonetwothree wrote:
| Studies have limitations, in particular they test
| artificial and narrowly-scoped problems that are quite
| different from real world work.
| rqmedes wrote:
| I find the opposite, the more senior the more value they
| offer as you know how to ask the right questions, how to
| vary the questions and try different tact's and also
| observe errors or mistakes
| 0xFACEFEED wrote:
| You could make the same argument for any non-AI driven
| productivity tool/technique. If we can't trust the user
| to determine what is and is not time-saving then time-
| saving isn't a useful thing to discuss outside of an
| academic setting.
|
| My issue with most AI discussions is they seem to
| completely change the dimensions we use to evaluate basic
| things. I believe if we replaced "AI" with "new useful
| tool" then people would be much more eager to adopt it.
|
| What clicked for me is when I started treating it more
| like a tool and less like some sort of nebulous pandora's
| box.
|
| Now to me it's no different than auto completing code,
| fuzzy finding files, regular expressions, garbage
| collection, unit testing, UI frameworks, design patterns,
| etc. It's just a tool. It has weaknesses and it has
| strengths. Use it for the strengths and account for the
| weaknesses.
|
| Like any tool it can be destructive in the hands of an
| inexperienced person or a person who's asking it to do
| too much. But in the hands of someone who knows what
| they're doing and knows what they want out of it - it's
| so freakin' awesome.
|
| Sorry for the digression. All that to say that if someone
| believes it's a productivity boost for them then I don't
| think they're being misled.
| bongodongobob wrote:
| Cognitive load is something people always leave out. I can
| fuckin code drunk with these things. Or just increase
| stamina to push farther than I would writing every single
| line.
| tensor wrote:
| Why aren't you writing unit tests just because AI wrote the
| function? Unit tests should be written regardless of the
| skill of the developer. Ironically, unit tests are also one
| area where AI really does help move faster.
|
| High level design, rough outlines and approaches, is the
| worst place to use AI. The other place AI is pretty good is
| surfacing api call or function calls you might not know about
| if you're new to the language. Basically, it can save you a
| lot of time by avoiding the need for tons of internet
| searching in some cases.
| chairhairair wrote:
| I have completely the opposite perspective.
|
| Unit tests actually need to be correct, down to individual
| characters. Same goes with API calls. The API needs to
| actually exist.
|
| Contrast that with "high level design, rough outlines".
| Those can be quite vague and hand-wavy. That's where these
| fuzzy LLMs shine.
|
| That said, these LLM-based systems are great at writing
| "change detection" unit tests that offer ~zero value (or
| negative).
| Aeolun wrote:
| > That said, these LLM-based systems are great at writing
| "change detection" unit tests that offer ~zero value (or
| negative).
|
| That's not at all true in my experience. With minimal
| guidance they put out pretty sensible tests.
| pawelduda wrote:
| Exactly, 1 step forward, 1 step backward. Avoiding edge cases
| is something that can't be glossed over, and for that I need
| to carefully review the code. Since I'm accountable for it,
| and can't skip this part anyway, I'd rather review my own
| than some chatbot's.
| knallfrosch wrote:
| I use it for an unfamiliar programming language and it's very
| nice. You can also ask it to explain badly documented code.
| weitendorf wrote:
| I'm building a tool in this space and believe it's actually
| multiple separate problems. From most to least solvable:
|
| 1. AI coding tools benefit a lot from explicit
| instructions/specifications and context for how their output
| will be used. This is actually a very similar problem to when
| eg someone asks a programmer "build me a website to do X" and
| then being unhappy with the result because they actually wanted
| to do "something like X", and a payments portal, and yellow
| buttons, and to host it on their existing website. So models
| need to be given those particular instructions somehow (there
| are many ways to do it, I think my approach is one of the best
| so far) and context (eg RAG via find-references, other files in
| your codebase, etc)
|
| 2. AI makes coding errors, bad assumptions, and mistakes just
| like humans. It's rather difficult to implement auto-correction
| in a good way, and goes beyond mere code-writing into "agentic"
| territory. This is also what I'm working on.
|
| 3. AI tools don't have architecture/software/system design
| knowledge appropriate represented in their training data and
| all the other techniques used to refine the model before
| releasing it. More accurately, they might have _knowledge_ in
| the form of eg all the blog posts and docs out there about it,
| but not _skill_. Actually, there is some improvement here,
| because I think o1 and 3.5 sonnet are doing some kind of
| reinforcement-learning /self-training to get better at this.
| But it's not easily addressable on your end.
|
| 4. There is ultimately a ton of context cached in your brain
| that you cannot realistically share with the AI model, either
| because it's not written anywhere or there is just too much of
| it. For example, you may want to structure your code in a
| certain way because your next feature will extend it or use it.
| Or your product is hosted on serving platform Y which has an
| implementation detail where it tries automatically setting
| Content-Type response headers by appending them to existing
| headers, so manually setting Content-Type in the response
| causes bugs on certain clients. You can't magically stuff all
| of this into the model context.
|
| My product tries to address all of these to varying extents.
| The largest gains in coding come from making it easier to
| specify requirements and self-correct, but architecture/design
| are much harder and not something we're working on much. You or
| anybody else can feel free to email me if you're interested in
| meeting for a product demo/feedback session - so far people
| really like our approach to setting output specs.
| fullstackwife wrote:
| One of the reasons for that may be the price: large code
| changes with multi turn conversation can eat up a lot of
| tokens, while those tools charge you a flat price per month.
| Probably many hacks are done under the hood to keep *their*
| costs low, and the user experiences this as lower quality
| responses.
|
| Still the "architecture and core libraries" is rather corner
| case, something at the bottom of their current sales funnel.
|
| also: do you really want to get equivalent of 1 FTE work for 20
| USD per month?:)
| tomrod wrote:
| I'd love an autoselected LLM that is fine-tuned to the syntax
| I'm actively using -- Cursor has a bit of a head start, but
| where Github and others can take it could be mindblowing
| (Cursor's moat is a decent VS Code extension -- I'm not sure
| it's a deep moat though).
| bmitc wrote:
| That's my exact experience with GitHub Copilot. Even
| boilerplate stuff it sucks at as well. I have no idea why its
| autocomplete is so bad when it has access to my code, the
| function signatures, types, etc. It gets stuff wrong all the
| time. For example, it will just flat out suggest functions that
| don't exist, neither in the Python core libraries or in my own
| modules. It doesn't make sense.
|
| I have all but given up on using Copilot for code development.
| I still do use it for autocomplete and boilerplate stuff, but I
| still have to review that. So there's still quite a bit of
| overhead, as it introduces subtle errors, especially in
| languages like Python. Beyond that, it's failure rate at
| producing running, correct code is basically 100%.
| miki123211 wrote:
| For now, I mostly use AI as a "faster typist".
|
| If it wants to complete what I wanted to type anyway, or
| something extremely similar, I just press tab, otherwise I type
| my own code.
|
| I'd say about 70% of individual lines are obvious enough if you
| have the surrounding context that this works pretty well in
| practice. This number is somewhat lower in normal code and
| higher in unit tests.
|
| Another use case is writing one-off scripts that aren't
| connected to any codebase in particular. If you're doing a lot
| of work with data, this comes in very handy.
|
| Something like "here's the header of a CSV file", pass each row
| through model x, only pass these three fields, the model will
| give you annotations, put these back in the csv and save, show
| progress, save every n rows in case of crashes, when the output
| file exists, skip already processed rows."
|
| I'm not (yet) convinced by AI writing entire features, I tried
| that a few times and it was very inconsistent with the
| surrounding codebase. Managing which parts of the codebase to
| put in its context is definitely an art though.
|
| It's worth keeping in mind that this is the worst AI we'll ever
| have, so this will probably get better soon.
| zeroonetwothree wrote:
| One off scripts do work very well.
| blitzar wrote:
| Reminds me of how I use the satnav when driving.
|
| I don't close my eyes and do whatever it tells me to do. If I
| think I know better I don't "turn right at the next set of
| lights" I just drive on as I would have before GPS and
| eventually realise that I went the wrong way or the satnav
| realises there was a perfectly valid 2nd/3rd/4th path to get
| to where I wanted to go.
| pnathan wrote:
| In general I do not find AI a net positive. Other tools seem to
| do at least as well in general.
|
| it can be used if you want the reliability of a random forum
| poster. which... sure. knock yourself out. sometimes there's
| gems in that dirt.
|
| I'm getting _very_ bearish on using LLMs for things that aren't
| pattern recognition.
| ianbutler wrote:
| I'm actually very curious why AI use is such a bi-modal
| experience. I've used AI to move multi thousand line codebases
| between languages. I've created new apps from scratch with it.
|
| My theory is the willingness to baby sit and the modality. I'm
| perfectly fine telling the tool I use its errors and working
| side by side with it like it was another person. At the end of
| the day it can belt out lines of code faster than I, or any
| human, can and I can review code very quickly so the overall
| productivity boost has been great.
|
| It does fundamentally alter my workflow. I'm very hands off
| keyboard when I'm working with AI in a way that is much more
| like working with someone or coaching someone to make something
| instead of doing the making myself. Which I'm fine with but
| recognize many developers aren't.
|
| I use AI autocomplete 0% of the time as I found that workflow
| was not as effective as me just writing code, but most of my
| most successful work using AI is a chat dialogue where I'm
| letting it build large swaths of the project a file or parts of
| a file at a time, with me reviewing and coaching.
| __float wrote:
| I'm not sure how many people are like me, but my attempts to
| use Copilot have largely been the context of writing code as
| usual, occasionally getting end-of-line or handful-of-lines
| completions from it. I suspect there's probably a bigger
| shift needed, but I haven't seen anyone (besides AI
| "influencers" I don't trust..?) showing what their day-to-day
| workflows look like.
|
| Is there a Vimcasts equivalent for learning the AI editor
| tips and tricks?
| sbarre wrote:
| Have you tried the chat mode?
|
| The autocomplete is somewhere between annoying and
| underwhelming for me, but the chat is super useful. Being
| able to just describe what you're thinking or what you're
| trying to do and having a bespoke code sample just show up
| (based on the code in your editor) that you can then either
| copy/paste in, cherry-pick from or just get inspired by,
| has been a great productivity booster..
|
| Treat it like a pair programmer or a rubber duck and you
| might have a better experience. I did!
| zeroonetwothree wrote:
| I guess for me it actually takes longer to review code than
| to write it. So maybe that's some of the difference.
| 0xFACEFEED wrote:
| As a programmer of over 20 years - this is terrifying.
|
| I'm willing to accept that I just have "get off my lawn"
| syndrome or something.
|
| But the idea of letting an LLM write/move large swaths of
| code seems so incredibly irresponsible. Whenever I sit down
| to write some code, be it a large implementation or a small
| function, I think about what other people (or future versions
| of myself) will struggle with when interacting with the code.
| Is it clear and concise? Is it too clever? Is it too easy to
| write a subtle bug when making changes? Have I made it
| totally clear that X is relying on Y dangerous behavior by
| adding a comment or intentionally making it visible in some
| other way?
|
| It goes the other way too. If I know someone well (or their
| style) then it makes evaluating their code easier. The more
| time I spend in a codebase the better idea I have of what the
| writer was trying to do. I remember spending a lot of time
| reading the early Redis codebase and got a pretty good sense
| of how Salvatore thinks. Or altering my approaches to code
| reviews depending on which coworker was submitting it. These
| weren't things I were doing out of desire but because all
| non-trivial code has so much subtlety; it's just the nature
| of the beast.
|
| So the thought of opening up a codebase that was cobbled
| together by an AI is just scary to me. Subtle bugs and errors
| would be equally distributed across the whole thing instead
| of where the writer was less competent (as is often the
| case). The whole thing just sounds like a gargantuan mess.
|
| Change my mind.
| bongodongobob wrote:
| You. Can. Write. Tests.
| ok_dad wrote:
| Tests haven't saved us so far, humans have been writing
| tests that passed for software with bugs for decades.
| lanternfish wrote:
| Tests aren't a full solution for all the considerations
| of the above post.
| blitzar wrote:
| Just let the LLM do that too.
| the_real_cher wrote:
| Even better you can let the AI write tests.
| hakunin wrote:
| How do you write a test for code clarity / readability /
| maintainability?
| 0xFACEFEED wrote:
| How do tests account for cases where I'm looking at a 100
| line function that could have easily been written in 20
| lines with just as much, if not more, clarity?
|
| It reminds me of a time (long ago) when the trend/fad was
| building applications visually. You would drag and drop
| UI elements and define logic using GUIs. Behind the
| scenes the IDE would generate code that linked everything
| together. One of the selling points was that underneath
| the hood it's just code so if someone didn't have access
| to the IDE (or whatever) then they could just open the
| source and make edits themselves.
|
| It obviously didn't work out. But not because of the
| scope/scale (something AI code generation solves) but
| because, it turns out, writing maintainable secure
| software takes a lot of careful thought.
|
| I'm not talking about asking an AI to vomit out a CRUD
| UI. For that I'm sure it's well suited and the risk is
| pretty low. But as soon as you introduce domain specific
| logic or non-trivial things connected to the real world -
| it requires thought. Often times you need to spend more
| time thinking about the problem than writing the code.
|
| I just don't see how "guidance" of an LLM gets anywhere
| near writing good software outside of trivial stuff.
| sbarre wrote:
| I think it depends on the stakes of what you're building.
|
| A lot of the concerns you describe make me think you work
| in a larger company or team and so both the organizational
| stakes (maintenance, future changes, tech debt, other
| people taking it over) and the functional stakes (bug free,
| performant, secure, etc) are high?
|
| If the person you're responding to is cranking out a
| personal SaaS project or something they won't ever want to
| maintain much, then they can do different math on risks.
|
| And probably also the language you're using, and the actual
| code itself.
|
| Porting a multi-thousand line web SaaS product in
| Typescript that's just CRUD operations and cranking out web
| views? Sure why not.
|
| Porting a multi-thousand line game codebase that's
| performance-critical and written in C++? Probably not.
|
| That said, I am super fascinated by the approach of "let
| the LLM write the code and coach it when it gets it wrong"
| and I feel like I want to try that.. But probably not on a
| work project, and maybe just on a personal project.
| ianbutler wrote:
| I have 10 years professional experience and I've been
| writing code for 20 years, really with this workflow I just
| read and review significantly more code and I coach it when
| it structures or styles something in a way I don't like.
|
| I'm fully in control and nothing gets committed I haven't
| read its an extension of me at that point.
| geysersam wrote:
| I'll take a stab at changing your mind.
|
| AIs are not able to write Redis. That's not their job. AIs
| should not write complex high performance code that
| millions of users rely on. If the code does something
| valuable for a large number of people you can afford humans
| to write it.
|
| AIs should write low value code that just repeats what's
| been done before but with some variations. Generic parts of
| CRUD apps, some fraction of typical frontends, common CI
| setups. That's what they're good at because they've seen it
| a million times already. That category constitutes most
| code written.
|
| This relieves human developers of ballpark 20% of their
| workload and that's already worth a lot of money.
| rqmedes wrote:
| I agree. I am in a very senior role and find that working
| with AI the same way you do I am many times more productive.
| Months of work becomes days or even hours of work
| bongodongobob wrote:
| My theory is grammatical correctness and specificity. I see a
| lot of people prompt like this:
|
| "use python to write me a prog that does some dice rolls and
| makes a graph"
|
| Vs
|
| "Create a Python program that generates random numbers to
| simulate a series of dice rolls. Export a graph of the
| results in PNG format."
|
| Information theory requires that you provide enough actual
| information. There is a minimum amount of work to supply the
| input. Otherwise, the gaps will get filled in with noise,
| working, what you want, or not.
|
| For example, maybe someday you could say "write me an OS" and
| it would work. However, to get exactly what you want, you
| still have to specify it. You can only compress so far.
| Aeolun wrote:
| > in practice I'm not noticing a productivity boost
|
| I am. Can suddenly do in a weekend what would have taken a
| week.
| Yodel0914 wrote:
| I find chatgpt incredibly useful for writing scripts against
| well-known APIs, or for a "better stackoverflow". Things like
| "how do I use a cursor in sql" or "in a devops yaml pipeline, I
| want to trigger another pipeline. How do I do that?".
|
| But working on our actual codebase with copilot in the IDE
| (Rider, in my case) is a net negative. It usually does OK when
| it's suggesting the completion of a single line, but when it
| decides to generate a whole block it invariably misunderstands
| the point of the code. I could imagine that getting better if I
| wrote more descriptive method names or comments, but the killer
| for me is that it just makes up methods and method signatures,
| even for objects that are part of publicly documented
| frameworks/APIs.
| fifteen1506 wrote:
| Paywall; can't read legally.
| ed_elliott_asc wrote:
| I am excited about this as I use Claude for coding but what I
| really like about copilot is if you have a list of something
| random like:
|
| /* Col1 varchar not null, Col2 int null, Col3 int not nul*/
|
| Then start doing something else like:
|
| | column | type | |---| ---| | Col1 | varchar |
|
| Then copilot is very good at guessing the rest of the table.
|
| (This isn't just sql to markdown it works whenever you want to
| repeat something using parts of another list somewhere in the
| same doc)
|
| I hope they continues as this has been a game changer for me as
| it is so quick, really great.
| johnyzee wrote:
| > _"The size of the Lego blocks that Copilot on AI can generate
| has grown [...] It certainly cannot write a whole GitHub or a
| whole Facebook, but the size of the building blocks will
| increase"_
|
| Um, that would make it _less_ capable, not more... /thatguy
| delduca wrote:
| I don't like using AI assistants in my editor; I prefer to keep
| it as clean as possible. So, I manually copy relevant parts of
| the code into ChatGPT, ask my question, and continue interacting
| until I get what I need. It's a bit manual, but since I use GPT
| for other tasks, it's convenient to have a single interface for
| everything.
| qubitly wrote:
| So GitHub's teaming up with Google, Anthropic, and OpenAI? Kinda
| feels Microsoft's version of a 'safety net', but for who exactly?
| It's hard not to wonder if this is actually about choice for the
| user or just insurance for Microsoft
| suyash wrote:
| Does this mean you need to be a paying user for Claude and Gemini
| or just with GitHub copilot?
| richardw wrote:
| This kind of thing is why I think Sam is often misjudged. You
| can't fuck around in such a competitive market. If you go in all
| kumbaya you'll get crushed by market forces. It's rare for
| company/founder ideals to survive the market indefinitely. I
| think he's iterated fast and the job is still very hard.
| chucke1992 wrote:
| This Github purchase was incredible
| moondistance wrote:
| Microsoft is negotiating equity in OpenAI as part of the switch
| to a for-profit. Non-zero chance this is a negotiation flex.
| pavelboyko wrote:
| I mentored junior SWE and CS students for years, and now using
| Claude as a coding assistant feels very similar. Yesterday, it
| suggested implementing a JSON parser from scratch in C to avoid a
| dependency -- and, unsurprisingly, the code didn't work. Two main
| differences stand out: 1) the LLM doesn't learn from corrections
| (at least not directly), and 2) the feedback loop is seconds
| instead of days. This speed is so convenient that it makes hiring
| junior SWEs seem almost pointless, though I sometimes wonder
| where we'll find mid-level and senior developers tomorrow if we
| stop hiring juniors today.
| hypeatei wrote:
| Years of experience doesn't correlate to a good developer
| either. I've seen senior devs using AI to solve impossible
| problems, for example asking it how to store an API key client
| side without leaking it...
| al_borland wrote:
| Does speed matter when it's not getting better and learning
| from corrections? I think I'd rather give someone a problem and
| have them come back with something that works in a couple days
| (answering a question here or there), rather than spend my time
| doing it myself because I'm getting fast, but wrong, results
| that aren't improving from the AI.
|
| > though I sometimes wonder where we'll find mid-level and
| senior developers tomorrow if we stop hiring juniors today.
|
| This is also a key point. While there is a lot of short term
| thinking these days, since people don't stick with companies
| like they used to. As a person who has been with my company for
| close to 20 years, making sure things can still run once you
| leave is important from a business perspective.
|
| Training isn't about today, it's about tomorrow. I've trained a
| lot of people, and doing it myself would always be faster in
| the moment. But it's about making the team better and making
| sure more people have more skill, to reduce single points of
| failure and ensure business continuity over the long-term. Not
| all of it pays off, but when it does, it pays off big.
| xpe wrote:
| Award for most ambiguous headline. ("cuts" can mean "initiates"
| or "terminates"!)
| zeroonetwothree wrote:
| "Cut a deal" is a standard phrase with only one meaning
| nvader wrote:
| But when you make it "cut AI deal", that breaks the standard
| phrase and opens the door to alternative explanations. I
| initially thought this was a news article about the deal
| breaking up.
| AntiqueFig wrote:
| I thought the same indeed.
| dankwizard wrote:
| A word can have multiple meanings.
| corobo wrote:
| Yes but cuts deals, which is what the title says, is
| ambiguous.
|
| Those are some load bearing pluralisations right there.
| fabmilo wrote:
| I also thought that they were ending some previous deal and not
| creating a new one.
| lifeisstillgood wrote:
| I still think it's worth emphasising - LLMs represent a massive
| capital absorber. Taking gobs of funding into your company is how
| you grow, how your options become more valuable, how your
| employees stay with you. If that treadmill were to break bad
| things happen.
|
| Search has been stuttering for a while - Google's growth and
| investment has been flattening - at some point they absorbed all
| the worlds stored information.
|
| OpenAI showed the new growth - we need billions of dollars to
| build and the run the LLMs (at a loss one assumes) - the
| treadmill can keep going
___________________________________________________________________
(page generated 2024-10-29 23:00 UTC)