[HN Gopher] Google CEO says more than a quarter of the company's...
___________________________________________________________________
Google CEO says more than a quarter of the company's new code is
created by AI
Author : S0y
Score : 192 points
Date : 2024-10-30 02:09 UTC (20 hours ago)
(HTM) web link (www.businessinsider.com)
(TXT) w3m dump (www.businessinsider.com)
| S0y wrote:
| https://archive.is/X43PU
| ausbah wrote:
| i would be may more impressed if LLMs could do code compression.
| more code == more things that can break, and when llms can
| generate boatloads of it with a click you can imagine what might
| happen
| Scene_Cast2 wrote:
| This actually sparked an idea for me. Could code complexity be
| measured as cumulative entropy as measured by running LLM token
| predictions on a codebase? Notably, verbose boilerplate would
| be pretty low entropy, and straightforward code should be
| decently low as well.
| jeffparsons wrote:
| Not quite, I think. Some kinds of redundancy are good, and
| some are bad. Good redundancy tends to reduce mistakes rather
| than introduce them. E.g. there's lots of redundancy in
| natural languages, and it helps resolve ambiguity and fill in
| blanks or corruption if you didn't hear something properly.
| Similarly, a lot of "entropy" in code could be reduced by
| shortening names, deleting types, etc., but all those things
| were helping to clarify intent to other humans, thereby
| reducing mistakes. But some is copy+paste of rules that
| should be enforce in one place. Teaching a computer to
| understand the difference is... hard.
|
| Although, if we were to ignore all this for a second, you
| could also make similar estimates with, e.g., gzip: the
| higher the compression ratio attained, the more
| "verbose"/"fluffy" the code is.
|
| Fun tangent: there are a lot of researchers who believe that
| compression and intelligence are equivalent or at least very
| tightly linked.
| 8note wrote:
| Interpreting this comment, it would predict low complexity
| for code copied unnecessarily.
|
| I'm not sure though. If it's copied a bunch of times, and
| it actually doesn't matter because each usecase of the
| copying is linearly independent, does it matter that it was
| copied?
|
| Over time, you'd still see copies being changed by
| themselves show up as increased entropy
| malfist wrote:
| Code complexity can already be measured deterministically
| with cyclomatic complexity. No need to use an AI fuzzy logic
| at this. Especially when they're bad at math.
| contravariant wrote:
| There's nothing fuzzy about letting an LLM determine the
| probability of a particular piece of text.
|
| In fact it's the one thing they are explicitly designed to
| do, the rest is more or less a side-effect.
| david-gpu wrote:
| _> Could code complexity be measured as cumulative entropy as
| measured by running LLM token predictions on a codebase?
| Notably, verbose boilerplate would be pretty low entropy, and
| straightforward code should be decently low as well._
|
| WinRAR can do that for you quite effectively.
| ks2048 wrote:
| I agree. It seems like counting lines of generated code is like
| counting bytes/instructions of compiled code - who cares? If
| "code" becomes prompts, then AI should lead to much smaller
| code than before.
|
| I'm aware that the difference is that AI-generated code can be
| read and modified by humans. But that quantity is bad because
| humans have to understand it to read or modify it.
| TZubiri wrote:
| What's that line about accounting for lines of code on the
| wrong side of the balance sheet?
| latexr wrote:
| > If "code" becomes prompts, then AI should lead to much
| smaller code than before.
|
| What's the point of shorter code if you can't trust it to do
| what it's supposed to?
|
| I'll take 20 lines of code that do what they should
| consistently over 1 line that may or may not do the task
| depending on the direction of the wind.
| asah wrote:
| meh - the LLM code I'm seeing isn't particularly more verbose.
| And as others have said, if you want tighter code, just add
| that to the prompt.
|
| fun story: today I had an LLM write me a non-trivial perl one-
| liner. It tried to be verbose but I insisted and it gave me one
| tight line.
| AlexandrB wrote:
| Exactly this. Code is a liability, if you can do the same thing
| with less code you're often better off.
| EasyMark wrote:
| Not if it's already stable and has been running for years.
| Legacy doesn't necessarily mean "need replacement because of
| technical debt". I've seen lots of people want to replace
| code that has been running basically bug free for years
| because "there are better coding styles and practices now"
| 8note wrote:
| How would it know which edge cases are being useful and which
| ones aren't?
|
| I understand more code as being more edge cases
| wvenable wrote:
| More code could just be useless code that no longer serves
| any purpose but still looks reasonable to the naked eye. An
| LLM can certainly figure out and suggest maybe some
| conditional is impossible given the rest of the code.
|
| I can also suggest alternatives, like using existing library
| functions for things that might have been coded manually.
| jrockway wrote:
| When I was there, way more than 25% of the code was copying one
| proto into another proto, or so people complained. What sort of
| memes are people making now that this task has been automated?
| dietr1ch wrote:
| I miss old memegen, but it got ruined by HR :/
| rcarmo wrote:
| I am reliably told that it is alive and well, even if it's
| changed a bit.
| anon1243 wrote:
| Memegen is there but unrecognizable now. A dedicated
| moderator team deletes memes, locks comments, bans people
| for mentioning "killing a process" (threatening language!)
| and contacts their managers.
| dietr1ch wrote:
| Yup, I simply stopped using it, which means they won.
| hn_throwaway_99 wrote:
| I am very interested in how this 25% number is calculated, and
| if it's a lot of boilerplate that in the past would have been
| just been big copy-paste jobs like a lot of protobuffers work.
| Would be curious if any Googlers could comment.
|
| Not that I'm really discounting the value of AI here. For
| example, I've found a ton of value and saved time getting AI to
| write CDKTF (basically, Terraform in Typescript) config scripts
| for me. I don't write Terraform that often, there are a ton of
| options I always forget, etc. So asking ChatGPT to write a
| Terraform config for, say, a new scheduled task for example
| saves me from a lot of manual lookup.
|
| But at the same time, the AI isn't really writing the
| complicated logic pieces for me. I think that comes down to the
| fact that when I do need to write complicated logic, I'm a
| decent enough programmer that it's probably faster for me to
| write it out in a high-level programming language than write it
| in English first.
| tylerchilds wrote:
| if the golden rule is that code is a liability, what does this
| headline imply?
| danielmarkbruce wrote:
| I'm sure google won't pay you money to take all their code off
| their hands.
| AlexandrB wrote:
| But they would pay me money to audit it for security.
| danielmarkbruce wrote:
| yup, you can get paid all kinds of money to fix/guard/check
| billion/trillion dollar assets..
| eddd-ddde wrote:
| The code would be getting written anyways, its an invariant.
| The difference is less time wasted typing keys (albeit small
| amount of time) and more importantly (in my experience) it
| helps A LOT for discoverability.
|
| With g3's immense amount of context, LLMs can vastly help you
| discover how other people are using existing libraries.
| tylerchilds wrote:
| my experience dabbling with the ai and code is that it is
| terrible at coming up with new stuff unless it already exists
|
| in regards to how others are using libraries, that's where
| the technology will excel-- re-writing code. once it has a
| stable AST to work with, the mathematical equation it is
| solving is a refactor.
|
| until it has that AST that solves the business need, the game
| is just prompt spaghetti until it hits altitude to be able to
| refactor.
| JimDabell wrote:
| Nothing at all. The headline talks about the proportion of code
| written by AI. Contrary to what a lot of comments here are
| assuming, it does not say that the volume of code written has
| increased.
|
| Google could be writing the same amount of code with fewer
| developers (they have had multiple layoffs lately), or their
| developers could be focusing more of their time and attention
| on the code they do write.
| contravariant wrote:
| Well, either they just didn't spend as much time writing the
| code or they increased their liability by about 33%.
|
| The truth is likely somewhere in between.
| kev009 wrote:
| I would hope a CEO, especially a technical one, would have enough
| sense to couple that statement to some useful business metric,
| because in isolation it might be announcement of public
| humiliation.
| dyauspitr wrote:
| Or a statement of pride that the intelligence they created is
| capable of lofty tasks.
| dmix wrote:
| The elitism of programmers who think the boilerplate code they
| write for 25% of the job, that's already been written before by
| 1000 other people before, is in fact a valuable use of company
| time to write by hand again.
|
| IMO it's only really an issue if a competent human wasn't
| involved in the process, basically a person who could have
| written it if needed, then they do the work connecting it to
| the useful stuff, and have appropriate QA/testing in
| place...the latter often taking far more effort than the actual
| writing-the-code time itself, even when a human does it.
| marcosdumay wrote:
| If 25% of your code is boilerplate, you have a serious
| architectural problem.
|
| That said, I've seen even higher ratios. But never in any
| place that survived for long.
| cryptoz wrote:
| Android mobile development has gotten so ...architectured
| that I would guess most apps have a much higher rate of
| "boilerplate" than you'd hope for.
|
| Everything is getting forced into a scalable, general
| purpose way, that most apps have to add a ridiculous amount
| of boilerplate.
| dmix wrote:
| You're probably thinking of just raw codebases, your
| company source code repo. Programmers do far, far more
| boilerplate stuff than raw code they commit with git.
| Debugging, data processing, system scripts, writing SQL
| queries, etc.
|
| Combine that with generic functions, framework boilerplate,
| OS/browser stuff, or explicit x-y-z code then your
| 'boilerplate' (ie repetitive, easily reproducible) easily
| gets to 25% of code you're programmers write every month.
| If your job is >75% pure human cognition problem solving
| you're probably in a higher tier of jobs than the vast
| majority of programmers on the planet.
| hn_throwaway_99 wrote:
| Depends on how you define "boilerplate". E.g. Terraform
| configs count for a significant number of the total lines
| in one of my repos. It's not really "boilerplate" in that
| it's not the exact same everywhere, but it is boilerplate
| in the since that setting up, say, a pretty standard Cloud
| SQL instance can take many, many lines of code just because
| there are so many config options.
| marcosdumay wrote:
| Terraform is verbose.
|
| It's only boilerplate if you write it again to set almost
| the same thing again. What, granted, if you are writing
| bare terraform config, it's probably both.
|
| But on either case, if your terraform config is
| repetitive and a large part of the code on an entire
| thing (not a repo, repos are arbitraty divisions, maybe
| "product", but it's also a bad name). Than that thing is
| certainly close to useless.
| 8note wrote:
| Is it though? It seems to me like a team ownership boundary
| question rather than an architecture question.
|
| Architecturally, it sounds like different architecture
| components map somewhere close to 1:1 to teams, rather than
| teams hacking components to be closer coupled to each other
| because they have the same ownership.
|
| I'd see too much boilerplate as being a
| organization/management org issue rather than a code
| architecture issue
| TheNewsIsHere wrote:
| To add: it's been my experience that it's the company that
| thinks the boilerplate code is some special, secret,
| proprietary thing that no other business could possibly
| have produced.
|
| Not the developer who has written the same effective stanza
| 10 times before.
| wvenable wrote:
| 25% of _new_ code might be boilerplate. All my apps in my
| organization start out roughly the same way with all the
| same stuff. You could argue on day one that 100% of the
| code is boilerplate and by the end of the project it is
| only a small percentage.
| mistrial9 wrote:
| you probably underestimate the endless miles of verbose code
| that are possible, by human or machine but especially by
| machine.
| kev009 wrote:
| Doing the same thing but faster might just mean you are
| masturbating more furiously. Show me the money, especially
| from a CEO.
| an_d_rew wrote:
| Huh.
|
| That may explain why google search has, in the past couple of
| months, become so unusable for me that I switched (happily) to
| kagi.
| twarge wrote:
| Which uses Google results?
| croes wrote:
| Related?
|
| > New tool bypasses Google Chrome's new cookie encryption system
|
| https://news.ycombinator.com/item?id=41988648
| hipadev23 wrote:
| Google is now mass-producing techdebt at rates not seen since
| Martin Fowler's first design pattern blogposts.
| joeevans1000 wrote:
| Not really technical debt when you will be able to regenerate
| 20K lines of code in a minute then QA and deploy it
| automatically.
| kibwen wrote:
| So a fresh, new ledger of technical debt every morning,
| impossible to ever pay off?
| 1attice wrote:
| Assuming, of course:
|
| - You know _which_ 20K lines need changing - You have
| _perfect_ QA - Nothing _ever_ goes wrong in deployment.
|
| I think there's a tendency in our industry to only take the
| hypotenuse of curves at the steepest point
| TheNewsIsHere wrote:
| That is a fantastic way to put it. I'd argue that you've
| described a bubble, which fits perfectly with the topic and
| where _most_ of it will eventually end up.
| nelup20 wrote:
| We've now entered the age of exponential tech debt, it'll be a
| sight to behold
| evbogue wrote:
| I'd be turning off the autocomplete in my IDE if I was at Google.
| Seems to double as a keylogger.
| joeevans1000 wrote:
| I read these threads and the usual 'I have to fix the AI code for
| longer than it would have taken to write it from scratch' and
| can't help but feel folks are truly trying to downplay what is
| going to eat the software industry alive.
| mjbale116 wrote:
| If you manage to convince software engineers that you are doing
| them a favour by employing them then they will approach any
| workplace negotiations with a specific mindset which will make
| them grab the first number it gets thrown to them.
|
| These statements are brilliant.
| akira2501 wrote:
| These statements rely on an unchallenged monopoly position.
| This is not sustainable. These statements will hasten the
| collapse.
| imaginebit wrote:
| I think he's trying to promote AI, somehow raises questions about
| thrir code quality among some
| dietr1ch wrote:
| I think it just shows how much noise there is in coding. Code
| gets reviewed anyways (although review quality was going down
| rapidly the more PMs where added to the team)
|
| Most of the code must be what could be snippets (opening files
| and handling errors with absl::, and moving data from proto to
| proto). One thing that doesn't help here, is that when writing
| for many engineers on different teams to read, spelling out
| simple code instead of depending on too many abstractions seems
| to be preferred by most teams.
|
| I guess that LLMs do provide smarter snippets that I don't need
| to fill out in detail, and when it understands types and
| whether things compile it gets quite good and "smart" when it
| comes to write down boilerplate.
| ultra_nick wrote:
| Why work at big businesses anymore? Let's just create more
| startups.
| IAmGraydon wrote:
| Risk appetite.
| game_the0ry wrote:
| Not so sure nowadays. Given how often big tech lays off
| employees and the abundance of recently laid off tech talent,
| trying to start your own company sounds a lot more appealing
| than ever.
|
| I consider myself risk-averse and even I am contemplating
| starting a small business in the event I get laid off.
| nine_zeros wrote:
| Writing more code means more needs to be maintained and they are
| cleverly hiding that fact. Software is a lot more like complex
| plumbing than people want to admit:
|
| More lines == more shit to maintain. Complex lines == the shit is
| unmanageable.
|
| But wall street investors love simplistic narratives such as More
| X == More revenue. So here we are. Pretty clever marketing imo.
| Tier3r wrote:
| Google is getting enshittified. It's already visible in many
| small ways. I was just using Google maps and in the route they
| called X (bus) Interchange as X International. I can only assume
| this happened because they are using AI to summarise routes now.
| Why in the world are they doing that? They have exact location
| names available.
| 1oooqooq wrote:
| this only means employees sign up to use new toys and they are
| paying enough seats for all employees.
|
| it's like companies paying all those todolist and tutorial apps
| left running on aws ec2 instances in 2007ish.
|
| I'd be worried if i were a google investor. lol.
| fragmede wrote:
| I'm not sure I get your point. Google created Gemini and
| whatever internal LLM their employees are using for code
| generation. Who are they paying, and for what seats? Not
| Microsoft or OpenAI or Anthropic...
| nosbo wrote:
| I don't write code as I'm a sysadmin. Mostly just scripts. But is
| this like saying intellisense writes 25% of my code? Because I
| use autocomplete to shortcut stuff or to create a for loop to
| fill with things I want to do.
| n_ary wrote:
| You just made it less attractive to the target corps who are to
| buy this product from Google. Saying, intellisense means corps
| already have license of various of these and some are even
| mostly free. Saying AI generate our 25% code sounds more
| attractive to corps, because it feels like something new and
| novel and you can imagine laying off 25% of the personnel and
| justify buying this product from Google.
|
| When someone who uses a product says it, there is a 50% chance
| of it being true, but when someone far away from the user says
| it, it is 100% promotion of product and setup for trust
| building for a future sale.
| coldpie wrote:
| Looks like it's an impressive autocomplete feature, yeah. Check
| out the video about halfway down here:
| https://research.google/blog/ai-in-software-engineering-at-g...
| (linked from other comment
| https://news.ycombinator.com/item?id=41992028 )
|
| Not what I thought when I heard "AI coding", but seems pretty
| neat.
| ChrisArchitect wrote:
| Related:
|
| _Alphabet ($GOOG) 2024 Q3 earnings release_
|
| https://news.ycombinator.com/item?id=41988811
| ntulpule wrote:
| Hi, I lead the teams responsible for our internal developer
| tools, including AI features. We work very closely with Google
| DeepMind to adapt Gemini models for Google-scale coding and other
| Software Engineering usecases. Google has a unique, massive
| monorepo which poses a lot of fun challenges when it comes to
| deploying AI capabilities at scale.
|
| 1. We take a lot of care to make sure the AI recommendations are
| safe and have a high quality bar (regular monitoring, code
| provenance tracking, adversarial testing, and more).
|
| 2. We also do regular A/B tests and randomized control trials to
| ensure these features are improving SWE productivity and
| throughput.
|
| 3. We see similar efficiencies across all programming languages
| and frameworks used internally at Google and engineers across all
| tenure and experience cohorts show similar gain in productivity.
|
| You can read more on our approach here:
|
| https://research.google/blog/ai-in-software-engineering-at-g...
| reverius42 wrote:
| To me the most interesting part of this is the claim that you
| can accurately and meaningfully measure software engineering
| productivity.
| valval wrote:
| You can come up with measures for it and then watch them,
| that's for sure.
| lr1970 wrote:
| when metric becomes the target it ceases to be a good
| metric. when discovered how it works developers will type
| the first character immediately after opening the log.
|
| edit: typo
| joshuamorton wrote:
| Only if the developer is being judged on the thing. If
| the _tool_ is being judged on the thing, it 's much less
| relevant.
|
| That is, I, personally, am not measured on how much AI
| generated code I create, and while the number is non-
| zero, I can't tell you what it is because I don't care
| and don't have any incentive to care. And I'm someone who
| is personally fairly bearish on the value of LLM-based
| codegen/autocomplete.
| ozim wrote:
| You can - but not on the level of a single developer and you
| cannot use those measures to manage productivity of a
| specific dev.
|
| For teams you can measure meaningful outcomes and improve
| team metrics.
|
| You shouldn't really compare teams but it also is possible if
| you know what teams are doing.
|
| If you are some disconnected manager that thinks he can make
| decisions or improvements reducing things to single numbers -
| yeah that's not possible.
| deely3 wrote:
| > For teams you can measure meaningful outcomes and improve
| team metrics.
|
| How? Which metrics?
| ozim wrote:
| That is what we pay managers -to figure out- for. They
| should find out which and how by knowing the team,
| familiarity with domain knowledge, understanding company
| dynamics, understanding customer, understanding market
| dynamics.
| seanmcdirmid wrote:
| That's basically a non-answer. Measuring "productivity"
| is a well known hard problem, and managers haven't really
| figured it out...
| yorwba wrote:
| Economists are generally fine with defining productivity
| as the ratio of aggregate outputs to aggregate inputs.
|
| Measuring it is not the hard part.
|
| The hard part is doing anything about it. If you can't
| attribute specific outputs to specific inputs, you don't
| know how to change inputs to maximize outputs. That's
| what managers need to do, but of course they're often
| just guessing.
| seanmcdirmid wrote:
| Measuring human productivity is hard since we can't
| quantify output beyond silly metrics like lines of code
| written or amount of time speaking during meetings. Maybe
| if we were hunter/gatherers we could measure it by amount
| of animals killed.
| yorwba wrote:
| That's why upthread we have
| https://news.ycombinator.com/item?id=41992562
|
| "You can [accurately and meaningfully measure software
| engineering productivity] - but not on the level of a
| single developer and you cannot use those measures to
| manage productivity of a specific dev."
|
| At the level of a company like Google, it's easy: both
| inputs and outputs are measured in terms of money.
| ozim wrote:
| As you point back to my comment.
|
| I am not Amazon person - but from my experience 2 pizza
| teams was what worked and I never implemented it myself
| just what I observed in wild.
|
| Measuring Google in terms of money is also flawed, there
| is loads of BS hidden there and lots of people paying big
| companies more just because they are big companies.
| ozim wrote:
| Well I pretty much see which team members are slacking
| and which are working hard.
|
| But I do code myself, I write requirements so I do know
| which ones are trivial and which ones are not. I also see
| when there are complex migrations.
|
| If you work in a group of people you will also get
| feedback - doesn't have to be snitching but still you get
| the feel who is a slacker in the group.
|
| It is hard to quantify the output if you want to be
| removed from the group "give me a number" manager. If you
| actually do the work of a manager so you get the feel of
| the group like who is "Hermione Granger" nagging that
| others are slacking and disregard their opinion, you see
| who is the "silent doer" or you see who is "we should do
| it properly" bullshitter you can make a lot of meaningful
| adjustments.
| mdorazio wrote:
| It's not a non-answer. Good managers need to figure out
| what metrics make sense for the team they are managing,
| and that will change depending on the company and team.
| It might be new features, bug fixes, new product launch
| milestones, customer satisfaction, ad revenue, or any of
| a hundred other things.
| randomNumber7 wrote:
| I heard lines of code is a hot one.
| seanmcdirmid wrote:
| I would want a specific example in that case rather than
| "the good managers figure it out" because in my
| experience, the bad managers pretend to figure it out
| while the good managers admit that they can't figure it
| out. Worse still, if you tell your reports what those
| metrics are, they will optimize them to death,
| potentially tanking the product (I can increase my bug
| fix count if there are more bugs to fix...).
| ozim wrote:
| So for a specific example I would have to outline 1-2
| years of history of a team and product as a starter.
|
| Then I would have to go on outlining 6-12 months of
| trying stuff out.
|
| Because if I just give "an example" I will get dozens of
| "smart ass" replies how this specific one did not work
| for them and I am stupid. Thanks but don't have time for
| that or for writing an essay that no one will read anyway
| and call me stupid or demand even more explanation. :)
| anthonyskipper wrote:
| My company uses the Dora metrics to measure the
| productivity of teams and those metrics are incredibly
| good.
| UncleMeat wrote:
| At scale you can do this in a bunch of interesting ways. For
| example, you could measure "amount of time between opening a
| crash log and writing the first character of a new change"
| across 10,000s of engineers. Yes, each individual data point
| is highly messy. Alice might start coding as a means of
| investigation. Bob might like to think about the crash over
| dinner. Carol might get a really hard bug while David gets a
| really easy one. But at scale you can see how changes in the
| tools change this metric.
|
| None of this works to evaluate individuals or even teams. But
| it can be effective at evaluating tools.
| fwip wrote:
| There's lots of stuff you can measure. It's not clear
| whether any of it is correlated with productivity.
|
| To use your example, a user with an LLM might say "LLM
| please fix this" as a first line of action, drastically
| improving this metric, even if it ruins your overall
| productivity.
| fhdsgbbcaA wrote:
| I've been thinking a lot lately about how an LLM trained in
| really high quality code would perform.
|
| I'm far from impressed with the output of GPT/Claude, all
| they've done is weight against stack overflow - which is still
| low quality code relative to Google.
|
| What is probability Google makes this a real product, or is it
| too likely to autocomplete trade secrets?
| hitradostava wrote:
| I'm continually surprised by the amount of negativity that
| accompanies these sort of statements. The direction of travel
| is very clear - LLM based systems will be writing more and more
| code at all companies.
|
| I don't think this is a bad thing - if this can be accompanied
| by an increase in software quality, which is possible. Right
| now its very hit and miss and everyone has examples of LLMs
| producing buggy or ridiculous code. But once the tooling
| improves to:
|
| 1. align produced code better to existing patterns and
| architecture 2. fix the feedback loop - with TDD, other LLM
| agents reviewing code, feeding in compile errors, letting other
| LLM agents interact with the produced code, etc.
|
| Then we will definitely start seeing more and more code
| produced by LLMs. Don't look at the state of the art not, look
| at the direction of travel.
| latexr wrote:
| > if this can be accompanied by an increase in software
| quality
|
| That's a huge "if", and by your own admission not what's
| happening now.
|
| > other LLM agents reviewing code, feeding in compile errors,
| letting other LLM agents interact with the produced code,
| etc.
|
| What a stupid future. Machines which make errors being
| "corrected" by machines which make errors in a death spiral.
| An unbelievable waste of figurative and literal energy.
|
| > Then we will definitely start seeing more and more code
| produced by LLMs.
|
| We're already there. And there's a lot of bad code being
| pumped out. Which will in turn be fed back to the LLMs.
|
| > Don't look at the state of the art not, look at the
| direction of travel.
|
| That's what leads to the eternal "in five years" which
| eventually sinks everyone's trust.
| danielmarkbruce wrote:
| > What a stupid future. Machines which make errors being
| "corrected" by machines which make errors in a death
| spiral. An unbelievable waste of figurative and literal
| energy.
|
| Humans are machines which make errors. Somehow, we got to
| the moon. The suggestion that errors just mindlessly
| compound and that there is no way around it, is what's
| stupid.
| nuancebydefault wrote:
| Exactly my thought. Humans can correct humans. Machines
| can correct, or at least point to failures in the product
| of, machines.
| reverius42 wrote:
| To err is human. To err at scale is AI.
| latexr wrote:
| > Humans are machines
|
| Even if we accept the premise (seeing humans as machines
| is literally dehumanising and a favourite argument of
| those who exploit them), not all machines are created
| equal. Would you use a bicycle to fill your taxes?
|
| > Somehow, we got to the moon
|
| Quite hand wavey. We didn't get to the Moon by reading a
| bunch of books from the era then probabilistically
| joining word fragments, passing that anround the same
| funnel a bunch of times, then blindly doing what came
| out, that's for sure.
|
| > The suggestion that errors just mindlessly compound and
| that there is no way around it
|
| Is one that you made up, as that was not my argument.
| paradox242 wrote:
| I don't see how this is sustainable. We have essentially
| eaten the seed corn. These current LLMs have been trained by
| an enormous corpus of mostly human-generated technical
| knowledge from sources which we already know to be currently
| being polluted by AI-generated slop. We also have preliminary
| research into how poorly these models do when training on
| data generated by other LLMs. Sure, it can coast off of that
| initial training set for maybe 5 or more years, but where
| will the next giant set of unpolluted training data come
| from? I just don't see it, unless we get something better
| than LLMs which is closer to AGI or an entire industry is
| created to explicitly create curated training data to be fed
| to future models.
| _DeadFred_ wrote:
| These tools also require the developer class to that they
| are intended to replace to continue to do what they
| currently do (create the knowledge source to train the AI
| on). It's not like the AIs are going to be creating the
| accessible knowledge bases to train AIs on, especially for
| new language extensions/libraries/etc. This is a one and
| f'd development. It will give a one time gain and then
| companies will be shocked when it falls apart and there's
| no developers trained up (because they all had to switch
| careers) to replace them. Unless Google's expectation is
| that all languages/development/libraries will just be
| static going forward.
| brainwad wrote:
| The LLM codegen at Google isn't unsupervised. It's
| integrated into the IDE as both autocomplete and prompt-
| based assistant, so you get a lot of feedback from a) what
| suggestions the human accepts and b) how they fix the
| suggestion when it's not perfect. So future iterations of
| the model won't be trained on LLM output, but on a mixture
| of human written code and human-corrected LLM output.
|
| As a dev, I like it. It speeds up writing easy but tedious
| code. It's just a bit smarter version of the refactoring
| tools already common in IDEs...
| randomNumber7 wrote:
| Because there seems to be a fundamental misunderstanding
| producing a lot of nonsense.
|
| Of course LLMs are a fantastic tool to improve productivity,
| but current LLM's cannot produce anything novel. They can
| only reproduce what they have seen.
| LinuxBender wrote:
| Is AI ready to crawl through all open source and find / fix all
| the potential security bugs or all bugs for that matter? If so
| will that become a commercial service or a free service?
|
| Will AI be able to detect bugs and back doors that require
| multiple pieces of code working together rather than being in a
| single piece of code? Humans have a hard time with this.
|
| - _Hypothetical Example: Authentication bugs in sshd that
| requires a flaw in systemd which then requires a flaw in udev
| or nss or PAM or some underlying library ... but looking at
| each individual library or daemon there are no bugs that a
| professional penetration testing organization such as the NCC
| group or Google 's Project Zero would find._ In other words,
| will AI soon be able to find more complex bugs in a year than
| Tavis has found in his career and will they start to compete
| with one another and start finding all the state sponsored
| complex bugs and then ultimately be able to create a map that
| suggests a common set of developers that may need to be
| notified? Will there be a table that logs where AI found things
| that professional human penetration testers could not?
| paradox242 wrote:
| Seems like there is more gain on the adversary side of this
| equation. Think nation-states like North Korea or China, and
| commercial entities like Pegasus Group.
| AnimalMuppet wrote:
| Google's AI would have the advantage of the source code.
| The adversaries would not. (At least, not without hacking
| Google's code repository, which isn't impossible...)
| mysterydip wrote:
| I assume the amount of monitoring effort is less than the
| amount of effort that would be required to replicate the AI
| generated code by humans, but do you have numbers on what that
| ROI looks like? Is it more like 10% or 200%?
| hshshshshsh wrote:
| Seems like everything is working out without any issues.
| Shouldn't you be a bit suspicious?
| Twirrim wrote:
| > We work very closely with Google DeepMind to adapt Gemini
| models for Google-scale coding and other Software Engineering
| usecases.
|
| Considering how terrible and frequently broken the code that
| the public facing Gemini produces, I'll have to be honest that
| that kind of scares me.
|
| Gemini frequently fails at some fairly basic stuff, even in
| popular languages where it would have had a lot of source
| material to work from; where other public models (even free
| ones) sail through.
|
| To give a fun, fairly recent example, here's a prime
| factorisation algorithm it produce for python:
| # Find the prime factorization of n prime_factors = []
| while n > 1: p = 2 while n % p == 0:
| prime_factors.append(p) n //= p p += 1
| prime_factors.append(n)
|
| Can you spot all the problems?
| senko wrote:
| We collectively deride leetcoding interviews yet ask AI to
| flawlessly solve leetcode questions.
|
| I bet I'd make more errors on my first try at it.
| AnimalMuppet wrote:
| Writing a prime-number factorization function is hardly
| "leetcode".
| atomic128 wrote:
| Empirical testing (for example:
| https://news.ycombinator.com/item?id=33293522) has
| established that the people on Hacker News tend to be
| junior in their skills. Understanding this fact can help
| you understand why certain opinions and reactions are
| more likely here. Surprisingly, the more skilled
| individuals tend to be found on Reddit (same testing
| performed there).
| louthy wrote:
| I'm not sure that's evidence; I looked at that and saw it
| was written in Go and just didn't bother. As someone with
| 40 years of coding experience and a fundamental dislike
| of Go, I didn't feel the need to even try. So the numbers
| can easily be skewed, surely.
| atomic128 wrote:
| Only individuals who submitted multiple bad solutions
| before giving up were counted as failing. If you look but
| don't bother, or submit a single bad solution, you aren't
| counted. Thousands of individuals were tested on Hacker
| News and Reddit, and surprisingly, it's not even close:
| Reddit is where the hackers are. I mean, at the time of
| the testing, years ago.
| louthy wrote:
| That doesn't change my point. It didn't test every dev on
| all platforms, it tested a subset. That subset may well
| have different attributes to the ones that didn't engage.
| So, it says nothing about the audience for the forums as
| a whole, just the few thousand that engaged.
|
| Perhaps even, there could be fewer Go programmers here
| and some just took a stab at it even though they don't
| know the language. So it could just select for which
| forum has the most Go programmers. Hardly rigourous.
|
| So I'd take that with a pinch of salt personally
| atomic128 wrote:
| Agreed. But remember, this isn't the only time the
| population has been tested. This is just the test (from
| two years ago, in 2022) that I happen to have a link to.
| louthy wrote:
| The population hasn't been tested. A subset has.
| Izikiel43 wrote:
| How is that thing testing? Is it expecting a specific
| solution or actually running the code? I tried some
| solutions and it complained anyways
| atomic128 wrote:
| The way the site works is explained in the first puzzle,
| "Hack This Site". TLDR, it builds and runs your code
| against a test suite. If your solutions weren't accepted,
| it's because they're wrong.
| 0xDEAFBEAD wrote:
| Where is the data?
| senko wrote:
| I didn't say it's hard, but it's most definitely
| leetcode, as in "pointless algorithmic exercise that will
| only show you if the candidate recently worked on a
| similar question".
|
| If that doesn't satisfy, here's a similar one at
| leetcode.com: https://leetcode.com/problems/distinct-
| prime-factors-of-prod...
|
| I would not expect a programmer of any seniority to churn
| stuff like that and have it working without testing.
| AnimalMuppet wrote:
| > "pointless algorithmic exercise that will only show you
| if the candidate recently worked on a similar question".
|
| I've been able to write one, not from memory but from
| first principles, any time in the last 40 years.
| gerash wrote:
| I believe most people use AI to help them quickly figure out
| how to use a library or an API without having to read all
| their (often out dated) documentation instead of helping them
| solve some mathematical challenge
| taeric wrote:
| If the documentation is out of date, such that it doesn't
| help, this doesn't bode well for the training data of the
| AI helping it get it right, either?
| macintux wrote:
| AI can presumably integrate all of the forum discussions
| talking about how people really use the code.
|
| Assuming discussions don't happen in Slack, or Discord,
| or...
| randomNumber7 wrote:
| And all the code on which it was trained...
| woodson wrote:
| Unfortunately, it often hallucinates wrong parameters (or
| gets their order wrong) if there are multiple different
| APIs for similar packages. For example, there are plenty
| ML model inference packages, and the code suggestions for
| NVIDIA Triton Inference Server Python code are pretty
| much always wrong, as it generates code that's probably
| correct for other Python ML inference packages with
| slightly different API.
| randomNumber7 wrote:
| I think that too but google claims something else.
| kgeist wrote:
| They probably use AI for writing tests, small internal
| tools/scripts, building generic frontends and quick
| prototypes/demos/proofs of concept. That could easily be that
| 25% of the code. And modern LLMs are pretty okayish with
| that.
| calf wrote:
| We are sorely lacking a "Make Computer Science a Science"
| movement, the tech lead's blurb is par for the course,
| talking about "SWE productivity" with no reference to
| scientific inquiry and a foundational understanding of
| safety, correctness, verification, validation of these new
| LLM technologies.
| justinpombrio wrote:
| > Can you spot all the problems?
|
| You were probably being rhetorical, but there are two
| problems:
|
| - `p = 2` should be outside the loop
|
| - `prime_factors.append(n)` appends `1` onto the end of the
| list for no reason
|
| With those two changes I'm pretty sure it's correct.
| bogwog wrote:
| Is any of the AI generated code being committed to Google's
| open source repos, or is it only being used for
| private/internal stuff?
| wslh wrote:
| As someone working in cybersecurity and actively researching
| vulnerability scanning in codebases (including with LLMs), I'm
| struggling to understand what you mean by "safe." If you're
| referring to detecting security vulnerabilities, then you're
| either working on a confidential project with unpublished
| methods, or your approach is likely on par with the current
| state of the art, which primarily addresses basic
| vulnerabilities.
| pixelat3d wrote:
| Sooo... is this why Google sucks now?
| rcarmo wrote:
| There is a running gag among my friends using Google Chat (or
| whatever their corporate IM tool is now called) that this
| explains a lot of what they're experiencing while using it...
| tdeck wrote:
| I didn't know anyone outside Google actually used that...
| oglop wrote:
| No surprise. I give my career about 2 years before I'm useless.
| phi-go wrote:
| They still need someone to write 75% of the code.
| k4rli wrote:
| Seems just overhyped tech to push up stock prices. It was
| already claimed 2 years ago that half of the jobs would be
| taken by "AI" but barely any have and AI has barely improved
| since GPT3.5. Latest Anthropic is only slightly helpful for
| software development, mostly for unusual bug investigations and
| logs analysis, at least in my experience.
| lysace wrote:
| Github Copilot had an outage for me this morning. It was kind of
| shocking. I now believe this metric. :-)
|
| I'll be looking into ways of running a local LLM for this purpose
| (code assistance in VS Code). I'm already really impressed with
| various quite large models running on my 32 GB Mac Studio M2 Max
| via Ollama. It feels like having a locally running chatgpt.
| kulahan wrote:
| I'm very happy to hear this; maybe it's finally time to buy a
| ton of ram for my PC! A local, private LLM would be great. I'd
| try talking to it about stuff I don't feel comfortable being on
| OpenAI's servers.
| lysace wrote:
| Getting lots of ram will let you run large models on the CPU,
| but it will be so slow.
|
| The Apple Silicon Macs have this shared memory between CPU
| and GPU that let's the (relatively underpowered GPU, compared
| to a decent Nvidia GPU) run these models at decent speeds,
| compared with a CPU, when using llama.cpp.
|
| This should all get _dramatically_ better /faster/cheaper
| within a few years, I suspect. Capitalism will figure this
| one out.
| kulahan wrote:
| Interesting, so this is a Mac-specific solution? That's
| pretty cool.
|
| I assume, then, that the primary goal would be to drop in
| the beefiest GPU possible when on windows/linux?
| lysace wrote:
| With Windows/Linux I think the issue is that NVidia is
| artificially limiting the amount of onboard RAM (they
| want to sell those devices for 10x more to openai, etc)
| and that AMD for whatever reason can't get their shit
| together.
|
| I'm sure that there are other much more knowledgeable
| people here though, on this topic.
| evoke4908 wrote:
| Ollama, docker and "open webui".
|
| It immediately works out of the box and that's it. I've been
| using local LLMs on my laptop for a while, it's pretty nice.
|
| The only thing you really need to worry about is VRAM. Make
| sure your GPU has enough memory to run your model and that's
| pretty much it.
|
| Also "open webui" is the worst project name I've ever seen.
| marstall wrote:
| first thought is that much of that 25% is test code for non-ai-
| gen code...
| pfannkuchen wrote:
| It's replaced the 25% previously copy pasted from stack overflow.
| brainwad wrote:
| The split is roughly 25% AI, 25% typed, 50% pasted.
| rkagerer wrote:
| This may have been intended as a joke, but it's the only
| explanation that reconciles the quote for me.
| ryoshu wrote:
| Spoken like an MBA who counts lines of code.
| marstall wrote:
| maps with recent headlines about AI improving programmer
| productivity 20-30%.
|
| which puts it in line with previous code-generation technologies
| i would imagine. I wonder which of these increased productivity
| the most?
|
| - Assembly Language
|
| - early Compilers
|
| - databases
|
| - graphics frameworks
|
| - ui frameworks (windows)
|
| - web apps
|
| - code generators (rails scaffolding)
|
| - genAI
| akira2501 wrote:
| Early Compilers. By a wide margin. They are the enabling factor
| for everything that comes below it. It's what allows you to
| share library interfaces and actually use them in a consistent
| manor and across multiple architectures. It entirely changed
| the shape of software development.
|
| The gap between "high level assembly" and "compiled language"
| is about as large as it gets.
| skrebbel wrote:
| To my experience, AIs can generate perfectly good code relatively
| easy things, the kind you might as well copy&paste from
| stackoverflow, and they'll very confidently generate _subtly
| wrong_ code for anything that 's non-trivial for an experienced
| programmer to write. How do people deal with this? I simply don't
| understand the value proposition. Does Google now have 25% subtly
| wrong code? Or do they have 25% trivial code? Or do all their
| engineers babysit the AI and bugfix the subtly wrong code? Or are
| all their engineers so junior that an AI is such a substantial
| help?
|
| Like, isn't this announcement a terrible indictment of how
| inexperienced their engineers are, or how trivial the problems
| they solve are, or both?
| tmoravec wrote:
| Does the figure include unit tests?
| airstrike wrote:
| By definition, "trivial" code should make up a significant
| portion of any code base, so perhaps the 25% is precisely the
| bit that is trivial and easily automated.
| Smaug123 wrote:
| I don't think the word "definition" means what you think it
| means!
| hifromwork wrote:
| 25% trivial code sounds like a reasonable guess.
| fzysingularity wrote:
| This seems reasonable - but I'm interpreting this as most
| junior-level coding needs will end and be replaced with AI.
| mrguyorama wrote:
| And the non junior developers will then just magically
| appear from the aether!With 10 years experience in a four
| year old stack.
| Nasrudith wrote:
| I wouldn't call it an indictment necessarily, because so much
| is dependent upon circumstances. They can't all be "deep
| problems" in the real world. Projects tend to have two
| components, "deep" work which is difficult and requires high
| skill and cannot be made up with by using masses of
| inexperienced and "shallow" work where being skilled doesn't
| really help, or doesn't help too much compared to throwing more
| bodies at the problem. To use an example it is like advanced
| accounting vs just counting up sales receipts.
|
| Even if their engineers were inexperienced that wouldn't be an
| indictment in itself so long as they had a sufficient necessary
| amount of shallow work. Using all experienced engineers to do
| shallow work is just inefficient, like having brain surgeons
| removing bunions. Automation is basically a way to transform
| deep work to a producer of "free" shallow work.
|
| That said, the real impressive thing with code isn't in its
| creation but in its ability to losslessly delete code and
| maintain or improve functionality.
| andyjohnson0 wrote:
| I suspect that a lot of the hard, google-scale stuff has
| already been done and packaged as an internal service or
| library - and just gets re-used. So the AIs are probably
| churning out new settings dialogs and the like.
| jajko wrote:
| I can generate in eclipse pojo classes or their accessor
| methods. I can let maven build entire packages from say XSDs (I
| know I am talking old boring tech, just giving an example). I
| can copy&paste half the code (if not more) from stack overflow.
|
| Now replace all this and much more with 'AI'. If they said AI
| helped them increase say ad effectivity by 3-5%, I'll start
| paying attention.
| akira2501 wrote:
| > isn't this announcement a terrible indictment
|
| Of obviously flawed corporate structures. This CEO has no
| particular programming expertise and most of his companies
| profits do not seem to flow from this activity. I strongly
| doubt he has a grip on the actual facts here and is
| uncritically repeating what was told to him in a meeting.
|
| He should, given his position, been the very _first_ person to
| ask the questions you've posed here.
| jjtheblunt wrote:
| Maybe the trick is to hide vetted correct code, of whatever
| origin, behind function calls for documented functions, thereby
| iteratively simplifying the work a later-trained LLM would need
| to do?
| toasteros wrote:
| > the kind you might as well copy&paste from stackoverflow
|
| This bothers me. I completely understand the conversational
| aspect - "what approach might work for this?", "how could we
| reduce the crud in this function?" - it worked a lot for me
| last year when I tried learning C.
|
| But the vast majority of AI use that I see is...not that. It's
| just glorified, very expensive search. We are willing to burn
| far, far more fuel than necessary because we've decided we
| can't be bothered with traditional search.
|
| A lot of enterprise software is poorly cobbled together using
| stackoverflow gathered code as it is. It's part of the reason
| why MS Teams makes your laptop run so hot. We've decided that
| power-inefficient software is the best approach. Now we want to
| amplify that effect by burning more fuel to get the same
| answers, but from an LLM.
|
| It's frustrating. It should be snowing where I am now, but it's
| not. Because we want to frivolously chase false convenience and
| burn gallons and gallons of fuel to do it. LLM usage is a part
| of that.
| chongli wrote:
| _we 've decided we can't be bothered with traditional search_
|
| Traditional search (at least on the web) is dying. The entire
| edifice is drowning under a rapidly rising tide of spam and
| scam sites. No one, including Google, knows what to do about
| it so we're punting on the whole project and hoping AI will
| swoop in like _deus ex machina_ and save the day.
| petre wrote:
| AI will generate even more spam and scam sites more
| trivially.
| romwell wrote:
| _Narrator: it did not, in fact, save the day._
| AnimalMuppet wrote:
| But it can't save the day.
|
| The problem with Google search is that it indexes all the
| web, and there's (as you say) a rising tide of scam and
| spam sites.
|
| The problem with AI is that it scoops up all the web as
| training data, _and there 's a rising tide of scam and spam
| sites._
| lokar wrote:
| It took the scam/spam sites a few years to catch up to
| Google search. Just wait a bit, equilibrium will return.
| akoboldfrying wrote:
| >The entire edifice is drowning under a rapidly rising tide
| of spam and scam sites.
|
| You make this claim with such confidence, but what is it
| based on?
|
| There have always been hordes of spam and scam websites.
| Can you point to anything that actually indicates that the
| ratio is now getting worse?
| chongli wrote:
| _There have always been hordes of spam and scam websites.
| Can you point to anything that actually indicates that
| the ratio is now getting worse?_
|
| No, there haven't always been hordes of spam and scam
| websites. I remember the web of the 90s. When Google
| first arrived on the scene every site on the results page
| was a real site, not a spam/scam site.
| masfuerte wrote:
| Google results are not polluted with spam because Google
| doesn't know how to deal with it.
|
| Google results are polluted with spam because it is more
| profitable for Google. This is a conscious decision they
| made five years ago.
| chongli wrote:
| _because it is more profitable for Google_
|
| Then why are DuckDuckGo results also (arguably even more
| so) polluted with spam/scam sites? I doubt DDG is making
| any profit from those sites since Google essentially owns
| the display ad business.
| JohnDone wrote:
| Ddg is actually Bing. Search as a service.
| photonthug wrote:
| Maybe it is naive but I think search would probably work
| again if they could roll back code to 10 or 15 years ago
| and just make search engines look for text in webpages.
|
| Google wasn't crushed by spam, they decided to stop doing
| text search and build search bubbles that are user
| specific, location-specific, decided to surface pages that
| mention search terms in metadata instead of in text users
| might read, etc. Oh yeah, and about a decade before LLMs
| were actually usable, they started to sabotage simple
| substring searches and kind of force this more
| conversational interface. That's when simple search terms
| stopped working very well, and you had to instead ask
| yourself "hmm how would a very old person or a small child
| phrase this question for a magic oracle"
|
| This is how we get stuff like: Did you mean "when did
| Shakespeare die near my location"? If anyone at google
| cared more about quality than printing money, that thirsty
| gambit would at least be at the bottom of the page instead
| of the top.
| layer8 wrote:
| > just make search engines look for text in webpages.
|
| Google's verbatim search option roughly does that for me
| (plus an ad blocker that removes ads from the results
| page). I have it activated by default as a search
| shortcut.
|
| (To activate it, one can add "tbs=li:1" as a query
| parameter to the Google search URL.)
| skissane wrote:
| I personally think a big problem with search is major
| search engines try to be all things to all people and hence
| suffer as a result.
|
| For example: a beginner developer is possibly better served
| by some SEO-heavy tutorial blog post; an experienced
| developer would prefer results weighted towards the
| official docs, the project's bug tracker and mailing list,
| etc. But since less technical and non-technical people
| vastly outnumber highly technical people, Google and Bing
| end up focusing on the needs of the former, at the cost of
| making search worse for the later.
|
| One positive about AI: if an AI is doing the search, it
| likely wants the more advanced material not the more
| beginner-focused one. It can take more advanced material
| and simplify it for the benefit of less experienced users.
| It is (I suspect) less likely to make mistakes if you ask
| it to simplify the more advanced material than if you just
| gave it more beginner-oriented material instead. So if AI
| starts to replace humans as the main clients of search,
| that may reverse some of the pressure to "dumb it down".
| photonthug wrote:
| > But since less technical and non-technical people
| vastly outnumber highly technical people, Google and Bing
| end up focusing on the needs of the former, at the cost
| of making search worse for the later.
|
| I mostly agree with your interesting comment, and I think
| your analysis basically jives with my sibling comment.
|
| But one thing I take issue with is the idea that this
| type of thing is a good faith effort, because it's more
| like a convenient excuse. Explaining substring search or
| even include/exclude ops to children and grandparents is
| actually easy. Setting preferences for tutorials vs API
| docs would also be easy. But companies don't really want
| user-directed behavior as much as they want to herd users
| to preferred content with algorithms, then convince the
| user it was their idea or at least the result of
| relatively static ranking processes.
|
| The push towards more fuzzy semantic search and "related
| content" everywhere is not to cater to novice users but
| to blur the line between paid advertisement and organic
| user-directed discovery.
|
| No need to give megacorp the benefit of the doubt on
| stuff like this, or make the underlying problems seem
| harder than they are. All platforms land in this place by
| convergent evolution wherein the driving forces are money
| and influence, not insurmountable technical difficulties
| or good intentions for usability.
| cyanydeez wrote:
| If only google was trying to solve search rather than
| shareholdet values.
| skydhash wrote:
| > _Traditional search (at least on the web) is dying._
|
| That's not my experience at all. While there are scammy
| sites, using the search engines as an index instead of an
| oracle still yields useful results. It only requires to
| learn the keywords which you can do by reading the relevant
| materials .
| ponector wrote:
| >> No one, including Google, knows what to do about it
|
| I'm sure they can. But they have no incentive. Try to
| Google an item, and it will show you a perfect match of
| sponsored ads and some other not-so-relevant non-sponsored
| results
| AtlasBarfed wrote:
| There's no way the search AI will beat out the spamgen AI.
|
| Tailoring/retraining the main search AI will be so much
| more expensive that retraining the spam special purpose
| AIs.
| quickthrowman wrote:
| Google could fix the problem if they wanted to, but it's
| not in their interests to fix it since the spam sites
| generally buy ads from Google and/or display Google ads on
| their spam websites. Google wants to maximize their income,
| so..
| layer8 wrote:
| Without a usable web search index, AI will be in trouble
| eventually as well. There is no substitute for it.
| nwellinghoff wrote:
| They probably have ai that scans existing human written code
| and auto generates patches and fixes to improve performance or
| security. The 25% is just a top level stat with no real meaning
| without context.
| groestl wrote:
| > do they have 25% trivial code?
|
| From what I've seen on Google Cloud, both as a user and from
| leaked source code, 25% of their code is probably just packing
| and unpacking of protobufs.
| tobyjsullivan wrote:
| I'm not a Google employee but I've heard enough stories to know
| that a surprising amount of code changes at google are
| basically updating API interfaces.
|
| The way google works, the person changing an interface is
| responsible for updating all dependent code. They create PRs
| which are then sent to code owners for approval. For lower-
| level dependencies, this can involve creating thousands of PRs
| across hundreds of projects.
|
| Google has had tooling to help with these large-scale refactors
| for decades, generally taking the form of static analysis
| tools. However, these would be inherently limited in their
| capability. Manual PR authoring would still be required in many
| cases.
|
| With this background, LLM code gen seems like a natural tool to
| augment Google's existing process.
|
| I expect Google is currently executing a wave of newly-
| unblocked refactoring projects.
|
| If anyone works/worked at google, feel free to correct me on
| this.
| skissane wrote:
| > To my experience, AIs can generate perfectly good code
| relatively easy things, the kind you might as well copy&paste
| from stackoverflow, and they'll very confidently generate
| subtly wrong code for anything that's non-trivial for an
| experienced programmer to write. How do people deal with this?
|
| Well, just in the last 24 hours, ChatGPT gave me solutions to
| some relatively complex problems that turned out to be
| significantly wrong.
|
| Did that mean it was a complete waste of my time? I'm not sure.
| Its broken code gave me a starting point for tinkering and
| exploring and trying to understand why it wasn't working (even
| if superficially it looked like it should). I'm not convinced I
| lost anything by trying its suggestions. And I learned some
| things in the process (e.g. asyncio doesn't play well together
| with Flask-Sock)
| wvenable wrote:
| > Or do they have 25% trivial code?
|
| We all have probably 25% or more trivial code. AI is great for
| that. I have X (table structure, model, data, etc) and I want
| to make Y with it. A lot of code is pretty much mindless
| shuffling data around.
|
| The other thing is good for is anything pretty standard. If I'm
| using a new technology and I just want to get started with
| whatever is the best practice, it's going to do that.
|
| If I ever have to do PowerShell (I hate PowerShell), I can get
| AI to generate pretty much whatever I want and then I'm smart
| enough to fix any issues. But I really don't like starting from
| nothing in a tech I hate.
| randomNumber7 wrote:
| Yes but then it would be more logical to say "AI makes our
| devs 25% more efficient". This is not what he said, but imo
| you are obviously right.
| wvenable wrote:
| Not necessarily. If 25% of the code is written by AI but
| that code isn't very interesting or difficult, it might not
| be making the devs 25% more efficient. It could even
| possibly be more but, either way, these are different
| metrics.
| johannes1234321 wrote:
| The benefit doesn't translate 1:1. The generated code has
| to be read and verified and might require small adaptions.
| (Partially that can be done by AI as well)
|
| But for me it massively improved all the boilerplate
| generic work. A lot of those things which are just annoying
| work, but not interesting.
|
| Then I can focus on the bigger things, on the important
| parts.
| lambdasquirrel wrote:
| I've already had one job interview where the applicant seemed
| broadly knowledgeable about everything we asked them during
| lead-in questions before actual debugging. Then when they had
| to actually dig deeper or demonstrate understanding while
| solving some problem, they fell short.
|
| I'm pretty sure they weren't the first and there've been
| others we didn't know about. So now I don't ask lead-in
| questions anymore. Surprisingly, it doesn't seem to make much
| of a difference and I don't need to get burned again.
| sangnoir wrote:
| > Does Google now have 25% subtly wrong code?
|
| How do you quantify "new code" - is it by lines of code or
| number of PRs/changesets generated? I can easily see it being
| the latter - if an AI workflow suggests 1 naming-change/cleanup
| commit to your PR made of 3 other human-authored commits, has
| it authored 25% of code? Arguably, yes - but it's trivial code
| that ought to be reviewed by humans. Dependabot is responsible
| for a good chunk of PRs already.
|
| Having a monorepo brings plenty of opportunities for automation
| when refactoring - whether its AI, AST manipulation or even
| good old grep. The trick is not to merge the code directly, but
| have humans in the loop to approve, or take-over and correct
| the code first.
| rpcope1 wrote:
| I don't get it either. People will say all sorts of strange
| stuff about how it writes the code for them or whatever, but
| even using the new Claude 3.5 Sonnet or whatever variant of
| GPT4, the moment I ask it anything that isn't the most basic
| done-to-death boilerplate, it generates stuff that's wrong, and
| often subtly wrong. If you're not at least pretty knowledgeable
| about exactly what it's generating, you'll be stuck trying to
| troubleshoot bad code, and if you are it's often about as quick
| to just write it yourself. It's especially bad if you get away
| from Python, and try to make it do anything else. SQL
| especially, for whatever reason, I've seen all of the major
| players generate either stuff that's just junk or will cause
| problems (things that your run of the mill DBA will catch).
|
| Honestly, I think it will become a better Intellisense but not
| much more. I'm a little excited because there's going to be so
| many people buying into this, generating so much bad code/bad
| architecture/etc. that will inevitably need someone to fix
| after the hype dies down and the rug is pulled, that I think
| there will continue to be employment opportunities.
| solumunus wrote:
| Supermaven is an incredible intellisense. Most code IS
| trivial and I barely write trivial code anymore. My imports
| appear instantly, with high accuracy. I have lots of embedded
| SQL queries and it's able to guess the structure of my
| database very accurately. As I'm writing a query the
| suggested joins are accurate probably 80% of the time. I'm
| significantly more productive and having to type much less.
| If this is as good as it ever gets I'm quite happy. I rarely
| use AI for non trivial code, but non trivial code is what I
| want to work on...
| ta_1138 wrote:
| This is all about the tooling most companies choose when
| building software: Things with more than enough boilerplate
| most code is trivial. We can build tools that have far less
| triviality and more density, where the distance between the
| code we write and business logic is very narrow.. but then
| every line of code we write is hard, because it's
| meaningful, and that feels bad enough to many developers,
| so we end up with tools where we might not be more
| productive, but we might feel productive, even though most
| of that apparent productivity is trivially generated.
|
| We also have the ceremonial layers of certain forms of
| corporate architecture, where nothing actually happens, but
| the steps must exist to match the holy box, box cylinder
| architecture. Ceremonial input massaging here, ceremonial
| data transformation over there, duplicated error
| checking... if it's easy for the LLM to do, maybe we
| shouldn't be doing it everywhere in the first place.
| thfuran wrote:
| >but then every line of code we write is hard, because
| it's meaningful, and that feels bad enough to many
| developers,
|
| I don't know that I've ever even met a developer who
| wants to be writing endless pools of trivial boilerplate
| instead of meaningful code. Even the people at work who
| are willing to say they don't want to deal with the
| ambiguity and high level design stuff and just want to be
| told what to do pretty clearly don't want endless
| drudgery.
| Aeolun wrote:
| That, but boilerplate stuff is also incredibly easy to
| understand. As compared to high density, high meaning
| code anyway. I prefer more low density low meaning code
| as it makes it much easier to reason about any part of
| the system.
| monksy wrote:
| I don't think that is the signal that I think most people
| are hoping for here.
|
| When I hear that most code is trivial, I think of this as a
| language design or a framework related issue making things
| harder than they should be.
|
| Throwing AI or generates at the problem just to claim that
| they fixed it is just frustrating.
| Kiro wrote:
| Interesting that you believe your subjective experience
| outweighs the claims of all others who report successfully
| using LLMs for coding. Wouldn't a more charitable
| interpretation be that it doesn't fit the stuff you're doing?
| skeeter2020 wrote:
| trivial code could very easily include the vast majority of
| most apps we're building these days. Most of it's just glue,
| and AI can probably stitch together a bunch of API calls and
| some UI as well as a human. It could also be a lot of non-
| product code, tooling, one-time things, etc.
| ithkuil wrote:
| Or perhaps that even for excellent engineers and complicated
| problems a quarter of the code one writes is stupid almost
| copy-pasteable boilerplate which is now an excellent target for
| the magic lArge text Interpolator
| eco wrote:
| I'd be terribly scared to use it in a language that isn't
| statically typed with many, many compile time error checks.
|
| Unless you're the type of programmer that is writing sabots all
| day (connecting round pegs into square holes between two data
| sources) you've got to be very critical of what these things
| are spitting out.
| randomNumber7 wrote:
| It is way more scary to use it for C or C++ than Python imo.
| cybrox wrote:
| If you use it as advanced IntelliSense/auto-complete, it's
| not any worse than with typed languages.
|
| If you just let it generate and run the code... yeah,
| probably, since you won't catch the issues at compile time.
| JohnMakin wrote:
| > To my experience, AIs can generate perfectly good code
| relatively easy things, the kind you might as well copy&paste
| from stackoverflow,
|
| This, imho, is what is happening. In the olden days, when
| StackOverflow + Google used to magically find the exact problem
| from the exact domain you needed every time - even then you'd
| often need to sift through the answers (top voted one was
| increasingly not what you needed) to find what you needed, then
| modify it further to precisely fit whatever you were doing.
| This worked fine for me for a long time until search rendered
| itself worthless and the overall answer quality of
| StackOverflow has gone down (imo). So, we are here, essentially
| doing the exact same thing in a much more expensive way, as you
| said.
|
| Regarding future employment opportunities - this rot is already
| happening and hires are coming from it, at least from what I'm
| seeing in my own domain.
| notyourwork wrote:
| To your point, I don't buy the truth of the statement. I work
| in big tech and am convinced that 25% of the code being written
| is not coming from AI.
| cybrox wrote:
| Depends if they include test code in this metric. I have found
| AI most valuable in generating test code. I usually want to
| keep tests as simple as possible, so I prefer some repetition
| over abstraction to make sure there's no issues with the test
| logic itself, AI makes this somewhat verbose process very easy
| and efficient.
| slibhb wrote:
| Most programming is trivial. Lots of non-trivial programming
| tasks can be broken down into pure, trivial sections. Then, the
| non-trivial part becomes knowing how the entire system fits
| together.
|
| I've been using LLMs for about a month now. It's a nice
| productivity gain. You do have to read generated code and
| understand it. Another useful strategy is pasting a buggy
| function and ask for revisions.
|
| I think most programmers who claim that LLMs aren't useful are
| reacting emotionally. They don't want LLMs to be useful
| because, in their eyes, that would lower the status of
| programming. This is a silly insecurity: ultimately programmers
| are useful because they can think formally better than most
| people. For the forseeable future, there's going to be massive
| demand for that, and people who can do it will be high status.
| gorjusborg wrote:
| > Most programming is trivial
|
| That's a bold statement, and incorrect, in my opinion.
|
| At a junior level software development can be about churning
| out trivial code in a previously defined box. I don't think
| its fair to call that 'most programming'.
| BobbyJo wrote:
| Probably overloading of the term "programming" is the issue
| here. Most "software engineering" is non-programming work.
| Most programming is not actually typing code.
|
| Most of the time, when I am typing code, the code I am
| producing is trivial, however.
| Reason077 wrote:
| A good farmer isn't likely to complain about getting a new
| tractor. But it might put a few horses out of work.
| adriand wrote:
| > Lots of non-trivial programming tasks can be broken down
| into pure, trivial sections. Then, the non-trivial part
| becomes knowing how the entire system fits together.
|
| I think that's exactly right. I used to have to create the
| puzzle pieces and then fit them together. Now, a lot of the
| time something else makes the piece and I'm just doing the
| fitting together part. Whether there will come a day when we
| just need to describe the completed puzzle remains to be
| seen.
| r14c wrote:
| From my perspective, writing out the requirements for an AI
| to produce the code I want is just as easy as writing it
| myself. There are some types of boilerplate code that I can
| see being useful to produce with an LLM, but I don't write
| them often enough to warrant actually setting up the
| workflow.
|
| Even with the debugging example, if I just read what I wrote
| I'll find the bug because I understand the language. For more
| complex bugs, I'd have to feed the LLM a large fraction of my
| codebase and at that point we're exceeding the level of
| understanding these things can have.
|
| I would be pretty happy to see an AI that can do effective
| code reviews, but until that point I probably won't bother.
| tonyedgecombe wrote:
| >I think most programmers who claim that LLMs aren't useful
| are reacting emotionally.
|
| I don't think that's true. Most programmers I speak to have
| been keen to try it out and reap some benefits.
|
| The almost universal experience has been that it works for
| trivial problems, starts injecting mistakes for harder
| problems and goes completely off the rails for anything
| really difficult.
| er4hn wrote:
| It's reasonable to say that LLMs are not completely useless.
| There is also a very valid case to make that LLMs are not
| good at generating production ready code. I have found asking
| LLMs to make me Nix flakes to be a very nice way to make use
| of Nix without learning the Nix language.
|
| As an example of not being production ready: I recently tried
| to use ChatGPT-4 to provide me with a script to manage my
| gmail labels. The APIs for these are all online, I didn't
| want to read them. ChatGPT-4 gave me a workable PoC that was
| extremely slow because it was using inefficient APIs. It then
| lied to me about better APIs existing and I realized that
| when reading the docs. The "vibes" outcome of this is that it
| can produce working slop code. For the curious I discuss this
| in more specific detail at:
| https://er4hn.info/blog/2024.10.26-gmail-labels/#using-ai-
| to...
| Aeolun wrote:
| I find a recurring theme in these kind of comments where
| people seem to blame their laziness on the tool. The
| problem is not that the tools are imperfect, it's that you
| apparently use them in situations where you expect
| perfection.
|
| Does a carpenter blame their hammer when it fails to drive
| in a screw?
| er4hn wrote:
| I'd argue that a closer analogy is I bought a laser based
| measuring device. I point it a distant point and it tells
| me the distance from the tip of the device to that point.
| Many people are excited that this tool will replace
| rulers and measuring tapes because of the ease of use.
|
| However this laser measuring tool is accurate within a
| range. There's a lot of factors that affect it's accuracy
| like time of day, how you hold it, the material you point
| it at, etc. Sometimes these accuracy errors are minimal,
| sometimes they are pretty big. You end up getting a lot
| of measurements that seem "close enough". but you still
| need to ask if each one is correct. "Measure Twice, Cut
| Once" begins to require one measurement with the laser
| tool and once with the conventional tool when accuracy
| matters.
|
| One could have a convoluted analogy where the carpenter
| has an electric hammer that for some reason has a rounded
| head that does cause some number of nails to not go in
| cleanly, but I like my analogy better :)
| derefr wrote:
| I would add that a lot of the time when I'm programming, I'm
| an expert on the _problem_ domain but not the _solution_
| domain -- that is, I know exactly what the _pseudocode_ to
| solve my problem should look like; but I 'm not necessarily
| fluent in the particular language and libraries/APIs I happen
| to have to use, in the particular codebase I'm working on, to
| operationalize that pseudocode.
|
| LLMs are great at translating already-rigorously-thought-out
| pseudocode requirements, into a specific (non-esoteric)
| programming language, with calls to (popular) libraries/APIs
| of that language. They might make little mistakes -- but so
| can human developers. If you're good at _catching_ little
| mistakes, then this can still be faster!
|
| For a concrete example of what I mean:
|
| I hardly ever code in JavaScript; I'm mostly a backend
| developer. But sometimes I want to quickly fix a problem with
| our frontend that's preventing end-to-end testing; or I want
| to add a proof-of-concept frontend half to a new backend
| feature, to demonstrate to the frontend devs by example the
| way the frontend should be using the new API endpoint.
|
| Now, I can sit down with a JS syntax + browser-DOM API cheat-
| sheet, and _probably, eventually_ write correct code that
| doesn 't accidentally e.g. incorrectly reject reject zero or
| empty strings because they're "false-y", or incorrectly
| interpolate the literal string "null" into a template string,
| or incorrectly try to call Element.setAttribute with a
| boolean true instead of an empty string. And that's because I
| _have_ written _some_ JS, and have been bitten by those
| things, just enough times now to _recognize_ those JS code
| smells when I see them when _reviewing_ code.
|
| But just because I can recognize bad JS code, doesn't mean
| that I can instantly conjure to mind blocks of pre-formatted
| JS code avoid all those pitfalls by doing everything right. I
| know the right way exists, and I've used it before, and I
| would know it if I saw it... but it's not "on the tip of my
| tongue" like it would be for languages I'm more familiar
| with.
|
| With an LLM, though, I can just tell it the pseudocode (or
| equivalent code in a language I know better), get an initial
| attempt at the JS version of it out, immediately see whether
| it passes the "sniff test"; and if it doesn't, iterate just
| by pointing out my concerns in plain English -- which will
| either result in code updated to solve the problem, or an
| explanation of why my concern isn't relevant. (Which, in the
| latter case, is a learning opportunity, but one that must
| always be followed up by independent research in non-LLM
| sources.)
|
| The product of this iteration process is basically the same
| code I would have written myself. But the iteration process
| was faster than writing the code myself.
|
| I liken this to the difference between asking someone who
| knows anatomy but only ever does sculpture, to draw a sketch
| of someone's face; vs sitting them in front of a professional
| illustrator and having them describe and iterate on the same
| face sketch. The illustrator won't perfectly understand the
| requirements of the sculptor -- but the illustrator is still
| a lot more _fluent in the medium_ than the sculptor is, and
| the sculptor still has all the required _knowledge of the
| domain_ (anatomy) required to recognize whether the sketch
| _matches their vision_.
| aorloff wrote:
| Its been a while since I was really fully in the trenches, but
| not that long.
|
| How people deal with this is they start by writing the test
| case.
|
| Once they have that, debugging that 25% comes relatively easily
| and after that its basically packaging up the PR
| Kiro wrote:
| You're doing circular reasoning based on your initial concern
| actually being a problem in practice. In my experience it's
| not, which makes all your other speculations inherently
| incorrect.
| afavour wrote:
| > Or do they have 25% trivial code?
|
| If anything that's probably an underestimate. Not to downplay
| the complexity in much of what Google does but I'm sure they
| also do an absolute ton of tedious, boring CRUD operations that
| an AI could write.
| ants_everywhere wrote:
| Google's internal codebase is nicer and more structured than
| the average open source code base.
|
| Their internal AI tools are presumably trained on their code,
| and it wouldn't surprise me if the AI is capable of much more
| internally than public coding AIs are.
| fsckboy wrote:
| > _Does Google now have 25% subtly wrong code?_
|
| maybe the ai generates 100% of the company's new code, and then
| by the time the programmers have fixed it, only 25% is left of
| the AI's ship of Theseus
| vkou wrote:
| How would you react to a tech firm that in 2018, proudly
| announced that 25% of their code was generated by
| IntelliJ/Resharper/Visual Studio's codegen and autocomplete and
| refactoring tools?
| grepLeigh wrote:
| I have a whole "chop wood, carry water" speech born from
| leading corporate software teams. A lot of work at a company of
| sufficient size boils down to keeping up with software entropy
| while also chipping away at some initiative that rolls up to an
| OKR. It can be such a demotivating experience for the type of
| smart, passionate people that FANNGs like to hire.
|
| There's even a buzzword for it: KTLO (keep the lights on). You
| don't want to be spending 100% of your time on KTLO work, but
| it's unrealistic to expect to do done of it. Most software
| engineers would gladly outsource this type of scutwork.
| Cthulhu_ wrote:
| You're quick to jump to the assertion that AI only generates SO
| style utility code to do X, but it can also be used to generate
| boring mapping code (e.g. to/from SQL datasets). I heard one ex
| Google dev say that most of his job wat fiddling with Protobuf
| definitions and payloads.
| bluerooibos wrote:
| At what point are people going to stop shitting on the code
| that Copilot or other LLM tools generate?
|
| > how trivial the problems they solve are
|
| A single line of code IS trivial. Simple code is good code. If
| I write the first 3 lines of a complex method and I let Copilot
| complete the 4th, that's 25% of my code written by an LLM.
|
| These tools have exploded in popularity for good reason. If
| they were no good, people wouldn't be using them.
|
| I can only assume people making such comments don't actually
| code on a daily basis and use these tools daily. Either that or
| you haven't figured out the knack of how to make it work
| properly for you.
| ZiiS wrote:
| Yes 25% of code is trivial; certainly for companies like Google
| that have always been a bit NIH.
| fuzzy2 wrote:
| I'll just answer here, but this isn't about this post in
| particular. It's about all of them. I've been struggling with a
| team of junior devs for the past months. How would I describe
| the experience? It's easy: just take any of these posts,
| replace "AI" with "junior dev", done.
|
| Except of course AI at least can do spelling. (Or at least I
| haven't encountered a problem in that regard.)
|
| I'm highly skeptical regarding LLM-assisted development. But I
| must admit: it works. If paired with an experienced senior
| developer. IMHO it must not be used otherwise.
| 0xCAP wrote:
| People overestimate faang. There are many talents working there,
| sure, but a lot of garbage gets pumped into their codebases as
| well.
| sbochins wrote:
| It's probably code that was previously machine generated that
| they're now calling "AI Generated".
| frank_nitti wrote:
| That would make sense and be a good use case, essentially doing
| what OpenAPI generators do (or Yeoman generators of yore), but
| less deterministic I'd imagine. So optimistically I would guess
| it covers ground that isn't already solved by mainstream tools.
|
| For the example of generating an http app scaffolding from an
| openapi spec, it would probably account for at least 25% of the
| text in the generated source code. But I imagine this report
| would conveniently exclude the creation of the original source
| yaml driving the generator -- I can't imagine you'd save much
| typing (or mental overhead) trying to prompt a chatbot to
| design your api spec correctly before the codegen
| otabdeveloper4 wrote:
| That explains a lot about Google's so-called "quality".
| Taylor_OD wrote:
| If we are talking about the boilerplate code and autofill syntax
| code that copilot or any other "AI" will offer me when I start
| typing... Then sure. Sounds about right.
|
| The other 75% is the stuff you actually have to think about.
|
| This feels like saying linters impact x0% of code. This just
| feels like an extension of that.
| chabes wrote:
| When Google announced their big layoffs, I noted the timing in
| relation to some big AI announcements. People here told me I was
| crazy for suggesting that corporations could replace employees
| with AI this early. Now the CEO is confirming that more than a
| quarter of new code is created by AI. Can't really deny that
| reality anymore folks.
| akira2501 wrote:
| > Can't really deny that reality anymore folks.
|
| You have to establish that the CEO is actually aware of the
| reality and is interested in accurately conveying that to you.
| As far as I can tell there is absolutely no reason to believe
| any part of this.
| paradox242 wrote:
| When leaders without the requisite technical knowledge are
| making decisions then the question of whether AI is capable of
| replacing human workers is orthogonal to the question of
| whether human workers will be replaced by AI.
| hbn wrote:
| I'd suggest the bigger factor in those layoffs is the money was
| made in earlier covid years where money was flowing and
| everyone was overhiring to show off record growth, then none of
| those employees had any justification for being kept around and
| were just a money sink so they fired them all.
|
| Not to mention Elon publicly demonstrated losing 80% of staff
| when he took over twitter and - you can complain about his
| management all you want - as someone who's been using it the
| whole way through, from a technical POV their downtimes and
| software quality has not been any worse and they're shipping
| features faster. A lot of software companies are overstaffed,
| especially Google who has spent years paying people to make
| projects just to get a PO promoted, then letting the projects
| rot and die to be replaced by something else. That's a lot of
| useless work being done.
| robohoe wrote:
| Who claims that he is speaking the truth and not some marketing
| jargon?
| randomNumber7 wrote:
| People who have replaced 25% of their brain with ai.
| Starlevel004 wrote:
| No wonder search barely works anymore
| hggigg wrote:
| I reckon he's talking bollocks. Same as IBM was when it was about
| to disguise layoffs as AI uplift and actually just shovelled the
| existing workload on to other people.
| fzysingularity wrote:
| While I get the MBA-speak of lines-of-code that AI is now able to
| accomplish, it does make me think about their highly-curated
| internal codebase that makes them well placed to potentially get
| to 50% AI-generated code.
|
| One common misconception is that all LLMs are the same. The
| models are trained the same, but trained on wildly different
| datasets. Google, and more specifically the Google codebase is
| arguably one of the most curated, and iterated on datasets in
| existence. This is a massive lever for Google to train their
| internal code-gen models, that realistically could easily replace
| any entry-level or junior developer.
|
| - Code review is another dimension of the process of maintaining
| a codebase that we can expect huge improvements with LLMs. The
| highly-curated commentary on existing code / flawed diff /
| corrected diff that Google possesses give them an opportunity to
| build a whole set of new internal tools / infra that's extremely
| tailored to their own coding standard / culture.
| morkalork wrote:
| Is the public gemini code gen LLM trained on their internal
| repo? I wonder if one could get it to cough up propriety code
| with the right prompt.
| bqmjjx0kac wrote:
| > that realistically could easily replace any entry-level or
| junior developer.
|
| This is a massive, unsubstantiated leap.
| risyachka wrote:
| The issue is it doesn't really replace junior dev. You become
| one - as you have to babysit it all the time, check every
| line of code, and beg it to make it work.
|
| In many cases it is counterproductive
| LudwigNagasena wrote:
| How much of that generated code is `if err != nil { return err
| }`?
| cebert wrote:
| Did AI have to go thru several rounds of Leetcode interviews?
| scottyah wrote:
| yes: https://alphacode.deepmind.com/ edit: blog link
| https://deepmind.google/discover/blog/competitive-programmin...
| twis wrote:
| How much code was "written by" autocomplete before LLMs came
| along? From my experience, LLM integration is advanced
| autocomplete. 25% is believable, but misleading.
| scottyah wrote:
| My linux terminal tab-complete has written 50% of my code
| devonbleak wrote:
| It's Go. 25% of the code is just basic error checking and
| returning nil.
| QuercusMax wrote:
| In Java, 25% of the code is import statements and curly braces
| NeoTar wrote:
| Does auto-code generation count as AI?
| contravariant wrote:
| In lisp about 50% of the code is just closing parentheses.
| harry8 wrote:
| Heh, but it can't be that, no reason to think llms can
| count brackets needing a close any more than they can count
| words.
| _spduchamp wrote:
| I can ask AI to generate the same code multiple times, and get
| new variations on programming style each time, and get the
| occasional solution that is just not quite right but sort of
| works. Sounds like a recipe for a gloppy mushy mess of style
| salad.
| soperj wrote:
| The real question is how many lines of code was it responsible
| for removing.
| randomNumber7 wrote:
| I cannot imagine this to be true, cause imo current LLM's coding
| abilities are very limited. It definitely makes me more
| productive to use it as a tool, but I use it mainly for
| boilerplate and short examples (where I had to read some library
| documentation before).
|
| Whenever the problem requires thinking, it horribly fails because
| it cannot reason (yet). So unless this is also true for google
| devs, I cannot see that 25% number.
| ThinkBeat wrote:
| This is quite interesting to know.
|
| I will be curious to see if it has any impact positive or
| negative over a couple of years.
|
| Will the code be more secure since the AI does not make the
| mistakes humans do?
|
| Or will the code, not well enough understood by the employees,
| exposes exploits that would not be there?
|
| Will it change average up time?
| ThinkBeat wrote:
| So um. With making this public statement, can we expect that 25%
| of "the bottom" coders at Google will soon be granted a lot more
| time and ability to spend time with their loves ones.
| SavageBeast wrote:
| Google needs to bolster their AI story and this is good click
| bait. I'm not buying it personally.
| mjhay wrote:
| 100% of Sundar Pichai could be replaced by an AI.
| wokkaflokka wrote:
| No wonder their products are getting worse and worse...
| deterministic wrote:
| Not impressed. I currently auto generate 90% or more of the code
| I need to implement business solutions. With no AI involved. Just
| high level declarations of intent auto translated to
| C++/Typescript/...
| arminiusreturns wrote:
| I was a luddit about the generative LLMs at first, as a crusty
| sysadmin type. I came around and started experimenting. It's been
| a boon for me.
|
| My conclusion is that we are at the first wave of a split between
| those who use LLMs to augment their abilities and knowledge, and
| those who delay. In cyberpunk terminally, it's aug-tech, not real
| AGI. (and the lesser ones code abilities and simpler the task,
| the more benefit, it's an accelerator)
| elzbardico wrote:
| Well. When I developed in Java, I think that Eclipse did similar
| figures circa 2005.
| marviel wrote:
| > 80% at Reasonote
| jeffbee wrote:
| It's quite amusing to me because I am old enough to remember when
| Copilot emerged the HN mainthought was that it was the death
| sentence for big corps, the scrappy independent hacker was going
| to run circles around them. But here we see the predictable
| reality: an organization that is already in an elite league in
| terms of developer velocity gets more benefit from LLM code
| assistants than Joe Hacker. These technologies serve to entrench
| and empower those who are already enormously powerful.
| davidclark wrote:
| If I tab complete my function and variable symbols, does my lsp
| write 80%+ of my lines of code?
| hi_hi wrote:
| > More than a quarter of new code created at Google is generated
| by AI, said CEO Sundar Pichai...
|
| How do they know this? At face value, it sounds like alot, but it
| only says "new code generated". Nothing about code making it into
| source control or production, or even which parts of googles vast
| business units.
|
| For all we know, this could be the result of some internal poll
| "Tell us if you've been using Goose recently" or some marketing
| analytics on the Goose "Generate" button.
|
| It's puff piece to put Google back in the lime light, and
| everyone is lapping it up.
| Hamuko wrote:
| How do Google's IP lawyers feel about a quarter of the company's
| code not being copyrightable?
| zxvkhkxvdvbdxz wrote:
| I feel this made me loose the respect I still had for Google
| prmoustache wrote:
| Aren't we just talking about auto completion?
|
| In that case those 25% are probably the very same 25% that were
| automatically generated by LTP based auto-completion.
| blibble wrote:
| this is the 2024 version of "25% of our code is now produced by
| outsourced resources"
| tabbott wrote:
| Without a clear explanation of methodology, this is meaningless.
| My guess is this statistic is generated using misleading
| techniques like classifying "code changes generated by existing
| bulk/automated refactoring tools" as "AI generated".
| skywhopper wrote:
| All this means is that 25% of code at Google is trivial
| boilerplate that would be better factored out of their process
| rather than tasking inefficient LLM tools with. The more they are
| willing to leave the "grunt work" to an LLM, the less likely they
| are to ever eliminate it from the process.
| skatanski wrote:
| I think at this moment, this sounds more like "quarter of the
| company's new code is created using stackoverflow and other
| forums. Many many people use all these tools to find information,
| as they did using stackoverflow a month ago, but now suddenly we
| can call it "created by AI". It'd be nice to have a distinction.
| I'm saying this, while being very excited about using LLMs as a
| developer.
| Terr_ wrote:
| [delayed]
___________________________________________________________________
(page generated 2024-10-30 23:00 UTC)