[HN Gopher] Google CEO says more than a quarter of the company's...
       ___________________________________________________________________
        
       Google CEO says more than a quarter of the company's new code is
       created by AI
        
       Author : S0y
       Score  : 192 points
       Date   : 2024-10-30 02:09 UTC (20 hours ago)
        
 (HTM) web link (www.businessinsider.com)
 (TXT) w3m dump (www.businessinsider.com)
        
       | S0y wrote:
       | https://archive.is/X43PU
        
       | ausbah wrote:
       | i would be may more impressed if LLMs could do code compression.
       | more code == more things that can break, and when llms can
       | generate boatloads of it with a click you can imagine what might
       | happen
        
         | Scene_Cast2 wrote:
         | This actually sparked an idea for me. Could code complexity be
         | measured as cumulative entropy as measured by running LLM token
         | predictions on a codebase? Notably, verbose boilerplate would
         | be pretty low entropy, and straightforward code should be
         | decently low as well.
        
           | jeffparsons wrote:
           | Not quite, I think. Some kinds of redundancy are good, and
           | some are bad. Good redundancy tends to reduce mistakes rather
           | than introduce them. E.g. there's lots of redundancy in
           | natural languages, and it helps resolve ambiguity and fill in
           | blanks or corruption if you didn't hear something properly.
           | Similarly, a lot of "entropy" in code could be reduced by
           | shortening names, deleting types, etc., but all those things
           | were helping to clarify intent to other humans, thereby
           | reducing mistakes. But some is copy+paste of rules that
           | should be enforce in one place. Teaching a computer to
           | understand the difference is... hard.
           | 
           | Although, if we were to ignore all this for a second, you
           | could also make similar estimates with, e.g., gzip: the
           | higher the compression ratio attained, the more
           | "verbose"/"fluffy" the code is.
           | 
           | Fun tangent: there are a lot of researchers who believe that
           | compression and intelligence are equivalent or at least very
           | tightly linked.
        
             | 8note wrote:
             | Interpreting this comment, it would predict low complexity
             | for code copied unnecessarily.
             | 
             | I'm not sure though. If it's copied a bunch of times, and
             | it actually doesn't matter because each usecase of the
             | copying is linearly independent, does it matter that it was
             | copied?
             | 
             | Over time, you'd still see copies being changed by
             | themselves show up as increased entropy
        
           | malfist wrote:
           | Code complexity can already be measured deterministically
           | with cyclomatic complexity. No need to use an AI fuzzy logic
           | at this. Especially when they're bad at math.
        
             | contravariant wrote:
             | There's nothing fuzzy about letting an LLM determine the
             | probability of a particular piece of text.
             | 
             | In fact it's the one thing they are explicitly designed to
             | do, the rest is more or less a side-effect.
        
           | david-gpu wrote:
           | _> Could code complexity be measured as cumulative entropy as
           | measured by running LLM token predictions on a codebase?
           | Notably, verbose boilerplate would be pretty low entropy, and
           | straightforward code should be decently low as well._
           | 
           | WinRAR can do that for you quite effectively.
        
         | ks2048 wrote:
         | I agree. It seems like counting lines of generated code is like
         | counting bytes/instructions of compiled code - who cares? If
         | "code" becomes prompts, then AI should lead to much smaller
         | code than before.
         | 
         | I'm aware that the difference is that AI-generated code can be
         | read and modified by humans. But that quantity is bad because
         | humans have to understand it to read or modify it.
        
           | TZubiri wrote:
           | What's that line about accounting for lines of code on the
           | wrong side of the balance sheet?
        
           | latexr wrote:
           | > If "code" becomes prompts, then AI should lead to much
           | smaller code than before.
           | 
           | What's the point of shorter code if you can't trust it to do
           | what it's supposed to?
           | 
           | I'll take 20 lines of code that do what they should
           | consistently over 1 line that may or may not do the task
           | depending on the direction of the wind.
        
         | asah wrote:
         | meh - the LLM code I'm seeing isn't particularly more verbose.
         | And as others have said, if you want tighter code, just add
         | that to the prompt.
         | 
         | fun story: today I had an LLM write me a non-trivial perl one-
         | liner. It tried to be verbose but I insisted and it gave me one
         | tight line.
        
         | AlexandrB wrote:
         | Exactly this. Code is a liability, if you can do the same thing
         | with less code you're often better off.
        
           | EasyMark wrote:
           | Not if it's already stable and has been running for years.
           | Legacy doesn't necessarily mean "need replacement because of
           | technical debt". I've seen lots of people want to replace
           | code that has been running basically bug free for years
           | because "there are better coding styles and practices now"
        
         | 8note wrote:
         | How would it know which edge cases are being useful and which
         | ones aren't?
         | 
         | I understand more code as being more edge cases
        
           | wvenable wrote:
           | More code could just be useless code that no longer serves
           | any purpose but still looks reasonable to the naked eye. An
           | LLM can certainly figure out and suggest maybe some
           | conditional is impossible given the rest of the code.
           | 
           | I can also suggest alternatives, like using existing library
           | functions for things that might have been coded manually.
        
       | jrockway wrote:
       | When I was there, way more than 25% of the code was copying one
       | proto into another proto, or so people complained. What sort of
       | memes are people making now that this task has been automated?
        
         | dietr1ch wrote:
         | I miss old memegen, but it got ruined by HR :/
        
           | rcarmo wrote:
           | I am reliably told that it is alive and well, even if it's
           | changed a bit.
        
             | anon1243 wrote:
             | Memegen is there but unrecognizable now. A dedicated
             | moderator team deletes memes, locks comments, bans people
             | for mentioning "killing a process" (threatening language!)
             | and contacts their managers.
        
               | dietr1ch wrote:
               | Yup, I simply stopped using it, which means they won.
        
         | hn_throwaway_99 wrote:
         | I am very interested in how this 25% number is calculated, and
         | if it's a lot of boilerplate that in the past would have been
         | just been big copy-paste jobs like a lot of protobuffers work.
         | Would be curious if any Googlers could comment.
         | 
         | Not that I'm really discounting the value of AI here. For
         | example, I've found a ton of value and saved time getting AI to
         | write CDKTF (basically, Terraform in Typescript) config scripts
         | for me. I don't write Terraform that often, there are a ton of
         | options I always forget, etc. So asking ChatGPT to write a
         | Terraform config for, say, a new scheduled task for example
         | saves me from a lot of manual lookup.
         | 
         | But at the same time, the AI isn't really writing the
         | complicated logic pieces for me. I think that comes down to the
         | fact that when I do need to write complicated logic, I'm a
         | decent enough programmer that it's probably faster for me to
         | write it out in a high-level programming language than write it
         | in English first.
        
       | tylerchilds wrote:
       | if the golden rule is that code is a liability, what does this
       | headline imply?
        
         | danielmarkbruce wrote:
         | I'm sure google won't pay you money to take all their code off
         | their hands.
        
           | AlexandrB wrote:
           | But they would pay me money to audit it for security.
        
             | danielmarkbruce wrote:
             | yup, you can get paid all kinds of money to fix/guard/check
             | billion/trillion dollar assets..
        
         | eddd-ddde wrote:
         | The code would be getting written anyways, its an invariant.
         | The difference is less time wasted typing keys (albeit small
         | amount of time) and more importantly (in my experience) it
         | helps A LOT for discoverability.
         | 
         | With g3's immense amount of context, LLMs can vastly help you
         | discover how other people are using existing libraries.
        
           | tylerchilds wrote:
           | my experience dabbling with the ai and code is that it is
           | terrible at coming up with new stuff unless it already exists
           | 
           | in regards to how others are using libraries, that's where
           | the technology will excel-- re-writing code. once it has a
           | stable AST to work with, the mathematical equation it is
           | solving is a refactor.
           | 
           | until it has that AST that solves the business need, the game
           | is just prompt spaghetti until it hits altitude to be able to
           | refactor.
        
         | JimDabell wrote:
         | Nothing at all. The headline talks about the proportion of code
         | written by AI. Contrary to what a lot of comments here are
         | assuming, it does not say that the volume of code written has
         | increased.
         | 
         | Google could be writing the same amount of code with fewer
         | developers (they have had multiple layoffs lately), or their
         | developers could be focusing more of their time and attention
         | on the code they do write.
        
         | contravariant wrote:
         | Well, either they just didn't spend as much time writing the
         | code or they increased their liability by about 33%.
         | 
         | The truth is likely somewhere in between.
        
       | kev009 wrote:
       | I would hope a CEO, especially a technical one, would have enough
       | sense to couple that statement to some useful business metric,
       | because in isolation it might be announcement of public
       | humiliation.
        
         | dyauspitr wrote:
         | Or a statement of pride that the intelligence they created is
         | capable of lofty tasks.
        
         | dmix wrote:
         | The elitism of programmers who think the boilerplate code they
         | write for 25% of the job, that's already been written before by
         | 1000 other people before, is in fact a valuable use of company
         | time to write by hand again.
         | 
         | IMO it's only really an issue if a competent human wasn't
         | involved in the process, basically a person who could have
         | written it if needed, then they do the work connecting it to
         | the useful stuff, and have appropriate QA/testing in
         | place...the latter often taking far more effort than the actual
         | writing-the-code time itself, even when a human does it.
        
           | marcosdumay wrote:
           | If 25% of your code is boilerplate, you have a serious
           | architectural problem.
           | 
           | That said, I've seen even higher ratios. But never in any
           | place that survived for long.
        
             | cryptoz wrote:
             | Android mobile development has gotten so ...architectured
             | that I would guess most apps have a much higher rate of
             | "boilerplate" than you'd hope for.
             | 
             | Everything is getting forced into a scalable, general
             | purpose way, that most apps have to add a ridiculous amount
             | of boilerplate.
        
             | dmix wrote:
             | You're probably thinking of just raw codebases, your
             | company source code repo. Programmers do far, far more
             | boilerplate stuff than raw code they commit with git.
             | Debugging, data processing, system scripts, writing SQL
             | queries, etc.
             | 
             | Combine that with generic functions, framework boilerplate,
             | OS/browser stuff, or explicit x-y-z code then your
             | 'boilerplate' (ie repetitive, easily reproducible) easily
             | gets to 25% of code you're programmers write every month.
             | If your job is >75% pure human cognition problem solving
             | you're probably in a higher tier of jobs than the vast
             | majority of programmers on the planet.
        
             | hn_throwaway_99 wrote:
             | Depends on how you define "boilerplate". E.g. Terraform
             | configs count for a significant number of the total lines
             | in one of my repos. It's not really "boilerplate" in that
             | it's not the exact same everywhere, but it is boilerplate
             | in the since that setting up, say, a pretty standard Cloud
             | SQL instance can take many, many lines of code just because
             | there are so many config options.
        
               | marcosdumay wrote:
               | Terraform is verbose.
               | 
               | It's only boilerplate if you write it again to set almost
               | the same thing again. What, granted, if you are writing
               | bare terraform config, it's probably both.
               | 
               | But on either case, if your terraform config is
               | repetitive and a large part of the code on an entire
               | thing (not a repo, repos are arbitraty divisions, maybe
               | "product", but it's also a bad name). Than that thing is
               | certainly close to useless.
        
             | 8note wrote:
             | Is it though? It seems to me like a team ownership boundary
             | question rather than an architecture question.
             | 
             | Architecturally, it sounds like different architecture
             | components map somewhere close to 1:1 to teams, rather than
             | teams hacking components to be closer coupled to each other
             | because they have the same ownership.
             | 
             | I'd see too much boilerplate as being a
             | organization/management org issue rather than a code
             | architecture issue
        
             | TheNewsIsHere wrote:
             | To add: it's been my experience that it's the company that
             | thinks the boilerplate code is some special, secret,
             | proprietary thing that no other business could possibly
             | have produced.
             | 
             | Not the developer who has written the same effective stanza
             | 10 times before.
        
             | wvenable wrote:
             | 25% of _new_ code might be boilerplate. All my apps in my
             | organization start out roughly the same way with all the
             | same stuff. You could argue on day one that 100% of the
             | code is boilerplate and by the end of the project it is
             | only a small percentage.
        
           | mistrial9 wrote:
           | you probably underestimate the endless miles of verbose code
           | that are possible, by human or machine but especially by
           | machine.
        
           | kev009 wrote:
           | Doing the same thing but faster might just mean you are
           | masturbating more furiously. Show me the money, especially
           | from a CEO.
        
       | an_d_rew wrote:
       | Huh.
       | 
       | That may explain why google search has, in the past couple of
       | months, become so unusable for me that I switched (happily) to
       | kagi.
        
         | twarge wrote:
         | Which uses Google results?
        
       | croes wrote:
       | Related?
       | 
       | > New tool bypasses Google Chrome's new cookie encryption system
       | 
       | https://news.ycombinator.com/item?id=41988648
        
       | hipadev23 wrote:
       | Google is now mass-producing techdebt at rates not seen since
       | Martin Fowler's first design pattern blogposts.
        
         | joeevans1000 wrote:
         | Not really technical debt when you will be able to regenerate
         | 20K lines of code in a minute then QA and deploy it
         | automatically.
        
           | kibwen wrote:
           | So a fresh, new ledger of technical debt every morning,
           | impossible to ever pay off?
        
           | 1attice wrote:
           | Assuming, of course:
           | 
           | - You know _which_ 20K lines need changing - You have
           | _perfect_ QA - Nothing _ever_ goes wrong in deployment.
           | 
           | I think there's a tendency in our industry to only take the
           | hypotenuse of curves at the steepest point
        
             | TheNewsIsHere wrote:
             | That is a fantastic way to put it. I'd argue that you've
             | described a bubble, which fits perfectly with the topic and
             | where _most_ of it will eventually end up.
        
         | nelup20 wrote:
         | We've now entered the age of exponential tech debt, it'll be a
         | sight to behold
        
       | evbogue wrote:
       | I'd be turning off the autocomplete in my IDE if I was at Google.
       | Seems to double as a keylogger.
        
       | joeevans1000 wrote:
       | I read these threads and the usual 'I have to fix the AI code for
       | longer than it would have taken to write it from scratch' and
       | can't help but feel folks are truly trying to downplay what is
       | going to eat the software industry alive.
        
       | mjbale116 wrote:
       | If you manage to convince software engineers that you are doing
       | them a favour by employing them then they will approach any
       | workplace negotiations with a specific mindset which will make
       | them grab the first number it gets thrown to them.
       | 
       | These statements are brilliant.
        
         | akira2501 wrote:
         | These statements rely on an unchallenged monopoly position.
         | This is not sustainable. These statements will hasten the
         | collapse.
        
       | imaginebit wrote:
       | I think he's trying to promote AI, somehow raises questions about
       | thrir code quality among some
        
         | dietr1ch wrote:
         | I think it just shows how much noise there is in coding. Code
         | gets reviewed anyways (although review quality was going down
         | rapidly the more PMs where added to the team)
         | 
         | Most of the code must be what could be snippets (opening files
         | and handling errors with absl::, and moving data from proto to
         | proto). One thing that doesn't help here, is that when writing
         | for many engineers on different teams to read, spelling out
         | simple code instead of depending on too many abstractions seems
         | to be preferred by most teams.
         | 
         | I guess that LLMs do provide smarter snippets that I don't need
         | to fill out in detail, and when it understands types and
         | whether things compile it gets quite good and "smart" when it
         | comes to write down boilerplate.
        
       | ultra_nick wrote:
       | Why work at big businesses anymore? Let's just create more
       | startups.
        
         | IAmGraydon wrote:
         | Risk appetite.
        
           | game_the0ry wrote:
           | Not so sure nowadays. Given how often big tech lays off
           | employees and the abundance of recently laid off tech talent,
           | trying to start your own company sounds a lot more appealing
           | than ever.
           | 
           | I consider myself risk-averse and even I am contemplating
           | starting a small business in the event I get laid off.
        
       | nine_zeros wrote:
       | Writing more code means more needs to be maintained and they are
       | cleverly hiding that fact. Software is a lot more like complex
       | plumbing than people want to admit:
       | 
       | More lines == more shit to maintain. Complex lines == the shit is
       | unmanageable.
       | 
       | But wall street investors love simplistic narratives such as More
       | X == More revenue. So here we are. Pretty clever marketing imo.
        
       | Tier3r wrote:
       | Google is getting enshittified. It's already visible in many
       | small ways. I was just using Google maps and in the route they
       | called X (bus) Interchange as X International. I can only assume
       | this happened because they are using AI to summarise routes now.
       | Why in the world are they doing that? They have exact location
       | names available.
        
       | 1oooqooq wrote:
       | this only means employees sign up to use new toys and they are
       | paying enough seats for all employees.
       | 
       | it's like companies paying all those todolist and tutorial apps
       | left running on aws ec2 instances in 2007ish.
       | 
       | I'd be worried if i were a google investor. lol.
        
         | fragmede wrote:
         | I'm not sure I get your point. Google created Gemini and
         | whatever internal LLM their employees are using for code
         | generation. Who are they paying, and for what seats? Not
         | Microsoft or OpenAI or Anthropic...
        
       | nosbo wrote:
       | I don't write code as I'm a sysadmin. Mostly just scripts. But is
       | this like saying intellisense writes 25% of my code? Because I
       | use autocomplete to shortcut stuff or to create a for loop to
       | fill with things I want to do.
        
         | n_ary wrote:
         | You just made it less attractive to the target corps who are to
         | buy this product from Google. Saying, intellisense means corps
         | already have license of various of these and some are even
         | mostly free. Saying AI generate our 25% code sounds more
         | attractive to corps, because it feels like something new and
         | novel and you can imagine laying off 25% of the personnel and
         | justify buying this product from Google.
         | 
         | When someone who uses a product says it, there is a 50% chance
         | of it being true, but when someone far away from the user says
         | it, it is 100% promotion of product and setup for trust
         | building for a future sale.
        
         | coldpie wrote:
         | Looks like it's an impressive autocomplete feature, yeah. Check
         | out the video about halfway down here:
         | https://research.google/blog/ai-in-software-engineering-at-g...
         | (linked from other comment
         | https://news.ycombinator.com/item?id=41992028 )
         | 
         | Not what I thought when I heard "AI coding", but seems pretty
         | neat.
        
       | ChrisArchitect wrote:
       | Related:
       | 
       |  _Alphabet ($GOOG) 2024 Q3 earnings release_
       | 
       | https://news.ycombinator.com/item?id=41988811
        
       | ntulpule wrote:
       | Hi, I lead the teams responsible for our internal developer
       | tools, including AI features. We work very closely with Google
       | DeepMind to adapt Gemini models for Google-scale coding and other
       | Software Engineering usecases. Google has a unique, massive
       | monorepo which poses a lot of fun challenges when it comes to
       | deploying AI capabilities at scale.
       | 
       | 1. We take a lot of care to make sure the AI recommendations are
       | safe and have a high quality bar (regular monitoring, code
       | provenance tracking, adversarial testing, and more).
       | 
       | 2. We also do regular A/B tests and randomized control trials to
       | ensure these features are improving SWE productivity and
       | throughput.
       | 
       | 3. We see similar efficiencies across all programming languages
       | and frameworks used internally at Google and engineers across all
       | tenure and experience cohorts show similar gain in productivity.
       | 
       | You can read more on our approach here:
       | 
       | https://research.google/blog/ai-in-software-engineering-at-g...
        
         | reverius42 wrote:
         | To me the most interesting part of this is the claim that you
         | can accurately and meaningfully measure software engineering
         | productivity.
        
           | valval wrote:
           | You can come up with measures for it and then watch them,
           | that's for sure.
        
             | lr1970 wrote:
             | when metric becomes the target it ceases to be a good
             | metric. when discovered how it works developers will type
             | the first character immediately after opening the log.
             | 
             | edit: typo
        
               | joshuamorton wrote:
               | Only if the developer is being judged on the thing. If
               | the _tool_ is being judged on the thing, it 's much less
               | relevant.
               | 
               | That is, I, personally, am not measured on how much AI
               | generated code I create, and while the number is non-
               | zero, I can't tell you what it is because I don't care
               | and don't have any incentive to care. And I'm someone who
               | is personally fairly bearish on the value of LLM-based
               | codegen/autocomplete.
        
           | ozim wrote:
           | You can - but not on the level of a single developer and you
           | cannot use those measures to manage productivity of a
           | specific dev.
           | 
           | For teams you can measure meaningful outcomes and improve
           | team metrics.
           | 
           | You shouldn't really compare teams but it also is possible if
           | you know what teams are doing.
           | 
           | If you are some disconnected manager that thinks he can make
           | decisions or improvements reducing things to single numbers -
           | yeah that's not possible.
        
             | deely3 wrote:
             | > For teams you can measure meaningful outcomes and improve
             | team metrics.
             | 
             | How? Which metrics?
        
               | ozim wrote:
               | That is what we pay managers -to figure out- for. They
               | should find out which and how by knowing the team,
               | familiarity with domain knowledge, understanding company
               | dynamics, understanding customer, understanding market
               | dynamics.
        
               | seanmcdirmid wrote:
               | That's basically a non-answer. Measuring "productivity"
               | is a well known hard problem, and managers haven't really
               | figured it out...
        
               | yorwba wrote:
               | Economists are generally fine with defining productivity
               | as the ratio of aggregate outputs to aggregate inputs.
               | 
               | Measuring it is not the hard part.
               | 
               | The hard part is doing anything about it. If you can't
               | attribute specific outputs to specific inputs, you don't
               | know how to change inputs to maximize outputs. That's
               | what managers need to do, but of course they're often
               | just guessing.
        
               | seanmcdirmid wrote:
               | Measuring human productivity is hard since we can't
               | quantify output beyond silly metrics like lines of code
               | written or amount of time speaking during meetings. Maybe
               | if we were hunter/gatherers we could measure it by amount
               | of animals killed.
        
               | yorwba wrote:
               | That's why upthread we have
               | https://news.ycombinator.com/item?id=41992562
               | 
               | "You can [accurately and meaningfully measure software
               | engineering productivity] - but not on the level of a
               | single developer and you cannot use those measures to
               | manage productivity of a specific dev."
               | 
               | At the level of a company like Google, it's easy: both
               | inputs and outputs are measured in terms of money.
        
               | ozim wrote:
               | As you point back to my comment.
               | 
               | I am not Amazon person - but from my experience 2 pizza
               | teams was what worked and I never implemented it myself
               | just what I observed in wild.
               | 
               | Measuring Google in terms of money is also flawed, there
               | is loads of BS hidden there and lots of people paying big
               | companies more just because they are big companies.
        
               | ozim wrote:
               | Well I pretty much see which team members are slacking
               | and which are working hard.
               | 
               | But I do code myself, I write requirements so I do know
               | which ones are trivial and which ones are not. I also see
               | when there are complex migrations.
               | 
               | If you work in a group of people you will also get
               | feedback - doesn't have to be snitching but still you get
               | the feel who is a slacker in the group.
               | 
               | It is hard to quantify the output if you want to be
               | removed from the group "give me a number" manager. If you
               | actually do the work of a manager so you get the feel of
               | the group like who is "Hermione Granger" nagging that
               | others are slacking and disregard their opinion, you see
               | who is the "silent doer" or you see who is "we should do
               | it properly" bullshitter you can make a lot of meaningful
               | adjustments.
        
               | mdorazio wrote:
               | It's not a non-answer. Good managers need to figure out
               | what metrics make sense for the team they are managing,
               | and that will change depending on the company and team.
               | It might be new features, bug fixes, new product launch
               | milestones, customer satisfaction, ad revenue, or any of
               | a hundred other things.
        
               | randomNumber7 wrote:
               | I heard lines of code is a hot one.
        
               | seanmcdirmid wrote:
               | I would want a specific example in that case rather than
               | "the good managers figure it out" because in my
               | experience, the bad managers pretend to figure it out
               | while the good managers admit that they can't figure it
               | out. Worse still, if you tell your reports what those
               | metrics are, they will optimize them to death,
               | potentially tanking the product (I can increase my bug
               | fix count if there are more bugs to fix...).
        
               | ozim wrote:
               | So for a specific example I would have to outline 1-2
               | years of history of a team and product as a starter.
               | 
               | Then I would have to go on outlining 6-12 months of
               | trying stuff out.
               | 
               | Because if I just give "an example" I will get dozens of
               | "smart ass" replies how this specific one did not work
               | for them and I am stupid. Thanks but don't have time for
               | that or for writing an essay that no one will read anyway
               | and call me stupid or demand even more explanation. :)
        
               | anthonyskipper wrote:
               | My company uses the Dora metrics to measure the
               | productivity of teams and those metrics are incredibly
               | good.
        
           | UncleMeat wrote:
           | At scale you can do this in a bunch of interesting ways. For
           | example, you could measure "amount of time between opening a
           | crash log and writing the first character of a new change"
           | across 10,000s of engineers. Yes, each individual data point
           | is highly messy. Alice might start coding as a means of
           | investigation. Bob might like to think about the crash over
           | dinner. Carol might get a really hard bug while David gets a
           | really easy one. But at scale you can see how changes in the
           | tools change this metric.
           | 
           | None of this works to evaluate individuals or even teams. But
           | it can be effective at evaluating tools.
        
             | fwip wrote:
             | There's lots of stuff you can measure. It's not clear
             | whether any of it is correlated with productivity.
             | 
             | To use your example, a user with an LLM might say "LLM
             | please fix this" as a first line of action, drastically
             | improving this metric, even if it ruins your overall
             | productivity.
        
         | fhdsgbbcaA wrote:
         | I've been thinking a lot lately about how an LLM trained in
         | really high quality code would perform.
         | 
         | I'm far from impressed with the output of GPT/Claude, all
         | they've done is weight against stack overflow - which is still
         | low quality code relative to Google.
         | 
         | What is probability Google makes this a real product, or is it
         | too likely to autocomplete trade secrets?
        
         | hitradostava wrote:
         | I'm continually surprised by the amount of negativity that
         | accompanies these sort of statements. The direction of travel
         | is very clear - LLM based systems will be writing more and more
         | code at all companies.
         | 
         | I don't think this is a bad thing - if this can be accompanied
         | by an increase in software quality, which is possible. Right
         | now its very hit and miss and everyone has examples of LLMs
         | producing buggy or ridiculous code. But once the tooling
         | improves to:
         | 
         | 1. align produced code better to existing patterns and
         | architecture 2. fix the feedback loop - with TDD, other LLM
         | agents reviewing code, feeding in compile errors, letting other
         | LLM agents interact with the produced code, etc.
         | 
         | Then we will definitely start seeing more and more code
         | produced by LLMs. Don't look at the state of the art not, look
         | at the direction of travel.
        
           | latexr wrote:
           | > if this can be accompanied by an increase in software
           | quality
           | 
           | That's a huge "if", and by your own admission not what's
           | happening now.
           | 
           | > other LLM agents reviewing code, feeding in compile errors,
           | letting other LLM agents interact with the produced code,
           | etc.
           | 
           | What a stupid future. Machines which make errors being
           | "corrected" by machines which make errors in a death spiral.
           | An unbelievable waste of figurative and literal energy.
           | 
           | > Then we will definitely start seeing more and more code
           | produced by LLMs.
           | 
           | We're already there. And there's a lot of bad code being
           | pumped out. Which will in turn be fed back to the LLMs.
           | 
           | > Don't look at the state of the art not, look at the
           | direction of travel.
           | 
           | That's what leads to the eternal "in five years" which
           | eventually sinks everyone's trust.
        
             | danielmarkbruce wrote:
             | > What a stupid future. Machines which make errors being
             | "corrected" by machines which make errors in a death
             | spiral. An unbelievable waste of figurative and literal
             | energy.
             | 
             | Humans are machines which make errors. Somehow, we got to
             | the moon. The suggestion that errors just mindlessly
             | compound and that there is no way around it, is what's
             | stupid.
        
               | nuancebydefault wrote:
               | Exactly my thought. Humans can correct humans. Machines
               | can correct, or at least point to failures in the product
               | of, machines.
        
               | reverius42 wrote:
               | To err is human. To err at scale is AI.
        
               | latexr wrote:
               | > Humans are machines
               | 
               | Even if we accept the premise (seeing humans as machines
               | is literally dehumanising and a favourite argument of
               | those who exploit them), not all machines are created
               | equal. Would you use a bicycle to fill your taxes?
               | 
               | > Somehow, we got to the moon
               | 
               | Quite hand wavey. We didn't get to the Moon by reading a
               | bunch of books from the era then probabilistically
               | joining word fragments, passing that anround the same
               | funnel a bunch of times, then blindly doing what came
               | out, that's for sure.
               | 
               | > The suggestion that errors just mindlessly compound and
               | that there is no way around it
               | 
               | Is one that you made up, as that was not my argument.
        
           | paradox242 wrote:
           | I don't see how this is sustainable. We have essentially
           | eaten the seed corn. These current LLMs have been trained by
           | an enormous corpus of mostly human-generated technical
           | knowledge from sources which we already know to be currently
           | being polluted by AI-generated slop. We also have preliminary
           | research into how poorly these models do when training on
           | data generated by other LLMs. Sure, it can coast off of that
           | initial training set for maybe 5 or more years, but where
           | will the next giant set of unpolluted training data come
           | from? I just don't see it, unless we get something better
           | than LLMs which is closer to AGI or an entire industry is
           | created to explicitly create curated training data to be fed
           | to future models.
        
             | _DeadFred_ wrote:
             | These tools also require the developer class to that they
             | are intended to replace to continue to do what they
             | currently do (create the knowledge source to train the AI
             | on). It's not like the AIs are going to be creating the
             | accessible knowledge bases to train AIs on, especially for
             | new language extensions/libraries/etc. This is a one and
             | f'd development. It will give a one time gain and then
             | companies will be shocked when it falls apart and there's
             | no developers trained up (because they all had to switch
             | careers) to replace them. Unless Google's expectation is
             | that all languages/development/libraries will just be
             | static going forward.
        
             | brainwad wrote:
             | The LLM codegen at Google isn't unsupervised. It's
             | integrated into the IDE as both autocomplete and prompt-
             | based assistant, so you get a lot of feedback from a) what
             | suggestions the human accepts and b) how they fix the
             | suggestion when it's not perfect. So future iterations of
             | the model won't be trained on LLM output, but on a mixture
             | of human written code and human-corrected LLM output.
             | 
             | As a dev, I like it. It speeds up writing easy but tedious
             | code. It's just a bit smarter version of the refactoring
             | tools already common in IDEs...
        
           | randomNumber7 wrote:
           | Because there seems to be a fundamental misunderstanding
           | producing a lot of nonsense.
           | 
           | Of course LLMs are a fantastic tool to improve productivity,
           | but current LLM's cannot produce anything novel. They can
           | only reproduce what they have seen.
        
         | LinuxBender wrote:
         | Is AI ready to crawl through all open source and find / fix all
         | the potential security bugs or all bugs for that matter? If so
         | will that become a commercial service or a free service?
         | 
         | Will AI be able to detect bugs and back doors that require
         | multiple pieces of code working together rather than being in a
         | single piece of code? Humans have a hard time with this.
         | 
         | - _Hypothetical Example: Authentication bugs in sshd that
         | requires a flaw in systemd which then requires a flaw in udev
         | or nss or PAM or some underlying library ... but looking at
         | each individual library or daemon there are no bugs that a
         | professional penetration testing organization such as the NCC
         | group or Google 's Project Zero would find._ In other words,
         | will AI soon be able to find more complex bugs in a year than
         | Tavis has found in his career and will they start to compete
         | with one another and start finding all the state sponsored
         | complex bugs and then ultimately be able to create a map that
         | suggests a common set of developers that may need to be
         | notified? Will there be a table that logs where AI found things
         | that professional human penetration testers could not?
        
           | paradox242 wrote:
           | Seems like there is more gain on the adversary side of this
           | equation. Think nation-states like North Korea or China, and
           | commercial entities like Pegasus Group.
        
             | AnimalMuppet wrote:
             | Google's AI would have the advantage of the source code.
             | The adversaries would not. (At least, not without hacking
             | Google's code repository, which isn't impossible...)
        
         | mysterydip wrote:
         | I assume the amount of monitoring effort is less than the
         | amount of effort that would be required to replicate the AI
         | generated code by humans, but do you have numbers on what that
         | ROI looks like? Is it more like 10% or 200%?
        
         | hshshshshsh wrote:
         | Seems like everything is working out without any issues.
         | Shouldn't you be a bit suspicious?
        
         | Twirrim wrote:
         | > We work very closely with Google DeepMind to adapt Gemini
         | models for Google-scale coding and other Software Engineering
         | usecases.
         | 
         | Considering how terrible and frequently broken the code that
         | the public facing Gemini produces, I'll have to be honest that
         | that kind of scares me.
         | 
         | Gemini frequently fails at some fairly basic stuff, even in
         | popular languages where it would have had a lot of source
         | material to work from; where other public models (even free
         | ones) sail through.
         | 
         | To give a fun, fairly recent example, here's a prime
         | factorisation algorithm it produce for python:
         | # Find the prime factorization of n       prime_factors = []
         | while n > 1:         p = 2         while n % p == 0:
         | prime_factors.append(p)           n //= p         p += 1
         | prime_factors.append(n)
         | 
         | Can you spot all the problems?
        
           | senko wrote:
           | We collectively deride leetcoding interviews yet ask AI to
           | flawlessly solve leetcode questions.
           | 
           | I bet I'd make more errors on my first try at it.
        
             | AnimalMuppet wrote:
             | Writing a prime-number factorization function is hardly
             | "leetcode".
        
               | atomic128 wrote:
               | Empirical testing (for example:
               | https://news.ycombinator.com/item?id=33293522) has
               | established that the people on Hacker News tend to be
               | junior in their skills. Understanding this fact can help
               | you understand why certain opinions and reactions are
               | more likely here. Surprisingly, the more skilled
               | individuals tend to be found on Reddit (same testing
               | performed there).
        
               | louthy wrote:
               | I'm not sure that's evidence; I looked at that and saw it
               | was written in Go and just didn't bother. As someone with
               | 40 years of coding experience and a fundamental dislike
               | of Go, I didn't feel the need to even try. So the numbers
               | can easily be skewed, surely.
        
               | atomic128 wrote:
               | Only individuals who submitted multiple bad solutions
               | before giving up were counted as failing. If you look but
               | don't bother, or submit a single bad solution, you aren't
               | counted. Thousands of individuals were tested on Hacker
               | News and Reddit, and surprisingly, it's not even close:
               | Reddit is where the hackers are. I mean, at the time of
               | the testing, years ago.
        
               | louthy wrote:
               | That doesn't change my point. It didn't test every dev on
               | all platforms, it tested a subset. That subset may well
               | have different attributes to the ones that didn't engage.
               | So, it says nothing about the audience for the forums as
               | a whole, just the few thousand that engaged.
               | 
               | Perhaps even, there could be fewer Go programmers here
               | and some just took a stab at it even though they don't
               | know the language. So it could just select for which
               | forum has the most Go programmers. Hardly rigourous.
               | 
               | So I'd take that with a pinch of salt personally
        
               | atomic128 wrote:
               | Agreed. But remember, this isn't the only time the
               | population has been tested. This is just the test (from
               | two years ago, in 2022) that I happen to have a link to.
        
               | louthy wrote:
               | The population hasn't been tested. A subset has.
        
               | Izikiel43 wrote:
               | How is that thing testing? Is it expecting a specific
               | solution or actually running the code? I tried some
               | solutions and it complained anyways
        
               | atomic128 wrote:
               | The way the site works is explained in the first puzzle,
               | "Hack This Site". TLDR, it builds and runs your code
               | against a test suite. If your solutions weren't accepted,
               | it's because they're wrong.
        
               | 0xDEAFBEAD wrote:
               | Where is the data?
        
               | senko wrote:
               | I didn't say it's hard, but it's most definitely
               | leetcode, as in "pointless algorithmic exercise that will
               | only show you if the candidate recently worked on a
               | similar question".
               | 
               | If that doesn't satisfy, here's a similar one at
               | leetcode.com: https://leetcode.com/problems/distinct-
               | prime-factors-of-prod...
               | 
               | I would not expect a programmer of any seniority to churn
               | stuff like that and have it working without testing.
        
               | AnimalMuppet wrote:
               | > "pointless algorithmic exercise that will only show you
               | if the candidate recently worked on a similar question".
               | 
               | I've been able to write one, not from memory but from
               | first principles, any time in the last 40 years.
        
           | gerash wrote:
           | I believe most people use AI to help them quickly figure out
           | how to use a library or an API without having to read all
           | their (often out dated) documentation instead of helping them
           | solve some mathematical challenge
        
             | taeric wrote:
             | If the documentation is out of date, such that it doesn't
             | help, this doesn't bode well for the training data of the
             | AI helping it get it right, either?
        
               | macintux wrote:
               | AI can presumably integrate all of the forum discussions
               | talking about how people really use the code.
               | 
               | Assuming discussions don't happen in Slack, or Discord,
               | or...
        
               | randomNumber7 wrote:
               | And all the code on which it was trained...
        
               | woodson wrote:
               | Unfortunately, it often hallucinates wrong parameters (or
               | gets their order wrong) if there are multiple different
               | APIs for similar packages. For example, there are plenty
               | ML model inference packages, and the code suggestions for
               | NVIDIA Triton Inference Server Python code are pretty
               | much always wrong, as it generates code that's probably
               | correct for other Python ML inference packages with
               | slightly different API.
        
             | randomNumber7 wrote:
             | I think that too but google claims something else.
        
           | kgeist wrote:
           | They probably use AI for writing tests, small internal
           | tools/scripts, building generic frontends and quick
           | prototypes/demos/proofs of concept. That could easily be that
           | 25% of the code. And modern LLMs are pretty okayish with
           | that.
        
           | calf wrote:
           | We are sorely lacking a "Make Computer Science a Science"
           | movement, the tech lead's blurb is par for the course,
           | talking about "SWE productivity" with no reference to
           | scientific inquiry and a foundational understanding of
           | safety, correctness, verification, validation of these new
           | LLM technologies.
        
           | justinpombrio wrote:
           | > Can you spot all the problems?
           | 
           | You were probably being rhetorical, but there are two
           | problems:
           | 
           | - `p = 2` should be outside the loop
           | 
           | - `prime_factors.append(n)` appends `1` onto the end of the
           | list for no reason
           | 
           | With those two changes I'm pretty sure it's correct.
        
         | bogwog wrote:
         | Is any of the AI generated code being committed to Google's
         | open source repos, or is it only being used for
         | private/internal stuff?
        
         | wslh wrote:
         | As someone working in cybersecurity and actively researching
         | vulnerability scanning in codebases (including with LLMs), I'm
         | struggling to understand what you mean by "safe." If you're
         | referring to detecting security vulnerabilities, then you're
         | either working on a confidential project with unpublished
         | methods, or your approach is likely on par with the current
         | state of the art, which primarily addresses basic
         | vulnerabilities.
        
       | pixelat3d wrote:
       | Sooo... is this why Google sucks now?
        
       | rcarmo wrote:
       | There is a running gag among my friends using Google Chat (or
       | whatever their corporate IM tool is now called) that this
       | explains a lot of what they're experiencing while using it...
        
         | tdeck wrote:
         | I didn't know anyone outside Google actually used that...
        
       | oglop wrote:
       | No surprise. I give my career about 2 years before I'm useless.
        
         | phi-go wrote:
         | They still need someone to write 75% of the code.
        
         | k4rli wrote:
         | Seems just overhyped tech to push up stock prices. It was
         | already claimed 2 years ago that half of the jobs would be
         | taken by "AI" but barely any have and AI has barely improved
         | since GPT3.5. Latest Anthropic is only slightly helpful for
         | software development, mostly for unusual bug investigations and
         | logs analysis, at least in my experience.
        
       | lysace wrote:
       | Github Copilot had an outage for me this morning. It was kind of
       | shocking. I now believe this metric. :-)
       | 
       | I'll be looking into ways of running a local LLM for this purpose
       | (code assistance in VS Code). I'm already really impressed with
       | various quite large models running on my 32 GB Mac Studio M2 Max
       | via Ollama. It feels like having a locally running chatgpt.
        
         | kulahan wrote:
         | I'm very happy to hear this; maybe it's finally time to buy a
         | ton of ram for my PC! A local, private LLM would be great. I'd
         | try talking to it about stuff I don't feel comfortable being on
         | OpenAI's servers.
        
           | lysace wrote:
           | Getting lots of ram will let you run large models on the CPU,
           | but it will be so slow.
           | 
           | The Apple Silicon Macs have this shared memory between CPU
           | and GPU that let's the (relatively underpowered GPU, compared
           | to a decent Nvidia GPU) run these models at decent speeds,
           | compared with a CPU, when using llama.cpp.
           | 
           | This should all get _dramatically_ better /faster/cheaper
           | within a few years, I suspect. Capitalism will figure this
           | one out.
        
             | kulahan wrote:
             | Interesting, so this is a Mac-specific solution? That's
             | pretty cool.
             | 
             | I assume, then, that the primary goal would be to drop in
             | the beefiest GPU possible when on windows/linux?
        
               | lysace wrote:
               | With Windows/Linux I think the issue is that NVidia is
               | artificially limiting the amount of onboard RAM (they
               | want to sell those devices for 10x more to openai, etc)
               | and that AMD for whatever reason can't get their shit
               | together.
               | 
               | I'm sure that there are other much more knowledgeable
               | people here though, on this topic.
        
         | evoke4908 wrote:
         | Ollama, docker and "open webui".
         | 
         | It immediately works out of the box and that's it. I've been
         | using local LLMs on my laptop for a while, it's pretty nice.
         | 
         | The only thing you really need to worry about is VRAM. Make
         | sure your GPU has enough memory to run your model and that's
         | pretty much it.
         | 
         | Also "open webui" is the worst project name I've ever seen.
        
       | marstall wrote:
       | first thought is that much of that 25% is test code for non-ai-
       | gen code...
        
       | pfannkuchen wrote:
       | It's replaced the 25% previously copy pasted from stack overflow.
        
         | brainwad wrote:
         | The split is roughly 25% AI, 25% typed, 50% pasted.
        
         | rkagerer wrote:
         | This may have been intended as a joke, but it's the only
         | explanation that reconciles the quote for me.
        
       | ryoshu wrote:
       | Spoken like an MBA who counts lines of code.
        
       | marstall wrote:
       | maps with recent headlines about AI improving programmer
       | productivity 20-30%.
       | 
       | which puts it in line with previous code-generation technologies
       | i would imagine. I wonder which of these increased productivity
       | the most?
       | 
       | - Assembly Language
       | 
       | - early Compilers
       | 
       | - databases
       | 
       | - graphics frameworks
       | 
       | - ui frameworks (windows)
       | 
       | - web apps
       | 
       | - code generators (rails scaffolding)
       | 
       | - genAI
        
         | akira2501 wrote:
         | Early Compilers. By a wide margin. They are the enabling factor
         | for everything that comes below it. It's what allows you to
         | share library interfaces and actually use them in a consistent
         | manor and across multiple architectures. It entirely changed
         | the shape of software development.
         | 
         | The gap between "high level assembly" and "compiled language"
         | is about as large as it gets.
        
       | skrebbel wrote:
       | To my experience, AIs can generate perfectly good code relatively
       | easy things, the kind you might as well copy&paste from
       | stackoverflow, and they'll very confidently generate _subtly
       | wrong_ code for anything that 's non-trivial for an experienced
       | programmer to write. How do people deal with this? I simply don't
       | understand the value proposition. Does Google now have 25% subtly
       | wrong code? Or do they have 25% trivial code? Or do all their
       | engineers babysit the AI and bugfix the subtly wrong code? Or are
       | all their engineers so junior that an AI is such a substantial
       | help?
       | 
       | Like, isn't this announcement a terrible indictment of how
       | inexperienced their engineers are, or how trivial the problems
       | they solve are, or both?
        
         | tmoravec wrote:
         | Does the figure include unit tests?
        
         | airstrike wrote:
         | By definition, "trivial" code should make up a significant
         | portion of any code base, so perhaps the 25% is precisely the
         | bit that is trivial and easily automated.
        
           | Smaug123 wrote:
           | I don't think the word "definition" means what you think it
           | means!
        
         | hifromwork wrote:
         | 25% trivial code sounds like a reasonable guess.
        
           | fzysingularity wrote:
           | This seems reasonable - but I'm interpreting this as most
           | junior-level coding needs will end and be replaced with AI.
        
             | mrguyorama wrote:
             | And the non junior developers will then just magically
             | appear from the aether!With 10 years experience in a four
             | year old stack.
        
         | Nasrudith wrote:
         | I wouldn't call it an indictment necessarily, because so much
         | is dependent upon circumstances. They can't all be "deep
         | problems" in the real world. Projects tend to have two
         | components, "deep" work which is difficult and requires high
         | skill and cannot be made up with by using masses of
         | inexperienced and "shallow" work where being skilled doesn't
         | really help, or doesn't help too much compared to throwing more
         | bodies at the problem. To use an example it is like advanced
         | accounting vs just counting up sales receipts.
         | 
         | Even if their engineers were inexperienced that wouldn't be an
         | indictment in itself so long as they had a sufficient necessary
         | amount of shallow work. Using all experienced engineers to do
         | shallow work is just inefficient, like having brain surgeons
         | removing bunions. Automation is basically a way to transform
         | deep work to a producer of "free" shallow work.
         | 
         | That said, the real impressive thing with code isn't in its
         | creation but in its ability to losslessly delete code and
         | maintain or improve functionality.
        
         | andyjohnson0 wrote:
         | I suspect that a lot of the hard, google-scale stuff has
         | already been done and packaged as an internal service or
         | library - and just gets re-used. So the AIs are probably
         | churning out new settings dialogs and the like.
        
         | jajko wrote:
         | I can generate in eclipse pojo classes or their accessor
         | methods. I can let maven build entire packages from say XSDs (I
         | know I am talking old boring tech, just giving an example). I
         | can copy&paste half the code (if not more) from stack overflow.
         | 
         | Now replace all this and much more with 'AI'. If they said AI
         | helped them increase say ad effectivity by 3-5%, I'll start
         | paying attention.
        
         | akira2501 wrote:
         | > isn't this announcement a terrible indictment
         | 
         | Of obviously flawed corporate structures. This CEO has no
         | particular programming expertise and most of his companies
         | profits do not seem to flow from this activity. I strongly
         | doubt he has a grip on the actual facts here and is
         | uncritically repeating what was told to him in a meeting.
         | 
         | He should, given his position, been the very _first_ person to
         | ask the questions you've posed here.
        
         | jjtheblunt wrote:
         | Maybe the trick is to hide vetted correct code, of whatever
         | origin, behind function calls for documented functions, thereby
         | iteratively simplifying the work a later-trained LLM would need
         | to do?
        
         | toasteros wrote:
         | > the kind you might as well copy&paste from stackoverflow
         | 
         | This bothers me. I completely understand the conversational
         | aspect - "what approach might work for this?", "how could we
         | reduce the crud in this function?" - it worked a lot for me
         | last year when I tried learning C.
         | 
         | But the vast majority of AI use that I see is...not that. It's
         | just glorified, very expensive search. We are willing to burn
         | far, far more fuel than necessary because we've decided we
         | can't be bothered with traditional search.
         | 
         | A lot of enterprise software is poorly cobbled together using
         | stackoverflow gathered code as it is. It's part of the reason
         | why MS Teams makes your laptop run so hot. We've decided that
         | power-inefficient software is the best approach. Now we want to
         | amplify that effect by burning more fuel to get the same
         | answers, but from an LLM.
         | 
         | It's frustrating. It should be snowing where I am now, but it's
         | not. Because we want to frivolously chase false convenience and
         | burn gallons and gallons of fuel to do it. LLM usage is a part
         | of that.
        
           | chongli wrote:
           | _we 've decided we can't be bothered with traditional search_
           | 
           | Traditional search (at least on the web) is dying. The entire
           | edifice is drowning under a rapidly rising tide of spam and
           | scam sites. No one, including Google, knows what to do about
           | it so we're punting on the whole project and hoping AI will
           | swoop in like _deus ex machina_ and save the day.
        
             | petre wrote:
             | AI will generate even more spam and scam sites more
             | trivially.
        
             | romwell wrote:
             | _Narrator: it did not, in fact, save the day._
        
             | AnimalMuppet wrote:
             | But it can't save the day.
             | 
             | The problem with Google search is that it indexes all the
             | web, and there's (as you say) a rising tide of scam and
             | spam sites.
             | 
             | The problem with AI is that it scoops up all the web as
             | training data, _and there 's a rising tide of scam and spam
             | sites._
        
             | lokar wrote:
             | It took the scam/spam sites a few years to catch up to
             | Google search. Just wait a bit, equilibrium will return.
        
             | akoboldfrying wrote:
             | >The entire edifice is drowning under a rapidly rising tide
             | of spam and scam sites.
             | 
             | You make this claim with such confidence, but what is it
             | based on?
             | 
             | There have always been hordes of spam and scam websites.
             | Can you point to anything that actually indicates that the
             | ratio is now getting worse?
        
               | chongli wrote:
               | _There have always been hordes of spam and scam websites.
               | Can you point to anything that actually indicates that
               | the ratio is now getting worse?_
               | 
               | No, there haven't always been hordes of spam and scam
               | websites. I remember the web of the 90s. When Google
               | first arrived on the scene every site on the results page
               | was a real site, not a spam/scam site.
        
             | masfuerte wrote:
             | Google results are not polluted with spam because Google
             | doesn't know how to deal with it.
             | 
             | Google results are polluted with spam because it is more
             | profitable for Google. This is a conscious decision they
             | made five years ago.
        
               | chongli wrote:
               | _because it is more profitable for Google_
               | 
               | Then why are DuckDuckGo results also (arguably even more
               | so) polluted with spam/scam sites? I doubt DDG is making
               | any profit from those sites since Google essentially owns
               | the display ad business.
        
               | JohnDone wrote:
               | Ddg is actually Bing. Search as a service.
        
             | photonthug wrote:
             | Maybe it is naive but I think search would probably work
             | again if they could roll back code to 10 or 15 years ago
             | and just make search engines look for text in webpages.
             | 
             | Google wasn't crushed by spam, they decided to stop doing
             | text search and build search bubbles that are user
             | specific, location-specific, decided to surface pages that
             | mention search terms in metadata instead of in text users
             | might read, etc. Oh yeah, and about a decade before LLMs
             | were actually usable, they started to sabotage simple
             | substring searches and kind of force this more
             | conversational interface. That's when simple search terms
             | stopped working very well, and you had to instead ask
             | yourself "hmm how would a very old person or a small child
             | phrase this question for a magic oracle"
             | 
             | This is how we get stuff like: Did you mean "when did
             | Shakespeare die near my location"? If anyone at google
             | cared more about quality than printing money, that thirsty
             | gambit would at least be at the bottom of the page instead
             | of the top.
        
               | layer8 wrote:
               | > just make search engines look for text in webpages.
               | 
               | Google's verbatim search option roughly does that for me
               | (plus an ad blocker that removes ads from the results
               | page). I have it activated by default as a search
               | shortcut.
               | 
               | (To activate it, one can add "tbs=li:1" as a query
               | parameter to the Google search URL.)
        
             | skissane wrote:
             | I personally think a big problem with search is major
             | search engines try to be all things to all people and hence
             | suffer as a result.
             | 
             | For example: a beginner developer is possibly better served
             | by some SEO-heavy tutorial blog post; an experienced
             | developer would prefer results weighted towards the
             | official docs, the project's bug tracker and mailing list,
             | etc. But since less technical and non-technical people
             | vastly outnumber highly technical people, Google and Bing
             | end up focusing on the needs of the former, at the cost of
             | making search worse for the later.
             | 
             | One positive about AI: if an AI is doing the search, it
             | likely wants the more advanced material not the more
             | beginner-focused one. It can take more advanced material
             | and simplify it for the benefit of less experienced users.
             | It is (I suspect) less likely to make mistakes if you ask
             | it to simplify the more advanced material than if you just
             | gave it more beginner-oriented material instead. So if AI
             | starts to replace humans as the main clients of search,
             | that may reverse some of the pressure to "dumb it down".
        
               | photonthug wrote:
               | > But since less technical and non-technical people
               | vastly outnumber highly technical people, Google and Bing
               | end up focusing on the needs of the former, at the cost
               | of making search worse for the later.
               | 
               | I mostly agree with your interesting comment, and I think
               | your analysis basically jives with my sibling comment.
               | 
               | But one thing I take issue with is the idea that this
               | type of thing is a good faith effort, because it's more
               | like a convenient excuse. Explaining substring search or
               | even include/exclude ops to children and grandparents is
               | actually easy. Setting preferences for tutorials vs API
               | docs would also be easy. But companies don't really want
               | user-directed behavior as much as they want to herd users
               | to preferred content with algorithms, then convince the
               | user it was their idea or at least the result of
               | relatively static ranking processes.
               | 
               | The push towards more fuzzy semantic search and "related
               | content" everywhere is not to cater to novice users but
               | to blur the line between paid advertisement and organic
               | user-directed discovery.
               | 
               | No need to give megacorp the benefit of the doubt on
               | stuff like this, or make the underlying problems seem
               | harder than they are. All platforms land in this place by
               | convergent evolution wherein the driving forces are money
               | and influence, not insurmountable technical difficulties
               | or good intentions for usability.
        
             | cyanydeez wrote:
             | If only google was trying to solve search rather than
             | shareholdet values.
        
             | skydhash wrote:
             | > _Traditional search (at least on the web) is dying._
             | 
             | That's not my experience at all. While there are scammy
             | sites, using the search engines as an index instead of an
             | oracle still yields useful results. It only requires to
             | learn the keywords which you can do by reading the relevant
             | materials .
        
             | ponector wrote:
             | >> No one, including Google, knows what to do about it
             | 
             | I'm sure they can. But they have no incentive. Try to
             | Google an item, and it will show you a perfect match of
             | sponsored ads and some other not-so-relevant non-sponsored
             | results
        
             | AtlasBarfed wrote:
             | There's no way the search AI will beat out the spamgen AI.
             | 
             | Tailoring/retraining the main search AI will be so much
             | more expensive that retraining the spam special purpose
             | AIs.
        
             | quickthrowman wrote:
             | Google could fix the problem if they wanted to, but it's
             | not in their interests to fix it since the spam sites
             | generally buy ads from Google and/or display Google ads on
             | their spam websites. Google wants to maximize their income,
             | so..
        
             | layer8 wrote:
             | Without a usable web search index, AI will be in trouble
             | eventually as well. There is no substitute for it.
        
         | nwellinghoff wrote:
         | They probably have ai that scans existing human written code
         | and auto generates patches and fixes to improve performance or
         | security. The 25% is just a top level stat with no real meaning
         | without context.
        
         | groestl wrote:
         | > do they have 25% trivial code?
         | 
         | From what I've seen on Google Cloud, both as a user and from
         | leaked source code, 25% of their code is probably just packing
         | and unpacking of protobufs.
        
         | tobyjsullivan wrote:
         | I'm not a Google employee but I've heard enough stories to know
         | that a surprising amount of code changes at google are
         | basically updating API interfaces.
         | 
         | The way google works, the person changing an interface is
         | responsible for updating all dependent code. They create PRs
         | which are then sent to code owners for approval. For lower-
         | level dependencies, this can involve creating thousands of PRs
         | across hundreds of projects.
         | 
         | Google has had tooling to help with these large-scale refactors
         | for decades, generally taking the form of static analysis
         | tools. However, these would be inherently limited in their
         | capability. Manual PR authoring would still be required in many
         | cases.
         | 
         | With this background, LLM code gen seems like a natural tool to
         | augment Google's existing process.
         | 
         | I expect Google is currently executing a wave of newly-
         | unblocked refactoring projects.
         | 
         | If anyone works/worked at google, feel free to correct me on
         | this.
        
         | skissane wrote:
         | > To my experience, AIs can generate perfectly good code
         | relatively easy things, the kind you might as well copy&paste
         | from stackoverflow, and they'll very confidently generate
         | subtly wrong code for anything that's non-trivial for an
         | experienced programmer to write. How do people deal with this?
         | 
         | Well, just in the last 24 hours, ChatGPT gave me solutions to
         | some relatively complex problems that turned out to be
         | significantly wrong.
         | 
         | Did that mean it was a complete waste of my time? I'm not sure.
         | Its broken code gave me a starting point for tinkering and
         | exploring and trying to understand why it wasn't working (even
         | if superficially it looked like it should). I'm not convinced I
         | lost anything by trying its suggestions. And I learned some
         | things in the process (e.g. asyncio doesn't play well together
         | with Flask-Sock)
        
         | wvenable wrote:
         | > Or do they have 25% trivial code?
         | 
         | We all have probably 25% or more trivial code. AI is great for
         | that. I have X (table structure, model, data, etc) and I want
         | to make Y with it. A lot of code is pretty much mindless
         | shuffling data around.
         | 
         | The other thing is good for is anything pretty standard. If I'm
         | using a new technology and I just want to get started with
         | whatever is the best practice, it's going to do that.
         | 
         | If I ever have to do PowerShell (I hate PowerShell), I can get
         | AI to generate pretty much whatever I want and then I'm smart
         | enough to fix any issues. But I really don't like starting from
         | nothing in a tech I hate.
        
           | randomNumber7 wrote:
           | Yes but then it would be more logical to say "AI makes our
           | devs 25% more efficient". This is not what he said, but imo
           | you are obviously right.
        
             | wvenable wrote:
             | Not necessarily. If 25% of the code is written by AI but
             | that code isn't very interesting or difficult, it might not
             | be making the devs 25% more efficient. It could even
             | possibly be more but, either way, these are different
             | metrics.
        
             | johannes1234321 wrote:
             | The benefit doesn't translate 1:1. The generated code has
             | to be read and verified and might require small adaptions.
             | (Partially that can be done by AI as well)
             | 
             | But for me it massively improved all the boilerplate
             | generic work. A lot of those things which are just annoying
             | work, but not interesting.
             | 
             | Then I can focus on the bigger things, on the important
             | parts.
        
           | lambdasquirrel wrote:
           | I've already had one job interview where the applicant seemed
           | broadly knowledgeable about everything we asked them during
           | lead-in questions before actual debugging. Then when they had
           | to actually dig deeper or demonstrate understanding while
           | solving some problem, they fell short.
           | 
           | I'm pretty sure they weren't the first and there've been
           | others we didn't know about. So now I don't ask lead-in
           | questions anymore. Surprisingly, it doesn't seem to make much
           | of a difference and I don't need to get burned again.
        
         | sangnoir wrote:
         | > Does Google now have 25% subtly wrong code?
         | 
         | How do you quantify "new code" - is it by lines of code or
         | number of PRs/changesets generated? I can easily see it being
         | the latter - if an AI workflow suggests 1 naming-change/cleanup
         | commit to your PR made of 3 other human-authored commits, has
         | it authored 25% of code? Arguably, yes - but it's trivial code
         | that ought to be reviewed by humans. Dependabot is responsible
         | for a good chunk of PRs already.
         | 
         | Having a monorepo brings plenty of opportunities for automation
         | when refactoring - whether its AI, AST manipulation or even
         | good old grep. The trick is not to merge the code directly, but
         | have humans in the loop to approve, or take-over and correct
         | the code first.
        
         | rpcope1 wrote:
         | I don't get it either. People will say all sorts of strange
         | stuff about how it writes the code for them or whatever, but
         | even using the new Claude 3.5 Sonnet or whatever variant of
         | GPT4, the moment I ask it anything that isn't the most basic
         | done-to-death boilerplate, it generates stuff that's wrong, and
         | often subtly wrong. If you're not at least pretty knowledgeable
         | about exactly what it's generating, you'll be stuck trying to
         | troubleshoot bad code, and if you are it's often about as quick
         | to just write it yourself. It's especially bad if you get away
         | from Python, and try to make it do anything else. SQL
         | especially, for whatever reason, I've seen all of the major
         | players generate either stuff that's just junk or will cause
         | problems (things that your run of the mill DBA will catch).
         | 
         | Honestly, I think it will become a better Intellisense but not
         | much more. I'm a little excited because there's going to be so
         | many people buying into this, generating so much bad code/bad
         | architecture/etc. that will inevitably need someone to fix
         | after the hype dies down and the rug is pulled, that I think
         | there will continue to be employment opportunities.
        
           | solumunus wrote:
           | Supermaven is an incredible intellisense. Most code IS
           | trivial and I barely write trivial code anymore. My imports
           | appear instantly, with high accuracy. I have lots of embedded
           | SQL queries and it's able to guess the structure of my
           | database very accurately. As I'm writing a query the
           | suggested joins are accurate probably 80% of the time. I'm
           | significantly more productive and having to type much less.
           | If this is as good as it ever gets I'm quite happy. I rarely
           | use AI for non trivial code, but non trivial code is what I
           | want to work on...
        
             | ta_1138 wrote:
             | This is all about the tooling most companies choose when
             | building software: Things with more than enough boilerplate
             | most code is trivial. We can build tools that have far less
             | triviality and more density, where the distance between the
             | code we write and business logic is very narrow.. but then
             | every line of code we write is hard, because it's
             | meaningful, and that feels bad enough to many developers,
             | so we end up with tools where we might not be more
             | productive, but we might feel productive, even though most
             | of that apparent productivity is trivially generated.
             | 
             | We also have the ceremonial layers of certain forms of
             | corporate architecture, where nothing actually happens, but
             | the steps must exist to match the holy box, box cylinder
             | architecture. Ceremonial input massaging here, ceremonial
             | data transformation over there, duplicated error
             | checking... if it's easy for the LLM to do, maybe we
             | shouldn't be doing it everywhere in the first place.
        
               | thfuran wrote:
               | >but then every line of code we write is hard, because
               | it's meaningful, and that feels bad enough to many
               | developers,
               | 
               | I don't know that I've ever even met a developer who
               | wants to be writing endless pools of trivial boilerplate
               | instead of meaningful code. Even the people at work who
               | are willing to say they don't want to deal with the
               | ambiguity and high level design stuff and just want to be
               | told what to do pretty clearly don't want endless
               | drudgery.
        
               | Aeolun wrote:
               | That, but boilerplate stuff is also incredibly easy to
               | understand. As compared to high density, high meaning
               | code anyway. I prefer more low density low meaning code
               | as it makes it much easier to reason about any part of
               | the system.
        
             | monksy wrote:
             | I don't think that is the signal that I think most people
             | are hoping for here.
             | 
             | When I hear that most code is trivial, I think of this as a
             | language design or a framework related issue making things
             | harder than they should be.
             | 
             | Throwing AI or generates at the problem just to claim that
             | they fixed it is just frustrating.
        
           | Kiro wrote:
           | Interesting that you believe your subjective experience
           | outweighs the claims of all others who report successfully
           | using LLMs for coding. Wouldn't a more charitable
           | interpretation be that it doesn't fit the stuff you're doing?
        
         | skeeter2020 wrote:
         | trivial code could very easily include the vast majority of
         | most apps we're building these days. Most of it's just glue,
         | and AI can probably stitch together a bunch of API calls and
         | some UI as well as a human. It could also be a lot of non-
         | product code, tooling, one-time things, etc.
        
         | ithkuil wrote:
         | Or perhaps that even for excellent engineers and complicated
         | problems a quarter of the code one writes is stupid almost
         | copy-pasteable boilerplate which is now an excellent target for
         | the magic lArge text Interpolator
        
         | eco wrote:
         | I'd be terribly scared to use it in a language that isn't
         | statically typed with many, many compile time error checks.
         | 
         | Unless you're the type of programmer that is writing sabots all
         | day (connecting round pegs into square holes between two data
         | sources) you've got to be very critical of what these things
         | are spitting out.
        
           | randomNumber7 wrote:
           | It is way more scary to use it for C or C++ than Python imo.
        
           | cybrox wrote:
           | If you use it as advanced IntelliSense/auto-complete, it's
           | not any worse than with typed languages.
           | 
           | If you just let it generate and run the code... yeah,
           | probably, since you won't catch the issues at compile time.
        
         | JohnMakin wrote:
         | > To my experience, AIs can generate perfectly good code
         | relatively easy things, the kind you might as well copy&paste
         | from stackoverflow,
         | 
         | This, imho, is what is happening. In the olden days, when
         | StackOverflow + Google used to magically find the exact problem
         | from the exact domain you needed every time - even then you'd
         | often need to sift through the answers (top voted one was
         | increasingly not what you needed) to find what you needed, then
         | modify it further to precisely fit whatever you were doing.
         | This worked fine for me for a long time until search rendered
         | itself worthless and the overall answer quality of
         | StackOverflow has gone down (imo). So, we are here, essentially
         | doing the exact same thing in a much more expensive way, as you
         | said.
         | 
         | Regarding future employment opportunities - this rot is already
         | happening and hires are coming from it, at least from what I'm
         | seeing in my own domain.
        
         | notyourwork wrote:
         | To your point, I don't buy the truth of the statement. I work
         | in big tech and am convinced that 25% of the code being written
         | is not coming from AI.
        
         | cybrox wrote:
         | Depends if they include test code in this metric. I have found
         | AI most valuable in generating test code. I usually want to
         | keep tests as simple as possible, so I prefer some repetition
         | over abstraction to make sure there's no issues with the test
         | logic itself, AI makes this somewhat verbose process very easy
         | and efficient.
        
         | slibhb wrote:
         | Most programming is trivial. Lots of non-trivial programming
         | tasks can be broken down into pure, trivial sections. Then, the
         | non-trivial part becomes knowing how the entire system fits
         | together.
         | 
         | I've been using LLMs for about a month now. It's a nice
         | productivity gain. You do have to read generated code and
         | understand it. Another useful strategy is pasting a buggy
         | function and ask for revisions.
         | 
         | I think most programmers who claim that LLMs aren't useful are
         | reacting emotionally. They don't want LLMs to be useful
         | because, in their eyes, that would lower the status of
         | programming. This is a silly insecurity: ultimately programmers
         | are useful because they can think formally better than most
         | people. For the forseeable future, there's going to be massive
         | demand for that, and people who can do it will be high status.
        
           | gorjusborg wrote:
           | > Most programming is trivial
           | 
           | That's a bold statement, and incorrect, in my opinion.
           | 
           | At a junior level software development can be about churning
           | out trivial code in a previously defined box. I don't think
           | its fair to call that 'most programming'.
        
             | BobbyJo wrote:
             | Probably overloading of the term "programming" is the issue
             | here. Most "software engineering" is non-programming work.
             | Most programming is not actually typing code.
             | 
             | Most of the time, when I am typing code, the code I am
             | producing is trivial, however.
        
           | Reason077 wrote:
           | A good farmer isn't likely to complain about getting a new
           | tractor. But it might put a few horses out of work.
        
           | adriand wrote:
           | > Lots of non-trivial programming tasks can be broken down
           | into pure, trivial sections. Then, the non-trivial part
           | becomes knowing how the entire system fits together.
           | 
           | I think that's exactly right. I used to have to create the
           | puzzle pieces and then fit them together. Now, a lot of the
           | time something else makes the piece and I'm just doing the
           | fitting together part. Whether there will come a day when we
           | just need to describe the completed puzzle remains to be
           | seen.
        
           | r14c wrote:
           | From my perspective, writing out the requirements for an AI
           | to produce the code I want is just as easy as writing it
           | myself. There are some types of boilerplate code that I can
           | see being useful to produce with an LLM, but I don't write
           | them often enough to warrant actually setting up the
           | workflow.
           | 
           | Even with the debugging example, if I just read what I wrote
           | I'll find the bug because I understand the language. For more
           | complex bugs, I'd have to feed the LLM a large fraction of my
           | codebase and at that point we're exceeding the level of
           | understanding these things can have.
           | 
           | I would be pretty happy to see an AI that can do effective
           | code reviews, but until that point I probably won't bother.
        
           | tonyedgecombe wrote:
           | >I think most programmers who claim that LLMs aren't useful
           | are reacting emotionally.
           | 
           | I don't think that's true. Most programmers I speak to have
           | been keen to try it out and reap some benefits.
           | 
           | The almost universal experience has been that it works for
           | trivial problems, starts injecting mistakes for harder
           | problems and goes completely off the rails for anything
           | really difficult.
        
           | er4hn wrote:
           | It's reasonable to say that LLMs are not completely useless.
           | There is also a very valid case to make that LLMs are not
           | good at generating production ready code. I have found asking
           | LLMs to make me Nix flakes to be a very nice way to make use
           | of Nix without learning the Nix language.
           | 
           | As an example of not being production ready: I recently tried
           | to use ChatGPT-4 to provide me with a script to manage my
           | gmail labels. The APIs for these are all online, I didn't
           | want to read them. ChatGPT-4 gave me a workable PoC that was
           | extremely slow because it was using inefficient APIs. It then
           | lied to me about better APIs existing and I realized that
           | when reading the docs. The "vibes" outcome of this is that it
           | can produce working slop code. For the curious I discuss this
           | in more specific detail at:
           | https://er4hn.info/blog/2024.10.26-gmail-labels/#using-ai-
           | to...
        
             | Aeolun wrote:
             | I find a recurring theme in these kind of comments where
             | people seem to blame their laziness on the tool. The
             | problem is not that the tools are imperfect, it's that you
             | apparently use them in situations where you expect
             | perfection.
             | 
             | Does a carpenter blame their hammer when it fails to drive
             | in a screw?
        
               | er4hn wrote:
               | I'd argue that a closer analogy is I bought a laser based
               | measuring device. I point it a distant point and it tells
               | me the distance from the tip of the device to that point.
               | Many people are excited that this tool will replace
               | rulers and measuring tapes because of the ease of use.
               | 
               | However this laser measuring tool is accurate within a
               | range. There's a lot of factors that affect it's accuracy
               | like time of day, how you hold it, the material you point
               | it at, etc. Sometimes these accuracy errors are minimal,
               | sometimes they are pretty big. You end up getting a lot
               | of measurements that seem "close enough". but you still
               | need to ask if each one is correct. "Measure Twice, Cut
               | Once" begins to require one measurement with the laser
               | tool and once with the conventional tool when accuracy
               | matters.
               | 
               | One could have a convoluted analogy where the carpenter
               | has an electric hammer that for some reason has a rounded
               | head that does cause some number of nails to not go in
               | cleanly, but I like my analogy better :)
        
           | derefr wrote:
           | I would add that a lot of the time when I'm programming, I'm
           | an expert on the _problem_ domain but not the _solution_
           | domain -- that is, I know exactly what the _pseudocode_ to
           | solve my problem should look like; but I 'm not necessarily
           | fluent in the particular language and libraries/APIs I happen
           | to have to use, in the particular codebase I'm working on, to
           | operationalize that pseudocode.
           | 
           | LLMs are great at translating already-rigorously-thought-out
           | pseudocode requirements, into a specific (non-esoteric)
           | programming language, with calls to (popular) libraries/APIs
           | of that language. They might make little mistakes -- but so
           | can human developers. If you're good at _catching_ little
           | mistakes, then this can still be faster!
           | 
           | For a concrete example of what I mean:
           | 
           | I hardly ever code in JavaScript; I'm mostly a backend
           | developer. But sometimes I want to quickly fix a problem with
           | our frontend that's preventing end-to-end testing; or I want
           | to add a proof-of-concept frontend half to a new backend
           | feature, to demonstrate to the frontend devs by example the
           | way the frontend should be using the new API endpoint.
           | 
           | Now, I can sit down with a JS syntax + browser-DOM API cheat-
           | sheet, and _probably, eventually_ write correct code that
           | doesn 't accidentally e.g. incorrectly reject reject zero or
           | empty strings because they're "false-y", or incorrectly
           | interpolate the literal string "null" into a template string,
           | or incorrectly try to call Element.setAttribute with a
           | boolean true instead of an empty string. And that's because I
           | _have_ written _some_ JS, and have been bitten by those
           | things, just enough times now to _recognize_ those JS code
           | smells when I see them when _reviewing_ code.
           | 
           | But just because I can recognize bad JS code, doesn't mean
           | that I can instantly conjure to mind blocks of pre-formatted
           | JS code avoid all those pitfalls by doing everything right. I
           | know the right way exists, and I've used it before, and I
           | would know it if I saw it... but it's not "on the tip of my
           | tongue" like it would be for languages I'm more familiar
           | with.
           | 
           | With an LLM, though, I can just tell it the pseudocode (or
           | equivalent code in a language I know better), get an initial
           | attempt at the JS version of it out, immediately see whether
           | it passes the "sniff test"; and if it doesn't, iterate just
           | by pointing out my concerns in plain English -- which will
           | either result in code updated to solve the problem, or an
           | explanation of why my concern isn't relevant. (Which, in the
           | latter case, is a learning opportunity, but one that must
           | always be followed up by independent research in non-LLM
           | sources.)
           | 
           | The product of this iteration process is basically the same
           | code I would have written myself. But the iteration process
           | was faster than writing the code myself.
           | 
           | I liken this to the difference between asking someone who
           | knows anatomy but only ever does sculpture, to draw a sketch
           | of someone's face; vs sitting them in front of a professional
           | illustrator and having them describe and iterate on the same
           | face sketch. The illustrator won't perfectly understand the
           | requirements of the sculptor -- but the illustrator is still
           | a lot more _fluent in the medium_ than the sculptor is, and
           | the sculptor still has all the required _knowledge of the
           | domain_ (anatomy) required to recognize whether the sketch
           | _matches their vision_.
        
         | aorloff wrote:
         | Its been a while since I was really fully in the trenches, but
         | not that long.
         | 
         | How people deal with this is they start by writing the test
         | case.
         | 
         | Once they have that, debugging that 25% comes relatively easily
         | and after that its basically packaging up the PR
        
         | Kiro wrote:
         | You're doing circular reasoning based on your initial concern
         | actually being a problem in practice. In my experience it's
         | not, which makes all your other speculations inherently
         | incorrect.
        
         | afavour wrote:
         | > Or do they have 25% trivial code?
         | 
         | If anything that's probably an underestimate. Not to downplay
         | the complexity in much of what Google does but I'm sure they
         | also do an absolute ton of tedious, boring CRUD operations that
         | an AI could write.
        
         | ants_everywhere wrote:
         | Google's internal codebase is nicer and more structured than
         | the average open source code base.
         | 
         | Their internal AI tools are presumably trained on their code,
         | and it wouldn't surprise me if the AI is capable of much more
         | internally than public coding AIs are.
        
         | fsckboy wrote:
         | > _Does Google now have 25% subtly wrong code?_
         | 
         | maybe the ai generates 100% of the company's new code, and then
         | by the time the programmers have fixed it, only 25% is left of
         | the AI's ship of Theseus
        
         | vkou wrote:
         | How would you react to a tech firm that in 2018, proudly
         | announced that 25% of their code was generated by
         | IntelliJ/Resharper/Visual Studio's codegen and autocomplete and
         | refactoring tools?
        
         | grepLeigh wrote:
         | I have a whole "chop wood, carry water" speech born from
         | leading corporate software teams. A lot of work at a company of
         | sufficient size boils down to keeping up with software entropy
         | while also chipping away at some initiative that rolls up to an
         | OKR. It can be such a demotivating experience for the type of
         | smart, passionate people that FANNGs like to hire.
         | 
         | There's even a buzzword for it: KTLO (keep the lights on). You
         | don't want to be spending 100% of your time on KTLO work, but
         | it's unrealistic to expect to do done of it. Most software
         | engineers would gladly outsource this type of scutwork.
        
         | Cthulhu_ wrote:
         | You're quick to jump to the assertion that AI only generates SO
         | style utility code to do X, but it can also be used to generate
         | boring mapping code (e.g. to/from SQL datasets). I heard one ex
         | Google dev say that most of his job wat fiddling with Protobuf
         | definitions and payloads.
        
         | bluerooibos wrote:
         | At what point are people going to stop shitting on the code
         | that Copilot or other LLM tools generate?
         | 
         | > how trivial the problems they solve are
         | 
         | A single line of code IS trivial. Simple code is good code. If
         | I write the first 3 lines of a complex method and I let Copilot
         | complete the 4th, that's 25% of my code written by an LLM.
         | 
         | These tools have exploded in popularity for good reason. If
         | they were no good, people wouldn't be using them.
         | 
         | I can only assume people making such comments don't actually
         | code on a daily basis and use these tools daily. Either that or
         | you haven't figured out the knack of how to make it work
         | properly for you.
        
         | ZiiS wrote:
         | Yes 25% of code is trivial; certainly for companies like Google
         | that have always been a bit NIH.
        
         | fuzzy2 wrote:
         | I'll just answer here, but this isn't about this post in
         | particular. It's about all of them. I've been struggling with a
         | team of junior devs for the past months. How would I describe
         | the experience? It's easy: just take any of these posts,
         | replace "AI" with "junior dev", done.
         | 
         | Except of course AI at least can do spelling. (Or at least I
         | haven't encountered a problem in that regard.)
         | 
         | I'm highly skeptical regarding LLM-assisted development. But I
         | must admit: it works. If paired with an experienced senior
         | developer. IMHO it must not be used otherwise.
        
       | 0xCAP wrote:
       | People overestimate faang. There are many talents working there,
       | sure, but a lot of garbage gets pumped into their codebases as
       | well.
        
       | sbochins wrote:
       | It's probably code that was previously machine generated that
       | they're now calling "AI Generated".
        
         | frank_nitti wrote:
         | That would make sense and be a good use case, essentially doing
         | what OpenAPI generators do (or Yeoman generators of yore), but
         | less deterministic I'd imagine. So optimistically I would guess
         | it covers ground that isn't already solved by mainstream tools.
         | 
         | For the example of generating an http app scaffolding from an
         | openapi spec, it would probably account for at least 25% of the
         | text in the generated source code. But I imagine this report
         | would conveniently exclude the creation of the original source
         | yaml driving the generator -- I can't imagine you'd save much
         | typing (or mental overhead) trying to prompt a chatbot to
         | design your api spec correctly before the codegen
        
       | otabdeveloper4 wrote:
       | That explains a lot about Google's so-called "quality".
        
       | Taylor_OD wrote:
       | If we are talking about the boilerplate code and autofill syntax
       | code that copilot or any other "AI" will offer me when I start
       | typing... Then sure. Sounds about right.
       | 
       | The other 75% is the stuff you actually have to think about.
       | 
       | This feels like saying linters impact x0% of code. This just
       | feels like an extension of that.
        
       | chabes wrote:
       | When Google announced their big layoffs, I noted the timing in
       | relation to some big AI announcements. People here told me I was
       | crazy for suggesting that corporations could replace employees
       | with AI this early. Now the CEO is confirming that more than a
       | quarter of new code is created by AI. Can't really deny that
       | reality anymore folks.
        
         | akira2501 wrote:
         | > Can't really deny that reality anymore folks.
         | 
         | You have to establish that the CEO is actually aware of the
         | reality and is interested in accurately conveying that to you.
         | As far as I can tell there is absolutely no reason to believe
         | any part of this.
        
         | paradox242 wrote:
         | When leaders without the requisite technical knowledge are
         | making decisions then the question of whether AI is capable of
         | replacing human workers is orthogonal to the question of
         | whether human workers will be replaced by AI.
        
         | hbn wrote:
         | I'd suggest the bigger factor in those layoffs is the money was
         | made in earlier covid years where money was flowing and
         | everyone was overhiring to show off record growth, then none of
         | those employees had any justification for being kept around and
         | were just a money sink so they fired them all.
         | 
         | Not to mention Elon publicly demonstrated losing 80% of staff
         | when he took over twitter and - you can complain about his
         | management all you want - as someone who's been using it the
         | whole way through, from a technical POV their downtimes and
         | software quality has not been any worse and they're shipping
         | features faster. A lot of software companies are overstaffed,
         | especially Google who has spent years paying people to make
         | projects just to get a PO promoted, then letting the projects
         | rot and die to be replaced by something else. That's a lot of
         | useless work being done.
        
         | robohoe wrote:
         | Who claims that he is speaking the truth and not some marketing
         | jargon?
        
           | randomNumber7 wrote:
           | People who have replaced 25% of their brain with ai.
        
       | Starlevel004 wrote:
       | No wonder search barely works anymore
        
       | hggigg wrote:
       | I reckon he's talking bollocks. Same as IBM was when it was about
       | to disguise layoffs as AI uplift and actually just shovelled the
       | existing workload on to other people.
        
       | fzysingularity wrote:
       | While I get the MBA-speak of lines-of-code that AI is now able to
       | accomplish, it does make me think about their highly-curated
       | internal codebase that makes them well placed to potentially get
       | to 50% AI-generated code.
       | 
       | One common misconception is that all LLMs are the same. The
       | models are trained the same, but trained on wildly different
       | datasets. Google, and more specifically the Google codebase is
       | arguably one of the most curated, and iterated on datasets in
       | existence. This is a massive lever for Google to train their
       | internal code-gen models, that realistically could easily replace
       | any entry-level or junior developer.
       | 
       | - Code review is another dimension of the process of maintaining
       | a codebase that we can expect huge improvements with LLMs. The
       | highly-curated commentary on existing code / flawed diff /
       | corrected diff that Google possesses give them an opportunity to
       | build a whole set of new internal tools / infra that's extremely
       | tailored to their own coding standard / culture.
        
         | morkalork wrote:
         | Is the public gemini code gen LLM trained on their internal
         | repo? I wonder if one could get it to cough up propriety code
         | with the right prompt.
        
         | bqmjjx0kac wrote:
         | > that realistically could easily replace any entry-level or
         | junior developer.
         | 
         | This is a massive, unsubstantiated leap.
        
           | risyachka wrote:
           | The issue is it doesn't really replace junior dev. You become
           | one - as you have to babysit it all the time, check every
           | line of code, and beg it to make it work.
           | 
           | In many cases it is counterproductive
        
       | LudwigNagasena wrote:
       | How much of that generated code is `if err != nil { return err
       | }`?
        
       | cebert wrote:
       | Did AI have to go thru several rounds of Leetcode interviews?
        
         | scottyah wrote:
         | yes: https://alphacode.deepmind.com/ edit: blog link
         | https://deepmind.google/discover/blog/competitive-programmin...
        
       | twis wrote:
       | How much code was "written by" autocomplete before LLMs came
       | along? From my experience, LLM integration is advanced
       | autocomplete. 25% is believable, but misleading.
        
         | scottyah wrote:
         | My linux terminal tab-complete has written 50% of my code
        
       | devonbleak wrote:
       | It's Go. 25% of the code is just basic error checking and
       | returning nil.
        
         | QuercusMax wrote:
         | In Java, 25% of the code is import statements and curly braces
        
           | NeoTar wrote:
           | Does auto-code generation count as AI?
        
           | contravariant wrote:
           | In lisp about 50% of the code is just closing parentheses.
        
             | harry8 wrote:
             | Heh, but it can't be that, no reason to think llms can
             | count brackets needing a close any more than they can count
             | words.
        
       | _spduchamp wrote:
       | I can ask AI to generate the same code multiple times, and get
       | new variations on programming style each time, and get the
       | occasional solution that is just not quite right but sort of
       | works. Sounds like a recipe for a gloppy mushy mess of style
       | salad.
        
       | soperj wrote:
       | The real question is how many lines of code was it responsible
       | for removing.
        
       | randomNumber7 wrote:
       | I cannot imagine this to be true, cause imo current LLM's coding
       | abilities are very limited. It definitely makes me more
       | productive to use it as a tool, but I use it mainly for
       | boilerplate and short examples (where I had to read some library
       | documentation before).
       | 
       | Whenever the problem requires thinking, it horribly fails because
       | it cannot reason (yet). So unless this is also true for google
       | devs, I cannot see that 25% number.
        
       | ThinkBeat wrote:
       | This is quite interesting to know.
       | 
       | I will be curious to see if it has any impact positive or
       | negative over a couple of years.
       | 
       | Will the code be more secure since the AI does not make the
       | mistakes humans do?
       | 
       | Or will the code, not well enough understood by the employees,
       | exposes exploits that would not be there?
       | 
       | Will it change average up time?
        
       | ThinkBeat wrote:
       | So um. With making this public statement, can we expect that 25%
       | of "the bottom" coders at Google will soon be granted a lot more
       | time and ability to spend time with their loves ones.
        
       | SavageBeast wrote:
       | Google needs to bolster their AI story and this is good click
       | bait. I'm not buying it personally.
        
       | mjhay wrote:
       | 100% of Sundar Pichai could be replaced by an AI.
        
       | wokkaflokka wrote:
       | No wonder their products are getting worse and worse...
        
       | deterministic wrote:
       | Not impressed. I currently auto generate 90% or more of the code
       | I need to implement business solutions. With no AI involved. Just
       | high level declarations of intent auto translated to
       | C++/Typescript/...
        
       | arminiusreturns wrote:
       | I was a luddit about the generative LLMs at first, as a crusty
       | sysadmin type. I came around and started experimenting. It's been
       | a boon for me.
       | 
       | My conclusion is that we are at the first wave of a split between
       | those who use LLMs to augment their abilities and knowledge, and
       | those who delay. In cyberpunk terminally, it's aug-tech, not real
       | AGI. (and the lesser ones code abilities and simpler the task,
       | the more benefit, it's an accelerator)
        
       | elzbardico wrote:
       | Well. When I developed in Java, I think that Eclipse did similar
       | figures circa 2005.
        
       | marviel wrote:
       | > 80% at Reasonote
        
       | jeffbee wrote:
       | It's quite amusing to me because I am old enough to remember when
       | Copilot emerged the HN mainthought was that it was the death
       | sentence for big corps, the scrappy independent hacker was going
       | to run circles around them. But here we see the predictable
       | reality: an organization that is already in an elite league in
       | terms of developer velocity gets more benefit from LLM code
       | assistants than Joe Hacker. These technologies serve to entrench
       | and empower those who are already enormously powerful.
        
       | davidclark wrote:
       | If I tab complete my function and variable symbols, does my lsp
       | write 80%+ of my lines of code?
        
       | hi_hi wrote:
       | > More than a quarter of new code created at Google is generated
       | by AI, said CEO Sundar Pichai...
       | 
       | How do they know this? At face value, it sounds like alot, but it
       | only says "new code generated". Nothing about code making it into
       | source control or production, or even which parts of googles vast
       | business units.
       | 
       | For all we know, this could be the result of some internal poll
       | "Tell us if you've been using Goose recently" or some marketing
       | analytics on the Goose "Generate" button.
       | 
       | It's puff piece to put Google back in the lime light, and
       | everyone is lapping it up.
        
       | Hamuko wrote:
       | How do Google's IP lawyers feel about a quarter of the company's
       | code not being copyrightable?
        
       | zxvkhkxvdvbdxz wrote:
       | I feel this made me loose the respect I still had for Google
        
       | prmoustache wrote:
       | Aren't we just talking about auto completion?
       | 
       | In that case those 25% are probably the very same 25% that were
       | automatically generated by LTP based auto-completion.
        
       | blibble wrote:
       | this is the 2024 version of "25% of our code is now produced by
       | outsourced resources"
        
       | tabbott wrote:
       | Without a clear explanation of methodology, this is meaningless.
       | My guess is this statistic is generated using misleading
       | techniques like classifying "code changes generated by existing
       | bulk/automated refactoring tools" as "AI generated".
        
       | skywhopper wrote:
       | All this means is that 25% of code at Google is trivial
       | boilerplate that would be better factored out of their process
       | rather than tasking inefficient LLM tools with. The more they are
       | willing to leave the "grunt work" to an LLM, the less likely they
       | are to ever eliminate it from the process.
        
       | skatanski wrote:
       | I think at this moment, this sounds more like "quarter of the
       | company's new code is created using stackoverflow and other
       | forums. Many many people use all these tools to find information,
       | as they did using stackoverflow a month ago, but now suddenly we
       | can call it "created by AI". It'd be nice to have a distinction.
       | I'm saying this, while being very excited about using LLMs as a
       | developer.
        
       | Terr_ wrote:
       | [delayed]
        
       ___________________________________________________________________
       (page generated 2024-10-30 23:00 UTC)