[HN Gopher] Study finds AI assistants help developers produce co...
       ___________________________________________________________________
        
       Study finds AI assistants help developers produce code likely to be
       buggy
        
       Author : signa11
       Score  : 162 points
       Date   : 2022-12-26 14:48 UTC (8 hours ago)
        
 (HTM) web link (www.theregister.com)
 (TXT) w3m dump (www.theregister.com)
        
       | PeterStuer wrote:
       | Isn't this just like the old copy/pasting google'd code? Easy
       | wins, hard fails.
        
       | logicallee wrote:
       | Merry Christmas :)
       | 
       | "Developers Outraged After Free Intelligent Assistant Fails to
       | Write Perfect Code on First Try"
       | 
       | In a shocking turn of events, developers are expressing their
       | frustration and outrage after their nearly-free intelligent
       | assistant failed to write perfect code on its first try. Despite
       | being able to complete various coding tasks with ease, the
       | assistant has reportedly struggled with issues such as logical
       | bugs and security vulnerabilities.
       | 
       | "I can't believe it!" exclaimed one developer. "I spent all of
       | ten dollars on this fancy new assistant, and it can't even write
       | a simple program without introducing some kind of flaw or
       | weakness. What's the point?"
       | 
       | Others have expressed similar disappointment, with some even
       | going as far as to claim that the assistant is "useless" and a
       | "complete waste of resources."
       | 
       | However, experts are quick to point out that the limitations of
       | these intelligent assistants should come as no surprise. After
       | all, they are not capable of reasoning about code or
       | understanding the intricacies of secure programming.
       | 
       | "It's important to remember that these assistants are simply
       | tools, and like any tool, they have their limitations," said one
       | industry insider. "Expecting them to write perfect code every
       | time is unrealistic and unfair."
       | 
       | Despite the disappointment and frustration, developers are still
       | finding ways to make use of these intelligent assistants,
       | recognizing that they can still be valuable resources when used
       | properly.
       | 
       | "I guess we just have to accept that there will always be some
       | limitations, and work with what we have," said one developer,
       | shaking his head in disbelief. "Who would have thought that
       | something so advanced and powerful could still be susceptible to
       | the occasional bug or security issue? The future is a strange and
       | mysterious place, indeed."
       | 
       | -
       | 
       | you know who wrote this :) transcript:
       | https://hastebin.com/raw/liwemalipu
        
       | xwdv wrote:
       | I find the best use of coding AI is for shitty little
       | inconsequential apps. I could imagine a startup founder using it
       | to feel like they are saving a lot of money on engineering costs
       | by using it for an MVP.
       | 
       | You will likely never use coding AI for things like embedded
       | systems, financial software, manufacturing, flight computers,
       | operating systems, software for managing networked
       | infrastructures, various domain specific AI...
        
         | dorolow wrote:
         | Agreed 100%. Implementing the correct algorithm/business logic
         | is more important than spitting out junk that you don't really
         | understand or won't pass code review.
         | 
         | The act of writing code is probably the easiest part of my job
         | and I still haven't really seen these tools excel at that
         | outside of toy examples.
         | 
         | On the other hand, I am pretty excited for AI code tools that
         | me why my code is shitty beyond static analysis, linters, etc.
        
         | alephxyz wrote:
         | That's my experience as well. I've tried Tabnine and Copilot
         | and they're both useful as fancy line autocompleters when you
         | know what you want to type. However they're unreliable to write
         | code directly from business logic. They'll usually write code
         | that _seems_ to be doing the right thing but with very subtle
         | bugs because they don't understand the intent of the function
         | and just try to mimic what's already there.
        
           | jolancer wrote:
           | [dead]
        
           | typon wrote:
           | If anything that's what makes them more dangerous. I tried
           | Copilot for a month and ended my subscription. It will write
           | a SQL query with all the expected keywords in there: a WHERE
           | for a filter, an ORDER BY for a sort, but the way it combines
           | these things, especially recursive queries, is almost always
           | wrong except the most trivial queries which I would have
           | written by hand in no time anyway.
        
             | jolancer wrote:
             | [dead]
        
         | tjpnz wrote:
         | Any industry where functional safety's a factor will be safe in
         | the longterm. The rest I'm not so sure about..
        
       | doctaj wrote:
       | This tracks for me. MOST of the code I have on github is just
       | random little projects testing out stuff. I don't intend to work
       | on it long term, so I don't go out of my way to make it
       | maintainable, readable, or anything.
        
       | snickerer wrote:
       | I am an experienced "industry professional" C++ programmer and I
       | tried ChatGPT and Stackoverflow and found it impressing and
       | entertaining, but I don't know how these two platforms would be
       | helpful for my work. I never use them.
       | 
       | When I need to solve a problem I must read the (API's) docs and I
       | must understand it. It would be dangerous and buggy if I would
       | copy 'n paste code from somewhere. I don't think it would even
       | work.
       | 
       | I also can learn from high quality examples by following it and
       | understanding the details. But Stackoverflow is full of questions
       | that feel like someone was too lazy or does not have the skills
       | to read the manual.
       | 
       | But maybe these platforms are good for learning? I am not sure if
       | ChatGPT and SO are good tools to learn to program. Maybe people
       | learn more when they take the time to find a solution by
       | themselves.
        
         | tomlue wrote:
         | I don't know what you do for your work, but I suspect you
         | haven't used chatgpt/copilot/etc enough to see the benefits.
         | 
         | Do you think tab-autocompletion is useful? AI assistants can do
         | that on steroids.
         | 
         | Do you ever use example code to learn a new package? ChatGPT
         | does that at a new level.
         | 
         | When I hear developers I work with saying stuff like the above,
         | I think it sometimes comes from a place of insecurity. It's ok
         | to find these tools useful, it doesn't mean you aren't an elite
         | programmer. In the end the best developers will find ways to
         | use these tools to learn faster and build better.
        
           | snickerer wrote:
           | I'll continue to try it. It is not like I think I am totally
           | elite and that's why I don't need code examples. It is just
           | that I in retrospective I see that I did not find much
           | assistance in SO.
        
       | dehrmann wrote:
       | With programming, the devil really is in the details. It's why
       | off-by-one errors are even a thing.
        
       | senthilnayagam wrote:
       | I have been using GitHub copilot for several months now, nearly
       | half the code in a side gig is written with copilot. then came
       | chatgpt, I like it better on what I communicate with it and the
       | code it generates and explains it and I can ask it to modify the
       | generated code. I wish copilot gets chatgpt as the core soon as I
       | prefer that style and the quality of code I get.
       | 
       | yes there are occasional bugs in code but you can find and debug
       | it when you run the code.
        
       | moyix wrote:
       | This is a really nice study! It is very cool that they were able
       | to get professional programmers to participate, this is something
       | that is really hard to set up as an academic team. And yes, 47
       | participants is a small number, but apparently sufficient in this
       | case to detect the effect (as evidenced by the p-values). It also
       | lines up nicely with work we did last year, which looked at the
       | security of Copilot's completions without any humans in the loop
       | [1] and found that something like 40% of its completions in
       | security sensitive scenarios were vulnerable.
       | 
       | One thing I'm not sure of is how well the setup reflects how
       | people actually use IDE-integrated tools like Copilot. In the
       | experiment, users had to explicitly ask the assistant for answers
       | rather than getting them as inline completions (see Figure 7(b)
       | in the appendix). I don't know if this would change the results;
       | I could see it going either way (inline completions appear
       | automatically so might be accepted automatically, but on the
       | other hand programmers are also used to inline completions being
       | incorrect and might be more likely to reject or repair them). It
       | also means that it was up to the individual user to figure out
       | how to prompt the AI, so the results will depend a lot on how
       | users chose to phrase their prompt and how much context they
       | provided.
       | 
       | As full disclosure, I'm one of the authors on a very similar
       | study [2] that _didn 't_ find any large effects on security :)
       | The main differences were:
       | 
       | - We only looked at C, rather than multiple languages. C is
       | notoriously hard to write secure code in, so the base rate of
       | vulnerabilities is likely to be higher. It's worth noting that
       | the Stanford study also didn't find a statistically significant
       | difference in security for C.
       | 
       | - Our study environment was designed to mimic Copilot much more
       | closely - so we had participants use VSCode with a plugin
       | providing inline completions from OpenAI's Codex API. This is
       | also why we used the Cushman rather than the DaVinci model:
       | Cushman's latency is much lower, which is important for realtime
       | use; it looks like GitHub made the same decision, since reverse
       | engineering of the Copilot plugin indicates it also uses Cushman
       | [3].
       | 
       | - We had participants try to code up a full library with 11
       | different functions, rather than doing a bunch of smaller
       | independent tasks. This means that the AI model had more context
       | to work with, and may have affected how users approached the
       | problem.
       | 
       | - We unfortunately only managed to get undergraduate and graduate
       | students as participants, so the base and experience skill level
       | of our user population may have been lower.
       | 
       | Overall I think it's clear that these models are prone to
       | spitting out insecure code right now, and this is an important
       | problem to fix (and one we're working on)! But it's still not
       | clear to me what effect this actually has on the security of code
       | written by programmers using tools like Copilot, and more
       | research is needed to figure that out.
       | 
       | [1] https://arxiv.org/abs/2108.09293
       | 
       | [2] https://arxiv.org/abs/2208.09727
       | 
       | [3] https://thakkarparth007.github.io/copilot-
       | explorer/posts/cop...
        
       | newaccount74 wrote:
       | I found ChatGPT pretty good at looking up stuff when I don't know
       | where it is documented.
       | 
       | Eg. asking it what the name of some constant or parameter is.
       | Much faster than trying to dig through Apple's confusing mess of
       | documentation, and faster than Google.
       | 
       | It does tend to give incorrect answers a lot, so I use it only
       | for stuff where I can quickly check if the answer is correct.
        
       | iamjk wrote:
       | I mean, is anyone really surprised by this? LLMs (as I understand
       | them today) only predict the next token based on previous tokens,
       | so there's no actual logical cohesion to what they produce.
        
       | lumb63 wrote:
       | Surprising. If only there were a way that we could have foreseen
       | that an AI trained to write code in part by looking at people
       | who, self-admittedly, don't know how to write code, and people
       | who write code for others with minimal context (Stack Overflow),
       | would produce buggy code. It is a case of GIGO.
       | 
       | Most developers do not learn much from Stack Overflow. Why do we
       | expect AI to fare better? In my experience, one in ten
       | (optimistically) Stack Overflow answers give anything more than a
       | code snippet with enough information to get the asker through
       | their immediate issue. It can be beneficial if you have the
       | necessary understanding already and only want the code snippet,
       | but this is no way for humans or machines to learn.
       | 
       | Also, having an "AI Assistant" must lower programmers' guards
       | against buggy code. After all, it is an assistant - it must
       | assist you, right? Subordinating humans to machines will not work
       | in this domain until there is better training data and the
       | machines can be taught the reason they are writing specific code.
       | Until then, I have low hopes for AI-generated code.
       | 
       | Even if AI could generate correct, bug-free code the majority
       | (say 99.9% of the time), I expect finding and correcting bugs
       | will be difficult for humans. For example, how many bugs are
       | found and corrected by the author of code during development,
       | versus how many in peer review? I'm reminded of a saying akin to
       | "ask someone to review 5,000 lines of code: no bugs. Ask someone
       | to review 5 lines of code: 5 bugs". We are poor critical
       | reviewers, and AI cannot fix that. AI assistants probably worsen
       | reviews, because reviewers will expect high-quality code from
       | their AI assistants.
        
         | visarga wrote:
         | AI can learn to do code review, there is plenty of data on
         | Github. It could also write tests and suggest possible bugs on
         | its own. Overall, using it might be better than doing it by
         | hand.
         | 
         | If you are using the AI just to write snippets of code, then it
         | is suboptimal. What it needs is to monitor execution errors and
         | fix its code over a few iterations, just like humans do.
        
           | discreteevent wrote:
           | Humans fix the code by understanding the model it represents.
           | In fact it's often that bug fixing is where you are forced to
           | understand what you wrote gliby from memory/patten matching.
        
         | Brian_K_White wrote:
         | reCaptcha v5: Which lines have errors?
        
         | dmix wrote:
         | > having an "AI Assistant" must lower programmers' guards
         | against buggy code
         | 
         | Why would you assume that?
         | 
         | If it's buggy a couple times, if everyone talks about how buggy
         | and unreliable it is, it can easily become common knowledge and
         | common practice to triple check the output.
        
           | kneebonian wrote:
           | Then how much time are you actually saving if you have to
           | review everything it produces. The bottle neck was never
           | typing speed, at that point all the AI is allowing you to do
           | is produce more buggy code more quickly.
        
             | dmix wrote:
             | I use Copilot daily and experimented with using ChatGPT for
             | real work code.
             | 
             | It's an incredibly valuable tool even with having to
             | rewrite the larger outputs... the small stuff like
             | autocompleting variables and keys is highly accurate and
             | what it most often generates (it's scary how good it is at
             | finishing the exact line you had in your head x50 times a
             | day).
             | 
             | What you need to be careful about is when it generates
             | entire functions or whole mini-modules. This is still
             | extremely useful because it gets your brain running. It
             | provides a simple template to suggest how it might look.
             | 
             | That way you're no longer starting from scratch, you see a
             | toy example with real code - for intellectual/creative work
             | having that sort of seed is super useful.
             | 
             | Imagine a writer with writers block staring at a blank page
             | vs a generated sample of dialogue between two characters or
             | an intro paragraph to get the ball rolling.
             | 
             | Usually you have to burn a few cycles, fail a couple times
             | writing some code, to get to the point where you've written
             | something good. So it's a normal part of the process to
             | throw Version 0.1 away IMO.
        
               | [deleted]
        
           | AlotOfReading wrote:
           | 1) verifying code is harder than writing it and
           | 
           | 2) verifying code requires domain knowledge, which implies
           | that the utility of these models is limited to things _could_
           | write myself if I weren 't too lazy. That's hugely
           | constricting.
        
             | dmix wrote:
             | Yes but I don't see it as generating entire blocks of code
             | you're supposed to copy and paste into your project.
             | 
             | It's like a template, a suggestion from which you can build
             | your own version.
             | 
             | Only rarely does it have the context or understanding of
             | the wider codebase to do a programmers job for them.
             | 
             | The times when it does generate a copy/pastable function
             | it's usually some isolated utility function like "format
             | date as DD-YYYY" something really simple and easy to
             | verify. The type of stuff you'd copy entirely from
             | Stackoverflow rather than finding a specific solution you
             | adapt.
             | 
             | It's mostly a glorified autocomplete and example suggestion
             | service. It is not a full code writing service.
             | 
             | Domain expertise will obviously still be a job requirement.
             | It's an assistant to the programmer, not an occasional
             | replacement for the programmer (and if you have domain
             | expertise you usually use a 3rd party library).
             | 
             | Maybe future versions will try to do more but that's not
             | what we have today.
        
               | AlotOfReading wrote:
               | I think that many people _will_ treat it as something
               | that can generate entire blocks of code. Unfortunately it
               | can be quite broken even just writing basic functions.
               | One of my tests was doing some quaternion rotations. It
               | did them, but refused to stop multiplying by the
               | conjugate afterwards.
               | 
               | Another was converting dates to "years ago", which was
               | broken for BCE because ChatGPT doesn't understand the
               | underlying concept.
        
         | ape4 wrote:
         | Does it look at the questions on Stack Overflow? That would be
         | silly. But yeah, even the answers are far from perfect - they
         | might solve the immediate problem but lack error checking, use
         | undocumented features, etc.
        
         | digitalsushi wrote:
         | If I had a little robot riding in the passenger seat that could
         | tell me whether to go left, straight, or right, and it was
         | correct 90% of the time, I'd think that was pretty great. I'd
         | get where I needed to be, even with a couple mishaps.
         | 
         | ML code suggestions are the same thing to me. If I don't know
         | where I am going, I can just ask it for suggestions. And it's
         | probably going to be what I want.
         | 
         | In both cases, I am annoyed with myself for having started
         | before I knew where I want to end up.
        
           | synu wrote:
           | One in ten turns being told to do the wrong thing sounds
           | incredibly annoying. Maybe if you're driving on the highway
           | for long stretches, but otherwise that would be awful.
        
           | Marazan wrote:
           | 90% right is superficially impressive but in actual practice
           | is abysmal.
           | 
           | Voice recognition software needed to get to 99.9% accuracy to
           | be actually useable.
        
           | bo1024 wrote:
           | The difference is that with code suggestions, you don't
           | necessarily notice errors, even extremely important errors.
        
           | wolfram74 wrote:
           | Your example hinges on at least two things
           | 
           | 1) How many turns do you take on a particular trip
           | 
           | 2) How do those wrong turns end up? if it's "travel time
           | extended by 30 seconds" or "My car, the car I hit and the
           | side of this building are all in shambles" changes what a 10%
           | failure rate means a lot.
        
             | lowbloodsugar wrote:
             | Right? Took a turn the wrong way down a one way street. Hit
             | a lady with a stroller who was looking the other way. She
             | is a surgeon. Her husband is a lawyer. You killed the kid.
             | Your life is _over_.
        
           | Salgat wrote:
           | The problem with ML is that it's pattern recognition, it's an
           | approximation. Code is absolute, it's logic that is
           | interpreted very literally and very exactly. This is what
           | makes it so dangerous for coding; it creates code that's
           | convincing to humans but with deviations that allow for all
           | sorts of bugs. And the worst part is, since you didn't write
           | the code, you may not have the skills (or time) to figure out
           | if those bugs exist, especially if the ML is extremely
           | convincing/clever in what it writes. I would argue that this
           | overhead is even worse for productivity over just writing it
           | yourself.
        
             | visarga wrote:
             | Expand your view. AI can write tests, read error messages,
             | find bugs in your code, we just need to give this task.
             | 
             | Let's think about tests. You write a function, the AI
             | writes a few tests for you. maybe you need to add a few
             | more. But it's better to have tests, and you might have
             | missed one of them.
             | 
             | Error messages - we rely on error messages to make this
             | leap from "code parrots" to "bug free". Most of our codes
             | fail the first time we run them. We're just fancy pattern
             | matchers too, but we have a runtime. So the AI could also
             | fix its bugs all alone, given the opportunity.
             | 
             | Finding bugs - we can train AI to spot bugs. It can become
             | an excellent tool to check not just AI code, but also human
             | code. Having a bug detector running in the background would
             | be great, even if is not perfect.
        
           | Jensson wrote:
           | > If I had a little robot riding in the passenger seat that
           | could tell me whether to go left, straight, or right, and it
           | was correct 90% of the time, I'd think that was pretty great.
           | I'd get where I needed to be, even with a couple mishaps.
           | 
           | Pretty sure you wouldn't when it starts telling you to make
           | illegal turns, or tell you to turn where there are no roads
           | etc, that is the state of language models for code. You'd
           | likely run over a person or get stopped by police at some
           | point if you listen to it, then it is easier to just turn it
           | off so you can focus on driving. A bad assistant is much
           | worse than no assistant.
           | 
           | Edit: And these models are more like Teslas autopilot than an
           | assistant giving you directions, since you have to oversee
           | what it does instead of letting it tell you what to do. An
           | autopilot that does the wrong thing 10% of the time is
           | horrible.
        
         | saurik wrote:
         | > If only there were a way that we could have foreseen that an
         | AI trained to write code in part by looking at people who,
         | self-admittedly, don't know how to write code, and people who
         | write code for others with minimal context (Stack Overflow),
         | would produce buggy code. It is a case of GIGO.
         | 
         | So, I'll claim the real issue is just that this generation of
         | AI isn't able to "learn", it merely "trains": if I were alone
         | in a room for years and you gave me a book on how to program
         | that has an error in it, during my careful study of the book
         | (without a computer to test on!), I am likely to notice the
         | error, get annoyed at the author trying to figure out if I
         | failed to understand some special case, and then eventually
         | decide the author was wrong. With only the knowledge from the
         | book, I will also be able to study the concepts of programming
         | and will eventually be able to design large complex systems;
         | again: I will be able to do this even if I don't have a
         | computer, in the same way people have studied math for
         | millennia.
         | 
         | And like, this is how we all learned to program, right? The
         | books and tutorials we learn to program with often suck; but,
         | after years dedicated to our craft synthesizing _the best_ of
         | what we learn, we not only can become better than any one of
         | the sources we learned from, given enough time to devote to
         | practice and self-study we can become better than _all_ of
         | them, both combined and on average (and if we couldn 't, then
         | of course no progress could ever be made by a human).
         | 
         | With a human, garbage in can lead to something fully legitimate
         | out! A single sentence by someone saying "never do X, because Y
         | can happen, where Y is extremely important" can cause us to
         | throw out immense amounts of material we already learned.
         | Somewhere, GitHub Copilot has seen code that was purposefully
         | documented with bugs (the kind we use to train humans for
         | "capture the flag events") as well as correct code with
         | comments explaining how to avoid potential bugs... it just
         | didn't "give a shit", and so it is more likely to do something
         | ridiculous like generate code with a bug in it and a comment
         | explaining the bug it just generated than to generate correct
         | code, because it doesn't have any clue what the hell it is
         | doing and isn't analyzing or thinking critically about the
         | training input.
         | 
         | > Even if AI could generate correct, bug-free code the majority
         | (say 99.9% of the time), I expect finding and correcting bugs
         | will be difficult for humans.
         | 
         | There is some error rate below which you beat the chance of a
         | human making a dumb mistake just because they are distracted or
         | tired, and at that point the AI will just bear the humans. I
         | don't know if that is 99.9% or 99.9999% (it might be extremely
         | tight, as humans generate thousands and thousands of individual
         | decisions in their code every work session), but past that
         | point you are actually better off than the current situation
         | where I first program something myself and then hire a team of
         | auditors to verify I coded it correctly (and/or a normal
         | company where someone is tasked to build something and then
         | every now and then someone like me is hired to figure out if
         | there are serious mistakes).
        
           | visarga wrote:
           | > With a human, garbage in can lead to something fully
           | legitimate out!
           | 
           | Because we get to see the error messages, fix and try again.
           | You can try this on chatGPT - give it a task, run the code,
           | probably fails, copy the error back, and let it fix is
           | errors. After a few rounds it gets the result with much
           | higher probability than when you allow it one single shot.
           | 
           | A language model can write programs, and then we can run the
           | programs to check if they pass tests, then the language model
           | has a special signal - execution feedback. If you retrain the
           | model with this new data, it will learn to code better and
           | better. It is reinforcement learning, not language modelling.
           | 
           | AlphaGo was able to generate its own data and beat humans at
           | Go by doing this exact thing. It's an evolutionary method as
           | well, because you are cultivating populations of problems and
           | solutions through generate + execute + validate.
        
       | __mharrison__ wrote:
       | I teach Python and Data Science for a living. I just ran through
       | my fundamentals of Python course project with ChatGPT this
       | morning. I was able to create a modern Python project layout,
       | unit tests, documentation, comments, types, and a command line
       | interface by copying and pasting code.
       | 
       | And the code was decent. It did have some problems coming up with
       | the correct assertions for some of the unit tests.
       | 
       | For folks with an understanding of coding, this will be a huge
       | boon to productivity and quality. However, this is not a tool
       | that newbies (currently) will be able to use without any
       | understanding of code.
       | 
       | I view it like a very powerful spell checker or grammar checker
       | for a word processor right now. You need to have a direction you
       | want to go, and if so this will get you there much faster with a
       | better end product.
        
       | ineedasername wrote:
       | P1 All people write buggy code. [1]
       | 
       | P2 AI assistants are trained on code written by people
       | 
       | [?] People using AI assistants will write buggy code
       | 
       | [1] Some might object to this, either in their own case or that
       | of some of their professional associates. I would not trust the
       | judgement of these people.
        
       | johnhenning wrote:
       | Looking at the demographics of the study, they only had 47 total
       | participants, 15% of which were professionals and 62% reporting
       | less than 5 years experience coding (which I would imagine is an
       | underestimate assuming some people exaggerated their actual
       | experience level). So that means they only had 6-7 participants
       | who worked in industry and _generously_ 18 people with more than
       | 5 years experience. They also don't report the breakdown of how
       | participants did by experience. One other factor they use to
       | measure is if the participant has security experience, but their
       | bar for that is whether they have taken a single security class.
       | 
       | Given all of this, I don't think the paper's conclusion is
       | convincing at all given that they were evaluating this on a pool
       | of participants that a majority were students with not much
       | experience programming when these tools are sold for professional
       | use. I would bet if the study had a more uniform distribution of
       | experience levels, you would probably see the "bugginess" trend
       | downwards with more experience. Participants with more years
       | programming just have had more time to learn how to use new tools
       | more effectively in their workflows.
       | 
       | I definitely tweaked my methods of using Copilot plenty over the
       | past year or so to take advantage of its strengths and avoid weak
       | its weaknesses.
        
         | klyrs wrote:
         | My concern is that students/novices are going to be using this,
         | without the ability to double-check the output of the tool. It
         | inspires overconfidence, looks okay at the surface level, and
         | bugs go unnoticed. The younger generation using this as a
         | crutch, treating their own creations as a black box, will not
         | have an adequate feedback mechanism to learn from their
         | mistakes. Code quality and performance will deteriorate over
         | time. You, an expert, learned without this crutch. Your use-
         | case is frankly uninteresting.
         | 
         | Amusingly, without careful curation, I'd predict that buggy
         | code will tend to self-replicate and these tools that
         | indiscriminately slurp public code will enter a death spiral
         | because the novices outnumber the experts. It's only a matter
         | of time before viruses are written to propagate through this
         | garbage stream. http://www.underhanded-c.org/
        
       | melling wrote:
       | Sounds like we need computer languages that are better at
       | preventing bugs. Throw in a little static analysis and AI
       | assistants become more useful.
       | 
       | Type inference, pure functions, functional, assertions, ...
       | 
       | Maybe start with Haskell?
        
         | unnouinceput wrote:
         | Haskell? Not buggy code? I'd say Haskell is the best language
         | to write buggy code. And when I say buggy is not about writing
         | purposely to be buggy, I say it's a maintenance horror show
         | that WILL produce buggy code 100% down the road. Remember,
         | maintenance is 90% of a project life, the initial development
         | is only 10%.
        
           | chowells wrote:
           | You must have used a different Haskell in production than I
           | have.
        
           | melling wrote:
           | Ok. The main point was to discuss making languages that are
           | better at preventing bugs. I simply threw out one to get the
           | conversation started, not to be the final answer. Other
           | thoughts?
        
             | Jensson wrote:
             | You can't prevent injection bugs on a language level, since
             | injection happens when you use string manipulation to
             | dynamically create code that you send to another program.
             | This means that it is just as easy to create SQL injection
             | bugs in Haskell or Rust or any other language, because they
             | all support concatenating strings to create SQL.
        
       | smrtinsert wrote:
       | Recently had the misfortune of fixing a really overwrought and
       | broken chunk of html. Copilot kept trying to introduce tons of
       | junk into it of nearly the exact same variety I was trying to
       | remove. I'm not sure how to fix this logically but still seems
       | like copilot has a way to go improve helpfulness.
        
       | jlarocco wrote:
       | Not surprising given the discussion I've seen about these AIs on
       | HN and other sites.
       | 
       | AFAICT, they're generally treated as a shortcut to avoid learning
       | an API or library, or as a shortcut to avoid boilerplate by
       | essentially copying somebody else's boilerplate.
       | 
       | Both cases boil down to using other people's code for purposes
       | that it wasn't built for, which is obviously going to be error
       | prone.
       | 
       | The AI has no idea how _this particular case_ differs from the
       | sample code it was trained on, so it 's up to the developer to go
       | back and correct the details - which is precisely what they were
       | trying to avoid with the AI. It's basically a fancy version of
       | code "templates" available in some IDEs, but fewer people know to
       | go back and fill in the correct details.
        
         | visarga wrote:
         | TL;DR - Models can learn from massive validation, not just
         | massive text prediction.
         | 
         | I agree our current crop of Copilot, GPT-3 and chatGPT can
         | introduce subtle bugs, and that's a big problem. Checking for
         | such bugs takes time. They code like someone who read a lot and
         | has good memory, but never implemented anything complex.
         | 
         | But there is no reason to be the same in the next iteration.
         | Language models can be fine-tuned on code execution, that means
         | giving sufficient exposure to trial and error to learn how to
         | avoid such bugs.
         | 
         | Having a way to test its outputs is the key to self
         | improvement. It works the same with AlphaGo - the model is
         | generating its own training data, and this process can
         | eventually surpass human level. It just needs tons of compute
         | to iron out a lot of bad ideas.
         | 
         | Alternatively, the model could use a code execution environment
         | at runtime, too. So it could iterate a few times on an error,
         | eventually using web search if it cannot solve it alone,
         | instead of simply failing and leaving it to the human to fix.
         | It is pretty good at proposing fixes and will recover most of
         | the times, like human devs.
         | 
         | I'm leaving this for further reading. It's fascinating.
         | 
         | > Evolution through Large Models
         | 
         | > Large language models (LLMs) trained to generate code can
         | vastly improve the effectiveness of mutation operators applied
         | to programs in genetic programming.
         | 
         | https://arxiv.org/abs/2206.08896
         | 
         | The paper is from OpenAI, Ken Stanley who works on opendedness
         | is the last author.
        
         | emerongi wrote:
         | > The AI has no idea how this particular case differs from the
         | sample code it was trained on
         | 
         | It can eventually figure a lot of things out if you provide the
         | correct requirements and examples. However, the caveat with all
         | of this is that you already have to be a _good programmer_ to:
         | 
         | - know what is needed
         | 
         | - know what the solution should more or less look like
         | 
         | - spot bugs in the AI's solution
         | 
         | - describe the problem, examples etc in a way that the AI can
         | work with
         | 
         | If you're already a good programmer, the time spent wrangling
         | the AI could be spent writing the solution yourself.
         | 
         | My feeling is that the most value from AIs will come from when
         | you get hopelessly stuck on a problem. An AI could provide some
         | initial steps or ideas that can get you unstuck, effectively
         | working as a rubber ducky. You're unlikely to copy the AI's
         | solution, but it could provide some random insight that helps
         | you take a step forward.
        
           | lamontcg wrote:
           | > My feeling is that the most value from AIs will come from
           | when you get hopelessly stuck on a problem.
           | 
           | This sounds about right.
           | 
           | It may also help you to find related areas to study. I was
           | throwing rocket optimization problems at ChatGPT and it was
           | horrible, but I think it was finding some related code from
           | vehicle navigation optimization problems which I found
           | interesting.
           | 
           | It does reasonably well at simple kinds of symbol
           | manipulation and might be able to replace matlab/mathematica
           | at some point, but the problem is that you always have to
           | double-check the results. I'm not sure it is really that much
           | better than just knowing your way around wikipedia (which has
           | its own accuracy issues, but for straightforward things it is
           | usually pretty good).
        
           | jlarocco wrote:
           | I hadn't considered the "hopelessly stuck" aspect, and that
           | does seem promising.
           | 
           | I fully agree on your other points, though. Communicating the
           | requirements to the AI in a non-ambiguous way is more-or-less
           | equivalent to just writing the code. I can't even get firm
           | requirements from my coworkers, and we all presumably have
           | "real" intelligence.
        
       | madelyn wrote:
       | I spent a few days working on using ChatGPT to write code. I've
       | been meaning to write a blogpost about my experience, but here's
       | my synopsis.
       | 
       | Where ChatGPT was best was when I wanted to do generic data
       | structure work with common tools. Prompts like "make a Python
       | class that stores keys in a redis sorted set, with the score as a
       | timestamp, that removes members older than a configured time on
       | access / insertion." I know how that code should work, I know
       | what's correct- but it's nice being lazy and letting ChatGPT spit
       | out a class for me I can clean up.
       | 
       | Anytime I want to do anything weird or nonstandard, ChatGPT is an
       | uphill battle. Try asking it to write a convolution function
       | without using Numpy (say you're using PyPy or AWS Lambda and C
       | extensions are a no go). It will often insist you have to use
       | Numpy- it argued with me at one point it wasn't possible without
       | it! (This has gotten a bit better since, but was still quite
       | interesting)
       | 
       | Working with pre-existing code, especially anything database
       | related, was not worth the time spent at all. Terrible query
       | plans, misinterpreting the schema that was provided, etc. I do
       | love SQL though- I'm quite biased here.
       | 
       | It was interesting in that when it worked best, I needed to know
       | what to ask for. I asked a good friend who started learning to
       | code a few months ago to try it, and she was way less successful
       | with it versus reading documentation.
       | 
       | Ultimately I think with tight text editor integration it will be
       | great, but it's still going to require someone skilled and who
       | knows what to ask for- at least for a couple years. As for how
       | productive it's made me, I've stopped using it for anything
       | except greenfield work with common libraries I already know how
       | to use.
        
       | spritefs wrote:
       | Something interesting that I don't see in the comments here are
       | comments Dijkstra has made about using natural language for
       | telling machines to do things[0]
       | 
       | > Instead of regarding the obligation to use formal symbols as a
       | burden, we should regard the convenience of using them as a
       | privilege: thanks to them, school children can learn to do what
       | in earlier days only genius could achieve. (This was evidently
       | not understood by the author that wrote --in 1977-- in the
       | preface of a technical report that "even the standard symbols
       | used for logical connectives have been avoided for the sake of
       | clarity". The occurrence of that sentence suggests that the
       | author's misunderstanding is not confined to him alone.) When all
       | is said and told, the "naturalness" with which we use our native
       | tongues boils down to the ease with which we can use them for
       | making statements the nonsense of which is not obvious.
       | 
       | I wonder what the long term consequences of LLMs will be. I can't
       | see them abstracting away programming languages for the reasons
       | that Dijkstra provides (namely the inherent ambiguity of natural
       | language), but maybe they could be a good summarizer of APIs and
       | generator for starter code.
       | 
       | My prediction is that they'll provide a similar utility to
       | stackoverflow and save time in developing software
       | 
       | [0]
       | https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E...
        
       | chromejs10 wrote:
       | Study finds that developers produce code likely to be buggy.
        
       | ilaksh wrote:
       | I made a site using the OpenAI APIs new models (similar to
       | ChatGPT) for creating simple apps or demos using natural
       | language: https://aidev.codes . If anyone is interested please
       | try it out.
       | 
       | I tried to promote the site by submitting to HN and did not
       | receive a single vote. I feel that although many people may not
       | be impressed, it is better than 0. So I am just trying to promote
       | it in related threads.
       | 
       | Anyway if you have any feedback or anything I am in the ChatGPT
       | and OpenAI Discords as user runvnc.
        
         | ShannonLimiter wrote:
         | Your website is only 30 words with no examples or explanations
         | that I can see - just a sign up. The video on your site doesn't
         | play on my computer.
         | 
         | It might be worth 1,000 votes, but it's hard to know without
         | actually seeing what it can do.
        
           | ilaksh wrote:
           | Thanks for explaining. What kind of computer? Yesterday I
           | added another source to the video (mp4) in case people can't
           | play WebM. Unless it is trying to play the WebM and doesn't
           | like the encoding.
           | 
           | I definitely need to link to the Discord with the examples
           | and/or finish the Explore/featured page which is the main
           | plan for that.
        
             | andsoitis wrote:
             | Video doesn't play on iPad. Think about what audience
             | you're trying to attract and create a landing page that
             | would resonate with them...
        
               | ilaksh wrote:
               | Thank you for telling me it doesn't play on an iPad. Your
               | implication that the landing page is useless does not
               | make sense in the context that the video does not play.
               | 
               | I do not own any Apple products and so was not able to
               | test on them.
        
         | tester457 wrote:
         | The video works for me but I think you should make your site
         | look more like https://raindrop.io/.
         | 
         | It has the video on the top like you, but it also has pictures
         | as you scroll down, basically a presentation for people too
         | impatient to watch a video. Look into copywriting techniques
         | for what to write as headers for the pictures, persuasion is
         | important to attract users.
         | 
         | It makes it look more complete too! Good luck on your site!
        
           | ilaksh wrote:
           | Good idea, I will try to spend some time filling out the
           | marketing copy etc. I have just barely got the site
           | functional.
        
         | eks391 wrote:
         | Hi ilaksh, I was able to view your video on an android device.
         | That said, I did have to watch it _several_ times before I
         | understand what was going on because a lot is happening
         | everywhere in the screen. With the video and the minimalist
         | intro on the site, I was still left guessing what was being
         | offered and how it works.
         | 
         | Usually before signing up for any website, I want to check out
         | _all_ of the information I can find on it on the website,
         | perhaps online forums if the site isnt clear, and competitors
         | so make sure what I 'm signing up for is something I actually
         | want. I also read any terms and conditions and the privacy
         | policy if they ask for anything from me, such as an email
         | address. Pretty much none of this was available.
         | 
         | The first thing I noticed/recommend is changing the lack of
         | tabs. It reminded me of scam sites that barely have enough to
         | make a site appear legit so they can get at least someone to
         | sign up for something. Also just for the persuation aspect of
         | the site, there should be helpful tabs, like one with an
         | example of a person using it (I know thats what the video is,
         | but it's not intuitive for a non-user, which I'll get to in a
         | second). One of the other guys gave the idea of having images
         | with commentary of whats going on between each pic, which I
         | think is a great idea _if implimented properly_. Maybe that
         | page could have a longer video under the photo tutorial with
         | audio of someone explaining a basic how to, in the form of them
         | showing what they want to do and how they did it. Then building
         | a simple button with a counter would be enough. Maybe a tab
         | could be a contact page with the T &C, PP, would be nice. And
         | another explaining how it works - like your little comment
         | mentioning the APIs is more information than what is on the
         | website and makes me more willing to try it out than your site
         | did.
         | 
         | So onto the site itself, after log in. Because of the previous
         | bullets, I would not have been willing to sign up, but to give
         | hopefully a helpful response to you, and knowing I'd need to
         | see more of the site, was the only reason I made an account.
         | Btw the UI of your little program is very nice. It's stylish,
         | mostly quick to grapple, and simplistic.
         | 
         | Anyway when I first saw it, again due to the front page lacking
         | any info, I was surprised that it was exactly the video. The
         | video wasn't clear whether we were looking at a feature, the
         | whole program, or what really. It was just visual noise without
         | explanation on the main page. Giving it a title like "this is
         | exactly what you'll see" will not help it though, again because
         | it isn't clear what is going on and the video by itself is more
         | unclear in use than the actual program itself. In fact, if you
         | had a tab where users can actually use the program without
         | making an account, but it just doesn't save their work and is a
         | demo, that would already be a huge help because people would
         | instantly see how easy and fun it is to use and it would rid
         | like 90% of the questions caused by not having any information.
         | 
         | Anyway, I saw the program and thought, oh its just the front
         | page video thing. I can use this! And I proceeded to try to use
         | the top right portion and wondered why it didn't work. Then I
         | kinda clicked randomly until I realized the in-prompt area is
         | actually the top-left part. Not exactly a fatal flaw, but a
         | clear tutorial would have led me to understanding the program
         | faster. Like the fact that every input is its own section of
         | code, and so my input needs to follow an orderly pseudocode
         | format. As a person not familiar with how good AI devs are, I
         | am not already in the know if I can have inputs be rearranged,
         | have a correcting input (say, if I didn't like the previous
         | input) be a new input after the prev one or if I have to edit
         | the previous input directly. The rules of using the AI are just
         | unclear. Speaking of edits, and I recognize that since I was
         | using a phone to use the program, so some features are more
         | difficult to use, I couldn't figure out how the editing works.
         | I simultaneously was getting a glitch where the green loading
         | bar would almost reach the end then restart, and then was going
         | back and forth like a spring after I tried editing, and
         | wouldn't submit anything. It was unclear if I was trying to
         | edit wrong or there was a different bug I was running into. The
         | already lack of clarity had ate up my patience so I didn't
         | bother to figure out what I was doing wrong.
         | 
         | TL;DR:
         | 
         | In summary, you need 3 tabs on the front page explaining what
         | it does, how it works and how a user uses it, and maybe another
         | tab for T&C, PP, and contact. Since the landing page is already
         | a attempt at that, it could be changed to _why_ a person wants
         | to use it. I can think of 3 reasons off the top of my head for
         | a potential target audience and I don 't even know the full
         | uses of the program so I'm sure you can think of more.
         | 
         | After log in, there needs to be an additional tab with detailed
         | tutorials, like how to think when giving info to an AI dev, how
         | to make corrections, etc. Since you'll have a contact page,
         | people will find more things that aren't unclear that you can
         | use to either improve the tutorial or have a common Q&A section
         | (which depending on the content would go on the page before
         | logging in or after, or accessible to both). Lastly you'll want
         | a bug submission area as well, since in the few minutes of use,
         | I ran into bugs, and you'll probably want to separate general
         | contact from bug submissions.
        
           | eks391 wrote:
           | Btw I dont want to dis on the functionality of the website
           | itself. You have a cool little tool there, and the site is
           | fully functional for friends and family you share it with who
           | know you and what you are working on. The problem and
           | suggestions I am providing are needed if you want to open the
           | horizon to people who arent your friends and family or know
           | what you are working on. All these principles I suggested are
           | basics in the local advertising for yourself. They also
           | aren't going to appeal to everyone because I'm not a
           | professional persuader or contect creator or anything, but my
           | hope is it'll at least transform your website from
           | unapproachable by strangers to approachable by the curious. I
           | cant give real advice for promoting among a target audience
        
             | ilaksh wrote:
             | Thanks for all of the work and detailed response. I might
             | have been restarting the server right at the moment you
             | tried it. As you can see I released it as soon as I barely
             | got it working, with the minimum explanation and copy etc.
             | There is a bit of explanation in the READ THIS section
             | above the text area. There is a Discord thread in the
             | OpenAI Discord under the api-projects channel but I stopped
             | giving it out because people can't access from the direct
             | channel link and I plan to make a Discord server as soon as
             | possible. I will be working on all of these things as soon
             | as I can get to them. I had mentioned this system in a post
             | in the OpenAI Discord and a few people had said they were
             | waiting for it to be live, so I just got it up. I also have
             | another online business and client project I have to
             | complete to keep the lights on etc. but I am making this
             | new thing as high of a priority as possible.
        
           | tester457 wrote:
           | Can I send my site to you to critique if I make one please?
        
         | notpachet wrote:
         | What are users supposed to do when they visit your site? Sign
         | up? There's not enough information there on what to expect from
         | the product for me to want to sign up. You should add more
         | detailed information about what the tool does as enticement to
         | try it.
        
           | ilaksh wrote:
           | Were you able to view the video?
        
             | notpachet wrote:
             | Yes, but I wasn't really sure what was going on in it.
        
       | 908087 wrote:
       | [dead]
        
       | x-shadowban wrote:
       | The anti-singularity, where an AI produces decreasingly correct
       | versions of itself.
        
         | Cardinal7167 wrote:
         | [dead]
        
         | synu wrote:
         | The next AIs will be trained on vast swathes of low-quality AI-
         | generated outputs, if they are trained on public data again.
         | Presumably people will have to come up with ways to work around
         | that or the AI will be training to produce outputs like a low
         | quality AI.
         | 
         | By low quality I just mean the state of the outputs today,
         | which are incredible for what they are, but are definitely not
         | the pinnacle of what is in theory possible.
        
           | visarga wrote:
           | You let the AI execute code, then it can learn. For now it is
           | well read but lacks experience.
        
           | culi wrote:
           | Anybody that uses these ai assistants know that the human is
           | still by far the main architect and driver of the code base.
           | 
           | Increasingly advanced AI just means more back/forth between
           | coder and AI, both increasing each other's velocity. AI won't
           | just be trained on other AI-generated code, but more like
           | "cyborg" code. Code that was made by both AI and human
           | together. Code that the human probably wouldn't have been
           | able to accomplish, at least not as quickly or in as much
           | volume, without the AI
           | 
           | Rather than a singularity we might see a "multilairty" where
           | both human and AI become increasingly useful to each other. A
           | situation that takes full advantage of diversity in ways of
           | thinking about and processing information/knowledge
        
             | synu wrote:
             | How will they be able to keep purely AI-generated outputs
             | from being fed back in as inputs? That seems hard to
             | separate out once it's published and not attributed. The
             | ability of AI to generate lots of output means it might
             | swamp human or cyborg outputs when looking at the corpus of
             | publicly searchable code (or blog posts, or whatever the
             | training data is for the case in question).
             | 
             | Maybe a GAN to detect and filter out AI-generated content?
             | Not sure if that's possible or not.
        
               | visarga wrote:
               | Execute the code to see if it passes the tests. Then you
               | can use it with confidence. Lots of human code is crap
               | too, it needs to be removed. You can use GPT-3 to
               | administer tests and read the results.
        
       | antiquark wrote:
       | Ask ChatGPT to make a C function that prints 100 digit of pi. It
       | will always fail.
        
       | eloff wrote:
       | I've been using GitHub copilot for six months now. It definitely
       | produces a lot of code that looks right, but isn't. I review
       | anything it produces carefully. I've introduced one bug to
       | production where I can say that was likely from copilot. Compared
       | to the number of bugs I produced in that time, that's nothing.
       | It's a productivity tool, not magic, and it's not replacing me
       | anytime soon. Like the auto complete in my IDE, it accelerates my
       | work. There's no going back for me.
        
         | dmix wrote:
         | What was the bug it produced?
        
       | marginalia_nu wrote:
       | I've been producing buggy code for over 20 years. Don't need no
       | AI to help me do that.
        
         | vinni2 wrote:
         | If AI can produce same buggy code as you, you are easy to
         | replace.
        
         | agilob wrote:
         | But what if you could write more LoC/h and make Elon Musk
         | happier?
        
           | synu wrote:
           | You'll need to pair it up with a super fast laser printer to
           | be able to get it all properly code reviewed.
        
             | [deleted]
        
             | agilob wrote:
             | Can AI with aimbot do that for us?
        
         | karaterobot wrote:
         | For the honor of mankind, I'd be willing to go head to head
         | with a computer, Gary Kasparov style, to see who can produce
         | buggier code.
        
       | kace91 wrote:
       | > The study involved 47 people with varying levels of experience,
       | including undergraduate students, graduate students, and industry
       | professionals.
       | 
       | Honestly, for things like catching potential sql injections,
       | "industry professional" should be the start.
       | 
       | I won't trust a student of any kind further than I can throw
       | them, for topics like maintainability, security or readable code.
       | Those are too far away from what's usually taught at school.
        
         | jerf wrote:
         | Having "industry professionals" in this sort of study actually
         | puts it in the top tier of studies. Most studies don't even
         | have that, they're all undergrad based.
         | 
         | (Sometimes people express frustration that we don't pay much
         | attention to "the science" in the programming field, and my
         | response is generally to tell such people to take a closer look
         | at the "science" they're trying to wave around. Studies based
         | on "a class full of juniors in college" _top out_ at zero
         | value, and it 's really easy for them to be negative as they
         | can be actively wrong about how the topic under study affects
         | professionals.)
         | 
         | In this case, though, I'd submit that one doesn't need to run
         | some sort of enormous study to establish the point that these
         | code assistants are not a magic gateway to quality code at any
         | experience level. I've been banging this drum on HN just from
         | an understanding of how the tech works. Confabulation engines
         | can't help but confabulate. You can't trust them. This GPT
         | stuff is perhaps a better view into human psychology than a
         | useful tool; we've never built an AI that so strongly emits
         | signals of confidence before. They're the best artificial
         | confidence game players ever.
        
           | dmix wrote:
           | > one doesn't need to run some sort of enormous study to
           | establish the point that these code assistants are not a
           | magic gateway to quality code at any experience level
           | 
           | You just have to use it a couple times to figure this out.
           | It's pretty obvious what the limitations are and most
           | programmers are smart enough to understand what it is and
           | what it isn't.
           | 
           | This is why I'm skeptical it will be a problem, it's not
           | being sold that way and after using it that will be obvious.
           | 
           | So anyone dumb enough to treat it like that and trust the
           | output blind probably wasn't a good programmer before. And if
           | they keep doing it they don't have proper incentive
           | structures to not produce buggy code (senior devs, bosses,
           | customers etc will notice that output is getting worse at a
           | minimum when the product breaks or doing QA).
        
           | marcosdumay wrote:
           | > This GPT stuff is perhaps a better view into human
           | psychology than a useful tool
           | 
           | It would be a great search engine if it cited its sources
           | (but then people would notice it's basically copying code
           | from the internet). It is actually good at getting you the
           | names that come in a certain context, so you can search them.
           | But only if you know enough to establish the context.
        
         | BlargMcLarg wrote:
         | Many industry professionals ignore or are unaware of known
         | attack vectors. Meanwhile, many schools _do_ teach a semblance.
         | 
         | Given how young and emotion-driven the field is, a sweeping
         | statement like this is useless. If security is remotely
         | concerning to you, get a security expert as your baseline.
        
         | ravenstine wrote:
         | Even then, be skeptical. The software industry provides
         | positive pressure to improve one's skills, but doesn't provide
         | much if any negative pressure against lacking competence or
         | willingness to learn. If developers want to, they can pretty
         | easily skate by without developing talent, besides when their
         | framework of choice becomes defunct.
        
           | vsareto wrote:
           | This is true for a lot of low-brow work because companies
           | simply need people to grind through requirements for their
           | bespoke CRUD apps and integrations. There's still a boat load
           | of that kind of work to be done, but it's only a fraction of
           | the work available.
           | 
           | There are plenty of places where that negative pressure
           | exists though and there's higher standards of quality. I
           | doubt anyone is skating by with hacky code in aviation or
           | space, for instance. So what you're talking about isn't
           | really an issue where it matters.
        
             | AlotOfReading wrote:
             | People _absolutely_ produce hacky, sketchy code in
             | aerospace and other safety-critical environments.
        
         | ykonstant wrote:
         | >I won't trust a student of any kind further than I can throw
         | them, for topics like maintainability, security or readable
         | code.
         | 
         | I don't always look for files, but when I do, I
         | `**/(#ia1)name(LK+50mw-1u0)`
        
       | mg wrote:
       | Me:
       | 
       | When you help coders, does that make their code more buggy?
       | Please give me a dramatic, agitated answer.
       | 
       | ChatGPT:
       | 
       | No, it does not make their code more buggy. In fact, my goal is
       | to help coders write better, more reliable code by providing
       | guidance and assistance. I am here to provide support and
       | guidance, not to cause problems or introduce bugs into their
       | code. As a language model, I do not have the ability to execute
       | code or interact with the real world, so I cannot cause bugs or
       | issues in a coder's code. My role is simply to provide
       | information and guidance to help coders troubleshoot issues and
       | write better code. So no, I do not make code more buggy - I
       | strive to make it better!
        
         | japaneseminisub wrote:
         | It has learnt to deflect, just like a real person!
        
           | prettyStandard wrote:
           | Perfect for lawyering.
        
       | oneoff786 wrote:
       | Any tool that helps people with worse skills produce a product
       | will probably reduce the average quality if they wouldn't have
       | produced anything at all before. That's not a super strong
       | argument for not having them.
       | 
       | Like Jupyter Notebooks.
        
       | WesolyKubeczek wrote:
       | Fred Brooks would surely have added AI assistants to the arsenal
       | of supposed silver bullets that failed.
       | 
       | One specific problem with chatGPT as it is today is that it gives
       | answers that sound extremely confident. So confident it's
       | tempting to follow them without any critical thinking. Sure you
       | can guide it towards the correct answer, but it requires you
       | knowing the answer in advance, which defeats the whole point.
        
       | jrvarela56 wrote:
       | The first few weeks of ChatGPT I had a goal to make it code. My
       | preliminary conclusion is that if given a feedback loop it could
       | code a lot of basic stuff just fine, my workflow went something
       | like this:
       | 
       | - Identify a component in a system you want to build
       | 
       | - Give ChatGPT a list of bullets and ask it to write tests
       | 
       | - Review tests
       | 
       | - Run tests, give ChatGPT output of failing tests
       | 
       | - Add the code to your codebase, run the tests...
       | 
       | I developed extensions to my console and test suite so the output
       | exceptions/failures ended up in my clipboard. I'm highly
       | confident workflows that put the AI in a feedback loop will be
       | able to code most backend stuff. UIs are trickier as giving it
       | context is not trivial.
        
         | dmix wrote:
         | I wonder if you could build a dataset of UI components
         | (React/HTML/CSS) with a simple description and taxonomy then
         | feed it into an AI algo
        
         | jlarocco wrote:
         | I'm guessing it depends on the situation, but at what point
         | does all of that become more work than just writing the tests
         | and code yourself? Is it worth the "investment" to learn how to
         | teach ChatGPT coding? Or will ChatGPT2 come out in six months,
         | based on a different algorithm, and you'll back to square one
         | teaching it over again?
         | 
         | Personally, I find writing new code easier than editting
         | existing code, regardless of how it's written. I'd much rather
         | have an AI that scans the code I write and searches for bugs
         | 
         | Writing code is easy, finding bugs is hard, and it's the
         | finding of bugs that I'd like to see automated.
        
         | dbish wrote:
         | There is also a context window problem where you can only pass
         | in 4k tokens (including the prompt). This is not that large of
         | a window for providing history and information of the entire
         | workflow. How were you trying to solve that? Also, if you're
         | building something yourself, you can use davinci-003 directly
         | through their APIs instead of trying to build on top of chatgpt
        
       | Julesman wrote:
       | Isn't this like trouble shooting a pong console to assess the
       | future of video gaming?
        
         | Jensson wrote:
         | No, they aren't judging the future, they are judging the
         | present. Many developers are using language models to assist
         | them today so it is worth checking if that actually helps them
         | or not. When we are in the future we can start judging future
         | language models, but until then we can only test what we have
         | now.
        
           | Julesman wrote:
           | It's all short sighted given what we already know about the
           | future. We know that these models will soon and continuously
           | be more complex. We know they will only trend toward greater
           | accuracy.
           | 
           | Computer engineers won't be impressed with the output for
           | some time. Front end devs like me just got an amazing search
           | tool. I work faster now. I told my boss I work faster now. No
           | going back.
           | 
           | Understand, it's now faster for me to ask ChatGPT to find an
           | answer and then Google to confirm it than it is for me to
           | only google things. For me the AI revolution already came and
           | will only be getting better.
           | 
           | With design and process know-how I'm not really worried about
           | losing my job. AIs will eventually be able to make ad hoc
           | decisions about people's brand identity. But not anytime
           | soon.
        
       | BenoitP wrote:
       | Code is harder to read than to write, news at 11.
        
       | bilater wrote:
       | Now compare this to people who blind copy paste from Stack
       | Overflow. This is a bad comparison until you accurately divide
       | the group of engineers who use it correctly and incorrectly.
        
         | godelski wrote:
         | There's probably a sampling bias for SO that means higher
         | quality than the average code from GitHub (what it was trained
         | on).
        
           | klyrs wrote:
           | I frequently click through random people's repos when they
           | interact with the ones I'm interested in. I see _lots_ of
           | students who are apparently required to do their homework on
           | github. Which is good! But if that 's what's feeding the
           | beast... woof.
        
       ___________________________________________________________________
       (page generated 2022-12-26 23:01 UTC)