[HN Gopher] What we learned in 6 months of working on an AI Deve...
___________________________________________________________________
What we learned in 6 months of working on an AI Developer
Author : magden
Score : 29 points
Date : 2024-03-03 12:20 UTC (10 hours ago)
(HTM) web link (blog.pythagora.ai)
(TXT) w3m dump (blog.pythagora.ai)
| somewhereoutth wrote:
| > Our approach is to focus on building the application layer
| instead of working on getting LLMs to output better results. The
| reasoning is that LLMs will get better,...
|
| So more jam tomorrow then. Building the framework around the
| magic is the easy bit.
| romafirst3 wrote:
| It is a very important bit and might be how we all code in the
| future.
| romafirst3 wrote:
| It's definitely not even close to bring solved either. I
| haven't seen a single code generator that works (100% of the
| time) for anything more than a very simple one or two liner.
| stevage wrote:
| The focus on upfront specs feels a bit off. Since it's apparently
| cheap to generate running code, as a user, I'd much rather be
| able to just iterate really fast and use output to refine my
| requirements rather than having to laboriously state them all up
| front. Agile rather than waterfall if you will.
| amelius wrote:
| Until I see an AI sysadmin that can help with basic
| configure/make problems, I don't have high hopes for an AI
| developer.
| kbar13 wrote:
| need AI for ffmpeg flags
| Cilvic wrote:
| I have great success for my simple use cases with sgpt -s
| "cut the 40 seconds of the video starting at 1:30"
| Lerc wrote:
| Even though I don't think GPT-4 is up to the task, it does seem
| like now is the right time to be working on these things. Pretty
| soon GPT-4 will not be the best in the field. The next generation
| will perform much better.
|
| Possibly the most frustrating thing I find about GPT-4 is how
| close it gets with it's wrong answers. It's easy to dismiss a
| lesser answer when it responds with a laughably out-of-band idea.
| GPT-4 often shows that it has a general idea of what you want but
| misses a small but critical aspect which results in a solution to
| something else that is similar but not what you wanted.
|
| I have mixed results on iterating on it's own mistakes. It will
| too often try and change the world to match it's answer, rather
| than fixing the answer. The best approach I have found to stop
| this is by getting it to create unit tests. I imagine there is a
| lot of training data for it to understand the intention behind
| fixing a failing test. It's a very specific problem for it to
| look at and generally changing the test is not considered the
| correct solution.
| berkes wrote:
| > Pretty soon GPT-4 will not be the best in the field. The next
| generation will perform much better.
|
| What makes you believe that progress is linear, or at least a
| line forever going up?
|
| I keep seeing people predicting rapidly improving AI, based on
| how rapid it improved over the last x months.
|
| But why is that not an outlier? How do we know we haven't hit a
| ceiling and stagnating? Isn't progress typically very bumpy and
| sudden?
| Lerc wrote:
| >What makes you believe that progress is linear, or at least
| a line forever going up?
|
| I assume neither of those things. I have however read a lot
| of the papers published since GPT-4 was trained. There have
| been a lot of advances since then, so much so that simply
| saying "a lot" seems to be a massive understatement.
|
| I think it is a reasonable assumption that at least a portion
| of those advancements would be able to build upon the
| existing technology of GPT-4 to produce something greater.
|
| I am not assuming discoveries yet to be made. I am
| considering existing discoveries that have not yet made it
| into the top level of production.
| withinboredom wrote:
| Oh man. When it's so close but wrong it's amazing for creative
| endeavors! For technical ones, it is quite a bad thing. It's
| like being a Star Wars fan but the AI just wants to talk about
| Star Trek.
|
| I think this is why the non-tech people see AI as so amazing.
| For anything human and non-technical, the "almost but not
| quite" nature is a good thing.
|
| I was using an AI to help me debug a weird thing (mainly
| summarizing log splats hundreds of lines long) and I eventually
| got pretty close to identifying the issue when I asked "wtaf is
| this message. Never seen anything like it." It then went on
| about how it was offended that I used vulgar language. I had to
| apologize for saying "wtaf!" Anyway, I found a bug in a linker,
| so that was fun; thanks Al.
| 65 wrote:
| Maybe AI developers can make landing pages and basic APIs. But,
| taking front end as an example, I just don't see how an AI can
| reproduce exact design specifications and interactivity to the
| point where it wouldn't just be faster to write the code yourself
| or search for some human verified snippet that does what you
| want.
|
| And programmers who do know how to actually write efficient code
| without AI seem like they'd be even more in demand than those
| that rely on AI. Skill + knowledge + ability to use existing
| resources (e.g. StackOverflow, packages, templates), as we do
| now, are much more predictable and faster than trying to wrangle
| AI to do exactly what the designer or PM wants.
|
| When the dishwasher was invented, everyone thought the human dish
| washer would be obsolete. And yet, restaurants still employ dish
| washers because they are much more efficient and thorough than a
| dishwashing machine.
| nine_zeros wrote:
| > When the dishwasher was invented, everyone thought the human
| dish washer would be obsolete. And yet, restaurants still
| employ dish washers because they are much more efficient and
| thorough than a dishwashing machine.
|
| This is a good example of both job destruction and job
| retention by technology.
|
| Job destruction - the total number of potential hand dishwasher
| jobs has reduced because the vast majority of commodity
| dishwashing is machine driven.
|
| Job enhancement - machine dishwashers just can't produce the
| quality/dexterity of hand dishwashers.
|
| I feel like generative AI will do the same. It will replace a
| large number of commodity jobs - editors, translators, copy
| producers, website designers, app prototypes, paper pushers but
| it will also reveal the value of skilled producers.
|
| Too risky to let chatGPT write code for your backend that
| destroys your production database and crashes your company
| forever.
| ctoth wrote:
| One of the things they seem to have figured out is the
| requirement to at least model a sort of actor-critic architecture
| with their agents. It helps quite a bit.
|
| They seem to badmouth Aider a tad (not cool) but I do wonder how
| a full-stack of this + Aider might work? There needs to also be
| some sort of good test generator involved.
|
| All that said, any time someone actually demonstrates progress on
| the automated Software Engineer problem and it makes it to HN, I
| am deeply reminded of the old quote:
|
| "It is difficult to get a man to understand something, when his
| salary depends on his not understanding it."
|
| Just read through this comments section and check out the pure
| copium. Yes, ChatGPT can do basic sysadmin tasks with ./configure
| and make.
|
| Yes it does make sense to work on this now, assuming LLMs will
| get better, because LLMs have continued to get better on any
| metric you can imagine.
|
| Finally, yes, AI devs will make landing pages and basic APIs. I
| didn't realize we were all hardcore world-class 0.01%
| programmers? I have certainly written a landing page and basic
| API before, in fact I do that sort of thing a lot more than I
| write uber1337 hax0r code. You probably do too!
| gumby wrote:
| CMake was invented to guarantee that at least some humans would
| have software jobs.
___________________________________________________________________
(page generated 2024-03-03 23:01 UTC)