[HN Gopher] What we learned in 6 months of working on an AI Deve...
       ___________________________________________________________________
        
       What we learned in 6 months of working on an AI Developer
        
       Author : magden
       Score  : 29 points
       Date   : 2024-03-03 12:20 UTC (10 hours ago)
        
 (HTM) web link (blog.pythagora.ai)
 (TXT) w3m dump (blog.pythagora.ai)
        
       | somewhereoutth wrote:
       | > Our approach is to focus on building the application layer
       | instead of working on getting LLMs to output better results. The
       | reasoning is that LLMs will get better,...
       | 
       | So more jam tomorrow then. Building the framework around the
       | magic is the easy bit.
        
         | romafirst3 wrote:
         | It is a very important bit and might be how we all code in the
         | future.
        
           | romafirst3 wrote:
           | It's definitely not even close to bring solved either. I
           | haven't seen a single code generator that works (100% of the
           | time) for anything more than a very simple one or two liner.
        
       | stevage wrote:
       | The focus on upfront specs feels a bit off. Since it's apparently
       | cheap to generate running code, as a user, I'd much rather be
       | able to just iterate really fast and use output to refine my
       | requirements rather than having to laboriously state them all up
       | front. Agile rather than waterfall if you will.
        
       | amelius wrote:
       | Until I see an AI sysadmin that can help with basic
       | configure/make problems, I don't have high hopes for an AI
       | developer.
        
         | kbar13 wrote:
         | need AI for ffmpeg flags
        
           | Cilvic wrote:
           | I have great success for my simple use cases with sgpt -s
           | "cut the 40 seconds of the video starting at 1:30"
        
       | Lerc wrote:
       | Even though I don't think GPT-4 is up to the task, it does seem
       | like now is the right time to be working on these things. Pretty
       | soon GPT-4 will not be the best in the field. The next generation
       | will perform much better.
       | 
       | Possibly the most frustrating thing I find about GPT-4 is how
       | close it gets with it's wrong answers. It's easy to dismiss a
       | lesser answer when it responds with a laughably out-of-band idea.
       | GPT-4 often shows that it has a general idea of what you want but
       | misses a small but critical aspect which results in a solution to
       | something else that is similar but not what you wanted.
       | 
       | I have mixed results on iterating on it's own mistakes. It will
       | too often try and change the world to match it's answer, rather
       | than fixing the answer. The best approach I have found to stop
       | this is by getting it to create unit tests. I imagine there is a
       | lot of training data for it to understand the intention behind
       | fixing a failing test. It's a very specific problem for it to
       | look at and generally changing the test is not considered the
       | correct solution.
        
         | berkes wrote:
         | > Pretty soon GPT-4 will not be the best in the field. The next
         | generation will perform much better.
         | 
         | What makes you believe that progress is linear, or at least a
         | line forever going up?
         | 
         | I keep seeing people predicting rapidly improving AI, based on
         | how rapid it improved over the last x months.
         | 
         | But why is that not an outlier? How do we know we haven't hit a
         | ceiling and stagnating? Isn't progress typically very bumpy and
         | sudden?
        
           | Lerc wrote:
           | >What makes you believe that progress is linear, or at least
           | a line forever going up?
           | 
           | I assume neither of those things. I have however read a lot
           | of the papers published since GPT-4 was trained. There have
           | been a lot of advances since then, so much so that simply
           | saying "a lot" seems to be a massive understatement.
           | 
           | I think it is a reasonable assumption that at least a portion
           | of those advancements would be able to build upon the
           | existing technology of GPT-4 to produce something greater.
           | 
           | I am not assuming discoveries yet to be made. I am
           | considering existing discoveries that have not yet made it
           | into the top level of production.
        
         | withinboredom wrote:
         | Oh man. When it's so close but wrong it's amazing for creative
         | endeavors! For technical ones, it is quite a bad thing. It's
         | like being a Star Wars fan but the AI just wants to talk about
         | Star Trek.
         | 
         | I think this is why the non-tech people see AI as so amazing.
         | For anything human and non-technical, the "almost but not
         | quite" nature is a good thing.
         | 
         | I was using an AI to help me debug a weird thing (mainly
         | summarizing log splats hundreds of lines long) and I eventually
         | got pretty close to identifying the issue when I asked "wtaf is
         | this message. Never seen anything like it." It then went on
         | about how it was offended that I used vulgar language. I had to
         | apologize for saying "wtaf!" Anyway, I found a bug in a linker,
         | so that was fun; thanks Al.
        
       | 65 wrote:
       | Maybe AI developers can make landing pages and basic APIs. But,
       | taking front end as an example, I just don't see how an AI can
       | reproduce exact design specifications and interactivity to the
       | point where it wouldn't just be faster to write the code yourself
       | or search for some human verified snippet that does what you
       | want.
       | 
       | And programmers who do know how to actually write efficient code
       | without AI seem like they'd be even more in demand than those
       | that rely on AI. Skill + knowledge + ability to use existing
       | resources (e.g. StackOverflow, packages, templates), as we do
       | now, are much more predictable and faster than trying to wrangle
       | AI to do exactly what the designer or PM wants.
       | 
       | When the dishwasher was invented, everyone thought the human dish
       | washer would be obsolete. And yet, restaurants still employ dish
       | washers because they are much more efficient and thorough than a
       | dishwashing machine.
        
         | nine_zeros wrote:
         | > When the dishwasher was invented, everyone thought the human
         | dish washer would be obsolete. And yet, restaurants still
         | employ dish washers because they are much more efficient and
         | thorough than a dishwashing machine.
         | 
         | This is a good example of both job destruction and job
         | retention by technology.
         | 
         | Job destruction - the total number of potential hand dishwasher
         | jobs has reduced because the vast majority of commodity
         | dishwashing is machine driven.
         | 
         | Job enhancement - machine dishwashers just can't produce the
         | quality/dexterity of hand dishwashers.
         | 
         | I feel like generative AI will do the same. It will replace a
         | large number of commodity jobs - editors, translators, copy
         | producers, website designers, app prototypes, paper pushers but
         | it will also reveal the value of skilled producers.
         | 
         | Too risky to let chatGPT write code for your backend that
         | destroys your production database and crashes your company
         | forever.
        
       | ctoth wrote:
       | One of the things they seem to have figured out is the
       | requirement to at least model a sort of actor-critic architecture
       | with their agents. It helps quite a bit.
       | 
       | They seem to badmouth Aider a tad (not cool) but I do wonder how
       | a full-stack of this + Aider might work? There needs to also be
       | some sort of good test generator involved.
       | 
       | All that said, any time someone actually demonstrates progress on
       | the automated Software Engineer problem and it makes it to HN, I
       | am deeply reminded of the old quote:
       | 
       | "It is difficult to get a man to understand something, when his
       | salary depends on his not understanding it."
       | 
       | Just read through this comments section and check out the pure
       | copium. Yes, ChatGPT can do basic sysadmin tasks with ./configure
       | and make.
       | 
       | Yes it does make sense to work on this now, assuming LLMs will
       | get better, because LLMs have continued to get better on any
       | metric you can imagine.
       | 
       | Finally, yes, AI devs will make landing pages and basic APIs. I
       | didn't realize we were all hardcore world-class 0.01%
       | programmers? I have certainly written a landing page and basic
       | API before, in fact I do that sort of thing a lot more than I
       | write uber1337 hax0r code. You probably do too!
        
       | gumby wrote:
       | CMake was invented to guarantee that at least some humans would
       | have software jobs.
        
       ___________________________________________________________________
       (page generated 2024-03-03 23:01 UTC)