[HN Gopher] Lessons from Creating a VSCode Extension with GPT-4
       ___________________________________________________________________
        
       Lessons from Creating a VSCode Extension with GPT-4
        
       Author : kevinslin
       Score  : 124 points
       Date   : 2023-05-25 14:42 UTC (8 hours ago)
        
 (HTM) web link (bit.kevinslin.com)
 (TXT) w3m dump (bit.kevinslin.com)
        
       | ilrwbwrkhv wrote:
       | So basically a waste of time. People are getting it wrong. It
       | shouldn't be used for any generation. It should be used for
       | compression.
        
         | wahnfrieden wrote:
         | Nah
         | 
         | edit: simply check the update... managed to be done one-shot
        
         | golol wrote:
         | I disagree. It is like 80% there.
        
           | lozenge wrote:
           | You could pull one of the sample extensions and it would
           | actually compile and run without these typos. The actual
           | logic of getting the current selection, calculating the
           | header length and writing back the new header is only like 10
           | LOC.
        
           | lcnPylGDnU4H9OF wrote:
           | "Prompt engineer" didn't exactly make sense to me until a
           | coworker (graphic design) talked about the prompts that he'd
           | see in Midjourney's discord server. Particularly when he
           | mentioned specifications around camera lenses and perspective
           | and other things I'm not familiar with. Very specific
           | choices, continuing to be added and refined.
           | 
           | Then seeing what guidance[0] is intended to do (LLM-prompt
           | templating) it becomes obvious what people have in mind for
           | this. It won't obviate understanding the lower-level code --
           | in fact, I expect the value of that skill to increase over
           | time as fewer people bother to learn it -- but it will cut
           | out the time it takes to scaffold an application.
           | 
           | [0] https://github.com/microsoft/guidance
        
             | ResearchCode wrote:
             | If an LLM can learn to create a picture it's likely it can
             | also learn to create a prompt.
        
           | layer8 wrote:
           | Yeah, but you know what they say about the last 20%.
        
         | WastingMyTime89 wrote:
         | It works better when you use it to generate something sane. My
         | take away from the article is that to program in typescript you
         | have to jump through insane hoops.
        
       | swyx wrote:
       | very cool to see usage of smol-developer in the wild! have been
       | traveling so havent committed code but have a bunch of plans
       | coming to level it up while staying smol. (see issues)
        
         | kevinslin wrote:
         | big fan of smol-developer. it provides a great starting point
         | for doing complicated things while being simple enough to
         | reason about :)
        
       | uludag wrote:
       | > To test GPT-4's ability to generate a complex program, ...
       | 
       | I wonder how much the complexity of the various ecosystems we
       | find ourselves in contribute to the lack of effectiveness of the
       | language model. The task at hand really shouldn't be considered
       | complex. Making and registering a command to do this in Emacs is
       | essentially (defun inc-heading () (interactive) (save-excursion
       | (search-backward-regexp "^\\\\# ") (insert "#"))). No project
       | structure, no dependencies, no tooling: something a LLM should
       | have no problem doing.
        
       | [deleted]
        
       | anotherpaulg wrote:
       | The author starts out with an excellent observation:
       | Lately, I've been playing around with LLMs to write code. I find
       | that they're great at generating small self-contained snippets.
       | Unfortunately, anything more than that requires a human...
       | 
       | I have been working on this problem quite a bit lately. I put
       | together a writeup describing the solution that's been working
       | well for me:
       | 
       | https://aider.chat/docs/ctags.html
       | 
       | The problem I am trying to solve is that it's difficult to use
       | GPT-4 to modify or extend a large, complex pre-existing codebase.
       | To modify such code, GPT needs to understand the dependencies and
       | APIs which interconnect its subsystems. Somehow we need to
       | provide this "code context" to GPT when we ask it to accomplish a
       | coding task. Specifically, we need to:
       | 
       | 1. Help GPT understand the overall codebase, so that it can
       | decifer the meaning of code with complex dependencies and
       | generate new code that respects and utilizes existing
       | abstractions.
       | 
       | 2. Convey all of this "code context" to GPT in an efficient
       | manner that fits within the 8k-token context window.
       | 
       | To address these issues, I send GPT a concise map of the whole
       | codebase. The map includes all declared variables and functions
       | with call signatures. This "repo map" is built automatically
       | using ctags and enables GPT to better comprehend, navigate and
       | edit code in larger repos.
       | 
       | The writeup linked above goes into more detail, and provides some
       | examples of the actual map that I send to GPT as well as examples
       | of how well it can work.
        
         | joenot443 wrote:
         | This is a game changer. I've been doing something similar,
         | relatively manually. I'll give this a spin and report back.
        
           | anotherpaulg wrote:
           | Glad to hear you're going to give it a try. Let me know how
           | it goes.
        
         | [deleted]
        
         | freedomben wrote:
         | I've been hearing executives and "tech leaders" recently saying
         | that 80% of the new code is now written by chatgpt, and that it
         | will "10x" a developer, but that sure mismatches with my
         | experience. I suspect there will be a lot of managers with much
         | higher expectations than is reasonable, which won't be good.
        
           | visarga wrote:
           | It's actually 1.2x productivity. Writing code does not take
           | most of the day. If GPT can be just as good at debugging as
           | it is at writing code, maybe the speedup would increase a
           | bit. The ultimate AI speed == human reading + thinking speed,
           | not AI generation speed.
        
             | blowski wrote:
             | I've found it's very useful for understanding snippets of
             | code, like a 200-line function. But systems are more than
             | "lots of 200-line functions" - there's a lot of context
             | hidden in flags, conditional blocks, data, Git histories.
             | 
             | Maybe one day, we'll be able to run it over all the Git
             | histories, Jira tickets, Confluence documentation, Slack
             | conversations, emails, meeting transcripts, presentations,
             | etc. Until then, the humans will need to stitch it all
             | together as best they can.
        
             | fragmede wrote:
             | but not all human thinking is worthwhile. I had it do a
             | simple chrome extension to let me reply inline on HN, and
             | it coughed out a manifest.json that worked first try. I
             | didn't have to poke around the Internet to find a reference
             | and then debug that via stack overflows. Easily saved me
             | half an hour and gave me more mental bandwidth for the
             | futzing with the DOM that I did need to do. (to your point
             | tho, I didn't try feeding the html to it to see if it could
             | do that part for me.)
             | 
             | so it's somewhere between 1.2x and 10x for me, depending on
             | what I'm doing that day. Maybe 3x on a good day?
        
               | jjnoakes wrote:
               | I don't mean to pick on you specifically, but this kind
               | of approach doesn't fit the way I like to work.
               | 
               | For example, just because the manifest.json worked
               | doesn't mean it is correct - is it free of issues
               | (security or otherwise)?
               | 
               | I would argue that every system in production today
               | seemed to "just work" when it was built and initially
               | tested, and yet how many serious issues are in the wild
               | today (security or otherwise)?
               | 
               | I prefer to take a little more time solving problems,
               | gaining an understanding of WHY things are done certain
               | ways, and some insight into potential problems that may
               | arise in the future even if something seems to work at
               | first glance.
               | 
               | Now I get that you are just talking about a small chrome
               | extension that maybe you are only using for yourself...
               | but scaling that up to anything beyond that seems like a
               | ticking time bomb to me.
        
               | sharemywin wrote:
               | I feel like you would get more benefit out of GPT. you
               | could ask it if it finds any vulnerabilities, common
               | mistakes, other inconstancies. please provide comments on
               | what each line does. what are some other common ways to
               | write this line of code, etc.
               | 
               | what are some ways to handle this XYZ problem. I see you
               | might have missed sql injection attacks. would that apply
               | here?
               | 
               | Same goes for code you find on the internet.
               | 
               | I got this out put for this line of code what do you
               | think the problem is.
        
               | jjnoakes wrote:
               | Every time I've tried chatgpt I've been shocked at the
               | mistakes. It isn't a good tool to use if you care about
               | correctness, and I care about correctness.
               | 
               | It may be able to regurgitate code for simple tasks, but
               | that's all I've seen it get right.
        
               | 0xedd wrote:
               | Most managers don't care for people like you. Companies
               | sell their product. Another successful fiscal year.
               | Irregardless of the absolute shit code base, wasteful
               | architecture and gaping security vulnerabilities.
        
               | jjnoakes wrote:
               | Maybe, but I've always had lucrative jobs and my work has
               | always been appreciated. Maybe you just have to find the
               | right employer. I think longer-term, employers that value
               | high quality work will have the upper-hand.
               | 
               | And to be honest, I don't care for managers like that, so
               | the feeling is mutual.
        
             | walthamstow wrote:
             | 1.2x is a good rough number. Some things it saves me hours
             | (regex, writing unfamiliar languages), some thing I never
             | even bother asking it about
        
         | drvdevd wrote:
         | Using ctags for this purpose is genius. Thanks for posting!
        
         | irrational wrote:
         | How much time did it take to prepare all of that for ChatGPT?
         | Won't you have to redo all of that work every time you ask for
         | more help since code bases are not static? Would it take less
         | time and effort to just write the code on your own?
        
           | anotherpaulg wrote:
           | Ya, it would be tedious to do all of this manually. I guess
           | it wasn't clear, but all of this is part of my open source
           | GPT coding tool called "aider".
           | 
           | aider is a command-line chat tool that allows you to write
           | and edit code with GPT-4. As you are chatting, aider
           | automatically manages all of this context and provides it to
           | GPT as part of the chat. It all happens transparently behind
           | the scenes.
        
         | fnordpiglet wrote:
         | This is awesome, thanks. I wonder if complementing it with a
         | vector store of the full repo further assists.
        
           | anotherpaulg wrote:
           | Ya, vector search is certainly the most common hammer to
           | reach for when you're trying to figure out what subset of
           | your full dataset to share with an LLM. And you're right,
           | it's probably a piece of the puzzle for coding with GPT
           | against a large codebase.
           | 
           | But I think code has such a semantically useful structure
           | that we should probably try and exploit that as much as
           | possible before falling back to "just search for stuff that
           | seems similar".
           | 
           | Check out the "future work" section near the end of the
           | writeup I linked above. I have a few possible improvements to
           | this basic "repo map" concept that I'm experimenting with
           | now.
        
             | fnordpiglet wrote:
             | No I agree the distilled map is most useful in context.
             | However I wonder if providing a vector store of the total
             | code base amplifies the effect. You could also pull in
             | vector stores of all dependencies as well. Regardless
             | amazing work and looking forward to seeing your future work
             | as outlined.
        
         | bredren wrote:
         | I did not know what Ctags were, here is the explanation from
         | exuberant Ctags:
         | 
         | >Ctags generates an index (or tag) file of language objects
         | found in source files that allows these items to be quickly and
         | easily located by a text editor or other utility. A tag
         | signifies a language object for which an index entry is
         | available (or, alternatively, the index entry created for that
         | object).
         | 
         | https://ctags.sourceforge.net/whatis.html
        
           | anotherpaulg wrote:
           | Good point, ctags is a bit old school! Most IDEs use Language
           | Server Protocol now for similar purposes.
           | 
           | I added a few sentences of explanation and background on
           | ctags to my writeup.
        
             | cyrialize wrote:
             | I'm a big fan of ctags. My old Emacs setup utilized Ctags
             | when all else failed. So if I wanted to find a reference
             | for something it would use LSP, then ctags if LSP returned
             | nothing.
        
       | esperent wrote:
       | * * *
        
       | pacifika wrote:
       | Ask it first how to break down the problem and then ask each step
        
         | kevinslin wrote:
         | and keep enough shared context between the steps so that there
         | is coherence
        
       | kevinslin wrote:
       | Author here. Happy to answer any questions or comments.
       | 
       | As a teaser for a followup: I've been able to create a version of
       | this that is able to generate an extension in one shot without
       | any issues. This is done by giving GPT more context on the
       | expected project layout as well as a checklist of constraints in
       | order to create a valid app. Its not hard to envision a future
       | where all software projects can be scaffolded by a LLM
        
         | operator-name wrote:
         | Looking forward to the follow up. I assume you set up a more
         | capable gpt agent that can execute commands?
        
           | kevinslin wrote:
           | same agent. just more constraints around the domain (of
           | creating a vscode extension)
        
         | mftb wrote:
         | I appreciate your write-up, it was well done. How long did you
         | spend on this? I'd be most interested to know how long, not
         | counting writing the blog post.
        
           | weaksauce wrote:
           | they started the project on the 13th per the git commits. so
           | on the magnitude of two weeks it seems.
        
             | mftb wrote:
             | Ah, good eye, ty.
        
         | doodlesdev wrote:
         | Tangential, but which model did you use to generate the picture
         | at the start of the article?
        
           | kevinslin wrote:
           | dalle2 with the following prompt: Humanoid robots working on
           | putting chips together in a factory, digital art
        
             | rpastuszak wrote:
             | Random plug, but I made a front end for d2, that is 6-10x
             | cheaper than using their site: https://dall-e.sonnet.io/
        
               | kevinslin wrote:
               | nice! will check it out
        
       ___________________________________________________________________
       (page generated 2023-05-25 23:00 UTC)