[HN Gopher] Lessons from Creating a VSCode Extension with GPT-4
___________________________________________________________________
Lessons from Creating a VSCode Extension with GPT-4
Author : kevinslin
Score : 124 points
Date : 2023-05-25 14:42 UTC (8 hours ago)
(HTM) web link (bit.kevinslin.com)
(TXT) w3m dump (bit.kevinslin.com)
| ilrwbwrkhv wrote:
| So basically a waste of time. People are getting it wrong. It
| shouldn't be used for any generation. It should be used for
| compression.
| wahnfrieden wrote:
| Nah
|
| edit: simply check the update... managed to be done one-shot
| golol wrote:
| I disagree. It is like 80% there.
| lozenge wrote:
| You could pull one of the sample extensions and it would
| actually compile and run without these typos. The actual
| logic of getting the current selection, calculating the
| header length and writing back the new header is only like 10
| LOC.
| lcnPylGDnU4H9OF wrote:
| "Prompt engineer" didn't exactly make sense to me until a
| coworker (graphic design) talked about the prompts that he'd
| see in Midjourney's discord server. Particularly when he
| mentioned specifications around camera lenses and perspective
| and other things I'm not familiar with. Very specific
| choices, continuing to be added and refined.
|
| Then seeing what guidance[0] is intended to do (LLM-prompt
| templating) it becomes obvious what people have in mind for
| this. It won't obviate understanding the lower-level code --
| in fact, I expect the value of that skill to increase over
| time as fewer people bother to learn it -- but it will cut
| out the time it takes to scaffold an application.
|
| [0] https://github.com/microsoft/guidance
| ResearchCode wrote:
| If an LLM can learn to create a picture it's likely it can
| also learn to create a prompt.
| layer8 wrote:
| Yeah, but you know what they say about the last 20%.
| WastingMyTime89 wrote:
| It works better when you use it to generate something sane. My
| take away from the article is that to program in typescript you
| have to jump through insane hoops.
| swyx wrote:
| very cool to see usage of smol-developer in the wild! have been
| traveling so havent committed code but have a bunch of plans
| coming to level it up while staying smol. (see issues)
| kevinslin wrote:
| big fan of smol-developer. it provides a great starting point
| for doing complicated things while being simple enough to
| reason about :)
| uludag wrote:
| > To test GPT-4's ability to generate a complex program, ...
|
| I wonder how much the complexity of the various ecosystems we
| find ourselves in contribute to the lack of effectiveness of the
| language model. The task at hand really shouldn't be considered
| complex. Making and registering a command to do this in Emacs is
| essentially (defun inc-heading () (interactive) (save-excursion
| (search-backward-regexp "^\\\\# ") (insert "#"))). No project
| structure, no dependencies, no tooling: something a LLM should
| have no problem doing.
| [deleted]
| anotherpaulg wrote:
| The author starts out with an excellent observation:
| Lately, I've been playing around with LLMs to write code. I find
| that they're great at generating small self-contained snippets.
| Unfortunately, anything more than that requires a human...
|
| I have been working on this problem quite a bit lately. I put
| together a writeup describing the solution that's been working
| well for me:
|
| https://aider.chat/docs/ctags.html
|
| The problem I am trying to solve is that it's difficult to use
| GPT-4 to modify or extend a large, complex pre-existing codebase.
| To modify such code, GPT needs to understand the dependencies and
| APIs which interconnect its subsystems. Somehow we need to
| provide this "code context" to GPT when we ask it to accomplish a
| coding task. Specifically, we need to:
|
| 1. Help GPT understand the overall codebase, so that it can
| decifer the meaning of code with complex dependencies and
| generate new code that respects and utilizes existing
| abstractions.
|
| 2. Convey all of this "code context" to GPT in an efficient
| manner that fits within the 8k-token context window.
|
| To address these issues, I send GPT a concise map of the whole
| codebase. The map includes all declared variables and functions
| with call signatures. This "repo map" is built automatically
| using ctags and enables GPT to better comprehend, navigate and
| edit code in larger repos.
|
| The writeup linked above goes into more detail, and provides some
| examples of the actual map that I send to GPT as well as examples
| of how well it can work.
| joenot443 wrote:
| This is a game changer. I've been doing something similar,
| relatively manually. I'll give this a spin and report back.
| anotherpaulg wrote:
| Glad to hear you're going to give it a try. Let me know how
| it goes.
| [deleted]
| freedomben wrote:
| I've been hearing executives and "tech leaders" recently saying
| that 80% of the new code is now written by chatgpt, and that it
| will "10x" a developer, but that sure mismatches with my
| experience. I suspect there will be a lot of managers with much
| higher expectations than is reasonable, which won't be good.
| visarga wrote:
| It's actually 1.2x productivity. Writing code does not take
| most of the day. If GPT can be just as good at debugging as
| it is at writing code, maybe the speedup would increase a
| bit. The ultimate AI speed == human reading + thinking speed,
| not AI generation speed.
| blowski wrote:
| I've found it's very useful for understanding snippets of
| code, like a 200-line function. But systems are more than
| "lots of 200-line functions" - there's a lot of context
| hidden in flags, conditional blocks, data, Git histories.
|
| Maybe one day, we'll be able to run it over all the Git
| histories, Jira tickets, Confluence documentation, Slack
| conversations, emails, meeting transcripts, presentations,
| etc. Until then, the humans will need to stitch it all
| together as best they can.
| fragmede wrote:
| but not all human thinking is worthwhile. I had it do a
| simple chrome extension to let me reply inline on HN, and
| it coughed out a manifest.json that worked first try. I
| didn't have to poke around the Internet to find a reference
| and then debug that via stack overflows. Easily saved me
| half an hour and gave me more mental bandwidth for the
| futzing with the DOM that I did need to do. (to your point
| tho, I didn't try feeding the html to it to see if it could
| do that part for me.)
|
| so it's somewhere between 1.2x and 10x for me, depending on
| what I'm doing that day. Maybe 3x on a good day?
| jjnoakes wrote:
| I don't mean to pick on you specifically, but this kind
| of approach doesn't fit the way I like to work.
|
| For example, just because the manifest.json worked
| doesn't mean it is correct - is it free of issues
| (security or otherwise)?
|
| I would argue that every system in production today
| seemed to "just work" when it was built and initially
| tested, and yet how many serious issues are in the wild
| today (security or otherwise)?
|
| I prefer to take a little more time solving problems,
| gaining an understanding of WHY things are done certain
| ways, and some insight into potential problems that may
| arise in the future even if something seems to work at
| first glance.
|
| Now I get that you are just talking about a small chrome
| extension that maybe you are only using for yourself...
| but scaling that up to anything beyond that seems like a
| ticking time bomb to me.
| sharemywin wrote:
| I feel like you would get more benefit out of GPT. you
| could ask it if it finds any vulnerabilities, common
| mistakes, other inconstancies. please provide comments on
| what each line does. what are some other common ways to
| write this line of code, etc.
|
| what are some ways to handle this XYZ problem. I see you
| might have missed sql injection attacks. would that apply
| here?
|
| Same goes for code you find on the internet.
|
| I got this out put for this line of code what do you
| think the problem is.
| jjnoakes wrote:
| Every time I've tried chatgpt I've been shocked at the
| mistakes. It isn't a good tool to use if you care about
| correctness, and I care about correctness.
|
| It may be able to regurgitate code for simple tasks, but
| that's all I've seen it get right.
| 0xedd wrote:
| Most managers don't care for people like you. Companies
| sell their product. Another successful fiscal year.
| Irregardless of the absolute shit code base, wasteful
| architecture and gaping security vulnerabilities.
| jjnoakes wrote:
| Maybe, but I've always had lucrative jobs and my work has
| always been appreciated. Maybe you just have to find the
| right employer. I think longer-term, employers that value
| high quality work will have the upper-hand.
|
| And to be honest, I don't care for managers like that, so
| the feeling is mutual.
| walthamstow wrote:
| 1.2x is a good rough number. Some things it saves me hours
| (regex, writing unfamiliar languages), some thing I never
| even bother asking it about
| drvdevd wrote:
| Using ctags for this purpose is genius. Thanks for posting!
| irrational wrote:
| How much time did it take to prepare all of that for ChatGPT?
| Won't you have to redo all of that work every time you ask for
| more help since code bases are not static? Would it take less
| time and effort to just write the code on your own?
| anotherpaulg wrote:
| Ya, it would be tedious to do all of this manually. I guess
| it wasn't clear, but all of this is part of my open source
| GPT coding tool called "aider".
|
| aider is a command-line chat tool that allows you to write
| and edit code with GPT-4. As you are chatting, aider
| automatically manages all of this context and provides it to
| GPT as part of the chat. It all happens transparently behind
| the scenes.
| fnordpiglet wrote:
| This is awesome, thanks. I wonder if complementing it with a
| vector store of the full repo further assists.
| anotherpaulg wrote:
| Ya, vector search is certainly the most common hammer to
| reach for when you're trying to figure out what subset of
| your full dataset to share with an LLM. And you're right,
| it's probably a piece of the puzzle for coding with GPT
| against a large codebase.
|
| But I think code has such a semantically useful structure
| that we should probably try and exploit that as much as
| possible before falling back to "just search for stuff that
| seems similar".
|
| Check out the "future work" section near the end of the
| writeup I linked above. I have a few possible improvements to
| this basic "repo map" concept that I'm experimenting with
| now.
| fnordpiglet wrote:
| No I agree the distilled map is most useful in context.
| However I wonder if providing a vector store of the total
| code base amplifies the effect. You could also pull in
| vector stores of all dependencies as well. Regardless
| amazing work and looking forward to seeing your future work
| as outlined.
| bredren wrote:
| I did not know what Ctags were, here is the explanation from
| exuberant Ctags:
|
| >Ctags generates an index (or tag) file of language objects
| found in source files that allows these items to be quickly and
| easily located by a text editor or other utility. A tag
| signifies a language object for which an index entry is
| available (or, alternatively, the index entry created for that
| object).
|
| https://ctags.sourceforge.net/whatis.html
| anotherpaulg wrote:
| Good point, ctags is a bit old school! Most IDEs use Language
| Server Protocol now for similar purposes.
|
| I added a few sentences of explanation and background on
| ctags to my writeup.
| cyrialize wrote:
| I'm a big fan of ctags. My old Emacs setup utilized Ctags
| when all else failed. So if I wanted to find a reference
| for something it would use LSP, then ctags if LSP returned
| nothing.
| esperent wrote:
| * * *
| pacifika wrote:
| Ask it first how to break down the problem and then ask each step
| kevinslin wrote:
| and keep enough shared context between the steps so that there
| is coherence
| kevinslin wrote:
| Author here. Happy to answer any questions or comments.
|
| As a teaser for a followup: I've been able to create a version of
| this that is able to generate an extension in one shot without
| any issues. This is done by giving GPT more context on the
| expected project layout as well as a checklist of constraints in
| order to create a valid app. Its not hard to envision a future
| where all software projects can be scaffolded by a LLM
| operator-name wrote:
| Looking forward to the follow up. I assume you set up a more
| capable gpt agent that can execute commands?
| kevinslin wrote:
| same agent. just more constraints around the domain (of
| creating a vscode extension)
| mftb wrote:
| I appreciate your write-up, it was well done. How long did you
| spend on this? I'd be most interested to know how long, not
| counting writing the blog post.
| weaksauce wrote:
| they started the project on the 13th per the git commits. so
| on the magnitude of two weeks it seems.
| mftb wrote:
| Ah, good eye, ty.
| doodlesdev wrote:
| Tangential, but which model did you use to generate the picture
| at the start of the article?
| kevinslin wrote:
| dalle2 with the following prompt: Humanoid robots working on
| putting chips together in a factory, digital art
| rpastuszak wrote:
| Random plug, but I made a front end for d2, that is 6-10x
| cheaper than using their site: https://dall-e.sonnet.io/
| kevinslin wrote:
| nice! will check it out
___________________________________________________________________
(page generated 2023-05-25 23:00 UTC)