[HN Gopher] Show HN: We are building an open-source IDE powered ...
___________________________________________________________________
Show HN: We are building an open-source IDE powered by AI
Author : mlejva
Score : 280 points
Date : 2023-04-04 14:50 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| cube2222 wrote:
| Based on my experiences having spent a bunch of time
| experimenting with LLMs for writing code, I think the hardest
| part that I haven't yet seen a solution to is modifying existing
| code bases.
|
| Sure, if you're doing greenfield, just ask it to write a new
| file. If you only have a simple script, you can ask it to rewrite
| the whole file. The tricky bit however, is figuring out how to
| edit relevant parts of the codebase, keeping in mind that the
| context window of LLMs is very limited.
|
| You basically want it to navigate to the right part of the
| codebase, then scope it to just part of a file that is relevant,
| and then let it rewrite just that small bit. I'm not saying it's
| impossible (maybe with proper indexing with embeddings, and
| splitting files up by i.e. functions you can make it work), but I
| think it's very non-trivial.
|
| Anyway, good luck! I hope you'll share your learnings once you
| figure this out. I think the idea of putting LLMs into
| Firecracker VMs to contain them is a very cool approach.
| chatmasta wrote:
| I think this will get easier as the supported context sizes get
| larger. Of course the tooling needs to take care of most of the
| drudgery, but I'm not sure there's any underlying limitation of
| an LLM that makes refactoring existing code any different from
| creating new code. It's just a matter of feeding it the layout
| of the repo and the code itself in a way that the LLM can
| retrieve and focus on the relevant files when you ask it for
| help with a task.
| cube2222 wrote:
| Oh yeah, I agree with the refactoring bit. As I said,
| rewriting a file (section) works great if your file fits into
| the context window.
|
| But context windows are far from being large enough to fit
| entire repos, nor even entire files (if they're big). I'm not
| sure how hard just scaling up the context window is, from the
| current early access of OpenAI GPT-4.
|
| > in a way that the LLM can retrieve and focus on the
| relevant files
|
| This I think is something we haven't really figured out yet,
| esp. if a feature requires working on multiple files. I
| wouldn't be surprised if approaches based on the semantic
| level (actually understanding the code and the relationships
| between it's parts; not the textual representation of it)
| won't be needed in the end here.
| chatmasta wrote:
| I agree retrieval will need to be aligned with semantic
| layout of the codebase. But that should be pretty straight
| forward, given the number of static analysis and
| refactoring tools we already have available to us and that
| we use daily as part of our IDE workflows.
|
| This also implies that the first codebases to really
| benefit from LLM collaboration will be those written in
| strongly typed languages which are already amenable to
| static analysis.
|
| And in terms of context windows, it's not like humans keep
| the entire codebase in their head at all times either. As a
| developer, when I'm focused on a single task, I'm only ever
| switching between a handful of files. And that's by design;
| we build our codebases using abstractions that are
| understandable to humans, given our limited context window,
| with its well-known limit of about seven simultaneous
| registers. So if anything, perhaps the risk of introducing
| an LLM to a codebase is that it could create abstractions
| that are _more_ complicated than what a human would prefer
| to read and maintain.
| M4v3R wrote:
| This is actually possible, I made a proof of concept agent that
| does exactly this. The agent has access to several commands
| that get executed on demand - ListFiles, ReadFile and
| WriteFile. You give it a prompt - for example "add a heading to
| the homepage and make it yellow" and it will use ListFiles to
| locate the correct file, then use ReadFile to read it and then
| finally use WriteFile to write the new contents of that file.
|
| It was just a PoC so it wasn't bulletproof and I only ever made
| it work on a small Next.js codebase (with every file being
| intentionally small so it fits the context window) but it did
| work correctly.
| euroderf wrote:
| Could I feed this a decent set of software requirements and get
| back something that reflects them ?
| mlejva wrote:
| Yes. That's one of the goals with e2b
| khana wrote:
| [dead]
| throwaway426079 wrote:
| As an analyst, could I write an analysis plan and have this tool
| implement the plan in, say python pandas?
| Thiebaut wrote:
| It's a very interesting idea, i haven't tested the app, maybe
| integration with vscode ?
| mlejva wrote:
| I don't think the integration with vscode would work well here.
| We're ditching a lot of existing workflows for a completely new
| interaction between the developer and the agent. It's not like
| copilot at all. E2b is more like having a virtual developer at
| your disposal 24/7. You write a technical spec and e2b codes
| builds the app for you
| nico wrote:
| Definitely would be great to use it directly in vscode
| [deleted]
| qaq wrote:
| So what prof is everyone is planning to switch to once this takes
| off?
| exhaze wrote:
| Aren't most devs working on existing apps?
|
| Don't most devs tend to be extremely sticky to their preferred
| dev env (last big migration was ST -> VS Code back in 2015-2017)
|
| Read all the comments in here. Still not getting why this isn't a
| VS Code plugin.
|
| Distribution almost always beats product.
| px43 wrote:
| Maybe to spite Mircosoft? I would love to completely remove
| Microsoft from my workflow. The way telemetry creeps into
| everything bugs the hell out of me. I also really don't like
| the security model of vscode where it's assumed that any code
| you look at comes from a trusted source, and might execute code
| on your local system. That's a ridiculous security assumption
| for a text editor, but not surprising considering that these
| people also made MS Office.
| nullandvoid wrote:
| I dislike telemetry as much as the next guy, but it is a
| little funny to complain about it on the post for this
| editor, which instead sends your entire codebase to OpenAI.
|
| Also it's a somewhat recent update, but VSCode asks you if
| you trust the code author now when you open a project
| https://code.visualstudio.com/docs/editor/workspace-trust
| exhaze wrote:
| When I run VS Code on Mac and open some new non trusted
| directory it literally forces me to click "yes" in dialog
| that says something like "do you trust stuff in this
| directory to be able to run code?" Is this a Mac only thing
| and maybe you use windows or linux? Cuz I found your security
| concern addressed to an almost obnoxious extent...
| Kwpolska wrote:
| This "trust" thing is on all platforms. I turn it off as
| soon as it shows up, since it's pointless IMO.
| sjm wrote:
| IMO Emacs is a perfect candidate for this kind of thing, or maybe
| something akin to LSP so you can bring your own editor. New GPT
| extensions are coming out daily for Emacs, e.g.
| https://github.com/xenodium/chatgpt-shell
| mfts0 wrote:
| Huge fan since Devbook and really happy you found a killer
| feature of Devbook's tooling combined with AI
| mlejva wrote:
| Thank you for the kind words:)
| maxilevi wrote:
| Do you use it for developing?
| celeritascelery wrote:
| This looks quite interesting. So I have question for these AI
| powered editors: what advantage would a dedicated editor like
| this have over just using an AI plugin for VsCode? How do you
| fundamentally build the editor differently if you are thinking
| about AI from the ground up?
| swyx wrote:
| i mean the obvious answer is just being able to do things that
| the VS Code extension api does not let you do.
|
| per the end of the readme:
|
| - one step to go from a spec to writing code, run code,
| debugging itself, install packages, and deploying the code
|
| - creates "ephemeral UI" on demand
| Haga wrote:
| [dead]
| rco8786 wrote:
| > How do you fundamentally build the editor differently if you
| are thinking about AI from the ground up?
|
| Great question. I would love to hear the devs thoughts here.
| This is one of those questions where my intuition tells me
| there may be a really great "first principles" type of answer,
| but I don't know it myself.
| ValentaTomas wrote:
| In essence when working with code stops being the major thing
| you do (you abstract that away) and start managing the agents
| working on code and writing the spec, you need new tools and
| working environment to support that.
| namaria wrote:
| So you lose precision and control?
| Swizec wrote:
| > and start managing the agents working on code ... you need
| new tools
|
| Jira?
|
| Only slightly joking. It really sounds like we're moving in
| the direction of engineers being a more precise technical
| version of a PM, but then engineers could just learn to speak
| business and we don't need PMs.
| TallGuyShort wrote:
| But can the AI, as Office Space said, "talk to the goddamn
| customers so the engineers don't have to"?
| toxicFork wrote:
| I think soon yes
| mlejva wrote:
| Hey! Thanks for the question.
|
| Our editor isn't a regular coding editor. You don't actually
| write code with e2b. You write technical specs and then
| collaborate with an AI agent. Imagine it more like having a
| virtual developer at your disposal 24/7. It's built on a
| completely new paradigms enabled by LLMs. This enables to
| unlock a lot of new use cases and features but at the same time
| there's a lot that needs to be built.
| mayank wrote:
| > You don't actually write code with e2b. You write technical
| specs and then collaborate with an AI agent.
|
| If I want to change 1 character of a generated source file,
| can I just go do that or will I have to figure out how to
| prompt the change in natural language?
| arthurcolle wrote:
| I'm sure there would be a way to edit the artifacts...
| otherwise this would be a constant exercise in frustration!
| mypalmike wrote:
| So, not too different from typical real world coding
| tasks today.
| walleeee wrote:
| I'm not sure I follow this answer. What are the entirely new
| paradigms? Writing is still the initial step. If text editing
| remains a core part of the workflow, why restrict the user's
| ability to edit code?
| eikenberry wrote:
| Writing technical specs a fancy way to say coding. This reads
| to me like you're writing a new programming language tightly
| integrated with an IDE that targets a new AI based compiler.
| spyder wrote:
| Hmm... Where is the new language in this? The specs is just
| human language and some JSON for defining structures. It's
| more so that the human language becoming a programming
| language with the help of AI.
| xwdv wrote:
| You're right, it is just English language.
|
| And over time, people will discover some basic phrases
| and keywords they can use to get certain results from the
| AI, and find out what sentence structures work best for
| getting a desired outcome. These will become standard
| practices and get passed along to other prompt engineers.
| And eventually they will give the standard a name, so
| they don't have to explain it every time. Maybe even
| version schemes, so humans can collaborate amongst
| themselves more effectively. And then some entrepreneurs
| will sell you courses and boot camps on how to speak to
| AI effectively. And jobs will start asking for applicants
| to have knowledge of this skill set, with years of
| experience.
| namaria wrote:
| >And over time, people will discover some basic phrases
| and keywords they can use to get certain results from the
| AI, and find out what sentence structures work best for
| getting a desired outcome.
|
| This just sounds like a language that is hard to learn,
| undocumented, and hard to debug
| xwdv wrote:
| Yes, but think of the savings!
| akomtu wrote:
| Until one day a new LLM gets released, GPT5, that doesn't
| recognize any of those special words. Mastering prompt-
| speak is essentially mastering undefined behaviors of C
| compilers.
| xwdv wrote:
| What you could do is just continue speaking to GPT4 and
| have it output prompts compatible with GPT5. Pretty
| simple.
| smrtinsert wrote:
| prompts results aren't static and will definitely not be
| guaranteed over time...
| SoftTalker wrote:
| Yes, as always the essential complexity of software is
| understanding and describing the problem you are trying to
| solve. Once that is done well, the code falls out fairly
| easily.
| eightysixfour wrote:
| That's like saying a painting, once they understand what
| they are trying to paint, just falls out of a painter. It
| is true but not useful for not-painters.
| BobbyJo wrote:
| I think the difference here is that code is effectively a
| description, so there is an extremely tight coupling
| between describing the task and the task itself.
|
| You could tell me, in the most painstaking detail, what
| you want me to paint, and I still couldn't paint it. You
| can take any random person on the street and tell them
| exactly what to type and they'd be able to "program".
| eikenberry wrote:
| That's just picking nits with the metaphor. Change it to
| a poet or a novelist and it works the same. If you tell a
| person exactly what to write they are just a fancy
| typewriter, not a poet or novelist. Same with code.
| eterps wrote:
| That's a brilliant and witty way to put it :D
| stinkbutt wrote:
| by technical specs you mean some kind of prompt mechanism,
| no?
| campbel wrote:
| I built this https://github.com/campbel/aieditor to test the
| idea of programming directly with the AI in control. Long story
| short, VS Code plugin is better IMO.
| barbariangrunge wrote:
| If you could use it without submitting data to some ai company,
| or if it came with a non-disgusting terms of service, that
| would be a killer feature for me.
|
| For example, the last ai company installer I just clicked
| "decline" to (a few minutes ago) says that you give it
| permission to download malware, including viruses and trojans,
| onto your computer and that you agree to pay the company if you
| infect other people and tarnish their reputation because of it.
| Literally. It was a very popular service too. I didn't even get
| to the IP section
|
| edit: those terms aren't on their website, so I can't link to
| them. They are hidden in that tiny, impossible to read box
| during setup for the desktop installer
| UberFly wrote:
| Sounds like fantastic software. Would be nice to know who
| you're referencing here.
| stinkbutt wrote:
| can you name this company please?
| prepend wrote:
| I suspect that if the goal is to make money, you want your own
| ide and don't want to rely on vscode.
| wruza wrote:
| I find it surprising that many developers find using ai prompts
| easier than an actual programming language (not the first time I
| hear about it, but now it seems to be serious).
|
| When I tried to "talk" to gpts, it was always hard to formulate
| my requests. Basically, I want to code. To tell something
| structured, rigorous, not vague and blah-blah-ey. To check and
| understand every step*. Idk why I feel like this, maybe for me
| the next ten years will be rougher than planned. Maybe because
| I'm not a manager type at all? Trust issues? I wonder if someone
| else feels the same.
|
| Otoh I understand the appeal partially. It roots in the fact that
| our editors and "IDEs" suck at helping you to navigate the
| knowledge quickly. Some do it to a great extent, but it's not
| universal. Maybe a new editor (plugin) will appear that will look
| at what you're doing in code and show the contextual knowledge,
| relevant locations, article summaries, etc in another pane. Like
| a buddy who sits nearby with a beer, sees what you do and is
| seasoned enough to drop a comment or a story about what's on your
| screen any time.
|
| Programming in a natural language and trusting the results I just
| can't stand. Hell, in practice we often can't negotiate or align
| ourselves in basic things even.
|
| * and I'm not from a slow category of developers
| 88j88 wrote:
| A large part of software development is documentation. It is
| often overlooked or not kept up to date. I think the great
| advancement of AI is that we can now more closely link and
| validate one with the other. You can easily summarize some code
| with chatGPT, and also provide some structure of code based on
| documentation (as an outline, or first cut).
|
| However, this is the state of the art today. In the future, the
| training set will be based on the prompt-result output and
| refinement process, leading me to believe that next generation
| tools will be much better at prompting the user to provide the
| details. I've already seen this enhancement in gpt4 recently, I
| think this is a common and interesting use case.
|
| Overall, these tools will become more and more advanced and
| useful for developers. Now is a great time to become proficient
| and start learning how the future will require you to adapt
| your workflow in order to best leverage chatGPT for
| development.
|
| This response was written by a human.
| vanjajaja1 wrote:
| I think it has to do with how invested you are in the
| appearance of the final code. If you're very focussed on
| getting a result and no more, then GPT is probably very
| powerful. But if you also care about the "how" of the
| implementation then you'll probably get frustrated
| micromanaging GPT. I know I do, but I do find its pretty useful
| when my memory of how to achieve something is blurry. I know
| what looks wrong but I don't remember how to make it right.
| hyperthesis wrote:
| Everyone projects their difficulties on to you, so here's mine:
|
| A specification can be rigorous, structured, precise and
| formal: a declarative description of what is to be done - not
| how. It is a very different perspective from designing
| algorithms - even one that proves the thing to be done, is
| done.
|
| But I think trusting the results is misplaced. It's more like
| being a partner in a law firm, where the articled clerks do all
| the work - but you are responsible. So you'd better check the
| work. And doing that efficiently is an additional skill set.
| johntash wrote:
| I'm not super impressed yet either, but I do think there's a
| lot of value. I tried a few times to get it to write some
| simple python scripts, and usually it'd get some things wrong.
| But it did help to get the basics of the script started.
|
| If you haven't already, look into trying out Copilot (I don't
| think there's a free version anymore?) or Codeium (free and has
| extensions for a lot of editors, just be careful with what code
| you're okay sending to a third party). Using AI prompts as a
| fancy auto complete is what has been giving me the most
| benefit. Something simple like a comment "# Test the
| empty_bank_account method" in an existing file using pytest, I
| go to the next line and then hit tab to auto-complete a 10-15
| line unit test. It's not always right, but it definitely helps
| speed things up compared to either typing it all out or
| copy/pasting.
|
| My biggest annoyance so far, at least with Codeium, is that
| sometimes it guesses wrong. With LSP-based auto-complete, it
| only gives me options for real existing
| functions/variables/etc. But Codeium/copilot like to "guess"
| what exists and can get it wrong.
|
| > Like a buddy who sits nearby with a beer, sees what you do
| and is seasoned enough to drop a comment or a story about
| what's on your screen any time.
|
| I agree with you, this is probably where it would be most
| useful (to me). I don't need someone to write the code for me,
| but I'd love an automated pair-programming buddy that doesn't
| require human interaction.
| ramraj07 wrote:
| I used to think like you until I actually tried copilot. I
| don't even write much boilerplate code (it's python, and the
| framework is already setup). It is still an effective and
| useful tool. It saves mental bandwidth, and the code it writes
| is exactly what I was intending to write, so it turns me into a
| reviewer than a coder and thus makes it easier to get the job
| done (I'm still confirming the code does what I want).
|
| This is just gpt-3. With chatGPT i finally managed to get unit
| test backbones for the most annoying methods in our code base
| that stuck our coverage at 92%. Once we get full copilot gpt-4
| integration we will likely get to 98% coverage in a days time.
| That's not nothing.
| wruza wrote:
| I tried copilot in some toy projects back then but wasn't
| left impressed. Maybe should try gpt-4 later. Thanks for
| sharing your experience!
| dlivingston wrote:
| With ChatGPT (GPT-3.5), I thought: "wow, this is really
| neat! The future is here!"
|
| With GPT-4, I thought: "fuck. It's all over. How soon will
| I get laid off?"
|
| Only half-joking, but GPT-4 is truly good at complex
| programming and problem solving, and noticeably better than
| GPT-3.5.
| thisgoesnowhere wrote:
| I agree it's a game changer. I also think once I get
| access to copilot x I'll find it irresistible.
|
| Something about the iterative context aware approach of
| chat helps me think through the problem. The only think I
| don't like about copilot is how you cant tell it to
| augment the output. Sometimes it's close but not perfect.
| cube2222 wrote:
| I feel like the real value is when working with tools you're
| not proficient with.
|
| I.e. for a backend dev that hardly knows react and needs to
| write some, who usually gets by doing small modifications and
| copy-paste-driven development there, ChatGPT / Copilot is
| basically a 10x productivity multiplier.
|
| I agree that I've not been feeling that much value yet when
| working in areas I'm proficient in.
| amelius wrote:
| > and I'm not from a slow category of developers
|
| You will be if you don't embrace AI ;)
| thequadehunter wrote:
| As an infra dude it saves me loads of time. Most of the code I
| write is stuff where I already know what needs to happen, I
| already know how the process looks un-automated, and I'm not
| going to learn a whole lot writing more boilerplate, or
| learning the names of all the endpoints for a dumb api and
| parsing the text.
|
| It's still mental work. You still have to read the code over
| and edit stuff.
|
| It helps me save lots of time so I can do quality things like
| comment on hn during work..
| yunwal wrote:
| Infra as code is really where copilot shines. GPT/codex isn't
| great as a state machine, but it's great for solving the
| "what's that variable called?" or "how's that yaml supposed
| to be structured?" problems. I think the architect and infra
| eng distinction will disappear soon as writing IAC becomes
| nearly instantaneous
| arketyp wrote:
| Totally. All I want is an easier time navigating code; not
| getting lost in tabs and page scrolling, nested class
| structures and function references, interrupted by context-
| switching text searches. It feels like I spend 75% of my brain
| capacity managing my short term memory. Your buddy mentor is a
| great formulation of the kind of AI aid I need. As for
| generating code, that's the least of my concerns. I'm actually
| happy when I get the opportunity to just hammer down some brain
| dead lines for a change.
| Mali- wrote:
| We're building ourselves out of a job in real time.
| jkettle wrote:
| We are all Biz Devs now. :(
| Manjuuu wrote:
| Or freeing up time for more complex parts of the job.
| jfengel wrote:
| I've been expecting to be replaced by a much, much cheaper
| developer in another country since I graduated college ...
| three decades ago. I'm still not 100% certain why that hasn't
| happened.
|
| I suspect it has to do with the equivalent of prompt
| engineering: it's too difficult to cross the cultural and
| linguistic barriers, as well as the barrier of space that
| could have mitigated the other two. By the time you've
| directed somebody to do the work with sufficient precision,
| you could have just done it yourself.
|
| And it's part of the reason we keep looking for that 10x
| superdeveloper. Not just that they produce more code per
| dollar spent, but that there is less communication overhead.
| The number of meetings scale with the square of the number of
| people involved, so getting 5 people for the same price as me
| doesn't actually save you time or money.
|
| I have no idea what that means for AI coding. Thus far it
| looks a lot like that overseas developer who really knows
| their stuff and speaks perfect English, but doesn't actually
| know the domain and can't learn it quickly. (Not because they
| aren't smart, but because of the various human factors I
| mentioned.)
|
| I'd be thrilled to be completely wrong about that -- in part
| because I've been mentally prepared for it for so long. I
| hope that younger developers get a chance to spin that into a
| whole new career, hopefully of a kind I can't even imagine.
| wruza wrote:
| _By the time you 've directed somebody to do the work with
| sufficient precision, you could have just done it
| yourself._
|
| And it's much slower because "do it" includes trial-error-
| decision cycle, which is fast when you're alone and weeks
| if you are directing and [mis]communicating. Also wondering
| where it goes and how big of a bubble it is/will be.
| stinkbutt wrote:
| damn you got some top shelf, high grade copium. let me get a
| hit
| Manjuuu wrote:
| If things don't go as I expect I would gladly switch
| career, opening a food shack somewhere for example sounds
| nice.
| wruza wrote:
| Sounds... until you're sitting there in the shack
| rethinking life choices.
| halfjoking wrote:
| Mark my words... custom AI IDEs are the new javascript
| frameworks.
|
| I think everyone had this idea and is building something similar.
| I know I am.
| mlejva wrote:
| Would love to learn more about what you're building! Do you
| have anything to share?
| halfjoking wrote:
| Nothing yet - so far it's just a basic electron app that you
| select what files to send as reference for the feature you
| want to add, then streamlines the process of applying edits
| ChatGPT sends back.
|
| I'm not really planning on turning it into a product. It
| sounds like this guy is a lot farther along than me if you're
| looking for a competitor - I think you're going to have
| plenty. https://mobile.twitter.com/codewandai
| gumballindie wrote:
| Good observation. I did notice that a lot of the types that
| jumped from framework to framework are now jumping onto ai.
| Please keep them there, at least that way js will stand a
| chance at becoming sane.
| namaria wrote:
| What do you think these fuzzy interpreters are gonna be used
| for? Machine code running on metal? It's gonna be all
| scripted servers and web apps for saas startups slurping that
| VC or UE money
| teddyh wrote:
| > _The current idea is to offer the base cloud version for free
| while having some features for individuals behind a subscription.
| We 'll share more on pricing for companies and enterprises in the
| future._
|
| What happens if you use the README.md and associated
| documentation as a prompt to re-implement this whole thing?
| graypegg wrote:
| If this truly works as the pitch describes:
|
| > mlejva: Our editor isn't a regular coding editor. You don't
| actually write code with e2b.
|
| then what licensing problems arise from its use? In theory, if
| you only prompt the AI to write the software, is the software
| even your intellectual property?
|
| It seems like this is a public domain software printing machine
| if you really aren't meant to edit the output.
| mnd999 wrote:
| Because developers love writing documentation.
| anninaC wrote:
| we all know who's behind this.. "666 commits"
| Etheryte wrote:
| While you mention that you can bring your own model, prompt, etc,
| the current main use case seems to be integrating with OpenAI.
| How, if at all, do you plan to address the current shortcoming
| that the code generated by it often doesn't work at all without
| numerous revisions?
| anonzzzies wrote:
| > it often doesn't work at all without numerous revisions?
|
| It doesn't so for many/most humans either. I hope they have a
| revisions prompting, but I did not try it yet.
|
| I noticed adding in a feedback/review loop often fixes it, but
| the you still need someone saying 'this is it' as it doesn't
| know the cut off point.
| Etheryte wrote:
| I think there's a big gap here you might be missing. Most
| developers beyond juniors can generally write code that at
| least compiles on the first pass, even if it isn't
| functionally correct. Current AI models often generate code
| that doesn't even compile.
| sebzim4500 wrote:
| Is that a problem though? In this IDE the LLM sees the
| error message and tries to fix it, possibly while the
| developer who wrote the prompt is off doing something else.
| anonzzzies wrote:
| > Most developers beyond juniors can generally write code
| that at least compiles on the first pass,
|
| Aha. Maybe you know super clever people or people who
| learned in the 60-80s when cycles (and reboots etc)
| mattered or were costly; this is incredibly far from the
| norm now.
| nico wrote:
| > Most developers beyond juniors can generally write code
| that at least compiles on the first pass, even if it isn't
| functionally correct.
|
| Hahaha. I've been coding for over 20 years and this is
| definitely not the case.
|
| > Current AI models often generate code that doesn't even
| compile.
|
| Most of the code ChatGPT has given me, has run/compiled on
| the first try. And it's been a lot longer and complex than
| what I would have written on a first pass.
|
| Let's just learn to use these tools instead of trying to
| justify human superiority.
| [deleted]
| ValentaTomas wrote:
| Revisions prompting is not there yet but it is one of things
| that we are experimenting with.
|
| The feedback/review loop is spot on - a lot of the problems
| can be fixed automatically in a few steps but you actually
| need the outputs/errors.
| anonzzzies wrote:
| I made one that tries to get to the end code by having 3.5
| and 4 play different roles and correcting eachother.
| Sometimes it works, mostly it loops being unable to get to
| the end.
| mlejva wrote:
| Good point and feedback, thank you. We'll update readme.
|
| A lot of UX, UI, DX work related to LLMs is completely new. We
| ourselves have a lot of new realizations.
|
| > While you mention that you can bring your own model, prompt,
| etc, the current main use case seems to be integrating with
| OpenAI
|
| You're right. This is because we started with OpenAI and to be
| fair it's easiest to use from the DX point of view. We want to
| make it more open very soon. We probably need more user
| feedback to learn what would be the best way how this
| integration should look like.
|
| > How, if at all, do you plan to address the current
| shortcoming that the code generated by it often doesn't work at
| all without numerous revisions?
|
| The AI agents work in a self-repairing loop. For example they
| write code, get back type errors or a stack trace and then try
| to fix the code. This works pretty well and often a bigger
| problem is short context window.
|
| We don't think this will replace developers, rather we need to
| figure out the right feedback loop between the developer and
| these AI agents. So we expect that developers most likely will
| edit some code.
| neets wrote:
| You know you live in the future when
| addmin123 wrote:
| Is it using the copilot of microsoft?
| Jupe wrote:
| This is awesome, scary and very interesting. But, for me, it
| comes with a personal concern:
|
| For some time I've been giving serious thought about an automated
| web service generator. Given a data model and information about
| the data (relationships, intents, groupings, etc.) output a fully
| deployable service. From unit tests through container
| definitions, and everything I can think of in-between (docs,
| OpenAPI spec, log forwarder, etc.)
|
| So far, while my investment hasn't been very large, I have to ask
| myself: "Is it worth it?"
|
| Watching this AI code generation stuff closely, I've been telling
| myself the story that the AI-generated code is not "provable". A
| deterministic system (like I've been imagining) would be
| "provable". Bugs or other unintended consequences would be
| directly traceable to the code generator itself. With AI code
| generation, there's no real way to know for sure (currently).
|
| Some leading questions (for me) come down to:
|
| 1. Are the sources used by the AI's learning phase trustworthy?
| (e.g. When will models be sophisticated enough to be trained to
| avoid some potentially problematic solutions?)
|
| 2. How would an AI-generated solution be maintained over time?
| (e.g. When can AI prompt + context be saved and re-used later?)
|
| 3. How is my (potentially proprietary) solution protected? (e.g.
| When can my company host a viable trained model in a proprietary
| environment?)"
|
| I want to say that my idea is worth it because the answers to
| these questions are (currently) not great (IMO) for the AI-
| generated world.
|
| But, the world is not static. At some point, AI code generators
| will be 10x or 100x more powerful. I'm confident that, at some
| point, these code generators will easily surpass my 20+ years of
| experience. And, company-hosted, trained AI models will most
| likely happen. And context storage and re-use will (by demand)
| find a solution. And trust will eventually be accomplished by
| "proof is in the pudding" logic.
|
| Basically, barring laws governing AI, my project doesn't stand a
| cold chance in hell. I knew this would happen at some point, but
| I was thinking more like a 5-10 year timeframe. Now, I realize,
| it could be 5-10 months.
| sebzim4500 wrote:
| Not OP but I've been playing with similar technology as a
| hobby.
|
| >1. Are the sources used by the AI's learning phase
| trustworthy? (e.g. When will models be sophisticated enough to
| be trained to avoid some potentially problematic solutions?)
|
| Probably not, but for most domains reviewing the code should be
| faster than writing it.
|
| >2. How would an AI-generated solution be maintained over time?
|
| I would imagine you don't save the original prompts. Rather,
| when you want to make changes you just give the AI the current
| project and a list of changes to make. Copilot can do this to
| some extent already. You'd have to do some creative prompting
| to get around context size limitations, maybe giving it a
| skeleton of the entire project and then giving actual code only
| on demand.
|
| > When can my company host a viable trained model in a
| proprietary environment?
|
| Hopefully soon. Finetuned LLaMA would not be far off GPT-3.5,
| but nowhere close to GPT-4. And even then there are licencing
| concerns.
| Jupe wrote:
| Ok, a couple of derivative "fears" around this...
|
| 1> Relying on code reviews has concerns, IMO. For example,
| how many engineers actually _review_ the code in their
| dependencies? (But, I guess it wouldn 't take that much to
| develop an adversarial "code review" AI?)
|
| 2> Yes, agreed, that would work. Provided the original
| solution had viable tests, the 2nd (or additional) rounds
| would have something to keep the changes grounded. In fact,
| perhaps the existing tests are enough? Making the next AI
| version of the solution truly "agile"?
|
| 3> So, at my age (yes, getting older) I'm led to a single,
| tongue-in-cheek / greedy question: How to invest in these AI-
| trained data sets?
| mlejva wrote:
| Thanks for the questions. I'll reply just a bit later. I'm in a
| train and reading/writing makes my stomach sick
| ivan888 wrote:
| My biggest concern with tools like this is reproducibility and
| maintainability. How deterministically can we go from the
| 'source' (natural language prompts) to the 'target' (source
| code)? Assuming we can't reasonably rebuild from source alone,
| how can we maintain a link between the source and target so that
| refactoring can occur without breaking our public interface?
| larsonnn wrote:
| Pretty simple it's just like any abstraction. This AI will not
| work when nobody would deliver the answer beforehand. LLMs are
| given inputs of existing code. When you abstract that you
| better hope you have good code in it.
|
| So my question would be, what is the use case?
|
| I guess it's more like planing software and not implementing
| it.
|
| You can pretty well plan your software with ChatGPT. But it
| will just help you not really doing the job.
| ValentaTomas wrote:
| This is a valid concern and we are still experimenting with how
| to do this right. A combination of preserving the reasoning
| history, having the generated code, and using tests to enforce
| the public interface (and fix it if anything breaks) looks
| promising.
|
| I think the crucial part is indeed not being able to
| deterministically go from NL to code but to take an existing
| state of the codebase and spec and "continue the work".
| mlejva wrote:
| Hey everyone, I quite frankly really didn't expect our project
| getting on HN front page in 2 minutes after posting. I'm one of
| the creators of this project.
|
| It's pretty nascent and a lot of work needs to be done so bear
| with us please. I'm traveling in 30 minutes but I'm happy to
| answer your questions.
|
| A little about my co-founder and myself: We've been building
| devtools for years now. We're really passionate about the field.
| The most recent thing we built is https://usedevbook.com. The
| goal was to build a simple framework/UI for companies to demo
| their APIs. Something like https://gradio.app but for API
| companies. It's been used for example by Prisma - we helped them
| build their playground with it - https://playground.prisma.io/
|
| Our new project - e2b - is using some of the technology we built
| for Devbook in the past. Specifically the secure sandbox
| environments where the AI agents run are our custom Firecracker
| VMs that we run on Nomad.
|
| If you want to follow the progress you can do follow the repo on
| GH or you can follow my co-founder, me, and the project on
| Twitter:
|
| - https://twitter.com/e2b_dev
|
| - https://twitter.com/t_valenta
|
| - https://twitter.com/mlejva
|
| And we have a community Discord server -
| https://discord.gg/U7KEcGErtQ
| peteradio wrote:
| Maybe this is a bit naive but is this AI batteries included? Or
| what level of work needs to be done to integrate with an AI?
| ValentaTomas wrote:
| Right now you just need the GPT-4 API key to run it but we
| plan to support custom models in the future.
| philote wrote:
| What type of information is sent to GPT-4? This could be a
| security concern for some.
| sebzim4500 wrote:
| Presumably everything is sent to or generated by GPT-4, I
| don't see how it would work otherwise.
| nico wrote:
| Please add support for GPT4All: https://github.com/nomic-
| ai/gpt4all
| overthrow wrote:
| (Off-topic) Is there an "Are we open source yet?"-type
| site[1] that follows the progress of the various open-
| source LLMs? This is the first I've heard of GPT4All. I'm
| finding it tough to keep up with all these projects!
|
| [1]: In the spirit of https://wiki.mozilla.org/Areweyet
| nico wrote:
| Love the concept. Don't know if any sites/lists. Maybe
| you could start one?
| proto_codex wrote:
| Not what you're looking for but I built a page a while
| back to keep track of Stable Diffusion links using my
| little website-builder side-project protocodex.com -
| https://protocodex.com/ai-creation
|
| You're welcome to use it if you want to get a link page
| started, and I'd be glad to help - you can also add
| comment sections on the page to get user
| input/contributions so if anyone else has some links they
| can comment them there. I eventually want to more fully
| formalize user contributions to pages so that they can be
| used as crowdsourced freeform sites, if theres enough
| interest out there.
| MacsHeadroom wrote:
| Or Vicuna, which hugely outperforms GPT4All due to higher
| quality training data (real human conversations with
| GPT-4).
| anonzzzies wrote:
| Not hard to add. Results will be not so good right now
| though. If results at all.
| 555watch wrote:
| Sorry if it's a dumb question, since It's quite hard to
| keep up with all the recent developments in custom
| GPT/LLM solutions.
|
| Do I understand correctly that the GPT4All provides a
| delta on top of some LLAMA model variant? If so, does one
| need to first obtain the original LLAMA model weights to
| be able to run all the subsequent derivations? Is there a
| _legal_ way to obtain it without being a published AI
| researcher? If not, I'm not sure that Gpt4All is viable
| when looking for legal solutions.
| la64710 wrote:
| All these tools starts with grand proclamation of "open" and then
| the first thing you notice is the field to add your OPENAI_KEY.
| My humble suggestion is that if you are building something truly
| open please use some other models like LLAMA or BERT as default
| example and keep options for adding other models as needed.
| mlejva wrote:
| Hey, thank you for the feedback. I understand that having a
| requirement for OpenAI key isn't good for folks. The project is
| 4 weeks old and OpenAI is what we started with as a LLM. We
| want to create interface for people to bring and connect their
| own models though. We will do a better work of explaining this
| in our readme
| la64710 wrote:
| Thanks as you might imagine in many scenarios, lock in to any
| particular hosted model is not desirable.
| anonzzzies wrote:
| I wouldn't worry too much about it; there will be more and
| more models and fine-tuning services and fine-tuned
| downloads. Different models to mix and match.
| armchairhacker wrote:
| I would expect the opposite: you write the code and have the AI
| write documentation, tests, and examples. Especially
| documentation.
| WalterSear wrote:
| It's good for both.
| dakiol wrote:
| Shouldn't it be a plugin for a more used open-source IDE?
| MacsHeadroom wrote:
| Microsoft who owns CoPilot and OpenAI makes the most used Open
| Source IDE.
|
| You think improving their produce for them while submitting
| yourself to the authority of their plugin market is a good
| idea?
| sebzim4500 wrote:
| Can it produce projects which are longer than the context length
| of the LLM? (8k for us plebs)
| M4v3R wrote:
| It's perfectly possible to assemble a project by writing
| individual files one at a time, so you would basically get 8k
| tokens per file. Or you could even write out the files in
| parts.
| bgirard wrote:
| Is there examples / case studies of more complex apps being built
| by LLMs? I've seen some interesting examples but they were all
| small and simple examples. I'd love to see more case studies of
| how well these tools perform in more complex scenarios.
|
| My gut feeling is we're still a few LLMs generations away from
| this being really usable but I'd love to hear how the authors are
| thinking about this.
| ryanSrich wrote:
| Can you give an example of complex? I've used ChatGPT to help
| me build an app that authenticates a user using Oauth. That
| information creates a user in the backend (Rails). That user
| can then import issues tagged with specific information from a
| 3rd party task management tool (Linear). The title for these
| issues are then listed in the UI. From there, the user can
| create automatic release notes from those issues. They can
| provide a release version, description, tone, audience, etc.
|
| All of that (issues list, version, tone, etc) is then
| formulated into a GPT prompt. The prompt is structured such
| that it returns written release notes. That note is then stored
| and the user can edit it using a rich text editor.
|
| Once the first note is created the system can help the user
| write future notes by predicting release version, etc.
|
| This isn't that complex imo, but I'm curious to see if this is
| what people consider complex.
|
| ChatGPT wrote 90% of the code for this.
| mypalmike wrote:
| How about a 2 million line legacy app spanning 5 languages
| including one created by a guy who left the company 14 years
| ago which has a hand-rolled parser and is buggy.
| mritchie712 wrote:
| I'd consider that non-trivial (e.g. not a todo example).
|
| How long did it take?
| ryanSrich wrote:
| Around 3 hours (not straight - I would hack on it for 30
| minutes to an hour at a time). I spent another 1.5 hours or
| so styling it, but I did that outside of ChatGPT.
| gumballindie wrote:
| Sorry how is that not trivial?
| sebzim4500 wrote:
| Because there are people being paid to do it.
| gumballindie wrote:
| There are people paid to watch paint dry, that doesn't
| mean it's non trivial.
| sebzim4500 wrote:
| Who exactly is being paid to watch paint dry?
| ehutch79 wrote:
| A Line Of Business app. With questionable specs. Where inputs
| are cross dependant and need to be filtered. Some fields
| being foreign keys to other models.
| bgirard wrote:
| I don't have a specific definition of complex in mind. Seeing
| more examples of this with the prompts used + output and the
| overall steps is exactly what I'm asking for. I'm
| particularly interested in how the success rate changes as
| the code base evolves. Are LLMs effective in empty repos? are
| they effective on large repos? Can prompts be tweaked to work
| on larger repos?
| mov_eax_ecx wrote:
| This is like the third post today on code generator using a react
| application, in twitter there are dozens of them.
|
| Copilot already exists and copilot X already packs the features
| this package promises AND much more, why use this application
| over Copilot?.
| jumpCastle wrote:
| One reason might be that some people value open source.
| fabrice_d wrote:
| Looking at the license of this project (Business Source
| License 1.1), this is not an open source project:
| https://github.com/e2b-dev/e2b/blob/master/LICENSE#L16
| mlejva wrote:
| e2b isn't like copilot. You don't really write code in our
| "IDE". Currently, it works in a way that you write a technical
| spec and then collaborate with an AI agent that builds the
| software for you. It's more like having a virtual developer
| available for you.
| truthaboveall wrote:
| [dead]
| nbzso wrote:
| Maybe until we all have a local LLM's and custom models with full
| control, this level of abstraction(prompting) is not useful. I
| refuse to contribute to the"Open"AI scheme. Let marketing and
| teens to give them data.:)
| exhaze wrote:
| I agree with your point about self hosting as long term
| strategy. However at current stage, it's still a balance
| between capability and control. Greater control means you lag
| behind on bleeding edge features (gap seems to be constantly
| narrowing here though for LLMs thanks to tons of OSS efforts).
|
| Disagree with your fears about "give them data".
|
| Here's their data policy: https://openai.com/policies/api-data-
| usage-policies
|
| Literally states it won't use your data. Ofc there's non
| trivial risk that this policy will change over time. Still, I
| don't feel like there's any huge lock-in risk with OpenAI right
| now, so advantage of using it outweighs the risk for most.
| nbzso wrote:
| I simply have no trust in Microsoft. Data is the new
| petrol.:)
___________________________________________________________________
(page generated 2023-04-04 23:00 UTC)