[HN Gopher] Converting Codebases with LLMs
___________________________________________________________________
Converting Codebases with LLMs
Author : Osis
Score : 114 points
Date : 2024-07-20 04:27 UTC (18 hours ago)
(HTM) web link (blog.withmantle.com)
(TXT) w3m dump (blog.withmantle.com)
| JTyQZSnP3cQGa8B wrote:
| > a recurring need in the software world for teams to convert a
| codebase from one language to another
|
| Really? I've only seen that twice in my career, and it was due to
| being written in the most obsolete tech ever.
|
| I have the same comment for the "patterns" that GPT-bros seem to
| be stuck in all the time. What kind of software are they writing
| that needs 80% of duplicated/useless code, and 20% of business
| code? They should first read Refactoring by Martin Fowler, and
| try to avoid those mistakes in the future because it's bad to
| rely on a AI for what should be their job, i.e. engineering
| software.
|
| > the database querying layer was quite verbose and greatly
| exceeded an LLM's output token limit
|
| No technical details as usual, only high-level stories. And how
| is it possible nowadays to have that kind of issue where most
| languages have their own SQL or REST library to do everything in,
| at most, 500 lines of code (if the code is duplicated)?
|
| Last but not least, the main web site is a very pretty empty page
| if JS it disabled. They should fix that with an LLM and write a
| blog post, that would be more interesting.
| blowski wrote:
| I've found it to be more common in organisations with an
| immature microservices culture, where developers seem to think
| there are awards for most number of languages used. At some
| point, sanity takes hold, and there is a process to standardise
| - involving lots of rewrites of small codebases.
| joshuanapoli wrote:
| The JavaScript ecosystem historically had a lot of turnover.
| Probably there are a lot of applications that repeatedly ported
| over the years: Ruby to JavaScript, to coffescript, to flow
| types (for React), to Typescript.
|
| I think that these language ports aren't as disruptive as
| architecture changes (waffling on microservices), and they're
| driven by availability of talent. Porting to follow the trend
| makes it easier and much more pleasant to onboard new
| developers. It usually has a practical benefit to users,
| because the latest tooling usually has a performance edge, but
| doesn't support the old language.
| blowski wrote:
| Yes, good point. Even within React, there's been a big change
| from class components to functional components and hooks. I
| imagine LLMs could help with some of that.
| cjonas wrote:
| I just ported 10k loc of react classes to function
| components using gpt-4o. The changes are mostly trivial,
| but would be fairly time consuming and tedious to make. It
| took me a few hours instead of a few days.
| bavell wrote:
| Isn't this already possible with codemods?
| imvetri wrote:
| No. Nobody can understand how to use code mods. I could
| not. But I'm not sure about others.
|
| My view is that those engineers in fb are no longer there
| to promote and support that project. Or they would have
| migrated to learning ai ml
| imvetri wrote:
| How much did it cost?
| JTyQZSnP3cQGa8B wrote:
| I forget "JS and the web" all the time because I've been
| actively avoiding it for the past 20 years. It happens in
| other environments but the web seems to encourage "following
| the trend" and that would make me crazy if I had to do this
| every day.
| fhd2 wrote:
| I haven't seen good teams do that - there's reliable
| options even in the pretty crazy JS/web ecosystem. What it
| does have is a ton of junior devs and a lot of people
| pushing their open core startups or celebrity status with
| new libraries and tools where they could have contributed
| to something existing instead. There's more bad devs and
| more noise, but there absolutely is enough good people and
| good stuff to build solid software. Just have to get used
| to the noise I suppose.
| imvetri wrote:
| Don't follow trend. JS and Web is bad may be bad. But
| building interface is not.
| fhd2 wrote:
| That's a concern I have: The pain of writing boilerplate used
| to make people improve their architecture and frameworks. If
| the Java ecosystem hadn't been so painful in the early 2000s,
| would better languages and frameworks have gained traction?
| Would good refactoring practices have gained traction?
|
| Sometimes refactoring doesn't even cut it, unfortunately. When
| stuck with a language and/or framework that simply requires
| lots of boilerplate, there's only two options: Migrate to
| something else or use/build code generation tools. I've done
| both with good success. Not sure I'd use a non-deterministic
| tool (like an LLM) for this, but since deterministic tools are
| harder to build, we might be looking at a future where a lot of
| working code is rewritten with automation that introduces
| subtle problems.
|
| I'm optimistic though. There's always been a lot of terrible
| software somehow kept under control with high
| development/testing resources. And then there's always been
| carefully built good software. I suspect we'll continue to have
| both.
|
| We'll probably have good software because some managers manage
| to hire good devs _and_ give them the right direction and
| support to do good work.
|
| We'll probably have lots of bad software for the same reasons
| as in the past: Incompetent management, competent management
| pragmatically sacrificing software quality and/or
| maintainability, incompetent (or really just impatient/rushed)
| developers.
|
| I don't think LLMs change the equation that much. Good devs
| will use them well (or perhaps not at all). Bad devs will use
| them badly. Good software can give startups an edge, bad
| (enough) software can bring down incumbents.
| onion2k wrote:
| I've seen it a lot. Mostly things like moving from PHP 3 to PHP
| 5, or Python 2 to Python 3, or React 12 to React 17. A language
| change doesn't have to be between completely different
| languages to be a pain.
| zcbenz wrote:
| A few months ago I ported ~15k lines of python code (10k are
| tests) to typescript, using GPT4. It cost me ~$70.
|
| The python project is https://github.com/ml-explore/mlx and the
| converted project is https://github.com/frost-beta/node-mlx
|
| I wrote a long prompt: https://github.com/frost-beta/node-
| mlx/blob/main/tests/promp...
|
| The first result was almost always bad, but after manually
| modifying the assistant's answer, following generation usually
| went much better.
| newzisforsukas wrote:
| > Use () => instead of function() for defining functions.
|
| > Use const when possible, but use let if the same name is
| reused in the same scope.
|
| looks like some of that could have been handled with a linter
| autofixing afterwards.
|
| $70 seems like a lot for 15,000 lines?
| mmastrac wrote:
| In the absence of an AST based tool, that's probably an
| absolute minimum of 20-40 hours of dev time (likely more) at
| $100-200 hourly, no?
| imvetri wrote:
| There is no AST tool nor anyone had solved cross language
| refactor. Direct me if I'm wrong
| iknowstuff wrote:
| They mean running it on typescript post conversion.
| newzisforsukas wrote:
| I just meant for processing, not compared to skilled or
| mindless translation done by humans.
| mmastrac wrote:
| $70 is not a lot given the value of the output. I
| understand where you're coming from, though, but it's
| important to compare against the value the human labour
| this is replacing.
| imvetri wrote:
| Why 70 is not a lot? Value for money
| kordlessagain wrote:
| It's less than half a cent a line!
| dsp_person wrote:
| For the cost I'm curious what's the breakdown in terms of
| specific gpt4 model and context length?
|
| What was the verification process like?
|
| Also any thoughts on transpilers? There's Brython for
| javascript, and some others like py2many, mypyc. And the
| approach in oil shell: written in python, translated to C++
| with custom tools
| newzisforsukas wrote:
| https://github.com/facebookresearch/CodeGen
| smusamashah wrote:
| What is the current best LLM for coding? I am using Claude Sonnet
| 3.5 free and it's so good. I am not making anything serious and
| LLM is perfect for that.
|
| Which current models are better than sonnet for code (plain old
| html JS is my use case btw)?
| sa-code wrote:
| I'm also using Sonnet to work on a library in Mojo and I've had
| pleasant experiences!
| Art9681 wrote:
| Sonnet is the best at the moment. GPT-4o is within a margin of
| error in capability. Use both.
| smusamashah wrote:
| Isn't GPT-4, rather than GPT-4o the general best one? I am
| not talking about benchmarks, but the personal experience
| instead. GPT-4 always seem more understanding, says no where
| it should more often, corrects and identifies errors more
| often. 4o seems to go with the flow without much
| notice/inspection and keeps blabbering the way it does
| instead of following the conversation style.
| bongodongobob wrote:
| Yup. I use 4o until it fails on something, pull in 4, then
| back to 4o. Rinse and repeat.
| kadir1234 wrote:
| Is there an architectural reason Sonnet beats 4o? Or is it
| simply a matter of training corpus?
| hugocbp wrote:
| I test a lot of them, online and with Ollama, and Sonnet 3.5 is
| in a league of its own for practical coding purposes.
|
| Still makes a lot of mistakes, but it gets things "more right"
| than any of the others in a much more consistent basis.
|
| I've now cancelled my ChatGPT subscription to Claude and also
| mostly stopped using the APIs (I use Msty to compare most
| models, you can give the same prompt to multiple models at once
| and compare the results).
|
| Sonnet 3.5 is amazing.
| mspreij wrote:
| This should help porting all the old Cobol and Perl apps out
| there, no?
| homarp wrote:
| IBM watsonx Code Assistant for Z: Transform COBOL services to
| Java(tm) by using an AI-assisted approach with IBM watsonx Code
| Assistant for Z and IBM Z Open Editor.
|
| https://www.ibm.com/docs/en/watsonx/watsonx-code-assistant-4...
|
| HN discussion: https://news.ycombinator.com/item?id=38508250
| ktzar wrote:
| I wonder how many subtle errors will make their way to the new
| codebase (decimal rounding, a library uses where a parameter is
| ignores and there's no tests for it...) only to be found in
| production and AI will be blamed.
| Deukhoofd wrote:
| I did some converting with Copilot today. The answer is, quite
| a lot. It'd convert integer types wrong (whoops, lost an
| unsigned there, etc).
|
| And then of course there were some parts of the code that dealt
| with gender, and Copilot just completely refused to do anything
| with that, because for some reason it's hardcoded to do so.
| bloak wrote:
| That gender thing is interesting. Could you try renaming some
| of the variables and substituting words in the comments so
| that the code no longer obviously appears to be dealing with
| gender and see if Copilot behaves differently?
|
| If it does behave differently, I'd find that a bit worrying
| because conversion of a correct program into a different
| programming language should not depend on how the variables
| are named or what's in the comments. For example, assuming
| this is a line from a program written in C that works
| "correctly", how should it be converted into Go or Rust or
| whatever? int product = a + b; // multiply
| the numbers
| Deukhoofd wrote:
| Everything works mostly fine as long as it's not obviously
| dealing with gender, but will fall over as soon as anything
| appears to refer to gender, either due to comments or due
| to variable naming.
|
| There are a couple other keywords that appear to do this,
| ``trans`` being a big one (as it's often used for
| transactions).
|
| It does also use assumptions from comments. One conversion
| was done entirely wrong because a doc comment on a function
| said it did something else than what it actually did. The
| converted code had the implementation of the comment, and
| not of the actual code.
| g-b-r wrote:
| We're already at
| https://threadreaderapp.com/thread/1718654143110512741.html ?
| xD
| flir wrote:
| Oh that's ok, I'll just have the chatbot write some tests too
| ;)
| Ozzie_osman wrote:
| > I wonder how many subtle errors will make their way to the
| new codebase.
|
| Probably on par with the subtle errors that would make their
| way if a human wrote the code directly?
| Slyfox33 wrote:
| No?
| Kiro wrote:
| Probably less than if a human did it. Compared to my code, AI
| generated code is much more thorough and takes more edge cases
| into account. LLMs have no problem writing tedious safe-guards
| against stuff that lazy humans skip with the argument that it
| will probably never happen.
| largbae wrote:
| I wonder if LLM language conversions will lead to a consolidation
| of languages. Suppose that you could prototype in any language
| and autoconvert that resulting functionality to Rust or another
| language with the right runtime features, would that be an
| appealing dev model?
| binary132 wrote:
| rust doesn't have runtime features
| threecheese wrote:
| I have the same suspicion; the current ecosystem of computing
| is very much a product of human constraints, and it may end up
| being more cost-efficient to have a single standard be used by
| AI models rather than having them need to match every unique
| code+libraries+hardware combination that exists or will exist.
| How this affects the computing ecosystem, this worries me.
| impure wrote:
| I used an LLM to convert my XML parser from Dart to Go. It was
| mostly right but with some giant mistakes. This was when I was
| extremely new to Go, don't know if I would do it again. It might
| be faster to manually write the code because that way I could
| spend less time reading it.
| Frieren wrote:
| > There is a recurring need in the software world for teams to
| convert a codebase from one language to another.
|
| Sounds more like a sales pitch than a reality. I have seen many
| times developers excited to port code from one language to
| another, but just because it is an opportunity to learn something
| new, do something different for a change and even rewrite old
| code.
|
| What is the value if is done automatically, nobody learns
| anything and the code is just a transcript of the old one?
| imvetri wrote:
| I don't anyone have solved to do this automatically, in this
| community.
|
| Value it provides is for the business. If a tool can do it,
| there is no need to hire or keep an engineering team.
|
| Engineering team has a running cost. Where as, using a tool or
| if someone makes the tool, sells it at a price slightly lower
| than what's spent for engineering team, doesn't it add a value?
|
| First. It's a tool that does, so reliable more than a human.
|
| If it is a sales pitch, someone will get it done, as there is
| an opportunity
| TechDebtDevin wrote:
| It's not reliable and LLMs cannot create anything novel.
|
| Your hypothetical company is going to end up pulling a
| Crowdstrike with this method and then they definately won't
| need an engineering budget!
| nerdjon wrote:
| > Value it provides is for the business. If a tool can do it,
| there is no need to hire or keep an engineering team.
|
| What exactly is the "value?". It worked before so what is the
| purpose of the change.
|
| > First. It's a tool that does, so reliable more than a
| human.
|
| Sorry but, are you new to LLM's? Have you seen the recent
| news? "Reliable" it is not.
|
| If your pitch is that an LLM will just write all of your code
| in the first place, there really isn't any need to migrate
| the code to another language when by your logic the LLM could
| just manage its existing language. The logic here quickly
| breaks down and doesn't make any sense.
| imvetri wrote:
| What news? I'm not talking about LLM. I'm talking about the
| challenge of cross language and code generation
| nerdjon wrote:
| > What news?
|
| https://news.ycombinator.com/item?id=40475578
|
| One such news entry, LLM's being unreliable is not a
| controversial opinion. It is well known and easy to find
| many instances of similar issues.
|
| > code generation
|
| What exactly do you think is generating the code? The
| article here is about generating code with an LLM.
| nerdjon wrote:
| Right, I have never been in a situation that a re-write was
| considered in another language that it was not due to some
| other reason.
|
| Most of the time it is first, we need to change some major
| functionality, we have an architectural issue, or something
| along those lines that will require a major re-write in the
| first place. So the idea of, maybe we should use a different
| language comes up.
|
| The idea of re-writing something in another language and it is
| identical functionality just for the sake of using another
| language just isn't a normal exercise unless you have a CTO
| pushing for something unnecessarily.
|
| Maybe, maybe I could buy saying we don't want to manage Java
| servers anymore or something along those lines. But even then,
| why break something that works.
|
| This seems like such a bad idea, is going to introduce so many
| bugs, require a ton of testing, for a minimal at best gain?
|
| And then yeah, who is going to maintain it given that no one
| actually wrote the code in the first place. Goodby historical
| knowledge and productivity. Hope you don't find a critical bug
| as soon as you release it that needs to be fixed asap.
|
| Don't do this, a seriously bad idea. That assumes that it is
| somehow a 1:1 functionality which by now we should be well
| aware that an LLM is going to make mistakes.
| simonw wrote:
| That was my initial instinct on reading that sentence too - I
| don't think converting from one language to another is actually
| very common.
|
| But in this particular case I think they justified doing so:
| "Our team had a prototype written in the language, R, and
| wanted to convert this to our standard production tech stack,
| Golang and ReactJS."
|
| As a Python programmer I tend not to worry about this, because
| Python is a good language for both prototyping and production -
| but I can absolutely see the need for this if you're
| prototyping with tools that you wouldn't want to run in
| production.
| tbrownaw wrote:
| One _benefit_ I 've heard for using different languages for
| prototyping and production is that it helps you remember to
| rewrite things properly rather than just dumping prototype-
| quality code into prod.
|
| Working around this by using tools that aren't exactly known
| for code quality in the first place seems like a bit of an
| odd choice.
| disgruntledphd2 wrote:
| > "Our team had a prototype written in the language, R, and
| wanted to convert this to our standard production tech stack,
| Golang and ReactJS."
|
| It's very hard for me to understand how this would work,
| unless the R code was very very simple.
|
| Like, R is mostly used for stats, and Go doesn't have all of
| the stats libraries, so what did the LLM generate?
|
| Maybe it was a pretty simple LoB app written in R (which
| would be pretty weird, even I as an R-head gave up on writing
| general purpose software in R some time ago) in which case it
| makes sense, or else the LLM generated lots and lots of
| boilerplate for matrix multiplication (I imagine any
| implementation of `model.matrix` would have been fun).
|
| Very very strange to me, at least.
| simonw wrote:
| I would expect most good LLMs to be able to implement
| statistical functions from scratch in languages like Go.
|
| I often ask ChatGPT Code Interpreter to implement algorithm
| from scratch in Python where the library needed for that
| function isn't present in the Code Interpreter environment
| - things like haversine distances for example.
| skissane wrote:
| > I would expect most good LLMs to be able to implement
| statistical functions from scratch in languages like Go.
|
| Implementing statistical functions from scratch can be
| rather dangerous - can you trust the implementation is
| correct? You can have an implementation which works well
| for a few obvious tests, but then performs poorly for
| edge cases (e.g. due to excessive accumulation of
| rounding error). Whereas, good chance the existing R
| implementation of _whatever_ has been reviewed by expert
| statisticians.
|
| LLMs can be great for saving time/energy when you have
| the domain expertise to validate their answers. But if
| you don't...
| simonw wrote:
| Yeah that's fair, I don't have a strong enough background
| in statistics to be able to catch edge cases in these
| kinds of things.
| goatlover wrote:
| The statistical functions in R and Python libraries are
| well tested. I don't know what sort of confidence you'd
| have in an LLM generating new stats libraries in other
| languages.
| cpeterso wrote:
| Was the R prototype outputting HTML code? What React front
| end code of any value is the LLM extracting from the R
| prototype?
|
| Other options to converting the code: call into the R code
| from the Go code. Or don't let your prototype grow to 12
| KLOC in a language you don't intend to use in production.
| temporarely wrote:
| Back in late 80s we were building automated Fortran to C
| converters. Client was in the aerospace field.
|
| > What is the value if is done automatically, nobody learns
| anything and the code is just a transcript of the old one?
|
| You may be shocked to learn that businesses using software have
| a different metric for the value of "code" than educating their
| (transient) code wranglers. The actual value of software is
| computational work. If a new language affords better tooling
| and availability of human resources, that is a win.
| Closi wrote:
| Yes, I was at a company that moved an application from Cobol
| to Java for exactly that purpose - having a mission critical
| application written in cobol is way harder to maintain than
| having that exact same application in Java.
| codr7 wrote:
| Extremely short term, yes.
|
| In the longer perspective, you'll lose most good developers
| if you don't allow them to evolve and have some fun along the
| way. And without the developers, the source code is pretty
| much useless.
|
| Humans are not machines.
| StableAlkyne wrote:
| > you'll lose most good developers if you don't allow them
| to evolve and have some fun along the way
|
| That's actually something I really like about tools like GH
| Copilot! It gives me an excuse to try out something using a
| new language, but with less of the productivity dip that
| comes from chasing syntax or stdlib calls. It doesn't
| produce code that is as good as an expert in that language,
| but it's a really convenient set of training wheels
|
| So it becomes easier justify, at least with my current
| organization
| refulgentis wrote:
| I think this is an interesting line of argument but its
| sort of reached its shallow depth on its 3rd exposition:
| it's not very complicated if I'm reading it correctly:
|
| "Theoretically, developers could eschew jobs that don't
| allow them to creatively reinterpret code as they translate
| it."
|
| It's a weak argument, because if you're translating even
| manually, it's not exactly the peak of creative self-
| expression.
|
| There's plenty of rote code that we'd all be happy to
| automate translation of --- I used this technique with GPT
| 3.0 to get math code translated across languages for
| Google's color library.
| imvetri wrote:
| Businesses see human and machine as resources.
| yarg wrote:
| There's a huge value in being able to automate conversion,
| especially on an active project with several teams working on
| features for several different clients (where downtime simply
| is not an option).
|
| Having dealt with a similar problem however, stay away from AI
| and instead perform the conversion by manipulating source code
| ASTs.
| singingfish wrote:
| Not that long ago I moved a 250kline codebase from oracle to
| postgres. Yes, SQL embedded in strings and so on.
|
| Towards the end of that process, chat GPT helped me with that,
| and it was pretty valuable for some kinds of problem. Still had
| to watch it like a hawk and specify things really clearly to
| make sure it didn't go off the rails.
| bustodisgusto wrote:
| This is a perfect use case for LLMs at the moment. I wrote a
| script to update and express code base to hono. I got Claude to
| write a regex that would match the handler to the route and
| called the Claude 3.5 api with an example conversion and some
| other relevant context.
|
| With the right prompt, it produced extremely clean and workable
| code.
|
| ~20 controller files and over 100 route handlers were converted
| in about 20 minutes and 5 dollars.
|
| The engineering cost of migrating code bases is trending to 0
| thesz wrote:
| > The engineering cost of migrating code bases is trending to 0
|
| I work with code base of >750K LOC C++ that is 12+ years old
| and would like to migrate it to something fashionable like
| Futhark or Python. So, please, tell me more about your
| wonderful regular expression.
| bustodisgusto wrote:
| I'm not sure why you would want to migrate a C++ codebase to
| an interpreted language?
| redleggedfrog wrote:
| How maintainable the code for _humans_?
| gregors wrote:
| I'm curious about the security implications and corporate
| policies about uploading your entire codebase to an LLM where
| others can access it (indirectly or directly).
|
| Other than that, I'm very interested to see how easily opensource
| libraries could be converted from ecosystem A to B.
| tbrownaw wrote:
| > _where others can access it (indirectly or directly)._
|
| Anything sold specifically for corporate use should come with
| contract terms that prohibit this. (The few that try to not
| guarantee confidentiality won't survive very long.)
|
| I know that one of the things $employer looks for is an
| explicit ban on using our data for training. Or even against
| having humans in the loop for the abuse monitoring process;
| that one came with rules about us having certain controls in
| place.
| DarkContinent wrote:
| It's not clear to me from the article how Mantle was porting the
| build scripts, infrastructure config files, etc across languages.
| Typically these files don't cleanly translate from one framework
| to another. Was this considered as part of 20% of project for
| human engineering effort?
| pmarreck wrote:
| That's odd. I was discussing this very idea with ChatGPT just
| last night, in the context of coming up with a way to
| deterministically go from <example code> to <english language
| description of example code> and back again, and then thought
| that English might be a good intermediate language when
| converting logic to a different programming language...
|
| https://chatgpt.com/share/5d2245e8-135e-44f4-a204-401e625183...
| imvetri wrote:
| That's strange. Would be a right word.
|
| Great minds think alike.
|
| You are solving the problem using chatgpt, with right words on
| first hit.
|
| Author here is solving problem using chatgpt's usage syntax.
|
| You will not be convinced by either, because the problem is not
| solved.
| rock_artist wrote:
| Recently I've converted some code to make an app from python to
| Swift. I've tried using Gemini and ChatGPT. The time I've spent
| afterwards debugging it in order to fix introduced bugs made it
| not worth it.
|
| IMHO, the way this could work is only if you have very good test
| coverage so you can run them. But without it this can easily go
| off the tracks.
___________________________________________________________________
(page generated 2024-07-20 23:10 UTC)