[HN Gopher] Large Language Models for Compiler Optimization
___________________________________________________________________
Large Language Models for Compiler Optimization
Author : og_kalu
Score : 42 points
Date : 2023-09-17 20:55 UTC (2 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| CalChris wrote:
| There's a half day tutorial at the LLVM Developers Meeting on
| this, _ML-Guided Compiler Optimization in LLVM_. However, the
| authors of this paper aren 't giving that tutorial.
| klohto wrote:
| MLGO uses RL
| mathisfun123 wrote:
| maybe today but MLGO, the initiative, uses whatever. To wit:
| chris cummins (first author here) was on last year's MLGO
| panel.
| wyldfire wrote:
| > understanding. We evaluate on a large suite of test programs.
| Our approach achieves a 3.0% improvement in reducing instruction
| counts over the compiler,
|
| 3% code size reduction is really good. The challenge will be
| having codegen like this that someone is willing to support. And
| for that they'd want to be able to reason about why the compiler
| made this decision or that one. IIUC that's an outstanding
| problem for AI in general.
| TheLoafOfBread wrote:
| Also bugs from this approach are going to be funny - program
| compiled with compiler version X will work as expected and same
| program compiled with version X+1 will start crashing because
| AI under some circumstances decided that dereference of a
| specific pointer was unnecessary, so it won't drop it into the
| assembly.
|
| Good luck finding such a bug, because you will be looking on
| correct code, but computer will be executing invalid output.
| a_wild_dandan wrote:
| Damn. These things imitate humans too well. Guess we'll need
| a giant test suite to feel stable. Sounds like work. Let's
| train a GAN for that and put our feet up. Turtles all the way
| down, my dudes.
| foota wrote:
| This kind of application of LLMs is most interesting to me, since
| it's possible to evaluate correctness and performance
| quantitatively.
| jebarker wrote:
| It seems like a poor fit to me precisely because correctness is
| boolean, difficult to measure and getting it wrong is very bad.
| I do think there's a place for AI here but it's probably not
| LLMs in their current form.
| foota wrote:
| The approach seems to focus on selecting optimizations to
| apply for LLVM (e.g., imagime: should this be inlined as an
| example), but the worst case is that the code is poorly
| optimized, you can't select optimizations to apply and get a
| wrong result.
|
| I agree that what you say is true of compilation as a whole,
| but that doesn't seem to be the focus here (rather, it's used
| as a sort of crutch to help the LLM learn)
| kukkamario wrote:
| They are not using LLM to directly produce the result code,
| but as tool that lists which optimisations should be done and
| in which order, which is fairly complex problem to solve. But
| if optimisation passes are implemented correctly (which is
| anyway required for a functioning optimising compiler), it
| cannot produce incorrect code, maybe only suboptimal compared
| to default heuristics used.
| jebarker wrote:
| If there's a list of known optimizations that preserve
| correctness then it becomes an optimization problem based
| on output length (as a proxy for cycle count). So is the
| idea that an LLM is more efficient than a search or direct
| optimization?
| sagarm wrote:
| There tend to be a lot of heuristics involved when
| deciding which optimizations to apply and in which order,
| so there's plenty of room to apply some machine learning.
|
| Whether LLMs are the right approach is a separate
| question.
|
| In SQL optimization, the problem is a bit trickier (IMO)
| because compilation is in the query path. One successful
| approach I know of is Bao:
| https://arxiv.org/abs/2004.03814
| armchairhacker wrote:
| The trick is finding a way to ensure that LLM produces
| something which is always correct. Like in this case, the LLM
| only changes compiler optimizations, not the assembly itself,
| so no matter what it outputs the code is correct, it just may
| be larger. Other possibilities: an LLM which applies
| semantics-preserving program transformations, or an LLM
| combined with a proof assistant to verify the output (more
| generally and for any domain, an LLM as an NP oracle).
|
| But I agree, as of now I haven't seen good uses where LLMs
| produce reliable output. Not only do you need that guarantee
| that whatever output always generates a correct program, you
| need something where an LLM is considerably better than a
| simple or random algorithm, and you need a lot of training
| data (severely restricting how creative you can be with the
| output).
| reic wrote:
| Quantification can be done by measuring in at least two
| dimensions: (1) the size of the synthesised code, and (2) how
| precisely the generated code matches the input (which means
| roughly: on what fraction of input do the two programs give
| different output). We have set up a challenge that _seeks_ to
| entice the community to look into this problem domain more.
| And we 've simplified the assumptions, so as to make it more
| tractable:
|
| - Challenge:
| https://codalab.lisn.upsaclay.fr/competitions/15096
|
| - Paper describing the challenge:
| https://arxiv.org/abs/2308.07899
|
| (I am one of the authors, AMA)
| haltist wrote:
| Next step is to add verification for optimized code from the LLM
| with an SMT solver (like Z3) to remove "hallucinations". If the
| input and output code can be verified to be equivalent then this
| would be a great addition to an optimization pipeline. Once
| that's done the same can be applied to intermediate
| representations of GPU kernels in a recursive loop of AI
| optimizing AI code for faster execution times.
| [deleted]
___________________________________________________________________
(page generated 2023-09-17 23:00 UTC)