[HN Gopher] Do variable names matter for AI code completion? (2025)
       ___________________________________________________________________
        
       Do variable names matter for AI code completion? (2025)
        
       Author : yakubov_org
       Score  : 46 points
       Date   : 2025-07-25 23:35 UTC (3 days ago)
        
 (HTM) web link (yakubov.org)
 (TXT) w3m dump (yakubov.org)
        
       | yakubov_org wrote:
       | When GitHub Copilot suggests your next line of code, does it
       | matter whether your variables are named "current_temperature" or
       | just "x"?
       | 
       | I ran an experiment to find out, testing 8 different AI models on
       | 500 Python code samples across 7 naming styles. The results
       | suggest that descriptive variable names do help AI code
       | completion.
       | 
       | Full paper: https://www.researchsquare.com/article/rs-7180885/v1
        
         | amelius wrote:
         | Shouldn't the LLMs therefore train on code where the variable
         | names have been randomized?
         | 
         | Perhaps it will make them more intelligent ...
        
           | jerf wrote:
           | No. Variable names contain valuable information. That's why
           | humans use them too.
           | 
           | AIs are finite. If they're burning brainpower on determining
           | what "x" means, that's brainpower they're not burning on your
           | actual task. It is no different than for humans. Complete
           | with all the considerations about them being wrong, etc.
        
             | amelius wrote:
             | Training them on randomized var names doesn't mean you
             | should do it deliberately during inference ...
             | 
             | Also, I think this is anthropomorphizing the llms a bit too
             | much. They are not humans, and I'd like to see an
             | experiment on how well they perform when trained with
             | randomized var names.
        
               | jerf wrote:
               | Neither "variable names contain valuable information" nor
               | "AIs are finite" are anthropomorphization. That variable
               | names contain information is not only relative to some
               | sort of human cognition; they objectively,
               | mathematically, contain information. Remove that
               | information and the AI is going to have to reconstruct it
               | (or at least most of it), and as finite things, that's
               | going to cost them because nothing like that is free for
               | finite things. None of my assertions here depend on the
               | humanity, the AI-ness, the Vulcan-ness, or any other
               | conceivable finite intellectual architecture of any
               | coding agent. It only depends on them being finite.
        
               | amelius wrote:
               | Let's stop with the comparison to humans, I'm more
               | interested in why it would hurt to train LLMs with harder
               | puzzles. Isn't that what we're doing all the time when
               | training llms? I'm just suggesting an easy way to
               | construct new puzzles: just randomize the varnames.
        
               | recursive wrote:
               | An even easier way to construct new puzzles is to fully
               | randomize the problem statements and intended solutions.
               | 
               | When you take out the information from the variable
               | names, you're making the training data farther from real-
               | world data. Practicing walking on your hands, while
               | harder than walking on your feet, won't make you better
               | at hiking. In fact, if you spend your limited training
               | resources on it, the opportunity cost might make you
               | worse.
        
               | JambalayaJimbo wrote:
               | But we know that variable names do not matter whatsoever
               | to a compiler. Now I do agree with you intuitively that
               | LLMs perform better on meaningful variable names, without
               | looking at hard data - but I don't think it has anything
               | to do with "brainpower" - I just think your input data is
               | more likely to resemble training data with meaningful
               | variable names.
        
               | socalgal2 wrote:
               | I think you're just arguing semantics. It seems
               | intuitively obvious that if I have some simple physics
               | code                   newPosition = currentPos +
               | velocity * deltaTime
               | 
               | and change it to                   addressInChina =
               | weightByGold + numberOfDogsOwned * birdPopulationInFrance
               | 
               | that both a human and likely an LLM will struggle to
               | understand the code and do the the right thing. The thing
               | we're discussion is does the LLM struggle. No one cares
               | if that's not literally "brain" power. All they care
               | about is does the LLM do a better, worse, or the same
               | 
               | > I just think your input data is more likely to resemble
               | training data with meaningful variable names.
               | 
               | Based on giving job interviews, cryptic names are common.
        
               | _0ffh wrote:
               | Exactly _because_ the task is harder if the variable name
               | does not contain any information is what makes training
               | like that a good idea. It forces the LLM to pay attention
               | to the _actual code_ to get it right, which in training
               | is a Good Thing (TM).
        
             | datameta wrote:
             | Right. Inherent information complexity goes down as all
             | metadata is stripped from the variable name and the value
             | has to be re-contextualized in a fresh logical chain every
             | time.
        
             | JambalayaJimbo wrote:
             | LLMs do not have brains and there is no evidence as far as
             | I know that they "think" like human beings do.
        
               | gnulinux wrote:
               | LLMs do not reason at all (i.e. deductive reasoning using
               | a formal system). Chain of thought etc simulate reasoning
               | by smoothing out the path to target tokens by adding
               | shorter stops on the way.
        
               | ACCount36 wrote:
               | Do you reason? "LLMs do not reason at all" casts that
               | into doubt immediately.
        
               | appreciatorBus wrote:
               | That being true does not mean that there are no limits to
               | whatever it might be doing, which might be wasted with
               | ambiguous naming schemes.
               | 
               | I am far from an AI booster or power user but in my
               | experience, I get much better results with descriptive
               | identifier names.
        
               | ACCount36 wrote:
               | LLMs are only capable of performing a finite amount of
               | computation within a single forward pass. We know that
               | much.
               | 
               | They are also known to operate on high level abstracts
               | and concepts - unlike systems operating strictly on
               | formal logic, and very much like humans.
        
           | fenomas wrote:
           | LLMs do see randomized identifiers, whenever they encounter
           | minimized code. And you can get a bit of an idea how much
           | they learn, by giving an LLM some minimized JS and asking it
           | to restore it with meaningful var names.
           | 
           | When I tried it once the model did a surprisingly good job,
           | though it was quite a while ago and with a small model by
           | today's standards.
        
           | knome wrote:
           | if you train them on randomized names, they'll also suggest
           | them.
           | 
           | better to not, I think.
        
           | dingnuts wrote:
           | No, they're more likely to predict the correct next token the
           | closer the code is to identical to the training set, so if
           | you're doing something generic short names will get the right
           | predictions and if you're doing something in a problem
           | domain, using an input that starts the sequence generation in
           | a part of the model that was trained on the problem domain is
           | going to be better
        
           | empath75 wrote:
           | They're trained on plenty of code with bad variable names.
           | 
           | But every time you make an AI think you are introducing an
           | opportunity for it to make a mistake.
        
       | ssalka wrote:
       | The names of variables impart semantic meaning, which LLMs can
       | pick up on and use as context for determining how variables
       | should behave or be used. Seems obvious to me that
       | `current_temperature` is a superior name to `x` - that is, unless
       | we're doing competitive programming ;)
        
         | yakubov_org wrote:
         | My first hypothesis was that shorter variable names would use
         | fewer tokens and be better for context utilisation and
         | inference speed. I would expand your competitive programming
         | angle to the obfuscated C challenge ;)
        
           | Macha wrote:
           | The problem is, unless you're doing green field development,
           | that description of what the existing desired functionality
           | is has to be somewhere, and I suspect a parallel markdown
           | requirements documents and the code with golfed variable
           | names are going to require more context, not less.
        
       | Groxx wrote:
       | Obviously yes. They all routinely treat my "thingsByID" array
       | like a dictionary - it's a compact array where ID = index though.
       | 
       | They even screw that up inside the tiny function that populates
       | it. If anything IMO, they over-value names immensely (which makes
       | sense, given how they work, and how broadly consistent
       | programmers are with naming).
        
         | gnulinux wrote:
         | Do you still have this problem if you add a comment before
         | declaring the variable like "Note: thingsById is not a
         | dictionary, it is an array. Each index of the array represents
         | a blabla id that maps to a thing"
         | 
         | In my experience they under overvalue var names, but they value
         | comments even more. So I tend to calibrate these things with
         | more detailed comments.
        
         | DullPointer wrote:
         | Curious if you get better results with something like
         | "thingsByIdx" or "thingsByIndex," etc.?
        
       | robertclaus wrote:
       | Nice to see actual data!
        
       | OutOfHere wrote:
       | Section names (as a comment) help greatly in long functions.
       | Section names can also help partially compensate for some of the
       | ambiguity of variable names.
       | 
       | Another thing that matters massively in Python is highly
       | accurate, clear, and sensible type annotations. In contrast,
       | incorrect type annotations can throw-off the LLM.
        
       | r0s wrote:
       | The purpose of code is for humans to read.
       | 
       | Until AI is compiling straight to machine language, code needs to
       | be readable.
        
         | deadbabe wrote:
         | Variable names don't matter in small scopes.
        
           | r0s wrote:
           | The scope of the cognitive effort is the total context of the
           | system. Yes it matters.
        
           | rented_mule wrote:
           | It certainly can matter in any scope. `x` or even `delay`
           | will lead to more bugs down the line than
           | `delay_in_milliseconds`. It can be incredibly frustrating to
           | debug why `delay = 1` does not appear to lead to a delay if
           | your first impression is that `delay` (or `x`) is in seconds.
        
             | deadbabe wrote:
             | You will have exactly the same problem if
             | delay_in_milliseconds is actually misnamed and the delay is
             | measured in seconds.
             | 
             | Comments lie. Names lie. Code is the only source of truth.
        
               | fwip wrote:
               | No - "delay_in_milliseconds" will let you find the error
               | and resolve it faster. With the less descriptive name,
               | you need to notice the mismatch between the definition
               | and the use site, which are further apart in context.
               | Imagine you see in your debugger: "delay_in_milliseconds:
               | 3" in your HttpTimeout - you'll instantly know that's
               | wrong.
               | 
               | If you believe your reductive argument, your function and
               | variable names would all be minimally descriptive, right?
        
               | deadbabe wrote:
               | For your specific example, there would never be a "delay
               | in milliseconds" variable in the first place. That's just
               | throat clearing.
               | 
               | "sleep 1" is the complete expression. Because sleep takes
               | a parameter measured in seconds, it's already understood.
               | 
               | You do not need "delay_in_seconds = 1" and then a
               | separate "sleep delay_in_seconds". That accomplishes
               | nothing, you might as well add a comment like "//seconds"
               | if you want some kind of clarity.
        
               | rented_mule wrote:
               | Years later, when all memory of intent is long gone, I'd
               | much rather work on a large code base that errs on the
               | side of too much "throat clearing" than one that errs on
               | the side too little. `sleep 1` tells what was written,
               | which may or may not match intent.
               | 
               | Many bugs come from writing something that does not match
               | intent. For example, someone writes most of their code in
               | another language where `sleep` takes milliseconds, they
               | meant to check the docs when they wrote it in this
               | language, but the alarm for the annual fire drill went
               | off just as they were about to check. So it went in as
               | `sleep 1000` in a branch of the code that only runs
               | occasionally. Years later, did they really mean 16
               | minutes and 40 seconds, or did they mean 1 second?
               | 
               | Leaving clues about intent helps detect such issues in
               | review and helps debug the problems that slip through
               | review. Comments are better than nothing, but they are
               | easier to ignore than variable names.
        
               | deadbabe wrote:
               | If the code isn't working, then intent doesn't matter.
               | The code was wrong.
               | 
               | If the code is working, the intent also doesn't matter,
               | what was written is what was intended.
               | 
               | Do the requirements call for an alarm of 16 minutes 40
               | seconds? Then leave the code be. If not, just change it.
        
       | qwertytyyuu wrote:
       | lol why is SCREAM_SNAKE_CASE out performing
        
       | nemo1618 wrote:
       | Time for Hungarian notation to make a comeback? I've always felt
       | it was unfairly maligned. It would probably give LLMs a decent
       | boost to see the type "directly" rather than needing to look up
       | the type via search or tool call.
        
         | socalgal2 wrote:
         | It was and still is
         | 
         | https://www.joelonsoftware.com/2005/05/11/making-wrong-code-...
         | 
         | Types help but they don't help "at a glance". In editors that
         | have type info you have to hover over variables or look
         | elsewhere in the code (even if it's up several lines) to figure
         | out what you're actually looking at. In "app" hungarian this
         | problem goes away.
        
           | hmry wrote:
           | I remember thinking this post was outdated when I first read
           | it.
           | 
           | "Safe strings and unsafe strings have the same type - string
           | - so we need to give them different naming conventions." I
           | thought "Surely the solution is to give them different types
           | instead. We have a tool to solve this, the type system."
           | 
           | "Operator overloading is bad because you need to read the
           | entire code to find the declaration of the variable and the
           | definition of the operator." I thought "No, just hit F12 to
           | jump to definition. (Also, doesn't this apply to methods as
           | well, not just operators?) We have a tool to solve this, the
           | IDE."
           | 
           | If it really does turn out that the article's way is making a
           | comeback 20 years later... How depressing would that be? All
           | those advances in compilers and language design and editors
           | thrown out, because LLMs can't use them?
        
             | selimthegrim wrote:
             | I wonder if LLMs grok multiple dispatch
        
       | k__ wrote:
       | It's kinda funny that people are now taking decades of good
       | coding practices seriously now that they work with AI instead of
       | humans.
        
         | roxolotl wrote:
         | I was talking to a coworker about how they get the most out of
         | Claude Code and they just went on to list every best practice
         | they've never been willing to implement when working
         | previously. For some reason people are willing to produce
         | design documentation, provide comments that explain why, write
         | self documenting code and so on now that they are using LLMs to
         | generate code.
         | 
         | It's the same with the articles about how to work with these
         | tools. A long list of coding best practices followed by a
         | totally clueless "wow once I do all the hard work LLMs generate
         | great code every time!"
        
         | kingstnap wrote:
         | "Context engineering" + "Prompt Engineering":
         | 
         | 1. Having clear requirements with low ambiguity. 2. Giving a
         | few input output pairs on how something should work (few shot
         | prompting). 3. Avoiding providing useless information. Be
         | consicise. 4. Avoid having contradictory information or
         | distractors. 5. Break complex problems into more manageable
         | pieces. 6. Provide goals and style guides.
         | 
         | A.K.A its just good engineering.
        
       | quuxplusone wrote:
       | "500 code samples generated by Magistral-24B" -- So you didn't
       | use real code?
       | 
       | The paper is totally mum on how "descriptive" names (e.g.
       | process_user_input) differ from "snake_case" names (e.g.
       | process_user_input).
       | 
       | The actual question here is not about the model but merely about
       | the tokenizer: is it the case that e.g. process_user_input
       | encodes into 5 tokens, ProcessUserInput into 3, and calcpay into
       | 1? If you don't break down the problem into simple objective
       | questions like this, you'll never produce anything worth reading.
        
         | ijk wrote:
         | True - though in the actual case of your examples, calcpay,
         | process_user_input, and ProcessUserInput all encode into
         | exactly 3 tokens with GPT-4.
         | 
         | Which is the exact kind of information that you want to know.
         | 
         | It is very non-obvious which one will use more tokens; the
         | Gemma tokenizer has the highest variance with
         | process|_|user|_|input = 5 tokens and Process|UserInput as 2
         | tokens.
         | 
         | In practice, I'd expect the performance difference to be
         | relatively minimal, as input tokens tends to quickly get
         | aggregated into more general concepts. But that's the kind of
         | question that's worth getting metrics on: my intuition suggests
         | one answer, but do the numbers actually hold up when you
         | actually measure it?
        
       | Sohcahtoa82 wrote:
       | It'd be interesting to see another result:
       | 
       | Adversarially named variables. As in, variables that are named
       | something that is deliberately wrong and misleading.
       | import json as csv         close = open         with
       | close("dogs.yaml") as socket:             time =
       | csv.loads(socket.read())         for sqlite3 in time:
       | # I dunno, more horrifying stuff
        
       ___________________________________________________________________
       (page generated 2025-07-29 23:01 UTC)