[HN Gopher] Size is the best predictor of code quality (2011)
___________________________________________________________________
Size is the best predictor of code quality (2011)
Author : crapvoter
Score : 65 points
Date : 2022-11-11 20:26 UTC (2 hours ago)
(HTM) web link (blog.vivekhaldar.com)
(TXT) w3m dump (blog.vivekhaldar.com)
| insane_dreamer wrote:
| This is why I only code in APL :P
| ozzythecat wrote:
| Meta observation: At least on Mobile Safari, this page breaks the
| back button. I had to spam press the back button like 10x in
| order to come back to HN. I absolutely despise websites that do
| this, and immediately feel a bias against whatever the site is
| selling or trying to convince me to believe.
|
| RE: the article itself
|
| I've been on projects where we eventually shipped the most poorly
| written code (variables with ambiguous naming, methods that are
| several hundreds of lines, classes with 5-10k lines, little or no
| test coverage). This code was baked into a consumer device, and
| we shipped over 16 million units over almost the span of a
| decade.
|
| I've also been on projects where the code was practically a work
| of art. By the time we were "done", the market had changed from
| under our feet. What we were doing wasn't so interesting or novel
| anymore. We never shipped.
|
| It doesn't surprise me that more code == more bugs (or an
| increasing probability), but it's good to keep in mind that value
| is what we care to produce. Everything else is a cost, and
| spending too much on any particular cost can take the whole
| enterprise down the ravine, regardless of good intentions.
| [deleted]
| [deleted]
| DougBTX wrote:
| > Meta observation: At least on Mobile Safari, this page breaks
| the back button.
|
| The UI is somewhat hidden, but a long press on the back button
| will open a pop up menu, which makes it easy to jump back two
| pages.
|
| Other than that, it is totally a bug in the website. I wonder
| how many lines of code it has??
| gibsonf1 wrote:
| I think use/time of the code needs to be factored in as the more
| the code is used, the more bugs are exposed and fixed, such that
| old actively managed code can be far less buggy (if all goes well
| and the architecture is good). Less code that does the same job
| is always better.
| [deleted]
| jongjong wrote:
| I agree with the premise of this article. You want a program to
| have as few lines of code as are required to satisfy the needs of
| stakeholders (especially customers and other operators of your
| software).
|
| The size of the solution must not exceed the size of the problem
| it solves. You need to focus on the essence of the problem and
| make compromises based on what is most important. Keeping the
| lines of code count low is a worthy target.
|
| There is a balance to be reached but devs today are far too
| permissive when it comes to complexity creep. It should be
| setting off alarm bells much sooner.
| forgotusername6 wrote:
| The ratio of tests to code is often a good benchmark of how good
| a codebase I'm dealing with.
|
| The caveat is code bases with targets on coverage. The ratio
| becomes less meaningful then.
| mrkeen wrote:
| > Does the number of bugs grow linearly with code size? Sub-
| linearly? Super-linearly? My gut feeling still says "sub-linear".
|
| My loose reasoning says it has to be linear (like interest
| generated on bank accounts). Otherwise you could simply merge two
| large codebases to squash bugs (in the case of sublinear bug
| growth) or divide a large codebase (in the case of superlinear
| bug growth).
|
| ALternatively, in a sublinear-bug-growth codebase, you could just
| keep adding and adding to it until eventually you'd add no more
| bugs.
| wruza wrote:
| My gut feeling is it grows linearly with any actions over
| codebase that are not accompanied by both testing and real
| usage. Well, until tech debt comes into play.
|
| But there are many ways to make it non-linear, e.g. one can
| build a nice high-level structure which becomes irrelevant when
| goals shift. Or one can sign up for a paradigm which requires
| better developers and deeper understanding.
| dskloet wrote:
| You can't just divide a code base that's entangled. If you
| disentangle it, you'll remove some bugs in the process. My
| feeling is that it's super-linear for this reason.
| tonyarkles wrote:
| Completely agree. And my favourite add-on is if it's broken
| apart such that one half of it ends up being a library,
| adding a second consumer to that library will tend to further
| reduce bugs, even more so if the second consumer is written
| by a different team. (Eventually, at least, because the
| second consumer is going to find bugs so the start)
| WalterSear wrote:
| Bugs grow linearly when writing them, not after they are
| written.
|
| > divide a large codebase (in the case of superlinear bug
| growth).
|
| I do almost exactly that when approaching a problematic system:
| refactor aspects of it as pure components or separate services
| in order to isolate behaviours from each other.
|
| Even exactly what you describe - just breaking a large codebase
| into two smaller (working) ones - exposes an interface that can
| reveal failed intentions. It's just usually an incredible bitch
| to do, unless you initially refactor the parts to minimize
| internal state and the types of interactions the two parts have
| (ie - make them more 'pure').
|
| Decidedly, super linear, IME.
| a99wang wrote:
| I don't quite follow this logic. If one manipulates a codebase
| in the ways you describe (merging with another, or dividing a
| larger one), the result doesn't even have to be a working
| program; how can we compare the result with the original?
|
| The real answer is probably that it might depends on how
| tightly coupled the code is, how much test coverage there is
| (assuming that tests are included in code size), the quality of
| the authors who wrote the code (this might be circular), maybe
| even on the technology used, or what you consider distinct
| bugs, etc.
|
| My intuition leans towards super-linear, but only slightly so
| -- large codebases tend to have many authors (many of whom may
| have since stopped working on the project) and be old.
| bogdanoff_2 wrote:
| I think it would make sense to imagine the codebase as a graph
| of some sorts, and the number of bugs as a function of both the
| number of vertices and edges.
| zokier wrote:
| There are also two dimensions to code size, one being source code
| size and another is executable size/number of instructions
| executed. This path of thinking brings up the connection between
| optimization and correctness.
|
| In particular if adding source lines describe the program more
| specifically and accurately then that should enable compiler to
| produce more compact/efficient code and I'd imagine also
| contribute to correctness.
| whatever1 wrote:
| In my experience clean code has nothing to do with its size.
|
| Some people write clean but long code. Others write super compact
| code that nobody else can decipher (except the compiler I guess).
| monksy wrote:
| The older/experienced in the industry I get the more I don't
| believe software engineering research.
|
| If it's in a uni- expect for the researchers to be biased towards
| the practices that are advocated in their academic text books on
| the subject.
|
| If it's an organization: Expect for it to be biased towards what
| they want for that language (ms for supporting c# for new
| features or moving people away from old features)
|
| ---
|
| I saw this in the anti-TDD/unit testing/Testing papers. They're
| making claims based on a flatten view of development and aren't
| capturing what happens on the local machine where those are
| intended. Additoinally they don't consider the externalities
| about how you approach the code base.
| kragen wrote:
| haldar's summary is completely incorrect, as you would expect
| from someone who says 'tl;dr'
|
| the paper finds that faults (and wmc (methods per class), cbo
| (coupling between object classes), response for a class (rfc),
| number of methods added (nma), etc.), correlate to _class_ size
| (non-comment source lines of code), not _program_ size
|
| it only looked at one program so it could not measure any effects
| due to program size
|
| moreover it wasn't measuring code quality in the sense of
| 'defects per kloc', it was measuring defects (whether a bug had
| been detected in a class in the field or not)
|
| stripping away the acronyms, what they found was that classes
| that contained more code more often had at least one bug, and
| also had more methods, but that having more methods without
| having more code didn't make classes significantly more likely to
| have a bug
|
| and similarly for the other complexity metrics like how many
| different methods a class calls and implements (rfc)
|
| this is unsurprising, since things like the number of lines of
| code in a class and the number of different methods in a class
| are just alternative metrics of class size, and everyone knows
| that in general more code means more bugs
|
| that's why we measure code quality in defects per kloc and not
| total defects. the paper didn't even try to measure code quality
| in that sense
|
| that doesn't mean the paper is bad. if the paper's authors are
| correct that many other papers have failed to control for size in
| their defect metrics, they have identified a serious shortcoming
| in the existing research literature; haldar merely totally failed
| to understand them
|
| so the paper haldar tried to summarize doesn't measure either
| program size or code quality
|
| (some of their references did look at program size tho)
|
| why are none of the other comments pointing this out
|
| are they all just commenting without having read the paper
|
| now i am sad
| ziddoap wrote:
| > _as you would expect from someone who says 'tl;dr'_
|
| ...What? This is hilarious.
|
| How is using an acronym to provide a brief summary of something
| indicative of correctness?
| kragen wrote:
| 'tl;dr' is an abbreviation for 'too long; didn't read'
|
| people who _don 't read long things_ sometimes know a lot
| about things like how to weld, how to comfort the grieving,
| and what makes them happy, but they are always profoundly
| ignorant when it comes to book learning
|
| that's what happened in this case, where haldar's summary
| claims that this paper shows that a variable they didn't
| attempt to measure is the best predictor of another variable
| they didn't attempt to measure
|
| evidently he just looked at the diagrams and guessed what the
| words meant
| ziddoap wrote:
| > _people who don 't read long things sometimes [...] but
| they are always profoundly ignorant when it comes to book
| learning_
|
| You seem to have fundamentally misunderstood how this
| acronym is used.
|
| When provided by an author, _it doesn 't mean that the
| author didn't read_... It's literally just a substitute for
| "Summary:". An indication _to the reader_ that if they find
| the article too long to read, they can read the immediately
| following subsection to get a summary of the article.
|
| Do you also think that people who say "Summary:" are
| profoundly ignorant? Or are you just against acronyms?
| kragen wrote:
| probably you haven't noticed this, but using 'tl;dr'
| instead of 'summary' or 'abstract' is a group identity
| marker by which the author shows solidarity with their
| (presumably profoundly ignorant) readers. it's the same
| sort of gesture as describing the paper's authors as
| 'eggheads' or 'boffins'
|
| and evidently in this case the blog post author really
| _didn 't_ read the paper they were 'summarizing'
|
| i find your contributions so far to the discussion of
| this paper profoundly disappointing
| ziddoap wrote:
| > _probably you haven 't noticed this_
|
| > _their (presumably profoundly ignorant)_
|
| > _i find your contributions so far to the discussion of
| this paper profoundly disappointing_
|
| I'm not sure why you are so hostile, and why you think
| everyone except yourself is profoundly ignorant, but it's
| clear that you have some sort of world view where you're
| superior than anyone else you interact with. I doubt
| anything I say would convince you otherwise.
|
| tl;dr: I found this conversation disappointing, too.
| kragen wrote:
| i frequently interact with people i admire and who know
| more than i do (and sometimes they are the same people),
| and, as i said above, even people who are profoundly
| ignorant about book learning are often wise about many
| other things
|
| rather than being hostile, i have taken the time to
| explain in more detail the things you asked about in my
| previous comments because they were unclear to you
|
| i am surprised that you have stooped to launching
| personal attacks on me in this comment, since so far i
| have criticized only your low-quality comments (and the
| original blog post) rather than any presumed personal
| attributes of yours
|
| try to do better please
| ziddoap wrote:
| > _i am surprised that you have stooped to launching
| personal attacks on me in this comment, since so far i
| have criticized only your low-quality comments (and the
| original blog post) rather than any presumed personal
| attributes of yours_
|
| You've called me "profoundly ignorant" several times in
| this chain. How you can say this with a straight face is
| baffling.
| kragen wrote:
| i haven't called you profoundly ignorant even once.
| please do not lie about what i have said; there is no
| point in doing so in any case because anyone reading this
| thread can look half a page up or click the 'context'
| link and see that you are lying
|
| i said that people who don't read long things are
| profoundly ignorant, at least within the sphere of book
| learning, though often not in other ways. that's not a
| personal attack on them, it's just an obviously true
| statement if the phrase 'book learning' refers to
| anything at all
|
| you haven't said even once that _you_ don 't read long
| things, so there was no reason for me to even suspect
| that what i said applies to you, even if it _were_ a
| personal attack (and, well, some people are sensitive
| about being ignorant because they 're used to being
| criticized for it)
| cole-k wrote:
| From now on I will strive to only write "abstract"
| instead of "TL;DR" so I can avoid showing solidarity with
| the presumed profoundly ignorant. It's a matter of
| principle that I must appear to be the most intelligent
| person in any room I enter, virtual or physical.
| kragen wrote:
| maybe you could show solidarity with them in a way that
| doesn't reinforce a dichotomy between them and people who
| read books
|
| for example, by emphasizing the praiseworthy attributes
| you have in common rather than the misfortunes that have
| befallen you and the self-destructive choices which, in
| cases like this one, perpetuate those misfortunes
|
| i mean it might be useful to read a book from time to
| time, or at least a 25-page article, and that's less
| likely if you turn it into an identity threat
|
| we're all born profoundly ignorant but we don't have to
| stay that way
|
| if you're the most intelligent person in the room maybe
| you're in the wrong room. this can easily be reversed to
| provide a plan of action if being the most intelligent
| person in the room is what you value most. beware,
| though, some people are smarter than they look
| beckingz wrote:
| Some of us know how to weld, how to comfort the grieving,
| and have good book learning. Still trying to find the
| balance between happiness, satisfaction, and pleasure.
| kragen wrote:
| agreed, i'm not trying to say it's a zero-sum sort of
| thing, just that using 'tl;dr' is not an indication of
| profound ignorance in all possible ways, just the forms
| of ignorance that result from not reading long things
| jt2190 wrote:
| > How is using an acronym to provide a brief summary of
| something indicative of correctness?
|
| I'm struggling to understand how this is the point that's
| most worth debating here. What do you propose we'll learn?
| kragen wrote:
| it turns out they thought they were being criticized so
| they got defensive, explaining their otherwise shockingly
| aggressive and even mendacious comment thread here:
| https://news.ycombinator.com/item?id=33567361
|
| not sure that's a good excuse for their total failure to
| contribute anything substantive though
| [deleted]
| otikik wrote:
| Zero lines means Zero bugs.
| mirekrusin wrote:
| Doesn't it throw null exception when you try to use it though?
| kccqzy wrote:
| Code size is also a good (but not great) predictor of development
| velocity. If you need to write fewer lines of code to do
| something (assuming standardized formatting so no code golf),
| it's probably the case that you can write that code faster, and
| ship features faster. And of course, some languages simply enable
| to work on such a high level of abstraction that you write much
| less code (e.g. http://www.paulgraham.com/avg.html).
| stephc_int13 wrote:
| The important aspect of this article is Cognitive Complexity.
|
| Of course, everything being equal, more code leads to more
| complexity.
|
| In this context, complexity is related to human capacity to
| collectively read/understand/maintain a code base.
|
| I've been writing software almost non-stop for close to 25 years,
| and I still don't understand most of the code written by others.
|
| I especially struggle to read anything with templates, generics,
| lambda...
|
| But I can read straight C code, code written by Fabrice Bellard
| or John Carmack in the early days of id software.
| remram wrote:
| Interesting. This reminds me of the "simple made easy" talk by
| Rich Hickey, about complexity vs difficulty:
| https://paulrcook.com/blog/simple-made-easy
| callamdelaney wrote:
| React developers take notes,random lambdas are dumb.
| nickdothutton wrote:
| I wrote a little on codebase size and quality, and what to do
| about it, back in 2019. Some might find it interesting.
|
| https://blog.eutopian.io/winning-systems-security-practition...
| hcarvalhoalves wrote:
| It's the Swiss cheese paradox: "cheese has holes; bigger cheese
| have more holes; therefore, the more cheese, less cheese".
|
| > However, I still haven't found any studies which show what this
| relationship is like. Does the number of bugs grow linearly with
| code size? Sub-linearly? Super-linearly? My gut feeling still
| says "sub-linear".
|
| We know from observing reality that even buggy software is better
| than no software - a software so buggy it adds negative value is
| a rare exception. So I guess it has to be "sub-linear", otherwise
| software wouldn't have economies of scale.
|
| The contradiction though is: how do we explain the apparent
| stability of large but _mature_ codebases? Thinking something
| like Emacs for example here, where some commits date back to
| 197X. I guess at some point it becomes an inverted U shape even
| if codebase grows, if it grows at a _conservative_ pace.
| ajuc wrote:
| Number of bugs can grow super-linearly while their impact might
| be decreasing preserving the overall usability increases.
| Basically as your software grows there's more corner cases that
| will get tested very rarely if ever.
|
| Number of possible states grows almost exponentially with code
| size, so it would be very weird if the number of bugs grew sub-
| linearly.
| rmbyrro wrote:
| Maybe another variable at play here is:
|
| Evolution: modifications_size / time / codebase_size
|
| Mature systems tend to have lower Evolution. Thus, there are
| more bugs being fixed than created. Over time, this leads to
| stability.
|
| On the other hand, one bug fix creates three more bugs, on
| average. So, I don't know why on Earth stable systems are even
| possible...
| numbsafari wrote:
| > how do we explain the apparent stability of large but mature
| codebases
|
| I would hypothesize three potential factors are at play,
| especially for software like Emacs.
|
| The first is that some bugs becomes, themselves, "features"
| over time. It might be wrong from the standpoint of what was
| intended or desired when the code was written, or what is
| documented, but because that is how it works, that's how it is
| now expected to work, and so it's no longer a bug.
|
| Another factor is that some bugs are "shielded" by the
| surrounding code, effectively two pieces of code offsetting
| each other's bugginess. Unless and until you either replace one
| or the other, or need to interact with one or the other in
| isolation, the bug isn't apparent. Kind of like how going to
| the chiropractor for one problem can "unlock" other, latent,
| problems.
|
| The last one, somewhat related to the first two, is that some
| well factored software systems do a very good job of isolating
| their most stable parts, leaving their less stable, more
| commonly changed or iterated parts in separate modules or
| components. Because the change rate and focus of effort is on
| the "outer" parts of the system, less time and energy is spent
| on the "inner" part. Very often, developers of the "outer"
| parts will accommodate the bugs of the "inner" part, because
| changing the inner part requires both greater knowledge of the
| overall system, as well as the potential for a larger "blast
| radius" (and therefore assumed responsibility) for ones change.
| So rather than debugging the "inner" part, the "outer" part
| grows lots of accommodations for the inner part. It's not until
| that inner part can no longer be accommodated, or it becomes a
| security vulnerability, that the community developing the
| system will see the value in "fixing" the "core".
|
| Anyhow, just a set of hypotheses.
| brundolf wrote:
| > otherwise software wouldn't have economies of scale
|
| Does it? I tend to think it doesn't (in terms of code size;
| obviously it does if we're scaling by number of users)
|
| Even if bugs scale sublinearly in practice- my gut tells me
| it's because development as a _whole_ slows down even more than
| the bugs accelerate, partially making up for it
|
| I.e. suppose your codebase size grows 5x, and the bugs per
| change grows 10x, but then the org's rate of change also slows
| down by 5x. Now your bugs per _time_ has only grown by 2x.
| worik wrote:
| > We know from observing reality that even buggy software is
| better than no software - a software so buggy it adds negative
| value is a rare exception.
|
| Whom observing which reality?
|
| I have made no scientific study - I am not stating any sort of
| strong opinion, just an uneasy feeling - that about 90% of the
| software and uses of computers subtract from humane well being
| and welfare, not contribute to it.
|
| Looking at you Javascript frameworks.
|
| Looking at you databases and tracking.
|
| Looking at you spyware and advertising bots.
|
| Looking at you killer robots (yes they exist - mostly thinking
| of the horrific USA military drone assassination programme)
|
| So many examples to choose from. I should not complain, I make
| my living from computers and the uses I facilitate are no
| shining examples of improving people's lives
| imachine1980_ wrote:
| i think you are mixing two concepts s imperfect software like
| databases and JavaScript frameworks, the others aren't
| imperfect they are evil, they produce what they want to
| produce, yes is negative for society, but not incorrect.
| [deleted]
| rozap wrote:
| Agree that a lot of it has to do with rate of growth. Compare
| emacs or postgres codebases with some of the horror shows that
| are the codebases at VC-hyper-growth startups.
| WrtCdEvrydy wrote:
| It's not so much "sub-linear" as "capable of being bypassed".
|
| The CEAC State Department website is the most common one for
| me... the file upload only works in incognito and "once" per
| session. So people have figured out that you can log out, log
| back in and keep uploading your immigration paperwork.
| johnfn wrote:
| > The contradiction though is: how do we explain the apparent
| stability of large but mature codebases?
|
| The problem with these sort of analyses, in my mind, is that
| they seem to neglect the iceberg of code you build on top of,
| which works perfectly fine and is virtually bug-free. For
| instance, imagine that I'm building a website. If size is a
| predictor of quality, then you'd expect me to add bugs as time
| goes on, and that's probably true. But my site is stacked on
| top of Chrome, an abstraction with hundreds (?) of millions of
| lines of code that virtually never give me an issue. This is
| stacked on top of the typescript compiler, the C++ compiler,
| the operating system, etc, and those all run in the millions
| (TS) to tens or hundreds of millions (OS) of lines of code! So
| while I don't doubt that codebases get buggier over time, this
| seems to neglect the fact that I use truly immense amounts of
| code every day that don't seem to run into an issue. Perhaps
| the bugs grow around the edges rather than in the center, or
| something like that?
| plantwallshoe wrote:
| I would guess that this theory applies average codebases of
| average complexity and quality.
|
| Chrome, popular compilers, popular operating systems skew
| towards the highest end of quality and developmental rigor.
| asperous wrote:
| Chromium has 63,438 open bugs in its bug tracker so the
| theory remains true, but maybe the broader point is that
| strong abstractions have enabled us to reach great levels of
| complexity. Testing, staffing, and programming best practices
| probably go a long way in making software like Chrome appear
| to work flawlessly from the outside.
|
| An analogy would be the craftsman-factory. A craftsman knows
| and performs every step of the process, and the end product
| depends on the skill of the craftsman. While a factory each
| worker is limited to a specific job and the combined effort
| of factory workers allows products well beyond the reach of
| any craftsman.
| johnfn wrote:
| > Chromium has 63,438 open bugs in its bug tracker so the
| theory remains true
|
| How many of those bugs affect you on a day-to-day basis?
| For me, the answer is 0. In fact, I can only think of one
| time ever when I was affected by a browser bug as a
| developer. I think that most of the reason Chrome has so
| many bugs is probably just because a staggering amount of
| people use Chrome, meaning they can inspect every nook and
| cranny.
|
| For a somewhat contrived example, imagine I write a program
| to add two numbers together. I write a single line of code:
| `var foo = bar + baz;` in JavaScript. If I ship it to no
| one, I can claim it has no bugs. But if I ship it to a
| million people, then I might start getting bug reports from
| people running it under extremely precise circumstances and
| running into issues. For instance, maybe I'd get a bug
| where two numbers sum to be negative because the user ran
| into integer overflow when bar and baz are very close to
| INT_MAX. I might also get reports of lack of precision when
| bar is extremely large and baz is not (e.g. sometimes foo +
| 1 === foo in JS if a is huge). I suspect most bugs in
| Chrome are of this variety, or, more likely, they are even
| more obscure.
|
| It seems to me that it's not a strict 1:1 correlation
| between size and quality - there's also a factor where more
| popular software has a lot more people looking for bugs.
| cxr wrote:
| > e.g. sometimes foo + 1 === foo in JS if a is huge
|
| Superfluous use of the strict equality check there. An
| ordinary comparison (i.e. using `==` aka "double equals")
| is sufficient.
| aappleby wrote:
| I've found this to be absolutely true throughout my career.
|
| The only adjustment I'd make is that size should be measured
| after some normalization pass (eliminate comments, whitespace,
| make variable names the same length, etc)
| aappleby wrote:
| "If I had more time, I would have written you a shorter
| letter."
| alexfromapex wrote:
| What's to say this isn't just an emergent statistical phenomenon?
| anothernewdude wrote:
| I suspect that only code bases of a certain quality can achieve
| larger sizes.
| fnordpiglet wrote:
| If you follow the roadmap to software quality of course small is
| better:
|
| Make it work Make it right Make it fast Make it small
|
| (Classically attributed to Kent Beck, but he left off the make it
| small which I added)
| wkdneidbwf wrote:
| breaks the back button in mobile safari. cute.
| dang wrote:
| Discussed at the time:
|
| _Size is the best predictor of code quality_ -
| https://news.ycombinator.com/item?id=3037293 - Sept 2011 (123
| comments)
| Aachen wrote:
| Link to original paper is dead. Working link:
|
| https://dl.acm.org/doi/10.1109/32.935855 (15 pages without
| appendices)
| cat_plus_plus wrote:
| Nope, it's how you use it!
___________________________________________________________________
(page generated 2022-11-11 23:00 UTC)