[HN Gopher] Cyclomatic complexity
___________________________________________________________________
Cyclomatic complexity
Author : tosh
Score : 82 points
Date : 2024-02-08 11:11 UTC (1 days ago)
(HTM) web link (en.wikipedia.org)
(TXT) w3m dump (en.wikipedia.org)
| xutopia wrote:
| If you're using Ruby I recommend the RubyCritic gem. It'll give
| you lots of information on your project.
| bdcravens wrote:
| It's good, but it's unfortunately easy to split up your code in
| unhelpful ways just to get the dopamine rush of passing all of
| the metrics. (without aggressive configuration)
|
| Additionally, you can get much of the benefit of RubyCritic via
| the Rubocop extension in your editor (with the same caveats as
| above)
| tromp wrote:
| Does this measure make sense for pure functional languages? Can
| it be defined for terms in the lambda calculus or in combinatory
| logic?
| marcosdumay wrote:
| You still have a number of atoms, composed and selected, in a
| way that is completely similar.
|
| But you'll discover that almost every program in a functional
| language has an enormous complexity when measured that way. So,
| it does pretty much make as much sense as on the imperative
| case, and also it's completely useless (instead of only
| partially useless, I guess).
| Verdex wrote:
| Probably not. Although, you can immediately follow that up with
| asking whether this makes any sense in any programming
| language. From the wikipedia link:
|
| https://en.wikipedia.org/wiki/Cyclomatic_complexity#Correlat...
|
| Correlation between CC and defects isn't always present and in
| fact lines of code might be a better metric.
|
| I assert that there is no current objective metric where you
| can plug in source code and get out some sort of
| comprehensibility value. Lines of code is the best we have at
| the moment. And it goes without saying (but I'm going to say it
| anyway) that lines of code isn't a great metric to be stuck
| with.
|
| Right now the only thing we have are best practices (ie some
| other people were "successful" in some other context and they
| happened to be doing this) and code smells (right, the code
| makes my tummy feel bad, you can just feel the objectivity
| pouring out).
|
| [It's been pointed out to me in the past that Weyuker's 9
| Properties can be used to determine comprehensibility, but when
| you look at the papers published about it they're all really
| underwhelming. Basically, people have built a machine that when
| given source code dumps out a number that correlates to
| Weyuker, but then nobody takes the next step to see if that
| number actually correlates to anything that anyone cares about.
|
| People are trying things though. For example, there's this:
| https://www.sonarsource.com/resources/cognitive-complexity/
|
| I'm not particularly convinced by it, but it does a better job
| than Weyuker, iirc.]
| CharlieDigital wrote:
| > I assert that there is no current objective metric where
| you can plug in source code and get out some sort of
| comprehensibility value.
|
| Ratio of ternary operators more than 1 level deep to total
| LoC.
|
| You know you're in for some pain when you randomly open a
| file and the ternary expressions go 6 levels deep.
| Verdex wrote:
| > Ratio of ternary operators more than 1 level deep to
| total LoC.
|
| Sure. I should have said, "no current general objective
| metric".
|
| For example, if we detect any symbols longer than 6 billion
| characters, then we know the code is likely to be hard to
| grok. Or if the only character used in symbols is the
| underscore.
|
| And maybe the path forward for determining objectively
| comprehensible code is to collect every pathological case
| and then good code is code without the bad stuff. But I
| personally hope for a general solution that we can setup
| which will emergently determine that too many nested
| ternary operators is bad from first principles. I'm more
| than ready to be disappointed when this turns out not to be
| possible, but everyone is allowed a dream.
| bluGill wrote:
| Sure, but the vast majority of programmers don't do that
| type of thing.
|
| Often in cases like the above I'm not sure if the metric is
| correct - or is it just that you are not used to reading
| that style. For years I worked at places that banned the
| ternary expression and so I didn't use them and it was a
| lot of work to remember what was going on the rare times I
| saw them. Then I got a job on the team that did use them
| all the time (but never more than 1 level deep) and I soon
| learned to read them and now I use them often where they
| make sense.
| zackmorris wrote:
| That's what I was wondering as a woefully uninformed armchair
| warrior burnout.
|
| The main (only?) difference between functional and imperative
| programming is mutability. The ideal FP representation is a
| spreadsheet with no infinitely cyclical references to cells
| like A1->B1->C1->A1, and the ideal IP representation is a
| Turing tape reading/writing/moving with no registers. I'm
| probably wrong about both of these, but that's how I think of
| them.
|
| So most IP languages like JavaScript eventually reach a state
| complexity limit where humans can no longer analyze them. It's
| like trying to remember more than 7 digits in a phone number
| while someone is yelling more phone numbers at you.
|
| Whereas FP languages work more like piping STDIN/STDOUT/STDERR
| streams between executables running in isolated processes, and
| have no scalability limit. We just don't have the hardware to
| properly run them, because we went the SIMD/GPU route instead
| of CPU clusters.
|
| Note that most of the FP lingo like currying, closures, lazy
| evaluation, etc are clever and have their applications, but
| distract from the core essence separating it from IP.
|
| People a little older than me did a lot of work in the 1990s to
| move away from mutable state (assembly, C++) and transition to
| pure functional synchronous blocking deterministic idempotent
| logic (Lisp, Prolog, VHDL/Verilog, HTTP, etc). Unfortunately
| corporations unwound most of that progress by reintroducing
| every anti-pattern in a million-monkeys typing proprietary race
| for profit. Async/nonblocking code is borrowed from monads and
| is a workaround for single-threaded code lacking proper
| scatter-gather arrays and higher order functions that can
| fork/join other threads.
|
| The only language I know that gets this stuff mostly right is
| ClojureScript, which attempts to use pure FP by avoiding monads
| and executing logic as a one-shot process with new input/output
| passed while suspended waiting for events, similarly to how PHP
| works. It also has a software transactional memory that works
| like Redux for storing state, which is also related to copy-on-
| write arrays in PHP before version 5 introduced pass-by-
| reference classes and permanently derailed what made it
| compelling. Haskell and other FP languages mostly miss the
| point due to their over-reliance on monads and syntactic sugar,
| which costs them purity and makes them little better than IP
| languages for use-cases in real life. And of course since
| ClojureScript is Lisp-based, its parenthesis syntax makes it
| difficult to read, putting it into the category of write-only
| languages like Perl, which all but guarantees that it will
| never reach mainstream adoption.
|
| I'm mostly skeptical of Rust for these reasons. Because a
| properly structured program doesn't have much/any mutable
| state, which negates most of the need for the borrow checker. I
| have a few dozen criteria that I use to judge what a "good"
| language would look like, and there simply isn't one yet. I
| would dearly love to write one, but it would take me perhaps
| 2-5 years, so with just nights and weekends and a few weeks of
| vacation per year if I'm lucky, that will never happen. The
| main challenge I face in my daily work is overcoming this
| condemnation of being forced to work with mediocre tools for
| the reminder of my life. And I see the 100x overhead cost of
| poor tooling reflected in the $200/hr wages of programmers
| whose productivity has fallen so low that a social network is
| considered a big lift. Whereas if we look at the logic
| involved, a teenager could write one in a couple of weeks with
| something like HyperCard/FileMaker/Access.
|
| I'm reminded of the Kirk-Picard conversation that goes "I take
| it the odds are against us and the situation is grim?" because
| I've mostly witnessed regression instead of progress the last
| 25 years. But revolutions happen when things are at their
| worst, so that keeps me going and inspired.
|
| TL;DR: mutability matters more than cyclomatic complexity.
| James_K wrote:
| The issue you face here is one of program inputs. Any time you
| are passing functions as values, you can't guarantee how
| exactly the body of a function will execute. In a simple
| example: (lambda (a) (a (lambda () (b)) (lambda
| () (c)))
|
| You can't tell how the calls to b and c will be executed
| without also knowing something about the value of a. The
| function a could be a branching construct, or it could just
| evaluate it's inputs directly. You could assume the worst case
| and assign a high complexity to this code, but all lambda
| calculus terms are delayed in this way, so every function call
| would be assigned the worst case complexity for control flow.
|
| More generally, imagine an interpreter that accepts a string as
| input and that string contains a program. You can make the
| program contained in the string as complex as you like without
| affecting the complexity of the interpreter, and effectively
| cheat the metric.
| endymi0n wrote:
| And as always, Goodhart's law applies: "When a measure becomes a
| target, it ceases to be a good measure"
|
| While the concept of cyclomatic complexity can be a somehow
| useful guideline, I've seen way too many CI/CD pipelines
| enforcing CC goals, which then usually results in an un-
| untangleable middleware and module mess where any piece of core
| logic is scattered over 102 code points in 34 modules -- instead
| of just residing in a single, well readable and commented main
| function that would neatly self-explain a whole complex task in
| three screens that might have a CC of 20.
| michaelcampbell wrote:
| Indeed, I've used McCabe stuff in the past not as a gate, but a
| way to say "this should be looked at before that" type of
| thing. Works pretty well.
|
| A low score doesn't mean "good", nor a high one "bad", just an
| indication that something might do well with some more
| scrutiny.
| pasc1878 wrote:
| Or even more importantly if the score is increasing as
| revisions occur you need to do a full review of that part.
|
| You are looking for changes and not absolute values of many
| measures.
| readthenotes1 wrote:
| I used this advice for decades. It works well...
| ozim wrote:
| OK first time I see Cyclomatic complexity as a useful
| metric. Rate of change is more important than any specific
| measure.
|
| But I never seen anyone mentioning that - but after I saw
| that here in comment it seems so obvious now.
| Swizec wrote:
| > any piece of core logic is scattered over 102 code points in
| 34 modules -- instead of just residing in a single, well
| readable and commented main function that would neatly self-
| explain
|
| The tragedy of McCabe metrics is that this is a tooling
| failure. You can and should apply McCabe to whole software and
| then all these smol-functions approaches start falling apart.
| McCabe goes through the roof because the metric _doesn't depend
| on length_.
|
| McCabe measures decision points. A 5000 line linear function
| has a McCabe complexity of 1. It's fine.
|
| An even better measure is architectural complexity. I've
| written about this a lot after reading some papers and a PhD
| thesis about it [1]. Tried turning it into a book but that
| didn't quite happen. It will be a subpoint in a new project :)
|
| Architectural Complexity uses visibility metrics to say how
| intertwined your software has become. A good rule of thumb is
| that the more imports/exports you have in a file, the more
| architecturally complex it is. Best seen with a dependency
| graph visualizer, but you can also feel it viscerally when you
| have to bounce around a bazillion files to understand what
| happens with any code flow.
|
| [1] https://swizec.com/blog/why-taming-architectural-
| complexity-...
| throwanem wrote:
| I think it's a shame you didn't write the book. I know some
| people who desperately need to read it.
| chrisweekly wrote:
| Same! Also, Swizec's an awesome writer. (Long-time email
| subscriber, and fan of his posts and Serverless Handbook
| book, etc).
| epage wrote:
| Like cpus with icache and dcache which leverage memory
| locality, we overlook the effect of locality on understanding
| code.
| tetha wrote:
| I mean I'm not working on big code bases anymore, more
| tooling, analytics and managment. But both the open/closed
| principle and some advice from Sandy Metz and others really
| stuck with me: Don't bother abstracting.
|
| Just write straight line code.
|
| And then introduce indirections and abstractions if good
| reasons demand it. In fact, if changes to the code demand it.
|
| For example, introducing an abstraction for an external
| database can improve testing prospects. If you need to fix
| the same bug in several places, it's usually time to extract
| a common abstraction. If you need to touch some established
| piece of code to introduce a choice what kinda
| algorithm/strategy/formatting/.... you need to use here,
| that's a decent point to introduce some abstraction and to
| inject some function/object here. If you would have to
| replicate the exact complex logic from somewhere else, and
| those places need to move in sync for reasons, moving that to
| a common place is probably a good idea.
|
| Just following what the code needs and the changes and
| requirements around the code need has resulted in much
| simpler code and way more effective abstractions for myself.
| captainbland wrote:
| I wonder if there are any open source emulators out there which
| are compliant with a strict limit on cyclomatic complexity.
| Specifically given a common pattern is to use a large
| switch/case to map from instructions which usually compiles to
| a jump table which is performant, fairly clear/maps nicely to
| architecture docs but falls foul to normal cyclomatic
| complexity rules.
|
| Edit: project concept: Enterprise Gameboy.
| almostnormal wrote:
| > Specifically given a common pattern is to use a large
| switch/case to map from instructions which usually compiles
| to a jump table [...]
|
| Work-around: Implement the jump table directly, e.g., array
| of function pointers or whatever works in the language of
| choice.
| Scubabear68 wrote:
| Old style Java code often exhibits low CC, and as you indicate
| this is not necessarily a good thing. The actual logic is often
| spread out in a multitude of helpers, proxies, deep inheritance
| chains, and unneeded abstractions. The end result is a system
| that is very difficult to reason about.
|
| Also, in general terms over use of DRY can lead you here.
|
| In many cases it is better to "risk" higher CC by concentrating
| logic where it makes sense.
| nicholasjarnold wrote:
| Exactly, and this is why I always try to steer teams away
| from "one metric to rule them all" whether this be "always
| fail if coverage is less than X percent" or "random code
| metric like CC is beyond limit". Reality is simply more
| complicated than that, and it takes experienced engineers to
| actively manage and balance the tradeoffs appropriately over
| time. Putting arbitrary one-size-fits-all rules in place is
| almost never the answer.
|
| Unfortunately, in some (many?) companies there simply aren't
| enough experienced engineers who have the time to do the
| active balancing...leading us back to "just stick this rule
| in pipeline so the teams are held to _some standard_".
| Scubabear68 wrote:
| Agreed. I like to do these scans but for informational
| purposes, not as a gate. Also most tools allow you to
| annotate code to turn off warnings, which can help when
| used intelligently.
|
| Of course some teams will over use such tools and turn off
| the metrics left and right.
|
| In the end there is no substitute for experienced
| engineers.
| captainbland wrote:
| Yeah I agree with this. Or, rather, I think there's an
| argument to be made that what we're really interested in is
| logic density Vs logic sparsity and when developing something
| we have to decide what the right level of that is. So if
| you're a small team, logic density makes sense because it
| means one person can have a lot of context in one place and
| make significant changes without doing a lot of context
| switching. Don't underestimate the productivity of one
| experienced guy doing old school PHP.
|
| But if you're a large team on a large project logic sparsity
| (with appropriate decoupling!) makes sense because then
| multiple people can change the code simultaneously without
| stepping on each others' toes. People can specialise where
| appropriate without having to understand the whole thing. The
| "sparsity" approach is obviously what enterprises see as good
| news mostly because it means they can have large teams
| without silos but in real terms it costs a lot of money to
| operate this way. Although I think as you rightly point out,
| Java has in recent years realised that it possibly went a bit
| too far in that direction historically.
| Scubabear68 wrote:
| I hear what you're saying here, but I like to delay going
| in the second direction as long as feasibly possible.
|
| In theory, the idea that "people can specialise where
| appropriate without having to understand the whole thing"
| sounds good. In reality, so many, many, many bugs and piles
| of technical debt pile up from developers with a worms eye
| view who don't understand how the whole system works.
| Eventually you have to make this compromise, but resist it
| as much as you reasonably can, and always strive to grow
| developers so they know as much of the system as they
| possibly can.
| SilasX wrote:
| I think that's the joke behind FizzBuzz Enterprise Edition,
| where FizzBuzz is implemented with rigid adherence to Java-
| encouraged abstractions:
|
| https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris.
| ..
| lcnPylGDnU4H9OF wrote:
| As soon as I saw the title, I was reminded of these comments
| that I sometimes see in Ruby code: #
| rubocop:disable Metrics/CyclomaticComplexity
|
| Presumably because nobody on the team is brave enough to open a
| PR that disables it in the config.
| ris wrote:
| This x100.
|
| If you take a function and tell someone it's "too complex" and
| tell them to "fix it", that person will usually end up
| splitting it into, say, three functions. And _now_ , you've
| just multiplied that complexity by all the different possible
| permutations of how/where the different functions can call each
| other. Corner cases that you could previously rule out because
| the control flow made them impossible are now fair game.
| MrGilbert wrote:
| That's why it's important to pair this one single aspect
| (cyclomatic complexity) with other aspects that balance it, and
| combine it into an index.
| slowhand09 wrote:
| Love this topic. Early in my computing career I was trying to
| modernize our software testing. I wrote a small pgm to generate
| the Cyclomatic Complexity of our Fortran code. I was especially
| pleased after discovering an error while working thru the IEEE
| paper McCabe had published.
| sanj wrote:
| I used this as a metric of autoflight system complexity!
|
| https://dspace.mit.edu/handle/1721.1/9172
| Feathercrown wrote:
| CC tools applied to JS tend to count each half of an AND or OR
| operator as its own path, which is super annoying and not useful.
| rpigab wrote:
| See also Cognitive Complexity:
| https://www.sonarsource.com/resources/cognitive-complexity/
| dataflow wrote:
| Is the "cycl-" pronounced as in "cycle" or as in "cyclical"?
| j2kun wrote:
| cycle
| dataflow wrote:
| Ah thanks, looks like I need to correct my pronunciation
| then...
| ducttapecrown wrote:
| I've definitely heard both saiclo- and siclo- in a math
| seminar context!
| dataflow wrote:
| Amazing, great to know! In what context does this come up
| in math?
| j2kun wrote:
| cyclotomic polynomials are common, and probably people
| borrow the pronunciation, but I have always heard that
| pronounced as "cycle"
| klodolph wrote:
| We have tools that flag methods with high cyclomatic complexity
| during review.
|
| By "flag", all I mean is that these tools leave a comment during
| code review. Humans make the decision to listen to the comment or
| ignore it. The human in the loop is critical, because ignoring it
| is sometimes the right decision.
| chrispeel wrote:
| Kolmogorov complexity [1] is a measure of the complexity of an
| object (say a text), while Cyclomatic complexity is a measure of
| the complexity of a program. Can we make any statements about the
| relationship between them?
|
| Let's say we have a text A, and we find a program P which is of
| length L and which the Kolmogorov complexity K(A); i.e. L=K(A).
| Does it make sense to compare the cyclomatic complexity C(P) with
| K(A)? Or can we write some equation or inequality the compares
| C(P) with K(A)?
|
| I guess a first task would be to normalize one or the other so as
| to get the same units...
|
| [1] https://en.wikipedia.org/wiki/Kolmogorov_complexity
| AlotOfReading wrote:
| They're not meaningfully related. You can say certain things
| about them though, for example kolmogorov optimal programs will
| generally have higher cyclomatic complexity because they don't
| follow any fixed set of rules. Put another way, they lack the
| "simple" underlying structure that would be required.
| nurettin wrote:
| Rather than attempting to create an objective code complexity
| measure, I just try to implement domain specific rules into
| composable functions. Then code becomes linear and declarative.
|
| For long running processes, I switch to MPSC where each event
| source is a thread, but places its events onto the main thread's
| queue.
| aschearer wrote:
| My random thought for slacker news consideration:
|
| I recently read "The evolutionary origins of modularity"[1] and
| it talks about NNs that are evolved based on task-performance vs
| task-performance plus "graph cost" and finds that the later
| greatly outperform the former. Graphs in the later group ended up
| being more "modular" which allowed them to evolve to meet
| changing requirements more readily.
|
| Could we analyze computer programs and create graphs representing
| modules? And links between systems? A node would a method. An
| edge would be a call site. Graph-modules - clusters of highly
| interconnected nodes - would represent the real fault lines of
| the application, which could be juxtaposed against the
| developer's conceptual model. The cost of the overall graph could
| be calculated and used as a metric - aiming for some lower bound.
|
| 1:
| https://royalsocietypublishing.org/doi/10.1098/rspb.2012.286...
| drojas wrote:
| As a former biologist and self-taught programmer your comment
| and the paper you shared made my day (thank you). I often use
| biological and evolutionary analogies to prioritize best
| practices in my head, this one goes is at the top now.
___________________________________________________________________
(page generated 2024-02-09 23:01 UTC)