[HN Gopher] You've just inherited a legacy C++ codebase, now what?
       ___________________________________________________________________
        
       You've just inherited a legacy C++ codebase, now what?
        
       Author : broken_broken_
       Score  : 115 points
       Date   : 2024-02-29 14:11 UTC (8 hours ago)
        
 (HTM) web link (gaultier.github.io)
 (TXT) w3m dump (gaultier.github.io)
        
       | keepamovin wrote:
       | It's funny. My first step would be                 0. You reach
       | out to the previous maintainers, visit them, buy them tea/beer
       | and chat (eventually) about the codebase. Learned Wizards will
       | teach you much.
       | 
       | But I didn't see that anywhere. I think the rest of the
       | suggestions (like get it running across platform, get tests
       | passing) are useful stress tests likely to lead you to robustness
       | and understanding however.
       | 
       | But I'd def be going for that sweeet, sweet low-hangin' fruit of
       | just talking to the ol' folks who came that way before. Haha :)
        
         | Night_Thastus wrote:
         | IME, this only works if you can get _regular_ help from them. A
         | one-off won 't help much at all.
        
           | keepamovin wrote:
           | Yeah you need to cultivate those relationships. But with a
           | willing partner that first session will take you from _0_ to
           | _1_ :)
        
           | Joel_Mckay wrote:
           | I've always found discussing why former employees left a
           | project incredibly enlightening. They will usually explain
           | the reality behind the PR given they are no longer involved
           | in the politics. Most importantly they will often tell you
           | your best case future with a firm.
           | 
           | Normally, employment agreements specifically restrict contact
           | with former staff, or discussions of sensitive matters like
           | compensation packages.
           | 
           | C++ is like any other language, in that it will often take 3
           | times longer to understand something than re-implement the
           | same. If you are lucky, than everything is a minimal API lib,
           | and you get repetitive examples of the use cases... but the
           | cooperative OSS breadcrumb model almost never happens in
           | commercial shops...
           | 
           | Legacy code bases can be hell to work with, as you end up
           | with the responsibility for 14 years of IT kludges. Also, the
           | opinions from entrenched lamers on what productivity means
           | will be painful at first.
           | 
           | Usually, with C++ it can become its own project specific
           | language variant (STL or Boost may help wrangle the chaos).
           | 
           | You have my sympathy, but no checklist can help with naive
           | design inertia. Have a wonderful day. =)
        
           | mellutussa wrote:
           | > A one-off won't help much at all.
           | 
           | Monumentally disagree. One-off session with a guy who knows
           | the codebase inside out can save you days of research work.
           | Plus telling you all about the problematic/historical areas.
        
             | Night_Thastus wrote:
             | I'm just stating my experience. A single day, _if they
             | still have access to the codebase_ might be able to clear
             | up some top-level concepts.
             | 
             | But the devil is in all the tiny details. What is this tiny
             | correction factor that was added 20 years ago? Why was this
             | value cut off to X decimals? Why didn't they just do Y
             | here? Why do we override this default behavior?
             | 
             | It's tens of thousands of tiny questions like that which
             | you can't ask until you're there.
        
               | slingnow wrote:
               | I don't understand what you're saying. Clearly both types
               | of meetings (one-off vs recurring) would be helpful. The
               | one-off may save you days/weeks of research, but it seems
               | like you're not satisfied with that unless you can answer
               | every single minor question you might have across the
               | entire codebase.
        
         | raverbashing wrote:
         | Easier said than done
         | 
         | My step 0 would be: run it through an UML tool to get a class
         | diagram and other diagrams.
         | 
         | This will help you _a lot_.
         | 
         | > Get the tests passing on your machine
         | 
         | Tests? On a C++ codebase? I like your optimism, rs
        
           | Joel_Mckay wrote:
           | No one has time to maintain the UML in production.
           | 
           | You may luck out with auto-generated documentation, but few
           | use these tools properly (javadoc or doxygen). =)
        
             | cratermoon wrote:
             | The GP said nothing about keeping and maintaining it, only
             | generating it. Use it to understand the codebase, then
             | archive it or throw it out.
        
               | Jtsummers wrote:
               | Exactly. You inherited 500k SLOC of C++ that grew
               | together since 1985. You don't know the interconnections
               | between the classes that have accumulated in that time.
               | It was also developed by multiple teams, and likely had
               | very different approaches to OO during these past nearly
               | 40 years. The UML diagrams won't tell you everything, but
               | they will tell you things like the inheritance hierarchy
               | (if this was a 1990s C++ project it's probably pretty
               | nasty), what classes are referenced by others via member
               | variables, etc. This can be hugely informative when you
               | want to transform a program into something saner.
        
               | Joel_Mckay wrote:
               | I always interpreted most polymorphism as sloppy context-
               | specific state-machine embedding, and fundamentally an
               | unmaintainable abomination from OOP paradigms.
               | 
               | OO requires a lot of planning to get right (again, no
               | shop will give your team time to do this properly), and
               | in practice it usually degenerates into spiral
               | development rather quickly (<2 years). Thus, 14 years
               | later what you think you see in documentation may be
               | completely different from the conditional recursive
               | definition some clown left for your team (yes, it happens
               | eventually...)
               | 
               | 500k lines is not that bad if most of it is encapsulated
               | libraries... =)
        
         | GuB-42 wrote:
         | I wouldn't make it the first step. If you do, you will probably
         | waste their time more than anything.
         | 
         | Try to work on it a little bit first, and once you get stuck in
         | various places, now you can talk to the previous maintainers,
         | it will be much more productive. They will also appreciate the
         | effort.
        
           | guhidalg wrote:
           | There's a fine balance with no right or wrong answer.
           | Previous maintainers will appreciate if you spent literally
           | more than a second trying to understand before you reach out
           | to them, but for your own sanity you should know when it's
           | time to stop and call for help.
        
         | theamk wrote:
         | Maybe do a quick look at codebase first so you can identify
         | biggest WTF's and ask about them.
         | 
         | After all, if you have inherited a codebase with no tests, with
         | build process which fails every other time, with unknown
         | dependency info, and which can only be built on a single server
         | with severely outdated OS... are you sure the previous
         | maintainer is a real wizard and all the problems are result of
         | not enough time? Or are they a "wizard" who keep things broken
         | for job security and/or because they don't want to learn new
         | things?
        
       | rwmj wrote:
       | Surprisingly good advice. In a similar vein, Joel's 12 steps to
       | better software: https://www.joelonsoftware.com/2000/08/09/the-
       | joel-test-12-s...
        
       | leecarraher wrote:
       | create bindings and externalize function libraries for other
       | languages, hope to your prefered deity nothing breaks
        
         | Night_Thastus wrote:
         | This adds additional problems. IE, Start replacing legacy C++
         | with Python, now debugging and following the flow of the code
         | becomes very difficult.
        
           | bluGill wrote:
           | If it is C++ I wouldn't think about python in most cases.
           | Rust should come to mind. Ada, or D are other options you
           | sometimes hear about.
        
       | sjc02060 wrote:
       | A good read. We recently did "Rewrite in a memory safe language?"
       | successfully. It was something that shouldn't have been written
       | in C++ in the first place (it was never performance sensitive).
        
         | tehnub wrote:
         | Would you mind sharing what language you used?
        
         | jstimpfle wrote:
         | Probably not a project spanning more than 3 decades of
         | development and millions of lines of code?
        
       | Jtsummers wrote:
       | I'd swap 2 and 3. Getting CI, linting, auto-formatting, etc.
       | going is a higher priority than tearing things out. Why? Because
       | you don't know what to tear out yet or even the consequence of
       | tearing them out. Linting (and other static analysis tools) also
       | give you a lot of insight into where the program needs work.
       | 
       | Things that get flagged by a static analysis tool (today) will
       | often be areas where you can tear out entire functions and maybe
       | even classes and files because they'll be a re-creation of STL
       | concepts. Like homegrown iterator libraries (with subtle
       | problems) that can be replaced with the STL algorithms library,
       | or homegrown smart pointers that can just be replaced with actual
       | smart pointers, or replacing the C string functions with C++'s on
       | string class (and related classes) and functions/methods.
       | 
       | But you won't see that as easily until you start scanning the
       | code. And you won't be able to evaluate the consequences until
       | you push towards more rapid test builds (at least) if not
       | deployment.
        
         | broken_broken_ wrote:
         | Fair point!
        
         | dralley wrote:
         | On the flip side, auto-formatting will trash your version
         | history and impede analysis of "when and why was this line
         | added".
        
           | skrebbel wrote:
           | You can ignore commits from git blame by adding them to a
           | .gitattributes file.
           | 
           | This is assuming Git of course, which is not a given at all
           | for the average legacy c++ codebase.
        
           | Jtsummers wrote:
           | I'm not hardcore on auto-formatters, but I think their impact
           | on code history is negligible in the case of every legacy
           | system I've worked on. The code history just isn't there.
           | These aren't projects that used git until recently (if at
           | all). Before that they used something else, but when they
           | transitioned they didn't preserve the history. And that's if
           | they used any version control system. I've tried to help
           | teams whose idea of version control was emailing someone
           | (they termed them "QA/CM") to make a read-only backup of the
           | source directory every few months (usually at a critical
           | review period in the project, so a _lot_ of code was changed
           | between these snapshots).
           | 
           | That said, sure, skip them if you're worried about the
           | history getting messed up or use them more selectively.
        
             | KerrAvon wrote:
             | SVN was a thing by the mid-2000's, and history from that is
             | easy to preserve in git. Just how old are the sourcebases
             | in question? (Not to shoot the messenger; just like, wow.)
             | 
             | edit:typo
        
               | varjag wrote:
               | The first large C++ project I worked on in mid-1990s was
               | basically preserving a bunch of archived copies of the
               | source tree. CVS was a thing but not on Windows, and
               | SourceSafe was creating more problems than it been
               | solving.
        
               | Pfiffer wrote:
               | I've heard a lot of stories about mid-90s codebases for
               | sure
        
               | Jtsummers wrote:
               | Some of these systems dated back to the 1970s. The worst
               | offenders were from the 1980s and 1990s though.
               | 
               | It's all about the team or organization and their
               | laziness or non-laziness.
        
             | bear8642 wrote:
             | >I think their impact on code history is negligible in the
             | case of every legacy system I've worked on. The code
             | history just isn't there.
             | 
             | Not sure if I agree here or not - whilst yes, the history
             | isn't there, if it's a small enough team you'll have a good
             | guess at who wrote it.
             | 
             | Definitely found I've learnt the style of colleages so know
             | who to ask just from the code outline.
        
               | Jtsummers wrote:
               | Legacy systems that you inherit don't have people coming
               | with them very often. That's part of the context of this.
               | You often don't have people to trace it back to or at
               | least not the people who actually wrote it (maybe someone
               | who worked with them before they got laid off a decade
               | ago), and reformatting the code is not going to make it
               | any harder to get answers from people who aren't there.
        
           | PreachSoup wrote:
           | On per file level it's just 1 commit. It's not really a big
           | deal
        
           | exDM69 wrote:
           | clang-format can be applied to new changes only, for this
           | very reason.
           | 
           | Adding it will remove white space nitpicking from code
           | review, even if it isn't perfect.
        
           | IshKebab wrote:
           | I believe you can configure `git blame` to skip a specific
           | commit. But in my experience it doesn't matter anyway for two
           | reasons:
           | 
           | 1. You're going to reformat it eventually anyway. You're just
           | delaying things. The best time to plant a tree, etc.
           | 
           | 2. If it's an old codebase and you're trying to understand
           | some bit of code you're almost always going to have to walk
           | through about 5 commits to get to the original one anyway.
           | One extra formatting commit doesn't really make any
           | difference.
        
           | lpapez wrote:
           | You can instruct git to ignore specific commits for blame and
           | diff commands.
           | 
           | See "git blame ignore revs file".
           | 
           | Intended use is exactly to ignore bulk changes like auto
           | formatting.
        
             | westurner wrote:
             | +1                 man git-blame       git help blame
             | 
             | https://git-scm.com/docs/git-blame
        
           | duped wrote:
           | This is another reason why you should track important
           | information in comments alongside the code instead of
           | trusting VCS to preserve it in logs/commit messages, and to
           | reject weird code missing comments from being merged.
           | 
           | Not saying that fixes decades of cruft because you shouldn't
           | change files without good reason and non-white space
           | formatting is not a good reason, but I'm mentioning it
           | because I've seen people naively belief bullshit like "code
           | is self explanatory" and "the reason is in the commit
           | message"
           | 
           | Just comment your code folks, this becomes less of a problem
        
         | politician wrote:
         | Nit: The post scopes "tearing things out" to dead code as
         | guided by compiler warnings and unsupported architectures.
         | 
         | If going the route, I'd recommend commenting out the lines
         | rather than removing them outright to simplify the diffs at
         | least until you're ready to squash and merge the branch.
        
           | SAI_Peregrinus wrote:
           | Better to use `#if` or `#ifdef` to prevent compilation. C &
           | C++ don't support nested comments, so you can end up with
           | existing comments in the code ending the comment block.
        
             | kccqzy wrote:
             | I think `#if` and `#ifdef` are not good ideas because they
             | prevent the compiler from seeing them in the first place. A
             | better solution is just `if (false)` which is nestable, and
             | the code is still checked by the compiler so it won't bit
             | rot.
        
         | btown wrote:
         | CI is different from the others, here! At minimum, building a
         | "happy path(s)" test harness that can run with replicable
         | results, and will run on every one of your commits, is a first
         | step, and also helps to understand the codebase.
         | 
         | And you're jumping around - and you'll have to! - odds are
         | you'll have a bunch of things changed locally, and might
         | accidentally create a commit that doesn't separate out one
         | concern from another. CI will be a godsend at that point.
        
         | jasonwatkinspdx wrote:
         | Yeah, I've done a fair bit of agency work dropping in to rescue
         | code bases, and the first thing I do is run unit tests and
         | check coverage. I add basic smoke tests anywhere they're
         | missing. This actually speeds me up, rather than slowing me
         | down, because once I have reasonably good coverage I can move
         | dramatically faster when refactoring. It's a small investment
         | that pays off.
        
         | thrwyexecbrain wrote:
         | I would absolutely not recommend auto-formatting a legacy
         | codebase. In my experience large C++ projects tend to have not
         | only code generation scripts (python/perl/whatever) but also
         | scripts that parse the code (usually to gather data for code
         | generation). Auto formatting might break that. I have even seen
         | some really cursed projects where the _users_ parsed public
         | header files with rather fragile scripts.
        
           | Jtsummers wrote:
           | I was listing the items in the original article's #3 and
           | saying I'd move them up to #2 before I'd go about excising
           | portions of the project, the original #2. I still stand by
           | that. But you can read my other comment where I don't really
           | defend auto-formatting to see that I don't care either way. I
           | made it about four hours ago so maybe you missed it if you
           | didn't refresh the page in the last few hours.
        
       | ecshafer wrote:
       | This is pretty great advice for any legacy code project. Even
       | outside of C++ there is a huge amount of code bases out there
       | that do not compile/run on a dev machine without tons of work. I
       | once worked on a Java project that due to some weird
       | dependencies, the dev mode was to run a junit test which started
       | spring and went into an infinite loop. Getting a standard run to
       | work helped a ton.
        
         | bluGill wrote:
         | The difference between greenfield and legacy code is just a few
         | years. So learn to work with legacy code and how to make it
         | better over time.
        
       | hesdeadjim wrote:
       | Start grinding leetcode and find another gig?
        
       | bun_terminator wrote:
       | > Rewrite in a memory safe language?
       | 
       | like c++11 and later?
        
         | z_open wrote:
         | How is that memory safe? Even vector out of bounds index is not
         | memory safe.
        
           | bun_terminator wrote:
           | You can access a vector with a function that throws an
           | exception if you so desire
        
             | TwentyPosts wrote:
             | You can also just write no code at all if you so desire. It
             | certainly won't cause any memory issues that way. (Hint:
             | What you yourself decide to write or refrain from writing
             | is not the problem. You're not they only person who ever
             | worked on this legacy codebase, and you want guarantees but
             | default, not having to check every line of code in the
             | entire project.)
        
               | bun_terminator wrote:
               | no you just have to write a githook with some static
               | analysis, like literally everyone who does proper c++.
               | Safety hasn't been an issue in c++ for more than a
               | decade. It's just a made up thing by people who don't use
               | the language but only want to hate.
        
               | z_open wrote:
               | Go look at the CVEs and github issues of modern C++
               | codebases. Your statement is nonsense. Chromium is still
               | plagued by use after free. How high do you set the bar?
               | Which codebases are we allowed to look at?
        
               | bun_terminator wrote:
               | I had that exact discussion with someone else a while
               | ago. And when you actually go through the chromium memory
               | bugs, it's 100% avoidable with an absolute baseline of
               | competence and not using ancient bugs. It's unfair that
               | C++ always has to compete in its state from 1990s against
               | languages in their current iteration.
        
               | z_open wrote:
               | That's why I asked what the bar was? If Google is writing
               | shitty C++ even with the world's eyes on their code base,
               | who is doing it right? No one writing anything
               | sufficiently complicated that's for sure.
        
               | bun_terminator wrote:
               | However you feel about this issue: It's pretty widely
               | known that google is bad at c++. Most codebases will be
               | of significantly higher quality.
        
               | hairyplanner wrote:
               | Google chrome must be one of the most used (in terms of
               | CPU time) C++ software _in the world_ right now. That
               | means it 's been fuzz tested (by the developers as well
               | as by the users and also the random websites that gives
               | it garbage html and javascript) extensively. I can only
               | think of the Linux kernel that is more widely used, and
               | Linux is not C++.
               | 
               | Since you seem to be very good at c++, can you point to a
               | "significantly higher quality" c++ projects please? I'd
               | like to see what it looks like.
        
               | delta_p_delta_x wrote:
               | > Since you seem to be very good at c++, can you point to
               | a "significantly higher quality" c++ projects please?
               | 
               | Not the parent commenter, but there _are_ quite a few
               | very high-quality C++ projects out there.
               | 
               | - LLVM
               | 
               | - KDE projects
               | 
               | - CMake
               | 
               | - Node.js
               | 
               | - OpenJDK
        
           | evouga wrote:
           | It's funny; I spent a couple of hours last week helping some
           | students debug out-of-bounds indices in their Rust code.
           | 
           | I've written bugs that would have been caught by the compiler
           | in a memory-safe language. I think the last time was maybe in
           | 2012 or 2013? I still write plenty of bugs today but they're
           | almost all logic errors that nothing (short of AI tools)
           | could have caught.
        
           | bluGill wrote:
           | vector.at() is memory safe. You get a choice. Easy to ban []
           | if you cannot statically prove it is safe.
           | 
           | C++11 isn't the most memory safe language, but C++11 is a lot
           | safer than older versions, and C++23 is better yet. I'm
           | expecting true memory safety in C++26 (time will tell), but
           | it will be opt-in profiles which isn't ideal.
        
       | sealeck wrote:
       | rm -r
       | 
       | Problem solved
        
         | dtx1 wrote:
         | > C++
         | 
         | > Not even once
        
         | GuB-42 wrote:
         | If you mean "rewrite from scratch", believe me, it is the worst
         | thing you can do. I speak from experience, it is tempting but
         | the few times I have done that, a few months later as I get
         | burnt, I could only think of how an idiot who never learn I
         | was.
         | 
         | Legacy code is like that because it went through many bugfixes
         | and addressing weird requirements. Start over and you lose all
         | that history, and it is bound to repeat itself. That weird
         | feature that makes no sense, as it turns out, makes a lot of
         | sense for some users, and that why it has been implemented in
         | the first place. And customers don't care about your new
         | architecture and fancy languages, they want their feature back,
         | otherwise they won't pay.
         | 
         | Another way to look at it is when you asked to maintain a
         | legacy code base, that's because that's software that has been
         | in use for a long time. If it was that bad, it would have been
         | dropped long ago, or maybe even cancelled before it got any
         | use. Respect software that is used in production, many don't
         | reach that stage.
         | 
         | Of course there are exceptions to that rule, but the general
         | idea about rewriting from scratch is: "no" means "no", "maybe"
         | means "no", and "yes" means "maybe".
        
           | sealeck wrote:
           | I'm being like 1000% facetious, and agree that rewrites are
           | bad.
        
           | bluGill wrote:
           | I have been involved in a successful rewrite. It cost
           | billions of dollars and many years when the code wasn't
           | working so the old system was still in use. We also ended up
           | bringing over some old code directly just to get something -
           | anything - functional at all. For many years my boss kept the
           | old version running on his desk because when there was a
           | question that old system was the requirements.
           | 
           | Today we only have to maintain the new system (the old is no
           | longer sold/supported), and the code is a lot better than the
           | old one. However I suspect we could have refactored the old
           | system in place for less time/money and been shipping all the
           | time. Now we have a new system and it works great - but we
           | already have had to do significant refactors because some new
           | requirement came along that didn't fit our nice architecture.
        
       | Night_Thastus wrote:
       | >worry not, by adding std::cmake to the standard library and
       | you'll see how it's absolutely a game changer
       | 
       | I'm pretty sure my stomach did somersaults on that.
       | 
       | But as for the advice:
       | 
       | >Get out the chainsaw and rip out everything that's not
       | absolutely required to provide the features your company/open
       | source project is advertising and selling
       | 
       | I hear you, but this is _incredibly_ dangerous. Might as well
       | take that chainsaw to yourself if you want to try this.
       | 
       | It's dangerous for multiple reasons. Mainly it's a case of
       | Chesterton's fence. Unless you fully understand why X was in the
       | software and fully understand how the current version of the
       | software is used, you _cannot_ remove it. A worst case scenario
       | would be that maybe a month or so later you make a release and
       | the users find out an important feature is subtly broken. You 'll
       | spend days trying to track down exactly how it broke.
       | 
       | >Make the project enter the 21st century by adding CI, linters,
       | fuzzing, auto-formatting, etc
       | 
       | It's a nice idea, but it's hard to do. One person is using VIM,
       | another is using emacs, another is using QTCreator, another
       | primarily edits in VSCode.. Trying to get everyone on the same
       | page about all this is very, very hard.
       | 
       | If it's an optional step that requires that they install
       | something new (like commit hook) it's just not going to happen.
       | 
       | Linters also won't do you any good when you open the project and
       | 2000+ warnings appear.
        
         | aaronbrethorst wrote:
         | An optional step locally like pre-commit hooks should be backed
         | up by a required step in the CI. In other words: the ability to
         | run tests locally, lint, fuzz, format, verify Yaml format,
         | check for missing EOF new lines, etc, should exist to help a
         | developer prevent a CI failure before they push.
         | 
         | As far as linters causing thousands of warnings to appear on
         | opening the project, the developer adding the linter should
         | make sure that the linter returns no warnings before they merge
         | that change. This can be accomplished by disabling the linter
         | for some warnings, some files, making some fixes, or some
         | combination of the above.
        
         | zer00eyz wrote:
         | >> It's a nice idea, but it's hard to do. One person is using
         | VIM, another is using emacs, another is using QTCreator,
         | another primarily edits in VSCode.. Trying to get everyone on
         | the same page about all this is very, very hard.
         | 
         | This is what's wrong with our industry, and it's no longer an
         | acceptable answer. We're supposed to be fucking professional,
         | and if a job needs to build a tool chain from the IDE up we
         | need to learn to use it and live with it.
         | 
         | Built on my machine, with my IDE, the way I like it and it
         | works is not software. It's arts and fucking crafts.
        
           | cratermoon wrote:
           | If you're saying everyone should agree on the same IDE and
           | personal development toolset, I disagree, sort of.
           | 
           | The GP was suggesting the effort to add CI, linters, fuzzing,
           | auto-formatting, etc was too hard. If that can be abandoned
           | entirely, perhaps the legacy codebase isn't providing enough
           | value, and the effort to maintain it would be better spent
           | replacing it. But the implication is that the value outweighs
           | the costs.
           | 
           | Put all the linters, fuzzing, and format checking in an
           | automated build toolchain. Allow individuals to work how they
           | want, except they can't break the build. Usually this will
           | reign in the edge cases using inadequate tools. The "built on
           | my machine, with my IDE, the way I like it and it works" is
           | no longer the arbiter of correct, but neither does the
           | organization have to deal with the endless yak shaving over
           | brace style and tool choice.
        
             | eropple wrote:
             | _> neither does the organization have to deal with the
             | endless yak shaving over brace style and tool choice_
             | 
             | I hear you, but an organization that fears this, instead of
             | Just Pick Something And Deal With It, is an organization
             | that probably doesn't have the right people in it to
             | succeed at any task more arduous than that.
        
         | electroly wrote:
         | > It's a nice idea, but it's hard to do. One person is using
         | VIM...
         | 
         | The things the author listed there are commonly not IDE
         | integrated. I've never seen a C++ development environment where
         | cpplint/clang-tidy and fuzzers are IDE integrated, they're too
         | slow to run automatically on keystrokes. Auto-formatting is the
         | only one that is sometimes integrated. All of this stuff you
         | can do from the command line without caring about each user's
         | chosen development environment. You should definitely at least
         | try rather than giving up before you start just because you
         | have two different text editors in use. This is C++; if your
         | team won't install any tools, you're gonna have a bad time.
         | Consider containerizing the tools so it's easier.
        
         | IshKebab wrote:
         | > One person is using VIM, ...
         | 
         | I don't get your point. You know you can autoformat outside
         | editors right? Just configure pre-commit and run it in CI. It's
         | trivial.
         | 
         | > If it's an optional step that requires that they install
         | something new (like commit hook) it's just not going to happen.
         | 
         | It will because if they don't then their PRs will fail CI.
         | 
         | This is really basic stuff, but knowledge of how to do CI and
         | infra right does seem to vary massively.
        
         | j-krieger wrote:
         | > It's a nice idea, but it's hard to do. One person is using
         | VIM, another is using emacs, another is using QTCreator,
         | another primarily edits in VSCode.. Trying to get everyone on
         | the same page about all this is very, very hard.
         | 
         | I must have missed the memo where I could just say no to basic
         | things my boss requires of me. You know, the guy that pays my
         | salary.
        
       | bluetomcat wrote:
       | Been there, done that. Don't be a code beauty queen. Make it
       | compile and make it run on your machine. Study the basic control-
       | flow graph starting from the entry point and see the relations
       | between source files. Debug it with step-into and see how deep
       | you go. Only then can you gradually start seeing the big picture
       | and any potential improvements.
        
         | johngossman wrote:
         | Absolutely. Read the code. Step through with a debugger. Fix
         | obvious bugs. If it's legacy and somebody is still paying to
         | have it worked on, it must mostly work. Changing things for
         | "cleanliness and modernization" is likely to break it.
        
           | cratermoon wrote:
           | > Fix obvious bugs.
           | 
           | Be careful about that. Hyrum's Law and all.
        
         | bluGill wrote:
         | In my experience it takes at least a year straight working with
         | code before you can form an opinion on if it is beautiful or
         | not. People who have not worked in a code base for that long do
         | not understand what is a beautiful design corrupted by the real
         | world vs what is ugly code. Most code started out with a
         | beautiful design but the real world forced ugly on it - you
         | might be able to improve this a little with full rewrite but
         | the real world will still force a lot of ugly on you. However
         | some code really is bad.
        
       | sk11001 wrote:
       | Is it worth getting more into C++ in 2024? Lots of interesting
       | jobs in finance require it but it seems almost impossible to get
       | hired without prior experience (with C++ and in finance).
        
         | optimalsolver wrote:
         | Yes.
         | 
         | I switched from Python to C++ because Cython, Numba, etc. just
         | weren't cutting it for my CPU-intensive research needs (program
         | synthesis), and I've never looked back.
        
           | sk11001 wrote:
           | My question isn't whether it's a good fit for a specific
           | project, I'm more interested in whether it's a good career
           | choice e.g. can you get a job using C++ without C++
           | experience; how realistic is it to ramp up on it quickly;
           | whether you're likely to end up with some gnarly legacy
           | codebase as described in the OP; is it worth pursuing this
           | direction at all.
        
             | hilux wrote:
             | Did you see yesterday's article about the White House
             | Office of the National Cyber Director (ONCD) advising
             | developers to dump C, C++, and other languages with memory-
             | safety issues?
        
               | sk11001 wrote:
               | Yes, and at the same time I'm seeing ads for jobs that
               | pay more than double what I make that require C++.
        
               | mkipper wrote:
               | I still think knowing C++ is pretty valuable to someone's
               | career (at least over the next 10 - 15 years) if they're
               | looking to work in fields that traditionally use C++ but
               | might be transitioning away from it.
               | 
               | The obvious comparison is Rust. There are way more C++
               | jobs out there than Rust jobs. And even if I'm hiring for
               | a team developing something in Rust, I'd generally prefer
               | candidates with similar C++ experience and a basic
               | understanding of Rust over candidates with a strong
               | knowledge of Rust and no domain experience. Modern C++
               | and Rust aren't _that_ dissimilar, and a lot of ideas and
               | techniques carry over from C++ to Rust.
               | 
               | Even if the DoD recommends that contractors stop using
               | C++ and tech / finance are moving away from it, I'd say
               | we're still years away from the point where Rust catches
               | up to C++ in terms of job opportunities. If your main
               | goal is employment in a certain industry, you'll probably
               | have an easier time getting your foot in the door with
               | C++ than Rust. Both paths are viable but the Rust path
               | would be much harder IMO.
        
               | avgcorrection wrote:
               | We're still in for another 20 years of hardcore veteran
               | Cxx programmers insisting that either the memory safety
               | issue is overblown or just a theoretical issue if you are
               | experienced enough/use a new enough edition of the
               | language.
        
               | bluGill wrote:
               | The C++ committee is looking hard at how to make C++
               | memory safe. If you use modern C++ you are already
               | reasonably memory safe - the trick is how do we force
               | developers to not access raw memory (no new/malloc, use
               | vector not arrays...). There are some things that seem
               | like they will come soon.
               | 
               | Of course if you really need that non-memory safe stuff -
               | which all your existing code does - then you can't take
               | advantage of it. However you can migrate your C++ to
               | modern C++ and add those features to your code. This is
               | probably easier than migrating to something like Rust
               | (Rust cannot work with C++ unless you stick with the C
               | subset from what I can tell) since you can work in small
               | chunks at a time in at least some situations.
        
             | TillE wrote:
             | C++ is really a language that you want to specialize in and
             | cultivate years of deep expertise with, rather than having
             | it as one tool in your belt like you can with other
             | languages.
             | 
             | That's certainly a choice you can make, and modern C++ is
             | generally a pretty good experience to work with. I would
             | hope that there's not a ton of active C++ projects which
             | are still mostly using the pre-2011 standard, but who
             | knows.
        
               | sgerenser wrote:
               | This exactly. It's a blessing and a curse, because I'd
               | love to move to a "better" language like Rust or even
               | Zig. But with 20+ years of C++ experience I feel like I'd
               | be throwing away too much to avoid C++ completely. Also
               | agreed that modern C++ is pretty decent. Lamenting that
               | I'm back in a codebase that started before C++11 vs my
               | previous job that was greenfield C++14/17.
        
             | stagger87 wrote:
             | I would be very surprised if most people actually choose to
             | develop in C++. It's a very good language choice for many
             | domains, and I suspect interest and expertise in those
             | domains drives people to C++ more than a desire to program
             | in C++.
        
             | jandrewrogers wrote:
             | Modern C++ is the language of choice for high-performance,
             | high-scale data-intensive applications and will remain so
             | for the foreseeable future. This is a class of application
             | for which it is uniquely suited (C and Rust both have
             | significant limitations in this domain). There are other
             | domains like gaming that are also heavily into C++.
             | Avoiding legacy C++ codebases is more about choosing where
             | you work carefully.
             | 
             | It goes without saying that if you don't like the kinds of
             | applications where C++ excels then it may not be a good
             | career choice because it is not a general purpose language
             | in practice.
        
         | d_sem wrote:
         | Depends on the industry you are interested in entering.
         | 
         | My myopic view of the world has seen the general trend from C
         | to C++ for realtime embedded applications. For example: in the
         | Automotive Industry all the interesting automotive features are
         | written in C++.
        
       | davidw wrote:
       | Well, tomorrow is the "who's hiring?" thread...
        
       | tehnub wrote:
       | I enjoyed the article and learned something. But I've been
       | wondering: When people say "rewrite in a memory-safe language",
       | what languages are they suggesting? Is this author rewriting
       | parts in Go, Java, C#? Or is it just a smirky, plausibly deniable
       | way of saying to rewrite it in Rust?
        
         | broken_broken_ wrote:
         | Author here, thanks! A second article will cover this, but the
         | bottom line is that it entirely depends on the team and the
         | constraints e.g. is a GC an option (then Go is a good option),
         | is security the highest priority, etc.
         | 
         | I'd say that most C++ developers will generally have an easy
         | time using Rust and will get equivalent performance.
         | 
         | But sometimes the project did not have a good reason to be in
         | C++ in the first place and I've seen successful rewrites in
         | Java for example.
         | 
         | Apple is rewriting some C++ code in Swift, etc. So, the
         | language the team/company is comfortable with is a good rule of
         | thumb.
        
           | tehnub wrote:
           | Makes sense, thanks!
        
         | avgcorrection wrote:
         | So you saw a post about C++, it didn't mention "Rust" once,
         | mentioned "memory safe" languages which there are _dozens_ of,
         | and yet found a way to shoehorn in a dismissive comment about a
         | meme. Nice.
         | 
         | We've reached the rewrite-in-rust meme stage of questioning
         | whether the author is a nefarious crypto-Rust programmer in
         | lieu of not being able to complain about it (since it wasn't
         | brought up!).
        
       | huqedato wrote:
       | Whenever I inherited a project containing legacy code, regardless
       | of the frameworks, tools, or languages used, we always found it
       | necessary to drop it and begin anew. Despite my efforts to reuse,
       | update, or refactor it, we inevitably reached a point where it
       | was unusable for further development.
        
       | Kapura wrote:
       | > Get out the chainsaw and rip out everything that's not
       | absolutely required to provide the features your company/open
       | source project is advertising and selling
       | 
       | Great advice! People do not often think about the value of de-
       | cluttering the codebase, especially _before_ a refactor.
        
       | cratermoon wrote:
       | Kill It with Fire https://nostarch.com/kill-it-fire
        
       | myrmidon wrote:
       | Really liked it! Especially the "get buy in" is really good
       | advice-- always stressing how the effort spent on refactoring
       | actually improves things, and WHY its necessary.
       | 
       | Something that's kinda implied that I would really stress:
       | Establish a "single source of truth" for any release/binary that
       | reaches production/customers, before even touching ANY code
       | (Ideally CI. And ideally builds are reproducible).
       | 
       | If you build from different machines/environments/toolchains, its
       | only a matter of time before that in itself breaks something, and
       | those kinds of problems can be really "interesting" to find (an
       | obscure race condition that only occurs when using a newer
       | compiler, etc.)
        
       | bArray wrote:
       | > 3. Make the project enter the 21st century by adding CI,
       | linters, fuzzing, auto-formatting, etc
       | 
       | I would break this down:
       | 
       | a) CI - Ensure not just you can build this, but it can be built
       | elsewhere too. This should prevent compile-based regressions.
       | 
       | b) Compiler warnings and static analysers - They are likely both
       | smarter than you. When it says "warning, you're doing weird
       | things with a pointer and it scares me", it's a good indication
       | you should go check it out.
       | 
       | c) Unit testing - Set up a series of tests for important parts of
       | the code to ensure it performs precisely the task you expect it
       | to, all the way down to the low level. There's a really good
       | chance it doesn't, and you need to understand why. Fixing
       | something could cause something else to blow up as it was written
       | around this bugged code. You also end up with a series of
       | regression tests for the most important code.
       | 
       | n) Auto-formatting - Not a priority. You should adopt the same
       | style as the original maintainer.
       | 
       | > 5. If you can, contemplate rewrite some parts in a memory safe
       | language
       | 
       | The last step of an inherited C++ codebase is to rewrite it in a
       | memory safe language? A few reasons why this probably won't work:
       | 
       | 1. Getting resources to do additional work on something that
       | isn't broken can be difficult.
       | 
       | 2. Rather than just needing knowledge in C++, you now also need
       | knowledge in an additional language too.
       | 
       | 3. Your testing potentially becomes more complex.
       | 
       | 4. Your project likely won't lend itself to being written in
       | multiple languages, due to memory/performance constraints. It
       | must be a significantly hard problem that you didn't just write
       | it yourself.
       | 
       | 5. You have chosen to inherit a legacy codebase rather than write
       | something from scratch. It's an admittance that you don't have
       | some resource (time/money/knowledge/etc) to do so.
        
       | grandinj wrote:
       | This is generally the same path that LibreOffice followed. Works
       | reasonably well.
       | 
       | We built our own find-dead-code tool, because the extant ones
       | were imprecise, and boy oh boy did they find lots of dead stuff.
       | And more dead stuff. And more dead stuff. Like peeling an onion,
       | it went on for quite a while. But totally worth it in the end,
       | made various improvements much easier.
        
       | Scubabear68 wrote:
       | The "rip everything out" step is not recommended. You will break
       | things you don't understand, invoke Chesterson's Fence, and
       | create enormous amounts of unnecessary work for yourself.
       | 
       | Make it compile, automate what you can, try not to poke the bear
       | as much as you can, pray you can start strangling it by porting
       | pieces to something else over time.
        
       | beanjuiceII wrote:
       | the whitehouse says i should RiiR
        
       | jeffrallen wrote:
       | This was my job at Cisco. But it was a C code base, which used
       | nonstandard compiler extensions, and so could not be built
       | without the legacy compilers with their locally made extensions.
       | Also the "unit tests" were actually hardware-in-the-loop tests.
       | And the Makefiles referenced NFS filesystems automounted from
       | global replicas, but none of them were on my continent.
       | 
       | Fun times. Don't work there anymore. Life is good. :)
        
       | mk_chan wrote:
       | I'm not sure why there's so much focus on refactoring or
       | improving it. When a feature needs to be added that can just be
       | tacked onto the code, do it without touching anything else.
       | 
       | If it's a big enough change, export whatever you need out of the
       | legacy code (by calling an external function/introducing a
       | network layer/pulling the exact same code out into a
       | library/other assorted ways of separating code) and do the rest
       | in a fresh environment.
       | 
       | I wouldn't try to do any major refactor unless several people are
       | going to work on the code in the future and the code needs to
       | have certain assumptions and standards so it is easy for the
       | group to work together on it.
        
         | dj_mc_merlin wrote:
         | The post argues against major refactors. The incremental
         | suggestions it gives progressively make the code easier to work
         | with. What you suggest works until it doesn't -- something
         | suddenly breaks when you make a change and there's so much
         | disorganized stuff that you can't pinpoint the cause for much
         | longer than necessary. The OP is basically arguing for
         | decluttering in order to be able to do changes easier, while
         | still maintaining cohesion and avoiding a major rewrite.
        
         | bluGill wrote:
         | The right answer depends on the future. I've worked on C++ code
         | where the replacement was already in the market but we had to
         | do a couple more releases of the old code. Sometimes it is
         | adding the same feature to both versions. There is a big
         | difference in how you treat code that you know the last release
         | is coming soon and code where you expect to maintain and add
         | features for a few more decades.
        
       | VyseofArcadia wrote:
       | > Get out the chainsaw and rip out everything that's not
       | absolutely required to provide the features your company/open
       | source project is advertising and selling
       | 
       | Except every legacy C++ codebase I've worked on is decades old.
       | Just enumerating the different "features" is a fool's errand.
       | Because of reshuffling and process changes, even marketing
       | doesn't have a complete list of our "features". And even it there
       | was a complete list of features, we have too many customers that
       | rely on spacebar heating[0] to just remove code that we think
       | doesn't map to a feature.
       | 
       | That's _if_ we can even tease apart which bits of code map to a
       | feature. It 's not like we only added brand new code for each
       | feature. We also relied on and modified existing code. The only
       | code that's "safe" to remove is dead code, and sometimes that's
       | not as dead as you might think.
       | 
       | Even _if_ we had a list of features and even _if_ code mapped
       | cleanly to features, the idea of removing all code not related to
       | "features your company is advertising or selling" is absurd.
       | Sometimes a feature is so widely used that you don't advertise it
       | anymore. It's just there. Should Microsoft remove boldface text
       | from Word because they're not actively advertising it?
       | 
       | The only way this makes sense is if the author and I have wildly
       | different ideas about what "legacy" means.
       | 
       | [0] https://xkcd.com/1172/
        
       | professorTuring wrote:
       | How good is AI refactoring the code? Haven't tried it yet, but...
       | as someone who has need to work on tons of legacy in the past...
       | looks interesting!
        
         | bluGill wrote:
         | Very mixed. Sometimes great, but you have too watch it close as
         | once in a while it will do garbage.
         | 
         | There is a lot of non-AI refactoring for C++ these days that is
         | very good. And many more tools that will point to areas that
         | there is a problem and often a manual fix of those areas is
         | "easy".
        
       | eschneider wrote:
       | Some good advice here, and some more...controversial advice here.
       | 
       | After inheriting quite a few giant C++ projects over the years,
       | there are a few obvious big wins to start with:
       | 
       | * Reproducible builds. The sanity you save will be your own. Pro-
       | tip: wrap your build environment with docker (or your favorite
       | packager) so that your tooling and dependencies become both
       | explicit and reproducable. The sanity you save will be your own.
       | 
       | * Get the code to build clean with -Wall. This is for a couple of
       | reasons. a) You'll turn up some amount of bad code/undefined
       | behavior/bugs this way. Fix them and make the warning go away.
       | It's ok to #pragma away some warnings once you've determined you
       | understand what's happening and it's "ok" in your situation. But
       | that should be rare. b) Once the build is clean, you'll get
       | obvious warnings when YOU do something sketchy and you can fix
       | that shit immediately. Again, the sanity you save will be your
       | own.
       | 
       | * Do some early testing with something like valgrind and
       | investigate any read/write errors it turns up. This is an easy
       | win from a bugfix/stability point of view.
       | 
       | * At least initially, keep refactorings localized. If you work on
       | a section and learn what it's doing, it's fine to clean it up and
       | make it better, but rearchitecting the world before you have a
       | good grasp on what's going on globally is just asking for pain
       | and agony.
        
         | SleepyMyroslav wrote:
         | I would not call that 'controversial'. In the internet days
         | people call this behavior trolling for a reason. The punchline
         | about rewriting code in different language gives an easy hint
         | at where this all going.
         | 
         | PS. I have been in the shoes of inheriting old projects before.
         | And I hope i left them in better state than they were before.
        
         | dataflow wrote:
         | Step 0: reproducible builds (like you said)
         | 
         | Step 1: run all tests, mark all the flaky ones.
         | 
         | Step 2: run all tests under sanitizers, mark all the ones that
         | fail.
         | 
         | Step 3: fix all the sanitizer failures.
         | 
         | Step 4: (the other stuff you wrote)
        
           | eschneider wrote:
           | Just a note on legacy tests: Step 0.5: understand the tests.
           | They need to be examined to see if they've rotted or not.
           | Tests passing/failing doesn't really mean code under test
           | works or not. The tests might have been abandoned under
           | previous management and don't accurately reflect how the code
           | is _supposed_ to be working.
        
           | daemin wrote:
           | Probably insert another Step 1: implement tests Be they
           | simple acceptance tests, integration tests, or even unit
           | tests for some things.
        
           | hyperman1 wrote:
           | If we're going to visit the circles of hell, let's do it
           | properly:
           | 
           | Step -1: Get it under source control and backed up.
           | 
           | Step -2: Find out if the source code corresponds to the
           | executable. Which of the 7 variants of the source code (if
           | any).
           | 
           | Step -3: Do dark rituals over a weekend with cdparanioa to
           | scrape the source code from the bunch of scratched cd's found
           | in someone's bottom drawer. Bonus point if said person died
           | last week, and other eldritch horrors lurk in that bottom
           | drawer. Build a VM clone of the one machine still capable of
           | compiling it.
           | 
           | Yes, I have scars, why do you ask?
        
             | galangalalgol wrote:
             | I was assuming it already had unit and system tests with
             | decent coverage. I forgot how bad stuff gets.
             | 
             | Maybe VM clones of various users too, and recordings of
             | their work flows?
        
         | microtherion wrote:
         | > Get the code to build clean with -Wall.
         | 
         | This is fine, but I would strongly recommend against putting
         | something like -Wall -Werror into production builds. Some of
         | the warnings produced by compilers are opinion based, and new
         | compiler versions may add new warnings, and suddenly the
         | previous "clean" code is no longer accepted. If you must use
         | -Werror, use it in debug builds.
        
           | ladberg wrote:
           | This is fixed by the suggestion right before it:
           | 
           | > * Reproducible builds. The sanity you save will be your
           | own. Pro-tip: wrap your build environment with docker (or
           | your favorite packager) so that your tooling and dependencies
           | become both explicit and reproducable. The sanity you save
           | will be your own.
           | 
           | Upgrading compiler versions shouldn't be done out-of-band
           | with your normal PR process.
        
             | jchw wrote:
             | I agree that this helps, although I still think that in
             | general, the _default_ build should never do -Werror, since
             | people may use other toolchains and it shouldn 't surprise-
             | break downstream (I'm pretty sure this is a problem Linux
             | distros struggle with all the time..) If it does it only in
             | your fully reproducible CI, then it should be totally fine,
             | of course.
        
               | eschneider wrote:
               | The scripted, packaged docker with toolchain dependencies
               | and _is_ the build. If someone decides to use a different
               | toolchain, the problems are on them.
        
               | jchw wrote:
               | Yeah that works if you are not dealing with open source.
               | If you are dealing with open source, though, it really
               | won't save you that much trouble, if anything it will
               | just lead to unnecessarily hostile interactions. You're
               | not really obligated to fix any specific issues that
               | people report, but shrugging and saying "Your problem."
               | is just non-productive and harms valuable downstreams
               | like Linux distributions. Especially when a lot of new
               | failures actually do indicate bugs and portability
               | issues.
        
               | galangalalgol wrote:
               | I would do wall wextra and werror. Again mostly for my
               | own sanity. But I'd wait to add werror until they were
               | all fixed so regression testing would continue as the
               | warnings got fixed. Cpp_check and clang tidy would also
               | eventually halt the pipeline. And *san on the tests as
               | compiled in both debug and O3 with a couple compilers.
        
         | golergka wrote:
         | Great advice. Almost all of it applies to any programming
         | language.
        
       | sega_sai wrote:
       | To be honest a lot of recommendations apply to other languages as
       | well. I.e. start with tests only then change, add autoformatting
       | etc. At least I had experience of applying a similar sequence of
       | steps to a python package.
        
       | Merik wrote:
       | if you have access to Gemini Pro 1.5 you could put the whole code
       | base into the context and start asking questions about
       | architecture, style, potential pain paints etc.
        
       | jujube3 wrote:
       | Look, I'm not saying you should rewrite it in Rust.
       | 
       | But you should rewrite it in Rust.
        
       | elzbardico wrote:
       | 1990 Windows C++ code? Consider euthanasia as a painless
       | solution.
        
       | codelobe wrote:
       | My first thing is usually:                   #0: Replace the
       | custom/proprietary Hashmap implementation with the STL version.
       | 
       | Once upon a time, C++ academics brow beat the lot of us into
       | accepting Red-Black-Tree as the only Map implementation, arguing
       | (in good faith yet from ignorance) that the "Big O" (an orgasm
       | joke, besides others) worst case scenario (Oops, pregnancy)
       | categorized Hash Map as O(n) on insert, etc. due to naieve
       | implementations frequently placing hash colliding keys in a
       | bucket via linked list or elsewise iterating to other "adjacent"
       | buckets. Point being: The One True Objective Standard of
       | "benchmark or die" was not considered, i.e., the average case is
       | obviously the best deciding factor -- or, as Spock simply logic'd
       | it, "The needs of the many outweigh the needs of the few".
       | 
       | Thus, it came to pass that STL was missing its Hashmap
       | implementation; And since it is typically trivial (or a non
       | issue) to avoid "worst case scenario" (of Waat? A Preggers Table
       | Bucket?), e.g., use of iterative re-mapping of the hashmap. So it
       | was that many "legacy" codebases built their own Hashmap
       | implementations to get at that (academically forbidden)
       | effective/average case insert/access/etc. sweet spot of constant
       | time "O(1)" [emphasis on the scare quotes: benchmark it and see
       | -- there is no real measure of the algo otherwise, riiight?].
       | Therefore, the affore-prophesied fracturing of the collections
       | APIs via the STL's failure to fill the niche that a Hashmap would
       | inevitably have to occupy came to pass -- Who could have forseen
       | this?!
       | 
       | What is done is done. The upshot is: One can typically
       | familiarize oneself with a legacy codebase whilst paying lip
       | service to "future maintainability" by (albeit usually needless)
       | replacing of custom Hashmap implementations with the one that the
       | C++ standards body eventually accepted into the codebase despite
       | the initial "academic" protesting too much via "Big O" notation
       | (which is demonstrably a sex-humor-based system meant to be of
       | little use in practical/average case world that we live in). Yes,
       | once again the apprentice has been made the butt of the joke.
        
         | bluGill wrote:
         | In the mid 1990s when C++ was getting std::map and the other
         | containers CPU caches were not a big deal. Average case was the
         | correct thing to optimize for. These days CPU caches are a big
         | deal and so your average case is typically dominated by CPU
         | cache miss pipeline stalls. This means for most work you need
         | different data structures. The world is still catching up to
         | what this means.
        
       | jcarrano wrote:
       | Good points, but this is not something you can solve with a
       | recipe. Investigate, talk to people and make sure you are solving
       | actual problems and prioritizing the right tasks.
       | 
       | This is an extremely crucial step that you must do first:
       | familiarize yourself with the system, its uses and the reasons it
       | works like it does. Most things will be there for a reason, even
       | if not written to the highest standard. Other parts might at
       | first sight seem very problematic yet be only minor issues.
       | 
       | Be careful with number 4 and 5. Do not rush to fix or rewrite
       | things just because they look like they can be improved. If it is
       | not causing issues and it is not central to the system, better
       | spend your resources somewhere else.
       | 
       | Get the team to adopt good practices, both in the actual code and
       | in the process. Observe the team and how they work and address
       | the worst issues first, but do not overwhelm them. They may not
       | even be aware of their inefficiencies (e.g. they might consider
       | complete rebuilds as something normal to do).
        
         | bombcar wrote:
         | I did not find Chesterton's fence, and was sad.
         | 
         | The very first thing to do with a new codebase is _don 't touch
         | anything_ until you understand it, and then _don 't touch
         | anything_ until you realize how mistaken your understanding
         | was.
        
       | bobnamob wrote:
       | This article (and admittedly most comments here) doesn't
       | emphasize the value of a comprehensive e2e test suite enough.
       | 
       | So much talk about change and large LoC deltas without capturing
       | the expected behavior of the system first
        
       | Kon-Peki wrote:
       | The article doesn't mention anything about global variables, but
       | reducing/eliminating them would be a high priority for me.
       | 
       | The approach I've taken is, when you do work on a function and
       | find that it uses a global variable, try to add the GV as a
       | function parameter (and update the calling sites). Even if it's
       | just a pointer to the global variable, you now have another
       | function that is more easily testable. Eventually you can get to
       | the point where the GV can be trivially changed to a local
       | variable somewhere appropriate.
        
       | joshmarinacci wrote:
       | You need to install linters and formatters and security checkers.
       | But you need to start using them incrementally. Trying to fix all
       | the issues found at once is a quick recipe for madness. I suggest
       | using clang-tidy with a meta-linter like Trunk Check
       | 
       | docs:
       | 
       | https://docs.trunk.io/check/configuration/configuring-existi...)
        
       | dureuill wrote:
       | > What do you do now?
       | 
       | Look for another job
       | 
       | > You'd be amazed at how many C++ codebase in the wild that are a
       | core part of a successful product earning millions and they
       | basically do not compile.
       | 
       | Wow I really hope this is hyperbole. I feel like I was lucky to
       | work on a codebase that had CI to test on multiple computers with
       | WError
        
         | throwaway71271 wrote:
         | > Wow I really hope this is hyperbole.
         | 
         | I am sure its not, I dont have much experience as I have worked
         | in only 3 companies in the last 25 years, but so far I have
         | found no relation between code quality and company earnings.
        
       | delta_p_delta_x wrote:
       | > So what do I recommend? Well, the good old git submodules and
       | compiling from source approach.=
       | 
       | It is strange that the author complains so much about automating
       | BOMs, package versioning, dependency sources, etc, and then
       | proceeds to suggest git submodules as superior to package
       | managers.
       | 
       | The author needs to try vcpkg before making these criticisms;
       | almost all of these are straightforwardly satisfied with vcpkg,
       | barring a few sharp edges (updating dependencies is a little
       | harder than with git submodules, but that's IMO a feature and not
       | a bug--dependencies are built in individual sandboxes which are
       | then installed to a specified directory. vcpkg can set internal
       | repositories as the registry instead of the official one, thus
       | maintaining the 'vendored in' aspect. vcpkg can chainload
       | toolchains and specify per-port customisations.
       | 
       | These are useful abstractions and it's why package managers are
       | so popular, rather than having everyone deal with veritable
       | bedsheets' worth of strings containing compile flags, macros,
       | warnings, etc.
        
       | vijucat wrote:
       | Not mentioned were code comprehension tools / techniques:
       | 
       | I used to use a tool called Source Navigator (written in Tcl/tk!)
       | that was great at indexing code bases. You could then check the
       | Call Hierarchy of the current method, for example, then use that
       | to make UML Sequence Diagrams. A similar one called Source
       | Insight shown below [1].
       | 
       | And oh, notes. Writing as if you're teaching someone is key.
       | 
       | Over the years, I got quite good at comprehending code, even code
       | written by an entire team over years. For a brief period, I was
       | the only person actively supporting and developing an algorithmic
       | trading code base in Java that traded ~$200m per day on 4 or 5
       | exchanges. I had 35 MB of documentation on that, lol. Loved the
       | responsibility (ignoring the key man risk :|). Honestly, there's
       | a lot of overengineering and redundancy in most large code bases.
       | 
       | [1] References in "Source Insight"
       | https://d4.alternativeto.net/6S4rr6_0rutCUWnpHNhVq7HMs8GTBs6...
        
       ___________________________________________________________________
       (page generated 2024-02-29 23:00 UTC)