[HN Gopher] Ask HN: Advice for leading a software migration?
       ___________________________________________________________________
        
       Ask HN: Advice for leading a software migration?
        
       Hey HN,  I'm about to take lead of a decent sized software
       migration at work. (From V1 of some subsystem, to v2, both in
       house. We want to deprecated and eventually remove V1 totally) For
       8 of our clients, totalling about 16 million customers.  I don't
       have too many details to share, as I don't know what's relevant.
       But I'm asking if anyone has any advice or recommended reading
       regarding such?  One book that is really inspiring me about it is
       "how big things get done" by Bent Flyvbjerg and Dan Gardner. In it,
       there's some key bits of advice such as  * Think slow, ask fast,
       and mitigate long tailed risks.  * Compartmentalize and stick to
       repeated processes. "Build with LEGOs"  * Look around at other
       projects of similar nature.  The last point is why I'm here, as I
       know some of you have been in the game for longer than I have, so
       feel free to share experiences that you might think is relevant, if
       you'd like.
        
       Author : drekipus
       Score  : 43 points
       Date   : 2024-06-22 08:49 UTC (14 hours ago)
        
       | daviddever23box wrote:
       | Listen to the data that you're migrating from one system to
       | another, so to speak. Test v1-to-v2 and v2-to-v1 migrations until
       | you're blue in the face. Feature-flag migrations for individual
       | clients. Ensure that any SLAs are met with v1-only, v1-in-flight-
       | to-v2, v2 only, and/or some mix of static partial migration. Make
       | sure that you have an absolutely homeomorphic mapping of data
       | from one representation to another.
        
       | brudgers wrote:
       | What is motivating the move technically?
       | 
       | What is motivating the move politically?
       | 
       | What is motivating the move psychologically?
       | 
       | Be clear regarding each.
       | 
       | They are all there in the decision.
       | 
       | Don't pretend they aren't.
       | 
       | Only one of them is technical.
       | 
       | And it is not most of success.
       | 
       | Good luck.
        
         | Msurrow wrote:
         | This. Even though you/we are [mostly] focusing on the tech
         | aspect of the world, make no mistake; the "business side" (or
         | the political) can kill your migration project more suddenly
         | and decidedly than you can spell 'strangler pattern'.
         | 
         | So, to add to the comment above:
         | 
         | - does your migration affect the clients and the way clients
         | work in any way? No matter how small, if the answer is "yes"
         | then you need to ensure full buy-in from the clients. Even if
         | your migration went flawlessly from a technical perspective, if
         | a large enough client didn't realise that V2 comes with some
         | change that he doesn't like, and when the change hits him after
         | the migration, he raises this as a problem and escalates the
         | problem high enough up the food chain with the message "this is
         | not working for us" then you are going to be rolling back,
         | regardless of the technical stuff. So, realise that the clients
         | are big stakeholders and they need to be managed from the
         | beginning of the project until some time after your V2 go live.
         | In my experience the best results come from bringing them close
         | to the project early and get some buy in by having them e.g. do
         | some end-to-end testing if V2 and get them to accept the V2
         | before the go live. Preferably in an email for if things get
         | ugly at some point (it happens, is sucks).
         | 
         | - also as the comment above says, don't ignore the political.
         | You should know what every important stakeholder gets out of
         | this? Don't forget personal ambition, ego, promotions etc as
         | possible motivators for stakeholders. Who of the stakeholders
         | are supporting your project now, and who is not? And just as
         | important, what may change for a stakeholder to "switch camp"
         | from supporter to not. Maybe the stakeholder is a mid-level
         | manager who is measured on some KPI and V2 will make his KPI
         | look better. So he's a supporter. But then his company gets a
         | new ceo and the KPIs change. Now he is no longer a supporter
         | because V2 doesn't give him anything he wants. And he's
         | actually now against your project because he has to commit some
         | resources to it, but doesn't get anything, so actually if your
         | project is killed he frees up resources and doesn't loose
         | anything.
         | 
         | From one developer to another; The tech part is the easier part
         | I'm sorry to say.
        
           | brudgers wrote:
           | I guess I need to clarify what I mean by the psychological
           | component.
           | 
           | Technical and political components are external. Career
           | aspirations and mitigating boredom/stagnation by pursuing
           | complicated work create motivations to invent interesting
           | projects.
           | 
           | And there's the ability to claim integration from v1 to v2 as
           | progress. Rather than only change.
           | 
           | To put it another way there is always some degree of change
           | for the sake of change motivating our desires for change.
           | Particularly when a big chunk of our time must be accounted
           | for. Typically, playing video games, sleeping, and walking a
           | dog through the woods instead are not viable alternatives in
           | contexts where data migrations are being considered.
           | 
           | If migration was something the OP didn't want to do, the
           | question would be about finding a new job.
        
       | hoofhearted wrote:
       | Follow the strangler fig pattern, and map out every single task
       | that is required in the migration on a whiteboard.
       | 
       | Write tests if you can, and set up a staging environment for V2
       | that you can setup and tear down easily for battle testing way
       | before going live.
       | 
       | From there, break the tasks up from above into their business
       | domains, and abstract those into new api services that the v1
       | system can use without any downtime.
       | 
       | For a frontend migration, that's a whole different story and you
       | would have to provide more details such as "moving from legacy
       | Angular 1 to React 18 while it's running".
        
         | hoofhearted wrote:
         | Original strangler fig post is here:
         | https://martinfowler.com/bliki/StranglerFigApplication.html
        
         | drpossum wrote:
         | This is a good answer and one I've put into practice
         | successfully more than once. Automated tests are very key here.
        
       | SeriousM wrote:
       | I find the ruby on rails migration path of github inspiring.
       | 
       | https://github.blog/2018-09-28-upgrading-github-from-rails-3...
       | 
       | They show some details how to migrate on large scale yet be safe
       | while doing so.
        
       | jayunit wrote:
       | Some good resources from Will Larson:
       | https://lethain.com/migrations/ or, if you prefer it in talk
       | format: https://lethain.com/qcon-sf-migrations-video/
        
         | simonw wrote:
         | This is the best piece of writing I've ever seen on the topic
         | of migrations, could not recommend this more highly.
        
       | philip1209 wrote:
       | I've done this. My quick thoughts:
       | 
       | - migrations always run longer than expected. In my case,
       | leadership estimates were off by a factor of 10. What the eng
       | manager originally said would take 3 months ended up taking a
       | couple years.
       | 
       | - try to deliver quick wins and incremental value. This is often
       | hard though. But it's worth a try.
       | 
       | - Try to avoid this becoming the project everybody attaches their
       | pet projects too. It's too easy for people to make this the
       | project where they use that new framework, test well, set up a
       | design system, and make lots of little changes.
       | 
       | - that being said: migrations are easiest if you keep the design
       | (visually and engineering) exactly the same. There will be lots
       | of pressure to "just redo it while you're already having to
       | rewrite it", but the uncertainty of a redesign really slows
       | things down. Having a reference implementation means you don't
       | have to invent tons of acceptance criteria from first principles.
       | 
       | - as soon as things start getting delayed, which they will, try
       | offering to cut corners or cancel the project. You want somebody
       | else in corporate to stick their neck out to extend the project.
       | 
       | - Try seeding the team with more veteran ICs internally. You'll
       | need their help as you uncover dragons or need to get other teams
       | to help run or integrate your new code.
       | 
       | - Among projects I've seen like this, the person running them
       | gets fired or quits partway through at least half of the time.
       | This is often because some middle manager made a promise they
       | couldn't keep to executives, and needs a scapegoat to save their
       | own job. (It's often that kind of middle manager who switches
       | jobs every two years and keeps failing up silently and the
       | project delay happens halfway through their stay at the company
       | and they're just trying to get to the two year mark and quit
       | before anybody realizes what is going on internally.)
        
         | frenchie4111 wrote:
         | > the person running them gets fired or quits partway through
         | at least half of the time
         | 
         | This is a good point. Or the migration appears to have been
         | very successful to management (before it's actually complete
         | from an engineering perspective) and they get promoted / moved
         | onto higher priority work.
         | 
         | Either way: make sure you are keeping the rest of the relevant
         | engineering organization informed about how the new system
         | works and how the migration is going to work.
        
           | philip1209 wrote:
           | I don't think there's much room for promotion because
           | migrations are fabrication and promotions favor innovation.
           | It's ability to save money versus ability to make money. See:
           | Smiling curve in economics.
        
         | sjf wrote:
         | I support everything in this comment.
         | 
         | After more than a decade at large sw companies, I can count on
         | one hand the number of migrations where the legacy system was
         | _ever_ able to be turned down. I've seen migrations drag on for
         | years, to the point where most of the team has turned over.
         | I've seen them become a three-way migration because the second
         | version was deemed insufficient so a third solution was
         | introduced.
         | 
         | Absolutely put your most senior devs on this; maintain as much
         | support from management as possible; budget for much, much more
         | time than you think; you need full commitment or you are going
         | to be maintaining both systems indefinitely.
        
       | mrj wrote:
       | One thing: try to find a path towards delivering solid
       | improvements as early as possible, phase out the big stuff and
       | work on a drum beat of consistent improvements.
       | 
       | Large projects have lots of vulnerabilities, but I've seen many
       | get sucked into "v2 is going to fix all the problems and mistakes
       | of v1." Without a solid technical plan, goals and deliverables,
       | it's easy for that effort to devolve into a years-long architect
       | astronaut-style arguments about nanoseconds saved by something
       | over something else. Halfway through somebody will suggest all
       | problems with this approach will be solved by $newLanguage. If it
       | doesn't serve the goals and deliver meaningful value, avoid
       | getting stuck in those traps. Know what you're trying to solve.
       | 
       | There will probably be a v3 and somebody will complain about your
       | version someday, too. It's the way of progress. As long as it's
       | an improvement over the old and lays the right groundwork,
       | continue moving in the right direction.
        
       | frenchie4111 wrote:
       | Lot's of good advice here. Some things I will throw in:
       | 
       | Find ways to ship smaller versions of the migration first. If
       | possible: isolate features that can be migrated on their own.
       | 
       | If possible silently run v2 in parallel with v1 for as long as it
       | takes to be comfortable with v2.
       | 
       | Assume that at some point you are going to have to completely
       | halt the migration, go back to v1-only, fix something, and
       | restart the migration.
       | 
       | I'd bet it's going to take 2-3x longer than you think to
       | completely deprecate v1.
        
       | robviren wrote:
       | Id immediately set the expectation that the process will be
       | messy, take longer than expected, and require continued
       | maintenance, iterations, and process improvements. Management
       | usually tries to sell a transition as being great for everyone
       | and will solve all problems. When it usually ends up being awful,
       | painful, and take incredible effort. Disappointment is always
       | better the sooner it is communicated. Align in principal for why
       | an effort must happen and the _realistic_ benefits to their daily
       | life. Don 't sell them a fairytale. I've found every transition
       | is nost painful because expectations and communication is poorly
       | managed.
       | 
       | I don't blame people. Usually the offenders are in a culture
       | where telling the truth is unpopular. It just depends on if you
       | want to have a successful transition, or make people feel good
       | about a project that takes 6 years to not finish.
        
       | buro9 wrote:
       | Is it a web service? Can you put a proxy in front of the old, to
       | allow you to observe, and potentially duplicate (to the new,
       | whilst testing) all requests that go to the old system?
       | 
       | If you're migrating data, can you take counts of things, so you
       | can get quickly verify, i.e. we have 2.32M records before, and we
       | have a way to prove we have 2.32M records after.
       | 
       | Mostly though, all migrations take longer than you think.
        
       | rglover wrote:
       | Design your new UI first, then your new data model, then write
       | your migrations/mapping functions to move v1 data to v2. Just did
       | this for a decent size app moving from Postgres to Mongo with few
       | hiccups.
        
       | awinter-py wrote:
       | the old system was fine
        
       | from-nibly wrote:
       | Make sure you can own the work for the whole migration. If you
       | lay out the tasks and more than 1 team has to be involved add 10
       | to the multiple increase over your managers estimates the
       | migration will take by 10 for each extra team.
       | 
       | "But that means if I have a 1 month project and I have to involve
       | 10 other teams it would take 10 years or so"
       | 
       | Yes that is another way of saying it will fail.
       | 
       | If you can figure out how to get up front sign off from all teams
       | so you can just do it all within your own team you will make
       | things go a lot faster.
       | 
       | Separately figure out how to cake slice things. If you have Dev
       | and prod for instance and 10 applications, don't migrate all 10
       | in Dev first. Migrate 1 app in Dev, then the same app in prod,
       | then go onto the next app. That way wherever you stop at least
       | something will be delivered to the custoner.
        
       | malkosta wrote:
       | Around a year ago I had one of those huge migration tasks that
       | you have no idea where to start. I hitted my head on the wall a
       | few times, and had to erase first month of work completely. In
       | the end what worked was: 1. Spend a day or two creating a fuzzy
       | view of the whole problem. Pay attention to the rabbit holes, do
       | not fall in them, be superficial. 2. Spend a day or two creating
       | a detailed view of the next 2 weeks. Go as deep as you can, but
       | pay attention to not prepare more than 2 weeks of work, because
       | things WILL change. And you will lose a lot of work. Minimize
       | that. 3. Execute. 4. Repeat from 1.
       | 
       | After a couple iterations your estimation will be much better and
       | you will see the light in the end of the tunnnel.
       | 
       | EDIT - almost forgot the most important part: write small
       | backwards compatible prs that are deploy to production
       | constantly. Don't write few big PRs, they will bite your back.
        
       | al_borland wrote:
       | If at all possible, try to find a way to do it incrementally,
       | with options to roll back if things go sideways when something is
       | released.
       | 
       | Management rarely wants to wait years for before seeing any pay
       | off from a big dramatic cutover, and big sweeping changes are
       | disruptive to clients.
       | 
       | This will likely create more work. Maybe some layer has to be
       | built to allow v1 and v2 subsystems to both operate with the
       | other parts of the app. But it should ultimately make it less
       | stressful.
       | 
       | If you can allow some friendly departments from friendly clients
       | to test and provide feedback before rolling it out to the whole
       | company or the full set of companies, that would probably go a
       | long way to help identify blind spots.
       | 
       | Most importantly, listen to your team and the people who know the
       | systems well. The projects I've seen that have really gone
       | sideways are ones where the people who know the true issues are
       | never consulted, or completely ignored when they try to raise an
       | alarm.
        
         | bearjaws wrote:
         | > try to find a way to do it incrementally, I would make it a
         | hard requirement.
         | 
         | If you can't do it incrementally, it's going to fail.
         | Corporations rarely have the attention span and staff tenure to
         | make that kind of migration work.
         | 
         | Even if it takes a year of pre-work to get to a point where it
         | can be done incrementally, it will be the only way it gets
         | done.
        
       | twodave wrote:
       | Is V2 already written? Or are you taking lead on designing and
       | building it?
        
       | apnsngr wrote:
       | I don't know if this fits your particular situation, but I
       | recommend building tutorials into your migration process. I built
       | a tool for migrating apps from Heroku to AWS ECS. The app
       | developer runs the tool in their repository and it opens a
       | migration guide in their web browser. The actual migration was
       | mostly automated but we split it up into steps and embedded them
       | into the guide. This way we could teach app devs the basics of
       | how to use ECS and other AWS services as they went. We could also
       | link out to additional docs and provide company specific details.
       | There was a CLI mode for developers that had to migrate a bunch
       | of apps. The tool was a big success and a couple hundred apps
       | were migrated with it. The migration guide ended up being a good
       | reference for people building brand new apps in AWS too. I built
       | the guide using VuePress, but Docusaurus is also a good option if
       | you are familiar with React.
        
       | diob wrote:
       | I would say if you can swap parts of v2 out incrementally that's
       | the best way.
       | 
       | Integration tests for behavior verification.
       | 
       | But the incremental migration is key.
        
       | 23B1 wrote:
       | Be transparent in how you pad estimates. This builds trust with
       | stakeholders so that when things go awry, you can remind them.
       | 
       | Require anyone who reports to you during the planning process to
       | do the same; provide the most accurate estimate possible, then be
       | transparent about their padding.
       | 
       | If there's a a 'known unknown' call it out. Mitigate risk with
       | high-level executive check-ins. Be candid with your status
       | lights, and tell them what you're doing to mitigate any risk on a
       | regular basis.
       | 
       | Migration is about managing up _to the org_ not just to a boss;
       | the more candid you are the more you deflate the rage that comes
       | with unexpected downtime, rollbacks, etc.
        
       | mirekrusin wrote:
       | 16 million users using the system means v1 is fine. Iterate on
       | it, make migration process, not a task with a deadline. Never do
       | two things at the same time, no matter how attractive they may
       | feel from the distance.
       | 
       | Sorry to say it but it smells a bit of "we're migrating because
       | microservices or kafka or whatever" - don't. Grow organically
       | into it. Do this kind of stuff because you have to, not because
       | you can.
       | 
       | If you said you're struggling/something doesn't work and you
       | can't anymore - it would be easier to advice and it wouldn't feel
       | like a step in the wrong direction.
        
       | yowlingcat wrote:
       | The best approach I can give you? Don't.
       | 
       | If you have a big bang V2 that is incompatible with V1, you've
       | already lost. There should be a V1.0.1, V1.0.2, etc that
       | incrementally gets you to what you would've already gotten with
       | V2 without losing the ability to do each individual piece in
       | stepwise succession. That's essentially what the "strangler
       | pattern" is.
       | 
       | The strangler pattern is helpful, because it forces you to focus
       | on what you piece-wise need to "strangle" -- which usually isn't
       | as much as it looks like at first blush.
       | 
       | The hardest part of most migrations is data model migrations, and
       | the best approach here is to start writing to the new model
       | before you start reading it from the core business logic. By the
       | time that works as expected, much of the pain is done. This takes
       | a long time because it requires a lot of repairing the ship as
       | you steer it, so for the sake of the business, it is best doing
       | it in small pieces aligned with chunks of business value or new
       | feature iteration velocity.
       | 
       | The second part of many migrations is adding sufficient test
       | coverage -- in a lot of cases, this will already be present, but
       | if it's not, you're in for a world of pain. If you don't have
       | enough test coverage of the V1, add that before you try and do
       | anything fancy or you'll end up testing "the long way" (through
       | production outages and late night scrambles to hotfix and
       | inevitable rollbacks).
        
       | dasil003 wrote:
       | I took over a team that was struggling with a very large
       | architectural migration that had been going for a couple years.
       | Two years later we have largely gotten things back to a healthy
       | state, though we have only achieved maybe 20% of the original
       | technical ambitions, the team is an order of magnitude stronger
       | than when it started, which in many ways is more important than
       | the exact state of the system. The migration introduced two major
       | new technologies being incubated by outside infra teams, a new
       | data model meant to coalesce 500+ fields comprised of data stored
       | from a dozen or more databases, serving hundreds of clients
       | across dozens of teams, and exposing data on hundreds of customer
       | facing surfaces representing both sides of a C2C marketplace.
       | 
       | The first thing I would say is take all advice you get with a
       | huge grain of salt. Details matter, and the particular details
       | that matter the most vary tremendously from project to project.
       | That said, here's my advice:
       | 
       | - Be clear on the goals up front and along the way. It's already
       | a red flag that you don't lead with the goal and say things like
       | "I don't have many details to share as I don't know what's
       | relevant". In the heady early days of a big project, there will
       | be many rose-tinted ideas of problems that can be solved, and
       | people will keep tacking them on without the burden of knowing
       | the stumbling blocks that will inevitably come. You need to keep
       | the goal in mind at all times so you can ruthlessly make
       | tradeoffs every step of the way. It's even okay if the goal
       | changes, but be explicit about it.
       | 
       | - Make sure you find a way to do it incrementally. If you find
       | you have code accumulating that is not being exercised in the
       | running system for more than a few weeks at a time, that's a huge
       | red flag. Kent Becks Trough of Despair [1] from a few days ago is
       | relevant to this point. You need to be very careful that your
       | trough doesn't grow wider than you can handle. It's surprisingly
       | easy for that to happen given the nature of software system
       | complexity growth. The risk is even greater if you have a lot of
       | resources at your disposal because more cooks in the kitchen
       | means hire risk of losing cohesion.
       | 
       | - There's no substitute for seniority up and down the chain. One
       | or two weak links can really derail the entire effort. And it's
       | not just about technical strength, communication and social
       | aspects are equally important. Every single front line engineer
       | will likely run into issues that will be relevant outside of
       | their scope, but will they recognize that for areas they are not
       | focused on? When a project is too big for any one individual to
       | understand all the details, you need a critical mass of big
       | picture thinkers, and some lightweight ways for informal
       | conversations to be sparked and escalated (or de-escalated) as
       | the importance comes into focus
       | 
       | - If you ever ask an engineer why they're doing something and the
       | answer is "because XXX told me to" or "because that's the plan",
       | it's time for a quick sit-down. Engineers who don't know why
       | they're doing something will not make good choices when the
       | unforeseen arises (which it always does).
       | 
       | - Know your clients. Are they internal, external? Are there ghost
       | or second-order clients due to leaking internal details or other
       | encapsulation violations? Will you still support all the features
       | they need? What actions will they need to take to support you?
       | What rate of change can they support? You can have the perfect
       | end-state in mind, and then get tripped up by mundane constraints
       | on your clients that you were not fully aware of.
        
       | soorya3 wrote:
       | 1. Remember Murphy's Law 2. Have Rollback option 3. Keep things
       | todo on go live date to a minimum. You would be surprised lot of
       | the risks can be mitigated before the date of change.
        
       | zEddSH wrote:
       | As a SDET, I suggest lots of testing.
       | 
       | Assuming V1 and V2 offer users the same functionality, there's a
       | bunch of tests you can offer. The best one IMO is oracle testing
       | where you do something on v1 and v2 and check they do the same
       | thing. Preferably roll out to a subset of users such as via a
       | canary deployment and make sure you have a rollback plan.
        
       | mmaarrccoo wrote:
       | I led a painful migration a couple of years ago and can share
       | some tips.
       | 
       | It's not clear whether v2 is already in production somewhere
       | else. If it is not, you better wait until 1) the v2 data model
       | has really been finalized and in prod and 2) key resources can be
       | made available to the migration team. We were forced to begin the
       | migration before the new product was complete and it was just
       | plain impossible. We had to start all over every quarter.
       | 
       | - Migrations are very difficult to estimate. Any optimistic
       | estimate will bite back. Hold off as much as you can, and ensure
       | appropriate buffers if you really have to.
       | 
       | - ensure that the 8 clients have an identical v1 data model
       | (tables, constraints, etc). If that is not the case, remember you
       | will run n migrations, not 1.
       | 
       | - You need a team with knowledge of both v1 and v2 data models,
       | as well as business domain know-how. There are many decisions
       | that need to be made and you need the right people to be around.
       | 
       | - Not everything has to be migrated. Trying to migrate 100% is a
       | common mistake: engage with the customers to understand what's
       | the minimum that legally and operationally has to be migrated,
       | especially if the v1 system has been in production for many
       | years.
       | 
       | - Data migration is a iterative process, and the last thing you
       | want to is to manually QA every iteration. You need to develop
       | tests that will provide a reasonable data integrity assurance.
       | 
       | - Dashboards showing data migrated, failing/ok tests, remaining
       | tables, etc. help communicate status and track progress.
       | 
       | - Customers will need to be involved during the whole project.
       | You need them to commit to making people available that can
       | quickly answer questions to unlock you dev teams. ideally, you
       | want to create a single team. Make sure that decisions are traced
       | and versioned.
       | 
       | - Performance matters. Discuss the performance requirements
       | upfront. Our process was very, very slow and we found out a bit
       | too late that the customer would not tolerate such down time.
       | Also, discuss "when" ok to migrate, how to rollback in case of
       | failure, etc.
        
       | asdfgfdsasdfg wrote:
       | I just read kill it with fire [0] that describes a methodology
       | for legacy modernization projects (which should work fine for any
       | migration). Highly recommended! It would have served me well as a
       | guide before I went in to the large migrations I've operated in
       | my career.
       | 
       | [0]: https://nostarch.com/kill-it-fire
        
       | omgwtfbyobbq wrote:
       | Put in more effort up front to make things easier later.
       | 
       | Try to automate what you can, efficiently. Code conversion,
       | tests, etc.
       | 
       | Keep an eye out for opportunities to simplify things.
       | 
       | Make sure to have buffer built into your time/effort estimations.
       | 
       | Ask lots of questions.
       | 
       | Find other folks who interact with different parts of the system
       | and ask if some of their time can be allocated to the conversion.
       | 
       | If you can onboard other folks, find out if there's anything you
       | can do to automate any of their work.
        
       | michaeljx wrote:
       | I' ve done 2 large software migrations in my life, here are my
       | findings: 1. If you can't do gradual deployment, try to do a
       | primary-secondary (master/slave) type of deployment where new
       | system runs in read-only mode (mirroring old system data) for a
       | while. 2. Whatever you budgeted for migrating data, double it.
       | Set a data cleansing specialist to start working on the data-to-
       | be migrated ASAP. 3. Document all processes of the current
       | system. Have the painful conversations up-front about
       | functionality that will be eliminated, migrating usually means
       | eliminating a bunch of features that do not pass the cost/benefit
       | threshold. Your users/stakeholders might not see it that way,
       | make it explicit what the cost of those features is, get as much
       | buy-in as possible
        
       | BillFranklin wrote:
       | This might help you: https://bilbof.com/migrations/table
       | 
       | I mined HN last year for all migrations. You can filter by
       | various fields like technologies etc. The table will probably be
       | most useful since it links off to the blog posts etc
        
       | avan1 wrote:
       | One year ago we successfully migrated to a new version (totally
       | big bang) with less than 4 hours downtime, for us it was version
       | 3 and v1 and v2 (plus a side service) both was working side by
       | side (v2 was a failed migration so they ended up a frankenstein
       | system which some requests goes to v2 and some other to v1 and
       | they put data in each other's databases. yes not a single source
       | of truth for all the data in database) and here are my three
       | cents:
       | 
       | 1. Don't - i don't know the size of the software but for us it
       | was lots of works specially at weekends and holidays. after
       | release almost half of the developers quits and the other half
       | were exhausted. totally doesn't worth that.
       | 
       | 2. Don't - if its possible to fix and refactor current version
       | please do that. you would thank yourself later. we had 15 months
       | of developments and in the middle of the project we need some
       | features and fundemental fixes for our current version which we
       | ended up another minor migration that we called v2.9.
       | 
       | 3. Don't - only do it if you had to and do it incrementally as
       | others suggested. start by building a microservice for most used
       | domain of your application with api backward compatibility (if
       | possible) and even use same database you are already using.
       | 
       | If you can't refactor current version (which i can't understand
       | why) and you insist to have a bigbang migration know the current
       | system well and know every column in the database(s) since you
       | will need to migrate millions of data at the end which is a big
       | project by itself.
        
       | zmj wrote:
       | Migrate the most complex use case first. There's nothing worse
       | than discovering mid-migration that you have to pause for rework.
       | Better instead to slow down with your first adopter and
       | accelerate for subsequent.
        
       | codingdave wrote:
       | I've done a ton of migrations, and most of the advice I'd give
       | has already been said in the other comments, except for one
       | thing:
       | 
       | If people are pushing for changes to the app to better match how
       | the business works today, leap into that conversation, but don't
       | get talked into changing the app. Instead aim at reworking their
       | business processes to first make their process as simple as
       | possible, and then simplify the app to match the new process.
       | Your migration is simpler when their process is simpler, and
       | everyone wins.
       | 
       | If people aren't willing to refactor business processes as part
       | of the effort, then refuse to change requirements. Hold steady to
       | "We both improve, or we stick to the status quo."
        
       | junto wrote:
       | I'm going through the same thing at the moment. I've come in two
       | years after the project was started, where the key strategy was
       | to replace a in-house developed maze of spaghetti that had become
       | unmaintainable, with a collection of SaaS based services
       | interconnected with a bunch of synchronization queues and
       | services moving data around.
       | 
       | The initial plan as I arrived was to write V2 in its entirety,
       | migrate the data from V1 as one bug bang rewrite and job done.
       | 
       | I realized immediately that the risk there was far too dangerous
       | and the goals unrealistic, so I'm pushing for the strangler
       | pattern and the business is pushing back. However I'm finally
       | getting people in the business to understand the new plan I've
       | put together and they are seeing opportunities.
       | 
       | Still, almost all the old developers have left and the existing
       | system is running on duct tape. There are no tests. The old
       | system is a fat client with a bunch of half finished messaging
       | services as the improvement project for the original system was
       | cancelled with lots of workarounds going direct into a central
       | database and most of the business logic is buried in the UI. Even
       | understanding how it all fits together is impossible, so the only
       | way forward is to go back to basics and work with the business
       | departments to design and document the process they REALLY need
       | and iteratively build it. You'll never really feature match and
       | business departments develop workaround processes over time that
       | become the norm, to the point that they become inefficient by
       | historical lack of design. Going back to basics in terms of
       | process design is something I highly recommend.
       | 
       | Your biggest challenge is that the business will be pushing
       | engineering to deliver and they will continuously try to push
       | deadlines on the engineering department that you won't be able to
       | realistically deliver. All you can do is keep pushing back, keep
       | on trucking and try not to let it get to you and your engineering
       | teams.
       | 
       | Remember to take a deep breath once in a while. Let the waves
       | crash over you and try not to take it all personally.
        
       | withinboredom wrote:
       | Everyone here has some good advice, but I didn't see this one
       | listed:
       | 
       | I work at a place that just finished the migration for ONE
       | customer (of many) and it took about 2 years. The main issue we
       | ran into was that NOBODY documented the difference in IDs between
       | the old vs. new system. We had to Frankenstein that shit (look at
       | original filenames of imported data to deduce what new id matched
       | to which old id) which took MONTHS.
       | 
       | So, if you have any data, make sure you know EXACTLY what the id
       | is in BOTH systems, even if you think they should be exactly the
       | same.
       | 
       | FWIW, our first step was keeping both systems synchronized (via
       | that ID matching up) and migrating the end-user frontend to the
       | new system.
       | 
       | From there, we trained the customer on the new system and
       | administration, and finally, we swapped them over to the new
       | system and disabled the synchronization system.
       | 
       | Now, we kinda know how to do it and we expect to be able to do it
       | faster ... we'll see.
        
       | simne wrote:
       | This is really what should be named big thing.
       | 
       | Unfortunately, this is huge project, and to do it, you need very
       | clean view on three parameters:
       | 
       | 1. Where are you now? How large (in LOCs) your start? Is your
       | project already loose coupled or, it is monolith? This is
       | important, because it is much easier if you could isolate small
       | parts and rebuild them separate, while all other code old.
       | 
       | 2. How should behave end point of project? Monolith or micro-
       | services, or something else?
       | 
       | 3. How large budget you have, and how many "man-hours" per month
       | possible?
       | 
       | So to do all these in some predicable time and budget, you need
       | project in waterfall style, but sure, you could use waterfall
       | part as overall strategy, and do all things with agile (trying to
       | stay nearest to waterfall milestones).
       | 
       | In real life I seen such migration, unfortunately unsuccessful
       | from project manager view. First, they tried to made new version
       | on same platform, but project just bloated without much success;
       | second, they decided to change to totally other platform and
       | rewrite all code from scratch (on other language), and this time
       | got success.
        
       | nostrademons wrote:
       | Led a couple of these. Advice:
       | 
       | 1) Do it incrementally. If you don't, it will fail. You can't
       | block feature releases for a whole organization for years, but if
       | you don't block feature releases you will forever fall behind
       | head.
       | 
       | 2) Design the v2 you want to have, but don't get too attached to
       | the design. It will change as you uncover engineering realities
       | and as business direction evolves. Be flexible and adapt as you
       | go along.
       | 
       | 3) It helps garner exec support if you can catalog product ideas
       | that they've _wanted_ to do but been prevented from by the
       | current architecture, and address them with the new architecture.
       | Rewrites that address new business goals and strategies have a
       | lot more staying power than rewrites for the hell of it, or ones
       | justified by  "the code will be much cleaner".
       | 
       | 4) If V1 doesn't already have clear APIs and subsystem
       | boundaries, it's usually worth doing pre-work to put them in
       | place. These take the form of behavior-neutral refactorings whose
       | _only_ purpose is to trim  & rationalize dependencies - you're
       | changing how the system looks to outside clients, but not how it
       | works.
       | 
       | 5) Make sure you have a comprehensive regression test suite. This
       | is part of why the last point is so important.
       | 
       | 6) The new APIs are the most important part. Get them working
       | first, in order of which APIs are most frequently depended on by
       | new code, even if you have to implement them on top of old code
       | or use hacks to connect them up to the old system. And then get
       | the rest of the org using them. This will help keep you from
       | falling the rest of the org in development, and build momentum on
       | the new system.
       | 
       | 7) Separate behavior-neutral changes that change _how the code
       | works internally_ from behavior-adding changes that change _what
       | the code does_. The former should give exactly the same results
       | as the v1 system, pass all the existing regression test suite,
       | and have exactly the same functionality except perhaps a latency
       | penalty for shimming  & data conversion. The latter handle any of
       | the new functionality that are the business goals of v2, which
       | hopefully you established in #3.
       | 
       | 8) The project will be a lot more sustainable if you can deliver
       | some of the business goals from #3 before your conversion is
       | complete.
       | 
       | 9) Have a latency budget for how much slower the new system can
       | be than the old one. It _will_ be slower than the old one, and
       | don 't try to message otherwise. This is why you try to get it
       | used for new functionality (#6) first; these often have fewer
       | users, so inefficiency causes less of a penalty for overall
       | experience.
       | 
       | 10) Also, expect bugs and lots of them. This is the other reason
       | to get it used for new code (#6) before migrating over critical
       | core functionality; it lets you smooth out the kinks before you
       | cause career- or business-ending failures.
       | 
       | 11) If v2 has a different data backend from v1, you _will_ need a
       | dual-write layer, because there 's going to be a period of time
       | when both backends are live. Consistency-check both backends
       | against each other to catch bugs before trying to switch fully
       | over to v2.
       | 
       | 12) Test your migration scripts, and treat them with the same
       | care you treat production code. A one-character error in the
       | migration script that moved GMail over to BigTable ended up
       | deleting 10% of GMail accounts and necessitating a tape-backup
       | restore that knocked out people's GMail for a week. If Google can
       | screw it up, you can too.
       | 
       | 13) When it comes to migrating over the long-tail of
       | functionality, enlist the rest of the org's help, have a bunch of
       | whole-company Fixits, and put lots of people on it. By the time
       | you do this, everything in v2 should be stable, people should
       | know the dragons in the new system, and migration should be
       | pretty straightforward. But this work tends to burn engineers
       | out, because it's boring and has virtually no real benefit other
       | than being able to get rid of V1.
       | 
       | 14) Expect this project to suck, to take about 5x more man-hours
       | than you expect, and to face cancellation at numerous intervals.
       | You should not be embarking on this unless business leaders are
       | sure that you really need to, and you have their complete buy-in.
       | 
       | Good luck.
        
       | Jefff8 wrote:
       | I've done this sort of thing a few times.
       | 
       | - Build a small core team who know the problem in depth. You
       | really need to understand v1 and v2 data and the mappings, as
       | well as the functionality in each.
       | 
       | - Build a test system that is insulated from customers; you want
       | to be able to use this as if it's the real thing, but to be
       | absolutely, completely, dead certain that it will not affect live
       | systems, and that no output from this will reach people it should
       | not. Make sure there are visual indicators as well as logical
       | traps on data exiting this system. Make this repeatable - you are
       | going to use this system a lot to re-run tests. Despite the
       | firebreaks, you will have some brown pants moments.
       | 
       | - The ideal is to move from v1 to v2 gradually, using a
       | passthrough system. However, ime, this is often not possible, and
       | at some point there will be a hard switch between systems.
       | 
       | - Develop migration plans with multiple off-ramps and fallbacks,
       | and monitoring. By the time you press the switch you should know
       | exactly how everything will work, and you should have no issues,
       | despite this, you should have layers of contingency to allow for
       | business as usual when the unplanned happens. This is a mix
       | between technical and business and it should have been properly
       | understood by everyone involved. Monitoring is critically
       | important. Your plans should include aftercare... for example,
       | what happens if you think the migration is successful, and two
       | weeks later you discover an issue with 200,000 transactions. How
       | will you reconcile? How will you communicate with the affected
       | parties?
       | 
       | - Look for classes of things... Can you find 100,000 dead
       | accounts that can be removed? Can you find 500,000 that have only
       | ever had one transaction? Look for classes of errors - and fix
       | them before migration. Keep a record of all of this, and make
       | certain that you have covered all cases and all records. If you
       | are lucky, you will be able to migrate classes from v1 to v2 and
       | have the passthrough transparently manage this.
       | 
       | - Ideally, have a log of transactions that can be replayed on
       | demand. So that you can run systems in parallel and so that in
       | the event of issues, you can unwind.
       | 
       | - Keep written logs of all the things that you and the team do.
       | You _will_ forget stuff you've done. This is true on an hour to
       | hour basis as well as a month-to-month basis.
       | 
       | - Work on making migrations fast. Can you organise it so that 16m
       | migrations take 10 minutes? This allows you test and retest. You
       | want anyone to be able to run on-demand migrations.
       | 
       | - Look to the end-users. For an upgrade to be successfully
       | deployed, both the business and the end-users must be happy; you
       | will want to run test groups, pilots, group conversations, and
       | make your team available to the end-users. Nothing should be
       | surprising by the end. You will also need to know that v2 is
       | performant - catch these problems before they become general
       | issues of dissatisfaction. Look also for pain points and try to
       | ensure that you remove them in v2. Change is painful, but if you
       | can show that there are benefits you will ameliorate much of the
       | criticism.
       | 
       | - Have defined end-points. You do not want to be doing this in 5
       | years time.
        
       | snarkypixel wrote:
       | One thing I've learned from these large migration projects is
       | that v1 always seems like total crap, while v2 appears to be the
       | perfect dream. However, as you begin building v2, you start to
       | realize that v1 was not actually that bad and had many great but
       | unappreciated features. Additionally, you come to understand that
       | many v1 features took a long time to develop, were battle-tested,
       | and would require significant effort to rebuild in v2 with
       | minimal benefits.
       | 
       | So, what I've learned is not to completely discard v1. Instead,
       | it's better to refactor or rebuild only the parts that pose
       | issues, even though it may not be as sexy or exciting as starting
       | v2 from scratch.
       | 
       | In practice, I would begin by cloning v1 and deploying it to a
       | development environment to start tweaking it. I would also ensure
       | to implement numerous automated tests to safeguard against any
       | potential issues caused by refactoring. Of course, if you can
       | keep using the same database that's even better as you can test
       | refactored features with real customer data and even run both
       | builds in parallel to spot any differences.
        
       ___________________________________________________________________
       (page generated 2024-06-22 23:01 UTC)