[HN Gopher] LinkedIn shelved plan to migrate to Microsoft Azure ...
___________________________________________________________________
LinkedIn shelved plan to migrate to Microsoft Azure cloud
Author : helsinkiandrew
Score : 13 points
Date : 2023-12-14 14:04 UTC (1 days ago)
(HTM) web link (www.cnbc.com)
(TXT) w3m dump (www.cnbc.com)
| helsinkiandrew wrote:
| 4 years ago: "LinkedIn is moving to Microsoft's Azure public
| cloud three years after $27 billion acquisition"
|
| https://www.cnbc.com/2019/07/23/linkedin-is-moving-to-micros...
| RoyTyrell wrote:
| My company has invested in moving to Azure except where we need
| to stay on Google. Apparently MS gave us a package on all of
| their products if we use Azure and it was enough to sway the
| execs.
|
| We were then given the directive that everyone at my level would
| need to get some certifications so we could properly use Azure,
| assist the architects and more jr devs. It's a good idea but my
| god the training is so poorly executed. I want to like Azure but
| it also seems like an uncoordinated mess.
|
| Maybe I'm just a grumpy dev. Anyone else have a better and more
| positive perspective? Who has good training for certs such as the
| Data Engineer or AI Engineer?
| green-eclipse wrote:
| Microsoft bought Hotmail back in 1997. Hotmail was powered by
| Unix servers until 2004, despite MS's best efforts to transition
| to their own Wintel-powered backend [0]. These things take time.
|
| [0] https://news.softpedia.com/news/Windows-Live-Hotmail-Was-
| Pow...
| yellow_lead wrote:
| > LinkedIn was having a hard time taking advantage of the cloud
| provider's software. Sources told CNBC that issues arose when
| LinkedIn attempted to lift and shift its existing software tools
| to Azure rather than refactor them to run on the cloud provider's
| ready made tools.
|
| I think I need this translated back into tech-speak.
| asylteltine wrote:
| That is tech speak. They tried to redeploy existing
| architecture into azure and it failed
| ghaff wrote:
| The headline notwithstanding, this doesn't seem like anything
| particularly Azure specific. They'd likely have had many of
| the same issues trying to mostly lift and shift to any of the
| big public cloud providers.
| eitally wrote:
| ELI5: For any sufficiently complex enterprise system (e.g.
| LinkedIn, or Google), any plain vanilla architecture is
| infeasible for lift & shift. Moreover, the vanilla services may
| not comply with internal security requirements, or play nicely
| with internal CI/CD tools, or internal databases / data
| structures / data processing pipelines / analytics.
| mecsred wrote:
| No five year old is going to understand a single sentence of
| that.
| calvin wrote:
| ELI40YO Engineer
| that_guy_iain wrote:
| Still need it dumbed down a bit /s
| maronato wrote:
| You spend a week building a castle with legos, and suddenly
| your mom asks if you can change some parts to use the new
| <lego competitor>. You can try to make the old and new
| parts fit together, but it isn't going to be easy most of
| the time, and you can't be certain that the lego competitor
| will have the same pieces or do the same things as your
| lego version. By the time you are done redoing those parts,
| you'll end up having to recreate large portions of your
| castle to make everything work together again, and even
| then you might miss something important that breaks a
| functionality of your castle.
| mypetocean wrote:
| Maybe in this case "ELI5" stands for "Explain it like I
| have 5 yrs experience in software development but gave up
| in year two"?
| eitally wrote:
| Arguably, if the five year old was reading hacker news,
| they might. :) Point taken, though, but honestly this
| doesn't seem like the place to simplify things to quite to
| 5Yo level.
| asmor wrote:
| That's all correct in theory, but in my experience these
| things still happen and are usually outgrowths of bad
| engineering culture / shadow IT / not wanting to be reliant
| on your cloud infra / platform team (often for irrational
| reasons, sometimes not). They get built with entire teams
| taking responsibility on paper, but then before you know it,
| nobody from that team still works at the company or on that
| team. Usually these systems are also GDPR nightmares if they
| contain user data, because these people don't understand when
| you tell them they need to have a plan for deleting user
| data. They don't even consider it a legal barrier, they think
| you're putting stones in their way.
|
| I've been on enough Cloud Archeology expeditions into the
| land of VMs where nobody knows what they do, it might as well
| be my job title now.
| _rutinerad wrote:
| My five-year-old loves her complex enterprise system.
| sqeaky wrote:
| Is that supposed to make it seem better?
|
| If refactoring is too hard for a Microsoft owned company what
| am I to think about my tech stack?
| bostik wrote:
| Beyond ludicrously small systems, refactoring of _live 24 /7
| production systems_ is never easy.
|
| Reality has a surprising amount of detail, and any non-
| trivial, customer-facing system will have accumulated weird
| code paths to account for obscure but nonetheless expensive
| edge cases. A codebase built across >20 years, scaled to
| support millions of concurrent users is going to be
| absolutely filled to the brim with weird things.
|
| When you add the need for live migrations with zero downtime,
| done every few years to account for next order of magnitude
| loads, you end up with a proper Frankenstein's monster. It's
| not called "rebuilding an airplane while flying" for a lark.
|
| Every round includes a long, complex engineering effort of
| incremental live migration. With parallel read/write patterns
| between old and new systems, and all their annoying semantic
| differences. And then, to add insult to injury, while your
| core team was going through the months-long process of
| migrating _one_ essential service, half a dozen upstream
| teams have independently realised they can depend on some
| weird side effects of the intermediate state and embedded its
| assumptions to a Critical Business Process[tm] responsible
| for a decent fraction of your company 's monthly revenue.
| Breaking their implicit workflow will make your entire
| company go from black to red, so your core team is now
| saddled with supporting the known-broken assumptions.
|
| Then you get to add wildly differing latency profiles to the
| mix. While you were running on your own hardware, the worst-
| case latency was rack-to-rack. Implicit assumptions on
| massive but essential workloads may depend, unknowingly, on
| call-to-call latencies that only rarely exceed 100
| microseconds. In a cross-AZ cloud setting you may suddenly
| have a p90 floor of 0.2ms. A _lot_ of software can break in
| unexpected ways when things are consistently just a _little
| bit_ too slow.
|
| Welcome to the wonderful world of distributed systems and
| cloud migrations. At some point the scars will heal.
| Allegedly.
| Raed667 wrote:
| I think this is an awkward way of saying they tried to add an
| abstraction on top of their AWS dependencies so that their
| services would work on Azure without a refactor.
| bbarnett wrote:
| They don't use AWS, but primarily baremetal.
| bbarnett wrote:
| Also, this reminds me of the time Microsoft bought Hotmail,
| and couldn't port it to WinNT. They had to leave it on its
| BSD variant for a long time, NT couldn't handle it.
| cplusplusfellow wrote:
| I'm surprised it can today.
| Scoundreller wrote:
| Arstechnica forum discussion on the topic in 2001:
|
| https://arstechnica.com/civis/threads/i-thought-hotmail-
| was-...
| wharvle wrote:
| "Lift and shift" is a term for when you move to "the cloud" but
| really just replace your physical servers with clones in cloud
| VMs. It's a relatively cheap (in terms of effort) way to get on
| "the cloud" but gains you basically zero of the benefits. The
| term's in wide use, talk to anyone involved with cloud-anything
| and they'll be familiar with it.
|
| I'm not sure what else needs to be translated? Nothing, I
| think?
| mmcgaha wrote:
| Lift and shift is a sales term to make it sound like the
| internal team is trying to over-complicate the migration. The
| sales guy will normally phrase it as "just lift and shift."
| neilv wrote:
| I love it. I could totally believe the etymology starts as
| slick sales persuasion trying to downplay the
| implementation difficulty of something that's being sold.
|
| And then people also pick it up for non-persuasion, because
| it also sounds like a catchy name for an engineering
| approach we already had.
|
| Of course it can still be used for persuasion for awhile,
| but will grow baggage over time, as efforts linked to the
| term don't play out that way.
| neilv wrote:
| Thanks for the explanation, but no need to imply that someone
| is out of the loop if they didn't know it.
|
| The term didn't sound familiar to me (though the concept
| was), and the term might not have been familiar to some
| others.
|
| People might not want to contradict an assertion because of
| language like "The term's in wide use, talk to anyone
| involved with cloud-anything and they'll be familiar with it.
| [...] I'm not sure what else needs to be translated? Nothing,
| I think?"
| wharvle wrote:
| > People might not want to contradict an assertion because
| of language like "The term's in wide use, talk to anyone
| involved with cloud-anything and they'll be familiar with
| it. [...] I'm not sure what else needs to be translated?
| Nothing, I think?"
|
| > > LinkedIn was having a hard time taking advantage of the
| cloud provider's software. Sources told CNBC that issues
| arose when LinkedIn attempted to lift and shift its
| existing software tools to Azure rather than refactor them
| to run on the cloud provider's ready made tools.
|
| The only other terms I can see that are jargon are "cloud
| provider" and "refactor", and those are already technical
| (more or less) so don't need to be translated into
| technical language.
|
| As for the other bit, I just meant that it's a widely-used
| term so one may continue to encounter it in these contexts.
| It truly is ubiquitous in discussion of and around
| "enterprise transformations" to the cloud, and among cloud
| practitioners more generally, so anyone connected to that
| space will know what it means. It's also _kinda_ already a
| technical term, in that developer /devops and SRE sorts
| throw it around and do mean a specific thing by it, which
| doesn't need to be translated for other technical folks in
| that area.
| neilv wrote:
| "Ten Thousand" https://xkcd.com/1053/
|
| The original person might've instead asked for an
| explanation in a way that didn't come across as
| criticizing the article.
|
| But probably best not to insist that everyone should
| already know the term; just explain it.
| wharvle wrote:
| Yeah, you're probably right. Feedback received.
| vergessenmir wrote:
| Lift and shift is a cloud migration strategy which involves
| moving your applications to the cloud with little to no
| modification. For example, you have an application running on a
| server in your data-centre, you then deploy a VM in the cloud
| with a similar spec and install the application.
|
| It's usually done to avoid the engineering cost of making the
| services more cloud native. What tends to happen a lot is that
| after a considerable portion of the migration is completed, the
| cost of the lift-and-shift effort start to overtake the
| savings, and the projected costs, dwarf the future savings.
|
| I suspect this is what happened with Linkedin.
| Agingcoder wrote:
| which savings ? It's never been obvious to me that cloud was
| cheaper if you're a large company
| lobsterthief wrote:
| It's easier to scale cloud infrastructure.
| jvm___ wrote:
| Even if you never need to scale it's cheaper to not have
| to physically maintain your own data center. If all the
| broken server, building power, building internet, access
| control, real estate costs... are all handled in the
| cloud there's savings there as well.
| NegativeK wrote:
| But those costs don't go away -- the cloud provider is
| going to charge you for them, along with a premium for
| profit?
|
| I'm used to organizations moving out of the cloud when
| they realize that it's more expensive if you don't have
| very peaky load demands.
| marcolussetti wrote:
| But that's somewhat negated if you lift and shift,
| because your application is not designed to leverage that
| capability in that way.
| campbel wrote:
| Compute is at a premium, but you can shift opex/capex
| around which might be more suitable. It can also be cheaper
| in headcount since you need fewer operators and less
| expertise in datacenter operations.
| adolph wrote:
| > you need fewer operators and less expertise in
| datacenter operations
|
| Because you are paying someone else for them.
|
| This is considered rational because those operators are
| presumably more productive in a pool of people using
| similar skills to support many customers rather than just
| one. It is similar to hiring a cleaning service rather
| than employing individual cleaners in a department of
| cleaning because cleaning things is not a core competency
| of business.
|
| It might be less irrational if some amount of compute is
| part of the core competency of the business. Since
| "software is eating the world," compute is a core
| competency of all businesses except for the ones that
| don't realize it yet.
| wharvle wrote:
| > It can also be cheaper in headcount since you need
| fewer operators and less expertise in datacenter
| operations.
|
| I've not _really_ seen this work out well. I think it
| might be true for simple set-ups, letting a tiny
| developer team also handle infra and support without
| going nuts doing it, if they set it up that way from the
| beginning, but more-complex setups always seem to have so
| damn many sharp edges and moving pieces that support ends
| up looking similar to what a far more DIY approach (short
| of building one 's own datacenter outright) would, in
| terms of time lost to it.
|
| ... and so does downtime, for that matter.
| asmor wrote:
| It's at least more predictable. You don't pay for staff
| with datacenter skills (sort of in short supply) and you
| don't need to make large investments early on to build the
| datacenter and you don't have a huge headache if you need
| to scale up or down operations.
| hibikir wrote:
| It really depends on workloads. Imagine you need massive
| spikes of compute for, say, flash sales, or people watching
| the superbowl in your streaming service. Buying all that
| hardware for just the spikes might not make sense vs just
| scaling up vms in a cloud provider and scale them down.
|
| In the real world, for baseline load, the big advantage for
| many large companies isn't price, but the massive lack of
| alacrity of many inhouse ops teams. If it takes me 3+
| months to provision compute for the simplest, lowest demand
| services (as is the custom in many large companies full of
| red tape and arguments about who bears costs), letting
| teams just spin up anything they want and get billed
| directly is often a winner, even if it's more expensive.
| Having entire teams waste months before they can put
| something in prod is a very different kind of expense in
| itself.
| maccard wrote:
| The simplest example is if you have on-prem hardware, you
| need to have capacity for your peak load. In a lift and
| shift, you would replace your fleet of 96 core xeons with a
| fleet of 96 core xeons in AWS.
|
| The cloud native approach would be to modify your app so
| that it can be scaled up and down so you keep a few
| machines always running, and scale up and down with your
| traffic so you only run at capacity when you need it.
| wredue wrote:
| This doesn't demonstrate anything about the savings.
|
| Anecdotally, when my previous company was looking at
| costs, cloud unequivocally came out significantly more
| expensive, and that wasn't even a large company (only
| 2,000 or so employees).
|
| I will grant that we did not have globalization problems
| to solve (but I'd also wager that lots of businesses
| prematurely "what if" this scenario anyway).
| maccard wrote:
| > This doesn't demonstrate anything about the savings.
|
| If you neeed 4 CPUs for your peak load for 4 hours per
| day, and only 1 of them for the other 20 hours a day, you
| can save by scaling down to 1 cpu for 85% of the day.
| bee_rider wrote:
| Although, it must be unusual, right? This is not one company
| porting their service to the cloud, this is Microsoft porting
| their LinkedIn service from whatever servers came along with
| LinkedIn, to their own servers, on which they also run a
| cloud business.
|
| Which... isn't to say anything about which way we should
| expect that to swing things. But it seems quite unusual, as
| most companies have not been bought by a cloud provider.
| Yet...
| random42 wrote:
| > this is Microsoft porting their LinkedIn service from
| whatever servers came along with LinkedIn, to their own
| servers
|
| Nope, LinkedIn executes completely independently.
| hinkley wrote:
| If your architecture is chatty enough, you will be sharding
| things so that most traffic stays in one rack, room, or data
| center.
|
| If you treat us-west-1 as a single data center, you may find
| you are spending a lot on traffic between AZs.
|
| A lift and shift might treat us-west-1 like a single data
| center. A more sophisticated strategy might treat it as
| three.
| mdeeks wrote:
| There is no such thing as "lift and shift". It is something
| Azure account reps like to say to make it sound like moving is
| easy. It sounds like you're picking up some boxes from one side
| of the room and moving them to the other. When in reality
| you're rewriting your infra code mostly from scratch.
|
| When we were acquired by MSFT we had the same project. We had
| to move from AWS to Azure. I made them all stop saying "lift
| and shift" because in reality it is "throw away all of your
| provisioning code and rewrite it using Azure primitives which
| don't work the same way as AWS ones".
|
| It is more akin to writing an iOS app to work on Android.
| axus wrote:
| I'm gonna bet that many Azure customers had no such thing as
| "provisioning code".
| asmor wrote:
| To be fair, AWS also used the exact term when we moved a
| project out of a tiny expensive to operate (though lack of
| scale) datacenter that only hadn't been retired because we
| had a 30+ year old COBOL app suite on a z system.
| arielcostas wrote:
| But lift and shift is not that, is it? It's having
| applications running directly on OSs (without
| containerisation or separation of dependencies like the
| database or physical disks) and moving it to "the cloud" to
| be ran on a VM in the same fashion.
|
| I mean, if you're already with AWS using their services
| (besides EC2 for hosting) such as RDS or S3; moving to Azure
| SQL (or DB for MySQL or whatever) and Blob Storage is not
| just lift-and-shift anymore, since you are actually changing
| from a cloud provider to a different one.
|
| AFAIK an actual migration to the cloud would involve
| rewriting some parts of the application to be cloud-native,
| such as using Service Bus for queues instead of a local
| Redis/RabbitMQ instance, using GCS instead of local disks,
| and using RDS instead of hosting your own single MySQL
| server.
| oasisbob wrote:
| There's no formal definition of "lift and shift", certainly
| nothing that would dictate specific virtualization
| strategies.
|
| I've always read it as being roughly analogous to "like for
| like," and dependent on the specific circumstances and
| status quo.
| oasisbob wrote:
| "Lift and shift" isn't just an Azure-specific phrase. Many
| people use it pejoratively, and point to it as an anti-
| pattern, and something to avoid.
|
| Similar terminology is "forklift"... been hearing that one
| for well over a decade.
|
| Migrations are oftentimes an opportunity to revisit scaling,
| configuration, build and deployment pipelines, platform
| primitives, etc. Every migration I've been involved in has a
| (probably necessary) tension between getting the job done
| efficiently, while not repeating all the mistakes of the
| past.
| hinkley wrote:
| "Lift and shift" came into the conversation once we started
| talking about how we were paying too much for AWS. The
| obvious stuff was things like less bin packing, and
| bandwidth for third party services, like telemetry
| dashboards.
|
| And it's not just the service fees. I blanche to think of
| the opportunity costs we accrued by focusing for that long
| on infrastructure to the exclusion of new product and
| features. It's truly breathtaking.
|
| And then there's the burnout, and the ruffled feathers.
| oasisbob wrote:
| I've become convinced that most migrations are absolute
| losers in terms of opportunity costs.
|
| Even if done skillfully with valid rationale, they don't
| show any value until you come out the other side
| successfully.
| hinkley wrote:
| Definitely. We migrated to a new telemetry vendor and I'm
| pretty sure it'll take 10 years for us to recoup the cost
| savings in man power and opportunity cost.
|
| They were worried the old vendor might go under. My own
| track record with predicting company failures is pretty
| bad, so I suspect they'll still be around ten years from
| now.
| ben_jones wrote:
| It could mean multiple things. My guess is they used vendor
| specific services that don't translate as well as the basic
| build blocks like vanilla S3/ec2
| upon_drumhead wrote:
| I wonder how the GitHub to Azure migration is going
| redrove wrote:
| I have it on good authority they're trying a lift and shift too
| and it's not going well, at least as of ~9mo ago.
| duxup wrote:
| Any move to any cloud is going to depend on the environment
| you're coming from. I've been in on decisions not to use X, Y, Z
| ... doesn't mean there was anything wrong with them, we just
| weren't ready for that yet or had different priorities or the
| ever present weird deal-breaker issue / requirement.
| brodouevencode wrote:
| Exactly. The cost to retool for the cloud is not insignificant.
| konschubert wrote:
| You have to
|
| 1) Be comfortable routing traffic between on-prem and cloud
| over the internet or at least over a VPN, and
|
| 2) avoid the temptation to build your own platform (Terraform
| templates are a liability, not an asset!) and
|
| 3) Move tiny stuff first and bigger stuff later.
|
| It's amazing how many companies fail at that.
| bobthepanda wrote:
| Doing things the right way and learning from before usually
| requires a certain degree of humility, and more often than
| not the person leading these projects is either required to
| be, or is deluding themselves to be a hotshot who can succeed
| in a bold new way.
| konschubert wrote:
| You mean somebody who posts 3 bullet points on the internet
| and claims that they solve everything ? ;)
| bobthepanda wrote:
| I mean, we invented the term Promotion Oriented
| Architecture for a reason.
|
| It's like politics; the best person for the job is a
| sufficiently experienced person who does not want it.
| mfer wrote:
| This doesn't appear to be about Microsoft's cloud but rather
| Public Cloud.
|
| The whole migration of LinkedIn from their own data centers to
| the public cloud (Microsofts) isn't going well.
|
| It appears they are still going to operate on-premise for many
| things. Some things moving or have moved to the public cloud.
|
| Isn't this more a shot at the public cloud for all the things
| than to any specific one?
| that_guy_iain wrote:
| I don't see anything that points to it being a general public
| cloud issue. And instead they talk about Azure software
| specifically as something that they couldn't take advantage of,
| no?
| mlhpdx wrote:
| I would not assume that it is a specific Azure problem from
| that statement. Many, many teams struggle to take advantage
| of cloud infrastructure because of habits and knowledge
| retained for operating the existing systems.
|
| It's possible given what they have, I t's simply best to keep
| it on premise - at least to some degree. That would likely
| not be true with a successful re-architecture, but not
| everyone is up for that.
| mfer wrote:
| It may not be about the teams. For example, when you
| control the data center you can do certain things around
| performance and scale you can't do in a public cloud.
|
| There are so many unknowns about how things are setup that
| it's hard to know.
| carimura wrote:
| Yes I came away with the same thing. It's The Register's modus
| operandi to use cheeky clickbait titles.
| femiagbabiaka wrote:
| Most cloud migration projects at large companies fail. It usually
| takes 3 or 4 tries at least before all the necessary lessons are
| learned.
| oooyay wrote:
| Maybe I'm a bit contrarian on this one but once I saw data
| center, Azure, and the phrase "lift and shift" it filled in a lot
| of context for me. I spent a lot of my early to mid career
| participating in these strategies. They don't work. VM images
| almost always are different in some way, there's something one
| vendor provides that another doesn't - in general there's enough
| minute details that add up to make a series of mini-mountains in
| terms of blockers.
| jeffbee wrote:
| Yep, there are always differences. Just one thing I stumbled
| into recently was one of our program images that has long
| worked fine in AWS can't start in Azure because something their
| hypervisor does to the virtual address layout conflicts with
| the way that we remap .text to a huge page. It is both trivia
| and a showstopper.
| pphysch wrote:
| Yeah, there is a vast gulf between "it works for us" and "every
| dependency was implemented strictly according to open standards
| and is therefore seamlessly portable". See also the joke of
| migrating between "SQL" databases.
| lgkk wrote:
| How do you move that much data over to another cloud provider?
|
| Without losing data or disrupting the customer?
|
| Or do the databases just stay in the data center and not migrate.
| wharvle wrote:
| Live replicas (perhaps initialized with a cold backup,
| initially, if the dataset's _really_ huge), carving off parts
| of it for separate migration if that 's at all feasible, and
| some expensive folks doing a lot of butt-clenching-worthy
| activity for an hour or two (unless it goes very poorly...) for
| the final cut-over, some evening.
| ThomasMoll wrote:
| We (when I worked at LinkedIn) did it with ETL clusters, we
| already had built them out for moving data between datacenters
| nightly. They would mirror an HDFS cluster, then ran batch jobs
| to transfer either directly to the outbound cluster or to
| another ETL cluster in another DC.
|
| We used one of our ETL clusters to ship data to MSFT for
| various LinkedIn integrations, like seeing LinkedIn profile
| information in Outlook or Office products.
| lumost wrote:
| It's incredibly difficult for a mature software business to
| justify infrastructure and tooling investments. This is why we
| think that startups are a haven for modern tooling and the
| largest legacy firms are ... well ... difficult.
|
| The last 15 years possibly broke this rule by virtue of low
| interest rates, enabling the justification of large internal
| teams focused on modernization efforts which sometimes went as
| far as moving the state of computing forward.
|
| I wouldn't be surprised to see legacy enterprises return to form
| now that interest rates are 7%
| ksec wrote:
| I wonder what sort of scale do LinkedIn operate in terms of
| Server count.
|
| And Github also under Microsoft seems to be doing fine with on-
| prem as well. Why force LinkedIN to use Azure?
| wredue wrote:
| If I had to guess, there are hordes of businesses out there
| that maintain operations on prem, and a large lift like this is
| great for the resume.
|
| Of course, I could also be entirely wrong, but I also am not
| going to pretend that IT resume padding then jumping ship and
| leaving a shart of an architecture behind doesn't happen all
| the time in this industry.
| astockwell wrote:
| $$$
| rdoherty wrote:
| When I was there it was in the low hundreds of thousands.
| Probably more as growth was still in double digit percentages
| per year of user base.
| ksec wrote:
| >When I was there it was in the low _hundreds of thousands_.
|
| Blows my mind every time I see these kind of numbers.
| thenewwazoo wrote:
| I'm obviously not going to comment on anything internal, and I'm
| obviously speaking for myself and not the company, but it's worth
| bearing in mind that this migration was not from "on-prem" in the
| traditional sense. LinkedIn has its own internal cloud, complete
| with all the abstractions you'd expect from a public cloud
| provider, except developed contemporaneously with all the _rest_
| of the "clouds" everyone is familiar with. It was designed for,
| and is tightly coupled to, LinkedIn's particular view on how to
| build a flexible infrastructure (for an extreme example, using
| Rest.Li[1], which includes client-side load balancing).
|
| There was no attempt to "lift-and-shift" anything. There are
| technologies that overlap and technologies that conflict and
| technologies that compliment one another. As with any huge
| layered stack, you have to figure out which from the "LinkedIn"
| column marry well with those in the "Azure" column.
|
| I personally appreciate LI management's ability to be clear-eyed
| about whether the ROI was there.
|
| [1] https://linkedin.github.io/rest.li/
| dfxm12 wrote:
| Yeah, based on my own experience with AWS and Azure (that has
| nothing to do with Linked In), my immediate reaction to the
| headline was, "well, you can be keen on Azure, but "stuck" on
| AWS for a myriad of other reasons". Reading the article pretty
| confirmed it.
| foobarian wrote:
| Oof, I'm twitching just reading that because we're in exactly
| the same boat. The problem with the ROI is that any kind of
| not-self-run cloud is guaranteed to be more expensive in direct
| costs. This has been shown time and time again for any
| reasonably large enterprise. However, there is a long list of
| things that are hard to express in money that support a cloud
| move, mostly to do with keeping up with modern tech, hiring,
| DR, better resiliency, etc. and so the decision can be quite
| dependent on the particular execs in the chain of command and
| their subjective values.
| sargun wrote:
| This is based on the assumption that Azure has modern tech,
| hires well, DR, and better resiliency than LinkedIn's "cloud"
| for LinkedIn's needs. There's a bit of a problem around
| incentives here, where Azure is built to sell to Azure's
| customer base, whereas LinkedIn has evolved their own stack
| over the years.
|
| The questions become:
|
| 1. Does it make sense to dump our special features in the
| stack, or move them to a higher level in the stack? 2. Does
| Azure have comparable capabilities for the LinkedIn stack? 3.
| Is LinkedIn worth it to Azure to sell to?
|
| ---
|
| Often times, "at scale", you can support custom solutions
| outside of cloud providers that are purpose-built, and often
| times more resilient and efficient than the cloud providers.
|
| AWS has taken a very interesting approach of building an
| incredibly wide set of solutions to support every customer
| under the sun, and their approach to being "customer
| obsessed" leads to them building super niche solutions if the
| deal is worth it.
|
| I'm not sure how Google and Azure handle these engagements.
| ljm wrote:
| It's not really 'the cloud' as much as it's a managed
| mainframe you allocate resources from. Only it's actually
| quite expensive to allocate resources but it becomes more
| palatable with a monthly bill compared to setting up on-prem.
|
| Costs more money but easier on the cash flow.
| tjpnz wrote:
| Sounds like they would've faced a similar set of issues moving to
| AWS or GCP.
| miguelazo wrote:
| There are plenty of issues with Azure, but LinkedIn is hardly at
| the vanguard of innovation. And that was still the case before
| Microsoft vastly overpaid for it.
| joshhart wrote:
| I left LinkedIn 1.5 years ago. I was there 12 years. I saw the
| revenue & profitability growth that occurred post acquisition.
| I am very very confident LinkedIn would be worth north of $100B
| on public markets today and Microsoft made the acquisition for
| $26B. You might argue that in the subsequent 6 years post
| acquisition that wasn't enough growth and they should have
| bought back shares instead but it was completely a debt
| financed acquisition and very high ROI for Microsoft.
| joshhart wrote:
| This was cancelled over a year ago - which the articles notes and
| is old news. It was clear the effort would have needed a very
| significant push that would have required a large halt in product
| development and management wasn't willing to stomach it due to
| high growth in 2020/2021. Which made sense. But LinkedIn revenue
| growth has heavily slowed with the pullback in tech hiring and
| they had the space to do it and consider it optimization time.
|
| Also as part of Blueshift the plan was to do batch processing
| first but LinkedIn had a culture belief in colocation of batch
| compute & storage, which is against the disaggregated storage
| paradigm we see now. IMO this led to some dragging of feet.
|
| Source: Worked at LinkedIn 12 years, am a director at Databricks
| now.
| ThomasMoll wrote:
| Not only that but the Hadoop team literally had the guy who
| wrote the original HDFS whitepaper. Moving a service with that
| much in house expertise first never made sense. I worked on one
| of the original Azure PoCs for Hadoop, even before Blueshift
| and it was immediately clear that we operated at a scale that
| Azure couldn't handle at the time. Our biggest cluster had over
| 500PB and total we had over an exabyte as of 2021 [1]. It was
| exorbitantly expensive to run a similar setup on VMs, and at
| the scale that we had I think it would have taken over 4,000 -
| 5,000 separate Azure Data Lake namespaces to support one of our
| R&D clusters. I believe most of this "make the biggest cluster
| you can" mentality was a hold over from the Yahoo! days.
|
| [1] https://engineering.linkedin.com/blog/2021/the-exabyte-
| club-...
| hasty_pudding wrote:
| As someone who worked at LI.
|
| They spent years and god knows how many millions TRYING to move
| to Azure with the Blueshift project..before pulling the plug.
| They hired armies of contractors.
|
| They didn't stop by choice.
|
| They stopped because their tech stack is a giant over engineered
| unmovable turd.
| bbkane wrote:
| As a current employee, there's things I don't like, but the
| infrastructure is more custom than bad (far better than my last
| job)
___________________________________________________________________
(page generated 2023-12-15 23:01 UTC)