[HN Gopher] The many lies about reducing complexity part 2: Cloud
___________________________________________________________________
The many lies about reducing complexity part 2: Cloud
Author : rapnie
Score : 171 points
Date : 2021-01-10 14:20 UTC (8 hours ago)
(HTM) web link (ea.rna.nl)
(TXT) w3m dump (ea.rna.nl)
| ehnto wrote:
| I don't understand what people are building in order to need half
| of this decoupled and managed elsewhere anyway. It wasn't all
| that challenging to self manage it five years ago, what's
| changed?
|
| My guess is that the average small to medium project has drank
| the enterprise coolaid, and they are suffering the configuration
| and complexity nightmares that surround managing cloud
| infrastructure before they really needed to.
|
| As the article is pointing out, you don't forgo managing these
| things by doing it in the cloud, you just manage it inside a
| constantly changing Web UI instead of something likely familiar
| to your developers.
| dkarl wrote:
| I guess it's Kool-Aid? I don't know; I don't remember being
| lied to when I started using cloud services. I think of cloud
| resources as being amazing and basically magical, but I know
| there's a limit to the magic, and the rest is work. People
| using (for example) AWS S3 should not be surprised that they
| still have to work to manage the naming, organization, access
| control, encryption, retention, etc. of their data, and they
| might encounter problems if they try to load a 100GB S3 object
| into a byte array in a container provisioned with 1GB of RAM.
| But they are. I don't know if that's human nature or if they're
| being lied to by consultants and marketers.
| ratww wrote:
| There are products (Terraform, CloudFormation) that help
| managing without an UI, but they also add complexity, so our
| point definitely stills stands.
| bob1029 wrote:
| This shared responsibility principle that underlies cloud
| marketing speak sounds a lot like the self-driving mess we find
| ourselves in today - I.e. the responsibility boundary between
| parties exists in a fog of war and results in more exceptions
| than if one or the other were totally responsible.
|
| We have been a customer of Amazon AWS for ~6 years now, and we
| still really only use ~3 of their products: EC2, Route53 and S3.
| I.e. the actual compute/memory/storage/network capacity, and the
| mapping of the outside world to it. Because we are a software
| company, we write most of our own software. There is no value to
| our customers in us stringing together a pile of someone else's
| products, especially in a way that we cannot guarantee will be
| sustainable for >5 years. We cannot afford to constantly rework
| completed product installations.
|
| We strongly feel that any deeper buy-in with 3rd party technology
| vendors would compromise our agility and put us at their total
| mercy. Where we are currently positioned in the lock-in game, we
| could pull the ripcord and be sitting in a private datacenter
| within a week. All we need to do is move VMs, domain
| registrations and DNS nameservers if we want to kill the AWS
| bill.
|
| I feel for those who are up to their eyeballs in cloud
| infrastructure. Perhaps you made your own bed, but you shouldn't
| have to suffer in it. These are very complex decisions. We didn't
| get it right at first either. Maybe consider pleading with your
| executive management for mercy now. Perhaps you get a shot at a
| complete redo before it all comes crashing down. We certainly
| did. It's amazing what can happen if you have the guts to own up
| to bad choices and start an honest conversation.
|
| I would also be interested to hear the other side of the coin.
| Who out there is using 20+ AWS/Azure/GCP products to back a
| single business app and is having a fantastic time of it?
| mumblemumble wrote:
| I recently inherited a product that was developed from the
| ground up on AWS. It's been a real eye opener.
|
| Yes, it absolutely is locked in, and will never run on anything
| but AWS. That doesn't surprise me. What surprises me is all of
| the unnecessary complexity. It's one big Rube Goldberg
| contraption, stringing together different AWS products, with a
| great deal of "tool in search of problem" syndrome for good
| measure. I am pretty sure that, in at least a few spots, the
| glue code used to plug into Amazon XYZ amounted to a greater
| development and maintenance burden than a homegrown module for
| solving the same problem would have been.
|
| NIH syndrome is certainly not any fun. But IH syndrome seems to
| be no better.
| [deleted]
| maria_weber23 wrote:
| I second that. It's not only that you make yourself completely
| intertwined with a Cloud by using more than fundamental
| services.
|
| The costs of lambda or even DDB are IMMENSE. These only pay off
| for services that have a high return per request. I.e. if you
| get a lot of value out of lambda calls, sure, use them. But for
| anything high-frequency that earns you little to nothing on its
| own, forget about it.
|
| Generally all your critical infrastructure should be Cloud
| independent. That narrows your choices largely to EC2, SQS,
| perhaps Kinesis, Rout53, and the like. And even there you
| should implement all your features with two clouds, i.e. Azure
| and AWS, just to be sure.
|
| The good news is also the bad news. There are effectively only
| two options: Azure or AWS. Google Cloud is a joke. They
| arbitrary change their prices, terminate existing products,
| offer zero support. It's just like we have come to love Google.
| They just don't give a shit about customers. Google only cares
| about "architecture", i.e. how cool do I feel as engineer
| having built that service. Customer service is something that
| Google doesn't seem to understand. So think carefully if you
| want to buy into their "product". Google, literally, only
| develops products for their own benefit.
| 0xEFF wrote:
| Google Cloud has quite good support and professional
| services.
|
| I've worked with them for 3 years and can't think of any
| services that have been killed.
|
| They are very customer focused. From my perspective as a
| partner cloud services are more built for customer use cases
| than Google internal use cases. GKE and Anthos for example.
| jbmsf wrote:
| I can't agree, at least not in general.
|
| The optionality of being cloud agnostic comes with a huge
| cost, both because of all the pieces you have to
| build+operate and because of the functionality you have to
| exclude from your systems.
|
| I am sure there are scales where you either have such a large
| engineering budget that you can ignore these costs or where
| decreasing your cloud spend is the only way to scale your
| business. But for the average company, I can't see how
| spending so much on infrastructure (and future optionality)
| pays off, especially when you could spend on product or
| marketing or anything else that has a more direct impact on
| your success.
| jsiepkes wrote:
| > But for the average company, I can't see how spending so
| much on infrastructure (and future optionality) pays off,
| especially when you could spend on product or marketing or
| anything else that has a more direct impact on your
| success.
|
| If you change "average company" to "average startup" then
| your point make sense. But for a normal company not
| everything needs to make a direct impact on your success.
| For example guaranteeing long term business continuity is
| an important factor too.
| jbmsf wrote:
| I take your point, but I still don't quite agree.
|
| There are obviously plenty of companies that are willing
| to couple themselves to a single cloud vendor (e.g.
| Netflix with AWS) and plenty of business continuity risks
| that companies don't find cost effective to pursue. Has
| anyone been as vocal about decoupling from CRM or ERB
| systems as they are with cloud?
|
| My own view is that these kinds of infrastructure
| projects create as many risks as the solve and happen at
| least as much because engineers like to solve these kinds
| of problems than for any other reason.
| nucleardog wrote:
| Unless you're planning for the possibility of AWS
| dropping offline permanently with little to no notice, it
| really feels like you're just paying a huge insurance
| premium. Like any insurance, it's down to whether you
| need insurance or could cover the loss. Whether you'd
| rather incur a smaller ongoing cost to avoid the
| possibility of a large one time loss.
|
| If AWS suddenly raised their prices 10x overnight, it
| would hurt but not be an existential threat for most
| companies. At that point they could invest six months or
| a year into migrating off of AWS.
|
| Rough numbers that would end up costing us like $4m in
| cloud spend and staff if we retasked the entire org to
| accomplishing that for a year.
|
| There's certainly an opportunity cost as well, but I'd
| argue it's not dissimilar to the opportunity cost we'd
| have been paying all along to maintain compatibility with
| multiple clouds.
|
| Obviously it's just conjecture, but my gut says the
| increased velocity of working on a single cloud and using
| existing Amazon services and tools where appropriate has
| made us significantly more than the costs of something
| that may never happen.
| jbmsf wrote:
| Strong agree.
|
| Plus I've seen more than a few efforts at multi-cloud
| that resulted in a strong dependency on all clouds vs the
| ability to switch between them. So not only do you not
| get to use cloud-specific services, you don't really get
| any benefit in terms of decoupling.
| zmmmmm wrote:
| > The optionality of being cloud agnostic comes with a huge
| cost, both because of all the pieces you have to
| build+operate
|
| This sounds like cloud vendor kool aid to me. Nearly every
| cloud vendor product above the infrastructure layer is a
| version of something that exists already in the world. When
| you outsource management of that to your cloud vendor you
| might lose 50% of the need to operationally manage that
| product but about 50% of it is irreducible. You still need
| internal competence in understanding that infrastructure
| and one way or another you're going to develop it over
| time. But if its your cloud vendor's proprietary stack then
| you are investing all your internal learning into non-
| transferrable skills instead of ones that can be
| generalised.
| singron wrote:
| Do you have examples of Google Cloud arbitrarily changing
| prices and terminating products?
|
| Sure they terminate consumer products, and there was a Maps
| price hike, but I'm not aware of anything that's part of
| Cloud.
| ma2rten wrote:
| A very long time ago App engine went out of beta and there
| was a price hike leaving many scrambling. App engine was in
| beta so long that many people didn't think that label meant
| anything.
| yls wrote:
| IIRC they introduced a cluster management fee in GKE.
| miscaccount wrote:
| not much 10 cents per hour https://www.reddit.com/r/kuber
| netes/comments/fdgblk/google_g...
| jbmsf wrote:
| It's always a trade-off though. You say you write most of your
| own software, but that's probably not true for, say your OS or
| programming language, or editors, or a million other things.
| Cloud software is the same; you might not be producing the most
| value if you spend your engineering hours (re)creating
| something you could buy.
|
| In my own experience:
|
| - AWS SNS and SQS are rock solid and provide excellent
| foundations for distributed systems. I know I would struggle to
| create the same level of reliability if I wrote my own publish-
| subscribe primitives and I've played enough with some of the
| open source alternatives to know they require operational costs
| that I don't want to pay.
|
| - I use EC2 some of the time (e.g. when I need GPUs), but I
| prefer to use containers because they offer a superior solution
| for reproducible installation. I tend to use ECS because I
| don't want to take on the complexity of K8S and it offers me
| enough to have reliable, load-balanced services. ECS with
| Fargate is a great building block for many, run-of-the-mill
| services (e.g. no GPU, not crazy resource usages).
|
| - Lambda is incredibly useful as glue between systems. I use
| Lambda to connect S3, SES, CloudWatch, and SQS to application
| code. I've also gone without Lambda on the SQS side and written
| my framework layers to dispatch messages to application code.
| This has advantages (e.g. finer-grain backoff control) but
| isn't worth it for smaller projects.
|
| - Secrets manager is a nice foundational component. There are
| alternatives out there, but it integrates so well with ECS that
| I rarely consider them.
|
| - RDS is terrific. In a past life, I spent time writing
| database failover logic and it was way too hard to get right
| consistently. I love not having to think about it. Plus
| encryption, backup, and monitoring are all batteries included.
|
| - VPC networking is essential. I've seen too many setups that
| just use the default VPC and run an EC2 instance on a public
| IP. The horror.
|
| - I've recently started to appreciate the value of Step
| Functions. When I write distributed systems, I tend to end up
| with a number of discrete components that each handle one part
| of a problem domain. This works, but creates understandability
| problems. I don't love writing Step Functions using a JSON
| grammar that isn't easy to test locally, but I find that the
| visibility they offer in terms of tracing a workflow is very
| nice.
|
| - CloudFront isn't the best CDN, but it is often good enough. I
| tend to use it for frontend application hosting (along with S3,
| Route53, and ACM).
|
| - CloudWatch is hard to avoid, though I rather dislike it.
| CloudWatch rules are useful for implementing cron-like triggers
| and detecting events in AWS systems, for example knowing
| whether EC2 failed to provision spot capacity.
|
| - I have mixed feeling about DynamoDB as well. It offers a nice
| set of primitives and is often easier to starting use for small
| projects than something like RDS, but I rarely operate at the
| scales where it's a better solution than something like RDS
| PostgreSQL with all the terrific libraries and frameworks that
| work with it.
|
| - At some scale, you want to segregate AWS resources across
| different accounts, usually with SSO and some level of
| automated provisioning. You can't escape IAM here and Control
| Tower is a pretty nice solution element as well.
|
| I'm not sure if I'm up to 20 services yet, but it's probably
| close enough to answer your question. There are better and
| worse services out there, but you can get a lot of business
| value by making the right trade-offs, both because you get
| something that would be hard to build with the same level of
| reliability and security and because you can spend your time
| writing software that speaks more directly to product needs.
|
| As for "having a fantastic time", YMMV. I am a huge fan of
| Terraform and tend to enjoy developing at that level. The
| solutions I've built provide building blocks for development
| teams who mostly don't have to think about the services.
| mycall wrote:
| Did you look into multi-cloud solutions like Pulumi or
| Terraform to abstract your cloud vendor?
| jasode wrote:
| _> I would also be interested to hear the other side of the
| coin. Who out there is using 20+ AWS/Azure/GCP products to back
| a single business app and is having a fantastic time of it?_
|
| Netflix uses a lot of AWS higher-level services beyond the
| basics of EC2 + S3. Netflix definitely doesn't restrict its use
| of AWS to only be a "dumb data center". Across various tech
| presentations by Netflix engineers, I count at least 17 AWS
| services they use.
|
| + EC2, S3, RDS, DynamoDB, EMR, ELB, Redshift, Lambda, Kinesis,
| VPC, Route 53, CloudTrail, CloudWatch, SQS, SES, ECS, SimpleDB,
| <probably many more>.
|
| I think we can assume they use 20+ AWS services.
| p_l wrote:
| Certain services IMHO have to be discounted from this list:
|
| - VPC - basic building block for any AWS-based infra that
| isn't ancient
|
| - CloudTrail - only way to get audit logs out of AWS, no
| matter what you feed them into
|
| - CloudWatch - similar with CloudTrail, many things (but not
| all) will log to CloudWatch, and if you use your own log
| infra you'll have to pull from it. Also necessary for
| metrics.
|
| - ELB/ELBv2/NLB/ALB - for many reasons they are often the
| only ways to pull traffic to your services deployed on AWS.
| Yes, you can sometimes do it another way around, but you have
| high chances of feeling the pain.
|
| My personal typical set for AWS is EC2, RDS, all the
| VPC/ELB/NLB/ALB stack, Route53, CloudTrail + CloudWatch. S3
| and RDS as needed, as both are easily moved elsewhere.
| tidepod12 wrote:
| I don't think you can discount them like that. Maybe they
| aren't as front of mind as services like S3, EC2, etc, but
| if you were to try to rebuild your setup in a personal data
| center, replacing the capabilities of VPC, IAM, CloudTrail,
| NAT gateways, ELBs, KMS etc would be a huge effort on your
| part. The fact that they are "basic building blocks" makes
| them more important, not less. In a discussion about the
| complexity of cloud providers versus other setups, that
| seems especially relevant.
| p_l wrote:
| Oh, I meant it more in terms of "can you count on them as
| _optional_ services ".
|
| Because they aren't optional, and yes, it takes non
| trivial amount to replicate them... but funnily enough,
| several of them have to be replicated elsewhere too.
|
| NAT gateways usually aren't an issue, KMS for many places
| can be done relatively quickly with Hashicorp Vault.
|
| IAM is a weird case, because unless you're building a
| cloud for others to use it's not necessarily that
| important, meanwhile your own authorization framework is
| necessary even on AWS because you can't just piggy back
| on IAM (I wish I could).
| fiddlerwoaroof wrote:
| I mostly agree, although ECS with Fargate is often nicer to
| use than EC2
| [deleted]
| notretarded wrote:
| I'm in two minds about this (deeper integration with a
| particular vendor - i.e. "serverless")
|
| Reduced time to market is incredibly valuable. Current client
| base is well in its millions. Ability to test to few and roll
| out to many instantly is invaluable. You no longer have to hire
| competent software developers who understand all patterns and
| practices to make scalable code and infrastructure. Just need
| them to work on a particular unit or function.
|
| The thing which scares me is, some of these companies are
| decades of years old, hundreds. How long has AWS/GCP/Azure
| abstractions been around for? How quick are we to graveyard
| some of these platforms. Quite. A lot quicker than you can
| lift, shift and rewrite your solution to elsewhere.
| rvanmil wrote:
| We carefully select and use PaaS and managed cloud services to
| construct our infrastructure with. This allows us to maximize
| our focus on what our customers are paying for: creating
| software for them which will typically be in use for 5+ years.
| We spend close to zero time on infrastructure maintenance and
| management, we pay others to do this for us, cheaper and more
| reliable. Having to swap out one service for another hasn't
| given us any trouble or unreasonable costs yet in the past 5
| years. Unlike the article is trying to convince us of, it has
| _massively_ reduced complexity for us.
| [deleted]
| bird_monster wrote:
| > There is no value to our customers in us stringing together a
| pile of someone else's products
|
| Maybe not your business, but there are many businesses in which
| this is exactly what happens. Any managed-service is just
| combining other people's work into a "product" that gets sold
| to customers. And that's great! AWS has a staggering amount of
| products, and lots of business don't even want to have to care
| about AWS.
|
| > Who out there is using 20+ AWS/Azure/GCP products to back a
| single business app and is having a fantastic time of it?
|
| Several times. I think cloud products are just tools to get you
| further along in your business. Most of the tools I use are
| distributed systems tools, because I don't want to have to own
| them, and container runtimes/datastores. Every single thing
| I've ever deployed across AWS/Azure is used as a generic
| interface that could be replaced relatively easily if
| necessary, and I've used Terraform to manage my infrastructure
| creation/deployment process, so that I can swap resources in
| and out without having to change tech.
|
| If, for some reason, Azure Event Hub stopped providing what we
| needed it for, we could certainly deploy a customized Kafka
| implementation and have the rest of our code not really know or
| care, but from when we set out to build our products, that has
| always been a "If we need to" problem, and we've never needed
| to.
| g9yuayon wrote:
| So your company cautiously chooses which services in AWS to
| use, and sticks to infrastructure offerings for now. Netflix
| called it "paved path", and it worked really well too for
| Netflix. Over the years, though, the "paved path" expanded and
| extended to more services. It's worth noting that EC2 alone is
| a huge productivity booster, bar none. Nothing beats setting up
| a cluster of machines, with a few clicks, that auto scales per
| dynamic scaling policies. In contrast, Uber couldn't do this
| for at least 5 years, and their docker-based cluster system is
| a crippled for not supporting the damn persistent volumes. God
| knows how much productivity was lost because of the bogus
| reasons Uber had for not going to cloud.
| bane wrote:
| I've worked with a number of teams over the last few years who
| use AWS and I'd say from top to bottom they all build their
| strategy more or less the same way:
|
| 0. Whatever is the minimum needed to get a VPC stood up.
|
| 1. EC2 as 90%+ of whatever they're doing
|
| 2. S3 for storing lots of stuff and/or crossing VPC boundaries
| for data ingress/egress (like seriously, S3 seems to be used
| more as an alternative to SFTP than for anything else). This
| makes up usually the rest of the thinking.
|
| 3. _Maybe_ one other technology that 's usually from the set of
| {Lambda, Batch, Redshift, SQS} but _rarely_ any combination of
| two or more of those.
|
| And that's it. I know there are teams that go all in. But for
| the dozen or teams I've personally interacted with this is it.
| The rest of the stack is usually something stuffed into an EC2
| instance instead of using an AWS version and it comes down to
| one thing: the difficulties in estimating pricing for those
| pieces. EC2 instances are drop-dead simple to price estimate
| forward 6 months, 12 months or longer.
|
| Amazon is probably leaving billions on the table every year
| because nobody can figure out how to price things so their
| department can make their yearly budget requests. The one time
| somebody tries to use some managed service that goes overbudget
| by 3000%, and the after action figures out that it would have
| been within the budget by using <open source technology> in
| EC2, they just do that instead -- even though it increases the
| staff cost and maintenance complexity.
|
| In fact just this past week a team was looking at using
| SageMaker in an effort to go all "cloud native", took one look
| at the pricing sheet and noped right back to Jupyter and
| scikit_learn in a few EC2 instances.
|
| An entire different group I'm working with is evaluating cloud
| management tools and most of them just simplify provisioning
| EC2 instances and tracking instance costs. They really don't do
| much for tracking costs from almost any of the other services.
| hyperdimension wrote:
| I'm not very familiar with AWS or The Cloud, but I'm having
| trouble understanding what you said about Amazon leaving
| money on the table by not directing customers toward
| specific-purpose services as opposed to EC2?
|
| Wouldn't (for AWS to make a profit anyway) whatever managed
| service _have_ to be cheaper than some equivalent service
| running on an EC2 VM?
|
| I get the concerns re: pricing and predictability, but it
| still seems like more $$$ for AWS.
| bane wrote:
| Yeah good question. Sibling comments to this one explain it
| well, but basically AWS managed services come at a premium
| price over some equivalent running in just EC2. (Some
| services in fact _do_ charge you for EC2 time + the service
| + storage etc.)
|
| "Managed" usually means "pay us more in exchange for less
| work on your part". This is usually pitched as a way to
| reduce admin/infrastructure/devops type staff and the
| overhead that goes along with having those people on the
| payroll.
| nucleardog wrote:
| No, usually the managed services are a premium over the
| bare hardware. When you use RDS for example, you're paying
| for the compute resources but also paying for the extra
| functionality they provide and their management and
| maintenance they're doing for you. You can run your own
| Postgres database, or you can pay the premium for Aurora on
| RDS and get a multi-region setup with point in time restore
| and one-click scaling and automatically managed storage
| size and automatic patching and monitoring integrated into
| AWS monitoring tools and...
|
| They're leaving money in the table because instead of using
| "Amazon Managed $X" potentially at a premium or paying a
| similar amount but in a way where AWS can provide the
| service with fewer compute resources than you or I would
| need because of their scale and thus more profitably,
| people look and see they'll be paying $0.10/1000 requests
| and $0.05/1gb of data processed in a query and $0.10/gb for
| bandwidth for any transfer that leaves the region and...
| people just give up and go "I have no idea what that will
| cost or whether I can afford it, but this EC2 instance is
| $150/mo, I can afford that."
| glogla wrote:
| For example:
|
| Managed Airflow Scheduler on AWS with "large" size costs
| $0.99/hour, or $8,672/year per instance. That's ~ $17,500
| considering Airflow for at least non-prod and prod
| instances.
|
| Building it on your own on same size EC2 instance would
| cost $3,363/year for the EC2. Times two for two
| environments, let's say $6,700. $4,000 if you prepay the
| instance.
|
| That looks way cheaper, but then you have to do the
| engineering and the operational support yourself.
|
| If you consider just the engineering and assume engineer
| costs $50/hour and estimate this to initial three weeks of
| work and then 2.5 days / month for support (upgrades,
| tuning, ...) that's extra $4,000 upfront and $1,000/month.
|
| So on AWS you're at $17,500/year and on-prem you're at best
| $20,000 first year and $16,000 next years.
|
| So the AWS only comes a bit more expensive - but the math
| is tricky on several parts:
|
| - maybe you need 4 environments deployed instead of 2,
| which is more for AWS but not much more for engineering?
|
| - maybe there's less sustaining cost because you're ok with
| upgrading Airflow only once a quarter?
|
| - you probably already pay the engineers, so it's not an
| extra _money_ cost, it 's extra cost of them not working on
| other stuff - different boxes and budgets
|
| - maybe you're in part of a world where good devops
| engineer doesn't cost $50/hour but $15 hour
|
| - I'm ignoring cost of operational support, which can be a
| lot for on-prem if you need 24/7
|
| - maybe you need 12+ Airflow instances thanks to your
| fragmented / federated IT and can share the engineering
| cost
|
| - etc, etc.
|
| So I think what OP was saying is that if AWS priced Managed
| Airflow at $0.5 per hour, it would be no brainer to use
| instead of build your own. The way it is, some customers
| will surely for their own Airflow instead, because the math
| favors it.
|
| Does that make sense?
| bird_monster wrote:
| > That looks way cheaper, but then you have to do the
| engineering and the operational support yourself.
|
| In my experience, this is the piece that engineers rarely
| realize and that is actually one of the biggest factors
| in evaluating cloud providers vs. home-rolled. Especially
| if you're a small company, engineering time (really any
| employee time) is _insanely valuable_. Valuable such that
| even if Airflow is cash-expensive, if using it allows
| your engineers to focus on building whatever makes _your
| business successful_, it is usually a much better idea to
| just use Airflow and keep moving. Clients usually will
| not care about whether you implemented your own version
| of an AWS product (unless that's your company's specific
| business). Clients will care about the features you ship.
| If you spent a ton of time re-inventing Airflow to save
| some cost, but then go bankrupt before you ever ship,
| rolling your own Airflow implementation clearly didn't
| save you anything.
| glogla wrote:
| I agree.
|
| The only caveat is that this goes for founders or
| engineers who are financially tied with the company
| success. If the engineer just collects paycheck, they
| might prioritize fun - and I feel that might be behind a
| lot of the "reinventing the wheel" efforts you see in the
| industry.
|
| Or maybe I'm just cynical.
| tpxl wrote:
| We used to have on-prem redis and a devops engineer to
| manage it, then we moved to redis in the cloud and had a
| devops engineer to manage it.
|
| Saying that in the cloud you don't need engineers to
| manage "operational support" is the biggest lie the cloud
| managed to sell.
| mumblemumble wrote:
| It's not just about a straight cost comparison. It's about
| how organizational decision-making works.
|
| The people shopping for products are not spending their own
| money, but they are spending their own time and energy. The
| people approving budgets are not considering all possible
| alternatives, they are only considering the ones that have
| been presented to them by the people doing the shopping.
|
| If the shoppers decide that an option will cost them too
| much in time and irritation, then it may be that the people
| holding the purse-strings are never even made aware that it
| exists. Even if it is the cheapest option.
| gregmac wrote:
| This is a really good summary of the situation, and I'd
| add a bit about risk:
|
| It's relativity easy to estimate EC2 costs for running
| some random service, because it's literally just a per-
| hour fee times number of instances. If you're wrong, the
| bigger instance size or more instances isn't that much
| more expensive.
|
| For almost every other service, you have to estimate some
| other much more detailed metric: number of http requests,
| bytes per message, etc. When you haven't yet written the
| software, those details can be very fuzzy, making the
| whole estimation process extremely risky - it could be
| cheaper than EC2, it could be 10x more, and we won't
| really know until we've spent at least a coulple months
| writing code. And let's hope we don't pivot or have our
| customers do anything in a way we're not expecting..
| harikb wrote:
| +1
|
| I bet cloud providers are incentivized not to provide
| detailed billing/usage stats. I remember having to use a 3rd
| party service to analyze our S3 usage.
|
| Infinite scalability is also a curse - we had a case where
| pruning history from an S3 bucket was failing for months and
| we didn't know until the storage bill became significant
| enough to notice. I guess in some ways it is better than
| being woken up in the middle of night but we wasted
| _millions_ storing useless data
|
| Azure also has similar issues - deleting a VM sometimes
| doesn't cleanup dependent resources and it is a mess to find
| and delete later - only because the dependent resources are
| deliberately not named with a matching tag.
| zmmmmm wrote:
| > Infinite scalability is also a curse
|
| People don't like to admit it, but in many circumstances,
| having a service that is escalating to 10x or 100x its
| normal demand go off line is probably the _desirable_
| thing.
| threentaway wrote:
| This seems like you didn't have proper monitoring and
| alerting set up for your job, not sure how that is a
| downside of AWS.
| jjoonathan wrote:
| AWS monitoring (and billing) is garbage because they make
| an extraordinary amount of money on unintentional spend.
|
| "But look at how many monitoring solutions they have in
| the dashboard! Why, just last re:invent they announced 20
| new monitoring features!"
|
| They make a big fuss and show about improving monitoring
| but it's always crippled in some way that makes it easy
| to get wrong and time-consuming or expensive to get
| right.
| milesvp wrote:
| > Infinite scalability is also a curse
|
| This was the key sentence, I think. This type of problem
| actually shows up in other domains as well, queueing
| theory comes immediately to mind. Even the halting
| problem is only a problem with infinite tape, and becomes
| easier with (known?) limited resources.
|
| When you have some parameter that is unbounded you need
| to add extra checks to bound them yourself to some sane
| value. You are right, in that the parent failed to
| monitor some infrastructure, but if they were in their
| own datacenter, once they filled their NAS, I'm positive
| someone would have noticed, if only because other checks,
| like diskspace are less likely to be forgotten.
|
| Also, getting a huge surprise bill is a downside of any
| option, and the risk needs to be factored into the cost.
| I'm constantly paranoid when working in a cloud
| environment, even doing something as trivial as a
| directory listing from the command line on S3 costs
| money. I had a back and forths with AWS support just to
| be clear what the order of magnitude of the bill would be
| for a simple cleanup action since there were 2 documented
| ways to do what I needed, and one appeared to be easier,
| yet significantly more expensive.
| bird_monster wrote:
| +1, but with a container tool (Fargate/ECS, Azure Container
| Instances) instead of EC2.
| nelsonenzo wrote:
| This person has never worked in a data center. He thinks he's
| managing a network because he sets a few vpc ips, that's an itsy
| tiny fragment of networking, and the cloud has indeed removex a
| great deal you previously had to manage on prem.
| zoomablemind wrote:
| Subjectively, it increasingly feels that while the complexity has
| been increasing, the notion of longevity of the underlying
| products and services has been degrading.
|
| While updates to software were expected, general outlook would be
| that they would not be breaking the core features. The emphasis
| on backwards compatibility was in a way an assurance to
| businesses that building their operations on vendor's products is
| not risky. Even then, some mission-critical elements would be
| defensively abstracted to avoid the dependency risks (at least
| theoretically...)
|
| Now, we all witness the "eternal-beta" paradigm across the most
| of the major software products. Frequent builds with automatic
| updates, when new features could be suddenly pushed, and old
| features removed.
|
| Sure, it's still possible to spec out a "hard-rock" steady
| platform, postpone updates, abstract dependencies and just focus
| on business. But...such approach won't be approved, as it's
| widely acknowledged that the presence of critical bugs is rather
| a "feature" of all software. Postponing the updates is not
| prudent, it's a liability.
|
| So the rock-solid expectations are just an illusion or perhaps a
| fantasy promoted widely, just to get the foot in the door.
|
| Ironically, the most stable elements are the so much dreaded
| "legacy", too often in charge of the business-critical logic.
| s3tz wrote:
| Anyone manage to find part 1? It's not on their site, can't seem
| to find it.
| jvanderbot wrote:
| It's linked in the article https://ea.rna.nl/2016/01/10/a-tale-
| of-application-rationali...
|
| "This was actually part 1 of this story: A tale of application
| rationalisation (not)."
| [deleted]
| CalChris wrote:
| Isn't this just _Jevon 's Paradox_ applied to software?
| when technological progress ... increases the efficiency with
| which a resource is used .., but the rate of consumption of that
| resource rises due to increasing demand [1]
|
| [1] https://en.wikipedia.org/wiki/Jevons_paradox
| ChicagoDave wrote:
| Reducing complexity should never be about platform (on-prem vs
| cloud).
|
| It should be about constructing software in partnership with the
| business and reducing complexity with modeled boundaries.
|
| You can leverage the cloud to do some interesting things, but the
| true benefit in is _what_ you construct, not _how_.
| sanp wrote:
| There is an element of _how_ as well. You could create simple
| monoliths or overengineered microservices. Or, complex
| monoliths with heavy coupling vs cleanly designed microservices
| with clear separations of concern.
| the-smug-one wrote:
| Are microservices meant to separate data too? As in, each
| service has its own database.
|
| Wouldn't that lead to non-normalisation of the data or a lot
| of expensive network lookups to get what I want/need?
|
| What is the point of micro services anyway :-)?
| kqr wrote:
| > Are microservices meant to separate data too? As in, each
| service has its own database.
|
| Yes.
|
| > Wouldn't that lead to non-normalisation of the data
|
| Yes. But it's not as bad as it sounds. That is how data on
| paper used to work, after all.
|
| Business rules (at least ones that have been around for
| more than 5--10 years) are written with intensely non-
| normalised data in mind.
|
| Business people tend to be fine with eventual consistency
| on the scale of hours or even days.
|
| Non-normalised data also makes total data corruption
| harder, and forensics in the case of bugs easier, in some
| ways: you find an unexpected value somewhere? Check the
| other versions that ought to exist and you can probably
| retrace at what point it got weird.
|
| The whole idea of consistent and fully normalised data is
| a, historically speaking, very recent innovation, and I'm
| not convinced it will last long in the real world. I think
| this is a brief moment in history when our software is
| primitive enough, yet optimistic enough, to even consider
| that type of data storage.
|
| And come on, it's not like the complete consistency of the
| data is worth _that_ many dollars in most cases, if we
| actually bother to compute the cost.
| ratww wrote:
| _> Are microservices meant to separate data too? As in,
| each service has its own database._
|
| Ideally yes, to scale.
|
| Sometimes you have a service with obvious and easy-to-split
| boundaries, and microservices are a breeze.
|
| Some things that are easy to turn into microservices: "API
| Wrapper" to a complex and messy third-party API. Logging
| and data collection. Sending emails/messages. User
| authentication. Search. Anything in your app that could
| become another app.
|
| However, when your data model is tightly coupled and you
| need to choose between tradeoffs (data duplication), having
| bigger services, or even keeping it as a monolith.
|
| Btw, if you don't care about scalability, sharing a
| database is still still not the best idea. But you can have
| a microservice that wraps the database in a service, for
| example. Tools like Hasura can be used for that.
| NicoJuicy wrote:
| Microservices is a solution for an organisational problem (
| multiple employees in one project), not a technical one.
|
| If you're flying solo, just use DDD for example. It will
| give you the same patterns without the devops complexity
| mumblemumble wrote:
| I honestly believe that hiding complexity behind a closed door
| does not eliminate it. However, a lot of software and service
| vendors have a vested interest in convincing people otherwise.
| And, historically, they've had all sorts of great platforms for
| doing so. Who doesn't enjoy a free day out of the office, with
| lunch provided?
|
| It's also much easier to hide complexity than it is to remove
| it. One can be accomplished with (relative) turnkey solutions,
| generally without ever having to leave the comfort of your
| computer. Whereas the other generally requires long hours
| standing in front of a chalkboard and scratching your head.
| SpicyLemonZest wrote:
| On the other hand, hiding complexity behind closed doors can
| be a very valuable thing, if it lets you keep track of who
| knows about the complexity behind each. I can't count the
| number of issues I've encountered that would have taken
| minutes instead of hours if only I'd known which specific
| experts I needed to talk to.
| mumblemumble wrote:
| Agreed. Though, that to comes at a cost, so I don't want to
| do it except when it's worth it.
|
| http://yosefk.com/blog/redundancy-vs-dependencies-which-
| is-w...
| candiddevmike wrote:
| I see a lot of mentions in the comments about just using the
| basic storage/networking/compute from AWS/AZ/GCP--if that's all
| you're using, you should really consider other providers. Linode,
| Digital Ocean, and Vultr will be far more competitive and offer
| faster machines, cheaper, and with better bandwidth pricing.
|
| The point of using AWS/AZ/GCP is to leverage their managed
| service portfolio and be locked in. If you aren't doing that,
| there are better companies that want your business and will treat
| you much better.
| dilyevsky wrote:
| There's also packet (now equinix metal) that gives control over
| l2 and have nice things such as ibgp. I think vultr may do too
| but their docs are poor and support was uncooperative
| ithrow wrote:
| IME, the network of AWS is much better than that of DO, Linode.
| dilyevsky wrote:
| How so?
| ithrow wrote:
| Less hiccups and downtime. It's faster and with better
| latency to other third party services. Superior internal
| control. Ex: In, linode a private IP address gives EVERYONE
| on the same data center access to your Linode server. Also,
| last time I used them they didn't have a Firewall.
| dilyevsky wrote:
| Right linode is basically old school dedicated servers
| afaik but DO should be in different class
| dijit wrote:
| AWS networking isn't _great_ but it's decidedly better
| than DO (which is actually the worst of those listed
| based on my own TCP connection tests).
|
| Linode is pretty stable if not very exciting, Vultr is
| "better than DO", but their networks are almost always in
| maintenance.
|
| For a little context; I maintain IRC servers and those
| are currently hosted in vultr (with linked nodes in 5
| regions), I notice ping spikes between those nodes often
| and sometimes congestion which drops users. (IRC is
| highly stateful TCP sessions).
|
| I've only known two truly good networking suppliers, GCP
| (and their magic BGP<->PoP<->Dark Fibre networks) and..
| Tilaa.. (which is only hosted in Netherlands.. which is
| why I can't use them for my global network)
| dilyevsky wrote:
| Awesome thanks for info. For gcp i notice occasional
| unavailability on the order of 10s of mins every quarter
| or so. That's VM networking. Their loadbalancers are a
| different story as they are a complete crap
| dijit wrote:
| This is very much not my experience, do you have any more
| information?
|
| Any particular regions? Are you certain it's not a local
| ISP?
|
| (I used to run an always online video game and we had a
| LOOOOOT of connection issues from "Spectrum internet" on
| all of our servers including GCP ones.)
| dilyevsky wrote:
| Answering here bc bottom post is locked for some reason -
| east1 occasionally disconnects from other regions. That
| is definitely within google backbone. Central-1 seems
| worse tho. If it's less than an hour they dont bother
| with the status page.
|
| For loadbalancer its very much by design as they randomly
| send you rst when google rolls them for upgrade and in
| some other cases (I'm working on a blog post on this).
| Google support recommendation is to retry (foreals)
| mrkramer wrote:
| Microsoft summarized it nice [1] :
|
| Advantages of public clouds:
|
| Lower costs
|
| No maintenance
|
| Near-unlimited scalability
|
| High reliability
|
| Advantages of a private cloud:
|
| More flexibility
|
| More control
|
| More scalability (compared to pure on-prem solution)
|
| [1] https://azure.microsoft.com/en-us/overview/what-are-
| private-...
| indymike wrote:
| Hmm... So Azure for unlimited scalability... But private clouds
| have more scalability?
| beaconstudios wrote:
| Presumably both options are written relative to non cloud
| setups
| 8K832d7tNmiQ wrote:
| probably meant that you can just request hundreds of servers
| in other part of the world in one single setup compared to
| manually building your server there.
| mrkramer wrote:
| "More scalability -- private clouds often offer more
| scalability compared to on-premises infrastructure."
|
| I think they meant private cloud (renting 3rd party servers
| and using/maintaining your private cloud) vs on-prem
| (buying servers and building your own data centers).
| jvanderbot wrote:
| Part 1 is linked in article
|
| https://ea.rna.nl/2016/01/10/a-tale-of-application-rationali...
|
| "This was actually part 1 of this story: A tale of application
| rationalisation (not)."
| [deleted]
| skohan wrote:
| I have to say, at my current company we are using Serverless, and
| it really does feel like it reduces complexity. No
| runtime/framework to set up, no uptime monitoring or management
| required on the application layer, and scaling is essentially
| solved for us. I mean you do pay for what you get, but it does
| feel like one of those technologies which really lowers the
| barrier to entry in terms of being able to release a production
| web application. In terms of building an MVP, a single developer
| really can deploy an application without any dev-ops support, and
| it will serve 10 million users if you manage to get that much
| traffic.
|
| I'm sure it's not optimal for every case, but for an awful lot of
| cases it seems pretty darned good, and you can save on dev ops
| hiring.
| ratww wrote:
| I used to be very excited about serverless, and I still have
| high hopes for it.
|
| But for me it ended up replacing the complexity of runtime and
| frameworks with the complexity of configuring auxiliary
| srevices like Gateway API, Amazon VPC, etc. We needed to move
| the complexity to some tool that configured the services around
| Lambda, like Terraform or Cloud Formation, or at best to a
| framework like Claudia or Serverless.com. Configuring it by
| hand looks fine in tutorials, but is madness: it's still
| complex, and makes it all way too fragile.
|
| There are however some products that make the experience better
| by simplifying configuration, like Vercel and Netlify.
| skohan wrote:
| Yeah I certainly agree that the complexity doesn't _really_
| go away completely, and sometimes it 's much more frustrating
| to have to configure poorly documented services rather than
| just having access to the host OS.
|
| I guess my overall point would be that two of the _hardest_
| things to do in terms of making a production-ready
| application are scaling and security, and Serverless pretty
| much obviates them. So it 's not a magic wand, but it does
| take away some of the significant barriers to entry.
| ratww wrote:
| Yes, I agree with that point. I think my point was more
| that Serverless is a good idea, but the current
| implementations are still not good at removing complexity.
| But I can see this easily changing, with open standards and
| the such.
| orlovs wrote:
| Well, we just need to admit that running applications if
| ack all know risks are complex. If we blissfully ignore
| risks like lamp or lemp stack its much more easier. Main
| question do we need to take in account most of risks,
| running within small scale.
| 8note wrote:
| I was expecting writing serverless to be a mess of writing
| configuration, but I've really enjoyed writing CDK for
| cloudformation. It's super unclear how you're supposed to
| write good cdk code, but I feel like I'm a lot clearer on
| what infrastructure I'm actually using than before, where I
| was relying on stuff set up by someone else ages ago with
| minimal to no documentation
| arendtio wrote:
| I wonder who tells the story that cloud computing has something
| to do with reducing complexity. In my world, cloud computing is a
| bout scalability and making things as complex as they need to be
| to be scalable. This rarely means that complexity is being
| reduced.
| didericis wrote:
| The simplicity of having one cloud based product rather than
| several native products built for different systems is an
| argument I've heard a lot.
| ratww wrote:
| This is an advantage of the web platform, not exactly related
| to cloud. You can get this advantage with an on-premises web
| product, or with old school hosting.
| didericis wrote:
| Very true, just trying to explain the source of the "cloud
| reduces complexity" argument. There are a number of small
| operations that don't want to manage all their own
| hardware, so cloud and web are conflated, and you get the
| web platform simplicity argument being used to justify a
| cloud platform.
___________________________________________________________________
(page generated 2021-01-10 23:01 UTC)