[HN Gopher] Software Infrastructure 2.0: A Wishlist (2021)
___________________________________________________________________
Software Infrastructure 2.0: A Wishlist (2021)
Author : whoiskatrin
Score : 85 points
Date : 2024-02-27 11:23 UTC (11 hours ago)
(HTM) web link (erikbern.com)
(TXT) w3m dump (erikbern.com)
| zsoltkacsandi wrote:
| The author apparently does not have any experience in building
| systems/infrastructure.
|
| > I can set up a static website in AWS, but it takes 45 steps in
| the console and 12 of them are highly confusing if you never did
| it before
|
| Anything can be confusing/takes time if you never did before.
| Getting productive needs time and practice. If your goal is only
| to set up a static site, AWS is an overkill for it.
|
| > It's sad this is the current state of infrastructure.
|
| It's sad that some people still haven't learned to pick the right
| tool for a problem.
|
| > I could go on, but I won't. I'm dreaming of a world where
| things are truly serverless.
|
| I don't even understand what the author wants here. There is no
| such thing "truly serverless". Your code will be executed by a
| server. Period. Serverless is just a fancy marketing term for
| ephemeral lightweight VMs.
|
| > If I make a change in the AWS console, or if I add a new pod to
| Kubernetes, or whatever, I want that to happen in seconds
|
| The author obviously doesn't have any knowledge about distributed
| systems.
|
| > My deep desire is to make it easy to create ephemeral
| resources. Do you need a database for your test suite? Create it
| in the cloud in a way so that it gets garbage collected once your
| test suite is done.
|
| Fortunately we have Terraform that's made this possible for a
| decade(?).
|
| > Code not configuration
|
| Terraform, Pulumi, countless of client libraries for all of the
| cloud providers.
| samuell wrote:
| > The author apparently does not have any experience in
| building systems/infrastructure.
|
| Well, he built https://modal.com , one of the coolest things
| since sliced mangoes, and before that
| https://github.com/spotify/luigi
| zsoltkacsandi wrote:
| I don't care what he built if he justifies his arguments with
| distorted facts and complains about lack of things that have
| been around for a decade.
| phillipcarter wrote:
| When you make statements like this:
|
| > There is no such thing "truly serverless". Your code will
| be executed by a server. Period.
|
| It indicates that maybe _you_ are the one who 's missing
| the point. The author is not saying anything about wanting
| code that magically runs on a server without running on a
| server.
| kitd wrote:
| _There is no such thing "truly serverless". Your code will be
| executed by a server._
|
| This is nit-picky. "Serverless" refers to the "dev", not the
| "ops", and has done for a while.
|
| _Fortunately we have Terraform that's made this possible for a
| decade(?)._
|
| Setting up production-grade DBs in Terraform is easy?
| szszrk wrote:
| If done perpetually - yes.
|
| The autor does make some weird arguments and seem to be
| creating an emotional setting for something. Like his own
| product you guys mentioned.
|
| My pods ARE ready in seconds. Wondering why his are not.
| cassianoleal wrote:
| > My pods ARE ready in seconds. Wondering why his are not.
|
| That's what I was thinking too. What kind of underpowered,
| crappy k8s cluster is this person running where pods take
| minutes to spin up?
| zsoltkacsandi wrote:
| > Setting up production-grade DBs in Terraform is easy?
|
| Oh, yes, it is. Setting up the resources actually the easiest
| part, most of the problems originate from the phenomenon that
| as the developers starts to use more and more "serverless"
| things, they know less about how the underlying technology
| works, how to use indexes, structure the database, how
| replication or transaction works. Production readiness is not
| just how a resource is configured. It is about how the
| application uses a resource efficiently.
|
| > This is nit-picky. "Serverless" refers to the "dev", not
| the "ops", and has done for a while.
|
| There is no "dev" and "ops" serverless. Your application will
| run on one or multiple CPUs, will use the memory, the disk,
| the network. When you write the application all of these
| matter, memory management, network communication, CPU caches,
| parallel execution, concurrency, disk access. It does not
| matter if you call it serverless, cloud, bare metal, etc. The
| basics are the same.
| jasode wrote:
| _> There is no such thing "truly serverless". Your code
| will be executed by a server. Period. _
|
| _> Your application will run on one or multiple CPUs, will
| use the memory, the disk, the network._
|
| But the term "serverless" has never meant _" serverless
| does not run on cpu, does not use any RAM, and does not use
| disk or network."_
|
| You're attempting a clarification for "serverless" that
| nobody needs because reasonable people didn't actually
| think serverless/LambdaFunctions/CloudWorkers/etc defied
| the laws of physics.
|
| "Serverless" from the beginning has always meant not having
| to do "os management/operations" type of tasks in a vm such
| as: sudo apt-get update sudo apt-get
| install <package> [...]
|
| Instead, the cloud vendors created ability to run
| _stateless functions_ which are executed in a "cloud
| runtime". The "dev" focuses the effort on coding the
| stateless functions instead of Linux os housekeeping tasks.
|
| And yes -- to pre-empt the discussion from going around in
| circles... the "cloud's runtime" for stateless functions do
| ultimately run on a "server" which runs on cpu/memory/disk.
| And yes, _" the cloud is just somebody else's computer"_. I
| think we all know that.
| zsoltkacsandi wrote:
| > "Serverless" from the beginning has always meant not
| having to do "os management/operations" type of tasks in
| a vm such as
|
| So you mean that serverless is when someone else types in
| the commands of installing the dependencies of your
| software.
|
| I am genuinely curious, how difficult/expensive learning
| and issuing these commands on a VM, putting them into a
| packerfile, Dockerfile or ansible playbook, considering
| the whole software development lifecycle?
|
| In your interpretation the serverless is when the person
| who runs these "Linux housekeeping" commands is working
| at AWS (or insert any other provider here) and not at
| your company.
| mike_hearn wrote:
| Serverless/FaaS takes care of the following things that
| you otherwise need to do yourself:
|
| 1. Provisioning VMs and copying the right files up to
| them.
|
| 2. Linking them together behind an HTTP load balancer,
| which itself needs to be on one or more VMs and possibly
| DNS balancing.
|
| 3. Configuring that load balancer to respond on HTTPS
| endpoints and health check backends.
|
| 4. Collecting logs etc to a central place.
|
| 5. Making sure servers restart if they need to for
| versioning or crash reasons.
|
| 6. Shutting it all down and cleaning it up if you stop
| using them.
|
| That's pretty much it. People like it because doing UNIX
| sysadmin work sucks. The usability just isn't very good.
| whoiskatrin wrote:
| I think it's important to understand that this was his opinion
| in 2021. Things have changed since, and hopefully, all these
| solutions are available now.
| zsoltkacsandi wrote:
| TBH, in 2018 I already used the things he was complaining
| about, so my opinion still stands.
| sciurus wrote:
| > I don't even understand what the author wants here. There is
| no such thing "truly serverless"
|
| The author says what they want. It's literally their next
| sentence:
|
| "As in, I don't want to think about future resource needs, I
| just want things to magically handle it."
|
| and they have four bullet points with examples of what this
| means to them earlier.
|
| I think it's fair to argue about the desirability,
| achievability, etc of this. I don't think it's fair to act as
| if the author is just spewing buzzwords without explanation.
| zsoltkacsandi wrote:
| Let's see:
|
| - Why do I have to think about the underlying pool of
| resources? Just maintain it for me.
|
| - I don't ever want to provision anything in advance of load.
|
| - I don't want to pay for idle resources. Just let me pay for
| whatever resources I'm actually using.
|
| - Serverless doesn't mean it's a burstable VM that saves its
| instance state to disk during periods of idle.
|
| This article was written in 2021.
|
| AWS Lambda was introduced in 2014 that fulfilled all of those
| requirements in those bullet points that you mentioned.
| Google App Engine is the same, it was introduced in 2008.
|
| So again, this article tells only one thing: that the author
| does not know what he is talking about.
| opentokix wrote:
| He is from spotify, he dont have any experience full stop.
| evantbyrne wrote:
| I built a CD for AWS (beakerstudio.com). The author is correct
| about everything being super complicated. Tools like Terraform
| help automate changes, but you still have to _learn_ all of the
| strange ways AWS works and juggle configuration requirements
| that are oftentimes so bizarre it makes you wonder if they are
| trying to funnel developers into support plans.
|
| Honestly, the experience of building Beaker Studio made me
| bearish on AWS. They price gouge and the DX is so bad teams
| pretty much need CDs. Once I get the time I want to update
| Beaker Studio so people can deploy to any old Linux box
| instead. Teams deserve so much better than AWS/Google/Azure.
| abi wrote:
| Been looking at a few solutions similar to yours! I'm
| currently on Render and looking to move elsewhere so I can
| have more control and particularly insight into system
| metrics. Do you support zero downtime deploys? It wasn't
| clear to me from your home page.
| evantbyrne wrote:
| Tasks are run on ECS with Fargate. If you setup your server
| with a load balancer, which is required on ECS to point DNS
| to the server, then the load balancer will wait for health
| checks to pass before switching over to the newly deployed
| tasks. ECS with Fargate is reliable in my experience, and
| Beaker Studio uses an alternate installation of itself to
| deploy itself, so everything is dogfooded. A big drawback
| imo is that AWS is expensive and Beaker Studio does not
| attempt to hack its way around their pricing. Right now I'm
| not billing users (within reason) who provide feedback, so
| please feel free to sign up and email me your notes.
| nkohari wrote:
| Just because you disagree with someone doesn't mean they don't
| know what they're talking about.
| lostmsu wrote:
| I would not agree. Author basically describes Google App
| Engine.
| dang wrote:
| Can you please not post in the flamewar style to HN, as you did
| here and elsewhere in this thread? You can make your
| substantive points without that. We're trying for a different
| kind of discussion.
|
| If you wouldn't mind reviewing
| https://news.ycombinator.com/newsguidelines.html and taking the
| intended spirit of the site more to heart, we'd be grateful.
| cheptsov wrote:
| I have a lot of respect for Erik and his work with Modal, which
| I've heard a lot of good feedback about. What Erik says about
| serverless and code over configuration can benefit many users and
| companies. However, I strongly disagree on the main points and
| certainly have a different wishlist for infrastructure. My main
| point would be on that list - open-source and vendor-agnosticism.
|
| Finally, I believe simple configuration can coexist with code.
|
| P.S.: At dstack, we are building an open-source platform to
| manage AI infra - a more lightweight and AI-friendly alternative
| to Kubernetes.
| phrotoma wrote:
| > You know how crappy software is crappy in ways that are so
| blatantly obvious to the user that you wonder why it was
| released?
|
| It has crossed my mind several times recently that I want a word
| to describe this exact state of affairs. Where a thing has a
| defect so blatant that it is evident to any user that the creator
| of the thing has never tried using it.
|
| Eg. an airbnb with no towels in it.
|
| What's the word for this situation?
| JSR_FDED wrote:
| Microsoft Teams
| dkasper wrote:
| Fugazi is my favorite word for it. Also snafu.
| fbergen wrote:
| Yet still people are using it?
|
| Otherwise it's called an MVP and a promise of plugging the
| holes
| crabbone wrote:
| It's overly naive to think that people who use such a tool
| _choose_ to use it.
|
| In many cases it's "you are hired into this job, this is the
| tool we give you, if you don't like the tool, take a hike".
|
| Even more so, a lot of software is developed not to be
| competitive, but to be exclusive. It's a lot easier to be the
| only choice for doing something than trying to compete with a
| different tool. I've seen countless examples of tools
| developed in exactly this paradigm, where the decision to use
| the tool wasn't made by anyone anywhere close the users of
| the tool (eg. hospital procurement department buying a PACS
| or a large avionics company ordering a custom-made budget-
| management program).
| PaulDavisThe1st wrote:
| A tool with more than one way to use it?
| crabbone wrote:
| I want to expand on this :)
|
| When I have to describe to people who don't work with me my
| interactions with developers (especially of the crappy code
| like that) from a standpoint of someone who represents the QA
| side of things... I describe to them my interactions with my
| five y.o. son: Me: How as school?
| Son: Goooood! Me: Did you behave? Son: Yes!
| Me: Did the teacher send you into timeout? Son: Yes...
| Me: So how come? You told me you behaved... What did you do?
| Son: Played with Ryan! Me: That doesn't seem like a
| good reason to send you into timeout.
|
| And we go like this until I either discover that he was yelling
| in class or I will never know the reason why he was in
| detention. This is also the pattern of denial I very frequently
| face when talking to the programmers who wrote the crappy code.
| Somewhere on the back of their minds they understand that they
| screwed up, but they will come up with all sorts of concocted
| reasoning to pretend that they either don't understand why the
| product sucks, or they would claim that it cannot be made any
| better, or attack me for not understanding how the product is
| supposed to work etc. The most recent example would be (in
| slight adaptation): Me: I discovered that we
| set PYTHONPATH variable when loading a (Tcl) module.
| Dev: I see no problems with that. Me: The new feature
| we are releasing to the users is conda support. Conda will not
| work (well) when this variable is set. Dev: Did the
| documentation tell users to load this module? Me: No,
| but it's obvious that users would like the functionality
| provided by the module in addition to using conda. They are
| made to complement each other. Besides, documentation doesn't
| say they shouldn't. Dev: (summons PM)
|
| And then PM continues in the same spirit as the developer. And,
| my guess is that the reason for it is that nobody really wants
| to work too hard. There's no reward in making a better quality
| product if that quality isn't immediately appreciated. Features
| like latency, throughput, size etc. are immediately visible to
| the user and are an easy sell. Features like internal
| consistency in the face of more sophisticated usage: these
| might never happen, and the user might never know that they
| were protected from their system collapsing on them by a
| substantial development effort. So, commercial companies de-
| prioritize quality. And that's how we get crappy programs.
| Kon-Peki wrote:
| Not knowing anything other than what you wrote, it sounds
| like your organization has leadership problems. People don't
| know why their job exists, they don't know what your
| organization is actually trying to accomplish, how any
| individual person fits into it, why the day-to-day things
| someone does helps, etc.
|
| Nothing anyone does with software will help.
| fhuici wrote:
| > The speed that's not there is setting up infrastructure. If I
| make a change in the AWS console, or if I add a new pod to
| Kubernetes, or whatever, I want that to happen in seconds. I'm
| not asking for milliseconds!
|
| Milliseconds is now possible: https://kraft.cloud/ (e.g., an
| NGINX web server in under 20 millis).
| fbergen wrote:
| But you still have clusters, why not everywhere... ?
| thundergolfer wrote:
| Cool looking website :) Small nit feedback, you say "less
| servers to operate" when it should be "fewer servers to
| operate" because servers are countable.
| fbergen wrote:
| I would love to have what we were sold as a "truly" serverless
| (even though the name doesn't mean no server)
|
| - CloudRun did a good job, but the autoscaling is too slow to not
| pay for idle
|
| - Lambda is great, but I want to run way more complex workloads
| than simple functions
| fbergen wrote:
| Am I asking too much? =P
| throwawaaarrgh wrote:
| The one thing I want, that doesn't exist, and won't for at least
| 10 years: immutable infrastructure.
|
| Oh, the _concept_ exists. I can make _some_ infrastructure
| mostly-immutable, myself. But the cloud doesn 't give me it out
| of the box. What the cloud gives me are APIs. If I write software
| to call those APIs, predict what the allowed values are, predict
| the failures I might see, write about 5,000 lines of code to
| handle the failures, attempt to reconcile differences, retry,
| store my artifacts, reference them, after implementing a build
| system, etc, I can get one or two things to be immutable. But for
| the vast majority of services it's actually impossible.
|
| Take an S3 bucket. Can you make an S3 bucket immutable? The
| objects inside it might be versions, sure. Can you roll back
| _all_ the objects in the bucket to Version 123? Can you roll back
| the S3 policy back to revision 22? Can you make it also roll back
| the CORS rules? Can you diff all these changes and see a log of
| them? Can you tell the bucket to fix itself back to the correct
| expected version of itself? Can you tell it to instead adopt 3
| new changes, as part of a version of the S3 bucket you tested
| somewhere else? The answer is "no".
|
| You can _fake it_ , with a configuration management tool like
| Terraform. But that's as immutable as a file on your filesystem.
| Any program can overwrite your files at any time; you have to
| have Puppet configured to monitor your files, and constantly fix
| the files when they get changed, track the Puppet code in Git,
| keep your own log of changes, etc. That filesystem isn't
| immutable, it's _mutable!_ If it was immutable you wouldn 't have
| to use Puppet (or Terraform). And the sad thing is we're all
| stuck on Terraform, which is actually terrible for a
| configuration management tool, because it mostly refuses to
| reconcile inconsistencies (the way every other configuration
| management tool in history has). It just bombs out and says _" Oh
| shit, that wasn't a change I planned, and you didn't write this
| HCL code to handle this weird condition, so I'm just gonna bail
| and not fix this. Good luck getting production working again."_
| Puppet wouldn't stop working if something other than Puppet
| updated a file. But nobody seems to mind that we literally
| regressed in functionality, because a company made up new
| marketing terms for their tools.
|
| Sadly this desired built-in immutability, and the declarative
| nature of it, won't be built into S3 or other tools for at least
| a decade or two. They would need to effectively build something
| akin to K8s just to manage their own components immutably and
| expose an entirely new API. So we are doomed to do Configuration
| Management in the cloud, until the cloud starts implementing
| immutability out of the box.
| stavros wrote:
| This isn't substantive, but it bugged me:
|
| > I'm not asking for milliseconds! Just please at least get it to
| less than a second.
|
| What do we measure "less than a second" times in?
| yoyohello13 wrote:
| Centiseconds?
| socketcluster wrote:
| I built a serverless SaaS no-code/low-code platform which could
| be of interest: https://saasufy.com/
|
| You can build your entire app inside a plain HTML file which can
| be deployed online with something like GitHub pages.
|
| I've built a few apps with it including a real-time chat app
| which supports both group chat, private 1-on-1 chat with an
| account system (with access control), OAuth via GitHub... The
| entire app is only 260 lines of HTML markup and fully serverless
| (no custom back end code). Access controls are defined via the
| control panel. All the app's code is in this file:
| https://github.com/Saasufy/chat-app/blob/main/index.html
|
| You can try the app here (use the 'Log in with GitHub' link):
| https://saasufy.github.io/chat-app/index.html
|
| Saasufy comes with around 20 generic declarative HTML components
| which can be assembled in complex ways:
| https://github.com/Saasufy/saasufy-components?tab=readme-ov-...
|
| There is a bit of a learning curve to figure out how the
| components work but once you understand it, you can build apps
| very quickly. The chat app only took me a few hours to build.
|
| I've also been helping a friend to build an application related
| to HR with Saasufy and I managed to get the basic search
| functionality working with only 160 lines of HTML markup.
| crabbone wrote:
| If I didn't know better, I'd think I'm reading one of those
| cheesy LinkedIn advertorials... To someone who dedicated their
| professional life to infrastructure all of these wishes read
| mostly irrelevant, with a strong proprietary advertising flavor.
| At every turn of a sentence I expected to find a mention of some
| commercial product this article was going to promote. Well, at
| least it doesn't seem to do that, not openly anyways.
|
| So, here are some thoughts on what seems to be the key points of
| the article:
|
| * I want to go fast.
|
| Well... yeah, sure, why not... but it's not very important. Lots
| of other goals will overshadow this one. Also, if we are talking
| in the context of whatever-as-a-service, there's very little
| incentive to work on the speed aspect as long as it not taking
| ages.
|
| Also, reducing infrastructure to whatever-as-a-service is
| seriously hollowing the definition. I've been in ops / infra for
| over a decade, and I've barely even touched the as-a-service
| aspect. Also, whenever I do come in contact with it, it's always
| awful, and I want to get away from it as fast as possible. Making
| it go faster won't help that though. The disappointing parts are
| poor documentation, poor support, proprietary tech. overly narrow
| scope etc.
|
| * Testing in production
|
| Why is this even a relevant issue?.. Anyways. OP needs to take a
| trip to the QA department. They obviously don't know why they
| have one. But it's also possible their QA department is worthless
| (ours is...) But having a worthless QA department isn't really
| something to wish for in Infrastructure 2.0. I don't see how this
| is a good goal.
|
| So, the reason why QA department is necessary, and why CI can
| possibly cover only a fraction of what can be / should be done
| with testing is that QA, beside other things, needs to simulate
| plenty of different possible conditions in controlled environment
| to be able to investigate and to diagnose problems. Most of the
| work of QA is spent on RCA, and then figuring out how to present
| the problem, stripped of all unnecessary components to the
| development team to be able to fix it. It's not possible to do
| good QA w/o an ability to isolate components which calls for
| creation of fake / artificial environments which are not like
| production.
|
| * Calls to unleash the next order of developer productivity
|
| This is such an MBA b/s... Just give it a break.
| mike_hearn wrote:
| It's a sub-component but Oracle Labs has a project to develop
| something like the FaaS platform he's asking for, called GraalOS.
|
| The basic idea is that FaaS is a leaky abstraction because (a)
| lots of runtimes are slow to start up and (b) isolation tech
| isn't good enough. So FaaS services start up VMs and containers
| and then the user's function which might have to do a lot of init
| work, like to load reference data, and because that takes too
| long you have to keep idle capacity around. At that point the
| abstraction is broken.
|
| So there's a two-part fix:
|
| 1. For Java users, the GraalVM native-image tool can pre-
| initialize and pre-compile a JVM app so that it starts up
| instantly (including with pre-loaded reference data).
|
| 2. Change the isolation model so VMs and containers don't need to
| be started up anymore. Containers alone can take hundreds of
| milliseconds to start.
|
| There's also some interesting stuff there that takes advantage of
| Oracle Cloud's more "edgey" nature than other clouds, where it
| has more datacenters than others (but smaller).
|
| The new isolation model works by exploiting new hardware features
| in CPUs that allow for intra-process memory isolation (Intel MPK)
| combined with hardware-enforced control flow integrity. This
| requires compiler support, but GraalVM knows about these features
| and so the cloud can just compile JVM apps to native for you. And
| what about other apps? Well, many languages run on GraalVM via
| Truffle, so those are covered (e.g. JavaScript) and for native
| code you can use a modified LLVM to compile and then do a static
| verification of any user supplied binaries, like NaCL used to do.
|
| If you put those things together then starting user code that's
| already available locally becomes just mmapping a shared library
| into a process, which is extremely fast. It can only exit the
| hardware/software enforced isolate by going via a trampoline
| that's equivalent to a syscall, but without needing an actual
| syscall. The Linux kernel isn't reachable at all.
|
| With that you can have functions that start and stop in
| milliseconds.
| thecleaner wrote:
| These are bad ideas. They are software wishes which no enterprise
| will pay for. Infra is setup once so optimising for setup time
| doesn't do the trick. Rollouts should take time deliberately so
| faulty software don't lead to an outage in seconds. No infra
| provider will bother turning off infra as again it can have
| impact on availability. AWS is optimising resource usage anyway
| barring a few services like Cloudwatch
| PaulDavisThe1st wrote:
| Within a few lines of each other in TFA:
|
| > We are, like what, 10 years into the cloud adoption? Most
| companies (at least the ones I talk to) run their stuff in the
| cloud. So why is software still acting as if the cloud doesn't
| exist?
|
| > As in, I don't want to think about future resource needs, I
| just want things to magically handle it.
|
| 'nuff said.
| ljm wrote:
| The cloud is so expensive for most companies that I think that
| a solution architect's insistence on setting up in the cloud by
| default is actually a corporate welfare program where VC funds
| are redirected to Amazon and Google.
|
| That said, it's still not as trivial as using managed SaaS but
| it's still easier than ever to basically spin up your own cloud
| of sorts, using the wealth of open source tech out there. K3S
| on Hetzner can do a pretty solid job for cheap. In that sense,
| the ecosystem around running your own cloud is only improving.
| pnathan wrote:
| One my basic design philosophies is I learn key things deeply,
| and fit them together, without layers of "make it easy" tools
| that introduce incessant XY problems and integration issues.
|
| If something is "magically" easy, it either is a meaningful
| design/algo revolution or it overpromises the production case
| while showing off the trivial. Most of the time it's #2. Docker
| was #1.
| friedrich_zip wrote:
| I get where he is going with this... but idk. Feels like a
| somewhat mid take. Strong abstractions always means strong vendor
| lock-in and more power to infrastructure providers. But AWS,
| Netlify and whoever runs your apps are not your friends.
| Vertically integrating your infrastructure can be a pretty good
| source of cost reduction and can create interesting assets if you
| have good talent in-house. So idk... sometimes the fact that
| building something takes time and you have to think about how you
| are going to set it up actually is a good thing, because you take
| the time to build it right and you end up understanding how
| everything works together.
| mdaniel wrote:
| (2021) and at the time:
| https://news.ycombinator.com/item?id=26869050
| dang wrote:
| Thanks! Macroexpanded:
|
| _Software Infrastructure 2.0: A Wishlist_ -
| https://news.ycombinator.com/item?id=26869050 - April 2021 (195
| comments)
| 015a wrote:
| Here's something very specific I've been thinking about recently.
|
| I think Google Cloud Cloud Run is obscenely ahead of its time.
| Its a product that's adjacent to so many competitors, yet has no
| direct competitor, and has managed to drive a stake into that
| niche in a way that makes it such a valuable product.
|
| Its serverless, but not "Lambda Serverless" or "Vercel
| Serverless" which forces you to adopt an entirely different
| programming model. Its just docker containers. But its also not
| serverless in the way Fargate or ACS is "serverless"; its still
| Scale to Zero.
|
| There's a lot of competition in the managed infrastructure space
| right now (Railway, Render, Fly, Vercel, etc). But I haven't seen
| anyone trying to do what Cloud Run does. Cloud Run has its
| disadvantages (cold starts are bad; it also could be a great fit
| for background workers/queue consumers/etc, but Google hasn't
| added any way to scale replicas beyond incoming HTTP requests
| yet).
|
| But the model is so perfect that I wish more companies would
| explore that space more, rather than retreating to "how things
| have always been done" ("pay us $X/mo to run a process") or
| retreating to the much more boring "custom serverless runtime",
| "your app is now only a 'AWS Lambda app' and cant run anywhere
| else congrats".
| ajcp wrote:
| At the risk of being on the outside here, I'd have to agree.
|
| To go a bit further I'm honestly quite interested(?) in how CGP
| has sought to differentiate itself from the two other providers
| by offering this kind of "plug and play" feel to cloud.
| Certainly there is value to be gained from the absolutely
| granular service offerings of AWS/Azure, but there's a point
| when it starts to feel like all I'm doing is building control
| towers for island landing strips.
|
| I just want my cloud providers ML service to talk to the data
| lake on the same cloud tenant without having to architect my
| way through 15 network nics, 30 service accounts, and 4 VDI...
| lijok wrote:
| Maybe I'm misunderstanding something but what you're describing
| is what AWS Lambda has been able to do for a long time now. You
| can run an api in a docker container with no Lambda-specific
| code.
| dmattia wrote:
| My understanding is that your docker image must have the
| lambda runtime interface client installed on the image in
| order to work.
|
| It's not a huge step usually to add the RIC, but it's a bit
| more tied in to AWS than CloudRun is, which can run arbitrary
| docker images, if I understand.
| lijok wrote:
| That's right - you have to package awslabs/aws-lambda-web-
| adapter into your docker image which proxies the API-GW/ALB
| requests through.
| maccard wrote:
| Azure has container _instances_ -
| https://azure.microsoft.com/en-gb/products/container-instanc...
|
| DigitalOcean iss not wildly far off it either.
|
| ECS + Fargate is the closest AWS has to it, but you need to do
| IAM and Networking to utilise it. If you're in AWS already,
| it's pretty good, albeit with some frustrating limits
| 015a wrote:
| Yup my bad, I meant ACI, not ACS.
|
| Correct me if I'm wrong, but these are actually not close to
| Cloud Run. Cloud Run's differentiator is its scaling metric;
| it scales with incoming requests, and has strict
| configuration to assert that each replica only handle N
| concurrent requests. You could maybe get something like this
| set up on ACI or Fargate, but it'd require stringing together
| five or six different products. You can also definitely wire
| up those to autoscale on CPU%, but (1) this is not scale-to-
| zero, and (2) CPU% kinda sucks as a scaling metric, right?
| Idk I've never been happy with systems that autoscale on
| CPU%.
| jiggawatts wrote:
| Azure has mostly implemented this now. ACI was a single
| instance, but they have scalable Container Apps now. These
| are just a dumbed down abstraction hiding a managed
| Kubernetes cluster beneath that you never interact with
| directly.
| kastden wrote:
| Container Instances is bad though, and you'll regret using
| it. There is Azure Container Apps but it requires some more
| setup in advance.
| mrkurt wrote:
| (I am bias because I work on Fly.io)
|
| Fly Machines are more powerful than Google Cloud Run IMO. You
| can treat them like cloud run, or manage them directly and
| implement your own Serverless model.
|
| Our PaaS orchestration is implemented entirely I. The client
| CLI, and it manages Fly Machines directly:
| https://fly.io/docs/machines/
| swyx wrote:
| we recently interviewed Erik and touched on this list:
| https://www.latent.space/p/modal
|
| and how Modal exemplifies a lot of the ideas he's been looking
| for. check it out incl our show notes!
| shayarma wrote:
| So true. why are we still paying for idle resources in 2024?
| sethkim wrote:
| What's cool is that Erik actually acted on these complaints.
| Modal is, by far, my favorite developer tool ever and makes me
| hopeful not just for the future of software engineering but the
| entire tech industry.
|
| If you're a naysayer in the comments, I would encourage you to go
| give it an honest try, and consider again why you think infra has
| to be done in harder ways.
| traverseda wrote:
| Alright, how to I search for "Modal"?
| elyall wrote:
| https://modal.com/
___________________________________________________________________
(page generated 2024-02-27 23:01 UTC)