[HN Gopher] AWS Lambda Behind the Scenes
___________________________________________________________________
AWS Lambda Behind the Scenes
Author : garblegarble
Score : 289 points
Date : 2021-07-10 12:33 UTC (10 hours ago)
(HTM) web link (www.bschaatsbergen.com)
(TXT) w3m dump (www.bschaatsbergen.com)
| abarrak wrote:
| A couple of days ago, I tried to search on how AWS operates RDS
| behind the scenes, since it is a managed stateful service I was
| wondering whether it runs in a traditional way VM-based or in a
| fully containerized environment? .. Unfortunately, a simple
| search will lead you to the consumer/customer resources out there
| only.
| rorykoehler wrote:
| Based on how they bill it, it looks like it's running on VMs
| StratusBen wrote:
| Agreed. AWS RDS instance types are just EC2 instance types
| prefixed with "db." and you're choosing either single-AZ or
| multi-AZ deployments so presumably AWS is just spinning up 1
| to 3 EC2 instances with some preconfigured software on them.
| navaati wrote:
| From what I know there _is_ a secret sauce beyond a mere
| AMI and a control plane, based on some EBS volumes magic. I
| may be mixing things up with Aurora though.
| fierro wrote:
| this is very, very safe to assume. There is probably a
| hundred engineer's worth of "secret sauce" for an entire
| managed DB product line.
| aeyes wrote:
| There were some comments in the early days that the
| Multi-AZ magic for classic RDS was just drbd on top of
| EBS.
|
| Aurora is a completely different approach where the RDBMS
| code is modified to directly interface with EBS instead
| of going through a traditional OS filesystem layer.
| rrdharan wrote:
| Amazon published a paper describing how Aurora works:
|
| https://www.allthingsdistributed.com/files/p1041-verbitsk
| i.p...
| ec109685 wrote:
| This is a good paper that talks about Aurora and provides some
| insight into how RDS operates:
| https://www.allthingsdistributed.com/files/p1041-verbitski.p...
|
| It's nice that AWS builds their own higher level abstractions
| on the same primitives outside developers use. Feels like they
| eat their own dogfood much more than Google where they bypass
| GCP and instead utilize underlying Borg primitives for many
| services.
| dilyevsky wrote:
| Which gcp services run directly on borg? My understanding is
| at least bigtable, cloud sql and other dbs are within
| "hidden" VMs. I think loadbalancers and storage are
| exceptions but same is true for aws (except the classic elb
| probably)
| rrdharan wrote:
| Bigtable, Firestore and Spanner run directly on Borg.
|
| Cloud SQL V2 runs in hidden VMs.
| orf wrote:
| Are those not internal services that pre-date GCP that
| are exposed externally _through_ GCP?
| abarrak wrote:
| Nice. I wonder if the stateful merits provided and marketed
| by containers orchestrates (e.g. K8S) is something they will
| consider in the future? ..
| rejectedandsad wrote:
| To build a new service at Amazon, the general path of least
| resistance these days is to use Lambda. If not Lambda, then
| ECS. If not ECS (or if it requires bare metal) then EC2.
| rejectedandsad wrote:
| This is a new thing really - it used to be that you'd use a
| different system that's in many ways better integrated with
| how the rest of development works but far worse in terms of
| UX and capacity planning, etc. Now many of the tools are
| basically frankenstein transformation from the old way into
| the Amazon-specific way AWS is used via the multi-account
| pattern.
| tnolet wrote:
| Great write up. Besides the technical parts, AWS Lambda probably
| created a ton of new businesses/ startups that otherwise would
| have been hard or at least expensive to get going.
| personlurking wrote:
| Aside: Lambda, out of Peru, is apparently the next virus variant
| we need to worry about.
|
| Edit: I had never seen the word before then I saw it used twice
| in two days, in different contexts. Just thought it was odd.
| adflux wrote:
| Can we agree to leave COVID out of this discussion?
| [deleted]
| minitoar wrote:
| It's not that odd if you consider it is the 11th letter of the
| Greek alphabet.
| Exmoor wrote:
| https://en.wikipedia.org/wiki/Frequency_illusion
| daxfohl wrote:
| One other thing I learned here is that lambda@edge is not
| actually run on the edge at all. It is forwarded to the nearest
| datacenter to execute. Not enough capacity in edges to spin up
| entire VMs for everything, even with Firecracker.
| chews wrote:
| nice writeup for how the magic really works. lambdas rock!
| Something1234 wrote:
| Fantastic paper. So I've been playing with the java and python
| runtimes and it's absolutely stunning how much better python is
| on execution and start up time.
|
| Also how does an event actually get to the lambda handler?
| Because they can come from all kind of sources.
| mdaniel wrote:
| I believe they fire up an http server, based on how their local
| executor behaves, and then do "servlet-y" (or WSGi-y) dispatch
| into the entry point method
| mlerner wrote:
| If you're interested in Firecracker, I wrote a summary of the
| original paper here:
| https://www.micahlerner.com/2021/06/17/firecracker-lightweig...
| daxfohl wrote:
| Any idea how much it has diverged from crosvm?
| mjb wrote:
| Quite a lot. Initially, a lot of the changes were removing
| things from crosvm, but adding features like snapshots, and
| factoring things out into RustVMM, has made them diverge a
| lot more.
|
| There's some data in the paper about how similar they were
| then, too.
| bschaatsbergen wrote:
| Great article @mlerner
| dr_kretyn wrote:
| Is this write up correct? How do they know that? I don't see any
| references on info source except a talk at re:invent.
| bschaatsbergen wrote:
| Both Marc Brooker (lead developer on the AWS Lambda team)
| giving the talks at Re:Invent as I mentioned in the footnotes,
| and the official documentation that's out there will provide
| you with a lot of information.
| garblegarble wrote:
| There's a decent references section at the bottom, and having
| watched the talks and briefly scanning the Firecracker paper
| referenced, they do back up the writer.
| dr_kretyn wrote:
| That's a footnote section and to me the listing is only
| partially related to the text, i.e. the write up contains a
| lots more details on multiple components.
|
| Thanks though for backing up this write up. That's +1 for
| confidence.
| dmarinus wrote:
| When I was at re:invent 2019 I joined some chalk talks which
| weren't recorded (or not published). Some of the hosts told lot
| of details of their internal infrastructure.
| chrisweekly wrote:
| This is great! Awesome writeup w thekind of details that are
| sometimes opaque and hard to find documentation for. I recently
| deployed a NextJS app using Serverless framework (and serverless-
| nextjs), so Lambda@Edge... looking fwd to playing more with
| compute at CDN edgein general (eg fly.io). Amazing how easy it
| is, esp. as someone who came into webdev in 1998.
| emteycz wrote:
| Considering your long experience, didn't you feel like we lost
| a lot post-PHP? I also stepped out of the PHP world into JS,
| and never understood why there isn't any apache2-modnodejs...
| And to me, the serverless JS movement seems to be just that,
| but with a lot of unnecessary baggage.
| ec109685 wrote:
| Lambda gets you back to the one request per process model
| that made php so easy to reason about and performance flat.
| With normally deployed JavaScript and single process
| concurrency, callbacks could all complete at same time and
| all block waiting to get cpu time to complete the request.
| nuclearnice3 wrote:
| We surely picked up some baggage. Much of it vendor specific.
| But also we jettison some? You stick a lambda into API
| gateway and you're on the internet. No servers. No linux
| setup. No apache conf.
|
| I'd encourage you to dive in for 20 hours in pure curiosity
| mode and see what you find.
| throwaway3699 wrote:
| Lambda is a proprietary solution that only works for people
| on AWS. Linux is open, and I just need to put it on a box.
| How do I install the AWS Lambda stack on a standard Linux
| box?
| nuclearnice3 wrote:
| You can't.
|
| Are you actually asking this question?
|
| Or you pretending to ask a question because you think the
| fact that AWS Lambda run on AWS is some huge gotcha that
| I never imagined and no one would ever tolerate?
|
| I explicitly note vendor-specific baggage. AWS revenue is
| over 45 billion annually and half of customers use
| lambda.
| throwaway3699 wrote:
| My point is that the industry really hasn't moved on from
| the old LAMP stack if it's been replaced by a single
| company. When it truly comes down to it, the day-to-day
| tools are not ours if they aren't open.
|
| And if deploying a lambda function on my own hardware is
| vastly more complex, then the tools haven't really
| changed, they just got outsourced.
|
| There are a bunch of semi-standards like Serverless
| Framework and Knative, but nothing concrete.
| nuclearnice3 wrote:
| I agree the tools got outsourced.
|
| I also agree there are big chunks of LAMP under the
| covers of running an AWS Lambda. So, in that sense, we
| haven't "moved on" from the old LAMP stack.
|
| I also agree the tools are "not ours" if they aren't
| open. They do useful things. It's a tradeoff.
| beckingz wrote:
| Openstack?
|
| https://www.openstack.org/
| js4ever wrote:
| It's great for a lot of use cases ... But unfortunately
| there is several important points that prevents to use this
| combo as a silver bullet.
|
| API gateway is limited to 29 sec of execution, if you need
| anything longer you will need an EC2 instance (or ECS or
| fargate) to act as a webserver and call the lambda (up to
| 15 min), cloudfront is also not an option for this comon
| use case because it's limited to 180 sec.
| bschaatsbergen wrote:
| Good to know you enjoyed the read!
| carlosf wrote:
| Really cool post!
|
| From the architecture, it's not really clear to me why Lambdas
| have the 15 min limitation. It seems to me AWS could use the same
| infrastructure to make a product that competes with Google Cloud
| Run. Maybe it's a businesses thing?
| cloakandswagger wrote:
| I can't think of any reason outside of product positioning.
|
| A lot of the novelty of Lambda is its identity as a function:
| small units of execution run on-demand. A Lambda that can run
| perpetually is made redundant by EC2, and the opinionated time
| limit informs a lot of design.
| ignoramous wrote:
| It may be product positioning, but Lambda really stems from
| AWS desire to do something about the dismal utilisation ratio
| of their most expensive bill item: Servers [0].
|
| I speculate, 1min or 15mins workloads are optimum to schedule
| and run uncorrelated workloads. Any more, and it may diminish
| returns?
|
| [0] https://youtu.be/dInADzgCI-s?t=524 (James Hamilton, 2013)
| mdaniel wrote:
| > A Lambda that can run perpetually is made redundant by EC2
|
| Is only conceptually true outside of "EC2 Classic", because
| (to the best of my knowledge) every other EC2 launches into a
| VPC, even if it's the default one for the account per region,
| and even then into the default security group (and one must
| specify the IDs). That may sound like "yeah, yeah" but is a
| level of moving parts that Lambda doesn't require a consumer
| to dive into unless they want to control its networking
| settings
|
| I would think removing the time limit on Lambda would be like
| printing money since I bet per second for Lambda is greater
| than EC2
| kolanos wrote:
| This service exists, it's called AWS Fargate [0].
|
| [0]: https://read.iopipe.com/how-far-out-is-aws-
| fargate-a2409d2f9...
| slumdev wrote:
| This isn't true.
|
| Fargate scales in minutes, not seconds. And it never scales
| to zero.
| simonw wrote:
| Fargate isn't a competitor to Cloud Run (I wish it was)
| because it doesn't scale to zero in between requests and
| scale back up again when new traffic arrives.
| carlosf wrote:
| Oof
|
| Makes sense!
|
| I wish Fargate was easier to use and had a scale to 0
| feature.
|
| If App Runner ends up supporting private deployments then we
| can have a true Cloud Run competitor.
| kolanos wrote:
| > I wish Fargate was easier to use and had a scale to 0
| feature.
|
| Fargate can be scaled to zero. Also, have you tried the
| CLI? [0]
|
| [0]: https://github.com/aws/copilot-cli
| simonw wrote:
| When I say "scale to zero" I mean like Cloud Run or AWS
| Lambda: I define it as the service automatically scaling
| to zero (and hence costing nothing to run) in between
| requests, but automatically starting up again when a new
| request comes in - so the request still gets served, it
| just suffers from a few seconds of cold-start time.
|
| I'm pretty sure Fargate doesn't offer this. It sounds
| like you're talking about the ability to manually (or
| automatically through scripting) turn off your Fargate
| containers, then manually turn them back on again - but
| not in a way that an incoming request still gets served
| even though the container wasn't running when the request
| first arrived.
| simonw wrote:
| This is a great article - I really appreciate when people take
| the time to assemble details from a bunch of different sources
| (Firecracker paper, re:Invent talks) and turn them into a useful
| overview like this.
|
| Clearly Bruno got a lot of the details right, Jeff Barr tweeted a
| link to this a few weeks ago:
| https://twitter.com/jeffbarr/status/1404512248152825857
___________________________________________________________________
(page generated 2021-07-10 23:00 UTC)