[HN Gopher] AWS Lambda Web Adapter
       ___________________________________________________________________
        
       AWS Lambda Web Adapter
        
       Author : cebert
       Score  : 77 points
       Date   : 2024-06-22 18:01 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | paulgb wrote:
       | One word of caution about naively porting a regular web app to
       | lambda: since you're charged for duration, if your app does
       | something like make an API call, you're paying for duration while
       | waiting on that API call for each request. If that API breaks and
       | hangs for 30s, and you are used to it being (say) a 300ms round
       | trip, your costs have 100xed.
       | 
       | So lambda pricing scales down to cheaper than a VPC, but it also
       | scales up a lot faster ;)
        
         | mlhpdx wrote:
         | Not untrue, but also not true. It isn't a Lambda problem, It's
         | the problem of running the same old, same old code on lambda.
         | 
         | Running clusters of VM/container/instances to "scale up" to
         | transient demand but leaving largely idle (a very, very common
         | outcome) then an event oriented rewrite is going to save a big
         | hunk of money.
         | 
         | But yeah, if you keep every instance busy all the time, it is
         | definitely cheaper than lambda busy all the time.
        
           | paulgb wrote:
           | Sure, if you're willing to rewrite for an event-based
           | architecture you can avoid the problem, but the post is a
           | framework designed for running on Lambda _without_ that
           | rewrite. One of the stated goals is to have the same
           | container run on EC2 or Fargate.
        
             | mlhpdx wrote:
             | I understand the point. It's one in a long line of such
             | enabling tech -- all leading to the same outcome.
        
             | zoover2020 wrote:
             | A lot of internal web apps with low TPS can be easily
             | ported to Lambda using these adapters. They should be the
             | main use case IMO.
        
               | allannienhuis wrote:
               | A lot of internal web apps with low TPS can easily be run
               | on a $5-30/month vps.
        
               | easton wrote:
               | The minimum cost of running a task on Fargate is under a
               | dollar a month. If you're a AWS company, you can get
               | cheaper if you take the lock in (not that there's much
               | lock-in with Fargate)
        
           | parthdesai wrote:
           | > Running clusters of VM/container/instances to "scale up" to
           | transient demand but leaving largely idle (a very, very
           | common outcome) then an event oriented rewrite is going to
           | save a big hunk of money.
           | 
           | If you're using k8s, HPA is handled for you. You just need to
           | define policies/parameters. You're saying as if rewriting an
           | entire app doesn't cost money.
        
         | danappelxx wrote:
         | This is an interesting point. Hangs usually cost $ from user
         | experience, with serverless they cost $ from compute. All the
         | more reason to set strict deadlines on all API calls!
        
         | johtso wrote:
         | That's one thing that's really nice about Cloudflare workers,
         | you're billed on CPU time, not duration, so you can happily
         | make slow requests to other services.
        
           | throwAGIway wrote:
           | How does that work? How can they determine my execution time
           | is spent waiting and not processing?
        
             | paulgb wrote:
             | They use their own runtime for V8 isolates, so they have a
             | lot of leeway into how they measure/meter it.
        
             | navaati wrote:
             | At the most basic level, your process is in the D or S
             | states rather than R state. Cloudlflare workers are not
             | exactly Unix processes, but the same concept applies.
        
             | CaliforniaKarl wrote:
             | When a Linux process is waiting for a socket to have data
             | ready to read (using a system call like select, poll, or
             | epoll), that process is put into a sleep state by the
             | kernel. It continues accruing "wall-clock time" ("wall
             | time") but stops accruing CPU time. When the kernel has
             | data for the socket, the process starts running again.
             | 
             | The above tracking method works for containerized things.
             | For virtualized things, it's different: When a Linux system
             | has nothing to do (all processes sleeping or waiting), the
             | kernel puts the CPU to sleep. The kernel will eventually be
             | woken by an interrupt from something, at which point things
             | continue. In a virtualized environment, the hypervisor can
             | measure how long the entire VM is running or sleeping.
        
               | throwAGIway wrote:
               | Thanks to you and the others who replied too, very
               | insightful. You put me on the right track to learn more
               | about this, much appreciated.
        
         | chuckadams wrote:
         | Lambdas have a timeout, which defaults to 3 seconds.
        
           | OJFord wrote:
           | And maxes at 30s (if apigw triggered), GP's point totally
           | valid.
        
         | foundart wrote:
         | Interesting. It makes me wonder about designing "lambda-client-
         | friendly APIs".
         | 
         | To reduce the chances of being long-running, the api would be
         | async, meaning: the client makes a request and gets a claim
         | code result quickly, with the api host to process the request
         | separately and then make a callback when done.
         | 
         | Lots of issues to consider.
         | 
         | Anyone have experience doing this? Would be great to hear about
         | it.
        
           | mike_d wrote:
           | > It makes me wonder about designing "lambda-client-friendly
           | APIs"
           | 
           | If your hammer fails to drive a screw into wood, you don't
           | need to contemplate a new type of low friction wood.
        
             | krick wrote:
             | Or maybe you do, if screws are so much cheaper than nails.
        
           | chuckadams wrote:
           | Sounds like webhooks to me -- a use case Lambda excels at.
        
             | foundart wrote:
             | Thanks! I think you're right.
             | 
             | To go beyond my original question:
             | 
             | In terms of how I have thought about them before, webhooks
             | are similar, but not quite the same thing. I've really only
             | dealt with webhooks when the response was delayed because
             | of a need for human interaction, for example: sending a
             | document for e-signature and then later receiving
             | notifications when the document is viewed and then signed
             | or rejected.
             | 
             | I haven't experienced much need to make REST APIs async,
             | only in cases where the processing was generally always
             | very slow for reasons that couldn't be optimized away. I
             | haven't seen much advocacy for it either.
             | 
             | However, if we think about lambda-based clients, then it
             | makes a lot more sense to provide async apis. Why have both
             | sides paying for the duration, even if the duration is
             | relatively short?
             | 
             | update:
             | 
             | Whether or not it's cheaper for the client depends on the
             | granularity of the pricing of the client's provider. For
             | AWS, it seems to be per second of duration, so async would
             | be more expensive for APIs that execute quickly.
        
         | jmull wrote:
         | I think "a regular web app" is interactive...
         | 
         | 3s would rarely be acceptable, much less 30s. Not that it never
         | happens, but it should be rare enough that the cost of the
         | lambda isn't the main concern. (Or if it's not rare, your focus
         | should be on fixing the issue, not really the cost of the
         | lambdas.)
         | 
         | Anyway, I think you'd typically limit the lambda run time limit
         | to something a lot shorter than 30 sec.
        
         | multani wrote:
         | What is the pattern to use on Lambda if you actually have to
         | call out other services, which may, sometimes, take a long time
         | to answer? Do you make requests with a shorter timeout and have
         | your Lambda fail when it triggers? Do you delegate long calls
         | to a non-Lambda service?
        
           | roncesvalles wrote:
           | It seems like Lambda is not suited for such "hanging" use
           | cases. To me the best use case for Lambda is for an API call
           | that might be made very infrequently, like say 50 times a
           | day. Doing the same with a VM instance (remember, it has to
           | be HA, secure etc) is probably not worth the effort.
        
           | moribvndvs wrote:
           | In a case where a dependency has a consistent high latency
           | and you can't avoid it, I'd run the numbers and see if it's
           | more worthwhile to just run your app on ECS or something
           | instead.
        
           | dragonwriter wrote:
           | If you have enougu volume to warrant it, probably re-
           | architecting the lambdas so the before/after part of the
           | calls is a separate invocation, with an adapter service
           | living on ECS that gets a call from the "before" lambda,
           | handles the remote request and response, the calls the
           | "after" lambda with the response.
           | 
           | The simpler solution is often to just lift the app from
           | lambda to ECS without internal changes
        
         | OJFord wrote:
         | A corollary is that if you have your lambdas call your other
         | lambdas call...
        
           | thefourthchime wrote:
           | lambda blocking on calling other lambdas is a bit of an
           | antipattern.
           | 
           | If it can be avoided and use something like sqs to call the
           | other lambda, die and then resume on a notification from the
           | other lambda.
           | 
           | That can tricky, but if costs are getting bad it's the way.
        
         | thefourthchime wrote:
         | I've done an internal website using lambdas. It works fine, I
         | just use jinja2 templates and load html files that are part of
         | the deployed file system.
         | 
         | Any other logic is just normal SPA stuff.
         | 
         | If you set the min concurrency to 1 it's pretty snappy as well.
        
       | mlhpdx wrote:
       | Help me understand how this could possibly result in a healthier
       | SDLC. Great, it is possible to run the same container on all
       | these services (and probably more given the usual quip about the
       | 50 ways to run a container on AWS). And, I can see that being a
       | tempting intellectual idea and even a practical "cheat" allowing
       | understaffed orgs to keep running the same old code in new ways.
       | 
       | But why? The less it changes the the more vulnerable it is, the
       | less compliant it is, the more expensive it is to operate
       | (including insurance), the less it improves and add features, the
       | less people like it.
       | 
       | Seems like a well-worn path to under performance as an
       | organization.
        
         | abadpoli wrote:
         | > The less it changes the the more vulnerable it is, the less
         | compliant it is, the more expensive it is to operate (including
         | insurance), the less it improves and add features, the less
         | people like it.
         | 
         | I don't follow this. Complexity is what leads to
         | vulnerabilities. Reducing complexity by reusing the same, known
         | code is better for security and for compliance. As a security
         | person, if a team came to me and said they were reusing the
         | same container in both contexts rather than creating a new code
         | base, I would say "hell yeah, less work for all of us".
         | 
         | There are other reasons why using the same container in both
         | contexts might not be great (see other comments in this
         | thread), but security and compliance aren't at the top of my
         | list at all (at least not for the reasons you listed).
        
           | mlhpdx wrote:
           | If the container has any dependencies, and they change, a
           | lack of change quickly becomes an escalating liability. I've
           | heard the "fixed is reliable" arguments for decades and have
           | never seen anything that would justify the widespread failure
           | to keep systems patched.
        
             | mlyle wrote:
             | He is not suggesting leaving the container unpatched, so
             | this is unrelated criticism.
        
             | abadpoli wrote:
             | As the other commenter said, nobody is saying to make your
             | application completely immutable. You still patch it, add
             | features, etc. But now you only have to patch one container
             | image rather than two (or more).
        
         | nijave wrote:
         | Abstracting the runtime environment is nice for mixed
         | environments and supporting local development. Maybe some
         | deployments have really low load and Lambda is dirt cheap
         | whereas other instances have sustained load and it's cheaper to
         | be able to swap infrastructure backend.
         | 
         | It doesn't have to be about life support for neglected code.
        
         | chuckadams wrote:
         | > The less it changes the the more vulnerable it is, the less
         | compliant it is, the more expensive it is to operate (including
         | insurance), the less it improves and add features, the less
         | people like it.
         | 
         | All of that is true, but all of that cost is being paid
         | regardless while the legacy system goes unmaintained. When the
         | company decides to shut down a data center, the choice with
         | legacy systems (especially niche ones) is often "lift and shift
         | over to cloud" or "shut it down". Notably missing among the
         | choices is "increase the maintenance budget".
         | 
         | All these hypotheticals aside, one actual bonus of shifting
         | onto lambda is reduced attack surface. However crusty your app
         | might be, you can't end up with a bitcoin miner on your server
         | if there's not a permanent server to hack.
        
       | ranger_danger wrote:
       | How painful is iterating changes during development on this?
        
         | inhumantsar wrote:
         | cycles can be pretty quick with lambda, definitely faster than
         | launching a container into k8s or ecs. there's good tooling for
         | local execution too
        
           | pluto_modadic wrote:
           | AWS codepipeline/codebuild still takes a really long time.
           | think 1 minute to notice a commit, 2 minutes to download
           | code, 3-8 minutes to get a codebuild runner, however long
           | your build is, and 30 seconds to update the lambda.
           | 
           | Especially when you don't have a good local AWS runner, vs
           | using a HTTP service you can run and live reload on your
           | laptop.
           | 
           | deploying to prod in less than 10 seconds, or even /live
           | reloading/ the AWS service would be awesome.
        
             | abadpoli wrote:
             | For fast iteration, our developers build, test and deploy
             | locally using cdk deploy directly into their dev account,
             | no need to wait around for the pipeline to do all those
             | steps. Then when they're ready, higher environments go
             | through the pipeline.
             | 
             | CDK has a new(ish) feature called hotswap that also
             | bypasses CloudFormation and makes deploys faster.
        
               | chuckadams wrote:
               | The equivalent of hotswap in SAM would be `sam sync
               | --code --watch`. Works in seconds, though I still prefer
               | to test locally as much as possible.
        
         | cjk2 wrote:
         | Slow as hell. Which is why everything we are doing is no longer
         | in Lambda. Literally it was crippling and most people don't
         | notice it in the noise of all the other problems that software
         | complexity brings in. If you measure it, you end up going "oh
         | shit so that's where all our money has gone!"
         | 
         | We have one service which has gone back to a Go program that
         | runs locally using go run. It's then shipped in a container to
         | ECR and then to EKS. The iteration cycle on a dev change is
         | around 10 seconds and happens entirely on the engineer's
         | laptop. A deployment takes around 30 seconds and happens
         | entirely on the engineer's laptop. Apart from production, which
         | takes 5 minutes and most of that is due to github actions being
         | a pile of shite.
        
       | tills13 wrote:
       | Something I don't really get is that if you're going through the
       | trouble of creating a container, why not just run it on fargate
       | or ec2?
       | 
       | Is it literally just the scaling the 0? And you're willing to
       | give up some number of requests that hit cold-start for that?
        
         | SahAssar wrote:
         | Scaling to 0 means me being able to deploy each branch/PR to a
         | whole, fully separate environment without it costing too much.
         | As soon as you do that everything from testing migrations (via
         | database branching), running full end-to-end tests (including
         | the full infra, db, backend, etc.), having proper previews for
         | people to view when reviewing the changes (like for example the
         | UX designer seeing the whole feature working including the
         | backend/db changes before it going in) just falls into place.
         | 
         | If I don't scale to 0 I'd prefer to work on dedicated hardware,
         | anything in-between just doesn't give me enough benefit.
        
           | jiggawatts wrote:
           | You can get a similar effect with Azure App Service, which is
           | basically a cloud hosted managed IIS web farm. A web app is
           | "just" a folder and takes zero resources when idle.
        
         | WhyNotHugo wrote:
         | The container is useful for running in locally in development.
         | It ensures that you have the same runtime environment (and
         | dependencies) both locally and on AWS.
        
         | nijave wrote:
         | Lambda scales faster if you really do need that. For instance,
         | imagine bursts of 100k requests. Cold start on Lambda is going
         | to be lower than you can autoscale something else.
        
           | ndriscoll wrote:
           | What actually happens is you hit the concurrency limit and
           | return 99k throttling errors, or your lambdas will try to
           | fire up 100k database connections and will again probably
           | just give you 99k-100k errors. Meanwhile a 1 CPU core
           | container would be done with your burst in a few seconds.
           | What in the world needs to handle random bursts from 0 to
           | 100k requests all at once though? I struggle to imagine
           | anything.
           | 
           | Lambda might be a decent fit for bursty CPU-intensive work
           | that doesn't need to do IO and can take advantage of multiple
           | cores for a single request, which is not many web
           | applications.
        
             | chuckadams wrote:
             | If you go from 0 to 100K legit requests in the same
             | instant, any sane architecture will ramp up autoscaling,
             | but not so instantly that it tries to serve every last one
             | of them in that moment. Most of them get throttled. A well-
             | behaved client will back off and retry, and a badly-behaved
             | client can go pound sand. But reality is, if you crank the
             | max instances knob to 100k and they all open DB
             | connections, your DB had better handle 100k connections.
        
               | ndriscoll wrote:
               | A sane architecture would be running your application on
               | something that has at least the resources of a phone
               | (e.g. 8+ GB RAM), in which case it should just buffer
               | 100k connections without issue. A sane application
               | framework has connection pooling to the database built
               | in, so the 100k requests would share ~16-32 connections
               | and your developers never have to think about such
               | things.
        
       | sudhirj wrote:
       | So the way we're using this is this:
       | 
       | * We write our HTTP services and package them in containers.
       | 
       | * We add the Lambda Web Adapter into the Dockerfile.
       | 
       | * We push the image to ECR.
       | 
       | * There's a hook lambda that creates a service on ECS/Fargate
       | (the first-party Kubernetes equivalent on AWS) and a lambda.
       | 
       | * Both are prepped to receive traffic from the ALB, but only one
       | of them is activated.
       | 
       | For services that make sense on lambda, they ALB routes traffic
       | to the lambda, otherwise to the service.
       | 
       | The other comments here have more detailed arguments over which
       | service would do better where, but the decision making tree is a
       | bit like this:
       | 
       | * is this a service with very few invocations? Probably use
       | lambda.
       | 
       | * is there constant load on this service? Probably use the
       | service.
       | 
       | * if load is mixed or if there's are lot of idle time in the
       | request handling flow, figure out the inflection point at which a
       | service would be cheaper. And run that.
       | 
       | While we wish there was a fully automated way to do this, this is
       | working well for us.
        
         | wavemode wrote:
         | I guess the part I don't get is, if your traffic is that low,
         | doesn't that mean you could run the service on a ~$4/month
         | instance anyway? Even if we assume the lambda option is
         | cheaper, it's only cheaper by pennies, basically. In exchange
         | for taking on a bunch of added complexity.
        
           | bubblyworld wrote:
           | Not the OP, but things can be infrequent and bursty, or
           | infrequent and expensive, or infrequent and memory-intensive.
           | In those cases (and many more I'm sure) lambda can make
           | sense.
        
       | andrewstuart wrote:
       | I've noticed any project that involves AWS requires a huge amount
       | of "programming the machine" versus "writing the application".
       | 
       | What I mean by "programming the machine" is doing technical tasks
       | that relates to making the cloud work in the way you want.
       | 
       | I've worked on projects in which tha vast majority of the work
       | seems to be endless "programming the machine".
        
         | acdha wrote:
         | That's a choice, like using Kubernetes or being a micro service
         | fundamentalist. A decade ago, you had people building in-house
         | Hadoop clusters for tiny amounts of data instead of doing their
         | jobs analyzing that data, and before that you had the J2EE
         | types building elaborate multi server systems for what could
         | have been simple apps.
         | 
         | This repeats constantly because the underlying problem is a
         | broken social environment where people aren't aligned with what
         | their users really need and that's rewarded.
        
           | andrewstuart wrote:
           | I think it is because cloud computing demands far more
           | programming the machine than deploying to compute instances
           | does.
        
             | acdha wrote:
             | Does it, though, or are you just more familiar with
             | traditional servers? There are a lot more things to manage
             | in a traditional environment and things like tools are more
             | primitive so you'll have to invest a lot more time getting
             | anywhere near equivalent reliability or security. That's
             | especially important if you have compliance requirements
             | where you'll be on the hook for things provided or greatly
             | simplified by the cloud platform.
        
               | andrewstuart wrote:
               | >> There are a lot more things to manage in a traditional
               | environment
               | 
               | Can you be specific because that does not sound right to
               | me.
               | 
               | I know both pretty well and seems to me there is vastly
               | more things that need to be tuned and managed and
               | configured in cloud.
        
               | chuckadams wrote:
               | How about you give some specifics of things that need
               | constant fiddling in the cloud but are free on bare
               | metal?
        
               | andrewstuart wrote:
               | Well its a good point, it's my feeling at the moment.
               | 
               | I think it would be interesting to do a systematic study
               | of the time required for each.
        
               | acdha wrote:
               | Here's a simple one: if I need to say something accepts
               | inbound 443 from 1.2.3.4, I create an AWS security group
               | rule and will never need to touch that for at least a
               | decade, probably longer. If I'm doing the same thing on
               | premise, the starting cost is learning about something
               | like Cisco hardware which costs a lot more to buy and I
               | am now including in my maintenance work. Over a year, I
               | will spend no time on my AWS rule but I will have to
               | patch the physical switches and deal with primitive
               | configuration tools for a complex networked device. If my
               | organization cares about security, I will have to certify
               | a ton of configuration and practice unrelated to packet
               | filtering whereas I can point auditors at the cloud
               | service's documentation for everything on their side of
               | the support agreement.
               | 
               | What about running code? If I deploy, say, Python Lambdas
               | behind an API Gateway or CloudFront, they'll run for
               | years without any need to touch them. I don't need to
               | patch servers, schedule downtime, provision new servers
               | for high load or turn them off when load is low, care
               | that the EL7 generation servers need to be upgraded to
               | EL8, nobody is thinking about load balancer or HTTPS
               | certificate rotation, etc. What you're paying for, and
               | giving up customization for, is having someone else
               | handle all of that.
        
           | chx wrote:
           | > A decade ago, you had people building in-house Hadoop
           | clusters for tiny amounts of data instead of doing their jobs
           | analyzing that data
           | 
           | So said Gary Bernhardt of WAT fame in 2015
           | https://x.com/garybernhardt/status/600783770925420546
           | 
           | > Consulting service: you bring your big data problems to me,
           | I say "your data set fits in RAM", you pay me $10,000 for
           | saving you $500,000.
           | 
           | Which, in turn, have inspired https://yourdatafitsinram.net/
        
             | acdha wrote:
             | I remember that, too. There were also just a lot of places
             | where it's like ... your data is already in a SQL database,
             | why not get better at reporting queries until you're
             | running so many of them that you don't have an easy path to
             | handle them? It was especially tragicomic in places where
             | they were basically hoping not to need to hire analysts
             | with technical skills, which is possibly the most MBA way
             | ever to very expensively "save" money.
        
         | abadpoli wrote:
         | For many systems, the application and the machine are so
         | intertwined as to basically be the same thing.
         | 
         | A lot of web apps are just boring CRUD APIs. The business logic
         | behind these is nothing special. The differentiating factor for
         | a lot of them is how they scale, or how they integrate with
         | other applications, or their performance, or how fast they are
         | to iterate on, etc. The "machine" offers a lot of possibilities
         | these days that can be taken advantage of, so customizing "the
         | machine" to get what you need out of it is a big focus for many
         | devs.
        
         | icedchai wrote:
         | Yep. I've also worked on "serverless" projects where similar or
         | more time is spent on IaC (Terraform, etc.) compared to
         | application development. All this effort for something that
         | barely gets any requests.
        
       | luke-stanley wrote:
       | How does this compare to Fly.io which AFAIK wakes on requests in
       | a very similar way with containers made into Firecracker VM's
       | (unless using GPU, I think)? I suppose Fly typically doesn't need
       | to scale per request so has a typical cheaper max cost? I guess
       | you'd just set the concurrency but how well that scales, I don't
       | know.
        
         | luke-stanley wrote:
         | Vercel must do something a bit similar to Fly too, I assume.
        
       | Thaxll wrote:
       | Someone with experiance with Lambda, how does connection pooling
       | works, are tcp connection re-used at all?
        
         | remram wrote:
         | There's no direct connection to the Lambda container, it goes
         | through a load balancer. The load balancer can keep connections
         | open for reuse.
        
       | irjustin wrote:
       | Question: this seems to take the port as the primary entry point
       | of communication which means you need to be running a web server?
       | 
       | This adds a lot of cold start overhead? Instead of directly
       | invoking a handler.
        
       ___________________________________________________________________
       (page generated 2024-06-22 23:00 UTC)