[HN Gopher] Autoscale Kubernetes workloads on any cloud using an...
       ___________________________________________________________________
        
       Autoscale Kubernetes workloads on any cloud using any event
        
       Author : innovate
       Score  : 39 points
       Date   : 2024-04-30 17:10 UTC (5 hours ago)
        
 (HTM) web link (kedify.io)
 (TXT) w3m dump (kedify.io)
        
       | innovate wrote:
       | Kedify has recently launched a SaaS-based Kubernetes event-driven
       | autoscaling service that is powered by the popular OSS project
       | KEDA.
        
       | anonymous_union wrote:
       | kubernetes is dying, isn't it?
        
         | lifty wrote:
         | Yes, just like Linux.
        
           | zbynek wrote:
           | Linux is dying? I though that year 2024 is the year of Linux
           | on the desktop!
        
             | 7thpower wrote:
             | No it's 2025, but definitely going to happen.
        
             | bigstrat2003 wrote:
             | Sadly yes. Netcraft confirms it.
        
         | nilamo wrote:
         | Commenting so I can see the replies later...
        
         | dijit wrote:
         | Kubernetes is a framework, it'll take a long time to die, it
         | will likely contort itself into fitting whatever paradigm is
         | needed.
         | 
         | However, I wonder what you mean? Kubernetes from where I sit
         | has almost complete ubiquity across most companies. Even in
         | places where it's a poor fit.
        
         | remram wrote:
         | I'm not sure what in the article makes you say that? Anyway,
         | the answer is no.
        
         | kobalsky wrote:
         | Could you explain how you arrive to that conclusion from seeing
         | some proyect offering an alternative autoscaling engine?
         | 
         | The only issue I have with the default HorizontalPodAutoscaler
         | is that I cannot scale down to 0 when some processing queues
         | are empty. Other than that, we have shrines erected to k8s.
        
       | acedTrex wrote:
       | so basically this is a gui abstraction over "kubectl apply -f
       | scaledObject.yaml" ...
        
         | mplewis wrote:
         | Not really.
        
         | zbynek wrote:
         | It is a single place for installation, updates, security fixes
         | and the whole administration of KEDA across the whole fleet of
         | k8s clusters. Very soon there will be configuration
         | optimization and recommender, dynamic scaling policies and much
         | more. And yeah, also gui abstraction over `kubectl...` is there
         | :)
        
       | Aldipower wrote:
       | Who needs autoscaling? I mean this as a serious question. Has
       | somebody a real story where autoscaling helped out the company or
       | product?
       | 
       | If you have the hardware resources, why not just scale up from
       | the beginning on? If you do not have the resources, you need a
       | lot of money anyways to pay the upscaled rent afterwards.
        
         | kube-system wrote:
         | 1. Scaling up beyond the level of resources you anticipated may
         | help you maintain better uptime. This could be useful if uptime
         | is very valuable for you.
         | 
         | 2. Hopefully, if you scale up, you can also scale down, which
         | will save you money when you don't need to rent the resources.
        
         | liveoneggs wrote:
         | In the cloud you pay by the minute (or less) so scaling down
         | saves money. Every single day my services scale up during peak
         | times and down in the evenings.
        
         | dijit wrote:
         | I gave a talk on this[0], but I've had some moderate success
         | doing autoscaling based on region for AAA always-online games.
         | 
         | That said, you could conceivably live at a higher abstraction.
         | 
         | Take dev environments for example. _Ideally_ the team working
         | on infra problems does not need to care how many versions of a
         | backend are operating on the dev environment.
         | 
         | The only thing infra needs to take into account the requested
         | resources.
         | 
         | Perverse incentives on wasting resources aside, it's nice when
         | you can have fewer variables in your mind when focusing on your
         | responsibility areas, it allows deeper intuition and creativity
         | - at the sacrifice of some cross cutting creativity across
         | teams.
         | 
         | [0]: https://sh.drk.sc/~dijit/devfest2019-msv.pdf
        
         | jacques_chester wrote:
         | > _If you have the hardware resources, why not just scale up
         | from the beginning on?_
         | 
         | For most workloads it's wasteful to have max capacity
         | provisioned at all times if you can instead provision on-
         | demand.
         | 
         | This is true in general. For example, electricity supply is a
         | mix of baseload power (cheap but only if left running
         | constantly) and peaking (expensive but easy to turn on and
         | off). It wouldn't be economical to have baseload capacity equal
         | to maximum demand. Instead it is aimed at _minimum_ demand and
         | other sources make up the difference depending on demand.
        
         | jrockway wrote:
         | For people on the cloud, you can buy new computers and sell
         | them back in a period of time measured in hours, so that's how
         | people are using these systems. It doesn't make a ton of
         | economic sense to me in general, because your periods of high
         | demand are going to be the same periods for everyone else in
         | that datacenter (you want to serve close to your users to
         | minimize latency, and most of the users in a geographical
         | region go to work and sleep at the same times). That said, it's
         | not priced like that. Having burstable core instances "always
         | on" can end up being more expensive than buying guaranteed
         | capacity instances for a short period of time.
         | 
         | I never auto-scale interactive workloads, but it's good for
         | batch work.
         | 
         | Other people have different feelings. Consider the case where
         | you release software multiple times a day, but it has a memory
         | leak. You don't notice this memory leak because you're
         | restarting the application so often. But the Winter Code Freeze
         | shows up, and your app starts running out of memory and dying,
         | paging you every day during your time off. If you had
         | horizontal autoscaling, you would just increase the amount of
         | memory that your application has until you come back and fix
         | it. Sloppy? Sure. But maybe easier to buy some RAM for a couple
         | weeks and not disrupt people's vacation. (The purist would
         | argue their vacation was ruined the day they checked in the
         | memory leak.) This gets all the more fun when the team writing
         | the code and the team responsible for the error rate in
         | production are different teams in different time zones. I don't
         | think that's a healthy way to structure your teams, but
         | literally everyone else on earth disagrees with me, so...
         | that's why there's a product that you can sell to the
         | infrastructure team instead of telling the dev team "wake up
         | and call free() on memory you're not using anymore".
        
         | rikthevik wrote:
         | Our customer workloads are bursty, so being able to scale down
         | to 0 (or close to it) saves us a lot of CPU and memory that
         | would otherwise do nothing for most of the day.
        
         | mplewis wrote:
         | I save a bunch of money every month by running my nightly tasks
         | on ephemeral nodes.
        
         | tomasGiden wrote:
         | We're using KEDA and ScaledJob to scale tomographic
         | reconstructions in the cloud. When a CT scanner has finished
         | uploading a scan, we let a ScaledJob create a Job to process
         | the data. A scan is maybe 8 hours and during that time we don't
         | need any compute resources. But when it's done we need both
         | lots of CPU and GPU power to process GBs and TBs of data
         | rapidly to show previews to the user.
         | 
         | Also, when a user triggers new previews we scale up nodes to
         | process that data. The problem there though is the scale up
         | time of the node pool which is a few minutes for a GPU node on
         | Azure.
         | 
         | We payed to have a GPU running all the time before but that got
         | too expensive.
         | 
         | As a side note, would I do it again I probably wouldn't build a
         | data pipeline on top of KEDA ScaledJobs and possibly not use
         | Kubernetes at all.
        
           | 7thpower wrote:
           | What would you use if you were to start fresh?
        
         | zbynek wrote:
         | (To be transparent, CTO of Kedify and a maintainer of KEDA
         | here)
         | 
         | Folks in other comments have answered this pretty well. Over
         | the past couple of years, I've talked to many companies and
         | individuals who have greatly benefited from autoscaling on k8s.
         | Generally, it has helped in these areas:
         | 
         | 1. Obvious case: if you run your environment on cloud
         | providers, it can significantly save costs and improve
         | throughput.
         | 
         | 2. It's not just about autoscaling workloads, but also about
         | managing batch jobs (K8s Jobs) that are triggered by events or
         | custom metrics on demand (you can think of this as a CronJob on
         | steroids).
         | 
         | 3. On-prem solutions: You're right; you can use the resources
         | you've already paid for. However, by enabling autoscaling, you
         | can also improve the distribution and utilization of those
         | resources. In large organizations, it is common practice for
         | individual teams to be treated as "internal customers" with
         | assigned quotas they can use. Autoscaling can be helpful in
         | these scenarios as well.
         | 
         | If you are interested in the area, I've given several talks on
         | K8s autoscaling, for example, our latest talk from KubeCon:
         | https://sched.co/1YhgO
        
         | temp_praneshp wrote:
         | I'm not sure if this answers your question but the last 2
         | companies I worked at (~7 years) both had very clear traffic
         | spikes 9a-5p US east coast hours on weekdays. My current place
         | actually sees more than 20-30% drop sunday nights compared to
         | monday morning, and it's constantly going up because we have a
         | lot of American enterprise customers.
         | 
         | Maybe I misunderstood your question but is there a case where
         | you can keep your entire capacity running for free? I'd assume
         | you pay AWS/other cloud or your electricity provider.
        
         | emmanueloga_ wrote:
         | Linode has a series of practical articles on how to use
         | autoscaling with Keda in LKS [1]. The ability to scale up/down
         | has obvious cost saving benefits, while retaining the ability
         | to serve peak traffic or optimize background work.
         | 
         | --
         | 
         | 1: https://www.linode.com/blog/?sq=KEDA
        
         | ses1984 wrote:
         | The New York Times crossword comes out every night at 10pm.
         | There is a traffic spike. It's huge.
        
         | benced wrote:
         | I have the opposite question: who is on the cloud and has such
         | consistent workloads they never need to scale up or down? I'm
         | sure those users exist but they must be the minority, right?
        
         | turtlebits wrote:
         | Anyone who doesn't have the same load 24 hours a day?
         | 
         | Ed-tech is a big one where you may have extremely low traffic
         | on weekends/summer/holidays/breaks.
        
         | slyall wrote:
         | I had a simple one on a Kubernetes cluster in AWS.
         | 
         | What happened is we'd have a queue processor that normally
         | needed a couple of pods to handle events. Except that once a
         | day another process would drop in 5 million requests into the
         | queue.
         | 
         | So I just had a simple keda autoscaler based on the length of
         | the queue. One pod for every 10,000 items in the queue with a
         | minimum of 2 pods and a maximum of 50 pods.
         | 
         | It would scale up after the big queue dumps, chew threw the
         | backlog and then scale back down again.
        
         | Germanion wrote:
         | We scale our cicd.
         | 
         | ~8-~24 Mo-fr 10-350vms
        
       | benced wrote:
       | The march away from companies directly managing Kubernetes to
       | Kubernetes being the layer that every future abstraction will be
       | built on continues.
        
       ___________________________________________________________________
       (page generated 2024-04-30 23:01 UTC)