[HN Gopher] Service mesh use cases (2020)
___________________________________________________________________
Service mesh use cases (2020)
Author : biggestlou
Score : 121 points
Date : 2023-02-09 23:38 UTC (1 days ago)
(HTM) web link (lucperkins.dev)
(TXT) w3m dump (lucperkins.dev)
| gillh wrote:
| Service meshes make it easier to roll out advanced load
| management/reliability features such as prioritized load
| shedding, which would otherwise need to be implemented within
| each language/framework.
|
| For instance, Aperture[0] open-source flow control system is
| built on service meshes.
|
| [0]: https://github.com/fluxninja/aperture
|
| [1]: https://docs.fluxninja.com
| 4pkjai wrote:
| I really did not enjoy dealing with our service mesh at the last
| place I worked.
| jspdown wrote:
| Out of curiosity, was it something built internally? Or were
| you relying on a public solution?
| jpdb wrote:
| I just wrote something extremely similar, but it's only internal
| right now.
|
| I personally find that the service mesh value-prop is hard to
| justify for a serverless stack (mostly Cloud Run, but AWS Lambda
| too probably), and in situations where your services are mostly
| all in the same language and you can bake the features into
| libraries that are much easier to import.
|
| Observability is a great example of this. In serverless-land,
| you're already getting the standard HTTP metrics (ex request
| count, response codes, latency, etc), tracing, and standard HTTP
| request logging "for free."
| davewritescode wrote:
| > I personally find that the service mesh value-prop is hard to
| justify for a serverless stack (mostly Cloud Run, but AWS
| Lambda too probably), and in situations where your services are
| mostly all in the same language and you can bake the features
| into libraries that are much easier to import.
|
| If you're running server less you already have 90% of what
| you'd get from a service mesh.
|
| I will tell you that having seen what happens in big companies,
| baking distributed concerns into libraries always ends in
| disaster long after you're gone.
|
| When you have a piece of code deployed in 200 separate apps,
| every change requires tons of project management.
| NovemberWhiskey wrote:
| Now imagine you have something that has the complexity and change
| volume of a distributed control plane bringing together load-
| balancing, service advertisement, public key infrastructure, and
| software defined networking, and then try to imagine running it
| at the same reliability as your DNS.
|
| Also: proxies, proxies everywhere, as far as the eye can see.
| darkwater wrote:
| This is the production ready part that people will usually
| discover later on their own skin...
| zidad wrote:
| And in addition to that, all of those immediately becoming the
| same centralized single point of failure. What could possibly
| go wrong (on high load)? ;p
| jspdown wrote:
| In most implementations this is not the case. Service Meshes
| tend to either follow a sidecar or a DaemonSet approach. You
| don't have a single proxy, people usually complain about the
| exact opposite.
| samsquire wrote:
| Thanks for this.
|
| I have never deployed a service mesh or used one but I am
| designing something similar at the code layer. It is designed to
| route between server components. That is, at the architecture
| between threads in a multithreaded system.
|
| The problem I want to solve is that I want architecture to be
| trivially easy to change with minimal _code_ changes. This is the
| promise and allure of enterprise service buses and messaging
| queues and probably Spring.
|
| I have managed RabbitMQ and I didn't enjoy it.
|
| If I want a system that can scale up and down and that multiples
| of any system object can be introduced or removed without drastic
| rewrites.
|
| I would like to decouple bottleneck from code and turn it into
| runtime configuration.
|
| My understanding of things such as Traefik and istio is that they
| are frustrating to set up.
|
| Specifically I am working on designing interthread communication
| patterns for multithreaded software.
|
| How do you design an architecture that is easy to change, scales
| and is flexible?
|
| I am thinking of a message routing definition format that is
| extremely flexible and allows any topology to be created.
|
| https://github.com/samsquire/ideas4#526-multiplexing-setting...
|
| I think there is application of the same pattern to the network
| layer too.
|
| Each communication event has associated with it an environment of
| keyvalues that look similar to this: petsserver1
| container1 thread3 socket5 user563
| ingestionthread1
|
| These can be used to route to keyspace ranges (such as particular
| users to tenant shards or load balance) to other components. For
| example users1-1000 are handled by petsserver1 and socket5 is
| associated with thread3.
|
| In other words: changing the RabbitMQ routing settings doesn't
| change the architecture of your software. You need to change the
| architecture of the software to match the routing configuration.
| But what if you changed the routing configuration and the
| application architecture changed to match?
| gnur wrote:
| I'd say most of these patterns are supported by NATS, it can do
| pub/sub but actually also has excellent support for RPC and in
| the latest iteration it also has a KV store baked in. I've been
| using it for a few pet projects so far and it has never been
| the weakest link.
| samsquire wrote:
| I keep hearing about NATS but I am yet to use it leisurely or
| for work.
|
| Thanks for the recommendation :-)
| CraigJPerry wrote:
| >> But what if you changed the routing configuration and the
| application architecture changed to match?
|
| If there were 3 ways to categorise scaling (there's more than
| this in reality) then they might be vertical, horizontal then
| distributed.
|
| You're describing an architecture that's in the horizontal
| scaling world view.
|
| You're not in vertical because you're using higher powered (but
| slower) strategies like active routing for comms between
| components where in vertical you'd have configurable queues but
| no routing layer.
|
| You're not in distributed scaling mode because your routing is
| assuming consistent latency and consistent bandwidth
| behaviours.
|
| I don't think one architecture to rule them all is a solvable
| problem. I'd heartily and very gratefully welcome being proven
| wrong on this.
| samsquire wrote:
| Thanks for your comment. It's definitely food for thought.
|
| You remind me of fallacies of distributed computing by
| mentioning consistent latency and bandwidth.
|
| https://en.m.wikipedia.org/wiki/Fallacies_of_distributed_com.
| ..
|
| I'm still at the design stage.
|
| Those architectures you describe, I am hoping there is a
| representation that can describe many architectures. There's
| probably architecture's I am yet to think of that are
| unrepresentable with my format.
|
| Going from 1 to N or removing, adding a layer should be
| automatable. That's my hope anyway.
|
| I want everything to wire itself automatically.
|
| I am trying to come up with a data structure that can
| represent architecture.
|
| I am trying to do what inversion of control containers do per
| request but for architecture. In Inversion of control
| containers you specify a scope of an object that is
| instantiated for a scope such as for a request or for a
| session. I want that for architecture.
| CraigJPerry wrote:
| it's such a fundamental problem space and with such a rich
| diversity of possible solutions that at a minimum you're
| going to create something seriously useful for a subset of
| types of application. But it'd be transformational for
| computing if you cracked the whole problem. I hope you do.
|
| I do like your idea of outsourcing the wiring (an error
| prone, detail heavy task) away from humans.
| Scubabear68 wrote:
| I've only read about Service Mesh, my impression was that it
| seems to add an awful lot of processes and complexity just to
| make developer's lives slightly easier.
|
| Maybe I'm wrong but it almost feels like busy work for DevOps. Is
| my first impression wrong? Is this the right way to architect
| systems in some use cases, and if so what are they?
| MoOmer wrote:
| Many of the use cases described in the post are solved by
| service meshes.
|
| So, in my opinion, the questions are introspective:
|
| - "Do I have enough context to know what problem those
| solutions are solving, and to at least appreciate the problem
| space to understand why someone may solve it like this?"
|
| - "Do I have or perceive those problem to impact my
| infrastructure/applications?"
|
| - "Does the solution offered by the use cases described appeal
| to me?"
|
| If yes at the end, then one potential implementation is a
| service mesh.
|
| A lot of these are solved out-of-the-box with Hashicorp's
| Nomad/Consul/Vault pairing, for example!
| remram wrote:
| It is true that a lot of those use cases are covered by
| "basic" Kubernetes (or Nomad) without the addition of Istio
| or similar, e.g. service discovery, load-balancing, circuit-
| breaking, autoscaling, blue-green, isolation, health
| checking...
|
| Adding a service mesh onto Kubernetes seems to bring a lot of
| complexity for a few benefits (80% of the effort for the last
| 20% sort of deal).
| campbel wrote:
| > Adding a service mesh onto Kubernetes seems to bring a
| lot of complexity for a few benefits
|
| I think the benefits are magnified in larger organizations
| or where operators and devs are not the same people. And
| the complexity is relative to which solution you pick. If
| you're already on Kubernetes, linkerd2 is relatively easy
| to install and manage; is that worth it? To me it has been
| in the past.
| tyingq wrote:
| I suspect if a Service Mesh is ultimately shown to have broad
| value, one will make it's way into the K8S core.
|
| To me, it's a fairly big decision to layer something that's
| complex in it's own right on top of something else that's also
| complex.
| jpdb wrote:
| > I suspect if a Service Mesh is ultimately shown to have
| broad value, one will make it's way into the K8S core
|
| I'm not so sure. I suspect it'll follow the same roadmap as
| Gateway API, which it already kind of is with the Service
| Mesh Interface (https://smi-spec.io/)
| jspdown wrote:
| Indeed, all major Service Meshes solution for Kubernetes
| implements (at least some part) the SMI specification.
| There is a group composed of these players working actively
| on making such spec a standard.
|
| Understanding these few CRDs give great insights on what do
| expect from a Service mesh and how thinks are typically
| articulated.
| kevan wrote:
| >slightly easier
|
| As a company grows sooner or later most of these features
| become pretty desirable from an operations perspective. Feature
| developers likely don't and shouldn't need to care. It probably
| starts with things like Auth and basic load balancing. As the
| company grows to dozens of teams and services then you'll start
| feeling pain around service discovery and wish you didn't need
| to implement yet another custom auth scheme to integrate with
| another department's service.
|
| After a few retry storm outages people will start paying more
| attention to load shedding, autoscaling, circuit breakers, rate
| limiting.
|
| More mature companies or ones with compliance obligations start
| thinking about zero-trust, TLS everywhere, auditing, and
| centralized telemetry.
|
| Is there complexity? Absolutely. Is it worth it? That depends
| where your company is in its lifecycle. Sometimes yes, other
| times you're probably better off just building things and
| living with the fact that your load shedding strategy is "just
| tip over".
| davewritescode wrote:
| We're the process of moving all of our services over to a
| service mesh and while the growing pains are definitely
| there, the payoff is huge.
|
| Even aside from a lot of the more hyped up features of
| service mesh, the biggest thing Istio solves is tls
| everywhere and cloud agnostic workload identity. All of our
| pods get new tls certs every 24 hours and nobody needs an API
| key to call anything.
|
| Our security team is thrilled that applications running with
| an Istio sidecar literally have way to leak credentials.
| There's no API keys to accidentally log. Once we have
| databases setup to support mTLS authentication, we won't need
| database passwords anymore.
| bushbaba wrote:
| Some of the functionality you mentioned above is possible
| without a service mesh.
| dasil003 wrote:
| It's 100% a question of scale. And I don't mean throughput, I
| mean domain and business logic complexity that requires an army
| of engineers.
|
| Just as it's foolish to create dozens of services if you have a
| 10-person team, you don't really get much out of a service mesh
| if you only have a handful of services and not feeling the pain
| with your traditional tooling.
|
| But once you get to large scale with convoluted business logic
| that is hard to reason about because so many teams are
| involved, the search for scalable abstractions begin. Service
| mesh then becomes useful because it is completely orthogonal to
| biz logic and you can now add engineers 100% focused on tooling
| and operations, and product engineers can think a lot less
| about certain classes of reliability and security concerns.
|
| Of course in todays era of resume driven development, and the
| huge comp paid by FAANGs, you are going to get a ton of young
| devs pushing for service mesh way before it makes sense. I
| can't say I blame them, but keep your wits about you!
| peteradio wrote:
| If you can convince your business folks to run shit on the
| command-line then there is basically no need for services
| ever. I know it sounds insane but its how it was done in the
| old days and there really is only a false barrier to doing it
| again.
| emptysea wrote:
| Place I worked had support staff copy-pasting mongo queries
| from google docs -- worked in the early days but eventually
| you have to start building an admin interface for more
| complicated processes
|
| When it was just mongo installs are easy since they only
| needed a mongo desktop client
| peteradio wrote:
| Terminal can handle auth.
| jrockway wrote:
| It's a "big company" thing. In my opinion, the best way to add
| mTLS to your stack is to just adjust your application code to
| verify the certificate on the other end of the connection. But
| if the "dev team" has the mandate "add features X, Y, and Z",
| and the "devops team" has the mandate "implement mTLS by the
| end of Q1", you can see why "bolt on a bunch of sidecars"
| becomes the selected solution. The two teams don't have to talk
| with each other, but they both accomplish their goals. The cost
| is less understanding, debuggability, and the cost of the
| service mesh product. But, from both teams' perspective, it
| looks like the best option.
|
| I'm not a big fan of this approach; the two teams need to have
| a meeting and need to have a shared goal to implement the
| business's selected security requirements together. But
| sometimes fixing the org is too hard, so there is a Plan B.
| davewritescode wrote:
| I very much disagree the sentiment that adding mTLS is just
| "verifying the certificate on the other end of the
| connection". You ignore the process of distribution and
| rotation of certificates which is non-trivial to implement
| application side.
| jagged-chisel wrote:
| Most of my programming peers want to focus on solving product-
| related problems rather than authe, authn, tls config,
| failover, throttling, discovery...
|
| We want to automate everything not related to the code we want
| to write. Service meshes sound like a good way to do that.
| [deleted]
| Scubabear68 wrote:
| Right - by why not use something like an API gateway then?
| pbalau wrote:
| That can work, but it means you simply outsourced the
| problem to AWS. It's not a bad idea per se, but it means
| your service needs to talk, in some way, http.
|
| You could use the service mesh thing from AWS, along with
| cognito jwts, for authenticatetion and authorization
| steviesands wrote:
| API gateways are primarily used for HTTP traffic coming
| from clients external to your backend services eg. an iOS
| device (hence the term 'gateway' vs. 'mesh'). I don't think
| they support thrift or grpc (at least aws doesn't, not sure
| about other providers). https://aws.amazon.com/api-gateway/
| asim wrote:
| Article is from 2020. Please add to title.
| dang wrote:
| Added. Thanks!
| jcq3 wrote:
| I thought service mesh main use case was to reduce time to
| production delivery, allowing hotfixes to be much more reactive.
| Am I totally wrong?
| jcq3 wrote:
| I think it is named Blue-green deployments in the article
| jspdown wrote:
| It's not about fast delivery, at least not in this way.
| Arguably if you need mTLS, traffic shaping, cross service
| observability, service discovery... Then yes, it's much faster
| to use an existing solution than built it yourself. But it
| won't make you hot fixes shipped faster.
|
| Service Mesh is nothing new. People tend to call it differently
| back then. The key features it bring are: - Traffic shaping:
| mirroring, canary, blue-green... - Cross service observability
| - End to end encryption - Service discovery
___________________________________________________________________
(page generated 2023-02-11 23:01 UTC)