[HN Gopher] eBPF will help solve service mesh by getting rid of ...
___________________________________________________________________
eBPF will help solve service mesh by getting rid of sidecars
Author : tgraf
Score : 203 points
Date : 2021-12-09 13:02 UTC (9 hours ago)
(HTM) web link (isovalent.com)
(TXT) w3m dump (isovalent.com)
| unmole wrote:
| Offtopic: I really like the style of the diagrams. I remember
| seeing something similar elsewhere. Are this manually drawn or is
| this the result of some tool?
| tgraf wrote:
| OP here: It's whimsical.com. I really love it.
| unmole wrote:
| Thank you, Thomas! I really admire all that you have done
| with Cilium.
| manvendrasingh wrote:
| I am wondering how would this solve the problem of mTLS while
| still supporting service level identities? Is it possible to move
| the mTLS to listeners instead of sidecar or some other mechanism?
| zinclozenge wrote:
| It's not clear how eBPF will deal with mTLS. I actually asked
| that when interviewing at a company using eBPF for observability
| into Kubernetes the answer was they didn't know.
|
| Yea, if you're getting TLS termination at the load balancer prior
| to k8s ingress then it's pretty nice.
| GauntletWizard wrote:
| The answer to this is simple - TLS will start being terminated
| at the pods themselves. The frontend load balancer will _also_
| terminate TLS - to the public sphere, and then will
| authenticate it 's connection to your backends as well.
| Kubernetes will provide x509 certificates suitable for service-
| to-service communications to pods automatically.
|
| The work is still in the early phases, so the exact form this
| will take has yet to be hammered out, but there's broad
| agreement that this functionality will be first-class in k8s in
| the future. If you want to keep running proxies for the other
| feature they provide, great - They'll be able to use the
| certificates provided by k8s for identity. If you'd like to
| know more, come to on of the SIGAUTH Meetings :)
| tgraf wrote:
| Then you should interview again but with us.
|
| This is not too different from wpa_supplicant used by several
| operating for key management for wireless networks. The
| complicated key negotiation and authentication can remain in
| user space, the encryption of the negotiated key can be done in
| the kernel (kTLS) or, when eBPF can control both sides, it can
| even be done without using TLS but encrypting using a network
| level encapsulation format to it works for non-TCP as well.
|
| Hint: We are hiring.
| dijit wrote:
| Honestly after I learned that the majority of Kubernetes nodes
| just proxy traffic between each other using iptables and that a
| load balancer can't tell the nodes apart (ones where your app
| lives vs ones that will proxy connection to your app) I got
| really worried about any kind of persistent connection in k8s
| land.
|
| Since some number of persistent connections will get force
| terminated on scale down or node replacement events...
|
| Cilium and eBPF looks like a pretty good solution to this though
| since you can then advertise your pods directly on the network
| and load balance those instead of every node.
| p_l wrote:
| Whether load balancer can or can-not tell the nodes apart
| depends on load balancer and method you use to expose your
| service to it, as well as what kind of networking setup you use
| (i.e. is pod networking sensibly exposed to load balancer or
| ... weirdly)
|
| Each "Service" object provides (by default, can be disabled)
| load-balanced IP address that by default uses kube-proxy as you
| described, a DNS A record pointing to said address, DNS SRV
| records pointing to actual direct connections (whether
| NodePorts or PodIP/port combinations) plus API access to get
| the same data out.
|
| There are even replacement kube-proxy implementations that
| route everything through F5 load balancer boxes, but they are
| less known.
| q3k wrote:
| > Honestly after I learned that the majority of Kubernetes
| nodes just proxy traffic between each other using iptables and
| that a load balancer can't tell the nodes apart (ones where
| your app lives vs ones that will proxy connection to your app)
| I got really worried about any kind of persistent connection in
| k8s land.
|
| There can be a difference, if your LoadBalancer-type service
| integration is well implemented. The externalTrafficPolicy knob
| determines whether all nodes should attract traffic from
| outside or only nodes that contain pods backing this service.
| For example, metallb (which attracts traffic by /32 BGP
| announcements to given external peers) will do this correctly.
|
| Within the cluster itself, only nodes which have pods backing a
| given service will be part of the iptables/ipvs/...
| Pod->Service->Pod mesh, so you won't end up with scenic routes
| anyway. Same for Pod->Pod networking, as these addresses are
| already clustered by host node.
| kklimonda wrote:
| How do you keep ecmp hashing stable between rollouts?
| bogomipz wrote:
| ECMP hashing would be between the edge router and the IP of
| the LBs advertising VIPs no? The LB would maintain the
| mappings between the VIPs and the nodePort IPs of worker
| nodes that have a local service Endpoint for the requested
| service. I don't think this would be any different than it
| is without Kubernetes or am I completely misunderstanding
| your question?
| dharmab wrote:
| If you're asking about connection stability in general:
|
| - Ideally, you avoid it in your application design.
|
| - If you need it, you set up SIGTERM handling in the
| application to wait for all connections to close before the
| process exits. You also set up "connection draining" at the
| load balancer to keep existing sessions to terminating Pods
| open but send new sessions to the new Pods. The tradeoff is
| that rollouts take much longer- if the session time is
| unbounded, you may need to enforce a deadline to break
| connections eventua.
| dilyevsky wrote:
| You dont just wait until all connections exit, you first
| need to withdraw bgp announcement to the edge router,
| then start the wait. It's not that simple with metal LBs.
| On the other hand it's not that simple with cloud LBs
| either bc they also break long tcp streams when they
| please
| dharmab wrote:
| That's if you're using a NodePort service, which the
| documentation explains is for niche use cases such as if you
| don't have a compatible dedicated load balancer. In most
| professional setups you do have such a load balancer and can
| use other types of routing that avoid this.
|
| https://kubernetes.io/docs/concepts/services-networking/serv...
| topspin wrote:
| > In most professional setups you do have such a load
| balancer
|
| May I ask what one might use in an AWS cloud environment to
| provide that load balancer within a Region?
|
| Does IPv6 address any of these issues? It seems to me that
| IPv6 is capable of providing every component in the system
| its own globally routable address, identity (mTLS perhaps)
| and transparent encryption with no extra sidecars, eBPF
| pieces, etc.
| shosti wrote:
| Ingresses on EKS will set up an ALB that sends traffic
| directly to pods instead of nodes (basically skips the
| whole K8s Service/NodePort networking setup). You have to
| use ` alb.ingress.kubernetes.io/target-type: ip` as an
| annotation I think (see
| https://docs.aws.amazon.com/eks/latest/userguide/alb-
| ingress...).
| [deleted]
| dharmab wrote:
| > May I ask what one might use in an AWS cloud environment
| to provide that load balancer within a Region?
|
| The AWS cloud controller will automatically set up an ALB
| for you if you configure a LoadBalancer service in
| Kubernetes. I've also done custom setups with AWS NLBs.
|
| > Does IPv6 address any of these issues?
|
| It could address some issues- you could conceivably create
| a CNI plugin which allocates an externally addressable IP
| to your Pods. Although you would probably still want a load
| balancer for custom routing rules and the improved
| reliability over DNS round robin.
| pm90 wrote:
| This is a concern only if you have ungraceful node termination
| Ie you suddenly yoink the node. In most cases when you
| terminate the node, k8s will (attempt to) cordon and drain the
| nodes, letting the pods gracefully terminate the connections
| before getting evicted.
|
| If you didn't have k8s and just used an autoscaling group of
| VMs you would have the same issue...
| zdw wrote:
| So instead of making the applications use a good RPC library,
| we're going to shove more crap into the kernel? No thanks, from a
| security context and complexity perspective.
|
| Per https://blog.dave.tf/post/new-kubernetes/ , the way that this
| was solved in Borg was:
|
| > "Borg solves that complexity by fiat, decreeing that Thou Shalt
| Use Our Client Libraries For Everything, so there's an obvious
| point at which to plug in arbitrarily fancy service discovery and
| load-balancing. "
|
| Which seems like a better solution, if requiring some
| reengineering of apps.
| __alexs wrote:
| I'm sure someone will write leftPad in eBPF any day now.
| hestefisk wrote:
| Indeed. We could even embed a WASM runtime (headless v8?) so
| one can execute arbitrary JavaScript in-kernel... wait :)
| zaphar wrote:
| eBPF is far too limited to run a WASM runtime. That's why
| the proposed article approach is even possible.
| nonameiguess wrote:
| In addition to whether or not all of your various dev teams
| preferred languages have a supported client SDK, you also have
| the build vs. buy issue if you're plugging COTS applications
| into your service mesh, there is no way to force a third party
| vendor to reengineer their application specifically for you.
|
| This probably dictates a lot of Google's famous "not invented
| here" behavior, but most organizations can't afford to just
| write their entire toolchain from scratch and need to use
| applications developed by third parties.
| tptacek wrote:
| The complexity is an issue (but sidecars are plenty complex
| too), but the security not so much. BPF C is incredibly
| limiting (you can't even have loops if the verifier can't prove
| to its satisfaction that the loop has a low static bound). It's
| nothing at all like writing kernel C.
| the_duke wrote:
| You don't have to use C.
|
| There are two projects that enable writing eBPF with Rust
| [1][2]. I'm sure there is an equivalent with nicer wrappers
| for C++.
|
| [1] https://github.com/foniod/redbpf
|
| [2] https://github.com/aya-rs/aya
| tptacek wrote:
| It doesn't make any difference which language you use; the
| security promises are coming from the verifier, which is
| analyzing the CFG of the compiled program. C is what most
| people use, since the underlying APIs are in C, and since
| the verifier is so limiting that most high-level
| constructions are off the table.
| the_duke wrote:
| Sure, I was not implying that Rust would have any
| security benefits fir eBPF.
|
| Just that you can even write eBPF code in more convenient
| languages.
| tptacek wrote:
| This has come up here a bunch of times (we do a lot of
| work in Rust). I've been a little skeptical that Rust is
| a win here, for basically the reason I gave upthread: you
| can't really do much with Rust in eBPF, because the
| verifier won't let you; it seems to me like you'd be
| writing a dialect of Rust-shaped C. But we did a recent
| work sample challenge for Rust candidates that included
| an eBPF component, and a couple good submissions used
| Rust eBPF, so maybe I'm wrong about that.
|
| I'm also biased because I _love_ writing C code (I know,
| both viscerally and intellectually, that I should
| virtually never do so; eBPF is the one sane exception!)
| MayeulC wrote:
| > a good RPC library
|
| I like that approach. If you use client libraries, new RPC
| mechanisms are "free" to implement (until you need to
| troubleshoot upgrades). It's also an argument against
| statically linking.
|
| For instance, if running services on the same machine, io-uring
| can probably be used? (I'm a noob at this). eBPF for packet
| switching/forwarding between different hosts, etc.
| malkia wrote:
| This may no longer be the case, but back at Google I remember
| one day having my java library no longer using the client
| library logger, but spawning some other app and talking
| (sending logs to it). That other app used to be fat-client,
| linked in our app, supported by another team. First I was
| wtf.. Then it hit me - this other team can update their
| "logging" binary at different cycle than us (hence we don't
| have to be on the same "build" cycle). All they needed to do
| for us is provide with very "thin" and rarelly changing
| interface library. And they can write it in any language they
| like (Java, c++, go, rust, etc.)
|
| Also no need to be .so/ (or .dll/.dylib) - just some quick
| IPC to send messages around. Actually can be better. For one,
| if their app is still buffering messages, my app can exit,
| while theirs still run. Or security reasons (or not having to
| think about these), etc. etc. So still statically linked but
| processes talking to each other. (Granted does not always
| work for some special apps, like audio/video plugins, but I
| think works fine for the case above).
| jrockway wrote:
| The big secret is that sidecars can only help so much. If you
| want distributed tracing, the service mesh can't propagate
| traces into your application (so if service A calls service B
| which calls service C, you'll never see that end to end with a
| mesh of sidecars). mTLS is similar; it's great to encrypt your
| internal traffic on the wire, but that needs to get propagated
| up to the application to make internal authorization decisions.
| (I suppose in some sense I like to make sure that "kubectl
| port-forward" doesn't have magical enhanced privileges, which
| it does if your app is oblivious to the mTLS going on in the
| background. You could disable that specifically in your k8s
| setup, but generally security through remembering to disable
| default features seems like a losing battle to me. Easier to
| have the app say "yeah you need a key". Just make sure you
| build the feature to let oncall get a key, or they will be very
| sad.)
|
| For that reason, I really do think that this is a temporary
| hack while client libraries are brought up to speed in popular
| languages. It is really easy to sell stuff with "just add
| another component to your house of cards to get feature X", but
| eventually it's all too much and you'll have to just edit your
| code.
|
| I personally don't use service meshes. I have played with Istio
| but the code is legitimately awful, so the anecdotes of "I've
| never seen it work" make perfect sense to me. I have, in fact,
| never seen it work. (Read the xDS spec, then read Istio's
| implementation. Errors? Just throw them away! That's the core
| goal of the project, it seems. I wrote my own xDS
| implementation that ... handles errors and NACKs correctly.
| Wow, such an engineering marvel and so difficult...)
|
| I do stick Envoy in front of things when it seems appropriate.
| For example, I'll put Envoy in front of a split
| frontend/backend application to provide one endpoint that
| serves both the frontend or backend. That way production is
| identical to your local development environment, avoiding
| surprises at the worst possible time. I also put it in front of
| applications that I don't feel like editing and rebuilding to
| get metrics and traces.
|
| The one feature that I've been missing from service meshes,
| Kubernetes networking plugins, etc. is the ability to make all
| traffic leave the cluster through a single set of services, who
| can see the cleartext of TLS transactions. (I looked at Istio
| specifically, because it does have EgressGateways, but it's
| implemented at the TCP level and not the HTTP level. So you
| don't see outgoing URLs, just outgoing IP addresses. And if
| someone is exfiltrating data, you can't log that.) My biggest
| concern with running things in production is not so much
| internal security, though that is a big concern, but rather "is
| my cluster abusing someone else". That's the sort of thing that
| gets your cloud account shut down without appeal, and I feel
| like I don't have good tooling to stop that right now.
| darkwater wrote:
| > If you want distributed tracing, the service mesh can't
| propagate traces into your application (so if service A calls
| service B which calls service C, you'll never see that end to
| end with a mesh of sidecars)
|
| Why not? AFAIK traces are sent from the instrumented app to
| some tracing backend, and a trace-id is carried over via an
| HTTP header from the entry point of the request until the
| last service that takes part in that request. Why a
| sidecar/mesh would break this?
| afrodc_ wrote:
| This. Header trace propagation is a godsend.
| colonelxc wrote:
| I think the point is that the service mesh can't do the
| work of propagation. It needs the client to grab the input
| header, and attach it to any outbound requests. From the
| perspective of the service mesh, the service is handling X
| requests, and Y requests are being sent outbound. It
| doesn't know how each outbound request maps to an input.
|
| So now all of the sudden we do need a client library for
| each service in order to make sure the header is being
| propagated correctly.
| tgraf wrote:
| If you are in a position where you can do that then great. Most
| folks out there are in a position where they need to run
| arbitrary applications delivered by vendors without an ability
| to modify them.
|
| The second aspect is that this can get extremely expensive if
| your applications are written in a wide number of language
| frameworks. That's obviously different at Google where the
| number of languages can be restricted and standardized.
|
| But even then, you could also link a TCP library into your app.
| Why don't you?
| outside1234 wrote:
| The industry is moving away from the client library approach.
| This is possible in a place like Google where they force folks
| to write software in one of four languages (C++, Java, Go,
| Python) but doesn't scale to a broader ecosystem.
| pjmlp wrote:
| It sure scales, I am yet to work in organisations where
| everything goes.
|
| There are a set of sanctioned languages and that is about it.
| jayd16 wrote:
| The subtle aspect of the comment you're replying to is that
| _they write everything_.
|
| Hard to cram a new library into some closed source vendor
| app.
| pjmlp wrote:
| Depends how it was written and made extensible.
| jayd16 wrote:
| It does feel a bit like we're trying to monkey patch compiled
| code but the benefits are pretty clear.
| lamontcg wrote:
| I would argue pretty strenuously that this is not what is
| being done.
|
| The sockets layer is becoming a facade which can guarantee
| additional things to applications which are compiled against
| it, and you've got dependency injection here so that the
| application layer can be written agnostically and not care
| about any of those concerns at all.
| q3k wrote:
| It is the technically better solution IMO/IME, too.
|
| But that doesn't work when you're trying to sell enterprises
| the idea of 'just move your workloads to Kubernetes!'. :)
| ZeroCool2u wrote:
| What if a client library does not yet exist for your language?
| q3k wrote:
| In a large orga, you limit the languages available for
| projects to well supported ones internally, ie. to those that
| are known to have a port of the RPC/metrics/status/discovery
| library. Also makes it easier to have everything under a
| single build system, under a single set of code styles, etc.
|
| If some developers want to use some new language, they have
| to first in put in the effort by a) demonstrating the
| business case of using a new language and allocating
| resources to integrate it into the ecosystem b) porting all
| the shared codebase to that new language.
| ZeroCool2u wrote:
| Absolutely. I was thinking what if there's a good business
| reason to use a different language that's not the norm for
| your org. Then you're stuck with an infra problem
| preventing you from using the right tool for the job.
|
| Of course, this is the exception to the rule you described
| well :)
| q3k wrote:
| I don't think of it as an infra problem, but as an early
| manifestation of effort that would arise later on,
| anyway: long-term maintenance of that new language. You
| need people who know the language to integrate it well
| with the rest of the codebase, people who can perform
| maintenance on language-related tasks, people who can
| train other people on this language, ... These are all
| problems you'd have later on, but are usually handwaved
| away as trivial.
|
| Throughout my career nearly every single company I've
| worked in had That One Codebase written by That One
| Brilliant Programmer in That One Weird Language that no-
| one maintains because the original author since left, the
| language turns out to be dead and because it's extremely
| expensive to hire or train more people to grok that
| language just for this project.
| __alexs wrote:
| There are only 5 languages. JavaScript, C++, Java, Python, C#
|
| This is basically the same set of languages people were
| writing 20 years ago and will probably be the same set of
| languages people will write in 20 years from now.
| MayeulC wrote:
| It really depends on your domain. I haven't seen C# a lot,
| nor python, in some orgs.
|
| For some (like me), it's more a superset of C, assembly,
| bash, maybe lisp, python and matlab.
|
| For others, it's going to be JavaScript, PHP, CSS, HTML..
|
| I agree though that a library is usually domain-specific,
| and that you can probably easily identify the subset of
| languages that you really need official bindings for
| (thereby making my comment a bit useless, sorry for the
| noise).
| dvogel wrote:
| I'm not necessarily advocating for the approach described in
| the article but it wouldn't worry me from a security
| perspective. The security model of eBPF is pretty impressive.
| The security issues arising from engineers struggling to keep
| the entire model in their head would concern me though.
| p_l wrote:
| In a world without (D)COM, I find it's much, much harder to
| make common base libraries and force people to use them,
| especially if you can't also force limit the set of toolchains
| used in the environment.
| outside1234 wrote:
| The network is the base library - that is the shift you are
| seeing. You make a call out to a network address with a
| specific protocol.
|
| Also, as an aside, I think WebAssembly has the potential to
| shift this back. In a world where libraries and programs are
| compiled to WebAssembly, it doesn't matter what their source
| language was, and as such, the client library based approach
| might swing back into vogue.
| p_l wrote:
| WASM isn't a valid target for many languages, that's one
| thing.
|
| Two, the case is about the library to interact with the
| network, so... There's also implementing the protocols.
| jjtheblunt wrote:
| > The network is the base library
|
| you remind me of the 20+ years ago Sun Microsystems
| assertion "The Network IS the Computer".
|
| citation: https://www.networkcomputing.com/cloud-
| infrastructure/networ...
| codetrotter wrote:
| > Identity-based Security: Relying on network identifiers to
| achieve security is no longer sufficient, both the sending and
| receiving services must be able to authenticate each other based
| on identities instead of a network identifier.
|
| Kinda semi-offtopic but I am curious to know if anyone has used
| identity part of a WireGuard setup for this purpose.
|
| So say you have a bunch of machines all connected in a WireGuard
| VPN. And then instead of your application knowing host names or
| IP addresses as the primary identifier of other nodes, your
| application refers to other nodes by their WireGuard public key?
|
| I use WireGuard but haven't tried anything like that. Don't know
| if it would be possible or sensible. Just thinking and wondering.
| madjam002 wrote:
| I too am interested in this.
|
| I long for the day where Kubernetes services, virtual machines,
| dedicated servers and developer machines can all securely talk
| to eachother in some kind of service mesh, where security and
| firewalls can be implemented with "tags".
|
| Tailscale seems to be pretty much this, but while it seems
| great for the dev/user facing side of things (developer machine
| connectivity), it doesn't seem like it's suited for the service
| to service communication side? It would be nice to have one
| unified connectivity solution with identity based security
| rather than e.g Consul Connect for services, Tailscale /
| Wireguard for dev machine connectivity, etc.
| starfallg wrote:
| >I long for the day where Kubernetes services, virtual
| machines, dedicated servers and developer machines can all
| securely talk to eachother in some kind of service mesh,
| where security and firewalls can be implemented with "tags".
|
| That's exactly what Scalable Group Tags (SGTs) are -
|
| https://tools.ietf.org/id/draft-smith-kandula-sxp-07.html
|
| Cisco implements this as a part of TrustSec
| tptacek wrote:
| We're a global platform that runs an intra-fleet WireGuard
| mesh, so we have authenticated addressing between nodes; we
| layer a couple dozen lines of BPF C on top of that to extend
| the authentication model to customer address prefixes. So,
| effectively, we're using WireGuard as an identity. In fact: we
| do so explicitly for peering connections to other services.
|
| So yeah, it's a model that can work. It's straightforward for
| us because we have a lot of granular control over what can get
| addressed where. It might be trickier if your network model is
| chaotic.
| tgraf wrote:
| One of the methods that Cilium (which implements this eBPF-
| based service mesh idea) uses to implementation authentication
| between workloads is Wireguard. It does exactly what you
| describe above.
|
| In addition it can also be used to enforce based on service
| specific keys/certificates as well.
| allset_ wrote:
| Isn't the Wireguard implementation in Cilium between nodes
| only, not workloads (pods)?
| tgraf wrote:
| It can do both. It can authenticate and encrypt all traffic
| between nodes which then also encrypts all traffic between
| the pods running on those pods. This is great because it
| also covers pod to node and all control plane traffic. The
| encryption can also use specific keys for different
| services to authenticate and encrypt pod to pod
| individually.
| q3k wrote:
| You'd be adding a whole new layer of what would effectively be
| dynamic routing. It's doable, but it's not a trivial amount of
| effort. Especially if you want everything to be transparent and
| automagic.
|
| There's earlier projects like CJDNS which provide pubkey-
| addressed networking, but they're limited in usability as they
| route based on a DHT.
| outside1234 wrote:
| There is a good talk about this (and more) from KubeCon:
|
| https://www.youtube.com/watch?v=KY5qujcujfI
| davewritescode wrote:
| From a resource perspective this makes sense but from a security
| perspective this drives me a little bit crazy. Sidecars aren't
| just for managing traffic, they're also a good way to automate
| managing the security context of the pod itself.
|
| The current security model in Istio delivers a pod specific
| SPIFFE cert to only that pod and pod identity is conveyed via
| that certificate.
|
| That feels like a whole bunch of eggs in 1 basket.
| tgraf wrote:
| What the proposed architecture allows is to continue using
| SPIFFE or another certificate management solution to generate
| and distribute the certificates but use either a per-node proxy
| or an eBPF implementation to enforce it. Even if the
| authentication handshake remains in a proxy but data encryption
| moves to the kernel then that is a massive benefit from an
| overhead perspective. This already exists and is called kTLS.
| xmodem wrote:
| Doing this with eBPF is definitely an improvement, but when I
| look at some of the sidecars we run in production, I often wonder
| why we can't just... integrate them into the application.
| mixedCase wrote:
| There are good reasons more often than not.
|
| Being able to pick up something generic rather than something
| language-specific.
|
| Not having to do process supervision (which includes handling
| monitoring and logs) within your application.
|
| Not making the application lifecycle subservient to needs such
| as log shipping and request rerouting. People get sig traps
| wrong suprisingly often.
| taeric wrote:
| My gut is that using sidecars doesn't really solve these
| problems straight up. Just moved them to the orchestrator.
|
| Which is not bad. But that area is also often misconfigured
| for supervision. And trapping signals remains mostly broken
| in all sidecars.
| cfors wrote:
| You can! There are downsides though for any sufficiently
| polyglot organization, which is maintaining all the different
| client SDK's that need to use that.
|
| Sidecars are often useful for platform-centric teams that would
| like to have access to help manage something like secrets,
| mTLS, or traffic shaping in the case of Envoy. The team that's
| responsible for that just needs to maintain a single sidecar
| rather than all of the potential SDK's for teams.
|
| Especially if you have specific sidecars that only work on a
| specific infrastructure, for example if you have a Vault
| sidecar that deals with secrets for your service over EKS IAM
| permissions, you suddenly can't start your service without a
| decent amount of mocking and feature flags. Its nice to not
| have to burden your client code with all of that.
|
| Also, there is a decent amount of work being done on gRPC to
| speak XDS which also removes the need for the sidecar [0].
|
| [0] https://istio.io/latest/blog/2021/proxyless-grpc/
| xemdetia wrote:
| Another thing too is that if your main application artifact
| can be static while your sidecar can react to configuration
| changes/patches/vulns/updates. Depending on your architecture
| it can make some components last for years without a change
| even though the sidecar/surrounding configuration is doing
| all sorts of stuff. Back when more people ran Java
| environments there were all sorts of settings you can do with
| just the JVM without the bytecode moving for how JCE worked
| which was extraordinarily helpful.
|
| It depends on your environment and architecture combined with
| how fast you can move especially with third party components.
| Having the microservice be 'dumb' can save everything.
| pjmlp wrote:
| For a moment I thought you're talking about POSIX directory
| services.
| darkwater wrote:
| > Especially if you have specific sidecars that only work on
| a specific infrastructure, for example if you have a Vault
| sidecar that deals with secrets for your service over EKS IAM
| permissions, you suddenly can't start your service without a
| decent amount of mocking and feature flags. Its nice to not
| have to burden your client code with all of that.
|
| Could you please elaborate on this? I don't fully understand
| what you mean. Especially, I don't understand if "Its nice to
| not have to burden your client code with all of that" applies
| to a setup with or without sidecars.
| cfors wrote:
| Take vault for example. Rather than have to toggle a flag
| in your service to get a secret, you could have the vault
| sidecar inject the secret automatically into your
| container, as opposed to having to pass a configuration
| flag `USE_VAULT` to your application, which will
| conditionally have a baked in vault client that fetches
| your secret for you.
|
| Your service doesn't really care where the secret comes
| from, as long it can use that secret to connect to some
| database, API or whatever. So IMO it makes your application
| code a bit cleaner knowing that it doesn't have to worry
| about where to fetch a secret from.
| darkwater wrote:
| Ok, so you are indeed advocating for the sidecar approach
| (and on this I fully agree, especially this Vault
| example)
| miduil wrote:
| Author of linkerd argues that splitting this responsibility
| will improve stability as you'll have a homogeneous interface
| (sidecar proxy) over a heterogeneous group of pods. Updating a
| sidecar-container (or using the same across all applications)
| is possible, whereas if it's integrated into the application
| you'll encounter much more barriers and need much wider
| coordination.
| dboreham wrote:
| Like it or not the socket has become the demarcation mechanism
| we use. Therefore all software ends up deployed as a thing that
| talks on sockets. Therefore you can't/shouldn't put
| functionality that belongs on the other end of the socket
| inside that thing. If you do that it's no longer the kind of
| thing you wanted (a discrete unit of software that does
| something). It's now a larger kind of component (software that
| does something, plus elements of the environment that software
| runs within). You probably don't want that.
| pjmlp wrote:
| The irony is arguing for monolithic kernels with a pile of
| such layers on top.
| Matthias247 wrote:
| I understand how BPF works for transparently steering TCP
| connections. But the article mentions gRPC - which means HTTP2.
| How can the BPF module be a replacement for a proxy here. My
| understanding is it would need to understand http2 framing and
| having buffers - which all sound like capabilities that require
| more than BPF?
|
| Are they implementing a http2 capable proxy in native kernel C
| code and making APIs to that accessible via bpf?
| tgraf wrote:
| The model I'm describing contains two pieces: 1) Moving away
| from sidecars to per-node proxies that can be better integrated
| into the Linux kernel concept of namespacing instead of
| artificially injecting them with complicated iptables
| redirection logic at the network level. 2) Providing the HTTP
| awareness directly with eBPF using eBPF-based protocol parsers.
| The parser itself is written in eBPF which has a ton of
| security benefits because it runs in a sandboxed environment.
|
| We are doing both. Aspect 2) is currently done for HTTP
| visibility and we will be working on connection splicing and
| HTTP header mutation going forward.
| tptacek wrote:
| What does an HTTP parser written in BPF look like? Bounded
| loops only --- meaning no string libraries --- seems like a
| hell of a constraint there.
| tgraf wrote:
| It looks not too different from the majority of HTTP
| parsers out there written in C. Here is an example of
| NodeJS [0].
|
| [0] https://github.com/nodejs/http-
| parser/blob/main/http_parser....
| tptacek wrote:
| Node's HTTP parser doesn't have to placate the BPF
| verifier, is why I'm asking.
| ko27 wrote:
| Not convinced that this a better solution then just implementing
| these features as part of the protocol. For example, most
| languages have libraries that support grpc load balancing.
|
| https://github.com/grpc/proposal/blob/master/A27-xds-global-...
___________________________________________________________________
(page generated 2021-12-09 23:00 UTC)