[HN Gopher] Notes on BPF and eBPF
___________________________________________________________________
Notes on BPF and eBPF
Author : mlerner
Score : 97 points
Date : 2022-01-02 19:48 UTC (3 hours ago)
(HTM) web link (jvns.ca)
(TXT) w3m dump (jvns.ca)
| daenz wrote:
| >eBPF programs can't access arbitrary kernel memory. Instead the
| kernel provides functions to get at some restricted subset of
| things.
|
| I must finally becoming a security pessimist when I read those
| sentences and the first thing I think is: these statements will
| not age well.
| Flocular wrote:
| The BPF capability should really only be given to root. I don't
| think it really gives any new attack surface. All I could see
| is it giving black-hats an easier interface to "kernel-level-
| fuckery".
| _3u10 wrote:
| It's not an easier interface. It's much easier to write
| kernel modules than to mess around with eBPF.
| stormbrew wrote:
| Easier to get started maybe, but there's something to be
| said for the 'ease' of having to worry a lot less about
| crashing the kernel you're working on.
| csdvrx wrote:
| Yes, "can't" should be replace by "shouldn't".
|
| If there's a physical possibility, it's just a matter of time
| before someone finds a way, as was proved by the CPU cache bugs
| leaking information.
| fragmede wrote:
| That's definitely a security professional's read :)
|
| "Isn't supposed to be able to" is a lot longer and distracting
| vs the oversimplification-for-sake-of-understanding of "can't".
| As far as it being proven wrong though - that's already
| happened, eg CVE-2021-29154
|
| https://blog.kernelcare.com/vulnerability/specially-crafted-...
| daenz wrote:
| That's fair. I understand that in the ideal world, it would
| be "can't." I guess my concern is that the wording kind of
| hand waves away any potential security issues, when people
| interested in this tech should absolutely be made aware of
| them.
| netsec_burn wrote:
| Many of the links under "things you can attach eBPF programs to"
| are broken, unfortunately.
| crtxcr wrote:
| >things you can attach eBPF programs to
|
| >...
|
| >seccomp / landlock security things
|
| Landlock does not use *BPF.
|
| Seccomp can only use BPF at this point, not eBPF (though there
| has been some work on it).
| pwnna wrote:
| BPF is indeed a pretty interesting technology. As the knowledge
| about it becomes more widespread, I anticipate that we will
| unlock some new capabilities both in terms of tracing. Brendan
| Gregg's book (https://www.brendangregg.com/bpf-performance-tools-
| book.html) serves as a good intro to this, although you probably
| only need to read a small chunk of it as a lot of it is
| reference-book-style material.
|
| The author's mentioned that you can trace MySQL with USDT, which
| is a tracepoint inserted by the developer at select locations in
| the code. This kind of tracepoints form a "stable interface" for
| tracing/performance debugging, whereas uprobe, which hooks into
| select userspace functions, are unstable as the binary is
| recompiled. Unfortunately, the USDT tracepoints (via DTrace) have
| been removed in MySQL 8.0. This makes it significantly more
| difficult to trace MySQL, although it's not
| impossiblhttps://news.ycombinator.com/item?id=29772927e. I've
| done a proof of concept of tracing MySQL with uprobe instead of
| USDT in this repo[1], which can kind of give you the same results
| (and possibly more stuff, as I can more easily read arbitrary
| memory address due to how the old USDT tracepoints are
| structured). This is not stable tho, as any MySQL upgrade may
| introduce incompatibility with the trace script, as I read memory
| address based on offsets (whereas with USDT this can be kept
| pretty stable). My appeal to Oracle to re-add this
| functionality[2] has unfortunately been rejected, which I think
| is a mistake given the wide range of possibilities unlocked via
| BPF.
|
| [1]: https://github.com/shuhaowu/mysqld-bpf
|
| [2]: https://bugs.mysql.com/bug.php?id=105741
|
| Another thing that I've been recently thinking of is using BPF to
| validate programs written for real-time Linux (via PREEMPT_RT).
| To my understanding, one of the main thing to avoid is page
| faults [3]. With the proper BPF tracing scripts, I think we can
| validate that programs indeed avoids page faults in integration
| testing. I'm not sure if it is super useful yet, but as I'm
| trying to write a few RT programs, it's something that came to my
| mind.
|
| [3]: https://lwn.net/Articles/837019/
|
| In addition to tracing (so bpftrace-based/bcc-based tools), I've
| recently discovered that there there are:
|
| 1. ebpfsnitch (https://github.com/harporoeder/ebpfsnitch): which
| is an application-level firewall without kernel modules.
|
| 2. ebpf-traffic-monitor
| (https://source.android.com/devices/tech/datausage/ebpf-
| traff...): which appears to be using BPF to account for traffic
| for different apps on Android.
|
| 3. kubectl trace (https://github.com/iovisor/kubectl-trace): Run
| tracing on k8s.
|
| There are apparently also use cases in the context of security,
| but I'm not familiar with it.
| kylequest wrote:
| Lots of good eBPF info from eBPF Summit:
| https://ebpf.io/summit-2021/ and https://ebpf.io/summit-2020/
|
| Also videos from eBPF Day KubeCon 2021:
| https://www.youtube.com/playlist?list=PLj6h78yzYM2Pm5nF_GmNQ...
| rammy1234 wrote:
| Is PDF link broken in the blog ?
| gnabgib wrote:
| It's available at:
| https://files.speakerdeck.com/presentations/130bc7df16db4556...
| (or click the download button on the slides page)
___________________________________________________________________
(page generated 2022-01-02 23:00 UTC)