[HN Gopher] Beating Google's kernelCTF PoW using AVX512
___________________________________________________________________
Beating Google's kernelCTF PoW using AVX512
Author : anematode
Score : 222 points
Date : 2025-05-30 16:19 UTC (6 hours ago)
(HTM) web link (anemato.de)
(TXT) w3m dump (anemato.de)
| davidfiala wrote:
| Absolutely excellent!
| quasiconnected wrote:
| Beautiful use of number theory for the mersenne number modulo!
| Havoc wrote:
| Some level of submission difficulty makes sense to keep the
| garbage submissions at bay but that seems a bit over the top?
|
| Can't imagine it makes a big difference to them whether google
| pays out 50k or 2x 50k for high quality bugs relevant to their
| company
| jameshart wrote:
| Right? This gives the impression that there might be people out
| there who have developed valid attacks but who are holding back
| from revealing them because they haven't optimized their
| strategy to ensure they can win the CTF race.
|
| This seems like a perverse incentive creating suboptimal
| behavior.
| Retr0id wrote:
| That's why they removed it.
| markasoftware wrote:
| no it's not. Google still has a long time between each
| submission cycle, and people are still holding back their
| exploits in the hope they'll win the next one. It's just a
| matter of luck now, rather than a matter of optimizing PoW.
| jameshart wrote:
| Still a race though. So you can't rig it by optimizing your
| proof of work implementation, people are still going to
| look for an edge. These are CTF experts, exploiting systems
| is what they do.
| bee_rider wrote:
| At the beginning of the post, they note a default
| implementation that uses some Google library. So maybe the
| assumption (which worked out for a while?) was that nobody
| would bother doing the "over the top" thing and try to beat
| Google's library?
|
| One option could be to just open up, like, 10 slots instead,
| right? Most folks might use this implementation (since it has
| been posted anyway). That way coming second place in the race
| isn't so bad...
| internetter wrote:
| I honestly don't see why there shouldn't be infinite slots.
| Google has the money. Right now they've created a backlog of
| an unknowable number of exploits, sequentually being
| submitted one by one. It is possible the backlog grows faster
| than it is emptied. If they let them all through at once,
| there would be an upfront cost but it wouldn't be like that
| every time, and the pace at which explitable bugs are fixed
| increases.
|
| The only restriction should be that the first submission of a
| given bug is the one that gets the reward.
| overfeed wrote:
| > Can't imagine it makes a big difference to them whether
| google pays out 50k or 2x 50k for high quality bugs
|
| You're brushing over important implemention details - it's not
| _Google_ running this program but a specific team inside the
| company that has a fixed budget, a limited headcount and doing
| the best with what they have.
|
| Your argument is similar to "America has trillions in GDP and
| could afford to provide twice the number of free treatments to
| kids with cancer" - it sounds reasonable in aggregate, but
| breaks down in the specifics; the individual hospital systems,
| teams and care providers are currently working close to their
| limits.
| avidiax wrote:
| > The proof of work (implemented here) is a certain verifiable
| delay function (VDF) known as "sloth". VDFs are cryptographic
| primitives which prove that a nontrivial amount of time has
| passed by requiring a long, serial computation. This computation
| outputs a proof which can be (relatively) quickly verified.
| Because the computation is serial, scaling to more computational
| resources (such as more CPU or GPU cores) does not reduce the
| runtime.
|
| Maybe this sort of thing is a market opportunity for the CPU
| makers?
|
| Add two special instructions, one that iterates a sloth given the
| seed value, and the final instruction that signs the result
| cryptographically to prove that it was produced by this model of
| CPU. Perhaps have hardware rate limiting to prevent overclocking.
|
| Another idea would be to monitor CPU uptime and sign the uptime +
| a challenge. The uptime is cleared after signing. This is more
| like proof of stake, where we are proving that we dedicated n
| seconds of CPU ownership, while allowing the CPU to be used
| productively otherwise.
| bawolff wrote:
| So if i unerstand correctly - its 4 second proof of work, and
| there is a payout once a month.
|
| Are there really so many exploits that people are racing to get
| it every month?
| philodeon wrote:
| Yes, the myth of Linux kernel security is just that, a myth.
| api wrote:
| But we don't need safe languages.
| jeffbee wrote:
| How does this myth even exist? There has never been a moment
| in the history of Linux where it lacked an exploitable flaw.
| The only uncertainty is that for the last couple of days we
| aren't sure exactly what the exploit is.
| landr0id wrote:
| A myth propelled by people who don't understand security
| continually saying "Anyone can read the code, therefore
| it's more secure".
| mmsc wrote:
| b-b-but, my linux distribution is perfect for multi-
| tenant/multi-user purposes!!
| udev4096 wrote:
| Most of the serious security researchers, such as Daniel
| Micay (lead developer of GrapheneOS), have been quite
| vocal [0] on how insecure linux is.
|
| [0] - https://old.reddit.com/r/GrapheneOS/comments/bddq5u
| /os_secur...
| spookie wrote:
| I would really like for him to go into detail into why
| Flatpak is flawed from day one, or his MAC criticisms.
| Fedora has a lot done on the former and no mention of it.
|
| Also why not recognize the strides Wayland made for
| security in the desktop? Very handwavey take.
| aidenn0 wrote:
| This exists because:
|
| 1. When Linux was "coming of age" (around the turn of the
| millennium), Windows security was _really_ bad.
|
| 2. At the same time, there were far more Windows machines
| than Linux
|
| 3. #1 and #2 together made Windows such an attractive
| target, that exploits for Linux were less of a thing.
| rhet0rica wrote:
| Don't forget: 4. Poor binary compatibility made it seem
| like worms were impossible, and twenty years ago, the
| most dangerous cyberattacks were 'pranks' designed to
| hose as many PCs as possible
| whizzter wrote:
| In Linux? No, C/C++-compiled binaries with dynamic
| linking often had issues due to glibc versions
| diffing,etc but you could always compiled things
| statically.
|
| "not ok" by LGPL licence so an issue for games,etc that
| wanted closed source so people talked about it, but a
| malicious worm creator could definitely go there since
| they probably wouldn't care about (L)GPL compliance
| (remember Linus has always been very hard on patches not
| breaking userspace ABI's).
| rhet0rica wrote:
| Yeah. Emphasis on the "seem" part. A false sense of
| security brought on by, "If binary artifacts for package
| _foo_ can't be ported between distros, what chance do
| worms have?" plus the even older memory that the Morris
| worm had been designed around (and limited to) certain
| target platforms and architectures:
| https://en.wikipedia.org/wiki/Morris_worm
| rfoo wrote:
| I thought when people say "Linux security is a myth" they
| were comparing it to FreeBSD / OpenBSD. So it's revealing
| that most are comparing Linux to Windows here.
| thephyber wrote:
| The myth has some truth to it.
|
| The impact of the average Windows exploit was higher than
| the average Linux exploit because non-NT Windows didn't use
| best practices such as multiple accounts + least privilege.
| And for years there were daemons on Windows that would get
| exploited with RCE just idling on a network (eg.
| Conficker).
|
| It took Microsoft several years of being a laughing stock
| of security before Bill Gates made the decision to slow new
| feature development while the company prioritized security.
| Windows security then increased. I believe that was around
| the time that Microsoft started reusing the NT kernel in
| the consumer versions of Windows.
|
| Also, the myth ignores the fact that cybersecurity risk has
| components of likelihood (a percentage) and impact (an
| absolute number, sometimes a dollar value). This conflation
| of two components invites lots of arguments and confusion,
| as commonly seen when certain CVEs are assigned non-obvious
| CVS scores.
| bonzini wrote:
| > around the time that Microsoft started reusing the NT
| kernel in the consumer versions of Windows.
|
| Much later. Windows XP used the NT kernel in 2001,
| whereas Conficker was in 2008.
| JackSlateur wrote:
| Is it, tho ?
|
| The ratio bug/feature is what matter (systems with no bugs
| but no features are nice but we have work to do)
| dist-epoch wrote:
| These are local escalation of privilege exploits (becoming root
| from a regular user), not remote code execution. Escalation of
| privilege bugs are a dime a dozen.
| internetter wrote:
| It wouldn't be reasonable to expect it to be a RCE bug. That
| wouldn't be a kernel bug, it would be in the networking stack
| or software running.
| eptcyka wrote:
| Where is the networking stack running on a typical Linux
| deployment? (:
| internetter wrote:
| Yes, and most software runs on a host operating system.
| Vulnerabilities in the software of 3rd party developers
| aren't in scope, even if it extends the kernel. Some
| networking tasks _are_ delegated to the kernel, but
| almost all of the really risky stuff is not in the core
| kernel.
| singron wrote:
| I think this also requires user (i.e. unprivileged)
| namespaces since you have to manipulate traffic control queue
| disciplines (tc qdisc). You normally need to be root to do
| this, so it's only useful as an escalation if you can do it
| within a namespace or you can convince some root daemon to do
| the qdisc manipulation for you (I think unlikely?).
|
| User namespaces opened up an enormous surface area to local
| privilege escalation since it made a ton of root-only APIs
| available to non-root users.
|
| I don't think user namespaces are available on android, and
| it's sometimes disabled on linux distributions, although I
| think more are starting to enable it.
|
| Relatedly, kernelCTF just announced they will be disabling
| user namespaces, which is probably a good idea if they are
| inundated with exploits: https://github.com/google/security-
| research/commit/7171625f5...
| Aurornis wrote:
| The server was open every 2 weeks. The proof of work was to
| slow down connections slightly to reduce incentives to spam as
| many connection requests as possible.
|
| Public CTFs are hard because inevitably some team will try
| something resembling a DDoS as part racing to the finish.
|
| Google removed the proof of work step after this.
| IshKebab wrote:
| Kind of feels like a weird criteria to reduce the number of
| submissions. In the real world it doesn't matter if an exploit
| takes 4 seconds or 6. Why not use some other more interesting
| criteria like how long ago was the bug introduced, or some of the
| many bonus criteria they have?
|
| Just seems weird for it to be a race.
| ramshanker wrote:
| Awesome.
|
| Just when I was starting to wonder if we finally had a chance to
| thwart L7 attacks using these PoW tricks.
| SunlitCat wrote:
| There is something off with
|
| > ...and despite supporting it [AVX512] on consumer CPUs for
| several generations...
|
| I dunno. Before Rocket Lake (11th gen) AVX512 was only available
| in those enthusiast cpu, xeon cpu or in some mobile processors
| (which i wouldn't really call consumer cpu).
|
| With the 12th gen (and that performance/efficiency core concept),
| they disabled it after a few months in those core and it was
| never seen again.
|
| I am pretty sure tho, after AMD has some kind of success with
| AVX512 Intel will reintroduce it again.
|
| btw. I am still rocking an Intel i9-11900 cpu in my setup here.
| ;)
| anematode wrote:
| fixed :) thx
| Aurornis wrote:
| > With the 12th gen (and that performance/efficiency core
| concept), they disabled it after a few months in those core and
| it was never seen again.
|
| The 12th gen CPUs with performance cores didn't even advertise
| AVX512 support or have it enabled out of the box. They didn't
| include AVX512 on the efficiency cores for space reasons, so
| the entire CPU was considered to not have AVX512.
|
| It was only through a quirk of some BIOS options that you could
| disable the efficiency cores and enable AVX512 on the remaining
| CPU. You had to give up the E-cores as a tradeoff.
| adgjlsfhk1 wrote:
| IMO this is more a sign of complete dysfunction at Intel.
| They definitely could have made avx512 instructions trigger a
| switch to p-cores, and honestly probably could have supported
| them completely (if slightly slowly) by splitting the same
| way AMD does on Zen4 and Zen5 C cores. The fact that they
| shipped P cores and E cores that had different assembly sets
| is what you get when you have separate teams competing with
| each other rather than working together to make a good
| product.
| pitust2 wrote:
| > They definitely could have made avx512 instructions
| trigger a switch to p-cores
|
| Not really, no. OS-level schedulers are complicated as is
| with only P vs E cores to worry about, let alone having to
| dynamically move tasks because they used a CPU feature (and
| then moving them back after they don't need them anymore).
|
| > and honestly probably could have supported them
| completely by splitting the same way AMD does on Zen4 and
| Zen5 C cores.
|
| The issue with AVX512 is not (just) that you need a very
| wide vector unit, but mostly that you need an incredibly
| large register file: you go up from 16 * 256 bit = 4096
| bits (AVX2) to 32 * 512 bit = 16384 bits (AVX512), and on
| top of that you need to add a whole bunch of extra
| registers for renaming purposes.
| oparin10 wrote:
| That tracks. Intel's updated AVX10 whitepaper[1] from a few
| months back seems to confirm this. It explicitly states 512-bit
| AVX will be standard for both P and E cores, moving away from
| 256-bit only configs. This strongly implies AVX-512 is making a
| proper comeback, not just on servers but future consumer CPUs
| with E-cores too. Probably trying to catch up with AMD's wider
| AVX-512 adoption.
|
| [1] - https://cdrdv2.intel.com/v1/dl/getContent/784343 (PDF)
| pittma wrote:
| Cool stuff! This method is very similar to how AVX-512-optimized
| RSA implementations work too, as they also have to do Very Large
| Exponentiations. This paper[1] covers how RSA does its windowing,
| which includes the formula showing how the window size can be
| arbitrary. One additional thing AVX-512 RSA implementations do,
| though, is store the results of the multiplications for the range
| [0..2^{window-size}) in a table; then, for each window, that
| result is retrieved from the table[2] and only the
| shifting/repositioning has to be done.
|
| 1. https://dpitt.me/files/sime.pdf (hosted on my domain because
| it's pulled from a journal)
|
| 2. https://github.com/aws/aws-
| lc/blob/9c8bd6d7b8adccdd8af4242e0...
| anematode wrote:
| Ooh interesting, I should have looked at this while
| developing.... Looks like that code could definitely use
| another version for e.g. Zen 5 where using zmm registers would
| lead to a 2x multiplication throughput. Also the mask registers
| are bounced to GPRs for arithmetic but that's suboptimal on Zen
| 4/5.
|
| Separately, I'm wondering whether the carries really need to be
| propagated in one step. (At least I think that's what's going
| on?) The chance that a carry in leads to an additional carry
| out beyond what's already there in the high 12 bits is very
| small, so in my code, I assume that carries only happen once
| and then loop back if necessary. That reduces the latency in
| the common case. I guess with a branch there could be timing
| attack issues though
| pittma wrote:
| ymms were used here on purpose! With full-width registers,
| the IFMA insns have a deleterious effect on frequency, at
| least in the Icelake timeframe.
| anematode wrote:
| Ye, hence a separate version for CPUs which don't have that
| problem. Although, maintaining so many of these RSA kernels
| does seem like a pain. Didn't realize u wrote that code;
| super cool that it's used in practice!
| pittma wrote:
| I am not the original author--this is adapted from an
| implementation by Shay Gueron, the author of that paper I
| linked, but I do agree that it's cool!
| supriyo-biswas wrote:
| Can anyone tell me how they equated the Python function in the
| blog to what was shown in the Google's POW implementation? Seems
| a little difficult to follow.
| bonzini wrote:
| The "exponent = (p + 1) // 4" in the Google code is 2^1277. To
| raise a number to such a huge power, you square it 1277 times
| (and that is what the Python function does). That works because
| if the initial value is "x", each squaring doubles the number
| of "x"s, and at the end you have 2^1277 of them.
| dcrazy wrote:
| For more on the base-52 representation that this article
| mentions, see this other post from today's front page:
| https://news.ycombinator.com/item?id=44132673
| bluelightning2k wrote:
| This is impressive. But it just strikes me as optimizing the
| wrong thing. A CTF shouldn't be about submission ops. Surely it
| would be better for everyone if all teams who send the flag
| within a submission window share the prize
| rfoo wrote:
| That would be a different meta-game and while I didn't think
| deep into it, I believe the likely outcome is - people are just
| discouraged and won't consider submitting to kernelCTF.
| kimixa wrote:
| I'd also say that this encourages people to sit on exploits
| instead of reporting them immediately, so they can try to
| submit it next time if they didn't get it, even without
| submission timing shenanigans.
|
| So it may be actively encouraging the "wrong" thing as well.
| vlovich123 wrote:
| I don't understand the purpose of the race. Why not just pay out
| per unique exploit?
| rfoo wrote:
| Boss want a strictly fixed budget for running those cool
| programs. The rationale behind these programs (at least
| partially) is about measuring exploits and mitigations
| dynamics, not buying bugs. And, Linux is just too buggy that if
| you pay for every 0-day it's basically out of control. Google
| tried to do so (and to drain people's hoarded bugs) once, ran a
| limited time promotion with no race, every 0 day counts, got
| flooded.
|
| And at the same time you don't want to piss the community, so
| here we go.
| 0x0 wrote:
| If linux kernel security really is so bad that google had to
| add a proof-of-work to introduce a 4 second race for 0day
| submissions, I'm surprised they're ok with still using the
| Linux kernel as the base for Android.
| udev4096 wrote:
| Android has a vastly improved and better version of linux
| kernel: https://old.reddit.com/r/GrapheneOS/comments/bddq5u
| /os_secur...
| kimixa wrote:
| All of that's talking about the userspace though?
| rfoo wrote:
| Not all. For example kCFI is kernel space.
|
| Also, attack surface reduction is a very valid strategy,
| so it may seem like about the userspace (sandbox for
| every apps etc) but it could make a big different in how
| much of the kernel attack surface is exposed.
| kimixa wrote:
| Yes, but the concept of CFI is only mentioned in passing
| in that entire thread, and the kCFI implementation used
| is a vanilla kernel feature and not android specific.
|
| There's a lot to be said that "Distro kernel config
| choices may not be as secure as possible", but that's not
| really an "Android"/"Vanilla Linux Kernel" difference.
| rfoo wrote:
| Well, I don't know kCFI being enabled on any distro
| besides Android, cause it requires building the kernel
| with Clang.
|
| The previous in-kernel CFI implementation (before the
| kinda joint effort - kCFI) was upstreamed by Google, too:
| https://www.phoronix.com/news/Clang-CFI-Linux-Patches and
| https://www.phoronix.com/news/Linux-Kernel-Clang-LTO-
| Patches. Pixel devices also had this long before. Given
| that the entire Linux kernel feature was developed out of
| Android I find it a little bit unfair to call it "using a
| vanilla kernel feature".
| kimixa wrote:
| I'd argue that the entire point of using a shared open
| source kernel is that other users can benefit from
| additions.
|
| Arguing "Who first added a feature" seems to be a losing
| spiral of needless tribalism. How many features does the
| Android kernel use that _weren 't_ developed by Google?
| Does that mean they _wouldn 't_ have developed those
| features? Or just that there's no point making a parallel
| implementation if there's already one existing.
| rfoo wrote:
| The point here is not who first added the feature to
| Linux kernel. The point is Android cared about security,
| built a CFI implementation, started shipping it back in
| 2018, while Linux had other priorities and didn't have it
| until 2021. And even then almost nobody adopted it.
| pas wrote:
| is there some more info somewhere about that flood? how many
| teams? how many submissions? how many unique? how much money
| G paid out?
| rfoo wrote:
| So, here is a tweet from Eduardo highlighting that they got
| a lot of 0day submissions immediately after they launched
| the promotion:
| https://x.com/sirdarckcat/status/1681491274489282565
|
| And the spreadsheet is public [0], I counted 7 unique
| hoarded bugs (submitted immediately after launch) after
| deduplicating by patch commit hash. Then in the month
| following, another 9 unique bugs were submitted.
|
| As for how much paid, I don't remember. Could be around
| $200~300k in total.
|
| [0] https://docs.google.com/spreadsheets/d/e/2PACX-1vS1REdT
| A29OJ...
| mmsc wrote:
| Amazing stuff, but also reads like a comedy due to the obstacles
| to win this challenge. Real rube goldberg stuff.
| initramfs wrote:
| neat. what website template is that? I'm looking for a simple
| blog.
| Brybry wrote:
| I think it's something like this
| https://github.com/timlrx/tailwind-nextjs-starter-blog
|
| basically nextjs + tailwind + mdx
___________________________________________________________________
(page generated 2025-05-30 23:00 UTC)