[HN Gopher] A call to reconsider memory address-space isolation ...
___________________________________________________________________
A call to reconsider memory address-space isolation in Linux
Author : chmaynard
Score : 183 points
Date : 2022-09-30 11:03 UTC (11 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| anonymouse008 wrote:
| I'm not smart enough to understand kernels, or how memory works
| at an operating system level, but does this have anything to do
| with Apple's recent announcement that all unallocated memory will
| be zeroed out?
|
| I've been wondering, albeit naively, if TikTok and Facebook have
| been scraping unallocated memory allocations to achieve "without
| achieving" listening in on our microphones and cameras. Siri is
| always listening / camera sometimes pops up before accessing
| photos and the unallocated recordings / visuals could still be
| there? It could explain the uncanny awareness and specificity of
| their ads / algos.
| valleyer wrote:
| Just in case the downvotes didn't answer your question: no.
| Memory was _already_ zeroed out before it crossed process
| boundaries. Apple 's recent change is about zeroing it out even
| before it's reused _within_ a process. (Freeing memory doesn 't
| always give it back to the operating system; the userland
| allocator usually keeps a list of free blocks to be reused
| quickly.)
|
| Therefore, the attack you are describing doesn't work.
| anonymouse008 wrote:
| Thanks kind stranger, I appreciate your patience!
| UI_at_80x24 wrote:
| If I understand correctly, OpenBSD has done this for years too.
| IIRC Around the time that the first Specter vulnerability was
| announced OpenBSD patched all those possible variations too.
| reacharavindh wrote:
| Not a kernel developer. It sounds like a useful feature in some
| contexts(shared hosts, multi-tenant setups etc.), but useful to
| run without these constraints when it doesn't apply(HPC doing one
| thing using all available resources). Couldn't this feature be
| implemented as a feature flag in the kernel to say enable this if
| you need(and disable if you know what you're doing)?
|
| As a side benefit, over time if smart folks find a way to reduce
| the overhead from this feature close to negligible levels, the
| feature flag would become unnecessary..
| 0xbadcafebee wrote:
| Personally I would rather we just lean more into microVMs. From a
| cloud computing perspective, I don't care what security Linux
| claims to have, because I just assume it is ineffective and move
| forward from there. We build systems using loosely-coupled,
| strongly-authenticated, temporary sessions, and it's very easy to
| enforce separation of concerns (and thus minimize attack
| surface).
| slt2021 wrote:
| your code may be secure, but your cloud neighbor or VM host may
| be not. if VM host machine is taken over, then your VMs are at
| risk too https://unit42.paloaltonetworks.com/azure-container-
| instance...
| 0xbadcafebee wrote:
| True, but it's pretty rare for hypervisors to be exploited.
| In that Azure case the node itself maintained a connection to
| each VM, which kind of defeats the purpose of using VMs for
| isolation.... It's like having strong memory guarantees in
| Linux and then having an open socket from a root-owned
| process to every other process... like, maybe giving them a
| new attack vector to a SPOF isn't a great idea
| fsflover wrote:
| If you care about security, Qubes OS is more secure, with its
| hardware virtualization: https://qubes-os.org.
| staticassertion wrote:
| It would be nice to have a more trustworthy stack at every
| ring, I think.
| hinkley wrote:
| Somewhere, Andrew Tanenbaum is chewing popcorn. Loudly.
| intelVISA wrote:
| muh MINIX backdoor
| astrange wrote:
| That's already running on the Intel PCH.
| [deleted]
| marcosdumay wrote:
| Next in line, a plea for splitting the source tree into
| different repositories. Followed by a plea for stabilizing the
| core kernel API...
|
| Then we can replace the core API with a secl4 emulator.
| kramerger wrote:
| 2-14% performance drop sounds high, I understand people are
| sceptical. Maybe the subsystems can be structured in a different
| way to ease the effects of ASI?
|
| Personally, I would rather see a kernel side MPU in feature CPUs
| junon wrote:
| No. This is unavoidable. Mapping physical pages into a local
| page table is expensive no matter how you spin it. Depending on
| the data structures, this means doubling the time in some cases
| to allocate a physical page, which is typically 4KiB. Large
| allocations mean multiple page allocations. It adds up.
| hinkley wrote:
| Correct me if I'm wrong, but wouldn't complicated page mapping
| affect the cost of system calls the most? Because each call
| causes two alterations to the MMU state.
|
| I wonder how this patch interacts with io_uring. If kernel call
| overhead keeps going up it will force people to find ways to
| avoid the overhead.
| gpderetta wrote:
| If I understand correctly the patch at the moment doesn't
| affect system calls at all, only entering and exiting virtual
| machines.
| tptacek wrote:
| It's 2-14% offset potentially by whatever the cost is of the
| other mitigations that it replaces, for whatever that's worth.
| Filligree wrote:
| I'm just going to keep turning all mitigations off, thanks.
| Veliladon wrote:
| You do you but I hope you've also turned off JavaScript in
| your web browser.
| coldpie wrote:
| Is there any evidence of anyone actually attempting such
| an attack outside a carefully controlled research
| scenario? The number of stars that have to align for this
| theoretical attack to work in practice is so high that I
| don't think any normal desktop end-user has reason to
| worry about it. (Shared servers, co-hosted VMs, etc are a
| different story.)
| garaetjjte wrote:
| I think there might be herd immunity effect to that:
| because mitigations are enabled by default, there isn't
| much focus for developing attacks for it.
| Veliladon wrote:
| True there isn't at the moment but do you really want to
| be the first to find out because you left the back door
| open?
| coldpie wrote:
| I'd liken it to refusing to leave your house because
| you're worried a meteor may strike you down. You're
| giving up a lot for not much reason.
| staticassertion wrote:
| Not quite. A meteor is an object with no drive or desire.
| Attackers are adversarial and thoughtful. You can't liken
| a random event to a purposeful act like this.
|
| It's _more_ like saying "refusing to leave your house
| because you're worried you might get mugged", which might
| actually be very reasonable if you live in a bad
| neighborhood, you're a common target of crime, etc. It
| may not be reasonable for many others.
|
| But analogies are pretty rough in general.
| coldpie wrote:
| No, I think meteor is closer. Mugging is in the realm of
| possibility, this javascript driveby theoretical attack
| is wildly implausible. Like, take the time to spell it
| out: what sequence of events has to occur for this attack
| to be successfully pulled off?
| staticassertion wrote:
| Which attack are you referring to? There have already
| been POCs for many speculative execution attacks.
| coldpie wrote:
| Whatever one Veliladon was referring to when they
| asserted one must run either mitigations or no
| javascript. My point is those POCs are not sufficient
| evidence that mitigations with real performance impact
| are justified for the typical desktop end-user. Again:
|
| > The number of stars that have to align for this
| theoretical attack to work in practice is so high that I
| don't think any normal desktop end-user has reason to
| worry about it.
| staticassertion wrote:
| Oh ok. So, no, the stars don't have to align at all. The
| attack is straightforward, the POCs show that.
|
| The reason we don't see these attacks is because everyone
| patched the major issues immediately. Further, attackers
| don't need to really go for these sorts of attacks, there
| are more reliable, well-worn methods for attacking
| browsers.
| coldpie wrote:
| > So, no, the stars don't have to align at all. The
| attack is straightforward, the POCs show that.
|
| Please spell it out for me. Suppose I'm a typical desktop
| user, how is important information going to be stolen if
| I have mitigations turned off and JavaScript enabled?
| What state does my browser have to be in, and what
| actions do I have to take (or not take) for the attack to
| succeed? What likelihood is it that someone has deployed
| an attack that meets those requirements?
|
| > Further, attackers don't need to really go for these
| sorts of attacks, there are more reliable, well-worn
| methods for attacking browsers.
|
| So we agree it's OK to leave mitigations off and browse
| the web?
| staticassertion wrote:
| > Suppose I'm a typical desktop user, how is important
| information going to be stolen if I have mitigations
| turned off and JavaScript enabled?
|
| https://github.com/google/security-research-
| pocs/tree/master...
|
| I don't imagine I'm going to explain it better than the
| many others who have already done so.
|
| > What state does my browser have to be in, and what
| actions do I have to take (or not take) for the attack to
| succeed?
|
| Your browser would have to be pretty old/ outdated since
| they've been updated to mitigate these attacks. Otherwise
| it's just necessary that you visit the attacker
| controlled website.
|
| > What likelihood is it that someone has deployed an
| attack that meets those requirements?
|
| That's not a simple question. Threat landscapes change
| based on a lot of factors. As I said earlier, we won't
| see these attacks because people have already patched and
| attackers have other methods.
|
| > So we agree it's OK to leave mitigations off and browse
| the web?
|
| You can do whatever you want, idk what you're trying to
| ask here. What is "OK" ? You will be vulnerable but
| unlikely to be attacked for the reasons mentioned. If you
| are "OK" with that that's up to you.
| totony wrote:
| Browser lowered timer resolution to mitigate most
| speculative execution attacks. Probably why there aren't
| useful exploits. I remember a js payload when the first
| spectre appeared.
| netr0ute wrote:
| I keep my important info on paper.
| intelVISA wrote:
| I've memorized my private key it's the only way to be
| safe in 2022.
| coolspot wrote:
| Let's test your memory, what is it?
| astrange wrote:
| Why do you want to let driver memory bugs overwrite random
| kernel memory?
| anvic wrote:
| For most of us driver memory bugs are extremely rare and
| therefore a 2-14% performance drop is bonkers.
| astrange wrote:
| Anything you're not testing regresses, and any driver you
| let out of jail is going to regress too.
|
| Properly engineered security fixes don't cause
| performance regressions, either because you find
| improvements to pay for them, or you get the hardware
| updated to make them cheaper. (That'd be PCID in this
| case.)
| mrguyorama wrote:
| My AMD GPU driver bugs out pretty regularly. Hardware
| companies are not known for their bulletproof driver
| code.
| aaaaaaaaaaab wrote:
| This is a good opportunity for Rust!
| colejohnson66 wrote:
| Not arguing with you, but how? Rust is about memory safety, not
| memory isolation. It keeps me from writing out of bounds, but
| it does nothing to stop a malicious kernel module from stealing
| `SECRET_KEY` out of the kernel's local memory.
| m000 wrote:
| > Sometimes, though, there will be a need to access sensitive
| memory. An important observation here, Weisse said, is that
| speculative execution will never cause a page fault. So, if the
| kernel faults while trying to do something with sensitive memory,
| the access is known to not be speculative; the kernel responds by
| mapping the sensitive ranges and continuing execution.
|
| So, is the isolation patch a real solution to speculative
| execution attacks? Or is it just adding another hurdle for the
| attacker to jump?
|
| I.e. (if I interpret this correctly) the attacker would just be
| forced to add a "priming" stage, where they will trick the kernel
| to map the sensitive ranges they need for their speculative
| attack. So the effectiveness of the patch boils down to (a) the
| feasibility of this priming stage, and (b) for how long can the
| attacker keep the mapped memory in place.
| e63f67dd-065b wrote:
| I think the idea is that faults on sensitive pages are known to
| be good, because they can only come when running kernel code.
| They are unmapped when exiting from kernel code, so in theory
| there's no way for userspace to speculatively execute on those
| pages and leak sensitive memory.
|
| It does seem to be a good fix for the root cause of the
| problem, but I'm skeptical about the details. I haven't look
| through the patch set, but narrowing down what's sensitive and
| what isn't is going to be a monumental task.
| [deleted]
| pas wrote:
| unmapped pages cannot be speculatively executed "into"? (due
| to TLB flush? which happens during unmapping?)
| simcop2387 wrote:
| My understanding is that this is because the tlb update
| can't be speculated through. That means that once
| speculative execution hits that barrier it will stop and
| can't then look at data that would be mapped in if
| speculation kept going. Basically there's a hard fence
| there that it won't/can't go past so if you do hit a page
| fault in kernel space on one of the unmapped pages you know
| it will be from an active and not speculative thread
| running.
| mhh__ wrote:
| "Good" speculative execution shouldn't be able to speculate
| isolated memory - isolation does work against Spectre.
|
| Meltdown-type exploits (Older Intel and arm) can "see" across
| this isolation in some cases, which is why they were such an
| egregious mistake. Spectre is kind of inevitable whereas
| Meltdown isn't.
| 323 wrote:
| Windows already does this for a number of years.
|
| Drivers which run in kernel space are not allowed anymore to
| access whatever they want, and Windows own kernel space data
| structures are protected against modifications by other kernel
| mode running code (kernel patch protection).
| hinkley wrote:
| And Windows needed that feature much, much earlier.
|
| One of the open secrets about windows is that in the 3.1 to 98
| era, quite a large percentage of system crashes were actually
| caused by Creative Labs' audio drivers. Those guys could not
| produce stable software if their lives depended on it.
|
| But I don't blame CL for Windows being crash prone. Microsoft
| made a choice and a compromise to get popular. A permissive
| driver model made them more popular with customers. If you
| pander, then the rewards and the consequences are both yours to
| enjoy.
|
| MS tried to have their cake and eat it too back then. They
| wanted everyone to think they were the most sophisticated and
| powerful company because they were the smartest company in the
| world (which was the internal dialog at the time according to
| insiders I interviewed), but at the same time that it was all
| dumb luck and they couldn't control anything.
| Stamp01 wrote:
| Many successful companies and individuals fall into that trap
| of being unable to differentiate between talent and luck. I'm
| certainly not saying Microsoft doesn't employ plenty of very
| talented individuals. But it also takes some fortunate twists
| of fate to succeed, which were completely out of Microsoft's
| control. Success is never guaranteed.
| [deleted]
| hinkley wrote:
| Probably didn't help that Microsoft was founded at the end
| of the Corporate Raider era either. If you ran a company
| and decided that one of your big successes was being in the
| right place at the right time, you might keep a bigger war
| chest to get you through your next bad luck window. But
| that pile of liquid assets paints a big bullseye on your
| forehead.
|
| It was 'better' to just assume you were awesome and hope
| your luck held for years on end. And if it didn't, then you
| could tell a story about how talent got you big and bad
| luck took you out. No, it was luck both ways or skill both
| ways.
| dralley wrote:
| The same was true with Windows Vista and Nvidia graphics
| drivers
|
| https://www.engadget.com/2008-03-27-nvidia-drivers-
| responsib...
| swinglock wrote:
| Audio drivers being the bane of Windows stability didn't
| truly stop until Microsoft took Creatives toys away by
| force with Vista's new audio stack. Instead the new GPU
| drivers indeed took over that role, but at least more
| temporarily.
| delusional wrote:
| Intel Rapid Storage technology is another absolute trash
| piece of software.
| rafale wrote:
| That's super interesting. Does it use some special CPU feature?
| The CPU usually let code running in kernel context do whatever
| it wants.
| galangalalgol wrote:
| I'm curious what software is telling the kernel no. What
| enforces this?
| jandrese wrote:
| The memory mapper. One of the side benefits of relocatable
| code is the ability to enforce policy at point of access.
| ape4 wrote:
| Why the kernel of course (joke attempt, I am wondering too)
| salawat wrote:
| Probably firmware/hardware.
| monocasa wrote:
| It's the type 1 hypervisor it wants to run on top of.
| salawat wrote:
| Then would not the type 1 Hypervisor then become the
| "kernel" seing as we've defined kernels as "that chunk of
| code capable of unrestricted access to machine state"?
| monocasa wrote:
| The line blurs for sure.
|
| I would say it's 'a' kernel. The idea of there only being
| one kernel is probably a concept that makes for nice
| layered diagrams, but doesn't come close to describing
| reality because of the combinatorial complexity of
| options for different morphs of layering. Sort of like
| the OSI network layers model in that way.
| 323 wrote:
| In Windows 10/11 the core of the Windows kernel can run in a
| virtual machine totally separated from the rest of the
| kernel.
|
| > HyperGuard takes advantage of VBS - Virtualization Based
| Security
|
| > Having memory that cannot be tampered with even from normal
| kernel code allows for many new security features
|
| > This is also what allows Microsoft to implement HyperGuard
| - a feature similar to PatchGuard that can't be tampered with
| even by malicious code that managed to elevate itself to run
| in the kernel.
|
| https://windows-internals.com/hyperguard-secure-kernel-
| patch...
| leeter wrote:
| Not a Windows Kernel Dev. But my understanding is it's more a
| tripwire than anything else unless virtualization based
| security is turned on. If that is activated then the Kernel
| has complete isolation from non-MS drivers and can prevent
| them from accessing critical data structures. MS has a list
| of known drivers that don't work with this and prevents users
| from activating it if it will break things.
| maldev wrote:
| Ya, you can look at the bugcheck codes and see the
| mechanism that does this. Since patchguard will always
| throw that bugcheck code, I think it's 0x109? It just does
| random scans and sees if it matches, it's nothing fancy.
| Even with VBS(virtualization based security) it functions
| the same and will still allow a driver to modify it, then
| crash. In windbg you can see this by "!analyze -show 0x109"
| assuming that its 0x109.
| mmis1000 wrote:
| I think VBS's role is ensuring you can no longer patch
| the PatchGuard itself? Because the guard itself is no
| longer in the kernel and you can do nothing with it.
|
| But I heard VBS has a ~10% overhead compared to not
| enable it. I wonder what does cost this. Enable hyperv
| itself didn't really cause observable difference though.
| maldev wrote:
| VBS's role is to mirror the kernel and wall it off
| through a hypervisor. So your kernel/usermode can't
| access the secure version. This basically lets it compare
| the "secure" kernel and the regular kernel structures.
| Things like the process list, Driver executable regions,
| signatures, and such are mirrored. So when a process
| spawns and it's added to the process/threadlist. Those
| operations are mirrored in the secure kernel then
| randomly checked for security.
|
| VBS also secures things like the scan timer/event and
| some other methods people used to use to disable it.
| http://uninformed.org/index.cgi?v=8&a=5&p=18 .
|
| The performance impact shouldn't really be noticeable at
| all. All you have is some memory operations which are
| "Duplicated", but not really since COW. But i'm not that
| much of an expert on patchguard besides the really basic
| functions.
| badrabbit wrote:
| Windows actually uses a cpu feature for kernel patch protection
| right? I remember trying to figure out why Linux doesn't.
| pjmlp wrote:
| On modern Windows, it actually always runs as guest on Hyper
| V, and thus many such protection mechanisms ping back on
| virtualization.
|
| Secure kernel and driver guard are another features with
| similar protection level.
| badrabbit wrote:
| Are you talking about VBS/HVCS? Isn't that optional or is
| it on by default for kernel stuff?
| pjmlp wrote:
| Optional on Windows 10, compulsory on Windows 11.
|
| One of the reasons for the hardware requirements.
| badrabbit wrote:
| Wow, didn't know thanks.
| josephcsible wrote:
| > Drivers which run in kernel space are not allowed anymore to
| access whatever they want
|
| I don't like this. It's one thing to have memory the kernel
| doesn't usually need be unmapped by default, but it's another
| to prevent you from mapping it when you do need it. This reeks
| of DRM.
| matheusmoreira wrote:
| No idea why you're being downvoted. This does sound like an
| anti-user feature. Stuff like this exists to protect third
| party software from our analysis and "tampering".
| marcosdumay wrote:
| It is anti-driver-freedom.
|
| Most users don't want driver-freedom. They want user-
| freedom, that is easily achieved by keeping the kernel open
| source and replaceable.
|
| If you want to break some in-kernel protection, you can
| just patch the kernel and remove it.
| dagmx wrote:
| Can you elaborate how this feels anti-user?
| matheusmoreira wrote:
| As the owner of the machine, I should have total access
| to everything. There should be no protected memory I
| can't read, no execution I can't trace. The number one
| user of such security features are "rights holders" that
| consider me hostile and want to protect their software
| from me, the owner of the machine it is running on. The
| result is a computer that is factory pwned. It's not
| really my computer, they're just "generously" allowing me
| to run software on it as long as it doesn't harm their
| bottom line.
|
| Virtualization based security, the technology enabling
| this, also enables DRM. It is currently required by
| Netflix for 4k resolution streaming.
| staticassertion wrote:
| If you're the admin of the system you can just load up
| your own kernel, or kernel modules, etc.
| dagmx wrote:
| How do you propose having security in that world? Should
| any process be able to access the memory of any other
| process? Should you opening a web page allow you a
| security issue there to be able to access your running
| applications?
|
| I guess my question is, you seem to have a very hardline
| "everything" take that I don't think extends to the real
| world
| matheusmoreira wrote:
| > Should any process be able to access the memory of any
| other process?
|
| Any process? No. My processes? Yes. That includes
| "sensitive" memory like cryptographic keys.
| josephcsible wrote:
| You're confusing "I" should have access to all the data
| on my machine, with "anything running on my machine"
| should have access to all the data on my machine.
| ikiris wrote:
| Ahh right, I forgot about the "user requested" security
| flag. Makes sense.
| mckeed wrote:
| Still build the security protections but allow the
| administrator to selectively disable/override them.
| 0xbadcafebee wrote:
| > but it's another to prevent you from mapping it when you do
| need it
|
| How do you know if it's a user needing to use it, or an
| attacker pretending to be a user needing to use it?
| josephcsible wrote:
| I'm willing to accept that if a malicious kernel driver
| gets loaded, I'm completely pwned.
| colejohnson66 wrote:
| Prior to Microsoft changing things with Vista, drivers were
| free to rummage about and break systems. Blue screens were a
| common thing back then _because_ drivers weren 't careful.
| It's why Vista had such a bad rap for breaking systems; It
| exposed the driver authors who didn't care about safety.
|
| Also, why does every kernel-safety measure have to be seen as
| anti-user or DRM? You're free to disable it[0] if you wish,
| just like Apple's SIP. It's there to keep users who don't
| know anything safe. Would you like it if an innocent looking
| piece of software was able to hide itself from the user
| (read: you) and the OS?[1] Preventing such attacks doesn't
| sound anti-user to me.
|
| [0]: https://windowsloop.com/disable-enable-device-guard-
| windows-...
|
| [1]: https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_r
| ootk...
| blibble wrote:
| > Also, why does every kernel-safety measure have to be
| seen as anti-user or DRM? You're free to disable it[0] if
| you wish,
|
| OP's point was about kernel patch protection
|
| how do I disable that?
| l33t233372 wrote:
| How can code be running in kernel mode if it doesn't have
| unrestricted memory access?
| johntb86 wrote:
| The page tables can be set up so kernel-mode code doesn't
| have access to all of memory. You could get around this by
| modifying the cr3 register to point to a different page
| directory, but that could cause problems whenever a context
| switch happens and cr3 is reverted. Microsoft also has
| PatchGuard, which could probably detect changes like that.
|
| In theory you could work around all these protections, but it
| would be difficult and fragile.
___________________________________________________________________
(page generated 2022-09-30 23:00 UTC)