[HN Gopher] Morpheus Turns a CPU into a Rubik's Cube to Defeat H...
       ___________________________________________________________________
        
       Morpheus Turns a CPU into a Rubik's Cube to Defeat Hackers
        
       Author : kaisix
       Score  : 42 points
       Date   : 2021-04-14 10:07 UTC (12 hours ago)
        
 (HTM) web link (spectrum.ieee.org)
 (TXT) w3m dump (spectrum.ieee.org)
        
       | voldacar wrote:
       | I'm a bit confused - how do you do pointer arithmetic if the cpu
       | encrypts pointers for your automatically and doesn't let you know
       | the actual address in memory that they point to?
        
         | joshka wrote:
         | I know nothing of the implementation, but hypothetically a
         | compiler could translate:                   p++
         | 
         | to instructions that effectively do:                   p =
         | enc(dec(p)+1)
         | 
         | Without exposing p outside the CPU. The implementation
         | described (seems to be) new instructions for incrementing and
         | dereferencing pointers.
        
           | bawolff wrote:
           | In which case, can't you just use it in your shell code? I
           | guess the defense is info leaks are useless so you can't
           | defeat aslr(?)
        
       | dejj wrote:
       | > you change what it means to add a value to a pointer.
       | 
       | Morpheus encrypts pointers. How does it allow C to access an
       | array or a struct and prohibit access beyond bounds to the return
       | address somewhere beyond the bounds of said array or struct, I
       | wonder.
        
         | jstanley wrote:
         | Perhaps it could have pointer arithmetic instructions, where
         | "ptr + n" makes code that uses special instructions to decrypt
         | "ptr", add "n", and encrypt the result before giving it back to
         | you?
        
           | dejj wrote:
           | Homomorphic encryption does that, yes. But in the interview,
           | he talks about it hypothetically.
        
           | anonymousiam wrote:
           | I don't see how that works. If such an instruction exists,
           | why couldn't an exploit developer use it to defeat pointer
           | encryption?
        
       | bansuian wrote:
       | This reminds of Ghost in the Shell, the anime. There is a concept
       | of a barrier maze that has to be solved before gaining access.
       | There is also a concept of an attack barrier if one tries to
       | force their way into a system.
        
       | teddyh wrote:
       | > _What is the overhead for Morpheus?_
       | 
       | > _Todd Austin: It's about 10 percent slower on average._
       | 
       | If you accept that kind of overhead, why not just go the
       | Burroughs B5000/Lisp Machine/etc. route and have a hardware
       | instruction set which is essentially byte code for a safe
       | machine, which cannot be tricked into reading and writing in
       | arbitrary places.
       | 
       | Of course, then you would still have the problem of actual
       | security holes in programs which has bugs, which Morpheus
       | effectively hides, by obscuring them.
        
         | bombcar wrote:
         | A 10% performance hit for "secure mode" doesn't seem at all
         | that bad - run at last year's speed for an additional layer of
         | security.
         | 
         | Something like this for IoT devices might actually provide some
         | of the missing security thereof.
        
           | teddyh wrote:
           | Do processing speeds really still increase 10% every year?
        
             | bombcar wrote:
             | Even if they're not, the processing power at a given price
             | point is increasing by 10% a year, which effectively works
             | out to the same - especially if you use these in low-cost
             | devices that are never updated (IoT seems a perfect fit).
        
             | mrguyorama wrote:
             | Thanks to AMD, it's actually a little better than that (for
             | one specific year, though the entire ryzen line has been
             | roughly similar)
             | 
             | https://www.cpubenchmark.net/compare/AMD-Ryzen-5-5600X-vs-
             | AM...
        
         | Jiocus wrote:
         | I think you're omission of what Todd Austin really answered in
         | the article, on this question, is close to insincere.
         | 
         | > Todd Austin: It's about 10 percent slower on average. We
         | could reduce those overheads if we had more time to optimize
         | the system. You know, we're just a couple of grad students and
         | some faculty. If a company like Intel, AMD, or ARM built this,
         | I would expect that you'd get the overheads down to a few
         | percent.
         | 
         | Edit: Take into account current performance hits for e.g.
         | Spectre, Meltdown patches.
        
           | teddyh wrote:
           | > _insincere_
           | 
           | I was not criticizing the Morpheus CPU. I was using it to
           | make a tangential point about other possible design choices.
        
             | Jiocus wrote:
             | Fair enough!
        
       | tandr wrote:
       | Isn't it sort of what OpenBSD(?) does with having return/entrance
       | pointers masked with an additional bits that only kernel knows?
       | (my memory is really vague on details, sorry)
        
       | bawolff wrote:
       | I wish this was more a technical paper. The interview tries to
       | dumb it down to the point where you can't follow what they
       | actually did, or what type of attacks are actually in scope.
       | 
       | Is this just encrypting pointers, where the encryption key
       | changes very frequently?
        
       | peter_d_sherman wrote:
       | >"IEEE Spectrum: So you can detect an attack just by looking at
       | what's being looked at?
       | 
       | Todd Austin: Yeah. Let me give you a classic example--the return
       | stack. [When a program "calls" a subroutine into action, it
       | stores the address the program should return to when the
       | subroutine is over in the return stack.] This is a real popular
       | place for attackers to manipulate to do what's called a buffer
       | overflow or a code injection. There's only one part of the
       | program that writes and reads from the return address, and that's
       | the beginning of a function and the end of a function. That's
       | really the semantics of the return address. But when you see a
       | program attacking the return address to do code injection, for
       | example, you'll see writes from many different places in the
       | program. So, these sort of unexpected, undefined accesses, normal
       | programs never do. That's a tell-tale sign that there is an
       | attack going on. Now, some systems might try to detect that. And
       | when they detect that, then they can kill the program."
       | 
       | PDS: Observation: An x86 emulator, such as Bochs, could be
       | programmed such that it "knew" about where any given
       | program/processes' stack is -- and explicitly looked
       | for/flagged/logged/warned/alerted and possibly stopped (and
       | dumped for further analysis) -- programs/processes whose stacks
       | were accessed/changed in questionable ways...
       | 
       | Also, this leads to my next thought...
       | 
       |  _Why couldn 't all processes' stacks -- have some form of
       | hardware/CPU protection?_
       | 
       | I mean, we have Ring 0 for OS code, and Ring 3 for user program
       | code -- this leaves Rings 1 and 2 free for other purposes...
       | well, why not put program/process stacks there, and guard access
       | to them via the OS?
       | 
       | (Now, a better idea would be a custom CPU designed to know about
       | and explicitly protect each process's stack memory -- but short
       | of that, x86 has those extra rings...)
       | 
       | Yes, it would make programs slower.
       | 
       | Possibly much slower, depending on what's being done. And of
       | course, the program wouldn't have immediate access to its stack
       | variables, without OS intervention...
       | 
       | Maybe part of the solution would be in what Forth did a long time
       | ago -- separate the call/return (control) stack from the local
       | variables stack...
       | 
       | Like, maybe if the call/return stack -- is placed in a Ring that
       | the OS controls (requiring an OS syscall -- which could also
       | log/alert in the case of an anomaly), but the data (local
       | variable) stack is still kept in a common area...
       | 
       | Yes, compilers would have to be modified to support that scheme,
       | and yes, it would be slower... but could it work?
       | 
       | I don't think it would prevent all kinds of attacks (I don't
       | think anything could) -- but it might help mitigate some of the
       | more common stack-based attacks...
       | 
       | Also, the idea of encrypted pointers -- both on the stack and not
       | -- is a good idea.
       | 
       | It would be slow if not implemented in hardware, unless the
       | encryption was a simple XOR with some known-only-at-runtime value
       | (which would be a bit slower than without it) -- but that would
       | be the fastest thing that could implement an approximation of
       | that scheme in software...
       | 
       | Anyway, a lot of interesting ideas in this paper...
        
         | qw3rty01 wrote:
         | Most of these already exist
         | 
         | Rings are code privilege levels, they don't "protect" memory,
         | page tables do that, and the implementation to protect the
         | stack has been around for over a decade - DEP
         | 
         | call/return stack detection is very new, afaik only current
         | generation intel processors have it under the name CET
         | (control-flow enforcement technology). This works by a
         | processor-level "shadow stack," which is pushed to on calls and
         | popped from on returns, and throws a fault if the actual stack
         | is returning to a different location
         | 
         | Pointer encryption is available on ARM as PAC (pointer
         | authentication). Afaik there's no equivalent on x86 processors.
         | 
         | Hardware mitigations are gaining more and more traction, so
         | hopefully all of them will be available across all major
         | architectures within the next decade or so.
         | 
         | EDIT: another huge mitigation that's starting to gain traction
         | is process-level virtualization. Qubes is most known for it,
         | but Windows has been building more support for it (currently
         | only edge and office support it, under the name "Application
         | Guard"). The idea is that if a malicious actor is able to
         | exploit a program, they would also have to break out of the vm
         | the app is in to gain access to the full system, which can be a
         | much more difficult task.
        
       ___________________________________________________________________
       (page generated 2021-04-14 23:02 UTC)