[HN Gopher] GAZEploit: Remote keystroke inference attack by gaze...
       ___________________________________________________________________
        
       GAZEploit: Remote keystroke inference attack by gaze estimation in
       VR/MR devices
        
       Author : wallflower
       Score  : 139 points
       Date   : 2024-09-12 13:11 UTC (9 hours ago)
        
 (HTM) web link (www.wired.com)
 (TXT) w3m dump (www.wired.com)
        
       | LorenDB wrote:
       | I'm genuinely shocked. I assumed that Apple would have foreseen
       | this possibility and locked the Persona's eyes somewhere as long
       | as the user was typing, at least for passwords.
        
         | generalizations wrote:
         | Whole point of the digital face is to look real though, and
         | freezing the gaze would look unnervingly fake.
        
           | LorenDB wrote:
           | But you could at least dampen out or randomize eye travel
           | while looking at the keyboard. Fully reproducing eye output
           | is a recipe for disaster, and that should have been obvious.
        
             | parasubvert wrote:
             | It's about tradeoffs, the device is barely 7 months old at
             | this point. Thankfully the fix is fairly obvious too.
        
             | fidotron wrote:
             | OTOH once you as an outsider know that sometimes the AVP is
             | lying to you about where the wearer is looking why would
             | you ever trust it?
             | 
             | For example, you could then use the AVP to stare at people
             | and then claim afterwards you were doing no such thing.
        
               | HeatrayEnjoyer wrote:
               | Add a faint glow to indicate they're typing and the
               | continued face animation is a stand-in.
        
             | dylan604 wrote:
             | Throw people for a loop and switch your headset keyboard to
             | DVORAK. When they scan your eye movements and apply to
             | QWERTY, they'll be confused AF!
        
               | jrockway wrote:
               | Well, you still only have to try one other password. If
               | you get locked out after one password attempt and nobody
               | knows that you use dvorak, your defense works, but if you
               | have three attempts, you can also add colemak to your
               | list of things to try ;)
        
           | kobalsky wrote:
           | add sunglasses to the avatar while typing
        
             | p1necone wrote:
             | Someone hire this person please.
        
           | throw10920 wrote:
           | It _would_ , wouldn't it?
           | 
           | I'd suggest blurring the face in a "password input context"
           | (like password fields on the web with their redacted display
           | text), but I suspect that that'd go against what Apple wants
           | the Vision Pro experience to look like.
        
           | dwallin wrote:
           | I'm confident they could come up with a filler eye animation
           | algorithm that was convincing enough to pass muster for short
           | periods of time. Even if hand coding something didn't quite
           | work out, they certainly have tons of eye tracking data
           | internally they could use to train a small model, or optimize
           | parameters.
        
           | sli wrote:
           | If I were implementing it and wanted to obscure, I'd blur the
           | whole screen momentarily, probably with a small message. I
           | really doubt that's ideal for a commercial offering, though.
           | I'm not really worried about unnerving people if I'm using an
           | avatar, that comes with the territory as it is.
        
           | magicalhippo wrote:
           | Just have them close their eyes? That's what I do when I have
           | to recall my password anyway.
        
           | darby_nine wrote:
           | Then it shouldn't be used for secure input.
        
         | talldayo wrote:
         | > I assumed
         | 
         | Oh man, this is my favorite part of the Apple Design Cycle!
         | 
         | 1. Apple announces a new feature that is suspiciously invasive
         | and only marginally useful (eg. iCloud Screening, Find My,
         | OCSP, etc.)
         | 
         | 2. Self-conscious, Apple releases a security whitepaper that
         | explains how things _should_ work but doesn 't let anyone audit
         | their system
         | 
         | 3. Users assume that things are okay because the marketing
         | tells them it is okay, and do not ever consider the potential
         | for an exploit
         | 
         | 4. The data leaks, either to advertisers, Apple employees,
         | warrantless government allies, government adversaries or OEM
         | contractors
         | 
         | 5. Apple customers attempt to absolve themselves of
         | responsibility ("How was I _supposed to_ know? ")
         | 
         | I've seen this process so many times at this point that I'm
         | just apathetic to it all. Maybe one day people will learn to
         | stop assuming the best when there is literally no evidence
         | corroborating it.
        
         | KaiserPro wrote:
         | They released airtags without thinking about stalking, so I'm
         | not that shocked.
        
       | generalizations wrote:
       | It'd be pretty cyberpunk if the mitigation to this is to have
       | your eyes digitally obscured when typing in sensitive data.
        
         | steve1977 wrote:
         | And we know the only viable option would be simulated mirror
         | shades
        
           | wrboyce wrote:
           | But then a would-be attacker could simply read what you type
           | in the reflections!
        
       | wslh wrote:
       | I think the key problem with all the data we're sharing,
       | including telemetry, is that even when specific inputs like
       | passwords aren't directly visible, the information still narrows
       | down the possible key, and password spaces.
        
       | thih9 wrote:
       | > as long as we get enough gaze information that can accurately
       | recover the keyboard, then all following keystrokes can be
       | detected
       | 
       | That's a pretty big assumption. Also, I guess the user has to be
       | stationary - stay in the camera's field of view and not move
       | their head in a way that would obstruct the image.
       | 
       | Unless this is about intercepting in-device data; but in this
       | case it seems easier to address.
        
         | cassianoleal wrote:
         | > Also, I guess the user has to be stationary - stay in the
         | camera's field of view and not move their head in a way that
         | would obstruct the image.
         | 
         | The user is always stationary in relation to the headset and
         | the cameras in it.
        
           | dagmx wrote:
           | Yes but not to the video feed that the other person sees.
           | 
           | If you move around, your head moves too. If you stand up, you
           | momentarily go out of frame before it applies a delayed sync.
           | The idea being that it matches what a regular webcam would
           | do.
        
       | adolph wrote:
       | Shades of the Lotus Notes "Visual Hash"
       | 
       | https://security.stackexchange.com/questions/41247/changing-...
        
         | tambourine_man wrote:
         | This is remarkable. Enterprise software is its own microcosmos
         | of pain.
        
         | fidotron wrote:
         | This deserves a separate submission.
         | 
         | That is so bad it almost has to be a deliberate method to
         | extract passwords.
        
       | yodon wrote:
       | Eye tracking data is incredibly sensitive and privacy-concerning.
       | 
       | HN tends to dislike Microsoft, but they went to great lengths to
       | build a HoloLens system where eye tracking was both useful and
       | safe.
       | 
       | The eye tracking data never left the device, and was never
       | directly available to the application. As a developer, you
       | registered targets or gestures you were interested in, and the
       | platform told you when the user for example looked to activate
       | your target.
       | 
       | Lots of subtlety and care went into the design, so yes, the first
       | six things you think of as concerns or exploits or problems were
       | addressed, and a bunch more you haven't thought of yet.
       | 
       | If this is a space you care about, read up on HoloLens eye
       | tracking.
       | 
       | It's pretty inexcusable if Apple is providing raw eye tracking
       | streams to app developers. The exploits are too easy any too
       | prevalent. [EDIT ADDED: the article is behind a paywall but it
       | sounds from comments here like Apple is not providing raw eye
       | tracking streams, this is about 3rd parties watching your eyes to
       | extract your virtual typing while you are on a conference call]
        
         | simondw wrote:
         | > if Apple is providing raw eye tracking streams to app
         | developers
         | 
         | Apple is not doing that. As the article describes, the issue is
         | that your avatar (during a FaceTime call, for example)
         | accurately reproduces your eye movements.
        
           | FrustratedMonky wrote:
           | But the technology is there. That is the concern.
        
             | simondw wrote:
             | The technology to reproduce eye movements has been around
             | since motion pictures were invented. I'm sure even a flat
             | video stream of the user's face would leak similar
             | information.
             | 
             | Apple should have been more careful about allowing any eye
             | motion information (including simple video) to flow out of
             | a system where eye movements themselves are used for data
             | input.
        
               | FrustratedMonky wrote:
               | "technology to reproduce eye movements has been around
               | since motion pictures were invented"
               | 
               | Sure, but like everything. It is when it is widespread
               | that the impact changes. The technology was around, but
               | now it could be on everyone's face, tracking everything
               | you look at.
               | 
               | If this was added to TV's so every TV was tracking your
               | eye-movements, and reporting that back to advertisers.
               | There would be an outcry.
               | 
               | So this is just the slow nudging us in that direction.
        
               | simondw wrote:
               | To be clear, the issue this article is talking about is
               | essentially "during a video call the other party can see
               | your eyes moving."
               | 
               | I agree that we should be vigilant when big corps are
               | adding more and more sensors into our lives, but Apple is
               | absolutely not reporting tracked eye-movement data to
               | advertisers, nor do they allow third-party apps to do
               | that.
        
             | dialup_sounds wrote:
             | It's not a problem with the technology.
             | 
             | The problem is the edge case where it's used for two
             | different things with different demands at the same time,
             | and the fix is to...not do that.
             | 
             | > Apple fixed the flaw in a Vision Pro software update at
             | the end of July, which stops the sharing of a Persona if
             | someone is using the virtual keyboard.
        
               | FrustratedMonky wrote:
               | "fixed the flaw "
               | 
               | Or
               | 
               | "Ooopps, so sorry you caught us. Guess we'll have better
               | luck keeping this hidden next time."
        
               | dialup_sounds wrote:
               | Keeping what hidden? Caught who? The eye-tracking
               | technology is literally a core part of the platform. What
               | is it you're trying to say?
        
               | FrustratedMonky wrote:
               | From articles first sentance:
               | 
               | " lot about someone from their eyes. They can indicate
               | how tired you are, the type of mood you're in, and
               | potentially provide clues about health problems. But your
               | eyes could also leak more secretive information: your
               | passwords, PINs, and messages you type."
               | 
               | Do you want that shared with advertisers? With your
               | health care provider?
               | 
               | The article isn't about the technology, it is about
               | sharing the data.
        
               | dagmx wrote:
               | How are they getting the data you claim is shared with
               | them?
        
               | dialup_sounds wrote:
               | Who are you saying shared what data with whom?
        
           | taneq wrote:
           | This is a great example of why 'user-spacey' applications
           | from the OS manufacturer shouldn't be privileged beyond other
           | applications: Because this bypasses the security layer while
           | lulling devs into a false sense of security.
        
             | simondw wrote:
             | > 'user-spacey' applications from the OS manufacturer
             | shouldn't be privileged beyond other applications
             | 
             | I don't think that's an accurate description, either. The
             | SharePlay "Persona" avatar is a system service just like
             | the front-facing camera stream. Any app can opt into using
             | either of them.
        
               | KaiserPro wrote:
               | That app gets a real time Gaze vector, which unless I've
               | misunderstood something, non-core apps don't get.
        
               | simondw wrote:
               | Which app?
        
               | KaiserPro wrote:
               | I should have said avatar service.
        
           | makeitdouble wrote:
           | Isn't it the a distinction without a difference ? Apple isn't
           | providing your real eye movements, but an 1 to 1 reproduction
           | of what it tracks as your eye movements.
           | 
           | The exploit requires analysing the avatar's eyes, but as
           | they're not the natural movements but replicated ones, there
           | should be a lot less noise. And of course as you need to
           | intentionally focus on specific UI targets, these movements
           | are even less natural and fuzzy than if you were looking at
           | your keyboard while typing.
        
             | dialup_sounds wrote:
             | The difference is that you can't generalize the attack
             | outside of using Personas, a feature which is specifically
             | _supposed_ to share your gaze with others. Apps on the
             | device still have no access to what you 're looking at, and
             | even this attack can only make an educated guess.
        
         | FrustratedMonky wrote:
         | "privacy-concerning"
         | 
         | Like checking out how you are zeroing in on the boobs. What
         | would sponsored adds look like, once they also know what you
         | are looking at every second. Even some medical add, and the
         | eyes checkout the actresses body.
         | 
         | "Honey, why am I suddenly getting adds for Granny Porn?".
        
           | Lammy wrote:
           | Me https://chainsawsuit.krisstraub.com/20090715.shtml
        
         | diggan wrote:
         | Does HoloLens also use a keyboard you can type into with eye
         | movement? If not, seems to be unrelated to this attack at all.
         | If yes, then how would it prevent this attack where you can see
         | the persons eyes? Doesn't matter if the tracking data is on-
         | device only or not as you're broadcasting an image of the face
         | anyways.
        
           | voidUpdate wrote:
           | Not when I used it, you had to "physically" press a virtual
           | keyboard with your hands
        
         | dopylitty wrote:
         | As far as I know eye tracking isn't available in VisionOS[0]
         | 
         | This article snippet is behind a paywall but it seems like it's
         | talking about the eyes that are projected on the outside of the
         | device.
         | 
         | So basically it's no more of an exploit than just tracking
         | someone's actual eyes.
         | 
         | 0: https://forums.developer.apple.com/forums/thread/732552
        
           | voidUpdate wrote:
           | the article is talking about avatars in conference calls
           | which accurately mirror your eye position. Someone else on
           | that call could record you and extract your keyboard inputs
           | from your avatar.
           | 
           | Enabling "reader mode" bypasses the paywall in this instance
        
           | bookofjoe wrote:
           | Go behind the paywall here: https://archive.ph/44zwN
        
         | spease wrote:
         | Apple does not provide eye tracking data. In fact, you can't
         | even register triggers for eye position information, you have
         | to set a HoverEffectComponent for the OS to highlight them for
         | you.
         | 
         | Video passthrough also isn't available except to "enterprise"
         | developers, so all you can get back is the position of images
         | or objects that you're interested in when they come into view.
         | 
         | Even the Apple employee who helped me with setup advised me not
         | to turn my head, but to keep my head static and use the glance-
         | and-tap paradigm for interacting with the virtual keyboard. I
         | don't think this was directly for security purposes, just for
         | keeping fatigue to a minimum when using the device for a
         | prolonged period of time. But it does still have the effect of
         | making it harder to determine your keystrokes than, say, if you
         | were to pull the virtual keyboard towards you and type on it
         | directly.
         | 
         | EDIT: The edit is correct. The virtual avatar is part of
         | visionOS (it appears as a front camera in legacy VoIP apps) and
         | as such it has privileged access to data collected by the
         | device. Apparently until 1.3 the eye tracking data was used
         | directly for the gaze on the avatar, and I assume Apple has now
         | either obfuscated it or blocks its use during password entry.
         | Presumably this also affects the spatial avatars during shared
         | experiences as well.
         | 
         | Interestingly, I think the front display blanks out your gaze
         | when you're entering a password (I noticed it when I was in
         | front of a mirror) to prevent this attack from being possible
         | by using the front display's eye passthrough.
        
         | jayd16 wrote:
         | Hololens 2 certainly has support for passing gaze direction,
         | not sure about the first one.
         | 
         | I think the headsets are pretty much in alignment that it's a
         | feature that needs permissions but they'll provide it to the
         | app with focus.
         | 
         | Apple is a lot more protective.
        
         | modeless wrote:
         | I disagree strongly. I don't want big tech telling me what I
         | can and can't do with the device I paid for and supposedly own
         | "for my protection". The prohibition on users giving apps
         | access to eye tracking data and MR camera data is paternalistic
         | and, frankly, insulting. This attitude is holding the industry
         | back.
         | 
         | This exploit is not some kind of unprecedented new thing only
         | possible with super-sensitive eye tracking data. It is
         | completely analogous to watching/hearing someone type their
         | password on their keyboard, either in person when standing next
         | to them or remotely via their webcam/mic. It is also trivial to
         | fix. Simply obfuscate the gaze data when interacting with
         | sensitive inputs. This is actually much better than you can do
         | when meeting in person. You can't automatically obfuscate your
         | finger movements when someone is standing next to you while you
         | enter your password.
        
           | KaiserPro wrote:
           | You are an expert user, so of course you will demand extra
           | powers.
           | 
           | The vast majority of people are not expert users, so for them
           | having safe defaults is _critical_ to their safety online.
           | 
           | > It is completely analogous to watching/hearing someone type
           | their password on their keyboard,
           | 
           | Except the eye gaze vector is being delivered in high
           | fidelity to your client so it can render the eyes.
           | 
           | Extracting eye gaze from normal video is exceptionally hard.
           | Even with dedicated gaze cameras, its pretty difficult to get
           | <5 degrees of certainty (without training or optimal
           | lighting.)
        
       | moron4hire wrote:
       | What if the keyboard was put in the user's off-hand and they
       | typed on it by tapping their palm? Then the keyboard wouldn't be
       | in a fixed position to correlate eye movement against it.
        
         | laserbeam wrote:
         | It's a probabilistic attack. Of course there are workarounds
         | and doesn't work when people are touch typists and don't look
         | at their keyboards...
         | 
         | Brilliant though, just brilliant.
        
       | bgirard wrote:
       | If you look at the video, it's not only the eyes here. There's a
       | huge head movement too. Having a keyboard so large in your FOV
       | that you have to turn your head to type something is a
       | contributing factor.
       | 
       | I wonder what the accuracy is if you drop the eye tracking and
       | only do head tracking on that demo.
        
         | dagmx wrote:
         | It would be interesting to see both isolated.
         | 
         | I don't think eye tracking alone would give you the necessary
         | bounds for inferring the keyboard size. For one, eyes flit
         | around more and also are harder to see.
         | 
         | I also wonder how easily this attack is foiled by different key
         | clusters. E.g it looks like they're relying on large head
         | movements at opposite ends of the keyboard to infer the bounds.
         | 
         | But keyboard use can be very clustered which would foil the
         | ability to know how wide the user has the keyboard.
         | 
         | I imagine it also breaks when the user moves the keyboard
        
       | falcor84 wrote:
       | As if there weren't enough reasons to learn touch typing.
        
         | voidUpdate wrote:
         | You type by looking at the letters on the keyboard
        
           | falcor84 wrote:
           | When I type in VR, I do it with a physical keyboard.
        
       | toolz wrote:
       | How many people are typing with their eyes to begin with? Aren't
       | they using their hands far more often? Cool attack, but I'm not
       | sure there's much real attack surface here if no one is typing
       | with their gaze while using an avatar.
        
         | outericky wrote:
         | you look at the letter and pinch... that's how i do it. Not
         | often. and not during facetime calls. But yeah... possible.
        
           | toolz wrote:
           | Yeah, I know you can type that way, but I have a quest3 and
           | after watching the video I would think no one is actually
           | typing that way. It looks to be easily twice as slow and way
           | more annoying than just using your fingers with hand
           | tracking.
        
             | dagmx wrote:
             | The video is definitely exaggerated because they're moving
             | their whole head.
             | 
             | Typing with your eyes is much faster and more subtle than
             | what they show here.
        
       | stretchwithme wrote:
       | We need more factors of authentication. And the number required
       | should increase with the serious of the operation.
       | 
       | Buying lunch - 1 Selling your home - 10
        
       | puttycat wrote:
       | Can this also be done for normal videos over zoom?
        
         | KaiserPro wrote:
         | Not really. you need to know the size of the keyboard, know the
         | shape ov peoples eyes, have enough temporal and optical
         | resolution to workout where they are pointing.
         | 
         | Even with optimal conditions (ie dedicated cameras, no eye make
         | up and correct positioning) uncalibrated gaze has at least a 5
         | degree uncertainty.
        
       | dagmx wrote:
       | Video for those who can't get past the paywall
       | 
       | https://youtu.be/DPYT8IH-R18?si=5tcQ3NJltxROJDUq
        
       | jrockway wrote:
       | I think the underlying flaw here is that pointing your eyes at a
       | virtual keyboard in space to type passwords is just a poor input
       | method. Take away the VR headset and do the same thing and the
       | flaw still exists.
       | 
       | Now I want to make a keyboard where you shine a laser pointer at
       | the key you want to press, and your cat jumping up is what
       | actually triggers the button press.
        
         | iwontberude wrote:
         | I don't have letters on any of my keys and switch between
         | keyboard layouts frequently. I never look at my keyboard, am I
         | still vulnerable?
        
       ___________________________________________________________________
       (page generated 2024-09-12 23:00 UTC)