[HN Gopher] New acoustic attack steals data from keystrokes with...
___________________________________________________________________
New acoustic attack steals data from keystrokes with 95% accuracy
Author : mikece
Score : 141 points
Date : 2023-08-05 16:33 UTC (6 hours ago)
(HTM) web link (www.bleepingcomputer.com)
(TXT) w3m dump (www.bleepingcomputer.com)
| elderlybanana wrote:
| In response to this post, I just open sourced a starter project
| to a variation of this idea:
| https://github.com/secretlessai/audio-mnist. I've been interested
| in doing image classification techniques like CNN on audio data
| for a while.
|
| A couple years ago for a weekend project I made a simple "audio-
| mnist" dataset from handwritten digit audio recordings. I never
| got past a few days worth of work, but open-sourcing it has been
| on my mind for a minute. This post kicked me into action. Getting
| some more data, basic CNN examples, etc. could provide a nice
| starting point for a lot of research and tools.
|
| There is still separate code I'd have to find and make
| intelligible to create the recordings and split the audio.
|
| Anyway, in case anyone finds part of this process interesting or
| useful.
| zaxomi wrote:
| New? Sovjet listened to typewriters in the 1970s.
| MaximilianEmel wrote:
| Now they can make wireless keyboards that don't need a battery or
| radio!
| gladiatr72 wrote:
| Death metal.
|
| Suck it.
| constantcrying wrote:
| Very interesting that this is even possible. But seems somewhat
| dangerous, making an audio recording is very easy.
| lispisok wrote:
| So they generated training data from one laptop and microphone
| then generated test data with the exact same laptop and
| microphone in the same setup, possibly one person pressing the
| keys too. For the Zoom model they trained a new model with data
| gathered from Zoom. They call it a practical side channel attack
| but they didnt do anything to see if this approach could
| generalize at all
| jprete wrote:
| I think this linited attack surface can work without having to
| generalize one model to multiple people or keyboards. One
| advantage of a Zoom attack is that you get "plaintext" shortly
| after hearing the "ciphertext" if you can get the target to
| type into the chat window. And when you hear typing in other
| contexts it's likely to be something that matches a handful of
| grammars that an LLM can recognize already (written languages,
| programming languages, commands, calculation inputs) - and when
| it doesn't, that's probably a password.
| omgJustTest wrote:
| The answer is that likely all the above are used.
|
| Asking for "what signal it is detecting" might be better asked
| from a "what is the greatest signal bearing information" being
| used... which would help in averting attacks.
|
| This kind of stuff could be real menacing in all sorts of
| public places like airports, coffee shops and etc.
| [deleted]
| Geee wrote:
| It's for a targeted attack. It doesn't need to be generalized.
| voytec wrote:
| Good enough for PoC.
| OtherShrezzing wrote:
| I believe that is the generalisable version of the attack.
| You're not looking to learn the sound of arbitrary keyboards
| with this attack, rather you're looking to learn the sound of
| specific targets.
|
| For example, a Twitch streamer enters responses into their
| stream-chat with a live mic. Later, the streamer enters their
| Twitch password. Someone employing this technique could
| reasonably be able to learn the audio from the first scenario,
| and apply the findings in the second scenario.
| TechBro8615 wrote:
| Finally, a real security weakness to cite when making fun of
| people for their mechanical keyboard. Time to start recording
| the audio of Zoom calls with some particularly loud typers...
| fatfingerd wrote:
| Not according to the article.. Microphones are sensitive
| enough to mount the attack on quieter keyboards.
| thereisnospork wrote:
| What we clearly need are louder keyboards - which
| overload the mic so as to render keystrokes
| indistinguishable.
| meepmorp wrote:
| I've wanted to integrate a cap gun into a keyboard,
| basically a an old fashioned roll of paper caps and
| solenoid to whack 'em, triggered by exclamation points.
| TheCleric wrote:
| Adding a gain knob to my keyboard, be right back.
| dvngnt_ wrote:
| for a few years I've used rtx voice to remove keyboard typing
| and other background noise
| yowzadave wrote:
| I guess more reason to just use a password manager to
| autofill your password?
| kypro wrote:
| Or just use 2fa
| bee_rider wrote:
| If you have 2FA and one part of it is easily figured out,
| then you have one factor authentication.
|
| If you cared enough about the authentication in the first
| place to bother with 2FA, then I guess it seems like the
| reduction there is still something to be worried about,
| right?
|
| Lots of "two factor authentication" schemes seem to
| involve just getting a text or something, so, not very
| secure at all. Of course, this is bad 2FA, but it is
| popular.
| gleenn wrote:
| Perfect is the enemy of good. Text based 2FA is
| compromisable relatively easily but at least it's an
| extra hurdle.
| 3np wrote:
| It's the "or just" being the issue there, not the "use
| 2fa".
| jgtrosh wrote:
| Only if it doesn't only rely on a master password
| apendleton wrote:
| A nice thing about master passwords though is that since
| you don't have to type them in as often, they can be very
| long. 95% accuracy probably isn't good enough to reliably
| reproduce a sentence-length master password, at least if
| it's only captured once.
| belval wrote:
| 95% means that on average only 1 in 20 keystroke will be
| wrong. Even if your password is very long (40-60) that
| means only 2-3 errors. Since more people are not machines
| their long password will be a combination of words like
| the famous "horsestaplebatterycorrect" example from xkcd.
|
| Even if you flip a few letters from something like the
| above a human attacker will easily be able to fix it
| manually.
|
| "horswstaplevatterucorrect" for example is still
| intelligible.
| moonchrome wrote:
| Seems simple to defend - use a password manager.
| WXLCKNO wrote:
| Time to inject background audio of me typing "fuck you" into my
| zoom calls.
| zgluck wrote:
| Tactical noise!
| hoosieree wrote:
| Text-to-keystroke-audio where the text comes from the LLM
| Prompt "fanfiction based on HGTV's Love It or List It starring
| an Ewok realtor and Klingon interior designer in iambic
| pentameter".
|
| The goal is to cause the eavesdropper to totally reevaluate
| their life choices, and maybe even get caught up in the story.
| [deleted]
| pengaru wrote:
| So microphones need to get muted automatically by password
| prompts, seems simple enough in principle.
| ariym wrote:
| Georgi Gerganov created one a few years ago
|
| https://github.com/ggerganov/kbd-audio
| [deleted]
| [deleted]
| whoopdedo wrote:
| If this means the end of those loud mechanical keyboards then
| good. I never liked the clicking noise.
| amelius wrote:
| No it means the beginning of people playing recordings of loud
| mechanical keyboards all day to thwart the snooping algorithms.
| exabrial wrote:
| Physical Access Owns, as usual.
| hoosieree wrote:
| Using an image classifier on spectrograms is pretty funny. Not a
| bad idea, given image classifiers are dime a dozen, but still.
| iainctduncan wrote:
| It's actually quite common. One of the big bird recognition
| apps does just this.
| constantly wrote:
| There are multiple apps for this? Seems like PBS KIDS should
| own the authoritative one, and the licensing.
| devsda wrote:
| Some systems have a setting to disable touchpad for x
| milliseconds after a key press.
|
| Do we need something similar for microphones too?
| thedookmaster wrote:
| I don't use the qwerty layout, I use colemak. Likely this
| mitigates this for myself.
| insanitybit wrote:
| That's the equivalent of a shift cipher with a well known
| offset.
| dns_snek wrote:
| I'm pretty confident that statistical analysis would give away
| your layout (assuming there's enough data), I wouldn't be so
| sure.
| bqmjjx0kac wrote:
| This is just security through obscurity. For real security, you
| need a cryptographically rolling keyboard layout.
| glitchc wrote:
| Brilliant suggestion. Have a TRNG or a CSPRNG (if too poor
| for a TRNG) choose the next layout at random for you, ideally
| with every keystroke. Good luck cracking that!
| hoosieree wrote:
| Even using Vim or Emacs would add some
| obufsCTRL[dbiobfuscation from all the spurious keystrokes.
| segfaultbuserr wrote:
| Some places use touchscreen keypads for PIN entry exactly
| for this reason: to allow randomization, e.g. for opening a
| locked door, or for authorizing a transaction.
| bee_rider wrote:
| That is interesting.
|
| I'm sure it depends on the application to some extent. I
| can type my pin in without looking at all, so I can cover
| it up while doing it. If I had to hunt and peck, it'd
| easier for an onlooker to observe my slower motions I
| think.
|
| But if I used the same machine often enough to produce
| wear specific to me, this randomization would be really
| useful.
| zootboy wrote:
| I use a randomized PIN pad on my phone, and I've gotten
| quite used to it. I can enter my PIN almost as fast as I
| could on an unscrambled pad; it's definitely not hunting
| and pecking.
| 8note wrote:
| Do they randomize the key locations though?
|
| Otherwise, you leave behind grease where your fingers
| touched
| [deleted]
| segfaultbuserr wrote:
| Yes, the layout is randomized every time you use it.
| mdp2021 wrote:
| Could be done by using a device with a display - e.g. an
| "ereader" - to present a random keyboard layout. But, good
| luck being efficient typing on that. At that point, better
| use a different input model.
|
| Or, use techniques such as those in the article, such as
| random keypresses played during the actual ones.
| raincole wrote:
| Why not just a keyboard that produces random noise?
| bqmjjx0kac wrote:
| Because the real data stream would still be there, just
| mixed with some noise. It feels harder to analyze whether
| the noise sufficiently obscures the real keystrokes than it
| does to ensure the actual keystrokes reveal no information.
| ben_w wrote:
| Finally, a use for Buffy's Swearing Keyboard.
|
| Or possibly the exact opposite of that, I can't tell if
| it's a one-to-one mapping on mobile:
| https://www2.b3ta.com/buffyswear/
|
| (Also, I'm feeling my age now, given how many years have
| elapsed since that kind of thing passed for internet
| culture...)
| usrusr wrote:
| Whereas for practical security, having some common substring
| in all your passwords that you don't type but insert through
| some global hotkey would be just fine as a mitigation against
| eavesdrop attacks.
|
| Yes, that's also obscurity, but obscurity is actually good -
| it only got a (deservedly) bad reputation from when it gets
| used as a _substitute_ (but I fail to see how using a
| nonstandard keyboard layout would even count as obscurity in
| the context of an audio attack, as the clear text reference
| would surely go through the same layout?)
| raffraffraff wrote:
| My sister in law uses voice recognition and dictation
| software, so she doesn't even use a keyboard! Totally safe!
| schaefer wrote:
| At least it would have, until just now, when you recklessly
| disclosed your secret keyboard layout. :P
| wildrhythms wrote:
| Couldn't they just translate the detected keystrokes to colemak
| layout?
| dragonmost wrote:
| Yes but you would have to know or try all possible layout
| bunga-bunga wrote:
| This specific attack could also be easily mitigated by
| dictating your passwords instead.
| transportgo wrote:
| I think about this attack when streamers on Twitch logs into
| websites etc.
| nmeagent wrote:
| I think an attacker would find that many streamers with high
| quality audio have properly setup their mics with noise gate
| filters to remove their relatively quiet keystrokes.
| mxwsn wrote:
| The example figure shows a key hit every half second, which
| suggests a pecking style of typing at around 24 wpm. This way the
| model gets very clean waveforms. I wonder how their approach
| would work with average or fast typists. The sound profiles might
| be much harder to link to characters.
| zaxomi wrote:
| Sovjet listened successfully to typewrites back in the 1970s.
| mejutoco wrote:
| Impressive. To be fair, a lot of typewriters jam if you press
| more than one key at a time, plus they are very loud.
| insickness wrote:
| Zoom is good at filtering out rather loud background noises. I
| can't imagine that the sound of background typing during a
| conversation could be detected by the other party.
| frant-hartm wrote:
| What? Zoom (by default with auto mic adjustment) catches
| everything. Typing on laptop is especially bad as it is closer
| to the mic than the person speaking (unless there is external
| mic), so it's like a stampede of rhinos.
| bee_rider wrote:
| In this case the parent comment is considering Zoom as an
| ally, while you are considering it an adversary.
|
| So, in case that "what" was intended to denote some
| confusion, there is the most likely source.
| woadwarrior01 wrote:
| If you're on macOS, you can use the voice isolation mic mode.
| rjh29 wrote:
| When I type my login or wallet password, I've done it so many
| times that the sound profile is going to be quite different to
| normal typing. Does the model handle that?
| tehsauce wrote:
| Would love a wireless keyboard that works using this! It wouldn't
| need any battery, charging or syncing!
| swid wrote:
| Some old TV remotes used to work this way. They were made by
| Zenith and are called Space Command remotes. Apparently they
| are the reason TV remotes are sometimes called clickers.
|
| https://www.theverge.com/23810061/zenith-space-command-remot...
| javajosh wrote:
| I find this really hard to believe. If it were really possible
| then _people_ could do it with their ears, and they would be
| doing it and showing off that they can do it. The human ear (and
| brain) are really, really good at finding patterns and getting
| signal out of noise.
| AndroTux wrote:
| Computers are better at stuff than humans? Impossible! I am the
| king of math, no machine beats me in calculating numbers!
| zaxomi wrote:
| This isn't new. Soviet listened to typewiters back in the
| 1970s.
| trifurcate wrote:
| You're really surprised that computers can outperform humans at
| pattern recognition?
| javajosh wrote:
| Yes. Humans have fantastic audio and video processing
| abilities, particularly picking out signal from noise. Even
| now human operators listen to sonar signals on submarines.
| There's a reason for that.
| crazygringo wrote:
| Fascinating. I'm really curious what the acoustic properties are
| that it's recognizing.
|
| Is it more of a physical fingerprint of each key, such that if
| you swapped keys/springs the model would need to be updated? So
| it's produced by manufacturing inconsistencies, the way
| individual typewriters used to be forensically identified?
|
| Or is more each key being identical, but producing a different
| resonance pattern within the keyboard/laptop due to the shape of
| all of the matter surrounding it? If you move the keyboard in the
| room, do you have to re-train the model?
|
| I also wonder how much it varies depending on how hard you press
| each key -- not at all or a great deal? And what about by
| keyboard -- when you compare thin MacBook keys with an external
| full-height keyboard, is one easier/harder to recognize each key
| on than the other?
| tedunangst wrote:
| But what passwords are you typing while on zoom and why aren't
| you on mute?
| constantcrying wrote:
| I can imagine many, many situations where you might do this.
| But maybe another thing to be worried about are scammees being
| able to know the Password of people they are calling.
| Tempest1981 wrote:
| When calling my cellular/internet/medical/financial provider,
| it might be interesting to "see" what they are typing. (Or if
| they're randomly surfing the internet.)
| tedunangst wrote:
| How long are you talking to them that you've been able to
| record samples of the sound of all their keystrokes and
| perform this analysis?
| slashdev wrote:
| Call support, get the URLs and logins for all their internal
| apps. Ouch!
| jacquesm wrote:
| Presumably all their backoffice stuff is only accessible
| via VPN. Oh, wait...
| foobiekr wrote:
| Given your username, you might find this interesting:
|
| https://en.m.wikipedia.org/wiki/Tempest_(codename)
|
| TEMPEST considered almost everything from electromagnetic
| leakage to exactly the attack described here.
| syntaxing wrote:
| Timing attacks have been attack vector for a while? I remember
| reading a tool on HN a couple years ago about it. You don't even
| need audio, the rate of which you enter the keys into the
| password field is enough.
| IshKebab wrote:
| I seriously doubt that.
| remram wrote:
| How do you get the rate?
| bqmjjx0kac wrote:
| Maybe any one of your browser tabs has JS listening to the
| accelerometer. It doesn't even require a permission, AFAIK.
| crazygringo wrote:
| By the way, some (most?) videoconferencing software removes
| keyboard sounds from the audio, because it's particularly a
| distracting problem with laptops where the microphone is _right
| next to_ the keys.
|
| I'm pretty sure Zoom does this by default as part of its noise
| cancellation (it's potentially even easier since you can use
| keydown events to help identify, not just the audio stream).
|
| So as long as basic default noise cancellation is on, that would
| at least prevent this over regular videoconferencing. And because
| of this, I'm having a hard time thinking of when else this would
| be a realistic threat, where the attacker wouldn't already have
| enough physical access to either install a regular keylogger or
| else a hidden camera.
| 1123581321 wrote:
| Meetings between organizations, multi-office cafeterias, or
| coffee shops, perhaps.
| cute_boi wrote:
| i use 1password and have never ever typed password, so i am
| probably safe.
| AndroTux wrote:
| Two words for you: Master password.
| bdcravens wrote:
| The risk isn't limited to passwords:
|
| "...passwords, discussions, messages, or other sensitive
| information..."
___________________________________________________________________
(page generated 2023-08-05 23:00 UTC)