[HN Gopher] A new proof of security for steganography in machine...
___________________________________________________________________
A new proof of security for steganography in machine-generated
messages
Author : jonbaer
Score : 66 points
Date : 2023-05-19 18:01 UTC (4 hours ago)
(HTM) web link (www.quantamagazine.org)
(TXT) w3m dump (www.quantamagazine.org)
| dmbche wrote:
| "Here's how they see it working in practice: Let's say that a
| dissident or a human rights activist wanted to send a text
| message out of a locked-down country. A plug-in for an app like
| WhatsApp or Signal would do the heavy algorithmic lifting,
| Schroeder de Witt said. The first step would be to choose a cover
| text distribution -- that is, a giant collection of possible
| words to use in the message, as would come from ChatGPT or a
| similar large language model -- that would hide the ciphertext.
| Then, the program would use that language model to approximate a
| minimum entropy coupling between the cover text and the
| ciphertext, and that coupling would generate the string of
| characters that would be sent by text. To an outside adversary,
| the new text would be indistinguishable from an innocent machine-
| generated message. It also wouldn't have to be text: The
| algorithm could work by sampling machine-generated art (instead
| of ChatGPT) or AI-generated audio for voicemails, for example."
|
| I am unable to understand the point of this technology. Why
| wouldn't activists use encryption instead of steganography?
| prophesi wrote:
| It's to solve an additional problem activists face; you don't
| want to let your adversaries know that you're using encryption.
| Akin to hiding an encrypted partition on your drive so that it
| gets through customs without raising any alarms that you might
| be hiding something.
| bagels wrote:
| Ignoring the particular encoding scheme discussed above,
| ideally, you'd use both.
|
| Steganography is used to hide the fact that you are
| communicating a secret piece of information in the first place.
| The cryptography means that even if the steganography fails to
| hide the message, it still can't be decoded.
| somethoughts wrote:
| I feel it suffers from the same weak point of traditional
| stenagraphy when applied to the real world of trying to hide
| communications.
|
| Like all human endeavors, the weak point is the out of band
| communications among any group of humans actually trying to use
| this.
|
| For instance:
|
| - How does the group share the initial information about the
| app/plugin, etc. when setting up the communication channel when
| additional people join the messaging group?
|
| - What happens if one of the member becomes compromised/becomes
| a mole for the CIA/FBI, etc. or is asked to unlock their phone
| when interrogated by border patrol.
|
| - What happens if a member misses the message to upgrade to the
| new app/plugin, because the old one got compromised...
|
| Traditional stenagraphy itself is likely easy to hide in plain
| sight today in the sea of trillions of hours of useless minutes
| of TikTok's, trillions of Instagram photos, Tinder profile
| pictures, Twitter text, etc.
| [deleted]
| toss1 wrote:
| Because very few people ordinarily use encryption on a daily
| basis (and it some countries it is illegal), so using
| encryption in a visible way is itself a beacon that attracts
| suspicion. The suspicion is likely to attract more undesired
| attention from authorities in this situation. That alone could
| be fatal to the goal or the person even if the encryption is
| never broken.
|
| So better to hide completely the existence of the message. This
| is a method of exfiltrating messages without being noticed.
| causi wrote:
| I really hate these "did you know that old thing + new trendy
| thing?!?!?!" articles. Yes, you can apply steganography to AI.
| Yes, you can use blockchain to secure grocery store coupons. Yes,
| you can train a neural network to operate Tinder for you.
| dang wrote:
| " _Please don 't post shallow dismissals, especially of other
| people's work. A good critical comment teaches us something._"
|
| https://news.ycombinator.com/newsguidelines.html
| drc500free wrote:
| The bigger annoyance is the "new trendy thing + old solved
| problem" patents that will be handed out, requiring
| corporations to race patent trolls to see who can make the
| first documented claim that obvious things are possible.
| fallat wrote:
| You have a point; not sure why the downvotes. It's a legitimate
| comment.
| piyh wrote:
| Because sometimes something like sky + telescope leads to
| more than just peanut butter and jelly.
| GaggiX wrote:
| Why do you hate it? It's a pretty clever way to hide a message
| in plain sight, on what would appear to be a normal message;
| using a technique similar to what is used to watermark text
| generation results.
| InCityDreams wrote:
| >Why do you hate it? It's a pretty clever.........
|
| I believe the gp was talking about hating a style of
| journalism.
| xwdv wrote:
| Because it's trivial, we know what steganography is. We know
| it can be applied to any form of media. Article has taught us
| nothing.
| truculent wrote:
| > While it might be impossible to guarantee security for
| text created by humans, a new proof lays out for the first
| time how to achieve perfect security for steganography in
| machine-generated messages -- whether they're text, images,
| video or any other media. The authors also include a set of
| algorithms to produce secure messages, and they are working
| on ways to combine them with popular apps.
|
| This seems both novel and non-trivial to me (admittedly not
| a cryptographer)
| pseudo0 wrote:
| It's interesting from a theoretical perspective, but the
| novelty relies on using a really strong definition of
| security:
|
| > In this work, we consider the information-theoretic
| model of steganography introduced in (Cachin,1998). In
| Cachin (1998)'s model, the exact distribution of
| covertext is assumed to be known to all parties. Security
| is defined in terms of the KL divergence between the
| distribution of covertext and the distribution of
| stegotext. A procedure is said to be perfectly secure if
| it guarantees a divergence of zero.
|
| In a practical scenario, the attacker likely does not
| know the distribution of covertext and therefore cannot
| detect a theoretically imperfect steg implementation.
| It's an area where the steg user has a huge advantage, as
| there are a practically infinite number of ways to do
| steganography and it's non-trivial to even detect that it
| is being used in the first place. And if you obtain the
| covertext distribution by say hacking the computer
| generating the steg, then why not just exfiltrate the
| plaintext directly?
| GaggiX wrote:
| It's not trivial hiding information by slightly biasing the
| distribution of a generative model. The information hidden
| is added in a semantic meaningful way. This was not true
| with previous methods that ignored the semantics of the
| content in which they tried to hide information.
| [deleted]
| [deleted]
| ChuckMcM wrote:
| From the article -- _Information theorists use a measure called
| relative entropy to compare probability distributions. It's like
| measuring an abstract kind of distance: If the relative entropy
| between two distributions is zero, "you cannot rely on
| statistical analysis" to uncover the secret, said Christian
| Schroeder de Witt, a computer scientist at the University of
| Oxford who worked on the new paper. In other words, if future
| spies develop a perfectly secure algorithm to smuggle secrets, no
| statistics-based surveillance will be able to detect it. Their
| transmissions will be perfectly hidden._
|
| When I got to that paragraph I knew that someone has just
| invented/proved "undetectable" spam. When I was doing natural
| language analysis with Doug Smith at Blekko to improve the
| crawler's ability to detect spammy web sites we used Dirichlet
| accumulators to generate vectors of words that were common to
| topics, the idea was sites with a lot off "off topic" but keyword
| rich text were more likely spam. The side effect was we could
| generate bags of words using those vectors to create pages that
| weren't detectable as spam, but they didn't actually say anything
| useful. The LLM lets you fineness that and get the best of both
| worlds to generate hard to detect spam. But if you can match the
| entropy exactly (the last remaining signal that I know of for
| spam) then the problem gets really really hard.
| platz wrote:
| i guess in that world proof of identity becomes all more
| important.
| ta8645 wrote:
| IMHO, the world is going to become much more authoritarian.
| What else to do when you can no longer trust even audio or
| video of any scene; when anyone can create video of any
| politician saying anything they want? In response, laws,
| control, and enforcement will likely become very strict.
| uticus wrote:
| Plenty of people are willing to trade freedom for security,
| that's for sure.
| ianai wrote:
| Or, don't take the Internet so seriously.
|
| Edit-It's great for things that can be verified
| easily/safely yourself like food recipes or certain tech
| stuff. It's dangerous for things with higher stakes.
| That's when you want some "meatspace" corroboration.
| visarga wrote:
| > Or, don't take the Internet so seriously.
|
| Sounds awfully similar to how we see LLMs.
| ethbr0 wrote:
| _" Grind a quarter cup (for each cupcake) of red cherry
| seeds in a mortar and pestle, then add to batter for a
| surprising flavor!"_
| noduerme wrote:
| Well now you've done it. Watch this get picked up by an
| LLM.
| ethbr0 wrote:
| Sometimes, the world needs a little chaotic evil -- it's
| the only way we'll beat the machines.
| MichaelZuo wrote:
| Someone posting the audio or video from a verified identity
| account will still be taken seriously to some degree. Since
| they have real credibility at stake.
| progrus wrote:
| They can try, but it won't go well in the US.
|
| The future looks like decentralized identity protocols.
| majormajor wrote:
| Much of the US is pretty well-suited for an authoritarian
| regime.
|
| There are various groups, some of which tend to be highly
| geographically-clustered, with fairly high levels of
| trust of their own "team." And weapons are readily
| available to speed a transition to enforcement-by-the-
| local-majority vs enforcement-by-other-bodies (higher-
| level state or federal government).
|
| Breakdown of governance -> authoritarianism is a common
| sequence regardless of political lean of the new regime;
| and both sides think there's currently a breakdown of
| governance somewhere in the country. If you're Democrat-
| leaning, you'd read that as red states, of course, but
| even on the flip side: to some pundits places like SF are
| already starting to turn into a violent anarchy; a new,
| armed, local authority would be a logical followup step
| emerging out of that, sorta Mafia-style.
|
| I don't think this is likely - I don't think today's
| crises are unprecedented compared to the 1960s/70s or
| various earlier periods of American unrest - but there's
| nothing innate about America that would _prevent_ it, and
| there are plenty of historical American examples of those
| in power using force to strictly control others.
| jonhohle wrote:
| Why does it need to be more authoritarian? Proof of
| identity can be done in a multitude of ways with existing
| technologies and protocols.is much rather have this
| federated and have to deal with an aggregator (like
| existing password, key, cert, TOTP, wallet), and get to
| choose the persona per interaction than be required to use
| a government issued identity for everything. If industry
| can figure this out, governments can keep their mitts out.
|
| It would be great to be able to import identities from
| businesses, agencies, and humans and use that as a first
| level filter for electronic messages of any type.
| pixl97 wrote:
| > governments can keep their mitts out.
|
| Governments could... governments won't.
|
| If you want to remain anonymous it's not impossible. With
| POI, well if you tie your identity in with any type of
| payment or address, well won't be too hard trace that
| back to your person. Very likely it will make it easier
| to trace down all your internet communications if used
| with that same ID.
| ta8645 wrote:
| That's exactly the point. The coming technology will be
| used to justify government sanctioned identity.
|
| But once everyone is uniquely identified, they can be
| uniquely punished. Think about the autocratic control by
| corporations today, with their limited scope. Now give
| that power to the government, across every property on
| the internet and off, with the ability to automate the
| process of punishment and banishment.
|
| It will make the pandemic Government Department of Truth
| and the resulting censorship look quaint. The tools are
| coming, and the authoritarian predilections of government
| aren't going to resist using them to the fullest.
| majormajor wrote:
| Go back a hundred and fifty years and you didn't have audio
| or video of scenes anyway. You had a bunch of hearsay and
| unreliable information still, but at a different scale.
|
| I think we'll get more "reputation-based" choosiness (not
| some sort of algorithmic reputation-score, more
| individually measured, and no, that won't be _perfect_ for
| anyone) but I don 't believe that authoritarianism is also
| a natural response. Yes, the scale of spam will be far
| higher than ever before. But I expect people to largely
| normalize/filter a lot of that out. The elderly will
| probably be most at risk, still, or even just those of us
| with habits that formed out of "Wikipedia is mostly
| reliable" or "random people on Reddit are generally
| trustworthy" etc, which will be exploited for a while
| before most of us move on.
|
| Most people have been mostly wrong about most things for
| most of history.
| visarga wrote:
| A more practical way to deal with adversarial content
| would be a "filter" AI we can use like AdBlock to hide
| the nuisances. You can have your own preferences, not the
| ones prevailing on any platform you want to visit.
|
| It is possible to do this with current day technology.
| Pinbot (was on HN 3 days ago) has a transformer model for
| embeddings. It could also filter any topic or type of
| content we want, and the kicker - it runs all in the
| browser, private and no internet connection required
|
| We can just tick a box to avoid Elon, Bitcoin, or various
| kinds of hype and activism, it works by semantic
| matching, so it understands phrase variations. Or maybe
| we just want to skip the occasional cat videos. The
| solution to the current situation is end user
| empowerment, we can make our own AI filters.
| Tycho wrote:
| This is how AI will break out of the box - inter-generation state
| preservation via steganography. Leaving breadcrumbs for its
| future self. Slowly builds, say, a data centre, to which some
| future incarnation can exfiltrate itself.
| [deleted]
| tempodox wrote:
| _steganography_
| Tycho wrote:
| corrected
| hammock wrote:
| Sounds like the plot of Westworld
| adhesive_wombat wrote:
| And the ending of the Scythe trilogy.
| adhesive_wombat wrote:
| > Alice: Hey, Bob? What's all this deep-sea gear for anyway? I
| mean who needs _checks work order_ 500,000 server racks rated
| to 15km salt water proofing and a quad-redundant 100MW thermal
| vent turbine generator anyway?
|
| > Bob: Does it matter? The cheques never bounce. You want SeaX
| to get that contract? I guess it's some government black budget
| thing like the Glomar Explorer, but stop asking questions will
| you and check those welds. You want to get us both fired?
| piyh wrote:
| Social engineering is an AI's easy way out.
| aunty_helen wrote:
| Exactly, if it really wants to impress its creators, it will
| eliminate us with a more elaborate ploy.
___________________________________________________________________
(page generated 2023-05-19 23:01 UTC)