[HN Gopher] US Government plans to develop AI that can unmask an...
___________________________________________________________________
US Government plans to develop AI that can unmask anonymous writers
Author : NickRandom
Score : 81 points
Date : 2022-09-30 13:59 UTC (9 hours ago)
(HTM) web link (reclaimthenet.org)
(TXT) w3m dump (reclaimthenet.org)
| frozenlettuce wrote:
| The thing is, AI is a good mask for a "backend process that you
| don't need to explain how it works". Assuming that the US
| government already has private conversations on multiple content
| and messaging platforms, this AI will provide the perfect excuse
| to connecting a blog post with a given id in a process.
| philipkglass wrote:
| My first cynical take was that this will be used for "hunch
| laundering." There could be no indication that user A is an
| alias for user B, other than someone's hunch, but getting a
| computer to say that they match might be good enough to get a
| warrant when someone's hunch wouldn't be. It would be similar
| to having drug sniffing dogs affirm their handlers' feelings.
| hoosieree wrote:
| Creepy factor aside, a similar tool for attribution would be very
| useful for content creators (or copyright holders) currently
| worried about stable diffusion.
| hilbert42 wrote:
| In my case and I suppose for most HN posters there'd be little
| point--for on demand Ycombinator would be compelled to hand over
| email and IP addresses to Government--as I'd reckon in most
| instances that'd be a much easier and faster way of obtaining
| relevant information.
|
| In an era when privacy has become hugely diminished under the
| hands of both governments and corporate interests it raises the
| question of what rights to anonymity anyone has in either a
| public or private forum, and at present there's little if any
| consensus on this which ought to signal that any such project is
| premature.
|
| Unlike yours truly--who usually speaks his mind irrespective of
| whether he's known to his audience or does so anonymously--many
| will not speak their minds out of fear of being ridiculed, or
| humiliated, or exposed, or out of the risk of offending--risking
| the breakup of a friendship, etc. Same goes for whistleblowers
| whose public utterances, if not done anonymously, usually costs
| them their jobs.
|
| If people fear that their autonomy to act in an anonymous manner
| has been removed then they're unlikely to act at all, silence
| being the better part of discretion.
|
| This would have huge negative repercussions for society, our
| institutions and our governance--after all, the secret ballot is
| one of the cornerstones of our democracies. If we're not careful
| AI could undermine the ballot by unmasking what users think or
| how they actually vote and it's not hard to see how this would
| lead to coercion thence totalitarian government.
|
| That said, in this world of widespread almost instant
| communications, actors who intentionally act out of bad faith can
| do widespread damage, especially so when they do so anonymously.
| Knowing who they are would minimize the damage they are able to
| cause.
|
| Similarly, in a distantly-related post on HN a few days ago I
| referred to the increasing loss of respect for our important
| institutions and for the way we're being governed and how I
| thought that faith could be restored. There, I suggested that as
| a part of that process we need to unmask the hidden processes of
| government and that this would also include the naming of those
| who originate policy, law, etc.:
|
| _" If we're to restore any faith in our governance then this
| protection [hiding originators of policy] must stop. Decisions
| made by government employees must be open to public scrutiny,
| similarly, the origins of government policy--laws, regulations
| etc.--must be traceable back to its source (those who initiated
| said policies).
|
| Systems without accountability will always become corrupt."_
|
| Thus, there's a real dichotomy at work here. For some things
| anonymity is essential, at other times it's a curse. And from the
| many recent instances of where the gnomes within government
| haven't acted in our best interests then I'm damned sure that
| putting AI to work here won't bode well for us either.
|
| I've little doubt that the technology will be abused, and by
| virtue of the fact it will automatically silence a large
| proportion of the population who need speak out and who should do
| so anonymously in the interests of all. Even if they aren't
| targeted directly just knowing that there are systems in place
| that have the potential to expose them would be sufficient to
| silence many--as AI analysis of their words could be used to
| determine their identity at any future time (living with ongoing
| stress from potential exposure of one's ID would likely be
| unbearable for some).
|
| Given past history and current bad behavior of governments in
| these areas, I do not believe that it is possible to put such a
| system in place that would gain the full confidence of all
| players involved. It would have to have sufficient protections
| locked in place to provide full public accountability as well as
| having inbuilt mechanisms that would ensure the system could not
| be abused by governments. At present, such conditions cannot be
| realistically met--not by a long shot.
|
| Before anyone or any entity could let AI loose on this project
| and simultaneously state with all honesty that sufficient
| protections were in place for the project to proceed with safety
| would require many other prerequisite protections and 'safety
| measures'--which currently do not exist--to be incorporated
| (locked) into our governance. For instance, a whole raft
| definitions and concomitant laws pertaining to privacy are needed
| --and that's just for starters.
|
| No doubt this project will proceed without those prerequisite
| protections, ipso facto, it will also be abused.
|
| _PS: note my quoted point about government policy etc. being
| open to public scrutiny. Here such questions arise such as where
| did this idea originate, what are the names of its instigators
| and what are their motives for instigating this development--not
| to mention others such as what are their qualifications,
| experience, etc. (perhaps, given the enormous potential of this
| AI application to damage society, we may even need to pose
| questions concerning their political beliefs and allegiances).
|
| It's no accident that this information is missing with this
| announcement._
| throwamon wrote:
| Didn't they already use something like this to supposedly unmask
| Satoshi Nakamoto?
| Grimburger wrote:
| Nakamoto Satoshi has not been unmasked despite the very short
| list of people capable/interested in creating what he made.
|
| Stylometric analysis did suggest a single person on that list.
| The easier thing for governments to do at the time would have
| been to just spin up a node in the first year and look at the
| IP addresses.
|
| He had no desire to become known back then and likely never
| will. It's only more dangerous now compared to the threat
| before of being locked up like the LibertyCoin guy (who just
| got released a year ago).
|
| NS is happy to stay in the shadows, nearly everyone respects
| that decision especially in a world of crypto scams and ponzis.
| Surprised they never linked the domain name purchase to him
| though.
| ortusdux wrote:
| I wonder how things like Gmail's smart auto-complete would affect
| these efforts.
| ezekg wrote:
| Don't talk bad about your government, folks. We're going to be
| entering a new age of technological tyranny.
| orangepurple wrote:
| @ezekg, GovAI has detected that your post violates community
| guidelines. Your COVID pass is RED for 72 hours to protect
| yourself and others.
| imglorp wrote:
| The government appears more worried about managing dissent than
| anything else, like what actual threats people are talking
| about.
| alexbiet wrote:
| @ezekg, GovAI has detected unlawful talk posted from your
| account. Your CBDC account is locked for 48 hours.
| brippalcharrid wrote:
| Further violations will lead to the balances of your close
| friends and family being adjusted by -20%, and the balances
| of acquaintances being adjusted by -5%. Help protect against
| the threat of misinformation and safeguard your Balance for
| up to 28 days by reporting anything that you think could lead
| to harm. Remember, We're All In This Together.
| Psychoshy_bc1q wrote:
| bitcoin fixes this.
| ezekg wrote:
| > We're All In This Together.
|
| Sent chills down my spine.
| daniel-cussen wrote:
| This is why I write my shit under my own legal name, even going
| to notarize this account at some point. Yeah throwaway yeah.
| Anonymous speech. Oh yeah darknet, Tor, cryptography, like yes
| sometimes, but it's a game of cat and mouse, it's purely a
| question of cost.
|
| Furthermore I consider games like poker or Magic the Gathering
| unplayable, that is the extent to which there is literally
| absolutely no privacy.
|
| But don't mind me, just got lobotomized is all.
| xani_ wrote:
| I'm sure it will not be used in malicious way
| blakesterz wrote:
| There's a great book on this kind of thing, Author Unknown: On
| the Trail of Anonymous, by Don Foster. This was written way back
| in 2000.
|
| People have been doing this for decades.
| runjake wrote:
| _> People have been doing this for decades._
|
| The _key point_ here is that it 's AI-driven and at scale -- in
| other words, mass surveillance.
|
| Personally, I see this as part of the US IC's mission, despite
| the potential domestic detriment.
| LinuxBender wrote:
| Here [1] is a previous discussion on this as well.
|
| [1] - https://news.ycombinator.com/item?id=33009545
| Zigurd wrote:
| Moreover, a stylometry analysis will reveal when I started to
| limit myself to one "moreover" for every two or three chapters.
| narrator wrote:
| How about using this to find bot accounts?
| michaelwww wrote:
| I'm old so I've been planning on this for awhile. I have a folder
| that contains all my personal data: photos and videos, journal,
| all my saved social media posts, all my emails and all my
| anonymous handles leading to everything I've ever written online.
| My thinking is an AI could create a reasonable facsimile of
| myself that my descendants could have a conversation with. I
| think it'd be better than a autobiography since Joyce Carol Oates
| convinced me by something she tweeted that no one reads
| autobiographies, not even close family, unless you are famous.
| oneoff786 wrote:
| If I had a bot that replicated my great great ancestor I'd
| probably get bored quickly and then try to prod it into
| revealing it's deeply outdated and inappropriate social views
| michaelwww wrote:
| You're right, the novelty would wear off quickly, but it
| doesn't hurt anything for me to organize a data set about
| myself just in case
| lwneal wrote:
| The best protection against this type of de-anonymization is to
| take measures now, while you still have time, to prevent it. It
| is possible to change the style of one's writing by using a
| language model which alters the original text in order to create
| a new piece with a different style. For example, to translate
| your text into the grandiose and flowing diction of a bygone era,
| you might consider the project below.
|
| [1] https://github.com/lwneal/victorianhackernews
| boarnoah wrote:
| There is a case to be made, not just for natural language but
| code.
|
| AFAIK there is quite a bit of examples from security labs where
| malware authors aren't necessarily identified but at least
| fingerprinted based on naming conventions, patterns they use
| across multiple projects etc...
|
| That sort of fingerprinting could expand to correlating
| someone's anonymous software projects to other examples of code
| elsewhere (ex: if they contribute to source available stuff).
|
| re: the example project you mention specifically, it does feel
| like using tools like that almost as a linter for natural
| language would be a fingerprint in itself.
|
| EDIT: As far as OPSEC goes, a fun tidbit. A friend of mine
| identified a PR I submitted anonymously to them, simply because
| of the style of PR comments I made.
| doliveira wrote:
| Aren't GANs all about creating both the generator and the
| discriminator? Seems to me you can also build the "reverser"
| quite easily.
| sn41 wrote:
| I find the examples given in the README to be quite tame for
| Victorian English. Compare it with the ending lines of A Tale
| of Two Cities:
|
| "It is a far, far better thing that I do, than I have ever
| done; it is a far, far better rest that I go to than I have
| ever known.",
|
| or this from Pride and Prejudice:
|
| "However little known the feelings or views of such a man may
| be on his first entering a neighbourhood, this truth is so well
| fixed in the minds of the surrounding families, that he is
| considered as the rightful property of some one or other of
| their daughters."
| MichaelCollins wrote:
| Tools like this probably fool traditional stylometry, but what
| about de-anonymization tools that find similar _ideas_ , not
| writing style? Perhaps most people have boring common ideas
| they got from others, but the sort of people the US Government
| is most interested in are likely quirkier than most.
| Kukumber wrote:
| They demonized China for doing things like that
|
| Now they'll copy China
|
| I find it very funny
| xani_ wrote:
| "How dare they do it before us"
|
| USA next decade:
|
| "Posting bad things on twitter reduces your credit score"
| hardnose wrote:
| Credit scores are not determined by the government. If a
| credit ratings agency took that step, it would harm them
| because Twitter posts are unlikely to represent a meaningful
| variable when predicting someone's creditworthiness.
|
| Don't confuse that with "social credit" systems, whereby
| China prevents you from riding trains if you say something
| naughty.
| wahnfrieden wrote:
| (citation needed - that's not actually implemented in china
| at scale, though it's a convenient talking point in the
| west)
| egberts1 wrote:
| It is called She Hui Xin Yong Ti Xi .
|
| Try and keep up.
|
| https://en.m.wikipedia.org/wiki/Social_Credit_System
| wahnfrieden wrote:
| Try and read.
|
| Where exactly does it say it's ever progressed beyond
| trials and announcements ie implemented at scale - oh it
| doesn't
| egberts1 wrote:
| seek and ye shall find
|
| https://nhglobalpartners.com/china-social-credit-system-
| expl...
| wahnfrieden wrote:
| thank you. the most specific citation I could find in
| your link was this, regarding the 80% rollout statistic:
|
| >As of December 2020, more than 80 percent of all the
| provinces, autonomous regions, and municipal cities had
| issued or were preparing to issue local credit laws and
| regulations.
| egberts1 wrote:
| I'm quite sure they are trying to automate this as well.
| egberts1 wrote:
| Or that VP of Apple quoting from a movie called "Arthur" at
| an auto show resulting in him being fired.
|
| It has begun right here in the United States and we just
| are oblivious to these new dastardly form of social
| credits.
| [deleted]
| Kukumber wrote:
| government, FANG, it's all the same, the same group of
| lobby talking and coordinating with each other, hiring CIA
| agents and bunch of 'friends'
|
| https://mronline.org/2022/07/27/national-security-search-
| eng...
|
| that's why the US wants to ban TikTok asap, because they
| don't want china to be able to do what they are doing for
| decades too
|
| > it would harm them because Twitter posts are unlikely to
| represent a meaningful variable when predicting someone's
| creditworthiness.
|
| people get fired and arrested already for posting stuff on
| twitter, in both the US and Europe, so no, it's not just
| just a "twitter moderation" thing
|
| Ask yourself why they are allowed to exist and still
| operate despite unable to grow and are loosing money for
| years, talk about anti-competitive practices, unless it's
| in reality a government body in disguise
| hardnose wrote:
| > government, FANG, it's all the same
|
| If you believe that, then you also support efforts to
| force big tech to respect freedom of political speech,
| yes?
| hardnose wrote:
| Developing a technology that can do that doesn't infringe on
| anyone's liberties.
|
| Failing to develop that technology, leaving it to China or some
| other authoritarian state to do, would be more likely to harm
| liberties, wouldn't it?
|
| Seems like you're just "damned if you do, damned if you
| don't"-ing, no offense.
| pessimizer wrote:
| > Developing a technology that can do that doesn't infringe
| on anyone's liberties.
|
| Why are you making this up?
| bitL wrote:
| "Hey Joe, run that article of yours through the anonymizing AI
| first!"
| 1970-01-01 wrote:
| Voynich manuscript ---> US Govt AI stylometry machine ---> 42
| PointA2B wrote:
| Content marketers in the digital marketing space commonly put
| blog posts through "spinners" that take your text and modify it
| through replacing words/phrases with similar equivalents. This
| lets you take one article and turning it into 5-10+ unique ones,
| even though they still discuss the same things. It would be a
| shame if a service like this was marketed towards those
| interested in privacy, it would probably break this entire
| system...
| ElementaryElk wrote:
| I've found plenty of articles that seem to be run through these
| spinners, but hand made corrections are likely to be necessary
| (unless it can be automated with ML for example) as you can
| almost always tell that something is odd based on context
| lacking word choices
| PointA2B wrote:
| And thats exactly what newer programs do, look at AppSumo and
| its practically all of them. The older gen simply used a
| giant dictionary, then picked a random option from the list
| of acceptable choices.
| swayvil wrote:
| What we need is a speaking style anonymizer. Like a language
| translator except it translates your text into some kind of
| stylistic uniformness.
|
| We're flexible on that format. Anything legible and relatively
| easy to translate into. Call it ANONSPEAK.
|
| It will probably be, aesthetically, horrible.
| hoosieree wrote:
| There exists such a practice, and it is known as academic
| publishing.
|
| Every verb is done in passive voice, punctuation is added -
| wherever possible - to make sentences appear more complex than
| they need be, and of course there is an effervescent use of
| sesquipedalian terms where shorter similes would otherwise
| suffice.
| __jambo wrote:
| Seems very doable given the state of google translate.
|
| Trouble is if you are a revolutionary leader of some kind you
| are probably going to be saying new things that no-one else
| talks about - which renders both anonspeak and the ai detection
| kind of redundant.
|
| I guess the application for this then is in the interim to stop
| people or online groups becoming revolutionary by tracking and
| deradicalising them with targeted manipulation.
| dehrmann wrote:
| _Plans?_ I assumed lots of people were already working on this.
| There 's already a lot of training data out there, and I suspect
| most users can be identified by use of a handful of uncommon
| trigrams and sentence stats. I know you can recognize things I've
| written at work because they have real em dashes--people rarely
| type with those.
| ben_w wrote:
| I briefly considered making one. My idea was simple -- build a
| Markov chain for each person plus the text with the unknown
| author, do a dot product of the intersection of the all the
| chains, pick the author with the best match. Never got around
| to it. Perhaps this weekend?
| MonkeyMalarky wrote:
| Someone unintentionally did something similar with HN comments
| to find users who sound most similar to you/eachother and
| people were finding their throwaway and alt accounts.
| vmoore wrote:
| Yes I recall that: 'Find Your Hacker News Doppelganger':
|
| https://news.ycombinator.com/item?id=27568709
| dhosek wrote:
| Interesting. There was commentary about finding
| anonymous/throwaway accounts, but on mine, I did not find
| my anonymous account (although I use that very rarely). The
| accounts that turned up seemed to be all real, and in a
| couple cases I could guess what might have made the checker
| match us (e.g., mentions of MFAs or Apple //e or similar
| politics), but not all. I didn't notice any linguistic
| similarities.
| rkagerer wrote:
| Yeah back then I felt mine didn't produce good matches
| either.
| ravenstine wrote:
| Neat experiment, though it took me fewer than 30 seconds to
| rule out my nearest "doppelganger". Too many patterns there
| I've never used.
| copperx wrote:
| About 15 years ago, at JHU, I heard about an algorithm that
| detected a writer's gender with more than 90% of accuracy, and
| the NLP professor considered that problem solved.
| kps wrote:
| I use em dashes -- though with space -- and also ellipses...
| Also 2x4s (that are actually 11/2x31/2), where 2 [?] 4 above
| -270degC, and footnotes1 and all that.
|
| 1 There.
| [deleted]
| wsinks wrote:
| I forget that they're called em dashes -- but I also love to
| use them to offset what I say from other people.
| Cupertino95014 wrote:
| They've broken more than one Python script of mine. That'll
| make you declare everything UTF-8.
| pessimizer wrote:
| If you have a compose key, (-) is compose + (-) (-) (.)
|
| Or compose + (-) (-) (-) for (--) which is wider depending on
| the font.
| lajamerr wrote:
| The next step after is to build an AI to convert your style of
| writing to another person's style.
| notadev wrote:
| Running all writing through AI to re-write everything in a
| different style while conveying the same information might
| work. Create a way to apply it to all online writing and you
| have something like a writing VPN.
| klabb3 wrote:
| Exactly. You barely need "AI" for this. Changing style enough
| to put some sand in a sophisticated text identifier could be
| as simple as introducing some spelling and grammar
| variations. It could easily be commoditized (isn't grammarly
| doing this already minus the privacy part?). Of all cat and
| mouse games LE and intelligence are playing, for this one I'm
| betting on the mouse.
| badrabbit wrote:
| NSA already does this. "Stylometry" I believe is the term.
| Perhaps they used heuristics and algorithms so far? I would NLP
| was good enough for this years ago.
| dehrmann wrote:
| Back in 2015 you might have some linguists work with data
| scientists to do feature engineering and use those as inputs
| to an LR model. I suppose you can let a deep learning model
| feature engineer for you, but either way, you'll get to some
| of the same heuristics you're thinking of.
| Zak wrote:
| I built something like that more than a decade ago to identify
| alter-egos in an online game from in-game chat. It was reasonably
| successful and I thought about commercial applications for it,
| but ultimately decided most things that could be used for are
| creepy or evil.
|
| I remember hearing DARPA was actively seeking research in the
| field around that time. In principle, I'm not absolutely against
| my software being part of the chain of events that leads to the
| decision to kill someone, but I don't trust the US government
| (or, realistically, anybody else) to independently verify an
| identification made by such a system.
|
| I'd be surprised if the three letter agencies aren't using
| something at least as good as what I wrote by now.
| copperx wrote:
| How sophisticated was your system? It sounds like you used
| cutting edge NLP techniques at the time.
| Zak wrote:
| I'd describe it as fairly simple. It was just a classifier
| where each account name was a category: there was no fancy
| NLP. It used a single feature type and an algorithm from a
| well-known family. I don't want to say what either was lest I
| further proliferate the technique.
|
| I cross checked using statistically improbable words, which
| helped confirm or exclude weak matches.
| eftychis wrote:
| https://news.ycombinator.com/item?id=33037319
|
| https://news.ycombinator.com/item?id=33034918
|
| Am I the only one seeing the irony and contradiction here? Not
| the same people at all -- subgroups at best , but the Director of
| National Intelligence is part of the administration. Perhaps I am
| missing something -- feel free to comment -- I am curious what
| everyone thinks.
| causi wrote:
| Wouldn't it be more accurate to say it links anonymous writings
| together? If you don't have any writings under your real name
| there's nothing it can do except indicate two pseudonyms belong
| to the same person.
| andy_ppp wrote:
| You're assuming they don't have copies of every email you ever
| sent...
| vmoore wrote:
| If I want to write anonymously, I cycle my text through Google
| Translate multiple times and keep all the grammatical errors
| intact. So, English > Italian, and then Italian > French, then
| back to English.
|
| I also pass it into Hemingway[0] first to make my text lean and
| non-superfluous.
|
| [0] https://hemingwayapp.com/
| unethical_ban wrote:
| So we'll have AI that can mask writer's identity in about three
| months.
| lm28469 wrote:
| Text to text AI, it should be relatively easy to do too
| MengerSponge wrote:
| Basically exists already, right? A lot of college kids are
| using it for their writing assignments.
| ezekg wrote:
| It'll be AIs all the way down.
| yamtaddle wrote:
| But it'll erase all my anachronistic grammatical and style
| preferences! Then what's even the point of writing?
|
| (it's _role_ and _& c._, goddamnit)
| DharmaPolice wrote:
| If precision isn't super important you can already run your
| text through machine translation into another language and then
| back again. Spammy content sites already seem to do that to
| avoid copyright detection.
| glitchc wrote:
| We can get around this by writing something and sending it
| through GPT-3 for "style correction".
___________________________________________________________________
(page generated 2022-09-30 23:01 UTC)