[HN Gopher] Adversarial.io - Fighting mass image recognition
       ___________________________________________________________________
        
       Adversarial.io - Fighting mass image recognition
        
       Author : petecooper
       Score  : 123 points
       Date   : 2021-02-20 19:32 UTC (3 hours ago)
        
 (HTM) web link (adversarial.io)
 (TXT) w3m dump (adversarial.io)
        
       | puttycat wrote:
       | Great idea. Please consider distributing this as open an source
       | downloadable app to avoid privacy concerns.
        
       | loser777 wrote:
       | What happens when the perturbed images are processed by some
       | noise removal method? On the crude end, even something like
       | aggressive JPEG compression will tend to remove high frequency
       | noise. There's also more sophisticated work like Deep Image Prior
       | [1], which can reconstruct images while discarding noise in a
       | more "natural" way. Finally, on the most extreme end, what
       | happens when someone hires an artist or builds a sufficiently
       | good robot artist to create a photorealistic "painting" of the
       | perturbed image?
       | 
       | There's a lot of work on compressing/denoising images so that
       | only the human-salient parts are preserved, and without seeing
       | this working past that I think it's better to interpret
       | "adversarial" in the machine learning sense only. Where
       | "adversarial" means useful for understanding how models work, but
       | not with any strong security implications.
       | 
       | [1] https://arxiv.org/abs/1711.10925
        
       | nerdponx wrote:
       | Switching the result from "tabby" to "catamount" is not nearly as
       | "adversarial" as I expected. Is that really worth it?
       | 
       | Is the idea that it's useful if you're trying to stop targeted
       | facial recognition of individual people?
        
         | TazeTSchnitzel wrote:
         | Perhaps the algorithm they're using doesn't know that those are
         | similar things and just knows they're different, so it
         | optimises towards them.
        
       | colincooke wrote:
       | It's interesting that they only tackle a single model
       | architecture (a pretty common one). It makes me think that is is
       | likely an attack technique which uses knowledge of the model
       | weights to mess up image recognition (if you know the weights,
       | there are some really nice techniques that can find the minimum
       | change necessary to mess up the classifier).
       | 
       | Pretty cool stuff, but also if my assumption is correct it means
       | that if you _didn't_ use the widely available ImageNet weights
       | for inception v3 then this attack would be less effective (or not
       | even work). Given that most actors who you don't want recognizing
       | your images don't open source their weights this may not scale/or
       | be very helpful...
        
         | fantod wrote:
         | This was my first thought as well. The question is: How robust
         | are adversarial perturbations? In other words: Given such a
         | perturbation that was generated using one model, how well can
         | we expect to work on a similar model (in the sense that both
         | models are fooled)? I would be curious to know if any research
         | has been done on this question.
        
       | nickvincent wrote:
       | There's a theme in this discussion that ML operators will just
       | train new models on adversarially perturbed data. I don't think
       | this is necessarily true at all!
       | 
       | The proliferation of tools like this and the "LowKey" paper/tool
       | linked below (an awesome paper!) will fundamentally change the
       | distribution of image data that exists. I think that widespread
       | usage of this kind of tool should trend towards increasing the
       | irreducible error of various computer vision tasks (in the same
       | way that long term adoption of mask wearing might change the
       | maximum accuracy of facial recognition).
       | 
       | Critically, while right now the people who do something like
       | manipulate their images will probably be very privacy conscious
       | or tech-interested people, tools like this seriously lower the
       | barrier to entry. It's not hard to imagine a browser extension
       | that helps you perturb all images you upload to a particular
       | domain, or something similar.
        
         | yumraj wrote:
         | > It's not hard to imagine a browser extension that helps you
         | perturb all images you upload to a particular domain, or
         | something similar.
         | 
         | Ideally I'd like to see something like this be part of the
         | camera filter itself.
         | 
         | Why can't Apple, if they choose to do so, just add something
         | like this as part of their camera app itself?
        
           | LockAndLol wrote:
           | If it's opensource, there's no need to wait on the business
           | interests of a trillion dollar company to align with your
           | wishes. Camera app developers can be made aware of it and add
           | it to their apps. If there are app developers on HN, they can
           | create pull requests to their favorite apps and add the
           | feature.
           | 
           | That's the power of opensource.
        
         | lmeyerov wrote:
         | The adversarial ML arms race seems similar to the rest of the
         | security/privacy arms race, where these endeavors will make
         | recognition stronger, not weaker, similar to how any other
         | manual red team attacks ultimately get (a) automated and (b)
         | incorporated into blue team's automatic defenses.
         | 
         | Hard to see why that wouldn't be the case, esp. for techniques
         | that are general, vs. exploiting bugs in individual models. As
         | long as a person can quickly tell the difference, it's in the
         | grasp of deep learning for ~perception problems, and the
         | economics of the arms race determines the rest of what happens
         | when..
        
           | nickvincent wrote:
           | Definitely agree that there will be cases in which a computer
           | vision operator ships features specifically intended to
           | create a new automatic defense.
           | 
           | One thing that seems unique to technologies that are mostly
           | just statistical learning is that each new manipulation
           | approach can basically widen the distribution of possible
           | inputs. In particular, I'm thinking that as more obfuscation
           | and protest technologies are made public like this, the
           | distribution of "images of faces available for computer
           | vision training" becomes more complex. That is to say,
           | whenever a adversarial tool creates a combination of pixels
           | that's never been see before, if that "new image" can't be
           | reduced back to a familiar image via de-noising or pre-
           | processing, the overall difficulty of computer vision tasks
           | increases.
           | 
           | All a long winded way of saying, I think for ML systems,
           | there's a unique opportunity to "stretch the distribution of
           | inputs" that may not exist for other security arms races.
           | 
           | Totally agree that economics of the arms race(s) will a huge
           | factor in determining how much an impact obfuscation and
           | protest can have.
        
       | car wrote:
       | It's really surprising to me how easily AI can be fooled. Maybe
       | there is a fundamental difference between our visual system and
       | what is represented in a visual recognition CNN. Could it be the
       | complexity of billions of cells vs. the simplification of an AI,
       | or something about the biology we haven't yet accounted for?
        
         | yumraj wrote:
         | Because there is no I in AI.
         | 
         | At a mile high conceptual level, AI is nothing but a program
         | created by a computer based on the _data it is provided_
         | 
         | Which is why it is extremely easy to fool using techniques that
         | it is not trained to handle, today, but might be able to handle
         | tomorrow. It is a race...
        
           | andrewjl wrote:
           | > Because there is no I in AI.
           | 
           | This. In a nutshell every sort of algorithm we call "AI"
           | today is reductive pattern matcher. This limitation isn't due
           | to computational capacity or even, IMO, algorithm design, but
           | due to our collective lack of understanding of how
           | intelligence itself works. We'll get there eventually, but
           | not for a long while.
        
           | IshKebab wrote:
           | > Because there is no I in AI.
           | 
           | This a common refrain but fairly obviously untrue. It assumes
           | there's some secret sauce in human brains that makes us
           | "intelligent" whereas AI is "just a machine".
           | 
           | It's pretty clear that human brains are just programs.
           | Extraordinarily complicated highly optimised programs, sure.
           | But nobody has even found a shred of evidence that there's
           | anything fundamentally different to programs in them.
           | 
           | Thinking otherwise is along the same lines as thinking that
           | animals don't have feelings.
           | 
           | Every time there's an advance in AI the "it's not _really_
           | intelligent " goalpost shifts. Clearly intelligence is a
           | continuum.
        
           | LockAndLol wrote:
           | > At a mile high conceptual level, AI is nothing but a
           | program created by a computer based on the data it is
           | provided
           | 
           | Your brain is but a preprogrammed, biological computer that
           | reacts to data obtained from its interfaces and attach
           | peripherals.
        
           | car wrote:
           | Right. I'm naive in assigning more capability to these models
           | than they posess.
           | 
           | I found a quote from Geoff Hinton where he talked about this
           | last year.
           | 
           | From [1]: _"I can take an image and a tiny bit of noise and
           | CNNs will recognize it as something completely different and
           | I can hardly see that it's changed. That seems really bizarre
           | and I take that as evidence that CNNs are actually using very
           | different information from us to recognize images," Hinton
           | said in his keynote speech at the AAAI Conference._
           | 
           |  _"It's not that it's wrong, they're just doing it in a very
           | different way, and their very different way has some
           | differences in how it generalizes," Hinton says._
           | 
           | [1] https://bdtechtalks.com/2020/03/02/geoffrey-hinton-
           | convnets-...
        
         | iujjkfjdkkdkf wrote:
         | We compare what we see against our internal model of the world,
         | a CNN pattern matches, it doesn't think critically about the
         | result that comes out.
        
         | TaylorAlexander wrote:
         | Our CNN models are not meant to replicate the human visual
         | system. They are a convenient mathematical tool that is quite
         | unlike our brain. It's not even necessary to use convolutions,
         | though they remain the most popular method.
        
         | amelius wrote:
         | There are optical illusions ...
         | 
         | https://michaelbach.de/ot/
        
         | tomaszs wrote:
         | Image recognition is currently done with so called weak UI. It
         | was not a popular term, however it became such some years ago
         | so that marketing can sell machine learning as an artificial
         | intelligence.
         | 
         | However machine learning is nowhere near what we consider as an
         | AI, equivalent of our intelligence.
         | 
         | You can compare machine learning with training a hamster to
         | jump on a command. If you will repeat learning process a lot of
         | time hamster will jump. But change anything in the environment
         | and he won't.
         | 
         | Machine learning is just a hamster that is trained thousands of
         | times.
         | 
         | It can do one thing, sometimes quite good, but still it is as
         | intelligent as a hamster.
         | 
         | Machine learning does not aim to become an intelligence. It is
         | just a well trained hamster. Nothing more.
         | 
         | It is just a fuzzy algorytm.
         | 
         | That is why it is so easy to fool the algorytm. I hesitate to
         | call machine learning any kind of AI just for the reason it
         | generates such confusion.
         | 
         | If it comes to developing real AI we are nowhere near
         | currently. However we enjoy machine based models that are
         | easier to brute force train with processing power he have today
        
         | colincooke wrote:
         | It's very intersting, but once you understand how these attacks
         | are performed, not too surprising (at least the attacks that
         | I'm familiar with, there may be others that are different).
         | Basically (and I'm drastically oversimplifying here), there are
         | a class of succesful techniques that can run an optimization to
         | maximize classification loss for a certain model
         | architecture/parameter set by modulating the content of the
         | image (for example by adding noise). This loss critically
         | depends on having the parameters of the model in hand, and is
         | very hard to avoid.
         | 
         | So it's a little more sophisticated than just adding random
         | noise, it's adding very specific quantities of noise to very
         | specific locations, which are based on perfect knowledge of how
         | the predictive system (the deep model) works.
         | 
         | Is this stuff interesting? Absolutely. Is it worth studying?
         | Yes, again. Does it mean that CNNs as we know them are poor
         | computer vision systems and fundementally flawed? No. It's a
         | limitation of existing deep models, and one which may be
         | overcome eventually.
        
       | JohnPDickerson wrote:
       | Folks interested in this kind of work should check out an
       | upcoming ICLR paper, "LowKey: Leveraging Adversarial Attacks to
       | Protect Social Media Users from Facial Recognition", from Tom
       | Goldstein's group at Maryland.
       | 
       | Similar pitch -- use a small adversarial perturbation to trick a
       | classifier -- but LowKey is targeted at industry-grade black-box
       | facial recognition systems, and also takes into account the
       | "human perceptibility" of the perturbation used. Manages to fool
       | both Amazon Rekognition and the Azure face recognition systems
       | almost always.
       | 
       | Paper: https://arxiv.org/abs/2101.07922
        
       | sly010 wrote:
       | Can't wait to read about Inception V4 being trained on
       | adversarial.io for better noise resistance :)
        
         | _the_inflator wrote:
         | Yes, the irony: this app makes image recognition even more
         | effective.
        
           | Imnimo wrote:
           | I don't really think this is true for two reasons. First, the
           | app doesn't add anything that someone training a classifier
           | couldn't just do themselves. If I want to train my network
           | against adversarial inputs, I can just generate them myself
           | at training time on my own training data - it's not
           | particularly helpful for me to have a bunch adversarially-
           | perturbed but unlabeled images. Second, and more importantly,
           | it's not sufficient to just train on adversarial examples
           | taken from another network. You might learn to be robust
           | against the specific weaknesses of that other network, but
           | your new network will have its own idiosyncratic weaknesses.
           | To be effective, adversarial training
           | (https://arxiv.org/pdf/1706.06083.pdf) needs an adversary
           | that adapts to the network as it trains. In other words, you
           | need your adversary to approximate your current weaknesses at
           | each step of training.
        
       | telesilla wrote:
       | I'd pay for API access for this, are there any plans for this?
        
       | kaoD wrote:
       | Why does this page request access to my VR devices?
        
       | djabatt wrote:
       | Absolutely the coolest project I read about this year. It will be
       | an arms race between hiding and finding. I went through this with
       | web and email spam.
        
       | bspammer wrote:
       | I love how you can almost see a lynx in the attacking noise. I'd
       | be interested to know if that's my brain spotting a pattern that
       | isn't there, or if that's genuinely just the mechanism for the
       | disruption.
        
         | 40four wrote:
         | Makes me think of those old 'magic eye' stereogram images that
         | used to be so popular.
        
         | Imnimo wrote:
         | I think it's unlikely that the noise will generally have any
         | relation to the target class. I can't find anywhere they say
         | exactly which attack method they use, so it's hard to say for
         | certain, but all of the attacks I'm aware of don't generate
         | noise that has any human-interpretable structure. See for
         | example Figure 1 in the seminal paper on adversarial attacks:
         | https://arxiv.org/pdf/1412.6572.pdf
        
       | endisneigh wrote:
       | Couldn't you easily infer the attacking noise by comparing the
       | original and the changed images? Once you have the attacking
       | noise it would be pretty trivial to beat this, no?
       | 
       | I also don't see how this would do much against object
       | recognition or face recognition. More insight to the types of
       | recognition this actually fights against would be helpful.
        
         | xondono wrote:
         | > I also don't see how this would do much against object
         | recognition or face recognition.
         | 
         | That's precisely the point, you are creating noise that humans
         | are insensitive to, but that severely affects AI.
         | 
         | The idea as I understand is that if you need to upload an image
         | (of yourself for instance), you can use this to complicate
         | matters to AIs by uploading the modified picture.
        
         | alexchamberlain wrote:
         | I think the idea is that the AI doesn't have access to the
         | original. That being said, I'm not sure what would stop such
         | AIs from being _trained_ on images that have been attacked.
        
           | capableweb wrote:
           | We'll end up in a similar cat&mouse game as online "pirates"
           | been at for a long time. Developers create something to break
           | the AI, the AI adopt because they figure out the noise
           | profile, developers change the noise profile and the AI has
           | to adopt again.
        
       | CivBase wrote:
       | Begun, the AI wars have.
        
         | djabatt wrote:
         | sadly there will be a lot of time and machine power spent on
         | this war.
        
       | forrestthewoods wrote:
       | If my human eyes can identify a picture then, eventually, so too
       | will algorithms. This is fundamentally a dead end concept.
       | 
       | > it works best with 299 x 299px images that depict one specific
       | object.
       | 
       | Wow. How incredibly useful.
        
         | car wrote:
         | How do we know that mammalian visual systems aren't
         | fundamentally different from AI? What you predict is nothing
         | but an assumption.
        
           | pelorat wrote:
           | The poster is not wrong. This attack only works against a
           | specific model trained against a specific dataset. It does
           | not work against models that understand that a animal face is
           | made up of sub features like a nose, two eyes and a mouth.
           | 
           | In the end, this is a completely useless exercise and will
           | not have any impact on mass image recognition. For this to
           | even work the attack needs to be tailored to the exact
           | weights in the neural network that is being attacked.
        
           | [deleted]
        
       ___________________________________________________________________
       (page generated 2021-02-20 23:00 UTC)