[HN Gopher] The FTC's new enforcement weapon: "Algorithmic destr...
       ___________________________________________________________________
        
       The FTC's new enforcement weapon: "Algorithmic destruction"
        
       Author : simonebrunozzi
       Score  : 94 points
       Date   : 2022-03-17 15:29 UTC (7 hours ago)
        
 (HTM) web link (www.protocol.com)
 (TXT) w3m dump (www.protocol.com)
        
       | tomc1985 wrote:
       | Good, I hope they tear through this data-mining-gold-rush and set
       | us back 1000 years, and I hope all the wantrepreneurs trying to
       | exploit all of us with this data bullshit end up homeless and
       | poor.
       | 
       | I am not a data point, I do not want to be a data point, and I do
       | not want products and services to interact with me as if I was.
       | We humans are superior at curation, discovery, and anticipating
       | our needs. Not AI.
        
         | BurningFrog wrote:
         | If AI is inferior, there is no need to ban it. Pick a side!
        
           | sodkmfkwegw wrote:
           | That's the same argument as :
           | 
           | If nuclear weapons are superior to all weapons, no need to
           | ban them!!
        
         | andreilys wrote:
         | _We humans are superior at curation, discovery, and
         | anticipating our needs._
         | 
         | Depends, I would prefer reco Algos be open source and I can
         | plug in different ones.
         | 
         | Trying to make sense of the mass of data without a personalized
         | recommendation engine is a fools errand.
        
           | hi_im_miles wrote:
           | We basically need a curation protocol.
        
             | yummypaint wrote:
             | This is the field of library science
        
           | tenebrisalietum wrote:
           | all we need is grep for audio and video and I'm good.
        
             | prometheus76 wrote:
             | Related: there are a few channels on youtube that I archive
             | locally, and I always grab the subtitles when I do. I throw
             | them all into a database that I can run queries against if
             | I'm trying to remember which video something was said in.
             | It's not a direct grep of the video, but pretty close!
        
         | m463 wrote:
         | > We humans are superior at curation, discovery, and
         | anticipating our needs
         | 
         | You speak as though advertising was for you.
         | 
         | The data being collected about you is used to sell you to
         | advertisers, and your data to 3rd parties.
         | 
         | you see advertisements not because you are interested in a
         | product, but because the seller of the product is interested in
         | one of your characteristics (like your income, sex, weight,
         | religious affiliation, price insensitivity, poor decision
         | making skills, etc...)
        
         | cryptonector wrote:
         | > I am not a data point, I do not want to be a data point, ...
         | 
         | <insert obligatory The Prisoner reference>                 I am
         | not a number! I am a free man!       I will not make any deals
         | with you.       I've resigned.       I will not be pushed,
         | filed, stamped, indexed,       briefed, debriefed, or numbered!
         | My life is my own!
        
         | listenallyall wrote:
         | Should we not have counted Covid deaths or cases because those
         | individuals also didn't want to be a data point?
        
           | andrewjl wrote:
           | Who's making money directly from those figures?
        
         | PaulDavisThe1st wrote:
         | > We humans are superior at curation, discovery, and
         | anticipating our needs. Not AI.
         | 
         | 1. I think is far from established, and if anything the
         | accumulating evidence shows the opposite (though I do hate the
         | term AI when used like this)
         | 
         | 2. The issue (for me, at least) is not "AI" replacing human
         | equivalents (though in some parts of life, this is a huge
         | issue), but rather the ease to which the AI's actual purpose
         | and success metrics can be different from the claims made about
         | it by its owner-operators.
         | 
         | It's not that this can't happen with human beings too -
         | financial advisers who are actually lining their own pockets
         | rather than providing you with the best advice for you. But it
         | is much easier to hide nefarious intent inside an algorithm,
         | and harder to discover it without access to source code (and
         | non-trivial even then).
        
       | [deleted]
        
       | victor9000 wrote:
       | I was recently looking for dimmable string lights, and everything
       | I found online required installing an app to work the lights. In
       | what universe is installing a shitty app more convenient for the
       | user than turning a dial? It seems like more and more products
       | are being converted into data harvesting channels, regardless of
       | their intended purpose.
        
         | alar44 wrote:
         | Because app has a shitload more functions than a dial. I can
         | program my strips to do all kinds of cool stuff.
         | 
         | I mean, what data is going to get extracted? Your favorite
         | color? The app that came with mine is great and I can work the
         | controls remotely.
         | 
         | Might be the wrong forum to be complaining that an app is too
         | hard for you.
        
           | _vertigo wrote:
           | > Might be the wrong forum to be complaining that an app is
           | too hard for you.
           | 
           | You're misconstruing the objection here. No one thinks an app
           | is too hard. The problem I don't want my email to end up in
           | some database because I wanted to dim my lights. I imagine it
           | would also be able to collect info on what bluetooth devices
           | I have in my kitchen, etc.
        
           | pickledcods wrote:
           | Have you checked the trackers and data loggers build into
           | that app that directly or indirectly call home when you
           | interact with your lights?
           | 
           | I have an aquarium and I use a water quality tester when I
           | suspect something is wrong. Before it was a colour stripe you
           | have to compare with a reference printed on a package. Now it
           | uses a camera and matches the color better for a better
           | analysis. However, that app calls home directly to JBL, so
           | they are building a profile on how I abuse my fish because
           | every time it logs, it logs a bad situation. It also leaks
           | the usage to jquery, google and crashlytics without notifying
           | me or asking my consent.
        
             | rablackburn wrote:
             | > It also leaks the usage to jquery, ...
             | 
             | This is the first time I've read someone refer to loading
             | jquery as a data leak.
             | 
             | Ideally they shouldn't be using jquery, or just shipping it
             | with the app/self-hosting. But for the general case, are
             | you saying your ideal would be an app prompting you for
             | permission before it loaded any external resource from any
             | url? Even one as ubiquitous as jquery?
             | 
             | How is the fact that your IP address requested a copy of
             | jquery, one of the most downloaded JavaScript libraries of
             | all time, any kind of meaningful signal?
             | 
             | I'd be more worried about it as an attack vector than a
             | privacy infringement.
        
           | Arrath wrote:
           | Sometimes all you want or need is a dial. I don't need my
           | dimmable lights to have a rave function, or work as a
           | visualizer for the music I'm listening to.
           | 
           | A dial on the wall also doesn't potentially open an IoT
           | shaped hole in my network security.
        
             | alar44 wrote:
             | They tracking ya like James Bond, all that super important
             | stuff you're doing. Hopefully they don't find your super
             | secret files, otherwise your master plan will be foiled.
        
       | scotuswroteus wrote:
       | This headline is, as they say, greatly exaggerated.
       | 
       | Algorithms have an age old anti-enforcement weapon, called the
       | white shoe law firm.
       | https://www.debevoise.com/insights/publications/2020/10/thir...
        
         | qchris wrote:
         | I don't understand your point here. Are you simply saying that
         | companies can prevent this by...hiring an expensive lawyer?
         | That seems like sort of a strawman argument; hiring a "white
         | shoe law firm" doesn't make an FTC action go away. Companies
         | could hire one of those anyway, but now if the firms fail to
         | win their case (which does in fact happen) the FTC has much
         | sharper teeth which with to move forward. The risk profile for
         | companies engaging in data malpractice goes significantly up.
         | 
         | Outside of that, the headline seems very accurate--algorithmic
         | disgorgement is a very new approach (the Everalbum ruling was
         | in early 2021), and with the Kurbo ruling, one it appears is
         | going to be much more common moving forward. The whole area of
         | regulating algorithmic use is pretty novel--the whole field is
         | basically so new that agencies are still figuring out the way
         | to go about.
        
       | NoraCodes wrote:
       | Can we please, please, please stop calling statistical
       | regressions "algorithms"? This is getting out of hand.
        
         | PeeMcGee wrote:
         | Algorithms Considered Harmful
        
         | Cd00d wrote:
         | Ha! And can we please stop calling curve fitting a "statistical
         | regression"?
        
         | IAmEveryone wrote:
         | A trained model is a set of rules to go from, for example, an
         | image to a tag identifying the object it shows.
         | 
         | That's an algorithm. It is not "statistical regression", even
         | if that is the method that was used in its creation.
         | 
         | The whole its-just-statistics-shtick is what's getting out of
         | hand. People are just mindlessly repeating the point because it
         | seemed smart when it was first made. Hint: it no longer does,
         | especially if you use it wrong.
        
           | melony wrote:
           | By that logic, most of mathematics are "algorithms" because
           | they map one space to another. That is absolute nonsense. An
           | algorithm is a sequence of steps, an implementation. A
           | trained model is only an algorithm in the most generous
           | interpretation of the term. You cannot discount the
           | underlying mathematics of it, not matter how unscrutable it
           | may be. A distribution implemented in Python doesn't
           | magically become any less statistical just because it is
           | executed on a CPU.
        
             | jnwatson wrote:
             | Math consists of proofs. Every proof corresponds to a
             | computer program, by Curry-Howard Correspondence. Every
             | program is an implementation of an algorithm.
             | 
             | Hence, math consists of things that correspond to
             | implementations of algorithms.
        
             | feoren wrote:
             | 1. Multiply your input number by 7
             | 
             | 2. Add three
             | 
             | 3. Square the result
             | 
             | 4. Take this result as your final number
             | 
             | There's an algorithm for f(x) = (7x + 3)^2. Similarly, any
             | map from one space to another is an algorithm if it's
             | describable on paper. Every mathematical proof is analogous
             | to an algorithm in important ways. Math is not "equal to"
             | algorithms, in the sense that two different algorithms can
             | describe the same underlying math, but every _description_
             | of any math essentially has to be an algorithm.
             | 
             | The only parts of math that would not fit would be anything
             | non-constructible, but we know we are limited to exploring
             | and describing only 0% of that world since our descriptions
             | (algorithms) are countably infinite, and there are
             | constructivists who believe non-constructible things don't
             | exist in any meaningful sense. So, yes, most (or all) of
             | the practice of mathematics is algorithms.
             | 
             | Why are we trying to gatekeep this word?
        
               | daniel-cussen wrote:
               | In addition natural numbers are algorithms, in that
               | numbers are functions of no inputs, i.e. 3 is f()=3,
               | similar to random()=[0,1]. And functions are a subset of
               | algorithms.
               | 
               | My word for algorithms that aren't closed-form functions,
               | my life's work, is "repetigrams" because they involve
               | repetition in the form of iteration or recursion.
        
         | [deleted]
        
         | Spivakov wrote:
         | Statistical regression is algorithm. I think I fail to get the
         | sentiment of this comment though (as someone not in data-
         | oriented profession)
        
           | agency wrote:
           | I think it's too late, much like cryptographers lamenting the
           | word "crypto", but I see where they're coming from. I have
           | thought it is unfortunate that the popular conception of "the
           | algorithm" is actually a remarkably bad example of an
           | algorithm. I mean sure it's an algorithm - a computer is
           | executing it - but when I think of an algorithm I think of a
           | well defined sequence of steps solving a discrete problem.
           | Whereas "the algorithm" / machine learning tends to attack a
           | different kind of problem - "squishy" problems like
           | recommendation systems where we don't really know how to
           | explain how we come up with answers as a discrete list of
           | instructions.
        
       | exabrial wrote:
       | It would be really nice to have companies destroy algorithms
       | based on knowledge obtained illegally (pretty much all of
       | Facebook and Google). I just don't see it happening anytime soon.
        
         | Arainach wrote:
         | What evidence do you have for these claims? What knowledge do
         | you believe was obtained illegally?
        
           | qchris wrote:
           | Facebook was forced to pay $650 million in 2021 for violating
           | Illinois' Biometric Information Privacy Act[1]. That's not
           | necessarily grounds for a federal regulatory action, but if
           | something like this happens in the future the does violate
           | federal law (in the article, this was caused by not paying
           | sufficient attention to COPPA), it could mean Facebook has to
           | delete/re-do quite a bit of work in addition to purely
           | monetary damages.
           | 
           | [1] https://techcrunch.com/2021/03/01/facebook-illinois-
           | class-ac...
        
             | Arainach wrote:
             | That's a great example. I can say that for a very long time
             | (at least 5 years, probably more but I wasn't there to know
             | for sure) Google has had explicit policies and training for
             | all employees about not sharing data across projects
             | without explicit user consent for this and many other
             | reasons.
        
         | IAmEveryone wrote:
         | As the article notes, the policy has been used, including
         | against Facebook's partner Cambridge Analytica. So I cannot say
         | that it will be "happening soon", but it is certain to "have
         | happened already".
        
       | amelius wrote:
       | > Algorithmic destruction
       | 
       | I don't understand this label. Isn't this more about discovering
       | how data is (ab)used by companies (what watchdogs are supposed to
       | do), and less about destroying things with algorithms?
        
         | mftb wrote:
         | It explains in the first few paragraphs of the linked articled
         | what they're trying to convey.
         | 
         | "...destroy the algorithms or AI models it built using personal
         | information collected through its Kurbo healthy eating app from
         | kids as young as 8 without parental permission."
         | 
         | and
         | 
         | "..forcing them to delete algorithmic systems built with ill-
         | gotten data could become a more routine approach, one that
         | modernizes FTC enforcement to directly affect how companies do
         | business."
         | 
         | Those quotes are from the first 3 paragraphs.
        
           | amelius wrote:
           | Ok. I read "algorithmic" as "done using an algorithm". So
           | (imho) a better label would be "algorithm destruction". Still
           | not a great label though.
        
             | mftb wrote:
             | Agree, your phrase is an improvement on theirs.
        
       | jka wrote:
       | Unless there's problematic material within the algorithms/models
       | -- and there could be, in some cases -- I get the feeling that,
       | longer-term, algorithmic transparency would be far more
       | effective.
       | 
       | It would help demonstrate what biases were introduced, the
       | systems and processes that permitted/encouraged those to exist,
       | and it would help prevent repeats of similar mistakes in future.
       | 
       | (it seems fair for the FTC to also be able to order companies to
       | stop using a particular category of algorithms; that doesn't
       | require or imply deletion, though)
        
       | sidewndr46 wrote:
       | How could the FTC require algorithmic destruction? How do you
       | prove you destroyed something? Would you have to lobotomize
       | everyone who had worked on building it?
        
         | somesortofthing wrote:
         | You don't. Upon receiving a destruction order, a company can
         | delete the trained model or not. If they do delete it, it's all
         | good. If they don't delete it, they (hopefully)can't get caught
         | using it again without company-destroying fines. They can
         | probably do some kind of recovery in most cases, but I doubt a
         | judge would be very favorable to them if they're caught
         | disobeying such an unambiguous order.
        
         | [deleted]
        
         | rossdavidh wrote:
         | So, essentially, they require the company to pretend that they
         | destroyed the "algorithm" (I think they actually mean machine
         | learning model). If they don't, or they do but they had a copy,
         | or they had used the knowledge to know which if several general
         | purpose algorithms worked better at approximating it, there's
         | essentially no way for an outside enforcement agency to know,
         | and even with a whistleblower there's a real problem trying to
         | enforce it. How do you prove, even if you have the database
         | seized, which data the machine learning model came from?
         | 
         | It's the same problem with saying "destroy that data", which is
         | also difficult to enforce (and probably isn't), except that
         | it's an additional level of difficulty in enforcing.
         | 
         | I'm not saying they don't have a point, I'm just saying I don't
         | see how they will be able to enforce this, even with
         | whistleblowers.
         | 
         | Whistleblower: "that ML model came from illegal data" Company:
         | "No, it didn't." ...
        
         | lallysingh wrote:
         | I think the trained model. I don't think people will remember
         | all the coefficients.
        
           | sidewndr46 wrote:
           | Right, but if the government says I have to turn over some
           | nuts and bolts that I got through some illegal action I can
           | just give them the location of my warehouse. Then the
           | rightful owner can come pick them up with a government agent
           | inspecting the process.
           | 
           | If I have a bunch of drugs or chemicals I shouldn't have, I
           | can give them the addresses and they can have government
           | agents watch the destruction.
           | 
           | If I have an algorithm (that is just a bunch of computer
           | files) how do I prove I destroyed it? Do they literally just
           | watch someone run "rm banned_model.bin" and then decide it's
           | gone? It's basically impossible to prove you don't have a
           | copy of something. There could always be a backup at another
           | location. There could always be an encrypted copy stored
           | somewhere that is impossible to detect.
           | 
           | Any moderately well company has a service contract with a
           | backup provider that acts an option of a last resort for
           | restoring lost data. How do you get that provider to destroy
           | their copy?
        
             | defen wrote:
             | In general, law is ok with good-faith mistakes if you obey
             | the spirit of the law. If you forgot about an off-site
             | backup but then deleted it when you found out about it, I
             | doubt you'd get in much trouble. If you deliberately used a
             | technological mechanism to evade a court order, you'd
             | probably be in for a world of hurt if you got caught, but
             | that's fundamentally no different than committing any other
             | crime and hoping you don't get caught.
        
             | Tenoke wrote:
             | You can randomly check if the company is using the model
             | down the line as well as rely on whistleblowers. If they
             | don't use it there's little point for them to risk a fine
             | by keeping it.
        
             | colejohnson66 wrote:
             | You're right; There's not any way to prove it's gone
             | completely. But in the real world, these hypotheticals can
             | be handled through increasing fines and possible jailtime
             | for contempt of court.
        
         | [deleted]
        
       | IYasha wrote:
       | What I fear is law-enforced backdoors that could literally
       | destroy everything - programs, code, data... Just imagining this
       | dystopian concept brings me shivers. You didn't pay taxes? Data
       | deletion.
        
       | more_corn wrote:
        
       | robbrown451 wrote:
       | Confusing headline, it seems to imply that all algorithms are
       | going to die, like we'll just stop using algorithms. Obviously
       | that doesn't make sense, but that is what it sounds like.
       | 
       | I also question whether we are talking about algorithms, or the
       | data set they are working with or the model created from that. I
       | can forgive confusing an algorithm with its implementation (e.g.
       | the source code), but this goes beyond that.
        
         | qchris wrote:
         | I believe the answer to your second line is "yes." This FTC
         | enforcement action appears to address all of the above: any
         | data that is collected in an illegal manner (in this case,
         | which violates COPPA) and any machine learning models that are
         | created through the use of this data (in their entirety, since
         | you can't un-train a piece of information from a model).
         | 
         | Machine learning is somewhat unique in the software world in
         | that the actual useful artifacts are not necessarily strongly
         | tied to the source code itself. You can have the identical
         | source code running at two different companies, but by
         | supplying them with two different training sets, you'll end up
         | with very different outputs. That's what algorithmic
         | destruction is targeted at--not even necessarily the source
         | code or algorithm in the technical sense (you can't destroy
         | "KMeans" or "convolution" as a concept, obviously), but both
         | the data and the model weights that are produced through the
         | use of that data that are used in perform a business action.
         | Those weights are typically stored separately from the source
         | code, and can be extremely expensive to re-create from scratch.
        
         | armchairhacker wrote:
         | The articles use of "algorithm" is incredibly bad.
         | 
         | > The FTC's new enforcement weapon spells death for algorithms
         | 
         | makes it sound like running quick-sort is a federal crime
        
         | IAmEveryone wrote:
         | A trained model, which is obviously the subject here, is an
         | algorithm.
        
           | robbrown451 wrote:
           | I generally wouldn't define algorithm as such, although I can
           | see where the lines between them can become blurry.
        
             | kaibee wrote:
             | > I generally wouldn't define algorithm as such, although I
             | can see where the lines between them can become blurry.
             | 
             | Its pretty clear cut tbh. An algorithm is a set of steps to
             | follow to produce some output. A trained model is, 'hey do
             | these matrix multiplications with these coefficients to get
             | an output'. The fact that the exact coefficients were
             | arrived at via backprop, doesn't make it not an algorithm.
        
               | mattmcknight wrote:
               | Not necessarily, as the trained model can just be the
               | matrix. It's more like data, as presumably the same
               | algorithm with different weights, trained by different
               | data would be permissible.
        
               | The_rationalist wrote:
               | Oh yes so chromium and minecraft are algorithms now. What
               | a useful definition we have here.. Obviously colloquially
               | when people says algorithm, it is an implicature that
               | they mean a hard computing (not soft) serie of steps that
               | achieve a specific goal. In other words, a statistical
               | algorithm does not point to the same category as a
               | deterministic algorithm and an algorithm refer to the
               | later class by default.
        
               | mannykannot wrote:
               | > What a useful definition we have here..
               | 
               | Indeed it is - for one thing, it allows us to see that
               | various useful theorems and results about algorithms and
               | computability apply as much to large programs as to small
               | ones, such as the fact that there's no fundamental
               | impediment to porting them between computers with
               | different instruction sets, or running them in virtual
               | machines.
               | 
               | What's not so well or usefully defined here is your
               | distinction between hard and soft computing.
               | 
               | > In other words, a statistical algorithm does not point
               | to the same category as a deterministic algorithm and an
               | algorithm refer to the later class by default.
               | 
               | You appear to be under the misapprehension that the set
               | of statistical algorithms is disjoint from that of
               | deterministic algorithms. I strongly suspect that all the
               | algorithms covered by the article are both statistical in
               | terms of what they compute and deterministic in terms of
               | how they do it.
        
               | SirSavary wrote:
               | No reasonable person would define Chromium or Minecraft
               | as algorithms; that's not what the poster was saying.
        
               | The_rationalist wrote:
               | oh yes chromium is not a "set of steps to follow to
               | produce some output" then?.. If the author wasn't meaning
               | what he said, what is the actual criteria he was meaning?
        
       ___________________________________________________________________
       (page generated 2022-03-17 23:01 UTC)