[HN Gopher] The FTC's new enforcement weapon: "Algorithmic destr...
___________________________________________________________________
The FTC's new enforcement weapon: "Algorithmic destruction"
Author : simonebrunozzi
Score : 94 points
Date : 2022-03-17 15:29 UTC (7 hours ago)
(HTM) web link (www.protocol.com)
(TXT) w3m dump (www.protocol.com)
| tomc1985 wrote:
| Good, I hope they tear through this data-mining-gold-rush and set
| us back 1000 years, and I hope all the wantrepreneurs trying to
| exploit all of us with this data bullshit end up homeless and
| poor.
|
| I am not a data point, I do not want to be a data point, and I do
| not want products and services to interact with me as if I was.
| We humans are superior at curation, discovery, and anticipating
| our needs. Not AI.
| BurningFrog wrote:
| If AI is inferior, there is no need to ban it. Pick a side!
| sodkmfkwegw wrote:
| That's the same argument as :
|
| If nuclear weapons are superior to all weapons, no need to
| ban them!!
| andreilys wrote:
| _We humans are superior at curation, discovery, and
| anticipating our needs._
|
| Depends, I would prefer reco Algos be open source and I can
| plug in different ones.
|
| Trying to make sense of the mass of data without a personalized
| recommendation engine is a fools errand.
| hi_im_miles wrote:
| We basically need a curation protocol.
| yummypaint wrote:
| This is the field of library science
| tenebrisalietum wrote:
| all we need is grep for audio and video and I'm good.
| prometheus76 wrote:
| Related: there are a few channels on youtube that I archive
| locally, and I always grab the subtitles when I do. I throw
| them all into a database that I can run queries against if
| I'm trying to remember which video something was said in.
| It's not a direct grep of the video, but pretty close!
| m463 wrote:
| > We humans are superior at curation, discovery, and
| anticipating our needs
|
| You speak as though advertising was for you.
|
| The data being collected about you is used to sell you to
| advertisers, and your data to 3rd parties.
|
| you see advertisements not because you are interested in a
| product, but because the seller of the product is interested in
| one of your characteristics (like your income, sex, weight,
| religious affiliation, price insensitivity, poor decision
| making skills, etc...)
| cryptonector wrote:
| > I am not a data point, I do not want to be a data point, ...
|
| <insert obligatory The Prisoner reference> I am
| not a number! I am a free man! I will not make any deals
| with you. I've resigned. I will not be pushed,
| filed, stamped, indexed, briefed, debriefed, or numbered!
| My life is my own!
| listenallyall wrote:
| Should we not have counted Covid deaths or cases because those
| individuals also didn't want to be a data point?
| andrewjl wrote:
| Who's making money directly from those figures?
| PaulDavisThe1st wrote:
| > We humans are superior at curation, discovery, and
| anticipating our needs. Not AI.
|
| 1. I think is far from established, and if anything the
| accumulating evidence shows the opposite (though I do hate the
| term AI when used like this)
|
| 2. The issue (for me, at least) is not "AI" replacing human
| equivalents (though in some parts of life, this is a huge
| issue), but rather the ease to which the AI's actual purpose
| and success metrics can be different from the claims made about
| it by its owner-operators.
|
| It's not that this can't happen with human beings too -
| financial advisers who are actually lining their own pockets
| rather than providing you with the best advice for you. But it
| is much easier to hide nefarious intent inside an algorithm,
| and harder to discover it without access to source code (and
| non-trivial even then).
| [deleted]
| victor9000 wrote:
| I was recently looking for dimmable string lights, and everything
| I found online required installing an app to work the lights. In
| what universe is installing a shitty app more convenient for the
| user than turning a dial? It seems like more and more products
| are being converted into data harvesting channels, regardless of
| their intended purpose.
| alar44 wrote:
| Because app has a shitload more functions than a dial. I can
| program my strips to do all kinds of cool stuff.
|
| I mean, what data is going to get extracted? Your favorite
| color? The app that came with mine is great and I can work the
| controls remotely.
|
| Might be the wrong forum to be complaining that an app is too
| hard for you.
| _vertigo wrote:
| > Might be the wrong forum to be complaining that an app is
| too hard for you.
|
| You're misconstruing the objection here. No one thinks an app
| is too hard. The problem I don't want my email to end up in
| some database because I wanted to dim my lights. I imagine it
| would also be able to collect info on what bluetooth devices
| I have in my kitchen, etc.
| pickledcods wrote:
| Have you checked the trackers and data loggers build into
| that app that directly or indirectly call home when you
| interact with your lights?
|
| I have an aquarium and I use a water quality tester when I
| suspect something is wrong. Before it was a colour stripe you
| have to compare with a reference printed on a package. Now it
| uses a camera and matches the color better for a better
| analysis. However, that app calls home directly to JBL, so
| they are building a profile on how I abuse my fish because
| every time it logs, it logs a bad situation. It also leaks
| the usage to jquery, google and crashlytics without notifying
| me or asking my consent.
| rablackburn wrote:
| > It also leaks the usage to jquery, ...
|
| This is the first time I've read someone refer to loading
| jquery as a data leak.
|
| Ideally they shouldn't be using jquery, or just shipping it
| with the app/self-hosting. But for the general case, are
| you saying your ideal would be an app prompting you for
| permission before it loaded any external resource from any
| url? Even one as ubiquitous as jquery?
|
| How is the fact that your IP address requested a copy of
| jquery, one of the most downloaded JavaScript libraries of
| all time, any kind of meaningful signal?
|
| I'd be more worried about it as an attack vector than a
| privacy infringement.
| Arrath wrote:
| Sometimes all you want or need is a dial. I don't need my
| dimmable lights to have a rave function, or work as a
| visualizer for the music I'm listening to.
|
| A dial on the wall also doesn't potentially open an IoT
| shaped hole in my network security.
| alar44 wrote:
| They tracking ya like James Bond, all that super important
| stuff you're doing. Hopefully they don't find your super
| secret files, otherwise your master plan will be foiled.
| scotuswroteus wrote:
| This headline is, as they say, greatly exaggerated.
|
| Algorithms have an age old anti-enforcement weapon, called the
| white shoe law firm.
| https://www.debevoise.com/insights/publications/2020/10/thir...
| qchris wrote:
| I don't understand your point here. Are you simply saying that
| companies can prevent this by...hiring an expensive lawyer?
| That seems like sort of a strawman argument; hiring a "white
| shoe law firm" doesn't make an FTC action go away. Companies
| could hire one of those anyway, but now if the firms fail to
| win their case (which does in fact happen) the FTC has much
| sharper teeth which with to move forward. The risk profile for
| companies engaging in data malpractice goes significantly up.
|
| Outside of that, the headline seems very accurate--algorithmic
| disgorgement is a very new approach (the Everalbum ruling was
| in early 2021), and with the Kurbo ruling, one it appears is
| going to be much more common moving forward. The whole area of
| regulating algorithmic use is pretty novel--the whole field is
| basically so new that agencies are still figuring out the way
| to go about.
| NoraCodes wrote:
| Can we please, please, please stop calling statistical
| regressions "algorithms"? This is getting out of hand.
| PeeMcGee wrote:
| Algorithms Considered Harmful
| Cd00d wrote:
| Ha! And can we please stop calling curve fitting a "statistical
| regression"?
| IAmEveryone wrote:
| A trained model is a set of rules to go from, for example, an
| image to a tag identifying the object it shows.
|
| That's an algorithm. It is not "statistical regression", even
| if that is the method that was used in its creation.
|
| The whole its-just-statistics-shtick is what's getting out of
| hand. People are just mindlessly repeating the point because it
| seemed smart when it was first made. Hint: it no longer does,
| especially if you use it wrong.
| melony wrote:
| By that logic, most of mathematics are "algorithms" because
| they map one space to another. That is absolute nonsense. An
| algorithm is a sequence of steps, an implementation. A
| trained model is only an algorithm in the most generous
| interpretation of the term. You cannot discount the
| underlying mathematics of it, not matter how unscrutable it
| may be. A distribution implemented in Python doesn't
| magically become any less statistical just because it is
| executed on a CPU.
| jnwatson wrote:
| Math consists of proofs. Every proof corresponds to a
| computer program, by Curry-Howard Correspondence. Every
| program is an implementation of an algorithm.
|
| Hence, math consists of things that correspond to
| implementations of algorithms.
| feoren wrote:
| 1. Multiply your input number by 7
|
| 2. Add three
|
| 3. Square the result
|
| 4. Take this result as your final number
|
| There's an algorithm for f(x) = (7x + 3)^2. Similarly, any
| map from one space to another is an algorithm if it's
| describable on paper. Every mathematical proof is analogous
| to an algorithm in important ways. Math is not "equal to"
| algorithms, in the sense that two different algorithms can
| describe the same underlying math, but every _description_
| of any math essentially has to be an algorithm.
|
| The only parts of math that would not fit would be anything
| non-constructible, but we know we are limited to exploring
| and describing only 0% of that world since our descriptions
| (algorithms) are countably infinite, and there are
| constructivists who believe non-constructible things don't
| exist in any meaningful sense. So, yes, most (or all) of
| the practice of mathematics is algorithms.
|
| Why are we trying to gatekeep this word?
| daniel-cussen wrote:
| In addition natural numbers are algorithms, in that
| numbers are functions of no inputs, i.e. 3 is f()=3,
| similar to random()=[0,1]. And functions are a subset of
| algorithms.
|
| My word for algorithms that aren't closed-form functions,
| my life's work, is "repetigrams" because they involve
| repetition in the form of iteration or recursion.
| [deleted]
| Spivakov wrote:
| Statistical regression is algorithm. I think I fail to get the
| sentiment of this comment though (as someone not in data-
| oriented profession)
| agency wrote:
| I think it's too late, much like cryptographers lamenting the
| word "crypto", but I see where they're coming from. I have
| thought it is unfortunate that the popular conception of "the
| algorithm" is actually a remarkably bad example of an
| algorithm. I mean sure it's an algorithm - a computer is
| executing it - but when I think of an algorithm I think of a
| well defined sequence of steps solving a discrete problem.
| Whereas "the algorithm" / machine learning tends to attack a
| different kind of problem - "squishy" problems like
| recommendation systems where we don't really know how to
| explain how we come up with answers as a discrete list of
| instructions.
| exabrial wrote:
| It would be really nice to have companies destroy algorithms
| based on knowledge obtained illegally (pretty much all of
| Facebook and Google). I just don't see it happening anytime soon.
| Arainach wrote:
| What evidence do you have for these claims? What knowledge do
| you believe was obtained illegally?
| qchris wrote:
| Facebook was forced to pay $650 million in 2021 for violating
| Illinois' Biometric Information Privacy Act[1]. That's not
| necessarily grounds for a federal regulatory action, but if
| something like this happens in the future the does violate
| federal law (in the article, this was caused by not paying
| sufficient attention to COPPA), it could mean Facebook has to
| delete/re-do quite a bit of work in addition to purely
| monetary damages.
|
| [1] https://techcrunch.com/2021/03/01/facebook-illinois-
| class-ac...
| Arainach wrote:
| That's a great example. I can say that for a very long time
| (at least 5 years, probably more but I wasn't there to know
| for sure) Google has had explicit policies and training for
| all employees about not sharing data across projects
| without explicit user consent for this and many other
| reasons.
| IAmEveryone wrote:
| As the article notes, the policy has been used, including
| against Facebook's partner Cambridge Analytica. So I cannot say
| that it will be "happening soon", but it is certain to "have
| happened already".
| amelius wrote:
| > Algorithmic destruction
|
| I don't understand this label. Isn't this more about discovering
| how data is (ab)used by companies (what watchdogs are supposed to
| do), and less about destroying things with algorithms?
| mftb wrote:
| It explains in the first few paragraphs of the linked articled
| what they're trying to convey.
|
| "...destroy the algorithms or AI models it built using personal
| information collected through its Kurbo healthy eating app from
| kids as young as 8 without parental permission."
|
| and
|
| "..forcing them to delete algorithmic systems built with ill-
| gotten data could become a more routine approach, one that
| modernizes FTC enforcement to directly affect how companies do
| business."
|
| Those quotes are from the first 3 paragraphs.
| amelius wrote:
| Ok. I read "algorithmic" as "done using an algorithm". So
| (imho) a better label would be "algorithm destruction". Still
| not a great label though.
| mftb wrote:
| Agree, your phrase is an improvement on theirs.
| jka wrote:
| Unless there's problematic material within the algorithms/models
| -- and there could be, in some cases -- I get the feeling that,
| longer-term, algorithmic transparency would be far more
| effective.
|
| It would help demonstrate what biases were introduced, the
| systems and processes that permitted/encouraged those to exist,
| and it would help prevent repeats of similar mistakes in future.
|
| (it seems fair for the FTC to also be able to order companies to
| stop using a particular category of algorithms; that doesn't
| require or imply deletion, though)
| sidewndr46 wrote:
| How could the FTC require algorithmic destruction? How do you
| prove you destroyed something? Would you have to lobotomize
| everyone who had worked on building it?
| somesortofthing wrote:
| You don't. Upon receiving a destruction order, a company can
| delete the trained model or not. If they do delete it, it's all
| good. If they don't delete it, they (hopefully)can't get caught
| using it again without company-destroying fines. They can
| probably do some kind of recovery in most cases, but I doubt a
| judge would be very favorable to them if they're caught
| disobeying such an unambiguous order.
| [deleted]
| rossdavidh wrote:
| So, essentially, they require the company to pretend that they
| destroyed the "algorithm" (I think they actually mean machine
| learning model). If they don't, or they do but they had a copy,
| or they had used the knowledge to know which if several general
| purpose algorithms worked better at approximating it, there's
| essentially no way for an outside enforcement agency to know,
| and even with a whistleblower there's a real problem trying to
| enforce it. How do you prove, even if you have the database
| seized, which data the machine learning model came from?
|
| It's the same problem with saying "destroy that data", which is
| also difficult to enforce (and probably isn't), except that
| it's an additional level of difficulty in enforcing.
|
| I'm not saying they don't have a point, I'm just saying I don't
| see how they will be able to enforce this, even with
| whistleblowers.
|
| Whistleblower: "that ML model came from illegal data" Company:
| "No, it didn't." ...
| lallysingh wrote:
| I think the trained model. I don't think people will remember
| all the coefficients.
| sidewndr46 wrote:
| Right, but if the government says I have to turn over some
| nuts and bolts that I got through some illegal action I can
| just give them the location of my warehouse. Then the
| rightful owner can come pick them up with a government agent
| inspecting the process.
|
| If I have a bunch of drugs or chemicals I shouldn't have, I
| can give them the addresses and they can have government
| agents watch the destruction.
|
| If I have an algorithm (that is just a bunch of computer
| files) how do I prove I destroyed it? Do they literally just
| watch someone run "rm banned_model.bin" and then decide it's
| gone? It's basically impossible to prove you don't have a
| copy of something. There could always be a backup at another
| location. There could always be an encrypted copy stored
| somewhere that is impossible to detect.
|
| Any moderately well company has a service contract with a
| backup provider that acts an option of a last resort for
| restoring lost data. How do you get that provider to destroy
| their copy?
| defen wrote:
| In general, law is ok with good-faith mistakes if you obey
| the spirit of the law. If you forgot about an off-site
| backup but then deleted it when you found out about it, I
| doubt you'd get in much trouble. If you deliberately used a
| technological mechanism to evade a court order, you'd
| probably be in for a world of hurt if you got caught, but
| that's fundamentally no different than committing any other
| crime and hoping you don't get caught.
| Tenoke wrote:
| You can randomly check if the company is using the model
| down the line as well as rely on whistleblowers. If they
| don't use it there's little point for them to risk a fine
| by keeping it.
| colejohnson66 wrote:
| You're right; There's not any way to prove it's gone
| completely. But in the real world, these hypotheticals can
| be handled through increasing fines and possible jailtime
| for contempt of court.
| [deleted]
| IYasha wrote:
| What I fear is law-enforced backdoors that could literally
| destroy everything - programs, code, data... Just imagining this
| dystopian concept brings me shivers. You didn't pay taxes? Data
| deletion.
| more_corn wrote:
| robbrown451 wrote:
| Confusing headline, it seems to imply that all algorithms are
| going to die, like we'll just stop using algorithms. Obviously
| that doesn't make sense, but that is what it sounds like.
|
| I also question whether we are talking about algorithms, or the
| data set they are working with or the model created from that. I
| can forgive confusing an algorithm with its implementation (e.g.
| the source code), but this goes beyond that.
| qchris wrote:
| I believe the answer to your second line is "yes." This FTC
| enforcement action appears to address all of the above: any
| data that is collected in an illegal manner (in this case,
| which violates COPPA) and any machine learning models that are
| created through the use of this data (in their entirety, since
| you can't un-train a piece of information from a model).
|
| Machine learning is somewhat unique in the software world in
| that the actual useful artifacts are not necessarily strongly
| tied to the source code itself. You can have the identical
| source code running at two different companies, but by
| supplying them with two different training sets, you'll end up
| with very different outputs. That's what algorithmic
| destruction is targeted at--not even necessarily the source
| code or algorithm in the technical sense (you can't destroy
| "KMeans" or "convolution" as a concept, obviously), but both
| the data and the model weights that are produced through the
| use of that data that are used in perform a business action.
| Those weights are typically stored separately from the source
| code, and can be extremely expensive to re-create from scratch.
| armchairhacker wrote:
| The articles use of "algorithm" is incredibly bad.
|
| > The FTC's new enforcement weapon spells death for algorithms
|
| makes it sound like running quick-sort is a federal crime
| IAmEveryone wrote:
| A trained model, which is obviously the subject here, is an
| algorithm.
| robbrown451 wrote:
| I generally wouldn't define algorithm as such, although I can
| see where the lines between them can become blurry.
| kaibee wrote:
| > I generally wouldn't define algorithm as such, although I
| can see where the lines between them can become blurry.
|
| Its pretty clear cut tbh. An algorithm is a set of steps to
| follow to produce some output. A trained model is, 'hey do
| these matrix multiplications with these coefficients to get
| an output'. The fact that the exact coefficients were
| arrived at via backprop, doesn't make it not an algorithm.
| mattmcknight wrote:
| Not necessarily, as the trained model can just be the
| matrix. It's more like data, as presumably the same
| algorithm with different weights, trained by different
| data would be permissible.
| The_rationalist wrote:
| Oh yes so chromium and minecraft are algorithms now. What
| a useful definition we have here.. Obviously colloquially
| when people says algorithm, it is an implicature that
| they mean a hard computing (not soft) serie of steps that
| achieve a specific goal. In other words, a statistical
| algorithm does not point to the same category as a
| deterministic algorithm and an algorithm refer to the
| later class by default.
| mannykannot wrote:
| > What a useful definition we have here..
|
| Indeed it is - for one thing, it allows us to see that
| various useful theorems and results about algorithms and
| computability apply as much to large programs as to small
| ones, such as the fact that there's no fundamental
| impediment to porting them between computers with
| different instruction sets, or running them in virtual
| machines.
|
| What's not so well or usefully defined here is your
| distinction between hard and soft computing.
|
| > In other words, a statistical algorithm does not point
| to the same category as a deterministic algorithm and an
| algorithm refer to the later class by default.
|
| You appear to be under the misapprehension that the set
| of statistical algorithms is disjoint from that of
| deterministic algorithms. I strongly suspect that all the
| algorithms covered by the article are both statistical in
| terms of what they compute and deterministic in terms of
| how they do it.
| SirSavary wrote:
| No reasonable person would define Chromium or Minecraft
| as algorithms; that's not what the poster was saying.
| The_rationalist wrote:
| oh yes chromium is not a "set of steps to follow to
| produce some output" then?.. If the author wasn't meaning
| what he said, what is the actual criteria he was meaning?
___________________________________________________________________
(page generated 2022-03-17 23:01 UTC)