[HN Gopher] Google Drive scans files for copyright infringement
       ___________________________________________________________________
        
       Google Drive scans files for copyright infringement
        
       Author : amrrs
       Score  : 62 points
       Date   : 2024-07-23 19:26 UTC (3 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | talldayo wrote:
       | This makes sense, though. I know I'm not the only one who looked
       | up "Shrek.mp4" on Google and got a literal sea of pirated movies
       | hosted on Google Drive.
        
         | gryn wrote:
         | it make sense on public content not on private stuff you're not
         | opening to the public. I think most people have a sense of I
         | can put what I want on my "Cloud" storage especially if it's
         | something you're paying for.
        
           | chrisjj wrote:
           | Google is presumably unable to determine that you paid for
           | the right to copy this music file. Hence its "may".
        
       | darby_nine wrote:
       | That's not all they scan for:
       | https://support.google.com/docs/thread/200185949/google-is-n...
        
         | sunaookami wrote:
         | lol @ that guy LARPing as a Google employee. Reminds me of the
         | Microsoft Answers forum.
        
           | RockRobotRock wrote:
           | Wow, imagine doing that for free.
        
       | math0ne wrote:
       | At least it used to be the case they only scan shared files.
        
       | perihelions wrote:
       | For convenience: the linked object is a text comment, plus a
       | screenshot of text,
       | 
       | - _" so, google has scanned my recently filed scanned files and
       | said it's a copyright infringement"_
       | 
       | - _" Bro, tell me your Gemini datasplit?"_
       | 
       | > _" Google"_
       | 
       | > _" Your file may violate Google Drive's Terms of Service"_
       | 
       | > _" "05 - You are always choosing.mp3" contains content that may
       | violate Google Drive's Copyright Infringement policy. Some
       | features related to this file may have been restricted. "_
       | 
       | > _" Restricted file 05 - You are always choosing.mp3"_ - _" "_
        
       | josefritzishere wrote:
       | AI is going to start deleting everything and locking us out of
       | Google drive. It's coming.
        
         | smrtinsert wrote:
         | The year of the local cloud is just around the corner. My
         | concern are books I buy from places like pragprog in pdf
         | format. I feel like Google simply doesn't care and would ban on
         | first offense.
        
       | cynicalsecurity wrote:
       | This is some Orwelian nonsense.
        
         | mass_and_energy wrote:
         | Honestly. Does this mean that if someone takes a picture of
         | their willy with Google Photos enabled, it'll censor myself
         | from myself? Where does it end?
        
       | harshreality wrote:
       | This (the copyright scanning policy) isn't new, is it?
       | 
       | What it's scanning for in this case is material it believes to be
       | copyrighted, and restricts features (notably sharing) for content
       | that matches.
       | 
       | Given that copyright law exists, and that Google doesn't like
       | wasting engineering time on legal stuff unless refusing to do it
       | would result in lawsuits, this scanning policy is a fairly low-
       | impact solution that has probably been deemed legally necessary
       | to avoid media company lawsuits. I don't like it, but the
       | alternative is for Google (and Microsoft, and any other cloud
       | storage that allows sharing) to mount an expensive legal effort
       | to try to overturn decades of digital copyright precedent, which
       | is likely to fail.
        
         | Hizonner wrote:
         | > What it's scanning for is material it believes to be
         | copyrighted,
         | 
         | All "material" created by any human is copyrighted, and has
         | been for decades. The question is who owns what rights, which
         | Google can't know.
         | 
         | > decades of digital copyright precedent
         | 
         | What precedent?
        
           | djbusby wrote:
           | Precedent: If you're hosting it you're guilty.
        
             | edude03 wrote:
             | Or more tactfully, if you're hosting it, we'll assume
             | you're guilty because we don't want to deal with it, and
             | you agreed to a very restrictive ToS that lets us take it
             | down.
        
         | em3rgent0rdr wrote:
         | The existence of copyright law is to blame. Not Google, who's
         | just trying to adhere to law.
        
         | nonrandomstring wrote:
         | > been deemed legally necessary
         | 
         | Ah, the old "deeming things" trick [0].
         | 
         | Almost everything is copyrighted. Like most of us here I've
         | given original writing, code or music to people who've shared
         | it on Google drive. That material is copyrighted no more or
         | less than anything by Disney or Sony.
         | 
         | Google doesn't _just_ "scan for copyright violations", it
         | specifically acts out of fear or leverage to be an unpaid
         | policeman for special interests, rich and powerful media
         | companies.
         | 
         | I haven't said anything new here, but I do think it's important
         | that we see arguments built on false standards. Google isn't
         | championing the law or anything noble and we would do well to
         | be very precise about choosing words to describe what is
         | happening.
         | 
         | [0] deem: acting by fiat and art without necessary recourse to
         | logic, law, evidence or consistency
        
       | akaike wrote:
       | They calculate hashes for files and probably compare them to
       | already reported ones
        
       | ricktdotorg wrote:
       | time to start changing the hash by adding a few bytes to your
       | movies before uploading to Google Drive                 head -c
       | 20 /dev/urandom >> movie.mp4
       | 
       | won't affect playback, will affect Google finding your pirated
       | films.
        
         | nick238 wrote:
         | Seems like this would corrupt the file. There are plenty of
         | metadata fields you could just put some crap in (or just
         | transpose letters in an existing string so you don't need to
         | change any length markers.
        
           | ricktdotorg wrote:
           | i've never had any problems with playback using the major vid
           | players on Linux with files i may or may not have used this
           | trick on.
        
           | nomel wrote:
           | I would really hope they would ignore the metadata, when
           | computing the hash, for this very reason. Properly tagging
           | films you download isn't exactly rare.
        
           | toast0 wrote:
           | Most media files are likely to tolerate random garbage tacked
           | on to the end of the file. ID3v1 tags are essential proof of
           | that; 128 bytes of garbage at the end that didn't cause any
           | trouble with playback.
        
           | wongarsu wrote:
           | That depends on the container format, and with some container
           | formats on the parser. Any container format designed to be
           | streamable would by definition survive corruption at the end.
           | Provided the player doesn't get too upset if any metadata at
           | the end is corrupted, but e.g. VLC handles such things quite
           | well
        
         | alfalfasprout wrote:
         | They may be doing locality sensitive hashing in which case this
         | wouldn't matter.
        
         | rakoo wrote:
         | I'm pretty sure Google engineers are smart enough to detect
         | quick workarounds that fit in a comment on Hacker News.
         | 
         | They already have to implement such a thing for finding
         | copyrighted material in Youtube videos, so they _know_ how to
         | deal with mixed signals.
        
           | HenryBemis wrote:
           | Or if they didn't think about this, they know do after
           | reading this thread :)
        
           | Hizonner wrote:
           | Encrypt the file and don't upload the key.
        
             | glitchc wrote:
             | They block it outright, even when the purpose is innocent
             | (sensitive documents). Tried with rar and 7-zip.
        
               | Hizonner wrote:
               | Wow, really? You can't store an encrypted file on Google
               | Drive? I guess I shouldn't be surprised, but I am,
               | mildly.
               | 
               | ... so it has no utility at all, since you definitely
               | shouldn't be storing any _unencrypted_ files on it.
        
               | Willish42 wrote:
               | anecdotal / N=1 here, but I've uploaded standard 7z
               | encrypted backups to personal Drive without issue
        
       | vander_elst wrote:
       | On the other hand if that file is shared publicly, Google might
       | be liable under some jurisdictions IIUC.
       | 
       | I d assume that they check some hashes of the file against a
       | database to check for copyright infringement. If only specific
       | actions are not permitted on the file e.g. sharing it widely,
       | this could seem reasonable?
       | 
       | Curious to learn more, what could be other actions the service
       | provider could take to avoid getting a lawsuit?
        
       | Hizonner wrote:
       | People not stupid enough to use Google's cloud services not
       | affected.
        
         | mass_and_energy wrote:
         | This is unrelated to GCP. If you're going to be an abrasive
         | tool then you should at the very least have some shred of an
         | idea of what you're talking about, laddie
        
           | Hizonner wrote:
           | Google Drive is a cloud service. Words have meanings outside
           | of Google's brand names, Laddie.
        
       | user3939382 wrote:
       | Put movies in a password protected zip.
        
         | godzillabrennus wrote:
         | At that point, just use Backblaze. It's reasonable in cost for
         | "unlimited" and creates an encrypted prior to transit.
        
       | delduca wrote:
       | For this reason and others (I was a Google One family
       | subscriber), I completely de-Googled myself.
        
       ___________________________________________________________________
       (page generated 2024-07-23 23:10 UTC)