[HN Gopher] Infinite-Storage-Glitch - Use YouTube as cloud stora...
       ___________________________________________________________________
        
       Infinite-Storage-Glitch - Use YouTube as cloud storage for any
       files
        
       Author : kinduff
       Score  : 273 points
       Date   : 2023-02-20 10:17 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | klntsky wrote:
       | > Unfortunately no filesystem functionality as of right now
       | 
       | I chuckled because of my own thought that seek (FS call) can be
       | implemented via youtube video seeking
        
       | lousken wrote:
       | if only somebody wrote client for personal backblaze backup
       | (which is also unlimited), can't easily store terabytes from my
       | linux PCs
        
         | FloatArtifact wrote:
         | https://github.com/tom300z/backblaze-personal-wine
        
       | nivenkos wrote:
       | Until Google bans your account completely across all services
       | with no means for appeal.
        
       | c7DJTLrn wrote:
       | This reminds me of a stupid idea I had: would it theoretically be
       | possible to store data using the backbone of the Internet itself?
       | You'd bounce packets (probably TCP) back and forth between two
       | hosts with bytes that aren't actually written to a disk anywhere
       | so they just exist as a stream until one end decides to copy a
       | section for itself.
        
         | cowsup wrote:
         | You're effectively relying on two computers to be up and
         | running 24/7. It'd be twice as better of an idea (which is
         | still a very low number) to just store that data in RAM on a
         | single device, rather than rely on two.
        
         | josephg wrote:
         | Suckerpunch did a video on "harder drives" where he implemented
         | a block storage device by storing data in ping packets. It's
         | one of my all time favourite technical talks - his style is
         | amazing, and he's an incredible story teller.
         | 
         | https://youtu.be/JcJSW7Rprio
        
           | devnullbrain wrote:
           | Am I correct in remembering a version of this video that
           | existed before the COVID tests?
        
         | tredre3 wrote:
         | This isn't a new idea. It can be traced back to delay-line
         | memory [1] and many thought experiments have been suggested to
         | use a large network as such. Even some actual demos have been
         | made [2][3].
         | 
         | 1. https://en.wikipedia.org/wiki/Delay-line_memory
         | 
         | 2. https://code.kryo.se/pingfs/
         | 
         | 3. https://www.shysecurity.com/post/20120401-PingFS
        
       | warent wrote:
       | Thank you for making this! I had the exact same idea quite some
       | time ago but had neither the skills nor the passion to actually
       | create it.
       | 
       | Seeing it come to life has just scratched a long forgotten itch
       | and damn it feels great.
        
       | ranting-moth wrote:
       | I like the novelty of this project, but if you value your Google
       | account I wouldn't try this out.
       | 
       | Google has been known to close accounts and "related" accounts
       | for abuse (as defined by them). So even if you create another
       | account, don't expect your main account to survive if there's any
       | possible link between them.
       | 
       | They are the judge, jury and executor, so eff around at your own
       | peril.
        
         | DonHopkins wrote:
         | The downside of using YouTube for backups is that the comments
         | on your backups are so toxic.
        
         | [deleted]
        
         | ornornor wrote:
         | Don't forget there is no appeal process (let alone the ability
         | to talk to a human)
         | 
         | What a brave new world.
        
           | [deleted]
        
           | flangola7 wrote:
           | This is starting to change. India has a new law requiring
           | social media companies to have a grievance officer and a
           | formal grievance process that allows users to speak to an
           | actual human. It lays out a set of valid reasons to suspend a
           | user, and cannot suspend or penalize a user for reasons not
           | on the list, and must do so in a fair manner as prescribed by
           | law. If the grievance process fails it can be appealed to a
           | government office and then courts.
        
             | hungryforcodes wrote:
             | Presumably even the BBC could use them...
        
         | flatiron wrote:
         | $20 a month gives you "unlimited" storage at google. they
         | gladly take my encrypted files for years now and I'm up to
         | 80TB. i think its more than reasonable to pay them for that
         | type of service and be slightly above board (the account type i
         | have says i need a minimum of 5 people but its just me).
        
           | gekoxyz wrote:
           | how do you manage the encryption/decryption?
        
             | trvz wrote:
             | https://rclone.org/crypt/
        
               | baal80spam wrote:
               | Love rclone!
        
             | willmorrison wrote:
             | One option is Cryptomator: https://cryptomator.org/
        
           | coldblues wrote:
           | That doesn't seem to be the case anymore. You have to pay for
           | all the users to get the benefit of unlimited storage.
        
             | flatiron wrote:
             | I must be grand fathered in. I pay $20 flat per month.
        
               | CTDOCodebases wrote:
               | Are you paying month to month or yearly? I don't know if
               | you can rely on that storage being available. See below.
               | 
               | https://www.zdnet.com/article/what-happens-to-your-g-
               | suite-u...
        
             | deadfece wrote:
             | $100/mo for absolutely unlimited is still an incredible
             | bargain. $20/mo is in the neighborhood of almost free.
        
               | Dylan16807 wrote:
               | A hundred dollars a month is only an incredible bargain
               | if you have huge amounts of data.
               | 
               | The average person could buy a $100 external drive and
               | replace it every five years, and that would be enough.
        
               | salawat wrote:
               | $20/mo x 12 months = $240 annually.
               | 
               | $240 annual x 75 years = $18000
               | 
               | Almost free huh?
               | 
               | $12000 a year x 75 = $90000
               | 
               | If I could pay that in and lock it in for the duration,
               | maybe I'd consider that, but no one is going to let you
               | do that.
               | 
               | Y'all got some funny notions on "Free".
               | 
               | Then there's the whole issue of "What if Google gets
               | bored?"
        
               | shiftpgdn wrote:
               | Where did the $12,000/year come in?
        
               | salawat wrote:
               | Fack. $1200 a year x 75 years should be $90000 lifetime..
        
               | hungryforcodes wrote:
               | Why 75 years?
        
               | ianburrell wrote:
               | I think they messed up $240/yr * 5 people. Which is
               | $1200/yr. Or $100/month * 12.
        
           | www_harka_com wrote:
           | Which service is that? Doesn't Workspace allow 1TB?
        
           | renonce wrote:
           | How long does it take for you to download 80TB? From what I
           | can see Google allows you to download 10TB per day but who
           | knows when they will change that limit.
        
             | selectodude wrote:
             | Even with a gigabit internet connection that would take a
             | couple hundred hours.
        
           | recuter wrote:
           | https://diskprices.com
           | 
           | Price per TB appears to have fallen below $8. So that's $640
           | worth of storage. Basically, if you were to buy your own hard
           | drives it works out to about $20/mo over two years..
        
             | mulmen wrote:
             | You're not accounting for redundancy, administration cost,
             | electricity, heat management, or servers to hold the
             | drives.
        
             | Damogran6 wrote:
             | I'm betting Google Storage is a little more fault
             | tolerant...
        
               | recuter wrote:
               | The other reply mentions backblaze. Whether you choose to
               | use them or not their published driver statistics are
               | quite useful:
               | 
               | https://www.backblaze.com/blog/backblaze-drive-stats-
               | for-202...
               | 
               | A well chosen model has an AFR of well below 1%. To get
               | about say, 100TB, you'd need a dozen drives or so with
               | ZFS and a nice enclosure. It is unlikely even one of them
               | will fail in a given year and you will not experience
               | data loss.
               | 
               | Here is a $100 case:
               | https://ja.aliexpress.com/item/1005003125774264.html
               | 
               | Here is some YouTuber shoving 100TB into it:
               | https://www.youtube.com/watch?v=boKmZKTKXHc
        
               | bornfreddy wrote:
               | Depends on the fault. Disk errors, fire, theft? Yes.
               | Account suspension? Hmmm...
        
               | manquer wrote:
               | This particular account while loss making for them it is
               | not by all that much.
               | 
               | A comparable Cloud Storage account on GCP with Coldline
               | storage would be $320/month ($0.004 GB/month) or just
               | $96/month for archival ($.0012/month).
               | 
               | The actual cost to Google is probably < $80/month for
               | this 80TB ( most of the data is going to be in stored in
               | a version of archival given the standard restrictions of
               | 10TB on export.
               | 
               | 80TB is also an heavy outlier, given the typical
               | available bandwidth today and disk sizes commercially
               | available for most users it will take a lot of dedicated
               | investment of effort and time to upload this amount of
               | data into the cloud.
               | 
               | Also Google's personal storage pricing is not competitive
               | for pure storage, Backblaze is only $7/month for example.
               | The higher price and value is derived from able to
               | integrate into other Google products and provide storage
               | for those like Gmail, Photos etc.
        
             | Oxxide wrote:
             | 8 dollars for a TB of storage, man. It still makes me feel
             | awestruck sometimes when I see stufff like a $23 3TB HDD.
        
           | PYTHONDJANGO wrote:
           | This information is wrong. Please give an URL to the service
           | you are writing about to prove that it is right, thanks.
        
         | nimbius wrote:
         | Bold of you to assume hn hasn't fully convinced me to abandon
         | everything but maps ;)
         | 
         | The 4x size increase is my biggest concern...too bloaty.
        
           | yootyootr wrote:
           | Don't forget that YouTube compresses videos, so the extra
           | filesize makes the videos resistant to that destructive
           | process.
        
         | [deleted]
        
       | photochemsyn wrote:
       | Video steganography might be a better approach and would be less
       | likely to trigger account banning or claims of abuse by the
       | hosters. The issue of avoiding data loss due to lossy compression
       | algorithms seems to be an active area of research:
       | 
       | https://jis-eurasipjournals.springeropen.com/articles/10.118...
       | 
       | > "Moreover, most video-sharing channels transmit the
       | steganographic video in a lossy way to reduce transmission
       | bandwidth or storage space, such as YouTube and Twitter. . .
       | Robust video steganography aims to send secret messages to the
       | receiver through lossy channels without arousing any suspicions
       | from the observer. Thus, the robustness against lossy channels,
       | the security against steganalysis, and the embedding capacity are
       | equally important."
       | 
       | I suppose in this project, the blocks of pixels are large enough
       | to avoid data loss due to compression?
        
       | andrewstuart wrote:
       | Hmmm I'm not convinced.
       | 
       | I had a good look into these sorts of technologies but the host
       | almost always changes the file so it makes it impossible to
       | retrieve the data hidden in the file.
       | 
       | You need a file hosting platform that guarantees not to change
       | the uploaded file.
       | 
       | How does this avoid such problems ?
        
         | TonyTrapp wrote:
         | If you look at the example video, it doesn't depend on the
         | video not being changed, but it does depend on a minimum level
         | of quality. That is, as long as the video quality is high
         | enough (720p in this case) to get back the original black and
         | white pixels, you're fine. The data is not hidden, it's there
         | in plain sight in the video.
        
           | andrewstuart wrote:
           | OK I'm convinced. I like it!
        
         | manmal wrote:
         | It's described in the README. The video has 2x2 pixel blocks
         | that are either black or white, so each one signifies a bit. So
         | a 1920x1080 frame encodes 518,400 bit = 64.8KB
         | 
         | The assumption is that video compression won't mess up those
         | blocks beyond recognition, so you should retain the information
         | as long as the rendered resolution and bitrate don't drop too
         | low.
         | 
         | Maybe this could be improved by e.g. using 32 colors instead of
         | 2, and bumping the block size to 3x3 (for safety) which should
         | yield ca 144KB per frame.
        
           | LocalH wrote:
           | The block size should honestly be tuned for the codec in use,
           | chiefly to determine the best block size to fit with the
           | codec's macroblock size. That's usually either 8x8, or with
           | newer codecs 16x16. I feel like something like maybe 8x2
           | would be smart, and I like the idea of monochrome for
           | resiliency, since chroma is downsampled. The fewer possible
           | pixel combinations you have within a macroblock, the better
           | the compression will probably end up being as well. And 8x2
           | would somewhat evoke the look of the old video backup systems
           | as well, for the fun of the nostalgia of that.
        
       | [deleted]
        
       | zxcvbn4038 wrote:
       | You could do this with any service which accepts user content.
       | You could have a tumblr blog focused on "paranormal phenomenon in
       | white noise images" and fill it full of data embedded in images.
       | If anyone ever asks you just explain that like many pattern
       | illusions not everyone can see images contained within - try
       | squinting, or covering the eye on the predominant side of your
       | body, stand on your head, blah blah blah.
        
         | unregistereddev wrote:
         | > fill it full of data embedded in images
         | 
         | This is even easier, because jpg's ignore additional data past
         | the end of the file. Post a low-res ~200kb jpg that has an
         | additional ~20mb of data appended. It'll still render perfectly
         | fine.
        
           | zxcvbn4038 wrote:
           | You could do the same thing with PNGs and different thunk
           | types. Although in both cases you run a risk that some
           | paranoid developer might filter out unexpected thunk types or
           | additional data so in both cases it would be best to put the
           | data in the image payload.
           | 
           | The other consideration is that Tumblr was always very
           | "creator" oriented and while they might produce thumbnails of
           | various sizes the original image is still available and not
           | mangled by resizing algorithms. Other free image hosts are
           | going to crush that image down the maximum amount tolerable
           | to the human eye. Google even does that for paid photo
           | hosting.
        
           | AkshatJ27 wrote:
           | Most platforms compress uploaded images, which would result
           | in the appended data being removed.
        
       | [deleted]
        
       | woodruffw wrote:
       | Nice work! I made a much worse variant of this years ago, with a
       | "mosaic" mode[1]: whatever YouTube was doing for compression at
       | the time handled multiple QRs tiled next to each other much
       | better than it did a single large one.
       | 
       | [1]: https://github.com/woodruffw-hackathons/where-tube
        
       | [deleted]
        
       | Wowfunhappy wrote:
       | I understand that the goal is to make the data survive video
       | compression, but wouldn't it make sense to use at least some
       | color information instead of entirely black and white pixels?
        
         | trklausss wrote:
         | Seems that you did not read the README:
         | 
         | There are two encoding modes, RGB and B/W. It uses a pixel-to-
         | data width of 2x2, but says YouTube's compression algorithm is
         | brutal, and one corrupted pixel already renders the whole thing
         | corrupted.
        
           | Wowfunhappy wrote:
           | Using full RGB clearly wouldn't work, but I do wonder if you
           | could use the color channel for _something_ , possibly
           | redundancy information.
        
             | einr wrote:
             | Very likely you could get away with _at least_ 4 bits (16
             | distinct colors) per pixel, which is 4x more efficient than
             | pure black and white.
        
               | LocalH wrote:
               | I think the _size_ of the effective chroma metapixels is
               | more important than the range of values. You need to make
               | them larger in order to keep the decoder from blending
               | them together when upscaling the 4:2:0 chroma.
               | 
               | Now, if you're using a 4:4:4 format to do this, then you
               | should be able to use smaller chroma metapixels (I still
               | wouldn't use the full chroma resolution, though, unless
               | you're using a high bitrate or a lossless codec).
               | However, that risks data corruption if passed through a
               | pipeline that downsamples the chroma.
        
         | Oxidation wrote:
         | Seems like it could benefit from forward error correction to
         | defend against bit errors (this is how QR codes survive big
         | chunks being partially obscured or replaced by logos, and also
         | how CDs survive being scratched within certain limits).
        
           | naikrovek wrote:
           | that error correction greatly inflates the final size of the
           | QR code, too.
           | 
           | there should be _some_ error correction in a system like
           | this, though.
        
             | Oxidation wrote:
             | You can choose how much correction you get, in terms of how
             | many bit errors you can correct per 'n' bits. And you need
             | surprisingly few bits to get pretty great performance under
             | "reasonable" bit-error rate channel (like under 10%
             | overhead). You can wind up the strength of the error
             | correction if you anticipate a noisier channel.
             | 
             | QR codes have 4 levels of correction you can use depending
             | on how robust you wish them to be. CDs and DVDs use two
             | chained, fixed, levels to keep the decoders simple. CDs
             | have 25% overhead, but their correction is very strong:
             | they can correct 4000 bits in a row.
        
         | LocalH wrote:
         | Chroma is lossier than luma in most common video codecs. AVC is
         | 4:2:0 on YouTube. 4:2:0, quite confusingly, means that chroma
         | is halved in both dimensions compared to luma (so one chroma
         | pixel is congruent with four luma pixels). As well, most
         | decoders will apply filtering on the chroma to upsample it to
         | match the luma, meaning that your color boundaries are going to
         | be indistinct at best, and you might even lose the original
         | chroma values entirely in the process. You'd have to use
         | multiple chroma pixels as one metapixel in order to increase
         | resilience, which would diminish the capacity. With modern
         | codecs, a monochrome signal seems better to use for actual
         | data, although I could see it being useful to use chroma for
         | metadata.
        
       | [deleted]
        
       | Ralo wrote:
       | I wrote something just like this with Discord, and I even got it
       | to host full videos which you can play back in browser. It's a
       | good backup service. [0]
       | 
       | I want to expand this in into a fully modular service that you
       | write payloads and scripts for various services, so when you
       | upload a file its spread out across many different providers.
       | When you're downloading, you just go down the list check what
       | still exists, and verify the checksum. This should be stable for
       | many years.
       | 
       | I plan to take a look into facebook and see what can/cant be
       | accessed there. I had this exact thought with youtube and thought
       | about using a pixel reader to exact out data. Same idea for
       | different image hosting services like imgur.
       | 
       | [0] https://github.com/5ut/DiskCord
        
         | danuker wrote:
         | The author says another Discord project served as inspiration:
         | https://github.com/pixelomer/discord-fs
         | 
         | Maybe you could join forces.
        
       | WaxProlix wrote:
       | It'd be cool to add a FUSE wrapper around this. At one point I
       | had a POC for a few of these sorts of things going (not as cool
       | as this project, just data stored to X free cloud store/metadata)
       | and creating a redundant transparent FUSE wrapper was probably
       | the next step. With multiple sources, you could even treat mux
       | data between slow/unreliable sources (content hosts in eg russia
       | or asia) to 'stripe' the data. And then, you could make these
       | modular so that new sources could be onboarded easily...
       | 
       | Yeah, I really like this stuff. Awesome project.
        
       | pcthrowaway wrote:
       | This was posted 2 days ago also (but received very little
       | attention): https://news.ycombinator.com/item?id=34850643
        
       | LocalH wrote:
       | Literally the modern equivalent of the old video-based backup
       | systems. I remember they existed for both the PC and the Amiga.
       | You would load a blank VHS tape into a VCR, connect the output of
       | the computer to that VCR's input, and then tell the program which
       | data you'd like to backup to the tape. It would generate this
       | flashing "mess" of black and white pixels that you'd record to
       | the tape. To restore, you'd connect the VCR output to a little
       | box that came with the product, it would convert the black and
       | white data in the video signal to a data stream that the program
       | would use to restore your data.
       | 
       | A portion of the signal would be used for timing, metadata and
       | error correction, so the program could tell you if the data was
       | sufficiently damaged upon restore.
       | 
       | LGR has a video on the PC version from Danmere:
       | https://youtu.be/TUS0Zv2APjU
       | 
       | Here's a video example of the Amiga industry's take on the idea:
       | https://youtu.be/VcBY6PMH0Kg?t=573
       | 
       | Sony even did this in 1980 to record CD-quality PCM audio onto
       | VHS tape. https://youtu.be/bnZFLzBO3yc
        
         | cortesoft wrote:
         | I had one in the late 90s that used 8mm tapes and my video
         | camera in the same way. Could store a ton of stuff.
         | 
         | It was pretty finicky, though, and very slow.
        
         | doubled112 wrote:
         | I just had a flashback to the Nintendo e-Reader I had as a kid.
         | 
         | Black and white dots in a strip on a card. Swipe the cards to
         | load the games.
        
         | bri3d wrote:
         | There was also DVStreamer for Windows and other tools for other
         | platforms which would store data on MiniDV tapes. This is of
         | course a bit less interesting than storage to VHS, since MiniDV
         | was already storing a bitstream, but still a clever oddity. I
         | think you could store ~13.5GB in SP mode or 20GB in LP mode
         | (reduced error correction).
        
         | adolph wrote:
         | Or the audio based ones like this Commodore cassette as stage
         | device. A guy in my neighborhood had one as part of Pac-Man
         | contest winnings.
         | 
         | https://en.wikipedia.org/wiki/Commodore_Datasette
        
         | cronix wrote:
         | We used to use regular audio cassette recorders to
         | store/restore data on the TRS80 before hard/floppy drives. It's
         | also how you backed up/restored midi data from early synths. It
         | basically just sounded like an early dial up modem transmitting
         | data when you played it back as audio.
         | 
         | https://www.youtube.com/watch?v=-nHrjqmt_wQ
        
           | chiph wrote:
           | My Apple ][+ had a tape interface. It mostly worked - if the
           | tape stretched or if the tape speed changed for some reason
           | (dirty capstan, power supply fluctuations, low volume, high
           | volume, evil pixies) then you wouldn't be able to read it
           | back.
           | 
           | This site describes the format, which was basically a header
           | tone, a sync tone, data bits, and then a checksum (not
           | described there but other sites say it was just an XOR). When
           | we got a Disk ][ (5-1/4" floppy drive) all those issues went
           | away.
           | 
           | http://www.applevault.com/hardware/apple/apple2/apple2casset.
           | ..
        
           | bobleeswagger wrote:
           | > It basically just sounded like an early dial up modem
           | transmitting data when you played it back as audio.
           | 
           | Modern synths still do this. The Korg Volca has a library for
           | converting audio into white noise that reprograms/adds more
           | samples.
        
           | noizejoy wrote:
           | > It's also how you backed up/restored midi data from early
           | synths.
           | 
           | In a weird closing of the circle, I now store the internal
           | sounds backup of my vintage Juno 60 synthesizer as a WAV file
           | recorded from that tape backup output.
           | 
           | So the digital info of the internal synthesizers gets
           | converted to analog audio in the synth, then passed as audio
           | to my modern computer's audio interface, which converts it to
           | a digital representation of the analog audio.
           | 
           | And vice versa to restore the backup into the synthesizer's
           | memory.
           | 
           | Incidentally those backups are more reliable now than when
           | using analog tape decks, since one doesn't encounter physical
           | tape degradation or a cassette deck "eating" the tape.
           | 
           | I haven't done any testing with compressed audio formats, but
           | I would expect even lossy formats to perform well, if one
           | keeps the lossiness within certain bounds, so that the
           | highest frequencies in the audio file are preserved.
        
       | Ardakilic wrote:
       | Reminds me of Gmail Drive from years ago, where you could use
       | your Gmail space as a virtual file system.
        
       | albert_e wrote:
       | Off topic
       | 
       | Does YouTube let you store unlimited video content (real video
       | like screen recordings etc of our own work - no shady or sneaky
       | stuff, nor any copyrighted stuff etc)
       | 
       | With all videos marked private ...so they are just "storage" by
       | account owner and no other users can access them and youtube
       | cannot monetize it ?
        
         | proxygeek wrote:
         | Oh.... Verry Interesting!! Hoping someone has the answer here
        
         | bombcar wrote:
         | Apparently? We do a bunch of private videos for storage (many
         | are also unlisted) and have no complaints.
         | 
         | I wouldn't use it as my ONLY backup of course.
        
           | 999900000999 wrote:
           | There was a thread here a while back where someone lost years
           | of corporate training content when YouTube deleted it.
           | 
           | I'd it's anything vital, as in your paycheck depends on it,
           | I'd have multiple backups.
        
       | paxys wrote:
       | It isn't exactly a "glitch", just something Google doesn't care
       | about (but absolutely will care about if too many people start
       | doing it).
       | 
       | I remember way back in the day someone came up with a clever way
       | of using Gmail attachments to build a cloud storage drive mounted
       | to your filesystem. Then Google themselves released Drive soon
       | after.
        
         | rwalle wrote:
         | I doubt "too many people start doing it" is ever going to
         | happen.
         | 
         | Obviously this is so difficult to use that most people would
         | rather pay $10/month to get 1TB of storage that can be very
         | easily accessed. Even if someone has 100TB of data and wants to
         | back them up, I don't they would do conversion to and from
         | YouTube videos.
         | 
         | An interesting idea, but probably won't get much real world
         | use.
        
           | telotortium wrote:
           | Pirates will take advantage of any suitably easy to use
           | storage. I think YouTube is probably a poor target these
           | days, though - Google's Denial of Service can probably detect
           | something like this in pretty short order.
        
           | josephg wrote:
           | You also run the risk of YouTube deleting your videos /
           | banning your account. I'm sure they wouldn't appreciate being
           | used as a generic backup provider.
        
       | okutanski wrote:
       | This is hilarious
        
         | naikrovek wrote:
         | "hilarious" is a bit strong. "interesting" feels better to me.
        
       | 2h wrote:
       | please find better uses of your time. this is such an obvious
       | abuse.
        
       | amelius wrote:
       | Nice, until Google introduces a new compression algorithm that
       | says: hey this looks like noise, let's replace it by this other
       | user's noise so we can save on storage costs.
        
         | glasshug wrote:
         | See, for example, film grain synthesis in AV1, which YouTube
         | uses:
         | 
         | https://en.wikipedia.org/wiki/AV1#Filters
         | https://norkin.org/pdf/DCC_2018_AV1_film_grain.pdf
         | https://waveletbeam.com/index.php/news/48-netflix-film-grain...
        
       | anonf0ld wrote:
       | Using this can get your google account and related IP addresses
       | banned? Isn't this sort of a Vandalism? But why attack Youtube
       | out of all places? Do it to TikTok instead. They won't notice the
       | difference(LOL). I would've said "delete this" normally but
       | today's political climate demands more free space on the internet
       | per individual definitely so...
        
       | brthsim wrote:
       | [dead]
        
       | f_devd wrote:
       | I wonder if you could get a better pixels/bit ratio when using
       | DCT/2DFFT based encoding since you'd still encode lower frequency
       | data but it would be in a format that compression algorithms
       | would also try to maintain.
        
       | yeahbutiguess wrote:
       | People do this all the time with any web connected service that
       | accepts data. People use open strings in AWS services, like
       | lambda function names, to store arbitrary bits.
        
       | charcircuit wrote:
       | I feel like this is overcomplicating things. You should be able
       | to download the original video you uploaded instead of
       | downloading a compressed version. I'm sure the uncompressed
       | version still exists.
        
         | LocalH wrote:
         | Ensuring that you can retrieve the data from the viewable video
         | means that this is also be a way of file transfer, one that
         | other people won't be able to download the original video for.
        
       | NKosmatos wrote:
       | This is bound to get you banned. I would do it a little bit more
       | clever (with lower bitrate/throughput/storage sizes)...
       | 
       | Encode the data inside audio, preferrably outside human audible
       | range, and then use a nice video of singing birds, or whales
       | talking, and use the "hidden" frequencies to hide the data.
       | 
       | I don't know if Youtube has any filters that cut out frequencies,
       | but this way they can't ban you, since you've uploaded a really
       | nice personal video of your singing birds, instead of the
       | conspicuous looking QR-like codes as in the OP ;-)
        
         | crazygringo wrote:
         | > _preferrably outside human audible range_
         | 
         | With any lossy audio compression algorithm, everything outside
         | the human audible range is filtered away completely as a first
         | step. That's compression 101.
         | 
         | Also there's much less bandwidth in the audio channel than the
         | video channel, and then far less again if you're trying to hide
         | a signal in another signal.
        
       | brudgers wrote:
       | Using video formats to store other data has a long history.
       | 
       | ADAT for example.
       | 
       | https://en.wikipedia.org/wiki/ADAT
        
       | egberts1 wrote:
       | THAT is 1337!
       | 
       | A true hacker spirit worthy of Captain Crunch whistle and its
       | application toward free payphone calls.
        
       | up2isomorphism wrote:
       | Dumb yet interesting idea, but if you care your data you should
       | put it on google especially you are abusing their service, if you
       | do not care why you even waste your time doing all this except
       | for fun.
        
       | AtlasBarfed wrote:
       | This would be really impressive with some stenography.
        
       | wigster wrote:
       | they're here...
       | 
       | nice end of transmission simulator to boot!
        
       | ed25519FUUU wrote:
       | One of the things I've successfully used YouTube for was video
       | storage of my security camera system. Unlimited video storage
       | with a simple app to watch them in case I need to check something
       | out!
       | 
       | And it's simple: camera uploads automatically via FTP,
       | inotifywait script uploads to google!
        
         | booi wrote:
         | Shhhh.. don't you know what the first rule about YouTube
         | storage for security systems is??
        
           | vinay_ys wrote:
           | Right this moment an engineer at Google is writing a personal
           | OKR to block this and declare $$$ savings in order to get
           | promoted next year.
        
             | teawrecks wrote:
             | All they'd have to do is limit the amount of private videos
             | you're allowed to store. If your only option for storing
             | unlimited security footage is to make it public, then
             | people probably wouldn't do that.
             | 
             | Alternatively, if they're allowed to use the footage to
             | train some AI that will help them take over the world, then
             | maybe they want all your random footage for free.
        
       | skwheel wrote:
       | after your finals, you should read about forward error
       | correction.
        
       | LastTrain wrote:
       | " I still don't condone using this tool for anything
       | serious/large. YouTube might understandably get mad."
       | 
       | I do love these kinds of hacks, but I hate these kind of weaselly
       | cop-out statements. You made the tool, own it!
        
         | shultays wrote:
         | Considering this might very well end up you losing your google
         | account, it is a very necessary warning.
         | 
         | If anything, the author should be more clear about what happens
         | if youtube gets mad: you might lose your google account along
         | with access to mail, drive, photos etc
        
       | j-krieger wrote:
       | I've observed that with any piece technology where you're
       | permitted to write / upload information and freely access it
       | afterwards, someone will attempt to (ab)use it for file storage
       | and write a blog article about it later :)
       | 
       | My favorite example of this was people storing files in "secret"
       | subreddits by using posts and comments to store bytes. When they
       | were later discovered by other users, the seemingly random
       | strings sparked a huge conspiracy about their possible meaning.
       | 
       | However, you always have the problem that your unwilling host may
       | remove your "files". I sometimes wonder about file storage using
       | a textual output format that can't be distinguished from normal
       | user interactions.
        
         | codetrotter wrote:
         | I remember when GMail was by invite only, and at the time they
         | were offering quite a larger amount of storage for Mail than
         | anyone else so people started using their GMail drafts to store
         | files.
         | 
         | That was the first time I can across such a thing.
         | 
         | Someone even made an extension for Windows XP that allowed you
         | to mount GMail as a storage volume.
         | 
         | > GMail Drive is a Shell Namespace Extension that creates a
         | virtual filesystem around your Google Mail account, allowing
         | you to use Gmail as a storage medium.
         | 
         | http://www.viksoe.dk/code/gmail.htm
        
           | meltyness wrote:
           | GmailFS was another early implementation.
        
             | drkstr wrote:
             | Writing the GmailFS HOWTO, and fixing a bug in the process,
             | was my first exposure to the power of OSS. Looking back,
             | I'm pretty sure this is what led me to persue software
             | engineering as a career!
        
         | petercooper wrote:
         | How about storing your files in other people's DNS caches?
         | 
         | https://blog.benjojo.co.uk/post/dns-filesystem-true-cloud-st...
        
           | metadat wrote:
           | Discussed 5 years ago:
           | 
           | https://news.ycombinator.com/item?id=16134041 (36 comments)
        
         | coffeeblack wrote:
         | Now I want to write a blog post about storing files inside of
         | blog posts about storing files inside of blog posts ...<error:
         | recursion limit reached>
        
           | j-krieger wrote:
           | This makes me think of Turing machines which store their own
           | code inside them selves, which you can use for all kinds of
           | interesting proofs. I wish I could find more about this.
        
             | t344344 wrote:
             | Look into Squeak/Smalltalk. It is an operating
             | system/desktop/IDE with self contained compiler.
        
         | [deleted]
        
         | BlueTemplar wrote:
         | In completely unrelated Hacker News :
         | 
         | "Ask HN: What are these strange random strings spamming my
         | blog?"
         | 
         | https://news.ycombinator.com/item?id=34865695
        
         | nullc wrote:
         | Pretty easy to do that, use a fixed point implementation of
         | GPT(N) of whatever size you like and range code your data into
         | the model probabilities. This also will achieve a close to rate
         | optimal embedding-- allowing you to embed about as much data as
         | the language model thinks the text has...
         | 
         | If you encrypt the data and include a checksum or other
         | identifying bytes in the ciphertext you can even have unwitting
         | human participants in the discussions and if their posts are
         | context your embedded data will be credible replies. You just
         | have to be sure that threading behavior doesn't make it
         | impossible to give the decoder identical context.
        
         | INTPenis wrote:
         | In this modern cloud-giant world it's abused for file storage
         | yes. But I come from the more traditional web hosting world of
         | the early 2000s and back then the general rule was that
         | anything that could store information online would sooner or
         | later be used to store porn.
        
         | adolph wrote:
         | > When they were later discovered by other users, the seemingly
         | random strings sparked a huge conspiracy about their possible
         | meaning.
         | 
         | Makes me wonder if numbers stations are actually just the
         | worlds slowest modems
        
         | [deleted]
        
         | diceduckmonk wrote:
         | > I sometimes wonder about file storage using a textual output
         | format that can't be distinguished from normal user
         | interactions.
         | 
         | I guess it depends on what noise-to-signal density you're
         | after.
         | 
         | With a a long enough ChatGPT generated output, no one would
         | question a few out of place characters or even an emoji. With
         | 3000+ different emojis to choose from that encodes an entire
         | byte of data.
         | 
         | Another idea is using "they're", "their", "there" as bits.
        
           | sgerenser wrote:
           | I vaguely recall some secretive company (perhaps Apple) using
           | adjustment of spacing, capitalization, etc. to encode a
           | unique serial number in messages sent by the CEO, which could
           | then be used to trace leaks.
        
             | vlunkr wrote:
             | Genius.com hid the message "red-handed" in Morse code using
             | alternating quote characters to prove that Google was
             | displaying their lyrics.
        
             | joaonmatos wrote:
             | Elon Musk was the one claiming to do it
        
               | mattkrause wrote:
               | Many people claim to do it.
               | 
               | It's a plot point in a _Patriot Games_ , a 1987 Tom
               | Clancy novel that introduced the term "canary trap" for
               | this trick. He says he invented the term, but not the
               | technique, which was already in use.
               | 
               | In a spat over the plot of _Star Trek III_ (so, early
               | 1980s), Harve Bennett distributed slightly different
               | versions of the script, allowing him to track a leak back
               | to Gene Roddenberry.
               | 
               | The book _SpyCatcher_ says it was in routine use at MI-5,
               | and you can find variations of it in lots of fiction too.
        
             | xen2xen1 wrote:
             | CIA and similar have been doing that for something like 50
             | years. Made it into Tom Clancy novels in the 80s IIRC.
        
         | dtx1 wrote:
         | > However, you always have the problem that your unwilling host
         | may remove your "files". I sometimes wonder about file storage
         | using a textual output format that can't be distinguished from
         | normal user interactions.
         | 
         | Well, with Chat GPT that's almost trival. POC
         | https://imgur.com/fQvMh9S
        
         | phh wrote:
         | > I sometimes wonder about file storage using a textual output
         | format that can't be distinguished from normal user
         | interactions.
         | 
         | You could use a reproducible LM (for instance using Bellard's
         | NNCP as basis), and encode one bit in one word by taking the
         | {first, second} most probable next word.
        
           | animuchan wrote:
           | This is fascinating! And the file transfer can be then fully
           | disguised as a conversation, with a ChatGPT-like client and
           | all. An unsuspecting user will see a chat bot; a specialized
           | client app would be able to receive files by talking to it.
        
       | OscarCunningham wrote:
       | Previously on 'Esoteric Filesystem Week':
       | 
       | 0. Linux's SystemV Filesystem Support Being Orphaned
       | https://news.ycombinator.com/item?id=34818040 by rbanffy 3 days
       | ago, 70 points, 73 comments
       | 
       | 1. TabFS - a browser extension that mounts the browser tabs as a
       | filesystem https://news.ycombinator.com/item?id=34847611 by pps 1
       | day ago, 961 points, 185 comments
       | 
       | 2. Vramfs - GPU VRAM based file system for Linux
       | https://news.ycombinator.com/item?id=34855134 by pabs3 1 day ago,
       | 226 points, 71 comments
        
         | Lt_Riza_Hawkeye wrote:
         | Maybe it doesn't have a post of its own, but I found these
         | esoteric storage methods greatly entertaining as well:
         | https://www.youtube.com/watch?v=JcJSW7Rprio
        
           | OscarCunningham wrote:
           | Tom7 is a national treasure.
        
             | loeg wrote:
             | Indeed. https://news.ycombinator.com/item?id=34859300 :-)
        
         | imhoguy wrote:
         | Now we need ytFS FUSE driver to random read these pattern
         | videos. Anyone? ;)
        
       | throwaway71271 wrote:
       | haha this is so cool, i made something similar
       | https://punkjazz.org/scrambled-eggs/ few years ago to explore
       | transferring files directly through the camera so nobody can
       | "see" what you download, because no packets go through the
       | internet, i managed to do 10kbps or so
       | 
       | the modern qr readers are so fast and easy to use, its
       | unbelievable
        
         | pveierland wrote:
         | Nice! It's such a neat way to transfer information :)
         | 
         | This guy extended the idea using fountain codes, which allows
         | you to miss arbitrary frames and still recover the full message
         | without waiting for the missed frames to re-appear:
         | 
         | https://divan.dev/posts/fountaincodes/
        
         | actionfromafar wrote:
         | One could do side-channel sliding window hand shakes with
         | audio, to improve download performance. :-)
        
           | throwaway71271 wrote:
           | i was actually thinking about that, could be even more cool
           | now with modern text to speech and whisper and some funky
           | word based encoding with huge dictionary like:
           | 
           | teacher: 0b00010010101001, school: ...
           | 
           | and then the website can encode the data as a sentence and
           | just text to speech it and the receiver can use whisper to
           | speech to text and decode
           | 
           | will be the most creepy thing because it can be very
           | steganographic and sound like a real sentence
        
       | sam_goody wrote:
       | This is why I really like HN...
       | 
       | (IMO there is not enough of these posts, and getting less over
       | time.)
       | 
       | A refreshing "actual hacker" project that makes me look anew at
       | the tools I always use...
       | 
       | So, my coffee maker is sending data to the net - maybe I can use
       | that for backup, and have it replicated both in the fridge and in
       | the living room lights...
       | 
       | But how would I retrieve that? Hmm. I assume that both Alexa and
       | Google assistant are tracking everything that goes through my IoT
       | devices. I'll ask GPT how to hack my Nest device to pull back
       | data on demand, that oughta work, surely?! :D
        
         | ggerganov wrote:
         | Yes - more of this please :)
         | 
         | Tangentially related and discussed in the past on HN: File
         | transfer via color barcodes and a phone camera
         | 
         | [0] https://news.ycombinator.com/item?id=25459501
         | 
         | [1] https://github.com/sz3/libcimbar
         | 
         | [2] https://cimbar.org
        
           | egberts1 wrote:
           | Oh boy, OpSec should pay attention to this.
        
             | thombat wrote:
             | It starts with "no monitors facing windows" and "all
             | visitors hand over phones and any other devices with
             | photographic possibilities" and moves up the
             | paranoia/professional caution scale from there.
        
         | luma wrote:
         | Agreed! Here's a fun video by suckerpinch trying out some truly
         | insane data storage ideas https://youtu.be/JcJSW7Rprio
        
       | layer8 wrote:
       | It could probably be hidden in a normal-looking video using
       | steganography. Lower effective bitrate of course.
        
       ___________________________________________________________________
       (page generated 2023-02-20 23:00 UTC)