hngopher.com

       [HN Gopher] YouTubeDrive: Store files as YouTube videos
       ___________________________________________________________________
        
       YouTubeDrive: Store files as YouTube videos
        
       Author : notamy
       Score  : 383 points
       Date   : 2022-05-24 17:31 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | smm11 wrote:
       | I suspect people in my office who send everything as a Word
       | attachment with an image, PPT, Excel workbook, etc., embedded,
       | are doing this unknowingly.
       | 
       | There are even Word files I've found that have complete file path
       | notation to ZIP files.
        
       | xhrpost wrote:
       | Not immediately obvious from the Readme, but does this rely on YT
       | always saving a providing download of the original un-altered
       | video file? If not, then it must be saving the data in a manner
       | that is retrievable even after compression and re-encoding, which
       | is very interesting.
        
         | [deleted]
        
       | bspear wrote:
       | Fascinating
        
       | fronterablog wrote:
       | I'm a GOOGL investor and I find this offensive.
        
       | jagged-chisel wrote:
       | This would be a good way to backup your YouTube videos to YouTube
       | while avoiding Content ID.
        
       | [deleted]
        
       | Group_B wrote:
       | I was literally thinking of something like this a couple days
       | ago. Good timing!
        
       | aneil wrote:
       | Evil genius.
        
       | anyfoo wrote:
       | I only looked at the example video, but is the concept just "big
       | enough pixels"?
       | 
       | Would be neater (and much more efficient) to encode the data such
       | that it's exactly untouched by the compression algorithm, e.g. by
       | encoding the data in wavelets and possibly motion vectors that
       | the algorithm is known to keep[1].
       | 
       | Of course that would also be a lot of work, and likely fall apart
       | once the video is re-encoded.
       | 
       | [1] If that's what video encoding still does, I really have no
       | idea, but you get the point.
        
         | colejohnson66 wrote:
         | YouTube let's you download your uploaded videos. I've never
         | tested it, but supposedly it's the exact same file you
         | uploaded.[a] It probably wouldn't work with this "tool" as it
         | uses the video ID (so I assume it's downloading what clients
         | see, not the source), but it's an idea for some other variation
         | on this concept.
         | 
         | [a] That way, in the future, if there's any improvements to the
         | transcode process that makes smaller files (different codec or
         | whatever), they still have the HQ source
        
           | mod50ack wrote:
           | They may retain the original files, but they don't give that
           | back to you in the download screen. I just tested it by going
           | to the Studio screen to download a video I uploaded as a
           | ~50GB ProRes MOV file and getting back an ~84MB H264 MP4.
        
         | dheera wrote:
         | YT might still recompress your video, possibly using
         | proprietary algorithms that are not necessarily DCT based
        
           | anyfoo wrote:
           | As said, falls apart with re-encoding. But is a bit more
           | interesting than what is more or less QR codes.
        
             | jrochkind1 wrote:
             | I find it a bit more interesting to have something that
             | actually works on youtube, even if only as a proof of
             | concept.
        
         | bambax wrote:
         | Or, film pieces of paper in succession, in a clear enough
         | manner that they're still readable even when heavily
         | compressed.
        
           | ben174 wrote:
           | OH, i get it :)
        
         | NonNefarious wrote:
         | Back in the day, VCRs were commonly used as tape backup devices
         | for data.
         | 
         | Now studios are using motion-picture film to store data, since
         | it's known to be stable for a century or more.
        
         | softfalcon wrote:
         | Agree it would be cool to be "untouched" by the compression
         | algorithm, but that's nearly impossible with YouTube. YouTube
         | encodes down to several different versions of a video and on
         | top of that, several different codecs to support different
         | devices with different built-in video hardware decoders.
         | 
         | For example, when I upload a 4K vid and then watch the 4K
         | stream on my Mac vs my PC, I get different video files solely
         | based on the browser settings that can tell what OS I'm
         | running.
         | 
         | Handling this compression protection for so many different
         | codecs is likely not feasible.
        
           | anyfoo wrote:
           | Yes, but nothing is saying this has to work for every codec.
           | Since you want to retrieve the files using a special client,
           | you could pick the codec you like.
           | 
           | But (almost) nothing prevents YouTube from not serving that
           | particular codec anymore. This still pretty much falls under
           | the "re-encoding" case I mentioned which would make the whole
           | thing brittle anyway.
           | 
           | But it's indeed cool to think about. 8)
        
           | ALittleLight wrote:
           | What if you have an ML model that produces a vector from a
           | given image. You have a set of vectors that correspond to
           | bytes - for a simple example you have 256 "anchor vectors"
           | that correspond to any possible byte.
           | 
           | To compress data an arbitrary sequence of bytes, for each
           | byte, you produce an image that your ML model would convert
           | to the corresponding anchor vector for that byte and add the
           | image as a frame in a video. Once all the bytes have been
           | converted to frames you then upload the video to YouTube.
           | 
           | To decompress the video you simply go frame by frame over the
           | video and send it to your model. Your model produces a vector
           | and you find which of your anchor vectors is the nearest
           | match. Even though YouTube will have compressed the video in
           | who knows what way, and even if YouTube's compression
           | changes, the resultant images in the video should look
           | similar, and if your anchors are well chosen and your model
           | works well, you should be able to tell which anchor a given
           | image is intended to correspond to.
        
             | [deleted]
        
             | rasguanabana wrote:
             | Why go that way. I'm no digital signal processing expert,
             | but images (and series thereof, i.e videos) are 2D signals.
             | What we see is spatial domain and analyzing pixel by pixel
             | is naive and won't get you very far.
             | 
             | What you need is going to frequency domain. From my own
             | experiment in university times most significant image info
             | lays in lowest frequencies. Cutting off frequencies higher
             | than 10% of lowest leaves very comprehensible image with
             | only wavey artifacts around objects. You have plenty of
             | bandwidth to use even if you want to embed info in existing
             | media.
             | 
             | Now here you have full bandwidth to use. Start with
             | frequency domain, set expectations of lowest bandwidth
             | you'll allow and set the coefficients of harmonic
             | components. Convert to spatial domain, upscale and you got
             | your video to upload. This should leave you with data
             | encoded in a way that should survive compression and
             | resizing. You'll just need to allow some room for that.
             | 
             | You could slap error correction codes on top.
             | 
             | If you think about it, you should consider video as - say -
             | copper wire or radio. We've come quite far transmitting
             | over these media without ML.
        
               | anyfoo wrote:
               | We started with that approach, by assuming that the
               | compression is wavelet based, and then purposefully
               | generating wavelets that we know survive the compression
               | process.
               | 
               | For the sake of this discussion, wavelets are pretty much
               | exactly that: A bunch of frequencies where the "least
               | important" (according to the algorithm) are cut out.
               | 
               | But that's pretty cool, seems like you've re-invented
               | JPEG without knowing it, so your understanding is solid!
        
             | anyfoo wrote:
             | That's essentially a variant of "bigger pixels". Just like
             | them, your algorithm cannot _guarantee_ that an unknown
             | codec will still make the whole thing perform adequately.
             | 
             | Even if you train your model to work best for all existing
             | codecs (I assume that's the "ML" part of the ML model), the
             | no free lunch theorem pretty much tells us that it can't
             | always perform well for codecs it does not know about.
             | 
             | (And so does entropy. Reducing to absurd levels, if your
             | codec results in only one pixel and the only color that
             | pixel can have is blue, then you'll only be able to encode
             | any information in the length of the video itself.)
        
           | rasguanabana wrote:
           | How about Fourier transform (or cosine, whichever works
           | best), and keep data as frequency components coefficients?
           | That's the rough idea behind digital watermarking. It
           | survives image transforms quite well.
        
       | layer8 wrote:
       | Back in the 90's I considered storing my backups as encrypted
       | stenographied or binary Usenet postings, as a kind of
       | decentralized backup, postings which would stick around long
       | enough for the next weekly backup. (Usenet providers had at least
       | a couple of weeks of retention time back then.)
        
       | accrual wrote:
       | I love that this is like tape in that it's a sequential access
       | medium. It's storing a tape-like data stream in a digital version
       | of what used to be tape itself (VHS).
        
         | layer8 wrote:
         | I believe YouTube supports random access, or otherwise you
         | wouldn't be able to jump around in a video. Youtube-dl also
         | supports resuming downloads in the middle, I believe.
        
           | ductsurprise wrote:
           | True... But guessing YouTubeDrive 'decoder' needs whole video
           | to get you back anything close to what you put in.
           | 
           | Otherwise each frame would have to have a ridiculous amount
           | of encoded overhead.
           | 
           | Ahh, NM cant even see that working.
           | 
           | edit: Maybe a file table at built from from specified first N
           | frames, that delivers frameset/file map ...
           | 
           | Still nothing like skipping spots in a video. That relies on
           | key frames and time signatures.
           | 
           | Cool stuff nonetheless...
        
             | Dylan16807 wrote:
             | Why would you need a map or overhead?
             | 
             | Each frame gets the same amount of the file, about a
             | kilobyte. So each frame is basically a sector. You need to
             | read in a few extra frames to undo the compression, but
             | otherwise it's just like a normal filesystem. And reading
             | in a batch of sectors at once is normal for real drives
             | too.
             | 
             | Even if you did need the frames to be self-describing, you
             | could just toss a counter/offset in the top left corner for
             | less than 1% overhead.
        
       | kringo wrote:
       | BEWARE: Until they clamp down and delete the files, you lose your
       | data.
       | 
       | Good technical experiment though!
        
         | netsharc wrote:
         | Since he's made a ready-to-use software, yeah Google will
         | probably ban this quite quickly...
        
       | Annatar wrote:
       | This works on the same principle as the video backup system (VBS)
       | which we used in the 1980's and the early 1990's on our Commodore
       | Amigas: if I remember correctly, one three hour PAL/SECAM VHS
       | tape had a capacity of 130 MB. The entire hardware fit into a DB
       | 25 parallel port connector and was easily made by oneself with a
       | soldering iron and a few cheap parts.
       | 
       | https://www.youtube.com/watch?v=VcBY6PMH0Kg
       | 
       | SGI IRIX also had something conceptually similar to this
       | "YouTubeDrive" called HFS, the hierarchical filesystem, whose
       | storage was backed by tape rather than disk, but to the OS it was
       | just a regular filesystem like any other: applications like
       | ls(1), cp(1), rm(1) or any other saw no difference, but the
       | latency was high of course.
        
         | rahimnathwani wrote:
         | "one three hour PAL/SECAM VHS tape had a capacity of 130 MB"
         | 
         | This reminds me of the Danmere Backer.
         | 
         | "The entire hardware fit into a DB 25 parallel port connector
         | and was easily made by oneself with a soldering iron and a few
         | cheap parts."
         | 
         | This reminds me of the DIY versions of the Covox Speech Thing:
         | https://hackaday.com/2014/09/29/the-lpt-dac/
        
         | thought_alarm wrote:
         | That's how digital audio was originally recorded to tape back
         | in the 1970s and 80s: encode the data into a broadcast video
         | signal and record it using a VCR.
         | 
         | In the age of $5000 10 MB hard drives, this was the only
         | sensible way to work with the 600+ MB of data needed to master
         | a compact disc.
         | 
         | That's also where the ubiquitous 44.1 kHz sample rate comes
         | from. It was the fastest data rate could be reliably encoded
         | into both NTSC and PAL broadcast signals. (For NTSC: 3 samples
         | per scan line, 245 scan lines per frame, 60 frames per second =
         | 44100 samples per second.)
        
         | ogurechny wrote:
         | 130 MB for the whole tape is not a lot. It equals to a floppy
         | disk throughput, which is probably not a coincidence. However,
         | basic soldering implies that the rest of the system acts like a
         | big software-defined DAC/ADC.
         | 
         | Dedicated controller could pack a lot more data, as in hobo
         | tape storage system: https://en.wikipedia.org/wiki/ArVid
        
       | geoffeg wrote:
       | This is great. I did something very similar with a laser printer
       | and a scanner many years ago. I wrote a script that generated
       | pages of colored blocks and spent some time figuring out how much
       | redundancy I needed on each page to account for the scanner's
       | resolution. I think I saw something similar here or on github a
       | few years ago.
        
         | lifthrasiir wrote:
         | Searching HN for "paper backup" gives a lot of existing
         | solutions, in fact too many that I don't know which one you
         | saw.
        
         | aaaaaaaaaaab wrote:
         | So you invented QR codes?
        
           | geoffeg wrote:
           | Overly complicated, color QR codes.
        
         | banana_giraffe wrote:
         | Reminds me of "Cauzin Softstrip", the format some computer
         | magazines used back in the day to distribute BASIC programs, or
         | even executables.
         | 
         | Random example from an issue of Byte:
         | 
         | https://archive.org/details/byte-magazine-1986-05/page/n432/...
        
       | daenz wrote:
       | How much data can you store if you embedded a picture-in-picture
       | file over a 10 minute video? I could totally see content creators
       | who do tutorials embedding project files in this way.
        
         | accrual wrote:
         | Would storing data as a 15 or 30 FPS QR code "video" be any
         | more useful? At a minimum one would gain a configurable amount
         | of error correction, and you could display it in the corner.
        
         | dsr_ wrote:
         | Back of the envelope estimate:
         | 
         | 4096 x 2160 x 24 x 60 is your theoretical max in bits/second,
         | 127 billion.
         | 
         | Assume that to counter YouTube's compression we need 16x16
         | blocks of no more than 256 colors and 15 keyframes/second; that
         | reduces it to
         | 
         | 256 * 135 * 8 * 15 = 4.1 million bits/sec.
         | 
         | That's not too awful. Ten minutes of this would get you about
         | 300MB of data, which itself might be compressed.
        
           | pstrateman wrote:
           | 4k video is almost always 3840x2160
        
             | kuschku wrote:
             | 4K consumer video is 3840x2160, 4K Cinema video is
             | 4096x2160.
             | 
             | Just like 2K consumer video is 1920x1080 and 2K Cinema
             | video is 2048x1080
        
         | behnamoh wrote:
         | "hope you enjoyed this video. btw, the source code used in this
         | tutorial is encoded in the video."
        
         | cush wrote:
         | Yeah seems way easier than adding a link in the description
        
           | daenz wrote:
           | Links die. As long as the video exists, the files that the
           | video uses will _always_ exist.
        
       | legitster wrote:
       | This reminds me of an old hacky product that would let you use
       | cheap VHS tapes as backup storage:
       | https://en.wikipedia.org/wiki/ArVid
       | 
       | You would hit Record on a VCR and the computer data would be
       | encoded as video data on the tape.
       | 
       | People are clever.
        
         | gibolt wrote:
         | Early games and software would be delivered on audio cassettes
         | that would then have to be 'played' in order to load your
         | software temporarily into the device, which could take minutes
         | 
         | edit: Video from the 8-bit Guy on how this worked -
         | https://www.youtube.com/watch?v=_9SM9lG47Ew
        
         | mobilene wrote:
         | This is old school. When I first wrote code back in the Stone
         | Age we used to store our stuff on cassette tape.
        
           | twh270 wrote:
           | You had cassette tape?? Lucky... I had to write my 1's and
           | 0's in the dirt with a stick.
           | 
           | Damn rain.
        
             | RedShift1 wrote:
             | You guys had dirt?
        
           | madengr wrote:
           | Ha ha, when I was a kid with my C64, I used my moms old reel-
           | to-reel tape deck to store data.
           | 
           | I still have a C64 and tape drive.
           | 
           | There was a magazine in the 80's where you could scan in the
           | code with a bar code scanner.
        
           | Random_Person wrote:
           | I still have my Atari 400 and tape drive!
        
             | johnvega wrote:
             | My family had Atari 400 with a tape drive. I remembered
             | buying a tape with a game. We also use it for basic
             | programming language and the Astroids game using a
             | cartridge.
        
         | alar44 wrote:
         | That's not really that hacky, audio cassettes were used
         | forever, it's just a tape backup.
        
         | jhgb wrote:
         | I remember a similar solution that was marketed in a German
         | mail order catalogue in late 1990s. It could have been Conrad,
         | but I'm not 100% sure. I recall it being a USB peripheral,
         | though. (Maybe I could find more about it in time...)
        
         | philjohn wrote:
         | The Alesis ADAT 8 track digital audio recorders used SVHS tapes
         | as the medium - at the end of the day, it's just a spooled
         | magnetic medium, not hugely different conceptually than a hard
         | drive.
        
         | ben174 wrote:
         | Wow, 2GB on a standard tape. For the time, that's incredibly
         | efficient and cheap.
        
           | anyfoo wrote:
           | Yeah. Video, even old grainy VHS, had a pretty high
           | bandwidth. Even much more so with S-VHS, which did not become
           | super popular though. (I'm actually wondering whether the 2GB
           | figure was for S-VHS, not VHS. Didn't to the math and
           | wouldn't be surprised either way, though.)
        
         | gattilorenz wrote:
         | Yes! There were many such systems, LGR made a video for one of
         | them, also showing the interface (as in: hardware and GUI) for
         | the backup: https://youtu.be/TUS0Zv2APjU
        
       | danschumann wrote:
       | This reminds me of Blame! where humans are living light rats in
       | the belly of the machine. Lol, also reminds me of the geocities
       | days where we created 50 accounts to upload dragon ball z videos.
        
       | ductsurprise wrote:
       | Could be a good and sneaky way to obfuscate encrypted message
       | transmissions?
        
       | mensetmanusman wrote:
       | Are the premium files stored as 4K?
        
       | bilekas wrote:
       | I absolutely love this idea. I need to dig more into the code,
       | but its almost like using twitter as a 'protocol' using youtube
       | as a storage.
       | 
       | So many ideas are flying to mind. Really creative.
        
       | shmatt wrote:
       | Reminds me of the old Wrapster[1] days
       | 
       | [1] https://www.cnet.com/tech/services-and-software/napster-
       | hack...
        
         | [deleted]
        
       | behnamoh wrote:
       | there was a story on HN a while ago in which someone stored
       | unlimited data in Google Sheets!
        
       | dahfizz wrote:
       | Does YouTube store and stream all videos losslessly? How does
       | this work otherwise?
        
         | kleer001 wrote:
         | things like redundancy and crc checks I assume
        
         | ezfe wrote:
         | The data is represented large enough on screen that compression
         | doesn't destroy it.
        
           | Beltalowda wrote:
           | e.g. similar to a QR code stored as a JPEG will still work
           | fine.
        
         | [deleted]
        
         | LukeShu wrote:
         | No, YouTube is not lossless.
         | 
         | The video that is created in the example in the README is
         | https://www.youtube.com/watch?v=Fmm1AeYmbNU
         | 
         | We can see that data is encoded as "pixels" that are quite
         | large, being made up of many actual pixels in the video file. I
         | see quite bad compression artifacts, yet I can clearly make out
         | the pixels that would need to be clear to read the data. It
         | looks like the video was uploaded at 720p (1280x720), but the
         | data is encoded as a 64x36 "pixel" image of 8 distinct colors.
         | So lots of room for lossy compression before it's unreadable.
        
           | [deleted]
        
         | [deleted]
        
         | martincmartin wrote:
         | Imagine a QR code that changes once every X milliseconds.
        
           | dahfizz wrote:
           | That's an excellent analogy, thank you.
        
       | derevaunseraun wrote:
       | This seems like something Cicada 3301 would use
       | 
       | I wonder how many random videos like this are floating around
       | that are encoding some super secret data...
        
       | advisedwang wrote:
       | Seems like a great way to get your account closed for abuse!
        
         | LewisVerstappen wrote:
         | You'd be surprised how much YouTube lets you upload.
         | 
         | I've been uploading 2-3 hours of content a day every day for
         | the past few years. On the same account too.
         | 
         | I have fewer than 10 subscribers lol.
        
           | deanCommie wrote:
           | How MUCH - yes - as long as it's videos, and it's not
           | violating copyright, you're probably not violating any Terms
           | of Service.
           | 
           | But I guarantee there is some clause in the ToS that this
           | project violates.
        
           | emptysongglass wrote:
           | Lucky you. I just posted my first two videos from a
           | conference that were banned within a day for violating
           | "Community Guidelines" without appeal.
        
             | [deleted]
        
           | c0balt wrote:
           | They let you sometimes get away with a lot more[0] ;)
           | 
           | [0]: https://www.youtube.com/watch?v=Olkb7fYSyiI
        
           | bityard wrote:
           | What kind of content do you upload? (Should "content" be in
           | air quotes? :P)
        
             | LewisVerstappen wrote:
             | Lol yeah.
             | 
             | It's just recordings of myself when I'm doing deep work. I
             | use OBS to stream my computer screen and a video recording
             | of myself (mostly me muttering to myself).
             | 
             | It helps me avoid getting distracted (I feel like I'm being
             | watched lol) and it's also interested to check back if I
             | want to see what I was working on 3 months ago.
             | 
             | All the videos are unlisted or private.
        
               | nittanymount wrote:
               | wow, curious, are you keeping these videos there, or will
               | delete them after several months?
        
               | pcthrowaway wrote:
               | Are you screensharing while recording? What tooling do
               | you use to do this if so?
               | 
               | Also, any potential issues with Google having access to
               | proprietary code? I know the chance of any human at
               | Google interpreting your videos is near-zero but still
        
               | adolph wrote:
               | Isn't that what Twitch is for?
        
         | johndfsgdgdfg wrote:
         | Then the whole HN crowd would have enough outrage materials for
         | weeks. Seems like a win-win situation to me.
        
         | robotnikman wrote:
         | Another thread posted today makes it seem like they don't
         | really care
         | 
         | https://news.ycombinator.com/item?id=31488455
        
         | Manuel_D wrote:
         | If it becomes prevalent, I think YouTube would do something
         | like slightly randomize the compression in their videos to
         | dissuade this kind of use.
        
         | deckar01 wrote:
         | You could make it much harder to detect by synthesizing a
         | unique video with a DNN and hiding the data using traditional
         | stenography techniques.
        
           | Mockapapella wrote:
           | I think that video compression might make this not a viable
           | technique. Artifacts would destroy the hidden data, right?
        
             | bitexploder wrote:
             | That is what redundancy and error correcting codes are for.
             | It will reduce your data density, but I am sure you can
             | find parameters that preserve the data.
        
             | upupandup wrote:
             | Couldn't you also embed data through sound? Upload a video
             | of a monkey at the zoo but you insert ultrasound with
             | encoded data.
             | 
             | something like this but far more mundane
             | 
             | https://www.youtube.com/watch?v=yLNpy62jIFk
        
               | bityard wrote:
               | > but you insert ultrasound with encoded data
               | 
               | Others in these comments have also suggested
               | steganography in both the video and audio streams. The
               | problem with that is that when you retrieve a video from
               | YouTube, you never get the original version back. You
               | only get a lossy re-encoded version, and the very
               | definition of lossy encoding is to toss out details that
               | humans can't (or wouldn't easily) perceive, including
               | ultra-sonic audio.
        
               | dotancohen wrote:
               | It might be ridiculous, but how about uploading a
               | computer-generated video of a human saying 0 and 1 very
               | quickly, to encode binary file.
               | 
               | Or better yet, the file could be one third the size if
               | the human says the numbers 0 to 7.
        
             | snowwrestler wrote:
             | Unless you tuned the NN on the files you get back from
             | YouTube, so that it learns to encode the data in a way that
             | is always recoverable despite the artifacts.
        
             | throwaway92394 wrote:
             | Compression will limit the bandwidth of a given frame but
             | you can work around it.
             | 
             | Some forms of DRM are already essentially this, compression
             | - and even crappy camera recording from a theater -
             | resistant DRM that is essentially stegonagraphy (you can't
             | visually tell its there) exist.
             | 
             | EDIT: "compression resistant watermark" is a good search
             | phrase if anyone is curious
        
       | umvi wrote:
       | Turns out any site that allows users to submit and retrieve data
       | can be abused in the same way:
       | 
       | - FacebookDrive: "Store files as base64 facebook posts"
       | 
       | - TwitterDrive: "Store files as base64 tweets"
       | 
       | - SoundCloudDrive: "Store files as mp3 audio"
       | 
       | - WikipediaDrive: "Store files in wikipedia article histories"
        
         | jasonlotito wrote:
         | My friends and I had a joke called NSABox. It would send data
         | around using words that would attract the attention of the NSA,
         | and you could submit a FOIA request to recover the data. I
         | always found it amusing.
        
           | havblue wrote:
           | I've heard of the loic ion cannon dos tool described as a
           | shortcut to getting sent to jail. This sounds similar.
        
             | mechanical_bear wrote:
             | Big difference. LOIC actually impacts a target.
        
           | mickeyp wrote:
           | There's a feature in Emacs that does that (unsurprisingly.)
           | 
           | It's called `M-x spook'. It inserts random gibberish that NSA
           | and the Echelon project would've supposedly picked up back in
           | the 90s.
        
             | LukeShu wrote:
             | spook.el was "introduced at or before Emacs version 18.52".
             | And 18.52 was released in 1988. And spook.el in a comment
             | says                   ;; Created: May 1987
             | 
             | So the things that the NSA and ECHELON would have picked up
             | on back in the 1980s, not the 1990s :)
        
         | upupandup wrote:
         | What a great time to write botnets
        
         | itake wrote:
         | Back in the day when @gmail was famous for their massive free
         | storage for email, ppl wrote scripts to chunk large files and
         | store them as email attachments.
        
           | adzm wrote:
           | People did this on AOL in the 90s as well!
        
             | jprd wrote:
             | Did you manage to get on the latest Mass Mail going out
             | tonight?
        
             | RcouF1uZ4gsC wrote:
             | With AOL, in the early 90's you didn't even need to do
             | that. You could just reformat and reuse the floppy disks
             | they were always sending you for free storage.
        
           | ihaveajob wrote:
           | I know someone who published an academic paper on doing
           | exactly this.
        
             | IshKebab wrote:
             | Doesn't sound very noteworthy tbh. It's obviously possible
             | and the implementation is straightforward.
        
               | 867-5309 wrote:
               | sounds like 99% of academic papers
        
               | IshKebab wrote:
               | Most papers at least _sound_ like they 're notable!
        
               | jraph wrote:
               | The less jam you have, the more you spread it out.
               | 
               | The opposite is also true. Brilliant ideas have lead to
               | papers that can read obvious and terribly unremarkable.
        
           | Grollicus wrote:
           | I used this as a backup target for the longest time. Simply
           | split the backup file into 10 MB chunks and send as mails to
           | a gmail account. Encrypted so no privacy problems. Rock solid
           | for years.
           | 
           | And as it was just storing emails it was even using gmail for
           | it's intended purpose so no TOS problems..
        
             | shon wrote:
             | Yup, did the exact same thing to back up all of the
             | Wordpress installs on a free server I ran for friends.
        
         | thrdbndndn wrote:
         | This is pretty tame compared to some actual, practical ones
         | such as https://github.com/apachecn/CDNDrive
         | 
         | For people who don't read Chinese: it encodes data into ~10M
         | blocks in PNG and then uploads (together with a metadata/index
         | file as an entry point) to various Chinese social media sites
         | that don't re-compress your images. I knew people have used it
         | to store* TBs after TBs data on them already.
         | 
         | *Of course, it would be foolish to think your data is even
         | remotely safe "storing" them this way. But it's a very good
         | solution for sharing large files.
        
         | behnamoh wrote:
         | also Telegram
        
         | [deleted]
        
         | WaxProlix wrote:
         | I wrote one of these as a POC when at AWS to store data sharded
         | across all the free namespaces (think Lambda names), with
         | pointers to the next chunk of data.
         | 
         | I like to think you could unify all of these into a FUSE
         | filesystem and just mount your transparent multi-cloud remote
         | FS as usual.
         | 
         | It's inefficient, but free! So you can have as much space as
         | you want. And it's potentially brittle, but free! So you can
         | replicate/stripe the data across as many providers as you want.
        
           | turtledove wrote:
           | I was an eng manager on Lambda for a time, and we definitely
           | knew people were doing this, and had plans to cut it out if
           | it ever became a problem. :D
        
             | WaxProlix wrote:
             | Yeah, you'd need to find some sort of auto-balancing to
             | detect this kind of bitrot from over-aggressive engineering
             | managers & their ilk and rebalance the data across other
             | sources. I think the multiple-shuffle-shard approach has
             | been done before, maybe we could steal some algo from a
             | RAID driver, or DynamoDB.
        
         | willcipriano wrote:
         | I made a tool that lets you store files anywhere you can store
         | a URL: https://podje.li/
        
           | metadat wrote:
           | Is there an import URLs button? Otherwise, how does one
           | reassemble the original?
        
             | willcipriano wrote:
             | Click them, it's really for things that fit into one or two
             | urls like small text files. I've used it for config files
             | that were getting formatted incorrectly over corporate
             | email that ate it as a attachment.
        
         | wging wrote:
         | See also https://github.com/qntm/base2048. "Base2048 is a
         | binary encoding optimised for transmitting data through
         | Twitter."
        
           | colinmhayes wrote:
           | Still need around 30,000 more unicode characters for this to
           | work.
        
             | wging wrote:
             | Sorry, I edited the post concurrently with your comment -
             | it now points to Base2048, the link I meant to post, which
             | actually should work - rather than
             | https://github.com/qntm/base65536 (which I think you're
             | commenting on).
        
             | theblazehen wrote:
             | > For transmitting data through Twitter, Base65536 is now
             | considered obsolete; see Base2048.
             | 
             | Source: https://github.com/qntm/base65536
        
         | the_duke wrote:
         | Github repos makes for a pretty good key-value store.
         | 
         | It even has a full CRUD API, no need for using libgit.
        
         | mike00632 wrote:
         | I wonder if access permissions would be easier to maintain
         | using Facebook...
        
           | dheera wrote:
           | Until one day your base64 ciphertext just so happens to
           | contain a curse word and you get banned for violating
           | "community standards"
        
       | anonymousiam wrote:
       | Reminds me of this similar tool that exploited GMail the same
       | way: https://www.computerworld.com/article/2547891/google-hack--
       | u...
        
       | saint_angels wrote:
       | Reminds me of a guy who stored data in ping messages
       | https://youtu.be/JcJSW7Rprio
        
         | alanh wrote:
         | What part of the video discusses this? :D So far it's about
         | juggling chainsaws
         | 
         | Edit: OK, I see where this is going. Lol
        
         | bluedays wrote:
         | I watch these things and I begin to realize I'll never be as
         | intelligent as someone like this. It's good to know no matter
         | how much you're grown there is always a bigger fish.
        
           | qorrect wrote:
           | I agree that there will always be smarter fish, but you can
           | definitely be this smart it just takes the proper motivation
           | ( or weird idea ) to wiggle its way into your brain.
        
       | msoad wrote:
       | Reminds me of the other post that used Facebook Messenger as
       | transport layer to get free internet in places that internet is
       | free if you use Facebook apps.
        
       | powerset wrote:
       | I wonder if something similar could be useful for transmitting
       | data optically, like an animated QR code. Maybe a good way to
       | transmit data over an air gap for the paranoid?
        
       | _trampeltier wrote:
       | This story from 2016 comes to my mind.
       | 
       | https://www.bbc.com/future/article/20160225-the-quest-to-sol...
        
       | kube-system wrote:
       | I can't wait until malware uses this as C2
        
         | Tijdreiziger wrote:
         | Seems pretty fragile. Google taking down your channel would be
         | enough to disarm your malware.
        
           | blibble wrote:
           | they worked around this years ago by generating the username
           | (domain name) based on some property of the current time
           | 
           | (plus using more than one tld)
        
         | vmception wrote:
         | Ipfs is decent enough or better with free pinning services
        
       | productceo wrote:
       | Imagine a free cloud storage, but you need to watch an ad every
       | time you download a file.
        
         | stingta wrote:
         | Wasn't that basically megaupload its ilk
        
         | rightbyte wrote:
         | I read that you did not download shady files from the interwebs
         | when that was a thing sane people actually did?
        
           | rationalfaith wrote:
        
         | [deleted]
        
       | das_keyboard wrote:
       | Wasn't there more or less recently on HN something like "Store
       | Data for free in DNS-Records"? Reminds me of this.
        
       | jtxt wrote:
       | Seems like it may be a decent "harder drive".
       | https://youtu.be/JcJSW7Rprio
        
       | metadat wrote:
       | Could youtube-dlp and YouTube Vanced now be hosted on.. YouTube?
       | 
       | I wonder how long it'd take for Google to crack down on the
       | system abuse.
       | 
       | Is it really abuse if the videos are viewable / playable?
       | Presumably the ToS either already forbids covert channel encoding
       | or soon will.
        
         | sevenf0ur wrote:
         | Probably breaks TOS under video spam
        
           | tenebrisalietum wrote:
           | Add a music track, it is now a psychedelic art video.
        
             | squarefoot wrote:
             | A music track in which the music happens to be FSK data
             | disguised as chiptune.
        
           | throwaway92394 wrote:
           | Just gotta add some good 'ol steganography
        
             | javajosh wrote:
             | This brings up an interesting question: what is the upper-
             | bound of hidden data density using video steganography?
             | E.g. how much extra data can you add before noticeable
             | degradation? It's interesting because it requires both a
             | detailed understanding of video encoding and also
             | understanding of human perception of video.
        
               | pbhjpbhj wrote:
               | I'd expect you could store more data steganographically
               | than the raw video data.
               | 
               | You can probably do things like add frames that can't be
               | decoded and so are skipped by a decoder; that effectively
               | allows arbitrary added hidden data. That's maybe
               | cheating.
               | 
               | If you stipulate that you can't already have a copy of
               | the unaltered file, and the data has to be extractable
               | from a pixel copy of the rendered frames ... that becomes
               | more interesting, I think.
        
               | samatman wrote:
               | I've seen drone metal videos where the video and audio
               | could both be 90% steganography and I wouldn't know the
               | difference.
        
             | alpaca128 wrote:
             | Good luck preserving it through YouTube's video
             | compression. It's super lossy with small details, in bad
             | cases the quality can visibly degrade to a point it looks
             | more like a corrupted low-res video file for a few seconds
             | (saw that once in a Tetris Effect gameplay video).
        
               | throwaway92394 wrote:
               | I mentioned it in another comment, but while that does
               | lower the bandwidth of a single frame, its not actually
               | an issue. There's several DRM techniques that can survive
               | a crappy camera recording in a theater.
               | 
               | "compression resistant watermark" turns up some good
               | resources for it. QR codes are another good example of
               | noise tolerant data transmission (fun fact - having logos
               | in a QR code isn't part of the spec, you're literally
               | covering the QR code but the error-correction can handle
               | it).
               | 
               | The best way I can describe it is that humans can still
               | read text in compressed videos. The worse the
               | compression/noise the larger the text needs to be, but we
               | can still read it.
        
           | bliteben wrote:
           | yeah wonder how long until the ban, also bans all of your
           | descendants for 10 generations?
        
         | robonerd wrote:
         | If you put youtube-dlp on youtube as a video, make sure to use
         | youtube-dlp to it up.
        
         | throwaway0a5e wrote:
         | >Is it really abuse if the videos are viewable / playable?
         | Presumably the ToS either already forbids covert channel
         | encoding or soon will.
         | 
         | If creators start encoding their source and material into their
         | content Google would probably be fine with that because it
         | gives them data but also gives them context for that data.
         | 
         | Edit: I meant like "director's commentary" and "notes about
         | production" type stuff like you used to see added to DVDs back
         | in the day. Not "using youtube as my personal file storage".
         | Why is this such an unpopular opinion?
        
           | jklinger410 wrote:
           | > If creators start encoding their source material into their
           | files Google would probably be fine with that
           | 
           | Not true at all, lol. Google has a paid file storage
           | solution. YouTube is for streaming video and that's the
           | activity they expect on that platform. I couldn't imagine any
           | service designed for one format would "probably be fine" with
           | users encoding other files inside of that format.
        
             | pbhjpbhj wrote:
             | I think the parent comment is limiting themselves to the
             | embedding of metadata specific to the containing file. It
             | would be like adding a single frame, but would potentially
             | give useful information to Google. In those limited
             | circumstances I think the parent is correct.
        
           | baud147258 wrote:
           | > If creators start encoding their source material into their
           | files Google would probably be fine with that
           | 
           | it'd depends, as I don't think people using YT to store files
           | would watch a lot of adds
        
             | throwaway0a5e wrote:
             | If creators use it like the appendix in a book I can see
             | people watching ads on their way to it.
        
         | cush wrote:
         | It's one of those problems that resolves itself.
         | 
         | The process of creating and using the files is prohibitively
         | unusable and so many better solutions exist that YT doesn't
         | need to worry about it
        
       | freestorage wrote:
       | Years ago when Amazon had unlimited photo storage, you could
       | "hide" gigabytes of data behind a 1px gif (literally
       | concatenation together) so that it wouldn't count against your
       | quota.
        
         | xhrpost wrote:
         | They still do if you pay for Prime. I was surprised to see that
         | even RAW files (which are uncompressed and quite large) were
         | uploaded and stored with no issues. Not the same as "hiding"
         | data but might still be possible.
        
           | karamanolev wrote:
           | In the interest of technical correctness, RAW files are
           | frequently compressed and even lossily compressed. For
           | example, Sony's RAW compression was only lossy until very
           | recent cameras.
           | 
           | Given that there are the options for uncompressed, lossy
           | compressed and lossless compressed, I'd say RAW files differ
           | in the stage of the data processing where capture is being
           | done and doesn't imply anything about the type of
           | compression.
           | 
           | What is relevant is that the formats vary widely between
           | manufacturers, camera lines and individual cameras, so unlike
           | JPEG, it's really hard to create a storage service that
           | compresses RAW files further after uploading in a meaningful
           | way. So anything they do needs to losslessly compress the
           | file.
        
           | netsharc wrote:
           | I guess you can store 24 bits of data as the R,G and B
           | components of a pixel of an "image", and store it as a
           | lossless image...
        
       | flaque wrote:
       | See also RedditFS: https://github.com/maxchehab/redditfs
        
       | kebman wrote:
       | Are there any examples? I'd love to see such a YouTube video...
       | :p
        
       | Jimmc414 wrote:
       | Very cool. I wonder how difficult it would be present a real
       | watchable video to the viewer. Albeit low quality, but embed the
       | file in a steganographic method. I think a risk of this tech is
       | that if it takes off, YT might easily adjust the algorithms to
       | remove unwatchable videos. Perhaps leaving a watchable video
       | could grant it more persistence than an obvious data stream.
        
         | ragingglow wrote:
         | Sure, but the more structure your video has to have, the harder
         | it becomes to hide information stenographically within it. Your
         | information density will become very low I think.
        
       | 8K832d7tNmiQ wrote:
       | I remember seeing this first discussed at 4chan /g/ board as a
       | joke wether or not they can abuse Youtube's unlimited file size
       | upload limit, then escalated into a proof of concept shown in the
       | repo :)
        
         | marginalia_nu wrote:
         | This is a tangent. I must have been maybe 15-16 at the time, so
         | somewhere around 20 years ago: One of the first pieces of
         | software I remember building was a POP3 server that served
         | files, that you could download using an email client where they
         | would show up as attachments.
         | 
         | Incredibly bizarre idea. I'm not sure who I thought would
         | benefit from this. I guess I got swept up in RFC1939 and needed
         | to build... something.
        
           | babanin wrote:
           | On my first job (in the beginning of the millennium) there
           | was a limit on files you could download, something around
           | 5Mb. If you wanted to download something bigger, you had to
           | ask sysadmins to do that and wait... That was really
           | annoying. So I and my colleague end up writing a service,
           | that could download a file to local storage and chop it into
           | multiple 5Mb attachments and send multiple emails to
           | requestor.
           | 
           | After some time the limit on single file was removed, but
           | daily limit was set up to 100Mb. The trick is that POP3
           | traffic wasn't accountable, so we continued to use our
           | "service".
        
             | hiq wrote:
             | I couldn't download .exe files at some $CORPORATION. They
             | had to be whitelisted or something, and the download just
             | wouldn't work otherwise. But once you had the .exe you
             | could run it just fine. You just had to ping some IT person
             | to be able to retrieve your .exe.
             | 
             | Of course it was still possible to browse the internet and
             | visualize arbitrary text, so splitting the .exe into
             | base64-encoded chunks and uploading them on GitHub from
             | another computer was working perfectly fine... I briefly
             | argued against these measures, given how unlikely they are
             | to prevent any kind of threat, but they're probably still
             | in place.
        
           | behnamoh wrote:
           | apparently e-mail is not much reliable for storing/keeping
           | files. there have been cases where an old email with an
           | attachment would not load correctly because the servers just
           | erased the attachment file.
        
             | marginalia_nu wrote:
             | This was a custom email server though, there never were any
             | emails, it just presented files as though they were so that
             | a client would download them.
             | 
             | Actually caused some problems for email clients, as they
             | usually assumed emails were small. I got a few of them to
             | crash with 200 Mb "attachments" (although this was in the
             | early 00s, 200Mb was bigger than it is today).
        
               | qorrect wrote:
               | I'm still confused on how this worked, did you email some
               | address and get a reply with the attachment ?
        
               | mjochim wrote:
               | Since GP says it was a POP3 server, I suppose you would
               | set up an email account in your client with its inbox
               | server pointing to that POP3 server. When the client
               | requests the content of the inbox, the server responds
               | with a list of "emails" that are really just files with
               | some email header slapped on; so your email client's
               | inbox window essentially becomes a file browser.
        
             | Gigachad wrote:
             | Interestingly, if you take a look at your emails from a few
             | years ago, most of the non attached images will fail to
             | load now.
        
       | Saint_Genet wrote:
       | Makes me wonder how many video and image upload sites are now
       | used as easily accessible number stations these days
        
         | adolph wrote:
         | Probably not many. The advantage of plain old-fashioned radio
         | is that the station doesn't keep track of the receivers.
         | Whoever watches a YouTube numbers station is tracked six ways
         | to Sunday.
        
       | INTPenis wrote:
       | I like this. The last wave of Twitter users into the fediverse
       | caused my AWS bill to go up 10 USD a month. Might have to start
       | storing media files on youtube instead ;)
        
       | jimmydeans wrote:
       | I remember a project that was doing this with photo files and
       | unlimited picture storage.
        
       | sunlite99 wrote:
       | How will you prevent youtube from re-encoding the video and data
       | getting thrashed?
        
         | tenebrisalietum wrote:
         | Make the boxes bigger.
        
       | take_it_not wrote:
       | I'm thinking maybe we can divide files into pieces and turn each
       | pieces into a QR code then turn each QR code into a single frame?
        
       | musicale wrote:
       | It's all fun and games until your files start getting DMCA
       | takedowns.
        
       | abadaba wrote:
       | Are there any services out there that combine all of these "Store
       | files as XYZ" into some kind of raid config?
       | 
       | Would be interesting if you could treat each service (Youtube,
       | Docs, Reddit, Messenger, etc) as a "disk" and stripe your data
       | across them.
        
         | [deleted]
        
       | [deleted]
        
       | AdriaanvRossum wrote:
       | How much kilobytes would be possible to store per minute video?
        
       | lb1lf wrote:
       | -Back in the day when file sharing was new, I won two rounds of
       | beer from my friends in university - the first after I tried what
       | I dubbed hardcore backups (Tarred, gzipped and pgp'd an archive,
       | slapped an avi header on it, renamed it
       | britney_uncensored_sex_tape[XXX].avi or something similar, then
       | shared it on WinMX assuming that as hard drive space was free and
       | teenage boys were teenage boys, at least some of those who
       | downloaded it would leave it to share even if the file claimed to
       | be corrupt.
       | 
       | It worked a charm.
       | 
       | Second round? A year later, when the archive was still available
       | from umpteen hosts.
       | 
       | For all I know, it still languishes on who knows how many old
       | hard drives...
        
         | marginalia_nu wrote:
         | Poor guys, still looking for the right codec to play the
         | britney tape they downloaded 28 years ago.
        
         | jjice wrote:
         | That's a perfect college CS story. Beer and bastardized files -
         | what a combo!
        
       ___________________________________________________________________
       (page generated 2022-05-24 23:00 UTC)