[HN Gopher] YouTubeDrive: Store files as YouTube videos
___________________________________________________________________
YouTubeDrive: Store files as YouTube videos
Author : notamy
Score : 383 points
Date : 2022-05-24 17:31 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| smm11 wrote:
| I suspect people in my office who send everything as a Word
| attachment with an image, PPT, Excel workbook, etc., embedded,
| are doing this unknowingly.
|
| There are even Word files I've found that have complete file path
| notation to ZIP files.
| xhrpost wrote:
| Not immediately obvious from the Readme, but does this rely on YT
| always saving a providing download of the original un-altered
| video file? If not, then it must be saving the data in a manner
| that is retrievable even after compression and re-encoding, which
| is very interesting.
| [deleted]
| bspear wrote:
| Fascinating
| fronterablog wrote:
| I'm a GOOGL investor and I find this offensive.
| jagged-chisel wrote:
| This would be a good way to backup your YouTube videos to YouTube
| while avoiding Content ID.
| [deleted]
| Group_B wrote:
| I was literally thinking of something like this a couple days
| ago. Good timing!
| aneil wrote:
| Evil genius.
| anyfoo wrote:
| I only looked at the example video, but is the concept just "big
| enough pixels"?
|
| Would be neater (and much more efficient) to encode the data such
| that it's exactly untouched by the compression algorithm, e.g. by
| encoding the data in wavelets and possibly motion vectors that
| the algorithm is known to keep[1].
|
| Of course that would also be a lot of work, and likely fall apart
| once the video is re-encoded.
|
| [1] If that's what video encoding still does, I really have no
| idea, but you get the point.
| colejohnson66 wrote:
| YouTube let's you download your uploaded videos. I've never
| tested it, but supposedly it's the exact same file you
| uploaded.[a] It probably wouldn't work with this "tool" as it
| uses the video ID (so I assume it's downloading what clients
| see, not the source), but it's an idea for some other variation
| on this concept.
|
| [a] That way, in the future, if there's any improvements to the
| transcode process that makes smaller files (different codec or
| whatever), they still have the HQ source
| mod50ack wrote:
| They may retain the original files, but they don't give that
| back to you in the download screen. I just tested it by going
| to the Studio screen to download a video I uploaded as a
| ~50GB ProRes MOV file and getting back an ~84MB H264 MP4.
| dheera wrote:
| YT might still recompress your video, possibly using
| proprietary algorithms that are not necessarily DCT based
| anyfoo wrote:
| As said, falls apart with re-encoding. But is a bit more
| interesting than what is more or less QR codes.
| jrochkind1 wrote:
| I find it a bit more interesting to have something that
| actually works on youtube, even if only as a proof of
| concept.
| bambax wrote:
| Or, film pieces of paper in succession, in a clear enough
| manner that they're still readable even when heavily
| compressed.
| ben174 wrote:
| OH, i get it :)
| NonNefarious wrote:
| Back in the day, VCRs were commonly used as tape backup devices
| for data.
|
| Now studios are using motion-picture film to store data, since
| it's known to be stable for a century or more.
| softfalcon wrote:
| Agree it would be cool to be "untouched" by the compression
| algorithm, but that's nearly impossible with YouTube. YouTube
| encodes down to several different versions of a video and on
| top of that, several different codecs to support different
| devices with different built-in video hardware decoders.
|
| For example, when I upload a 4K vid and then watch the 4K
| stream on my Mac vs my PC, I get different video files solely
| based on the browser settings that can tell what OS I'm
| running.
|
| Handling this compression protection for so many different
| codecs is likely not feasible.
| anyfoo wrote:
| Yes, but nothing is saying this has to work for every codec.
| Since you want to retrieve the files using a special client,
| you could pick the codec you like.
|
| But (almost) nothing prevents YouTube from not serving that
| particular codec anymore. This still pretty much falls under
| the "re-encoding" case I mentioned which would make the whole
| thing brittle anyway.
|
| But it's indeed cool to think about. 8)
| ALittleLight wrote:
| What if you have an ML model that produces a vector from a
| given image. You have a set of vectors that correspond to
| bytes - for a simple example you have 256 "anchor vectors"
| that correspond to any possible byte.
|
| To compress data an arbitrary sequence of bytes, for each
| byte, you produce an image that your ML model would convert
| to the corresponding anchor vector for that byte and add the
| image as a frame in a video. Once all the bytes have been
| converted to frames you then upload the video to YouTube.
|
| To decompress the video you simply go frame by frame over the
| video and send it to your model. Your model produces a vector
| and you find which of your anchor vectors is the nearest
| match. Even though YouTube will have compressed the video in
| who knows what way, and even if YouTube's compression
| changes, the resultant images in the video should look
| similar, and if your anchors are well chosen and your model
| works well, you should be able to tell which anchor a given
| image is intended to correspond to.
| [deleted]
| rasguanabana wrote:
| Why go that way. I'm no digital signal processing expert,
| but images (and series thereof, i.e videos) are 2D signals.
| What we see is spatial domain and analyzing pixel by pixel
| is naive and won't get you very far.
|
| What you need is going to frequency domain. From my own
| experiment in university times most significant image info
| lays in lowest frequencies. Cutting off frequencies higher
| than 10% of lowest leaves very comprehensible image with
| only wavey artifacts around objects. You have plenty of
| bandwidth to use even if you want to embed info in existing
| media.
|
| Now here you have full bandwidth to use. Start with
| frequency domain, set expectations of lowest bandwidth
| you'll allow and set the coefficients of harmonic
| components. Convert to spatial domain, upscale and you got
| your video to upload. This should leave you with data
| encoded in a way that should survive compression and
| resizing. You'll just need to allow some room for that.
|
| You could slap error correction codes on top.
|
| If you think about it, you should consider video as - say -
| copper wire or radio. We've come quite far transmitting
| over these media without ML.
| anyfoo wrote:
| We started with that approach, by assuming that the
| compression is wavelet based, and then purposefully
| generating wavelets that we know survive the compression
| process.
|
| For the sake of this discussion, wavelets are pretty much
| exactly that: A bunch of frequencies where the "least
| important" (according to the algorithm) are cut out.
|
| But that's pretty cool, seems like you've re-invented
| JPEG without knowing it, so your understanding is solid!
| anyfoo wrote:
| That's essentially a variant of "bigger pixels". Just like
| them, your algorithm cannot _guarantee_ that an unknown
| codec will still make the whole thing perform adequately.
|
| Even if you train your model to work best for all existing
| codecs (I assume that's the "ML" part of the ML model), the
| no free lunch theorem pretty much tells us that it can't
| always perform well for codecs it does not know about.
|
| (And so does entropy. Reducing to absurd levels, if your
| codec results in only one pixel and the only color that
| pixel can have is blue, then you'll only be able to encode
| any information in the length of the video itself.)
| rasguanabana wrote:
| How about Fourier transform (or cosine, whichever works
| best), and keep data as frequency components coefficients?
| That's the rough idea behind digital watermarking. It
| survives image transforms quite well.
| layer8 wrote:
| Back in the 90's I considered storing my backups as encrypted
| stenographied or binary Usenet postings, as a kind of
| decentralized backup, postings which would stick around long
| enough for the next weekly backup. (Usenet providers had at least
| a couple of weeks of retention time back then.)
| accrual wrote:
| I love that this is like tape in that it's a sequential access
| medium. It's storing a tape-like data stream in a digital version
| of what used to be tape itself (VHS).
| layer8 wrote:
| I believe YouTube supports random access, or otherwise you
| wouldn't be able to jump around in a video. Youtube-dl also
| supports resuming downloads in the middle, I believe.
| ductsurprise wrote:
| True... But guessing YouTubeDrive 'decoder' needs whole video
| to get you back anything close to what you put in.
|
| Otherwise each frame would have to have a ridiculous amount
| of encoded overhead.
|
| Ahh, NM cant even see that working.
|
| edit: Maybe a file table at built from from specified first N
| frames, that delivers frameset/file map ...
|
| Still nothing like skipping spots in a video. That relies on
| key frames and time signatures.
|
| Cool stuff nonetheless...
| Dylan16807 wrote:
| Why would you need a map or overhead?
|
| Each frame gets the same amount of the file, about a
| kilobyte. So each frame is basically a sector. You need to
| read in a few extra frames to undo the compression, but
| otherwise it's just like a normal filesystem. And reading
| in a batch of sectors at once is normal for real drives
| too.
|
| Even if you did need the frames to be self-describing, you
| could just toss a counter/offset in the top left corner for
| less than 1% overhead.
| kringo wrote:
| BEWARE: Until they clamp down and delete the files, you lose your
| data.
|
| Good technical experiment though!
| netsharc wrote:
| Since he's made a ready-to-use software, yeah Google will
| probably ban this quite quickly...
| Annatar wrote:
| This works on the same principle as the video backup system (VBS)
| which we used in the 1980's and the early 1990's on our Commodore
| Amigas: if I remember correctly, one three hour PAL/SECAM VHS
| tape had a capacity of 130 MB. The entire hardware fit into a DB
| 25 parallel port connector and was easily made by oneself with a
| soldering iron and a few cheap parts.
|
| https://www.youtube.com/watch?v=VcBY6PMH0Kg
|
| SGI IRIX also had something conceptually similar to this
| "YouTubeDrive" called HFS, the hierarchical filesystem, whose
| storage was backed by tape rather than disk, but to the OS it was
| just a regular filesystem like any other: applications like
| ls(1), cp(1), rm(1) or any other saw no difference, but the
| latency was high of course.
| rahimnathwani wrote:
| "one three hour PAL/SECAM VHS tape had a capacity of 130 MB"
|
| This reminds me of the Danmere Backer.
|
| "The entire hardware fit into a DB 25 parallel port connector
| and was easily made by oneself with a soldering iron and a few
| cheap parts."
|
| This reminds me of the DIY versions of the Covox Speech Thing:
| https://hackaday.com/2014/09/29/the-lpt-dac/
| thought_alarm wrote:
| That's how digital audio was originally recorded to tape back
| in the 1970s and 80s: encode the data into a broadcast video
| signal and record it using a VCR.
|
| In the age of $5000 10 MB hard drives, this was the only
| sensible way to work with the 600+ MB of data needed to master
| a compact disc.
|
| That's also where the ubiquitous 44.1 kHz sample rate comes
| from. It was the fastest data rate could be reliably encoded
| into both NTSC and PAL broadcast signals. (For NTSC: 3 samples
| per scan line, 245 scan lines per frame, 60 frames per second =
| 44100 samples per second.)
| ogurechny wrote:
| 130 MB for the whole tape is not a lot. It equals to a floppy
| disk throughput, which is probably not a coincidence. However,
| basic soldering implies that the rest of the system acts like a
| big software-defined DAC/ADC.
|
| Dedicated controller could pack a lot more data, as in hobo
| tape storage system: https://en.wikipedia.org/wiki/ArVid
| geoffeg wrote:
| This is great. I did something very similar with a laser printer
| and a scanner many years ago. I wrote a script that generated
| pages of colored blocks and spent some time figuring out how much
| redundancy I needed on each page to account for the scanner's
| resolution. I think I saw something similar here or on github a
| few years ago.
| lifthrasiir wrote:
| Searching HN for "paper backup" gives a lot of existing
| solutions, in fact too many that I don't know which one you
| saw.
| aaaaaaaaaaab wrote:
| So you invented QR codes?
| geoffeg wrote:
| Overly complicated, color QR codes.
| banana_giraffe wrote:
| Reminds me of "Cauzin Softstrip", the format some computer
| magazines used back in the day to distribute BASIC programs, or
| even executables.
|
| Random example from an issue of Byte:
|
| https://archive.org/details/byte-magazine-1986-05/page/n432/...
| daenz wrote:
| How much data can you store if you embedded a picture-in-picture
| file over a 10 minute video? I could totally see content creators
| who do tutorials embedding project files in this way.
| accrual wrote:
| Would storing data as a 15 or 30 FPS QR code "video" be any
| more useful? At a minimum one would gain a configurable amount
| of error correction, and you could display it in the corner.
| dsr_ wrote:
| Back of the envelope estimate:
|
| 4096 x 2160 x 24 x 60 is your theoretical max in bits/second,
| 127 billion.
|
| Assume that to counter YouTube's compression we need 16x16
| blocks of no more than 256 colors and 15 keyframes/second; that
| reduces it to
|
| 256 * 135 * 8 * 15 = 4.1 million bits/sec.
|
| That's not too awful. Ten minutes of this would get you about
| 300MB of data, which itself might be compressed.
| pstrateman wrote:
| 4k video is almost always 3840x2160
| kuschku wrote:
| 4K consumer video is 3840x2160, 4K Cinema video is
| 4096x2160.
|
| Just like 2K consumer video is 1920x1080 and 2K Cinema
| video is 2048x1080
| behnamoh wrote:
| "hope you enjoyed this video. btw, the source code used in this
| tutorial is encoded in the video."
| cush wrote:
| Yeah seems way easier than adding a link in the description
| daenz wrote:
| Links die. As long as the video exists, the files that the
| video uses will _always_ exist.
| legitster wrote:
| This reminds me of an old hacky product that would let you use
| cheap VHS tapes as backup storage:
| https://en.wikipedia.org/wiki/ArVid
|
| You would hit Record on a VCR and the computer data would be
| encoded as video data on the tape.
|
| People are clever.
| gibolt wrote:
| Early games and software would be delivered on audio cassettes
| that would then have to be 'played' in order to load your
| software temporarily into the device, which could take minutes
|
| edit: Video from the 8-bit Guy on how this worked -
| https://www.youtube.com/watch?v=_9SM9lG47Ew
| mobilene wrote:
| This is old school. When I first wrote code back in the Stone
| Age we used to store our stuff on cassette tape.
| twh270 wrote:
| You had cassette tape?? Lucky... I had to write my 1's and
| 0's in the dirt with a stick.
|
| Damn rain.
| RedShift1 wrote:
| You guys had dirt?
| madengr wrote:
| Ha ha, when I was a kid with my C64, I used my moms old reel-
| to-reel tape deck to store data.
|
| I still have a C64 and tape drive.
|
| There was a magazine in the 80's where you could scan in the
| code with a bar code scanner.
| Random_Person wrote:
| I still have my Atari 400 and tape drive!
| johnvega wrote:
| My family had Atari 400 with a tape drive. I remembered
| buying a tape with a game. We also use it for basic
| programming language and the Astroids game using a
| cartridge.
| alar44 wrote:
| That's not really that hacky, audio cassettes were used
| forever, it's just a tape backup.
| jhgb wrote:
| I remember a similar solution that was marketed in a German
| mail order catalogue in late 1990s. It could have been Conrad,
| but I'm not 100% sure. I recall it being a USB peripheral,
| though. (Maybe I could find more about it in time...)
| philjohn wrote:
| The Alesis ADAT 8 track digital audio recorders used SVHS tapes
| as the medium - at the end of the day, it's just a spooled
| magnetic medium, not hugely different conceptually than a hard
| drive.
| ben174 wrote:
| Wow, 2GB on a standard tape. For the time, that's incredibly
| efficient and cheap.
| anyfoo wrote:
| Yeah. Video, even old grainy VHS, had a pretty high
| bandwidth. Even much more so with S-VHS, which did not become
| super popular though. (I'm actually wondering whether the 2GB
| figure was for S-VHS, not VHS. Didn't to the math and
| wouldn't be surprised either way, though.)
| gattilorenz wrote:
| Yes! There were many such systems, LGR made a video for one of
| them, also showing the interface (as in: hardware and GUI) for
| the backup: https://youtu.be/TUS0Zv2APjU
| danschumann wrote:
| This reminds me of Blame! where humans are living light rats in
| the belly of the machine. Lol, also reminds me of the geocities
| days where we created 50 accounts to upload dragon ball z videos.
| ductsurprise wrote:
| Could be a good and sneaky way to obfuscate encrypted message
| transmissions?
| mensetmanusman wrote:
| Are the premium files stored as 4K?
| bilekas wrote:
| I absolutely love this idea. I need to dig more into the code,
| but its almost like using twitter as a 'protocol' using youtube
| as a storage.
|
| So many ideas are flying to mind. Really creative.
| shmatt wrote:
| Reminds me of the old Wrapster[1] days
|
| [1] https://www.cnet.com/tech/services-and-software/napster-
| hack...
| [deleted]
| behnamoh wrote:
| there was a story on HN a while ago in which someone stored
| unlimited data in Google Sheets!
| dahfizz wrote:
| Does YouTube store and stream all videos losslessly? How does
| this work otherwise?
| kleer001 wrote:
| things like redundancy and crc checks I assume
| ezfe wrote:
| The data is represented large enough on screen that compression
| doesn't destroy it.
| Beltalowda wrote:
| e.g. similar to a QR code stored as a JPEG will still work
| fine.
| [deleted]
| LukeShu wrote:
| No, YouTube is not lossless.
|
| The video that is created in the example in the README is
| https://www.youtube.com/watch?v=Fmm1AeYmbNU
|
| We can see that data is encoded as "pixels" that are quite
| large, being made up of many actual pixels in the video file. I
| see quite bad compression artifacts, yet I can clearly make out
| the pixels that would need to be clear to read the data. It
| looks like the video was uploaded at 720p (1280x720), but the
| data is encoded as a 64x36 "pixel" image of 8 distinct colors.
| So lots of room for lossy compression before it's unreadable.
| [deleted]
| [deleted]
| martincmartin wrote:
| Imagine a QR code that changes once every X milliseconds.
| dahfizz wrote:
| That's an excellent analogy, thank you.
| derevaunseraun wrote:
| This seems like something Cicada 3301 would use
|
| I wonder how many random videos like this are floating around
| that are encoding some super secret data...
| advisedwang wrote:
| Seems like a great way to get your account closed for abuse!
| LewisVerstappen wrote:
| You'd be surprised how much YouTube lets you upload.
|
| I've been uploading 2-3 hours of content a day every day for
| the past few years. On the same account too.
|
| I have fewer than 10 subscribers lol.
| deanCommie wrote:
| How MUCH - yes - as long as it's videos, and it's not
| violating copyright, you're probably not violating any Terms
| of Service.
|
| But I guarantee there is some clause in the ToS that this
| project violates.
| emptysongglass wrote:
| Lucky you. I just posted my first two videos from a
| conference that were banned within a day for violating
| "Community Guidelines" without appeal.
| [deleted]
| c0balt wrote:
| They let you sometimes get away with a lot more[0] ;)
|
| [0]: https://www.youtube.com/watch?v=Olkb7fYSyiI
| bityard wrote:
| What kind of content do you upload? (Should "content" be in
| air quotes? :P)
| LewisVerstappen wrote:
| Lol yeah.
|
| It's just recordings of myself when I'm doing deep work. I
| use OBS to stream my computer screen and a video recording
| of myself (mostly me muttering to myself).
|
| It helps me avoid getting distracted (I feel like I'm being
| watched lol) and it's also interested to check back if I
| want to see what I was working on 3 months ago.
|
| All the videos are unlisted or private.
| nittanymount wrote:
| wow, curious, are you keeping these videos there, or will
| delete them after several months?
| pcthrowaway wrote:
| Are you screensharing while recording? What tooling do
| you use to do this if so?
|
| Also, any potential issues with Google having access to
| proprietary code? I know the chance of any human at
| Google interpreting your videos is near-zero but still
| adolph wrote:
| Isn't that what Twitch is for?
| johndfsgdgdfg wrote:
| Then the whole HN crowd would have enough outrage materials for
| weeks. Seems like a win-win situation to me.
| robotnikman wrote:
| Another thread posted today makes it seem like they don't
| really care
|
| https://news.ycombinator.com/item?id=31488455
| Manuel_D wrote:
| If it becomes prevalent, I think YouTube would do something
| like slightly randomize the compression in their videos to
| dissuade this kind of use.
| deckar01 wrote:
| You could make it much harder to detect by synthesizing a
| unique video with a DNN and hiding the data using traditional
| stenography techniques.
| Mockapapella wrote:
| I think that video compression might make this not a viable
| technique. Artifacts would destroy the hidden data, right?
| bitexploder wrote:
| That is what redundancy and error correcting codes are for.
| It will reduce your data density, but I am sure you can
| find parameters that preserve the data.
| upupandup wrote:
| Couldn't you also embed data through sound? Upload a video
| of a monkey at the zoo but you insert ultrasound with
| encoded data.
|
| something like this but far more mundane
|
| https://www.youtube.com/watch?v=yLNpy62jIFk
| bityard wrote:
| > but you insert ultrasound with encoded data
|
| Others in these comments have also suggested
| steganography in both the video and audio streams. The
| problem with that is that when you retrieve a video from
| YouTube, you never get the original version back. You
| only get a lossy re-encoded version, and the very
| definition of lossy encoding is to toss out details that
| humans can't (or wouldn't easily) perceive, including
| ultra-sonic audio.
| dotancohen wrote:
| It might be ridiculous, but how about uploading a
| computer-generated video of a human saying 0 and 1 very
| quickly, to encode binary file.
|
| Or better yet, the file could be one third the size if
| the human says the numbers 0 to 7.
| snowwrestler wrote:
| Unless you tuned the NN on the files you get back from
| YouTube, so that it learns to encode the data in a way that
| is always recoverable despite the artifacts.
| throwaway92394 wrote:
| Compression will limit the bandwidth of a given frame but
| you can work around it.
|
| Some forms of DRM are already essentially this, compression
| - and even crappy camera recording from a theater -
| resistant DRM that is essentially stegonagraphy (you can't
| visually tell its there) exist.
|
| EDIT: "compression resistant watermark" is a good search
| phrase if anyone is curious
| umvi wrote:
| Turns out any site that allows users to submit and retrieve data
| can be abused in the same way:
|
| - FacebookDrive: "Store files as base64 facebook posts"
|
| - TwitterDrive: "Store files as base64 tweets"
|
| - SoundCloudDrive: "Store files as mp3 audio"
|
| - WikipediaDrive: "Store files in wikipedia article histories"
| jasonlotito wrote:
| My friends and I had a joke called NSABox. It would send data
| around using words that would attract the attention of the NSA,
| and you could submit a FOIA request to recover the data. I
| always found it amusing.
| havblue wrote:
| I've heard of the loic ion cannon dos tool described as a
| shortcut to getting sent to jail. This sounds similar.
| mechanical_bear wrote:
| Big difference. LOIC actually impacts a target.
| mickeyp wrote:
| There's a feature in Emacs that does that (unsurprisingly.)
|
| It's called `M-x spook'. It inserts random gibberish that NSA
| and the Echelon project would've supposedly picked up back in
| the 90s.
| LukeShu wrote:
| spook.el was "introduced at or before Emacs version 18.52".
| And 18.52 was released in 1988. And spook.el in a comment
| says ;; Created: May 1987
|
| So the things that the NSA and ECHELON would have picked up
| on back in the 1980s, not the 1990s :)
| upupandup wrote:
| What a great time to write botnets
| itake wrote:
| Back in the day when @gmail was famous for their massive free
| storage for email, ppl wrote scripts to chunk large files and
| store them as email attachments.
| adzm wrote:
| People did this on AOL in the 90s as well!
| jprd wrote:
| Did you manage to get on the latest Mass Mail going out
| tonight?
| RcouF1uZ4gsC wrote:
| With AOL, in the early 90's you didn't even need to do
| that. You could just reformat and reuse the floppy disks
| they were always sending you for free storage.
| ihaveajob wrote:
| I know someone who published an academic paper on doing
| exactly this.
| IshKebab wrote:
| Doesn't sound very noteworthy tbh. It's obviously possible
| and the implementation is straightforward.
| 867-5309 wrote:
| sounds like 99% of academic papers
| IshKebab wrote:
| Most papers at least _sound_ like they 're notable!
| jraph wrote:
| The less jam you have, the more you spread it out.
|
| The opposite is also true. Brilliant ideas have lead to
| papers that can read obvious and terribly unremarkable.
| Grollicus wrote:
| I used this as a backup target for the longest time. Simply
| split the backup file into 10 MB chunks and send as mails to
| a gmail account. Encrypted so no privacy problems. Rock solid
| for years.
|
| And as it was just storing emails it was even using gmail for
| it's intended purpose so no TOS problems..
| shon wrote:
| Yup, did the exact same thing to back up all of the
| Wordpress installs on a free server I ran for friends.
| thrdbndndn wrote:
| This is pretty tame compared to some actual, practical ones
| such as https://github.com/apachecn/CDNDrive
|
| For people who don't read Chinese: it encodes data into ~10M
| blocks in PNG and then uploads (together with a metadata/index
| file as an entry point) to various Chinese social media sites
| that don't re-compress your images. I knew people have used it
| to store* TBs after TBs data on them already.
|
| *Of course, it would be foolish to think your data is even
| remotely safe "storing" them this way. But it's a very good
| solution for sharing large files.
| behnamoh wrote:
| also Telegram
| [deleted]
| WaxProlix wrote:
| I wrote one of these as a POC when at AWS to store data sharded
| across all the free namespaces (think Lambda names), with
| pointers to the next chunk of data.
|
| I like to think you could unify all of these into a FUSE
| filesystem and just mount your transparent multi-cloud remote
| FS as usual.
|
| It's inefficient, but free! So you can have as much space as
| you want. And it's potentially brittle, but free! So you can
| replicate/stripe the data across as many providers as you want.
| turtledove wrote:
| I was an eng manager on Lambda for a time, and we definitely
| knew people were doing this, and had plans to cut it out if
| it ever became a problem. :D
| WaxProlix wrote:
| Yeah, you'd need to find some sort of auto-balancing to
| detect this kind of bitrot from over-aggressive engineering
| managers & their ilk and rebalance the data across other
| sources. I think the multiple-shuffle-shard approach has
| been done before, maybe we could steal some algo from a
| RAID driver, or DynamoDB.
| willcipriano wrote:
| I made a tool that lets you store files anywhere you can store
| a URL: https://podje.li/
| metadat wrote:
| Is there an import URLs button? Otherwise, how does one
| reassemble the original?
| willcipriano wrote:
| Click them, it's really for things that fit into one or two
| urls like small text files. I've used it for config files
| that were getting formatted incorrectly over corporate
| email that ate it as a attachment.
| wging wrote:
| See also https://github.com/qntm/base2048. "Base2048 is a
| binary encoding optimised for transmitting data through
| Twitter."
| colinmhayes wrote:
| Still need around 30,000 more unicode characters for this to
| work.
| wging wrote:
| Sorry, I edited the post concurrently with your comment -
| it now points to Base2048, the link I meant to post, which
| actually should work - rather than
| https://github.com/qntm/base65536 (which I think you're
| commenting on).
| theblazehen wrote:
| > For transmitting data through Twitter, Base65536 is now
| considered obsolete; see Base2048.
|
| Source: https://github.com/qntm/base65536
| the_duke wrote:
| Github repos makes for a pretty good key-value store.
|
| It even has a full CRUD API, no need for using libgit.
| mike00632 wrote:
| I wonder if access permissions would be easier to maintain
| using Facebook...
| dheera wrote:
| Until one day your base64 ciphertext just so happens to
| contain a curse word and you get banned for violating
| "community standards"
| anonymousiam wrote:
| Reminds me of this similar tool that exploited GMail the same
| way: https://www.computerworld.com/article/2547891/google-hack--
| u...
| saint_angels wrote:
| Reminds me of a guy who stored data in ping messages
| https://youtu.be/JcJSW7Rprio
| alanh wrote:
| What part of the video discusses this? :D So far it's about
| juggling chainsaws
|
| Edit: OK, I see where this is going. Lol
| bluedays wrote:
| I watch these things and I begin to realize I'll never be as
| intelligent as someone like this. It's good to know no matter
| how much you're grown there is always a bigger fish.
| qorrect wrote:
| I agree that there will always be smarter fish, but you can
| definitely be this smart it just takes the proper motivation
| ( or weird idea ) to wiggle its way into your brain.
| msoad wrote:
| Reminds me of the other post that used Facebook Messenger as
| transport layer to get free internet in places that internet is
| free if you use Facebook apps.
| powerset wrote:
| I wonder if something similar could be useful for transmitting
| data optically, like an animated QR code. Maybe a good way to
| transmit data over an air gap for the paranoid?
| _trampeltier wrote:
| This story from 2016 comes to my mind.
|
| https://www.bbc.com/future/article/20160225-the-quest-to-sol...
| kube-system wrote:
| I can't wait until malware uses this as C2
| Tijdreiziger wrote:
| Seems pretty fragile. Google taking down your channel would be
| enough to disarm your malware.
| blibble wrote:
| they worked around this years ago by generating the username
| (domain name) based on some property of the current time
|
| (plus using more than one tld)
| vmception wrote:
| Ipfs is decent enough or better with free pinning services
| productceo wrote:
| Imagine a free cloud storage, but you need to watch an ad every
| time you download a file.
| stingta wrote:
| Wasn't that basically megaupload its ilk
| rightbyte wrote:
| I read that you did not download shady files from the interwebs
| when that was a thing sane people actually did?
| rationalfaith wrote:
| [deleted]
| das_keyboard wrote:
| Wasn't there more or less recently on HN something like "Store
| Data for free in DNS-Records"? Reminds me of this.
| jtxt wrote:
| Seems like it may be a decent "harder drive".
| https://youtu.be/JcJSW7Rprio
| metadat wrote:
| Could youtube-dlp and YouTube Vanced now be hosted on.. YouTube?
|
| I wonder how long it'd take for Google to crack down on the
| system abuse.
|
| Is it really abuse if the videos are viewable / playable?
| Presumably the ToS either already forbids covert channel encoding
| or soon will.
| sevenf0ur wrote:
| Probably breaks TOS under video spam
| tenebrisalietum wrote:
| Add a music track, it is now a psychedelic art video.
| squarefoot wrote:
| A music track in which the music happens to be FSK data
| disguised as chiptune.
| throwaway92394 wrote:
| Just gotta add some good 'ol steganography
| javajosh wrote:
| This brings up an interesting question: what is the upper-
| bound of hidden data density using video steganography?
| E.g. how much extra data can you add before noticeable
| degradation? It's interesting because it requires both a
| detailed understanding of video encoding and also
| understanding of human perception of video.
| pbhjpbhj wrote:
| I'd expect you could store more data steganographically
| than the raw video data.
|
| You can probably do things like add frames that can't be
| decoded and so are skipped by a decoder; that effectively
| allows arbitrary added hidden data. That's maybe
| cheating.
|
| If you stipulate that you can't already have a copy of
| the unaltered file, and the data has to be extractable
| from a pixel copy of the rendered frames ... that becomes
| more interesting, I think.
| samatman wrote:
| I've seen drone metal videos where the video and audio
| could both be 90% steganography and I wouldn't know the
| difference.
| alpaca128 wrote:
| Good luck preserving it through YouTube's video
| compression. It's super lossy with small details, in bad
| cases the quality can visibly degrade to a point it looks
| more like a corrupted low-res video file for a few seconds
| (saw that once in a Tetris Effect gameplay video).
| throwaway92394 wrote:
| I mentioned it in another comment, but while that does
| lower the bandwidth of a single frame, its not actually
| an issue. There's several DRM techniques that can survive
| a crappy camera recording in a theater.
|
| "compression resistant watermark" turns up some good
| resources for it. QR codes are another good example of
| noise tolerant data transmission (fun fact - having logos
| in a QR code isn't part of the spec, you're literally
| covering the QR code but the error-correction can handle
| it).
|
| The best way I can describe it is that humans can still
| read text in compressed videos. The worse the
| compression/noise the larger the text needs to be, but we
| can still read it.
| bliteben wrote:
| yeah wonder how long until the ban, also bans all of your
| descendants for 10 generations?
| robonerd wrote:
| If you put youtube-dlp on youtube as a video, make sure to use
| youtube-dlp to it up.
| throwaway0a5e wrote:
| >Is it really abuse if the videos are viewable / playable?
| Presumably the ToS either already forbids covert channel
| encoding or soon will.
|
| If creators start encoding their source and material into their
| content Google would probably be fine with that because it
| gives them data but also gives them context for that data.
|
| Edit: I meant like "director's commentary" and "notes about
| production" type stuff like you used to see added to DVDs back
| in the day. Not "using youtube as my personal file storage".
| Why is this such an unpopular opinion?
| jklinger410 wrote:
| > If creators start encoding their source material into their
| files Google would probably be fine with that
|
| Not true at all, lol. Google has a paid file storage
| solution. YouTube is for streaming video and that's the
| activity they expect on that platform. I couldn't imagine any
| service designed for one format would "probably be fine" with
| users encoding other files inside of that format.
| pbhjpbhj wrote:
| I think the parent comment is limiting themselves to the
| embedding of metadata specific to the containing file. It
| would be like adding a single frame, but would potentially
| give useful information to Google. In those limited
| circumstances I think the parent is correct.
| baud147258 wrote:
| > If creators start encoding their source material into their
| files Google would probably be fine with that
|
| it'd depends, as I don't think people using YT to store files
| would watch a lot of adds
| throwaway0a5e wrote:
| If creators use it like the appendix in a book I can see
| people watching ads on their way to it.
| cush wrote:
| It's one of those problems that resolves itself.
|
| The process of creating and using the files is prohibitively
| unusable and so many better solutions exist that YT doesn't
| need to worry about it
| freestorage wrote:
| Years ago when Amazon had unlimited photo storage, you could
| "hide" gigabytes of data behind a 1px gif (literally
| concatenation together) so that it wouldn't count against your
| quota.
| xhrpost wrote:
| They still do if you pay for Prime. I was surprised to see that
| even RAW files (which are uncompressed and quite large) were
| uploaded and stored with no issues. Not the same as "hiding"
| data but might still be possible.
| karamanolev wrote:
| In the interest of technical correctness, RAW files are
| frequently compressed and even lossily compressed. For
| example, Sony's RAW compression was only lossy until very
| recent cameras.
|
| Given that there are the options for uncompressed, lossy
| compressed and lossless compressed, I'd say RAW files differ
| in the stage of the data processing where capture is being
| done and doesn't imply anything about the type of
| compression.
|
| What is relevant is that the formats vary widely between
| manufacturers, camera lines and individual cameras, so unlike
| JPEG, it's really hard to create a storage service that
| compresses RAW files further after uploading in a meaningful
| way. So anything they do needs to losslessly compress the
| file.
| netsharc wrote:
| I guess you can store 24 bits of data as the R,G and B
| components of a pixel of an "image", and store it as a
| lossless image...
| flaque wrote:
| See also RedditFS: https://github.com/maxchehab/redditfs
| kebman wrote:
| Are there any examples? I'd love to see such a YouTube video...
| :p
| Jimmc414 wrote:
| Very cool. I wonder how difficult it would be present a real
| watchable video to the viewer. Albeit low quality, but embed the
| file in a steganographic method. I think a risk of this tech is
| that if it takes off, YT might easily adjust the algorithms to
| remove unwatchable videos. Perhaps leaving a watchable video
| could grant it more persistence than an obvious data stream.
| ragingglow wrote:
| Sure, but the more structure your video has to have, the harder
| it becomes to hide information stenographically within it. Your
| information density will become very low I think.
| 8K832d7tNmiQ wrote:
| I remember seeing this first discussed at 4chan /g/ board as a
| joke wether or not they can abuse Youtube's unlimited file size
| upload limit, then escalated into a proof of concept shown in the
| repo :)
| marginalia_nu wrote:
| This is a tangent. I must have been maybe 15-16 at the time, so
| somewhere around 20 years ago: One of the first pieces of
| software I remember building was a POP3 server that served
| files, that you could download using an email client where they
| would show up as attachments.
|
| Incredibly bizarre idea. I'm not sure who I thought would
| benefit from this. I guess I got swept up in RFC1939 and needed
| to build... something.
| babanin wrote:
| On my first job (in the beginning of the millennium) there
| was a limit on files you could download, something around
| 5Mb. If you wanted to download something bigger, you had to
| ask sysadmins to do that and wait... That was really
| annoying. So I and my colleague end up writing a service,
| that could download a file to local storage and chop it into
| multiple 5Mb attachments and send multiple emails to
| requestor.
|
| After some time the limit on single file was removed, but
| daily limit was set up to 100Mb. The trick is that POP3
| traffic wasn't accountable, so we continued to use our
| "service".
| hiq wrote:
| I couldn't download .exe files at some $CORPORATION. They
| had to be whitelisted or something, and the download just
| wouldn't work otherwise. But once you had the .exe you
| could run it just fine. You just had to ping some IT person
| to be able to retrieve your .exe.
|
| Of course it was still possible to browse the internet and
| visualize arbitrary text, so splitting the .exe into
| base64-encoded chunks and uploading them on GitHub from
| another computer was working perfectly fine... I briefly
| argued against these measures, given how unlikely they are
| to prevent any kind of threat, but they're probably still
| in place.
| behnamoh wrote:
| apparently e-mail is not much reliable for storing/keeping
| files. there have been cases where an old email with an
| attachment would not load correctly because the servers just
| erased the attachment file.
| marginalia_nu wrote:
| This was a custom email server though, there never were any
| emails, it just presented files as though they were so that
| a client would download them.
|
| Actually caused some problems for email clients, as they
| usually assumed emails were small. I got a few of them to
| crash with 200 Mb "attachments" (although this was in the
| early 00s, 200Mb was bigger than it is today).
| qorrect wrote:
| I'm still confused on how this worked, did you email some
| address and get a reply with the attachment ?
| mjochim wrote:
| Since GP says it was a POP3 server, I suppose you would
| set up an email account in your client with its inbox
| server pointing to that POP3 server. When the client
| requests the content of the inbox, the server responds
| with a list of "emails" that are really just files with
| some email header slapped on; so your email client's
| inbox window essentially becomes a file browser.
| Gigachad wrote:
| Interestingly, if you take a look at your emails from a few
| years ago, most of the non attached images will fail to
| load now.
| Saint_Genet wrote:
| Makes me wonder how many video and image upload sites are now
| used as easily accessible number stations these days
| adolph wrote:
| Probably not many. The advantage of plain old-fashioned radio
| is that the station doesn't keep track of the receivers.
| Whoever watches a YouTube numbers station is tracked six ways
| to Sunday.
| INTPenis wrote:
| I like this. The last wave of Twitter users into the fediverse
| caused my AWS bill to go up 10 USD a month. Might have to start
| storing media files on youtube instead ;)
| jimmydeans wrote:
| I remember a project that was doing this with photo files and
| unlimited picture storage.
| sunlite99 wrote:
| How will you prevent youtube from re-encoding the video and data
| getting thrashed?
| tenebrisalietum wrote:
| Make the boxes bigger.
| take_it_not wrote:
| I'm thinking maybe we can divide files into pieces and turn each
| pieces into a QR code then turn each QR code into a single frame?
| musicale wrote:
| It's all fun and games until your files start getting DMCA
| takedowns.
| abadaba wrote:
| Are there any services out there that combine all of these "Store
| files as XYZ" into some kind of raid config?
|
| Would be interesting if you could treat each service (Youtube,
| Docs, Reddit, Messenger, etc) as a "disk" and stripe your data
| across them.
| [deleted]
| [deleted]
| AdriaanvRossum wrote:
| How much kilobytes would be possible to store per minute video?
| lb1lf wrote:
| -Back in the day when file sharing was new, I won two rounds of
| beer from my friends in university - the first after I tried what
| I dubbed hardcore backups (Tarred, gzipped and pgp'd an archive,
| slapped an avi header on it, renamed it
| britney_uncensored_sex_tape[XXX].avi or something similar, then
| shared it on WinMX assuming that as hard drive space was free and
| teenage boys were teenage boys, at least some of those who
| downloaded it would leave it to share even if the file claimed to
| be corrupt.
|
| It worked a charm.
|
| Second round? A year later, when the archive was still available
| from umpteen hosts.
|
| For all I know, it still languishes on who knows how many old
| hard drives...
| marginalia_nu wrote:
| Poor guys, still looking for the right codec to play the
| britney tape they downloaded 28 years ago.
| jjice wrote:
| That's a perfect college CS story. Beer and bastardized files -
| what a combo!
___________________________________________________________________
(page generated 2022-05-24 23:00 UTC)