[HN Gopher] Old Vidme embeds turn into porn after domain purchase
       ___________________________________________________________________
        
       Old Vidme embeds turn into porn after domain purchase
        
       Author : ibaikov
       Score  : 149 points
       Date   : 2021-07-23 17:27 UTC (5 hours ago)
        
 (HTM) web link (www.theverge.com)
 (TXT) w3m dump (www.theverge.com)
        
       | S-E-P wrote:
       | That's kind of hilarious but gross all at the same time
        
       | kazinator wrote:
       | > _Here's (yet another) argument against using third-party embeds
       | on your respectable website_
       | 
       | Well, it's an argument against using embeds without having any
       | way to validate their authenticity.
       | 
       | This is analogous to having a software distro (e.g. package
       | manager) which downloads upstream tarballs or git repos without
       | checking any hashes.
       | 
       | Is there a solution for this? Say you want to embed a video from
       | some third party sites; what tools are there for ensuring that
       | the embedding will somehow lapse if the video at that URL has
       | been replaced or altered?
       | 
       | Ideally, you'd be notified if that happened. While not showing
       | porn is good, but videos not working is bad. You can't be
       | checking an entire site all the time for non-working external
       | content.
        
       | one_off_comment wrote:
       | The weird thing about this to me is, the company or person who
       | scooped up the domain... in order to get their plan working so
       | quickly, wouldn't they have had to set up a site perfectly
       | beforehand so the embeds would work as desired and then just sat
       | there waiting, hoping, ready to hit the button to scoop up the
       | domain at just the right time, praying nobody beat them to it?
       | 
       | Isn't that a lot of work for almost no gain?
        
         | dec0dedab0de wrote:
         | depending on how vidme stored their videos it could just be as
         | simple as returning any request for a video file with whatever
         | they were showing.
        
         | duskwuff wrote:
         | The embeds don't actually "work", per se. When the browser
         | tries to load the embed into an <iframe>, it gets redirected to
         | the home page of the porn site, and ends up displaying the
         | upper left corner of that page in the space where the video
         | embed was supposed to go. It all looks rather more accidental
         | than purposeful.
        
       | sideproject wrote:
       | I use Domainy
       | 
       | https://www.domainy.io
       | 
       | to monitor interesting domains being expired and to also find
       | interesting domains that are available.
       | 
       | twitter.accountant? what about twitter.beauty?
       | 
       | Domains are not exactly available when they expire, but this
       | helps me to check if any of them become available.
        
       | vernie wrote:
       | Any discussion beyond "lol damn" is unnecessary.
        
       | sp332 wrote:
       | 4 years ago, Archive Team backed up "nearly all" of Vidme
       | (according to Jason Scott). It was uploaded to the Internet
       | Archive here https://archive.org/details/archiveteam_vidme so you
       | can fix your vid.me URLs by pointing them to the Archive.
       | 
       | (And you can help keep them there, too https://archive.org/donate
       | But honestly, using the Archive more keeps the bigger donors
       | involved, so don't feel guilty or anything. Just use the
       | Archive!)
        
         | Scoundreller wrote:
         | So uhhhh, anyone want to write the recursive replacement regex
         | that'll do that across an entire filesystem?
        
           | lallysingh wrote:
           | find | xargs awk?
        
             | toomuchtodo wrote:
             | Wikipedia has a bot that the Internet Archive collaborated
             | on [1], where rotten links are updated to point to the
             | Internet Archive and new links are queued for retrieval
             | [2]. There should probably be a similar effort for CMS
             | systems like Wordpress and such. The code to do this is
             | semi trivial.
             | 
             | [1] https://meta.wikimedia.org/wiki/InternetArchiveBot
             | 
             | [2] https://en.wikipedia.org/wiki/Wikipedia:Link_rot
        
               | sp332 wrote:
               | But the links here haven't rotted, in that they will not
               | return a 404 or whatever. They will still load videos, it
               | will just be replaced with porn. So you need more code to
               | detect all vidme links as something to fall back to the
               | Archive for.
        
               | toomuchtodo wrote:
               | In a sense, all of the links for the domain in question
               | have rotted, and must be replaced. To your point, your
               | code could replace wholesale based on the domain.
        
         | danso wrote:
         | The search results counter is 12,385...is that really all (or
         | "nearly all") of what Vid.me had for content? I mean I know it
         | failed at dethroning Youtube, but 12,000+ videos is barely
         | anything.
        
           | boomboomsubban wrote:
           | That's roughly one hour of new uploads on YouTube as of 2016.
        
             | mixmastamyk wrote:
             | Like everything, a great majority are junk and/or niche.
        
           | oxguy3 wrote:
           | It looks like each of those 12,385 items is a big 10.4GB
           | archive of thousands of videos.
        
           | tech234a wrote:
           | That is the number of megawarc items. Each one of these items
           | contains many videos.
        
         | Scoundreller wrote:
         | weird flex, but I needed to get over a threshold to 'earn' a
         | credit card promo for spending enough, and a donation here will
         | do it!
         | 
         | (It's too bad the tax deduction isn't eligible in Canada :( )
        
           | alexjplant wrote:
           | I have the opposite problem... I give money to OpenBSD but
           | can't claim it on my taxes since they're Canadian :-(.
        
             | darrylb42 wrote:
             | Are they a registered non-profit now? I donated when I used
             | OpenBSD but don't think I got a tax receipt, just my name
             | on one of the releases.
        
               | Scoundreller wrote:
               | Was hoping there was maybe opportunity for corresponding
               | donations that maximizes deductions.
               | 
               | Looks like openBSD isn't deductible either for Canadian
               | donoUrs. Theo's registered address has lots of cool wifi
               | links though. I count 4! (one is on the gutter)
               | 
               | https://www.google.com/maps/place/812+23+Ave+SE,+Calgary,
               | +AB...
        
         | ibaikov wrote:
         | Oh that's clever, I totally forgot about this. Archive is
         | amazing.
        
           | toomuchtodo wrote:
           | Not all public goods are flashy, but most are necessary.
        
         | Causality1 wrote:
         | I wish they'd been doing that when Google Video shut down. Lost
         | a lot of good content I didn't download.
        
           | tech234a wrote:
           | They did: https://wiki.archiveteam.org/index.php/Google_Video
        
           | sp332 wrote:
           | I helped download a few gigs of Google Video at the time, so
           | I know some of it is up there.
        
         | swiley wrote:
         | That really is an amazing organization.
        
       | anigbrowl wrote:
       | What's the win for the new owner, do they make money from linking
       | to the porn somehow? Sure it could also be just for lulz but
       | there are many equally or more lulzy possibilities whereas porn
       | often seems to be coupled with economic gain.
        
         | dragonwriter wrote:
         | > What's the win for the new owner, do they make money from
         | linking to the porn somehow?
         | 
         | They _are_ the porn (the new owner is a porn firm), so, yes. I
         | haven 't seen the actual linked content, but I assume its
         | something like free samples with directions telling people
         | where to get more; its a move that gets porn ads placed for
         | free in a lot of places that would never choose to allow porn
         | ads.
        
           | anigbrowl wrote:
           | Thanks, I didn't find it clear from the article and had no
           | desire to go looking for content.
        
       | oakwhiz wrote:
       | Very amusing but I take issue with the article's claim that
       | "Here's (yet another) argument against using third-party embeds"
       | - this might be an oversimpified perspective when it's actually a
       | good argument to use subresource integrity (essentially
       | cryptographic pinning of third party embedded content). I am
       | unsure if this extends to every kind of resource that you could
       | have in a web page, (ideally it should) but I could imagine a JS
       | shim being used to cover some edge cases.
        
         | [deleted]
        
         | sp332 wrote:
         | Subresource integrity would prevent the porn from showing up,
         | but the videos would still be broken. Hosting the videos
         | yourself is the option that would keep things running.
        
           | [deleted]
        
         | jandrese wrote:
         | Video embeds tend to be especially fragile however since
         | they're often implemented by gluing together a relatively large
         | number of third party services. Any one of the services goes
         | down or changes the API and the whole thing breaks.
         | 
         | Ask any uMatrix user how much fun it is to get a video to play
         | on some websites and just how much external crap you have to
         | allow before you see the first frame.
        
           | S_A_P wrote:
           | That and Facebook container. I could not get slack to work
           | until I allowed slack access to Facebook because I had to re-
           | authenticate. Even though I don't use Facebook I have to
           | allow slack to access it at least to be able to authenticate
           | to a different authentication source.
        
             | jandrese wrote:
             | Don't get me started on comment section plugins that are
             | basically just Facebook threads on a webpage.
        
       | AkshitGarg wrote:
       | Genuinely curious, how does a company manage their domains after
       | they shut down their services to prevent this kind of stuff
        
         | cyberge99 wrote:
         | I would think the same way a person would posthumously: let it
         | expire and let natural order take over or delegate ownership
         | via legal/custodial means.
         | 
         | Sorry that doesn't answer your question more than "it depends
         | on the entity".
        
         | tyingq wrote:
         | You could pre-pay for X years, where X is long enough to ensure
         | you'll be dead when it expires.
         | 
         | Edit: Seems like there's a 10 year limit in many places? I
         | wonder if that's broad convention or an actual rule.
         | 
         | Okay, apparently an ICANN limit for .com domains:
         | 
         |  _" The expiration date of the domain registration is extended
         | by the number of years, up to a maximum of ten years, as
         | specified by the Registrar's requested Extend operation."_
         | 
         | https://www.icann.org/en/registry-agreements/com/com-registr...
         | 
         | Though, I can see that navy.us has an expiry in 2053, so it's
         | likely per registry.
        
           | tqkxzugoaupvwqr wrote:
           | I can at most pay for 10 years in advance. Is this a limit of
           | my registrar?
        
             | axaxs wrote:
             | Likely the registry, not registrar. When I worked at one 10
             | was the limit. Unsure if that was handed down by ICANN,
             | though.
        
             | tyingq wrote:
             | On mine, I can pay for up to 10 years, but I think I can
             | just pay twice and get 20.
             | 
             | Edit: It seems to vary. Some places cite an ICANN limit of
             | 10 years.
        
               | rootusrootus wrote:
               | That may vary by registrar. My .com domain is about 2
               | years out from renewal and the most it will let me add is
               | 8.
        
               | duskwuff wrote:
               | Most registries limit domain lifetimes to 10 years. You
               | can't renew a domain if doing so would put the lifetime
               | beyond 10 years -- the registry will refuse to perform
               | the operation.
        
               | tinus_hn wrote:
               | Clearly there are companies that offer extension as a
               | service, you pay 20 times as much and they promise to
               | extend your domain 10 times.
        
               | mytherin wrote:
               | That just moves the problem though, how likely is it that
               | _that_ company will be around in 90 years to perform the
               | last extension?
        
           | _jal wrote:
           | Network Solutions will sell you service for 100 years.
           | 
           | Edit: CSC, Markmonitor and similar shops will contract to
           | keep your name available for long periods, too, but that's a
           | bit different.
        
             | throwaway3neu94 wrote:
             | Or for the life of their company, whichever is shorter.
        
               | _jal wrote:
               | That's why I said 'sell' instead of 'provide'.
        
           | rootusrootus wrote:
           | Aren't most domains limited to 10 years max?
        
           | yupper32 wrote:
           | For me I feel like a 10 year pre-pay would be worse. I'll
           | forget about the fact that I need to pay for it again in
           | 2031.
           | 
           | Yearly at least keeps me on my toes a bit.
        
             | throwaway3neu94 wrote:
             | You still have to renew it yearly to keep the expiration 10
             | years away.
             | 
             | Also most registrars will send you an email when it's about
             | to expire. If it does get dropped, there's a grave period
             | to recover it.
        
       | jtvjan wrote:
       | I'd get it if it was Goatse, but this is just regular porn. If
       | they simply wanted to earn money from ads, serving something more
       | milquetoast would make more sense, because now there's a rush to
       | remove old embeds.
       | 
       | Why?!
        
         | vmception wrote:
         | Now everyone know about it and vidme's ad space is more
         | valuable
         | 
         | Genius
        
         | bitwize wrote:
         | > serving something more milquetoast
         | 
         | Never gonna give you up...
        
       | TekMol wrote:
       | Domains are an area where blockchain technology could help a lot.
       | 
       | A domain could cost a fixed amount of X per month. So you could
       | pay 100 years upfront and be _sure_ to not lose it in that
       | timeframe.
       | 
       | To move a domain,the registrar _and_ the owner could have to sign
       | the move. So it would not be possible anymore to lose a domain
       | due to the regsitrar making a mistake.
        
         | mdoms wrote:
         | Can you explain in detail what problem blockchain solves here
         | and why it can't be solved without blockchain?
        
           | TekMol wrote:
           | The problem that someone else can move your domain.
           | 
           | The same problem it solves for finance: That someone else can
           | move your money.
        
             | crummy wrote:
             | Don't normal domains work the same - you pay a fee and you
             | get the domain for X years? In both situations the domains
             | expire and a porn site can nab them.
        
           | floren wrote:
           | Well, if it's solved via blockchain, there's an opportunity
           | for speculation among early adopters, which I think is the
           | major selling point for most blockchain "solutions"
        
         | jazzyjackson wrote:
         | "ethereum name service" is doing this and honestly aside from
         | gas fees it's not a bad price for permanent real estate (as
         | permanent as the ethereum virtual machine at least)
        
       | ignitionmonkey wrote:
       | We can't and shouldn't expect people to keep their old domains
       | forever. We need a way for pages to be signed and hyperlinks to
       | enforce authorship. When we link to stuff, we should have a way
       | to say whose stuff we're linking to. It's no different from
       | installing signed software and using trusted repositories.
       | 
       | This is one of the reasons I created a proof-of-concept web
       | extension that verifies links and pages using PGP. On a mismatch,
       | it flags the page and offers a web archive link instead.
       | 
       | https://webverify.jahed.dev/
       | 
       | It was pretty fun to make, but currently due to performance, Web
       | Extensions API doesn't provide the features to do this perfectly.
       | Firefox provides just about enough additional APIs to hack it
       | together.
        
         | ff7c11 wrote:
         | I had a personal project that I got bored with so I let the
         | domain expire. Then I got emails from former users saying that
         | the domain was now hosting malware. So yeah I would like it for
         | all the old links to the site to somehow know that the owner
         | has changed. Not sure what a reasonable solution would look
         | like though.
        
         | elihu wrote:
         | One idea that's been around for awhile is to identify files by
         | their hash. That has pros and cons. The good side of that is
         | that the file is immutable; you can't accidentally link to
         | something else unless someone can manufacture a hash collision
         | somehow. The down side is that if the file is corrected in some
         | way, you don't get the fixes.
         | 
         | In a lot of the peer-to-peer distributed hash table designs,
         | all you need to retrieve a file is its hash.
         | 
         | https://en.wikipedia.org/wiki/Content_addressable_network
        
           | kazinator wrote:
           | Problem is, you have to download the entire video to check
           | the hash. That's not how video embedding works; the client
           | browser is just handed some link, and it obtains pieces of
           | the video, rendering it instantly.
           | 
           | Basically, little segments of the video have to have a
           | signature which is continuously validated. Or something like
           | that.
        
             | grapehut wrote:
             | That's not really a problem. You don't hash the entire
             | video, but do something resembling a merkle tree. i.e. look
             | at a torrent, they're identified by a hash but you can
             | download and verify a random chunk
        
               | kazinator wrote:
               | Right, merkle tree! OK, so the embedding site only stores
               | a single hash: the root one. This hashes the the
               | remaining hashes. The first thing we fetch from the video
               | is those hashes and if their hash doesn't match, we
               | flag/ignore the video and refuse to play.
               | 
               | Multiple levels of the tree can be stored throughout the
               | video file. The first level after the root can be for
               | major sections, like 5 minute segments. The next levels
               | are then at the start of each 5 minute segment, giving
               | hashes for one second chunks.
               | 
               | If the root hash checks out, we get the 5-min hashes. If
               | they check out, we get the hashes for the first 5-min
               | block, and if those hashes check out, we start to play
               | the video, validating every second of it against a one
               | second hash from the 5 min block. Then we get the next
               | 5-min hash block and so on.
               | 
               | Kind of thing.
        
             | skipants wrote:
             | I don't have expertise in video codecs or file formats, but
             | couldn't you hash the first N bytes of a stream? Stream
             | those N bytes to the client and if it matches start the
             | video, else stop the download and not start the video.
        
               | AgentME wrote:
               | Merkle trees basically accomplish this with separate
               | hashes over every N bytes, so that the content can be
               | verified continuously as it's downloaded.
        
               | jodrellblank wrote:
               | Presumably for SHA256 you only need to hash ~256 bits;
               | what's anyone gonna do, try all possible combinations to
               | find a collision?
        
               | kazinator wrote:
               | They will keep the first few seconds or minutes of the
               | original video, bit-exact, and then switch to porn. The
               | player needs to validate every section.
        
               | tomjakubowski wrote:
               | If you only hash/check the first N bytes of the video
               | stream, the remainder of the video could be anything.
        
               | giantrobot wrote:
               | This has a number of problems.
               | 
               | The most egregious is if I'm an attacker and I have the
               | file you request I can hash the appropriate portion you'd
               | use to verify it but fill the rest with junk or exploits.
               | You'd receive the file, it would emit the correct hash,
               | yet be not what you were expecting.
               | 
               | For video especially what you receive isn't necessarily
               | predictable by the client. With HLS or MPEG DASH
               | streaming the video you receive could be one of a number
               | of different encoding e.g. lower or higher bitrates to
               | deal with changing network conditions. The actual
               | m3u8/mpd file you might receive could change arbitrarily
               | as the video provider adds or drops different encodings.
               | The hash of such a file today isn't guaranteed to match
               | the hash tomorrow for entirely banal non-malicious
               | reasons.
               | 
               | Fun fact: the UUHash algorithm used by the FastTrack
               | network (Kazaa, Morpheus, etc) only hashed the first bit
               | of a file. Hashing a large file took forever on hardware
               | of the day. Even hashing small files was non-trivial. The
               | RIAA through various fronts would insert spoofed files
               | where the first portion of the file was legitimate but
               | the content of the file would be junk or annoying sounds.
               | The files would be named like any other MP3 someone was
               | searching for and even have seemingly good IDv3 tags.
        
             | nextlevelwizard wrote:
             | Assuming there is no malicious intent behind the embeds you
             | could have the hash in the header of the video.
        
               | kazinator wrote:
               | I think that in the context of the topic of this
               | submission "replacement by porn" is being regarded as
               | malicious intent.
        
           | ohyeshedid wrote:
           | Sounds like something subresource integrity[1] could be
           | expanded to include.
           | 
           | [1]: https://developer.mozilla.org/en-
           | US/docs/Web/Security/Subres...
        
         | beambot wrote:
         | One solution is to have a unique URI per file that is
         | independent of DNS. Decentralized file storage (e.g. FileCoin /
         | IPFS) might serve this purpose...
        
           | ignitionmonkey wrote:
           | Definitely but I think we need something that will work with
           | the web we currently have while these bigger ideas are
           | fleshed out and adopted.
           | 
           | Also, while IPNS covers the issue of linking to dynamic
           | content, it's worth mentioning IPFS will have similar issues
           | with DNS as DNSLink and similar domain-driven solutions are
           | used to cover its usability issues (long, random URIs).
        
             | AgentME wrote:
             | IPFS works pretty well in the style of progressive-
             | enhancement with the existing web for static content
             | specifically. If you want to link to a resource that's
             | available through IPFS, then you make the link point to an
             | IPFS gateway that you trust and expect to stay online
             | (possibly your own on your own domain), like
             | https://example.com/ipfs/Qm_IPFS_HASH_HERE/. Anyone that
             | has a browser with direct IPFS support (either because
             | they're using an IPFS extension or they're using a browser
             | with built-in support, which might get more popular if IPFS
             | takes off) will have their browser recognize the URL format
             | and just fetch the file by hash directly from IPFS, and it
             | won't matter if example.com is still up and serving the
             | file. For everyone else, the link will work as long as
             | example.com is still up and acting as an IPFS gateway. If
             | example.com ever goes down, then users can make the link
             | work by installing an IPFS extension or manually replacing
             | the example.com domain in the link with the domain of any
             | still-active IPFS gateway, and any admin in control of the
             | page could fix the link similarly.
        
             | jeroenhd wrote:
             | I think long, random URIs are fine for embedded content,
             | actually. Most embeds are short, random URIs prepended with
             | a trendy domain name.
             | 
             | If you could "permalink" certain content for embeds, that'd
             | probably solve the issues, right?
        
         | alisonkisk wrote:
         | Also, Signed Exchanges for CDNs.
         | 
         | https://developers.google.com/web/updates/2018/11/signed-exc...
        
         | LinuxBender wrote:
         | Could SRI be used for this? [1]
         | 
         | [1] - https://developer.mozilla.org/en-
         | US/docs/Web/Security/Subres...
        
           | ignitionmonkey wrote:
           | SRI/hashing works for static content. Though it's worth
           | mentioning it's a SUB-resource feature (images, scripts,
           | etc.). It doesn't work for hyperlinks to other pages. Even if
           | it did, it's a different use case.
           | 
           | Say I link to an article by Author A that has comments in it
           | (or even a footer, relative timestamp, sidebar, etc.).
           | Hashing won't work as the page is always changing. I want the
           | link to always go to Author A but I don't care if the content
           | changes. That's the sort of use case signed webpages and
           | hyperlinks with enforced authorship covers. It's less about
           | what's on the page and more about who created it.
        
             | LinuxBender wrote:
             | Good points. I would guess that for something to be
             | implemented, it would have to be easy for browsers and API
             | tools to check once per domain and cache the response and
             | should probably be something that already exists and has
             | been adopted. Maybe a page could have a meta tag or header
             | that contains a hash of the destination sites DANE
             | signature? Something like _" targetref:somedomain.tld
             | expectsig:39726a2fe2bb052cf00e6b95a8385f7"_ based on tools
             | like danecheck [1] or maybe DNSSEC but that is very poorly
             | adopted.
             | 
             | [1] - https://github.com/vdukhovni/danecheck
        
               | unilynx wrote:
               | Finding a way to embed the domain registration date might
               | be sufficient, that would cover most of the expiry
               | situation
               | 
               | There was a ietf or similar registry that used your
               | domain and registration date to carve out your namespace,
               | ie dns.2021.07.26.com.example would be your prefix.
               | Pretty robust. Can't remember what it was anymore
        
               | ignitionmonkey wrote:
               | The solution I was going for with WebVerify is more web-
               | centric rather than domain-driven, which I think is a
               | better fit for webpages. It can be enforced at the
               | hyperlink-level for shared domains (like GitHub Pages,
               | University web spaces) and works for static resources
               | without needing to configure external resources. The only
               | really complicated part is PGP but that can be solved
               | with better tooling (as seen with Keybase).
        
       | paulpauper wrote:
       | Or 3rd party JS libraries. instead of https://code.jquery.com/,
       | your admin 'accidentally' added https://code.jqeury.com/ which
       | embeds a crypto miner
        
       | mindslight wrote:
       | There were non-porn videos on the original vid.me? TIL.
        
       | dredmorbius wrote:
       | The trend toward embeds rather than screenshots or direct copying
       | has long struck me as ill-conceived.
       | 
       | At least it's only pr0n. As a vector for malware / spyware
       | injection, this could be even more interesting.
       | 
       | Relevant xkcd, of course: https://xkcd.com/1698/
       | 
       | h/t Elda King @ Mastodon
       | https://weirder.earth/@eldaking/106626603001624730
        
       | edoceo wrote:
       | One of the services I've sold more than once was to handle the
       | "offlining" of a domain. Basically provide a 307/404/410 service
       | and make sure it works for a long time before the name gets
       | released. Basically to help clean up on the way out.
        
       | nanis wrote:
       | I guess I submitted too early :-)
       | https://news.ycombinator.com/item?id=27925605
        
       ___________________________________________________________________
       (page generated 2021-07-23 23:00 UTC)