[HN Gopher] Amazon Cloud Traffic Is Suffocating Fedora's Mirrors
       ___________________________________________________________________
        
       Amazon Cloud Traffic Is Suffocating Fedora's Mirrors
        
       Author : heywire
       Score  : 88 points
       Date   : 2024-05-29 20:12 UTC (2 hours ago)
        
 (HTM) web link (www.phoronix.com)
 (TXT) w3m dump (www.phoronix.com)
        
       | INTPenis wrote:
       | I think Digitalocean has their own package mirror for their
       | image.
       | 
       | AWS is just being ignorant. If I were in charge of Fedora
       | infrastructure I'd block them and send them instructions on how
       | to setup a mirror.
        
         | kbolino wrote:
         | Though there are probably things AWS could do anyway, this
         | could well be caused by a large customer using a custom AMI,
         | and not because of anything Amazon did or didn't do.
        
         | hinkley wrote:
         | Escalating delays can help with this. Get it to be slow enough
         | that people notice.
         | 
         | XML schemas have had a similar history, tanking w3c.org
         | servers.
        
           | recursive wrote:
           | They totally deserved it for making namespaces that "just
           | happen to be" URLs. XML is insane.
        
             | kevindamm wrote:
             | well, but they're URIs. See, the difference is right there.
             | An identifier not a location. Nobody should ever confuse
             | the two!
             | 
             | /s of course
        
             | IshKebab wrote:
             | Why even are they URLs? The only reasonable suggestion I
             | could find is that it was part of an abandoned or poorly
             | adopted idea to also host the schema at that URL.
        
               | TillE wrote:
               | Probably the same sort of thought process that led to the
               | convention of Java packages being named
               | com.example.whatever. It identifies the creator and gives
               | you some structure to create a unique identifier.
               | 
               | Lot of half-baked ideas floating around in the early
               | years of the commercial internet, but the Java thing held
               | up better.
        
               | agwa wrote:
               | It's a way to ensure global uniqueness. In the end
               | they're just compared byte-for-byte as strings.
        
             | rodgerd wrote:
             | Ah I remember the good old days of strict XML parsers that
             | would fail if they didn't have Internet access to pull the
             | schema in.
        
             | agwa wrote:
             | Schemas != namespaces.
             | 
             | I'm sure some braindead software out there attempts to
             | retrieve namespace URIs, but it would surely be a drop in
             | the bucket compared to traffic for schemas/DTDs (which are
             | intended to be retrieved).
        
         | jcrawfordor wrote:
         | The fact that this is EPEL strongly suggests that it was set up
         | by an AWS user, not by Amazon themselves. EPEL is not used by
         | default in any common AWS AMIs. Perhaps it is an Amazon Linux
         | user who enabled EPEL via Amazon's package, but it's not
         | supported in the most recent version of AL so Amazon seems to
         | have addresses that issue anyway.
        
           | Havoc wrote:
           | >that it was set up by an AWS user,
           | 
           | A user with "five million additional systems" on AWS?
        
             | vesrah wrote:
             | A long time ago I knew a guy that uploaded a Counter-Strike
             | patch to his ISP personal hosting and ended up on the
             | official mirror list. Ended up taking down the ISP iirc.
        
             | Reason077 wrote:
             | > _"A user with "five million additional systems" on AWS?"_
             | 
             | Someone is going to be in for a big surprise when they get
             | their AWS bill this month and realise there's an infinite-
             | loop bug in their instance spawning script.
        
               | facialwipe wrote:
               | It's clearly either a large contract that would have been
               | negotiated before any instances were spun up, or Amazon
               | themselves.
        
             | stonogo wrote:
             | I don't think it's one user; I think it's a ton of them.
             | Want to use Let's Encrypt in your Openshift-on-AWS
             | deployment? certbot's in EPEL, along with a lot of other
             | quality-of-life stuff for log-shipping, monitoring, etc.
        
             | briffle wrote:
             | That is some massive AI training!
        
         | blitzar wrote:
         | Nahh that seems needlessly cruel they should continue to serve
         | them at 100k speed.
        
         | ajross wrote:
         | Surely AWS knows how to set up a mirror. It's just a mistake,
         | they'll surely correct it. Also simply blogging about it (which
         | gets amplified by Phoronix, then by HN) is a better strategy
         | for getting their attention than blocking.
        
           | INTPenis wrote:
           | I worked in ops for 20+ years.
           | 
           | If someone blocks you it becomes an incident, a post mortem
           | and you learn your lesson.
           | 
           | If someone blogs about it, or e-mails you, it gets added to a
           | todo list and might get fixed in a few weeks by a
           | disinterested intern.
        
         | skywhopper wrote:
         | I doubt Amazon builds the Fedora images. So if they're pointed
         | to the wrong place, that's not AWS's fault.
        
       | xd1936 wrote:
       | I wish apt, dnf/rpm, flatpak, etc utilized a decentralized
       | distribution option, like IPFS or BEP46 Mutable Torrents. It
       | would be neat if the project leads seeded new package update
       | hashes, volunteers ran seedboxes instead of http mirrors, and
       | clients had (default-on?) seeding of package binaries in addition
       | to only downloading. It would be neat to see the open source
       | community contributing to support each other's experience.
        
         | spookie wrote:
         | At the end of the day, there's a trust issue. The way
         | distribution works nowadays, mitigates a lot of those. Making
         | it decentralised would be a step backwards.
        
           | ciupicri wrote:
           | Aren't the RPMs signed?
        
           | BSDobelix wrote:
           | You know that packages are signed right? That's why everyone
           | can be a libreoffice or ArchLinux Mirror...or Fedora?
        
             | aaomidi wrote:
             | And because of this, these mirrors are often non https so
             | your ISP can actually intercept and provide the data to you
             | directly.
             | 
             | This is why torrents would actually work fine in this
             | model.
        
               | Cyphase wrote:
               | Where intercept usually means point DNS for the mirror
               | domains to the ISP's local mirror, and ISP could mean
               | your cloud provider.
        
               | BSDobelix wrote:
               | >ISP can actually intercept and provide the data to you
               | directly
               | 
               | What a nice gesture! However my mirror-servers are https
               | plus rsync (for some projects). But are there not some
               | ISP's that block torrent-traffic completely?
        
           | Cyphase wrote:
           | Packages can be and generally are signed and verified.
        
         | candiddevmike wrote:
         | This would make it so much easier for end users to cache/host a
         | mirror than the myriad of tools and hacks like squid.
         | 
         | I'm sure this will happen once the Linux community standardizes
         | on a package format /s.
        
         | j-bos wrote:
         | Why aren't packages based on torrents? Most distros come with a
         | torrent client.
        
           | utensil4778 wrote:
           | Honestly, default behavior should be to _at least_ share
           | package files on the local network. But sharing to the wider
           | internet should be fairly trivial in  <current year>, we have
           | no shortage of technologies that accomplish this.
           | 
           | I wish more systems had this kind of feature. I have a fat
           | fiber connection, I'd be thrilled to pop up an unofficial
           | mirror for something like a Linux distro. I've tried
           | mirroring Linux ISO torrents, but it seems almost nobody ever
           | downloads from a torrent, so I end up never actually
           | uploading any of these images.
        
           | Joe_Cool wrote:
           | Torrents don't handle frequent updates very well. You'd have
           | tons of outdated torrent floating around. They work well for
           | archives and installation media though that doesn't change
           | that often.
        
         | Joe_Cool wrote:
         | You could update Archlinux with pacman over ipfs from around
         | 2015 until last year. https://github.com/ipfs/notes/issues/84
         | 
         | Then the mirror became too slow and couldn't handle the amount
         | of package data and was shut down:
         | https://github.com/RubenKelevra/pacman.store
         | 
         | It worked rather well and even automatically mirrored the
         | packages on my LAN. Maybe it'll be back some day.
        
       | renewiltord wrote:
       | Checks out. The normal stuff is mirrored but not EPEL
       | https://repost.aws/knowledge-center/ec2-enable-epel
        
       | steelframe wrote:
       | Something irks me about volunteers spending real money to support
       | all the OSS freeloading businesses. I'm talking about companies
       | with a market cap in the $Billions. Almost none of them can be
       | bothered to kick back even a modicum of financial support to the
       | authors of the software that runs their business, and to add
       | insult to injury, they in fact soak the members of community who
       | distribute the binaries for their bandwidth.
        
         | BSDobelix wrote:
         | How about just support community distributions? ...and no,
         | Fedora is non of them.
         | 
         | Just stop support brand's and corporation's if you don't get
         | paid?
        
         | mardifoufs wrote:
         | Does Redhat pay for fedora's infra? Genuine question and I'm
         | not saying it's better either. It's just that it would be weird
         | for fedora to operate on volunteer/donated infra when it's
         | quite important to red hat, considering it's the "upstream"
         | (not sure if that's the correct term here) of RHEL
        
       | skissane wrote:
       | Suppose you have an Artifactory server that mirrors/caches a lot
       | of public stuff so one is (hopefully) a good citizen and don't
       | spam public mirrors with constant requests for the same thing.
       | 
       | But every tool has its own config to set to use the Artifactory.
       | One setting for the OS package manager (which is different for
       | different Linux distributions), another for PyPI, another for NPM
       | (or Yarn or whatever), another for Maven/Gradle, something else
       | for Go, then I need to download this Postgres extension and build
       | it from source - the list goes on. So almost inevitably something
       | gets missed and one ends up not being as good a citizen as one
       | ought, and then one day some random Jenkins job is failing
       | because some external dependency could not be downloaded.
       | 
       | I wish there was an easier way. Like some standard mechanism for
       | saying "for this URL use this proxy".
       | 
       | I guess one could just use a proxy server (http_proxy environment
       | variable) but with most things on HTTPS it needs to MITM the TLS
       | which then means you need that certificate installed in the build
       | process - which is another one of those "everything can do it but
       | everything does it differently" problems. And in any event, MITM
       | is a bad smell.
        
         | azornathogron wrote:
         | Proxy auto config (PAC) supports specifying different proxies
         | for different URLs. Unfortunately, a PAC file is just a file
         | that contains a JavaScript function to pick the proxy, so
         | they're crazily over-powered for the task, and support for them
         | isn't very broad. Browsers support them, but I guess most
         | command line tools wouldn't.
         | 
         | https://en.m.wikipedia.org/wiki/Proxy_auto-config
        
       ___________________________________________________________________
       (page generated 2024-05-29 23:01 UTC)