[HN Gopher] Playing Battleships over BGP (2018)
       ___________________________________________________________________
        
       Playing Battleships over BGP (2018)
        
       Author : tcard
       Score  : 117 points
       Date   : 2021-10-05 10:12 UTC (12 hours ago)
        
 (HTM) web link (blog.benjojo.co.uk)
 (TXT) w3m dump (blog.benjojo.co.uk)
        
       | dt3ft wrote:
       | So this is why Facebook went down, eh? ;)
        
       | efitz wrote:
       | This was very cool and also IMO very irresponsible.
        
         | [deleted]
        
       | benjojo12 wrote:
       | Author of the post here, Ask me almost anything I guess(?)
        
         | bmsleight_ wrote:
         | Considering Events yesterday - how do you test non-live ?
        
           | tg180 wrote:
           | Maybe dn42.eu?
           | 
           | > Experiment with routing technology
           | 
           | > Participating in dn42 is primarily useful for learning
           | routing technologies such as BGP, using a reasonably large
           | network (> 1500 AS, > 1700 prefixes).
           | 
           | > Since dn42 is very similar to the Internet, it can be used
           | as a hands-on testing ground for new ideas, or simply to
           | learn real networking stuff that you probably can't do on the
           | Internet (BGP multihoming, transit). The biggest advantage
           | when compared to the Internet: if you break something in the
           | network, you won't have any big network operator yelling
           | angrily at you.
        
           | benjojo12 wrote:
           | Who said I tested non-live?
           | 
           | The actual beta builds/sanity checks were done just with two
           | VMs peered with each other, but the live internet one was
           | done in one take (and never again, at least by me)
        
             | INTPenis wrote:
             | So how should these problems be mitigated? Have separate
             | infrastructure for critical services or staging BGP or
             | what?
        
               | midasuni wrote:
               | It seems that the main problem Facebook group in
               | restoring device was a lack of a completely separate out
               | of band management network
               | 
               | If my network (way smaller than FB, but budget way lower)
               | goes, I can get in via another ISP and WireGuard into the
               | OOB network which is completly separate to the inband
               | management.
               | 
               | Not every access switch is on OOB, but the core ones and
               | a few critical devices are.
        
             | benjojo12 wrote:
             | To add on, BGP has a very much "meme" status of being scary
             | and dangerous, and any touching will break youtube etc.
             | [Mostly perpetuated by infosec circles]
             | 
             | It's really not the 2000's anymore, BGP is mostly safe and
             | filtered. There are still improvements to be made (I've
             | even written on the blog about them), but one persons
             | immense fuck ups are far less likely to cause issues now
             | that IRR filters and prefix limits exist.
        
               | JadeNB wrote:
               | > It's really not the 2000's anymore, BGP is mostly safe
               | and filtered. There are still improvements to be made
               | (I've even written on the blog about them), but one
               | persons immense fuck ups are far less likely to cause
               | issues now that IRR filters and prefix limits exist.
               | 
               | Any non-maliciously designed protocol probably _can_ be
               | used safely, but surely yesterday 's events show that it
               | is still eminently possible to use BGP dangerously?
        
               | HideousKojima wrote:
               | What happened yesterday was (appears to be) Facebook
               | screwing up their own routing and DNS, not anyone else's.
               | They didn't take down routing for any IPs and domains
               | they didn't own. I can't imagine any other protocol
               | making a mistake like FB's impossible
        
               | benjojo12 wrote:
               | What part of yesterday was showing that it was possible
               | to use BGP dangerously?
               | 
               | If you are certain in this argument, then you master
               | electric switch is dangerous because you could switch off
               | the power to your house.
        
               | JadeNB wrote:
               | > If you are certain in this argument, then you master
               | electric switch is dangerous because you could switch off
               | the power to your house.
               | 
               | This seems like a response to an argument I haven't made
               | yet! (All else aside, if you prebut my argument, it
               | allows me not to make that argument.)
               | 
               | Sure, it's possible to do dangerous things with BGP; that
               | alone is not why I say it's possible to use it
               | dangerously. What _is_ dangerous is the fact that a small
               | and apparently innocent change can have such far-reaching
               | consequences--for example, I 'll bet there was no serious
               | consideration at Facebook of not being able to open
               | electronic door locks in the case of an apparently
               | innocent BGP update.
               | 
               | I don't consider my master electric switch dangerous
               | because I could switch off the power to my house. I
               | _would_ consider it dangerous if, after switching off the
               | power to my house, I was ejected from my house, and could
               | no longer open the doors of my house to get in and switch
               | the power back on.
        
               | HideousKojima wrote:
               | >for example, I'll bet there was no serious consideration
               | at Facebook of not being able to open electronic door
               | locks in the case of an apparently innocent BGP update.
               | 
               | If that was actually the case, a lot of heads at FB
               | should roll over this. The logic is simple and obvious,
               | and if the sysadmins and network admins didn't think
               | about this line of thinking then they're overpaid:
               | 
               | 1) Our door control system is accessed via a public
               | IP/address, not via an internal/private address.
               | 
               | 2) Accessing our public IPs/addresses is dependent on BGP
               | and DNS not getting borked.
        
               | JadeNB wrote:
               | > If that was actually the case, a lot of heads at FB
               | should roll over this. The logic is simple and obvious,
               | and if the sysadmins and network admins didn't think
               | about this line of thinking then they're overpaid:
               | 
               | You say "if" as if it's a conditional, but surely the
               | fact that it happened proves that no-one considered it
               | (or, I suppose, that whoever did consider it didn't have
               | enough sway to stop it from happening). There are,
               | rightly, so many laws and regulations requiring that safe
               | egress in case of emergency not be prevented, and I can't
               | imagine anyone actually considering and tolerating even
               | the slightest risk of an _Internet issue_ preventing that
               | egress. Well, I guess I can imagine lots of people doing
               | lots of awful and harmful things, but I can 't imagine
               | anyone doing it in such a way that it would be this easy
               | to get caught doing something blatantly illegal.
               | 
               | (Or were there safety measures in place that allowed
               | egress, just not entry? I don't know the specifics, since
               | my source is just the news stories that mention that the
               | door locks didn't work and people couldn't get in--but
               | maybe they could still get out?)
        
         | scratchadams wrote:
         | no question, just a selfish request for more blog posts please
        
         | kjrose wrote:
         | If you could replace BGP globally instantly with no problems.
         | What would you replace it with?
        
           | pyvpx wrote:
           | and keep IPv[4|6]?
        
             | moffkalast wrote:
             | IPv9 is where it's at.
        
           | benjojo12 wrote:
           | (Keeping in mind that replacing BGP is similar hard-ness as
           | replacing SMTP, and thus, might not be worth it)
           | 
           | Honestly, the issue that exists with BGP is not the protocol.
           | The issue is attached to trust, and there is not a instantly
           | fixable problem with a different protocol.
           | 
           | One issue with the internet as a whole is that seemingly
           | simple questions are actually hard, The one is slowly being
           | fixed with RPKI is "Who actually owns this IP address",
           | knowing this we can build better filters against direct
           | (origin AS != owner AS) hijacks.
           | 
           | However the next question that has no solution for is "Who is
           | allowed to carry this route/transit this data?" -- This is
           | going to be unbelievably hard to solve with certainty, There
           | is question that maybe a PKI solution could be deployed
           | (BGPSEC). However you also will hit the next issue.
           | 
           | The bgp table is massive. 1M+ routes that is stored on
           | machines with reasonably long lifetimes. It does not help
           | that in terms of computing power these machines are in
           | general very slow. A multi TBit/s router may only have a 2014
           | era laptop CPU powering it. So computing anything 1M times
           | quickly is a massive ask, and when links go down, it is
           | reasonable have fast recompute/reconvergance times.
           | 
           | Fixing bgp is not a easy issue. Anyone who is telling you so
           | is either fraudulent or does not understand the sheer
           | scale/scope of the issues attached to the protocol.
        
             | makeworld wrote:
             | Have you seen Yggdrasil? It provides an alternate routing
             | idea, among other things.
             | 
             | https://yggdrasil-network.github.io/
        
             | convolvatron wrote:
             | it is if you relax the constraint that the providers keep
             | the legacy allocations and can advertise whatever the hell
             | they want
             | 
             | Steve Deering had a really nice proposal on geographic
             | addressing that would make pki sufficiently performant by
             | using hierarchical assignments
        
       | throw0101a wrote:
       | > _For a protocol that was produced on two napkins in 1989_ [...]
       | 
       | I'm not sure I'd want to deal with a protocol that _can 't_ be
       | explained on a napkin or two. UTF-8 was design on a diner
       | placemat:
       | 
       | * https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt
        
       ___________________________________________________________________
       (page generated 2021-10-05 23:01 UTC)