[HN Gopher] Why is IRC distributed across multiple servers?
       ___________________________________________________________________
        
       Why is IRC distributed across multiple servers?
        
       Author : rain1
       Score  : 120 points
       Date   : 2021-09-12 10:56 UTC (12 hours ago)
        
 (HTM) web link (gist.github.com)
 (TXT) w3m dump (gist.github.com)
        
       | throwthere wrote:
       | I don't know if the numbers are realistic here. First and most
       | importantly, messages are only sent to clients in the same
       | chatroom, not sever wide. Second, 10% of users are only very
       | rarely going to send messages at once. By rare you can probably
       | substitute never. This, this is simple very small text messages
       | where seconds of lag don't really matter-- why would it be hard
       | to manage tens of thousands of concurrent connections?whatsapp
       | crushed millions of connections on single server back in 2012--
       | https://web.archive.org/web/20140501234954/https://blog.what...
        
       | prdonahue wrote:
       | So you can nick collide people, obviously.
        
       | sdfs6645d wrote:
       | https://www.reddit.com/live/17n32347h2coa
       | https://www.reddit.com/live/17n32eybvox8j
       | https://www.reddit.com/live/17n33zpj7ujrq
       | https://www.reddit.com/live/17n34xytz9su9
       | https://www.reddit.com/live/17n35ablgxldg
       | https://www.reddit.com/live/17n35grmkeeqo
       | https://www.reddit.com/live/17n35walz24op
       | https://www.reddit.com/live/17n364wbm5m7n
       | https://www.reddit.com/live/17n36fxxw5f06
       | https://www.reddit.com/live/17n3bisi2bly2
       | https://www.reddit.com/live/17n3bx98vndyl
       | https://www.reddit.com/live/17n3c3kq8vgi1
       | https://www.reddit.com/live/17n3cbuq2sawu
       | https://www.reddit.com/live/17n3clc0kf3qk
       | https://www.reddit.com/live/17n3crh9t9hba
       | https://www.reddit.com/live/17n3d4tgfroo8
       | https://www.reddit.com/live/17n3dcxpjb6mz
       | https://www.reddit.com/live/17n3dj906t21a
       | https://www.reddit.com/live/17n3drj836arb
       | https://www.reddit.com/r/nflChiefsvBrownstvs/
       | https://www.reddit.com/r/nflChiefsvBrowns/
       | https://www.reddit.com/live/17n3g9kcuru7t
       | https://www.reddit.com/live/17n3gg2056pzw
       | https://www.reddit.com/live/17n3grv476zr0
       | https://www.reddit.com/live/17n3h0p5701cy
       | https://www.reddit.com/live/17n3iascrq39o
       | https://www.reddit.com/live/17n3i4nkj2vwg
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
       | https://links.uky.edu/sites/default/files/webform/marketing-...
        
       | unixhero wrote:
       | Engineering IRC networks is si much fun.
        
       | H8crilA wrote:
       | Just remember that "netsplits" exist in every distributed system,
       | be it a chat app or a database. It's just the CAP theorem. IRC
       | has chosen to sacrifice C (consistency).
       | 
       | The only thing that changed in the modern times is that the P
       | (partitions) are extremely rare in modern high octane cloud
       | infrastructures. Also, modern solutions often decide to sacrifice
       | A (availability), by returning an error saying "we're aware of
       | the problem and we're working on a solution". This is what
       | happened quite recently when Google authentication went out and
       | half of the internet went dark, while under the hood they had a
       | simple out-of-quota situation on one of the replicas of their
       | core authentication systems. The system was programmed to
       | sacrifice A (availability) and reject all authentication
       | requests.
        
         | moonchild wrote:
         | > IRC has chosen to sacrifice C (consistency)
         | 
         | Hm? Hasn't it sacrificed partition tolerance? A netsplit is a
         | partition.
        
           | mappu wrote:
           | In CAP, the P happens whether you like it or not, and you get
           | to choose between C-but-not-A or A-but-not-C.
           | 
           | IRC is an AP system. It stays up (+A) in a netsplit (+P) but
           | the resulting servers are not consistent.
        
           | Twisol wrote:
           | It tolerates partitions just fine; I've been through many
           | netsplits where folks just kept talking on our side of the
           | split until the network healed.
           | 
           | Partition tolerance doesn't mean partitions don't affect the
           | system, or that they can't happen. It just means the system
           | has to choose whether to become unavailable or inconsistent
           | (since it can't have both in the presence of a partition).
           | IRC chooses to remain available, at the cost of losing
           | messages for people on the wrong side of the split.
        
           | j56no wrote:
           | If it had sacrificed P IRC would stop working in case of a
           | net split. Instead it keeps working in an inconsistent state.
        
             | wlonkly wrote:
             | No, "stop working" is Availability.
        
         | throwaway20371 wrote:
         | > "netsplits" exist in every distributed system, be it a chat
         | app or a database, it's just the CAP theorem
         | 
         | Well let's not get carried away. Network partitions happen
         | everywhere, but everything is not about the CAP theorem. CAP
         | theorem is a very specific model that a lot of apps (even ACID
         | databases) don't conform with. Comparing IRC to CAP theorem is
         | like comparing it to ACID and saying, "IRC decided to sacrifice
         | transaction integrity".
         | 
         | IRC didn't explicitly sacrifice the C in CAP, they designed a
         | simple server protocol. They could have added a bunch of
         | weirdness to hide splits from users, but it would have been
         | unnecessarily complicated and not contributed significantly to
         | the user experience.
        
           | H8crilA wrote:
           | I'm sorry but I don't think you realise how simple and
           | fundamental the CAP theorem is. It's almost a tautology. And
           | yes it applies fully.
           | 
           | The most basic case is if there's absolutely no method of
           | exchanging information from point A to point B. Then agents
           | at A and B will not be able to communicate. That's it. Any
           | system built to facilitate information exhange will either
           | have to deliver incomplete information (C) or will have to
           | refuse to operate (A).
           | 
           | Now then, as I said, nowadays it's extremely unlikely that
           | there's truly no connection between any two major Internet
           | hubs (though it can happen, hello BGP). It still happens in
           | specific systems that do not work on any method of
           | information transfer but rather on specific methods of
           | information transfer. The IRC example requires specific
           | servers to be up, not just a functioning IP routing between
           | the end clients. If some server is not up then (at least
           | temporarily) from IRC's point of view there's no way to
           | deliver information from A to B. The Google auth outage
           | example requires (among most likely many other things) disk
           | space availability on specific servers for information
           | exchange to happen.
        
             | TheDong wrote:
             | > I don't think you realise how simple and fundamental the
             | CAP theorem is
             | 
             | May I recommend reading "A Critique of the CAP Theorum -
             | Martin Kleppmann", available as a PDF here
             | https://arxiv.org/abs/1509.05393
             | 
             | As that paper points out, your definition of CAP theorem is
             | simplified and incomplete to the point of being wrong, as
             | many are.
             | 
             | As it also point out, CAP theorem doesn't really account
             | for eventual consistency well.
             | 
             | I would argue that a chat protocol is a good place to
             | perform eventual consistency, and those tradeoffs work
             | well. During network partitions, have both sides of the
             | partition continue to accept messages. Have the client mark
             | messages with random unique IDs, and have each server mark
             | messages with a server timestamp. The well-defined merge
             | operation is now to sort by server-time and dedupe by
             | message ID, such that if a message is sent to two servers
             | it only displays once.
             | 
             | This doesn't work for IRC traditionally, since messages do
             | not have unique IDs, and so no merge operation can
             | deduplicate them, and servers do not store messages during
             | netsplits (or at any time really), so they cannot be re-
             | sent.
             | 
             | However, a similar system exists for other chat systems.
             | matrix is a federated system of multiple servers, and when
             | partitions occur, each server will still accept new
             | messages, and later those messages will be made available
             | to other servers and merged in at the appropriate time.
             | 
             | I think that CAP theorem's results are less interesting if
             | you consider application-level resolutions to network
             | issues (i.e. eventual consistency), and as I believe the
             | paper also implies trotting it out constantly when talking
             | about practical systems gets old fast.
        
               | H8crilA wrote:
               | If you can always merge reordered edits/messages then CAP
               | does not apply because you don't need C (as defined in
               | CAP), you may instead talk about partitions/connectivity
               | issues as if they were some anomalous sources of large
               | latencies in the system. You have your own, different
               | definition of C. There are some very very large scale
               | systems out there that work under the assumption that any
               | edits can arrive reordered, and it's OK for the
               | observable properties of the system.
               | 
               | Here's what's "inconsistent" in an eventually consistent
               | chat app: your typed responses might have been different
               | had you seen in time what the other party has to say. To
               | some degree the "computation" happens in your head. A
               | "fully consistent" / "fully synchronous" chat app would
               | sometimes refuse to send a message because the other
               | party might have said something in the meantime. Like
               | you'd expect from a fully-synchronous bank account
               | balance handling system that wants to keep >= 0 balance
               | at all times, rejecting overdraft transactions.
               | 
               | (And I agree that this is completely acceptable behavior
               | for a chat app; we as people are built to tolerate this
               | kind of a problem in async person to person
               | communication; just pointing out what does C in CAP
               | exactly mean; the "fully synchronous" chat app would be
               | just an occasional pain in the ass with little benefit)
        
             | throwaway20371 wrote:
             | > The most basic case is if there's absolutely no method of
             | exchanging information from point A to point B. Then agents
             | at A and B will not be able to communicate. That's it.
             | 
             | That's not it. The most basic case is if there's no
             | _linearizeability_ between A and B. A and B can continue
             | communicating but fail the C in CAP if linearizeability
             | fails. Hence we shouldn 't compare everything to CAP.
        
       | 300bps wrote:
       | One thing I haven't seen covered is multiple servers =
       | redundancy.
       | 
       | If a server goes down, having a net split is a lot better than
       | having the entire network down.
        
       | k__ wrote:
       | So its users can get fun netsplits.
       | 
       | I remember we would all try to get on the same server in our
       | channel, but some less technical people would use a web client
       | that assigned different ones every time.
        
       | nathias wrote:
       | The side effect was also great for communities on .net servers
       | that didn't have services like user accounts and channels.
       | Channel ops were battle-won and people who had them were much
       | better at not sucking completely.
        
         | Sunspark wrote:
         | ChanOps have always been a problem. Anyone who becomes one, is
         | a dictator for life. There is no recourse, the only option is
         | to either be on their good side, or go to a different channel
         | or network.
         | 
         | I like IRC as an open technology. I don't like the lack of
         | accountability from the gatekeepers.
         | 
         | It is the same problem on online forums like reddit. If the
         | mods do not look upon you with favour, you are banned, even if
         | the rules have not been broken.
        
           | nathias wrote:
           | Yes, chanops make channels into properties of individuals,
           | but without them they are property of the community that uses
           | them.
        
       | jchw wrote:
       | Well, from my historical reading of it, initially, IRC was a
       | federated network of servers that were essentially one network,
       | the way email is one network: there was no shared administration
       | or anything. Anyone could run a server and jump into the network.
       | Due to abuse, servers began restricting who they peered with, and
       | it fractured into multiple networks.
       | 
       | So really, I suspect it was designed to be distributed and
       | federated, and it just became what it is by accident.
        
         | albertgoeswoof wrote:
         | What would be the abuse issues from open peering? How were we
         | able to solve them for email, but not IRC?
        
           | nickelpro wrote:
           | We didn't, email spam exists to this day. The solution has
           | been to ban entire swaths of domains and even IP ranges by
           | chucking all mail from them into spam folders
        
         | Ekaros wrote:
         | Many other services also used to be like this. Think of Usenet
         | aka. news. It is effective model when you think of Internet as
         | network of networks. When there was real difference between
         | connecting to your local area network, metropolitan area
         | network or even wide area network.
         | 
         | Actually we have come quite far from those days and full speed
         | point-to-point links between most points is somewhat realistic.
        
         | unilynx wrote:
         | It was never open to attach your server to a network, unlike
         | email. A server connection was way too powerful for that. You
         | needed an existing server admin to allow your server to
         | connect.
        
           | ghancock wrote:
           | I wasn't there but I have seen multiple histories say that
           | there were servers that accepted connections from anyone
           | (most famously eris.berkeley.edu but not only that one). For
           | example, https://about.psyc.eu/IRC
        
       | jimjams wrote:
       | In reality only guys in the same channel get sent the messages...
       | if messages are spread between even a few channels the autual
       | numbers are much more manageable for one server.
        
       | Ologn wrote:
       | > One of the problems of having multiple servers is that
       | netsplits can occur.
       | 
       | In the early/mid 1990s, the IRC servers in Australia would split
       | from the IRC servers in the US all of the time (sometimes Europe
       | would break from the US as well). The Internet connection between
       | the US and Australia was slower and flakier back then. It made
       | lots of sense for Australians to be on Australian IRC servers and
       | Americans to be on US IRC servers, and to all be talking together
       | when the link was working (the majority of the time) and to not
       | be when the link broke (fairly regularly). The CAP theorem says
       | something has to go in those cases, and the thing that went was
       | consistency between US and Australian (or European) messages sent
       | to a channel - the messages from the other side of the split
       | would be dropped during the split.
       | 
       | I don't remember many technical netsplits on Freenode or Libera
       | in recent years, so it is less of a thing now. IRC servers were
       | always federated, so there was the original split of Anet and
       | EFnet, and the Undernet split, then the EFnet/IRCnet split which
       | revolved around those US/Europe/Australia issues. More recently
       | there was the Freenode/Libera split.
       | 
       | IRC's model always worked for me.
        
       | throwaway20371 wrote:
       | Why are Linux distributions hosted on multiple mirror servers
       | that they don't own?
       | 
       | 1) money 2) availability 3) trust 4) security
       | 
       | 1) If you don't have a lot of money, you take the servers you can
       | get. Donated mirrors means you don't have to pay the bandwidth or
       | hosting bills.
       | 
       | 2) If you have multiple servers, it's less likely that one server
       | going down will tank your project. When GitHub, AWS, or even
       | Level3 has an outage, Linux distros keep on chugging like nothing
       | happened. Traditional server maintenance is also easier when
       | everyone can just switch to a different server.
       | 
       | 3) Maintainers can use their PGP keys to create signed packages
       | and downloads. Their public keys are distributed on mirrors, as
       | well as embedded in the downloads they've signed. Once downloaded
       | by users, the distribution can verify its own integrity. But how
       | does the user know they started with the real maintainers' public
       | key? The public key is distributed on a hundred geographically-
       | distributed servers all owned by different people; the user can
       | check them all. So other than compromising a maintainer's key,
       | it's logistically impossible to compromise end-user security.
       | (this one is more Linux-specific than IRC-specific)
       | 
       | 4) If you only have one server & it gets compromised, it can be
       | hard to tell. By comparing its operating state to the other
       | servers, you can sometimes more quickly identify the compromise.
       | And if you do find a compromise, you can remove the compromised
       | server quickly, close the hole on the other servers, and start
       | regenerating keys. It's an eventuality every large project should
       | be prepared for, and IRC servers do get compromised. Linux
       | mirrors don't matter in this regard, but the build servers etc do
       | matter.
       | 
       | IRC comes from the same time and place, and has some (but not
       | all) of the same considerations.
        
       | LinuxBender wrote:
       | I can somewhat answer this. Apologies, this became a bit long
       | winded and I have barely touched on several historical, technical
       | and logistical reasons.
       | 
       | Part of the answer is historical and part of this _was_
       | technical. IRC has been around for a very long time. As such, the
       | earlier versions of the servers and daemons could not accept tens
       | of thousands of client connections _ePoll vs Select_. The
       | connections between servers are multiplexed and not directly
       | related to the number of people connected to the server. There
       | was also a matter of latency. Servers in a region would keep the
       | messages local to that region, as only people in the same channel
       | get the messages and it was less common to have people in the
       | same channel all over the world. This also changed with time. If
       | there was a split, you lost other regions. This was not always
       | the case, so of course I am over-generalizing since there were
       | many different IRC networks designed by many different people.
       | Being long running services, I had seen a great deal of
       | hesitation to re-architect anything on the fly on at least some
       | of the networks, even after ePoll and modern hardware made it
       | possible to have tens of thousands of people on one server. Some
       | of the smaller IRC networks indeed consolidated into fewer or a
       | single server.
       | 
       | Another facet is logistics and ownership. Many of the bigger
       | networks are comprised of servers owned and managed by different
       | people and organizations. The servers are linked as a matter of
       | trust. That trust can be revoked. Most of the early IRC networks
       | were run by people doing this in their free time with their own
       | money and/or limited resources. In some other cases some
       | organizations prefer to have their own servers so that their own
       | people indeed to not suffer splits for their local communication.
       | There are a myriad of other use-cases and reasons why some
       | organizations had their own servers. Sometimes there was a need
       | to give LocalOps special permissions that would not be permitted
       | network-wide. Despite the technical capability to have less
       | servers, some organizations are not going to give up their local
       | nodes.
       | 
       | One issue not mentioned is permission losses on splits. The issue
       | with splits and permission changes has more to do with the way
       | services are integrated into IRC, or more specifically, aren't.
       | Services are treated like bots with higher privilege and most if
       | not all of them were not written to be multi-master. Rather than
       | dealing with moving services around or pushing for read-only
       | daemons, they just lived with the possibility that there would be
       | splits and they would eventually resolve themselves. I personally
       | would have preferred to see a more common integration with
       | OpenLDAP. Some of the IRC daemons can use LDAP, but it is more of
       | an after-thought, or bolt on. This would have allowed splits to
       | occur without losing channel permissions and clients could be
       | configured to quickly attach to another server in another region
       | and that is just DNS management. This could have been further
       | improved by amending or replacing the IRC RFC's to allow SRV
       | records. This may have been done by now for all I know. I shut
       | down my last public server some time ago.
       | 
       | There is a lot more to this than I could sum up on HN. Anyway,
       | today you can fire up an IRCd of your choice on modern hardware
       | and accept tens of thousands if not hundreds of thousands of
       | people on a single server if you wish. It is technically
       | possible. I would still design the network to have multiple
       | servers, as you will eventually hit a bottleneck. If you really
       | want to do this, you will have to de-tune the anti-ddos counter
       | measures to allow the thundering herd to join your standby server
       | or make code changes to permit the thundering herd briefly on
       | fail-over.
        
         | wayoutthere wrote:
         | As someone who ran IRC servers in the 90s the technical
         | limitation was the number of file descriptors. I think Linux at
         | the time was limited to 1024 and the biggest server on our
         | network was a DEC Alpha with 4096. The entire network (DALnet
         | at the time) was in the 20-30k user range so we absolutely
         | needed multiple servers.
        
           | throwaway20371 wrote:
           | I'm pretty sure even back then you could edit the hard-coded
           | limit in the source code and recompile. I remember us doing
           | something like this as it was too expensive to just keep
           | buying servers and our apps were connection-happy.
        
             | wayoutthere wrote:
             | 1024 was the max you could boost it to; 256 was the default
             | as I recall. Linux 1.x was pretty bootstrappy.
        
           | blibble wrote:
           | there was also no efficient io multiplexing
           | 
           | an ircd with a few thousand clients was cpu bound on
           | poll()/select()
           | 
           | /dev/poll and kqueue/epoll were game changing
        
             | jlokier wrote:
             | It was actually possible to delegate subsets of descriptors
             | to child processes doing the poll()/select(), making
             | polling have the same time complexity as /dev/poll and
             | kqueue/epoll, and avoid being CPU bound. Even better if you
             | delegated cold subsets, and kept a hot subset in the main
             | process.
             | 
             | But few knew the trick so it didn't catch on.
        
               | blibble wrote:
               | mind explaining how?
               | 
               | with poll()/select() I don't see how you can avoid
               | checking every FD at least once (poll's fd counter
               | aside), vs. epoll() only returning those in the desired
               | state
               | 
               | (and I don't think you could do tricks like epoll_wait()
               | on an epoll fd)
        
               | jlokier wrote:
               | Sure.
               | 
               | Fork some child processes, and keep an AF_UNIX
               | socketpair() open to them so you can pass them file
               | descriptiors with SCM_RIGHTS.
               | 
               | Have the main process divide up the fds it is waiting on
               | into a "hot" subset and cold subsets of size at most N,
               | and for each cold subset pick a child process P. fds can
               | be moved between hot and cold at any time, and generally
               | you will move them to hot after they have woken and been
               | used, and move them to cold after a few consecutive poll-
               | cycles where they were not ready. Don't move fds to cold
               | subsets belonging to child processes that you don't want
               | to wake, though.
               | 
               | When the main process is ready to "poll everything", have
               | it iterate over each child process _that is not already
               | sleeping_ , and send a message over the socketpair(),
               | containing a list of fd_set additions and removals to
               | that child's wait-for subset, including the type of poll
               | (read, write, etc).
               | 
               | For each fd where the child doesn't have the real file
               | descriptor yet, pass that over the socketpair() as part
               | of the message. (If threads are usable instead of
               | processes, there's no need to send the file descriptor.
               | But on old systems, the system threads were often
               | implemented by userspace multiplexing with poll/select
               | anyway, so it wasn't a good idea to use threads with this
               | technique.)
               | 
               | As well as a list of changes, this message tells the
               | child process to run poll/select on its subset, and then
               | reply with the set of fds that are ready (and their
               | readiness type).
               | 
               | After issuing all the child process messages, the main
               | process does its own poll/select, to wait for hot fds and
               | replies from the child processes.
               | 
               | The reason this has different scaling properties, despite
               | the overheads, is that each child handles a limited size
               | subset, messages scale with the amount of change activity
               | not the size of sets, and ideally the "coldest" fds end
               | up gathered together in child processes that _continue to
               | sleep between a large number of main process polls_ , so
               | the number of active child processes and messages scales
               | with the amount of change activity as well.
               | 
               | Keep in mind, even active fds are removed from the wait-
               | for subset if they've recently reported they are ready
               | and the poll loop hasn't read/written them yet. So it has
               | similar algorithmic properties to epoll.
               | 
               | As a bonus in the case of select(), the fds in the child
               | processes have smaller values than in the main process.
               | So in addition to the number of fds polled per cycle
               | scaling with the amount of activity instead of the total
               | number of fds, the fd_set bitset size does not grow with
               | the total number of fds either. In the main process the
               | bitset size does grow, but it's possible to juggle fd
               | values with dup2() to overcome that.
        
         | wpietri wrote:
         | People whose only experience is with modern hardware and
         | networks really have a hard time getting the first point. As
         | somebody who started coding around the time IRC was created,
         | hardware and networks are _amazingly good_ compared to what we
         | had at the time.
         | 
         | In the mid-90s, years after IRC was written, I set up a
         | distributed system for the financial traders I was working for.
         | Our between-cities links were 64 kb/s guranteed and could burst
         | all the way up to 256 kb/s. And those links were not super
         | reliable. These were connecting systems with Pentium processor
         | running ~90 MHz with ~8 MB RAM. They did very, very little
         | compared with even the cheapest server slice you can get with
         | AWS.
         | 
         | This is one of those things, like George Washington never
         | knowing about dinosaurs, where it's just hard to comprehend how
         | people thought back in the olden days.
        
         | rain1 wrote:
         | thanks, really appreciate this comment.
        
           | LinuxBender wrote:
           | My pleasure. I am sure others could add a great deal more.
           | There is a very long history and there are many pieces of
           | history I am leaving out. A big part I left out is the
           | individual server rate limits vs. network link rate limits
           | and network topology and that is both a technical and
           | logistical issue.
        
             | spinax wrote:
             | One positive thing I'd add - as a user - under logistics is
             | high availability. Life is messy, servers go down planned
             | or unplanned for whatever reasons - IRC networks are in a
             | sense truly 'federated' in that the client will get a new
             | server on reconnect attempts much like webservers behind a
             | load balancer. You never have to worry about your 'home
             | instance' being unavailable, as they're all your home
             | instance. (I speak about the public networks like Libera or
             | OFTC)
        
         | Sophira wrote:
         | > _Services are treated like bots with higher privilege_
         | 
         | A slight correction: Services normally link as a _server_ to
         | the network, which is how they get the higher privilege that
         | they do (because only servers, not clients, get the ability to
         | kill users from the network, etc).
         | 
         | And to add to this for others who may be curious: typically
         | there is some special configuration on the IRC server side to
         | allow the link, and some additional configuration to disallow
         | clients from changing their nickname to names like "NickServ",
         | etc (but to still allow the names when a server on the network
         | broadcasts a user with that nick). Normal non-Services IRC
         | bots, on the other hand, connect as regular clients.
        
           | duskwuff wrote:
           | Services also need to perform actions which aren't possible
           | for ordinary users, like knowing when a user connects,
           | forcibly changing a user's nick, or changing a user's
           | permissions in a channel without being an operator in the
           | channel.
        
         | hrpnk wrote:
         | Ah, netsplits were so eventful. I still remember the split-wars
         | where groups would wait for a split to happen and gain operator
         | permissions only to take over a channel on the merge [1].
         | 
         | [1] https://en.wikipedia.org/wiki/IRC_takeover#Riding_the_split
        
           | sterlind wrote:
           | wouldn't ChanServ fix things once the split resolves?
        
             | hrpnk wrote:
             | I did not experience that, but you're right:
             | https://en.wikipedia.org/wiki/IRC_services#ChanServ
        
             | lnxg33k1 wrote:
             | Not all networks have services, for example that happened a
             | lot on IRCNet which doesn't(? maybe now has?)
        
             | techrat wrote:
             | ChanServ is a relatively modern function of IRC. For a good
             | while, still to this day on some networks, services did not
             | exist.
        
         | rdpintqogeogsaa wrote:
         | Lots of correct and insightful information here, but I'd like
         | to pick out one specific aspect here.
         | 
         | > _[...] clients could be configured to quickly attach to
         | another server in another region and that is just DNS
         | management. This could have been further improved by amending
         | or replacing the IRC RFC 's to allow SRV records. This may have
         | been done by now for all I know._
         | 
         | To set the stage: Larger IRC networks balance their global
         | servers. A DNS A query for irc.example.com will yield a list of
         | geographically local servers, possibly shuffled on each query
         | as well.
         | 
         | I know of at least one IRC network that refuses to send even
         | the list of all geographically local servers, only sending a
         | subset, as a measure to avoid trivial DDoS attacks if people
         | don't go around collecting the DNS records ahead of time. I'm
         | told that this actually works because the thread actors are not
         | the sophisticated kind.
         | 
         | Incidentally, I have also noted that some networks will shuffle
         | the order of A records for each query because the clients
         | cannot be trusted to select a random DNS response. Considering
         | something _this trivial_ already doesn 't work, I dread to
         | imagine how much a DNS SRV implementation would go wrong,
         | considering it needs both sorting and a weighted random
         | sampling[1] to really work.
         | 
         | [1] https://datatracker.ietf.org/doc/html/rfc2782 page 3 _et
         | seq._
        
           | samsquire wrote:
           | On some operating systems getaddrinfo sorts the DNS response
           | by IPv6 distance! Breaking load balancing
           | 
           | https://access.redhat.com/solutions/22132
        
             | toast0 wrote:
             | This is not exclusive to IPv6, I've seen it on v4 as well.
             | If you've got short DNS TTLs and can return 2-4 records out
             | of a larger pool, that can help, but if your TTLs are
             | longer, you have to consider the handful of recusrive DNS
             | servers that serve a large number of users... You want to
             | give them more records to balance that traffic better.
             | 
             | OTOH, current IRC usage numbers are pretty low, a beefy
             | single server should work, except for the disruption
             | potential of single servers. Latency can be a bit of an
             | issue too, depending on where your users are; not great if
             | users are in south asia and the only server is in east
             | coast US.
        
       | hcykb wrote:
       | Because when IRC was popular servers and routes went down often
       | and a single server couldn't handle all the users a network would
       | have. Neither of those are a concern anymore.
        
       | jbverschoor wrote:
       | Because many of the early protocols, including IP, we're designed
       | with network failures in mind.
        
       | mvanbaak wrote:
       | > via round robin DNS (meaning that when people resolve the DNS
       | it gives them a random server from the set of 20 to connect to)
       | 
       | Most of the times, it's not simple round-robin, but also geo-
       | based. This means clients will get ip addresses of the servers
       | closest to them.
        
         | magila wrote:
         | My experience with Freenode/Libra Chat is that they either
         | don't implement geo DNS or don't do a very good job of it. I'm
         | on the US west coast and lookups to irc.libera.chat often
         | return servers in Europe.
         | 
         | Edit: Double checking Libra Chat's website I see that they have
         | added regional hostnames so I guess that's their solution.
        
           | pvtmert wrote:
           | if they're using aws route53, your isp needs to support edns.
           | 
           | otherwise, your netblock might have been falsely advertised
           | in the dns provider's geoip database. (eg. maxmind)
        
             | pushrax wrote:
             | They're using Cloudflare. When I resolve them from the east
             | coast, I got a San Francisco server once and a server in
             | Budapest once. They have a server in Toronto, Ashburn,
             | Montreal, and other places that are closer.
             | 
             | I know geodns works here since I use it for some of my own
             | deployments.
        
         | melony wrote:
         | Does IRC predates distributed state machines? Why can't the
         | servers sync up the chat via Paxos or Raft?
        
           | manquer wrote:
           | Chat is not that complex as other distributed applications
           | you probably don't need Raft.Both Paxos and Raft are very
           | complex algorithms to implement.
           | 
           | A CRDT based append only implementation is probably more than
           | enough?. Data is never modified only added/removed in typical
           | chat workflows.
           | 
           | Reading discourd engineering blog over the years it looks
           | like scaling the pub/sub for the consumers in large channels
           | is lot harder than DB/store itself being distributed.
        
           | giantrobot wrote:
           | Distributing state wasn't the goal on IRC, only relaying
           | messages. If you miss a message you miss a message. You can
           | use client-side tools (bots, bouncers, etc) to record state
           | but the protocol itself doesn't care.
        
           | duskwuff wrote:
           | Implementing Paxos would mean that stateful operations (like
           | connecting to the server, joining a channel, or changing
           | modes) become impossible on a server, or a group of servers,
           | that have lost quorum.
        
           | sterlind wrote:
           | hard to do Paxos over large geographical distances
           | efficiently, but... it's IRC, so..
           | 
           | I just assume it was from an earlier internet where
           | distributed systems weren't as well understood. I don't think
           | it necessarily predates Paxos but it definitely predates
           | Paxos being a household name.
        
           | X6S1x6Okd1st wrote:
           | paxos was first created 1989, but not popularized for a long
           | while after:
           | https://en.m.wikipedia.org/wiki/Paxos_(computer_science)
           | 
           | irc 1988: https://en.m.wikipedia.org/wiki/Internet_Relay_Chat
           | 
           | Earliest reference for raft I can find is 2013.
        
       | rawoke083600 wrote:
       | Would be fun to visit the old problems(like this one) with modern
       | toolset. Say golang with channels(not the /join type of channels)
       | :p
        
       ___________________________________________________________________
       (page generated 2021-09-12 23:01 UTC)