[HN Gopher] How Discord Stores Billions of Messages (2017)
       ___________________________________________________________________
        
       How Discord Stores Billions of Messages (2017)
        
       Author : ibraheemdev
       Score  : 360 points
       Date   : 2021-08-24 17:43 UTC (5 hours ago)
        
 (HTM) web link (blog.discord.com)
 (TXT) w3m dump (blog.discord.com)
        
       | ChrisArchitect wrote:
       | plenty of discussion when this was news:
       | 
       | https://news.ycombinator.com/item?id=13439725
        
         | dang wrote:
         | Thanks! Macroexpanded:
         | 
         |  _How Discord Stores Billions of Messages Using Cassandra_ -
         | https://news.ycombinator.com/item?id=13439725 - Jan 2017 (155
         | comments)
        
       | jitans wrote:
       | I would have used CockroachDB, it has all the requirements listed
       | and you don't need to know in advance the queries you will
       | perform when deciding the database schema.
        
         | rnotaro wrote:
         | On the day that this blog post was written, CockroachDB was
         | only at beta-20170112 and didn't even had a production release
         | yet.
         | 
         | v1.0 was released on May 10, 2017 [1], so I doubt it was even
         | on their mind when they started working on the project.
         | 
         | [1] https://www.cockroachlabs.com/docs/releases/index.html
        
           | PeterCorless wrote:
           | Amazing. The whole industry has come quite a ways since 2017!
        
         | PeterCorless wrote:
         | This would not work at scale for a company like Discord, with
         | its volume of traffic. Cockroach, being consistency-oriented
         | would quickly become transaction-bound. You want a database
         | like a Cassandra or Scylla that is more
         | performance/availability oriented. Otherwise you are going to
         | see a lot of lag and latency in the Discord chat.
         | 
         | Cockroach is very, very good for a distributed SQL database.
         | But it's still performance-limited in its very nature.
         | 
         | More here on the difference between NoSQL/NewSQL performance,
         | using Scylla (a CQL-workalike) as a point of comparison:
         | 
         | https://www.scylladb.com/2021/01/21/cockroachdb-vs-scylla-be...
        
       | umvi wrote:
       | Discord is so good. I just can't imagine it can stay this good
       | forever. My fear is that eventually it will be bought out and
       | aggressively monetized.
        
         | alpb wrote:
         | If you're paying for Discord every month, it's actually fairly
         | expensive. A lot of the good features unlock once people start
         | boosting servers with Nitros and those aren't cheap either. So
         | I'd assume they aren't bleeding cash left and right on infra
         | costs. They might actually breaking even on the infra costs at
         | least.
        
         | helen___keller wrote:
         | For years I've had a little bet going with friends about who
         | ends up buying them to subsidize all this. My money was on
         | amazon, because it could work so well with twitch + amazon
         | prime.
        
       | kerblang wrote:
       | I've used cassandra quite a bit and even I had to go back and
       | figure out what this primary key means:
       | ((channel_id, bucket), message_id)
       | 
       | The primary key consists of partition key + clustering columns,
       | so this says that channel_id & bucket are the partition key, and
       | message_id is the one and only clustering column (you can have
       | more).
       | 
       | They also cite the most common cassandra mistake, which is not
       | understanding that your partition key has to limit partition size
       | to less than 300MB, and no surprise: They had to craft the
       | "bucket" column as a function of message date-time because that's
       | usually the only way to prevent a partition from eventually
       | growing too large. Anyhow, this is incredibly important if you
       | don't want to suffer a catastrophic failure months/years after
       | you thought everything was good to go.
       | 
       | They didn't mention this part: Oh, I have to include all
       | partition key columns in every query's "where" clause, so... I
       | have to run as many queries as are needed for the time period of
       | data I want to see, and stitch the results together... ugh...
       | Yeah it's a little messy.
        
         | habibur wrote:
         | Reading the article I was right now visiting Cassandra site to
         | figure out what the catch is. Surely there should be a catch.
         | 
         | Well, here it is. The partitioning in manual upto the SQL
         | level.
        
           | PretzelPirate wrote:
           | The bigger catch is that when your partition grows too big
           | and your nodes are hit by the OOMKiller, you have very few
           | options other than create a new table and replay data, or use
           | a cli tool to manually partition your data while the node is
           | offline.
           | 
           | Using Cassandra tends to mean pushing costs to your
           | developers instead of spending more money on storage
           | resources, and your devs will almost certainly spend a ton of
           | time fixing downed nodes.
           | 
           | Apple supplied some of the biggest contributors to Cassandra
           | who were optimizing things like how to read data in a
           | partition without fully reading the partition into memory to
           | avoid the terrible GC cost. They put in a ton of engineering
           | effort that probably could have been better spent elsewhere
           | if they'd used a different database.
        
             | bsaul wrote:
             | what would you recommend now instead ?
        
       | tester756 wrote:
       | Discord had like $300M invested and they created unparalleled
       | piece of software that ate whole market, damn.
       | 
       | One of the most impressive softwares that I've seen and use after
       | years of using ventrilo/mumble/teamspeak.
        
         | RHSeeger wrote:
         | Honestly, I find discord super frustrating. Can't have multiple
         | chats open at the same time, can't close the right rail, etc.
         | It's UX is subpar in almost every way that matters to me. I use
         | it because _everyone_ uses it, not because I want to.
        
           | jonshariat wrote:
           | Ditto, feels fast and has a lot of features but to me the UX
           | is very confusing.
        
             | agumonkey wrote:
             | everytime I open an irc client I feel lighter.. something
             | about the discord web client is so dreadful. the android
             | app focuses more on notifications and quick switches.. a
             | bit better somehow.
        
             | mvanaltvorst wrote:
             | Personally, I find that Discord feels extremely bloated
             | compared to almost any other program on my computer.
             | Probably has to do with the fact that Discord has still not
             | released a native ARM version of its client for the Apple
             | M1. Nearly a year after its release there really is no
             | excuse for any Electron application to stick to
             | x86_64+Rosetta 2.
             | 
             | Discord is literally the only x86 application that is still
             | installed on my MacBook Pro M1.
        
               | judge2020 wrote:
               | Discord recently downgraded from 64-bit to 32-bit on
               | Windows as well, I imagine there were issues with the
               | interconnects with other, native parts of Discord like
               | screen-sharing that were easily 'solved' by only
               | distributing 32 bit binaries.
        
           | herodoturtle wrote:
           | I second this.
           | 
           | Find the UI very confusing. Perhaps I'm just old; but damn,
           | my intuition in using discord's interface constantly lets me
           | down.
        
           | derefr wrote:
           | -
        
             | ibraheemdev wrote:
             | > There are alternate Discord clients, and -- unlike Slack
             | -- Discord doesn't try to actively prevent people from
             | writing alternate clients against their API.
             | 
             | Yes, they do.
             | 
             | https://github.com/Bios-Marcel/cordless:
             | 
             | > Hey, so I know this is somewhat of a bummer, but I got
             | banned because of ToS violation today. This seemed to be
             | connected to creating a new PM channel via the /users/@me
             | endpoint. As that's basically a confirmation for what we've
             | believed would never be enforced, I decided to not work on
             | the cordless project anymore. I'll be taking down cordless
             | in package managers in hope that no new users will install
             | it anymore without knowing the risks. I believe that if you
             | manage to build it yourself, you've probably read the
             | README and are aware of the risks. I'll keep the repository
             | up, but might archive it at some point. And yes, you'll
             | still be able to use existing binaries for as long as
             | discord doesn't introduce any more breaking changes.
             | However, be aware that the risk of getting a ban will only
             | get higher with time!
             | 
             | https://github.com/atlx/discord-term:
             | 
             | > Disclaimer: So-called "self-bots" are against Discord's
             | Terms of Service and therefore discouraged. I am not
             | responsible for any loss or restriction whatsoever caused
             | by using self-bots or this software. That being said,
             | there's no one stopping you from risking using an account,
             | so go head!
        
             | masklinn wrote:
             | > There are alternate Discord clients, and -- unlike Slack
             | -- Discord doesn't try to actively prevent people from
             | writing alternate clients against their API.
             | 
             | It's again TOS and people have copped bans for using
             | alternate clients.
        
             | sa1 wrote:
             | https://twitter.com/discord/status/1229357198918197248?lang
             | =...
        
           | Operyl wrote:
           | Going to be the dissenting voice here and say I find the UI
           | usable, and have no major gripes. The Member List can be
           | hidden by the user icon to the left of the search field, and
           | there's a convenient "Compact mode" to make things more dense
           | information wise.
        
           | Kaze404 wrote:
           | I can't think of many reasons Discord "ate the whole market"
           | besides smart marketing, honestly. It does audio rooms
           | incredibly well, but everything else (even their developer
           | support team) is just terrible.
        
             | agumonkey wrote:
             | I'm not even sure it was a lot of marketing, for the new
             | generation, it's just what they needed, chat+voice+screen
             | on a html5/css3 visual language.
             | 
             | I despise discord quite a lot btw :)
        
             | moenzuel wrote:
             | Discord ate the market because it is free and good enough
             | voice quality. Before that everyone was paying for voice
             | server hosting or doing it themselves. Is that smart
             | marketing or just the standard operating procedure for
             | startups around that time?
        
               | judge2020 wrote:
               | Technically Skype existed but lost with the new sms-
               | looking ui that everyone hated and it was hardly suitable
               | for anything larger than a friend group. Then there were
               | long-standing issues of voice chats being P2P and thus
               | allowing users to find the IP of other users, enabling
               | DDOS attacks on routers.
        
               | jacurtis wrote:
               | Yeah I think people forget that Skype actually owned the
               | video game voiceserver market for a few years.
               | 
               | Ventrillo and Teamspeak and Mumble were all good. But you
               | had to assign someone in your friend-group to manage the
               | server. This meant paying for hosting to do it "the right
               | way", and in turn one friend either paid the hosting
               | themselves or you had to figure out how to split the
               | cost. Then if someone else joined the group you had to
               | split it with them, etc.. Some people would self-host
               | teamspeak or ventrillo at their houses so you could avoid
               | those costs, but now you are reliant on an unreliable
               | system of one friend hosting your voiceserver on their
               | desktop computer. This means that router mishaps could
               | send it offline, them turning off their computer could
               | send it offline, or if the teamspeak/vent daemon wasn't
               | running then your whole server is offline.
               | 
               | Skype solved a lot of those problems because it was
               | always online, no one had to manage a server, and it was
               | free. It sucked in just about every other way as a game
               | chat option, but the benefits of no-server-management,
               | always-available, and no-cost, made an objectively
               | inferior product dominate the world of game chat.
               | 
               | Discord simply took the features of
               | teamspeak/mumble/ventrillo and combined it with the
               | service benefits that skype offered. No more server cost
               | sharing and no more server administration. But you still
               | got the benefits of actual game chat servers like voice
               | lobbies (as opposed to initiating calls like skype).
               | 
               | I really don't think Marketing is what made Discord
               | successful. This is truly an example of someone who
               | solved a need. We needed a product like
               | teamspeak/ventrillo/mumble combined with a service like
               | skype. Discord was that creation. It truly solved a
               | problem for gamers. Gamers were not looking to cling to
               | skype, but they were all using it. Discord created a
               | product that fit into the market perfectly and the masses
               | ran to it because the need was so big, and Discord solved
               | the problem that gamers needed. The ease of setup also
               | helped. Sending a single share link that someone simply
               | clicked was all it took to join a server and start
               | talking. I think that ease of setup is also an incredibly
               | under-rated strength of Discord. In fact I would venture
               | to guess that most gamers joined their first Discord
               | server by clicking a discord share link that was sent to
               | them via Skype.
        
               | rubicon33 wrote:
               | I agree, but also worth mentioning that Discord really
               | did Chatrooms correctly.
               | 
               | The ability to easily create servers, invite users to
               | your server, and then make that server your homebase with
               | its own channels and emojis, is pretty novel and
               | perfectly fit into the gaming community which is
               | basically a loosely connected graph of friend groups.
        
               | jaquers wrote:
               | Also, RBAC which allows huge flexibility for users to run
               | their community in whatever fashion they want.
        
             | Saris wrote:
             | Everything else was/is much worse.
             | 
             | Even years later it's still the only platform I know of
             | that combines text chat rooms, voice chat rooms, and video
             | streaming into one place, all accessible from your 'server'
             | as they call it.
             | 
             | It also has clients for many platforms, including a web
             | client, all of which look and function the same.
             | 
             | Any alternative out there does one of those things decently
             | well, but either completely lacks or is utterly awful at
             | the other things.
        
               | tester756 wrote:
               | >combines text chat rooms, voice chat rooms, and video
               | streaming into one place
               | 
               | and unlike skype (and probably teams too) supports PUSH 2
               | TALK which gaming oriented voice chats had close to 2
               | decades ago and is even more useful now, during WFH.
        
             | ayngg wrote:
             | Because most people used ventrilo, teamspeak or mumble,
             | which required running or paying for a server, and had
             | limited chat/ social network functionality, and skype which
             | was just terrible. It came at the right time as gaming hit
             | its stride in the mainstream when people needed a place for
             | 'local' communities that was basically frictionless and
             | discord was there to capture that audience because there
             | was basically nothing else.
             | 
             | All the features afterwards are mostly just them throwing
             | stuff at the wall and seeing what sticks.
        
             | rcxdude wrote:
             | I don't think they did much marketing. They provided, for
             | free, a replacement for a chat room, forum, and voice chat,
             | which was in total easier to set up than any of the
             | previous options any gaming community had for those, with
             | similar levels of functionality.
        
             | bluecalm wrote:
             | People love it, marketing has nothing to do with it. I've
             | heard any Discord when I followed open source programming
             | project (Leela Chess Zero) and it was obvious after a few
             | minutes why it's a fantastic fit. I moved my project there
             | shortly after as well and it's fantastic.
        
             | crocodiletears wrote:
             | Their moderation and community management tools are
             | fantastic.
        
               | Kaze404 wrote:
               | It's definitely improved over the years, but every
               | remotely populated server I'm in uses bots for basic
               | moderation features like ban words, proper bans/kicks
               | (for example, temporary bans), warns, etc. There's still
               | a long way to go in my opinion.
        
             | [deleted]
        
             | porb121 wrote:
             | uh, how about the fact that vent/mumble/Skype were all
             | awful?
        
               | mjevans wrote:
               | The only bad part of mumble calls for voice come from:
               | * Where the server is hosted / quality of server       *
               | Poor client UI
               | 
               | The client UI issue is how easy it is to work-around bad
               | audio from other users. It's possible to do, the UI just
               | completely sucks.
               | 
               | User interface and end user fulfillment just aren't great
               | generally for OSS. I think it would take a commons
               | improvement project with either government grants
               | (infrastructure) paying for results AND/OR a university
               | spearheading the development project.
        
               | Kaze404 wrote:
               | I don't see how that's relevant? I don't even prefer
               | those over Discord, but I don't think it's enough of an
               | improvement to warrant the market share it has now.
        
               | mbesto wrote:
               | What do you prefer?
               | 
               | The fact that you can:
               | 
               | (1) Find a server with a simple URL (no pw needed, no
               | port or IP, etc.)
               | 
               | (2) Find your friends easily with a unique username and
               | use chat as a fallback
               | 
               | (3) Create an audio room (that scales!) that has great
               | audio quality (doesn't "drop" calls and
               | 
               | (4) Client that auto-updates to provide more and more
               | features
               | 
               | All within a few minutes is a HUGE upgrade or Ventrilo.
               | 
               | This doesn't even mention the network effects, bot
               | integration, gif/photo capability, etc. It's superior in
               | almost every way possible.
        
               | oarsinsync wrote:
               | Mumble client is awful. Mumble audio quality and latency
               | is fantastic.
               | 
               | Mumble is also self hosted, and nobody wants to host
               | anything anymore when there's a free alternative that's
               | good enough and hosted by someone else.
               | 
               | Skype is just universally terrible.
               | 
               | EDIT: since I'm part of that tiny minority exception that
               | proves the rule: s/nobody wants/the vast majority do not
               | want/
        
             | hardwaregeek wrote:
             | Slack was/is pretty terrible too. Having every workspace
             | require a new user is the pinnacle of idiocy. So annoying
             | and even worse if you have different emails for different
             | workspaces.
        
           | masklinn wrote:
           | Samesies.
           | 
           | It's not _bad_ per se but there 's plenty of crap in there.
           | 
           | The shortcuts situation is absolutely dreadful for one, I
           | don't understand how _gamers_ can cope with it:
           | 
           | * there are all of 5 actions you can bind to custom shortcuts
           | 
           | * discord defines dozens of built-in shortcuts you _can not_
           | rebind or disable, if any of those conflicts with something
           | you need you better hope the OS has a way of overriding it
           | 
           | Large chatrooms as well, the moderation tools seem rather
           | limited, maybe it's better for administrators but as a user
           | all you can do is block someone _and you still have to see
           | that they 're posting comments_. It' incredibly frustrating.
           | 
           | Then the linking and jumping to old message works half the
           | time, maybe, search is absolute dogshit, and I've rarely seen
           | a less reliable @-autocompletion, half the time I have to
           | find old messages of the person I'm trying to ping before
           | discord remembers they exist and lets me actually @ them.
           | 
           | And I don't think support actually exists. You just post into
           | the black hole that are tte support forum thing.
        
           | CoryAlexMartin wrote:
           | You may enjoy Ripcord if you're not happy with Discord's UI.
           | I've been using it for a few months, and it's made Discord
           | enjoyable to use.
           | 
           | I do have to open the official client whenever I do voice
           | calls though, because there's currently an issue that can
           | cause incoming audio to sound terrible. But for text chat,
           | it's great.
           | 
           | https://cancel.fm/ripcord/
        
             | Dylan16807 wrote:
             | It would help if discord would stop threatening to ban
             | people for using third party clients.
        
         | zitterbewegung wrote:
         | IMHO it is even better than Slack.
        
         | 5faulker wrote:
         | If that's 2017, imagine what it would be in 2021.
        
         | sascha_sl wrote:
         | Discord's smart move was emulating the concept of servers
         | (including all the teenage drama coming from having
         | administrators and moderators more interested in (ab)using
         | their power than community building) while making them
         | accessible to anyone without technical knowledge.
         | 
         | But it's important to remember that Discord is not that.
         | Discord holds all your data, in luxurious detail, with no
         | option to delete. They go as far as ignoring GDPR when people
         | ask for their messages to be deleted. "Deleting" your account
         | will not even anonymize your ID, it unsets your avatar, renames
         | you, kicks you from all guilds and disables logging in. That's
         | it. And if they ban you there is no place to move on to.
        
           | tester756 wrote:
           | and you aren't going to get DDoSed if you join server of not
           | the nicest people :)
        
             | sascha_sl wrote:
             | That used to be a huge issue, less with Client-Server model
             | software, but more the P2P mechanisms in Skype.
             | 
             | These days? Well, most server providers have some sort of
             | basic flood mitigations in place now, and even more
             | advanced protection has become affordable.
        
               | tester756 wrote:
               | >These days? Well, most server providers have some sort
               | of basic flood mitigations in place now, and even more
               | advanced protection has become affordable.
               | 
               | Hmm
               | 
               | I didn't meant your server being DDoSd, but you being
               | DDoS (but probably that's what you meant with Skype P2P
               | example?)
        
               | sascha_sl wrote:
               | Well yeah. Though Vent, Teamspeak and Mumble never had
               | these issues (if you could trust the server admin).
               | 
               | Skype (at the time, no idea now) was a very shoddily
               | written piece of software. It was trivial to query the IP
               | of any online user, even if they were not on your contact
               | list or appearing offline.
               | 
               | You had to use a VPN or carefully conceal your Skype ID,
               | I did work with a somewhat popular live streamer back
               | then (so a VPN wasn't feasible), and their ID was a very
               | random string that was not to be shared under any
               | circumstances.
        
               | tester756 wrote:
               | >Though Vent, Teamspeak and Mumble never had these issues
               | (if you could trust the server admin).
               | 
               | In some games where people were often switching teams,
               | you couldnt.
        
         | dvt wrote:
         | > ventrilo/mumble/teamspeak
         | 
         | To be fair, Mumble is FOSS, and Ventrilo and Teamspeak have
         | literally not iterated since 2005. Discord is pretty mediocre
         | software (remember when they accidentally allowed iframe XSS
         | RCE attacks? A very amateurish mistake), but the incumbents
         | were an absolute dumpster fire.
        
           | Scaless wrote:
           | For Mumble in particular, the devs had their heads in the
           | clouds for so long that it is no surprise that it is no
           | longer relevant.
           | 
           | If you had a mic that had issues in any way (buzzing, volume,
           | balance), "The Wizard" and "AGC" were supposed to fix it for
           | you. Do not fret little one, for you do not need nor want to
           | manually fiddle with settings, The Wizard will make
           | everything right [1]!
           | 
           | The pivotal feature that was the reason so many people I know
           | stopped using it is the ability to change the volume of an
           | individual person [2]. It has been a requested feature since
           | the beginning of time, yet it took until 2016 to implement in
           | dev branch and didn't actually make it into a release version
           | until 2020! Too little, too late.
           | 
           | [1] https://web.archive.org/web/20200223143654/https://wiki.m
           | umb...
           | 
           | [2] https://github.com/mumble-voip/mumble/issues/1156
        
           | the_duke wrote:
           | > and Ventrilo and Teamspeak have literally not iterated
           | since 2005
           | 
           | True, but to be fair: the next iteration of Teamspeak will be
           | based on the Matrix protocol, which is quite a big iteration.
           | See https://news.ycombinator.com/item?id=25743874 .
        
         | tschellenbach wrote:
         | Better engineering than Slack, but not as good of a business
        
         | yelnatz wrote:
         | Well it solved a lot of pain points with the target market.
         | 
         | I remember my friends and I kept bickering who would pay for
         | this month's bill for the vent/mumble servers. That kept on for
         | years until I had enough and hosted my own in a droplet in
         | digital ocean. None of my friends knew how to do that since
         | they're not very technical.
         | 
         | Discord you just had to click a couple buttons and its free.
        
           | m4rtink wrote:
           | It works nice and is free, but for how long?
           | 
           | How long can they keep paying for that bandwidth and message
           | data storage while keeping the thing essentially free?
        
             | humaniania wrote:
             | Nitro pulls $10/mo and has enough benefits that a lot of
             | people pay. Plus server boosts. I bet that they have good
             | cash flow.
        
               | keithnz wrote:
               | we use discord for work and to "boost" your server to a
               | level where you get reasonable streaming you need to pay
               | ~$60 and a higher upload limit, or ~$110 for the max.
               | Which is pretty good in that it applies for all users
               | 
               | It's a bit of an odd model for paying for businesses, but
               | works well in the gaming world where multiple people can
               | essentially help pay for a server (if you want the extra
               | toys)
        
               | ganoushoreilly wrote:
               | I pay for nitro, I don't use discord non stop but it's
               | great for a bunch of niche channels i'm on. I'm happy to
               | pay $10 to a platform that makes it easier for me to find
               | information and I know a bunch of my colleagues pay as
               | well. All in we still support individual projects as well
               | but truthfully it's the cost of a beer a month.
        
               | judge2020 wrote:
               | Our only source is this WSJ article (excerpt from qz[0]):
               | 
               | Discord declined to share how many Nitro subscribers it
               | has, but the Wall Street Journal reported that Discord
               | generated $130 million in revenue last year, up from $45
               | million in 2019. In the same time period, its monthly
               | user base doubled.
               | 
               | 0: https://qz.com/2034087/chat-app-discord-is-shedding-
               | its-game....
        
             | SaltyBackendGuy wrote:
             | > It works nice and is free, but for how long?
             | 
             | I wonder about this a lot. I wonder if they have some big
             | 'whales' that help sustain their business OR they're just
             | selling all of our data (is that enough to make money at
             | discords scale??).
        
               | bonesinger wrote:
               | IMO its a good competitor to slack. They probably make
               | money from businesses too. They have lots of options for
               | permissions/roles and all kinds of API access to write
               | bots for.
        
               | grlass wrote:
               | I have some contacts who work for organisations use it
               | instead of Slack for SMEs of 100-1000 employees.
               | 
               | Unsure what their billing looks like, but it works pretty
               | well for them apparently.
        
               | judge2020 wrote:
               | Unless businesses are paying for server boosts[0] (which
               | would only be useful for 1080p60 screen-share or a 50mb
               | upload limit), there's no way to use Discord for business
               | or pay extra for business use outside of creating a free
               | server like any other; there's no real reason to choose
               | Discord for business either, since it has no real
               | retention policy (other than storing messages forever,
               | for now), DLP is non-existent, there's no SSO/SAML, etc.
               | The only reason to use Discord for business is if you
               | really like Discord and/or other parts of your business
               | are on Discord, like if you run a video game.
               | 
               | 0: https://support.discord.com/hc/en-
               | us/articles/360028038352-S...
        
               | derefr wrote:
               | There are "businesses" that have communities, and want to
               | own/manage them. Discord works much better than Slack as
               | a platform for "official" managed open-membership
               | communities; it's seemingly a use-case the Discord staff
               | have put a lot of thought into.
               | 
               | Think: every content-creator or streamer.
               | 
               | But also: regular corporations that provide platform
               | services that people build their own stuff on top of,
               | such that people want to talk to _each-other_ about the
               | service rather than just talking to _the corporation_
               | about the service. (The sort of thing you used to stand
               | up a hosted forum for.)
        
               | bonesinger wrote:
               | Yep, I agree about limited industry but it does work in
               | that respect. I see it used a lot for content creators as
               | a way to organize and tier out their fans as well.
        
               | grlass wrote:
               | one of the founders and current CEO Jason Citron had a
               | previous org called OpenFeint, which:
               | 
               | > was party to a class action suit with allegations
               | including computer fraud, invasion of privacy, breach of
               | contract, bad faith and seven other statutory violations.
               | According to a news report "OpenFeint's business plan
               | included accessing and disclosing personal information
               | without authorization to mobile-device application
               | developers, advertising networks and web-analytic vendors
               | that market mobile applications" [1]
               | 
               | Of course that doesn't mean anything about the current
               | model of Discord, but good to be aware of.
               | 
               | [1] https://en.wikipedia.org/wiki/OpenFeint#History
        
               | ezconnect wrote:
               | It's a spy and data mining operation for some
               | intelligence group same reason why twitter and facebook
               | got traction.
        
           | caymanjim wrote:
           | It's free for the same reason everything is free these days.
           | VC funds anything that will attract a lot of users to mine
           | data from so they can sell the data. Discord didn't do
           | anything that was groundbreaking or even solve a problem that
           | had no solution; they just came along during a time when
           | investors are willing to fund a company operating at a loss
           | for a decade until FANG buys them.
           | 
           | Discord's a pretty good product, and they've got the
           | engineers and money to get better, but the only reason they
           | won is because of timing. Same for Slack; there were
           | identical products to Slack that tried for decades to gain
           | traction, but they weren't free, because that business model
           | didn't exist at the time.
        
             | EamonnMR wrote:
             | Slack isn't free, they sell you history-in effect, you
             | generate the data that they sell back to you.
        
               | mavhc wrote:
               | Holding your data hostage until you pay up, isn't that
               | ransomware?
        
               | rokhayakebe wrote:
               | Not if you voluntarily provided it in the first place.
        
           | nnncdrj wrote:
           | I don't get it, you can run mumble on any random Linux box in
           | your house, you don't need to pay to have it hosted
           | somewhere. Works find running on any box on your desk.
           | 
           | Discord makes you the product. It's gratis in exchange for
           | letting them spy on you. If you don't know why that's bad...
        
             | tester756 wrote:
             | >you can run mumble on any random Linux box in your house,
             | you don't need to pay to have it hosted somewhere.
             | 
             | if you have public IP or use stuff like hamachi (at least
             | that's how we did it decade ago)
        
             | Forbo wrote:
             | I vouched for your comment. A lot of the rise of Discord
             | can be attributed to convenience, network effects, and
             | pretty features like animated reactions, but ultimately it
             | is still surveillance capitalism. Unfortunately, it appears
             | that the masses don't care about things like privacy, as
             | they're more than willing to sign up for these kinds of
             | services.
             | 
             | https://www.pcgamer.com/how-private-is-your-private-
             | discord-...
        
             | jacurtis wrote:
             | > you can run mumble on any random Linux box in your house
             | 
             | That seems easy to you. That would be easy for me too and
             | most likely 90% of the people on HackerNews.
             | 
             | But the average person doesn't have a "random Linux box" in
             | their house. Most people don't even know what Linux is.
             | Most people would be overwhelmed just looking for the
             | terminal emulator on their computer, before they even typed
             | a command into it.
             | 
             | Most people don't want to manage an always-on linux box for
             | a voice server. Most people don't want to manage port-
             | forwarding on their firewall/router. Most people don't have
             | static IPs at their house and wouldn't know how to setup
             | dynamic dns to solve the problem. Most people don't even
             | know what DNS is.
             | 
             | MOST PEOPLE just want a program they can launch when they
             | want to talk to their friends. That is why Discord has been
             | successful.
             | 
             | I'm not saying that's good. I am just saying that its the
             | way the world is.
        
         | kccqzy wrote:
         | I've also noticed that in a lot of tech-related social circles
         | people are increasingly choosing Discord over Slack. That's a
         | trend I totally didn't expect: at least until a few years ago
         | it was clear that Discord was for gamers and Slack was for work
         | and everyone else. That changed quickly. Impressive indeed!
        
           | atom_arranger wrote:
           | Slack uses a language that is like markdown, but is not
           | markdown, which is annoying for engineers.
           | 
           | Slack does not support syntax highlighting of code blocks.
           | 
           | Discord uses proper markdown and supports syntax
           | highlighting.
           | 
           | These are two things that make me think Discord is better
           | specifically for engineers, aside from it just being
           | generally way better.
        
             | derefr wrote:
             | > Slack does not support syntax highlighting of code
             | blocks.
             | 
             | It does, but only if you make your code block into its own
             | post as a "text snippet." (I assume this is because Slack's
             | internal markup doesn't allow regions to have parameterized
             | metadata, but there _is_ parameterized metadata at the
             | chat-post-event level.)
             | 
             | You also get other benefits of doing this, e.g. being able
             | to collapse the snippet, download it, etc. Code pasted into
             | Slack should really always be pasted as a snippet. I just
             | wish it auto-detected you were trying to do that and
             | offered to make a snippet.
        
               | atom_arranger wrote:
               | Yeah, I have done that. It's really clunky. I've also
               | done it to be able to use headings and other things that
               | I'd prefer just worked in the main chat.
        
             | gsliepen wrote:
             | The most frustrating thing is that somewhere on their
             | website they mentioned that most people are not familiar
             | with the Markdown syntax, so they chose not to use it. But
             | instead they created their own syntax that even less people
             | are familiar with...
        
             | masklinn wrote:
             | > Discord uses proper markdown
             | 
             | Not even remotely.
             | 
             | Discord supports:
             | 
             | * fenced code blocks (but not indentation)
             | 
             | * quoting, _a single level_ (nested quotes don 't work,
             | properly replying to other comments is painful)
             | 
             | * inline decorations (italics, bold, underline,
             | strikethrough, code)
             | 
             | * inline spoilers (an extension)
             | 
             | * disabling autolinking (an other extension)
             | 
             | It doesn't support: headings, paragraphs, lists (ordered or
             | not), labelled links, tables, images, footnotes, images
             | (you can only use the image upload feature which puts a
             | single image below a comment).
             | 
             | It also has a limit to 2000 char (4000 with nitro), which
             | can be rather low when posting code snippets.
        
             | MichaelZuo wrote:
             | That does seem bizarre. Was there some special reason for
             | them introducing almost markdown but-not-quite as a
             | feature?
        
           | Nadya wrote:
           | At my work we use Discord to have virtual "desks" (really
           | just audio channels) so people can drop by and chat while you
           | are at your desk. If you're busy or don't wish to be
           | disturbed you can 'lock' your desk to prevent people from
           | joining it (it limits the room size to 1, aka just you).
           | 
           | It really has helped the social factor of moving nearly
           | everyone in the office to remote working. Every department
           | that has adopted the "virtual office" Discord setup loves it
           | over Slack and basically never uses Slack anymore. It's way
           | less awkward to call people, it's easier to not incidentally
           | disturb them when they're busy, during breaks/lunch you can
           | go to the "breakroom" and hang out and chat with everyone
           | else. And it was all very easy to setup and with the Discord
           | server template stuff we can even clone it for each
           | department with very minimal work (renaming channels to that
           | departments' people).
        
             | victortroz wrote:
             | Slack implemented something similar called Huddles, but I
             | think is for paid plans only. I personally think Slack
             | calls quality in general are much worse than other services
             | or platforms like Discord or Meets/etc, so I don't know if
             | it'll really help reaching people and companies that are
             | using alternatives for voice.
        
               | didibus wrote:
               | Huddles uses Amazon's chime backend for audio, so it
               | should perform much better then the current "audio calls"
               | that slack had, though I haven't tried it yet.
        
               | moojd wrote:
               | My company was also using discord for virtual desks and
               | have moved to slack huddles. They work as good as discord
               | voice channels.
        
           | vxNsr wrote:
           | The free discord tier is better than the free slack tier.
           | That's honestly why 90% of ppl use discord over slack.
           | 
           | Also paid discord is 100x cheaper than paid slack, for non-
           | corporate entities. You can get top tier discord for like
           | $100/m while slack price goes up with each user. Not to
           | mention that discord allows users to easily assist in
           | upgrading your server while slack doesn't have that
           | functionality at all.
        
           | Xavdidtheshadow wrote:
           | I think which fits your community is really dependent on your
           | use case.
           | 
           | Discord has great tools around moderation and membership
           | tiers; it's designed for users you don't trust.
           | 
           | Slack is much more for a community where everyone knows each
           | other (or at least trusts each other a bit, like you'd trust
           | a coworker).
        
             | PeterCorless wrote:
             | Agreed. If you want to manage a lot of people who you don't
             | really "know," then Discord has all sorts of roles and
             | rules you can set up.
        
           | [deleted]
        
           | craftinator wrote:
           | > it was clear that Discord was for gamers and Slack was for
           | work
           | 
           | One of the best hammers I own is a screwdriver.
        
           | munchbunny wrote:
           | Discord hasn't gotten bloated yet the way Slack has, which
           | makes it much more pleasant when all you actually want to do
           | is chat and maybe hang out on voice and sometimes with a
           | screen broadcast.
           | 
           | Once you're in an enterprise space, Slack's features become
           | actually useful.
        
           | nonbirithm wrote:
           | They also pivoted their marketing message away from being
           | "for gamers" and towards anyone who wanted "a place to hang
           | out," like developer groups or high schoolers.
        
             | hi5eyes wrote:
             | Discord's pushing for school groups to use it, especially
             | with the .edu pop up on launch
             | 
             | Discord School Hubs page:
             | https://support.discord.com/hc/en-
             | us/articles/4406046651927-...
             | 
             | https://www.reddit.com/r/discordapp/comments/p37s7s/so_disc
             | o...
        
           | mciancia wrote:
           | Well, slack is extremely expensive IMO, so I'm not surprised
        
             | TillE wrote:
             | Yeah I always thought Slack is explicitly intended for Real
             | Work and not much else (pay per user!), so it's pretty
             | natural that Discord took over.
        
         | brightball wrote:
         | Elixir ftw
        
           | mrdoops wrote:
           | Want to build a real time chat app? Best build it on the
           | BEAM.
        
       | paulryanrogers wrote:
       | TLDR MongoDB then Cassandra
        
         | PeterCorless wrote:
         | ...then [post this blog] Scylla.
         | 
         | https://www.scylladb.com/press-release/discord-chooses-scyll...
        
         | typon wrote:
         | Did they make the move to ScyllaDB, as mentioned in their
         | "Future work" section?
        
           | misframer wrote:
           | Their jobs pages (e.g. [0]) mention "ScyllaDB/Cassandra".
           | 
           | [0] Senior Site Reliability Engineer:
           | https://discord.com/jobs/4004051002
        
             | mikesun wrote:
             | Also:
             | 
             | https://discord.com/jobs/5411664002
             | 
             | https://discord.com/jobs/5426301002
        
           | Sikul wrote:
           | We've moved quite a few datasets from Cassandra to Scylla,
           | but not messages. I think we're planning to make a blog post
           | about our experience with Scylla at some point.
        
             | PeterCorless wrote:
             | Definitely lemme know when you are going to do that!
             | (peter@scylladb.com here).
        
         | yeswecatan wrote:
         | Still a good read if you're curious about Cassandra
        
           | gregoriol wrote:
           | 2017 is pretty antique though now: the scaling and the
           | ecosystem change fast
        
             | yeswecatan wrote:
             | I admittedly know very little about this space. What are
             | some of the newer players?
        
               | biggestdummy wrote:
               | Scylla, for one. Which is mentioned as a possible "next
               | step" in the article.
        
               | PeterCorless wrote:
               | If you are not familiar, DB-engines.com keeps a listing
               | of who the major players are in the database world.
               | 
               | https://db-engines.com/en/ranking
               | 
               | MongoDB is ranked #5 on the list at present; Cassandra
               | comes in at #11. (And Scylla, which they moved to most of
               | their workload from Cassandra, is currently #88.)
               | 
               | DB-engines also have specific rankings for what are known
               | as 'NoSQL wide column stores' -- which is what Cassandra
               | and Scylla are classed as:
               | 
               | https://db-engines.com/en/ranking/wide+column+store
               | 
               | Note that MongoDB is a different class of NoSQL entirely.
               | It is a "document store" -- MongoDB is the most popular
               | document store.
               | 
               | https://db-engines.com/en/ranking/document+store
               | 
               | But what this means is that even though both MongoDB,
               | Cassandra and Scylla are all "NoSQL" making this move for
               | Discord required significant data modeling and migration.
               | 
               | (Note that the difference between Cassandra and Scylla is
               | far narrower. Both use the same data model and Cassandra
               | Query Language (CQL).
               | 
               | Hope that helps give you some orientation in the NoSQL
               | database field.
        
       | ryanianian wrote:
       | TFA states:
       | 
       | > we knew we were not going to use MongoDB sharding because it is
       | complicated to use and not known for stability
       | 
       | But then goes on to describe using Cassandra and overcoming
       | sharding and stability issues. I.e., changing the key, changing
       | TTL knobs, adding anti-entropy sweepers, and considering
       | switching to a different cassandra impl entirely.
       | 
       | Are these issues significantly _harder_ to solve in MongoDB than
       | Cassandra?
        
         | spmurrayzzz wrote:
         | Its hard to say because they're not explicit about this but,
         | despite being a decade-long Mongo apologist myself, I'd totally
         | believe that they liked the linear scale story for Cassandra
         | more from an infrastructure/config perspective.
         | 
         | Increasing top-end write throughput or replication in Cassandra
         | is just adding more nodes, where in Mongo its not just adding
         | nodes, its adding replica sets (which consist of 3 or more
         | nodes). So there's a few more layers of complexity to that
         | story. You need more replica sets to increase write throughput
         | and need more nodes in replica sets to increase replication.
         | 
         | Im hand waving some details here, but I've worked with both
         | platforms can definitely understand the choice at least from a
         | pure infra lens.
        
         | PeterCorless wrote:
         | At the time MongoDB's sharding story wasn't great. They've
         | gotten better since, but still have a primary-replica set model
         | that has a single point of failure/failover. Cassandra (and
         | Scylla) are leaderless, peer-to-peer clustering. Any node can
         | go offline and the cluster keeps humming. Cassandra shards per
         | node. Scylla goes beyond that and shards per core.
         | 
         | Cassandra and Scylla also use hinted handoffs so if a node is
         | unavailable temporarily (up to a few hours) you can store
         | "hints" for it when it comes back online. Handy for short admin
         | windows.
        
           | trashcan wrote:
           | MongoDB has the equivalent of hinted handoffs. Changes are
           | streamed to secondary nodes via the oplog, and the secondary
           | just resumes where it was once it is back online. There is a
           | limit to how long it can be offline (based on the size of the
           | oplog), but that is the same limitation as hinted handoffs.
        
             | PeterCorless wrote:
             | Thanks! Good to know.
        
           | calmoo wrote:
           | A MongoDB shard isn't necessarily a single-point-of-failure
           | since a shard is usually deployed as a replica set. If a
           | shard's primary node goes down, a secondary node in the
           | replica set is elected as a primary and takes reads + writes.
           | Similar to what you mentioned for Scylla - a node can go
           | offline on a shard in a MongoDB cluster and it keeps humming.
        
       | jamesdwilson wrote:
       | With privacy concerns, companies should be shamed for storing
       | billions of messages.
        
         | colesantiago wrote:
         | > With privacy concerns, companies should be shamed for storing
         | billions of messages.
         | 
         | should we shame ycombinator for storing the messages, accounts
         | and comments on hacker news then?
         | 
         | I am still unable to delete my account here even though the
         | CCPA and the GDPR exists. But here we are.
        
           | mishafb wrote:
           | Your comments are a public thing you choose to publish to the
           | world, but many (most?) discord messages are not. You'd want
           | them to get deleted.
        
             | colesantiago wrote:
             | Not if one would want to choose to delete their account
             | afterwards with the expectation that all comments would be
             | deleted as well.
        
             | cdash wrote:
             | Sorry but no, I don't want my messages deleted. I use
             | search history all the time even to search for things I
             | personally said. I can get behind deleting messages if the
             | account is deleted though, but as some type of automatic
             | thing based on time in the past? No thanks.
        
               | jollybean wrote:
               | That you wold chose to 'keep your messages' is a little
               | besides the point of those who would opt for more
               | privacy.
               | 
               | Nobody is making the argument you should be forced to
               | delete your messages.
               | 
               | In any normal world, messages that are not used would be
               | deleted as a matter of privacy. They're kept, because
               | they can be kept, and they can be monetized. That
               | monetization has zero benefit to the user, it's just an
               | artifact of our odd way of doing business where we
               | continue to externalize a lot of things. I think over the
               | next 10 years we might see a regulator shift , which also
               | means costs more directly exposed, meaning Discord may
               | cost $1 month, i.e. the externalization 'costed in' like
               | carbon tax on fuels.
        
           | antoinealb wrote:
           | Honest question: could it be argued that HN does not store
           | any PII and therefore does not need to let you delete your
           | account ?
        
         | losteric wrote:
         | That is a fair concern. However, as a customer - searching my
         | message history is a desirable feature. I would rather see
         | meaningful individual and corporate accountability for privacy
         | breaches. The threat of jail and/or 100MM's in fines should
         | motivate better data handling.
        
       | jgilias wrote:
       | Can anyone share experiences with using Discord as a
       | communications tool in a workplace? We're currently on Google
       | Chat because it comes with the package that we pay for anyway,
       | but it's pretty lame. So from time to time we consider jumping to
       | Slack. But then, why not Discord?
        
         | bhouser wrote:
         | Discord would be awesome for work ... I want to use it so bad.
         | But the terms of service are completely unpalatable. Discord
         | basically gets a perpetual license to anything you post.
        
           | bhouser wrote:
           | The reason for Discord over Slack, BTW is voice & video.
           | Otherwise Slack is working great.
        
             | didibus wrote:
             | Slack added voice now:
             | https://slack.com/help/articles/4402059015315-Start-a-
             | huddle...
        
               | YetAnotherNick wrote:
               | They had voice from beginning. It just sucked so bad that
               | they rebranded it to launch halfway decent option.
        
               | styren wrote:
               | "Huddles" pick up so much noice from the background I
               | stopped using it. Whenever I had to take a call without
               | headphones on it was basically unusable. I'm a single
               | data point however, your milage may vary.
        
       | Quarrelsome wrote:
       | do they ever explain what their "anti-entropy" processes are for
       | and what they do?
        
       | josephd79 wrote:
       | oh great, here come the mongodb haters.
       | 
       | I do use discord for a few groups, too bad they will not allow
       | 3rd party clients because a discord terminal app similar to irssi
       | would be awesome
        
         | PeterCorless wrote:
         | MongoDB is great for developers. Very facile to get started.
         | However, it tends to fall over when it hits scale -- which
         | could be in total data set size (like, >TB scale), transaction
         | scale (>100k ops) or in low latencies (submillisecond to
         | single-digit millisecond).
         | 
         | In any of those domains, if you are trying to solve your
         | problem with MongoDB you are in for a world of hurt.
         | 
         | That's generally when people start looking at other options.
         | Whether an in-memory system for pure speed, or a horizontally
         | scalable system for raw size or throughput.
        
       | c7DJTLrn wrote:
       | They wouldn't need to store so many if they actually let people
       | delete their messages on account deletion. Instead, they ban many
       | people who attempt to do so via automated scripts.
        
         | anigbrowl wrote:
         | I don't think they delete anything really. I've retrieved stuff
         | from servers that were ostensibly deleted over a year ago.
        
           | c7DJTLrn wrote:
           | Well, data is their greatest asset I'd assume. They don't
           | want that precious user data going anywhere.
        
         | simonw wrote:
         | Deletion of data at scale is a really difficult technical
         | problem, unfortunately.
         | 
         | I'm not saying they shouldn't do that though - especially given
         | regulations like GDPR. Designing systems for deletion is
         | important! But it's also really hard, especially if you didn't
         | design for it from the start.
         | 
         | There's also no way the tiny fraction of users who want to
         | delete their data would make up a significant enough proportion
         | of the messages that it would impact their scaling strategy.
        
       | NotAnOtter wrote:
       | This is like a masterclass in how to answer system design
       | questions. Maybe a bit verbose. They cover requirements, how to
       | answer those requirements, relevant tech for the problem,
       | implementation, and techniques for maintenance
        
         | bcrosby95 wrote:
         | Are you saying that one person, without consulting other
         | engineers, or having to do any research, made this decision in
         | less than a day? Because I have some bad news for you.
         | 
         | The sentiment you express here is why interviewing is so shit
         | these days.
        
           | whymauri wrote:
           | Are you replying to the right comment?
        
             | imdsm wrote:
             | They must be. Totally out of context response.
        
           | NotAnOtter wrote:
           | No? But I am saying this is like an extremely articulate,
           | over the top, excessively detailed answer for a system design
           | question. I'm not saying this is what people should aim for,
           | just that it's a good example of the types of things you
           | should discuss during a system design interview. I mean
           | that's what it IS. It's a system design.
        
           | manigandham wrote:
           | 1) Yes, there are people that can do this.
           | 
           | 2) But this isn't an interview situation. This was "system
           | design questions" as in how to solve a problem as a company
           | using the whole team (4 backend engineers at that time).
        
         | [deleted]
        
       | vortico wrote:
       | I'm curious of the 2021 measure of total disk space that Discord
       | consumes. Servers that I'm in share images every few minutes,
       | which must add up pretty quick.
        
         | snak wrote:
         | The Casssandra cluster mentioned (12 nodes, 1TB each) only
         | handles text, as far as the article goes.
        
           | jhgg wrote:
           | We're well over 12 nodes in current year :P
        
             | PeterCorless wrote:
             | lol. :D
        
             | snak wrote:
             | Oh, just noticed the article is from 2017. Is there a newer
             | one, related/similar to this one?
        
         | latchkey wrote:
         | Unique images or just copied from elsewhere?
        
           | objektif wrote:
           | How does that matter? Do they keep track of all images on the
           | internet?
        
       | legerdemain wrote:
       | We took a big bet on Cassandra, and then on an opinionated
       | wrapper around Cassandra at $PASTJOB. The use case was a text
       | search engine for syslog-type stuff.
       | 
       | The product we built using Cassandra was widely known as our
       | buggiest and least maintainable, and it died a merciful death
       | after several years of being inflicted on customers.
       | 
       | We didn't have a good handle on the exact perf implications of
       | different values of read/write replication. Writing product code
       | to handle a range of eventual consistency scenarios is
       | challenging. The memory consumption and duration of compactions
       | and column/node repair jobs is hard to model and accommodate.
       | It's hard to tell what the cluster is doing at any given moment.
       | Our experience with support plans from Datastax was also pretty
       | dismal.
       | 
       | Maybe the situation has changed since 2016. In my experience with
       | several employers since then, it seems like every enterprise
       | architect fell in love with Cassandra around 2014-2015 and then
       | had a long, painful, protracted breakup.
        
         | trashcan wrote:
         | I've used Cassandra at two companies, and had the exact same
         | experience as you at the first company. At a much bigger
         | company that had some very, very highly paid Cassandra DBAs it
         | was actually a relatively smooth experience.
        
           | objektif wrote:
           | How much is very very high pay?
        
             | trashcan wrote:
             | I don't think it would be appropriate for me to say very
             | specifically, but I suspect about double what a software
             | engineer with the same amount of experience would earn.
        
         | PeterCorless wrote:
         | We just benchmarked Cassandra 4.0, which is brand-spanking new.
         | 
         | The good news: C _4.0 is a far better performing database than
         | C_ 3.11. The new GCs definitely get rid of the long tail
         | latency nightmares:
         | 
         | https://www.scylladb.com/2021/08/19/cassandra-4-0-vs-cassand...
         | 
         | However, we also compared it to Scylla's latest release, and
         | though C _4 is_ better*, you can still find other CQL-
         | compatible databases that outperform it. Especially around
         | compactions and topology changes:
         | 
         | https://www.scylladb.com/2021/08/24/apache-cassandra-4-0-vs-...
         | 
         | Just published these numbers today.
        
         | yukinon wrote:
         | If you had to do it again, what would you choose instead?
        
           | legerdemain wrote:
           | These days? Redshift or BigQuery BigTable (thanks for
           | correcting my think-o!). Back then? Maybe HBase.
        
             | chillacy wrote:
             | Do you mean BigTable? That's Google's "HBase" in sofar as
             | hbase is based on the BT paper.
             | 
             | From what I recall from using it a few years ago, it's
             | pretty damn fast, very low latency. HBase had speedy p50s
             | as well but tended to get quite slow at p99 due to GC.
        
       | klaussilveira wrote:
       | Did they ever make it to ScyllaDB?
        
         | jhgg wrote:
         | Yea. Everything but messages is on Scylla.
        
           | res0nat0r wrote:
           | It's unfortunate Discord is still requiring relocation to
           | SFO, the product is amazing and it looks like some awesome
           | engineering behind the scenes that would be fun to work on!
        
             | Sikul wrote:
             | This isn't true anymore. We switched to allowing and
             | supporting permanent remote work (writing this from
             | Seattle).
        
               | res0nat0r wrote:
               | I applied the other month for a job that mentioned SFO or
               | remote, then halfway through the signup it stated that
               | they were allowing folks to work remote until COVID was
               | better, then wanted folks to be onsite, and then prompted
               | for a yes/no if I was willing to move to SFO at a later
               | date. Didn't get a chance to talk to anyone and expect it
               | was because of this, so is a bit disappointing.
        
               | Sikul wrote:
               | That's weird. Do you have a link to the job posting?
        
               | res0nat0r wrote:
               | I believe it was this one:
               | https://discord.com/jobs/4004051002
               | 
               | I see a couple of Storage Engineer job links below, I
               | should apply to one of those too... :)
        
               | Sikul wrote:
               | Ah yeah, just tested it and it says "IF APPLICABLE, WOULD
               | YOU BE WILLING TO RELOCATE TO DISCORD'S SF HQ? WHILE
               | DISCORD IS EMBRACING A HYBRID REMOTE APPROACH GOING
               | FORWARD, SOME ROLES WILL REMAIN HQ-BASED."
               | 
               | It is not applicable for that role. It is not applicable
               | for most (maybe all) engineering roles.
        
               | res0nat0r wrote:
               | Yep that is it!
        
           | AlexAndScripts wrote:
           | Why not messages?
        
             | jhgg wrote:
             | Upstream blockers in scylla + its our biggest dataset, and
             | is a quite large project.
        
               | swman wrote:
               | :wave:
               | 
               | Do you use Cassandra for all your access patterns or do
               | you use something else (elastic search or something)
               | also?
               | 
               | I'm just curious as in my professional career I recently
               | switched to platform engineering from full stack
               | mobile/web software engineer.
               | 
               | Thanks!!!
        
         | mikesun wrote:
         | Yes. And we're hiring storage engineers!
         | https://discord.com/jobs/5411664002
         | https://discord.com/jobs/5426301002
        
       | dave_sullivan wrote:
       | Maybe I don't know enough about databases, but could you really
       | not have done this with postgres?
        
         | simonw wrote:
         | I'm very much in the "use PostgreSQL unless you have absolutely
         | proven to yourself that it won't work for your project" camp
         | but in this case it really does look like moving to Cassandra
         | was a good choice.
         | 
         | NoSQL scalable stores like Cassandra basically only work well
         | if you have a very strong model of the queries that you will
         | need to make.
         | 
         | In this case, that's exactly what they had: they knew what
         | their read/write patterns looked like and they knew that they
         | would be growing at hundreds of millions of rows per month, so
         | easy horizontal scalability was a hard requirement.
         | 
         | The biggest weakness of classical relational databases like
         | PostgreSQL come when you have a super high volumes of inserts
         | (as opposed to updates) which will continue to grow your
         | database over time, and you need to keep all of that data
         | accessible for real-time queries.
         | 
         | They might have been able to achieve something like this using
         | a PostgreSQL extension such as Citus, but it really does look
         | like what they are doing fits Cassandra's sweet spot.
        
         | chillacy wrote:
         | Afaik you could up until the data exceeds your capacity to fit
         | in one machine, at which point you have to figure out how to
         | split your data up in a way which lets you preserve all the
         | strengths of sql (strong consistency). At that point you run
         | into a lot of complexity with managing your shards.
        
       | Ansil849 wrote:
       | Nothing about any sort of encryption.
        
         | imdsm wrote:
         | That's an interesting point too. They talk about not being a
         | blob store, not wanting the serialisation cycle to hamper
         | performance but makes you wonder how exactly they're storing
         | the data. I'd guess it's not encrypted at all.
         | 
         | ETA: Going back to the original thread, the whole question of
         | encryption seems to be dodged and that usually means the answer
         | isn't the one people are looking for:
         | https://news.ycombinator.com/item?id=13440921
        
       | andrewstuart wrote:
       | I'd first reach for Postgres to do this. Anyone have any idea how
       | Postgres would stack up in a similar challenge?
        
         | chillacy wrote:
         | Hmm... have not tried this myself, but just brainstorming. You
         | could shard based on discord server or chat room, which would
         | give "read before write" consistency since writes can lock, but
         | then you'd have to manage shards to account for varying loads
         | like servers/rooms which grow rapidly, and deal with hot shards
         | which might outgrow the capacity of a single db server.
         | 
         | Given that they said their requirements were "linear
         | scalability, automatic failover, low maintenance, predictable
         | performance", I don't think I'd go that route.
        
         | swman wrote:
         | At discord scale (global), I'd assume keeping things in sync
         | even with R/W + global replication needs would be quite
         | difficult.
        
       | mikesun wrote:
       | We're actively hiring for our storage infrastructure team (SF or
       | remote) ! If this sounds interesting, check us out!
       | 
       | https://discord.com/jobs/5411664002
       | 
       | https://discord.com/jobs/5426301002
        
       | misframer wrote:
       | Looks like they migrated (at least partially) to Scylla: "Discord
       | Chooses Scylla as Its Core Storage Layer" (2020)
       | 
       | https://www.scylladb.com/press-release/discord-chooses-scyll...
        
         | PeterCorless wrote:
         | Yes. They started experimenting with Scylla earlier, and made
         | the switch in 2020. Here's more on their logic -- they like
         | "opinionated systems":
         | 
         | https://www.scylladb.com/2019/03/20/discord-on-the-joy-of-op...
        
       | secondcoming wrote:
       | They should check out Scylla if they want even faster queries and
       | - depending on their workload - fewer instances.
       | 
       | Edit: It seems they have moved to Scylla
        
         | kaladin-jasnah wrote:
         | They already did--
         | https://news.ycombinator.com/item?id=28293097, main article:
         | https://www.scylladb.com/press-release/discord-chooses-scyll...
        
       ___________________________________________________________________
       (page generated 2021-08-24 23:00 UTC)