Post AVOPQB465Pa5LnYYee by idanoo@mastodon.nz
 (DIR) More posts by idanoo@mastodon.nz
 (DIR) Post #AVL0wLgt2FUPFB41Gi by idanoo@mastodon.nz
       2023-04-30T07:12:45Z
       
       0 likes, 0 repeats
       
       Next up on the scaling journey:- Configure second OpnSense VM with CARP IP's (Full failover with no downtime).- Invest in 3rd server for network storage sometime this year
       
 (DIR) Post #AVL0wMN4VPEXM0ni1Q by strypey@mastodon.nzoss.nz
       2023-05-05T07:37:14Z
       
       0 likes, 0 repeats
       
       @idanooI'd like to pick your brains on...> the scaling journey... if  that's OK? @aurynn and @lightweight too “ :)
       
 (DIR) Post #AVL5wPD8RmFtvqzhr6 by idanoo@mastodon.nz
       2023-05-05T08:32:41Z
       
       0 likes, 0 repeats
       
       @strypey Sure
       
 (DIR) Post #AVNj7Xg9aXZTuQmwsa by strypey@mastodon.nzoss.nz
       2023-05-06T15:01:47Z
       
       0 likes, 0 repeats
       
       @idanoo@aurynn @lightweight Specifically, right now I'm interested in scaling Mastodon. Like, roughly where have you found the transition points to be, as users numbers/ posting activity grows, where new tactics are called for to keep the server efficient and reliable?Also, have you tried any of the alternatives to Mastodon (eg Pleroma/ Akkoma or MissKey/ CalcKey)? Better or worse performance wise? Do they follow the same scaling curve?
       
 (DIR) Post #AVNjAUapWB33CuhmUK by strypey@mastodon.nzoss.nz
       2023-05-06T15:02:19Z
       
       0 likes, 0 repeats
       
       @idanooBonus question for extra points; have you tried out any of the single-user AP server packages and how did they perform?@aurynn @lightweight
       
 (DIR) Post #AVONPmFQ9mi2TMXByK by idanoo@mastodon.nz
       2023-05-06T22:33:14Z
       
       0 likes, 0 repeats
       
       @strypey @aurynn @lightweight Unsure on comparing other services.  Haven't run any large enough.We're running postgresql with 8c/8gb ram, same with redis (although these are shared across everything)You can likely get away with vertical scaling for quite awhile, we only split horizontal so we could use separate hardware for the sidekiq workers with NFS backend. I think stock was fine for a few hundred users + small federation like Aurynn mentioned1/2
       
 (DIR) Post #AVOOLGjRUQOFWXzUZc by aurynn@cloudisland.nz
       2023-05-06T20:56:19Z
       
       0 likes, 0 repeats
       
       @strypey my instance design is very different from the others, I reckon, but for me it’s not number of users but number of follow relationships that is the major load indicator.@idanoo @lightweight
       
 (DIR) Post #AVOOLHOv0DZDbBOcDo by strypey@mastodon.nzoss.nz
       2023-05-06T22:43:39Z
       
       0 likes, 0 repeats
       
       @aurynn> number of follow relationships that is the major load indicatorHow do you measure that?@idanoo @lightweight
       
 (DIR) Post #AVOPQB465Pa5LnYYee by idanoo@mastodon.nz
       2023-05-06T22:36:06Z
       
       0 likes, 0 repeats
       
       @strypey @aurynn @lightweight One of the first things we did was split sidekiq workers into separate processes with 25 threads each, all with different order of queues so they all get some priority. Next was bumping streaming+web processes to be able to use all the resource we had on our web instance (16c/24gb). This was around 2k users, or more 500+ active at one time. At the same time we had to tweak nginx as we hit max_open_files + max_connections. 2/3
       
 (DIR) Post #AVOPQBrN8CzpocbuSW by idanoo@mastodon.nz
       2023-05-06T22:37:29Z
       
       0 likes, 0 repeats
       
       @strypey @aurynn @lightweight A lot of federation can definitely mean early scaling even with a <100 users. As long as you've got the sidekiq workers and DB resource to keep this queue at zero, you'll be good.3/3
       
 (DIR) Post #AVOPQCZgTSRS23LIWm by aurynn@cloudisland.nz
       2023-05-06T22:40:25Z
       
       0 likes, 0 repeats
       
       @idanoo @strypey @lightweight My architecture is individual VM instances for each component, and scaling each piece as needed. I have two webservers sitting behind a loadbalancer, two sidekiq nodes, a redis, an elasticsearch, and a postgres node.Cost isn't as much of an object for me since I'm a paid instance and hosting costs are more than covered by subscriptions.In November, biggest issue for me was getting Sidekiq and Web into autoscaling so I could scale in response to load
       
 (DIR) Post #AVOPQDBGDkV1uavJ68 by aurynn@cloudisland.nz
       2023-05-06T22:41:00Z
       
       0 likes, 0 repeats
       
       @idanoo @strypey @lightweight November kind of changed everything.I have 40 threads with normal distribution across my two Sidekiq nodes, and then a separate "general" instance that runs the mailer and scheduler with 5 threads.
       
 (DIR) Post #AVOPQDiaDr9dZwVv2O by strypey@mastodon.nzoss.nz
       2023-05-06T22:55:44Z
       
       0 likes, 0 repeats
       
       @aurynn> November kind of changed everythingUnderstatement of the year! 🤣@idanoo @lightweight
       
 (DIR) Post #AVOQHdeAYBYXLLv9yi by strypey@mastodon.nzoss.nz
       2023-05-06T23:05:25Z
       
       0 likes, 0 repeats
       
       @idanoo@aurynn @lightweight Thanks for all this detail, it's super helpful. Next question, what kind of hardware / hosting are you using? How many boxen are involved? What sort of specs? I'm trying to get my head around what sort of capacity we'll need to host our Mastodon farm.
       
 (DIR) Post #AVOS0tUAbos5S4O4e0 by aurynn@cloudisland.nz
       2023-05-06T23:06:04Z
       
       0 likes, 0 repeats
       
       @strypey @idanoo @lightweight just pay me to manage it 😉
       
 (DIR) Post #AVOS0u18dFF76JoP20 by strypey@mastodon.nzoss.nz
       2023-05-06T23:24:44Z
       
       0 likes, 0 repeats
       
       @aurynn> just pay me to manage it 😉🤣But seriously, if we can get some bootstrap funding (got a couple of ideas I need to chase up), we might offer one (or more) of you some contracting work holding our hands while we're learning to find our arses with both of them 😁 @idanoo @lightweight
       
 (DIR) Post #AVP03PmctXbSdrOTDs by lightweight@mastodon.nzoss.nz
       2023-05-07T04:01:16Z
       
       0 likes, 0 repeats
       
       @strypey @aurynn @idanoo Interesting all - sounds like we each have fairly different approaches. I know that my instances (have 4 at the moment) are all smaller scale than yours. All mine are pure Docker Compose on commodity cloud VPSs (either on CatalystCloud - sponsored - or overseas as it's about 1/20th the cost).
       
 (DIR) Post #AVP03QW0Aptouachwu by lightweight@mastodon.nzoss.nz
       2023-05-07T04:05:01Z
       
       0 likes, 0 repeats
       
       @strypey @aurynn @idanoo On the NZOSS instance, after boosting Sidekiqs in Nov to 5 x default (which sorted that issue), our biggest performance bottleneck has been backing up the PostgreSQL DB hourly & then gzipping it which slows everything down. I'm planning to set up a secondary for PostgreSQL and do backups there instead. In the meantime, I've reduced backup frequency to daily.
       
 (DIR) Post #AVP03R564LyKfR2jeS by idanoo@mastodon.nz
       2023-05-07T04:08:35Z
       
       0 likes, 0 repeats
       
       @lightweight @strypey @aurynn Yeah I also had to cut back from hourly dumps. masto NZ is sitting at 42GB in postgres at the moment and it was just too much to dump then upload. Instead we're shipping zfs incrementals off site hourly and then daily dumps to the cloud.
       
 (DIR) Post #AVP03ReXwYKQRNd2uG by strypey@mastodon.nzoss.nz
       2023-05-07T05:46:11Z
       
       0 likes, 0 repeats
       
       @idanoo> Instead we're shipping zfs incrementals off site hourly and then daily dumps to the cloudThe nuts and bolts of implementation are above my head for now, but saving diffs rather than full db copies intuitively seems like a more efficient approach.@lightweight @aurynn
       
 (DIR) Post #AVT5XyydCkOx8yrVdw by strypey@mastodon.nzoss.nz
       2023-05-09T05:06:37Z
       
       0 likes, 0 repeats
       
       @aurynn > My architecture is individual VM instances for each componentDo you mean full hardware virtualization, or do you mean a container for each component?@idanoo @lightweight
       
 (DIR) Post #AVT5bncVN4WX4zKgYi by idanoo@mastodon.nz
       2023-05-09T05:07:16Z
       
       0 likes, 0 repeats
       
       @strypey @aurynn @lightweight For reference I'm running LXC containers on proxmox, one for Postgres, one for Redis, one for Web, one for Sidekiq
       
 (DIR) Post #AVT5eYzoCRlmamYD7g by strypey@mastodon.nzoss.nz
       2023-05-09T05:07:49Z
       
       0 likes, 0 repeats
       
       @aurynn> it’s not number of users but number of follow relationships that is the major load indicatorDo you mean outward follows, inwards follows, or do both create the same increase on load? @idanoo @lightweight
       
 (DIR) Post #AVT6IzObSscm3NQOgq by aurynn@cloudisland.nz
       2023-05-09T05:08:55Z
       
       0 likes, 0 repeats
       
       @strypey Being followed from outside servers is the greater load, since that correlates to more sidekiq jobs to push data and more media traffic as other instances fetch images@idanoo @lightweight
       
 (DIR) Post #AVT6IztnatZtc81JJY by strypey@mastodon.nzoss.nz
       2023-05-09T05:15:06Z
       
       0 likes, 0 repeats
       
       @aurynn > Being followed from outside servers is the greater load, since that correlates to more sidekiq jobs to push dataI keep hearing about relays and how they can help with this kind of load. Is this something you've played around with?@idanoo @lightweight
       
 (DIR) Post #AVTcqsvWO2eifuhQR6 by aurynn@cloudisland.nz
       2023-05-09T05:06:53Z
       
       0 likes, 0 repeats
       
       @strypey full hardware virtualisation. @idanoo @lightweight
       
 (DIR) Post #AVTcqueDzK0a0qcdkm by strypey@mastodon.nzoss.nz
       2023-05-09T11:19:46Z
       
       0 likes, 0 repeats
       
       @aurynn> full hardware virtualisationI might be showing my ignorance here, but doesn't this put a lot of extra load on the server? Why not use a VM for each instance, with a container for each component?@idanoo @lightweight
       
 (DIR) Post #AVUxo1gPfPW19v4RKi by aurynn@cloudisland.nz
       2023-05-09T19:21:20Z
       
       0 likes, 0 repeats
       
       @strypey I may have misunderstood. I’m using Catalyst Cloud, so I’m running … 9? VMs for everything right now. 2x web, 2x sidekiq, and a general services machine (monitoring, etc), 1 Redis, 1 Postgres, 1 elasticsearch.The next generation of the system will be using Nomad as a container fabric for the Rails and maybe Redis and Elastic parts of things. @idanoo @lightweight
       
 (DIR) Post #AVUxo2GvTeiqzA9bFI by strypey@mastodon.nzoss.nz
       2023-05-10T02:49:18Z
       
       0 likes, 0 repeats
       
       @aurynn> The next generation of the system will be using Nomad as a container fabricThis one? https://www.nomadproject.io/What do you see as the advantages of this over other options like Kubernetes, ProxMox or Docker?@idanoo @lightweight
       
 (DIR) Post #AVUyUDbvi6JKJZ5nXM by djsumdog@djsumdog.com
       2023-05-10T02:56:57.656330Z
       
       0 likes, 0 repeats
       
       I recently tried to get Nomad working on my pi cluster:https://battlepenguin.com/tech/rack-mount-cluster-of-raspberry-pis/I started making some Anisble scripts, but it was a fucking pain and I got so frustrated I switched to rsync and podman in some scripts for my deployments.Nomad might be simpler than k8s or DC/OS, but it’s still not easy.
       
 (DIR) Post #AVUzQGM5fF6TceMT0y by aurynn@cloudisland.nz
       2023-05-10T02:50:01Z
       
       0 likes, 0 repeats
       
       @strypey I'm a single-person shop, k8s requires a team of its own to run.Docker-swarm looks a bit unsupported.Not heard of ProxMox.@idanoo @lightweight
       
 (DIR) Post #AVUzQGz5KGINZabbnM by idanoo@mastodon.nz
       2023-05-10T02:53:02Z
       
       0 likes, 0 repeats
       
       @aurynn @strypey @lightweight It's really dependent on your current infra and how you plan to scale.Proxmox is basically a full hypervisor like ESXI with QEMU/LXC options. It's not meant so much for autoscaling and that kind of thing since more on-prem than cloud
       
 (DIR) Post #AVUzQI0tV1KwlUSZFo by strypey@mastodon.nzoss.nz
       2023-05-10T03:07:25Z
       
       0 likes, 0 repeats
       
       @idanoo> Proxmox is... more on-prem than cloudGood to know. That might be useful for a local job we've been offered.@aurynn @lightweight
       
 (DIR) Post #AaP3Oyv3OCkSblE40G by strypey@mastodon.nzoss.nz
       2023-10-03T19:44:09Z
       
       0 likes, 0 repeats
       
       @idanoo @aurynnI'm guessing one or both of you folks auto-prune your old posts? I came back to check out this very insightful thread from a few months back and all I can see is my posts 😆Have any of you written any blog pieces somewhere more permanent about your experiences scaling a Mastodon instance, or know of any good ones? @lightweight