Post AOKqlD1DoN143vkCps by zladuric@mastodon.technology
 (DIR) More posts by zladuric@mastodon.technology
 (DIR) Post #AIqri1Ph02pO8GYDM8 by ashfurrow@mastodon.technology
       2022-04-26T12:49:07Z
       
       1 likes, 0 repeats
       
       Here are the DigitalOcean graphs for mastodon.technology over the past 7 days, and the past 24 hours. Yesterday required us to rebalance our resource utilization (before this week, we were over-provisioned and were preparing to decrease server resources to save costs πŸ˜…).I think we're in an okay spot. We don't have a lot of headroom left. Fingers crossed today. We may need to close new registrations, but we'll be upfront if we make that decision.
       
 (DIR) Post #AIqri2vHO5gkoWKxEG by ashfurrow@mastodon.technology
       2022-04-26T12:50:57Z
       
       0 likes, 0 repeats
       
       Along with myself, this instance is moderated by @bclindner and @fuzzface – I'm fortunate to have such a great team πŸ™Œ
       
 (DIR) Post #AIqri4Pnq5hNRTcqRc by ashfurrow@mastodon.technology
       2022-04-26T12:51:25Z
       
       0 likes, 0 repeats
       
       This instance is funded by a Patreon – if you'd like to contribute, head over to https://www.patreon.com/ashfurrow
       
 (DIR) Post #AIqri5oedBAhmqGCoq by ashfurrow@mastodon.technology
       2022-04-26T13:26:12Z
       
       0 likes, 0 repeats
       
       One year ago, this instance was running on Docker. This configuration had been officially supported, but discouraged, by Mastodon maintainers – and for good reason. It introduced so many problems and so much overhead that I'm really grateful that everyone peer-pressured me into migrating off it. Now it runs as standard linux services. I'm positive this instance would be in tears right now, otherwise πŸ˜†
       
 (DIR) Post #AIr3PnUVDthRBqsR8q by ashfurrow@mastodon.technology
       2022-04-26T21:58:26Z
       
       0 likes, 0 repeats
       
       Sidekiq queues have continued to get worse. (But at a slower rate! Yes!!!) I’ve played with the config of everything as much as I’m comfortable – a sluggish instance is better than an unresponsive instance.
       
 (DIR) Post #AIr3PoAgh3RZIgc7tY by ashfurrow@mastodon.technology
       2022-04-26T22:02:49Z
       
       0 likes, 0 repeats
       
       43k jobs queued up. default has a latency of 22 minutes, pull is over two hours lol okayI see a few jobs taking a long time (>10 minutes) but most finish instantly. I’m trying to figure out where the next bottleneck to scale up is. How do I get the CPU cores doing more? PgBouncer? Multiple Sidekiq services? It seems like I have CPU headroom, so should I get more aggressive with my Postgres config? Is there a quick win without prolonged downtime? These are the questions on my mind.
       
 (DIR) Post #AIr3Pogwl7FQujhtB2 by ashfurrow@mastodon.technology
       2022-04-26T22:07:02Z
       
       0 likes, 0 repeats
       
       Yesterday we processed a record-setting 1M Sidekiq jobs. Today is at 1.4M already! Sidekiq seems to be the bottleneck, but I can’t scale it up without running out of Postgres connections. That makes me think PgBouncer might be the necessary next step. It gives me more connections, so Sidekiq can do more at once, so my idling CPUs can get to work. Does that make sense? (Bonus: the Mastodon docs have a guide on how to do this.)
       
 (DIR) Post #AIr3PpIsU5aaoNSBIe by shadowfacts@social.shadowfacts.net
       2022-04-26T22:18:30.980902Z
       
       0 likes, 0 repeats
       
       @ashfurrow before going to pgbouncer, you could try increasing the size of the db connection pooli'm not sure exactly the tradeoff between increasing the pool size and using pgbouncer, but the pool size should be easier to experiment with
       
 (DIR) Post #AIr66F59ELNipiNuOe by stux@mstdn.social
       2022-04-26T22:48:31Z
       
       0 likes, 0 repeats
       
       @ashfurrow @rodti Ahh found it! Take a look at this :blobcatgiggle:​Lol, it was so simply to find.. :mortyderp:​ (brainfarrt for me!)https://github.com/mstdn/Mastodon/tree/main/dist
       
 (DIR) Post #AIr6BBSS8FqL2bdqCm by stux@mstdn.social
       2022-04-26T22:49:25Z
       
       0 likes, 0 repeats
       
       @ashfurrow @rodti Result:You can add as many workers from each job as you need yay! πŸ’ͺ
       
 (DIR) Post #AIr6hyyJZmIps9BUcS by ashfurrow@mastodon.technology
       2022-04-26T22:37:26Z
       
       1 likes, 0 repeats
       
       @shadowfacts I have already increased it 😬 Currently at 300.
       
 (DIR) Post #AIr7PYfyhj7H3jO0Aa by ashfurrow@mastodon.technology
       2022-04-26T23:03:03Z
       
       0 likes, 0 repeats
       
       @stux @rodti amazing! thank you!
       
 (DIR) Post #AIr83xOYDebTJ5UxHs by ashfurrow@mastodon.technology
       2022-04-26T23:08:24Z
       
       0 likes, 0 repeats
       
       @stux @rodti and the total number of workers needs to be less than the Postgres connection limit, right?
       
 (DIR) Post #AIr8yca2HS9ZG0tEPo by ashfurrow@mastodon.technology
       2022-04-26T23:20:42Z
       
       0 likes, 0 repeats
       
       @stux @rodti Okay! I split out new services for pull and default specifically, and left the existing service as-is for now. CPU usage jumped rom 25% to about 50% whoa! And the queue size decreased from 26k to 14k pretty rapidly – default is empty again! Yes! Thank you!!!So there are a total of 400 threads in that screenshot, right? So your max_connections in Postgres must be *at least* that, right? I've broken things by exceeding this limit before 😬
       
 (DIR) Post #AIr96q2g74VHKFCnE8 by stux@mstdn.social
       2022-04-26T23:22:13Z
       
       0 likes, 0 repeats
       
       @ashfurrow @rodti Please don't copy my exact setup :blobcatgiggle:​I think it's a little overkill for you! I don't wanna blow up your server :catblush:​ I think we have over 25K DB connections availble, at least!
       
 (DIR) Post #AIr9FSKXdu5OOHfbCy by ashfurrow@mastodon.technology
       2022-04-26T23:23:48Z
       
       0 likes, 0 repeats
       
       @stux @rodti of course! I'm trying to understand the principles that went into your decisions πŸ‘ I appreciate you sharing your config files as examples.
       
 (DIR) Post #AIr9OM52FXffc2f0fw by stux@mstdn.social
       2022-04-26T23:25:25Z
       
       0 likes, 0 repeats
       
       @ashfurrow @rodti Sure thing mate! :blobcathearts:​ Try to start with 25 jobs per "task"! And always make sure the schedual is running :ablobwink:​ It's in the "default" sidekiq that comes with Masto itself πŸ˜„ If you have a lower number of workers per job you can always start/stop those when needed ^^
       
 (DIR) Post #AIr9ejgiFDu9Sp5vou by ashfurrow@mastodon.technology
       2022-04-26T23:27:38Z
       
       1 likes, 1 repeats
       
       Big thanks to @rodti and @stux for their suggestion and help in splitting out Sidekiq into multiple systemd services πŸ™Œ Our queues are empty and our CPUs are being better utilized πŸ‘
       
 (DIR) Post #AIr9iEKYlw7dMxOnuy by stux@mstdn.social
       2022-04-26T23:28:58Z
       
       0 likes, 0 repeats
       
       @ashfurrow @rodti Always welcome! πŸ™‡ If you need any further advice on some scaling things I got some stored somewhere on a shelf 🧠 :blobcatgiggle:​
       
 (DIR) Post #AIr9lQMWUlLBYNqAT2 by ashfurrow@mastodon.technology
       2022-04-26T23:29:19Z
       
       0 likes, 0 repeats
       
       @stux @rodti :blobcatfingerguns:
       
 (DIR) Post #AIsAqA7qaIehkxl7JI by ashfurrow@mastodon.technology
       2022-04-27T11:15:59Z
       
       1 likes, 0 repeats
       
       The instance stayed up overnight with the new config changes (always a concern haha). Sidekiq queues are empty and have already chewed through an average day’s amount of jobs. I’m excited/terrified to see what happens today!
       
 (DIR) Post #AOKqlD1DoN143vkCps by zladuric@mastodon.technology
       2022-04-26T22:35:35Z
       
       0 likes, 0 repeats
       
       @ashfurrow curious, what infra do you have for the instance?
       
 (DIR) Post #AOKqlDVM0L7RZNqGno by ashfurrow@mastodon.technology
       2022-04-26T23:30:50Z
       
       1 likes, 0 repeats
       
       @zladuric The VM runs on DigitalOcean, 8 vCPUs and 16GB RAM. There's a 250GB external disk for the postgres database. AWS S3 for bucket storage and offsite backup storage, bunny.net for CDN, and Mailgun for email.