Post AIr9OM52FXffc2f0fw by stux@mstdn.social
(DIR) More posts by stux@mstdn.social
(DIR) Post #AIqri1Ph02pO8GYDM8 by ashfurrow@mastodon.technology
2022-04-26T12:49:07Z
1 likes, 0 repeats
Here are the DigitalOcean graphs for mastodon.technology over the past 7 days, and the past 24 hours. Yesterday required us to rebalance our resource utilization (before this week, we were over-provisioned and were preparing to decrease server resources to save costs π
).I think we're in an okay spot. We don't have a lot of headroom left. Fingers crossed today. We may need to close new registrations, but we'll be upfront if we make that decision.
(DIR) Post #AIqri2vHO5gkoWKxEG by ashfurrow@mastodon.technology
2022-04-26T12:50:57Z
0 likes, 0 repeats
Along with myself, this instance is moderated by @bclindner and @fuzzface β I'm fortunate to have such a great team π
(DIR) Post #AIqri4Pnq5hNRTcqRc by ashfurrow@mastodon.technology
2022-04-26T12:51:25Z
0 likes, 0 repeats
This instance is funded by a Patreon β if you'd like to contribute, head over to https://www.patreon.com/ashfurrow
(DIR) Post #AIqri5oedBAhmqGCoq by ashfurrow@mastodon.technology
2022-04-26T13:26:12Z
0 likes, 0 repeats
One year ago, this instance was running on Docker. This configuration had been officially supported, but discouraged, by Mastodon maintainers β and for good reason. It introduced so many problems and so much overhead that I'm really grateful that everyone peer-pressured me into migrating off it. Now it runs as standard linux services. I'm positive this instance would be in tears right now, otherwise π
(DIR) Post #AIr3PnUVDthRBqsR8q by ashfurrow@mastodon.technology
2022-04-26T21:58:26Z
0 likes, 0 repeats
Sidekiq queues have continued to get worse. (But at a slower rate! Yes!!!) Iβve played with the config of everything as much as Iβm comfortable β a sluggish instance is better than an unresponsive instance.
(DIR) Post #AIr3PoAgh3RZIgc7tY by ashfurrow@mastodon.technology
2022-04-26T22:02:49Z
0 likes, 0 repeats
43k jobs queued up. default has a latency of 22 minutes, pull is over two hours lol okayI see a few jobs taking a long time (>10 minutes) but most finish instantly. Iβm trying to figure out where the next bottleneck to scale up is. How do I get the CPU cores doing more? PgBouncer? Multiple Sidekiq services? It seems like I have CPU headroom, so should I get more aggressive with my Postgres config? Is there a quick win without prolonged downtime? These are the questions on my mind.
(DIR) Post #AIr3Pogwl7FQujhtB2 by ashfurrow@mastodon.technology
2022-04-26T22:07:02Z
0 likes, 0 repeats
Yesterday we processed a record-setting 1M Sidekiq jobs. Today is at 1.4M already! Sidekiq seems to be the bottleneck, but I canβt scale it up without running out of Postgres connections. That makes me think PgBouncer might be the necessary next step. It gives me more connections, so Sidekiq can do more at once, so my idling CPUs can get to work. Does that make sense? (Bonus: the Mastodon docs have a guide on how to do this.)
(DIR) Post #AIr3PpIsU5aaoNSBIe by shadowfacts@social.shadowfacts.net
2022-04-26T22:18:30.980902Z
0 likes, 0 repeats
@ashfurrow before going to pgbouncer, you could try increasing the size of the db connection pooli'm not sure exactly the tradeoff between increasing the pool size and using pgbouncer, but the pool size should be easier to experiment with
(DIR) Post #AIr66F59ELNipiNuOe by stux@mstdn.social
2022-04-26T22:48:31Z
0 likes, 0 repeats
@ashfurrow @rodti Ahh found it! Take a look at this :blobcatgiggle:βLol, it was so simply to find.. :mortyderp:β (brainfarrt for me!)https://github.com/mstdn/Mastodon/tree/main/dist
(DIR) Post #AIr6BBSS8FqL2bdqCm by stux@mstdn.social
2022-04-26T22:49:25Z
0 likes, 0 repeats
@ashfurrow @rodti Result:You can add as many workers from each job as you need yay! πͺ
(DIR) Post #AIr6hyyJZmIps9BUcS by ashfurrow@mastodon.technology
2022-04-26T22:37:26Z
1 likes, 0 repeats
@shadowfacts I have already increased it π¬ Currently at 300.
(DIR) Post #AIr7PYfyhj7H3jO0Aa by ashfurrow@mastodon.technology
2022-04-26T23:03:03Z
0 likes, 0 repeats
@stux @rodti amazing! thank you!
(DIR) Post #AIr83xOYDebTJ5UxHs by ashfurrow@mastodon.technology
2022-04-26T23:08:24Z
0 likes, 0 repeats
@stux @rodti and the total number of workers needs to be less than the Postgres connection limit, right?
(DIR) Post #AIr8yca2HS9ZG0tEPo by ashfurrow@mastodon.technology
2022-04-26T23:20:42Z
0 likes, 0 repeats
@stux @rodti Okay! I split out new services for pull and default specifically, and left the existing service as-is for now. CPU usage jumped rom 25% to about 50% whoa! And the queue size decreased from 26k to 14k pretty rapidly β default is empty again! Yes! Thank you!!!So there are a total of 400 threads in that screenshot, right? So your max_connections in Postgres must be *at least* that, right? I've broken things by exceeding this limit before π¬
(DIR) Post #AIr96q2g74VHKFCnE8 by stux@mstdn.social
2022-04-26T23:22:13Z
0 likes, 0 repeats
@ashfurrow @rodti Please don't copy my exact setup :blobcatgiggle:βI think it's a little overkill for you! I don't wanna blow up your server :catblush:β I think we have over 25K DB connections availble, at least!
(DIR) Post #AIr9FSKXdu5OOHfbCy by ashfurrow@mastodon.technology
2022-04-26T23:23:48Z
0 likes, 0 repeats
@stux @rodti of course! I'm trying to understand the principles that went into your decisions π I appreciate you sharing your config files as examples.
(DIR) Post #AIr9OM52FXffc2f0fw by stux@mstdn.social
2022-04-26T23:25:25Z
0 likes, 0 repeats
@ashfurrow @rodti Sure thing mate! :blobcathearts:β Try to start with 25 jobs per "task"! And always make sure the schedual is running :ablobwink:β It's in the "default" sidekiq that comes with Masto itself π If you have a lower number of workers per job you can always start/stop those when needed ^^
(DIR) Post #AIr9ejgiFDu9Sp5vou by ashfurrow@mastodon.technology
2022-04-26T23:27:38Z
1 likes, 1 repeats
Big thanks to @rodti and @stux for their suggestion and help in splitting out Sidekiq into multiple systemd services π Our queues are empty and our CPUs are being better utilized π
(DIR) Post #AIr9iEKYlw7dMxOnuy by stux@mstdn.social
2022-04-26T23:28:58Z
0 likes, 0 repeats
@ashfurrow @rodti Always welcome! π If you need any further advice on some scaling things I got some stored somewhere on a shelf π§ :blobcatgiggle:β
(DIR) Post #AIr9lQMWUlLBYNqAT2 by ashfurrow@mastodon.technology
2022-04-26T23:29:19Z
0 likes, 0 repeats
@stux @rodti :blobcatfingerguns:
(DIR) Post #AIsAqA7qaIehkxl7JI by ashfurrow@mastodon.technology
2022-04-27T11:15:59Z
1 likes, 0 repeats
The instance stayed up overnight with the new config changes (always a concern haha). Sidekiq queues are empty and have already chewed through an average dayβs amount of jobs. Iβm excited/terrified to see what happens today!
(DIR) Post #AOKqlD1DoN143vkCps by zladuric@mastodon.technology
2022-04-26T22:35:35Z
0 likes, 0 repeats
@ashfurrow curious, what infra do you have for the instance?
(DIR) Post #AOKqlDVM0L7RZNqGno by ashfurrow@mastodon.technology
2022-04-26T23:30:50Z
1 likes, 0 repeats
@zladuric The VM runs on DigitalOcean, 8 vCPUs and 16GB RAM. There's a 250GB external disk for the postgres database. AWS S3 for bucket storage and offsite backup storage, bunny.net for CDN, and Mailgun for email.