Post AiOAFrHHcDT2r75L4C by dean@uwu.social
 (DIR) More posts by dean@uwu.social
 (DIR) Post #AiNuJjdgMLO73YOAPA by dean@uwu.social
       2024-05-29T13:24:36Z
       
       0 likes, 0 repeats
       
       We're struggling to figure out a good way to migrate Mastodon media storage from one server to another... Mastodon stores files in millions of directories by hash (kinda like aaa/bbb/ccc/ddd) which means doing something simple like an initial rsync and then taking Mastodon down for a quick resync is impossible. We initially were going to go this route but after the initial file listing took more than 24h we cancelled it and gave up.So now we're looking at just copying the raw filesystem over, but if we want to do it without taking mastodon down for the entire sync we need to come up with a way of copying it and resyncing the changed blocks afterwards.One way could be to use overlayfs. Remount the old volume R/O, create a temporary upperdir and create an overlay between them. Then, copy the R/O image to it's new home, expand it or whatever, and apply the upperdir onto it. This way we only need to list the directories that actually had writes. Special care will need to be taken to ensure we delete any files that have overlayfs tombstones. IDK if anyone has ever done this before.Another way could be to use devmapper snapshots to create a new COW-backed volume, rsync the R/O underlying block device over and then apply the COW to the new volume with snapshot-merge. We tried testing this out and caused devmapper to die horribly and spit out kernel bug log lines, so we had to reboot and e2fsck for 2 hours.At this point it might be better to just take everything down for as long as it takes. I'm extremely annoyed at Mastodon's file structure making it impossible to move without major downtime. Their solution just seems to be "use S3 lol". It would probably take 24 hours (8TB at 1Gbps is roughly 17 hours). We could shrink it first since we don't use all the space, but resize2fs will take a while as well.If anyone has any tips or ideas for doing it with minimal downtime I'd like to hear them. Or if you're an uwu.social user and don't care about extended downtime I'd also like to hear your thoughts too.
       
 (DIR) Post #AiNzSEbfbe6HBiJtL6 by Hanuwu@uwu.social
       2024-05-29T14:22:08Z
       
       0 likes, 0 repeats
       
       @dean while I don't know if the performance would be better than rsync, but have you looked into git annex?
       
 (DIR) Post #AiOAFrHHcDT2r75L4C by dean@uwu.social
       2024-05-29T16:23:11Z
       
       0 likes, 0 repeats
       
       @Hanuwu I think it would have the same problems as rsync since it has to recursively list files in millions of directories :(
       
 (DIR) Post #AiOBsOy0uMBgIc0yWm by sulian@uwu.social
       2024-05-29T16:41:20Z
       
       0 likes, 0 repeats
       
       @dean can you turn off just media during the migration? 🤔
       
 (DIR) Post #AiODHe8fjhe8Y1bMbA by dean@uwu.social
       2024-05-29T16:57:07Z
       
       0 likes, 0 repeats
       
       @sulian I don't think you can turn off media without causing a lot of issues. It's (probably) easy to block for remote posts, but blocking for local posts would be difficult. I suppose the upload endpoints could be disabled.Also, any deleted posts from local or remote instances that have media need to have their corresponding media deleted which would fail... and then we'd have orphaned files in the media storage directory forever :(
       
 (DIR) Post #AiOYCBdaNGjbk3IbWC by Cushee_Foofee@uwu.social
       2024-05-29T20:51:24Z
       
       0 likes, 0 repeats
       
       @dean I don't understand some of those words but I understand that computer stuff can be very insane at times.I guess figure out how long the downtime would be, and then make a new post (Idk if this is even a thing, just some post everyone can see and reply to) asking what times people DON'T want downtime, and plan around causing the least issue?Personally I just puruse here so any downtime is fine, I just prefer a heads up if you are being intentional (maybe 1 week notice?).
       
 (DIR) Post #AiOb9zeKH4d9qRI6pk by riderkicker@uwu.social
       2024-05-29T21:24:39Z
       
       0 likes, 0 repeats
       
       @dean I'm for taking it down for as long as it takes. Beats rushing & finding issues for days/weeks later.I can find something else to do for a day or so.
       
 (DIR) Post #AiP1TdDA0Ragr0bqIS by koakuma@uwu.social
       2024-05-30T02:19:31Z
       
       0 likes, 0 repeats
       
       @dean Count me among the ones who are fine with a downtime :02lurk:
       
 (DIR) Post #AiPbggNB9weKm2omFE by k1tteh@uwu.social
       2024-05-30T09:05:15Z
       
       0 likes, 0 repeats
       
       @dean yeah fuck it, just take it down
       
 (DIR) Post #AiPfDqmrlAwsUZL9lo by kura@z0ne.social
       2024-05-30T09:44:51.669Z
       
       0 likes, 0 repeats
       
       @dean@uwu.social personally, I don't mind planned and communicated down times ( ps I still have @kura@uwu.social and @anker@uwu.social)Anyway. You could use overlay fs to a sshfs mapped directory. That way, stuff is already on target host. Not sure how good that works though.
       
 (DIR) Post #AiPfSHWr8Cw3U5jVZY by izaya@social.shadowkat.net
       2024-05-30T09:35:56.659707Z
       
       0 likes, 0 repeats
       
       @dean fast stream compression (zstd or ... lz?) would likely mean you could just ignore the empty space assuming your filesystem is on a volume with TRIM/DISCARD
       
 (DIR) Post #AiPfiDrwVSr4lbCIbo by RavenLuni@furry.engineer
       2024-05-30T09:50:21Z
       
       0 likes, 0 repeats
       
       @dean Ask @crashdoom for advice - theyve done a great job of keeping this place going through a whole load of such issues
       
 (DIR) Post #AiPfrwNeNpurQGf7bs by Jain@blob.cat
       2024-05-30T09:52:07.909956Z
       
       0 likes, 0 repeats
       
       @dean how about purging remote media first?
       
 (DIR) Post #AiPgKAEGUKECTvjKwS by Jain@blob.cat
       2024-05-30T09:57:13.855063Z
       
       0 likes, 0 repeats
       
       @dean i mean, i really dont get why mastodon saves media from remote instances... It doesnt really make sense and just blows up media storage...There is a maintenance job for that
       
 (DIR) Post #AiPgUeakSIQnXtFTEG by dean@uwu.social
       2024-05-30T09:59:08Z
       
       0 likes, 0 repeats
       
       @Jain yeah we purged media before we started, I don't think it speeds up the file listing though because the hash directories get left behind AFAIK :(
       
 (DIR) Post #AiPh0vY6wYHTPAP7QW by Jain@blob.cat
       2024-05-30T10:04:57.601237Z
       
       0 likes, 0 repeats
       
       @dean :blobcatgoogly: are you sure you did or isnt the job done yet? Im asking because i cant imagine that the 8TB is just media from your users... Well it could be but it seems a bit high for the amount of MAU you have on your server
       
 (DIR) Post #AiPh5ng1dk3SRAQhgu by dean@uwu.social
       2024-05-30T10:05:50Z
       
       1 likes, 0 repeats
       
       @Jain the 8TB is the raw volume size if we were to copy the ext4 partition over instead of rsyncing each file over. Sorry if that wasn't clear 🙏
       
 (DIR) Post #AiPhebLlSk5sLFXziS by dean@uwu.social
       2024-05-30T10:12:07Z
       
       1 likes, 0 repeats
       
       @izaya If we were gonna move the FS we'd definitely do it with compression. The other admin is gonna benchmark zstd today on the source and destination servers to see what settings we can use. We could also probably make it more efficient by running e4defrag online first (but it's kinda scary lol)
       
 (DIR) Post #AiPiDysXyWMJRykdmK by Jain@blob.cat
       2024-05-30T10:18:31.431589Z
       
       0 likes, 0 repeats
       
       @dean i different way would be to use the database as your file list. Backup the DB, use the Backup to copy files, take down the instance, compare the stopped db with the backup, resync the difference and you are done...Altho this require more work...
       
 (DIR) Post #AiPiM2TgThYuvexqRU by dean@uwu.social
       2024-05-30T10:19:58Z
       
       1 likes, 0 repeats
       
       @Jain that's a good idea! Might look into this and see how feasible it is for us
       
 (DIR) Post #AiQCbbK8CXQtyDpgSO by flandrescarlet@uwu.social
       2024-05-30T15:58:50Z
       
       0 likes, 0 repeats
       
       @dean do what you need to, if a long period of downtime means less issues and/or less stress for you two, definitely do it :izumithumbsup:
       
 (DIR) Post #AiQJKFzQxxOSYppQDQ by crashdoom@furry.engineer
       2024-05-30T15:18:26Z
       
       0 likes, 0 repeats
       
       @RavenLuni @dean If it helps, when we migrated from Cloudflare R2 to local storage, we scheduled downtime to initially rsync across only the important media.That ended up being media attachments, profile images, emojis, etc. that were all created by us or our users. Entirely ignore the cache directory, as it’s probably by far your largest amount of content.To avoid causing broken images, I’d then rsync the profile images and emojis from the cache folder.It’s not ideal, but should minimize the downtime, especially when you’re transferring a large amount of content (we had about 5TB~)Also, doing clean ups beforehand (if you can accommodate) should reduce the amount of content you need to transfer! Orphaned files seemed to be the worst for us and took up half a TB over time xD
       
 (DIR) Post #AiQJKH2J4lHlo2BEKe by dean@uwu.social
       2024-05-30T17:14:14Z
       
       0 likes, 0 repeats
       
       @crashdoom @RavenLuni Thanks for the tips! Is there a good way to clean up orphaned files? Maybe some tootctl command or something?
       
 (DIR) Post #AiQNJn0797rATBBATQ by janisf@mstdn.social
       2024-05-30T17:58:58Z
       
       0 likes, 0 repeats
       
       @dean I'm really, really not an expert, but I wonder if you could migrate with like, the newer 20% of the data with a 3-hour downtime, and then spread out shorter down times to pump  in chunks from newest to oldest.There's nothing quite like a timestamp for a clean place to cut a dataset.On the other hand, I've been known to complicate things.
       
 (DIR) Post #AiQZlcIniRDeqYxHlI by crashdoom@furry.engineer
       2024-05-30T20:18:23Z
       
       0 likes, 0 repeats
       
       @dean @RavenLuni Yeah, that’d be tootctl media remove-orphans (https://docs.joinmastodon.org/admin/tootctl/#media-remove-orphans).That should get a handful of files that were from deleted posts, and didn’t get cleaned up.