Post ASgYFVvuQugbBhqdQf by ProgGrrl@bbq.snoot.com
 (DIR) More posts by ProgGrrl@bbq.snoot.com
 (DIR) Post #ASgVgw1dENGG4GuNKC by lauren@mastodon.laurenweinstein.org
       2023-02-14T22:54:14Z
       
       0 likes, 0 repeats
       
       So this *is* a problem with the current #Mastodon model. When i sent out my piece on generative AI this morning, I included a link to the actual blog post. Within seconds, that server was dragged down into the dirt due to all the Mastodon sites pulling that URL looking for metadata/images that aren't there -- blocking other services too including email. Over 800 in a few seconds. I can handle that because it settled back down fairly quickly, but it's badly designed to create what is effectively a DDoS attack, and some sites would be seriously messed up by it.
       
 (DIR) Post #ASgVt0iNcZu85Sf4nw by pjfasano@nova.community
       2023-02-14T22:56:15Z
       
       0 likes, 0 repeats
       
       @lauren Why aren't those included as a payload with the original post? smh...
       
 (DIR) Post #ASgW1K3nZy2cP1dxBo by lauren@mastodon.laurenweinstein.org
       2023-02-14T22:57:57Z
       
       0 likes, 0 repeats
       
       Load average on that server, BTW, went over 50. I'm not sure how far over 50. That was bad enough.
       
 (DIR) Post #ASgW9cJ9cPjZJUmTU8 by thomasafine@social.linux.pizza
       2023-02-14T22:59:23Z
       
       0 likes, 0 repeats
       
       @lauren The originating server should retrieve the metadata (including thumbnail images) once and then distribute this metadata when it distributed the posting.
       
 (DIR) Post #ASgWCfuwefGPNQup84 by lauren@mastodon.laurenweinstein.org
       2023-02-14T22:59:57Z
       
       0 likes, 0 repeats
       
       @thomasafine Apparently that's not how it works.
       
 (DIR) Post #ASgWQyArZcRU3NB2wq by kkeller@curling.social
       2023-02-14T23:02:31Z
       
       0 likes, 0 repeats
       
       @lauren are you able to see what URLs were being grabbed besides the target URL?  When I did this experiment with a test URL, a while back, I only saw a few hundred hits total in the access log, and the only hits were to the target.  I'm trying to figure out what other Mastodon servers are seeing in the return that causes them to get greedy.
       
 (DIR) Post #ASgWSijJxwB2jW85eS by thomasafine@social.linux.pizza
       2023-02-14T23:02:52Z
       
       0 likes, 0 repeats
       
       @lauren sure I'm just saying that's the obvious design change that would alleviate this problem, without creating a similar problem for Mastodon servers (because a push mechanism would scale differently, as well as naturally happening serially rather than in parallel).
       
 (DIR) Post #ASgWXAzUpn6tXbouBM by lauren@mastodon.laurenweinstein.org
       2023-02-14T23:03:41Z
       
       0 likes, 0 repeats
       
       @kkeller In theory yeah but the logs are massive and I have better ways to spend my time right now. There are various aspects of #Mastodon that clearly are not scaling well -- backups is one of them, for example.
       
 (DIR) Post #ASgWbdVyj15TAH8k88 by lauren@mastodon.laurenweinstein.org
       2023-02-14T23:04:27Z
       
       0 likes, 0 repeats
       
       @thomasafine I've seen this discussed before but there are quite a few design elements of #Mastodon that simply do not appear to be scaling well.
       
 (DIR) Post #ASgY2qUSgrIVBFy8A4 by bplein@bvp.me
       2023-02-14T23:20:32Z
       
       0 likes, 0 repeats
       
       @lauren Wow! As soon as you described the problem, I thought "yes, that sounds like that would happen... and that's not good" and then I had to go see it for myself. I posted a link to an old blog post on a Wordpress site I control, and sure enough, within several seconds, 218 different hits from mastodon servers took place.
       
 (DIR) Post #ASgYFVvuQugbBhqdQf by ProgGrrl@bbq.snoot.com
       2023-02-14T23:22:53Z
       
       0 likes, 0 repeats
       
       @lauren curious what team @Gargron thinks about this as system grows?
       
 (DIR) Post #ASgYjOkdhPgvbLpIci by tknarr@mstdn.social
       2023-02-14T23:28:13Z
       
       0 likes, 0 repeats
       
       @lauren I recognized the problem pretty quickly. Any site that tries to auto-get previews for links in a post triggers the same behavior. Should a site even be trying to generate a preview for a bare link? My reaction is that should be something the original poster handles when embedding a link (as opposed to just pasting a bare URL into a post), and the syndication protocol should be able to handle extra content associated with links.
       
 (DIR) Post #ASgYvuY3Iwhzh1uCjw by rafial@hackers.town
       2023-02-14T23:28:59Z
       
       0 likes, 2 repeats
       
       @ProgGrrl @lauren this issue has been known about for a long time:https://github.com/mastodon/mastodon/issues/4486…and unfortunately, it wasn’t solved then. It’s going to be harder and harder to ignore as the network keeps on growing.
       
 (DIR) Post #ASgZEVbSH3WTpTMhcW by lauren@mastodon.laurenweinstein.org
       2023-02-14T23:33:46Z
       
       0 likes, 0 repeats
       
       @rafial @noam @ProgGrrl Yeah, well we see how well that worked out.
       
 (DIR) Post #ASgZZPo11Xd9vjD280 by noam@beige.party
       2023-02-14T23:31:20Z
       
       0 likes, 0 repeats
       
       @rafial @ProgGrrl @lauren There are solutions! Servers could pass link previews in-band from the post (means they're less trustworthy but link previews were never trustworthy to begin with)
       
 (DIR) Post #ASgZZQMOxh8VeNIUj2 by da5is@hachyderm.io
       2023-02-14T23:33:42Z
       
       0 likes, 0 repeats
       
       @noam @rafial @ProgGrrl @lauren what about a randomized delay to spread out the load (if we’re concerned about the “all at once” and not the “total volume”?
       
 (DIR) Post #ASgZZQtMz7VXIcip72 by lauren@mastodon.laurenweinstein.org
       2023-02-14T23:37:39Z
       
       0 likes, 0 repeats
       
       @da5is @noam @rafial @ProgGrrl It would help for sure, but in terms of total effect (including bandwidth used at that original site) it doesn't change the fundamentals. Having all those sites individually trying to pull the same URL just can't scale. Period. I've been slashdotted lots of times, I know how bad this can get.
       
 (DIR) Post #ASgZn97xpWEkG25Ts0 by da5is@hachyderm.io
       2023-02-14T23:40:06Z
       
       0 likes, 0 repeats
       
       @lauren @noam @rafial @ProgGrrl thanks for the explanation - perhaps a delay as a short term fix to buy time for something more scalable would be beneficial in the near term.
       
 (DIR) Post #ASgaVWREWRFLIRGDL6 by rafial@hackers.town
       2023-02-14T23:45:02Z
       
       0 likes, 0 repeats
       
       @da5is @lauren @noam @ProgGrrl the short term fix has already been done … https://github.com/mastodon/mastodon/pull/8015
       
 (DIR) Post #ASgaVWzyRH2H2BVxUO by lauren@mastodon.laurenweinstein.org
       2023-02-14T23:48:08Z
       
       0 likes, 0 repeats
       
       @rafial @da5is @noam @ProgGrrl Yeah? Didn't seem to help much if it's actually in there.
       
 (DIR) Post #ASgbBkEfc8lcFbW9E8 by rafial@hackers.town
       2023-02-14T23:55:48Z
       
       0 likes, 0 repeats
       
       @lauren @da5is @noam @ProgGrrl here is a more recent issue that cites the problem (it seems the delay is only 60 seconds) https://github.com/mastodon/mastodon/issues/21861
       
 (DIR) Post #ASgbsxbfmDywAjsqWG by wasootch@mastodon.sdf.org
       2023-02-15T00:03:34Z
       
       0 likes, 0 repeats
       
       @lauren Oh interesting. It would probably hurt my blog. I'll have to see if it is doing that. I do find Google is being cranky about my blog lately and I've been linking to Mastodon apps.
       
 (DIR) Post #ASgcduvVYiVtTEYI5I by lauren@mastodon.laurenweinstein.org
       2023-02-15T00:12:04Z
       
       0 likes, 0 repeats
       
       @rafial @da5is @noam @ProgGrrl Yeah. So not fixed at all.
       
 (DIR) Post #ASgdQZoOqqTaArQiMy by ralf@noc.social
       2023-02-15T00:20:19Z
       
       0 likes, 0 repeats
       
       @lauren .... sounds like your blog isn't behind any edge caching. WP is notorious for being exceptionally inefficient with MySQL queries (sometimes >100 per page load, depending on plugins).Cache on the edge; I could explain how but that would kinda kill my side business.
       
 (DIR) Post #ASgdrjl5ygR6qGpr16 by ralf@noc.social
       2023-02-15T00:22:30Z
       
       0 likes, 0 repeats
       
       @thomasafine @lauren This would be the proper model.  But there's workarounds.  At worst, use cloudflare free, which will keep even a puny server online through this with negligible impact.
       
 (DIR) Post #ASgdrkLxlbvWgc5ITw by lauren@mastodon.laurenweinstein.org
       2023-02-15T00:25:49Z
       
       0 likes, 0 repeats
       
       @ralf @thomasafine I wouldn't use Cloudflare if they were the last platform on the planet, so long as they continue to serve far right horrors.
       
 (DIR) Post #ASgyNtkCNxGmA51r4i by gaditb@icosahedron.website
       2023-02-15T04:15:37Z
       
       0 likes, 0 repeats
       
       @lauren Happened to @jwz recently, too.https://www.jwz.org/blog/2022/11/mastodon-stampede/And I'm sure many others.
       
 (DIR) Post #ASgzyEqjUeE7zvyfEe by lauren@mastodon.laurenweinstein.org
       2023-02-15T04:33:29Z
       
       0 likes, 0 repeats
       
       @gaditb @jwz The technical term for this situation is "broken". Perhaps it wasn't "broken" at a smaller scale. But it sure is now.
       
 (DIR) Post #ASh0RxizjA7DZdX0Xw by TheWebTech@noc.social
       2023-02-15T03:29:39Z
       
       0 likes, 0 repeats
       
       @thomasafine@linux.pizza @lauren@laurenweinstein.org  This isn't done because it would introduce a security flaw in the whole mastodon system if the mastodon server did this vs each instance getting the accurate information from the source itself.Effectively you'd be making Mastodon vulnerable to cache poisoning.There's probably a better solution to the problem but, that solution ideally would be more like a lighter-weight "Hey just send me the meta data, not the whole page" system.
       
 (DIR) Post #ASh0RyQF8Mi5jllXxQ by lauren@mastodon.laurenweinstein.org
       2023-02-15T04:38:46Z
       
       0 likes, 0 repeats
       
       @TheWebTech @thomasafine There are obvious relatively straightforward ways to add authentication data in a way that would avoid the scenario you outline. However, I doubt that the Mastodon Priesthood is really interested in solving these problems, because having the problems present helps to hold the scale down, which seems to be part of the plan -- "keep out as many of those people from Twitter as possible, unless they're *our* kind of people." That sure seems to be the philosophy from some quarters.
       
 (DIR) Post #ASh1Rxda9Tp7G0Yxn6 by gaditb@icosahedron.website
       2023-02-15T04:50:01Z
       
       0 likes, 0 repeats
       
       @lauren Yeah, like I don't... do webdev, nor do I do distributed systems stuff (but.. neither does he), but I really never really understood Gargron's argument against doing it differently then, and I can't say I understand it better now.
       
 (DIR) Post #ASh4VdVNFGqNq3ASh6 by jwz@mastodon.social
       2023-02-15T05:23:47Z
       
       0 likes, 1 repeats
       
       @lauren @gaditb Bug reported in 2017, ignored, denigrated and victim-blamed: https://github.com/mastodon/mastodon/issues/4486Bug reported again in 2020, ignored but less denigrated: https://github.com/mastodon/mastodon/issues/12738The fact that the Mastodon developer community's reaction to "here's a real-world scaling problem happening right now" is victim-blaming and 5 years of delay rather than "we'd better find a solution to that pretty quick" is not a great look.
       
 (DIR) Post #AShHydr0KGySirRjt2 by jpanzer@mastodon.social
       2023-02-15T07:55:12Z
       
       0 likes, 0 repeats
       
       @lauren imo the way to solve this is with an optional preview metadata sent by the poster’s instance; and, a mechanism to check that it’s accurate (not spammy, misleading, etc) by normal link spam control and user reports.  If an instance accumulates too many users with bad or misleading preview data, well, that’s why we have blocklists.  (Maybe there could just be a block on preview data for badly behaving instances.)
       
 (DIR) Post #AShmjAnBBwG0pBlGOO by renchap@oisaur.com
       2023-02-15T13:39:42Z
       
       0 likes, 0 repeats
       
       @lauren we have this on our radar and are considering solutions. I wrote a document on the topic a few weeks ago: https://gist.github.com/renchap/3ae0df45b7b4534f98a8055d91d52186
       
 (DIR) Post #AShohBcnKDwOjg5HzU by pc88ingrate@techhub.social
       2023-02-15T14:01:50Z
       
       0 likes, 0 repeats
       
       @lauren Assume I'm an idiot. What's the difference between this and any other load balancing you have to do as a web administrator? Is there something specific about mastadon that does not let you reschedule incoming network requests? Wouldn't very simple anti ddos measures be fine here?
       
 (DIR) Post #AShzDgsrdfRqUbjr6G by lauren@mastodon.laurenweinstein.org
       2023-02-15T15:59:47Z
       
       0 likes, 0 repeats
       
       @pc88ingrate Many Mastodon sites operate with very limited resources. It should not be their responsibility to provision for a fundamental flaw in the Mastodon design that has been recognized for years but not fixed.
       
 (DIR) Post #ASi17tPWLrjvVxNPNo by TheWebTech@noc.social
       2023-02-15T16:20:50Z
       
       0 likes, 0 repeats
       
       @lauren @thomasafine "keep out as many of those people from Twitter as possible, unless they're *our* kind of people." Maybe I've missed some interactions that have occurred but I haven't actually seen that behavior from maintainers of Mastodon. I've seen pushback on functionality requests that they had previously weighed on, but mostly even that pushback has been more of a "let's not recreate Twitter, let's not rush and instead see if we can do something that's better."
       
 (DIR) Post #ASiKmEIC1zdqzTtecq by TheWebTech@noc.social
       2023-02-15T20:00:29Z
       
       0 likes, 0 repeats
       
       @lauren @thomasafine On the point of "relatively straightforward ways to add authentication data". To me that's not straightforward, and didn't seem to be to others in GitHub issue. If you have a new solution that you think you could implement, suggest it in the GitHub issue and consider a Pull Request.The only straight forward and sure-fire solution in there right now is removing the feature, but to many it's considered a table-stakes feature.
       
 (DIR) Post #ASiKt3AhmBzWWke9BI by lauren@mastodon.laurenweinstein.org
       2023-02-15T20:02:34Z
       
       0 likes, 0 repeats
       
       @TheWebTech @thomasafine No, I'm past the stage where I'm willing to devote my time to coding that will be ignored by the "powers that be" in control of a project. Been there, done that.
       
 (DIR) Post #ASiPTEsW80wF4neQ6K by alfajet@alfajet.masto.host
       2023-02-15T20:53:52Z
       
       0 likes, 0 repeats
       
       @lauren I see that more as an implementation issue rather than a problem with the model itself.I assume by the #Mastodon model, you refer to the distributed nature of the fediverse. Maybe the servers should cache less agressively. However, I am sure that when a post is going viral on another popular social media, similar load issues will arise.
       
 (DIR) Post #ASiPihuoS3azTd6IK0 by lauren@mastodon.laurenweinstein.org
       2023-02-15T20:56:23Z
       
       0 likes, 0 repeats
       
       @alfajet I've had posts slashdotted quite a few times, so I know how that impacts my servers. But that's individuals who want to read the post. It's the only way for it to really happen. The Mastodon case is different because it's *unnecessary*.
       
 (DIR) Post #ASiQUIdpiqRizWqu3c by TheWebTech@noc.social
       2023-02-15T21:04:42Z
       
       0 likes, 0 repeats
       
       @lauren @thomasafine Well, it's best to suggest the idea you have in the github issue before you code anything to avoid that.Regardless this isn't a situation of folks not wanting to do it. Right now in that thread, all of the options discussed seem to have serious drawbacks. So none of them are going to get implemented until someone finds a solution to the drawbacks. The "Powers that be" are all the developers interested in solving the issue, including you, and the project maintainers.
       
 (DIR) Post #ASiQWDQ86ZBzNUM8ie by alfajet@alfajet.masto.host
       2023-02-15T21:05:24Z
       
       0 likes, 0 repeats
       
       @lauren I don't deny your point, I've seen the effect also myself on a few occasions. I am just saying that this isn't, in my view, not a model flaw. I reckon this could be addressed if needed (see other reply with p2p suggestion). Posting something on slashdot probably would have attracted its fair share of bots too ;-)
       
 (DIR) Post #ASiQl4NpzDbJh5MoVs by colin_mcmillen@piaille.fr
       2023-02-15T21:08:20Z
       
       0 likes, 0 repeats
       
       @lauren @ralf @thomasafine same for me. Varnish achieves the same protection from such non-offensive ddoses.
       
 (DIR) Post #ASiQmtFEN4rJj7D2Ho by lauren@mastodon.laurenweinstein.org
       2023-02-15T21:08:30Z
       
       0 likes, 0 repeats
       
       @alfajet Slashdot does attract some bots since there are syndication sites of course, but usually the impact of being slashdotted is spread out over a hour or more as people and bots get around to reading it. The Mastodon effect was at least 800 hitting within a minute, pushing the load average up to (somewhere) over 50 -- which on a two cpu server had notable impact.
       
 (DIR) Post #ASiTLUYiLO2ZD0wwJk by alfajet@alfajet.masto.host
       2023-02-15T21:37:20Z
       
       0 likes, 0 repeats
       
       @lauren yup that's a bit of a peak! Federation certainly comes with its own challenges, but hopefully things like this can be resolved by a better implementation.As I am running my own instance, I also see the other side of the coin: I am getting a huge media db, just for me. Should my instance host other users, it probably wouldn't be much larger. There are definitely issues to be addressed, I don't deny it! :)
       
 (DIR) Post #ASiVEoYBojhj5oeENs by lauren@mastodon.laurenweinstein.org
       2023-02-15T21:58:32Z
       
       0 likes, 0 repeats
       
       @alfajet Same thing here, it's my own instance and it's just me. Backups are getting very lengthy.
       
 (DIR) Post #ASiWDR6raGNU3blzAf by pc88ingrate@techhub.social
       2023-02-15T22:09:28Z
       
       0 likes, 0 repeats
       
       @lauren That sucks! I hope you figure out your web hosting woes soon.
       
 (DIR) Post #ASiXCJkjYEiOu9S7HM by ralf@noc.social
       2023-02-15T22:20:09Z
       
       0 likes, 0 repeats
       
       @lauren @thomasafine hence "at worst"
       
 (DIR) Post #ASiZ7k8qD7KHCzfumu by tknarr@mstdn.social
       2023-02-15T22:30:22Z
       
       0 likes, 0 repeats
       
       @gaditb @lauren I can see the reasoning. The Mastodon protocol doesn't involve itself with how a particular client displays links, at the protocol level it's plain text. Should the protocol really change to accommodate an implementation-specific quirk of specific software?IMO it's desired-enough behavior that there should be some accommodation in there, although I'd go with a "Link" attachment type, but I can understand the reasoning.
       
 (DIR) Post #ASiZ7ktHQSTNX1P0Ai by lauren@mastodon.laurenweinstein.org
       2023-02-15T22:42:05Z
       
       0 likes, 0 repeats
       
       @tknarr @gaditb I'm impressed that there isn't even agreement on something as disruptive as causing DDoS attacks. Mastodon remains a toy that can't scale. I guess that's the idea. Musk loves this.
       
 (DIR) Post #ASic8KyDa9vTIEgHuS by tknarr@mstdn.social
       2023-02-15T23:15:48Z
       
       0 likes, 0 repeats
       
       @lauren @gaditb If browsers with link prefetch enabled kill a site when a link to it is posted, which component is failing to scale?1. The web server the link was posted on, for not creating a local cached copy of the pages that will be fetched?2. The browser, for fetching content ahead of time without users requesting it?3. The site linked to, for not being prepared for becoming suddenly incredibly popular?Mastodon itself, like Slashdot, scales very well. That's the problem.
       
 (DIR) Post #ASieKgFAVv6qeecgOu by lauren@mastodon.laurenweinstein.org
       2023-02-15T23:40:22Z
       
       0 likes, 0 repeats
       
       @ScriptFanix @thomasafine The entire topology is based on trust. Not difficult to assign trust levels to preview origins.
       
 (DIR) Post #ASucneOqa1qvK3M66q by moiety@queer.garden
       2023-02-21T18:19:45Z
       
       0 likes, 0 repeats
       
       @lauren is there anything site owners can do?
       
 (DIR) Post #ASvB8sbkLZOJuWgR4i by serrebi@dragonscave.space
       2023-02-22T00:44:34Z
       
       0 likes, 0 repeats
       
       @lauren I have the same problem with my Icecast mount now. Luckily they don't count as listeners in stats, but still...
       
 (DIR) Post #AT4PtD8BWPIefETwsi by lunar@mas.to
       2023-02-26T11:42:05Z
       
       0 likes, 0 repeats
       
       @lauren So you'll complain about a problem and then when offered a way to help fix the problem you obstinately refuse. Tell me you're an American without telling me.@TheWebTech @thomasafine