Post AxrwJgLtn2i28FV8MK by jdp23@neuromatch.social
(DIR) More posts by jdp23@neuromatch.social
(DIR) Post #AxpWadLHFS4GaxiGf2 by ricci@discuss.systems
2025-09-03T13:12:04Z
0 likes, 0 repeats
I have some ideas for another metric for distributed social networks:# The B-IndexThe B-Index intends to model the number of entities that have blocking-level control over an account's participation in a distributed social network.It is expressed as $B_X = N$, where $X$ is the fraction of the network that the account is blocked from if N entities block it from the portions they have control over.For example, a network with $B_{50} = 20$ would mean that 20 entities making blocking decisions can block an account from half of the network.## UsesThe B-Index can be used for multiple purposes:* From an individual user perspective, it can be thought of as the ability of administrators, etc. to block that user's access to the network; users may prefer networks that have a high B-Index if they have concerns about being blocked from network access* From a Trust and Safety perspective, it can be thought of as the amount of cooperation required to limit bad actors' access to the network: users may prefer networks with a low B-Index if they have concerns about being targeted* From a resilience perspective, it can be thought of as the exposure of the network to the disappearance of infrastructure due to financial collapse, DoS attack, legal action, etc.; users may prefer networks with a high B-Index if they have concerns about the stability or sustainability of individual infrastructural elementsDetails here, comments welcome: https://github.com/ricci/distributed-social-networks/blob/main/BIndex.md
(DIR) Post #AxpWfPHcwj1gRMHX4C by ricci@discuss.systems
2025-09-03T13:13:00Z
0 likes, 0 repeats
@bnewbold @jdp23 I would particularly like your thoughts on application of this to the ATProto world if you are so inclined
(DIR) Post #AxpXx92gAWhvEhJPAO by jawnsy@mastodon.social
2025-09-03T13:27:23Z
0 likes, 0 repeats
@ricci ✅ Has graduate degree✅ Has LaTeX syntax burned into brain forever
(DIR) Post #Axpbm31oaVMvDPs71c by elplatt@greatjustice.net
2025-09-03T14:10:14Z
0 likes, 0 repeats
@ricci Interesting! This is kind of the converse of the "effective redundancy" we proposed in our attack tolerance paper. The challenge, I think, is preventing circumvention by large numbers of sockpuppet about (sybil attacks)
(DIR) Post #AxpcNMFG6RSzWux0bY by ricci@discuss.systems
2025-09-03T14:16:58Z
0 likes, 0 repeats
@elplatt You're referring to this paper?https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0214292I'll give it a read, thanks!
(DIR) Post #AxpeiA5QheAtoJOvk8 by elplatt@greatjustice.net
2025-09-03T14:43:08Z
0 likes, 0 repeats
@ricci that's the one!
(DIR) Post #AxpwRctK9X87On0IKG by jdp23@neuromatch.social
2025-09-03T18:01:48Z
0 likes, 0 repeats
Interesting ... My quick reaction is that the different potential levels of blocking in AT Protocol are a complication, and so are the multiple paths of access. A couple of specific examples:Bluesky PBC controls the PLC directory, so if they blocked an did:plc account at that level it would affect access to 100% of the network. But did:web exists as well, and the blocking there is done by DNS authorities or whoever controlls the domain if multiple people are using it. If 99% of accounts currently use did:plc (I just pulled that out of the hat, and have no idea what the actual numbers are), that means B(1) = 99 ... but it also means that it's trivial for anybody blocked to create a new did:web account and evade the block. Then again, this is a situation where (if I understnad correctly) there isn't "credible exit" -- you can't convert a did:plc to did:web, so you'll lose your data and network.Similarly, looking at the MIssissippi situation, where Bluesky blocked access at the app level. 99%+ of the people currently use the app; Other apps don't block, and VPNs exist, so it's trivial to evade the block ... but a lot of people don't know this. So is that B(1) = 99, or B(1) = 0, or ...?I also wonder if a single overall number for the whole ATmosphere obscures important differences between different subnetworks. If Bluesky PBC's Relay and AppView block a PDS, then that PDSs. users are arguably essentially cut off from the "Bluesky network" (whoever that's defeined). But, their own internal connectivity might be a lot less affected -- if for example they've got their own custom AppView. Then again they also potentially have their own internal choke points. So Blacksky's B-index (or Gander's, or Spark's) is different than Bluesky's, and I'm not sure about aggregating them.@bnewbold
(DIR) Post #Axq6NOVGhiaT9tNAGW by ricci@discuss.systems
2025-09-03T19:53:08Z
0 likes, 0 repeats
@jdp23 @bnewbold Thanks very much for these thoughts! I really appreciate your help and patience filling in the gaps in my knowledge here. FWIW I'm playing around with these things hands-on to learn more, I set up my own PDS last night and hope to try my own relay, either with the reference implementation or blacksky's rsky this week. On PLC: The number of did:web handles for the network appears to be - according to another dataset I found that scrapes handles - 159. I'm not confident in saying that's the complete set, but the number of PLC handles would appear, conservatively to be north of 99.9% (the dataset I am using, which is coming from the bluesky relay, contains 402k handles, so that's assuming that I'm seeing a representative sample of the full set of handles)My conclusion here is that the B-Index for atproto is, depending on your viewpoint, very boring or very revealing. Unless the handle data I'm looking at is very incorrect, the B-Index for the *current* atmosphere is effectively $B_X$ = 1 for all $X$ at least up to 99.9; whether or not you believe they *would*, through administrative decisions or company failure Bluesky PBC could cut off any or all users from 99.9% of the network. I'm sure some people find that totally unsurprising, while others are probably quite surprised.On your third point, this is actually kind of what I'm trying to capture with B-Index. Yeah, it does seem like (maybe other than the PLC) that someone who is using blacksky's appview, PDS, and relay, is going to be just fine within the blacksky network - but they will be cut of from 99.999% or so of the atmosphere. The B-Index is not trying to model whether they care - some people will be perfectly fine with that scenario, but rather model the centralization of power that gets one to that point; ie. one company can decide to cut me off from 99.999% of the world, vs. if m.s blocks me (worst case on the fediverse) that cuts me off from 25% of the world.On the Mississippi situation, my answer is kind of similar: how many companies (or instance admins, etc.) does it take to make a decision that affects X% of users. On the VPN thing, I think we can't model it, since we can't really model the how accessible VPNs are. On the alternative appviews, we can, though, if we can get user counts for them; here the B-Index may be higher, because more entities have to make the decision to block.So some other thoughts here: * One of the interesting things (I hope) about B-Index is that you can use it for little 'what if' experiments, eg. we could move on from the PLC, assume that it will not be used to revoke DIDs and that it will not go down, move on to the next 'largest' resource and go from there: I assume that's likely to be the bluesky relay or maybe the appview. But this is where I need to understand better what someone on, say, blacksky, who has been banned at the appview and/or relay level from bluesky can or cannot see. Maybe once I have my own relay to play with I'll have a better understanding.* I think it would be interesting to create some version of this metric that looks not at blocking but credible exit, separately for social graph and user data (which as you point out is going to cover an exceedingly small number of Fediverse users today). Though it sounds like atproto might run into the same problem with PLC - but we can use the same technique of 'here's what it looks like today, here's what it would look like if the PLC were in the hands of a consortium, or truly decentralized, etc.* Labeling and feeds are also going to be interesting here, I'm not giving up on trying to look at those just yet :) This will probably be the case where, assuming I can get appropriate data, the *fediverse* is the one that's complicated to model, since each instance has its own blocked * As you point out there are DNS and other external dependencies, and I'm interested in those as well, I did some rough numbers for the fediverse in terms of networks that hosts are on a while ago, I'm sure this could be formalized: https://discuss.systems/@ricci/114396317436420669* I think it would be interesting to extend the kind of thing noted in my previous point to legal jurisdictions as well.
(DIR) Post #Axqhm6d3zgUzGYfJJ2 by okennedy@discuss.systems
2025-09-04T02:52:07Z
0 likes, 0 repeats
@ricci It might also be worth thinking about the distinction between different data flow modalities, as (at least in the Fedi) the B_x value is different for eachRead modalities:- Read: The ability of a user to access known public information on the network. (can I read a Toot that I have a link to)- Link: The ability of a user to access information connected to known public information (can I access the Toot that a post I can access is replying to)- Subscribe: The ability of a user to request that the network push updates to them (can I follow a user / subscribe to a thread)Write modalities:- Publish: The ability of a user to push content to users who subscribe (what fraction of the network can follow me)- Broadcast: The ability of a user to make their content discoverable to others (what fraction of the network shows my Toots in the communal feed)For example, I'm not sure the fediverse even has a "Link" B_50, as public posts are public. Sites might block specific IPs, but it's virtually impossible to stop someone from accessing public posts or posts linked to them. Contrast with X, where you need an account to see more than a direct link.This is, I think, something that sets the Fediverse apart, as there are explicit technical barriers to site admins blocking off some of these flow classes. F'rex, I seem to recall at least one outrage-inducing effort a year or so ago where some corp tried (successfully?) to bypass Subscribe restrictions by using scrapers. I imagine the AI companies are doing the same thing right now.
(DIR) Post #Axrbi4xzpMncNw74MK by ricci@discuss.systems
2025-09-04T13:18:55Z
0 likes, 0 repeats
@okennedy This is a quite interesting idea, as I think it also sets a basis for measuring some of the things that make bluesky interesting too - there, the ability of third parties to create feeds, in increase, visibility of posts, and labellers, to allow people to opt-in to decreased visibility of posts, fits roughly into this framework as well
(DIR) Post #AxrwJgLtn2i28FV8MK by jdp23@neuromatch.social
2025-09-04T17:09:50Z
0 likes, 0 repeats
Glad it was useful! It certainly is interesting work -- a lot to think about here ...Agreed that did:web is very niche. I was discussing this with Rudy Fraser and he pointed out that there are shadow copies of the PLC directory (in fact rsky-relay currently uses its own copy), so if Bluesky PBC when away it could probably be rebuilt. So the current B-index has different implications in terms of the power of their administrative discussions and the impact of a financial collapse.In terms of the what-ifs, I tend to look at things through a threat modeling lens. Sometimes threats can be prvented mitigated proactively; other times, it makes more sense to wait and do the mitigation and recovery afterwards. I had seen the fedi hosting anlaysis when you did it (although it didn't click until now that you were the same person). Instance blocklists are another place where there's been concerns in fedi that relate to the B-index. A couple of years ago there were concerns that The Bad Space would somehow get used as a blocklist for the entire fediverse. These particular concerns were somewhat unrelaistic -- as I said at the time"The Bad Space includes mastodon.social on its default blocklist – and mastodon.social is run by Mastodon gGmbH, who also maintains the Mastodon code base. Mastodon's not going to adopt a default blocklist that blocks mastodon.social, and Mastodon is currently over 80% of the fediverse. So The Bad Space isn't going to get adopted as a default by the entire fediverse."But the underlying issue is real. One specific threat is that Meta has talked about providing moderation tools for fedi (out of the goodness of their hearts of course, nothing to worry about here) and it wouldn't surprise me if they embed their own blocklist as part of it, and maybe even over time make that a requirement for federating with threads.net.@ricci @bnewbold
(DIR) Post #Axt5bXwgsTZ219GOFE by ricci@discuss.systems
2025-09-05T06:28:32Z
0 likes, 0 repeats
@jdp23 @bnewbold I think it's *probably* fair to leave PLC off the threat list for B-Index , since, as you say, it can be backed up (though I think one would have to get control of the domain to actually use the backups in a disaster). I'll focus on the relays.And yeah the bad space and others like it are why I want to figure out how to include blocklists in the calculation for the fediverse; the hard part of course will be trying to get data on how widely they are used. I've been watching the bad space from the beginning and have my own concerns too. I *think* the tier0 lists from seirdy https://seirdy.one/posts/2023/05/02/fediverse-blocklists/ and oliphant https://codeberg.org/oliphant/blocklists are more widely used, but frankly that's just a hunch, I want data. :) And whether you think seirdy's list is good or not (I happen to think it looks good), it's one person maintaining the tier0, so that significantly centralizes things for the B-Index for the fediverse.BTW I just added git forges to the main stats page...
(DIR) Post #Axu3ji7Svj5j8a1p9E by jdp23@neuromatch.social
2025-09-05T17:42:22Z
0 likes, 0 repeats
Maybe what the PLC question points to is that it's not an overall B-index, it's a B-index scoped to specific threats B(PLC, 99.9) = 1 and that's likely to remain the case for the next 6+ monthsB(Relay, 99) = 1 but that's (maybe) in the process of changingOn blocklists ... it's a can of worms. On The Bad Space, I have thoughts at https://privacy.thenexus.today/the-bad-space/ ... as far as I know few if any instances use it as a blocklist, although when I set up an instance I did used their 60% list along with Seirdy's Tier0 as starting points. Seirdy's work is extraordinarily good, the detailed receipts are invaluable, and there's a lot of value complementing the automated aggregation with personal judgment. TBS's sources prioritize safety for marginalized people -- the race aspect gets a lot of focus, obviously but it's often overlooked just how many of the TBS sources are trans- and queer- led.My guess is that Seirdy's lists, Gardenfence, and Oliphant have the broadest usage, but I don't know how broad it is -- and I don't know how many instances automatically process updates. A complication here is thatthey're all aggregations of the blocklists of various source instances, and there's a lot of overlap on their sources; plus, some of the sources take cues from the others. Which makes sense: if an instance known for good moderation decides to block somebody, it's a good idea for everybody to look at it and make their own decisions, and if there are receipts everybody will make a similar decision. For that matter,no matter the source, a post to FediBlock with receipts will also lead to blocking by multiple instances.I don't think any of the larger instances directly use any of the blocklists; some consult them as part of curating their blocklists, but don't treat them as ground truth. Seirdy's FediNuke and Gardenfence both had ~140 entries last time I checked (early this year), with substantial but not complete overlap, and all of those are on Seirdy's Tier0 and Oliphant's Tier0 ... but according to CARIAD, there are only ~70 instances that are blocked by 80% of the instances that make their blocklists public. wtf. Seirdy has receipts on the blocklist page, why would anybody not block hose instances? @jaz has links to CARIAD's stats at https://neuromatch.social/@jaz@mastodon.iftas.org/115146871066372145@ricci
(DIR) Post #AxyXyhw5fX14AYcAqG by ricci@discuss.systems
2025-09-07T21:40:06Z
0 likes, 0 repeats
@jaz @jdp23 Thanks very much, folks, for the link to this report, I did hear about it when it came out, but I had forgotten about it!
(DIR) Post #AxyYcPIAPDxTvgs3QO by ricci@discuss.systems
2025-09-07T21:47:17Z
0 likes, 0 repeats
@jdp23 Yeah and of course correlated behavior between different instances is possible to model (eg. if instance A puts something on its blocklist what is the probability that B, C, and D will?) but harder (likely impossible) to put judgement on as to "how 'good' is this", especially since "good" is undefinable. As you say, some instances having a reputation for "good" moderation, and others taking their lead is "good" in the sense that it gets some bad actors blocked quickly and widely, but is vulnerable to a small number of people making "bad" decisions.I have spent a good chuck of today trying to see if I can get data about use of various labelers in bluesky, because I think this might show very interesting decentralization trends, but with no luck so far. You can get *follower* counts for the labelers but not *subscribers*, and it's the latter that I need.