[HN Gopher] Decentralized Syndication - The Missing Internet Pro...
___________________________________________________________________
Decentralized Syndication - The Missing Internet Protocol
Author : brisky
Score : 114 points
Date : 2025-01-10 11:57 UTC (2 days ago)
(HTM) web link (tautvilas.lt)
(TXT) w3m dump (tautvilas.lt)
| teddyh wrote:
| Is he reinventing USENET netnews?
| hinkley wrote:
| Spam started on Usenet. As did Internet censorship. You can't
| _just_ reinvent Usenet. Or we could all just use Usenet.
| stackghost wrote:
| >Or we could all just use Usenet.
|
| Usenet doesn't scale. The Eternal September taught us that.
|
| To being Usenet back into the mainstream would require a
| major protocol upgrade, to say nothing of the seismic social
| shift.
| hinkley wrote:
| That's also my feeling. There's a space for something that
| has some of the same goals as Usenet while also learning
| from the past.
|
| I don't think it's a fruitful or useful comment to say
| something is "like Usenet" as a dismissal. So what if it
| is? It was useful as hell when it wasn't terrible.
| bb88 wrote:
| Yes and no. I think the issue primarily is that I could never
| just generate a new newsgroup back when usenet was popular and
| get it to syndicate with other servers.
|
| The other issue is who's going to host it? I need a port
| somehow (CGNAT be damned!).
| fiatjaf wrote:
| Nostr is kind of what you're looking for.
| doomroot wrote:
| My thought as well.
|
| ps When is your SC podcast coming back?
| pfraze wrote:
| Atproto supports deletes and partial syncs
| convolvatron wrote:
| alot of the use cases for this would have been covered by
| protocol designs suggested by Floyd, Jacobson and Zhang in
| https://www.icir.org/floyd/papers/adapt-web.pdf
|
| but it came right at a time when the industry had kind of just
| stopped listening to that whole group, and it was built on
| multicast, which was a dying horse.
|
| but if we had that facility as a widely implemented open
| standard, things would be much different and arguably much better
| today.
| rapnie wrote:
| > built on multicast, which was a dying horse.
|
| There's a fascinating research project Librecast [0], funded by
| the EU via NLnet, that may boost multicast right into modern
| tech stacks again.
|
| [0] https://www.librecast.net/about.html
| nunobrito wrote:
| What is that used for? Was looking at the documentation but
| I'm still without understanding the use case they are trying
| to solve.
|
| Isn't multicasting something already available with UDP or
| Point-to-Point connections without a single network envolved?
| convolvatron wrote:
| by 'multicast' here one really means a facility that's
| provided by layer 3. So yes, we can build our own multicast
| overlays. But a generic facility had two big benefits. One
| is that the spanning distribution tree can be built with a
| knowledge of the actual topology, and copies can be made in
| the backbone where they belong (copies in the overlay often
| mean that the data can traverse the same link more than
| once).
|
| The other big one is access. If we call agree on multicast
| semantics and addressing, and its built into everyone
| operating system, then we can all use that as a equal
| access facility to effectively publish to everyone, not
| just people who happen to be part of this particular club
| and are running this particular flavor of multicast.
| hkt wrote:
| https://en.wikipedia.org/wiki/Syndie was a decent attempt at this
| which is, I gather, still somewhat alive.
| remram wrote:
| I would love to have an RSS interface where I can republish
| articles to a number of my own feeds (selectively or
| automatically). Then I can follow some my friends' republished
| feeds.
|
| I feel like the "one feed" approach of most social platform is
| not here to benefit users but to encourage doom-scrolling with
| FOMO. It would be a lot harder for them to get so much of users'
| time and tolerance for ads if it were actually organized. But it
| seems to me that there might not be that much work needed to turn
| an RSS reader into a very productive social platform for sharing
| news and articles.
| fabrice_d wrote:
| That looks close to custom feeds in the ATProto / BlueSky
| world.
| James_K wrote:
| This interface already exists. It's called RSS. Simply make
| feed titled "reposts" and add entries linking to other
| websites. I already have such a thing on my own website with
| the precise hope that others will copy it.
| remram wrote:
| At some level yes, but I would like to be able to de-
| duplicate if multiple people/feeds repost the same article,
| and it would need a lot more on the discovery side (so I can
| find friends-of-friends, more feeds from same friend I
| follow, etc). Like a web-of-trust type of construct which I
| see as necessary with the accelerating rise of bots on all
| platforms.
| James_K wrote:
| Deduping can be done on the reader end. As for a web of
| trust, you can put a friends list on your website.
| edhelas wrote:
| XMPP XEP-0060 Pubsub is doing that :)
|
| I wrote a specific XEP for the social part
| https://xmpp.org/extensions/xep-0472.html
|
| And it's implemented in Movim https://movim.eu/
| wmf wrote:
| 1. Domain names: good.
|
| 2. Proof of work time IDs as timestamps: This doesn't work. It's
| trivial to backdate posts just by picking an earlier ID. (I don't
| care about this topic personally but people are concerned about
| backdating not forward-dating.)
|
| N. Decentralized instances should be able to host partial data:
| This is where I got lost. If everybody is hosting their own data,
| why is anything else needed?
| hinkley wrote:
| Time services can help with these sorts of things. They aren't
| notarizing the message. You don't trust the service to validate
| who wrote it or who sent it, you just trust that it saw these
| bytes at this time.
| catlifeonmars wrote:
| Something that maintains a mapping between a signature+domain
| and the earliest seen timestamp for that combination? I think
| at that point the time service becomes a viable aggregated
| index for readers who use to look for updates. I think this
| also solves the problem for lowering the cost of
| participation... since the index would only store a small
| amount of data per-post, and since indexes can be composed by
| the reader, it could scale cost effectively.
| hinkley wrote:
| I've only briefly worked with these but got a rundown from
| someone more broadly experienced with them. Essentially you
| treat trust as a checklist. I accept this message (and any
| subsequent transactions implied by its existence) if it
| comes from the right person, was emitted during the right
| time frame (whether I saw it during a separate time frame),
| and <insert other criteria here>. If I miss the message due
| to transmission errors or partitioning, I can still honor
| it later even though it now changes the consequences of
| some later message I can now determine to have arrived out
| of order.
| catlifeonmars wrote:
| I wonder if another way to think about this is as an
| authenticated vector clock. I think a merkle tree
| approach is probably too heavy weight as it's not
| necessary to keep that information around. You kind of
| just need quorum (defined as appropriate to mitigate
| abuse) to update your local version of the vector clock,
| but unlike a merkle tree, you only need partial updates
| based on the subset of posters you care about and you
| basically only need to keep around the last few versions
| of each vector component.
| hinkley wrote:
| > Merkle Tree
|
| I don't recall if any of the signatures sign across any
| other signatures. I think in some cases it's just a...
| Merkle List?
|
| Merkle trees get weird if they're done as signatures.
| With bitcoin everyone votes on the validity in a fairly
| narrow timeframe and the implausibility of spoofing a
| record and then spoofing the next n is what allows for
| the trust-in-the-blind to be practical (even if I don't
| agree that the theory is sound).
|
| For signatures, on data at rest, it gets complicated.
| Because at some point you're trusting a payload that has
| signatures from expired certificates. You end up
| having/needing transitive trust, because the two
| signatures you care about are valid, and they signed the
| payload while the signatures they cared about were still
| valid. So now you need to look at signature timestamps,
| Cert validity range, Cert _chain_ validity range, CRLs or
| OCSP, and you better make sure all your timestamps are in
| UTC...
|
| It's easier if the system has a maximum deliver-by date,
| and you just drop anything that shows up too out of band.
| I can use a piece of software that was created and signed
| over a year ago, but maybe I shouldn't accept year-old
| messages.
|
| I did a code signing system for avionics software, that
| supported chain of custody via countersigning (Merkle
| before anyone heard of bitcoin). The people who needed to
| understand it did but it broke some brains. I lost count
| of how many meetings we had where we had to explain the
| soundness of the transitivity. People were nervous, and
| frankly there weren't enough people watching the
| watchers. Sometimes you only catch your own mistakes when
| teaching.
| arccy wrote:
| that's too much tech for a trust problem it can't solve.
| just use a TimeStamp Authority like
| https://freetsa.org/index_en.php or
| https://knowledge.digicert.com/general-
| information/rfc3161-c...
| evbogue wrote:
| If the data is a signed hash, why does it need the domain name
| requirement? One can host self-authenticating content in many
| places.
|
| And one can host many signing keys at a single domain.
| catlifeonmars wrote:
| In the article, the main motivation for requiring a domain
| name, is to raise the barrier to entry above "free" to
| mitigate spamming/abuse.
| uzyn wrote:
| A 1-time fixed cost will not deter spam, it only encourages
| more spamming to lower the averaged per-spam cost. Email
| spamming requires some system set up, that's a 1-time fixed
| cost above $10/year but it does not stop spam.
| catlifeonmars wrote:
| It's one time fixed cost per stream of messages, with
| some out of band mechanism to throttle posting per-
| stream. I'm not sure I agree with the original articles
| choice of throttling mechanism (binding to the universal
| Bitcoin clock), but the concept still makes sense: in
| order to scale up production of spam, you still need to
| buy additional domains, otherwise you're limited to one
| post every n-minutes, and domain registration slow enough
| for block lists to keep up.
| wmf wrote:
| One person per domain is essentially proof of $10.
| hinkley wrote:
| There was a psychological study that decided that community
| moderation tends to be self healing if, and only if,
| punishing others for a perceived infraction comes at a cost
| to the punisher.
|
| I believe I have the timeline right that this study
| happened not too long before StackOverflow got the idea
| that getting upvoted gives you ten points and downvoting
| someone costs you two. As long as you're saying something
| useful occasionally instead of disagreeing with everyone
| else, your karma continues to rise.
| catlifeonmars wrote:
| While interesting, this seems entirely tangential to the
| conversation to me. I'm not seeing the connection between
| paying for the right to participate and punishment. What
| am I missing?
| hinkley wrote:
| If there's no system to refute the uniqueness of a
| handful of identities (sock puppets used for griefing)
| then the system won't scale. If anyone can issue a
| "takedown" for free, the system of moderation won't
| scale.
| macawfish wrote:
| Domain names are fine but they shouldn't be forced onto anyone.
| Nothing about DID or any other flexible and open decentralized
| naming/identity protocol will prevent anyone from using domain
| names if they want to.
| imglorp wrote:
| Recent events also taught us that proof of work is a serious
| problem for the biosphere when serious money is involved and
| everybody scales up. Instead, it seems proof of stake is more
| what is required.
| wmf wrote:
| Yeah, a verifiable delay function is probably better for
| timestamping.
| brisky wrote:
| Hi, author here. Regarding backdating it is a valid concern. I
| did not mention in the article, but in my proposed architecture
| users could post links of others (consider that a retweet). For
| links that have reposts there could exist additional security
| checks implemented to check validity of post time.
|
| Regarding hosting partial data: there should be an option to
| host just recent data for the past month or other time frames
| and not full DB of URLs. This would make decentralization
| better as each instance could have less storage requirements,
| but total information would be present on the network.
| toomim wrote:
| I am working on something like this. If you are, too, please
| contact me! toomim@gmail.com.
| evbogue wrote:
| I'm working on something like this too! I emailed you.
| glenstein wrote:
| While everyone is waiting for Atproto to proto, ActivityPub is
| already here. This is giving me "Sumerians look on in confusion
| as god creates world" vibes.
|
| https://theonion.com/sumerians-look-on-in-confusion-as-god-c...
| echelon wrote:
| These are still too centralized. The protocol should look more
| like BitTorrent.
|
| - You don't need domain names for identity. Signatures are
| enough. An optional extension could contain emails and social
| handles in the payload if desired.
|
| - You don't need terabytes of storage. All content can be
| ephemeral. Nodes can have different retention policies, and
| third party archival services and client-side behavior can
| provide durable storage, bookmarking/favoriting, etc.
|
| - The protocols should be P2P-first rather than federated. This
| prevents centralization and rule by federated cabal. Users can
| choose their own filtering, clustering, and prioritization.
| RobotToaster wrote:
| Isn't this ipfs?
| immibis wrote:
| There's no known way to make this work well yet, but feel
| free to invent that. Until that happens, federated is mostly
| the best we have, because most people don't want to be
| responsible for their own servers.
|
| P.S. ActivityPub is a euphemism for Mastodon's protocol,
| which isn't just ActivityPub.
| viraptor wrote:
| > Nodes can have different retention policies, and third
| party archival services and client-side behavior can provide
| durable storage, bookmarking/favoriting, etc.
|
| That's completely achievable in AP. Most current servers use
| reasonable retention, extended for boosted posts.
| MichaelZuo wrote:
| Then it is a bit strange why it wasn't designed to be
| 'BitTorrent-like' from the beginning as the parent
| suggests.
| Uptrenda wrote:
| >Everybody has to host their own content
|
| Yeah, this won't work. Like at all. This idea has been tried over
| and over on various decentralized apps and the problem is as
| nodes go offline and online links quickly break...
|
| No offense but this is a very half-assed post to gloss over what
| has been one of the basic problems in the space. It's a problem
| that inspired research in DHTs, various attempts at decentralized
| storage systems, and most recently -- we're getting some
| interesting hybrid approaches that seem like they will actually
| work.
|
| >Domain names should be decentralized IDs (DIDs)
|
| This is a hard problem by itself. All the decentralized name
| systems I've seen suck. People currently try use DHTs. I'm not
| sure that a DHT can provide reliability though and since the name
| is the root of the entire system it needs to be 100% reliable. In
| my own peer-to-peer work I side-step this problem entirely by
| having a fixed list of root servers. You don't have to try
| "decentralize" everything.
|
| >Proof of work time IDs can be used as timestamps
|
| Horribly inefficient for a social feed and orphans are going to
| screw you even more.
|
| I think you've not thought about this very hard.
| catlifeonmars wrote:
| > In my own peer-to-peer work I side-step this problem entirely
| by having a fixed list of root servers. You don't have to try
| "decentralize" everything.
|
| Not author, but that is what the global domain system is. There
| are a handful of root name servers that are baked into DNS
| resolvers.
| Uptrenda wrote:
| You're exactly right. It seems to work well enough for
| domains already so I kept the model.
| est wrote:
| Pity RSS is one-way. There's no standard way of comment or doing
| iteractions.
| uzyn wrote:
| Interaction/comment syndication would be very interesting. This
| is, I feel, what makes proprietary social media so addictive.
| sali0 wrote:
| Its an interesting point. I haven't even read the article
| yet, but have been reading the comments. Maybe they were the
| star of the show all along.
| hinkley wrote:
| Someone on the NCSA Mosaic team had a similar idea, but after
| they left nobody remaining knew what to do with it or how it
| worked.
|
| It took me 20 years to decide maybe they were right. A bunch
| if Reddits more tightly associated with a set of websites and
| users than with a centralized ad platform would be fairly
| good - if you had browser support for handling the syndicated
| comments. You could have one for your friends or colleagues,
| one for watchdog groups to discuss their fact checking or
| responses to a new campaign by a troublesome company.
| Zak wrote:
| This comment describes ActivityPub.
| cyberax wrote:
| That is a really great list of requirements.
|
| One area that is overlooked is commercialization. I believe, that
| the decentralized protocol needs to support some kind of paid
| subscription and/or micropayments.
|
| WebMonetization ( https://webmonetization.org/docs/ ) is a good
| start, but they're not tackling the actual payment infrastructure
| setup.
| openrisk wrote:
| Its not obvious to me that what is missing here is another
| technical protocol rather than more effective 'social protocols'.
| If you havent noticed, the major issues of today is not the
| scaling of message passing per-se but the moderation of content
| and violations of the boundary between public and private. These
| issues are socially defined and cannot be delegated to (possibly
| algorithmic) protocols.
|
| In other words what is missing is rules, regulations and
| incentives that are adapted to the way people use the digital
| domain and enforce the decentralized exchange of digital
| information to stay within a consensus "desired" envelope.
|
| Providing capabilities in code and network design is ofcourse a
| great enabler, but drifting into technosolutionism of the bitcoin
| type is a dead end. Society is not a static user of technical
| protocols. If left without matching social protocols any
| technical protocol will be exploited and fail.
|
| The example of abusive hyperscale social media should be a
| warning: they emerged as a behavior, they were not specified
| anywhere in the underlying web design. Facebook is just one
| website after all. Tim Berners-Lee probably did not anticipate
| that one endpoint would succesfully fake being the entire
| universe.
|
| The deeper question is, do we want the shape of digital networks
| to reflect the observed concentration or real current social and
| economic networks or do we want to use the leverage of this new
| techology to shape things in a different (hopefully better)
| direction?
|
| The mess we are in today is not so much failure of technology as
| it is digital illiteracy, from the casual user all the way to the
| most influential legal and political roles.
| pessimizer wrote:
| > If you havent noticed, the major issues of today is not the
| scaling of message passing per-se but the moderation of content
| and violations of the boundary between public and private.
|
| Are those the major issues of today? Those are the major issues
| for censors, not for communicators.
| openrisk wrote:
| Are spammers and scammers "communicators"? How about
| organized misinformation campaigns? In what kind of deeply
| sick ideological la-la-land is any kind of control of
| information flow "censorship".
| neuroelectron wrote:
| I think it's pretty clear they don't want us to have such a
| protocol. Google's attack on RSS is probably the clearest example
| of this, but there's also several more foundational issues that
| prevent multicasts and similar mechanisms from being effective.
| nunobrito wrote:
| NOSTR has solved most of these topics in a simple way. Anyone can
| generate a private/public key without emails or password, and
| anyone can send messages that you can verify as truly belonging
| to the person with that signature.
|
| They have hundreds of servers running today by volunteers, there
| is little cost of entry since even cellphones can be used as
| servers (nodes) to keep you private notes or keep the notes from
| people you follow.
|
| There is now a file sharing service called "Blossom" which is
| decentralized in the same simple manner. I don't think I've seen
| there a way to specify custom domains, people can only use the
| public key for the moment to host simple web pages without a
| server behind.
|
| Many of the topics in your page are matching with has been
| implemented there, it might be a good match for you to improve it
| further.
| brisky wrote:
| Can NOSTR handle 100 million daily active users?
| nunobrito wrote:
| Your question rephrased: "Can EMAIL handle 100 million daily
| users?".
|
| The answer is yes.
|
| NOSTR is similar to emails. They depend on nostr/email
| providers and aren't depending on any single of them, what
| exists is a common agreement (protocol). The overwhelming
| majority of those providers are free and you can also run
| your own from the cellphone.
|
| Some providers might become commercial like gmail, still many
| others will still provide access for free. Email is doing
| just fine nowadays, NOSTR will do fine as well.
| Groxx wrote:
| This is all necessarily true of _any_ "protocol". It is
| absolutely not true that every protocol scales efficiently
| to 100 million active users all interacting though, so it
| is basically a meaningless claim.
|
| E.g. ActivityPub has exactly the same claims, and it's
| currently handling several million, essentially all
| interact _able_. Some parts are working fine, and some
| parts are DDoSing every link shared on any normally-
| connected instance.
| defanor wrote:
| AIUI, the "Decentralized" added to RSS here stands for:
|
| - Propagation (via asynchronous notifications). Making it more
| like NNTP. Though perhaps that is not very different functionally
| from feed (RSS and Atom) aggregators: those just rely on pulling
| more than on pushing.
|
| - A domain name per user. This can be problematic: you have to be
| a relatively tech-savvy person with a stable income and living in
| an accommodating enough country (no disconnection of financial
| systems, blocking of registrar websites, etc) to reliably
| maintain a personal domain name.
|
| - Mandatory signatures. I would prefer OpenPGP over a fixed
| algorithm though: otherwise it lacks cryptographic agility, and
| reinvents parts of it (including key distribution). And perhaps
| to make that optional.
|
| - Bitcoin blockchain.
|
| I do not quite see how those help with decentralization, though
| propagation may help with discovery, which indeed tends to be
| problematic in decentralized and distributed systems. But that
| can be achieved with NNTP or aggregators. While the rest seems to
| hurt the "Simple" part of RSS.
| James_K wrote:
| A number of countries actually offer free domain names to
| citizens. I agree with the rest though. I don't see what this
| adds to RSS, which already has most of these things given its
| served over HTTPS in most cases.
| James_K wrote:
| Perhaps this is a little naive of me, but I really don't
| understand what this does. Let's say you have website with an RSS
| feed, it seems to have everything listed here. I suppose pages
| don't have signatures, but you could easily include a signature
| scheme in your website. In fact I think this is possible with
| existing technologies using a link element with MIME type
| "application/pkcs7-signature".
| blacklion wrote:
| It is funny, how link to text with these words: "Everybody has to
| host their own content" points to medium.com, not to tautvilas.lt
| brisky wrote:
| https://news.ycombinator.com/item?id=42624529
| brisky wrote:
| Url was now changed to my own domain by the mods.
| jasode wrote:
| The blog mentions the "discovery problem" 7 times but this
| project's particular technology architecture for syndication
| doesn't seem to actually address that.
|
| The project's main differentiating factor seems to be _not
| propagating the actual content_ to the nodes but instead save
| disk space by only distributing hashes of content.
|
| However, having a "p2p" decentralized network of hashes doesn't
| solve the "discovery" problem. The blog lists the following
| bullet points of metadata but that's not enough to facilitate
| "content discovery":
|
| _> However it could be possible to build a scalable and fast
| decentralized infrastructure if instances only kept references to
| hosted content.
|
| >Let's define what could be the absolute minimum structure of
| decentralized content unit:
|
| >- Reference to your content -- a URL
|
| >- User ID -- A way to identify who posted the content (domain
| name)
|
| >- Signature -- A way to verify that the user is the actual owner
|
| >- Content hash -- A way to identify if content was changed after
| publishing
|
| >- Post time -- A way to know when the post was submitted to the
| platform
|
| >It is not unreasonable to expect that all this information could
| fit into roughly 100 bytes._
|
| Those minimal 5 fields of metadata (url+userid+sig+hash+time) are
| not enough to facilitate content discovery.
|
| Content discovery of _reducing the infinite internet down to a
| manageable subset_ requires a lot more metadata. That extra
| metadata requires _scanning the actual content_ instead of the
| hashes. This _extra metadata based on actual content_ (e.g.
| Google 's "search index", Twitter's tweets & hashtags, etc) -- is
| one of the factors that acts as unescapable gravity pulling users
| towards centralization.
|
| To the author, what algorithm did you have in mind for
| decentralized content discovery?
| brisky wrote:
| Thanks for the comment, these concerns are valid. At the core
| the protocol supports only basic discovery - you can see who is
| posting right now and history of everyone who has ever posted.
| Regarding rich context discovery where content could be found
| by specific tags and key words this would be implemented by
| reader platforms that crawl the index
| somat wrote:
| Ipfs has a pub/sub mechanism.
|
| As far as I can tell it is stuck in some sort of inefficient
| prototype stage. which is unfortunate because I think it is one
| of the neatest most compelling parts of the whole project. it is
| very cool to be able build protocols with no central server.
|
| Here is my prototype of a video streaming service built on it. I
| abandoned the idea mainly because I am a poor programmer and
| could never muster the enthusiasm to get it past the prototype
| stage. but the idea of a a video streaming service that was
| actually serverless sounded cool at the time
|
| http://nl1.outband.net/fossil/ipfs_stream/file?name=ipfs_str...
| brisky wrote:
| I think both RSDS and IPFS use libp2p pub/sub mechanism
| dang wrote:
| Url changed from https://tautvilas.medium.com/decentralized-
| syndication-the-m..., which points to this.
| brisky wrote:
| Thank you
___________________________________________________________________
(page generated 2025-01-12 23:01 UTC)