[HN Gopher] How to self-host all of Bluesky except the AppView (...
       ___________________________________________________________________
        
       How to self-host all of Bluesky except the AppView (for now)
        
       Author : icy
       Score  : 104 points
       Date   : 2024-11-08 13:13 UTC (9 hours ago)
        
 (HTM) web link (alice.bsky.sh)
 (TXT) w3m dump (alice.bsky.sh)
        
       | __justplaying wrote:
       | author here, should you have questions!
        
         | theschmed wrote:
         | Thanks for making yourself available to answer questions!
         | Hopefully this is not a dumb question.
         | 
         | Is plc.directory a single point of failure for BlueSky users
         | who want to take advantage of the benefits of a did:plc? And if
         | so, is that a permanent thing or down the road will there be
         | multiple interoperating did:plc directories?
        
           | __justplaying wrote:
           | yes it's a SPOF. not sure about the second question, but i do
           | know there are plans to transfer its ownership to an
           | independent foundation
        
             | pfraze wrote:
             | Transferring to an independent org is what we're talking
             | about now, yes.
             | 
             | The backstory to PLC is that we picked up the DID standard
             | and looked for an existing registry-method that would
             | satisfy requirements1. None of them really did. We then
             | surveyed mechanisms for decentralized operation: DHTs, open
             | blockchains, permissioned blockchains, and federated
             | databases. Of them, the two blockchain variants seemed
             | perhaps promising, but still premature since (as of 2022)
             | you there's cost variability due to load and in some cases
             | bad transaction latency (eg 10 minutes).
             | 
             | We decided the best decision was to create PLC, which
             | matches all of the requirements except for longterm meta
             | governance. The way we designed it was to make the registry
             | mechanics transferrable to a different protocol in the
             | future, so that if for instance we decided (say) a DHT was
             | suitable (it's not) we'd be able to use the same
             | identifiers but change resolution and mutations to a new
             | process. Then we started talking to other SMEs to get their
             | take.
             | 
             | Ultimately the solution that's gotten the most favorable
             | response has been setting up an ICANN-style independent
             | organization to operate it. This can be joined with a
             | couple of interesting systems, such as mirrors which tail a
             | certificate-transparency-style audit log, and which could
             | even serve as transaction witnesses to indicate when the
             | core registry might be rejecting updates ("write
             | censorship").
             | 
             | What can I say, some things take time and stakeholder-
             | building. Look up the history of DNS and Network Solutions
             | Inc for a bit of a wild ride that people have forgotten
             | about. One other thing I should point out is that the DID
             | spec enables multiple registry methods. Atproto currently
             | supports did:web, and if other methods show up which
             | satisfy the requirements then we are interested.
             | 
             | 1 Secure against manipulation by the registry operators,
             | longterm meta governance, highly available, reasonable
             | transaction latency, reliably low cost that's not dogged by
             | token speculation, low ecological impact.
        
               | jazzyjackson wrote:
               | Hey pfraze, forgive my ignorance but what role does DID
               | serve that DNS doesn't? My favorite part about bsky is
               | using TXT record to prove that I control my domain for
               | username purposes, what's the downside to just generating
               | a keypair, and using the fingerprint of the public key as
               | my identity? (Maybe with some affordance for key rotation
               | vis a vis KERI*) Not doubting youall weighed every
               | possibility, just wondering what I'm missing
               | 
               | *Key Event Receipt Infrastructure
        
               | steveklabnik wrote:
               | Not Paul, but DID is a stable ID over time, whereas dns
               | is not. This lets you change your handle without the
               | network losing track of who you are. I was
               | @steveklabnik.bsky.social before I was @steveklabnik.com,
               | and when I made the switch, all of my previous stuff was
               | still there.
               | 
               | This is a fun party trick in some sense, but also a real
               | meaningful feature in another. If I ever decide to move
               | from steveklabnik.com to steve.klabnik.com, a thing I
               | have been considering for a few years, my stuff on
               | @proto/Bluesky will be one of the only services that
               | doesn't have the issue that's kept me from pulling the
               | trigger: updating the entire world that that's where I am
               | now.
        
               | pfraze wrote:
               | Yes! And if this were not the case then account
               | portability between PDS hosts would be really
               | challenging. Same logic as keeping your phone number when
               | you switch cell carriers
        
               | kiitos wrote:
               | DIDs are stable only in the context of a specific
               | 'verifiable data registry' as the spec puts it.
               | 
               | https://www.w3.org/TR/did-core/#dfn-verifiable-data-
               | registry
               | 
               | DIDs delegate trust and authority to a data registry, in
               | exactly the same way that DNS delegates trust and
               | authority to ~ICANN.
               | 
               | The system model is exactly the same. The difference is
               | only in the properties of the authoritative entity.
        
               | steveklabnik wrote:
               | That's a good point: I was speaking in a more social
               | manner. Because domains are human-readable, they tend to
               | be used for humans. Bluesky could have chosen to just use
               | domains, but I personally prefer that we have the
               | additional layer of indirection. Plus like, you have the
               | ability (at the low level, not really exposed in the UI
               | in any meaningful way) to be multiple people: I can
               | associate multiple domains with my DID.
               | 
               | That said, you're not wrong that a registry is a
               | registry.
        
               | Kye wrote:
               | >> _" What can I say, some things take time and
               | stakeholder-building."_
               | 
               | The ongoing WordPress fiasco is a good sign of what
               | happens when you set up an independent organization too
               | soon. You won't have the people or the commitments from
               | those people to maintain that independence, so the
               | independent thing ends up not being able to do anything
               | to protect the thing that was supposed to be independent
               | from the commercial interests looking to exploit it.
        
         | jervant wrote:
         | How are Direct Messages implemented in Bluesky if anyone can
         | access a firehose of all network activity?
        
           | __justplaying wrote:
           | DMs are currently 1:1 only and closed source. They are
           | working on/planning to build proper E2EE DMs that support
           | group chats.
        
         | moreati wrote:
         | What's in that 4.5 TB? e.g. message metadata? Message text?
         | Media?
         | 
         | What time window does it cover? A rolling N day window?
         | Everything since year dot?
         | 
         | Can it be pruned? e.g. only data of accounts followed or
         | messages interacted with
        
         | mintplant wrote:
         | What's the difference between social-app and the AppView?
        
           | pfraze wrote:
           | social-app is the client side, AppView is the backend api
           | surface
        
       | __justplaying wrote:
       | How do I ask the mods to swap out the link to the actual post
       | instead of my blog's front page?
       | 
       | (...also, the title, as the original has the caveat)
        
         | paulgb wrote:
         | @dang a better URL would be
         | https://alice.bsky.sh/post/3laega7icmi2q
         | 
         | (I can't tell if Dan has an alert set up on his handle or
         | whether he just sees everything, but hopefully that works :))
        
           | __justplaying wrote:
           | thanks!
        
           | yorwba wrote:
           | dang doesn't have an alert and he doesn't see everything.
           | https://news.ycombinator.com/item?id=41317232 The official
           | way to contact the mods is in the footer, i.e. email
           | hn@ycombinator.com
        
             | __justplaying wrote:
             | will email, thanks
        
             | paulgb wrote:
             | Ah thanks, good to know. I guess I've just been lucky with
             | it and developed a superstition that it works.
        
               | timerol wrote:
               | He is also extremely active here, so there's a good
               | chance he reads and responds to a random comment without
               | an email. But email is the approved (and fastest) way to
               | go about it
        
         | Jtsummers wrote:
         | It's likely the correct page was submitted. The correct page
         | includes a canonical link in the HTML:                 <link
         | rel="canonical" href="https://alice.bsky.sh"/>
         | 
         | HN will replace submission links with the canonical link if
         | it's found.
        
           | __justplaying wrote:
           | oh. time to look at the code of my blog...
        
         | dang wrote:
         | Fixed now!
        
       | 98codes wrote:
       | This is all academic for me until Bluesky gets the functionality
       | to get an account back onto their main network, for DR if not
       | peace of mind that an "undo" is possible.
        
         | diggan wrote:
         | Totally understandable. Personally I don't use Bluesky for
         | anything vital, it's just data that the world wouldn't be
         | better/worse without anyways, so I'm gonna go and give it a try
         | even if there is no undo.
         | 
         | I love that people even has the choice, so much better than not
         | even being able to.
        
       | sureglymop wrote:
       | It's great that you wrote this up!
       | 
       | One thing I have found with many open source/selfhostable
       | projects is just how much running them yourself can vary. It can
       | go from a simple compose file with everything included to having
       | to dig for obscure services and piece together how they all form
       | the whole.
       | 
       | For example, I recently looked into self hosting Zotero. It is so
       | under documented and complex that there is almost no way one
       | could self host that (even for just one user) without that being
       | ones job. So one needs to make a distinction between something
       | being open source and being feasible to use/maintain.
       | 
       | In the end I gave up with Zotero. Even though it could have
       | replaced Obsidian Notes, Calibre and Syncthing all at once for
       | me.
        
         | __justplaying wrote:
         | Self-hosting/mirroring all these Bluesky components is
         | currently a mixed bag as well though honestly the only outlier
         | is the Relay, which is a beast. i currently have my copy of the
         | PLC, a Jetstream with 2 days of data and a clone of the app on
         | my laptop i play with sometimes and/or change things for an
         | elaborate shitpost of Bluesky Nitro
         | https://bsky.app/profile/alice.mosphere.at/post/3l7bpmmtiop2...
         | 
         | I don't self-host my PDS yet because there is no migration path
         | back yet (but there will be). Though maybe I'll just yolo one
         | day and do it anyways.
        
         | diggan wrote:
         | > For example, I recently looked into self hosting Zotero. It
         | is so under documented and complex that there is almost no way
         | one could self host that
         | 
         | I've come across this a lot too. But what I've found is that it
         | mostly applies to open source projects that offer a hosted paid
         | version, so it kind of makes sense they'll make the experience
         | slightly worse than it could be (consciously or
         | subconsciously), as it pushes people to their hosted solution.
         | I don't particularly like it though.
         | 
         | Doesn't seem to be the case for Zotero specifically, but your
         | comment reminded me that I've noticed this more often lately.
        
           | sbarre wrote:
           | Yeah I tend to use ease of install for community editions of
           | hosted paid open source projects as the leading indicator of
           | how seriously they invest in (and support) their
           | free/community version..
        
         | elashri wrote:
         | > For example, I recently looked into self hosting Zotero. It
         | is so under documented and complex that there is almost no way
         | one could self host that (even for just one user) without that
         | being ones job. So one needs to make a distinction between
         | something being open source and being feasible to use/maintain
         | 
         | Just for the benefit for anyone that want to go through this
         | rabbit hole. You cannot selfhost Zotero. In theory but in
         | practice it is no feasible. If you find their free storage
         | limiting then store them on webdav (all clients support that).
         | 
         | zotero team explicity said that they don't see this as a
         | priority [1] and with the release of zotero 7 and transition it
         | is not realistic to think they will ever do.
         | 
         | [1]
         | https://github.com/zotero/dataserver/issues/105#issuecomment...
        
       | heavensteeth wrote:
       | This site is _extremely_ snappy. Good work.
        
         | __justplaying wrote:
         | Thanks! Its code is available at
         | https://github.com/aliceisjustplaying/whtwnd-blog, I intend to
         | turn this into the template as the posts are stored on my PDS,
         | on ATProto, using WhtWnd https://whtwnd.com/
         | 
         | (And all of this is a fork of my friend's Samuel's blog,
         | https://mozzius.dev, see
         | https://github.com/mozzius/mozzius.dev)
        
       | ck2 wrote:
       | I found it interesting it's almost impossible, very difficult to
       | get real Bluesky stats
       | 
       | This site tries but has limits:
       | 
       | * https://bsky.jazco.dev/stats
       | 
       | They broke 14 million yesterday and it seems to be snowballing
       | now since the election:
       | 
       | * https://bsky.app/profile/jaz.bsky.social/post/3laetwhztdk2x
        
         | __justplaying wrote:
         | https://bskycharts.edavis.dev/ is a good starting point for a
         | number of charts
        
       | jazzyjackson wrote:
       | Is it feasible to run a bluesky instance "on prem" and "offline"
       | for instance as an airgapped corporate intranet ?
        
         | nisten wrote:
         | Great do I have to setup LDAP , oauth, and troubleshoot
         | corporate-style single-signon systems for the next 6 months
         | just to get a chat server running now....
        
         | elfprince13 wrote:
         | I think if you replaced the plc directory with a corporate
         | domain that would be pretty straightforward?
        
       | zzyzxd wrote:
       | Selfhosting is my hobby but I am also an SRE. I am hesitant to do
       | this because the instruction is "too easy" -- "Simply open your
       | firewall, download and run this installer.sh with sudo on your
       | server and that's it!"[1].
       | 
       | How do I secure the webserver and the data? Where is the data on
       | my disk? How to backup and restore? High availability?
       | 
       | There might be detailed documentation somewhere, or I can even
       | read the code. But these are the important things an open source
       | software should tell its users right off the bat.
       | 
       | 1: https://github.com/bluesky-social/pds/blob/main/README.md
        
         | dawnerd wrote:
         | I was about to do this as well but their installer sketched me
         | out. Why can't it just be some easy to follow docker
         | instructions? They use docker too but instructions to set it up
         | on your own is basically "read the installer script".
         | 
         | Meanwhile mastodon is incredibly easy to self host w/ relays.
        
           | benharri wrote:
           | i certainly wouldn't say mastodon easy to self host
        
           | j45 wrote:
           | An installer script is often an early step, and much better
           | than nothing... as well as a step towards docker.
           | 
           | Here's the kicker, the install script could be called from a
           | Dockerfile pretty easily, no? Sure, there might be things to
           | sort out, but it doesn't seem unreasonable.
           | 
           | I agree having a docker image is super handy and can be quick
           | to try, as well as update, and put into a larger self-hosted
           | environment how you need.
        
         | freedomben wrote:
         | Same, exactly. I would so much rather be given a docker-compose
         | or k8s yaml along with some other tidbits like how to run
         | migrations and stuff, than get a bash script I can just run.
         | I've been doing this long enough to know that it's not initial
         | setup work that really matters, it's the upgrade and
         | backup/restore story that really matters. If your bash script
         | just pulls and runs a docker container or something then cool,
         | but if it's doing much more than that then that's a big red
         | flag to me.
        
           | diggan wrote:
           | Here:
           | 
           | - https://raw.githubusercontent.com/bluesky-
           | social/pds/main/in... has all the expected outside docker-
           | compose setup, you can read it through in like 5 minutes
           | 
           | - Heavy-duty part of the setup is running
           | https://raw.githubusercontent.com/bluesky-
           | social/pds/main/co... which you should be familiar with
           | 
           | I guess the shellscript is for people who want a one-line
           | install, which I wouldn't do myself either, but I guess some
           | people prefer.
        
             | zzyzxd wrote:
             | The script even installs docker with apt by itself (which,
             | I think, is the only reason they require Ubuntu as the OS
             | -- to not to deal with any other package manager
             | variants)... I mean, why? Just let people install docker
             | however they like! If you don't even trust your users to
             | install a container runtime, who's your target audience
             | really?
             | 
             | It's also over complicated, like, it even tries to handle
             | race condition of multiple apt processes! What kind of
             | environment do they expect the users have? As the project
             | become more popular, the script will need to handle more
             | edge cases. Let's see if it is still a 5 minutes read one
             | year later.
             | 
             | > I guess the shellscript is for people who want a one-line
             | install, which I wouldn't do myself either, but I guess
             | some people prefer.
             | 
             | This is the problem in lots open source projects --
             | providing a one-liner installer and bragging about how easy
             | the initial setup is, without an easy path for long term
             | maintenance. Give it some time, many happy users of the
             | one-liner will be unhappy when they encounter issues.
        
         | j45 wrote:
         | This message is for anyone who might find trying self-hosting
         | intimidating.
         | 
         | Like hosting an application in the cloud, you also will never
         | stop improving how you self-host.
         | 
         | If there's questions lacking about a software package, it's
         | often could be reflected in your self-hosting environment too.
         | 
         | Running this type of an installer is excellent to quickly
         | introduce yourself to any technology - to then start learning
         | about how you want to run it long term.
         | 
         | The questions expressed above are not new. How SRE's solve it
         | today also can be different and more complicated than needed.
         | 
         | Easy answer - if they have an install script, it's getting run
         | inside a VM, or Docker which itself is a baseline backup and HA
         | automatically if needed.
         | 
         | If generally anything is run inside of a self-hosted hypervisor
         | like Proxmox, it can be setup to automatically backup, mirror,
         | HA as-is, while you figure out what you want. This includes
         | running docker inside a Proxmox VM, there is not a big
         | performance hit anymore for doing this for things that are
         | largely idle most of the time.
         | 
         | There is a big difference between SaaS, PaaS, and IaaS. It's
         | easier and easier to get the benefits from all three by being
         | willing to build up the foundation instead of pointing at the
         | gaps in each package for not filling it for you.
         | 
         | It's encouraging to see things becoming more possible :)
        
         | diggan wrote:
         | > How do I secure the webserver and the data? Where is the data
         | on my disk? How to backup and restore? High availability?
         | 
         | I feel like that it's kind of out of the scope from an article
         | describing the steps for application/protocol specific
         | infrastructure. You need to look for resources, guides and such
         | for general self-hosting instead, somewhere else.
         | 
         | For example, if you use TrueScale NAS/unraid/proxmos or
         | whatever for local self-hosting, you'd setup those things via
         | those platforms. If you use Kubernetes/Nomad/Incus/Containers,
         | you'd solve those things via that tooling.
        
       | nisten wrote:
       | Is the actual guide just this <400 word thing, or is it all those
       | 15 different links on the post, or only some of them....
       | 
       | Does that... bureaucracy of documentation not infuriate anyone
       | else or is it just me. I guess I'll try and reset my password to
       | bluesky website, assuming it's this .app one, but then it's
       | asking me to maybe select a provider ... of my password.
       | 
       | Does whoemever made this user experience not have enough
       | emotional intelligence realize how infuriating it is?
        
         | __justplaying wrote:
         | This was a quick and dirty post I put together primarily for
         | people who are already on Bluesky and have dev experience, and
         | peppered with appropriate links where you have actual guides
         | and/or documentation for each bit.
        
         | steveklabnik wrote:
         | > I guess I'll try and reset my password to bluesky website,
         | assuming it's this .app one, but then it's asking me to maybe
         | select a provider ... of my password.
         | 
         | It's asking what the host of your data is. If you're not
         | running your own server, then the default value of Bluesky
         | itself is the correct one.
        
       | mdaniel wrote:
       | Also, yesterday someone posted[1] https://frontpage.fyi/ which
       | seems like it's predominately Bluesky/ATprotocol news but since
       | both of those interest me, if this blog link interests you then
       | so might that link. It logs in with Bsky oauth2 federation
       | 
       | 1: https://news.ycombinator.com/item?id=42081210
        
       | elfprince13 wrote:
       | but I thought that Bluesky wasn't meaningfully distributed /s
        
       ___________________________________________________________________
       (page generated 2024-11-08 23:01 UTC)