[HN Gopher] How to self-host all of Bluesky except the AppView (...
___________________________________________________________________
How to self-host all of Bluesky except the AppView (for now)
Author : icy
Score : 104 points
Date : 2024-11-08 13:13 UTC (9 hours ago)
(HTM) web link (alice.bsky.sh)
(TXT) w3m dump (alice.bsky.sh)
| __justplaying wrote:
| author here, should you have questions!
| theschmed wrote:
| Thanks for making yourself available to answer questions!
| Hopefully this is not a dumb question.
|
| Is plc.directory a single point of failure for BlueSky users
| who want to take advantage of the benefits of a did:plc? And if
| so, is that a permanent thing or down the road will there be
| multiple interoperating did:plc directories?
| __justplaying wrote:
| yes it's a SPOF. not sure about the second question, but i do
| know there are plans to transfer its ownership to an
| independent foundation
| pfraze wrote:
| Transferring to an independent org is what we're talking
| about now, yes.
|
| The backstory to PLC is that we picked up the DID standard
| and looked for an existing registry-method that would
| satisfy requirements1. None of them really did. We then
| surveyed mechanisms for decentralized operation: DHTs, open
| blockchains, permissioned blockchains, and federated
| databases. Of them, the two blockchain variants seemed
| perhaps promising, but still premature since (as of 2022)
| you there's cost variability due to load and in some cases
| bad transaction latency (eg 10 minutes).
|
| We decided the best decision was to create PLC, which
| matches all of the requirements except for longterm meta
| governance. The way we designed it was to make the registry
| mechanics transferrable to a different protocol in the
| future, so that if for instance we decided (say) a DHT was
| suitable (it's not) we'd be able to use the same
| identifiers but change resolution and mutations to a new
| process. Then we started talking to other SMEs to get their
| take.
|
| Ultimately the solution that's gotten the most favorable
| response has been setting up an ICANN-style independent
| organization to operate it. This can be joined with a
| couple of interesting systems, such as mirrors which tail a
| certificate-transparency-style audit log, and which could
| even serve as transaction witnesses to indicate when the
| core registry might be rejecting updates ("write
| censorship").
|
| What can I say, some things take time and stakeholder-
| building. Look up the history of DNS and Network Solutions
| Inc for a bit of a wild ride that people have forgotten
| about. One other thing I should point out is that the DID
| spec enables multiple registry methods. Atproto currently
| supports did:web, and if other methods show up which
| satisfy the requirements then we are interested.
|
| 1 Secure against manipulation by the registry operators,
| longterm meta governance, highly available, reasonable
| transaction latency, reliably low cost that's not dogged by
| token speculation, low ecological impact.
| jazzyjackson wrote:
| Hey pfraze, forgive my ignorance but what role does DID
| serve that DNS doesn't? My favorite part about bsky is
| using TXT record to prove that I control my domain for
| username purposes, what's the downside to just generating
| a keypair, and using the fingerprint of the public key as
| my identity? (Maybe with some affordance for key rotation
| vis a vis KERI*) Not doubting youall weighed every
| possibility, just wondering what I'm missing
|
| *Key Event Receipt Infrastructure
| steveklabnik wrote:
| Not Paul, but DID is a stable ID over time, whereas dns
| is not. This lets you change your handle without the
| network losing track of who you are. I was
| @steveklabnik.bsky.social before I was @steveklabnik.com,
| and when I made the switch, all of my previous stuff was
| still there.
|
| This is a fun party trick in some sense, but also a real
| meaningful feature in another. If I ever decide to move
| from steveklabnik.com to steve.klabnik.com, a thing I
| have been considering for a few years, my stuff on
| @proto/Bluesky will be one of the only services that
| doesn't have the issue that's kept me from pulling the
| trigger: updating the entire world that that's where I am
| now.
| pfraze wrote:
| Yes! And if this were not the case then account
| portability between PDS hosts would be really
| challenging. Same logic as keeping your phone number when
| you switch cell carriers
| kiitos wrote:
| DIDs are stable only in the context of a specific
| 'verifiable data registry' as the spec puts it.
|
| https://www.w3.org/TR/did-core/#dfn-verifiable-data-
| registry
|
| DIDs delegate trust and authority to a data registry, in
| exactly the same way that DNS delegates trust and
| authority to ~ICANN.
|
| The system model is exactly the same. The difference is
| only in the properties of the authoritative entity.
| steveklabnik wrote:
| That's a good point: I was speaking in a more social
| manner. Because domains are human-readable, they tend to
| be used for humans. Bluesky could have chosen to just use
| domains, but I personally prefer that we have the
| additional layer of indirection. Plus like, you have the
| ability (at the low level, not really exposed in the UI
| in any meaningful way) to be multiple people: I can
| associate multiple domains with my DID.
|
| That said, you're not wrong that a registry is a
| registry.
| Kye wrote:
| >> _" What can I say, some things take time and
| stakeholder-building."_
|
| The ongoing WordPress fiasco is a good sign of what
| happens when you set up an independent organization too
| soon. You won't have the people or the commitments from
| those people to maintain that independence, so the
| independent thing ends up not being able to do anything
| to protect the thing that was supposed to be independent
| from the commercial interests looking to exploit it.
| jervant wrote:
| How are Direct Messages implemented in Bluesky if anyone can
| access a firehose of all network activity?
| __justplaying wrote:
| DMs are currently 1:1 only and closed source. They are
| working on/planning to build proper E2EE DMs that support
| group chats.
| moreati wrote:
| What's in that 4.5 TB? e.g. message metadata? Message text?
| Media?
|
| What time window does it cover? A rolling N day window?
| Everything since year dot?
|
| Can it be pruned? e.g. only data of accounts followed or
| messages interacted with
| mintplant wrote:
| What's the difference between social-app and the AppView?
| pfraze wrote:
| social-app is the client side, AppView is the backend api
| surface
| __justplaying wrote:
| How do I ask the mods to swap out the link to the actual post
| instead of my blog's front page?
|
| (...also, the title, as the original has the caveat)
| paulgb wrote:
| @dang a better URL would be
| https://alice.bsky.sh/post/3laega7icmi2q
|
| (I can't tell if Dan has an alert set up on his handle or
| whether he just sees everything, but hopefully that works :))
| __justplaying wrote:
| thanks!
| yorwba wrote:
| dang doesn't have an alert and he doesn't see everything.
| https://news.ycombinator.com/item?id=41317232 The official
| way to contact the mods is in the footer, i.e. email
| hn@ycombinator.com
| __justplaying wrote:
| will email, thanks
| paulgb wrote:
| Ah thanks, good to know. I guess I've just been lucky with
| it and developed a superstition that it works.
| timerol wrote:
| He is also extremely active here, so there's a good
| chance he reads and responds to a random comment without
| an email. But email is the approved (and fastest) way to
| go about it
| Jtsummers wrote:
| It's likely the correct page was submitted. The correct page
| includes a canonical link in the HTML: <link
| rel="canonical" href="https://alice.bsky.sh"/>
|
| HN will replace submission links with the canonical link if
| it's found.
| __justplaying wrote:
| oh. time to look at the code of my blog...
| dang wrote:
| Fixed now!
| 98codes wrote:
| This is all academic for me until Bluesky gets the functionality
| to get an account back onto their main network, for DR if not
| peace of mind that an "undo" is possible.
| diggan wrote:
| Totally understandable. Personally I don't use Bluesky for
| anything vital, it's just data that the world wouldn't be
| better/worse without anyways, so I'm gonna go and give it a try
| even if there is no undo.
|
| I love that people even has the choice, so much better than not
| even being able to.
| sureglymop wrote:
| It's great that you wrote this up!
|
| One thing I have found with many open source/selfhostable
| projects is just how much running them yourself can vary. It can
| go from a simple compose file with everything included to having
| to dig for obscure services and piece together how they all form
| the whole.
|
| For example, I recently looked into self hosting Zotero. It is so
| under documented and complex that there is almost no way one
| could self host that (even for just one user) without that being
| ones job. So one needs to make a distinction between something
| being open source and being feasible to use/maintain.
|
| In the end I gave up with Zotero. Even though it could have
| replaced Obsidian Notes, Calibre and Syncthing all at once for
| me.
| __justplaying wrote:
| Self-hosting/mirroring all these Bluesky components is
| currently a mixed bag as well though honestly the only outlier
| is the Relay, which is a beast. i currently have my copy of the
| PLC, a Jetstream with 2 days of data and a clone of the app on
| my laptop i play with sometimes and/or change things for an
| elaborate shitpost of Bluesky Nitro
| https://bsky.app/profile/alice.mosphere.at/post/3l7bpmmtiop2...
|
| I don't self-host my PDS yet because there is no migration path
| back yet (but there will be). Though maybe I'll just yolo one
| day and do it anyways.
| diggan wrote:
| > For example, I recently looked into self hosting Zotero. It
| is so under documented and complex that there is almost no way
| one could self host that
|
| I've come across this a lot too. But what I've found is that it
| mostly applies to open source projects that offer a hosted paid
| version, so it kind of makes sense they'll make the experience
| slightly worse than it could be (consciously or
| subconsciously), as it pushes people to their hosted solution.
| I don't particularly like it though.
|
| Doesn't seem to be the case for Zotero specifically, but your
| comment reminded me that I've noticed this more often lately.
| sbarre wrote:
| Yeah I tend to use ease of install for community editions of
| hosted paid open source projects as the leading indicator of
| how seriously they invest in (and support) their
| free/community version..
| elashri wrote:
| > For example, I recently looked into self hosting Zotero. It
| is so under documented and complex that there is almost no way
| one could self host that (even for just one user) without that
| being ones job. So one needs to make a distinction between
| something being open source and being feasible to use/maintain
|
| Just for the benefit for anyone that want to go through this
| rabbit hole. You cannot selfhost Zotero. In theory but in
| practice it is no feasible. If you find their free storage
| limiting then store them on webdav (all clients support that).
|
| zotero team explicity said that they don't see this as a
| priority [1] and with the release of zotero 7 and transition it
| is not realistic to think they will ever do.
|
| [1]
| https://github.com/zotero/dataserver/issues/105#issuecomment...
| heavensteeth wrote:
| This site is _extremely_ snappy. Good work.
| __justplaying wrote:
| Thanks! Its code is available at
| https://github.com/aliceisjustplaying/whtwnd-blog, I intend to
| turn this into the template as the posts are stored on my PDS,
| on ATProto, using WhtWnd https://whtwnd.com/
|
| (And all of this is a fork of my friend's Samuel's blog,
| https://mozzius.dev, see
| https://github.com/mozzius/mozzius.dev)
| ck2 wrote:
| I found it interesting it's almost impossible, very difficult to
| get real Bluesky stats
|
| This site tries but has limits:
|
| * https://bsky.jazco.dev/stats
|
| They broke 14 million yesterday and it seems to be snowballing
| now since the election:
|
| * https://bsky.app/profile/jaz.bsky.social/post/3laetwhztdk2x
| __justplaying wrote:
| https://bskycharts.edavis.dev/ is a good starting point for a
| number of charts
| jazzyjackson wrote:
| Is it feasible to run a bluesky instance "on prem" and "offline"
| for instance as an airgapped corporate intranet ?
| nisten wrote:
| Great do I have to setup LDAP , oauth, and troubleshoot
| corporate-style single-signon systems for the next 6 months
| just to get a chat server running now....
| elfprince13 wrote:
| I think if you replaced the plc directory with a corporate
| domain that would be pretty straightforward?
| zzyzxd wrote:
| Selfhosting is my hobby but I am also an SRE. I am hesitant to do
| this because the instruction is "too easy" -- "Simply open your
| firewall, download and run this installer.sh with sudo on your
| server and that's it!"[1].
|
| How do I secure the webserver and the data? Where is the data on
| my disk? How to backup and restore? High availability?
|
| There might be detailed documentation somewhere, or I can even
| read the code. But these are the important things an open source
| software should tell its users right off the bat.
|
| 1: https://github.com/bluesky-social/pds/blob/main/README.md
| dawnerd wrote:
| I was about to do this as well but their installer sketched me
| out. Why can't it just be some easy to follow docker
| instructions? They use docker too but instructions to set it up
| on your own is basically "read the installer script".
|
| Meanwhile mastodon is incredibly easy to self host w/ relays.
| benharri wrote:
| i certainly wouldn't say mastodon easy to self host
| j45 wrote:
| An installer script is often an early step, and much better
| than nothing... as well as a step towards docker.
|
| Here's the kicker, the install script could be called from a
| Dockerfile pretty easily, no? Sure, there might be things to
| sort out, but it doesn't seem unreasonable.
|
| I agree having a docker image is super handy and can be quick
| to try, as well as update, and put into a larger self-hosted
| environment how you need.
| freedomben wrote:
| Same, exactly. I would so much rather be given a docker-compose
| or k8s yaml along with some other tidbits like how to run
| migrations and stuff, than get a bash script I can just run.
| I've been doing this long enough to know that it's not initial
| setup work that really matters, it's the upgrade and
| backup/restore story that really matters. If your bash script
| just pulls and runs a docker container or something then cool,
| but if it's doing much more than that then that's a big red
| flag to me.
| diggan wrote:
| Here:
|
| - https://raw.githubusercontent.com/bluesky-
| social/pds/main/in... has all the expected outside docker-
| compose setup, you can read it through in like 5 minutes
|
| - Heavy-duty part of the setup is running
| https://raw.githubusercontent.com/bluesky-
| social/pds/main/co... which you should be familiar with
|
| I guess the shellscript is for people who want a one-line
| install, which I wouldn't do myself either, but I guess some
| people prefer.
| zzyzxd wrote:
| The script even installs docker with apt by itself (which,
| I think, is the only reason they require Ubuntu as the OS
| -- to not to deal with any other package manager
| variants)... I mean, why? Just let people install docker
| however they like! If you don't even trust your users to
| install a container runtime, who's your target audience
| really?
|
| It's also over complicated, like, it even tries to handle
| race condition of multiple apt processes! What kind of
| environment do they expect the users have? As the project
| become more popular, the script will need to handle more
| edge cases. Let's see if it is still a 5 minutes read one
| year later.
|
| > I guess the shellscript is for people who want a one-line
| install, which I wouldn't do myself either, but I guess
| some people prefer.
|
| This is the problem in lots open source projects --
| providing a one-liner installer and bragging about how easy
| the initial setup is, without an easy path for long term
| maintenance. Give it some time, many happy users of the
| one-liner will be unhappy when they encounter issues.
| j45 wrote:
| This message is for anyone who might find trying self-hosting
| intimidating.
|
| Like hosting an application in the cloud, you also will never
| stop improving how you self-host.
|
| If there's questions lacking about a software package, it's
| often could be reflected in your self-hosting environment too.
|
| Running this type of an installer is excellent to quickly
| introduce yourself to any technology - to then start learning
| about how you want to run it long term.
|
| The questions expressed above are not new. How SRE's solve it
| today also can be different and more complicated than needed.
|
| Easy answer - if they have an install script, it's getting run
| inside a VM, or Docker which itself is a baseline backup and HA
| automatically if needed.
|
| If generally anything is run inside of a self-hosted hypervisor
| like Proxmox, it can be setup to automatically backup, mirror,
| HA as-is, while you figure out what you want. This includes
| running docker inside a Proxmox VM, there is not a big
| performance hit anymore for doing this for things that are
| largely idle most of the time.
|
| There is a big difference between SaaS, PaaS, and IaaS. It's
| easier and easier to get the benefits from all three by being
| willing to build up the foundation instead of pointing at the
| gaps in each package for not filling it for you.
|
| It's encouraging to see things becoming more possible :)
| diggan wrote:
| > How do I secure the webserver and the data? Where is the data
| on my disk? How to backup and restore? High availability?
|
| I feel like that it's kind of out of the scope from an article
| describing the steps for application/protocol specific
| infrastructure. You need to look for resources, guides and such
| for general self-hosting instead, somewhere else.
|
| For example, if you use TrueScale NAS/unraid/proxmos or
| whatever for local self-hosting, you'd setup those things via
| those platforms. If you use Kubernetes/Nomad/Incus/Containers,
| you'd solve those things via that tooling.
| nisten wrote:
| Is the actual guide just this <400 word thing, or is it all those
| 15 different links on the post, or only some of them....
|
| Does that... bureaucracy of documentation not infuriate anyone
| else or is it just me. I guess I'll try and reset my password to
| bluesky website, assuming it's this .app one, but then it's
| asking me to maybe select a provider ... of my password.
|
| Does whoemever made this user experience not have enough
| emotional intelligence realize how infuriating it is?
| __justplaying wrote:
| This was a quick and dirty post I put together primarily for
| people who are already on Bluesky and have dev experience, and
| peppered with appropriate links where you have actual guides
| and/or documentation for each bit.
| steveklabnik wrote:
| > I guess I'll try and reset my password to bluesky website,
| assuming it's this .app one, but then it's asking me to maybe
| select a provider ... of my password.
|
| It's asking what the host of your data is. If you're not
| running your own server, then the default value of Bluesky
| itself is the correct one.
| mdaniel wrote:
| Also, yesterday someone posted[1] https://frontpage.fyi/ which
| seems like it's predominately Bluesky/ATprotocol news but since
| both of those interest me, if this blog link interests you then
| so might that link. It logs in with Bsky oauth2 federation
|
| 1: https://news.ycombinator.com/item?id=42081210
| elfprince13 wrote:
| but I thought that Bluesky wasn't meaningfully distributed /s
___________________________________________________________________
(page generated 2024-11-08 23:01 UTC)