[HN Gopher] I Wrote an Activitypub Server in OCaml: Lessons Lear...
___________________________________________________________________
I Wrote an Activitypub Server in OCaml: Lessons Learnt, Weekends
Lost
Author : gopiandcode
Score : 92 points
Date : 2023-04-23 10:56 UTC (1 days ago)
(HTM) web link (gopiandcode.uk)
(TXT) w3m dump (gopiandcode.uk)
| throwaway290 wrote:
| There's also LitePub, though development seems stalled (?)
| yawaramin wrote:
| Was it developed at all? I'm not seeing any business logic in
| the repo: https://hacktivis.me/git/litepub.social/files.html
| riffic wrote:
| link for reference: https://litepub.social/
| SideburnsOfDoom wrote:
| My question is this: if I was to try to hack up an ActivityPub
| server in my platform of choice, how would I know how compliant
| it is? Is there any compliance test suite to verify this?
|
| "Try and load it up in a client app" seems suboptimal.
|
| "load it up and see" attitude is part of what made parsing and
| renderings HTML so hairy, and compliance test suites helped.
| mariusor wrote:
| There was a suite of tests, that sadly fell to bitrot. One of
| the developers in the community created a parallel application
| that could test implementations, but then this too ended up
| unmaintained[1].
|
| [1] https://github.com/go-fed/testsuite
| nologic01 wrote:
| I found the post well written and informative. Though I am
| clueless about OCaml it feels as this would be useful for anybody
| working on a new server implementation in any language ecosystem
| as it highlights what needs to be done and potential bottlenecks.
|
| As for the activitypub spec and the currently popular
| implementations it doesnt take long exposure to the fediverse to
| realise there are some rough edges and historical accidents (e.g
| mastodon being actually the defacto interpretation of the
| standard). Imho now that there is substantial more mindshare
| devoted to decentralized social it would be opportune to revisit
| these things and if needed revise before they get backed in.
| mikece wrote:
| Im looking forward to a solid ActivityPub server written in Go or
| Rust that can run on modest hardware/small resource Docker hosts.
| knjllppppp wrote:
| I've had a go at doing it in Go and the ActivityPub spec is so
| loosely defined that it's just a real challenge if you intend
| to actually unmarshal the JSON you receive
|
| It's not completely impossible but you have to be okay with
| discarding a lot of unknown options or essentially reverse
| engineering the objects used by the servers you are federated
| with
|
| That's not to say it's impossible, I was able to crawl the
| network successfully, but it hints at the reason that Mastodon
| and Pleroma use dynamic languages
|
| I'd be very interested to see a flexible/complete AP
| implementation in any statically typed language
|
| Fwiw WriteFreely is implemented in Go with go-fed but --
| correct me if I'm wrong -- that library seemed more limited to
| me than what Pleroma and Mastodon support
| zimpenfish wrote:
| > I'd be very interested to see a flexible/complete AP
| implementation in any statically typed language
|
| Try Honk[1] or GotoSocial[2]?
|
| [1] https://humungus.tedunangst.com/r/honk [2]
| https://github.com/superseriousbusiness/gotosocial
| mariusor wrote:
| Neither is flexible, nor strives for completion. They are
| both implementations that try to map the ActivityPub
| vocabulary on an existing web-application domain.
|
| They are not ActivityPub servers, but web-apps that use the
| ActivityPub vocabulary to federate, which is what I meant
| in the grandparent post when I mentioned the classic
| mistake of ActivityPub implementers. :D
| mariusor wrote:
| I'm surprised you didn't find my library because I managed to
| create a statically typed vocabulary library for Go that maps
| the specification verbatim: https://pkg.go.dev/github.com/go-
| ap/activitypub#Object
|
| It wasn't easy indeed, and it locked me out of some options
| to support execution time vocabulary extensions, but hey, it
| works and it's relatively easy to use.
| knjllppppp wrote:
| I'm surprised I didn't find it, too! My google-fu must be
| getting rusty. Thanks for the link, I'll have to have a
| deep dive when I get the time :)
| mariusor wrote:
| I hope you get to it. The library itself contains more
| than just the vocabulary part and I would be glad of more
| eyeballs on the problem. :D
| yawaramin wrote:
| Here's the implementation described in OP:
| https://github.com/Gopiandcode/ocamlot
|
| OCaml is a statically-typed language. It falls somewhere
| between Go and Haskell on the spectrum of type 'strength'.
| SideburnsOfDoom wrote:
| > Im looking forward to a solid ActivityPub server written in
| Go or Rust that can run on modest hardware/small resource
|
| The "Lightweight" GoLang ActivityPub server is GoToSocial
| https://github.com/superseriousbusiness/gotosocial
|
| The better-known lightweight servers are Pleroma and fork
| Akkoma, written in Elixir https://akkoma.dev/AkkomaGang/akkoma/
|
| Some of this info I got via:
| https://social.treehouse.systems/@ariadne/110226729543740723
| zimpenfish wrote:
| There's also Honk[1] which is written in Go but has slightly
| wacky source and doesn't support the Mastodon API (but does
| provide an inbuilt web UI.)
|
| [1] https://humungus.tedunangst.com/r/honk
| bgorman wrote:
| Ocaml code compiles to native binaries, just like Go/Rust.
| yawaramin wrote:
| Why specifically those languages? Others can also target modest
| hardware/small resource Docker hosts.
| mariusor wrote:
| Well, there is one already as the reference implementation for
| a suite of libraries I wrote. You can find it at
| https://github.com/go-ap/fedbox. (Contributions welcome)
| zimpenfish wrote:
| Does it only support C2S as the API? Are there any clients
| which actually support C2S rather than the Mastodon API?
| mariusor wrote:
| It does support server to server, but currently it does not
| play well with Mastodon due to its limited support of HTTP
| Signatures algorithms. I didn't get bothered enough by this
| yet to actually fix it on my side.
|
| And there are a number of clients that work with this
| specific brand of client to server ActivityPub but I wrote
| all of them. The one that can be seen on the internet is a
| link aggregator similar to HN and (old) reddit, you can
| find a demo instance at https://brutalinks.tech.
| zimpenfish wrote:
| > It does support server to server
|
| Ah, sorry, I should have said "client API" rather than
| just API there.
| mariusor wrote:
| ActivityPub has a section which deals with how clients
| and servers should communicate with each other (called
| Client to Server - C2S - in the spec). So it's the same
| vocabulary and operations with slightly different side
| effects, but most servers don't implement it because it's
| not "specified enough". That's why developers generally
| just use the Mastodon API.
| jeroenhd wrote:
| I think there is (was?) an attempt to rewrite Mastodon into
| Rust but I haven't heard much about it.
|
| A single user Mastodon instance takes an unreasonable amount of
| resources. I don't know if it's just because of Ruby (Gitlab
| has the same problem, so it might just be) or because everyone
| is wasting money on expensive servers, but an RSS feed on
| steroid shouldn't take this much RAM.
| [deleted]
| WorldMaker wrote:
| Mastodon itself is designed for "flagship scale" (given lead
| developers run mastodon.social and mastodon.online, two of
| the biggest instances and the most "dogfooding" two
| instances) so it bundles an entire cluster of services:
| background processors (sidekiq), caches (redis, I think?),
| database server (postgres), optional ElastiCache, and more. I
| don't know how much Ruby itself accounts for expensive
| overhead, but just running all of those other things on a
| single server vertically for a single user instance is a
| massive, expensive overhead. (It's clearly built for
| horizontal scale where your background services and caches
| and database servers may all be different clusters of
| VMs/servers over vertical stack efficiency when "scaled down"
| from the "natural" "mastodon.social scale" that Mastodon is
| most optimized for.)
|
| It's an interesting optimization problem reminder that
| scaling factors are different for different needs and not
| everything scales cleanly to every use case. A single user
| instance _should_ be able to use a much smaller vertical
| stack, but scaling down from a wide horizontal stack is not
| necessarily the best or cheapest place to start when building
| something like that.
|
| (There are some interesting projects I've seen to build
| single user instances with much less overhead, shorter
| vertical stacks. I'm curious to see where those efforts go.
| In my own usage of Mastodon my "single user" instance gets
| the benefits of the horizontal scaling Mastodon was built for
| because my hosting provider does a bunch of work to make sure
| that they take advantage of that economy of scale to host
| many small instances for cheaper than trying to run small
| instances in one-off VMs.)
| mxuribe wrote:
| There are several websites out there which hope to list many
| ActivityPub servers (and clients) in many (programming)
| languages, and other implemtnation aspects...Like, here's an
| oldie but goodie website:
| https://fediverse.party/en/miscellaneous/ ...There are other
| wbsites of course.
|
| Just select your desired lang. and review! Now, of course, it
| might be early days for some languages (e.g. for Rust,
| etc.)...But, one reason why some languages are used over
| others...is due to ease of deploying on VPCs and VPC-like hosts
| (...historically the land that php ruled ;-)
|
| Enjoy, and I hope you find what you're looking for!
| erwinh wrote:
| A bit off-topic but the post title will probably attract relevant
| people.
|
| What are the thoughts on OCaml on HN?
| WorldMaker wrote:
| I haven't used OCaml much directly, but F# is a common enough
| tool in my toolbelt at this point. My experience of F# is that
| overall it's a good language family. The access to .NET's
| standard library (the BCL) and easy interop with C# are the
| biggest reasons F# is the tool I more often reach to as it
| already fits the ecosystem most of my other development is in,
| but I'd love to work more directly with OCaml should the need
| arise.
| cccbbbaaa wrote:
| It replaced Python for everything longer than a couple hundred
| of lines long for me. Fast language, fast compile times,
| clean(-ish) syntax, strong typing system, good ecosystem, and
| now multicore support? Yes please!
|
| I must be more nuanced, though: existing libraries in opam are
| generally very, very good (I really like cmdliner), but many
| things may be missing. There is no alternative to Django, for
| instance. No serious IDE, except emacs. The standard library
| was so lacking that there is at least an alternative. The
| situation improved, but there's still missing stuff compared to
| Python.
| mattpallissard wrote:
| > There is no alternative to Django, for instance.
|
| https://aantron.github.io/dream/, which is new and used by
| ocaml.org as well as OP
|
| > No serious IDE, except emacs
|
| and vim, and visual studio, and whatever else supports the
| LSP protocol via https://github.com/ocaml/ocaml-lsp
|
| > The standard library was so lacking that there is at least
| an alternative.
|
| While janestreet does have an publish their own stdlib, I
| personally try to stick to the stdlib whenever possible. Not
| to knock janestreet. I'm glad they're around and have
| contributed a bunch.
|
| But overall I agree with you. It's been my favorite language
| to write in for years now. You can't just reach for off-the-
| shelf libraries for every little thing. Although the ones
| that do exist tend to be written halfway decently.
| amelius wrote:
| Do you make GUIs in OCaml, and which libraries do or would
| you use?
|
| And how about scientific computing (SciPy), deep learning
| (PyTorch etc.), or computational geometry (Shapely etc.)?
| yw3410 wrote:
| GUIs are a PITA like in most languages.
|
| I think most people use something which binds to gtk (such
| as lablgtk) or Qt.
|
| For scientific computing there is Owl, but I haven't used
| it personally.
| amelius wrote:
| Hmm, I'll stick with Python for now.
|
| The ecosystem of libraries is just too good.
|
| Perhaps if OCaml made it very easy to interoperate with
| Python, I could give it a chance.
| yw3410 wrote:
| There is pyml for interopt but I've never used it.
| still_grokking wrote:
| I've heard good things about OCaml in general.
|
| But "no serious IDE, except emacs" is a non-starter imho, if
| it's true.
|
| They should really invest in this. Otherwise the language
| won't attract any professional developers in the large.
| yw3410 wrote:
| Vscode works since there is an LSP server.
| zem wrote:
| one of my favourite languages! not so much for its (excellent)
| technical qualities, but just as a matter of personal taste -
| it joined ruby and racket in a short list of languages that
| just feel nice to program in. (i suspect D would join that list
| too but despite being interested in it for a while i haven't
| yet had a compelling project to use it for.)
| dahwolf wrote:
| Saw some comments on the protocol being fluffy and typical
| implementations resource hungry. This is an interesting guy to
| follow:
|
| https://universeodon.com/@supernovae
|
| He's the admin of universeodon, a mastodon instance with 13K MAU.
| He recently shared that in a month's time, 3TB of text was
| transferred just in ActivityPub events. Images a multiple of it.
| I don't know what the bill is, but I was pretty shocked by the
| stats...for "just" 13K users.
|
| And the cruel thing is that it still doesn't work properly.
| Likes/boosts and replies do not properly synchronize.
| mariusor wrote:
| The author makes the basic mistake of most of the people
| implementing ActivityPub services: they want to map the logic of
| an existing type of web application and contort existing domain
| objects into encoding/decoding to an "impractically large number"
| of options. That happens because they want two things in one: a
| server and a client.
|
| The ActivityPub specification needs to be read with a goal
| similar to an email server in mind. It should do one thing:
| receive JSON-LD objects in inbox, process them according to the
| specification, and(maybe) store them on disk.
|
| The idea of "users", "friends", "posts", "feeds" etc, are
| concepts that belong to the clients on top of this server, not in
| the server itself.
|
| This separation between clients and server will also allow better
| interop/graceful degradation of object types that the
| client/server don't specifically understand.
| JustSomeNobody wrote:
| Do you know of a small sample project that does this as an
| example?
| mariusor wrote:
| There are no "small sample" projects as far as I know. But if
| you look in my profile (or other comments in this thread) I
| did develop a server which only does ActivityPub, client to
| server and server to server.
| cratermoon wrote:
| OK, but for someone who wants to build a useful tool that does
| what the author wants, "interacting with the Fediverse", such
| as federating with Mastodon, how useful is doing that one
| thing?
| mariusor wrote:
| If you want to create one just for yourself, sure. If you
| want to create something for the rest of the world, probably
| not very much.
|
| I get the "scratch your own itch" mentality, but not if you
| kneecap all efforts that try to build on top of it. :D
| jeroenhd wrote:
| It depends on your goal. If your server is just a tool you
| use, you can ignore lot of concepts. There is no local
| timeline, there are no users, all follows belong to a single
| user, etc.
|
| I can't find the link but a while back there was a post on
| the front page about how to get a findable, read only
| ActivityPub profile by just uploading some static JSON files.
| Not exactly a Twitter competitor, but you don't need much to
| start exchanging messages.
| mdasen wrote:
| I believe you're looking for this:
| https://blog.joinmastodon.org/2018/06/how-to-implement-a-
| bas...
| cratermoon wrote:
| I did that myself. It's quite a distance from passively
| accepting requests to interacting with the Fediverse.
| still_grokking wrote:
| This comment raised a whole bunch of red flags for me.
|
| Fist and foremost: Saying that something is like an email
| server translates for me into "this is an under- and over-
| specified swamp at the same time, full of quirks, and actually
| not implementable in any reasonable way". Because that's what
| email is. I almost can't think of a greater horror than writing
| an email server from scratch...
|
| I don't know enough about ActivityPub to judge whether it's
| really like email. I would strongly hope it isn't, as otherwise
| it would be a tech you should probably better never touch as a
| developer.
|
| The next thing is: If an ActivityPub server only receives and
| sends some opaque BLOBs what's the whole point of it?
|
| But when it's not about opaque BLOBs you need to map the
| structures in the spec to proper types in a statically typed
| languages as you can't manipulate them otherwise in any
| meaningful way. If it's not possible to do that because the
| spec is vague and/or there is no coherent data model behind it
| that would be just another reason to not touch this tech.
| Nobody needs the next underspecified, stringly-typed "email".
|
| I really hope I'm reading this wrong!
| WorldMaker wrote:
| > If an ActivityPub server only receives and sends some
| opaque BLOBs what's the whole point of it?
|
| There's still a difference between "try to black-box the
| incoming data as much as possible" and "treat the incoming
| data as opaque BLOBs and assume". The data is mostly JSON-LD
| which is a far cry from "binary large objects". It is always
| going to be "semi-transparent" as it will always be JSON.
| Whether or not you like the "-LD" extensions to JSON (they
| are heavy, they do have a lot of RDF baggage you may not
| desire), they give you a bunch of guaranteed "baseline
| schema" for the JSON objects that you can use for static
| typing that might be "good enough" for a lot of "meaningful
| manipulations" (such as following links to pick up related
| objects; LD => linking data) and that is all easily
| transparent.
|
| A lot of the schemas beyond "LD" in ActivityPub are
| client/application-specific beyond most of the JSON-LD basics
| and should be easy to treat as a black box unless doing
| client/application-specific tasks. That's not necessarily
| "stringly typed", it's kind of a classic "serialization
| onion": The server at best needs to know that it is JSON and
| it may have JSON-LD metadata for relevant related linked
| objects (and a few other metadata fields common to
| "introspection", similar to "headers"). The client can dig
| deeper and know it is not just "any" JSON object but a more
| specific schema for a given class of thing the client cares
| about.
| still_grokking wrote:
| To be honest, this sounds indeed quite like the mess that
| email is.
|
| If the server isn't just a "dumb 'BLOB' storage" it will
| need to handle application logic (sooner or later, as this
| is actually what servers are for)...
|
| But given that the application logic seems to be mostly
| unspecified, kind of wild west, where every client
| application can do whatever it thinks it's users like, this
| will unavoidably end in all the problems you have with
| email, where the server needs to know about all the
| specific details, quirks, and idiosyncrasies of every
| client ever built.
|
| The whole concept reads like an implementation of
| "'Postel's Law' fallacy".
| mariusor wrote:
| It sounds like you made your mind up. I hope that you'll
| decide to stop wasting your time by contributing to this
| thread.
| still_grokking wrote:
| I'm just reflecting on what I've heard here so far.
|
| I didn't made up my mind, as for that I would need to
| study the _primary sources_ myself. Talk is cheap. Even
| here on HN.
|
| But I start to get a kind of picture. And it doesn't look
| pretty to be honest. That's kind of discouraging and sad.
|
| That's not my fault. I'm just trying to understand what
| people here are saying.
| mariusor wrote:
| Thank you for articulating this very well, I was getting a
| bit frustrated at OPs contrarianism. :)
| mariusor wrote:
| The email comparison helps people to understand the
| directional way ActivityPub works, I don't know enough about
| email (whichever of SMTP or IMAP/POP3/samd you consider that
| to be) to make a comparison at protocol level.
|
| > If [...]receives and sends some opaque BLOBs what's the
| whole point of it?
|
| There are some rules about how to have side effects for said
| blobs. Some of the blobs themselves have side effects. That's
| mostly what ActivityPub is: rules about how to distribute the
| blobs in the federated context, rules to what to do with the
| blobs when they reach your servers (when coming from other
| servers, or directly from clients).
|
| The vocabulary that ActivityPub is based upon, is another
| whole specification, called ActivityStreams, and which didn't
| originate in the W3C group. This vocabulary has three (*main)
| types of objects: Activities - which provide the backbone of
| ActivityPub (Like, Follow, Create, Update), Actors -
| basically different types of users (these are the entities
| that operate the activities) and, Objects - whatever the
| Activities operate on.
| MuffinFlavored wrote:
| > JSON-LD
|
| https://json-ld.org/ for anybody else not super familiar
| iudqnolq wrote:
| (My only knowledge of activitypub comes from reading this
| article.)
|
| To receive JSON-LD messages don't you need to send follow
| requests? And to do that don't you need to deal with the fact
| the spec is too complicated and most servers implement
| inconsistent parts of it?
| vidarh wrote:
| To receive JSON-LD messages, someone needs to send them to
| you. Sending follow requests is perhaps the easiest way to do
| that, but those follow requests do not need to be initiated
| by the same code that hosts the inbox.
|
| The point is there are several potentially independent layers
| and modules there: The message pump itself at least can be
| implemented separately from the decoding of individual
| message types, and separate from managing followers and
| following, the same way e.g. a mail server knows nothing
| about how to follow mailing lists, or decoding email messages
| past the header.
| still_grokking wrote:
| That sounds like a mess.
|
| Reading through the other comments here it seems that the
| spec is in fact a mess...
| [deleted]
___________________________________________________________________
(page generated 2023-04-24 23:00 UTC)