[HN Gopher] Message order in Matrix: right now, we are deliberat...
___________________________________________________________________
Message order in Matrix: right now, we are deliberately
inconsistent
Author : whereistimbo
Score : 92 points
Date : 2024-12-05 02:11 UTC (20 hours ago)
(HTM) web link (artificialworlds.net)
(TXT) w3m dump (artificialworlds.net)
| lmm wrote:
| In general we certainly want to be able to change things "in the
| past". When there is unpleasant spam in a groupchat, you want a
| moderator to be able to remove or at least hide it, in a way that
| means people scrolling up won't be exposed to it unless they
| explicitly want to. (You could argue for having the client deal
| with all of that, but I don't think there's much benefit).
|
| And if, as in the example at the end, clients on different
| homeservers will inevitably see different views, then I don't
| think always showing the same history to the same client, or
| clients on the same server, solves the "gaslighting" problem - if
| anything it could make it worse. Maybe clients should make it
| obvious when messages have been "retconned" into the scrollback,
| and maybe servers should have certain features to support that.
| But the idea of having a consistent linear timeline is one of
| those answers that's clear, simple, and wrong.
| danpalmer wrote:
| This is something that many chat apps get wrong and I'm not sure
| this article is moving in the right direction. The UX is fairly
| clear in my mind:
|
| 1. All up-to-date clients should be displaying the same message
| order. 2. A single client should not send messages in the wrong
| order.
|
| Yes a client may be out of date and therefore show something
| different, but once it becomes up to date it should be showing
| the same state even if that means amending history. Why? Because
| the humans reading it will be confused otherwise! An app getting
| more data is something we intuitively understand, but if my
| client shows something and yours shows something else, we will
| conclude different meanings from it.
|
| Additionally there are some clients that treat each message input
| by the user as a retriable thing in isolation, which is also
| clearly incorrect. If I send two messages and the first fails to
| go through, I almost certainly don't want to retry the second
| until the first has gone through, otherwise my client has
| literally sent out of order messages! I use Beeper for chat and
| this is one of the most frustrating things it does.
| lmm wrote:
| > Additionally there are some clients that treat each message
| input by the user as a retriable thing in isolation, which is
| also clearly incorrect. If I send two messages and the first
| fails to go through, I almost certainly don't want to retry the
| second until the first has gone through, otherwise my client
| has literally sent out of order messages!
|
| I don't think that's clearly incorrect. If you sent two
| messages you presumably want them to be two messages and they
| should be retried as such. If what you wanted to send was a
| single, multi-line message, surely you would have just done
| that?
| winwang wrote:
| I break up my messages, as do many people.
| adastra22 wrote:
| Do you do so with the expectation that they might arrive
| out of order, or one fails?
| Spivak wrote:
| Out of order no, failing and having to manually re-send
| which makes it out of order is acceptable.
| danpalmer wrote:
| Not at all. The separation of messages is part of
| communication, not me trying to game a network protocol.
| Maybe it's to emphasise a point, maybe it's to time a joke,
| maybe it's to send a photo and a text message separately.
| thaumasiotes wrote:
| > If what you wanted to send was a single, multi-line
| message, surely you would have just done that?
|
| No. danpalmer is correct; the break between messages is an
| integral part of the communication.
| hks0 wrote:
| Human communications are more naunced than DB transactions.
| If I forget to mention something important I send a new
| message rather than editing the already sent one, to make
| sure I catch their attention. Edits can go unnoticed. Imagine
| this scenario:
|
| [12:00 / sent] Sell the house.
|
| [12:05 / failed] Please feed the baby.
|
| [12:06 / sent] Oh and the cat too.
|
| Now the receiver's going to sell my cat [example inspired by
| The Art of Multiprocessor Programming].
| johnny22 wrote:
| They sure are, but I hate when slack combines my messages
| when I wanted two separate messages on purpose. If i send
| two messages, it's because I did it on purpose.
| shawnz wrote:
| How far back should you be able to amend history? What if a
| malicious client adds messages to a conversation that happened
| in the past? Imagine for example I'm at work and notice a
| critical mistake that I missed, and so I retroactively add
| messages to the old conversation to make it look like I'm not
| liable, should that be permitted by the protocol?
| glandium wrote:
| Matrix allow to edit weeks old messages, already.
| shawnz wrote:
| But there's a flag which indicates they've been edited and
| you can see the edit history, right? So that's not useful
| for this scenario.
| thomastay wrote:
| Obv it depends, but one way to "solve" this it is to show an
| edit history, or at least the latest edit timestamp along
| with some visual indicator that the message was edited
| recently
| shawnz wrote:
| I'm not talking about edits, I'm talking about sending new
| messages which are backdated to appear as part of an older
| conversation
| thomastay wrote:
| icic, yeah that definitely shouldn't be allowed
| fastball wrote:
| You can amend displayed order for humans (what matters for
| 99% of usage), while still allowing anyone interested to see
| when the message actually arrived at the homeserver (making
| the suggested gambit impractical).
| shawnz wrote:
| Then instead imagine this: the user really is innocent and
| just happened to coincidentally send the message right
| after the start of a long period of poor connectivity (like
| a flight, or a road trip, etc). If you just allow it to go
| through after an arbitrary delay, with only a log of the
| received time for liability purposes, then the user
| wouldn't have any indication of this scenario occurring.
|
| Wouldn't it be better in that case to show an error so that
| they can make sure the situation is addressed
| appropriately?
| Spivak wrote:
| You have when the client claims it was sent (so where it
| goes in the displayed history) and can see when it was
| received. What else could you possibly do?
| shawnz wrote:
| For example, you could reject the message and show the
| user an error but only if there's a discrepancy of >X
| minutes. But how much discrepancy should be allowed? I
| don't know, I only mean to show why I think the solution
| isn't as simple as it appears
| lmm wrote:
| > you could reject the message and show the user an error
| but only if there's a discrepancy of >X minutes
|
| No you can't, not in a federated and decentralised system
| like this.
|
| The sender can wait for a read receipt from a given
| receiver user, if the receiver is willing to make those
| public. But if the message left client A and didn't
| arrive at client B, there's no objective fact of the
| matter about whether the message "was sent" or not.
| immibis wrote:
| Seems like a design deficiency of Matrix then. When IRC
| federation breaks, everyone can see it, except for the
| rare people who aren't in a shared channel with anyone on
| the other side of the break.
| lmm wrote:
| > Wouldn't it be better in that case to show an error so
| that they can make sure the situation is addressed
| appropriately?
|
| If you want a single centralised server then you can set
| things up that way. Presumably if you're using a setup
| with multiple servers, and took one of the servers on the
| flight/road trip, you wanted the people on the
| flight/road trip to be able to keep talking to each other
| over that server, even though that server is disconnected
| from the one in the office.
| dataflow wrote:
| > How far back should you be able to amend history? What if a
| malicious client adds messages to a conversation that
| happened in the past? Imagine for example I'm at work and
| notice a critical mistake that I missed, and so I
| retroactively add messages to the old conversation to make it
| look like I'm not liable, should that be permitted by the
| protocol?
|
| I believe that's impossible? At least if you design it
| correctly.
|
| For ordering/interleaving purposes, what matters isn't the
| time you claim to send the message, it's the time the message
| is received by the server. If you want, you can display the
| claimed send timestamp beside the message (and prominently
| highlight it if it is e.g. out of order, or with a long
| delay, etc.), but that is irrelevant to the ordering.
|
| The point here is that there should be a single consistent
| order on the server, and that's what all clients ought be
| displaying. Any messages not yet acknowledged by the server
| should be displayed differently so that users are aware they
| haven't been seen yet, and any messages that arrive before
| those are sent would obviously get inserted above those.
| lmm wrote:
| > what matters isn't the time you claim to send the
| message, it's the time the message is received by the
| server
|
| There's no "the" server here. If you use the time the
| message is received by the server, you'll get different
| views on different servers, and you may see messages from
| months ago appearing as new, if connectivity breaks down
| and is later restored.
| dataflow wrote:
| > There's no "the" server here.
|
| Can't you assign every conversation to a single
| authoritative server for handling?
|
| Also, how large of a time skew are you imagining would
| exist between different _servers_? That stuff ought to be
| accurate to at least milliseconds if not micro...
| shawnz wrote:
| Aside from the concerns with decentralized servers that the
| other poster mentioned, this has the disadvantage that your
| messages are going to get constantly reordered to not match
| the intended flow of the conversation when you have poor
| connectivity, which is a bad user experience
| dataflow wrote:
| Wasn't the whole point here that the messages _wouldn 't_
| get reordered? There would be one definite order that
| everyone would see. Again, if the message isn't
| timestamped by the sever, it would need to appear
| visually differently, so that everyone knows about this.
| And nobody says the server has to accept messages with
| arbitrarily delays either.
| shawnz wrote:
| My point is that some limited reordering maybe should be
| allowed, but not too much. That is to say, the problem
| isn't as simple as just doing it one way or another way.
| Every approach has some disadvantages.
| dataflow wrote:
| I know you were trying to reach that conclusion, but my
| point was that the design I suggested neither seemed to
| have the problem you suggested, nor is reordering a
| necessary outcome, from what I can tell.
| deredede wrote:
| > Yes a client may be out of date and therefore show something
| different, but once it becomes up to date it should be showing
| the same state even if that means amending history. Why?
| Because the humans reading it will be confused otherwise! An
| app getting more data is something we intuitively understand,
| but if my client shows something and yours shows something
| else, we will conclude different meanings from it.
|
| That's interesting because I have the complete opposite take
| and would hard disagree with this. I intuitively understand
| that if we both write messages at the same time, we will see
| them in different order. Snail mail has worked this way for
| centuries, and I very much prefer this to an app silently
| altering the content as time goes. It is confusing when it
| happens under my eyes (something moved at the top of the screen
| while I was reading the bottom, what was it?) and easily leads
| to missed messages especially in group conversations (my buddy
| sent a message with a poor connection at 11am, it is retried
| and sent at 2pm and appears before the lengthy discussion
| others had at noon).
| moring wrote:
| Snail mail has never claimed that a history of all messages,
| with that history having a current state, exists. If you send
| a paper letter, you don't have it yourself anymore. You might
| keep a copy, but that's a _copy_, not the letter you sent.
|
| Messenger apps claim that such a history exists by showing
| you, well, that history. In the same way, messengers claim
| that a message order exists, by showing you the messages in
| that order. If something exists, then it is independent of
| the viewer. So the assumption that the message order is the
| same for all viewers is founded in how two people look at
| physical objects.
| deredede wrote:
| Messenger apps don't claim that this history should be
| global and consistent. The order in which messages were
| sent and received by my device is a perfectly fine (and I'd
| say intuitive) history. It is the order people (and their
| records, if they have some) would have had in mind in the
| old time.
|
| I take a different conclusion from the way people look at
| physical objects: since your device (or even my other
| device) is a different physical object than my device, I'd
| be wholly unsurprised to find a different order there.
| lmm wrote:
| > Messenger apps don't claim that this history should be
| global and consistent.
|
| The fact that we're talking about multiple people looking
| at the same chat - the fact that we do conceptualise it
| as "the same chat" and "the history" - implies that we
| think of it as a single thing. And I think messenger apps
| generally nudge us that way - e.g. setting the name of
| the chat usually sets it for everyone.
|
| > It is the order people (and their records, if they have
| some) would have had in mind in the old time.
|
| I don't think it is. If I pull my correspondence with
| person X out of my drawer or file, the only dates I have
| to order them by are the dates written on the letters -
| which are the letters they and I (if I keep carbons of
| the ones I send) wrote them on, not the dates I received
| them. If they sent me a postcard while on holiday and
| then a letter after returning that arrived sooner, I'll
| read them in one order on receipt and in a different
| order when looking back. Likewise if I have a memo of a
| phone call with them, that may be from before I received
| a letter that is nevertheless dated earlier.
| deredede wrote:
| > I think messenger apps generally nudge us that way -
| e.g. setting the name of the chat usually sets it for
| everyone.
|
| That's a good point - maybe it's actually email that
| warped my mind.
|
| > I'll read them in one order on receipt and in a
| different order when looking back
|
| Also a good point, I was thinking more about business
| communication where the date the letter is received
| matters. Thinking back on it, I think the main difference
| is that the messenger apps might happily reorder message
| before (or while) I read them. And if only one order is
| to be available, the one of the most use to me for an
| instant messaging app is the one I received the messages
| in, but I get how for other use cases it would be
| different.
| kevincox wrote:
| > I intuitively understand that if we both write messages at
| the same time, we will see them in different order.
|
| I think you are thinking like a distributed systems designer.
| I would assume that if you asked 10 "random Americans" 9 of
| them would assume that someone managed to send their message
| first and would be surprised if their phone and their friends
| phone showed them messages in different orders.
| RaftPeople wrote:
| I don't use these apps so maybe my solution wouldn't work, but
| after reading the article, it seems that having a visual
| indicator of messages that are new but in the past would be a
| reasonable solution.
|
| Especially if there were simple controls to flip into a mode
| that minimizes the ones already seen (collapsed and grey for
| example) while highlighting all of the ones inserted in the
| past.
|
| Or, if there are many messages already seen and few inserted
| into hist, show the inserted ones with a small sampling of the
| already seen (so the user can anchor to already familiar data
| in the timeline) along with "72 messages hidden that were
| previously seen" type of thing in between the inserteds to
| condense the view.
| thomastay wrote:
| Having dealt with this problem at work for several years now, I
| feel the pain of keeping different clients in sync - it's
| extremely difficult. Not sure if it's possible in Matrix, but
| consider having a message ID that increments by one on every
| message in a room. That lets the client know pretty quickly if
| there's a gap or a misordering.
|
| Not really getting this point though: The /sync
| API returns events in an order "according to the arrival time of
| the event on the homeserver". The spec for /messages
| says it returns events "in chronological order. (The exact
| definition of chronological is dependent on the server
| implementation.)".
|
| Why would those two return different results? When does the
| chronological order of two messages differ from the arrival time
| of the event on the homeserver?
| AlotOfReading wrote:
| Non-monotonic clocks?
| wkrp wrote:
| /messages might be a legacy endpoint compared to a newer /sync.
| I know Matrix has been working hard on their sliding sync api.
| duskwuff wrote:
| What I think you're missing is that Matrix runs as a
| distributed system. There's no central authority to assign IDs
| to messages, and it's possible for a single group chat to run
| in a split-brain configuration if two homeservers lose
| connectivity to each other. When those homeservers reconnect,
| users connected to each one will see messages appear "in the
| past" which were sent by users on the other side of the split.
| adastra22 wrote:
| That makes the problem harder, but not impossible.
| lmm wrote:
| It makes an incrementing message ID impossible.
| adastra22 wrote:
| Only if you're not ok with eventual consistency and
| renumbering in the case of discovered conflicts or net
| splits.
| toast0 wrote:
| I'm pretty sure this is actually impossible in a
| distributed system with independent operation, and if it
| were possible, it would be terrible UI anyway.
|
| Problem one is if you want to order events chronologically,
| you need to precisely decide what the time of the event
| means. Probably not the time the client hit send, because
| you can only measure that on the client and client clocks
| are at best approximately accurate. You could consider the
| time the server received it, and assume your server times
| are accurate, but that's still problematic because even in
| a well functioning system, if a user sends message A to
| server.wdc around the same time as a user sends message B
| to server.lax, users connected to server.wdc will get A
| then B, and users connected to server.lax will get B then
| A, and this leads to problem two:
|
| Problem two is messages generally display in order of
| receipt. If you get a message that slots in earlier in the
| thread, you may need to scroll up to see it. In a busy
| theead, it's going to be hard to read all the messages
| because of the back and forth. If you send a message, it
| may need to be reordered too. If you go back to the thread
| later, new messages may be in different places. This is
| _more_ disorienting IMHO than different message orders for
| different viewers.
|
| Problem two gets even worse when you don't just have
| distance between servers, but also some network or other
| operational issues. If a server accepts a message, but is
| unable to forward it immediately, you probably want it to
| forward it whenever it can... if there's a significant
| delay, now the message is again going to be displayed in a
| place where it's difficult to see.
|
| You can kind solve this by forcing messages to a group to
| go through a single queue which forces an ordering, but
| that makes accepting messages for a group a lot more
| difficult.
| j16sdiz wrote:
| > This is more disorienting IMHO than different message
| orders for different viewers.
|
| the parent post is arguing this is _less_ disorienting,
| and I agree.
| toast0 wrote:
| I feel like when an important message comes in out of
| sequence, but you had already sent a response to the chat
| with what was visible at the time, it will be very
| confusing when that gets reordered.
|
| Ex:
|
| A@T0: User X is abusing our service, we should send them
| a sternly written letter.
|
| B@T60: Yes, I'll do it right away.
|
| C@T2 (received later): No, we should just shadowban them.
|
| When B sent their message, their intent was clear to
| them. But when they review their message after C's
| message is received, if the display ordering is changed,
| the meaning of the communication has changed, and how can
| B show that sending the warning was reasonable when they
| clearly said they were going to shadowban the user.
| (Maybe this group should use something else with a
| guaranteed ordering to track abuse and response, but
| that's a different question)
|
| If C's message is displayed earlier than B's in some
| cases and not others, that makes for a confusing
| situation, but each person can look at their messages and
| easily see what they saw when they argue about a
| breakdown in communication in the aftermath.
| adastra22 wrote:
| There are relatively straightforward decentralized
| consensus algorithms for ordering events if we assume
| cooperation. If we assume malicious peers, then we're in
| the space of the byzantine generals problem, but there
| are solutions to that too.
|
| Now there's some property that you have to give up, for
| example an immutable ordering. You might think the
| message came in one order, then reconnect with the
| network and discover the order was flipped. So long as
| the UI can handle that an update, there are consensus
| algorithms that will deliver a consistent view even in
| the edge cases.
|
| You don't need a single timestamping queue.
| aeonik wrote:
| As long as the speed of light remains constant for all
| observers, who cares if everyone agrees on simultaneity?
| Distributed systems don't need to know what _time_ it is,
| just what _happened_.
|
| Well known systems are implemented this way, and the UI
| is great, people barely even notice.
| thomastay wrote:
| yeah, having eventual consistency for messages across
| homeservers makes the work on the client harder. I guess they
| just have to accept that messages will "appear in the past"
| as you said.
|
| But at least for messages sent within the same homeserver, I
| would think that those two apis should return the same data
| fiddlerwoaroof wrote:
| I think you basically want a partial order for federated
| chat: messages should arrive after the messages that cause
| them but not necessarily after messages that didn't cause
| them. In the case of a network partition, this allows
| people on either side of the partition to continue
| communicating at the cost of non-determinism when the
| partition is resolved.
| duskwuff wrote:
| I'd maintain that an important property is for the system
| to be eventually consistent with regards to history. You
| don't want a transient network event to potentially
| result in two users permanently seeing messages in a
| different order.
| fiddlerwoaroof wrote:
| I don't think you can prevent that without centralizing
| on a single server
| lmm wrote:
| You can, but it results in the situation the article is
| complaining about.
|
| During a netsplit, people chatting on opposite sides of
| the netsplit continue to be able to chat (by design), but
| will (obviously) see a different history from each other.
| So when the netsplit heals, you have a dilemma: either
| you splice the history from the other side in, giving
| eventual consistency at the cost of changing the history
| that people have already read, or you keep permanently
| different histories on servers that were on one side or
| the other.
| mycall wrote:
| Split-brain scenarios can be resolved using an odd number of
| nodes (or voters) to achieve a majority consensus to agree on
| the state of the system, stopping the services on the
| minority side to prevent conflicting operations. Once
| communication is restored, the stopped nodes can rejoin the
| cluster and synchronize their data. Vector clocks are a great
| abstraction for ensuring correct ordering as well.
| wolrah wrote:
| Perhaps I'm wrong about how Matrix works, but my
| understanding was that at least public rooms still had a
| "primary" homeserver, like for example I can connect to
| #debian:matrix.org from any number of federated servers but
| matrix.org is still where that room "lives".
|
| If that understanding is correct, then IMO the answer is
| simply that the canonical timeline is what that server says
| it is. Poorly connected users or those on other servers
| experiencing issues or delays with federation may temporarily
| see a different sequence of events but once everyone's had a
| chance to sync back up the state should generally be what the
| primary server for the room saw it as.
|
| Perhaps there should be some sort of flag for "this message
| has been reordered during a resync" that clients which
| initially had a different state due to whatever reason could
| store to make it clear what happened, and likewise if the
| central homeserver receives messages with a timestamp
| significantly off real time it could flag those messages as
| possibly having been received out of order while still
| displaying them in the order they were received.
| nine_k wrote:
| What's mysterious here? One ordering is dictated (arrival
| time), another left for the consideration of the server (likely
| allowing for stuff like pinned messages, etc, that break the
| strict ordering).
|
| If a Matrix server allows to delete messages (by the poster or
| by a moderator), then increasing IDs with _no gaps_ become
| impossible. If the server allows editing of existing messages,
| then a sequence with no gaps is not sufficient to reflect all
| changes. Ideally a server does not do either, but uses more
| messages to augment existing messages, or mark some as deleted;
| with that, a sequence with no gaps would suffice.
| throwaway14356 wrote:
| i had a hilarious argument with the significant other when my
| messages appeared a very lame response to messages i didn't
| receive.
|
| i think the mental model should be what is most useful in court.
| if a netsplit occurs the state of the room doesn't exist anymore,
| conversation can continue but it should be a different room
| populated with working available clients. The main room can be
| restored and the missed convo can be a 3rd room
| Vanit wrote:
| I'm throwing some shade here, but this reeks of backend engineers
| not caring about UX.
| fsckboy wrote:
| this reeks of backend engineers not caring about UX designers
| who don't understand the problem while the UI designers who do
| understand are barred from attending meetings for bad behavior.
| I'm not throwing shade.
| shepherdjerred wrote:
| This sounds like a pretty good use case for a consensus algorithm
| like Paxos or Raft
| evilotto wrote:
| I think the most important property to preserve is causality;
| that is, if a user sends a message B after they have read
| (i.e., received) a message A, then B should come after A for
| everyone, because B depends on A. Basically use a Lamport
| clock.
| purpleidea wrote:
| Those are CP which is impossible in a distributed messaging
| system where it has to obviously be AP. Otherwise you'd have to
| guarantee that everyone involved is always online (no
| partition) to make progress on sending messages!
|
| I think I have this right anyways. (CAP theorem for anyone
| curious.)
| timokoesters wrote:
| I'm the author of the spec issue this blog post is based on:
| https://github.com/matrix-org/matrix-spec/issues/852
|
| In my implementation for the Conduit Matrix server, the /sync
| order is used for everything. The timeline is just one list that
| grows on one end for incoming events and on the other end for
| backfilled events.
|
| I think it's important that the message order does not change,
| because that's very difficult to communicate to the user.
| Fizzadar wrote:
| Oh that's neat (TIL), am also working on a HS that also does
| this [1].
|
| Not only does it feel like the most correct (I don't think
| there is a perfect) behaviour for the user but also makes
| implementation much simpler. Synapse has a LOT of ordering foo
| and magic in the code I still don't fully understand and I've
| gone fairly deep into synapse at times for work.
|
| [1] https://github.com/Beeper/babbleserv
| Saris wrote:
| That's something that Telegram always seems to get right, I've
| never seen messages out of order in different clients, and if I
| do something like upload a video then immediately send more text
| messages before it's done, it will shove the video in between the
| messages where it should be when the upload is done.
|
| I know it's a much harder problem without a central server
| managing things. But consistency is very important for messages,
| out of order they could have a very different meaning and be very
| confusing.
___________________________________________________________________
(page generated 2024-12-05 23:00 UTC)