[HN Gopher] Falsehoods programmers believe about TCP
       ___________________________________________________________________
        
       Falsehoods programmers believe about TCP
        
       Author : todsacerdoti
       Score  : 140 points
       Date   : 2024-09-14 18:26 UTC (4 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | saghm wrote:
       | > remember, all of the following statements are _false_ at least
       | some of the time, but for some of these, perhaps not very often
       | 
       | > 5. There is a such thing as a TCP packet
       | 
       | > 6. There is no such thing as a TCP packet
       | 
       | I don't understand this at all. Either the concept of a TCP
       | packet exists, or the concept does not exist. Even it's not being
       | used in certain scenarios, I don't see how you can argue that
       | "there's no such thing" any of the time. This might just be me
       | misunderstanding whatever point they're trying to make, but I
       | don't remember ever having such philosophical confusion from
       | anything in any other "falsehoods programmers believe about..."
       | article before.
        
         | deathanatos wrote:
         | Yeah, I agree with you here.
         | 
         | I think the thing most related to that that I see people
         | thinking is that send(2) & recv(2) calls translate 1:1 with
         | packets send/recv. I.e., that they don't understand that the
         | interface TCP exposes to applications is a byte stream. Which
         | then results in things like thinking recv(2) will receive a
         | complete "message" for some definition of message in the
         | application protocol (i.e., the mistake belief that
         | fragmentation won't happen).
        
         | abnry wrote:
         | My only guess how this could make sense is if there is some
         | ambiguity in the definition of a TCP packet.
        
         | o11c wrote:
         | What it really means is: Packets have well-defined boundaries
         | between sufficiently-adjacent nodes. They are not guaranteed to
         | keep those boundaries end-to-end over arbitrary middleware.
        
         | marcosdumay wrote:
         | It's about your abstraction level and the kinds of problem you
         | are ignoring. It's true at the same time that you can't ignore
         | the problems of stream communication nor the problems of
         | package-based communication.
        
         | jpollock wrote:
         | If I had to guess, it would be an assumption that TCP was edge
         | to edge with no translation in the middle.
         | 
         | My guess is that this is talking about systems in the middle of
         | the network, changing (for example) their sizes by combining
         | and splitting packets to fit through various transits.
        
         | fracus wrote:
         | 5 and 6 are mutually exclusive. They don't make sense
         | logically. And most of the list was never explained at all.
        
           | ordu wrote:
           | _> They don 't make sense logically_
           | 
           | In practice such situations can arise in one of two cases:
           | 
           | 1. some non-sense creeped in
           | 
           | 2. logic is applied to a self-contradictory set of axioms and
           | definitions.
           | 
           | (1) is not very interesting, but (2) happens frequently
           | enough because people often do not try to formalize their
           | definitions and axioms. As a consequence they are using some
           | vague concepts and their statements are true in some cases
           | but not in others.
           | 
           | With all that said, I can propose the way how this logical
           | non-sense could be _right_. (NB. I don 't know if it applies
           | to TCP, I'm just thinking generally, and just as an example
           | to all that abstract words above) The notion of "existence"
           | of the mistaken programmer can be wrong. If we accept their
           | definition of existence, then TCP packets doesn't exist, but
           | they exist in some other sense.
        
             | astrobe_ wrote:
             | Yes. If one applies correctly the rules of logic on
             | inconsistent axioms, the conclusions will be inconsistent.
             | If one incorrectly applies logic to inconsistent axioms,
             | the conclusions may or may not be consistent. It happens
             | IRL sometimes; "being right for the wrong reasons". That
             | being said, I suspect the game of the author is to play
             | with leaky abstractions. TCP is a stream-oriented protocol,
             | but is implemented on top of frames etc.
        
           | alephnerd wrote:
           | The point is that a lot of stuff in Networking (and Computer
           | Engineering in general) is very context dependent, and that
           | you cannot be extremely opinionated about this stuff.
        
           | inopinatus wrote:
           | They are not mutually exclusive statements, because they
           | don't exist in isolation: they are both potentially true and
           | false depending on the context of discussion.
        
             | Feathercrown wrote:
             | But they assert whether or not something exists, as an
             | absolute statement. Maybe TCP packets don't exist in a
             | particular situation, but there is still such a thing as a
             | TCP packet in that case.
        
         | ooterness wrote:
         | Pedantically: TCP has segments, IP has packets, and Ethernet
         | has frames. They are one-to-one in simple cases, but not
         | always.
         | 
         | https://networkengineering.stackexchange.com/questions/50083...
         | 
         | In particular, fragmentation by intermediate routers means that
         | the server and receiver may disagree about the frame and packet
         | boundaries. TCP is expected to make a "reliable" pipe-like
         | service out of whatever happens, and the application layer
         | doesn't have (shouldn't need?) visibility into that process.
        
           | LudwigNagasena wrote:
           | Falsehoods programmers believe: TCP/IP can be coherently
           | mapped to the OSI model.
        
             | Rauchg wrote:
             | Falsehoods programmers believe: the OSI model
        
               | fanf2 wrote:
               | Yep. https://dotat.at/@/2024-03-26-iso-osi-usw.html
        
               | devman0 wrote:
               | Falsehoods programmers believe: the OSI model is useless
        
               | akira2501 wrote:
               | It's useful. It's not controlling.
        
         | gwbas1c wrote:
         | > 6. There is _no such thing_ as a TCP packet
         | 
         | Because the software abstraction is a stream of bytes; and it's
         | up to the application to decide where the "packets" begin and
         | end.
         | 
         | For example, I might write to a TCP socket: 100 bytes, 50
         | bytes, and then 125 bytes.
         | 
         | BUT, the receiver could get: A single event with 275 bytes. Or
         | it could get an event with 75 bytes and then an event with 200
         | bytes. Or it could get 11 events of 25 bytes.
         | 
         | > 5. There is a such thing as a TCP packet
         | 
         | This one I struggle with. I think the author is talking about
         | connection set up, acking, and connection teardown.
        
           | sixfiveotwo wrote:
           | If you have a look at the underlying network traffic, you'll
           | see IP packets carrying TCP data, ie. The protocol field in
           | the IP packet header will be set to TCP; this could be
           | assimilated to a TCP packet.
        
         | Terr_ wrote:
         | Perhaps it could be rescued by rephrasing them as "is always"
         | versus "is never"?
        
         | fanf2 wrote:
         | I guess it's talking about how the TCP data stream is segmented
         | into IP packets. From the IP point of view, there are packets;
         | from the application point of view there is a data stream; but
         | it's more complicated than that. Applications have some control
         | over when TCP's PSH flag is set, roughly speaking, at the end
         | of each write(); and that in turn affects segmentation because
         | small pushed writes cause small packets. But if the sender
         | can't send straight away then buffered data doesn't preserve
         | write() boundaries and will be sent with large packets.
        
       | richm44 wrote:
       | 1. A SYN will receive a SYN-ACK or a RST 2. A host from my
       | machine is the same as from your machine 3. An IP from my machine
       | is the same as from your machine
        
         | eptcyka wrote:
         | 1. A SYN may receive a SYN-ACK, RST or nothing at all.
        
       | hinkley wrote:
       | I recall it blew my fiancee's mind that I could unplug her
       | ethernet cable, move it around an obstacle, plug it back in and
       | all her connections were still alive. It's designed to have bombs
       | dropped on it.
        
         | jancsika wrote:
         | What happens in that case? I'm going to speculate:
         | 
         | 1. Remote keeps sending stuff to your unplugged connection
         | 
         | 2. You plug your ethernet cable back in
         | 
         | 3. Your computer's TCP acknowledges the last sequence number it
         | received for each new sequence it receives from remote
         | 
         | 4. Remote sees duplicate ACKs for same sequence number,
         | interprets it as packet loss and resends the stuff
        
           | IgorPartola wrote:
           | With timeouts. You can't unplug it for an hour and have this
           | happen. But a few seconds is exactly what this is designed
           | for. As another commenter pointed out, your OS could also try
           | to be "helpful".
        
           | toast0 wrote:
           | Yeah, that's pretty much it.
           | 
           | If packets were sent while you were disconnected, they'll be
           | gone, but if you're disconnected for only part of the burst,
           | duplicate ACKing will trigger retransmits.
           | 
           | If you were gone for the whole burst, you'll get put right by
           | timer based retransmits.
           | 
           | If you're gone for long enough, most peers will timeout on
           | unacknowledged data (although that's not in the TCP RFC), and
           | if there's no outstanding data, most peers eventually have
           | some sort of periodic ping and timeout (tcp keep-alives is a
           | reasonable fallback IMHO, if your application protocol
           | doesn't have someything, although the default of IIRC 2 hours
           | feels long in todays world of lots of NATs and much shorter
           | timeouts).
        
         | toast0 wrote:
         | Depends on OS settings these days. Lots of OSes want to help
         | and detect link down and reset all your connections. Kind of a
         | pain when you just want to move a cable.
        
           | sgerenser wrote:
           | Like Chrome's oh so helpful ERR_NETWORK_CHANGED
        
             | Avamander wrote:
             | Like, I know?! Just reload the page already, I've told you
             | twice.
             | 
             | It's rather irritating.
        
             | plorkyeran wrote:
             | Also known as ERR_FUCK_YOU. Yes, I know that I'm connected
             | to a misbehaving wifi router. Just load the fucking page.
        
               | whaleofatw2022 wrote:
               | Loud
        
         | pjc50 wrote:
         | .. on Linux. If you do that on Windows the MAC will detect the
         | loss of link pulses, report the interface as down, and Windows
         | will "helpfully" reset all your TCP connections.
        
           | IshKebab wrote:
           | That seems like way more sensible behaviour.
        
             | Rygian wrote:
             | How so?
        
       | dtaht wrote:
       | It really is astounding to me how so many still do not understand
       | that tcp is not a function call, or behaviors like slow start and
       | congestion avoidance.
       | 
       | Recently a new rate limiter for TCP went by that was so terribly,
       | terribly broken, and I cannot help but imagine that most of the
       | containers of the world suffer from Bufferbloat in general.
        
         | dtaht wrote:
         | The rate limiter in question:
         | https://github.com/cilium/cilium/issues/29083
        
       | solatic wrote:
       | Related: you can get at most once delivery or at least once
       | delivery; you cannot get exactly once delivery. If I had a dollar
       | for every junior who thought that a lack of exactly once delivery
       | guarantees was a bug...
        
         | stanac wrote:
         | Yup, I try to explain it with shouting a message to someone in
         | a crowded room. You can yell at your boss "I fixed the bug",
         | they can confirm it or ignore you, which is delivery at most
         | once if you don't repeat the message. If you try to repeat the
         | message until they confirm it, it is at least once delivery.
         | 
         | edit: Point is in confirming that message is received. If you
         | don't receive the confirmation the message was delivered at
         | most once.
        
         | bcoates wrote:
         | This is a popular saying that is basically wrong.
         | 
         | You have very limited guarantees around an arbitrarily bad
         | partition, but this is also a detectable condition. Lots of
         | defective systems exist, but in general non-defective systems
         | generally guarantee "exactly once delivery or detected failure"
        
         | lisper wrote:
         | If you can get at-least-once delivery, why can you not build
         | exactly-once on top of that?
        
           | to11mtm wrote:
           | Because 'exactly once' delivery is arguably a misnomer, you
           | usually _really_ want  'at least once delivery with acks and
           | idempotent processing on the other side'.
           | 
           | The difference is subtle but important in practice and
           | specification.
        
             | lisper wrote:
             | > you usually really want 'at least once delivery with acks
             | and idempotent processing on the other side'.
             | 
             | Why? I'm pretty sure I really want (the illusion of)
             | exactly-once delivery, and it seems to me that I can
             | implement that pretty easily given at-least-once delivery.
             | Why would I not want that?
             | 
             | > The difference is subtle but important
             | 
             | Why?
        
               | ZephyrBlu wrote:
               | > _I 'm pretty sure I really want (the illusion of)
               | exactly-once delivery_
               | 
               | Do you know what idempotency is? This is exactly what he
               | described.
               | 
               | Idempotency is important to prevent unwanted behaviour
               | for duplicate actions. If you have "exactly-once", and
               | accidentally execute the action twice that could cause
               | problems.
        
           | shepherdjerred wrote:
           | You can get exactly once processing, but not exactly once
           | delivery.
           | 
           | https://bravenewgeek.com/you-cannot-have-exactly-once-
           | delive...
        
       | KaiserPro wrote:
       | So TCP has slow start, and exponential fall off and shit.
       | 
       |  _but_ you can get round that in a lot of cases by just having a
       | load of TCP connections in parallel.
       | 
       | TCP is cheap and well optimised, especially if you are keeping a
       | bunch of connections open. (opening can be expensive)
       | 
       | so if you have a high latency connection, or a bit of packet
       | loss, and you want to reach line speed without having to figure
       | out cornercases with UDP, just open up 100-1k TCP connections and
       | multiplex them.
       | 
       | bish bash bosh, mostly line speed over a high latency line (mind
       | you this was in the days of 100m-500m cross atlantic internet,
       | you'll probably need more connections to saturate a 10gig line.)
        
         | nh2 wrote:
         | Such hack is often not necessary.
         | 
         | Set larger kernel TCP send and receive buffers and enable BBR
         | congestion control. Speed will usually be good also across high
         | latency links, and no multiplexing logic needed. Especially if
         | you control both sides of the connection.
        
           | KaiserPro wrote:
           | > Set larger kernel TCP send and receive buffers and enable
           | BBR congestion control
           | 
           | I mean yeah, but that requires having access to the kernel
           | config. so for most people multiplexing TCP is a useful way
           | to maximise a link, without having to fiddle with stuff that
           | is a pain to deploy. (politically as well as logistically)
           | 
           | I deployed this "technique" before BBR was a thing. It worked
           | well enough for what I needed it to do (move large images
           | from London to California) It was pretty simple to engineer
           | as well (mainly because I didn't have to make a fancy custom
           | error detection/correction/rate limiting system over UDP )
        
       | koala_man wrote:
       | I find this "falsehoods programmers believe" format of making
       | pointed claims that you intentionally don't clarify to be
       | unhelpful and obnoxious
        
         | IAmNotACellist wrote:
         | The following list contains only falsehoods:
         | 
         | 1. You're wrong
         | 
         | 2. Okay, you're right
         | 
         | 3. Okay maybe you're right or wrong but certainly not both
        
           | jayd16 wrote:
           | Aha! You're null.
        
           | cubano wrote:
           | Well perhaps your a photon and then you certainly are both.
        
         | efitz wrote:
         | You're doing it wrong.
         | 
         | "Falsehoods programmers believe..." articles are designed to
         | make you THINK about problematic assumptions. They are not like
         | the 10 commandments and they are not decrees of absolute truth.
        
           | Dylan16807 wrote:
           | Saying a thing and then saying the opposite, without
           | elaborating, is not good at making you THINK. This list is
           | doing it wrong.
        
             | Izkata wrote:
             | Yep, the original "names" one was mostly written so
             | negating each of the points gave you the exception you
             | needed to handle. Even the cases written with both were
             | done on a way it was obvious the negation didn't apply
             | universally, so both worked.
        
           | homebrewer wrote:
           | Now imagine how much time could have been saved globally if
           | one person spent half an hour writing a short description of
           | why each point is false instead of making hundreds (or
           | thousands) of people spend hours _thinking_ about and
           | researching every one of them. You 're probably left with
           | more knowledge in the end if you're not spoon-fed by the
           | author, but how many of us need really deep knowledge of the
           | TCP inner workings?
        
           | ryandrake wrote:
           | I look at "Falsehoods programmers believe..." articles as a
           | good source of test cases. If I'm parsing a date (don't do
           | that), I'm going to look at "Falsehoods programmers believe
           | about dates" to help build out my list of unit tests for that
           | function. Same for names, street addresses and so on.
        
         | kranuck wrote:
         | Yeah I stopped with 5 and 6 and will never give the slightest
         | care what this person has to say ever again.
        
         | hnlmorg wrote:
         | I'm pretty sure originals that defined this format did have
         | examples and citations.
         | 
         | But I do agree that some of the later entries have felt a
         | little lazy.
        
           | 9dev wrote:
           | Right, I was about to comment that. One of the first ones I
           | remember was this one, about addresses[1]; or this one, about
           | names[2]. Both provide examples and information, which is the
           | only thing making the whole article useful.
           | [1]: https://www.mjt.me.uk/posts/falsehoods-programmers-
           | believe-about-addresses/       [2]:
           | https://shinesolutions.com/2018/01/08/falsehoods-programmers-
           | believe-about-names-with-examples/
        
         | tenebrisalietum wrote:
         | Falsehoods falsehood-list makers believe.
         | 
         | 1. That said items are falsehoods in the first place.
         | 
         | 2. That said items are necessarily interesting or noteworthy.
         | 
         | 3. That a list is necessarily the best format to present said
         | items.
         | 
         | 4. That they may speak for the involved parties beliefs.
        
         | IshKebab wrote:
         | I agree. This isn't even a good one of those lists. It's more
         | like "dubious pedantry to make me feel smart about my TCP
         | knowledge".
         | 
         | 1-4. Yes we know about the 2 generals problem. And yes we know
         | what "reliable" means in this context. 5-6. This is just
         | stupid. 7. Obviously not true. Nobody thinks this. 8-9. The
         | reasons for and flaws of Nagle's algorithm are well known. 10.
         | This isn't even true. Most of the time you _don 't_ need to
         | care about it. That's the whole point of abstraction. You need
         | to care about it if you are doing extensive performance
         | optimisation, but usually you aren't. 11. Again untrue. You
         | _can_ think of TCP as a two way pipe. Again that 's the whole
         | point of abstraction. 12. Not sure exactly what they're trying
         | to say here but again it's very well known that TCP and UDP are
         | pretty much the only protocols that are likely to work on the
         | internet. 13. Ditto. We all know why so many protocols are
         | "over HTTPS", e.g. DoH. 14. This isn't a technical point. 15.
         | Dunno what this is talking about but I'm guessing it's along
         | the lines of "a byte is 8 bits", i.e. it is actually true in
         | the modern world.
        
         | raggi wrote:
         | Yep, I work on low level networking software professionally and
         | this post is largely meaningless dribble and is probably
         | motivated by grandstanding.
         | 
         | It's like an engineer who says "how does a screen show black"
         | and then says "nope" to every response. It's maybe a way to
         | make people think, but beyond that the negativity and
         | grandstanding of it is ultimately a turn off for many receivers
         | which eventually either then has them bully others this way or
         | deters them from the field, depending on how it affects them.
         | There are far better teaching methods that work better for
         | everyone and teach faster and result in higher accuracy and
         | retention.
        
           | yazzku wrote:
           | The author probably doesn't understand the answers very well
           | themselves.
        
           | cubano wrote:
           | Uhhh....how _does_ the screen show black?
        
             | shermantanktop wrote:
             | Each of the pixels is actually a little shining eye which
             | watches your every move. When the pixel's eyelid closes,
             | that pixel turns black. That's why they call it putting a
             | display "to sleep."
        
         | numpad0 wrote:
         | I think the original "names" and subsequent "addresses" were
         | useful in that a conclusion(that programmers should embrace
         | defeatism and refrain from parsing or evaluating or even trying
         | to separate them into fields) can be drawn, and the lessons
         | learned were slightly more specific than often realized...
        
         | lovecg wrote:
         | I believe the article that started it all is
         | https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...
         | - crucially every entry is self-explanatory, which is a point
         | that a lot of the subsequent "Falsehood..." list authors miss.
        
           | subarctic wrote:
           | Even that could benefit from including a counterexample in
           | almost every point.
        
         | gweinberg wrote:
         | This. If these lists contained things that programmers actually
         | believed and explained why they are false, they might actually
         | be useful. It's hard to imagine an unsupported assertion that
         | an ambiguous statement is "false" and yet its contradiction is
         | also "false".
        
         | adrianmonk wrote:
         | I don't think you're taking into account the context or
         | intended audience. It's a casual forum message posted in reply
         | to someone else's message.
         | 
         | They have not written a "falsehoods programmers believe"
         | article. They have proposed that one _ought to be_ written and
         | have given a starting point for what it might cover.
         | 
         | They offered their list to "get the ball rolling", confirming
         | that they don't see it as a finished product.
         | 
         | They sent it to other readers of the same forum, who might be
         | expected to have more knowledge of this topic, not to whoever
         | runs across it on the front page of HN.
        
       | poorman wrote:
       | > Explainer for 1-4:
       | https://en.wikipedia.org/wiki/Two_Generals%27_Problem. TL;DR: If
       | the connection breaks while an ACK is outstanding, the sender
       | will have no way of knowing whether the segment was received, and
       | this turns out to be an insoluble problem no matter how much
       | complexity you pile on top of it. You need something resembling
       | Paxos or Raft to get a guarantee like that
       | 
       | The hashgraph algorithm is pretty sweet too and doesn't have the
       | issue of a single write leader like Paxos and Raft. Basically
       | multi-writers / leaderless
       | 
       | https://www.swirlds.com/downloads/SWIRLDS-TR-2016-01.pdf
       | 
       | But to be fair, I'm not certain that CAP theorem and partition
       | tolerance really belong in a conversation about TCP anyway
        
       | kranuck wrote:
       | > 11. This is all low-level pedantry
       | 
       | Yeah pretty much.
       | 
       | maybe don't write contradictory unexplained nonsense.
        
       | dasyatidprime wrote:
       | To some of the critics here: did you or did you not notice the "
       | _Somebody ought to_ write one of those [...] Here, I 'll even
       | _get the ball rolling_ " framing? A polished such article this is
       | not claiming itself to be! I would go as far as saying the HN
       | submission title is misleading as a result.
        
       | hamilyon2 wrote:
       | I'll go out on a limb: inside datacenter on your own hardware,
       | you can safely ignore low-level pedantry and mostly ignore "weird
       | networks" and use TCP as two-way Unix pipe.
       | 
       | "Mostly" because you still care about bandwidth limits and packet
       | RPS limits and latency of course.
        
         | toast0 wrote:
         | I wouldn't, unless you've got a really solid understanding of
         | your datacenter network and it's 100% good all the time. Which
         | is unlikely, from my experience as a server person.
         | 
         | If you've got dirty optics between two switches, now you're
         | getting packet loss and TCP rears its head. Hopefully it's not
         | an issue now, but diagnosing microbursting[1] was lots of fun,
         | and really wigs TCP out. I've also run into 'fabric
         | congestion'. My true favorite though is when you've got 2x
         | aggregation on servers, and 4x aggregation for top of rack
         | switches to spine switches, so there's 8 paths in each
         | direction between two servers in adjacent racks, and only one
         | path (sometimes in only one direction) is only running at
         | 99.9%. That's a real PITA to track down unless you have
         | visibility into switching metrics.
         | 
         | [1] https://en.m.wikipedia.org/wiki/Micro-bursting_(networking)
        
           | macintux wrote:
           | I was always suspicious about self-hosted high availability
           | solutions (typically just diagrams, not yet implemented) that
           | included redundant switches.
           | 
           | Given how generally reliable switches are, I was inclined to
           | believe that a misconfiguration or flaky network cable on one
           | switch was more likely to cause a downtime (or significant
           | degradation) than an outright switch failure, so adding
           | another switch was doubling the chances of trouble and, as
           | you note, making it harder to troubleshoot.
        
             | toast0 wrote:
             | It kind of depends. You do get some weird stuff to debug,
             | and more connections = more likely that one of them is
             | broken.
             | 
             | Otoh, if you ever do any scheduled maintenance on your
             | switches (which is likely if they're doing anything fancy),
             | having properly setup redundancy means you can announce a
             | likely brief loss of redundancy, rather than a likely brief
             | full loss of connectivity. If you have the right knobs, you
             | can gracefully fail out the switch under maintenance and
             | everything goes smoothly. Of course, sometimes you reboot
             | the redundant switch and it confuses the other one and
             | servers lose connectivity anyway.
        
           | to11mtm wrote:
           | Agreed.
           | 
           | Having done Akka.NET Remote/Cluster setups in prod that
           | survived _multiple_ 'new to the org' categories of DC
           | Failures at their level of scale/capacity [0] there's a lot
           | to account for if you want to keep everything happy and
           | visible [1][2][3]
           | 
           | [0] - Cut fiber between DCs, Rack failures due to IO-ish type
           | issues, bad switches... at least 2 out of 3.
           | 
           | [1] - The upshot was we were able to survive all of the
           | scenarios in at worst a degraded state, Once or twice we
           | needed a restart.
           | 
           | [2] - We also had enough metrics going on that we could
           | detect DC/server outages about as quickly as whoever actually
           | was monitoring the failing subsystem.
           | 
           | [3] - But here's the funny rub. An APM tool was the Achilles
           | heel for both our Akka Links, as well as our SQLServer
           | connections. Once they installed an 'agent' we more
           | frequently had to do a 'full cycle' to clean things up after
           | an outage, or even an MSSQL Server reboot. After I left the
           | shop I got confirmation that yes, the APM module was the
           | problem.
        
       | grishka wrote:
       | This reminds me of a very particular problem that we tried to
       | solve when I worked at VKontakte. It was about instant messaging
       | and flaky mobile data connections.
       | 
       | The problem: you're on a subway train and you send a message as
       | it departs a station. The request does get to the server, but by
       | the time the response arrives, the train is already in the tunnel
       | and you don't have a signal any more. So the client thinks that
       | the message failed to send, but it was, in fact, sent
       | successfully. The client would retry when it's back online, and
       | would send another copy of that message.
       | 
       | The solution was to send a client-generated "random ID" with each
       | request. I much later learned that this is conventionally called
       | an "idempotency token". This worked, except there was now another
       | problem: you sometimes receive your own message over the long-
       | polling thing _before_ the response to the request that sent it.
       | You don 't know for sure whether it's the message you just sent,
       | or something else sent by a different client on the same account,
       | because you don't know the ID of your message yet. This was
       | solved by me delaying the processing of outgoing messages on the
       | client side until all outstanding messages are fully sent and
       | their IDs are known.
       | 
       | Telegram solved this much more elegantly: when the client
       | reconnects to the server, the server sends it all the responses
       | that were not acknowledged during the previous connection.
       | MTProto has its own acknowledgement mechanism in addition to
       | TCP's.
       | 
       | So yeah, instant messaging seems trivial at the first glance, but
       | it turns out that TCP is a leaky enough abstraction that you need
       | to somehow plug those leaks at the application level.
        
         | zmj wrote:
         | I had to deal with the second problem in a file synchronization
         | app. The solution was to propagate a "device id" through the
         | request and poll/push, so the originating device could ignore
         | changes that it originated.
        
       ___________________________________________________________________
       (page generated 2024-09-14 23:00 UTC)