[HN Gopher] What they don't teach you about sockets
___________________________________________________________________
What they don't teach you about sockets
Author : zdw
Score : 231 points
Date : 2022-07-25 15:15 UTC (1 days ago)
(HTM) web link (macoy.me)
(TXT) w3m dump (macoy.me)
| weeksie wrote:
| Ah man, sockets. Real sockets. Yesterday's kiosk thread jogged my
| memory about all that but almost all of the network communication
| on my old kiosk installations was socket based. It was a pleasant
| way to work.
| Nezghul wrote:
| I very often see "reconnect loops" in various codebases and I
| wonder are they necessary? Wouldn't the same effect be achieved
| by for example increasing timeouts or some other connection
| parameter?
| pravus wrote:
| For TCP the state required to maintain the socket in the kernel
| is invalidated on error and needs to be reset. The only way to
| do this is to explicitly perform the connection setup again. An
| extended timeout only delays this process since the remote side
| will have invalidated its state as well.
|
| UDP packets require no connection but you still might see some
| sort of re-synchronization code to reset application state
| which could be called "reconnect".
| pgorczak wrote:
| They're a bit of a feature of the connection-oriented nature of
| TCP as the other reply mentions. If the server process crashes
| and restarts for example, the client will be told that its
| previous connection is not valid anymore. Basically TCP lets
| client and server assume that all bytes put into the socket
| after connect()/accept() will end up at the other side in that
| same order. Each time there is an error that violates that
| assumption, the connection needs to be explicitly "reset".
| dehrmann wrote:
| One of the big issues with TCP is a lot of communication isn't
| well-suited to the stream model and is message-oriented, so yes,
| like the author says, you have to go implement ACKs. And then you
| want multiplexing, which streams also fail at. Before you know
| it, you built a worse version of HTTP2.
| jdthedisciple wrote:
| Naive questions but doesn't QUIC solve those problems?
| kevin_thibedeau wrote:
| SCTP solved them. We're just rarely allowed to use it because
| the internet is broken.
| lxgr wrote:
| Yes, it's a bummer that we have to encapsulate it in UDP,
| but when using it like that, it works quite well in my
| experience.
| cogman10 wrote:
| HTTP2 solves these problems. HTTP3 (QUIC) solves the head of
| line blocking problem of TCP. That is, a packet getting
| corrupted or lost causes TCP to hold up other packets from
| going out until that packet can be successfully delivered. So
| you may be multiplexed in your messages, but you end up with
| a slowdown and backlog of data that could have been
| successfully received and interpreted but for the lost
| packet.
| dcsommer wrote:
| QUIC is HTTP/3 more or less and solves the same problems,
| yes.
| iTokio wrote:
| And then you realize that sometimes it has become even worse
| due to tcp head of line blocking and you move to udp and build
| a worse version of http3
| hinkley wrote:
| I think Head of Line blocking is a more concrete problem but
| it seems that it's often encountered when running away from a
| different one:
|
| If you have different conversations going on between a single
| client and the server, you can make several connections to do
| it. You'll pay a little bit more at the start of the
| conversation of course, so be frugal with this design, and
| think about ways to 'hide' the delay during bootstrapping.
| But know that with funky enough network conditions, it's
| possible for one connection to error out but the other to
| continue to work.
|
| The problem for me always comes back to information
| architecture, and the pervasive lack of it, or at least lack
| of investment in it. If you really have two separate
| conversations with the server, losing one channel shouldn't
| make the whole application freak out. But we all know people
| take the path of least resistance, and soon you have 2.5
| conversations on one channel and 1.5 on the other.
|
| The advice here is some of the same advice in more
| sophisticated treatises on RESTful APIs (it's all distributed
| computing, it's all the same problem set arranged in
| different orders). REST calls are generally supposed to be
| stateless. The client making what looks like a non sequitur
| call to the backend should Just Work. If you manage that,
| then having one channel inform the client of the existence of
| a resource and fetching that resource out of band isn't
| really coupling of the two channels. That subtlety is lost on
| some people who defend their actual coupling by pointing out
| that other people have done it too, when no, they really
| haven't. And if anything, multiplexing lets them get away
| with this bad behavior for much longer, allowing the bad
| patterns to become idiomatic.
| danesparza wrote:
| If you insist on passing messages, use a well-designed message
| queue service and don't rebuild the wheel (but just a little
| shittier).
| pclmulqdq wrote:
| A message queue service, or an RPC stack, adds a tremendous
| amount of overhead to a system. This is part of why computers
| are 200x faster than they were 20 years ago, but the
| performance feels the same.
|
| HTTP works reasonably well on TCP, but a lot of what we want
| to do is better suited to a reliable UDP protocol.
| Unfortunately, routers often balk at UDP packets, so TCP it
| is.
| gopalv wrote:
| > use a well-designed message queue service
|
| There is a faq about ZeroMQ written in the Monty Pythonesque
| fashion of "other than that, what does ZeroMQ do for us?"
|
| I can't quite find it in my bookmarks, but it went by the "so
| it gives you sockets", "but also message batching" ... "but
| other than batching, what does it give us?".
|
| Also, the whole problem with using just TCP is that often it
| needs kernel level tuning - like you need to fix INET MSL on
| some BSD boxes to avoid port exhaustion or tweak DSACK when
| you have hanging connections in the network stack (like S3
| connections hanging when SACK/DSACK is on).
|
| A standard library is likely to have bugs too, but hopefully
| someone else has found it before you run into it.
| jchw wrote:
| > I will also ask: how often are you writing applications that
| want to accept either files or sockets?
|
| In Go where there is an io.Reader/io.Writer abstraction where
| blocking interactions are OK and you're absolutely intended to
| handle all of the errors, it's really no problem at all to use a
| socket where you'd use a file.
|
| Unfortunately, this only works because the abstraction handles it
| well. You can't really do custom things with file descriptors, so
| the amount of useful things you can do by treating sockets as
| files is quite limited (though it certainly exists.)
|
| (I was going to say "TLS for example" though come to think of it
| this isn't even strictly true under Linux;
| https://docs.kernel.org/networking/tls.html )
|
| Still, having TCP sockets and files at the same abstraction layer
| in general isn't all that bad; you should consider that writes to
| the filesystem could fail and do take time when blocking. When
| done right, this makes it much easier for apps to become
| transparent to the network and other media when it should be
| possible.
| jcranmer wrote:
| > (I was going to say "TLS for example" though come to think of
| it this isn't even strictly true under Linux;
| https://docs.kernel.org/networking/tls.html )
|
| TLS isn't that much harder to handle than regular TCP sockets--
| you're going to need a socket interface that lets you get
| reader and writer streams, and the extra roundtrips you need
| for negotiation are handled in the constructor for the TLS
| socket.
|
| It is more difficult if you want to support more advanced
| features of TLS, or especially if you want to support something
| like STARTTLS (negotiate TLS on an already-open socket). But
| this is already kind of true for sockets in general: the
| reader/writer abstraction breaks down relatively quickly if you
| need to do anything smarter than occasionally-flushed streams.
| jchw wrote:
| I think my original post was fairly unclear of what I meant.
|
| See, what I meant was like this. File descriptors are an OS
| abstraction; the "backends" you get are defined in the
| kernel. You _can 't really_ do custom behavior with FDs; for
| example, in Go, the Go TLS library can open a connection and
| return an io.Writer, and when you write, it will be
| symmetrically encrypted, transparent to you, as if the code
| spoke TLS. But when you're dealing with raw file descriptors,
| and the read and write syscalls, there's no way to make
| 'custom' read and write handlers, like you can with
| programming language abstractions.
|
| (I do acknowledge that you could in fact do some of this with
| pipes, though I have seldom seen it used this way outside of
| shell programming. It's kinda missing the point, anyway,
| since pipes are just another type of fd. You can do pipes on
| top of Go's abstraction too, but it would be very cumbersome
| in comparison.)
|
| But as a kind of quirk, Linux actually _does_ support TLS
| sockets. It will not do the handshake, so you still have to
| do that in userspace. But if you use the Linux kernel's TLS
| socket support, it will in fact give you an FD that you can
| read from and write to directly with transparency, as if it
| was any other file or socket; you don't have to handle the
| symmetric cryptography bits or use a separate interface. I
| think this is rather neat, although I'm not sure how
| practically useful it is. (Presumably it would be useful for
| hardware offloading, at least.)
| jerf wrote:
| "Unfortunately, this only works because the abstraction handles
| it well. You can't really do custom things with file
| descriptors, so the amount of useful things you can do by
| treating sockets as files is quite limited"
|
| In general, most code has a clear initialization step where it
| sets up everything it wants to set up, then you can pass it to
| something that only expects a io.Reader/io.Writer and it can
| operate. I have a couple of places where one way of getting
| something does an HTTP request, but another way opens an S3
| file for writing, and yet another way opens a local disk file.
| Each of them has their own completely peculiar associated
| configuration and errors to deal with, but once I've got the
| stream cleanly I pass of an io.WriteCloser to the target
| "payload" function.
|
| If you're doing _super_ intense socket stuff you may need to
| grow the interface, or even just plain code against sockets
| directly. But most application-level stuff, even complicated
| stuff like "Maybe I'm submitting a form and maybe I'm writing
| to S3 and maybe I'm writing to disk and maybe I'm writing to a
| multiplexed socket and maybe I'm doing more than one of these
| things at once" can be cleanly broken into a "initialize the
| io.Reader/io.Writer, however complicated it may be" phase and a
| "use the io.Reader/io.Writer in another function that doesn't
| have to worry about the other details" phase. It is also highly
| advantageous to be able to pass a memory buffer to the latter
| function to test it without having to also try to figure out
| how to fake up a socket or a file or whatever.
|
| People don't write applications that accept either files or
| sockets because in most languages there is one impediment or
| another to the process; a system that _almost_ makes file-like
| objects share an interface but in practice not really,
| libraries that force you to pass them strings rather than file-
| like objects, etc. While it isn 't attributable to Go _qua_ Go,
| by getting it right in the standard library early Go
| legitimately is _really_ good at this sort of thing, more
| because the standard library set the tone for the rest of the
| ecosystem than because of any unique language features. I hear
| Rust is good too, which I can easily believe. Every time I try
| to use Python to do this, I 'm just saddened; it _ought_ to
| work, it _ought_ to be easy, but something always goes wrong.
| jchw wrote:
| The way I see it, Go did two things that made it work well:
|
| - Have a very simple, fairly well-defined interface for
| arbitrary read/write streams. This interface needs to have
| decent characteristics for performance, some kind of error
| handling, and a way to deal with blocking. Go's interface
| satisfies all three.
|
| - Have a good story for async in general. It's not really
| helpful to have an answer for how to deal with I/O blocking
| if the answer is really crappy and nobody actually wants to
| use it. A lot of older async I/O solutions felt very much in
| this camp.
|
| I think that Rust does a pretty decent job, though I'm a
| little bearish on their approach to async. (Not that I have
| any better ideas; to the contrary, I'm pretty convinced that
| Rust async is necessarily a mess and there's not much that
| can be done to make it significantly better.)
|
| But I think you can actually do a decent job of this even in
| traditional C++, in the right conditions. In fact, I used
| QIODevice much the way one would use io.Reader/io.Writer in
| Go. It was a little more cumbersome, but the theory is much
| the same. So I think it's not necessarily that Go did
| something especially well, I think the truth is actually
| sadder: I think most programming languages and their standard
| libraries just have a terrible story around I/O and
| asynchronicity. I do think that the state of the art has
| gotten better here and that future programming languages are
| unlikely to not attempt to solve this problem. So at least
| there's that.
|
| The truth is that input and output is unreliable, limited and
| latent by the nature of it. You can ignore it for disk
| because it's _relatively_ fast. But at the end of the day,
| the bytes you 're writing to disk need to go through the
| user/kernel boundary, possibly a couple times, to the
| filesystem, most likely asynchronously out of the CPU to the
| I/O controller, to the disk which likely buffers it, and then
| finally from the disk's buffers to its platters or cells or
| what have you. That's a lot of stuff going on.
|
| I think it's fair to say that "input and output" in this
| context means "anything that goes out of the processor." For
| example, storing data in a register would certainly not be
| I/O. Memory/RAM is generally excluded from I/O, because it's
| treated as a critical extension of the CPU and sometimes
| packaged with it anyway; it's fair for your application (and
| operating system) to crash violently if someone unplugs a
| stick of RAM.
|
| But that reality is not extended almost anywhere else. USB
| flash drives can be yanked out of the port at any time, and
| that's just how it goes; all buffers are going to get dropped
| and the state of the flash drive is just whatever it was when
| it was yanked, roughly. USB flash drives are not a special
| case. Hell, you can obviously hotplug ordinary SSDs and HDDs,
| too, even if you wouldn't typically do so in a home computer.
|
| So is disk I/O seriously that different from network I/O?
| It's "unreliable" (relative to registers or RAM). It's "slow"
| (relative to registers or RAM). It has "latency" (relative to
| registers or RAM). The difference seems to be the degree of
| unreliability and the degree of latency, but still. Should
| you treat a `write` call to disk differently than a `write`
| call to the network? I argue not very much.
|
| I don't really know 100% why the situation is bad with
| Python, but I can only say that I don't really think it
| should've been. Of course, hindsight is 20:20. It's probably
| a lot more complicated than I think.
| deathanatos wrote:
| > _How you handle a file no longer existing vs. a socket
| disconnection are not likely to be very similar. I 'm sure I'll
| get counter arguments to this,_
|
| That's my cue! I think the "everything is a file" is somewhat
| misunderstood. I might even rephrase it as "everything is a file
| descriptor" first, but then if you need to give a _name_ to it,
| or ACL that thing, that 's what the file-system is for: that all
| the objects share a common hierarchy, and different things that
| need names can be put in say, the same directory. I.e., that
| there is _one_ hierarchy, for pipes, devices, normal files, etc.
|
| I'd argue that the stuff that "is" a file (named or anonymous) or
| a file descriptor is actually rather small, and most operations
| are going to require knowing what kind of file you have.
|
| E.g., in Linux, read()/write() behave differently when operating
| on, e.g., an eventfd, read() requires a buffer of at least 8 B.
|
| Heck, "close" might really be the only thing that makes sense,
| generically. (I'd also argue that a (fallible1) "link" call
| should be in that category, but alas, it isn't on Linux, and
| potentially unlink for similar reasons -- i.e., give and remove
| names from a file. I think this is a design mistake personally,
| but Linux is what it is. What I think is a bigger mistake is
| having separate/isolated namespaces for separate types of
| objects. POSIX, and hence Linux, makes this error in a few
| spots.)
|
| But if you're just writing a TCP server, yeah, that probably
| doesn't matter.
|
| > _and that you should write your applications to treat these the
| same._
|
| But I wouldn't argue that. A socket likely is a socket for some
| very specific purpose (whatever the app _is_ , most likely), and
| that specific purpose is going mean it will be treated as that.
|
| In OOP terms, just b/c something might _have_ a base class doesn
| 't mean we always treat it as only an instance of the base class.
| Sometimes the stuff on the derived class is _very_ important to
| the function of the program, and that 's fine.
|
| 1e.g., in the case of an anonymous normal file (an unlinked temp
| file, e.g., O_TMPFILE), attempting to link it on a different
| file-system would have to fail.
| quickthrower2 wrote:
| Regarding his original expectations of TCP. Even if true, is
| there much difference between dropped data and data being
| delivered hours late? I imagine at an app level you would suspect
| any message that got sent 12 hours ago but kept in a queue.
|
| I imagine if that scenario is Ok you would explicitly use a queue
| system.
| neilobremski wrote:
| This article may as well have been about writing to file streams
| does not block until a read occurs. Maybe the documentation (on
| sockets?!) could be more clear but at some point more words don't
| help with conceptual understanding.
| hinkley wrote:
| The first bit of real code I ever wrote was an SO_LINGER bug fix
| for a game that couldn't restart if users had disconnected due to
| loss of network.
|
| Then I had to explain it to several other people who had the same
| problems. Seems a lot of copy and paste went on among that
| community.
| andai wrote:
| I played around with making a little websocket layer for browser
| games a while back. I used a Windows tool called "clumsy" to
| simulate crappy network conditions and was surprised at how poor
| the results were with just WebSockets, despite all the overhead
| of being a protocol on top of a protocol. The result is that you
| need to build a protocol on top of the protocol on top of the
| protocol if you actually want your messages delivered...
| swagasaurus-rex wrote:
| I built a javascript data synchronization library specifically
| for games
|
| https://github.com/siriusastrebe/jsynchronous
|
| A core part of the library is the ability to re-send data
| that's been lost due to connection interruption. Absolutely
| crucial for ensuring data is properly synchronized.
| tuukkah wrote:
| That's because WebSockets are more or less just sockets for web
| apps. You'lll want to use a protocol that deals with messages
| and their at-least-once delivery, such as MQTT (that can run on
| top of WebSocket if you need it).
| gwbas1c wrote:
| > If I send() a message, I have no guarantees that the other
| machine will recv() it if it is suddenly disconnected from the
| network. Again, this may be obvious to an experienced network
| programmer, but to an absolute beginner like me it was not.
|
| Uhh, that's 100% obvious. That's why it's not taught.
|
| Of course, I've also goofed on things that are obvious to other
| people. But, com'on, TCP isn't magic.
| dragon96 wrote:
| It could be obvious, but no need to be condescending about it.
|
| There are many statements that fit into the category of
| "obvious once stated, but not obvious if you didn't consider
| the distinction to begin with".
| tptacek wrote:
| It's not 100% obvious. The mental model where send() blocks
| until recv() on the other side confirms it is coherent: the
| receiver sends ACKs with bumped ack numbers to acknowledge the
| bytes it's received, and could delay those ACKs until the
| application has taken the bytes out of the socket buffer. It
| doesn't work that way, of course, and shouldn't, but it could.
| hinkley wrote:
| If it were so obvious we wouldn't have so many concurrency bugs
| that appear time and again in new programs. If it's not network
| flush it's file system flush.
| jsmith45 wrote:
| > Uhh, that's 100% obvious. That's why it's not taught.
|
| Is it? You probably feel that way from knowing how TCP works.
| But it would be quite straightforward to make it true with a
| slightly modified version of TCP (that acknowledges all
| packets, rather than every second one) by having send() block
| until it receives back the ACK from the receiver. (Yeah this
| would kill transmission rates, but it would function!) And
| furthermore, while it would be a terrible idea, the ACK could
| even be delayed until the end application makes the recv() call
| for the packet.
|
| To somebody not familiar with the details (and why this would
| be a terrible tradeoff), something along those lines would be
| entirely plausible.
| [deleted]
| 10000truths wrote:
| Perhaps there should exist a flag for send() that would make it
| so that it doesn't return until all data in the send() call has
| been ACKed by the receiving side (with a user configurable
| timeout).
|
| Of course, it's still not bulletproof. The other side can receive
| the packets, stuff them in its receive buffer, send an ACK for
| those packets, and then fail before draining the receive buffer
| due to an OS crash or hardware failure. But computers and
| operating systems tend to be much more reliable than networks, so
| it would still provide a much stronger guarantee of delivery or
| failure.
| tenebrisalietum wrote:
| There's a difference between ACK of the remote network stack
| (yeah, we got your packets, they're waiting in line) and ACK of
| the application (yeah, app X processed your requests composed
| of 1 or more packets)
|
| Compare with the classic OS optimization for spinning rust hard
| drives - write system calls will return immediately, but actual
| write requests to the hardware will occur sometime later. It's
| assumed most of the time your computer doesn't lose power, but
| that does happen sometimes, hence journaling.
| noselasd wrote:
| Much stronger but of limited use still, for the reasons you
| listed.
|
| You very rarely care that the remote tcp/ip stack has acked the
| message, you care that the messages has been received by the
| program and processed - You're better off implementing your own
| acks in those cases, allowing you to report back any errors in
| that ack as well. Or you don't really care, and can just fire
| and forget those messages.
|
| And that also allows you to implement a system where you can
| pipeline messages - waiting for remote acks when allowing just
| 1 message in flight limits your throughput severely.
| Uptrenda wrote:
| It seems like the OP is mostly talking about 'blocking' sockets.
| Such sockets return when they're ready or there's an error. So
| send returns when its passed off its data to the network buffer
| (or if its full it will wait until it can pass off SOME data.)
| You might think that sounds excellent - but from memory send may
| not send all of the bytes you pass to it. So if you want to send
| out all of a given buffer with blocking sockets - you really need
| to write a loop that implements send_all with a count of the
| amount of bytes sent or quit on error.
|
| Blocking sockets are kind of shitty, not gonna lie. The
| counterpart to send is recv. Say you send a HTTP request to a web
| server and you want to get a response. With a blocking socket its
| quite possible that your application will simply wait forever. I
| am p sure the default 'timeout' for blocking sockets is 'None' so
| it just waits for success or failure. So a shitty web server can
| make your entire client hang.
|
| So how to solve this?
|
| Well, you might try setting a 'timeout' for blocking operations
| but this would also screw you. Any thread that calls that
| blocking operation is going to hang that entire time. Maybe that
| is fine for you -- should you design program to be multi-threaded
| and pass off sockets so they can wait like that -- and that is
| one such solution.
|
| Another solution -- and this is the one the OP uses -- is to use
| the 'select' call to check to see if a socket operation will
| 'block.' I believe it works on 'reading' and 'writing.' But wait
| a minute. Now you've got to implement some kind of performant
| loop that periodically does checks on your sockets. This may
| sound simple but its actually the subject of whole research
| projects to try build the most performant loops possible. Now
| we're really talking about event loops here and how to build
| them.
|
| So how to solve this... today... for real-world uses?
|
| Most people are just going to want to use asynchronous I/O. If
| you've never worked with async code before: its a way of doing
| event-based programming where you can suspend execution of a
| function if an event isn't ready. This allows other functions to
| 'run.' Note that this is a way to do 'concurrency' -- or switch
| between multiple tasks. A good async library may or may not also
| be 'parallel' -- the ability to execute functions simultaneously
| (like on multiple cores.)
|
| If we go back to the idea of the loop and using 'select' on our
| socket descriptors. This is really like a poor-persons async
| event loop. It can easily be implemented in a single thread, in a
| single core. But again -- for modern applications -- you're going
| to want to stay away from using the socket functions and go for
| async I/O instead.
|
| One last caveat to mention:
|
| Network code needs to be FAST. Not all software that we write
| needs to run as fast as possible. That's just a fact and indeed
| many warn against 'premature optimization.' I would say this
| advice doesn't bode well for network code. It's simply not
| acceptable to write crappy algorithms that add tens of
| milliseconds or nanoseconds to packet delivery time if you can
| avoid it. It can actually add up to costs a lot of money and make
| certain applications impossible.
|
| The thing is though -- profiling async code can be hard --
| profiling network code even harder. A network is unreliable and
| to measure run-time of code you only care about how the code
| performs when its successful. So you're going to want to find
| tools that let you throw away erroneous results and measure how
| long 'coroutines' actually run for.
|
| Async network code may underlyingly use non-blocking sockets,
| select, and poll. But they are designed to be as efficient as
| possible. So if you have access to using them its probably what
| you want to use!
| PeterWhittaker wrote:
| select(...)
|
| ?
|
| Why not EPOLLRDHUP ?
| larsonnn wrote:
| The title is confusing. I did learn it that way...
| TYPE_FASTER wrote:
| What really helped me understand and troubleshoot network
| communications at the socket level was the book by W. Richard
| Stevens[0]. I think this is because he starts with an
| introduction to TCP, and builds from there. Knowing TCP states
| and transitions is important to reading packet dumps, and in
| general debugging networking.
|
| Also, no matter what platform you're using, there is probably a
| socket implementation somewhere at the core. You're best off
| understanding how sockets work, then understanding how the
| platform you're working with uses sockets[1].
|
| Once I read the W. Richard Stevens book, I was able to read and
| understand RFCs for protocols like HTTP to know how things should
| work. Then you're better prepared to figure out if a behavior is
| due to your code, or an implementation of the protocol in your
| device, or the device on the other end of the network connection,
| or some intermediary device.
|
| [0] - http://www.kohala.com/start/unpv12e.html
|
| [1] - https://docs.microsoft.com/en-
| us/windows/win32/winsock/porti...
| waynesonfire wrote:
| curious whether the book help w/ understanding routing and
| firewalls?
| TYPE_FASTER wrote:
| Yes. To really understand routing and firewalls, I'd suggest
| implementing a firewall for yourself using Linux or OpenBSD.
| I call out OpenBSD because when I went through that process
| years ago, OpenBSD was lightweight from resource and
| complexity perspectives, so I could focus on network
| configuration. There might be better options with Linux these
| days, it's been a long time since I looked.
| tptacek wrote:
| Very yes on both, especially with firewalls.
| lanstin wrote:
| We had to debug an issue with ARP during a deploy of docker
| containers a year or two ago. A nearly complete understanding
| of the lower layers is not that much knowledge and quite useful
| at times.
| 5e92cb50239222b wrote:
| "Nearly complete"? No matter how deep your knowledge is,
| you're only scratching the surface. To pick a small example,
| there's something like 20 TCP congestion avoidance algorithms
| alone (many are available in the mainline Linux kernel, and
| they can be picked depending on the task and network at
| hand), and I believe it took something like a decade of
| research, trial and error to solve the bufferbloat problem.
|
| https://en.wikipedia.org/wiki/TCP_congestion_control
|
| https://lwn.net/Articles/616241/
| hn_go_brrrrr wrote:
| Only CUBIC and BBR seem to have any meaningful usage,
| however: https://www.comp.nus.edu.sg/~ayush/images/sigmetri
| cs2020-gor...
| icedchai wrote:
| I tried out BBR on a system that had a ton of packet
| loss. It made a tremendous difference!
| n_kr wrote:
| > No matter how deep your knowledge is, you're only
| scratching the surface.
|
| I understand this is just emphasis, but no, its not magic,
| its not innate ability, its just software man! If you have
| dug deep enough, and understood it, that's it. Key phrase
| is IMO 'understood', but that's universal.
| sophacles wrote:
| I agree with your main point - a decent working knowledge
| down-stack really helps with network stuffs.
|
| "nearly complete" strikes me as a whole lot of hubris though
| - I've spent almost 20 years at a career that can be summed
| up as "weird stuff with packets on linux", and the only thing
| that I'm certain of is: every time I think I'm starting to
| get a handle on this networking thing, the universe shows me
| that my assessment of how much there is to know was off by at
| least an order of magnitude.
| macintux wrote:
| I once interviewed someone with "Internet guru" on his
| resume. I advised him there were only a handful of people
| I'd consider for that title.
| gwright wrote:
| I had the privilege of working for Rich as a junior developer
| and then later with him as a co-author. He was my mentor and
| friend. Seeing messages like this over 30 years later really
| makes my day and reminds me how much I miss him.
| Dowwie wrote:
| for those unfamiliar with C, if anyone can recommend sufficient
| reference material to understand the C examples presented in
| the book, that would be helpful
| wruza wrote:
| The most confusing part should be "inheritance" of sockaddr
| structs, afair (the rest should read as if it was
| javascript). If this is it, then be sure that no magic
| happens there, structure quirks matter much less than the
| values they are filled with.
| TYPE_FASTER wrote:
| Some languages, like Python[0], have a low-level socket
| library that maps pretty closely to the C API used in the
| book.
|
| [0] https://docs.python.org/3/library/socket.html
| throwaway1777 wrote:
| You're opening a can of worms but Kernigan and Ritchie wrote
| the canonical book on C and you cant go wrong with that.
| quietbritishjim wrote:
| It's with stressing how different the K&R book is from
| books in the past that introduce a programming language.
| Usually those sort of books are fairly long and you end up
| with a decent but incomplete understanding.
|
| With K&R, it's quite short, it's an easy going read, and
| yet when you get to the end you've covered the _whole_ of C
| (even the standard library!).
| hinkley wrote:
| There were many authors in that era who had practically
| compulsory books. Stevens is one of the few to have two. His
| Unix Programming book was referred to as the Bible.
| sophacles wrote:
| I still refer back to APUE often and push my junior folks to
| read it.
| dimman wrote:
| Interesting read. I'm quite curious of where all the initial
| misperceptions about sockets comes from.
|
| I can highly recommend Beej's guide to network programming:
| https://beej.us/guide/bgnet/
|
| That together with Linux/BSD man pages should be everything
| needed, some great documentation there.
| pgorczak wrote:
| I definitely used to think TCP was more "high-level" than it
| actually is. Yes it does much more than UDP but still, its job
| is to get a sequence of bytes from A to B. You can tune it for
| higher throughput or more sensitive flow control but anything
| concerning message passing, request/response, ... is beyond the
| scope of TCP.
| opportune wrote:
| If I could make a meta point, nothing about software is magic
| (except the fact it works at all). Computers do exactly as they
| are told and nothing more.
|
| If you're using some underlying technology you should really know
| how it works at some level so you can understand what assumptions
| you can make and which you can't. TCP doesn't mean there is never
| data loss.
| dylan604 wrote:
| Except when literal bugs short out your system, or actual
| cosmic rays bit flip your data, or any plethora of other things
| that can cause your "correctly written" code to not function as
| expected.
|
| That's when the magicians come out and do their thing
| gabereiser wrote:
| This was a good write up but was kind of short of details. It
| would have been really awesome with some code examples. I love
| socket programming. Love, like I love going to the dentist.
|
| Having a state machine to handle basic I/O between client/server
| kind of blows when it comes to plenko machine switch statements
| of arbitrary state enums. Languages like Go or Python allow you
| to have a way of communicating with the client from the server in
| a more direct client->server oriented way. Write/Read input, do
| thing, write, read, do, repeat until you finish ops.
|
| Go is my favorite for this as I can spawn a goroutine to
| read/write to a socket. Rust is my second favorite for this but
| it's a bit trickier. Python's twisted framework has something
| like this with Protocols. I wish C++ had a standard socket
| implementation (std::network?)
|
| Anyway, this gave me a smile today so thanks.
| zwieback wrote:
| In the early 90s I was working on AppleTalk-PC interoperability
| SW, which included "ATP", the AppleTalk Transaction Protocol,
| which is kind of a guaranteed delivery protocol. It even had "at
| least once" and "exactly once" options to make sure state
| wouldn't get messed up in client-server applications:
| https://developer.apple.com/library/archive/documentation/ma...
|
| I was kind of sad to see that things like this fell by the
| wayside in favor of streaming protocols but of course it's way
| more efficient to not handle each transaction individually.
|
| The problem of users implementing TCP++ on top of TCP is a real
| problem, I think.
| NegativeLatency wrote:
| Was there a driver or library for ATP or were you implementing
| that on PCs?
| zwieback wrote:
| No, we were the ones writing the PC drivers, for DOS, Windows
| and OS/2. Back then the lower layers were more or less
| standardized but there was no AppleTalk stack so that's what
| we developed as a product.
| bombela wrote:
| I just want to point out that "exactly-once" is not physically
| possible. The documentation claims "exactly-once", and proceeds
| to explain that after a while, it will timeout. Which makes it
| "at-most-once", as expected.
|
| As far as I know; in communication; you can only have at-least-
| once (1..N) or at-most-once (0..1).
| neonroku wrote:
| The following sentence in the article jumped out at me: "The
| difference is that you are not dealing with the unreliability of
| UDP like TCP is." This reads to me like TCP is built on top of
| UDP, which at one time I thought to be the case, but it's not.
| UDP and TCP are both transport layer, built on the internet
| layer, which is unreliable.
| enriquto wrote:
| > How you handle a file no longer existing vs. a socket
| disconnection are not likely to be very similar.
|
| Why not? It seems that fopen(3) and fread(3) provide the perfect
| abstraction for that. The semantics to remove(3) a file that is
| open are very clear, and they represent exactly what you want to
| happen when a connection is lost.
|
| I never understood the need for "sockets" as a separate type of
| file. Why can't they be just be exposed as regular files?
| [deleted]
| psyc wrote:
| I'm pleased to hear that anyone is still teaching anyone anything
| about sockets.
|
| The bit about Windows not treating sockets as files made me
| pause, since Windows does treat so many things as files. After
| thinking about it some, I suppose it's kernel32 that treats
| kernel32 things as files. Winsock has a separate history.
| amluto wrote:
| Windows has a long and unfortunate history of encouraging
| programs to extend system behavior by injecting libraries into
| other programs' address spaces. You can look up Winsock LSPs
| and Win32 hooks [0] for examples. This means that programs
| cannot rely on the public APIs actually interacting with the
| kernel in the way one would natively imagine -- the
| implementation may be partially replaced with a call into user
| code from another vendor in the same progress. Eww!
|
| So, as I recall, a _normal_ socket is a kernel object, but a
| third party LSP-provided socket might not be. This also means
| that any API, e.g. select, that may interact with more than one
| socket at once has problems. [1]
|
| [0] https://docs.microsoft.com/en-
| us/windows/win32/api/winuser/n... -- see the remarks section.
|
| [1] https://docs.microsoft.com/en-
| us/windows/win32/api/Winsock2/... -- again, see the remarks.
| xeeeeeeeeeeenu wrote:
| It's complicated. Sockets are _mostly_ interchangeable with
| filehandles but there are many exceptions. For example,
| ReadFile() works with sockets, whereas DuplicateHandle()
| silently corrupts them.
|
| However, there's another problem: overlapped vs non-overlapped
| handles. socket() always creates overlapped sockets, while
| WSASocket() can create both types. Overlapped handles can't be
| read synchronously with standard APIs, which in turn means you
| can't read() a fd created from an overlapped handle.
|
| Naturally, in their infinite wisdom, Windows designers decided
| there's no need to provide an official API to introspect
| handles, so there's no documented way to tell them apart (there
| are unofficial ways, though). BTW, it's a proof of poor
| communication between teams in Microsoft, because their C
| runtime (especially its fd emulation) would greatly benefit
| from such an API.
|
| It's frustrating. I'm sure that if Windows was an open-source
| project, that mess would be fixed immediately.
___________________________________________________________________
(page generated 2022-07-26 23:00 UTC)