[HN Gopher] Achieving 5M persistent connections with Project Loo...
___________________________________________________________________
Achieving 5M persistent connections with Project Loom virtual
threads
Author : genzer
Score : 271 points
Date : 2022-04-30 08:07 UTC (14 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| deepsun wrote:
| How does that compare to Kotlin suspend functions?
| jillesvangurp wrote:
| Loom will make a great backend for kotlin's co-routines. Roman
| Elizarov (kotlin language lead & person who is behind Kotlin's
| co-routine framework) has already confirmed that will happen
| and it makes a lot of sense.
|
| For those who don't understand this, Kotlin's co-routine
| framework is designed to be language neutral and already works
| on top the major platforms that have kotlin compilers (native,
| javascript, jvm, and soon wasm). So, it doesn't really compete
| with the "native" way of doing concurrent, aynchronous, or
| parallel computing on any of those platforms but simply
| abstracts the underlying functionality.
|
| It's actually a multi platform library that implements all the
| platform specific aspects in the platform appropriate way. It's
| also very easy to adapt existing frameworks in this space via
| Kotlin extension functions and the JVM implementation actually
| ships out of the box with such functions for most common
| solutions on the JVM for this (Java's threads, futures,
| threadpools, etc., Spring Flux, RxJava, Vert.x, etc.). Loom
| will be just another solution in this long list.
|
| If you use Spring Boot with Kotlin for example, rather than
| dealing with Spring's Flux, you simply define your asynchronous
| resources as suspend functions. Spring does the rest.
|
| With Kotlin-js in a browser you can call Promise.toCoroutine()
| ans async { ... }.asPromise(). That makes it really easy to
| write asynchronous event handling in a web application for
| example or work with javascript APIs that expect promises from
| Kotlin. And if you use web-compose, fritz2, or even react with
| kotlin-js, anything asynchronous, you'd likely be dealing with
| via some kind of co-routine and suspend functions.
|
| Once Loom ships, it basically will enable some nice, low level
| optimization to happen in the JVM implementation for co-
| routines and there will likely be some new extension functions
| to adapt the various new Java APIs for this. Not a big deal but
| it will probably be nice for situations with extremely large
| amounts of co-routines and IO. Not that it's particularly
| struggling there of course but all little bits help. It's not
| likely to require any code updates either. When the time comes,
| simply update your jvm and co-routine library and you should be
| good to go.
| richdougherty wrote:
| I made a comment about this above:
| https://news.ycombinator.com/item?id=31218826
|
| I won't repeat it all, but the main point is that having
| runtime support is much better than relying on compiler
| support, even if compiler support is pretty fantastic.
|
| Note that the two aren't mutally exclusive, you should still be
| able to use coroutines after Project Loom ships, and it still
| might make sense in many places.
| torginus wrote:
| While I can't answer the question directly there is an article
| about C#-s async/await vs Go's goroutines, which compare the
| two approaches, and while some of the stuff is probably stack-
| specific, a lot of it is probably intrinsic to the approach:
|
| - Green threads scale somewhat better, but both scale
| ridiculously well, meaning probably you won't run into scaling
| issues.
|
| - async/await generators use way less memory than a dedicated
| green thread, this affects both memory consumption and startup
| time, since the process has to run around asking the OS for
| more memory
|
| - green threads are faster to execute
|
| Here's the link:
|
| https://alexyakunin.medium.com/go-vs-c-part-1-goroutines-vs-...
| Andrew_nenakhov wrote:
| Sounds like a job for Erlang.
| speed_spread wrote:
| Sounds like Erlang's out of a job.
| cheradenine_uk wrote:
| I think a lot of people are missing the point.
|
| Go look at the sourcecode. Look at how simple it is - anyone who
| has created a thread with java knows what's happening. With only
| minor tweaks, this means your pre-existing code can take
| advantage of this with, basically, no effort. And it retains all
| the debuggability of traditional java thread (I.e: a stack trace
| that makes sense!)
|
| If you've spent any time at all dealing with the horrors of c#
| async/await (Why am I here? Oh, no idea) and it's doubling of
| your APIs to support function colouring - or, you've fought with
| the complexities of reactive solutions in the Java space --
| often, frankly, in the name of "scalability" that will never be
| practically required -- this is a big deal.
|
| You no longer have to worry about any of that.
| pjmlp wrote:
| Or inserting the occasional Task.Run() calls, as means to
| avoiding changing the whole call stack up to Main().
| gavinray wrote:
| This hasn't been that much of a problem, IME
|
| If you decide somewhere deep in your program you want to use
| async operations, most languages allow you to keep the
| invoking function/closure synchronous and return some kind of
| Promise/Future-like value
| pjmlp wrote:
| Which is exactly the workaround with Task.Run(), being able
| to integrate a library written with async/await in
| codebases older than the feature, where no one is paying
| for a full rewrite.
| SemanticStrengh wrote:
| Except Kotlin coroutines already works, can be very easily
| integrated in existing java codebases and are much superior
| than loom (structured concurrency, flow, etc)
| richdougherty wrote:
| Kotlin coroutines are amazing. They're built on very clever
| tech that converts fairly normal source code into a state
| machine when compiled. This has huge benefits and allows the
| programmer to break their code up without the hassle of
| explicitly programming callbacks, etc.
|
| https://kotlinlang.org/spec/asynchronous-programming-with-
| co...
|
| However... an unavoidable fact is that converted code works
| differently to other code. The programmer needs to know the
| difference. Normal and converted code compose together
| differently. The Kotlin compiler and type system helps keep
| track, but it can't paper over everything.
|
| Having lightweight thread and continuations support directly
| in the VM makes things very much simpler for programmers (and
| compiler writers!) since the VM can handle the details of
| suspending/resuming and code composes together effortlessly,
| even without compiler support, so it works across languages
| and codebases.
|
| I don't want to be critical about Kotlin. It's amazing what
| it achieves and I'm a big fan of this stuff. Here are some
| notes I wrote on something similar, Scala's experiments with
| compile-time delimited continuations:
| https://rd.nz/2009/02/delimited-continuations-in-
| scala_24.ht...
|
| I think this is a general principle about compiler features
| vs runtime features. Having things in the runtime makes life
| a lot easier for everyone, at the cost of runtime complexity,
| of course.
|
| Another one I'd like to see is native support for tail calls
| in Java. Kotlin, Scala, etc have to do compile-time tricks to
| get basic tail call support, but it doesn't work across
| functions well.
|
| Scala and Kotlin both ask the programmer to add annotations
| where tail calls are needed, since the code gen so often
| fails.
|
| https://kotlinlang.org/docs/functions.html#tail-recursive-
| fu...
|
| https://www.scala-
| lang.org/api/3.x/scala/annotation/tailrec....
|
| https://rd.nz/2009/04/tail-calls-tailrec-and-
| trampolines.htm...
|
| As a side note, I can see that tail calls are planned for
| Project Loom too, but I haven't heard if that's implemented
| yet. Does anyone know the status?
|
| "Project Loom is to intended to explore, incubate and deliver
| Java VM features and APIs built on top of them for the
| purpose of supporting easy-to-use, high-throughput
| lightweight concurrency and new programming models on the
| Java platform. This is accomplished by the addition of the
| following constructs:
|
| * Virtual threads
|
| * Delimited continuations
|
| * Tail-call elimination"
|
| https://wiki.openjdk.java.net/display/loom/Main
| SemanticStrengh wrote:
| Coroutines are _much less_ coloured than async await
| programming though since functions returns resolved types
| directly instead of futures. But yes there is the notion of
| coroutine scope but I don 't see how to supress it without
| making it less expressive.
|
| Very few people know it but Oracle is developping an
| alternative to Loom, in parallel.
| https://github.com/oracle/graal/pull/4114
|
| BTW i expect Kotlin coroutines to leverage loom eventually.
|
| As for the tailrecursive keyword, it is not a constraint
| but a feature since it guarantee at the type level that
| this function cannot stack overflow. Few people know there
| is an alternative to tailrecursive, that can make any
| function stackoverflow safe by leveraging the heap via
| continuations
| https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/-deep-
| re...
|
| As for Java, there is universal support for tail recursion
| at the bytecode level https://github.com/Sipkab/jvm-tail-
| recursion
| ohgodplsno wrote:
| > Coroutines are much less coloured than async await
| programming though since functions returns resolved types
| directly instead of futures
|
| Only because the compiler does its magic behind the
| scenes and transforms it into bytecode that takes a
| lambda with a continuation. Try calling a suspend
| function from java or starting a job and surprise, it's
| continuations all the way down
| SemanticStrengh wrote:
| yes interfacing with java is generally made via RxJava
| and reactor. Interfacing is easy but yes nobody wants to
| use rxjava and reactor in the first place.. I wonder
| wether loom will enable easier interop and make the magic
| work from java side POV
| gavinray wrote:
| Thanks for posting that link to Java tail recursion
| library, super handy + didn't know about it. You need
| tail recursion for writing expression evaluators/visitors
| frequently.
|
| I've been using an IntelliJ extension that can do magic
| by rewriting recursive functions to stateful stack-based
| code for performance, but it spits out very ugly code:
|
| https://github.com/andreisilviudragnea/remove-recursion-
| insp... > "This inspection detects
| methods containing recursive calls (not just tail
| recursive calls) and removes the recursion from the
| method body, while preserving the original semantics of
| the code. However, the resulting code becomes rather
| obfuscated if the control flow in the recursive method is
| complex."
|
| It was this guy's whole Bachelor thesis I guess:
|
| https://github.com/andreisilviudragnea/remove-recursion-
| insp...
| bullen wrote:
| Agreed it's simpler, but using NIO with one OS thread per core
| also has it's benefits.
|
| The context switch (how ever small) will cause latency when
| this solution is at saturation.
|
| I think they should write four tests: fiber, NIO and each with
| userspace networking (no kernel copying network memory) and
| compare them.
|
| Why Oracle is stalling removing the kernel for Java networking
| is surprising to me, they allready have a VM.
| blibble wrote:
| there's still a context switch with NIO, you're just doing it
| manually
| pron wrote:
| https://github.com/ebarlas/project-loom-comparison
| vlovich123 wrote:
| Shouldn't you be able to send authorization and
| authentication requests in parallel in the async and
| virtual threads cases?
| threeseed wrote:
| It is just an example so they could do anything.
|
| But in the real world it is common to need information
| from the authorization stage to use in the authentication
| stage. For example you may have a user login with an
| email address/password which you then pass to an LDAP
| server in order to get a userId. This userId is then used
| in a database to determine with objects/groups they have
| access to.
| the8472 wrote:
| net.netfilter.nf_conntrack_buckets = 1966050
| net.netfilter.nf_conntrack_max = 7864200
|
| or avoid conntrack entirely
| LinuxBender wrote:
| For completeness sake I would add that one must also set
| options nf_conntrack expect_hashsize=X hashsize=X
|
| in /etc/modules.d/nf_conntrack.conf, X being 1/4 the size of
| conntrack_max
| metabrew wrote:
| API for the server example looks... actually good, wow. Nice job!
|
| Also tickled to see my erlang 1M comet blog post referenced. A
| lifetime ago now, pre-websockets.
| alberth wrote:
| Is this a test of just having 5M people knock on your door?
|
| Or is this a test where something actually happens (data
| exchanges) with each connection?
|
| I ask because those are two totally different workloads and
| typically where in the later test Erlang shines.
| bufferoverflow wrote:
| It's an echo server. The client sends the data, the server
| responds with the same data.
| newskfm wrote:
| sgtnoodle wrote:
| I'm not a java programmer. I tried clicking 3 layers deep of
| links, but still have no idea what virtual threads are in this
| context. Is it a userspace thread implementation?
|
| I've used explicit context switching syscalls to "mock out"
| embedded real time OS task switching APIs. It's pretty fun and
| useful. The context switching itself may not be any faster than
| if the kernel does it, but the fact that it's synchronous to your
| program flow means that you don't have to spend any overhead
| synchronizing to mutexes, queues, etc. (You still have them, they
| just don't have to be thread safe.)
| grishka wrote:
| > Is it a userspace thread implementation?
|
| Yes.
| zinxq wrote:
| Loom sets out to give you a sane programming paradigm similar to
| what threads do (i.e. as opposed to programming asynchronous I/O
| in Java with some type of callback) without the overhead of
| Operating System threads.
|
| That's a very cool and a noble pursuit. But the title of this
| article might as well have been "5M persistent connections with
| Linux" because that's where the magic 5M connections happen.
|
| I could also attempt 5M connections at the Java level using Netty
| and asynchronous IO - no threads or Loom. Again, it'd take more
| Linux configuration than anything else. If that configuration did
| happen though now you can also do it in C# async/await,
| javascript, I'm sure Erlang and anything else that does
| Asynchronous I/O whether it's masked by something like
| Loom/Async/Await or not.
| simulate-me wrote:
| As the GP said, what's cool about this is how simple the code
| is. You might be able to achieve 5M connections in Java using
| an event loop based solution (eg Netty), but if the connection
| handlers need to do any async work, then they also need to be
| written using an event loop, which is not how most people write
| Java. Simply put, 5M connections was not possible using Java in
| the way most people write Java.
| [deleted]
| pron wrote:
| It is true that the experiment exercises the OS, but that's
| only _part_ of the point. The other part is that it uses a
| simple, blocking, thread-per-request model with Java 1.0
| networking APIs. So this is "achieving 5M persistent
| connections with (essentially) 26-year-old code that's fully
| debuggable and observable by the platform." This stresses both
| the OS and the Java runtime.
|
| So while you could achieve 5M in other ways, those ways would
| not only be more complex, but also not really
| observable/debuggable by Java platform tools.
| cheradenine_uk wrote:
| This.
|
| Writing the sort of applications that I get involved with,
| it's frequently the case whilst it's true that 1 OS
| thread/java thread was a theoretical scalability limitation -
| in practice we were never likely to hit it (and there was
| always the 'get a bigger computer').
|
| But: the complexity mavens inside our company and projects we
| rely upon get bitten by an obsessive need to chase
| 'scalability' /at all costs/. Which is fine, but the downside
| to that is the negative consequences of coloured functions
| comes into play. We end up suffering having to deal with
| vert.x or kotlin or whatever flavour-of-the-month solution is
| that is /inherently/ harder to reason about than a linear
| piece of code. If you're in a c# project, the you get a
| library that's async, and boom, game over.
|
| If loom gets even within performance shouting distance of
| those other models, it's ought to kill (for all but the
| edgiest of edge-cases) reactive programming in the java space
| dead. You might be able to make a case - obviously depending
| on your use cases which are not mine - that extracting, say,
| 50% more scalability is worth the downsides. If that number
| is, say, 5%, then for the vast majority of projects the
| answer is going to be 'no'.
|
| I say 'ought to', as I fear the adage that "developers love
| complexity the way moths love flames - and often with the
| same results". I see both engineers and projects (Hibernate
| and keycloak, IIRC) have a great deal of themselves invested
| in their Rx position, and I already sense that they're not
| going to give it up without a fight.
|
| So: the headline number is less important than "for virtually
| everyone you will no longer have to trade simplicity for
| scalability". I can't wait!
| amluto wrote:
| Threads (whether lightweight or heavyweight) can't fully
| replace reactive/proactive/async programming even ignoring
| performance and scalability. Sometimes network code simply
| needs to wait for more than one event as a matter of
| functionality. For example, a program might need to handle
| the availability of outgoing buffer space and _also_ handle
| the availability of incoming data. And it might also need
| to handle completion of a database query or incoming data
| on a separate connection. Sure, using extra threads might
| do it, but it's awkward.
| pron wrote:
| > Sure, using extra threads might do it, but it's
| awkward.
|
| It's simpler and nicer, actually -- and definitely offers
| better tooling and observability -- especially with
| structured concurrency: https://download.java.net/java/ea
| rly_access/loom/docs/api/jd...
| mike_hearn wrote:
| A couple of points to consider.
|
| 1. Demanding scalability for inappropriate projects and at
| any cost is something I've seen too, and on investigation
| it was usually related to former battle scars. A software
| system that stops scaling at the wrong time can be horrific
| for the business. Some of them never recover, the canonical
| example being MySpace, but I've heard of other examples
| that were less public. In finance entire multi-year IT
| projects by huge teams have failed and had to be scrapped
| because they didn't scale to even current business needs,
| let alone future needs. Emergency projects to make
| something "scale" because new customers have been on-
| boarded, or business requirements changed, are the sort of
| thing nobody wants to get caught up in. Over time these
| people graduate into senior management where they become
| architects who react to those bad experiences by insisting
| on making scalability a checkbox to tick.
|
| Of course there's also trying to make easy projects more
| challenging, resume-driven development etc too. It's not
| just that. But that's one way it can happen.
|
| 2. Rx type models aren't just about the cost of threads. An
| abstraction over a stream of events is useful in many
| contexts, for example, single-threaded GUIs.
| cheradenine_uk wrote:
| I think my point is more that you end up having to pay
| the costs (of Rx-style APIs) whether you need the
| scalability or not, because the libraries end up going
| down that route. This has sometimes felt that I'm being
| forced to do work in order to satisfy the fringe needs of
| some other project!
|
| And sure, if you are living in a single-threaded
| environment, your choices are somewhat limited. I,
| personally, dislike front-end programming for exactly
| that reason - things like RxJS feel hideously
| overcomplicated to me. My guess is that most, though not
| all, will much prefer the loom-style threading over
| async/await given free choice.
| lostcolony wrote:
| One additional - as noted, it's been 26 years since
| Java's founding. Project Loom has been around since at
| least 2018 and still has no release date. It'll be cool
| for Java projects whenever it comes out, but I
| just...have a hard time caring right now. I can't use it
| for old codebases currently, and new codebases I'm not
| using one request per Java thread anyway (tbh - when it's
| my choice I'm not choosing the JVM at all). The space has
| moved, and continues to move. In no way to say the JVM
| shouldn't be adopting the good ideas that come along the
| way, that is one of the benefits of being as conservative
| and glacial in adoption as it is, but I just...don't get
| excited about them, or find myself in any position in
| relation to the JVM (Java specifically, but the
| fundamentals affect other languages) other than "ugh,
| this again".
| chrisseaton wrote:
| > I'm not using one request per Java thread anyway
|
| The point is with Loom you can, and you can stop putting
| everything into a continuation and go back to straight-
| line code.
| lostcolony wrote:
| >> The point is with Loom you can
|
| The point I was making is that Loom isn't released,
| stable, production ready, supported, etc, and there's no
| still no date when it's supposed to be, so what you can
| do with Loom in no way affects what I can do with a
| production codebase, either new or legacy. I'm not sure
| how you missed that from my post.
|
| I'm not defending reactive programming on the JVM. I'm
| also not defending threads as units of concurrency. I'm
| saying I can get the benefits of Project Loom -right
| now-, in production ready languages/libraries, outside of
| the JVM, and I can't reasonably pick Project Loom if I
| want something stable and supported by its creators.
| pron wrote:
| > and there's no still no date when it's supposed to be
|
| September 20 (in Preview)
|
| > I'm saying I can get the benefits of Project Loom
| -right now-, in production ready languages/libraries,
| outside of the JVM
|
| Only sort-of. The only languages offering something
| similar in terms of programming model are Erlang
| (/Elixir) and Go -- both inspired virtual threads. But
| Erlang doesn't offer similar performance, and Go doesn't
| offer similar observbility. Neither offers the same
| popularity.
| lostcolony wrote:
| I'm not saying there aren't tradeoffs, just that if I
| need the benefits of virtual threads...I have other
| options. I'm all for this landing on the JVM, mainly so
| that non-Java languages there can take advantage of it
| rather than the hoops they currently have to jump through
| to offer a saner concurrency model, but that until it
| does...don't care. And last I saw this feature is
| proposed to land in preview in JDK19; not that it would,
| and...it's still preview. Meaning the soonest we can
| expect to see this safely available to production code is
| next year (preview in Java is a bit weird, admittedly.
| "This is not experimental but we can change any part of
| it or remove it for future versions depending how things
| go" was basically my take on it when I looked in the
| past).
|
| Meanwhile, as you say, Erlang/Elixir gives me this model
| with 35+ years of history behind it (and no
| libraries/frameworks in use trying to provide me a leaky
| abstraction of something 'better'), better observability
| than the JVM, a safer memory model for concurrent code, a
| better model for reliability, with the main issue being
| the CPU hit (less of a concern for IO bound workloads,
| which is where this kind of concurrency is generally
| impactful anyway). Go has reduced observability than
| Java, sure, but a number of other tradeoffs I personally
| prefer (not least of all because in most of the Java
| shops I was in, I was the one most familiar with
| profiling and debugging Java. The tools are there, the
| experience amongst the average Java developer isn't), and
| will also be releasing twice between now and next year.
|
| Again, I'm not saying virtual threads from Loom aren't
| cool (in fact, I said they were; the technical
| achievement of making it a drop in replacement is itself
| incredible), or that it wouldn't be useful when it
| releases for those choosing Java, stuck with Java due to
| legacy reasons, or using a JVM language that is now able
| to migrate to take advantage of this to remove some of
| the impedance mismatch between their concurrency model(s)
| and Java's threading and the resulting caveats. Just that
| I don't care until it does (because I've been hearing
| about it for the past 4 years), it still doesn't put it
| on par with the models other languages have adopted
| (memory model matters to me quite a bit since I tend to
| care about correct behavior under load more than raw
| performance numbers; that said, of course, nothing is
| preventing people from adopting safer practices
| there...just like nothing has been in years previous.
| They just...haven't), nor do I care about the claims
| people make about it displacing X, Y, or Z. It probably
| will for new code! Whenever it gets fully supported in
| production. But there's still all that legacy code
| written over the past two decades using libraries and
| frameworks built to work around Java's initial 1:1
| threading model, and which simply due to calling
| conventions and architecture (i.e., reactive and etc)
| would have to be rewritten, which probably won't happen
| due to the reality of production projects, even if there
| were clear gains in doing so (which as the great-
| grandparent mentions, is not nearly so clearcut).
| namdnay wrote:
| And hopefully we can bury Reactor Core in the garden and
| never talk about it again
| Scarbutt wrote:
| What has the space move to?
| pron wrote:
| > and still has no release date
|
| JEP 425 has been proposed to target JDK 19, out September
| 20. It will first be a "Preview" feature, which means
| supported but subject to change, and if all goes well
| would normally be out of Preview two releases, i.e. one
| year, after that.
|
| > I'm not using one request per Java thread anyway
|
| You don't have to, but not that _only_ the thread-per-
| request model offers you world-class observability
| /debuggability.
|
| > other than "ugh, this again".
|
| Ok, although in 2022, the Java platform is still among
| the most technologically advanced, state-of-the art,
| software plarform out there. It stands shoulder to
| shoulder with clang and V8 on compilation, and beats
| everything else on GC and low-overhead observability
| (yes, even eBPF).
| zinxq wrote:
| I think we're in agreement. Ignoring under the hood - Loom's
| programming paradigm (from the viewpoint of control flow) is
| the Threading programming paradigm. (Virtual)Thread-per-
| connection programming is easier and far more intuitive than
| asynchronous (i.e. callback-esque) programming.
|
| I still attest though - The 5M connections in this example is
| still a red herring.
|
| Can we get to 6M? Can we get to 10M? Is that a question for
| Loom or Java's asynchronous IO system? No - it's a question
| for the operating system.
|
| Loom and Java NIO can handle probably a billion connections
| as programmed. Java Threads cannot - although that too is a
| broken statement. "Linux Threads cannot" is the real
| statement. You can't have that many for resource reasons.
| Java Threads are just a thin abstraction on top of that.
|
| Linux out of the box can't do 5M connections (last I
| checked). It takes Linux tuning artistry to get it there.
|
| Don't get me wrong - I think Loom is cool. It's attempted to
| do the same thing as Async/Await tried - just better. But it
| is most definitely not the only way to achieve 5MM
| connections with Java or anything else. Possibly however,
| it's the most friendly and intuitive way to do it.
|
| *We typically vilify Java Threads for the Ram they consume.
| Something like 1M per thread or something (tunable). Loom
| must still use "some" ram per connection although surely far
| far less (and of course Linux must use some amount of kernel
| ram per connection too).
| pron wrote:
| > But it is most definitely not the only way to achieve 5MM
| connections with Java or anything else. Possibly however,
| it's the most friendly and intuitive way to do it.
|
| It is the only way to achieve that many connections with
| Java in a way that's debuggable and observable by the
| platform and its tools, regardless of its intuitiveness or
| friendliness to human programmers. It's important to
| understand that this is an objective technical difference,
| and one of the cornerstones of the project. Computations
| that are composed in the asynchronous style are invisible
| to the runtime. Your server could be overloaded with I/O,
| and yet your profile will show idle thread pools.
|
| Virtual threads don't just allow you to write something you
| could do anyway in some other way. They actually do work
| that has simply been impossible so far at that scale: they
| allow the runtime and its tools to understand how your
| program is composed and observe it at runtime in a
| meaningful and helpful way.
|
| One of the main reasons so many companies turn to Java for
| their most important server-side applications is that it
| offers unmatched observability into what the program is
| doing (at least among other languages/platforms with
| similar performance). But that ability was missing for
| high-scale concurrency. Virtual threads add it to the
| platform.
| mike_hearn wrote:
| I don't quite follow your argument.
|
| Saying "Linux cannot handle 5M connections with one thread
| per connection" isn't a reasonable statement because no
| operating system can do that, they can't even get close.
| The resource usage of a kernel thread is defined by pretty
| fundamental limits in operating system architecture,
| namely, that the kernel doesn't know anything about the
| software using the thread. Any general purpose kernel will
| be unable to provision userspace with that many threads
| without consuming infeasible quantities of RAM.
|
| The reason JVM virtual threads can do this is because the
| JVM has deep control and understanding of the stack and the
| heap (it compiled all the code). The reason Loom
| scalability gets worse if you call into native code is that
| then you're back to not controlling the stack.
|
| Getting to 10M is therefore very much a question for the
| JVM as well as the operating system. It'll be heavily
| affected by GC performance with huge heaps, which luckily
| modern G1 excels at, it'll be affected by the performance
| of the JVM's userspace schedulers (ForkJoinPool etc), it'll
| be affected by the JVM's internal book-keeping logic and
| many other things. It stresses every level of the stack.
| pron wrote:
| For more information about virtual threads see
| https://openjdk.java.net/jeps/425 (planned to preview in JDK 19,
| out this September).
|
| What's remarkable about this experiment is that it uses simple
| 26-year-old (Java 1.0) networking APIs.
| midislack wrote:
| I see a lot of these making the FP of HN. But it's very difficult
| to be impressed, or unimpressed because it's all about hardware.
| How much hardware is everybody throwing at all of this? 5M
| persistent connections on a Pi with mere GigE? Pretty frickin'
| amazing. 5M persistent connections on a Threadripper with 128
| cores and a dozen trunked 4 port 10GE NICs? Yaaaaawwwnnn snooze.
|
| We need a standardized computer for benchmarking these types of
| claims. I propose the RasPi 4 4GB model. Everybody can find one,
| all the hardware's soldered on so no cheating is really possible,
| etc. Then we can really shoot for efficiency.
| shadowpho wrote:
| Raspberry pi 4 performance changes wildly based on cooling.
| Bare die vs heatsink vs heatsink + fan will give you wildly
| different results.
| midislack wrote:
| Same is true with any computer these days. So let's go no
| heat sink, Pi 4 4GB anyway.
| KingOfCoders wrote:
| Something to learn for everybody, the article is mainly about
| Linux tuning.
| jeroenhd wrote:
| The Linux tuning part seems to have been inspired by these blog
| posts from 14 years ago:
| https://www.metabrew.com/article/a-million-user-comet-applic...
|
| It's almost a little disappointing that beefy modern servers
| only manage a x5 scale improvement, though that could be due to
| the differences in runtime behaviour between Erlang and the
| JVM.
| wiradikusuma wrote:
| The experiment is about Java app, but the tweaks are at the O/S
| level. Does it mean any app (Java/not, Loom/not) can achieve
| target given correct tweak?
|
| Also, why are these not default for the O/S? What are we
| compromising by setting those values?
| mike_hearn wrote:
| No, it doesn't. The reason the tweaks are at the OS level is
| because, apparently, Loom-enabled JVMs already scale up to that
| level without needing any tuning. But if you try that in C++
| you're going to die very quickly.
| pjmlp wrote:
| With C++ co-routines and a runtime like HPX, not really.
|
| However there are other reasons why a C++ applications
| connected to the internet might indeed die faster than a Java
| one.
| gpderetta wrote:
| There have been userspace thread libraries for c++ for
| decades.
| yosefk wrote:
| Sure, I wrote some myself. Q is what libraries you can use
| on top of the userspace thread package that are aware of
| the userspace threads rather than just using OS APIs and
| thus eg blocking the current OS thread.
| gpderetta wrote:
| There are .so interposition tricks that can be used for
| that.
|
| I think Pth used to do that for example.
| yosefk wrote:
| Could you elaborate?
| toast0 wrote:
| You need both your operating system and your application
| environment need to be up to the task. I'd expect most
| operating systems to be up to the task; although it might need
| settings set. Some of the settings are things that are
| statically allocated in non-swappable memory and you don't want
| to waste memory on being able to to have 5M sockets open if you
| never go over 10k. Often you'll want to reduce socket buffers
| from defaults, which will reduce throughput per socket, but
| target throughput per socket is likely low or you wouldn't want
| to cram so many connections per client. You may need to
| increase the size of the connection table and the hash used for
| it as well; again, it wastes non-swappable ram to have it too
| big if you won't use it.
|
| For application level, it's going to depend on how you handle
| concurrency. This post is interesting, because it's a benchmark
| of a different way to do it in Java. You could probably do 5M
| connections in regular Java through some explicit event loop
| structure; but with the Loom preview, you can do it connection
| per Thread. You would be unlikely to do it with connection per
| Thread without Loom, since Linux threads are very unlikely to
| scale so high (but I'd be happy to read a report showing 5M
| Linux threads)
| jiggawatts wrote:
| There's always trade-offs. It would be very rare for any server
| to reach even 100K concurrent connections, let alone 5M.
| Optimising for that would be optimising for the 0.000001% case
| at the expense of the common case.
|
| Some back of the envelope maths:
| https://www.wolframalpha.com/input?i=100+Gbps+%2F+5+million
|
| If the server had a 100 Gbps Ethernet NIC, this would leave
| just 20 kbps for each TCP connection.
|
| I could imagine some IoT scenarios where this _might_ be a
| useful thing, but outside of that? I doubt there 's anyone that
| wants 20 kbps throughput in this day and age...
|
| It's a good stress test however to squeeze out inefficiencies,
| super-linear scaling issues, etc...
| jeroenhd wrote:
| 20kbps should be sufficient for things like chat apps if you
| have the CPU power to actually process chat messages like
| that. Modern apps also require attachments and those will
| require more bandwidth, but for the core messaging
| infrastructure without backfilling a message history I think
| 20kbps should be sufficient. Chat apps are bursty, after all,
| leaving you with more than just the average connection speed
| in practice.
| henrydark wrote:
| I have a memory of some chat site, maybe discord, sending
| attachments to a different server, thus exchanging the
| bandwidth problem with extra system complexity
| jeroenhd wrote:
| That's how I'd solve the problem. The added complexity
| isn't even that high, give the application an endpoint to
| push an attachment into a distributed object store of
| your choice, submit a message with a reference to the
| object and persist it the moment the chat message was
| sent. This could be done with mere bytes for the message
| itself and some very dumb anycast-to-s3 services in
| different data centers.
|
| I'm sure I'm skipping over tons of complexity here (HTTP
| keepalives binding clients to a single attachment host
| for example) because I'm no chat app developer, but the
| theoretical complexity is still relatively low.
| Koffiepoeder wrote:
| Open, idle websockets can be a use case for a large amount of
| tcp connections with a small data footprint.
| jeffbee wrote:
| Also IMAP has this unfortunate property.
| wiseowise wrote:
| And how is that any different from Kotlin coroutines if you still
| need to call Thread.startVirtualThread?
| pjmlp wrote:
| Native VM support instead an additional library faking it, and
| filling .class files with needless boilerplate.
| ferdowsi wrote:
| Kotlin coroutines are colored and infect your whole codebase.
| Virtual threads do not.
| pron wrote:
| 1. These are actual threads from the Java runtime's
| perspective. You can step through them and profile them with
| existing debuggers and profilers. They maintain stacktraces and
| ThreadLocals just like platform threads.
|
| 2. There is no need for a split world of APIs, some designed
| for threads and others for coroutines (so-called "function
| colouring"). Existing APIs, third-party libraries, and programs
| -- even those dating back to Java 1.0 (just as this experiment
| does with Java 1.0's java.net.ServerSocket) -- just work on
| millions of virtual threads.
|
| Normally, you wouldn't even call Thread.startVirtualThread(),
| but just replace your platform-thread-pool-based
| ExecutorService with an ExecutorService that spawns a new
| virtual thread for each task
| (Executors.newVirtualThreadPerTaskExecutor()). For more
| details, see the JEP: https://openjdk.java.net/jeps/425
| imranhou wrote:
| It looks more closer to go routines, which to me begs the
| question - where are the channels that I could use to communicate
| between these virtual threads?
| sdfgdfgbsdfg wrote:
| In a library. Loom is more about adapting the JVM itself for
| continuations and virtual threads than adding to userspace.
| [deleted]
| adra wrote:
| Go's channels are simplistically a mutex in front of a queue.
| Java has many existing objects that can do the same, it's just
| that's not idiomatic best choice to do the same. Since green
| threads should wake up from Object.notify(), any threads
| blocking on the monitor should wake/consume. I'm curious how
| scalable/performance a green thread ConcurrentDequeue would
| stand up to go's channel.
| Matthias247 wrote:
| You are right. But Go Channels come also with the superpower
| of ,,select", which allows to wait for multiple objects to
| become ready and atomic execution of actions. I don't think
| this part can be retrofitted on top of simple BlockingQueues.
| sdfgdfgbsdfg wrote:
| pron talks about this on https://cr.openjdk.java.net/~rpres
| sler/loom/loom/sol1_part2....
| christophilus wrote:
| Loom looks like it's nicely solved the function coloring problem.
| This plus Graal makes me excited to pick up Clojure again.
| invalidname wrote:
| This is pretty fantastic!
|
| I'm very excited about the possibilities of Loom. Would love to
| have a more realistic sample with Spring Boot that would
| demonstrate the real world scale. I saw a few but nothing
| remotely as ambitious as that.
| isbvhodnvemrwvn wrote:
| Spring Boot overhead would likely make that infeasible.
| RhodesianHunter wrote:
| Spring boot overhead is largely in startup time. It really
| doesn't have much overhead there after.
|
| It's largely a collection of the same libraries you would use
| anyways glued together with a custom di system.
| invalidname wrote:
| I'm not saying 5M. I just want to see to what scale it would
| get without threading issues. Spring Boot isn't THAT heavy.
| nelsonic wrote:
| Reminds of https://phoenixframework.org/blog/the-road-
| to-2-million-webs... Would love to see this extended to more
| Languages/Frameworks.
| mike_hearn wrote:
| In theory once Graal adds support for it, any Graal/Truffle-
| compatible language can benefit.
|
| IMHO it's only JVM+Graal that can bring this to other
| languages. Loom relies very heavily on some fairly unique
| aspects of the Java ecosystem (Go has these things too though).
| One is that lots of important bits of code are implemented in
| pure Java, like the IO and SSL stacks. Most languages rely
| heavily on FFI to C libraries. That's especially true of
| dynamic scripting languages but is also true of things like
| Rust. The Java world has more of a culture of writing their own
| implementations of things.
|
| For the Loom approach to work you need:
|
| a. Very tight and difficult integration between the compiler,
| threading subsystem and garbage collector.
|
| b. The compiler/runtime to control all code being used. The
| moment you cross the FFI into code generated by another
| compiler (i.e. a native library) you have to pin the thread and
| the scalability degrades or is lost completely.
|
| But! Graal has a trick up its sleeve. It can JIT compile lots
| of languages, and those languages can call into each other
| without a classical FFI. Instead the compiler sees both call
| site and destination site, and can inline them together to
| optimize as one. Moreover those languages include binary
| languages like LLVM bitcode and WASM. In turn that means that
| e.g. Python calling into a C extension can still work, because
| the C extension will be compiled to LLVM bitcode and then the
| JVM will take over from there. So there's one compiler for the
| entire process, even when mixing code from multiple languages.
| That's what Loom needs.
|
| At least in theory. Perhaps pron will contradict me here
| because I have a feeling Loom also needs the invariant that
| there are no pointers into the stack. True for most languages
| but not once C gets involved. I don't know to what extent you
| could "fix" C programs at the compiler level to respect that
| invariant, even if you have LLVM bitcode. But at least the one-
| compiler aspect is not getting in the way.
| kaba0 wrote:
| With Truffle you have to map your language's semantics to
| java ones. I am unfortunately out of my depth on the details,
| but my guess would be that LLVM operates here with this in
| mind in a completely safe way (I guess pointers to the stack
| are not safe) so presumably it should work for these as well.
| mike_hearn wrote:
| Not exactly, no. That's the whole point of Truffle and why
| it's such a big leap forward. You do _not_ map your
| language 's semantics to Java semantics. You can implement
| them on top of the JVM but bypassing Java bytecode. Your
| language doesn't even have to be garbage collected, and
| LLVM bitcode isn't (unless you use the enterprise version
| which adds support for automatically converting C/C++ to
| memory safe GCd code!).
|
| So - C code running on the JVM via Sulong keeps C/C++
| semantics. That probably means you can build pointers into
| the stack, and then I don't know what Loom would do. Right
| now they aren't integrated so I guess that's a research
| question.
| bkolobara wrote:
| With lunatic [0] we are trying to bring this to all languages
| that compile to WebAssembly. A few days ago I wrote about our
| journey of bringing it to Rust:
| https://lunatic.solutions/blog/writing-rust-the-elixir-way-1...
|
| [0]: https://github.com/lunatic-solutions/lunatic
| TYMorningCoffee wrote:
| I was only able to get to 840,000 open connections with my
| experiment. My machine only has 8GB of memory.
| https://josephmate.github.io/2022-04-14-max-connections/
|
| Is there anyway for the TCP connections share memory in kernel
| space? My experiment only uses two 8 byte buffers in userspace.
| toast0 wrote:
| Does Linux actually allocate buffers for each socket or does it
| just link to sk_buff's (which I understand are similar to
| FreeBSD's mbuf's) and then limit how much storage can be
| linked? FreeBSD has a limit on the total ram used for mbufs as
| well, not sure about Linux.
|
| Otoh, FreeBSD's maximum FD limit is set as a factor of total
| memory pages (edit: looked it up, it's in
| sys/kern/subr_param.c, the limit is one FD per four pages,
| unless you edit kernel source) and you've got 2M pages with 8GB
| ram, so you would be limited to 512k FDs total, and if you're
| running the client on the same machine as server, that's 256k
| connections. But 8G is not much for a server, and some phones
| have more than that... so it's not super limiting.
|
| When you're really not doing much with the connections,
| userland tcp as suggest in a sibling, could help you squeeze in
| more connections, but if you're going to actually do work, you
| probably need more ram.
|
| Btw, as a former WhatsApp server engineer, WhatsApp listens on
| three ports; 80, 443, and 5222. Not that that makes a
| significant difference in the content.
| mh- wrote:
| no*, and as you've discovered, the skbufs allocated by the
| kernel will often be the limiting factor for a highly
| concurrent socket server on linux.
|
| * I don't know if someone has created some experimental
| implementation somewhere. It would require a significant
| overhaul of the TCP implementation in the kernel.
|
| edit: check out this sibling thread about userland TCP. I think
| this is a more interesting/likely direction to explore in.
| https://news.ycombinator.com/item?id=31215569
| 10000truths wrote:
| A bit of a digression, but I'd love to see how much further one
| could go with a memory-optimized userland TCP stack, and storing
| the send and receive buffers on disk.
|
| A TCP connection state machine consists of a few variables to
| keep track of sequence numbers and congestion control parameters
| (no more than 100-200 bytes total), plus the space for
| send/receive buffers.
|
| A 4 TB SSD would fit ~125 million 16-KB buffer pairs, and 125
| million 256-byte structs would take up only 32 GB of memory. In
| theory, handling 100 million simultaneous connections on a single
| machine is totally doable. Of course, the per-connection
| throughput would be complete doodoo even with the best NICs, but
| it would still be a monumental yet achievable milestone.
| mike_hearn wrote:
| Presumably at 100M simultaneous connections the machine CPU
| would be saturated with setting up and closing them, without
| getting much actual work done. TCP connections seem too fragile
| to make it worth trying to keep them open for really long
| periods.
|
| It's interesting to think about though, I agree. What are the
| next scaling bottlenecks now (for JVM compatible languages)
| threading is nearly solved?
|
| There are some obvious ones. Others in the thread have pointed
| out network bandwidth. Some use cases don't need much bandwidth
| but do need intense routability of data between connections,
| like chat apps, and it seems ideal for those. Still, you're
| going to face other problems:
|
| 1. If that process is restarted for any reason that's a _lot_
| of clients that get disrupted. JVMs are quite good at hot-
| reloading code on the fly, so it 's not inherently the case
| that this is problematic because you could make restarts very
| rare. But it's still a problem.
|
| 2. Your CPU may be sufficient for the steady state but on
| restart the clients will all try to reconnect at once. Adding
| jitter doesn't really solve the issue, as users will still have
| to wait. Handling 5M connections is great unless it takes a
| long time to reach that level of connectivity and you are
| depending on it.
|
| 3. TCP is rarely used alone now, it usually comes with SSL.
| Doing SSL handshakes is more expensive than setting up a TCP
| connection (probably!). Do you need to use something like QUIC
| instead? Or can you offload that to the NIC making this a non-
| issue? I don't know. BTW the Java SSL stack is written in Java
| itself so it's fully Loom compatible.
| natdempk wrote:
| It depends on what you do, but I think GC/memory pressure can
| become an issue rather quickly with the default programming
| models Java leads you towards. I end up seeing this a lot in
| somewhat high throughput services/workers I own where
| fetching a lot of data to handle requests and discarding it
| afterwards leads to a lot of GC time. Curious if anyone has
| any sage advice on this front.
| toast0 wrote:
| You're totally spot on that connection establishment is much
| more challenging than steady state; with TLS or just TCP.
|
| I don't think QUIC helps with that at all. Afaik, QUIC is all
| userland, so you'd skip kernel processing, but that doesn't
| really make establishment cheaper. And TCP+TLS establishes
| the connection before doing crypto, so that saves effort on
| spoofing (otoh, it increases the round trips, so pick your
| tradeoffs).
|
| One nice thing about TCP though is it's trivial to determine
| if packets are establishing or connected; you can easily drop
| incoming SYNs when CPU is saturated to put back pressure on
| clients. That will work enough when crypto setup is the issue
| as well. Operating systems will essentially do this for you
| if you get behind on accepting on your listen sockets. (Edit)
| syncookies help somewhat if your system gets overwelmed and
| can't keep state for all of them half-established
| connections, although not without tradeoffs.
|
| In the before times, accelerator cards for TLS handshakes
| were common (or at least available), but I think current NIC
| acceleration is mainly the bulk ciphering which IMHO is more
| useful for sending files than sending small data that I'd
| expect in a large connection count machine. With file
| sending, having the CPU do bulk ciphers is a RAM bottleneck:
| the CPU needs to read the data, cipher it, and write to RAM
| then tell the NIC to send it; if the NIC can do the bulk
| cipher that's a read and write omitted. If it's chat data,
| the CPU probably was already processing it, so a few cycles
| with AES instructions to cipher it before sending it to send
| buffers is not very expensive.
| charcircuit wrote:
| I think you meant to say TLS. Not SSL.
| adra wrote:
| I'm pretty sure the exercise was to show the absolute
| extremes that could be achieved in a toy application and
| possibly how easy one could achieve some level of IO blocking
| scaling that has been harder than most other tasks in java of
| late. More and more, heap allocations are cheaper, often with
| sub-milli collector locks, CPU scaling has more to do with
| what you're doing instead of the platform, but java have
| enough tools to make your application fast.
|
| For extremely IO wait bound workloads though, there was
| always a LOT if hoops to jump through to make performance
| strong since OS threads always have a notable stack memory
| footprint that just doesn't scale well when you could have
| thousands of OS threads waiting around just taking up RAM.
| toast0 wrote:
| It's easy to just get 4TB of ram if that's what you need; I
| haven't scoped out what you can shove into a cheap off the
| shelf server these days, but I'd guess around 16TB before you
| need to get fancy servers (Edit: maybe 8TB is more realistic
| after looking at SuperMicro's 'Ultra' servers). I think you'd
| need a very specialized applicatjon for 100M connections per
| server to make sense, but if you've got one, that sounds like a
| fun challenge; my email is in my profile.
|
| Moving 100M connections for maintenance will be a giant pain
| though. You would want to spend a good amount of time on a test
| suite so you can have confidence in the new deploys when you
| make them. Also, the client side of testing will probably be
| harder to scale than the server side... but you can do things
| like run 1000 test clients with 100k outgoing connections each
| to help with that.
| Nullabillity wrote:
| Loom is missing the point.
|
| Time has shown that bare threads are not a viable high-level API
| for managing concurrency. As it turns out, we humans don't think
| in terms of locks and condvars but "to do X, I first need to know
| Y". That maps perfectly onto futures(/promises). And once you
| have those, you don't need all the extra complexity and hacks
| that green threads (/"colourless async") bring in.
|
| I'd take a system that combined the API of futures with the
| performance of OS threads over the opposite combination, any day
| of the week. But as it turns out, we don't have to choose. We can
| have the performance of futures with the API of futures.
|
| Or we can waste person-years chasing mirages, I guess. I just
| hope I won't get stuck having to use the end product of this.
| IshKebab wrote:
| Threads have essentially the same API as Futures - normally you
| have some join of join handle and you can join a set of threads
| (the equivalent of awaiting a set of futures).
|
| Threads don't require locks and condvars. You can use channels
| and scoped joins etc. if you want.
|
| Give me some async code and I'll show you an easier threaded
| version.
| bpicolo wrote:
| The goroutine model in go is plenty conceptually simple for
| concurrency. Correct me if I'm wrong, but loom seems similar in
| that sense?
|
| I don't find myself missing out on futures in Go.
| pron wrote:
| I think you're mixing specific synchronisation/communication
| mechanisms with the basic concept of a thread, which is simply
| the sequential composition of instructions _that is known and
| observable by the runtime_. If you like the future /promise
| API, that will work even better with threads, because then the
| sequence is a reified concept known to the runtime and all its
| tools. You'll be able to step through the sequence of
| operations with a debugger; the profiler will know to associate
| operations with their context. What API you choose to compose
| your operations, whether you prefer message passing with no
| shared state, shared state with locks, or a combination of the
| two -- that's all orthogonal to threads. All they are is a
| sequantial unit of instructions that may run concurrently to
| other such units, _and is traceable and observable by the
| platform and its tools_.
| Nullabillity wrote:
| You can implement futures by just running each future as a
| thread, but it doesn't really give you much. It's a lot more
| complex to write a preemptive thread scheduler + delegating
| future scheduler than to just write a future scheduler in the
| first place.
|
| Especially when that future scheduler already exists and
| works, and the preemptive one is a multi-year research
| project away.
| pron wrote:
| It gives you a lot (aside from the ability to use existing
| libraries and APIs): observability and debuggabillity.
|
| Supporting tooling has been one of the most important
| aspects of this project, because even those who were
| willing to write asynchronous code, and even the few who
| actually enjoyed it, constantly complained -- and rightly
| so -- that they cannot easily observe, debug and profile
| such programs. When it comes to "serious" applications,
| observability is one of the most important aspects and
| requirements of a system.
|
| Instead of introducing new kind of sequenatial code unit
| through all layers of tooling -- which would have been a
| huge project anyway, we abstracted the existing thread
| concept.
| rvcdbn wrote:
| Maybe threads don't work for your thinking style but your claim
| that this is generally true is baseless and pretty well refuted
| by languages like Go or Erlang that feature stackfull
| threads/processes as a critical part of their best-in-class
| concurrency stories.
| Nullabillity wrote:
| Erlang sidesteps the problem by avoiding mutable shared
| state, in this context they're threads/processes in name
| only.
|
| Go is just yet another implementation of green threads that
| is slightly less broken than prior implementations, because
| it had the benefit of being implemented on day 1 (so the
| whole ecosystem is green thread-aware). It's certainly
| nowhere near "best-in-class".
| toast0 wrote:
| Shared mutable state is hard to work with, but Java threads
| and Java promises both give you access to it. In either
| case, you'd need discipline to avoid patterns which reduce
| concurrency.
|
| From the article, it seems that Loom (in preview) enables
| the threaded model for Java to scale. IMHO, this is great
| because you can write simple straightforward code in a
| threaded model. You can certainly write complex code in a
| threaded model too. Maybe there's an argument that promises
| can be simple and straightforward too, but my experience
| with them hasn't been very straightforward.
| chrisseaton wrote:
| > Erlang sidesteps the problem by avoiding mutable shared
| state
|
| Erlang is maximal shared mutable state!
|
| Processes are mutable state and they're shared between
| other processes.
| groestl wrote:
| If I look at a thread, I see futures all over the place.
| They're just implicit, and the OS takes care of
| concurrency/preemption. Sure, that means that you need
| concurrency primitives if you access shared resources, but only
| in the trivial case you can get away without shared state in
| the promise/future scenario as well (i.e. glue code that ties
| together the hard stuff). Downside is your code gets convoluted
| and your stacktraces suck.
| torginus wrote:
| While impressive, I don't really see it as something practical -
| I think scaling across processes/VMs is a much more realistic
| approach.
| notorandit wrote:
| With a maximum of 64k TCP connections per single server IP, you
| need 77 different IP on the server side. This is a fact.
| imperio59 wrote:
| Pretty sure you can bump that up in the kernel to hold more
| active connections per server that 64k...
| jauer wrote:
| How do you figure?
|
| Clients can connect to the server on the same server port, so
| connection limit is more like 64k*2 for every Client IP-Server
| IP pair.
| akvadrako wrote:
| Actually every client IP+port / server IP+port pair. Linux
| uses 60999 - 32768 for ephemeral ports so can support 28e3^2
| = 784 million connections per IP pair.
| mypalmike wrote:
| Except your service is almost certainly listening on one
| non-ephemeral port.
|
| But having "only" tens of thousands of connections per
| client is rarely a problem in practice, apart from some
| load testing scenarios (such as the experiment here, where
| they opened a number of ports so they could test a large
| number of connections with a single client machine).
| charcircuit wrote:
| 1 IP can correspond to multiple different clients.
| peq wrote:
| Isn't this limit per client ip, server ip, and server port?
| (https://stackoverflow.com/a/2332756/303637)
| alanfranz wrote:
| "You need 77 ips" to do what? May be a fact or not, depending
| on what you're doing.
|
| If you suppose just one open server port, you'll probably need
| 77 client ips to do this test to get unique socket pairs.
|
| But it's a client problem, not a server one.
| ivanr wrote:
| I imagine that's the limit per client IP address [for a single
| server port], no? The Linux kernel can use multiple pieces of
| information to track connections: client IP address, client
| port, server IP address, server port.
|
| Cloudflare has some interesting blog posts on this topic:
|
| - https://blog.cloudflare.com/how-we-built-spectrum/
|
| - https://blog.cloudflare.com/how-to-stop-running-out-of-
| ephem...
| NovemberWhiskey wrote:
| What?
|
| Having run production services that had over 250,000 sockets
| connecting to a single server port, I'm calling "nope" on that.
|
| Are you thinking of the ephemeral port limit? That's on the
| client side; not the server side. Each TCP socket pair is a
| four-tuple of [server IP, server port, client IP, client port];
| the uniqueness comes from the client IP/port part in the server
| case.
| jeroenhd wrote:
| You don't really need 77 IP addresses (the 64k limit for TCP is
| per client IP, per source port, per server IP) but even if you
| did, your average IPv6 server will have a few billion
| available. Every client can connect to a server IP of their own
| if you ignore the practical limits of the network acceleration
| and driver stack. If you're somehow dealing with this scale, I
| doubt you'll be stuck with pure legacy IP addressing.
|
| The real problem with such a setup is that you're not left with
| a whole lot of bandwidth per connection, even if you ignore
| things like packet loss and retransmits mucking up the
| connections. Most VPS servers have a 1gbps connection, with 5
| million clients that leaves 200 bytes per second of concurrent
| bandwidth for TCP signaling and data to flow through. You'll
| need a ridiculous network card for a single server to deal with
| such a load, in the terabits per second range.
___________________________________________________________________
(page generated 2022-04-30 23:00 UTC)