[HN Gopher] Generics can make your Go code slower
___________________________________________________________________
Generics can make your Go code slower
Author : tanoku
Score : 327 points
Date : 2022-03-30 15:47 UTC (7 hours ago)
(HTM) web link (planetscale.com)
(TXT) w3m dump (planetscale.com)
| YesThatTom2 wrote:
| People that demanded generics don't care about performance.
|
| They care about making excuses about not using Go.
| throwoutway wrote:
| The first code-to-assembly highlighting example here is
| beautiful. Question to the authors-- is that custom just for this
| article?
|
| Is there an open source CSS library or something that does this?
| tanoku wrote:
| Hey, author here. Thanks for the kind words! This is a custom
| pipeline that I designed for the article. It's implemented as a
| Node.js library using SVG.js and it statically generates the
| interactive SVGs directly in the static site generator I was
| using (Eleventy) by calling out to the Go compiler and
| extracting assembly for any lines you mark as interesting. It
| turned out very handy for iterating, but it's not particularly
| reusable I'm afraid!
| BeeOnRope wrote:
| I came here to ask about the same thing. Very cool! I would
| be very interested even in a blog post just on how you did
| the SVG generation.
| msla wrote:
| I agree with the commenter you're replying to; I'd only add
| that Intel syntax is much more readable than AT&T.
| mcronce wrote:
| FWIW, I'm fairly sure this is the assembly syntax used by
| Go - the author may not have made a decision to use this vs
| another
| eatonphil wrote:
| Key tldr from me:
|
| > Ah well. Overall, this may have been a bit of a disappointment
| to those who expected to use Generics as a powerful option to
| optimize Go code, as it is done in other systems languages. We
| have learned (I hope!) a lot of interesting details about the way
| the Go compiler deals with Generics. Unfortunately, we have also
| learned that the implementation shipped in 1.18, more often than
| not, makes Generic code slower than whatever it was replacing.
| But as we've seen in several examples, it needn't be this way.
| Regardless of whether we consider Go as a "systems-oriented"
| language, it feels like runtime dictionaries was not the right
| technical implementation choice for a compiled language at all.
| Despite the low complexity of the Go compiler, it's clear and
| measurable that its generated code has been steadily getting
| better on every release since 1.0, with very few regressions, up
| until now.
|
| And remember:
|
| > DO NOT despair and/or weep profusely, as there is no technical
| limitation in the language design for Go Generics that prevents
| an (eventual) implementation that uses monomorphization more
| aggressively to inline or de-virtualize method calls.
| jatone wrote:
| I agree. I find this snippet interestingly incorrect.
|
| > with very few regressions, up until now.
|
| the idea that this is a regression is silly. you can't have a
| regression unless old code is slower as a result. which is
| clearly not the case. its just a less than ideal outcome for
| generics. which will likely get resolved.
| nvarsj wrote:
| I'd argue that golang is inherently not a systems language, with
| its mandatory GC managed memory. I think it's a poor choice for
| anything performance or memory sensitive, especially a database.
| I know people would disagree (hence all the DBs written in golang
| these days, and Java before it), but I think C/C++/Rust/D are all
| superior for that kind of application.
|
| All of which is to say, I don't think it matters. Use the right
| tool for the job - if you care about generic overhead, golang is
| not the right thing to use in the first place.
| jksmith wrote:
| Sure, so can Modula-2 or Ada. Point is, degrees of separation
| in our biases, especially when it comes to C sugared languages.
| JamesBarney wrote:
| I've used a couple of DBs written in Go before and they were
| great. Loved using InfluxDB it was robust and performant.
| mhh__ wrote:
| Go is a system (no s) language IMO.
| DoctorOW wrote:
| I've said this before, and I'll say it again. People lump Go in
| with C/C++/Rust because they all (can) produce static binaries.
| I don't need to install Go's runtime like I install
| Java/NodeJS/Python runtimes. Honestly, I think it speaks so
| much to Go's accomplishments that it performs so well people
| intuitively categorize it with the systems languages rather
| than other managed languages.
| pjmlp wrote:
| Managed languages like Oberon, D, Modula-3, System C#...
| Thaxll wrote:
| There are very fast DB written in Go so this comment is
| irrelevant, what is the equivalent of
| https://github.com/VictoriaMetrics/VictoriaMetrics in an other
| language?
| chakkepolja wrote:
| There's highly tuned java software too, like Lucene, do you
| call java a systems language?
|
| All in all I think the semantics debate is irrelevant. No one
| is going to use go for an OS only because someone on internet
| calls it a systems language.
| jpgvm wrote:
| Gorilla which many of VM ideas are based on is in C++.
|
| Druid is Java and very fast but not like for like as it's an
| event database not a timeseries database. Pinot is in the
| same vein.
|
| Most of the very big and very fast databases you have used
| indirectly though web services like Netflix (Cassandra), etc
| are written in Java.
| samatman wrote:
| Large programs require memory management.
|
| Are you writing an application where Go's garbage collector
| will perform poorly relative to rolling your own memory
| management?
|
| Maybe, those applications exist, but maybe not, it shouldn't be
| presumed.
|
| I'm more open to the argument from definition, which might be
| what you mean by 'inherently': there isn't an RFC we can point
| to for interpreting what a systems language _is_ , and it could
| be useful to have a consensus that manual memory management is
| a necessary property of anything we call a systems language.
|
| No such consensus exists, and arguing that Go is a poor choice
| for $thing isn't a great way to establish whether it is or is
| not a systems language.
|
| Go users certainly seem to think so, and it's not a molehill I
| wish to die upon.
| zozbot234 wrote:
| Rust itself will most likely get some form of support for
| local, "pluggable" garbage collection in the near future. It's
| needed for managing general graphs with possible cycles, which
| might come up even in "systems programming" scenarios where
| performance is a focus - especially as the need for auto-
| managing large, complex systems increases.
| nu11ptr wrote:
| `Rc` and `Arc` have weak references which have worked just
| fine for me to break cycles in a graph. Not saying my use
| case is the most complex, but I haven't noticed this as a
| problem yet. YMMV
| zozbot234 wrote:
| But the whole point of weak references is that they don't
| get shared ownership (i.e. extend the lifetime) of their
| referenced values. That's not doing GC. They're fine when
| there's always some other (strong) reference that will
| ensure whatever lifetime is needed, but that's not the
| general case that GC is intended for.
| nu11ptr wrote:
| Weak pointers don't create a strong reference, correct,
| and that is exactly what is needed to break a cycle.
| Since it is a cycle, there is some other "owning" strong
| reference out there. Every use case I've seen generally
| has an obvious strong and weak reference (usually parent
| vs child). I'm sure there are trickier corner cases, but
| that is the typical IMO.
|
| For everything else, Rust has no need for a tracing GC,
| as it has "compile time" GC via static lifetime analysis
| which is much better IMO already and often avoid heap
| allocation all together.
| jerf wrote:
| If your data is "mostly a tree but has the occasional back
| reference I can easily identify as a back reference", that
| works great and I've used it in other languages.
|
| But in the limit, as your data becomes maximally "graphy"
| and there isn't much you can do about it, this ceases to be
| a viable option. You have to be able to carve a "canonical
| tree" out of your graph for this to work and that's not
| always possible. (Practically so, I mean. Mathematically
| you can always define a tree, by simple fiat if nothing
| else, but that doesn't help if the only available
| definitions are not themselves practical.)
| nu11ptr wrote:
| Fair point - so far I've been lucky enough to avoid such
| use cases
| bborud wrote:
| Let's start at the beginning. What is a <<systems language>>
| and for what is it typically used?
| sophacles wrote:
| A systems language is a language used (typically) to write
| systems :P
|
| Jokes aside, this is kind of a fundamental problem with the
| term, and many terms around classifying programs. Also worth
| noting - "program" is a term that is a lot looser than people
| who typically live above the kernel tend to think.
| jeffffff wrote:
| i agree with you that garbage collected languages are bad for
| systems programming but it's not because garbage collection is
| inherently bad, it's because gc doesn't handle freeing
| resources other than memory. for better or worse i've spent
| most of my professional career writing databases in java and i
| will not start a database product in java or any other garbage
| collected language again. getting the error handling and
| resource cleanup right is way harder in java or go than c++ or
| rust because raii is the only sane way to do it.
| baq wrote:
| this has been argued ad nauseum a decade ago and it boils down
| to your definition of 'systems'. at google scale, a system is a
| mesh of networked programs, not a kernel or low-level bit-
| banging tool.
| stingraycharles wrote:
| By that definition, Java is a systems language as well.
|
| I think Go makes a better trade-off than Java, but I struggle
| to come up with decent examples of projects one could write
| in Go and not in Java. Most of the "systems" problems that
| Java is unsuitable for, also apply to Go.
| coder543 wrote:
| > By that definition, Java is a systems language as well.
|
| I would fully agree that Java is a systems language.
|
| However, the definition of "systems language" has been a
| contentious issue for a very long time, and that debate
| seems unlikely to be resolved in this thread. I don't think
| the term itself is very useful, so it's probably better if
| everyone focuses on discussing things that actually matter
| to the applications they're trying to develop instead of
| arguing about nebulous classifications.
| phplovesong wrote:
| Go compiles to static binaries. Java needs the JVM. That
| already is a HUGE difference in "picking the right tool".
|
| Also, the JVM is way more heavy and resource intensive than
| a typical Go program. Go is great for cli tools, servers,
| and the usual "microservices" stuff, whatever it means to
| you.
| pjmlp wrote:
| Java can just be AOT compiled as Go, the only difference
| is that until recently it wasn't a free beer option to do
| so.
| cb321 wrote:
| >until recently
|
| Actually, for a long time (almost 20 years, I think), the
| gcc subsystem gcj let you do AOT compilation of Java to
| ELF binaries. [1] I think they had to be dynamically
| linked, but only to a few shared objects (unless you
| pulled in a lot via native dependencies, but that's kind
| of "on you").
|
| I don't recall any restrictions on how to use generated
| code, anymore than gcc-generated object code. So, I don't
| think the FSF/copyleft on the compiler itself nullifies a
| free beer classification. :-) gcj may not have done a
| great job of tracking Java language changes. So, there
| might be a "doesn't count as 'real Java'" semantic issue.
|
| [1] https://manpages.ubuntu.com/manpages/trusty/man1/gcj.
| 1.html
| ptsneves wrote:
| The interesting thing I always heard in the Java world is
| that AOT was actually a drawback as the virtual machine
| allowed for just in time optimizations according to
| hotspots. Actually if I remember correctly the word
| hotspot itself was even used as a tech trademark.
|
| I was always a bit skeptical but given i was never much
| into Java i just assumed my skepticism was out of
| ignorance. Now with what I know about profile guided
| compilation I can see it happening; A JIT language should
| have a performance advantage, especially if the optimal
| code paths change dynamically according to workload. Not
| even profile guided compilation can easily handle that,
| unless I am ignorant of more than i thought.
| pjmlp wrote:
| Sun was ideologically against AOT, as all commercial
| vendors always had some form of either AOT or JIT caches.
|
| In fact, the JIT caches in OpenJDK come from Oracle/Bea's
| J/Rockit, while IBM's OpenJ9 AOT compilation is from
| WebSphere Real Time JVM implementation.
| fluoridation wrote:
| I've heard the exact opposite. The supposed performance
| benefits of JIT compared to AOT (profile-guided
| optimization, run-time uarch-targeted optimization) never
| really materialized. There's been a lot of research since
| the late '90s into program transformation and it turned
| out that actually the most effective optimizations are
| architecture-independent and too expensive to be
| performed over and over again at startup or when the
| program is running. At the same time, deciding when it's
| worthwhile to reoptimize based on new profiling data
| turned out to be a much more difficult problem than
| expected.
|
| So the end result is that while both AOT (GCC, LLVM) and
| JIT (JVM, CLR) toolchains have been making gradual
| progress in performance, the JIT toolchains never caught
| up with the AOT ones as was expected in the '90s.
| pjmlp wrote:
| Good luck with inlining and devirtualization across DLLs
| with AOT.
|
| JIT caches with PGO get most of AOT benefits, that is why
| after the short stint with AOT on Android, Google decided
| to invest in JIT caches instead.
|
| The best toolchains can do both, so it is never a matter
| of either AOT or JIT.
|
| GCC and clang aren't investing in JIT features just for
| fun.
| fluoridation wrote:
| What's with the snappy tone?
|
| >Good luck with inlining and devirtualization across DLLs
| with AOT.
|
| An AOT compiler/linker is unable to inline calls across
| DLL boundaries because DLLs present a black-box
| interface. A JIT compiler would run into the exact same
| problem when presented with a DLL whose interface it
| doesn't understand or is incompatible with. If you really
| want a call inlined the solution is to link the caller
| and function statically (whether the native code
| generation happens at compile- or run-time), not to
| depend on unknown capabilities of the run-time.
|
| >The best toolchains can do both, so it is never a matter
| of either AOT or JIT.
|
| You're refuting a false dichotomy no one raised.
| cb321 wrote:
| Theory and practice can diverge and it's easy to over-
| conclude based on either with such complex systems. For
| example, I have seen gcc PGO make the very same training
| case used to measure the profile run _more slowly_. One
| might think that impossible naively, but maybe it sounds
| more plausible if I put it differently - "steering the
| many code generation heuristics with the profile failed
| in practice in that case". As with almost everything in
| computer systems, "it all depends...."
| pjmlp wrote:
| I seldom mention it, because gcj was abandoned in 2009
| when most contributors moved into the newly released
| OpenJDK, eventually removed from GCC tree, and it never
| was as foolproof as the commercial versions.
| tptacek wrote:
| People do huge amounts of systems programming in Java,
| including in systems that are incredibly performance-
| sensitive.
| jpgvm wrote:
| Go is strictly less useful than Java because it has
| strictly less power. This is true for general purpose
| programming (though somewhat remediated through
| introduction of generics) it's doubly true for "systems"
| applications:
|
| No access to raw threads. No ability to allocate/utilize
| off-heap memory (without CGo and nonsense atleast). Low
| throughput compared to Java JIT (unsuitable for CPU
| intensive tasks).
|
| The only thing I can think of in it's favor is lower memory
| usage by default but this is mostly just a JVM
| misconception, you can totally tune it for low memory usage
| (in constrained env) or high memory efficiency - especially
| if using off-heap structures.
|
| On a stdlib level Java mostly wins but Go has some
| highlights, it has an absolutely rock solid and well built
| HTTP and TLS/X.509/ASN1 stack for instance, also more
| batteries vs Java.
|
| Overall I think if the requirement is "goes fast" I will
| always choose Java.
|
| I may pick Go if the brief calls for something like a
| lightweight network proxy that should be I/O bound rather
| than CPU bound and everything I need is in stdlib and I
| don't need any fancy collections etc.
| jen20 wrote:
| > On a stdlib level Java mostly wins
|
| This isn't even true compared to other comparable
| platforms like .NET, let alone Go which has hands down
| the most useful and well constructed standard library in
| existence (yes, even better than Python).
| jpgvm wrote:
| Yeah I don't buy that.
|
| Especially not when things like this exist:
| https://pkg.go.dev/container
|
| And things like this don't: https://docs.oracle.com/en/ja
| va/javase/17/docs/api/java.base...
|
| As I mentioned Go does have a great HTTP and TLS stack
| but that doesn't do enough to put it on the same level.
| throwaway894345 wrote:
| I think you're mistaken on nearly every count. :)
|
| First of all, Go and Java exist at roughly the same
| performance tier. It will be less work to make Java beat
| Go for some applications and vice versa for other
| applications. Moreover, typical Go programs use quite a
| lot less memory than typical Java programs (i.e., there's
| more than one kind of performance).
|
| Secondly, Go can make syscalls directly, so it absolutely
| can use raw-threads and off-heap memory. These are
| virtually never useful for the "systems" domain (as
| defined above).
|
| Thirdly, I think Go's stdlib is better if only because it
| isn't riddled with inheritance. It also has a standard
| testing library that works with the standard tooling.
|
| Lastly, I think you're ignoring other pertinent factors
| like maintainability (does a new dev have to learn a new
| framework, style, conventions, etc to start
| contributing?), learning curve (how long does it take to
| onboard someone who is unfamiliar with the language? Are
| there plugins for their text editor or are they going to
| have to learn an IDE?), tooling (do you need a DSL just
| to define the dependencies? do you need a DSL just to
| spit out a static binary? do you need a CI pipeline to
| publish source code or documentation packages?), runtime
| (do you need a GC tuning wizard to calibrate your
| runtime? does it "just work" in all environments?), etc.
| jpgvm wrote:
| I disagree.
|
| Go is definitely not as fast as Java for throughput. It's
| gotten pretty good for latency sensitive workloads but
| it's simply left in the dust for straight throughput,
| especially if you are hammering the GC.
|
| Sure it can make syscalls directly but if you are going
| to talk about a maintainability nightmare I can't think
| of anything worse than trying to manipulate threads
| directly in Go. I had to do this in a previous Go project
| where thread pinning was important and even that sucked.
|
| That is just taste. Objectively collections and many
| other aspects of the Java stdlib completely destroy Go, I
| pointed out the good bits already.
|
| Again, taste. Java has a slightly steeper and longer
| learning curve but that is a cost you pay once and is
| amortized over all the code that engineer will contribute
| over their tenure.
|
| Using an IDE (especially if everyone is using the same
| one) is actually a productivity improvement, not an
| impairment but again - taste. Some people just don't like
| IDEs or don't like that you need to use a specific one to
| get the most out of a specific tech stack.
|
| Build systems in Java by and large fall into only 3
| camps, Maven, Gradle and a very small (but
| loud/dedicated) Bazel camp. Contrast that to Go which is
| almost always a huge pile of horrible Makefiles, CMake,
| Bazel or some other crazy homebrewed bash build system.
|
| You don't escape CI because you used Go, if you think you
| did then you are probably doing Go wrong.
|
| Java runtime trades simplicity for ability to be tuned,
| again taste. I personally prefer it.
|
| So no, I don't think I am mistaken. I think you just
| prefer Go over Java for subjective reasons. Which is
| completely OK but doesn't invalidate anything I said.
| Thaxll wrote:
| > Build systems in Java by and large fall into only 3
| camps, Maven, Gradle and a very small (but
| loud/dedicated) Bazel camp. Contrast that to Go which is
| almost always a huge pile of horrible Makefiles, CMake,
| Bazel or some other crazy homebrewed bash build system.
|
| Well Go does not need a book of 400+ pages to understand
| Maven.
| throwaway894345 wrote:
| > Go is definitely not as fast as Java for throughput.
| It's gotten pretty good for latency sensitive workloads
| but it's simply left in the dust for straight throughput,
| especially if you are hammering the GC.
|
| Java is better for _GC throughput_ , but your claim was
| about _compute throughput_ in general. Moreover, Go doesn
| 't lean nearly as hard on GC as Java does in the first
| place (idiomatic value types, less boxing, etc), so GC
| throughput doesn't imply overall throughput.
|
| > Sure it can make syscalls directly but if you are going
| to talk about a maintainability nightmare I can't think
| of anything worse than trying to manipulate threads
| directly in Go. I had to do this in a previous Go project
| where thread pinning was important and even that sucked.
|
| Thread pinning is a very rare requirement, typically you
| only need it when you're calling some poorly-written C
| library. If this is your requirement, then Go's solution
| will be less maintainable, but for everyone else the
| absence of the foot-gun is the more maintainable solution
| (i.e., as opposed to an ecosystem of intermingled OS
| threads and goroutines).
|
| > That is just taste. Objectively collections and many
| other aspects of the Java stdlib completely destroy Go, I
| pointed out the good bits already.
|
| Agreed that it's taste. Agreed that Java has more
| collections than Go, but I think it's a good thing that
| Go pushes people toward slices and hashmaps because those
| are the right tool for the job 90% of the time. I think
| there's some broader point here about how Java doesn't do
| a good job of encouraging people away from misfeatures
| (e.g., inheritance, raw threads, off-heap memory, etc).
|
| > Again, taste. Java has a slightly steeper and longer
| learning curve but that is a cost you pay once and is
| amortized over all the code that engineer will contribute
| over their tenure.
|
| Java has a _significantly_ steeper /longer curve--it's
| not only the language that you must learn, but also the
| stdlib, runtime, tools, etc and these are typically
| considerably more complicated than Go. Moreover, it's a
| cost an engineer pays once, but it's a cost an
| organization pays over and over (either because they have
| to train people in Java or narrow their hiring pool).
|
| > Build systems in Java by and large fall into only 3
| camps, Maven, Gradle and a very small (but
| loud/dedicated) Bazel camp. Contrast that to Go which is
| almost always a huge pile of horrible Makefiles, CMake,
| Bazel or some other crazy homebrewed bash build system.
|
| Go has one build system, `go build`. Some people will
| wrap those in Makefiles (typically very lightweight
| makefiles e.g., they just call `go build` with a few
| flags). A minuscule number of projects use Bazel--for all
| intents and purposes, Bazel is not part of the Go
| ecosystem. I haven't seen any "crazy homebrewed bash
| build system" either, I suspect this falls into the "for
| all intents and purposes not part of the Go ecosystem"
| category as well. I've been writing Go regularly since
| 2012.
|
| > You don't escape CI because you used Go, if you think
| you did then you are probably doing Go wrong.
|
| I claimed the CI burden is lighter for Go than Java, not
| that it goes away entirely.
|
| > Java runtime trades simplicity for ability to be tuned,
| again taste. I personally prefer it.
|
| I think it's difficult to accurately quantify, but I
| don't think it's a matter of taste. Specifically, I would
| wager that Go's defaults + knobs are less work than Java
| for something like 99% of applications.
|
| > So no, I don't think I am mistaken. I think you just
| prefer Go over Java for subjective reasons. Which is
| completely OK but doesn't invalidate anything I said.
|
| I agree that some questions are subjective, but I think
| on many objective questions you are mistaken (e.g.,
| performance, build tool ecosystem, etc).
| philosopher1234 wrote:
| I think your use of "subjective" is avoiding discussing
| things that are harder to prove but matter a great deal.
| philosopher1234 wrote:
| This is an argument from edge case capabilities that
| completely ignores maintenance costs + development time.
| Seems very naive to me.
| jpgvm wrote:
| It's not. If you are building a database or other
| "systems" software these are very relevant capabilities.
|
| Also development time of Java may be slightly longer in
| the early stages but I generally find refactoring of Java
| projects and shuffling of timelines etc is a ton easier
| than Go. So I think Java wins out over a longer period of
| time even if it starts off a bit slower.
|
| It's far from naive. I have written a shitton of Go code
| (also a shitton of Java if that wasn't already apparent).
| philosopher1234 wrote:
| You may not personally be naive, but i was talking about
| your analysis, not you.
|
| >Also development time of Java may be slightly longer in
| the early stages but I generally find refactoring of Java
| projects and shuffling of timelines etc is a ton easier
| than Go. So I think Java wins out over a longer period of
| time even if it starts off a bit slower.
|
| I think this topic is far too large to be answered in
| this brief sentence. I also think it deserves a higher
| allocation of your words than what you spared for java's
| capabilities :)
|
| But yes, I see now that you are interested purely in
| performance in your argument and definition of systems
| software, in which case what you're saying may be true.
| BobbyJo wrote:
| Totally agree. If the argument is strictly more power is
| always better, then C++ would always win. Why doesn't it?
| Exactly what you reference, dev time and maintenance.
|
| Go was designed for simplicity. Of course it's not the
| fastest or most feature rich. It's strong suit is that I
| can pop open any Go codebase and understand what's going
| on fairly quickly. I went from not knowing any Go to
| working with it effectively in a large codebase in a
| couple weeks. Not the case with Java. Not the case with
| most languages.
| jpgvm wrote:
| That wasn't the argument though, you are attacking a
| strawman. The argument was much more nuanced if you
| bothered to read it.
|
| Essentially it boils down to this. If I am writing
| -systems- software and I'm going to choose between Go or
| Java then the list of things I pointed out are the main
| differentiating features along with raw throughput which
| matters for things like databases which need to be able
| to do very fast index/bitmap/etc operations.
|
| Go is great for being simple and easy to get going.
| However that is completely worthless in systems software
| that requires years of background knowledge to
| meaningfully contribute to. The startup cost of learning
| a new codebase (or entirely new programming language)
| pales in comparison to the requisite background
| knowledge.
| BobbyJo wrote:
| > Go is strictly less useful than Java because it has
| strictly less power.
|
| Literally sentence one, so calling my argument straw-man
| is dishonest.
|
| > Essentially it boils down to this. If I am writing
| -systems- software and I'm going to choose between Go or
| Java then the list of things I pointed out are the main
| differentiating features along with raw throughput which
| matters for things like databases which need to be able
| to do very fast index/bitmap/etc operations.
|
| All true. In my experience though, the long tail of
| maintenance and bug fixes tend to result in decreasing
| performance over time, as well as a slowing of new
| feature support.
|
| All of that being said, these are all fairly pointless
| metrics when we can just look at the DBs being adopted
| and why people are adopting them. Plenty of projects use
| Go because of Go's strengths, so saying "that is
| completely worthless in systems software" is verifiably
| false. It's not worthless in any software, worth less
| maybe, but not worthless.
| throwaway894345 wrote:
| I don't think it's useful to frame "fitness for a given
| domain" as a binary, but yes, Java is often used
| successfully for this domain (although personally I think
| Go is an even better fit for a variety of reasons).
| geodel wrote:
| Go has user defined value types which Java does not yet. It
| makes huge difference in memory density for typical data
| structures. This makes Go more suitable to low overhead web
| services, cli tools running on few MBs which Java at least
| needs few hundred MBs
| pionar wrote:
| > Go has user defined value types which Java does not
| yet.
|
| C# has this. A lot of people overlook C# in this area,
| probably because until recently, it was not cross-
| platform.
| socialdemocrat wrote:
| I would say Go is a systems programming language. A systems
| programming language is for creating services used by actual
| end user applications. That is pretty much what Go is being
| used for. Who is writing editors or drawing applications in Go?
| Nobody.
|
| Go does contain many of the things of interest to systems
| programmers such as pointers and the ability to specify memory
| layout of data structures. You can make your own secondary
| allocators. In short it gives you far more fine grained control
| over how memory is used than something like Java or Python.
|
| https://erik-engheim.medium.com/is-go-a-systems-programming-...
| xyproto wrote:
| The GC in Go is not mandatory.
| ptman wrote:
| In the language? In the implementation?
| xyproto wrote:
| Both. It can be paused or disabled at runtime.
| titzer wrote:
| You really haven't given any supporting information for your
| argument other than a vague feeling that GC is somehow bad. In
| fact you just pointed out many counterexamples to your own
| argument, so I'm not sure what to take away.
|
| I've seen this sentiment a lot, and I never see specifics. "GC
| is bad for systems language" is an unsupported, tribalist,
| firmly-held belief that is unsupported by hard data.
|
| On the other hand, huge, memory-intensive and garbage-collected
| systems have been deployed in vast numbers by thousands of
| different companies for decades, long before Go, within
| acceptable latency bounds. And shoddy, poorly performing
| systems have been written in C/C++ and failed spectacularly for
| all kinds of reasons.
| throwaway894345 wrote:
| Virtually invariably, "GC is bad" assumes (1) lots of garbage
| (2) long pause times. Go has idiomatic value types (so it
| generates much less garbage) and a low-latency garbage
| collector. People who argue against GC are almost always
| arguing against some Java GC circa 2005.
| MaulingMonkey wrote:
| This is the no true scottsman argument. I mean, no true
| modern GC. And it's bullshit. Let's be topical and pick on
| Go since that's the language in the title:
|
| https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-
| how-i...
|
| 30% of CPU spent on GC, individual GC pauses already in the
| milliseconds, despite a tiny half-gig heap in 2019. For
| gamedev, a single millisecond in the wrong place can be
| enough to miss vsync and have unacceptable framerate
| stutter. In the right place, it's "merely" 10% of your
| entire frame budget, for VR-friendly, nausea-avoiding
| framerates near 100fps. Or perhaps 90% if "single digit
| milliseconds" might include 9ms.
|
| Meanwhile, the last professional project I worked on had
| 100ms pauses every 30 seconds because we were experimenting
| with duktape, which is still seeing active commits. Closer
| to a 32GB heap for that project, but most of that was
| textures. Explicit allocation would at least show where the
| problematic garbage/churn was in any profiler, but garbage
| collection meant a single opaque codepath for _all_ garbage
| deallocation... without even the benefit of explicit static
| types to narrow down the problem.
| titzer wrote:
| From your link (which I remember reading at the time):
|
| > So by simply reducing GC frequency, we saw close to a
| ~99% drop in mark assist work, which translated to a~45%
| improvement in 99th percentile API latency at peak
| traffic.
|
| Did you look at the actual article? (Because it doesn't
| support your point). They added a 10GB memory ballast to
| keep the GC pacer from collecting too much. That is just
| a bad heuristic in the GC, and should have a tuning knob.
| I'd argue a tuning knob isn't so bad, compared to
| rewriting your entire application to manually malloc/free
| everything, which would likely result in oodles of bugs.
|
| Also:
|
| > And it's bullshit.
|
| Please, we can keep the temperature on the conversation
| down a bit by just keeping to facts and leaving out a few
| of these words.
| MaulingMonkey wrote:
| > Please, we can keep the temperature on the conversation
| down a bit by just keeping to facts and leaving out a few
| of these words.
|
| Sure. Let's avoid some of these words too:
|
| > unsupported, tribalist, firmly-held belief that is
| unsupported by hard data.
|
| Asking for examples is fine and great, but painting broad
| strokes of the dissenting camp before they have a chance
| to respond does nothing to help keep things cool.
| [deleted]
| MaulingMonkey wrote:
| > Did you look at the actual article? (Because it doesn't
| support your point).
|
| I did and it does for the point I intended to derive from
| said article:
|
| >> However, the GC pause times before and after the
| change were not significantly different. Furthermore, our
| pause times were on the order of single digit
| milliseconds, not the 100s of milliseconds improvement we
| saw at peak load.
|
| They were able to improve times via tuning. Individual GC
| pause times were still in the milliseconds. Totally
| acceptable for twitch's API servers (and in fact drowned
| out by the several hundred millisecond response times),
| but those numbers mean you'd want to avoid doing anything
| at all in a gamedev render thread that could potentially
| trigger a GC pause, because said GC pause will trigger a
| vsync miss.
|
| > I'd argue a tuning knob isn't so bad, compared to
| rewriting your entire application to manually malloc/free
| everything, which would likely result in oodles of bugs.
|
| Memory debuggers and RAII tools have ways to tackle this.
|
| I've also spent my fair share of time tackling oodles of
| bugs from object pooling, meant to workaround performance
| pitfalls in GCed languages, made worse by the fact that
| said languages treated manual memory allocation as a
| second class citizen at best, providing inadequate
| tooling for tackling the problem vs languages that treat
| it as a first class option.
| titzer wrote:
| > but those numbers mean you'd want to avoid doing
| anything at all in a gamedev render thread that could
| potentially trigger a GC pause, because said GC pause
| will trigger a vsync miss.
|
| You might want to take a look at this:
|
| https://queue.acm.org/detail.cfm?id=2977741
| MaulingMonkey wrote:
| I have, it's a decent read - although somewhat
| incoherent. E.g. they tout the benefits of GCing when
| idle, then trash the idea of controlling GC:
|
| > Sin two: explicit garbage-collection invocation.
| JavaScript does not have a Java-style System.gc() API,
| but some developers would like to have that. Their
| motivation is proactively to invoke garbage collection
| during a non-time-critical phase in order to avoid it
| later when timing is critical. [...]
|
| So, no explicitly GCing when a _game knows_ it 's idle.
| Gah. The worst part is these are entirely fine points...
| and somewhat coherent in the context of webapps and
| webpages. But then when one attempts to embed v8 - as one
| does - and suddenly you the developer are the one that
| might be attempting to time GCs correctly. At least then
| you have access to the appropriate native APIs:
|
| * https://v8docs.nodesource.com/node-7.10/d5/dda/classv8_
| 1_1_i... * https://v8docs.nodesource.com/node-7.10/d5/dda
| /classv8_1_1_i...
|
| A project I worked on had a few points where it had to
| explicitly call GC multiple times back to back.
| Intertwined references from C++ -> Squirrel[1] -> C++ ->
| Squirrel meant the first GC would finalize some C++
| objects, which would unroot some Squirrel objects, which
| would allow some more C++ objects fo be finalized - but
| only one layer at a time per GC pass.
|
| Without the multiple explicit GC calls between unrooting
| one level and loading the next, the game had a tendency
| to "randomly"[2] ~double it's typical memory budget
| (thanks to uncollected dead objects and the corresponding
| textures they were keeping alive), crashing OOM in the
| process - the kind of thing that would fail console
| certification processes and ruin marketing plans.
|
| [1]: http://squirrel-lang.org/
|
| [2]: quite sensitive to the timing of "natural"
| allocation-triggered GCs, and what objects might've
| created what reference cyles etc.
| titzer wrote:
| > So, no explicitly GCing when a game knows it's idle.
|
| I mean, that is literally what the idle time scheduler in
| Chrome does. It has a system-wide view of idleness, which
| includes all phases of rendering and whatever else
| concurrent work is going on.
|
| > Intertwined references from C++ -> Squirrel[1] -> C++
| -> Squirrel meant the first GC would finalize some C++
| objects, which would unroot some Squirrel objects, which
| would allow some more C++ objects fo be finalized - but
| only one layer at a time per GC pass.
|
| This is a nasty problem and it happens a lot interfacing
| two heaps, one GC'd and one not. The solution isn't less
| GC, it's more. That's why Chrome has GC of C++ (Oilpan)
| and is working towards a unified heap (this may already
| be done). You put the blame on the wrong component here.
| throwaway894345 wrote:
| I don't think you know what "no true scotsman" means--I'm
| not asserting that Go's GC is the "true GC" but that it
| is one permutation of "GC" and it defies the conventional
| criticisms levied at GC. As such, it's inadequate to
| refute GC in general on the basis of long pauses and lots
| of garbage, you must refute each GC (or at least each
| type/class of GC) individually. Also, you can see how
| cherry-picking pathological, worst-case examples doesn't
| inform us about the normative case, right?
| [deleted]
| MaulingMonkey wrote:
| >> And it's bullshit.
|
| > _cherry picks worst-case examples and represents them
| as normative_
|
| Neither of my examples are anywhere near worst-case. All
| texture data bypassed the GC entirely, for example,
| contributing to neither live object count nor GC
| pressure. I'm taking numbers from a modern GC with value
| types that _you yourself_ should be fine and pointed out,
| hey, it 's actually pretty not OK for anything that might
| touch the render loop in modern game development, even if
| it's not being used as the primary language GC.
| MaulingMonkey wrote:
| > I don't think you know what "no true scotsman" means--
| I'm not asserting that Go's GC is the "true GC"
|
| At no point in invoking
| https://en.wikipedia.org/wiki/No_true_Scotsman does one
| bother to define what a true scotsman is, only what it is
| not by way of handwaving away any example of problems
| with a category by implying the category excludes them.
| It's exactly what you've done when you state "People who
| argue against GC are almost always arguing against" some
| ancient, nonmodern, unoptimized GC.
|
| Modern GCs have perf issues in some categories too.
|
| > As such, it's inadequate to refute GC in general on the
| basis of long pauses and lots of garbage, you must refute
| each GC (or at least each type/class of GC) individually.
|
| I do not intend to refute the value of GCs in general. I
| will happily use GCs _in some cases_.
|
| I intend to refute your overbroad generalization of the
| anti-GC camp, for which specific examples are sufficient.
|
| > Also, you can see how cherry-picking pathological,
| worst-case examples doesn't inform us about the normative
| case, right?
|
| My examples are neither pathological nor worst case. They
| need not be normative - but for what it's worth, they
| _do_ exemplify the normative case of my own experiences
| in game development across multiple projects with
| different teams at different studios, when language level
| GCs were used for general purpouses, despite being
| bypassed for bulk data.
|
| It's also exactly what titzer was complaining was missing
| upthread:
|
| > I've seen this sentiment a lot, and I never see
| specifics. "GC is bad for systems language" is an
| unsupported, tribalist, firmly-held belief that is
| unsupported by hard data.
| Karrot_Kream wrote:
| Right. In my experience just taking a few steps (like pre-
| allocating buffers or arrays) decrease GC pressure enough
| where GC runs don't actually affect performance enough to
| matter (as long as you're looking at ~0.5-1 ms P99 response
| times). But there's always the strident group who says GCs
| are bad and never offer any circumstance where that could be
| true.
| titzer wrote:
| Indeed. What really kills is extremely high allocation
| rates and extremely high garbage production. I've seen
| internal numbers from $megacorp that show that trashy C++
| programs (high allocation + deallocation rates) look pretty
| much the same to CPUs as trashy Java programs, but are far
| worse in terms of memory fragmentation. Trashy C++ programs
| can end up spending 20+% of their execution time in
| malloc/free. That's a fleetwide number I've seen on
| clusters > 1M cores.
|
| I will admit that the programming _culture_ is different
| for many GC 'd languages' communities, sometimes
| encouraging a very trashy programming style, which
| contributes to the perception that GC _itself_ is the
| problem, but based on my experience in designing and
| building GC 'd systems, I _don 't_ blame GC itself.
| ninkendo wrote:
| > I will admit that the programming culture is different
| for many GC'd languages' communities, sometimes
| encouraging a very trashy programming style, which
| contributes to the perception that GC itself is the
| problem
|
| For some languages (I'm looking at you, Java), there's
| not much of a way to program that _doesn 't_ generate a
| bunch of garbage, because only primitives are treated as
| value types, and for Objects, heap allocations can only
| be avoided if escape analysis can prove the object
| doesn't outlast its stack frame (which isn't reliable in
| practice.) (Edit: or maybe it doesn't happen _at all_.
| Apparently escape analysis isn't used to put objects on
| the stack even if they are known to not escape:
| https://www.beyondjava.net/escape-analysis-java)
|
| I honestly can't imagine much of a way to program in Java
| that doesn't result in tremendous GC pressure. You could
| technically allocate big static buffers and use a
| completely different paradigm where every function is
| static and takes data at a defined offsets to said
| buffers, but... nobody really does this and the result
| wouldn't look anything like Java.
|
| Sometimes it's appropriate to blame the language.
| _ph_ wrote:
| It is indeed a huge problem of Java, that it often makes
| it difficult to avoid generating garbage. However, one
| still can reduce it a lot if trying hard. And be it by
| reimplementing selected parts of the standard libraries.
|
| But the job of avoiding garbage is much easier in Go :)
| titzer wrote:
| > I'm looking at you, Java
|
| > Sometimes it's appropriate to blame the language.
|
| Oh, I know, I was just being vague to be diplomatic. Java
| being generally trashy has been one of the major
| motivators for me to do Virgil. In Java, you can't even
| parse an integer without allocating memory.
| osigurdson wrote:
| GC is bad if the problem domain you are working on requires
| you to think about the behaviour of the GC all the time. GC
| behaviour can be subtle, may change from version to version
| and some behaviours may not even be clearly documented. If
| one has to think about it all the time, it is likely better
| just to use a tool where memory management is more explicit.
| However, I think for many examples of "systems software"
| (Kubernetes for example), GC is not an issue at all but for
| others it is an illogical first choice (though it often can
| be made to work).
| throwaway894345 wrote:
| Even in performance-critical software, you're not thinking
| about GC "all the time" but only in certain hot paths.
| Also, value types and some allocation semantics (which Go
| technically lacks, but the stack analyzer is intuitive and
| easily profiled so it has the same effect as semantics)
| make the cognitive GC burden much lower.
| emtel wrote:
| My argument against GC (and which applies similary to JIT-
| basd runtimes) is that the problems caused by GC pauses have
| non-local causes. If a piece of code ran slowly because of a
| GC pause, the cause of the pause is in some sense _the entire
| rest of the system_. You can't fix the problem with a
| localized change.
|
| Programs in un-managed languages can be slow too, and
| excessive use of malloc() is a frequent culprit. But the
| difference is that if I have a piece of code that is slow
| because it is calling malloc() too much, I can often (or at
| least some of the time) just remove the malloc() calls from
| that function. I don't have to boil the ocean and
| significantly reduce the rate at which my entire program
| allocates memory.
|
| I think another factor that gets ignored is how much you care
| about tail latency. I think GC is usually fine for servers
| and other situations where you are targeting a good P99 or
| P99.9 latency number. And indeed, this is where JVM, Go,
| node.js, and other GCed runtimes dominate.
|
| But, there are situations, like games, where a bad P99.9
| frame time means dropping a frame every 15 seconds (at
| 60fps). If you've got one frame skip every 10 seconds because
| of garbage collection pauses and you want to get to one frame
| skip every minute, that is _not_ an easy problem to fix.
|
| (Yes, I am aware that many commercial game engines have
| garbage collectors).
| bcrosby95 wrote:
| I don't want to try to bring up an exception that disproves
| your rule, but what about something like BEAM, where it has
| per-process (process = lightweight thread) heaps and GC.
| emtel wrote:
| I don't know anything about BEAM, but I don't think
| single-threading of any form really addresses the
| underlying problem. If you go to allocate something, and
| the system decides a GC is necessary in order to satisfy
| your allocation, then the GC has to run before your
| allocation returns.
| rbranson wrote:
| You can't share objects across threads (called
| "processes") in BEAM, so it's very different. The GC only
| ever needs to pause one call stack at a time to do a full
| GC cycle. Memory shared across processes is generally
| manually managed, typically more akin to a database than
| an object heap.
| fauigerzigerk wrote:
| _> I've seen this sentiment a lot, and I never see specifics.
| "GC is bad for systems language" is an unsupported,
| tribalist, firmly-held belief that is unsupported by hard
| data._
|
| I would argue it's not (very) hard data that we need in this
| case. My opinion is that the resource usage of infrastructure
| code should be as low as possible so that most resources are
| available to run applications.
|
| The economic viability of application development is very
| much determined by developer productivity. Many applications
| aren't even run that often if you think of in-house business
| software for instance. So application development is where we
| have to spend our resource budget.
|
| Systems/infrastructure code on the other hand is subject to
| very different economics. It runs all the time. The ratio of
| development time to runtime is incredibly small. We should
| optimise the heck out of infrastructure code to drive down
| resource usage whenever possible.
|
| GC has significant memory and CPU overhead. I don't want to
| spend double digit resource percentages on GC for software
| that could be written differently without being uneconomical.
| titzer wrote:
| > Systems/infrastructure code on the other hand is subject
| to very different economics. It runs all the time. The
| ratio of development time to runtime is incredibly small.
| We should optimise the heck out of infrastructure code to
| drive down resource usage whenever possible.
|
| I will assume by "infrastructure code" you mean things like
| kernels and network stacks.
|
| Unfortunately there are several intertwined issues here.
|
| First, we pay in completely different ways for writing this
| software in manually-managed languages. Security
| vulnerabilities. Bugs. Development time. Slow evolution. I
| don't agree with the tradeoffs we have made. This software
| is important and needs to be memory-safe. Maybe Rust will
| deliver, who knows. But we currently have a lot of latent
| memory management bugs here that have consistently clocked
| in 2/3 to 3/4 of critical CVEs over several decades. That's
| a real problem. We aren't getting this right.
|
| Second, infrastructure code does not consume a lot of
| memory. Infrastructure code mostly manages memory and
| buffers. The actual heap footprint of the Linux kernel is
| pretty small; it mostly indexes and manages memory,
| buffers, devices, packets, etc. _That_ is where
| optimization should go; manage the most resources with the
| lowest overhead just in terms of data structures.
|
| > GC has significant memory and CPU overhead. I don't want
| to spend double digit resource percentages on GC for
| software that could be written differently without being
| uneconomical.
|
| Let's posit 20% CPU for all the things that a GC does. And
| let's posit 2X for enough heap room to keep the GC running
| concurrently well enough that it doesn't incur a lot of
| mutator pauses.
|
| If all that infrastructure is taking 10% of CPU and 10% of
| memory, we are talking adding 2% CPU and 10% memory.
|
| ABSOLUTE BARGAIN in my book!
|
| The funny thing is, that people made these same arguments
| back in the heyday of Moore's law when we were getting 2X
| CPU performance ever 18 months. 2% CPU back then was a
| matter of weeks of Moore's law. Now? Maybe a couple of
| months. We consistently choose to spend the performance
| dividends of hardware performance improvements on...more
| performance? And nothing on safety or programmability? I
| seriously think we chose poorly here due to some
| significant confusion in priorities and real costs.
| fauigerzigerk wrote:
| _> I will assume by "infrastructure code" you mean things
| like kernels and network stacks._
|
| That, and things like database systems, libraries that
| are used in a lot of other software or language runtimes
| for higher level languages.
|
| _> The actual heap footprint of the Linux kernel is
| pretty small_
|
| And what would that footprint be if the kernel was
| written in Java or Go? What would the performance of all
| those device drivers be?
|
| You can of course write memory efficient code in GC
| languages by manually managing a bunch of buffers. But I
| have seen and written quite a bit of that sort of code.
| It's horribly unsafe and horribly unproductive to write.
| It's far worse than any C++ code I have ever seen. It's
| the only choice left when you have boxed yourself into a
| corner with a language that is unsuitable for the task.
|
| _> First, we pay in completely different ways for
| writing this software in manually-managed languages.
| Security vulnerabilities. Bugs. Development time_
|
| This is not a bad argument, but I think there has always
| been a very wide range of safety features in non-GC
| languages. C was never the only language choice. We had
| the Pascal family of languages. We had ADA. We got
| "modern" C++, and now we have Rust.
|
| If safety was ever good enough reason to use GC languages
| for systems/infrastructure, that time is now over.
| throwaway894345 wrote:
| > You can of course write memory efficient code in GC
| languages by manually managing a bunch of buffers. But I
| have seen and written quite a bit of that sort of code.
| It's horribly unsafe and horribly unproductive to write.
|
| Go uses buffers pretty idiomatically and they don't seem
| unsafe or unproductive. Maybe I'm not following your
| meaning?
|
| > If safety was ever good enough reason to use GC
| languages for systems/infrastructure, that time is now
| over.
|
| I don't know that I want GC for OS kernels and device
| drivers and so on, but typically people arguing against
| GC are assuming lots of garbage and long pause times;
| however, Go demonstrates that we can have low-latency GC
| and relatively easy control over how much garbage we
| generate and where that garbage is generated. It's also
| not hard to conceive of a language inspired by Go that is
| more aggressively optimized (for example, fully
| monomorphized generics, per this article or with a more
| sophisticated garbage collector).
|
| I think the more compelling reason to avoid a GC for
| kernel level code is that it implies that the lowest
| level code depends on a fairly complex piece of software,
| and that _feels wrong_ (but that 's also a weak criticism
| and I could probably be convinced otherwise).
| titzer wrote:
| > It's also not hard to conceive of a language inspired
| by Go that is more aggressively optimized (for example,
| fully monomorphized generics, per this article or with a
| more sophisticated garbage collector).
|
| Standard ML as implemented by MLton uses full
| monomorphization and a host of advanced functional
| optimizations. That code can be blazingly fast. MLton
| does take a long time to compile, though.
|
| I've been working on a systems language that is GC'd and
| also does full monomorphization--Virgil. I am fairly
| miserly with memory allocations in my style and the
| compiler manages to compile itself (50KLOC) at full
| optimization in 300ms and ~200MB of memory, not
| performing a single GC. My GC is really dumb and does a
| Cheney-style semispace copy, so it has horrible pauses.
| Even so, GC is invisible so the algorithm could be
| swapped out at any time.
|
| For an OS kernel, I think a GC would have to be pretty
| sophisticated (concurrent, highly-parallel, on-the-fly),
| but I think this is a problem I would love to be working
| on, rather than debugging the 19000th use-after-free bug.
|
| Go's GC is _very_ sophisticated, with very low pause
| times. It trades memory for those low pause times and can
| suffer from fragmentation because it doesn 't compact
| memory. Concurrent copying is still a hard problem.
|
| Again, a problem I'd rather we had than the world melting
| down because we chose performance rather than security.
| fauigerzigerk wrote:
| My argument is about the economics of software
| development more than about any of the large number of
| interesting technical details we could debate for a very
| long time.
|
| There are higher level features that cause higher
| resource consumption. GC is clearly one such feature. No
| one denies that. So how do we decide where it makes more
| sense to use these features and where does it make less
| sense?
|
| What I'm saying is that we should let ourselves be guided
| by the ratio development_time / running_time. The smaller
| this ratio, the less sense it makes to use such "resource
| hogging" features and the more sense it makes to use
| every opportunity for optimisation.
|
| This is not only true for infrastructure/systems
| software. This is just one case where that ratio is very
| small. Another case would be application software that is
| used by a very large number of people, such as web
| browsers.
| titzer wrote:
| I understand your argument, it's been made for decades.
| Put in a lot of effort to save those resources. But isn't
| about _effort_. We put in a lot of effort _and still got
| crap, even worse, security_. We put effort into the wrong
| things!
|
| We've majorly screwed up our priorities. Correctness
| should be so much higher up the priority list, probably
| #1, TBH. When it is a high priority, we _should_ be
| willing to sacrifice performance to actually get it. The
| correct question is not _if_ we should sacrifice
| performance, but _how much_. We didn 't even get that
| right.
|
| But look, I know. Security doesn't sell systems, never
| has--benchmarks do. The competitive benchmarking
| marketplace is partly responsible. And there, there's
| been so much FUD on the subject that I feel we've all
| been hoodwinked and conned into putting performance at
| the top to all of our detriment. That was just dumb on
| our (collective) part.
|
| Let me put it another way. Go back to 1980. Suppose I
| offered you two choices. Choice A, you get a 1000x
| improvement in computer performance and memory capacity,
| but your system software is a major pain in the ass to
| write and full of security vulnerabilities, to the point
| where the world suffers hundreds of billions of dollars
| of lost GDP due to software vulnerabilities. Choice B,
| you get an 800x improvement in computer performance, a
| 500x improvement in memory capacity, and two thirds of
| that GDP loss just doesn't happen. Also, writing that
| software isn't nearly as much of a pain in the ass.
|
| Which did we choose? Yeah. That's where the disagreement
| lies.
| zimpenfish wrote:
| > mandatory GC managed memory
|
| It can be a right old bugger - I've been tweaking gron's memory
| usage down as a side project (e.g. 1M lines of my sample file
| into original gron uses 6G maxrss/33G peak, new tweaked uses
| 83M maxrss/80M peak) and there's a couple of pathological cases
| where the code seems to spend more time GCing than parsing,
| even with `runtime.GC()` forced every N lines. In C, I'd know
| where my memory was and what it was doing but even with things
| like pprof, I'm mostly in the dark with Go.
| randomdata wrote:
| _> I 'd argue that golang is inherently not a systems language_
|
| First you'd have to establish what "systems" means. That,
| you'll find, is all over the place. Some people see systems as
| low level components like the kernel, others the userland that
| allows the user to operate the computer (the set of Unix
| utilizes, for example), you're suggesting databases and things
| like that.
|
| The middle one, the small command line utilities that allow you
| to perform focused functions, is a reasonably decent fit for
| Go. This is one of the places it has really found a niche.
|
| What's certain is that the Go team comes from a very different
| world to a lot of us. The definitions they use, across the
| board, are not congruent with what you'll often find elsewhere.
| Systems is one example that has drawn attention, but it doesn't
| end there. For example, what Go calls casting is the opposite
| of what some other languages call casting.
| slrz wrote:
| What is it that Go supposedly calls casting? The term (or its
| variations) does not show up in the language specification.
|
| People sometimes use it for type conversions but that's in
| line with usage elsewhere, no?
| jjtheblunt wrote:
| > with its mandatory GC managed memory
|
| is that factual, in the general case?
|
| it seems there exists a category of Go programs for which
| escape analysis entirely obviates heap allocations, in which
| case if there is any garbage collection it originates in the
| statically linked runtime.
| IgorPartola wrote:
| Hold the phone. Where did the leap from "golang is not a
| systems language" to "poor choice for anything performance or
| memory sensitive" come from?
|
| That is a huge leap you are making there that I don't think is
| exactly justified.
| synergy20 wrote:
| I agree. golang is not really a system programming. It's more
| like java, a language for applications.
|
| It does have one niche, that it includes most if not everything
| you need run a network-based service(or micro-service), e.g
| http,even https, dns...are baked in. You no longer need to
| install openssl on windows for example, in golang one binary
| will include all of those(with CGO disabled too).
|
| I do system programming in c and c++, maybe rust later when I
| have time to grasp that, there is no way for using Go there.
|
| For network related applications, Go thus far is my favorite,
| nothing beat it, one binary has its all, can't be easier to
| upgrade in the field too.
| svnpenn wrote:
| Typical gatekeeping. I like Go, because it lets me get stuff
| done. You could say the same about JavaScript, but I think Go
| is better because of the type system. C, C++ and Rust are
| faster in many cases, but man are they awful to work with.
|
| C and C++ dont really have package management to speak of, its
| basically "figure it out yourself". I tried Rust a couple of
| times, but the Result/Option paradigm basically forces you into
| this deeply nested code style that I hate.
| dahfizz wrote:
| > C and C++ dont really have package management to speak of
|
| I hear this complaint often, but I consider it a feature of
| C. You end up with much less third party dependencies, and
| the libraries you do end up using have been battle tested for
| decades. I much prefer that to having to install hundreds of
| packages just to check if a number is even, like in JS.
| pjmlp wrote:
| Ah, that is why all major OSes end up having some form of
| POSIX support to keep those C applications going.
| dahfizz wrote:
| You need have some OS API, what is wrong with POSIX?
|
| And what does POSIX have to do with package management?
| remexre wrote:
| my big personal nit is poor async support; e.g. async
| disk IO is recent in Linux, and AFAIK all the Unices
| implement POSIX aio as a threadpool anyway. not being
| able to wait on "either this mutex/semaphore has been
| signaled, or this IO operation has completed" is also
| occasionally very annoying...
| pjmlp wrote:
| POSIX is UNIX rebranded as C runtime for OSes that aren't
| UNIX.
| throwaway894345 wrote:
| I think you're contradicting yourself. You end up with
| fewer third party dependencies in C because C developers
| end up rewriting from scratch what they would otherwise
| import, and these rewrites have much _less_ battle-testing
| than popular libraries in other languages. Moreover, they
| also have more opportunity for failure since C is so much
| less safe than other languages. Even in those few libraries
| which have received "decades of battle-testing" we still
| see critical vulnerabilities emerge. Lastly, you're
| implying a dichotomy between C and JS in a thread about Go,
| which doesn't have the same dependency sprawl as JS.
| dcgudeman wrote:
| Hmm yes, why stop there? Why have functions? Just
| reimplement business logic all over your codebase. That way
| each block of code has everything you need to know. Sure,
| functions have been adopted by every other
| language/ecosystem and are universally known to be useful
| despite a few downsides but you could say the same about
| package management and that hasn't deterred you yet.
| dahfizz wrote:
| You're arguing against a straw man. I could just as
| easily say "why not make every line of code it's own
| function?"
|
| C has libraries, and my comment made it clear that they
| are useful.
|
| Argue against my actual point:
|
| By not having a bespoke package manager, and instead
| relying on the system package manager, you end up with
| higher quality dependencies and with dramatically less
| bloat in C than other language ecosystems. It is all the
| benefit and none of the drawbacks of npm-like ecosystems.
| dcgudeman wrote:
| I don't agree with your assessment that libraries in C
| are higher quality. Additionally I have yet to see a
| system package manager that enables developers to install
| dependencies at a specific version solely for a project
| without a lot of headaches. All the venv stuff in python
| is necessary python dependencies are installed system
| wide. The idea that the C/C++ ecosystem is better off
| because it doesn't have its own package manager is a
| bizarre idea at best.
| steveklabnik wrote:
| > You end up with much less third party dependencies,
|
| https://wiki.alopex.li/LetsBeRealAboutDependencies
| averagedev wrote:
| I've found Go to be much simpler than Rust, especially syntax
| wise. However, in Rust you can use the ? operator which
| propagates errors. In Go you have to check err != nil.
| svnpenn wrote:
| > Rust you can use the ? operator
|
| That doesnt work with all types:
|
| https://stackoverflow.com/a/65085003
| monocasa wrote:
| Only automatically printing something when returning it
| from main doesn't work with all types with the ?
| operator. And frankly 'handling errors by auto print and
| exit' is a bit of a code smell anyway, it's not much
| better than just .unwrap() on everything in main.
| jdmnd wrote:
| It works for any type that implements the
| `std::error::Error` trait; which is something you can
| easily implement for your own types. If you want your
| errors to be integers for some reason, you can wrap that
| type in a zero-sized "newtype" wrapper, and implement
| `Error` for that.
|
| The Stack Overflow answer you linked seems to be claiming
| that it's simply easier to return strings, but I wouldn't
| say this is a restriction imposed by the language.
| svnpenn wrote:
| > easily implement for your own types
|
| have you ever actually done that? I have, its not easy.
| Please dont try to hand wave away negatives of the Rust
| type system.
| mcronce wrote:
| I do it frequently. It is indeed easy
| TheDong wrote:
| > have you ever actually done that? I have, its not easy.
|
| Yes. I do it frequently. "#[derive(Error, Debug)]":
| https://github.com/dtolnay/thiserror#example
|
| Much easier than implementing the error interface in go.
|
| Rust is powerful enough to allow macros to remove
| annoying boiler-plate, and so most people using rust will
| grab one of the error-handling crates that are de-facto
| standard and remove the minor pain you're talking about.
|
| In go, it's not really possible to do this because the
| language doesn't provide such macros (i.e. the old third-
| party github.com/pkg/errors wanted you to implement
| 'Cause', but couldn't provide sugar like 'this-error'
| does for it because go is simply less powerful).
|
| I've found implementing errors in go to be much more
| error-prone and painful than in rust, and that's not to
| mention every function returning untyped errors, meaning
| I have no clue what callers should check for and handle
| new errors I add.
| svnpenn wrote:
| > Much easier than implementing the error interface in
| go.
|
| is this a joke? You have to import a third party package,
| just to implement an error interface? Here is Go example,
| no imports: type errorString string
| func (e errorString) Error() string { return
| string(e) }
| TheDong wrote:
| It was not a joke.
|
| Let's look at a common example: you want to return two
| different types of errors and have the caller distinguish
| between them. Let me show it to you in rust and go.
|
| Rust: #[derive(Error, Debug)]
| pub enum MyErrors { #[error("NotFound: {0}")
| NotFound(String), #[error("Internal error")]
| Internal(#[source] anyhow::Error), }
|
| The equivalent go would be something like:
| type NotFoundErr struct { msg string
| } func (err NotFoundErr) Error() string {
| return "NotFound: " + err.msg } func
| (err NotFoundErr) Is(target error) bool { if
| target == nil { return false
| } // All NotFoundErrs are considered the
| same, regardless of msg _, ok :=
| target.(NotFoundErr) return ok }
| type InternalErr struct { wrapped error
| } func (err InternalErr) Error() string
| { return fmt.Sprintf("Internal error: %s",
| err.wrapped) } func (err
| InternalErr) Unwrap() error { return
| err.wrapped }
| svnpenn wrote:
| I dont think you realize how ridiculous this comment is.
| Youre comparing 10 lines of Go, with 200 of Rust:
|
| https://github.com/dtolnay/thiserror/blob/master/src/lib.
| rs
| pornel wrote:
| Nobody's saying you can't use Go or must use C/C++/Rust. If
| Go works for you, that's great.
|
| The issue is about positioning of Go as a language. It's
| confusing due to being (formerly) marketed as a "systems
| programming language" that is typically a domain of
| C/C++/Rust, but technically Go fits closer to capabilities of
| Java or TypeScript.
| pjmlp wrote:
| Is writing a compiler, linker, kernel emulation layer,
| TCP/IP stack or a GPU debugger, systems programming?
| ohYi55 wrote:
| git clone?
|
| I mean do we need bespoke package management tooling for
| everything now?
|
| Seems like an outdated systems admin meme that violates KISS,
| explodes dependency chains, risks security, etc. IT feels
| infected by sunk cost fallacy.
|
| It's electron state in machines. The less altogether the
| better.
| mcronce wrote:
| What, specifically, do you mean when you say Rust is "awful
| to work with"? With C and C++ I agree, but I've had a
| _drastically_ better development experience in Rust than Go.
| svnpenn wrote:
| You should probably read the rest of the comment...
| mcronce wrote:
| Are Result and Option really the only thing? Because
| nesting scopes based on Err/None is rarely the right
| choice, just like nesting a scope based on `if err ==
| nil` isn't typically something you want to do in Go, or
| `if errno == 0` in C - You can panic
| trivially with `.unwrap()` - You can propagate the
| condition up trivially with `?` - `?` doesn't
| work with all types, but it does work with Option, and it
| does work with the vast majority of error types - making
| your custom error work with it is very easy (if you're
| whipping up an application or prototype and want your
| error handling very simple, `anyhow::Error` works great
| here) - You can convert None to an Err condition
| trivially with `.ok_or()?` - In cases where it
| makes sense, you can trivially use a default value with
| `.unwrap_or_default()`
|
| And all of these use require a _lot_ less code than `if
| err != nil { return nil, err }`
|
| And all of these allow you to use the Ok/Some value
| directly in a function call, or in a method chain, while
| still enabling the compiler to force you to handle the
| Err/None case
|
| The common theme here being "trivial" :) Result/Option
| are a big piece of that better developer experience.
| reikonomusha wrote:
| I have a slightly contrary opinion. Systems software is a very
| large umbrella, and much under that umbrella is not encumbered
| by a garbage collector whatsoever. (To add insult to injury,
| the term's definition isn't even broadly agreed upon, similar
| to the definition of a "high-level language".) Yes, there are
| some systems applications where a GC can be a hindrance in
| practice, but these days I'm not even sure it's a majority of
| systems software.
|
| I think what's more important for the systems programmer is (1)
| the ability to inspect the low-level behavior of functions,
| like through their disassembly; (2) be reasonably confident how
| code will compile; and (3) have some dials and levers to
| control aspects of compiled code and memory usage. All of these
| things can and are present, not only in some garbage collected
| languages, but also garbage-collected languages with a dynamic
| type system!
|
| Yes, there are environments so spartan and so precision-
| oriented that even a language's built-in allocator cannot be
| used (e.g., malloc), in which case using a GC'd language is
| going to be an unwinnable fight for control. But if you only
| need to do precision management of a memory that isn't
| pervasive in all of your allocation patterns, then using a
| language like C feels like throwing the baby out with the bath
| water. It's very rarely "all or nothing" in a modern, garbage-
| collected language.
| no_circuit wrote:
| The article is from a database company, so I'll assume that
| approximates the scope. My scope for the GC discussion would
| include other parts that could be considered similar
| software: cluster-control plane (Kubernetes), other
| databases, and possibly the first level of API services to
| implement a service like an internal users/profiles or auth
| endpoints.
|
| The tricky thing is GC works most of the time, but if you are
| working at scale you really can't predict user behavior, and
| so all of those GC-tuning parameters that were set six months
| ago no longer work properly. A good portion of production
| outages are likely related to cascading failures due to too
| long GC pauses, and a good portion of developer time is spent
| testing and tuning GC parameters. It is easier to remove
| and/or just not allow GC languages at these levels in the
| first place.
|
| On the other hand IMO GC-languages at the frontend level are
| OK since you'd just need to scale horizontally.
| initplus wrote:
| It's impossible to spend any time tuning Go's GC parameters
| as they intentionally do not provide any.
|
| Go's GC is optimized for latency, it doesn't see the same
| kind of 1% peak latency issues you get in languages with a
| long tail of high latency pauses.
|
| Also consider API design - Java API (both in standard &
| third party libs) tend to be on the verbose side and build
| complex structures out of many nested objects. Most Go
| applications will have less nesting depth so it's
| inherently an easier GC problem.
|
| System designs that rely on allocating a huge amount of
| memory to a single process exist in a weird space - big
| enough that perf is really important, but small enough that
| single-process is still a viable design. Building massive
| monoliths that allocate hundreds of Gb's at peak load just
| doesn't seem "in vogue" anymore.
|
| If you are building a distributed system keeping any
| individual processes peak allocation to a reasonable size
| is almost automatic.
| erik_seaberg wrote:
| You tune Go's GC by rewriting your code. It's like
| turning a knob but slower and riskier.
| coder543 wrote:
| You tune GC in Go by profiling allocations, CPU, and
| memory usage. Profiling shows you where the problems are,
| and Go has some surprisingly nice profiling tools built
| in.
|
| Unlike turning a knob, which has wide reaching and
| unpredictable effects that may cause problems to just
| move around from one part of your application to another,
| you can address the actual problems with near-surgical
| precision in Go. You can even add tests to the code to
| ensure that you're meeting the expected number of
| allocations along a certain code path if you need to
| guarantee against regressions... but the GC is so rarely
| the problem in Go compared to Java, it's just not
| something to worry about 99% of the time.
|
| If knobs had a "fix the problem" setting, they would
| already be set to that value. Instead, every value is a
| trade off, and since you have hundreds of knobs, you're
| playing an _impossible_ optimization game with hundreds
| of parameters to try to find the set of parameter values
| that make your _entire_ application perform the way you
| want it to. You might as well have a meta-tuner that just
| randomly turns the knobs to collect data on all the
| possible combinations of settings... and just hope that
| your next code change doesn 't throw all that hard work
| out the window. Go gives you the tools to tune different
| parts of your code to behave in ways that are optimal for
| them.
|
| It's worth pointing out that languages like Rust and C++
| also require you to tune allocations and deallocations...
| this is not strictly a GC problem. In those languages,
| like in Go, you have to address the actual problems
| instead of spinning knobs and hoping the problem goes
| away.
|
| The one time I have actually run up against Go's GC when
| writing code that was trying to push the absolute limits
| of what could be done on a fleet of rather resource
| constrained cloud instances, I wished I was writing Rust
| for this particular problem... I definitely wasn't
| wishing I could be spinning Java's GC knobs. But, I was
| still able to optimize things to work in Go the way I
| needed them to even in that case, even if the level of
| control isn't as granular as Rust would have provided.
| exdsq wrote:
| I think I toggled with the GC for less than a week in my
| eight years experience including some systems stuff - maybe
| this is true at FANG scale but not for me!
| coder543 wrote:
| Go doesn't offer a bunch of GC tuning parameters. Really
| only one parameter, so your concerns about complex GC
| tuning here seem targeted at some other language like Java.
|
| This is a drawback in some cases, since one size never
| truly fits all, but it dramatically simplifies things for
| most applications, and the Go GC has been tuned for many
| years to work well in most places where Go is commonly
| used. The developers of Go continue to fix shortcomings
| that are identified.
|
| Go's GC prioritizes very short STWs and predictable
| latency, instead of total GC throughput, and Go makes GC
| throughput more manageable by stack allocating as much as
| it can to reduce GC pressure.
|
| Generally speaking, Go is also known for using very little
| memory compared to Java.
| no_circuit wrote:
| Yes, my comments were targeted to Java and Scala. Java
| has paid the bills for me for many years. I'd use Java
| for just about anything except for high load
| infrastructure systems. And if you're in, or want to be
| in, that situation, then why risk finding out two years
| later that a GC-enabled app is suboptimal?
|
| I'd guess you'd have no choice if in order to hire
| developers, you had to choose a language that the people
| found fun to use.
| astrange wrote:
| Is go's GC not copying/generational? I think "stack
| allocation" doesn't really make sense in a generational
| GC, as everything sort of gets stack allocated. Of
| course, compile-time lifetime hints might still be useful
| somehow.
| coder543 wrote:
| > Is go's GC not copying/generational?
|
| Nope, Go does not use a copying or generational GC. Go
| uses a concurrent mark and sweep GC.
|
| Even then, generational GCs are not as cheap as stack
| allocation.
| socialdemocrat wrote:
| Java _needs_ lots of GC tuning parameters because you
| have practically no way of tuning the way your memory is
| used and organized in Java code. In Go you can actually
| do that. You can decide how data structures are nested,
| you can take pointers to the inside of a a block of
| memory. You could make e.g. a secondary allocator,
| allocating objects from a contiguous block of memory.
|
| Java doesn't allow those things, and thus it must instead
| give you lots of levers to pull on to tune the GC.
|
| It is just a different strategy of achieving the same
| thing:
|
| https://itnext.io/go-does-not-need-a-java-style-gc-
| ac99b8d26...
| apalmer wrote:
| > A good portion of production outages are likely related
| to cascading failures due to too long GC pauses, and a good
| portion of developer time is spent testing and tuning GC
| parameters.
|
| Can't really accept that without some kind of quantitative
| evidence.
| no_circuit wrote:
| No worries. It is not meant to be quantitative. For a few
| years of my career that has been my experience. For this
| type of software, if I'm making the decision on what
| technology to use, it won't be any GC-based language. I'd
| rather not rely on promises that GC works great, or is
| very tunable.
|
| One could argue that I could just tune my services from
| time to time. But I'd just reduce the surface area for
| problems by not relying upon it at all -- both a
| technical and a business decision.
| EdwardDiego wrote:
| > A good portion of production outages are likely related
| to cascading failures due to too long GC pauses, and a good
| portion of developer time is spent testing and tuning GC
| parameters
|
| After 14 years in JVM dev in areas where latency and
| reliability are business critical, I disagree.
|
| Yes, excessive GC stop the world pauses can cause latency
| spikes, and excessive GC time is bad, and yes, when a new
| GC algorithm is released that you think might offer
| improvements, you test it thoroughly to determine if it's
| better or worse for your workload.
|
| But a "good portion" of outages and developer time?
|
| Nope. Most outages occur for the same old boring reasons -
| someone smashed the DB with an update that hits a
| pathological case and deadlocks processes using the same
| table, a DC caught fire, someone committed code with a very
| bad logical bug, someone considered a guru heard that gRPC
| was cool and used it without adequate code review and
| didn't understand that gRPC's load balancing defaults to
| pick first, etc. etc.
|
| The outages caused by GC were very very few.
|
| Outages caused by screw-ups or lack of understanding of
| subtleties of a piece of tech, as common as they are in
| every other field of development.
|
| Then there's the question of what outages GCed languages
| _don't_ suffer.
|
| I've never had to debug corrupted memory, or how a use
| after free bug let people exfiltrate data.
| throwaway894345 wrote:
| > The tricky thing is GC works most of the time, but if you
| are working at scale you really can't predict user
| behavior, and so all of those GC-tuning parameters that
| were set six months ago no longer work properly. A good
| portion of production outages are likely related to
| cascading failures due to too long GC pauses, and a good
| portion of developer time is spent testing and tuning GC
| parameters. It is easier to remove and/or just not allow GC
| languages at these levels in the first place.
|
| Getting rid of the GC doesn't absolve you of the problem,
| it just means that rather than tuning GC parameters, you've
| encoded usage assumptions in thousands of places scattered
| throughout your code base.
| kubb wrote:
| Reading the title I'm worried, should I keep using reflection
| instead?
| jerf wrote:
| If the information in this article is make-or-break for your
| program, you probably shouldn't have chosen Go.
|
| In the grand space of all programming languages, Go is fast. In
| the space of compiled programming languages, it's on the slower
| end. If you're in a "counting CPU ops" situation it's not a
| good choice.
|
| There is an intermediate space in which one is optimizing a
| particular tight loop, certainly, I've been there, and this can
| be nice to know. But if it's beyond "nice to know", you have a
| problem.
|
| I don't know what you're doing with reflection but the odds are
| that it's wildly slower than anything in that article though,
| because of how it works. Reflection is basically like a
| dynamically-typed programming language runtime you can use as a
| library in Go, and does the same thing dynamically-typed
| languages (modulo JIT) do on their insides, which is
| essentially deal with everything through an extra layer of
| indirection. Not just a function call here or there...
| _everything_. Reading a field. Writing a field. Calling a
| function, etc. Everywhere you have runtime dynamic behavior,
| the need to check for a lot of things to be true, and
| everything operating through extra layers of pointers and table
| structs. Where the article is complaining about an extra CPU
| instruction here and an extra pointer indirection there, you
| 've signed up for extra function calls and pointer indirections
| by the dozens. If you can convert reflection to generics it
| will almost certainly be a big win.
|
| (But if you cared about performance you were probably also
| better off with an interface that didn't fully express what you
| meant and some extra type switches.)
| shadowgovt wrote:
| This is good high-level advice as well as low-level advice.
|
| Go is positioned to be most useful as an alternative to Java,
| and to C++ where performance isn't the key factor (i.e.
| projects where C++ would be chosen because "Enh, it's a big
| desktop application and C++ is familiar to a lot of
| developers," not because the project actually calls for being
| able to break out into assembly language easily or where
| fine-tuning performance is more important than tool-provided
| platform portability).
| azth wrote:
| In practice, it's used as an alternative to python and ruby
| and nodejs. It can't fully do what Java or C# do.
| geodel wrote:
| I mean of coure. I have not seen IBM Websphere Server
| 6.0.1 written in Go. Neither is there a full fledge
| SOAP/WSDL engine in Go. So clearly Go is less capable.
| coder543 wrote:
| Well, that is simply not true at all.
|
| Go is a perfectly capable replacement for Java and C#.
| Many huge projects that would likely never be written in
| Python have been written in Go when they would have
| otherwise been written in Java or C# in years past:
| Kubernetes, Prometheus, HashiCorp Vault and Terraform,
| etcd, CoreDNS, TiDB, Loki, InfluxDB, NATS, Docker, Caddy,
| Gitea, Drone CI, Faktory, etc. The list goes on and on.
|
| What, exactly, are you saying that Go can't do that Java
| can?
|
| Go is _not_ a perfectly capable replacement for Rust, for
| example, because Rust offers extremely low level control
| over all resource usage, making it much easier to use for
| situations where you need every last ounce of
| performance, but neither C# nor Java offer the
| capabilities Rust offers either.
|
| I like C# just fine (Java... not so much), but your
| comment makes no sense. Certainly, I would rather use Go
| than most scripting languages; having static types and
| great performance makes a lot of tasks easier. But that
| doesn't mean Go is somehow less capable than Java or
| C#... it is a great alternative to both. If someone needs
| more than Go can provide, they're going to rewrite in
| Rust, C++, or C, not Java or C#.
| marwatk wrote:
| > What, exactly, are you saying that Go can't do that
| Java can?
|
| Runtime library addition (plugins) and dependency
| injection are two big ones. (We can argue the merit
| separately, but they're not possible in Go)
|
| I think if Java had easily distributable static binaries
| k8s would have stayed Java (it started out as Java).
| jerf wrote:
| Plugins are barely possible and utterly impractical, so
| no objection there.
|
| DI is totally possible, just about every system I build
| is nothing but dependency injection. What confuses people
| in the Java world is that you don't need a framework for
| it, you just _do it_. You could say the language simply
| supports a simple version of it natively.
|
| If you want something much more complicated like the Java
| way, there are some libraries that do it, but few people
| find them worthwhile. They are a lot of drama for what is
| in Go not that much additional functionality.
|
| This is one of the many places the interfaces not
| requiring declaration of conformance fundamentally
| changes Go vs. Java and leaves me still preferring Go
| even if Java picks up every other thing from Go. You
| don't need a big dependency injection framework; you just
| declare yourself an interface that matches what you use
| out of some 3rd-party library, then pass the value from
| the 3rd-party library in to your code naturally.
| Dependency injected. All other things you may want to do
| with that dependency, like swap in a testing
| implementation, you just do with Go code.
|
| (And I personally think that if Java's interfaces were
| satisfied like Go's, there would never have been a Go.)
| coder543 wrote:
| > Runtime library addition (plugins)
|
| https://pkg.go.dev/plugin
|
| Linux only, but it exists and it works... I just wouldn't
| recommend that particular pattern for almost anything.
|
| Either some kind of RPC or compile-time plugins would be
| better for almost all cases.
|
| - With RPC plugins (using whatever kind of RPC that you
| prefer), you get the benefit of process isolation in case
| that plugin crashes, the plugin can be written in any
| language, and other programs can easily reuse those
| plugins if they desire. The Language Server Protocol is a
| great example of an RPC plugin system, and it has had a
| huge impact throughout the developer world.
|
| - With compile-time plugins, you get even better
| performance due to the ahead-of-time optimizations that
| are possible. Go programs compile so quickly that it's
| not a big deal to compile variants with different plugins
| included... this is what Caddy did early on, and this
| plugin architecture still works out well for them last I
| checked.
|
| > dependency injection
|
| https://go.dev/blog/wire
|
| Java-style DI isn't very idiomatic for Go, and it's just
| a pattern (the absence of which would not prevent
| applications from being developed, the purpose of this
| discussion)... but there are several options for doing DI
| in Go, including this one from Google.
| randomdata wrote:
| _> Runtime library addition (plugins)_
|
| I don't see anything inherit to Go that would prevent it.
| gc even added rudimentary support some time ago,
| fostering the addition of the plugin package[1], but
| those doing the work ultimately determined that it wasn't
| useful enough to dedicate further effort towards
| improving it.
|
| There was a proposal to remove it, but it turns out that
| some people are using runtime library addition, and so it
| remains.
|
| [1] https://pkg.go.dev/plugin
| shadowgovt wrote:
| I believe Go supports DI via Wire
| (https://github.com/google/wire).
| lalaithion wrote:
| Very few people are actually answering your question, so I'll
| answer it: Generics are slower than concrete types, and are
| slower than simple interfaces. However, the article does not
| bother to compare generics with reflection, and my intuition
| says that generics will be faster than reflection.
| pdpi wrote:
| Definitely not. In the general case, you will make things
| simpler and faster by turning reflection-based code into
| generic code.
|
| What this article says is that a function that is generic on an
| interface introduces a tiny bit of reflection (as little as is
| necessary to figure out if a type conforms to an interface and
| get an itab out of it), and that tiny bit of reflection is
| quite expensive. This means two things.
|
| One, if you're not in a position where you're worried about
| what does or does not get devirtualized and inlined, this isn't
| a problem for you. If you're using reflection at all, this
| definitely doesn't apply to you.
|
| Two, reflection is crazy expensive, and the whole point of the
| article is that the introduction of that tiny bit of reflection
| can make function calls literally twice as slow. If you are in
| a position where you care about the performance of function
| calls, you're never really going to improve upon the situation
| by piling on even more reflection.
| LanceH wrote:
| If you're worried you should benchmark the differences on your
| requirements.
| AYBABTME wrote:
| Use generics if it makes your dev experience better. Profile if
| it's slow. Optimize the slow bits.
| morelisp wrote:
| If you're using reflection or storing a bare interface{}, you
| should probably instead try using generics.
|
| If you're using real interfaces, you should keep using
| interfaces.
|
| If you care about performance, you should not try to write
| Java-Streams / FP-like code in a language with no JIT and a
| non-generational non-compacting GC.
| linkdd wrote:
| Premature optimization is a bad thing.
|
| Just implement naively, then if you have performance issues
| identify the bottleneck.
| morelisp wrote:
| Ignorance of how your language works is a bad thing.
|
| Knowing where performance issues with certain techniques
| might arise is not premature optimization. Implement with an
| appropriate level of care, including performance concerns.
| Not every kind of poor performance appears as a clear spike
| in a call graph, and even fewer can be fixed without changing
| any external API.
| linkdd wrote:
| > Ignorance of how your language works is a bad thing.
|
| And I never said anything remotely close to contradict this
| statement.
|
| > Knowing where performance issues with certain techniques
| might arise is not premature optimization.
|
| It is: - Python: should I use a for loop, a
| list comprehension or the map function? - C++: should
| I use a std::list, std::vector, ...? - Go: should I
| use interface{} or generics?
|
| The difference between those options is subtle and
| completely unrelated to the problem you want to solve.
|
| > Implement with an appropriate level of care, including
| performance concerns. Step 1: solve your
| problem naively, aka: make it work Step 2: add tests,
| separate business logic from implementation details, aka:
| make it right Step 3: profile / benchmark to see
| where the chokepoints are and optimize them, aka: make it
| fast
|
| Chances are that if you have deeply nested loops, generics
| vs interface{} will be the last of your problems.
|
| To take the C++ example again, until you have implemented
| your algorithm, you don't know what kind of operations (and
| how often) you will do with your container. So you can't
| know whether std::list or std::vector fits best.
|
| In Go, until you have implemented your algorithm, you don't
| know how often you will have to use generics / reflection,
| so you can't know what will be the true impact on your
| code.
|
| The "I know X is almost always faster so i'll use it
| instead of Y" will bite you more often than you can count.
|
| > Not every kind of poor performance appears as a clear
| spike in a call graph
|
| CPU usage, memory consumption, idling/waiting times, etc...
| Those are the kind of metrics you care about when
| benchmarking your code. No one said you only look at spike
| in a call graph.
|
| But still, to look for such information, you need to have
| at least a first implementation of your problem's solution.
| Doing this before is a waste of time and energy because 80%
| of the time, your assumptions are wrong.
|
| > and even fewer can be fixed without changing any external
| API.
|
| This is why you "make it work" and "make it right" before
| you "make it fast".
|
| This way you have a clear separation between your API and
| your implementation details.
| morelisp wrote:
| You're giving fine advice for well-scoped tasks with
| minimal design space (well, sort of - using std::list
| _ever_ is laughable - but if you had said unordered_map
| vs. map, sure, so I take the broad point). But, some of
| us have been around the block a few times though, and now
| need to make sure those spaces are delineated for others
| in a way that won 't force them into a performance
| corner.
|
| > until you have implemented your algorithm, you don't
| know what kind of operations (and how often) you will do
| with your container.. until you have implemented your
| algorithm, you don't know how often you will have to use
| generics / reflection, so you can't know what will be the
| true impact on your code.
|
| I don't mean to brag, but I guess I'm a lot better at
| planning ahead than you. I don't usually have the whole
| program written in my head before I start, but I also
| can't remember any time I had to reach for a hammer as
| big as reflect and didn't expect to very early on, and
| most of the time I know what I intend to do to my data!
|
| > This is why you "make it work" and "make it right"
| before you "make it fast"... This way you have a clear
| separation between your API and your implementation
| details.
|
| This is not possible. APIs force performance constraints.
| Maybe wait until your API works before micro-optimizing
| it, but also maybe think about how many pointers you're
| going to have to chase and methods your users will need
| to implement in the first place because you probably
| don't get to "optimize" those later without breaking the
| API. You write about "the bottleneck", but there's not
| always a single bottleneck distinct from "the API".
| Sometimes there's a program that's slow because there's a
| singular part that takes 10 seconds and could take 1
| second. But sometimes it's slow because every different
| bit of it is taking 2ns where it could take 1ns.
|
| Consider the basic read-some-bytes API in Go vs. Python
| (translated into Go, so the difference is obvious):
| type GoReader interface { Read([]byte) (int, error) }
| type PyReader interface { Read(int) ([]byte, error) }
|
| You're never going to make an API like PyReader anywhere
| near as fast as GoReader, no matter how much optimization
| you do!
| linkdd wrote:
| > using std::list ever is laughable
|
| https://baptiste-wicht.com/posts/2012/11/cpp-benchmark-
| vecto...
|
| > some of us have been around the block a few times
| though, and now need to make sure those spaces are
| delineated for others in a way that won't force them into
| a performance corner.
|
| This, just like the rest of your comment, is just
| patronizing and condescendant.
|
| > I don't mean to brag, but I guess I'm a lot better at
| planning ahead than you
|
| See previous point...
|
| > I also can't remember any time I had to reach for a
| hammer as big as reflect and didn't expect to very early
| on
|
| This is not what I said at all. Let's say you know early
| on, before any code is written, you will need reflection.
| Can you tell me how many calls to the reflection API will
| happen before-hand? Is it `n`? `n _log(n)`? `n2`? Will
| you use reflection at every corner, or just on the
| boundaries of your task? Once implemented, could it be
| refactored in a simpler way? You don 't know until you
| wrote the code.
|
| > most of the time I know what I intend to do to my data
|
| "what" is the spec, "how" is the code, and there is
| multiple answers to the "how", until you write them and
| benchmark them, you can't know for sure which one is the
| best, you can only have assumptions/hypothesis. Unless
| you're doing constantly exactly the same thing.
|
| > but also maybe think about how many pointers you're
| going to have to chase and methods your users will need
| to implement in the first place because you probably
| don't get to "optimize" those later without breaking the
| API.
|
| Basically, "write the spec before jumping into code".
| Which is the basis of "make it work, make it right, make
| it fast" because if you don't even know what problem
| you're solving, there is no way you can do anything
| relevant.
|
| > You write about "the bottleneck", but there's not
| always a single bottleneck distinct from "the API".
|
| I never implied there is a single bottleneck. But If you
| separate the implementation details from the High-Level
| API, they sure are distinct. For example, you can solve
| the N+1 problem in a GraphQL API without changing its
| schema.
|
| If your implementation details leaks to your API, it just
| means it's poorly separated.
|
| > You're never going to make an API like PyReader
| anywhere near as fast as GoReader, no matter how much
| optimization you do!
|
| Because Python is interpreted and Go is compiled. Under
| the hood, the OS uses the `int read(int fd, void _dest,
| size_t count)`, and there is an upper limit to the
| `count` parameter (specific to the OS/kernel).
|
| Python's IO API knows this and allocates a buffer only
| once under the hood, it would be equivalent to having a
| PyReader implementation using a GoReader interface +
| preallocated []byte slice.
|
| I can't tell you which one is faster without a benchmark
| because the difference is so subtle, so I won't.
| 8note wrote:
| The language is a tool for a job.
|
| If I'm using low torque, I don't need to know the yield
| strength of my wrench
| slackfan wrote:
| Meh. The people who screamed loudest about Generics missing in Go
| aren't going to be using the language now that the language has
| them, and are going to find something new to complain about.
|
| The language will suffer now with additional developmental and
| other overhead.
|
| The world will continue turning.
| trey-jones wrote:
| This is a really long and informative article, but I would
| propose a change to the title here, since "Generics can make your
| Go code slower" seems like the expected outcome, where the
| conclusion of the article leans more towards "Generics don't
| always make your code slower", as well as enumerating some good
| ways to use generics, as well as some anti-patterns.
| [deleted]
| SomeCallMeTim wrote:
| In C++, generics (templates) are zero-cost abstractions.
|
| So no, generics do _not_ de facto make code slower.
| BobbyJo wrote:
| I think his point was that they definitely won't make it
| faster (more abstraction means more indirection), so the
| expectation from most (myself included) would be that using
| them incurs a performance penalty, maybe not directly via
| their implementation, but via their use in broader terms.
| SomeCallMeTim wrote:
| Using templates in C++ _can_ make code faster, though.
| Because you can write the same routine with more
| abstraction and _less_ indirection.
|
| I've used C++ templates effectively as a code generator to
| layer multiple levels of abstractions into completely
| customized code throughout the abstraction.
| chakkepolja wrote:
| We don't know of a way to implement generic types without
| (vtable dispatch + boxing) cost AND without monomorphization
| cost. Some languages do former, some latter, some combination
| of 2.
|
| Monomorphization: * code bloat * slow compiles * debug builds
| may be slow (esp c++)
|
| Dynamic dispatch & boxing (Usually both are needed): * not
| zero cost
|
| Pick your poison
| SomeCallMeTim wrote:
| "Zero-cost" in that context refers to runtime performance.
| It _always_ refers to runtime performance.
|
| And code bloat, as I've said elsewhere, is vastly overblown
| as a problem. Another commenter pointed out that link-time
| optimization removes most of the bloat. The rest is
| customized code that's optimized per-instantiation.
|
| Slow compiles _are_ an issue with C++ templates. They 're
| literally a Turing-complete code-generation language of
| their own, and they can perform complex calculations at
| compile time, so yes, they tend to make compiles take
| longer when you're using them extensively. But the point I
| was making was about runtime performance. That's why C++
| compilers often perform incremental compilation, which can
| limit the development time cost.
|
| Debug builds can simply be slow in C++ with or without
| templates. C++ templates really don't affect debug build
| runtime performance in any material fashion; writing the
| code out customized for each given type should have
| identical performance to the template-generated version of
| the code, unless there's some obscure corner case I'm not
| considering.
| monocasa wrote:
| There are no true zero cost abstractions under all
| situations. In the general case they make things faster, but
| I've personally made C++ code faster by un templating code to
| relieve I$ pressure, and also allow the compiler to make
| smarter optimizations when it has less code to deal with. The
| optimizer passes practically have a finite window they can
| look at because of the complexity class of a lot of optimizer
| algorithms.
| josefx wrote:
| C++ can suffer from negative performance from template bloat
| in two ways:
|
| Templated symbol names are gigantic. This can impact program
| link and load times significantly in addition to the inflated
| binary size.
|
| Duplication of identical code for every type, for example the
| methods of std::vector<int> and std::vector<unsigned int>
| should compile to the same instructions. There are linker
| flags that allow some deduplication but those have their own
| drawbacks, another trick is to actively use void pointers for
| code parts that do not need to know the type, allowing them
| to be reused behind a type safe template based API.
| fbkr wrote:
| > There are linker flags that allow some deduplication but
| those have their own drawbacks
|
| As long as you use --icf=safe I don't see any drawback, and
| most of the time it results in almost identical reductions
| to --icf=all since not many real programs compare addresses
| of functions.
| josefx wrote:
| I think that requires separate function sections, which
| themselves may cause bloat and data duplication.
| gmfawcett wrote:
| That's only 99% of the story. :) Having too many
| specializations of a C++ template can lead to code bloat,
| which can degrade cache locality, which can degrade
| performance.
| pjmlp wrote:
| Depends if LTO is used.
| mcronce wrote:
| You're definitely right. While it's not a particularly
| common problem, it does exist; one thing I'd really like to
| see enter the compiler world is an optimization step to use
| vtable dispatch (or something akin to Rust's enum_dispatch,
| since all concrete types should be knowable at compile
| time) in these cases.
|
| I expect it would require a fair amount of tuning to become
| useful, but could be based on something analogous to the
| function inliner's cost model, along with number of calls
| per type. Could possibly be most useful as a PGO type step,
| where real-world call frequency with each concrete type is
| considered.
| nu11ptr wrote:
| enum dispatch in Rust is one of my favorite tricks. Most
| of the time you have a limited number of implementations,
| and enum dispatch is often more performant and even less
| limiting (than say trait objects)
| mcronce wrote:
| I'm a huge fan. It's very little work to use, as long as
| all variants can be known to the author, and as long as
| you aren't in a situation where uncommon variants
| drastically inflate the size of your common variants,
| it's a performance win, often a big one, compared to a
| boxed trait object.
|
| Even when you have to box a variant to avoid inflating
| the size of the whole enum, that's still an improvement
| over a `dyn Trait` - it involves half as much pointer
| chasing
|
| It'd be cool to see this added as a compiler optimization
| - even for cases where the author of an interface can't
| possibly know all variants (e.g. you have a `pub fn` that
| accepts a `&dyn MyTrait`), the compiler can
| SomeCallMeTim wrote:
| In my experience, code bloat from templates is overblown.
|
| Inlining happens with or without template classes.
| gmfawcett wrote:
| That's fair. I guess if you need the functionality in
| your program, you need the functionality: the codegen
| approach doesn't matter that much. And like pjmlp said,
| LTO can make a difference too. Thanks for your thoughts,
| these kinds of exchanges make me smarter. :)
| asvitkine wrote:
| Zero cost from runtime performance, but you pay binary size
| for it. It's a trade off between the two...
| addcninblue wrote:
| Is it the expected outcome? I was under the initial impression
| that the author also noted:
|
| > Overall, this may have been a bit of a disappointment to
| those who expected to use Generics as a powerful option to
| optimize Go code, as it is done in other systems languages.
|
| where the implementation would smartly inline code and have
| performance no worse than doing so manually. I quite
| appreciated the call to attention that there's a nonobvious
| embedded footgun.
|
| (As a side note, this design choice is quite interesting, and I
| appreciate the author diving into their breakdown and thoughts
| on it!)
| Ensorceled wrote:
| Interestingly the original title and your proposed title imply,
| to me, the opposite of what I think they imply to you. This
| suggestion is really unclear.
| cube2222 wrote:
| Great article, just skimmed it, but will definitely dive deeper
| into it. I thought Go is doing full monomorphization.
|
| As another datapoint I can add that I tried to replace the
| interface{}-based btree that I use as the main workhorse for
| grouping in OctoSQL[0] with a generic one, and got around 5% of a
| speedup out of it in terms of records per second.
|
| That said, compiling with Go 1.18 vs Go 1.17 got me a 10-15%
| speedup by itself.
|
| [0]:https://github.com/cube2222/octosql
| morelisp wrote:
| > That said, compiling with Go 1.18 vs Go 1.17 got me a 10-15%
| speedup by itself.
|
| Where did you see this speedup? Other than `GOAMD64` there
| wasn't much in the release notes about compiler or stdlib
| performance improvements so I didn't rush to get 1.18-compiled
| binaries deployed, but maybe I should...
|
| (I do expect some nice speedups from using Cut and
| AvailableBuffer in a few places, but not without some
| rewrites.)
| cube2222 wrote:
| I've experienced that speedup on an ARM MacBook Pro. I've
| just checked on Linux AMD64 and there's no performance
| difference there.
| hencq wrote:
| It's probably because of the new register passing calling
| convention. From https://tip.golang.org/doc/go1.18
|
| > Go 1.17 implemented a new way of passing function
| arguments and results using registers instead of the stack
| on 64-bit x86 architecture on selected operating systems.
| Go 1.18 expands the supported platforms to include 64-bit
| ARM (GOARCH=arm64), big- and little-endian 64-bit PowerPC
| (GOARCH=ppc64, ppc64le), as well as 64-bit x86 architecture
| (GOARCH=amd64) on all operating systems. On 64-bit ARM and
| 64-bit PowerPC systems, benchmarking shows typical
| performance improvements of 10% or more.
| ptomato wrote:
| Yeah, 1.17 got register (instead of stack) calling
| convention on amd64; 1.18 expanded that to arm64, which
| should be responsible for most of that performance
| improvement.
| coder543 wrote:
| GOAMD64 _could_ be significant, so I 'm not sure why your
| comment seems to dismiss it?
|
| Also, as the article mentions, Go 1.18 can now inline
| functions that contain a "range" for loop, which previously
| was not allowed, and this would contribute performance
| improvements for some programs by itself. The new register-
| based calling convention was extended to ARM64, so if you're
| running Go on something like Graviton2 or an Apple Silicon
| laptop, you could expect to see a measurable improvement from
| that too. (edit: the person you replied to confirmed they're
| using Apple Silicon, so definitely a major factor.)
|
| The Go team is always working on performance improvements, so
| I'm sure there are others that made it into the release
| without being mentioned in the release notes.
| jatone wrote:
| what I expect to happen now that golang has generics and reports
| like these will show up is golang will explore monomorphizing
| generics and get hard numbers. they may also choose to use some
| of the compilation speeds they've gained from linker
| optimizations and spend that on generics.
|
| I can't imagine monomorphizing being that big of a deal during
| compilation if the generation is defered and results are cached.
| whimsicalism wrote:
| I am unfamiliar with Go. This article discusses that they have
| decided to go for runtime lookup. Is there any reason why that
| implementation might make monomorphizing more difficult?
| jatone wrote:
| nope. it was an intentional trade off with respect to
| compilation speed. once generics have baked for a bit with
| real world usage said decision will almost certainly be
| revisited.
|
| edit: for example one could envision the compiler generates
| the top n specializations per generic function based on usage
| and then uses the current stuff non-specialized version for
| the rest.
| jimmaswell wrote:
| > you create an exciting universe of optimizations that are
| essentially impossible when using boxed types
|
| Couldn't JIT do this?
| fulafel wrote:
| > Inlining code is great. Monomorphization is a total win for
| systems programming languages: it is, essentially, the only form
| of polymorphism that has zero runtime overhead
|
| Blowing your icache can result in slowdowns. In many cases it's
| worth having smaller code even if it's a bit slower when
| microbenchmarked cache-hot, to avoid evicting other frequently
| used code from the cache in the real system.
| masklinn wrote:
| The essay is missing a "usually", but it's true that
| monomorphisation is a gain in the vast majority of situations
| because of the data locality and optimisation opportunities
| offered by all the calls being static. Though obviously that
| assumes a pretty heavy optimisation pipeline (so languages like
| C++ or Rust benefit a lot more than a language with a lighter
| AOT optimisation pipeline like Java).
|
| Much as with JITs (though probably with higher thresholds),
| issues occur for megamorphic callsites (when a generic function
| has a ton of instances), but that should be possible to dump
| for visibility, and there are common and pretty easy solutions
| for at least some cases e.g. trampolining through a small
| generic function (which will almost certainly be inlined) to
| one that's already monomorphic is pretty common when the
| generic bits are mostly a few conversions at the head of the
| function (this sort of trampolining is common in Rust, where
| "conversion" generics are often used for convenience purposes
| so e.g. a function will take an `T: AsRef<str>` so the caller
| doesn't have to extract an `&str` themselves).
| ki_ wrote:
| Generics are a generic solution, but they are absolutely
| necessary in my opinion.
| sedatk wrote:
| This is a great article yet with an unnecessarily sensationalist
| headline. Generics can be improved in performance over time, but
| a superstition like "generics are slow" (not the exact headline,
| but what it implies to reader) can remain stuck in our heads
| forever. I can see developers stick to the dogma of "never use
| generics if you want fast code", and resorting to terrible
| duplication, and more bugs.
| zellyn wrote:
| (off-topic) Anyone else using Firefox know why the text starts
| out light gray and then flashes to unreadably dark gray after the
| page loads? (The header logo and text change from gray to blue
| too)
| wtetzner wrote:
| I'm using Firefox and don't see that issue. Maybe some kind of
| plugin you have installed?
| brundolf wrote:
| This is super interesting and well-written. Also, wow, that
| generated-assembly-viewer widget is slick.
| nunez wrote:
| yeah the formatting on this article was insanely good for a
| technical blog post. Good job, Planetscale marketing!
| socialdemocrat wrote:
| Really well written article. I liked that the author tried to
| keep a simple language around a fair amount of complex topics.
|
| Although the article paints the Go solution for generics somewhat
| negative, it actually made me more positive to the Go solution.
|
| I don't want generic code to be pushed everywhere in Go. I like
| Go to stay simple and it seems the choices the Go authors have
| made will discourage overuse of Generics. With interfaces you
| already avoid code duplication so why push generics? It is just a
| complication.
|
| Now you can keep generics to the areas were Go didn't use to work
| so great.
|
| Personally I quite like that Go is trying to find a niche
| somewhere between languages such as Python and C/C++. You get
| better performance than Python, but they are not seeking zero-
| overhead at any cost like C++ which dramatically increases
| complexity.
|
| Given the huge amount of projects implemented with Java, C#,
| Python, Node etc there must be more than enough cases where Go
| has perfectly good performance. In the more extreme cases I
| suspect C++ and Rust are the better options.
|
| Or if you do number crunching and more scientific stuff then
| Julia will actually outperform Go, despite being dynamically
| typed. Julia is a bit opposite of Go. Julia has generics
| (parameterized types) for performance rather than type safety.
|
| In Julia you can create functions taking interface types and
| still get inlining and max performance. Just throwing it out
| there are many people seem to think that to achieve max
| performance you always need a complex statically typed language
| like C++/D/Rust. No you don't. There are also very high speed
| dynamic languages (well only Julia I guess at the moment.
| Possibly LuaJIT and Terra).
| AtNightWeCode wrote:
| Is there any large project that done an in-place replacement to
| use generics that has been benchmarked? I doubt that the change
| is even measurable in general.
| ctvo wrote:
| Bravo on the informative content and presentation. That component
| that shows the assembly next to syntax highlighted code? _chefs
| kiss_
| maxekman wrote:
| Similar to how the GC has become faster and faster with each
| version, we can expect the generics implementation to be too. I
| wouldn't pay much attention to conclusions about performance from
| the initial release of the feature. The Go team is quiet open
| with their approach.
| JaggerFoo wrote:
| throwaway894345 wrote:
| The article is like 9k words and it only mentions Rust twice in
| passing.
| zachruss92 wrote:
| For me Go has replaced Node as my preferred backend language. The
| reason is because of the power of static binaries, the confidence
| that the code I write today can still run ten years from now, and
| the performance.
|
| The difference in the code I'm working with is being able to
| handle 250 req/s in node versus 50,000 req/s in Go without me
| doing any performance optimizations.
|
| From my understanding Go was written with developer ergonomics
| first and performance is a lower priority. Generics undoubtedly
| make it a lot easier to write and maintain complex code. That may
| come at a performance cost but for the work I do even if it cuts
| the req/s in half I can always throw more servers at the problem.
|
| Now if I was writing a database or something where performance is
| paramount I can understand where this can be a concern, it just
| isn't for me.
|
| I'd be very curious what orgs like CockroachDB and even K8s think
| about generics at the scale they're using them.
| RedShift1 wrote:
| 250 vs. 50000 req/s seems like a too big of a difference to me.
| Sure Go is faster than Node but Node is no slough either, you
| might want to dig in some deeper why you only got 250 req/s
| with Node.
| akvadrako wrote:
| That could mostly be due to multithreading. That comes free
| with go but requires a different model in node.
| KwisaksHaderach wrote:
| Doesn't sound unrealistic if you have a mix load of IO and
| raw processing.
| zelphirkalt wrote:
| Go was created with simplicity of feature set in mind, which
| does not translate into developer ergonomics automatically. It
| rather offers a least common denominator of lang features, so
| that most devs can handle it, who previously only handled other
| languages like Java and similar. This way Google aimed at
| attracting those devs. They'd not have to learn much to make
| the switch.
|
| True developer ergonomics, as far as a programming language
| itself goes, stems from language features, which make goals
| easy to accomplish in little amount of code, in a readable way,
| using well crafted concepts of the language. Having to go to
| lengths, because your lang does not support programming
| language features (like generics in Go for a long time) is not
| developer ergonomics.
|
| There is the aspect tooling for a language of course, but that
| has not necessarily to do with programmming language design.
| Same goes for standard library.
| vorpalhex wrote:
| > The difference in the code I'm working with is being able to
| handle 250 req/s in node versus 50,000 req/s in Go without me
| doing any performance optimizations.
|
| Your node code should be in the 2k reqs/s range trivially, with
| many frameworks comfortable offering 5k+.
|
| It is never going to be as fast as go, but it will handle most
| cases.
| throwaway894345 wrote:
| How can you make these claims without information about what
| his application handles requests? Not everything is a trivial
| database read/write op.
| hu3 wrote:
| Related. The introduction of Generics in Go revived an issue
| about the ergonomics of typesafe Context in a Go HTTP framework
| called Gin: https://github.com/gin-gonic/gin/issues/1123
|
| If anyone can contribute, please do.
| kevwil wrote:
| Seems obvious; like, did someone expect all the extra abstraction
| would make Go faster?
| throwaway894345 wrote:
| The article articulates why it's reasonable to expect that
| generics _would_ make Go faster. From TFA:
|
| > Monomorphization is a total win for systems programming
| languages: it is, essentially, the only form of polymorphism
| that has zero runtime overhead, and often it has _negative_
| performance overhead. It makes generic code _faster_.
| __s wrote:
| What extra abstraction?
|
| I'd expect without monomorphization the code should perform the
| same as interface{} code, perhaps minus type cast error
| handling overhead. That's the model where generics are passing
| interface{} underneath, & exist only as a type check _(a la
| Java type erasure)_
| anonymoushn wrote:
| Yes? We used code generators to monomorphize our code in like
| 2015 and it was faster than using interfaces. Generics could
| reasonably produce the same code we did in 2015, but they
| don't.
| [deleted]
| morelisp wrote:
| > there's no incentive to convert a pure function that takes an
| interface to use Generics in 1.18.
|
| Good. I saw a lot of people suggesting in late 2021 that you
| could use generics as some kind of `#pragma force-
| devirtualization`, and that would be awful if it became common.
| mcronce wrote:
| Why would that be awful?
| ramesh31 wrote:
| Well sure. Not writing hand tuned assembly can make your code
| slower, too. Go's value as a language is how it fills the niche
| between Rust and Python, giving you low level things like manual
| memory control, while still making tradeoffs for performance and
| developer experience.
| mrweasel wrote:
| I might have worded it differently, but yeah, of cause generics
| can make your code slower, what did people expect.
| zellyn wrote:
| I don't know about you, but when I imagine what compilers do
| with generic code, I typically imagine monomorphization,
| which (aside from increasing cache pressure a little), should
| generally not make things slower, but rather introduce
| possibilities for inlining that could make it faster.
| mrweasel wrote:
| Apparently I scrolled right past that bit of the article.
| I'm a little unsure how it's suppose to make the code
| faster, but maybe because I compare it wrong. The
| alternative to generics is writing all the different
| function by hand, in my mind at least. I don't fully
| understand how generics are suppose to be made faster than
| a custom function for that datatype.
| wtetzner wrote:
| I think the reasoning is that for something that's
| commonly used for many different types, you won't go
| through the effort of re-implementing that function for
| each type (it may not even be feasible to do so). Which
| means you'll end up with some sort of indirection to make
| it generic.
| masklinn wrote:
| > I don't fully understand how generics are suppose to be
| made faster than a custom function for that datatype.
|
| The point is that a monomorphized generic function should
| not be _slower_ than the custom-function-per-datatype,
| but because Go 's generics are not fully monomorphized
| they can be, and in fact can be slower than the a
| function-for-interface.
| mcronce wrote:
| The point is they shouldn't be _slower_ than a manually-
| copied implementation for that concrete type. They also
| should be _faster_ than vtable dynamic dispatch in the
| vast majority of cases. (I also fail to see a compelling
| reason that they couldn 't have been implemented by
| passing the fat pointer directly, making the codegen the
| same as passing an interface, instead of having that
| business with the extra layer of indirection.)
|
| If there are specialization opportunities when hand-
| implementing the function for a given concrete type, I
| would indeed expect that to be faster than a
| monomorphized generic function.
| tsimionescu wrote:
| It depends what they are replacing. Typically, generics used
| to replace runtime polymorphism (using [T any] []T instead of
| []any) would be a speed boost in C#, C++, or Rust; and would
| have no impact on speed in Java.
| morelisp wrote:
| And it is also a speed boost in Go, assuming you don't call
| any methods. (Which, if you were really using [T any], you
| either weren't or you were dissembling about your
| acceptable types.)
| whimsicalism wrote:
| > I might have worded it differently, but yeah, of cause
| generics can make your code slower, what did people expect.
|
| ? In most languages, it is compile-time overhead, not
| runtime.
| masklinn wrote:
| I wouldn't say "most", it's very variable. Also not unlike
| Go I think C# uses a hybrid model, where all reference
| types get the same instances of the generic function.
| wtetzner wrote:
| From the article:
|
| > Monomorphization is a total win for systems programming
| languages: it is, essentially, the only form of polymorphism
| that has zero runtime overhead, and often it has negative
| performance overhead. It makes generic code faster.
|
| The point is that the way Go implements generics is in such a
| way that it can make your code slower, even though there is a
| well-known way that will not make your code slower (at the
| cost of compile times).
| ramesh31 wrote:
| >even though there is a well-known way that will not make
| your code slower (at the cost of compile times).
|
| That's the point though. The Golang team was surely aware
| of both approaches, and chose what they did as a conscious
| design decision to prefer faster compile times. People love
| Go because of the iteration speed compared to C++. And
| these little things start to add up if you don't have a
| clear product vision about what your language is meant for.
| marssaxman wrote:
| I would have expected generics to make the compiler take
| longer, not the compiled program.
| pphysch wrote:
| My first use of Go generics has been for a concurrent "ECS" game
| engine. In this case, the gains are pretty obvious. I think.
|
| I get to write one set of generic methods and data structures
| that operate over arbitrary "Component" structs, and I can
| allocate all my components of a particular type contiguously on
| the heap, then iterate over them with arbitrary, type-safe
| functions.
|
| I can't fathom that doing this via a Component interface would be
| even as close as fast, because it would destroy cache performance
| by introducing a bunch of Interface tuples and pointer
| dereferencing for every single instance. Not to mention the type-
| unsafe code being yucky. Am I wrong?
|
| FWIW I was able to update 2,000,000 components per (1/60s) frame
| per thread in a simple Game of Life prototype, which I am quite
| happy with. But I never bothered to evaluate if Interfaces would
| be as fast
| nikki93 wrote:
| Sweet! I've been using it for the same. Example game project
| (did it for a game jam): https://github.com/nikki93/raylib-5k
| -- in this case the Go gets transpiled to C++ and runs as
| WebAssembly too. Readme includes a link to play the game in the
| browser. game.gx.go and behaviors.gx.go kind of show the ECS
| style.
|
| It's worked with 60fps performance on a naive N^2 collision
| algo over about 4200 entities -- but also I tend to use a
| broadphase for collisions in actual games (there's an "externs"
| system to call to other C/C++ and I use Chipmunk's broadphase).
| folago wrote:
| Sounds interesting, is it available somewhere?
| pphysch wrote:
| Still want to hit some milestones before releasing anything,
| so not quite
| siftrics wrote:
| Assuming your generic functions take _pointers_ to Components
| as input, full monomorphization does not occur and you're
| suffering a performance hit similar in magnitude, if not
| strictly greater empirically, to interface "dereferences".
|
| On this basis, I don't believe your generic implementation is
| as faster than an interface implementation as you claim.
| pphysch wrote:
| You're right, here's what one of my hot loops look like:
| func (cc *ComponentContainer[T]) ForEach(f
| func(*Component[T])) { for _, page := range
| cc.pool.pages { for i := range page { if
| page[i].IsActive() { f(&page[i]) }
| } } }
|
| Still, the interface approach is a total nightmare from a
| readability + runtime error perspective so I won't be going
| back & will just hope for some performance freebies in 1.19
| or later :^)
| dmullis wrote:
| lokar wrote:
| It seems like the code size vs speed trade-off would be well
| managed by FDO.
| komuW wrote:
| Go does has some form of monomorphization implemented in Go1.18;
| it is just behind a feature flag(compiler flags).
|
| Look at the assembly difference between this two examples:
|
| 1. https://godbolt.org/z/7r84jd7Ya (without monomorphization)
|
| 2. https://godbolt.org/z/5Ecr133dz (with monomorphization)
|
| If you don't want to use godbolt, run the command `go tool
| compile '-d=unified=1' -p . -S main.go`
|
| I guess that the flag is not documented because the Go team has
| not committed themselves to whichever implementation.
| masklinn wrote:
| FWIW you can have two different compilers (and outputs) for the
| same input: https://godbolt.org/z/bb1oG9TbP in the compiler
| pane just click "add new" and "clone compiler" (you can
| actually drag that button to immediately open the pane in the
| right layout instead of having your layout move to vertical
| thirds and needing to move the pane afterwards).
|
| Learned that watching one of Matt's cppcon talks (A+, would do
| again), as you can expect this is useful to compare different
| versions of a compiler, or different compilers entirely, or
| different optimisation settings.
|
| But wait, there's more! Using the top left Add dropdown, you
| can get _a diff view between compilation outputs_ :
| https://godbolt.org/z/s3WxhEsKE (I maximised it because a diff
| view getting only a third of the layout is a bit narrow).
| komuW wrote:
| thanks!
| klodolph wrote:
| I'm excited about generics that gives you a tradeoff between
| monomorphization and "everything is a pointer". The "everything
| is a pointer" approach, like Haskell, is incredibly inefficient
| wrt execution time and memory usage, the "monomorphize
| everything" approach can explode your code size surprisingly
| fast.
|
| I wouldn't be surprised if we get some control over
| monomorphization down the line, but if Go started with the
| monomorphization approach, it would be impossible to back out of
| it because it would cause performance regressions. Starting with
| the shape stenciling approach means that introducing
| monomorphization later can give you a performance improvement.
|
| I'm not trying to predict whether we'll get monomorphization at
| some future point in Go, but I'm just saying that at least the
| door is open.
| tines wrote:
| Haskell does monomorphization as well. See
| https://reasonablypolymorphic.com/blog/specialization/
| SomeCallMeTim wrote:
| > "monomorphize everything" approach can explode your code size
| surprisingly fast.
|
| It can in the naive implementation. Early C++ was famous for
| code bloat and (apparently) hasn't shaken that outdated
| impression.
|
| In practice, monomorphization of templates hasn't been a
| serious issue in C++ for a long time. The compiler and linker
| technologies have advanced significantly.
| asdfasgasdgasdg wrote:
| > Early C++ was famous for code bloat and (apparently) hasn't
| shaken that outdated impression.
|
| It's not an outdated impression. C++ generics can and do
| interact very poorly with inlining and other language
| features to cause extremely large binary sizes, especially if
| you do anything complex inside them. They also harm
| compilation performance since each copy of the generic code
| needs to be optimized.
|
| Generics in C++ are reasonably efficient when there is
| relatively little code generated per generic, but when this
| is not true, they can be a problem.
| titzer wrote:
| > The compiler and linker technologies have advanced
| significantly.
|
| AFAICT the linker de-duplicates identical pieces of machine
| code. You still can get multi-megabyte object files for every
| source file. I used to work on V8. Almost every .o is 3+MB.
| Times hundreds, plus bigger ones, it's more than a gigabyte
| of object files for a single build. That's absurd. Not V8's
| fault--stupid C++ compilation and linking model.
| zozbot234 wrote:
| Yes, they seem to have shipped a MVP first, which is a sensible
| approach. Controlling the extent of monomorphization requires
| changes in how the code is written, so if they had offered that
| exclusively it would've been a pitfall to existing users. By
| boxing everything, they keep their MVP closer to the previously
| idiomatic interface{} pattern.
| rowanG077 wrote:
| Why is a speed part of the Go language contract but footprint
| of the executable is not? I, for one, would be quite miffed if
| an update of the Go compiler would mean an application would no
| longer fit on my mcu. That is worse then the application
| running slower.
| gbear605 wrote:
| Ideally it could be a compiler flag. Even more ideally, you
| could tell your compiler what the max size you want is and
| then it would optimize for the best speed at a given
| executable footprint.
| masklinn wrote:
| > Why is a speed part of the Go language contract but
| footprint of the executable is not?
|
| Because footprint of the executable has pretty literally
| never been, Go has always had deficient DCE and generated
| huge executables.
| rowanG077 wrote:
| It also generates pretty slow executables. That doesn't
| invalidate the point.
| qznc wrote:
| A hybrid approach would be monomorphization for native types
| like int and pointers for records. C# is doing that if I
| remember correctly.
| Ar-Curunir wrote:
| IMO that's a bad trade-off for many performance-sensitive
| applications, since it means that you can't rely on newtypes
| and structs for correctness.
| dse1982 wrote:
| This is a very interesting article. I was however a bit confused
| by the lingo, calling everything generics. As I understood it the
| main point of the article quite precisely matched the distinction
| between generics and templates as I learned it. Therefore what
| surprised me most was the fact that go monomorphizes generic code
| sometimes. Which however makes sense given the way go's module-
| system works - i.e. imported modules are included in the
| compilation - but doesn't fit my general understanding of
| generics.
___________________________________________________________________
(page generated 2022-03-30 23:00 UTC)