[HN Gopher] Too Much Go Misdirection
___________________________________________________________________
Too Much Go Misdirection
Author : todsacerdoti
Score : 133 points
Date : 2025-05-19 15:40 UTC (7 hours ago)
(HTM) web link (flak.tedunangst.com)
(TXT) w3m dump (flak.tedunangst.com)
| jchw wrote:
| The biggest issue here IMO is the interaction between two things:
|
| - "Upcasting" either to a concrete type or to an interface that
| implements a specific additional function; e.g. in this case
| Bytes() would probably be useful
|
| - Wrapper types, like bufio.Reader, that wrap an underlying type.
|
| In isolation, either practice works great and I think they're
| nice ideas. However, over and over, they're proving to work
| together poorly. A wrapper type can't easily forward the type it
| is wrapping for the sake of accessing upcasts, and even if it
| did, depending on the type of wrapper it might be bad to expose
| the underlying type, so it has to be done carefully.
|
| So instead this winds up needing to be handled basically for each
| type hierarchy that needs it, leading to awkward constructions
| like the Unwrap function for error types (which is very effective
| but weirder than it sounds, especially because there are two
| Unwraps) and the ResponseController for ResponseWriter wrappers.
|
| Seems like the language or standard library needs a way to
| express this situation so that a wrapper can choose to be opaque
| or transparent and there can be an idiomatic way of exposing
| this.
| movpasd wrote:
| I'm not sure I fully understand the issue as I don't know Go,
| but is this something that a language-level delegation feature
| could help with?
| hello_computer wrote:
| It is the same struggle you can find in any language with
| private/public props. The stream he wants to read from is
| actually just a buffer that has been wrapped as a stream, and
| he's having a hard time directly accessing the buffer through
| its wrapper. He could stream it into a new temporary buffer,
| but he's trying to avoid that since it's wasteful. I've had
| the same problem in C++.
| XorNot wrote:
| But the other side of this is that there's a contract
| violation going on: []byte can be mutated, but io.Reader
| cannot.
|
| When I pass io.Reader I don't expect anything underneath it
| to be changed except the position. When I pass []byte it
| might be mutated.
|
| So really solving the requires a whole new type -
| []constbyte or something (in general Go really needs
| stronger immutability guarantees - I've taken to putting
| Copy methods on all my struts just so I can get guaranteed
| independent copies, but I have to do it manually).
| hello_computer wrote:
| there is also the specific vs general trade-off: the
| general (io.Reader) being more flexible, while providing
| fewer opportunities for optimization. vice versa with the
| specific--be it []byte, or even []constbyte. i think it
| is just an inherent struggle with all abstractions.
| msteffen wrote:
| > The bytes.Reader should really implement Peek. I'm pretty sure
| the reason it doesn't is because this is the only way of creating
| read only views of slices. And a naughty user could peek at the
| bytes and then modify them. Sigh. People hate const poisoning,
| but I hate this more.
|
| When I was a Google, a team adjacent to ours was onboarding a new
| client with performance demands that they could not realistically
| meet with anything resembling their current hardware footprint.
| Their service was a stateless Java service, so they elected to
| rewrite in C++. Now, Java has some overhead because of garbage
| collection and the JVM, and they hoped that this might move the
| needle, but what happened was they went from 300qps/core to 1200,
| with lower tail latency. Literally 3x improvement.
|
| Why? Probably a lot of reasons, but the general consensus was:
| Java has no const, so many of Google's internal libraries make
| defensive copies in many places, to guarantee immutability (which
| is valuable in a highly concurrent service, which everything
| there is). This generates a huge amount of garbage that, in
| theory, is short-lived, rarely escapes its GC generation, and can
| all be cleaned up after the request is finished. But their
| experience was that it's just much faster to not copy and delete
| things all over the place. Which you can often avoid by using
| const effectively. I came to believe that this was Java's biggest
| performance bottleneck, and when I saw that Go had GC with no
| const, I figured it would have the exact same problem
| hinkley wrote:
| Java has a little const. Strings are immutable. You can make
| objects with no mutations, so you can make read only
| collections fairly easily which is usually where const becomes
| a problem.
|
| But then you have for instance Elixir, where all functions are
| pure, so mutating inputs to outputs takes a ton of copying, and
| any data structure that is not a DAG is a gigantic pain in the
| ass to modify. I lost count of how many tries it took me to
| implement Norvig's sudoku solver. I kept having to go back and
| redesign my data structures every time I added more of the
| logic.
|
| [edit to add]: DTOs exist in Java because some jackass used the
| term "Value Object" to include mutation despite large swaths of
| the CS world considering VOs to be intrinsically const. So then
| they had to make up a new term that meant Value Object without
| using the concept they already broke.
| throwawaymaths wrote:
| There are so many purity escape hatches in elixir!!
| hinkley wrote:
| The only ones I know about are the ones in functions and
| closures, where SSA basically just creates a new variable
| that hides the original (until the end of the block at
| which point you discover the reassignment didn't stick).
| What did you have in mind?
| throwawaymaths wrote:
| ets tables are the goto for data structures (for example
| the :digraph module that ships with elixir is built on an
| ets table, presumably because A* needs a mutable
| datatype)
|
| Process dictionary is also an option.
|
| Can always use a genserver as a data store (but only do
| that if lifetime shenanigans make it make sense)
|
| Postgres is also an option.
| hnlmorg wrote:
| Are you able to explain the problem a little more because
| "const" does exist as a keyword, so I assume it's doing
| something different to what you're referring to with regards to
| C++ constants. Is Go not substituting constants like a macro?
| Or are we discussing something entirely different and I'm
| misunderstanding the context here?
| kentm wrote:
| I'm assuming they mean const function/method parameters.
| Being able to mark inputs to functions as const to guarantee
| that they aren't mutated in C++ which often means you can
| just pass in the value by reference safely.
| jerf wrote:
| Java, Go, and C++ all have enough differences here that at
| this level of detail you shouldn't assume any other
| conversion will have exactly the same result that msteffen
| lays out. Java generally has more sophisticated compilation
| and a lot more engineering effort poured into it, but Go
| often ends up with less indirection in the first place and
| started life with what Java calls records [1] so they are
| more pervasive throughout the ecosystem. Which effect "wins"
| can be difficult to guess in advance without either a deep
| analysis of the code, or just trying it.
|
| What msteffen talks about is a general principle that you can
| expect even small differences between languages to sometimes
| have significant impact on code.
|
| I think this is also one of the reasons Rust libraries tend
| to come out so fast. They're very good at not copying things,
| but doing it safely without having to make "just in case"
| copies. It's hard to ever see a benchmark in which this
| particular effect makes Rust come out faster than any other
| language, because in the natural process of optimizing any
| particular benchmark for a non-Rust language, the benchmark
| will naturally not involve taking random copies "just in
| case", but this can have a significant impact on all that
| real code out in the real world not written for benchmarking.
| Casually written Rust can safely not make lots of copies,
| casually written code in almost anything else will probably
| have a lot more copying than the programmer realizes.
|
| [1]: https://blogs.oracle.com/javamagazine/post/records-come-
| to-j...
| hnlmorg wrote:
| Ahhh right that makes a lot more sense.
|
| Thanks for the explanation
| masklinn wrote:
| They mean const in the sense of readonly guarantees.
|
| In java types are generally shared and mutable so let's say
| you want a list input, you generally don't store it as is
| because the caller could modify it at any point, so if you
| accept a `List`, you defensively copy it into an inner type
| for safety, which has a cost (even more so if you also need
| to defensively copy the list contents).
|
| And likewise on output otherwise the caller could downcast
| and modify (in that specific case you could wrap it in an
| unmodifiableList, but not all types have an unmodifiable view
| available).
| cogman10 wrote:
| > so many of Google's internal libraries make defensive copies
| in many places,
|
| This, IMO, is a sign of poor design.
|
| What are you trying to protect? That the google library isn't
| modifying something or that the caller of the google library
| isn't concurrently modifying something?
|
| Or are your storing off the value for later use?
|
| In any case, it's acceptable in the Javadoc and API to specify
| "If you give me this, you cannot further modify it". This
| already happens and is expected in common JDK data-structures.
| For example, if you put an element into a HashSet and then
| change the hash, you won't be able to find it again in the
| HashSet. Nobody complains that's the case because it's a "Well
| duh, you shouldn't have done that". Similarly, if you mutate a
| map while accessing it you'll get a
| "ConcurrentModificationException" or even bad results. Again,
| completely expected behavior.
|
| If you are worried about your code doing the wrong thing with
| something, then one defense that is easy to deploy is wrapping
| that object with one the is unmodifiable. That's why the JDK
| has the likes of `Collections.unmodifiableSet`. That doesn't do
| a defensive copy and is just a quick wrapper on the incoming
| set.
|
| Defensive programming has it's place. However, I think it gets
| over-deployed.
| 90s_dev wrote:
| > Now, why doesn't bytes.Reader implement Peek? It's just a byte
| slice, it's definitely possible to peek ahead without altering
| stream state. But it was overlooked, and instead this workaround
| is applied.
|
| When I first looked at Go, it seemed to have far too many layers
| of abstraction on top of one another. Which is so ironic,
| considering that's one of the main things it was trying to fix
| about Java. It ended up becoming the thing it fought against.
| treyd wrote:
| I would agree with you but not so much here specifically. It's
| much more true with how goroutines and channels work, in that
| they're too unstructured and don't compose well, which
| necessitates needing to make ad-hoc abstractions around them.
| swisniewski wrote:
| There's a much simpler way to do this:
|
| If you want your library to operate on bytes, then rather than
| taking in an io.Reader and trying to figure out how to get bytes
| out of it the most efficient way, why not just have the library
| taken in []byte rather than io.Reader?
|
| If someone has a complex reader and needs to extract to a
| temporary buffer, they can do that. But if like in the author's
| case you already have []byte, then just pass that it rather than
| trying to wrap it.
|
| I think the issue here is that the author is adding more
| complexity to the interface than needed.
|
| If you need a []byte, take in a []byte. Your callers should be
| able to figure out how to get you that when they need to.
|
| With go, the answer is usually "just do the simple thing and you
| will have a good time".
| TheDong wrote:
| The author is trying to integrate with the Go stdlib, which
| requires you produce images from an 'io.Reader". See
| https://pkg.go.dev/image#RegisterFormat
|
| Isn't using the stdlib simpler than not for your callers?
|
| I also often hear gophers say to take inspiration from the go
| stdlib. The 'net/http' package's 'http.Request.Body' also has
| this same UX. Should there be `Body` and `BodyBytes` for the
| case when your http request wants to refer to a reader, vs
| wants to refer to bytes you already have?
| jchw wrote:
| The BodyBytes hypothetical isn't particularly convincing
| because you usually don't actually have the bytes before
| reading them, they're queued up on a socket.
|
| In most cases I'd argue it really is idiomatic Go to offer a
| []byte API if that can be done more efficiently. The Go
| stdlib does sometimes offer both a []byte and Reader API for
| input to encoding/json, for example. Internally, I don't
| think it actually streams incrementally.
|
| That said I do see why this doesn't actually apply here. IMO
| the big problem here is that you can't just rip out the
| Bytes() method with an upcast and use that due to the wrapper
| in the way. If Go had a way to do somehow transparent wrapper
| types this would possilby not be an issue. Maybe it should
| have some way to do that.
| TheDong wrote:
| > The BodyBytes hypothetical isn't particularly convincing
| because you usually don't actually have the bytes before
| reading them, they're queued up on a socket.
|
| Ah, sorry, we were talking about two different
| 'http.Request.Body's. For some weird reason both the
| `http.Client.Do`'s request and `http.Server`'s request are
| the same type.
|
| You're right that you usually don't have the bytes for the
| server, but for the client, like a huge fraction of client
| requests are `http.NewRequestWithContext(context.TODO(),
| "POST", "api.foo.com", bytes.NewReader(jsonBytesForAPI))`.
| You clearly have the bytes in that case.
|
| Anyway, another example of the wisdom of the stdlib, you
| can save on structs by re-using one struct, and then having
| a bunch of comments like "For server requests, this field
| means X, for client requests, this is ignored or means Y".
| jchw wrote:
| Thinking about that more though, http.Client.Do is going
| to take that io.Reader and pipe it out to a socket. What
| would it do differently if you handed it a []byte? I
| suppose you could reduce some copying. Maybe worth it but
| I think Go already has other ways to avoid unnecessary
| copies when piping readers and writers together (e.g.
| using `WriterTo` instead of doing Read+Write.)
| tptacek wrote:
| It is, but one of the virtues of the Go ecosystem is that
| it's also often very easy to fork the standard library;
| people do it with the whole TLS stack all the time.
|
| The tension Ted is raising at the end of the article ---
| either this is an illustration of how useful casting is, or a
| showcase of design slipups in the standard library --- well,
| :why-not-both:. Go is very careful about both the stability
| of its standard library and the coherency of its interfaces
| (no popen, popen2, subprocess). Something has to be traded
| off to get that; this is one of the things. OK!
| ronsor wrote:
| > people do it with the whole TLS stack all the time.
|
| It's the only way to add custom TLS extensions.
| throwaway894345 wrote:
| How does using the stdlib internally simplify things for
| callers? And what does that have to do with tanking
| inspiration from the stdlib?
|
| On the second point, passing a []byte to something that
| really does not want a streaming interface is perfectly
| idiomatic per the stdlib.
|
| I don't think it complicates things for the caller if the
| author used a third party deciding function unless it
| produced a different type besides image.Image (and even then
| only a very minor inconvenience).
|
| I also don't think it's the fault of the stdlib that it
| doesn't provide high performance implementations of every
| function with every conceivable interface.
|
| I do think there's some reasonable critique to be made about
| the stdlib's use of reflection to detect unofficial
| interfaces, but it's also a perfectly pragmatic solution for
| maintaining compatibility while also not have the perfect
| future knowledge to build the best possible interface from
| day 0. :shrug:
| int_19h wrote:
| Because it _forces_ the reader to read data into a temporary
| buffer in its entirety. If the thing this function is trying to
| do doesn 't actually require it to do its job, that introduces
| unnecessary overhead.
| mbrumlow wrote:
| What? Where else would it be?
|
| It's either in the socket(and likely not fully arrived) or
| ... in a buffer.
|
| Peak is not some magic, it is well a temporary buffer.
|
| Beyond that, I keep seeing people ask for a byte interface.
| Has anybody looked at the IO.reader interface ???
|
| type Reader interface { Read(p []byte) (n int, err error) }
|
| You can read as little or as much as you would like and you
| can do this at any stage of a chain if readers.
| nemothekid wrote:
| You are still doing a copy, and people want to avoid the
| needless memory copy.
|
| If you are decoding a 4 megabyte jpeg, and that jpeg
| already exists in memory, then copying that buffer by using
| the Reader interface is painful overhead.
| dgb23 wrote:
| Personally I rarely use or even implement interfaces except
| some other part needs them. My brain thinks in terms of plain
| data by default.
|
| I appreciate how they compose, for example when I call io.Copy
| and how things are handled for me. But when I structure my code
| that way, it's extra effort that doesn't come naturally at all.
| vjerancrnjak wrote:
| That's how leaky abstraction of many file std implementations
| starts.
|
| Reading into a byte buffer, pass in a buffer to read values,
| pass in a buffer to write values. Then OS does the same thing,
| has its own buffer that accepts your buffer, then the
| underlying storage volume has its own buffer.
|
| Buffers all the way down to inefficiency.
| woah wrote:
| Seems pretty crazy to force a bunch of data to be saved into
| memory all the time just for programming language aesthetic
| reasons
| silverwind wrote:
| A good API should just accept either,e.g. the union of []byte
| and io.Reader.
|
| Both have pros and cons and those should be for the user to
| decide.
| thayne wrote:
| Ah, but go doesn't have union types.
| Zambyte wrote:
| One option would be to accept an interface{} and then
| switch on the type.
| stouset wrote:
| It's frightening how quickly the answer in golang becomes
| "downcast to interface{} and force type problems to
| happen at runtime".
| throwaway894345 wrote:
| You don't need to downcast to interface, io.Reader is
| already an interface, and a type assertion on an
| interface ("if this io.Reader is just a byteslice and
| cursor, then use the byteslice") is strictly safer than
| an untagged union and equally safe with a tagged union.
|
| I wish Go had Rust-like enums as well, but they don't
| make anything safer in this case (except for the nil
| interface concern which isn't the point you're raising).
| throwaway894345 wrote:
| An io.Reader is already an interface, so you can already
| switch on its type.
| Zambyte wrote:
| My comment is explaining how
|
| > A good API should just accept either,e.g. the union of
| []byte and io.Reader.
|
| could be done. Can you elaborate on how the fact that
| io.Reader is an interface lets you accept a []byte in the
| API? To my knowledge, the answer is: you can't. You have
| to wrap the []byte in an io.Reader, and you are at the
| exact problem described in the article.
| throwaway894345 wrote:
| Nah, a good API doesn't push the conditionals down. You don't
| need to pass a union to let the user decide, you just need to
| present an API for each (including a generic implementation
| that monomorphizes into multiple concrete implementations)
| https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-
| do...
| liampulles wrote:
| The reason interface smuggling exists as a pattern in the Go
| standard library and others is because the Go team (and those who
| agree with its philosophy) take breaking API changes really
| seriously.
|
| It is no small feat that Go is still on major version 1.
| treyd wrote:
| Wouldn't you say that it's a design oversight that the
| interface system leads to tight constraints on what you can do
| without breaking APIs?
| liampulles wrote:
| I can't call it a design oversight no, because I'm not sure
| what reasonable alternatives were considered before Go v1 was
| released. I also don't have context of all the decision
| factors that went into Go's spec. To be honest, I'm not
| anywhere near an expert on programming language design - I'm
| probably the wrong person to ask.
|
| I am thankful that they haven't broken the spec to change
| that design, but maybe others don't care about that as much
| as I do.
| hkpack wrote:
| It seems that go library is ok with you paying the performance
| price when using io.Reader/io.Writer on memory structures.
|
| You can write clean idiomatic code, but it won't be the fastest.
| So for maximum results you should always do everything manually
| for your use case: i.e. don't use additional readers/writers and
| operate on []byte directly if that is what you are working with.
|
| I think it is mostly a good thing - you can quickly write simple
| but slower code and refactor everything later when needed.
| millipede wrote:
| Type inspection is the flaw of Go's interface system. Try to make
| a type that delegates to another object, and the type inspection
| breaks. It's especially noticeable with the net/http types, which
| would be great to intercept, but then breaks things like Flusher
| or Hijacker.
| 38 wrote:
| > What I would like is for my image decoding function to notice
| that the io.Reader it has been given is in fact a bytes.Reader so
| we can skip the copy.
|
| What a terrible idea. If you want bytes.reader, then use that in
| the function signature, or better yet just a byte slice. It
| should have been a red flag when your solution involves the
| unsafe package
| Groxx wrote:
| I think you kinda missed the point. The point is that trying to
| make a user-friendly API with the familiar, highly composable,
| and extremely common io.Reader that everything (including Go's
| stdlib) encourages you to use ends up putting you in this
| unfortunate design corner if you also care about performance.
|
| It's frustration about getting _close_ to a good API, but not
| having any reasonable way to close the final gap, forcing you
| do you to go stuff like you mentioned: have multiple near-
| identical APIs for performance, and needing your users to
| understand and use them correctly to get a good result.
| XorNot wrote:
| The benefit seems fictional though. When is the user going to
| have all the bytes in memory but only have an io.Reader?
| Probably never. If I have a reader it's because the bytes are
| coming from something which itself does not make that
| promise. Like the most common application would be an
| os.File.
|
| If I _do_ have all the bytes in memory, then I have a []byte
| array, know I have it, and can use the []byte interface you
| must 've implemented internally to use this speed up.
| bobbylarrybobby wrote:
| I am not a gopher, so this may be a dumb question: when an
| io.Reader produces a buffer of its contents, does it not have the
| option of just returning the buffer it wraps if it does in fact
| wrap a buffer? Something like (pseudocode) `if self isa
| BufferedReader { self.takeBuffer() } else { let buffer =
| newBuffer(); self.fill(buffer); buffer }`.
| masklinn wrote:
| 1. Like read(2), io.Reader.Read _takes_ a slice parameter to
| which it writes, it doesn 't return one. This is better for a
| "legitimate" read stream as the caller can amortise
| allocations.
|
| 2. And Read takes the slice by value, so the length, capacity,
| and buffer pointer are copied into the callee. This gives no
| way of "swapping buffers", even ignoring that the caller may
| have issues getting back a slice with a completely different
| size and capacity than they sent / expected.
| nulld3v wrote:
| I am once again begging for this to be implemented:
| https://github.com/golang/go/issues/4146
| nottorp wrote:
| Interesting, I don't know Go (yet) but I was messing with it the
| other weekend.
|
| Can someone please summarize this []byte vs somethingReader thing
| for me? Assume I can program, just not familiar with Go.
|
| I was reading off sockets and it looked to me that the example
| code (i randomly ran into) had too many Reader something or
| other.
|
| Edit: Ok, I know what a streaming class does, they're available
| in many frameworks. I'm more interested in why you'd get forced
| to use them in the context of the Go standard library.
|
| Are they mandatory for sockets? Or for interacting with other
| common functions that I'd use to process the data out of my
| sockets?
|
| I just wanted to read up to a new line or a hard coded size limit
| from a socket... ;) Without getting accidentally quadratic in
| either cpu use or memory use...
| mholt wrote:
| []byte is a buffer. (io.)Reader is any type that can read
| bytes, like as part of a stream. There are `bytes.Reader` and
| `bufio.Reader` and `strings.Reader`, and many others, depending
| on what you're reading. These three types allow you to read
| []byte, any buffered input, and string, respectively, as
| streams.
|
| Streaming is more efficient for large pieces of data (files,
| etc; whatever you have), but the buffer is usually easier to
| work with and grants more flexibility.
| ashishb wrote:
| Imagine you are reading files which can be of varying size (KB
| to GB). A byte array will do a contiguous allocation of the
| file size.
|
| A Reader can be much more thoughtful. And I say "can be"
| because someone can make Reader as inefficient as a byte array.
|
| Or they can read in chunks.
|
| For example, if you are trying you read exif data or reading up
| to first N bytes, Reader is a superior approach.
| masklinn wrote:
| byte[] is a bytes buffer, a sequence of bytes. io.Reader is an
| abstraction around a read stream. You can adapt a byte[] to a
| Reader by wrapping it in a bytes.Reader, that way if a function
| needs a reader and you have bytes, you can give them your
| bytes.
|
| The problem TFA has, is that bytes.Reader implies a copy: it's
| going to read data into a second internal slice. So when all
| their library needs is the bytes, they could use the bytes
| themselves, and avoid a potentially expensive copy.
|
| Obviously you could just have a second entry point which takes
| straight byte[] instead of a reader, but in this case they're
| trying to conform to the standard library's image module[1]
| which does not expose a bytes interface and conditionally adds
| further indirection layers.
|
| [1]: https://pkg.go.dev/image
| konart wrote:
| Go's source code (comment actually) says it all tbh:
| https://github.com/golang/go/blob/master/src/io/io.go#L55
|
| (very much) tldr: anything that implements `Read(p []byte) (n
| int, err error)` - is a Reader.
| binary132 wrote:
| Breaking virtual types is bad.
| eximius wrote:
| This really feels like trying to use Go for a purpose that it is
| inherently not designed for: absolute performance.
|
| Go is a fantastic case study in "good enough" practical
| engineering. And sometimes that means you can't wring out the
| absolute max performance and that's okay.
|
| It's frustrating, but it is fulfilling it's goals. It's goals
| just aren't yours.
___________________________________________________________________
(page generated 2025-05-19 23:00 UTC)