[HN Gopher] Go Data Structures (2009)
___________________________________________________________________
Go Data Structures (2009)
Author : rednafi
Score : 121 points
Date : 2025-02-05 09:44 UTC (13 hours ago)
(HTM) web link (research.swtch.com)
(TXT) w3m dump (research.swtch.com)
| eu wrote:
| it's a good read, but i think it should focus more on some of the
| common mistakes that people make when slicing a slice.
| rednafi wrote:
| Slicing a slice is full of gotchas. I tend to forget all the
| rules and avoid it whenever I can.
| bborud wrote:
| It is a suprisingly hard thing to implement well. I have no
| idea how many times I have implemented slice-like things in C
| (back in the 1990-2000s when I mostly wrote C) and it was one
| of those things I never managed to be happy with.
| neonsunset wrote:
| It can be done but that requires a better, more expressive
| type system.
| rednafi wrote:
| An expressive type system also _often_ means slower build
| times. I dislike working with Rust for this exact reason.
|
| While most people highlight the difficulty of picking up
| the syntax, I find Rust to be an incredibly tedious
| language overall. Zig has a less expressive type system,
| but it compiles much faster (though not as fast as Go). I
| like what Zig and Odin folks are doing over there.
|
| I like the balance Go strikes between developer
| productivity and power, though I dearly miss union types
| in Go.
| 4ad wrote:
| An expressive type system absolutely, positively,
| unequivocally does not imply slower build times
| (especially with a Church-style type system). There are
| plenty of programming languages with advanced type
| systems which compile extremely quickly, even faster than
| Go, for example OCaml.
|
| Don't make the fallacy of conflating Rust's slow compile
| time with its "advanced" (not really, it's 80's tech)
| type system. Rust compilation is slow for unrelated
| reasons.
| taurknaut wrote:
| Old doesn't mean non-advanced. GraalVM is based on a
| paper (Futamura) from fifty years ago. Off the top of my
| head I can't think of many language features younger than
| the eighties--maybe green threading? That would be
| surprising but might fit. I suppose you could also say
| gradual typing. Haskell has many recent innovations, of
| course, but very few of those have seen much use
| elsewhere. Scala has its implicits, I guess, that's
| another one.
|
| Personally, I write java at my day job and the type
| system there makes me loooong for rust.
| pjmlp wrote:
| No need for Rust, when JVM has Haskell, Scala, Kotlin,
| Clojure, Common Lisp.
| taurknaut wrote:
| I prefer rust to all of them, but I also come from a very
| systemsy background. Plus it has the benefit of being
| much easier to embed inside or compose around basically
| any runtime you'd like than managed code, which is why I
| chose rust rather than basically any managed language.
|
| But, it's just a tool, and the tools I choose reflect the
| type of stuff I want to build. The JVM is extremely
| impressive in its own right. You're just not going to to
| find any one runtime or ecosystem that hits every niche.
| I'm happy to leave the language favoritism to the junior
| devs--for the vast majority of situations, what you're
| building dictates which language makes the most sense,
| not vice versa.
| neonsunset wrote:
| As a start, Go could separate container and slice types,
| the way C# did it with T[]/List<T>/other and
| Span<T>/Memory<T>. No lengthy build process required.
| rednafi wrote:
| Yeah, but the at the same time, I find C# code a sigil
| soup. Go makes a different tradeoff.
|
| I've been involved in a few successful large scale
| projects and never felt like the type system of Go is
| holding me back too much. Sure the error handling could
| be better, union type would make interface munging
| easier, but the current state after generics isn't too
| bad.
| neonsunset wrote:
| > Sigil soup
|
| Last time I checked, C# had clean and focused syntax for
| working with collection types. Could you provide an
| example?
| jerf wrote:
| I'm not deeply familiar with those C# types, but I think
| maybe it already does. An array, which includes the size
| of itself in its type so that a four-element array is not
| the same type as an eight-element array, is already in
| Go. Go's language affordances make it easy to just have
| slices without the underlying arrays, since they're
| generally not useful on their own, but you can take
| arrays and slice into them if you like.
| dist1ll wrote:
| It's possible to build languages that compile faster than
| Go, with a much more expressive type system.
|
| It's just that compile times and DevEx haven't been a
| priority for most projects.
| thegeekpirate wrote:
| You'd most likely be happy with Odin (https://odin-
| lang.org), which I find to be essentially a fixed Go with
| no GC.
| pjmlp wrote:
| As proven by other languages with similar type systems
| and faster compile times, Rust's case is a matter of
| tooling, not language features.
| _flux wrote:
| Not sure if that's really a proof, as it could be the
| exact combination of language features that makes up the
| slowness. For example traits, non-ordered definitions in
| compilation units and monomorphization probably don't
| help. GHC also isn't a speed demon for compilation.
|
| But sure, LLVM and interfacing with it is quite possibly
| a big contributor to it.
| pjmlp wrote:
| Haskell isn't the only language around with complex type
| systems.
|
| However, it is actually a good example regarding tooling,
| as the Haskell ecosystem has interpreters and REPL
| environments available, for quick development and
| prototyping, something that is yet to be common among
| Rustaceans.
| neonsunset wrote:
| Rust has Cranelift: https://cranelift.dev
| pjmlp wrote:
| Indeed, but its compile times aren't much better than
| LLVM, at least one year ago.
|
| Ideally we would be having the F# REPL/JIT, plus Native
| AOT for deployment, as comparable development workflow
| experience.
|
| Naturally F# was chosen as example, because that's your
| area. :)
|
| Not being negative per se, I also would like to have
| something like Haskell GHCi, or OCaml bytecode compiler,
| as options on rustup, so naturally something like this
| might eventually come.
| burnished wrote:
| Based on a first hand account I read (but cannot source
| offhand) Rust's slow compiles are because anytime there
| was a tradeoff involving compile time at the expense of
| something else they'd always choose that something else.
| Not cause they hated fast compilation, guess it just
| wasn't high on their priorities.
| pjmlp wrote:
| Something that even Mesa and Modula-2 already supported by
| 1980's, maybe one day C will eventually get proper slices.
| rendaw wrote:
| Appending a slice is also full of gotchas. Sometimes it
| modifies the slice in place, sometimes it reallocates and
| copies.
| pstuart wrote:
| Only really a gotcha if you pass a slice into a function
| and expect to see modifications in that slice after the
| function completes. It's helpful to remember that Go passes
| by value, not reference.
|
| That can be addressed by passing the slice as a pointer:
| https://go.dev/play/p/h9Cg8qL9kNL
| TheDong wrote:
| > Only really a gotcha if you pass a slice into a
| function and expect to see modifications in that slice
| after the function completes. It's helpful to remember
| that Go passes by value, not reference.
|
| Slices are passed partly by value (the length), partly by
| reference (the data). func takeSlice(s
| []int) { slices.Sort(s) }
|
| From your explanation, you would expect that to not
| mutate the slice passed in, but it does.
|
| This can have other quite confusing gotchas, like:
| func f(s []int) { _ = append(s, 1) }
| func main() { s := []int{1, 2, 3}
| f(s[0:2]) fmt.Printf("%v\n", s) }
|
| I'm sure the output makes perfect intuitive sense
| https://go.dev/play/p/79gOzSStTp4
| adonovan wrote:
| A slice operation s[i:] seems like it should be little more
| than an ADD instruction for a registerized slice where i is
| known to be in bounds, but a surprising little detail is that
| when i==cap(s) we really don't want to create an empty slice
| whose data pointer points one past the end of the original
| array, as this could keep some arbitrary object live. So the
| compiler generates a special branch-free sequence
| (NEG;SUB;ASR;AND) to compute the correct pointer increment,
| ((i - cap) >> 63) & (8 * i).
|
| https://go.dev/play/p/J2U4djvMVoY
| fmbb wrote:
| Can you link some article about these common mistakes? Sounds
| like good learnings can be had.
| WolfCop wrote:
| (2009)
| rednafi wrote:
| It's Go we're talking about. Other than 64-bit the dominant
| word size, nothing much has changed.
| 4ad wrote:
| The interface layout has changed since the article (although
| this specific article doesn't mention interfaces, a later
| article in the series does). Additionally Go now has
| generics.
|
| It's true that little has changed, but very little is
| changing in the data representation of any language, really.
| Even ones that are evolving rapidly.
| Cthulhu_ wrote:
| This is a truth that's easily overlooked; most languages
| are several levels beyond basic types to the point that
| people forget about the low level constructs involved. This
| is one reason why I like Go, it exposes and educates on
| fairly low-level mechanisms that are not unfamiliar to
| anyone who's studied computer science, but at the same time
| you don't have to worry too much about the lower level
| stuff like memory, pointers, zeroing, etc. I think it's a
| good tradeoff.
| mseepgood wrote:
| It's still not news.
| Cthulhu_ wrote:
| HN is not primarily a news site though, despite the name.
| zellyn wrote:
| The most important thing to know is that Go slices are _exactly_
| how you would implement a dynamically growable slice in C if I
| asked you to. No less, but importantly, no more: a pointer to a
| piece of memory, and a current length and capacity.
|
| The classic slice mistake is to forget that append doesn't
| necessarily return a completely new slice backed by a disjoint
| piece of memory, especially easy if you've been using a language
| where variables are immutable by default.
|
| The problem is that it will often accidentally work; if you over-
| run the backing store, you will get a new piece of memory:
| a := make([]int, 1, 1) // a is backed by a one-element-long piece
| of memory a[0] = 42 b := append(a, 17) // this
| must allocate a new piece of memory c := append(a, 37) //
| this must also allocate a new piece of memory // At
| this point a == [42], b == [42, 17], c == [42, 37]
|
| The problem is when the existing capacity is sufficient:
| a := make([]int, 1, 10) // a is backed by a ten-element-long
| piece of memory a[0] = 42 b := append(a, 17) //
| no need to allocate: just use the existing memory c :=
| append(a, 37) // no need to allocate: just (re)use the existing
| memory // At this point a == [42], b == [42, 37], c
| == [42, 37] // ^^^^^^^^^^^^^
| BoingBoomTschak wrote:
| That's not the "classic slice mistake", that's slices being one
| of the stupidest things Go ever decided upon. In C++, that'd be
| the equivalent of merging std::vector<T> and std::span<T,
| std::dynamic_extent> into a single structure for "reasons"
| ("simplicity"?).
|
| https://build-your-own.org/blog/20241125_go_slice_surprise/
| goes a bit further, but this is really the point where I
| decided I wouldn't be able to stand Go.
| zellyn wrote:
| In the period when Go was ready for production use and Rust
| wasn't, I couldn't wait for Rust to be ready, so that the
| response to everyone hating Go for not being a C++ equivalent
| could just be, "I think you want Rust."
|
| I think you want Rust.
| kccqzy wrote:
| I'd argue that the Go example is worse than the equivalent in
| C, because Go has a garbage collector and C does not. In C,
| precisely because of the lack of a garbage collector, people
| need to explicitly and carefully document who is responsible
| for deallocating the memory. This makes things clear who is
| owning that memory. And this by extension leads to an intuition
| on whether the append is acceptable on the original slice or
| not. If a C function returns such a slice and is documented
| that the caller should free it, you can call realloc and add to
| it; if the C function documents that the caller should not free
| it, you naturally wouldn't even try to append to it: you would
| copy it first.
|
| In Go, this garbage collector frees people from thinking about
| freeing memory. But it doesn't free people from thinking about
| owned vs borrowed memory. And that's where bugs can happen.
| kbolino wrote:
| This isn't really a practical problem.
|
| If you created the slice, you control it and you can modify
| its elements, append to it, etc.
|
| If you didn't create the slice, you don't control it. If you
| want to modify it or append to it, you should copy it first.
|
| This has reflected how I've seen Go used professionally by
| people experienced in the language. The language could offer
| more help (but probably never will), but it's not hardly a
| _regression_ from C. The real regression from C is the lack
| of const pointers.
| pjmlp wrote:
| And enums, type Colours (Red, Blue, Green)
| //.... c := Colours.Blue a := pred(c)
| b := succ(c) for x := range Colours {
| //.... }
|
| 1976's Pascal style is unthinkable to ever land in Go, but
| we have iota and const, lets not complain.
| kbolino wrote:
| Go's "enums" are worse than a lot of other languages, but
| they're not worse than C.
| pjmlp wrote:
| Indeed, then again maybe we should not design languages
| with features worse than C in the 21st century, having 60
| years of field experience.
| kbolino wrote:
| It's not worse. It's just _only slightly better_. The
| most damning thing you can say about Go is not that it
| doesn 't improve upon C, it's that it improves _only_ on
| C (and only in the ways cared about). The authors of Go
| really didn 't examine other languages very closely. So
| starting with the authors' goals (light on the page,
| bounds checked, concurrency with message passing, easy to
| pick up, fast to compile) and a fairly deep knowledge of
| C (but little else) you pretty much get Go (especially
| early Go).
| pjmlp wrote:
| Even that isn't really the case, as Go, as it started,
| was basically Inferno's Limbo in a new clothing, with
| some influence from Oberon-2 method syntax and the SYSTEM
| package as unsafe.
|
| Unfortunately afterwards they decided to follow Wirth's
| quest in Oberon-07 minimalism, instead of Active Oberon.
| 9rx wrote:
| Go enums are the same as in every single other language.
| After all, all an enum does is count. There isn't
| anything more you can do with it.
|
| You can introduce data structures and types that utilize
| enums, with some languages taking that idea further than
| others, but that's well beyond enums themselves.
| masklinn wrote:
| > Go enums are the same as in every single other
| language. After all, all an enum does is count. There
| isn't anything more you can do with it.
|
| You may want to meet a modern language one day. Or hell
| not even a modern language, even a shit one like Java.
|
| As it turns out there are languages where you can define
| an enum foo { bar, baz }
|
| and when you type a parameter as a `foo` you're not at
| risk of getting 69 or 4328.
| 9rx wrote:
| _> You may want to meet a modern language one day._
|
| Like what? CrabLang? Its enums are identical to Go,
| unsurprisingly. Above the enum rests a discriminated
| union that ends up hiding the details of the enum. That
| is where things begin to differ greatly. You have to do
| some real trickery to actually get at the underlying enum
| result in that language. But, when you do, you see that
| it is exactly the same.
|
| > and when you type a parameter as a `foo` you're not at
| risk of getting 69 or 4328.
|
| That's thanks to the type system, though, not enums.
| Enums are not a type. Enums are a number generator. Hence
| the name.
| tialaramex wrote:
| > That's thanks to the type system, though, not enums.
| Enums are not a type. Enums are a number generator. Hence
| the name.
|
| What's happened here is that you've mistaken "Things I
| believe" for "What everybody else believes" but you're
| talking to other people, not yourself, so, this makes you
| seem like a blithering idiot.
|
| The particular version of this trap you've fallen into is
| a variation of the "It's all just the machine integers"
| mental model from C. It's just a model and the problem
| arises when you mistake that model for reality.
|
| Now, technically this model isn't even correct for C's
| abstract machine, but it's close enough for many
| programmers and it tends to match how they think the
| hardware works, which is even more confusing for the
| hardware people who know how it _actually_ works but that
| 's another conversation.
|
| This model is completely useless for languages which
| don't have the same type system, and so it's no surprise
| that it immediately led you astray.
| 9rx wrote:
| _> but you 're talking to other people_
|
| No, I am clearly talking to a computer program. It is
| possible that the program is forwarding the discussion on
| to people. Perhaps that is what you are trying to allude
| to? The details of how the software works behind the
| scenes is beyond my concern. There is no intention to
| talk to other people, even if the software has somehow
| created that situation incidentally. If I wanted to talk
| to people, I would go out and talk to people, not type
| away at my computer into a box given to me by the
| software.
|
| _> The particular version of this trap you 've fallen
| into is a variation of the "It's all just the machine
| integers" mental model from C._
|
| As much as I enjoy your pet definition that you've
| arbitrarily made up on the spot here, the particular trap
| I have fallen into is the dictionary. It literally states
| what an enumeration is according to the prevailing usage.
| It does not describe it as a type, it describes it as the
| action of mentioning a number of things one by one. Which
| is exactly what an enum does.
|
| The previous comment is talking about type constraints.
| You can definitely constrain a type such that it is
| invalid to use it outside of the numbers generated by the
| enum, just as you can constrain a type to only accept
| certain strings. e.g. from Typescript: `type Email =
| "{string}@{string}"` This idea is not limited to enums.
|
| That's definitely a thing, and definitely a thing that
| could be used in conjunction with an enum, but is not an
| enum itself. Not as enum is normally used. Of course you
| can arbitrarily define it however you please. But, if you
| are accepting of each holding their own pet definition,
| your comment doesn't work. You can't have it both ways.
| noapologies wrote:
| > ... even a shit one like Java.
|
| I agree with your point about Go enums.
|
| But in defense of Java, modern Java is actually pretty
| pleasant.
|
| Virtual threads, records and sealed classes, pattern
| matching, state-of-the-art garbage collectors, a great
| standard library etc. (and obviously well-behaved enums).
|
| Not to mention the other languages you get for free with
| the JVM ecosystem.
|
| It might not be as expressive as Rust, but certainly
| Java/JVM > Go.
| pjmlp wrote:
| It is so funny how Go folks praise its standard library,
| when it is a fraction of Python, Java and .NET standard
| libraries.
| kbolino wrote:
| Things that were in the Go standard library from the
| first public release (2012):
|
| encoding/json vs Python: 2008, json
| package added in version 2.6 .NET: 2019,
| System.Text.Json came with .NET Core 3.0 JVM: still
| nothing
|
| net/http client vs Python: gave up, "just
| use requests" .NET: also 2012, HttpClient came with
| .NET Framework 4.5 JVM: 2018, HttpClient came with
| Java 11
|
| The time library definitely sucks though. There's really
| no excuse for it, joda-time came out a long time ago.
| pjmlp wrote:
| Go enums are stuck in the 1960's macro assemblers.
| 9rx wrote:
| Quite right. The entire possible space for enums to
| explore was exhausted in the age of 1960s assembler.
| There is only so much you can do with a number generator.
|
| Which, I guess, is why we have this desperate attempt to
| redefine what an enum is, having become bored with a term
| that has no invitation potential. But, unfortunately, we
| have not found shared consensus on what "enum" should
| become. Some think it should be a constraint, others
| think it should be a discriminated union, others think a
| type declaration, so on and so forth.
|
| All of which already have names, which makes the whole
| thing particularly bizarre. You'd think if someone really
| feels the need to make their mark on the world by coining
| "enum" under new usage in the popular lexicon they would
| at least pick a concept that isn't already otherwise
| named.
| pjmlp wrote:
| I think some people need to spend some time learning type
| systems theory.
| 9rx wrote:
| That is probably true, but has little to do with enums,
| which are concerned with mentioning things one-by-one
| (i.e. producing values).
|
| It is true that some type system features built upon
| enums. Like a previous commenter mentioned, Pascal offers
| a type that constrains allowable values of that type to
| be within the values generated by the enumerator.
| Likewise, I mentioned in another discussion that in
| CrabLang the enumerator value is used as the discriminant
| in its discriminated union types, which achieves a
| similar effect. I expect that confuses some people into
| thinking types and enums are the same thing, which may be
| what you are trying to get at, although doesn't really
| apply here. The difference is known to those reading this
| discussion.
|
| The biggest problem with this desperate attempt to find
| new meaning for "enum" is: What are we going to call what
| has traditionally been known as an enum? It does not seem
| to have another word to describe it.
| kccqzy wrote:
| The practical problem is about transferring the control.
| Without it, you end up doing way too much defensive
| copying.
| zellyn wrote:
| I happen to like the way Go did it, but now that generics and
| iterators have landed, it should be possible to create an
| always-make-a-copy vector type (although it'll need to use
| `get` and `set` like Java Lists, not [42] like
| slices/arrays).
|
| It's not like you could have appended twice to the same List
| in Java and expected to get two disjoint arrays, unless you
| copy first, and `slices.Clone()` makes that easy in Go now
| too.
| wavemode wrote:
| It's not clear to me why I would need a "dynamically growable
| slice". That's not a slice at that point, it's an vector aka
| ArrayList.
|
| If all I need is a slice of memory, I'd rather only have to
| pass around the two pieces of data (pointer and length) than
| all three.
| zellyn wrote:
| Yes. "dynamically growable slice" : "vector" :: "tomato" :
| "to-mah-to"
| tialaramex wrote:
| The growable array type is an owning container, whereas a
| slice (in C++ and C# span) is a reference type, it doesn't
| actually own anything.
|
| This type isn't either of those things, it's a hybrid, and
| Go tries to get away with that because it's a garbage
| collected language so people don't need to care as much
| about ownership.
| tialaramex wrote:
| I think of this as a failed experiment. Like coal powered
| cars or when we thought maybe transatlantic travel by airship
| was the Right Thing. It may be hard to know what'll happen
| without trying.
|
| The growable array type (what you call "vector") is a
| venerable data type, although it does still have parameters
| you might reasonable tweak e.g. what's the correct growth
| factor? Doubling is popular but e.g. Zig chooses new_size =
| (old_size * 1.5) + 8 and you'll see disagreement about what
| API should be offered and why - e.g. C++ std::vector has the
| wrong reservation API
|
| But this thing clearly isn't a mistake, it's intentionally a
| hybrid, and if it had been a huge success maybe everybody
| else would scramble to copy it.
| slashdev wrote:
| Indeed, this is how rust handles it.
|
| A mutable String or Vec has all three fields, but the
| borrowed slice or str has just pointer and length.
|
| Go chooses to be simpler by not having those kinds of
| separations, which also makes it a little less efficient
| under some circumstances.
| ustad wrote:
| Go's slice behavior reveals a concerning trade-off in language
| design where performance was prioritized over predictability.
| Consider this trap: a := make([]int, 1, 10)
| a[0] = 42 b := append(a, 17) c := append(a, 37)
| // Surprise! b and c both equal [42, 37]
|
| The fact that append()'s behavior changes based on a hidden
| property (capacity) violates what I consider a core principle
| of language design: fundamental operations should be
| predictable and easy to reason about without having to think
| about implementation details.
|
| While I understand the performance benefits of reusing memory,
| language fundamentals shouldn't require developers to
| constantly think about backing array capacity to avoid bugs.
| This is especially problematic because it "usually works" until
| it doesn't - the worst kind of bug.
|
| Modern languages have shown us better ways. Rust makes
| ownership explicit. Immutable-by-default languages prevent such
| surprises entirely. Even if Go wanted to maintain performance,
| they could have made this behavior more explicit - perhaps with
| separate "append_safe" and "append_reuse" functions.
|
| Programming language fundamentals should be like good UI design
| - they should do what you expect them to do. When core
| operations have surprising edge cases, it creates cognitive
| overhead that ripples through all code written in that
| language.
___________________________________________________________________
(page generated 2025-02-05 23:01 UTC)