[HN Gopher] a[low:high:max] in Golang - A Rare Slice Trick
       ___________________________________________________________________
        
       a[low:high:max] in Golang - A Rare Slice Trick
        
       Author : weird_user
       Score  : 84 points
       Date   : 2023-03-18 16:53 UTC (6 hours ago)
        
 (HTM) web link (build-your-own.org)
 (TXT) w3m dump (build-your-own.org)
        
       | [deleted]
        
       | tommodev wrote:
       | Question: what is the reason for the silent copy when append
       | exceeds the original slice cap?
       | 
       | It's a footgun avoided by reading the spec and (maybe)
       | remembering it in practice, but it feels like it would be safer
       | to throw a comp error and force the user to deal with it when a
       | user is trying to exceed the cap of the underlying array?
       | 
       | Alternative is defensively using len() and cap() for slice ops in
       | which case error-ing out feels more ergonomic.
        
       | nutate wrote:
       | My fave golang slice trick is the len of an empty slice is 0, but
       | the slice itself is == to nil, but the len of nil won't compile.
       | Can't understand that one.
       | 
       | https://go.dev/play/p/MslCkBphl7q?v=gotip
        
         | kjksf wrote:
         | Because nil is a special thing in Go.
         | 
         | nil is not a value, it's a predeclared identifier.
         | 
         | it represents zero value for pointers, interfaces, maps,
         | slices, channels and function types, representing an
         | uninitialized value
         | 
         | len(nil) feels like it should work if you think of nil as the
         | same as "value of empty array"
         | 
         | but what should be:                 var p *Struct       len(p)
         | ???
         | 
         | There's no "length of zero-valued pointer".
        
         | morelisp wrote:
         | Because the bare value `nil` has no type. Typing it, e.g.
         | `len([]int(nil))` works fine.
        
       | gus_massa wrote:
       | This paragraph is scary. It looks like a good idea for the Golang
       | version of the Underhanded C Contest.
       | 
       | > _This trick is useful for returning a slice from an immutable
       | array; if you accidentally append to the supposedly immutable
       | slice, a copy is forced and no data is overwritten because there
       | is no more capacity left._
        
         | gleenn wrote:
         | I don't know much Go but what does it mean to "accidentally"
         | append to an immutable array? Are there just no particular
         | guards for arrays and they are saying something which should be
         | treated as immutable?
        
           | chrsig wrote:
           | Here's a succinct demonstration:
           | https://go.dev/play/p/Li6_Rpe2R5L
           | 
           | Basically: a slice is a triple of {ptr to allocation, number
           | of elements used, size of allocation}
           | 
           | the slices will share the same underlying allocation. By
           | specifying the third parameter in the slice function, you set
           | the capacity equal to the number of elements in your slice.
           | This forces the next append to reallocate and copy the
           | contents into a new region.
           | 
           | Without bounding the capacity when creating a new slice, the
           | append operation _could possibly_ continue using the same
           | allocation, shared with the original slice. It _could
           | possibly_ resize and copy as well. It depends on how full the
           | slice is, and is an implementation detail subject to change
           | between versions.
        
             | ErikCorry wrote:
             | Here another illustration of how slices sometimes alias
             | each other, sometimes not. https://mobile.twitter.com/erikc
             | orry/status/1561635841495236...
        
               | thomaslkjeldsen wrote:
               | and yet another one (thanks to Erik):
               | https://github.com/tlk/go-
               | wat/blob/41cc21294a953dc7ad07a9b40...
        
           | tedunangst wrote:
           | There are no immutable arrays. If you have a slice of
           | capacity ten with four elements, you can write a fifth
           | element to it. If you reduce the capacity to four, then a new
           | slice must be allocated to store five elements.
           | 
           | The backing store for a slice is just a pointer. The rule is
           | you don't give people pointers that you don't want them to
           | write to.
        
             | andrewstuart2 wrote:
             | That's just not true. There are immutable arrays in length.
             | [4]int cannot be appended to. The backing of all slices are
             | array types and array doubling is used for appends that go
             | beyond the capacity.
             | 
             | https://go.dev/ref/spec#Array_types
        
               | tedunangst wrote:
               | Yeah, that was a mistake.
        
           | Groxx wrote:
           | Append mutates or copies-and-appends based on the capacity of
           | the slice it's given. And since slicing an array or a slice
           | gives you just a _view_ , not a copy, it can cause "spooky
           | action at a distance" and mutate things you didn't expect,
           | especially if it was a slice _with extra capacity_ of data
           | used somewhere else. Which is what this article is
           | describing.
           | 
           | Plus the knowledge of the change in length (to see the
           | appended data) is only visible with the returned slice (a new
           | view with the larger length), but the underlying data is
           | changed either way.
           | 
           | It's not _often_ a source of errors, but when it is it can be
           | extremely hard to diagnose.
        
             | DougBTX wrote:
             | So in other words, appending to a slice _may_ overwrite
             | data in the middle of a different slice?
        
               | Groxx wrote:
               | Yep. Depending on how the slice was _constructed_ , not
               | how it is used.
               | 
               | Slices (largely) behave like this in most languages, Go's
               | contribution is mostly that slices and append are
               | ubiquitous, so pretty much every Go coder is exposed to
               | it. Prior to generics, literally any alternative was so
               | much more work they essentially haven't been used. That
               | may change in the future now that we do have (very
               | simplistic) generics, but only time will tell.
        
               | iainmerrick wrote:
               | _Slices (largely) behave like this in most languages_
               | 
               | Is that really the case? I can't think of many languages
               | that let you construct a view into the middle of array,
               | pass that view as an argument to a function call, and
               | allow that function to add new values into your array via
               | the view.
               | 
               | In Python or JS, for example, the "slice" would just be a
               | brand new array, right?
        
               | Groxx wrote:
               | If the _slice syntax_ creates a copy, I would argue that
               | it 's simply syntactic sugar for array copying, not
               | actually producing a slice (i.e. there is no _slice type_
               | ). But yes, `somefunc(ary[1:])` in Python produces a
               | copy, not a reference to the underlying value. You could
               | build a more "true" slice class, but the builtin stuff
               | doesn't do that. JavaScript is similar.
               | 
               | Java however has `List<>.sublist` which largely behaves
               | like Go: https://docs.oracle.com/javase/8/docs/api/java/u
               | til/List.htm... . Sometimes these kinds of things are
               | also referred to as "views". Go calls them slices though,
               | and this is a Go article.
               | 
               | edit: C# apparently has "spans":
               | https://learn.microsoft.com/en-us/archive/msdn-
               | magazine/2018...
        
               | iainmerrick wrote:
               | It doesn't look like any of those allow you to
               | arbitrarily add and remove elements from the parent list
               | in the same way Go does (if I understand Go slices
               | correctly).
               | 
               | Java does let you make modifications, but only under very
               | tight restrictions:
               | 
               |  _The semantics of the list returned by this method
               | become undefined if the backing list (i.e., this list)
               | is_ structurally modified _in any way other than via the
               | returned list. (Structural modifications are those that
               | change the size of this list, or otherwise perturb it in
               | such a fashion that iterations in progress may yield
               | incorrect results.)_
               | 
               | I take that to mean the _parent_ list can't be modified
               | at the same time, nor can multiple views be modified.
               | 
               | So, I think Go's ability to do those structural
               | modifications on slices is rather unusual (as well as
               | error-prone and not obviously useful).
        
               | klodolph wrote:
               | Another example is typed arrays in JavaScript, which have
               | both .slice() and .subarray(). One creates a copy, and
               | one creates a new view into the same underlying memory.
        
               | iainmerrick wrote:
               | You can't change the length of a JS ArrayBuffer.
        
               | masklinn wrote:
               | > Slices (largely) behave like this in most languages
               | 
               | Only on assignment, not in appending.
               | 
               | > Go's contribution is mostly that slices and append are
               | ubiquitous
               | 
               | Go's contribution is the conflation of slices and
               | vectors, which in most languages are separate (or really
               | most languages only have the latter and don't provide
               | access to backing arrays, thus precluding this specific
               | confusion).
        
               | klodolph wrote:
               | In C++, until C++20's std::span, you would just use
               | std::vector for stuff you can modify and const
               | std::vector for stuff you can't.
               | 
               | I think the real problem is that Go's type system doesn't
               | catch the common error of keeping a reference to
               | something you don't own, or similar errors. C# catches
               | some of these errors by letting you return a
               | IReadOnlyList<T> or some other restricted type. Golang
               | has ways to narrow certain types, but the slice type is
               | primitive and has no narrowed read-only variation. C#
               | instead forces you to pay a higher runtime cost, because
               | you're paying a lot more for indirection with
               | IReadOnlyList<T>.
               | 
               | People fret about the difference between T[] and List<T>
               | in C#, and they do come with different performance
               | characteristics... Go's choice to use slices everywhere
               | does have a certain advantage that you're getting the
               | fast path everywhere, and you're spending less time
               | thinking about which one to use.
               | 
               | Not trying to say that Go is "right" here, it's just my
               | viewpoint that these language design decisions are
               | rational.
        
           | TwentyPosts wrote:
           | Go has no way to ensure "const correctness". If you pass
           | around a slice (which you'll be doing a lot, since it's
           | essentially Go's equivalent of a vector), everyone can modify
           | the slice however they please. It's just a fat pointer.
           | 
           | So yes, something which should be treated as immutable.
           | 
           | The way Go works here easily leads to bugs, especially when
           | concurrency is involved and slices get captured or used by
           | goroutines.
           | 
           | If you send a slice to a function, they essentially receive a
           | pointer pointing at the same block of memory as in your
           | calling function. They can modify it. If they append to the
           | slice, however, and surpass the capacity of the slice they
           | received, then their slice gets reallocated and doesn't point
           | at the same block of memory anymore.
           | 
           | In other words, if you receive a slice, append to it, and
           | then change the first value of the slice, then this might
           | modify a slice (or array) somewhere completely different,
           | depending on whether your append surpasses the allocated
           | capacity of the slice or not.
           | 
           | Welcome to Golang.
        
             | pstuart wrote:
             | On the bright side, if one tests with race detection
             | enabled these problems are _usually_ made apparent.
        
               | masklinn wrote:
               | This issue can trivially occur in sequential code.
        
             | eternalban wrote:
             | "one goroutine has access to the value" :O
             | 
             | If you stick to "values" Go works as originally advertised
             | (see below). With values iirc the consensus became that for
             | high performance the channel overhead was too much of a hit
             | and so back to locks and 'traditional' concurrency.
             | 
             | The concurrency issue in PLTs is not a logical puzzle. It's
             | a performance challenge revolving around copying stuff and
             | hand-offs. With multicores thrown in, it seems to really
             | require addressing the challenges once and for all (as
             | services) at the OS and possibly even hardware layer. IF we
             | have efficiencies at the hw & os around 'message passing',
             | any variation on CSP would address the concurrency issue,
             | "by design", as it says below in Go's "Effective Go"
             | documentation.
             | 
             | https://web.archive.org/web/20091111073232/http://golang.or
             | g...
             | 
             | https://web.archive.org/web/20091113154825/http://golang.or
             | g...
             | 
             |  _Share by communicating
             | 
             | Concurrent programming is a large topic and there is space
             | only for some Go-specific highlights here.
             | 
             | Concurrent programming in many environments is made
             | difficult by the subtleties required to implement correct
             | access to shared variables. Go encourages a different
             | approach in which shared values are passed around on
             | channels and, in fact, never actively shared by separate
             | threads of execution. Only one goroutine has access to the
             | value at any given time. Data races cannot occur, by
             | design. To encourage this way of thinking we have reduced
             | it to a slogan:
             | 
             | Do not communicate by sharing memory; instead, share memory
             | by communicating."_
        
             | Groxx wrote:
             | But also strings _are_ actually immutable sometimes (at the
             | very least when hardcoded), and they can be converted to a
             | `[]byte` slice that looks mutable, but panics when
             | modified. AFAIK no other slice-like data in Go behaves like
             | this.
             | 
             | Fun!
        
               | morelisp wrote:
               | This is misleading enough it's a lie; to do this you need
               | to use a function named `unsafe.Pointer` with a type
               | named `reflect.StringHeader`, not just `[]byte` and
               | `string`.
        
               | Groxx wrote:
               | It means there is special protection for string-data that
               | is not available to anything else, and that library-
               | authors get panic reports about immutable byte slices. To
               | that degree, it's something that the specialized type
               | exposes you to but the type system does not adequately
               | protect or signal.
               | 
               | But yes, AFAIK this requires use of `unsafe`. Which does
               | change things, and puts the blame for misuse squarely on
               | the immutable-byte-slice creators.
        
               | morelisp wrote:
               | > It means there is special protection for strings that
               | is not available to anything else
               | 
               | Lots of languages have no C/C++-style const objects yet
               | immutable strings. It's weird to pick on Go specifically
               | for this. (Since it's the JVM model, at this point it may
               | even be a majority of mainstream languages.)
               | 
               | If you mean specifically the panic when modifying a
               | static string, is this not just the default `mprotect` on
               | rodata? You can do that to any memory page you want. It's
               | a feature of the kernel's memory management, not the type
               | system or even the language runtime.
               | 
               | > library-authors get panic reports about immutable byte
               | slices.
               | 
               | I'm really skeptical this happens in any meaningful
               | amount. Go offers the feature to transform a byte slice
               | you know you won't want anymore into a string without a
               | memory allocation; or to pass a string to a function
               | which wants a []byte and _will not mutate it_ , usually
               | to wrap an optimized implementation working on both
               | strings and []byte. Modifying a string's contents is
               | always undefined behavior, even if it doesn't panic
               | immediately - the compiler will assume strings are
               | immutable and make "as if" judgements accordingly.
        
               | tedunangst wrote:
               | I don't think you can convert a string to []byte without
               | unsafe except by copying?
        
               | wyager wrote:
               | This is a pretty good vignette of how Go is designed.
               | "Users don't need this feature (immutability,
               | polymorphism, etc.), let's not include it. Ah crap, turns
               | out we actually need it for a core language feature. Is
               | there a lesson we can take away here? Nah, just make an
               | ad-hoc implementation of the functionality in this one
               | place."
        
               | Groxx wrote:
               | "X for me but not for thee" is very much the vibe Go
               | gives me, yeah.
               | 
               | I mean, I have completely replaced my adhoc Python use
               | with it and I'm much happier. But it's terrifying to use
               | to build large systems.
        
               | ikiris wrote:
               | Part of it is that "large systems" are almost all
               | combinations of small systems with proto boundaries, so
               | it's not much of an actual risk unless you're making
               | giant monolith code.
        
               | andrewflnr wrote:
               | "Proto" meaning protobuffers?
        
               | klodolph wrote:
               | Const is a hard language feature to design well. There
               | are a lot of pitfalls in the way const is used in C++ and
               | TypeScript (I'm including "readonly" in the discussion).
               | 
               | Rust gets it right, but Rust is relatively complicated.
               | 
               | The languages which are most similar to Go are Java and
               | C#, both of which also lack const types, or have a very
               | limited version of constness.
        
               | avgcorrection wrote:
               | For what it's worth it sounds no worse than Java: you
               | have immutable strings but you have no way to enforce
               | that you can't modify something that you have recieved,
               | which is why the underlying char array for a string has
               | to be defensively copied if the user asks for `asArray()`
               | or whatever it is.
               | 
               | Strings being immutable doesn't seem like a special case,
               | though. The underlying mutable array is just
               | encapsulated, which is something that you can implement
               | yourself. (But _reflection_ ... maybe you can break the
               | rules with reflection.)
        
           | Izkata wrote:
           | Sounds like they're just badly describing copy-on-write.
           | 
           | The _original_ slice uses no extra memory because as long as
           | you treat it as immutable it won 't actually make a copy. As
           | soon as you try to modify it, then the copy is made and the
           | extra memory used.
        
             | mook wrote:
             | Yes-ish; because go has no immutable slices, it's just
             | copy-on-realloc. Which is if course obvious if realize
             | that's what it's actually doing...
        
             | [deleted]
        
           | UncleEntity wrote:
           | I know even less go but it sounds like it prevents
           | accidentally writing past the end of the array by making a
           | copy and appending to that.
        
       | jjice wrote:
       | Tangential, but if OP is the owner of the site, could you talk a
       | little bit about your book writing process?
       | 
       | - How you write
       | 
       | - How you render the PDF
       | 
       | - How you develop your plans
       | 
       | That kind of thing. I find it super interesting to self-publish
       | software books and I've been slowly writing one for about a year
       | now. Really curious about this stuff in general and it looks like
       | you've got a solid process down.
        
       | sacnoradhq wrote:
       | 0. Only 2 of 3 values are necessary.
       | 
       | 1. Can 1 value be unspecified?
       | 
       | 2. Can 2 values be unspecified?
       | 
       | 3. Can 3 values be unspecified?
       | 
       | 4. Are conflicting values interpreted as intersection of ranges
       | rather than union?
       | 
       | Note 1-3. With 2 values, I believe x[:] is how to lift a sized
       | array into a more generic slice type.
        
         | masklinn wrote:
         | Your questions are really unclear.
         | 
         | In the 2-parameters form `a[low:high]`, both values can be left
         | out, defaulting to respectively 0 and len(a). In the
         | 3-parameter form, only the leading value can be left out,
         | defaulting to 0.
         | 
         | I've no idea what (4) is asking about.
        
         | klodolph wrote:
         | Regarding conflicting ranges--
         | 
         | In a[i:j:k], 0 <= i, i <= j, j <= k, and k <= cap(a). If not,
         | then the operation will panic.
        
       | [deleted]
        
       | operator-name wrote:
       | So this is the same as                   a[low:min(high, max)]
       | ?
        
       ___________________________________________________________________
       (page generated 2023-03-18 23:00 UTC)