[HN Gopher] GoSTL: Algorithm and datastructure library for Go si...
       ___________________________________________________________________
        
       GoSTL: Algorithm and datastructure library for Go similar to C++
       STL
        
       Author : gmgn
       Score  : 82 points
       Date   : 2022-01-30 13:25 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | politician wrote:
       | Someone (I) should fork this and upgrade it to use generics.
        
         | arendtio wrote:
         | https://go.dev/doc/tutorial/generics ;-)
        
         | synergy20 wrote:
         | Absolutely, looking for that.
         | 
         | STL in Go with generics will be super useful. Can't wait.
        
       | jupp0r wrote:
       | The design decision of providing thread safe operations on data
       | structures by default is a pretty poor one. In my experience this
       | is the wrong level of abstraction on which to perform mutual
       | exclusion of certain operations. Normally you want locks around a
       | transaction-like operation that might involve multiple operations
       | on a data structure (or multiple data structures). When just
       | using concurrency safe containers on their own, subtle
       | consistency bugs can be introduced and you end up writing a
       | second layer of locks around the higher level code anyways, now
       | paying the performance overhead of locking twice, once for your
       | logic and then for the container.
        
         | synergy20 wrote:
         | I agree, while 'lock data not code' sounds good(that's what
         | Rust does I think), container is a different story, it's by
         | design mutable and unfit to lock as a data structure, container
         | is not a pure data structure, it's a ADT that really works with
         | data structure and its methods(algorithms) hands in hands, hard
         | to split by any lock.
        
       | maxpert wrote:
       | Will wait for a Generics version. This is version is so not worth
       | it!
        
       | zerr wrote:
       | Off topic: is there still no alternative for putting random URLs
       | directly in a source code in order to use some libs?
        
         | maerF0x0 wrote:
         | what do you mean by "random urls"?
        
           | stackedinserter wrote:
           | Having `import "github.com/liyue201/gostl/ds/array"` in
           | codebase looks weird and unsafe. Like, who's liyue201 and
           | what exactly I'm importing?
           | 
           | Vendoring helps a bit, but it's still ugly.
           | 
           | Same problems exist in other languages, although "import
           | numpy as np" doesn't explicitly say that you're importing a
           | random head from someone's master.
        
             | mseepgood wrote:
             | > explicitly say that you're importing a random head from
             | someone's master
             | 
             | You don't import HEAD from master. Do you even understand
             | how go.mod, go.sum and the checksum database work?
        
             | lokar wrote:
             | You could setup your build to vendor. The import would be
             | the same.
        
             | jen20 wrote:
             | Frankly if it imparts that kind of concern, that is the
             | dependency system working properly.
             | 
             | 'require "leftpad"' may look safer, but is in fact not, and
             | unless you happen to know what is in the standard library
             | of Node.js off the top of your head, provides not so much
             | as a clue that the library is even third party.
        
             | qbasic_forever wrote:
             | On the contrary, I absolutely want to quickly and easily
             | know if my corporate codebase is depending on
             | github/<random user>.
             | 
             | Golang has a pretty solid standard library and process to
             | extend it. The 'x' packages are all reasonably vetted and
             | close to official libraries:
             | https://pkg.go.dev/golang.org/x You could easily make a
             | decree in a codebase that you won't depend on random github
             | code and only use stuff from trusted sources like standard
             | library, x, etc.
        
             | tikkabhuna wrote:
             | Great thing about the import notation is it points you in
             | the direction to look. Where's the numpy code base? Your
             | go.mod should specify the version of the library you're
             | using.
        
             | cuteboy19 wrote:
             | It's not much different from                   import
             | com.mysql.cj.jdbc
        
               | exdsq wrote:
               | I've never thought about that but yeah you're right,
               | that's cool. I'm not a fan of Go's imports either but
               | this makes it make a little more sense to me.
        
             | DelightOne wrote:
             | What namespacing do you envision? On the one side you don't
             | like short names that are installed locally and on the
             | other you seem to not like the full url noting where to
             | retrieve the library.
        
           | politician wrote:
           | They are confused about import notation.
        
         | bsdnoob wrote:
         | You can use replace directive in go mod then you'd be able to
         | load library from local path
        
         | evanmoran wrote:
         | This is how imports work in go. You make it look like a url,
         | but then can choose in the mod (the module) file if it is local
         | or to keep the url. This makes it quite clean because files
         | have the same import url even as relative paths change, so you
         | can move files all around and the import statements can stay
         | the same. It also has the added advantage that there is no
         | single module repository like in npm. So any url / GitHub repo
         | works so no module can permanently own a name.
        
           | tsimionescu wrote:
           | > This makes it quite clean because files have the same
           | import url even as relative paths change, so you can move
           | files all around and the import statements can stay the same.
           | 
           | That only works if no one is using your module as a
           | dependency.
           | 
           | > So any url / GitHub repo works so no module can permanently
           | own a name.
           | 
           | Only if go mod understands the source control mechanism, and
           | only if the HTTP server in front of that speaks the survival
           | go mod protocol. It's a horrible system that the go team
           | doesn't use internally.
        
         | mseepgood wrote:
         | It's not random. It's what you as a developer wanted as a
         | dependency, and it's cryptographically pinned to the exact
         | version you wanted.
        
       | jchw wrote:
       | I believe we're only one month away from a stable Go release with
       | generics. This might become a fair bit more interesting at that
       | point.
        
       | DontGiveTwoFlux wrote:
       | I hope that iterations can get some language level treatment via
       | a mechanism to hook into the 'range' operator on custom
       | containers.
        
         | throwaway894345 wrote:
         | I care less about hooking into range and more about the
         | performance. I would really like for iterators to be zero-cost
         | abstractions, but right now a SliceIter is waaaayyy slower than
         | looping over a slice (with range or otherwise). The Go compiler
         | isn't smart enough to realize that a loop over a SliceIter can
         | be reduced to a loop over a slice, perhaps because it can't
         | know for certain that the loop counter variable (e.g. the
         | `counter` field in `type SliceIter[T any] struct { slice []T;
         | counter int }`) is only being accessed by the loop itself (this
         | might be where Rust's ownership semantics would be useful in
         | Go?).
        
       | iamgopal wrote:
       | What are the go libraries such as this that can benefit from
       | genetics ?
        
       | JediPig wrote:
       | its not even close. its a wrapper around something the language
       | doesn't support. trying to make it support something like
       | iterators and ranges, plus its not generic is just asking for
       | code bloat and spaghetti. we all seen libraries like this before,
       | they fail cause the language doesn't have support for it.
       | 
       | Until it does, keep the source in line with current best
       | practices.
        
       | jerf wrote:
       | There's a lot of good work put into this.
       | 
       | If it is intended as a tool to help port C++ code out of C++ into
       | Go, it looks very useful.
       | 
       | If it is intended as a tool to help Go programmers, it has made a
       | common, but regrettably very serious mistake, that programmers
       | make when porting code: It has precisely copied the original API,
       | despite the fact the original API is quite unidiomatic in the new
       | language.
       | 
       | I don't want a package of all this stuff with the original _API_
       | of C++; I want the same _capabilities_ , but for them to be
       | idiomatic in the new language. It is quite common that the
       | resulting port will both be missing some things no longer
       | necessary in the new language, but may also require things that
       | weren't necessary in the host language.
       | 
       | As a nice little bite-sized example, consider some of the vector
       | methods. PushBack takes a single element. It really ought to take
       | a slice, probably via the "..." notation to make it a variable-
       | argument function. That's idiomatic in Go. It is also idiomatic
       | to offer that functionality and not worry _too_ much about the
       | fact it will be ever so slightly slower than not taking a slice.
       | Go is generally fast, but if you care _that_ much about speed to
       | the n 'th degree, you picked the wrong language. "Reserve" is in
       | the original C++ API because C++ is not a garbage collected
       | language. Go has the ability to specify sizes up front with
       | slices, and that's generally enough. ShrinkToFit is generally
       | again for a non-GC'd language... it's not useless in Go but
       | again, if you're really paranoid about that being available you
       | probably picked the wrong language.
       | 
       | Examples of this sort are pervasively shot throughout this API,
       | from what a scan of it shows me.
       | 
       | A non-trivial amount of the complexity of the STL interface is
       | related to manual memory management. In Go it results in APIs
       | where even if you can't quite point at what the problem is,
       | they're just generally overcomplicated in a GC language.
       | 
       | I expect something much simpler to implement, much simpler to
       | use, and probably also faster to execute could be written if one
       | wrote a list of the desirable capabilities down, and then ignored
       | the C++ API and implemented the capabilities directly in
       | idiomatic Go. Probably after crossing off a couple of things in
       | the capabilities related to C++ being manually-memory-managed,
       | and perhaps after adding some capabilities related to cheap
       | threads being available, e.g., a parallel map operation that
       | works across these data structures or something. Idiomatic, not
       | just blindly copied.
       | 
       | (I comment on this because I do see this a _lot_. Since these
       | sorts of libraries rarely take off it can be a fatal mistake made
       | on a lot of effort.)
        
         | turminal wrote:
         | > It has precisely copied the original API, despite the fact
         | the original API is quite unidiomatic in the new language.
         | 
         | STL isn't even idiomatic in C++ ;)
        
           | cyber_kinetist wrote:
           | I really wish someone takes the work of ditching the STL as a
           | whole and make an unofficial "version 2" of the same standard
           | library. I mean, IO sucks ass, std::string really sucks,
           | containers like unordered_map are really slow (because of the
           | pointer stability requirements), custom allocators are
           | cumbersome, the regex library is a mess, why are string_view
           | and span different things, yada yada. (And darn those long
           | compile times and horrendous debug-mode performance!)
           | 
           | Basically, something like a more updated version of EASTL
           | (https://github.com/electronicarts/EASTL) might be really
           | useful.
        
             | pjmlp wrote:
             | It was good enough for ATLAS experiments, but what does
             | CERN understand about HPC anyway.
        
               | scheme271 wrote:
               | CERN and HEP (high energy physics) in general care more
               | about HTC (high throughput computing). In general, event
               | processing is pleasantly parallel since each event can be
               | processed independently of others. A lot of the computing
               | is really similar to a world wide distributed map reduce
               | cluster. This significantly reduces a lot of the
               | complexity that comes up when working with a workflow
               | that might have thousands of threads interacting with
               | each other.
        
               | pjmlp wrote:
               | So much words and nothing about iostreams or STL.
               | 
               | I know how ATLAS HLT TDAQ works, you will find my name on
               | 2003 - 2004 papers.
        
             | einpoklum wrote:
             | EASTL still adheres to the same API as the proper standard
             | library, doesn't it?
             | 
             | Anyway, API-wise, you at least have ranges in C++20, which
             | affords you some syntactic convenience in the ability to
             | pass ranges rather than pairs of iterators. But I
             | definitely agree that both form-wise and impl-design-wise,
             | a change would be welcome.
        
             | Someone wrote:
             | > IO sucks ass, std::string really sucks, containers like
             | unordered_map are really slow (because of the pointer
             | stability requirements), custom allocators are cumbersome,
             | the regex library is a mess, why are string_view and span
             | different things
             | 
             | Nitpick: the STL (https://en.wikipedia.org/wiki/Standard_Te
             | mplate_Library#Cont...) contains neither of these.
             | 
             | > and make an unofficial "version 2" of the same standard
             | library.
             | 
             | Part of the C++ standard library in some sense _is_ version
             | 2 of the STL.
        
             | pjmlp wrote:
             | "This Videogame Programmer Used the STL and You Will Never
             | Guess What Happened Next" - CppCon 2019
             | 
             | https://www.youtube.com/watch?v=6hC9IxqdDDw
             | 
             | Ergo, not everyone agrees with that opinion.
        
             | _wldu wrote:
             | unordered_map is hash based. It's as fast as any other hash
             | backed data structure. Hash performance is well defined and
             | widely understood.
        
               | lpapez wrote:
               | There are dozens of hash map implementations for C++, and
               | while they all have O(1) asymptotic complexity, the
               | runtime performance can be vastly different depending on
               | the implementation choices. The design space of a hash
               | based data structure is immense, but std::unordered_map
               | is generally always avoided unless your main goal is
               | either ease-of-use or beginner-friendliness or you simply
               | don't care about performance. But if any of these is true
               | for your context, then C++ is probably not the right tool
               | for the job anyway.
        
               | pjmlp wrote:
               | In many cases C++ is the only tool for the job, unless
               | one wants to have fun doing their own toolchain on top of
               | the platform SDKs, or drop down to C, even worse.
        
         | tgv wrote:
         | This example                   func isEven(iter
         | iterator.ConstIterator) bool {             return
         | iter.Value().(int)%2 == 0         }         ...
         | algorithm.CountIf(a.Begin(), a.End(), isEven)
         | 
         | is certainly off-putting. It starts out with a "deque", which
         | is basically []any (in modern Go), needs Begin and End
         | pointers, and then even doesn't pass the actual value, but the
         | iterator (which can be modified in the call). I'd rather see
         | something more idiomatic like this
         | CountIf(len(a), func(i int) bool {return a[i] % 2 == 0 })
         | 
         | As you say, it might be helpful if you're porting something
         | that relies heavily on STL, but only if you don't understand
         | the code, and are willing to take the risk of subtle
         | differences between the C++ and Go implementations.
        
         | layer8 wrote:
         | > As a nice little bite-sized example, consider some of the
         | vector methods. PushBack takes a single element.
         | 
         | Also, push_back has always been an awkward name choice. Porting
         | would be a good opportunity to rename it to _append_.
        
         | tialaramex wrote:
         | The whole _point_ of the Standard Template Library was
         | generics. That 's the good idea behind the STL, and as I
         | understand it this doesn't support generics, so it's missing
         | this crucial concept.
         | 
         | But yes I generally agree with your point. The X language
         | version of a Y language library ought to work the way somebody
         | who has never used the Y language but is familiar with X
         | expects it to work.
         | 
         | That's across the board. Go and C++ agree that a "string" might
         | be just any bunch of bytes, who knows how they're encoded,
         | that's somebody else's problem so APIs which do care need to be
         | very explicit, maybe even mentioning it in method names. In
         | Python that's very strange, the strings are always Unicode and
         | so a function name_in_cp1252() is awkward to use, probably just
         | offer name() and cope with the encoding.
         | 
         | There are some subtle but important affordances which a non-
         | native user may not even consider adding because their
         | preferred language just omits the necessary feature. For
         | example Rust's From/ Into/ TryFrom/ TryInto cascade or C++ 20
         | spaceship operator are things native speakers of those
         | languages expect in a good library but aren't going to fall out
         | naturally from converting a Python API.
        
           | synergy20 wrote:
           | Indeed. Might be easier after go1.18 with generics is out to
           | create a STL for Go. STL could be so useful for Go if done
           | right.
        
       ___________________________________________________________________
       (page generated 2022-01-30 23:01 UTC)