[HN Gopher] GoSTL: Algorithm and datastructure library for Go si...
___________________________________________________________________
GoSTL: Algorithm and datastructure library for Go similar to C++
STL
Author : gmgn
Score : 82 points
Date : 2022-01-30 13:25 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| politician wrote:
| Someone (I) should fork this and upgrade it to use generics.
| arendtio wrote:
| https://go.dev/doc/tutorial/generics ;-)
| synergy20 wrote:
| Absolutely, looking for that.
|
| STL in Go with generics will be super useful. Can't wait.
| jupp0r wrote:
| The design decision of providing thread safe operations on data
| structures by default is a pretty poor one. In my experience this
| is the wrong level of abstraction on which to perform mutual
| exclusion of certain operations. Normally you want locks around a
| transaction-like operation that might involve multiple operations
| on a data structure (or multiple data structures). When just
| using concurrency safe containers on their own, subtle
| consistency bugs can be introduced and you end up writing a
| second layer of locks around the higher level code anyways, now
| paying the performance overhead of locking twice, once for your
| logic and then for the container.
| synergy20 wrote:
| I agree, while 'lock data not code' sounds good(that's what
| Rust does I think), container is a different story, it's by
| design mutable and unfit to lock as a data structure, container
| is not a pure data structure, it's a ADT that really works with
| data structure and its methods(algorithms) hands in hands, hard
| to split by any lock.
| maxpert wrote:
| Will wait for a Generics version. This is version is so not worth
| it!
| zerr wrote:
| Off topic: is there still no alternative for putting random URLs
| directly in a source code in order to use some libs?
| maerF0x0 wrote:
| what do you mean by "random urls"?
| stackedinserter wrote:
| Having `import "github.com/liyue201/gostl/ds/array"` in
| codebase looks weird and unsafe. Like, who's liyue201 and
| what exactly I'm importing?
|
| Vendoring helps a bit, but it's still ugly.
|
| Same problems exist in other languages, although "import
| numpy as np" doesn't explicitly say that you're importing a
| random head from someone's master.
| mseepgood wrote:
| > explicitly say that you're importing a random head from
| someone's master
|
| You don't import HEAD from master. Do you even understand
| how go.mod, go.sum and the checksum database work?
| lokar wrote:
| You could setup your build to vendor. The import would be
| the same.
| jen20 wrote:
| Frankly if it imparts that kind of concern, that is the
| dependency system working properly.
|
| 'require "leftpad"' may look safer, but is in fact not, and
| unless you happen to know what is in the standard library
| of Node.js off the top of your head, provides not so much
| as a clue that the library is even third party.
| qbasic_forever wrote:
| On the contrary, I absolutely want to quickly and easily
| know if my corporate codebase is depending on
| github/<random user>.
|
| Golang has a pretty solid standard library and process to
| extend it. The 'x' packages are all reasonably vetted and
| close to official libraries:
| https://pkg.go.dev/golang.org/x You could easily make a
| decree in a codebase that you won't depend on random github
| code and only use stuff from trusted sources like standard
| library, x, etc.
| tikkabhuna wrote:
| Great thing about the import notation is it points you in
| the direction to look. Where's the numpy code base? Your
| go.mod should specify the version of the library you're
| using.
| cuteboy19 wrote:
| It's not much different from import
| com.mysql.cj.jdbc
| exdsq wrote:
| I've never thought about that but yeah you're right,
| that's cool. I'm not a fan of Go's imports either but
| this makes it make a little more sense to me.
| DelightOne wrote:
| What namespacing do you envision? On the one side you don't
| like short names that are installed locally and on the
| other you seem to not like the full url noting where to
| retrieve the library.
| politician wrote:
| They are confused about import notation.
| bsdnoob wrote:
| You can use replace directive in go mod then you'd be able to
| load library from local path
| evanmoran wrote:
| This is how imports work in go. You make it look like a url,
| but then can choose in the mod (the module) file if it is local
| or to keep the url. This makes it quite clean because files
| have the same import url even as relative paths change, so you
| can move files all around and the import statements can stay
| the same. It also has the added advantage that there is no
| single module repository like in npm. So any url / GitHub repo
| works so no module can permanently own a name.
| tsimionescu wrote:
| > This makes it quite clean because files have the same
| import url even as relative paths change, so you can move
| files all around and the import statements can stay the same.
|
| That only works if no one is using your module as a
| dependency.
|
| > So any url / GitHub repo works so no module can permanently
| own a name.
|
| Only if go mod understands the source control mechanism, and
| only if the HTTP server in front of that speaks the survival
| go mod protocol. It's a horrible system that the go team
| doesn't use internally.
| mseepgood wrote:
| It's not random. It's what you as a developer wanted as a
| dependency, and it's cryptographically pinned to the exact
| version you wanted.
| jchw wrote:
| I believe we're only one month away from a stable Go release with
| generics. This might become a fair bit more interesting at that
| point.
| DontGiveTwoFlux wrote:
| I hope that iterations can get some language level treatment via
| a mechanism to hook into the 'range' operator on custom
| containers.
| throwaway894345 wrote:
| I care less about hooking into range and more about the
| performance. I would really like for iterators to be zero-cost
| abstractions, but right now a SliceIter is waaaayyy slower than
| looping over a slice (with range or otherwise). The Go compiler
| isn't smart enough to realize that a loop over a SliceIter can
| be reduced to a loop over a slice, perhaps because it can't
| know for certain that the loop counter variable (e.g. the
| `counter` field in `type SliceIter[T any] struct { slice []T;
| counter int }`) is only being accessed by the loop itself (this
| might be where Rust's ownership semantics would be useful in
| Go?).
| iamgopal wrote:
| What are the go libraries such as this that can benefit from
| genetics ?
| JediPig wrote:
| its not even close. its a wrapper around something the language
| doesn't support. trying to make it support something like
| iterators and ranges, plus its not generic is just asking for
| code bloat and spaghetti. we all seen libraries like this before,
| they fail cause the language doesn't have support for it.
|
| Until it does, keep the source in line with current best
| practices.
| jerf wrote:
| There's a lot of good work put into this.
|
| If it is intended as a tool to help port C++ code out of C++ into
| Go, it looks very useful.
|
| If it is intended as a tool to help Go programmers, it has made a
| common, but regrettably very serious mistake, that programmers
| make when porting code: It has precisely copied the original API,
| despite the fact the original API is quite unidiomatic in the new
| language.
|
| I don't want a package of all this stuff with the original _API_
| of C++; I want the same _capabilities_ , but for them to be
| idiomatic in the new language. It is quite common that the
| resulting port will both be missing some things no longer
| necessary in the new language, but may also require things that
| weren't necessary in the host language.
|
| As a nice little bite-sized example, consider some of the vector
| methods. PushBack takes a single element. It really ought to take
| a slice, probably via the "..." notation to make it a variable-
| argument function. That's idiomatic in Go. It is also idiomatic
| to offer that functionality and not worry _too_ much about the
| fact it will be ever so slightly slower than not taking a slice.
| Go is generally fast, but if you care _that_ much about speed to
| the n 'th degree, you picked the wrong language. "Reserve" is in
| the original C++ API because C++ is not a garbage collected
| language. Go has the ability to specify sizes up front with
| slices, and that's generally enough. ShrinkToFit is generally
| again for a non-GC'd language... it's not useless in Go but
| again, if you're really paranoid about that being available you
| probably picked the wrong language.
|
| Examples of this sort are pervasively shot throughout this API,
| from what a scan of it shows me.
|
| A non-trivial amount of the complexity of the STL interface is
| related to manual memory management. In Go it results in APIs
| where even if you can't quite point at what the problem is,
| they're just generally overcomplicated in a GC language.
|
| I expect something much simpler to implement, much simpler to
| use, and probably also faster to execute could be written if one
| wrote a list of the desirable capabilities down, and then ignored
| the C++ API and implemented the capabilities directly in
| idiomatic Go. Probably after crossing off a couple of things in
| the capabilities related to C++ being manually-memory-managed,
| and perhaps after adding some capabilities related to cheap
| threads being available, e.g., a parallel map operation that
| works across these data structures or something. Idiomatic, not
| just blindly copied.
|
| (I comment on this because I do see this a _lot_. Since these
| sorts of libraries rarely take off it can be a fatal mistake made
| on a lot of effort.)
| turminal wrote:
| > It has precisely copied the original API, despite the fact
| the original API is quite unidiomatic in the new language.
|
| STL isn't even idiomatic in C++ ;)
| cyber_kinetist wrote:
| I really wish someone takes the work of ditching the STL as a
| whole and make an unofficial "version 2" of the same standard
| library. I mean, IO sucks ass, std::string really sucks,
| containers like unordered_map are really slow (because of the
| pointer stability requirements), custom allocators are
| cumbersome, the regex library is a mess, why are string_view
| and span different things, yada yada. (And darn those long
| compile times and horrendous debug-mode performance!)
|
| Basically, something like a more updated version of EASTL
| (https://github.com/electronicarts/EASTL) might be really
| useful.
| pjmlp wrote:
| It was good enough for ATLAS experiments, but what does
| CERN understand about HPC anyway.
| scheme271 wrote:
| CERN and HEP (high energy physics) in general care more
| about HTC (high throughput computing). In general, event
| processing is pleasantly parallel since each event can be
| processed independently of others. A lot of the computing
| is really similar to a world wide distributed map reduce
| cluster. This significantly reduces a lot of the
| complexity that comes up when working with a workflow
| that might have thousands of threads interacting with
| each other.
| pjmlp wrote:
| So much words and nothing about iostreams or STL.
|
| I know how ATLAS HLT TDAQ works, you will find my name on
| 2003 - 2004 papers.
| einpoklum wrote:
| EASTL still adheres to the same API as the proper standard
| library, doesn't it?
|
| Anyway, API-wise, you at least have ranges in C++20, which
| affords you some syntactic convenience in the ability to
| pass ranges rather than pairs of iterators. But I
| definitely agree that both form-wise and impl-design-wise,
| a change would be welcome.
| Someone wrote:
| > IO sucks ass, std::string really sucks, containers like
| unordered_map are really slow (because of the pointer
| stability requirements), custom allocators are cumbersome,
| the regex library is a mess, why are string_view and span
| different things
|
| Nitpick: the STL (https://en.wikipedia.org/wiki/Standard_Te
| mplate_Library#Cont...) contains neither of these.
|
| > and make an unofficial "version 2" of the same standard
| library.
|
| Part of the C++ standard library in some sense _is_ version
| 2 of the STL.
| pjmlp wrote:
| "This Videogame Programmer Used the STL and You Will Never
| Guess What Happened Next" - CppCon 2019
|
| https://www.youtube.com/watch?v=6hC9IxqdDDw
|
| Ergo, not everyone agrees with that opinion.
| _wldu wrote:
| unordered_map is hash based. It's as fast as any other hash
| backed data structure. Hash performance is well defined and
| widely understood.
| lpapez wrote:
| There are dozens of hash map implementations for C++, and
| while they all have O(1) asymptotic complexity, the
| runtime performance can be vastly different depending on
| the implementation choices. The design space of a hash
| based data structure is immense, but std::unordered_map
| is generally always avoided unless your main goal is
| either ease-of-use or beginner-friendliness or you simply
| don't care about performance. But if any of these is true
| for your context, then C++ is probably not the right tool
| for the job anyway.
| pjmlp wrote:
| In many cases C++ is the only tool for the job, unless
| one wants to have fun doing their own toolchain on top of
| the platform SDKs, or drop down to C, even worse.
| tgv wrote:
| This example func isEven(iter
| iterator.ConstIterator) bool { return
| iter.Value().(int)%2 == 0 } ...
| algorithm.CountIf(a.Begin(), a.End(), isEven)
|
| is certainly off-putting. It starts out with a "deque", which
| is basically []any (in modern Go), needs Begin and End
| pointers, and then even doesn't pass the actual value, but the
| iterator (which can be modified in the call). I'd rather see
| something more idiomatic like this
| CountIf(len(a), func(i int) bool {return a[i] % 2 == 0 })
|
| As you say, it might be helpful if you're porting something
| that relies heavily on STL, but only if you don't understand
| the code, and are willing to take the risk of subtle
| differences between the C++ and Go implementations.
| layer8 wrote:
| > As a nice little bite-sized example, consider some of the
| vector methods. PushBack takes a single element.
|
| Also, push_back has always been an awkward name choice. Porting
| would be a good opportunity to rename it to _append_.
| tialaramex wrote:
| The whole _point_ of the Standard Template Library was
| generics. That 's the good idea behind the STL, and as I
| understand it this doesn't support generics, so it's missing
| this crucial concept.
|
| But yes I generally agree with your point. The X language
| version of a Y language library ought to work the way somebody
| who has never used the Y language but is familiar with X
| expects it to work.
|
| That's across the board. Go and C++ agree that a "string" might
| be just any bunch of bytes, who knows how they're encoded,
| that's somebody else's problem so APIs which do care need to be
| very explicit, maybe even mentioning it in method names. In
| Python that's very strange, the strings are always Unicode and
| so a function name_in_cp1252() is awkward to use, probably just
| offer name() and cope with the encoding.
|
| There are some subtle but important affordances which a non-
| native user may not even consider adding because their
| preferred language just omits the necessary feature. For
| example Rust's From/ Into/ TryFrom/ TryInto cascade or C++ 20
| spaceship operator are things native speakers of those
| languages expect in a good library but aren't going to fall out
| naturally from converting a Python API.
| synergy20 wrote:
| Indeed. Might be easier after go1.18 with generics is out to
| create a STL for Go. STL could be so useful for Go if done
| right.
___________________________________________________________________
(page generated 2022-01-30 23:01 UTC)