[HN Gopher] Faster: Fast persistent recoverable log and key-valu...
       ___________________________________________________________________
        
       Faster: Fast persistent recoverable log and key-value store
        
       Author : LAC-Tech
       Score  : 77 points
       Date   : 2024-02-25 01:27 UTC (21 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | withinboredom wrote:
       | FASTER is awesome! I just wish the C version was as feature
       | complete as the C# version. I know how to use FFI to call C
       | stuff, but I have no idea how to call C# stuff. It's the only
       | reason I haven't used it in real life.
        
         | neonsunset wrote:
         | You could, in theory, export certain methods with
         | [UnmanagedCallersOnly] and AOT* compile it - those become plain
         | C exports. Alternatively, you can host .NET runtime within C++
         | process and call arbitrary methods from the loaded assemblies.
         | Or you could not deal with all that and just use C# :) (which
         | comes with an advantage of not using something worse)
         | 
         | * - I don't know if it uses anything requiring JIT but likely
         | no or limited to certain features.
        
           | withinboredom wrote:
           | I've specifically gone hunting for this documentation and
           | never found it. Thanks for the tip!
           | 
           | I love C#, but I've been in PHP/Go land for awhile now. PHP
           | is an interesting language, I'd have never thought I'd like
           | it as much as I do.
           | 
           | It'd also be cool to see a Go implementation, but there's
           | also cgo.
        
             | neonsunset wrote:
             | :(
             | 
             | But on occasion you go the first route, the docs are here:
             | 
             | - https://learn.microsoft.com/en-
             | us/dotnet/core/deploying/nati... +adjacent sections
             | 
             | - https://learn.microsoft.com/en-
             | us/dotnet/api/system.runtime....
             | 
             | This pretty much comes down to writing glue exports the way
             | you would do in Rust and then 'dotnet publish'ing it as
             | .dll/.so (you could also produce .lib/.a for static linking
             | but it is trickier).
             | 
             | Overall, I feel like the ungodly amount of projects of this
             | kind (albeit of lower complexity) written in Go (which is a
             | weaker platform) could have benefitted from using C#
             | instead - it has zero-cost abstractions through struct
             | generics like Rust and allows expressing complex data
             | structures in a terser way.
        
               | withinboredom wrote:
               | I think a lot of what go has going for it is in static
               | compilation. Things just work. The C# community leans a
               | lot further from open source (paid libraries vs. free).
        
               | neonsunset wrote:
               | Go receives so much unjustified good will it is almost
               | unbelievable...
               | 
               | Either way, if you do look at current state of .NET
               | ecosystem, you may get surprised. But I guess, such is
               | the perception of the public that may have read a bit too
               | much into Go's promises (it would have abysmal
               | performance should FASTER have been written in Go) - C#,
               | seen as less popular weird Java, is now an underdog after
               | all (look at Github LOC statistics).
        
               | withinboredom wrote:
               | > Go receives so much unjustified good will it is almost
               | unbelievable...
               | 
               | Yes. Yes it does and it really is annoying. I can't tell
               | if it is the language or just people not deeply
               | understand the abstractions it provides because it
               | doesn't use sane defaults (or the defaults are geared
               | towards high-throughput, google scale nonsense).
               | 
               | I help out a lot on FrankenPHP, and lately I've been
               | digging into a weird bug, deep in Go, that causes FIN_ACK
               | packets to get delayed by hundreds and hundreds of ms.
               | There are so many layers of abstractions (nearly 100) to
               | dig through. I know for a fact in the C# there are less
               | than 10 to the CLR, then after you can rule out the
               | language, you are just tracing IR. But no, I'm digging
               | through hand-written asm, hunting for a bug or at least
               | figure out how to report it with enough information that
               | it doesn't sound like "my email can't be sent more than
               | 500 miles" (google that one btw).
        
               | apgwoz wrote:
               | > I've been digging into a weird bug, deep in Go, that
               | causes FIN_ACK packets to get delayed by hundreds and
               | hundreds of ms.
               | 
               | Just going to put it out there: have you considered the
               | problem might actually be [Nagle's Algorithm](https://en.
               | wikipedia.org/wiki/Nagle's_algorithm)? This algorithm is
               | the source of 10s of thousands of hours of wasted
               | debugging time of "weird bugs related to latency of
               | networking." Even the "greats" have wasted time debugging
               | something that ended up being "I forgot about Nagle."
        
               | withinboredom wrote:
               | Yes. Go doesn't even offer the ability to use Nagle's
               | algorithm:
               | 
               | https://withinboredom.info/2022/12/29/golang-is-evil-on-
               | shit...
               | 
               | Nagle's algorithm is dealing with connections. These are
               | FIN_ACK. Meaning the connection was told to close and we
               | need to ack the closed connection.
        
               | kjksf wrote:
               | Of course it does: just call TCPConn.SetNoDelay(false).
               | 
               | See https://github.com/golang/go/blob/master/src/net/tcps
               | ock.go#...
        
               | withinboredom wrote:
               | You have to get access to the TCP connection first ...
               | which, once you get more than a few layers above that, it
               | is impossible to get to.
        
               | jddj wrote:
               | We've arrived at a weird spot.
               | 
               | There's a lot of very good software written in Go now.
               | Much of that software could benefit (performance wise, at
               | least, leaving ergonomics aside due to subjectivity) from
               | being written in C# instead.
               | 
               | But does Go deserve some credit for being associated with
               | those projects, or was it marketing and the Google
               | effect?
               | 
               | .net AOT is still quite young, so maybe the tide could
               | turn somewhat.
        
               | ReflectedImage wrote:
               | I think language simplicity is a blind spot for most
               | developers. Programming languages shouldn't be judged
               | just based on their performance (otherwise assembly
               | language would be #1) but also how simple they are. The
               | simpler the better.
               | 
               | Go is simpler than C# and that's giving it the advantage
               | over C#.
        
               | neonsunset wrote:
               | That's what makes it worse. Go was designed, in a way, as
               | a toy language to solve the assault on the codebase
               | quality by all the fresh graduates Google was hiring. I'm
               | not joking - this is paraphrasing Rob Pike. Go is a
               | language that wastes my time.
        
               | withinboredom wrote:
               | There's also cgo. Say what you will, but an easy-to-use
               | way to reach into highly performant libraries and
               | existing code was smart from a language design
               | perspective. But yeah, other than that, I agree with you.
        
               | ReflectedImage wrote:
               | Software developers tend to be far more productive and
               | write less bug filled code in "toy" languages. Progamming
               | language complexity is usually just plain bad for
               | developers at any skill level.
        
               | neonsunset wrote:
               | More sophisticated design allows you to richly represent
               | a certain problem and offer idiomatic way of solving it
               | rather than having you do extra 200 LOC of boilerplate
               | like open-coded loops and if err != nils. Not even
               | mentioning a dozen DSLs people keep inventing in Go
               | ecosystem - something that is usually a sign of language
               | weakness, similar to Ruby.
        
               | ReflectedImage wrote:
               | Generally speaking, more sophisticated designs perform
               | worse both on time to implement and on code correctness.
               | 
               | That's just the way it usually shakes out.
               | 
               | DSLs are known in academic literature as 4th generation
               | languages (whereas Go/C# would only be 3rd generation
               | languages), they are really good thing as long as you
               | aren't the person implementing them.
        
               | zer00eyz wrote:
               | There are good reasons to give Go good will:
               | 
               | Simplicity: Editor, GitHub, golang download. I have a
               | working dev env for a LOT of use cases. Sudo apt install
               | vim git wget tar. wget golang, untar, set env var. I have
               | a script to do it for me on Debian boxes. Python, ruby,
               | php are pretty close. IM guessing C# is a bit more
               | complicated but not by much.
               | 
               | Dependency and library management. GO wins vs
               | python(venvs), ruby, node, php... go mod and how it deals
               | with pinning, pulling from GitHub. Again I dont know what
               | C# has here but go feels both magical, and easy to
               | understand on this front.
               | 
               | go build / go run. The fact that is this easy and fast to
               | get to a running binary is impressive. I had a badly
               | behaving container the other day and the residents of it
               | were not giving back helpful errors. One go program (sub
               | 100 lines) later I was getting usable error messages and
               | quickly worked through the network issues. There are
               | plenty of go apps that work like this! Mediamtx is great
               | (RPI cam server) just grab the binary blob and go... The
               | same thing in python is gonna be a lot more complicated.
               | 
               | Testing: A friend of mine and I recently started a
               | project together and his commentary after coming from
               | working on large ruby and node projects is "how is this
               | testing this fast". GO is eating its own dogfood here
               | with its concurrency model. IM guessing that, C# can run
               | the same way.
               | 
               | Golangs good will isnt because it is the best at
               | something. If were comparing features go is gonna come in
               | 2nd or 3rd or 4th every time. Thats the thing, go is
               | consistently very good at feature "X". It's not the best
               | but it remains in the top 5. Speed, concurrency, compile
               | time, ease of setup, ease of deployment, portability,
               | scaling...
               | 
               | Golang is the Toyota of programing languages. Lovable in
               | its reliability.
        
               | withinboredom wrote:
               | Why do you bother commenting if you don't have any
               | experience in the other language?
               | 
               | With C#, you literally just `apt install` the cli and it
               | keeps up to date. I write go every weekend, and spent
               | years in C#. I still don't know how to properly install
               | go and I have to run it in docker containers. It's so
               | undocumented (or rather the documentation was lacking at
               | least a few years ago and I've not bothered checking
               | because my workflow works for me), that I got lost as a
               | new go dev.
               | 
               | As far as building and running ... C# is on par with Go.
               | There's really not much difference, at least from the
               | cli.
               | 
               | Editors ... I dunno. I pay for a visual studio license
               | and the IDE is simply magical (esp with resharper). I use
               | goland for go, and these two IDEs are barely comparable
               | in some respects.
               | 
               | The testing story in C# is something to be desired. I
               | would rather build a random project to test my code than
               | write tests in C#. It's a mess over there.
               | 
               | But yeah, if I were to start a project completely from
               | scratch, I'd choose PHP over either of them, so maybe I
               | don't know what I'm talking about.
               | 
               | I'll see myself out.
        
               | zer00eyz wrote:
               | >>>> It's so undocumented (or rather the documentation
               | was lacking at least a few years ago and I've not
               | bothered checking because my workflow works for me)
               | 
               | Massive improvements from the days of early go. Between
               | ease of install (Mac, Linux, haven't looked at windows in
               | 2 ish years) and how it deals with packaging (go mod)
               | there has been a ton of progress here.
               | 
               | I have not used C# in at least a decade. My knowledge is
               | dated! (python, ruby, node, c, and rust are all much more
               | recent). It wasnt a bad language at the time but was very
               | MS centric. And installing it on linux is a bit more
               | complicated than "apt install" sadly.
               | 
               | As someone who wrote PHP for years (and it paid me well)
               | I would say that you should take a look at installing go
               | "from scratch" on your local system (
               | https://go.dev/doc/install ) and building a few throw
               | away tools!
        
               | ryanjshaw wrote:
               | If you haven't used C# in a decade you missed that .NET
               | Framework has been completely rewritten as open source
               | .NET. To install, it's literally just:
               | 
               | apt-get install dotnet-sdk-8.0
        
               | neonsunset wrote:
               | and then getting helloworld to run is literally
               | dotnet new console -o MyConsole && cd MyConsole && dotnet
               | run
               | 
               | (dotnet run uses debug build by default, it's very
               | similar to cargo)
               | 
               | as for package management, it is                   dotnet
               | add package {name}
               | 
               | by far one of the best ones, and when you need to
               | manually edit .csproj, it's as easy as cargo.toml.
        
         | LAC-Tech wrote:
         | Which one have you used, the kv store or log?
         | 
         | Reason I submitted this because I'm curious about peoples real
         | world experience.
        
           | withinboredom wrote:
           | I was looking to use both. Actually. I discovered FASTER when
           | looking to port durable functions to php (it's called
           | durable-php if you want to google it, though the
           | implementation is nothing like it) and the netherite engine
           | uses faster.
           | 
           | It's perfect for my use-case, more so now than when I
           | originally researched it as a possibility. Back then, I
           | didn't even have threading solved for php. Now that's all a
           | solved problem (threads ftw) and I'm refactoring log storage
           | now to better support things like faster.
        
             | adamretter wrote:
             | Have you considered Meta's RocksDB as an option?
        
               | withinboredom wrote:
               | I wrote a time-traveling database (where you can query a
               | table/row as of a specific point in time and join it to
               | data at another point in time; we used this for AI
               | training to predict future behavior in users) completely
               | from scratch (that was the coolest work project ever,
               | btw) that was built on Hadoop/Hbase. I understand RocksDB
               | is fairly similar ... however, I want to stay as far away
               | from any of those kinds of APIs. I have scars from
               | dealing with hbase and writing query planners and
               | figuring out how to do performant joins in a white-room
               | type environment. No. Thank. You.
               | 
               | It was fun at the time, but I don't want to go near it
               | ever again.
        
               | apgwoz wrote:
               | [RocksDB](https://rocksdb.org/) isn't a distributed
               | storage system, fwiw. It's an embedded KV engine similar
               | to LevelDB, LMDB, or really sqlite (though that's full
               | SQL, not just KV)
        
               | withinboredom wrote:
               | Yes, it's based on the same paper as hbase, IIRC.
        
               | dgacmu wrote:
               | To be perhaps overly detailed: Hbase is an open source
               | approximation of bigtable. Bigtable _uses_ leveldb as its
               | per-shard local storage mechanism; Rocks is a
               | clone+extension of leveldb.
               | 
               | Bigtable and hbase are higher level and provide
               | functionality across shards and machines. Level and rocks
               | are building blocks that provide a log-structured merge
               | tree storage and retrieval mechanism.
        
               | withinboredom wrote:
               | > Bigtable _uses_ leveldb as its per-shard local storage
               | mechanism
               | 
               | Ah, that's probably what I'm conflating with it then.
               | 
               | Thanks for the information.
        
       | zokier wrote:
       | Is this actually used anywhere by MS? Or is it just random
       | research project (MSR?)
        
         | withinboredom wrote:
         | Durable Functions uses it in Netherite, which is how I
         | originally discovered this library.
        
         | LAC-Tech wrote:
         | I'm curious about that too, appareantly it was created for
         | their SimpleStore research project.
         | 
         | https://www.microsoft.com/en-us/research/project/simplestore...
        
       | Retr0id wrote:
       | I think I know what they're getting at, but "can quickly saturate
       | disk bandwidth" doesn't exactly sound like a selling point, on
       | its own!
        
         | lomereiter wrote:
         | Reminds me of a Russian joke.
         | 
         | A secretary applies for a job; the interviewer asks her: "In
         | your CV you claim that you can type 1000 characters per minute
         | - for real?!" "Yes!", she replies, then adds in a low voice:
         | "but such nonsense comes out..."
        
         | ww520 wrote:
         | SSD have very good bandwidth, in excessive of 10 GB/s.
         | 
         | The bottleneck is often at the CPU. For a 10Ghz CPU, it can
         | spend 1 cycle to process 1 byte. That's the scale we're at now.
         | 
         | https://www.tomshardware.com/features/ssd-benchmarks-hierarc...
        
         | klysm wrote:
         | How is that not a selling point? You want the disk to be the
         | bottleneck
        
           | cout wrote:
           | Not necessarily. For example, an uncompressed log will
           | saturate disk more easily than a compressed log but if
           | compression is fast enough the compressed log will write more
           | data in the same amount of time.
           | 
           | A more complex case: a column store might write in batches.
           | Later an insert in the middle might require the entire batch
           | to be read from disk and then rewritten. This makes queries
           | faster later on but at the cost of more disk io up front. In
           | this case disk bandwidth is also saturated but write
           | performance might be worse than an append-only log that does
           | not optimize at all for reads/queries.
        
       | avinassh wrote:
       | How does this compare with RocksDB?
       | 
       | Also, are there any performance benchmarks?
        
         | emills wrote:
         | There's a paper that does some benchmarking against other
         | options here: https://www.microsoft.com/en-
         | us/research/uploads/prod/2018/0...
        
         | welder wrote:
         | Looks like it's exponentially faster than RocksDB, but in most
         | use cases the bottleneck would be the network and available
         | sockets on the machine running FASTER. Unless you're doing high
         | throughput embedded data ingestion. Maybe Neuralink could use
         | this.
        
         | ww520 wrote:
         | RocksDB supports point query and range query. This only
         | supports point query. Also I'm not sure whether FASTER supports
         | transaction, as the paper didn't mention it.
        
       | CyberDildonics wrote:
       | I wish people would push back against using a generic adjective
       | as a name. Naming something _" Faster"_ is just trying to get
       | someone to remember by being intentionally confusing.
        
         | hacknews20 wrote:
         | What! Nonsense.
        
           | CyberDildonics wrote:
           | Well you do make a compelling argument.
        
       | bravura wrote:
       | Are there Python bindings? I couldn't find any.
        
       | DenisM wrote:
       | How does it compare to FoundationDB? Nether the paper nor the
       | GitHub page mentioned it.
        
         | _bohm wrote:
         | It would be a bit of an apples-to-oranges comparison.
         | FoundationDB is a distributed KV database that supports range
         | queries. FASTER is an embedded KV store that only supports
         | point lookups. The use cases for each are rather different.
        
       | siliconc0w wrote:
       | I really like the idea of tiering a log device such that you go
       | from memory -> nvme -> object storage. You get some nice
       | properties like fast low latency commit times from NVMe, read
       | your own local writes (usually good enough), but have most of
       | your cold data on cheaper durable object storage with some
       | intelligence to pull down/warm up pages when you suspect you'll
       | need them.
       | 
       | It'd be nice if this had a more language-agnostic frontend like
       | gRPC or something.
        
       | dang wrote:
       | Related:
       | 
       |  _Faster A fast concurrent persistent key-value store and log, in
       | C# and C++_ - https://news.ycombinator.com/item?id=25741670 - Jan
       | 2021 (8 comments)
       | 
       |  _Faster - Fast key-value store from Microsoft Research_ -
       | https://news.ycombinator.com/item?id=17785002 - Aug 2018 (76
       | comments)
       | 
       |  _Faster - A key-value store for large state management_ -
       | https://news.ycombinator.com/item?id=17267403 - June 2018 (34
       | comments)
        
       ___________________________________________________________________
       (page generated 2024-02-25 23:01 UTC)