[HN Gopher] Netflix's Key-Value Data Abstraction Layer
       ___________________________________________________________________
        
       Netflix's Key-Value Data Abstraction Layer
        
       Author : gslin
       Score  : 69 points
       Date   : 2024-09-19 01:13 UTC (21 hours ago)
        
 (HTM) web link (netflixtechblog.com)
 (TXT) w3m dump (netflixtechblog.com)
        
       | snicker7 wrote:
       | This API is very similar to DynamoDB, which is basically a hash
       | table of B-trees.
       | 
       | My experience is that this architecture can lead to very chatty
       | applications if you have a rich data model (eg a graph).
        
         | jolynch wrote:
         | (post author)
         | 
         | It is indeed similar to DynamoDB as well as the original
         | Cassandra Thrift API! This is intentional since those are both
         | targeted backends and we need to be able to migrate customers
         | between Cassandra Thrift, Cassandra CQL and DynamoDB. One of
         | the most important things we use this abstraction for is
         | seamless migration [1] as use cases and offerings evolve.
         | Rather than think of KeyValue as the only database you ever
         | need, think of it like your language's Map interface, and
         | depending on the problem you are solving you need different
         | implementations of that interface (different backing
         | databases).
         | 
         | Graphs are indeed a challenge (and Relational is completely out
         | of scope), but the high-scale Netflix graph abstraction is
         | actually built atop KV just like a Graph library might be built
         | on top of a language's built in Map type.
         | 
         | [1] https://www.youtube.com/watch?v=3bjnm1SXLlo
        
       | jerf wrote:
       | For anyone looking for a TL;DR, I'd suggest starting at
       | https://netflixtechblog.com/introducing-netflixs-key-value-d... ,
       | which HN is truncating so you can't see it but I've directly
       | linked to a later section in the post with a #. Up to that point
       | it's basically "a networked HashMap<String, SortedMap<Bytes,
       | Bytes>>". But the ability to return partial results based on a
       | timeout with a pagination token is somewhat unusual and the next
       | section called "Signaling" is at least worth a look.
        
       | ericmcer wrote:
       | Can anyone explain why Netflix is considered to have such high
       | tier engineering? Just from a super high level view they store
       | and serve ~5000 videos saved at a few different qualities (4?) so
       | lets say a total of 20,000 videos. Those files only change when
       | specific privileged users update them.
       | 
       | Compare that with Youtube where ~5,000 videos are uploaded,
       | processed into different formats/qualities every minute, and can
       | be added by anyone with an email. It seems like Netflix has a
       | fairly trivial problem when compared with video sharing or
       | content sharing sites.
        
         | NBJack wrote:
         | Hype for the engineering culture? Helps attract the right
         | talent. It is a relatively small team that is...ah, _heavily
         | motivated_ to come up with good solutions around the clock. And
         | they maintain an excellent tech blog.
         | 
         | Don't get me wrong; serving the level of traffic they handle
         | isn't easy to scale or do cost-effectively around the globe.
         | They are also considered by some to be pioneers in chaos
         | engineering, and made headlines years ago making a competition
         | to find the "best" suggestion algorithm.
        
         | jolynch wrote:
         | My experience has been that the talent density is the main
         | difference. Netflix tackles huge problems with a small number
         | of engineers. I think one angle of complexity you may be
         | missing is efficiency - both in engineering cost and
         | infrastructure cost.
         | 
         | Also YouTube has _excellent_ engineering (e.g. Vitess in the
         | data space), and they are building atop an excellent
         | infrastructure (e.g. Borg and the godly Google network). It's
         | worth noting though that the whole Netflix infrastructure team
         | is probably smaller than a small to medium satellite org at
         | Google.
        
         | loire280 wrote:
         | You're probably right, but Netflix does a good job building
         | their engineering brand by writing up and sharing their
         | technical work publicly.
        
         | ianbutler wrote:
         | Netflix still has to serve 20k videos to 300million people.
         | That's about a 750million hours of streamed content. Serving
         | that content is challenging.
         | 
         | Then they have their ad network on top of it. Then they have
         | their analytics apparatus. Then they probably have a whole
         | suite of tools for content producers. Then they probably have a
         | bunch of janky tools for things that didn't exist as products
         | 15 years ago.
         | 
         | Seems reasonable to me if you put in a little more thought
         | about the problem and scale.
        
         | thecosmicfrog wrote:
         | As soon as a streaming service starts having availability
         | issues, it will garner a reputation very quickly and lose
         | customers just as quickly. Being able to serve N amount of
         | content _reliably_ and consistently (even if less than M
         | amount) is still a strong demonstration of good engineering
         | practice in my opinion.
         | 
         | On that point, I can't honestly recall a time I had Netflix
         | streaming issues that weren't because of a problem on my side.
         | Maybe I've just been lucky though, so ymmv.
        
       ___________________________________________________________________
       (page generated 2024-09-19 23:01 UTC)