[HN Gopher] We built a Modern Data Stack from scratch and reduce...
       ___________________________________________________________________
        
       We built a Modern Data Stack from scratch and reduced our bill by
       70%
        
       Author : jchandra
       Score  : 29 points
       Date   : 2025-03-09 18:35 UTC (3 hours ago)
        
 (HTM) web link (jchandra.com)
 (TXT) w3m dump (jchandra.com)
        
       | vivahir215 wrote:
       | Good read.
       | 
       | I do have a question on the BigQuery. i f you were experiencing
       | unpredictable query costs or customization issues, that sounds
       | like user error. There are ways to optimize or commit slots for
       | reducing the cost. Did you try that ?
        
         | jchandra wrote:
         | As for BigQuery, while it's a great tool, we faced challenges
         | with high-volume, small queries where costs became
         | unpredictable as it is priced per data volume scanned.
         | Clustered tables, Materialised views helped to some extent, but
         | they didn't fully mitigate the overhead for our specific
         | workloads. There are ways to overcome and optimize it for sure
         | so i wouldn't exactly put it on GBQ or any limitations.
         | 
         | It's always a trade-off, and we made the call that best fit our
         | scale, workloads, and long-term plans
        
           | vivahir215 wrote:
           | Hmm, Okay.
           | 
           | I am not sure if managing kafka connect cluster in too
           | expensive in long term. This solution might work for you
           | based on your needs. I would suggest to look for
           | alternatives.
        
           | throwaway7783 wrote:
           | Did you consider slots based pricing model for BQ?
        
       | cratermoon wrote:
       | AKA The Monty Hall Rewrite
       | https://alexsexton.com/blog/2014/11/the-monty-hall-rewrite
        
       | jchandra wrote:
       | We did have a discussion on Self vs Managed and TCOs associated
       | with it. 1> We have multi regional setup so it came up with Data
       | Sovereignty requirements. 2> Vendor Lock ins - Few of the
       | services were not available in that geographic region 3> With
       | managed services, you often pay for capacity you might not always
       | use. our workloads were often consistent and predictable, so self
       | managed solutions helped in fine tuning our resources. 4> One og
       | the goal was to keep our storage and compute loosely coupled
       | while staying Iceberg-compatible for flexibility. Whether it's
       | Trino today or Snowflake/Databricks tomorrow, we aren't locked
       | in.
        
       | snake_doc wrote:
       | These just seems like over engineered solutions trying to
       | guarantee their job security. When the dataflows are so straight
       | forward, just replicate into pick your OLAP, and transform there.
        
       | throwaway7783 wrote:
       | .. how many engineers?
        
       | ripped_britches wrote:
       | So you saved just $20k per year? Not sure the context of your
       | company but I'm not sure if this turns out to be a net win given
       | the cost of engineering resources to produce this infra gain
        
       | SkyPuncher wrote:
       | I know it's easy to be critical, but I'm having trouble seeing
       | the ROI on this.
       | 
       | This is a $20k/year savings. Perhaps, I'm not aware of the
       | pricing in the Indian market (where this startup is), but that
       | simply doesn't seem like a good use of time. There's an actual
       | cost of doing these implementations. Both in hard financial
       | dollars (salaries of the people doing the work) and the trade-
       | offs of de prioritizing other other.
        
         | paxys wrote:
         | The biggest issue IMO is that engineers who work on projects
         | like these inevitably get bored and move on, and then the
         | company is stuck trying to add features, fix bugs and generally
         | untangle the mess, all taking away time and resources from
         | their actual product.
        
       | rockwotj wrote:
       | Why confluent instead of something like MSK, Redpanda or one of
       | the new leaderless, direct to S3 Kafka implementations?
        
       | 1a527dd5 wrote:
       | There is something here that doesn't sit right.
       | 
       | We use BQ and Metabase heavily at work. Our BQ analytics pipeline
       | is several hundred TBs. In the beginning we had data
       | (engineer|analyst|person) run amock and run up a BQ bill around
       | 4,000 per month.
       | 
       | By far the biggest things was:-
       | 
       | - partition key was optional -> fix: required
       | 
       | - bypass the BQ caching layer -> fix: make queries use
       | deterministic inputs [2]
       | 
       | It took a few weeks to go through each query using the metadata
       | tables [1] but it worth it. In the end our BQ analysis pricing
       | was down to something like 10 per day.
       | 
       | [1] https://cloud.google.com/bigquery/docs/information-schema-
       | jo...
       | 
       | [2] https://cloud.google.com/bigquery/docs/cached-
       | results#cache-...
        
       ___________________________________________________________________
       (page generated 2025-03-09 22:00 UTC)