[HN Gopher] Show HN: Hydra (YC W22) - Serverless Analytics on Po...
       ___________________________________________________________________
        
       Show HN: Hydra (YC W22) - Serverless Analytics on Postgres
        
       Hi HN, Hydra cofounders (Joe and JD) here (https://www.hydra.so/)!
       We enable realtime analytics on Postgres without requiring an
       external analytics database.  Traditionally, this was unfeasible:
       Postgres is a rowstore database that's 1000X slower at analytical
       processing than a columnstore database.  (A quick refresher for
       anyone interested: A rowstore means table rows are stored
       sequentially, making it efficient at inserting / updating a record,
       but inefficient at filtering and aggregating data. At most
       businesses, analytical reporting scans large volumes of events,
       traces, time-series data. As the volume grows, the inefficiency of
       the rowstore compounds: i.e. it's not scalable for analytics. In
       contrast, a columnstore stores all the values of each column in
       sequence.)  For decades, it was a requirement for businesses to
       manage these differences between the row and columnstore's relative
       strengths, by maintaining two separate systems. This led to large
       gaps in both functionality and syntax, and background knowledge of
       engineers. For example, here are the gaps between Redshift (a
       popular columnstore) and Postgres (rowstore) features:
       (https://docs.aws.amazon.com/redshift/latest/dg/c_unsupported...).
       We think there's a better, simpler way: unify the rowstore and
       columnstore - keep the data in one place, stop the costs and hassle
       of managing an external analytics database. With Hydra, events,
       traces, time-series data, user sessions, clickstream, IOT
       telemetry, etc. are now accessible as a columnstore right alongside
       my standard rowstore tables.  Our solution: Hydra separates compute
       from storage to bring the analytics columnstore with serverless
       processing and automatic caching to your postgres database.  The
       term "serverless" can be a bit confusing, because a server always
       exists, but it means compute is ephemeral and spun up and down
       automatically. The database automatically provisions and isolates
       dedicated compute resources for each query process. Serverless is
       different from managed compute, where the user explicitly chooses
       to allocate and scale CPU and memory continuously, and potentially
       overpay during idle time.  How is serverless useful? It's important
       that every analytics query has its own resources per process. The
       major hurdles with running analytics on Postgres is 1) Rowstore
       performance 2) Resource contention. #2 is very often overlooked -
       but in practice, when analytics queries are run they tend to hog
       resources (RAM and CPU) from Postgres transactional work. So, a
       slightly expensive analytics query has the ability to slow down the
       entire database: that's why serverless is important: it guarantees
       the expensive queries are isolated and run on dedicated database
       resources per process.  why is hydra so fast at analytics?
       (https://tinyurl.com/hydraDBMS) 1) columnstore by default 2)
       metadata for efficient file-skipping and retrieval 3) parallel,
       vectorized execution 4) automatic caching  what's the killer
       feature? hydra can quickly join columnstore tables with standard
       row tables within postgres with direct sql.  example: "segment
       events as a table." Instead of dumping segment event data into a s3
       bucket or external analytics database, use hydra to store and join
       events (clicks, signups, purchases) with user profile data within
       postgres. know your users in realtime: "what events predict churn?"
       or "which user will likely convert?" is immediately actionable.
       Thanks for reading! We would love to hear your feedback and if
       you'd like to try Hydra now, we offer a $300 credit and 14-days
       free per account. We're excited to see how bringing the columnstore
       and rowstore side-by-side can help your project.
        
       Author : coatue
       Score  : 36 points
       Date   : 2025-05-09 15:24 UTC (7 hours ago)
        
 (HTM) web link (www.hydra.so)
 (TXT) w3m dump (www.hydra.so)
        
       | cultofmetatron wrote:
       | my team is currently looking into offloading some of our
       | analytics data into a columnar database next year. hydra and
       | clickhouse were the top ones on the list. would love a breakdown
       | of how the two compare.
        
         | coatue wrote:
         | [Joe, Hydra cofounder] Hey, that's really great - I love
         | hearing that. Hydra is a columnar database with an integrated
         | Postgres rowstore. Analytics aren't purely best on columnar:
         | we've heard from users that their analytics workload would
         | benefit from fast lookup on row tables too, not just scanning
         | large tables. Our goal for Hydra is to enable realtime
         | analytics on Postgres without requiring an external analytics
         | database. This makes it possible to join the rowstore and
         | columnstore data in Postgres with direct SQL. Other analytics
         | databases typically rely on ETL pipelines to move data out of
         | Postgres, which depending on your scale, can become expensive
         | and introduce delay.
        
           | cultofmetatron wrote:
           | from what you wrote above, it seems like a great value add
           | for greenfield projects.
           | 
           | we currently use aws aurora. how easy would it be to simply
           | sql dump and load into hydra and how well would it serve as a
           | drop in replacement?
        
             | coatue wrote:
             | Close to a drop-in replacement since Aurora bills itself as
             | Postgres. Any data you load into Hydra will automatically
             | be converted into the columnstore! we're happy to help out
             | and feel free to DM me directly.
        
       | fourseventy wrote:
       | The homepage of this website does a bad job of explaining wtf
       | Hydra actually does. Is it a database? Some type of serverless
       | architecture? Ok analytics, but analytics about what, postgrs
       | performance? Does 'analytics' mean that its for OLAP queries?
        
         | coatue wrote:
         | [Joe Hydra cofounder]. Hydra is a fast analytics db on
         | Postgres. It's a database with both a row and columnstore.
         | Analytics can mean reporting, metrics, customer-facing
         | dashboards. Sounds like we should spend some time making
         | analytics templates.
        
           | switchbak wrote:
           | I've run through the docs and it's really unclear how the
           | compute model works. "Serverless" is nice, but how exactly is
           | that managed?
        
       | pikdum wrote:
       | I feel like my ideal would be something more hybrid. It's pretty
       | rare that I have a table that I decide upfront should be
       | columnar. It's a lot more common that I want occasional
       | analytics-like queries on my regular tables to not take forever.
        
         | coatue wrote:
         | [Joe, Hydra cofounder] That's good feedback. It's easy to
         | change the default table type to rowstore "heap"
         | (https://docs.hydra.so/guides/analytics#switching-the-
         | default...).
         | 
         | We initiall set the rowstore as default, but people wouldn't
         | create columnstore tables and were confused on why performance
         | wasn't improving. So, figured this was cleaner, but you always
         | have the option to switch the default table type back.
        
       | mritchie712 wrote:
       | is this using pg_duck?
        
         | coatue wrote:
         | [Joe, Hydra cofounder] Hey there, yes - we codeveloped
         | pg_duckdb and it's what Hydra is built on top of!
        
       | switchbak wrote:
       | Ory Hydra is a relatively high-profile project with a name
       | collision, FYI.
        
         | VWWHFSfQ wrote:
         | there are a million open source products called hydra. I don't
         | think any of them can really claim it exclusively
        
       | thawab wrote:
       | Hello Joe, thanks a lot for hydra and pg_duckdb. I wanted to
       | confirm that for self hosting hydra i have to generate a token
       | from your platform? what data is shared with hydra for this case.
       | We need to double check as our data has restriction of sharing.
       | 
       | > Visit http://platform.hydra.so/token to fetch the access token
       | and paste it into the section above.
        
         | coatue wrote:
         | Hello thawab, yes! you can self-host Hydra with a token from
         | the platform. Sign-up and visit that URL to take you to the
         | right spot. We call it Bare Metal deployment, here's 1 minute
         | setup guide (https://docs.hydra.so/guides/bare_metal)
        
           | thawab wrote:
           | thanks a lot, the other part of the question:
           | 
           | 1- what data is shared with hydra for this case?
           | 
           | 2- whats the pricing for the bare metal deployment?
        
             | coatue wrote:
             | billing (usage) metrics so we know what to charge. We offer
             | BYOC 'Bare Metal' deployments as part of the Business plan.
             | You can set it up now, but we offer volume discounts so you
             | should talk to our team directly. Feel free to DM me on X
             | (@JoeSciarrino) or email founders@
        
       | thenaturalist wrote:
       | Hey there, congrats on publicly launching this after your work
       | over the past months!
       | 
       | Having followed the project for a while now, I really scratch my
       | head when looking at your pricing.
       | 
       | The entire innovation of the past decade in database land has
       | gone towards decoupling storage and compute, driving query
       | engines (like DuckDB) and file formats (like Iceberg).
       | 
       | Yet you force-bundle storage and compute in your pricing while
       | also selling a serverless product.
       | 
       | What's the reason behind that?
       | 
       | Why do it in the first place?
       | 
       | How does your pricing work?
       | 
       | The 40/ 500 compute hours I get are included in the spend limit
       | per tier (i.e. max 160 additional hours in Starter etc.) or
       | completely separate?
       | 
       | Why are there member constraints on a database product?
       | 
       | How does that factor into cost/ map to SDL / reasonable team
       | setups of people operating analytics projects revolving around a
       | database like yours?
       | 
       | I have never seen such a limit with any other vendor and esp.
       | when you wanna get a hold in the market/ have people start using
       | Hydra for the specialized role it can provide, having a 2 person
       | limit for the minimum tier if I wanna PoC this would likely be a
       | show stopper tbh...
        
         | coatue wrote:
         | [Joe, Hydra cofounder] Hey there, I appreciate you taking the
         | time to write this up - helps a lot to hear what's confusing.
         | 
         | One of the downsides of serverless is that it can be difficult
         | to predict the overall monthly cost when the granularity of
         | billing (per invocation, memory usage, or execution time) is
         | complex. For developers this might be totally fine (even
         | preferred), but we think that giving a single, predictable
         | price: Hydra $100 / month is better for businesses to plan
         | around.
         | 
         | Usage caps per plan are purely soft limits so users don't
         | actually encounter them. Yes, we want people to upgrade to
         | higher plans. In the words of Maya Angelou "Be careful when a
         | naked person offers you a shirt" - meaning, we believe these
         | are the best prices we can offer today to build a sustainable
         | project on. That said, I appreciate your point about our # of
         | users limit. If we removed that limit would you try out Hydra?
        
       ___________________________________________________________________
       (page generated 2025-05-09 23:00 UTC)