[HN Gopher] OpenAI Acquires Rockset
___________________________________________________________________
OpenAI Acquires Rockset
Author : colesantiago
Score : 216 points
Date : 2024-06-21 15:04 UTC (7 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| softwaredoug wrote:
| Is this a death knell to many of the vector DBs pushing RAG
| solutions right now?
|
| (Or maybe it's validation in RAG and these companies should
| rejoice)
| nextworddev wrote:
| Your workload is my opportunity - OpenAI, probably
| posix_monad wrote:
| Nah, that was Postgres vector extensions
| PeterCorless wrote:
| "RAG" is more of a concept than a specification. So the
| Cambrian explosion of how to actually do it will continue
| unabated.
|
| Likewise, I don't think it's going to stem the tide of adding
| vector indexes and similarity search techniques to traditional
| databases.
|
| Instead, if anything, I think this is a validation that
| traditional databases aren't going anywhere -- OLAP or OLTP.
| Behind all the LLM models you're still going to need true,
| authoritative data in databases to avoid (or at least minimize)
| the hallucination problem.
|
| AI needs, if anything, even more programmatic ways to get at
| that data.
| codezero wrote:
| Really surprised to hear that they will be shutting down the SaaS
| business and all existing customers will need to offboard by the
| end of September.
|
| Quite a few of my customers build on top of Rockset and it won't
| be a smooth transition.
| ilrwbwrkhv wrote:
| Classic VC funded saas.
|
| Why people would build actual businesses on top of these fly at
| night Saas companies funded by VC money is beyond me.
| codezero wrote:
| It's not really classic. There are many different kinds of
| exits, this one is pretty uncommon in my experience,
| especially for a company that's been around as long as it has
| and with the customers they have.
| rockyroad wrote:
| https://docs.rockset.com/documentation/docs/faq
|
| Rockset is off boarding existing customers. Definitely sucks we
| spent the last 3 months adopting it. We used it to replicate
| dynamodb in near real time for adhoc & reporting queries.
| Schemaless architecture was very easy to work with
| noufalibrahim wrote:
| Ideally, success of customers should mean success of company
| but the incentives are wildly misaligned here. It seems
| perfectly "okay" from the company perspective since they are in
| it to make money but doing that by telling their customers to
| "leave because we got acquired and no longer care about what we
| sold you last week" is really harsh.
| Maxatar wrote:
| If you're a corporate customer, then you take that risk when
| you sign up for a month to month contract. Now usually a
| month to month works very well for you, but it is a risk you
| accept.
|
| A lot of corporate customers will seek longer term contracts,
| a year or even longer, so they can lock in a price and
| various service guarantees. Even in the case of this
| acquisition it's only customers on a month to month plan that
| have to migrate by September, customers on a long term
| contract will continue to have access and support for the
| duration of their contract.
| CapcomGo wrote:
| Sure but all of their contract customers still need to find
| a replacement
| jjmarr wrote:
| If you want a service guarantee for longer than 1 month
| don't sign a contract for 1 month at a time.
|
| If you want a guarantee of more time to transition you
| are expected to pay for the privilege.
| PeterCorless wrote:
| Yikes! 30 Sep cut off? That's not a lot of lead time, given the
| amount of work database systems require to data model,
| benchmark and migrate. My apologies if this seems inappropriate
| but given the urgency, my employer, StarTree, has a free tier
| if anyone needs to try out alternate solutions.
|
| https://startree.ai/saas-signup
|
| I understand Rockset-to-StarTree (Apache Pinot) is not a 1:1
| drop-in replacement. But hopefully it's a port in a storm.
|
| Whether you end up on StarTree or another suitable alternate, I
| hope everyone has as painless a migration as possible. Reminds
| me a bit of how FoundationDB customers found themselves without
| a home when Apple acquired them back in 2015[0].
|
| [0] https://news.ycombinator.com/item?id=9259986
| biggestdummy wrote:
| Hi PeterCorless! (We're friends IRL - it's Greg)
|
| While we're putting in plugs for open source alternatives,
| I'll recommend looking at StarRocks.
| https://www.starrocks.io/
|
| I share Peter's sentiment for wishing everyone an easy
| transition, whatever you choose.
| Merick wrote:
| Seconding the StarRocks project, best performance out there
| and the community is great. Tons of support.
| mythticquest006 wrote:
| Just adding to what Greg mentioned: if you want to learn
| more about StarRocks or have any questions, feel free to
| reach out to us in the StarRocks community on Slack:
| https://try.starrocks.com/join-starrocks-on-slack.
| lopkeny12ko wrote:
| Don't these M&As need to be cleared by regulators? Seems
| premature to tell your customers you're discontinuing your
| product before the acquisition has actually cleared (and, for
| that matter, passed OpenAI's due diligence).
| preetamjinka wrote:
| According to [0] the acquisition has already been completed.
|
| [0] https://rockset.com/blog/openai-acquires-rockset/
| danielmarkbruce wrote:
| Tiny acquisitions which don't affect anything don't need sign
| off from any regulator. They likely signed the deal and
| closed it in one go.
| lolinder wrote:
| How exactly does that work? Is there a formal threshold for
| when a regulator will want to take a look?
| danielmarkbruce wrote:
| Yeah, the FTC and DOJ publish guidelines with thresholds,
| and you basically just follow them.
|
| E.g. https://www.ftc.gov/enforcement/premerger-
| notification-progr...
|
| https://www.ftc.gov/enforcement/premerger-notification-
| progr...
|
| There are like 10k+ mergers and acquisitions done in the
| US each year (ballpark). It requires real analysis to
| figure if something should be blocked (practically none
| have any real effect on anything and shouldn't be) and
| there are only so many folks at the regulators who can do
| that analysis (and honestly... they aren't good at
| it...).
| fullspectrumdev wrote:
| Usually no.
|
| I've been around for a few M&A that horrendously fucked
| customers of the "acquired" company and the regulator doesn't
| care. Even if the acquirer is under regulatory observation.
| neilv wrote:
| > _[flagged] [dead] OpenAI Acquires Rockset (openai.com)_
|
| I vouched for this because it seems relevant, and I saw no reason
| in the comments to flag it.
| ram_rar wrote:
| > How long will service remain available?
|
| > Month-to-month customers without an active contract will have
| until Monday,
|
| > September 30th, 2024, 5 PM PDT to off-board.
|
| I'd love to hear from someone with expertise in vendor onboarding
| and business continuity risk: how do vendor contracts typically
| protect customers in situations like this?
|
| I'm sure will be super frustrated with datastore vendor change,
| which would need nontrivial resources from product development to
| system migration in such a short span of time.
| JumpCrisscross wrote:
| > _how do vendor contracts typically protect customers in
| situations like this?_
|
| That's in the termination clause.
| borski wrote:
| There is typically a change of control clause too.
| hluska wrote:
| Vendor contracts typically have both termination and change of
| control clauses. In general, you can negotiate for more
| security, but you will pay heavily for it. The typical contract
| though contains very little in the way of customer protection.
|
| Technically, when companies choose a vendor, they should
| consider risks like a company suddenly being acquired. In
| practice, it's quite hard to assign an actual number to that
| column - it's almost always a risk but it's extremely hard to
| quantify. You'll often hear things like "every vendor could be
| acquired so that counts equally for all choices" when that risk
| gets discussed.
| pr337h4m wrote:
| Looks like we'll finally be able to search our past ChatGPT
| convos without having to Ctrl-F the data export
| ot wrote:
| Very unexpected acquisition. I don't think that Rockset is a
| suitable infrastructure for RAG, a purpose-built inverted index
| would be far more efficient (both in terms of compute and
| storage), so I'm not sure how much of the technology would
| actually be useful for them.
|
| I can think of two options
|
| - Pure acqui-hire: virtually all of Rockset engineering
| leadership is ex-Meta, and OpenAI has been hiring several senior
| infra engineers from Meta, so these are all people that have
| worked together previously.
|
| - OpenAI is building some product where customers can ingest
| large amounts of data, which could be managed by the Rockset
| infrastructure as source of truth, and then indexed by their RAG
| systems.
| hipadev23 wrote:
| OpenAI has billions of dollars and nothing but GPUs to spend it
| on. This isn't strategic per-se, it's just rollup. Good place
| to be in for any data-adjacent product company.
|
| Google and Amazon followed the same strategy for over a decade
| just buying anything that was possibly helpful.
| ot wrote:
| I would speculate that OpenAI is in a phase where speed of
| delivery is make-or-break, and any bloat would be a
| distraction. I bet they're extremely deliberate in their
| acquisitions.
| jshx wrote:
| When rate of change increases (with different accelerating
| rates in different dimensions) what delibration chimps with
| 3 inch brains do does not matter. Even the explaintory
| stories cant be manufactured fast enpugh to keep pace. Such
| a state is called The Anthill.
| tudorb wrote:
| Giuseppe! Long time no see. Rockset's architecture changed
| somewhat since we last talked-- not in fundamental ways, but in
| ways that would alleviate your concerns.
|
| If you want to talk (not secret) technical details, you know
| where to find me :)
|
| -Tudor.
| ot wrote:
| I guess I stand corrected then :)
|
| (Hi!)
|
| EDIT: I forgot to say, with the recent hires and the Rockset
| team, OpenAI is building quite the infra dream team :)
| chatmasta wrote:
| Does OpenAI use Rockset internally? I feel like I have some
| vague memory about that... in which case, the acquisition would
| make sense from a continuity of business perspective.
| mritchie712 wrote:
| they were using qdrant for RAG as of November 2023. Not sure
| if it's changed since then.
|
| https://x.com/simonw/status/1722011967886688696
| simonw wrote:
| RAG doesn't have to involve vector search.
|
| The (very thin) blog post said "Enhancing our retrieval
| infrastructure" - my guess is this is more about other forms of
| retrieval, like constructing and executing SQL queries and
| using the results to help answer questions.
| zurfer wrote:
| Last time I heard of Rockset was at the Snowflake Summit
| where they positioned as a faster DWH.
|
| Looking at the landing page now it seems they almost pivoted
| into semi/unstructed data.
|
| To your point, I feel like nobody knows exactly how to do RAG
| really well (fast and accurate). I also doubt the Rockset
| team has it figured out but it seems like there is an
| opportunity to build a new kind of database/memory system and
| OpenAI believes the Rockset team can help.
| ethbr1 wrote:
| I think OpenAI also realized they're an AI major without a
| dance partner, when it comes to context.
|
| Google (Android, Gmail, Maps, G Office), Apple (iPhone,
| Mail, Maps, Productivity), Microsoft (Office365, Windows,
| XBox).
|
| In terms of moat and lock-in, that leaves OpenAI vulnerable
| to last mile customer hijacking.
| tirumaraiselvan wrote:
| > RAG doesn't have to involve vector search.
|
| This. Not sure why RAG triggers vector search for everyone.
| Retrieval Augmented Generation is as generic as it can get.
| clpmsf wrote:
| Most likely for the same reason that so many people seem to
| think they need a vector-specific database and a framework
| like langchain to build any type of GenAI-enabled
| application... the content marketing is working.
| netvarun wrote:
| Congrats to the team. IIRC their CTO was the creator of RocksDB.
| usrnm wrote:
| RocksDB is a fork of LevelDB created by Jeffrey Dean and Sanjay
| Ghemawat at Google.
| flakiness wrote:
| LevelDB was like their hobby project and was built mostly for
| Chrome's Indexed DB. RocksDB brought it to a much higher
| level with a lot of dedication.
| tylerhannan wrote:
| A database is core to your infrastructure...finding out your
| database is going away is a horrifying situation. Finding out
| that the time you have to migrate is a few months. Agh.
|
| As others will say, there are options. Rockset helpfully posts
| links to a bunch of comparisons on their website, and these
| alternatives include ClickHouse, Elasticsearch, Druid, etc..
| https://rockset.com/real-time-analytics-comparison/
|
| I'm inherently biased (as a member of the ClickHouse team). But
| do check ClickHouse out.
|
| You can always come hang out in our Slack (clickhouse.com/slack)
| and, of course, the combination of hosted ClickHouse
| (clickhouse.com/cloud) and the open-source
| (github.com/clickhouse) may add a bit of comfort when your vendor
| up and disappears via acquisition.
| lolinder wrote:
| To anyone else who may be confused like I was: Rockset will, in
| fact, "gradually transition current customers off Rockset". The
| OpenAI announcement linked above doesn't say so, but the
| Rockset announcement does:
|
| https://rockset.com/blog/openai-acquires-rockset/
|
| Month-to-month users been given until September 30, which is a
| _very_ short amount of time for a major infrastructure
| transition. Enterprise users are given a vague "talk to your
| account manager" answer:
|
| https://docs.rockset.com/documentation/docs/faq
|
| In other words, the above isn't just FUD from a competitor,
| there legitimately are going to be a lot of frantic refugees in
| the coming months.
| fullspectrumdev wrote:
| I've no skin in this particular game that I know of but this
| migration period is really, really short.
| deanc wrote:
| Especially if you're in certain parts of Northern Europe
| where it's common to take the whole of July off work.
| fullspectrumdev wrote:
| Jesus having read the releases this comment should go up more
| given that I suspect a lot of shops will not have enough time
| to migrate.
| zX41ZdbW wrote:
| I have tested Rockset for competitive analysis.
|
| Good parts:
|
| It has a slick and nice-looking UI. Good documentation. Many data
| loading options (including S3).
|
| SQL support is good (Calcite?). Types are inferred on data
| loading. But you have to choose one "timestamp" column.
|
| Bad parts:
|
| First data load attempts failed (after 24 hours, it showed
| something like "too many retries").
|
| I've loaded around 500 million rows, and the storage limit ran
| out.
|
| Query performance did not shine. Storage size was very large (it
| seems they create many indices automatically).
|
| Considerations:
|
| The technology is not open-source. It is rocksdb + secondary
| indices + object storage + SQL engine.
| refset wrote:
| > The technology is not open-source
|
| Not entirely fair - see https://github.com/rockset/rocksdb-
| cloud (a fork of RocksDB with a separation of storage and
| compute, using S3 and Lambda-based compaction)
| xyst wrote:
| I wonder how much money OpenAI dangled in front of Rockset
| C-level execs and board to agree to acquisition. Seems company
| was founded in 2016 (8 years ago) [1]
|
| With investment from vulture capitalists to the tune of $117M.
| [2] I would assume they want a sizeable return on investment, so
| maybe a $250-350M cash deal?
|
| Doesn't seem like this would be a unicorn, but it's a payoff.
| Certainly will cover the losses from a few bad investments.
|
| [1] https://venturebeat.com/ai/openai-acquires-rockset-to-
| streng...
|
| [2]
| https://www.crunchbase.com/organization/rockset/company_fina...
| m3kw9 wrote:
| A lot of stock
| freedomben wrote:
| AI is a difficult space to be a customer in. All
| customers/investors/etc want you to add "AI" to your products,
| but for the majority of people that means using a vendor, and the
| churn in the space is shocking.
|
| It's this point where my gratitude for Llama and Meta is
| extremely high.
| bbor wrote:
| Can we all agree that OpenAI should be banned from any kinds of
| corporate acquisitions? Ditto for Microsoft/Google/Meta,
| obviously
| davedx wrote:
| No, why on earth?
| Merick wrote:
| I'll always remember Rockset for their ridiculous comparison
| page: https://rockset.com/real-time-analytics-comparison/
|
| Maybe they should rename it to their migration options page. Or
| maybe I'll just ask ChatGPT what the best alternative is...
|
| Still, pretty useful stuff, but it also feels like Rockset had
| been moving a little too slowly in recent years, but congrats to
| them on finding a new home.
| teej wrote:
| These pages are done for SEO. You get loads of inter-linked
| pages rich with keywords that match user searches exactly.
| riku_iki wrote:
| It was funny seeing their front page saying "World's faster
| analytical and search database" has 90MB/s streaming ingest
| speed..
| idrathernot wrote:
| Not a good look for OpenAI. Shows a lack of confidence in their
| internal prospects to push the needle if they're already
| considering inorganic growth alternatives.
| wantsanagent wrote:
| OpenAI Eng Mgmt: "Hey, we really like this rockset thing we've
| been using, but we don't have the people to build it out as fast
| as you want."
|
| OpenAI Leadership: "Ok, buy Rockset and have them build anything
| you need."
|
| OpenAI Eng Mgmt: "... Ok. You want to run a db service?"
|
| OpenAI Leadership: "No. Dump all the existing customers. They
| build for us now."
| jstummbillig wrote:
| Yes, that's how acqui-hiring goes. Idk. Anything is noteworthy
| when OpenAI does it, I suppose?
| Jensson wrote:
| It is surprising every time you see a non-profit behave
| exactly like a for-profit. You'd think there would be some
| difference, but no apparently we see there is basically none.
|
| At least I have never seen a non-profit acquihire before.
| codezero wrote:
| OpenAI isn't a nonprofit.
|
| https://openai.com/our-structure/
| asdsyd wrote:
| Just trying out my first comment on hacker news in a move to
| doing away from X. Please ignore this. Thanks
| xeromal wrote:
| Hello!
| zeroonetwothree wrote:
| Seems a bit of an odd fit for OpenAI. But I assume they had good
| reasons.
| fire_lake wrote:
| They couldn't get GPT5 to write a clone for them?
| alclol wrote:
| As others have mentioned, this acquisition leaves many Rockset
| customers in a tough spot with a short timeline to migrate. I'd
| like to bring attention to a potential alternative:
| RisingWave(https://risingwave.com/). RisingWave is an open-source
| streaming database designed for real-time analytics and data
| processing. Like Rockset, it offers PostgreSQL compatibility and
| impressive ability to handle both streaming and batch data.
|
| What sets RisingWave apart is its focus on stream processing
| while maintaining SQL compatibility. This could be particularly
| valuable for users leveraging Rockset's real-time capabilities.
| RisingWave offers several features that may appeal to Rockset
| users. It's built to scale in cloud environments and can ingest
| data from a large variety of sources. The database supports
| materialized views for efficient query processing and ensures
| data consistency with ACID transactions. For those concerned
| about vendor lock-in after this experience, RisingWave's open-
| source nature (Apache 2.0 license) provides an extra layer of
| assurance. There's also a managed cloud offering for those who
| prefer a hands-off approach.
|
| I encourage impacted Rockset users to explore RisingWave as part
| of their evaluation process. The project has a welcoming
| community(join at risingwave.com/slack) and extensive
| documentation to help with the transition. [Disclosure: I'm
| associated with RisingWave. Happy to answer any questions or
| provide more details about how it compares to Rockset for
| specific use cases.]
| bayouborne wrote:
| It's clear OpenAI badly wants to get to a place where the Support
| and R&D departments of big companies can dump every disjointed
| scrap of info they've been collecting for years, into a massive
| bucket, let OpenAI's servers cook it for a while and then like
| magic, let managers ask the Borgian result.. stuff. Why is this
| process failing? What's not relevant? What stuff that we've
| demoted in importance, isn't? etc etc etc
| threeseed wrote:
| This is exactly it here.
|
| It's been very common to see startups many of whom have never
| set foot in an enterprise push this idea that you can drop a
| LLM on top of company data and ask questions like it was
| ChatGPT. The reality is that most company data is a mess with
| little funding/will to fix it and so the results are unusable.
| So if OpenAI wants to be anything other more than a chatbot
| they will need to start to tackle this problem.
|
| Amazing to watch their aspirations go from such lofty heights
| to being just another enterprise data infrastructure SaaS
| company.
|
| And should be a clear sign that the AI hype train has run out
| of stream.
___________________________________________________________________
(page generated 2024-06-21 23:01 UTC)