[HN Gopher] ArcticDB: Why a Hedge Fund Built Its Own Database
___________________________________________________________________
ArcticDB: Why a Hedge Fund Built Its Own Database
Author : todsacerdoti
Score : 41 points
Date : 2024-08-21 13:17 UTC (3 days ago)
(HTM) web link (www.infoq.com)
(TXT) w3m dump (www.infoq.com)
| dang wrote:
| Related:
|
| _ArcticDB: A high-performance, serverless Pandas DataFrame
| database_ - https://news.ycombinator.com/item?id=35198131 - March
| 2023 (1 comment)
|
| _Introducing ArcticDB: Powering data science at Man Group_ -
| https://news.ycombinator.com/item?id=35181870 - March 2023 (1
| comment)
|
| _Introducing ArcticDB: A Database for Observability_ -
| https://news.ycombinator.com/item?id=31260597 - May 2022 (31
| comments)
| Nelkins wrote:
| I don't think the last link is related. Different database.
| silisili wrote:
| Correct. They renamed FrostDB, here is the announcement -
|
| https://www.polarsignals.com/blog/posts/2022/06/16/arcticdb-.
| ..
| OutOfHere wrote:
| https://github.com/man-group/arcticDB
| stackskipton wrote:
| Read the presentation. Answer was what I expected. We had unique
| problem and because we make oil drums amount of cash, dipping a
| bucket and taking that cash to solve the problem was easy
| justification.
|
| These are really smart people solving problems they have but many
| companies don't have buckets of cash to hire really smart people
| to solve those problems.
|
| Also, the questions after presentation pointed out the data isn't
| always analyzed in their database so it's more like storage
| system then database.
|
| >Participant 1: What's the optimization happening on the pandas
| DataFrames, which we obviously know are not very good at scaling
| up to billions of rows? How are you doing that? On the pandas
| DataFrames, what kind of optimizations are you running under the
| hood? Are you doing some Spark?
|
| >Munro: The general pattern we have internally and the users
| have, is that your returning pandas DataFrames are usable.
| They're fitting in memory. You're doing the querying, so it's
| like, limit your results to that. Then, once people have got
| their DataFrame back, they might choose another technology like
| Polars, DuckDB to do their analytics, depending on if they don't
| like pandas or they think it's too slow.
| datahack wrote:
| This comment is underrated comedy gold. You clearly have worked
| with big data.
| primitivesuave wrote:
| I skipped to the "why build a database" section and then
| skipped another two minutes of his tangential thoughts - seems
| like the answer is "because Moore's law"?
| tda wrote:
| I know there are tons of problems that are solved in excel while
| they really shouldn't. Instead of getting the expert business
| analyst to use a better tool (like pandas), money is spent to
| "fix" excel.
|
| Apparently there is also a class of problems that outgrow pandas.
| And instead of the business side switching to more suitable
| tools, some really smart people are hired to build crutches
| around pandas.
|
| Oh well, they probably had fun doing it. Maybe they get to work
| on nogil python next
| bdjsiqoocwk wrote:
| Isn't it constrained to minutely timestamps or something like
| that.
___________________________________________________________________
(page generated 2024-08-24 23:00 UTC)