[HN Gopher] Show HN: Spice.ai - materialize, accelerate, and que...
       ___________________________________________________________________
        
       Show HN: Spice.ai - materialize, accelerate, and query SQL data
       from any source
        
       Hi HN, We're Luke and Phillip, and we're building Spice.ai OSS - a
       lightweight, portable runtime, built in Rust and powered by Apache
       DataFusion to locally materialize, accelerate, and query data
       tables sourced from any database, data warehouse or data lake.
       Phillip and I first introduced Spice on Show HN in September 2021.
       Since then, we've been schooled and humbled in every way building
       100TB+ data and ML systems for the https://spice.ai cloud platform.
       Along with our customers, we struggled with getting fast, low-
       latency, high-concurrency SQL query within a budget, accessing and
       combining data from many sources, trade-offs between OLTP/OLAP
       compute engines, and managing datasets as code.  Today, we're re-
       launching Spice, completely rebuilt from the ground up, to directly
       solve several of the problems we had in accessing data quickly and
       cost-effectively providing it to applications, dashboards, and
       machine learning. Spice provides federated SQL query across
       databases (MySQL, PostgreSQL, etc.), data warehouses (Snowflake,
       BigQuery, etc.) and data lakes (S3, MinIO, Databricks, etc.) with
       the ability to materialize remote datasets locally using in-memory
       Arrow, DuckDB, SQLite, or PostgreSQL. Accelerated engines run in
       your infrastructure giving you flexibility and control over price
       and performance.  You can read the full announcement blog post at
       https://blog.spiceai.org/posts/2024/03/28/adding-spice-the-n....
       We'd appreciate it if you check Spice out, give us feedback, and if
       you'd like to contribute, we'd love to build with you.  Thanks!
       GitHub: https://github.com/spiceai/spiceai
        
       Author : lukekim
       Score  : 98 points
       Date   : 2024-03-28 17:16 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | dvdsgl wrote:
       | Congrats on the launch! This is exciting. The video demo is
       | awesome: https://youtu.be/AZyrecVWnEs?si=j7JVKhhcUor1_y-f
        
         | lukekim wrote:
         | Thank you!
        
       | jjustin_lawson wrote:
       | Congrats on the launch team!
        
       | alamb wrote:
       | So great to see another project built on DataFusion @!
        
       | watsondoc wrote:
       | Wow, looks promising
        
       | leeholim wrote:
       | Congratulations on the launch!!
        
       | cedrone wrote:
       | Congrats Luke & Phillip- exciting day!
        
       | dwgray wrote:
       | This looks great - I've been meaning to dig into Rust - seems
       | like a solid choice for you.
        
       | mritchie712 wrote:
       | Very cool!
       | 
       | One thing to keep in mind:
       | 
       | DuckDB can directly query parquet files (and many other file
       | types[1]), mysql, postgres[0], and SQLite. So if you're in need
       | of something like this, DuckDB on it's own might work for your
       | use case.
       | 
       | 0 - https://duckdb.org/docs/extensions/postgres
       | 
       | 1 - https://twitter.com/thisritchie/status/1767922982046015840
        
         | lukekim wrote:
         | Yes, we're huge fans of DuckDB, Mark, Hannes and the team.
         | 
         | What we've found is sometimes you want to materialize data in
         | an OTLP DB, so what Spice gives you is the choice to store some
         | datasets in DuckDB and some in something like SQLite/PostgreSQL
         | and join them together in a single SQL query, so you can get
         | the best of both worlds.
        
       | alex_hirner wrote:
       | Looks great! Is flightsql supported over the wire too, so one
       | could hook it up to grafana? Any plans to support iceberg?
        
         | jeadie wrote:
         | Yes! It can connect to FlightSQL compatible servers (see
         | https://docs.spiceai.org/data-connectors/flightsql ) and its
         | also a FlightSQL compatible server
        
           | lukekim wrote:
           | We also have a Grafana plugin we'll continue to improve to
           | make it super easy to connect to Grafana, and Spice has a
           | metrics endpoint and example Grafana dashboard for monitoring
           | itself https://github.com/spiceai/spiceai/blob/trunk/monitori
           | ng/gra...
        
         | jeadie wrote:
         | And yes, Iceberg is very high up on our list
        
       | lmeyerov wrote:
       | Any sense of comparison to Dremio, which helped steward the Arrow
       | ecosystem for doing this kind of thing?
       | 
       | (The idea is great fwiw, I've been following them one-off for
       | years, and we have to do elements of these things in how we build
       | louie.ai and Graphistry for the GPU equivalent. Real pain point!)
        
         | lukekim wrote:
         | Dremio is awesome. We've followed the Dremio journey from one
         | of Jacques' original talks a couple of years back. Dremio's
         | idea of caching tiers and reflections is powerful for
         | performance.
         | 
         | Spice takes it further and provides flexibility for
         | materialization, giving you full control over where that
         | materialization exists (same machine, same pod, same network,
         | same cluster, same region, etc.), what engine/processing (OLTP
         | - SQLite/PostgreSQL, OLAP - DuckDB/Arrow) it uses and what tier
         | (in-memory, attached NVMe, etc.) to store it down to the
         | dataset level.
        
       | ignoramous wrote:
       | > _Today, we 're re-launching Spice..._                 Obtaining
       | blockchain and smart-contract data is hard ... Spice makes it
       | easy.
       | 
       | http://web.archive.org/web/20220414105622/https://docs.spice...
       | 
       | A slight detour from the company's original vision
       | (https://archive.is/88IoQ)?
        
         | lukekim wrote:
         | Actually, we posted the original vision in Sep 2021 at
         | https://blog.spiceai.org/posts/2021/09/07/introducing-spice....
         | for AI-driven applications and discussed needing a good source
         | of data at https://blog.spiceai.org/posts/2021/12/05/ai-needs-
         | ai-ready-....
         | 
         | We believe blockchain data is one of the most interesting time-
         | series datasets to work in developing an AI-driven application
         | platform, because it's continuous, well-structured, has many
         | applications, and is open to index. Regardless of views on
         | crypto, from a purely technical/data feed perspective, it's
         | quite useful for testing time-series systems.
        
         | spxneo wrote:
         | wow thanks for pointing this out
         | 
         | I have a short circuit for whenever I see the B word or still
         | pushing this smart-contract non-sense that isn't being used in
         | serious real world projects with legal repercussions....for the
         | 10+ years this technology has existed
        
       | martinmao wrote:
       | This looks awesome!
        
       | neeleshs wrote:
       | Congratulations. Is this similar to Trino/Starburst, Drill?
        
         | lukekim wrote:
         | Thank you!
         | 
         | Yes, in terms of federated queries, there are similarities, but
         | Spice is designed to be much smaller, faster, and lightweight
         | (single-binary, 140MB) so you can run it next to your
         | application as a sidecar, or eventually even in the browser.
         | Spice also gives you more options and flexibility for
         | materialization, so you can choose where and how to store local
         | materialized data.
        
       ___________________________________________________________________
       (page generated 2024-03-28 23:00 UTC)