[HN Gopher] Inserting 130M SQLite Rows per Minute from a Scripti...
___________________________________________________________________
Inserting 130M SQLite Rows per Minute from a Scripting Language
Author : mpweiher
Score : 71 points
Date : 2021-07-24 20:11 UTC (2 hours ago)
(HTM) web link (blog.metaobject.com)
(TXT) w3m dump (blog.metaobject.com)
| macintux wrote:
| Discussion of the referenced article, _Inserting a billion rows
| in under a minute_ :
|
| https://news.ycombinator.com/item?id=27872575
| IfOnlyYouKnew wrote:
| Last time this was here, I ran this and got the following
| results: /fast-sqlite3-inserts (master)> time
| make busy-rust Sun Jul 18 17:04:59 UTC 2021
| [RUST] busy.rs (100_000_000) iterations real
| 0m9.816s user 0m9.380s sys 0m0.433s
| ________________________________________________________
| Executed in 9.92 secs fish external usr
| time 9.43 secs 0.20 millis 9.43 secs sys time
| 0.47 secs 1.07 millis 0.47 secs
| fast-sqlite3-inserts (master)> time make busy-rust-thread
| Sun Jul 18 17:04:48 UTC 2021 [RUST] threaded_busy.rs
| (100_000_000) iterations real 0m2.104s
| user 0m13.640s sys 0m0.724s
| ________________________________________________________
| Executed in 2.33 secs fish external usr
| time 13.68 secs 0.20 millis 13.68 secs sys time
| 0.78 secs 1.18 millis 0.78 secs
|
| I'm probably doing something wrong. Or I'm getting the pace
| needed for the billion?
|
| This is on a M1 MacBook Air.
| marvel_boy wrote:
| > inserted 10M rows in 4.328 seconds
|
| Not bad really. Objective-S is impressive.
| taneq wrote:
| Surely this is bandwidth limited? Any language should be fine as
| long as it's not gratuitously awful.
| aasasd wrote:
| Personally I'm more interested in _indexing_ a boatload of data
| as fast as possible, with modest resource requirements. Had a
| couple of cases where I 'd like to do that on a mid-tier laptop,
| with no particular success so far. I have to guess, but it seems
| that 'data scientists' either buy big fat boxes with tons of ram
| and cpu, or offload everything to big fat boxes in datacenters,
| or twiddle thumbs for a quite while. You'd think that by now
| writing indexes on all columns at the top sequential drive speed
| would be a solved problem from any 'Learn data science in two
| days' tutorial.
| bob1029 wrote:
| It is certainly feasible to saturate NVMe with just index
| writes in many niche implementations today. The trick is
| usually copious amounts of batching so that IO can do more per
| unit.
| remram wrote:
| How does vanilla SQLite 'create index' fare, with correct
| pragmas e.g. cache_size?
| pupdogg wrote:
| Sorry I don't mean to hijack the original post but for
| performant insert and indexing (which I assume is for
| analysis), I'd recommend using Clickhouse or QuestDB
| wiredfool wrote:
| I'd second the rec for clickhouse.
___________________________________________________________________
(page generated 2021-07-24 23:00 UTC)