[HN Gopher] Parquet-WASM: Rust-based WebAssembly bindings to rea...
___________________________________________________________________
Parquet-WASM: Rust-based WebAssembly bindings to read and write
Parquet data
Author : kylebarron
Score : 123 points
Date : 2024-04-22 15:10 UTC (7 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jasonjmcghee wrote:
| Seeing as the popular alternative here would be DuckDB-WASM,
| which (last time I checked) is on the order of 50MB, this is
| comparatively super lightweight.
| leeoniya wrote:
| i think duckdb-wasm is closer to 6MB over wire, but ~36MB once
| decompressed. (see net panel when loading
| https://shell.duckdb.org/)
|
| the decompressed size should be okay since it's not the same as
| parsing and JITing 36MB of JS.
| leeoniya wrote:
| in my [albeit outdated] experience ArrowJS is quite a bit slower
| than using native JS types. i feel like crossing the WASM<>JS
| boundary is very expensive, especially for anything other than
| numbers/typed arrays.
|
| what are people's experiences with this?
| ingenieroariel wrote:
| I'll let Kyle chime in but I tested it a few months ago with
| millions of polygons on an M2 16GB of RAM laptop and it worked
| very well.
|
| There is a library by the same author called lonboard that
| provides the JS bits inside JupyterLab.
| https://github.com/developmentseed/lonboard
|
| <speculation>I think it is based on the Kepler.gl / Deck.gl
| data loaders that go straight to GPU from
| network.</speculation>
| kylebarron wrote:
| Arrow JS is just ArrayBuffers underneath. You do want to
| amortize some operations to avoid unnecessary conversions. I.e.
| Arrow JS stores strings as UTF-8, but native JS strings are
| UTF-16 I believe.
|
| Arrow is especially powerful across the WASM <--> JS boundary!
| In fact, I wrote a library to interpret Arrow from Wasm memory
| into JS without any copies [0]. (Motivating blog post [1])
|
| [0]: https://github.com/kylebarron/arrow-js-ffi
|
| [1]: https://observablehq.com/@kylebarron/zero-copy-apache-
| arrow-...
| domoritz wrote:
| One of the ArrowJS committers here. We have fixed a few
| significant performance bottlenecks over the last few versions
| so try again. Also, I'm also ways curious to see specific use
| cases that are slow so we can make ArrowJS even better. Some
| limitations are fundamental and you may be better off
| converting to the corresponding JS types (which should be
| fast).
| rubenvanwyk wrote:
| Can this read and write Parquet files to S3-compatible storage?
| kylebarron wrote:
| It can read from HTTP urls, but you'd need to manage signing
| the URLs yourself. On the writing side, it currently writes to
| an ArrayBuffer, which then you could upload to a server or save
| on the user's machine.
| 2genders16259 wrote:
| hi are u lonely want ai gf?? https://discord.gg/elyza
| qTyTGkClOmgnFaGbc
| 2genders14547 wrote:
| Are you lonely? Do u want an AI girlfriend?
| https://discord.gg/elyza DbMyFKaBniSrlYtZI
| SEXMCNIGGA37282 wrote:
| hi are u lonely want ai gf?? https://discord.gg/elyza
| jAzRcLWCgCedQoNax
| 2genders35107 wrote:
| Are you lonely? Do u want an AI girlfriend?
| https://discord.gg/elyza vwtnyruKptLxvaJUO
| 2genders6880 wrote:
| Are you lonely? Do u want an AI girlfriend?
| https://discord.gg/elyza FaQgmiuKZWYuTPUBV
| 2genders34139 wrote:
| hi are u lonely want ai gf?? https://discord.gg/candyai
| qHwwhlQneaQwrjCqj
| SEXMCNIGGA45535 wrote:
| Are you lonely? Do u want an AI girlfriend?
| https://discord.gg/candyai kdFVkbSIMfSVCcWxN
| 2genders1767 wrote:
| hi are u lonely want ai gf?? https://discord.gg/elyza
| ktkzBxkzfCdTscBMd
| 2genders42516 wrote:
| hi are u lonely want ai gf?? https://discord.gg/candyai
| MoDGpzgeeJeSiwJIX
| 2genders3617 wrote:
| Are you lonely? Do u want an AI girlfriend?
| https://discord.gg/elyza -- FOLLOW THE HOMIE
| https://twitter.com/hashimthearab SsipRxQJgRLMKqYXk
| 2genders33247 wrote:
| hi are u lonely want ai gf?? https://discord.gg/elyza
| rYmWinCFbPNcigFsw
___________________________________________________________________
(page generated 2024-04-22 23:00 UTC)