[HN Gopher] Xlite: Query Excel and Open Document spreadsheets as...
___________________________________________________________________
Xlite: Query Excel and Open Document spreadsheets as SQLite virtual
tables
Author : thunderbong
Score : 201 points
Date : 2022-06-25 12:34 UTC (10 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| mwenge wrote:
| https://dirtylittlesql.com and https://io10.dev also support
| this.
| _qua wrote:
| Well this is the coolest thing I've seen all week.
| olah_1 wrote:
| I'm looking for a way to have a searchable/filterable CSV file
| of 30k rows as a web app.
|
| Some kind of fork of DirtyLittleSQL looks like it is the
| solution!
| bsilvereagle wrote:
| One-liner for doing this for CSVs from a few days ago:
| https://news.ycombinator.com/item?id=31824030
| damagednoob wrote:
| As another alternative, you can use in2csv[1] to convert excel
| documents to csv and pipe them into the other tools of csvkit.
|
| [1]:
| https://csvkit.readthedocs.io/en/latest/tutorial/1_getting_s...
| eatonphil wrote:
| This is a cool project! But if you just want a simple CLI, you
| can query Excel and ODS files with dsq [0]. It's the same end
| result without needing to carry plugins around and with support
| for many additional data formats. Plus a growing standard library
| of functions that don't come built into SQLite such as best-
| effort date parsing, URL parsing/extraction, statistical
| aggregation functions, math functions, string and regex helpers,
| hashing functions and so on [1].
|
| An annoying thing about this extension-based style of file
| support is needing to create a new table for every new file if
| the schema is different. This is a limitation [2] of sqlite
| unfortunately. dsq doesn't work this way so it doesn't have that
| limit.
|
| On the other hand, if you go this route you can more easily
| combine with other extensions. That's not really possible with
| dsq right now.
|
| [0] https://github.com/multiprocessio/dsq
|
| [1] https://github.com/multiprocessio/go-sqlite3-stdlib
|
| [2] https://sqlite.org/forum/forumpost/ec944414fa
| sitkack wrote:
| I think when Wasm gets a little bit more mature, it will enable
| the creation of more composable tools. One might be able to hot
| create modules? Kinda ironic that this is how it is structured
| given that arbitrary projections is what SQL does.
|
| Is there a way to hook lower in the SQLite stack, and make it
| think that your DS is something it already understands?
|
| What about a function that returns a view that it dynamically
| generates?
| eatonphil wrote:
| Sqlite does not allow table valued functions to return
| results with varying schemas. The schema must be defined once
| up front and then cannot change.
|
| That forum post is a request I made to them to allow dynamic
| columns but they don't seem interested so far.
| sitkack wrote:
| I get that as how it currently stands. But somewhere in the
| sqlite codebase, everything is a projection.
|
| Still thinking and reading through sqlite source ...
| kbd wrote:
| Do any of these tools take into account hidden Excel rows? It's a
| constant frustration.
| eatonphil wrote:
| Dunno! But if you send me a sample I can try it out for dsq.
| kbd wrote:
| Sweet. Here's my open issue with VisiData about this:
| https://github.com/saulpw/visidata/issues/1398
|
| There's a sample file at the end of that thread.
| no_identd wrote:
| Do you have a workaround workflow in the meantime? I could
| imagine a few non-Macro or Macro-light solutions to this.
| Use a pivot table for example, the results of which I
| believe get 'magiced' into their destination cells instead
| of getting hidden, I however don't recall whether they
| auto-inherit autofilters set on the source table and/or
| source cell range.
|
| How much organizational (not technical, we'll get to that)
| control/authority/input/stakeholder role do you have on the
| source document templates?
|
| Also, you might want to check out
| https://github.com/microsoft/advanced-formula-environment/
| kbd wrote:
| > Do you have a workaround workflow in the meantime?
|
| Yeah in the VisiData ticket I link to a post[1] with some
| code to handle hidden rows and columns in
| Pandas+openpyxl.
|
| [1] https://towardsdatascience.com/how-to-load-excel-
| files-with-...
| justsomeuser wrote:
| How do virtual tables compare with regular tables for
| performance?
|
| I would assume SQLite has some optimisations for native tables
| (rather than reading the data from another virtual table backed
| file)?
| lazypenguin wrote:
| It's implementation dependent. You implement an interface and
| it's up to the implementation to optimize the calls. The
| interface is also rather simplistic so there's some
| optimization that can't be done at the implementation level
| since SQLite didn't provide all the information at the time of
| the query.
| Fnoord wrote:
| rga (ripgrep all) is basically rg (ripgrep) for binaries.
| Linda703 wrote:
| pmeira wrote:
| FreeXL is another alternative: https://www.gaia-
| gis.it/fossil/freexl/index
| don-code wrote:
| In addition to just loading and querying data, spreadsheets are
| great as database mocks in integration tests. I long ago used
| XLSQL (note: there is a new XLSQL project unrelated to this one)
| to mock an Oracle database:
| https://sourceforge.net/projects/xlsql/.
___________________________________________________________________
(page generated 2022-06-25 23:00 UTC)