[HN Gopher] A fast SQLite PWA notebook for CSV files
       ___________________________________________________________________
        
       A fast SQLite PWA notebook for CSV files
        
       Author : mwenge
       Score  : 169 points
       Date   : 2021-12-30 10:14 UTC (12 hours ago)
        
 (HTM) web link (dirtylittlesql.com)
 (TXT) w3m dump (dirtylittlesql.com)
        
       | adolph wrote:
       | This is really wonderful! The discussion about lay people's
       | knowledge of sql reminded me that the Pandas API is often useful
       | for non-sql folk. Likewise there are some projects similar to
       | dirtylittlesql to bring Python data manipulation to the browser.
       | 
       | https://github.com/jtpio/jupyterlite
       | 
       | https://github.com/gzuidhof/starboard-notebook
        
       | mwenge wrote:
       | There are a lot of different solutions knocking around for
       | running queries on one or more CSV files. This is a web-based
       | notebook that prioritizes loading large files quickly so you can
       | get running queries on them as soon as possible. It supports most
       | text files, Excel files, JSON. It can also display your results
       | as graphs.
        
         | vmchale wrote:
         | I quite like Q for this though it doesn't have graphs.
         | 
         | Jd is also decent. More expressive languages for data.
        
           | cfeduke wrote:
           | Visidata[0] is another great tool in this vein I often forget
           | the name of when I really need it. Python, in the terminal,
           | kind of like vim on CSV with graphing capabilities.
           | 
           | 0. https://www.visidata.org/
        
             | leonim wrote:
             | Second this recommendation for the terminal. For my CLI
             | toolbox, VisiData is my favorite.
             | 
             | I find VisiData is great for quickly exploring and querying
             | data from the CLI. It can handle many types of files
             | (SQLite, CSV, TSV, Excel, JSON, YAML, etc). Visidata loads
             | all the data into memory, and so is very responsive when
             | exploring the data. It allows you to quickly do all sorts
             | of of adhoc queries interactively, without having to write
             | a valid SQL query.
             | 
             | I haven't used Q. When I first heard of it, I liked the
             | idea that Q allowed you to run random queries on CSV and
             | TSV files. However, it seemed like it would be slow if you
             | wanted to do follow up queries, since it had to repopulate
             | the in memory SQLite file for each query. Though it looks
             | like the latest version has a way to cache the generated
             | sqlite file. So that seems like it could help.
             | 
             | Also, if I have some CSV, TSV, JSONL data sqlite-utils is
             | useful for converting them to SQLite, and then exploring
             | with Visidata or SQL queries.
             | 
             | Q: https://github.com/harelba/q sqlite-utils:
             | https://github.com/simonw/sqlite-utils
        
               | chrisweekly wrote:
               | In this space I absolutely love https://lnav.org -- the
               | "mini-ETL powertool" featuring embedded SQLite and *nix-y
               | CLI for chaining / scripting.
               | 
               | Edit:
               | 
               | PS Unaffiliated, just a fan since 2016(?)
               | 
               | PPS Despite the name and original / primary use case,
               | lnav is useful for things beyond "just" logfiles(!)
        
       | stsourlidakis wrote:
       | This is really good! Would be great if there was a way to rename
       | the created tables!
        
       | cube00 wrote:
       | A PWA _and_ vanilla JavaScript; we need more of these. Fantastic
       | job!
        
         | oefrha wrote:
         | Not sure what's your definition of "vanilla JavaScript", but
         | https://dirtylittlesql.com/separators.js is clearly produced by
         | a bundler, bundling a bunch of npm packages like events,
         | buffer, etc. I would be pretty alarmed if someone wrote a
         | single 7659-line vanilla JavaScript file by hand in 2021.
         | 
         | And of course the SQLite part is wasm.
        
           | maga wrote:
           | Vanilla JS nowadays means not using a framework, in
           | yesteryears it meant not using jQuery, in either case, it
           | doesn't preclude from having dependencies.
        
           | DemocracyFTW wrote:
           | to be fair, in case the dependencies are pegged to specific
           | known-good versions and are bundled into one or a few files
           | then two concerns with dependencies go away, namely, insanely
           | deep and wide file system trees and the lingering danger of
           | any one of hundreds of software titles getting malware-ized.
           | One could even add that it's probably a good idea to prefer
           | tried-and-tested existing software over writing everything
           | from scratch.
        
         | lghh wrote:
         | Why does it matter that it's vanilla Javascript? I don't think
         | this would actually fit the general definition of vanilla
         | Javascript, but I'm more curious why it matters at all?
        
           | mikojan wrote:
           | Not OP, but in my opinion:
           | 
           | When I am reading through the source code of somebodies
           | personal project, it's absolutely amazing to:
           | 
           | - see how to do X in discernible, concise, modern JavaScript
           | 
           | - follow along significant architectural decisions
           | 
           | - watch a web API in action, not a specialized abstraction
           | that will necessarily hide some capabilities
           | 
           | That being said: This project doesn't offer much in that
           | regard. It's hard to read and makes little use of modern
           | JavaScript.
        
       | js4ever wrote:
       | Very cool, but I didn't found a way to configure the delimiter
       | (most of my files use ; as a delimiter)
        
         | mwenge wrote:
         | Forgot to add a semi-colon as a possible delimiter! Fixed it
         | there now.
        
           | lucasverra wrote:
           | hey there, i've left a github issue / feature request.
           | 
           | Column UI would be awesome
        
       | fxj wrote:
       | Love it! Could you also provide some options for decimal point
       | data? Comma instead of period.
       | 
       | It would be great if I could use it as a kernel in JupyterLite!
        
         | pletnes wrote:
         | There is a sqlite kernel for regular jupyter. Not sure about
         | jupyterlite though.
        
       | kapilvt wrote:
       | Also worth a look, built as a static web spa
       | 
       | https://github.com/lana-k/sqliteviz
        
       | monkeydust wrote:
       | What would be cool is integrating this with OpenAI Codex - the
       | natural language to SQL transformations were impressive when I
       | played around with them.
       | 
       | This could then be pitched to non-techie folk who want to simply
       | ask questions from their data on flat file.
        
         | hgarg wrote:
         | You mean like this? - https://leesaapp.com/
        
           | monkeydust wrote:
           | Yes, thanks for sharing.
        
         | eof wrote:
         | I'm skeptical of the utility of building for non techie people.
         | It takes literally tens of minutes to Learn SQL necessary to a
         | single csv file; and the overlap of people needing to query csv
         | files and capable of learning basic select syntax is probably
         | nearly perfectly overlapping.
         | 
         | I don't knock the research at all, it's very cool. But it has
         | to be basically perfect for someone even slightly experienced
         | to consider using, otherwise it will cause more frustration
         | then time saving.
        
           | Taurenking wrote:
        
           | Closi wrote:
           | > It takes literally tens of minutes to Learn SQL necessary
           | to a single csv file
           | 
           | It might take you, as a programmer or technically literate
           | person, tens of minutes to learn the SQL necessary to parse a
           | csv file.
           | 
           | But I have met a whole raft of people that would firstly take
           | much longer to learn enough SQL to get by, then would still
           | need help when basic syntax error messages came up, and then
           | would get frustrated before getting someone else to do basic
           | analysis for them.
        
             | eof wrote:
             | I have met those people to, but none of those people have
             | ever had a csv file to run queries on.
             | 
             | Analysis has a lot of meanings. Getting basic facts (there
             | are N rows that X) and selecting subsets of data are
             | realistic analysis non technical people might do on csv
             | files.
             | 
             | And I want to be clear that parsing a csv file with sqlite
             | or MySQL is much more complex than learning "SELECT .. FROM
             | .. WHERE" (with no joins!) when there is a tool like OPs
             | that magically does the "parsing".
        
               | wswope wrote:
               | > And I want to be clear that parsing a csv file with
               | sqlite or MySQL is much more complex than learning
               | "SELECT .. FROM .. WHERE" (with no joins!) when there is
               | a tool like OPs that magically does the "parsing".
               | 
               | Sqlite CSV import:
               | 
               | .mode csv
               | 
               | .import file.csv tablename
        
               | Closi wrote:
               | In reality, a basic query might be something like "Show
               | me how many units of product code 010201 were sold each
               | month this year", which actually isn't that easy to write
               | in SQL (but very easy to express in plain English).
        
       | pjmlp wrote:
       | Nice app, best of all no Electron needed.
        
         | tanin wrote:
         | Welp, I was gonna post my web0 desktop Electron app for working
         | with CSVs using SQL: https://superintendent.app
        
           | eatonphil wrote:
           | Here's another one I'm working on [0]. Run it as a desktop
           | Electron app or as a server/web app. There's also a
           | serverless/in-memory web app demo running here [1].
           | 
           | [0] https://github.com/multiprocessio/datastation
           | 
           | [1] https://app.datastation.multiprocess.io
        
       | Multrex wrote:
       | Is it available as Docker Image?
        
         | cube00 wrote:
         | Given it's a PWA couldn't you just install it and use it
         | offline?
        
       | mynameismon wrote:
       | Similar: Microsoft LogParser (https://www.microsoft.com/en-
       | in/download/details.aspx?id=246...)
        
       ___________________________________________________________________
       (page generated 2021-12-30 23:01 UTC)