[HN Gopher] Csvlens: Command line CSV file viewer. Like less but...
       ___________________________________________________________________
        
       Csvlens: Command line CSV file viewer. Like less but made for CSV
        
       Author : ingve
       Score  : 295 points
       Date   : 2024-01-06 09:07 UTC (13 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jwiz wrote:
       | See also: https://www.visidata.org/
        
         | mistaken wrote:
         | visidata is awesome. It also supports a bunch of other file
         | formats such as excel sheets.
        
         | standing_user wrote:
         | Didn't know about it, thank you
        
         | rrr_oh_man wrote:
         | > Scatterplots in the terminal
         | 
         | Wow!
        
         | jellyfishbeaver wrote:
         | Strongly considering donating to the creator of Visidata, it's
         | one of the terminal tools I can't live without. I use it every
         | day.
        
           | bbkane wrote:
           | If it's that important and you can afford a few bucks a
           | month, I strongly urge you stop considering your donation,
           | and (right now) follow the links on the GitHub to actually
           | start supporting the project.
           | 
           | It's really easy to wallow in the "considering" phase of a
           | decision indefinitely (or maybe that's just me :D)
        
           | letmeinhere wrote:
           | huh, something (maybe similarity to VisiCalc name) had me
           | dismissing it as commercial software
        
           | jwiz wrote:
           | Yep, I am doing the $5/mo Patreon level because I love it so
           | much.
        
         | zoogeny wrote:
         | I've been thinking about a terminal spreadsheet editor for
         | about a year now. I haven't deeply investigated this tool while
         | I write this comment but I'd love for something like this for
         | Terminal that supported `=A1+B2` kind of formulas.
        
           | runjake wrote:
           | Back in the day, we called this Lotus 123 and VisiCalc.
           | 
           | https://en.m.wikipedia.org/wiki/Lotus_1-2-3
        
             | zoogeny wrote:
             | Yes, exactly - imagine a modern version written in Rust,
             | with vi key-bindings, etc. Please someone be inspired by
             | this idea and build this. TUIs are making a comeback and
             | the spreadsheet is still one of the most useful and
             | effective tools in all information science.
        
             | PaulHoule wrote:
             | The spreadsheet data model lives a double life.
             | 
             | (1) On one hand you can use it for data that is really
             | tabular,
             | 
             | (2) But it also has an engine that can compute the
             | dependency relationships between cells and recalculate the
             | cells affected by a change. This is quite different from
             | conventional programming languages where you are required
             | to specify an order to put operations in. Of course this a
             | problem for exploiting parallelism but I'd charge it is one
             | more bit of cognitive load that makes it harder for
             | beginners and non-professional programmers. (Professional
             | programmers are just used to it and only run into problems
             | in unusual cases where circularity is involved, but I think
             | it's one more thing that beginners struggle with.)
             | 
             | The worst problem is a lack of separation between code and
             | data. If you are doing an analysis you might put something
             | like                 =SUM(A1:A28)
             | 
             | in A30. Excel will change that to
             | =SUM(A1:A29)
             | 
             | if you insert a row in there, which helps, but they lock
             | you into the mindset of "I'm making the December sales
             | report" as opposed to "I'm making the monthly sales
             | report". That is, the data and the analysis should be two
             | separate things: just as you can put the December data or
             | the January data into a Python script.
             | 
             | (Note some of the same problems still exist with
             | "notebooks" and "workspaces" where you wind up with a file
             | that has both code and data in it which can be problematic
             | to check into git, particularly when your are working for
             | people who would like to have a beautiful notebook with an
             | analysis in it to view in GitHub but will then struggle to
             | version control it. Many "data scientists" fail to rise
             | above the December sales report even though there's a clear
             | path to turn a Jupyter notebook into a Python script.)
             | 
             | ---
             | 
             | As much as that is a rant I think there's a huge untapped
             | market for things that are like spreadsheets but different.
             | That is, something that looks like Excel but is specialized
             | for editing tabular data (no formulas, CSV import 'just
             | works' all the time, ...) or something that has formulas
             | like Excel but not on a grid or not just on a grid. For the
             | latter there was this product
             | 
             | https://en.wikipedia.org/wiki/TK_Solver
             | 
             | which is still around. People were amazed with TK Solver
             | when it came out and I'm surprised to this day that it
             | hasn't had a lot of competition.
        
               | wlindley wrote:
               | We had that once, in an MSDOS-era program called Javelin
               | ( https://en.wikipedia.org/wiki/Javelin_Software ):
               | 
               | "Unlike models in a spreadsheet, Javelin models are built
               | on objects called variables, not on data in cells of a
               | report. For example, a time series, or any variable, is
               | an object in itself, not a collection of cells which
               | happen to appear in a row or column... Calculations are
               | performed on these objects, as opposed to a range of
               | cells, so adding two time series automatically aligns
               | them in calendar time, or in a user-defined time frame.
               | Data are independent of worksheets..."
               | 
               | That was forty years ago and as far as I know there
               | hasn't been a program that works the same way since.
               | Perhaps someone could reverse-engineer it (the jav.exe
               | file is only 56,448 bytes!)
        
               | bee_rider wrote:
               | Spreadsheets are kind of interesting tech. Because the
               | user is more-or-less directly building the data
               | dependency graph, I'd expect them to map better to actual
               | hardware (massively parallel in so many dimensions)
               | better than a sequential procedural language like C.
               | 
               | But, they are not very fashionable.
               | 
               | It would be fun to have a pipeline to compile a
               | spreadsheet, that produces an executable. Potential
               | outputs: cpu, gpu, fpga. Haha.
        
           | franek wrote:
           | While not built around CSV, two terminal spreadsheet tools I
           | have successfully used in the past are sc-im and the (neo)vim
           | plugin vim-table-mode:
           | 
           | https://github.com/andmarti1424/sc-im/
           | 
           | https://github.com/dhruvasagar/vim-table-mode
           | 
           | Back then I stopped using sc-im because it could not
           | import/export XLSX, if I remember correctly. Apparently it
           | can today!
           | 
           | vim-table-mode always felt a little fragile and I don't want
           | to be bound to vim anymore. That said, it still feels like a
           | small miracle to me to have functional spreadsheet formulas
           | inside markdown documents - calculation and typesetting all
           | in one place.
        
           | genericacct wrote:
           | I wrote an SC fork that does this ,will send a link when im
           | home
        
         | somat wrote:
         | Can confirm, I consider visidata to be the gold standard for
         | data exploration.
         | 
         | I have largely given up on spreadsheets, replacing them with
         | relational databases. And on top of being a world class tool
         | for understanding the data for import, visidata makes for a
         | pretty great query pager.
        
         | retrochameleon wrote:
         | I've been using Visidata and love it for viewing, extracting
         | rows or columns to a different file, but I know it is capable
         | of much more that I haven't figured out yet
        
       | musha68k wrote:
       | Not on "computer" but some such in your pipeline should be good
       | enough more often than not?
       | 
       | awk -v OFS='\t' '{$1=$1; print}' | column -t
        
         | cowsandmilk wrote:
         | many csv data sets cannot be handled by awk due to csv
         | supporting things like commas inside fields, newlines inside
         | fields, etc. Your solution would also fall over for any tabs
         | within fields. Plenty of tools out there support RFC 4180 CSV
         | files as well as common csv variants. And these tools have been
         | around for a decade and have suites of cli commands that can be
         | piped together, just speaking "CSV" as a format rather than
         | pure text streams.
        
           | musha68k wrote:
           | Good points on using proper CSV tooling 100%; so that's why
           | the disclaimer "more often than not"
           | 
           | BTW I just realized that I was omitting the field separator
           | '-F,' actually.
           | 
           | For those pesky nested commas from top off simplistic text
           | hat still I'd go back to sed for fun and potential profit:
           | 
           | sed 's/","/"\t"/g; s/^"/"\t/; s/"$//; s/,,/, ,/g' | column -t
           | 
           | Not elegant nor hands-off and might be missing something
           | else; def needs the right data / volume and fiddling as we
           | know ymmv; can get unwieldy quickly etc etc
           | 
           | TLDR; use column -t when applicable
        
           | prudentpomelo wrote:
           | Awk now supports a `--csv` flag for processing csv's.
           | https://github.com/onetrueawk/awk/blob/master/README.md
        
             | musha68k wrote:
             | oh nice TIL
        
             | snidane wrote:
             | How does one output properly csv quoted rows? It seems thr
             | csv flag works only for parsing inputs.
        
       | kkoncevicius wrote:
       | Does it allow specifying columns when filtering for rows?
       | 
       | For instance, the shown example filters for "Bug" but seems to
       | filter for rows that contain "Bug" in any of the columns. Can I
       | specify to filter for "Bug" only looking at a certain column?
       | 
       | At the same time - is filtering by numbers implemented, like
       | filtering for rows that have a numeric value above X in a certain
       | column?
       | 
       | If not these should be on TODO list - those are very common
       | operations for .csv type data.
        
         | dmd wrote:
         | You want https://www.visidata.org/
        
       | account-5 wrote:
       | I love these sorts of programs. Is there a list so where of data
       | programs like this for the command line? Im sure there will be.
        
       | drog wrote:
       | One of the things that greatly improved my csv workflow is
       | duckdb. It's a small binary that allows querying csv with sql.
        
         | lyjackal wrote:
         | You can do the same with SQLite, which is usually already
         | installed in most environments                   sqlite>
         | .import test.csv foo --csv
        
           | llimllib wrote:
           | A poor man's version of csvlens is something like:
           | sqlite -column :memory: '.import --csv file.csv tmp' 'select
           | * from tmp;' | bat
           | 
           | which imports the csv into sqlite and outputs it to bat, my
           | favorite pager - use `less` or whatever else you desire.
        
           | dbreunig wrote:
           | duckdb is a single file with no dependencies and it's _fast_.
           | Still blows my mind how quickly it can query GB sized gzipped
           | CSVs.
        
         | wenc wrote:
         | I use DuckDB for queries and Visidata for quick inspections.
         | 
         | Between those two, I can work with not only CSVs, but also JSON
         | and Parquet files (which are blazing fast -- CSVs are good for
         | human readability and editability, but they're horrendous for
         | queries).
         | 
         | CLI CSV tools pop up every now and then, but there's too many
         | of them and I feel that my use cases are sufficiently addressed
         | with only 2 tools.
        
         | dkga wrote:
         | Long live duckDB! Big fan here.
        
       | pradeepchhetri wrote:
       | I enjoy using clickhouse-local for parsing csv files. I generally
       | hit situations where I need custom delimiter and custom parsing
       | rules, I find it handles all of these edge cases very well.
       | Recently I found that if my csv files are compressed, i don't
       | even need to uncompress them, it auto-magically figures out the
       | compression format and process it for me.
        
         | tambourine_man wrote:
         | Didn't know about them, seems interesting.
        
       | RhysU wrote:
       | ngrid is quite usable, stable, and sensible for streaming work:
       | https://github.com/twosigma/ngrid
        
         | ranger_danger wrote:
         | a TUI program without a screenshot might as well not exist...
         | nobody is going to use it if they can't see what it looks like.
        
           | RhysU wrote:
           | Heh, and to think I downloaded dozens of Slackware floppy
           | images over a 28.8 modem without any screenshots! What a fool
           | I was!
           | 
           | (I have used this utility weekly for almost 10 years. It
           | isn't pretty. It's effective.)
        
             | bbkane wrote:
             | Maybe they'd accept a PR to the README with a screenshot.
        
       | bishfish wrote:
       | This would be nice to have with lesspipe.sh viewer.
        
       | tinkertamper wrote:
       | Love this!
       | 
       | Call me lame, but are there any open-source projects that
       | accomplish this type of gui in typescript? I have an idea but it
       | uses something written in JavaScript.
        
         | chuckadams wrote:
         | You can use any javascript library in typescript, and TS is all
         | JS at runtime anyway. You can even add types without touching
         | the JS library, just write a .d.ts file and stick it anywhere
         | that TS looks for sources.
        
       | zeckalpha wrote:
       | I'd love something like this in csvkit, xsv, or qsv. The
       | refragmentation of CSV CLI tools is counter to the long term
       | trend.
        
       | wodenokoto wrote:
       | Is there an open source thing similar to this that has a windows
       | gui?
        
         | olavgg wrote:
         | Libreoffice is a great tool for opening csv files
        
         | pama wrote:
         | Emacs has an excellent csv mode.
        
       | trackofalljades wrote:
       | This is glorious, someone put it in homebrew!
        
       | edu_guitar wrote:
       | Nice! Once or twice I've used tad as a GUI to view csv files, but
       | I usually use vi with nowrap or read the file in R. Now csvlens
       | will be my default for csv files.
       | 
       | [1]:https://www.tadviewer.com/
        
         | bee_rider wrote:
         | I like viewing most CSVs in vim. It can be a little annoying
         | when the columns don't line up well. Often I'll replace all the
         | commas with a tab, and then set the tab width to some very high
         | value to make it all line up.
         | 
         | Of course, there exist plenty of CSV files that don't follow
         | any particular standard, so the trick doesn't always work (if
         | there are tabs in the data, if there's some very wide field, or
         | if there's a comma in the data). Works good enough for the
         | files I have, though.
        
       | RicoElectrico wrote:
       | I tried some TUI tool that was pretty advanced at viewing tables
       | (sorting, filtering etc.), but I don't remember the name. It was
       | on a WSL instance on my work laptop which I nuked when they laid
       | me off.
       | 
       | I remember that it was tab-based. Any idea what that could have
       | been?
        
       | nickjj wrote:
       | One thing I often do with CSVs is sum up all or specific rows in
       | a column.
       | 
       | For example maybe you're doing end of year taxes and now you have
       | this large CSV export from your bank or payment provider with
       | multiple categories and you want to get the totals for certain
       | things.
       | 
       | In a GUI tool it's really easy to sort by a column and drag your
       | mouse to select what you want and see it summed in real time.
       | 
       | Oftentimes things aren't clean enough to have 100% confidence
       | that you can solve this with an automated script because maybe
       | something is spelled slightly different but it's really the same
       | thing. This feels like one of those things where spending a legit
       | 10-15 minutes once a year to do it manually is better than trying
       | to account for every known and unknown edge case you could think
       | of. The stakes are too high if you get it wrong since it's
       | related to taxes.
       | 
       | Has anyone found a really good standalone basic spreadsheet app
       | that "just works" which isn't Microsoft Excel that works on
       | Windows or Linux? I don't know why but Libre and Open Office both
       | struggle to parse columns out in certain types of CSVs and the
       | sorting behavior is typically a lot worse than Google's
       | spreadsheet app but I'd like to remove some dependence on using
       | Google.
        
         | AB1908 wrote:
         | I've never tried this myself but you could try importing the
         | CSV in sqlite and then run aggregate queries over it. Does that
         | sound useful?
        
           | mongol wrote:
           | Was about to suggest the same
        
           | nickjj wrote:
           | You can but it's back to depending on code to get the totals.
           | This is one spot where IMO being able to visualize the data
           | by seeing the rows and have immediate feedback on the sum is
           | useful. You can continue dragging or use CTRL + SHIFT
           | clicking to select more stuff as needed while letting your
           | brain decide what should be grouped together.
           | 
           | With a SQL / code approach you have to account for these
           | things without being able to see them and then adjust the
           | code afterwards to include your custom groupings. It ends up
           | taking more time. If the categories didn't change every year
           | it would for sure be worth it to code up a solution since
           | you'll know the edge cases by looking at the existing CSV but
           | it's a moving target because it could change next year.
        
         | wobblykiwi wrote:
         | I haven't dug too much into the tool, but could you use
         | something like Datasette for that?
        
         | hectormalot wrote:
         | If you want to use the CLI, Visidata might be what you want. It
         | does have a bit of a learning curve. Beyond that, I've found it
         | quite handy to do quick data explorations. e.g. there are
         | shortcuts for histograms, filtering, x-y plots, etc.
         | 
         | [1]: https://www.visidata.org
        
         | lowbloodsugar wrote:
         | Jupyter?
        
         | snidane wrote:
         | There is sc-im, which is the closest you'll get to a full
         | spreadsheet app in terminal with vi controls.
         | 
         | https://github.com/andmarti1424/sc-im
        
         | e12e wrote:
         | Have you tried gnumeric?
        
           | nickjj wrote:
           | Nope, but I just tried. I went to their site and noticed
           | there's no binaries for Windows. The site is also served over
           | HTTP (not HTTPS) and the default experience for apt
           | installing it on Ubuntu 22.04 didn't work due to a bunch of
           | packages no longer existing:                   E: Failed to
           | fetch
           | http://archive.ubuntu.com/ubuntu/pool/main/e/evince/evince-
           | common_42.3-0ubuntu3_all.deb  404  Not Found [IP:
           | 91.189.91.83 80]         E: Failed to fetch http://security.u
           | buntu.com/ubuntu/pool/main/p/poppler/libpoppler118_22.02.0-2u
           | buntu0.2_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
           | E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/mai
           | n/p/poppler/libpoppler-glib8_22.02.0-2ubuntu0.2_amd64.deb
           | 404  Not Found [IP: 91.189.91.83 80]         E: Failed to
           | fetch http://security.ubuntu.com/ubuntu/pool/main/g/ghostscri
           | pt/libgs9-common_9.55.0%7edfsg1-0ubuntu5.5_all.deb  404  Not
           | Found [IP: 91.189.91.83 80]         E: Failed to fetch http:/
           | /security.ubuntu.com/ubuntu/pool/main/g/ghostscript/libgs9_9.
           | 55.0%7edfsg1-0ubuntu5.5_amd64.deb  404  Not Found [IP:
           | 91.189.91.83 80]         E: Failed to fetch http://archive.ub
           | untu.com/ubuntu/pool/main/e/evince/libevdocument3-4_42.3-0ubu
           | ntu3_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
           | E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main
           | /e/evince/libevview3-3_42.3-0ubuntu3_amd64.deb  404  Not
           | Found [IP: 91.189.91.83 80]         E: Failed to fetch http:/
           | /archive.ubuntu.com/ubuntu/pool/main/e/evince/evince_42.3-0ub
           | untu3_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
           | E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/mai
           | n/w/webkit2gtk/libjavascriptcoregtk-4.0-18_2.42.1-0ubuntu0.22
           | .04.1_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
           | E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/mai
           | n/w/webkit2gtk/libwebkit2gtk-4.0-37_2.42.1-0ubuntu0.22.04.1_a
           | md64.deb  404  Not Found [IP: 91.189.91.83 80]
           | 
           | Going to go with nope on this one.
        
             | e12e wrote:
             | Gnumeric isn't in main on Debian/Ubuntu anymore?
             | 
             | Ed: looks like it is in universe on Ubuntu - you have
             | universe enabled?
             | 
             | https://packages.ubuntu.com/search?keywords=gnumeric
             | 
             | Ed: Did you apt update? Core pieces of evince missing
             | sounds very strange?
        
       | fao_ wrote:
       | See we wouldn't need this if everyone had standardized on TSV
       | instead
        
         | JohnKemeny wrote:
         | I don't understand if this is meant as a joke or not. Do you
         | mean that we then would have Tsvlens? Tabs don't automatically
         | make columns aligned, nor do they eliminate the problem of
         | quoting. I don't get it, but then again, I'm exceptionally dim-
         | witted.
        
           | bee_rider wrote:
           | Because tabs are flexible space, you can render them really
           | wide in most text editors, wide enough that all the columns
           | line up. But it is a hacky solution!
        
           | m2f2 wrote:
           | TSV are far superior to CSV as the separator is a TAB and not
           | commas, semicolons or other weird stuff depending on your
           | locale. Same for single and double quotes. Only thing to be
           | aware are tsv created by Excel with newlines embedded in text
           | fields, which are a huge PITA for almost all parsers.
        
       | leeoniya wrote:
       | diffeence from https://github.com/BurntSushi/xsv ?
        
       | adius wrote:
       | I found the perfect solution for me with
       | https://www.moderncsv.com. Starts fast, focused GUI, fully
       | featured, no bullshit!
        
         | minroot wrote:
         | Really cool, hope it was open source
        
       | BasilPH wrote:
       | That looks like a great CSV viewer.
       | 
       | I've enjoyed using csvkit[^0] in the past. The viewer isn't as
       | good as csvlens seem to be, but it comes with the ability to
       | grep, cut and pipe CSV data which has come in handy.
       | 
       | csvlens + csvkit might be a great combination.
       | 
       | [^0]: https://csvkit.readthedocs.io/en/latest/
        
       | yzzyx wrote:
       | I have been looking for a viewer exactly like this for so long!
       | Visidata is nice, but way more complex than what I was after.
       | This will fit perfectly into my workflow, thank you for sharing!
        
       ___________________________________________________________________
       (page generated 2024-01-06 23:00 UTC)