[HN Gopher] Csvlens: Command line CSV file viewer. Like less but...
___________________________________________________________________
Csvlens: Command line CSV file viewer. Like less but made for CSV
Author : ingve
Score : 295 points
Date : 2024-01-06 09:07 UTC (13 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jwiz wrote:
| See also: https://www.visidata.org/
| mistaken wrote:
| visidata is awesome. It also supports a bunch of other file
| formats such as excel sheets.
| standing_user wrote:
| Didn't know about it, thank you
| rrr_oh_man wrote:
| > Scatterplots in the terminal
|
| Wow!
| jellyfishbeaver wrote:
| Strongly considering donating to the creator of Visidata, it's
| one of the terminal tools I can't live without. I use it every
| day.
| bbkane wrote:
| If it's that important and you can afford a few bucks a
| month, I strongly urge you stop considering your donation,
| and (right now) follow the links on the GitHub to actually
| start supporting the project.
|
| It's really easy to wallow in the "considering" phase of a
| decision indefinitely (or maybe that's just me :D)
| letmeinhere wrote:
| huh, something (maybe similarity to VisiCalc name) had me
| dismissing it as commercial software
| jwiz wrote:
| Yep, I am doing the $5/mo Patreon level because I love it so
| much.
| zoogeny wrote:
| I've been thinking about a terminal spreadsheet editor for
| about a year now. I haven't deeply investigated this tool while
| I write this comment but I'd love for something like this for
| Terminal that supported `=A1+B2` kind of formulas.
| runjake wrote:
| Back in the day, we called this Lotus 123 and VisiCalc.
|
| https://en.m.wikipedia.org/wiki/Lotus_1-2-3
| zoogeny wrote:
| Yes, exactly - imagine a modern version written in Rust,
| with vi key-bindings, etc. Please someone be inspired by
| this idea and build this. TUIs are making a comeback and
| the spreadsheet is still one of the most useful and
| effective tools in all information science.
| PaulHoule wrote:
| The spreadsheet data model lives a double life.
|
| (1) On one hand you can use it for data that is really
| tabular,
|
| (2) But it also has an engine that can compute the
| dependency relationships between cells and recalculate the
| cells affected by a change. This is quite different from
| conventional programming languages where you are required
| to specify an order to put operations in. Of course this a
| problem for exploiting parallelism but I'd charge it is one
| more bit of cognitive load that makes it harder for
| beginners and non-professional programmers. (Professional
| programmers are just used to it and only run into problems
| in unusual cases where circularity is involved, but I think
| it's one more thing that beginners struggle with.)
|
| The worst problem is a lack of separation between code and
| data. If you are doing an analysis you might put something
| like =SUM(A1:A28)
|
| in A30. Excel will change that to
| =SUM(A1:A29)
|
| if you insert a row in there, which helps, but they lock
| you into the mindset of "I'm making the December sales
| report" as opposed to "I'm making the monthly sales
| report". That is, the data and the analysis should be two
| separate things: just as you can put the December data or
| the January data into a Python script.
|
| (Note some of the same problems still exist with
| "notebooks" and "workspaces" where you wind up with a file
| that has both code and data in it which can be problematic
| to check into git, particularly when your are working for
| people who would like to have a beautiful notebook with an
| analysis in it to view in GitHub but will then struggle to
| version control it. Many "data scientists" fail to rise
| above the December sales report even though there's a clear
| path to turn a Jupyter notebook into a Python script.)
|
| ---
|
| As much as that is a rant I think there's a huge untapped
| market for things that are like spreadsheets but different.
| That is, something that looks like Excel but is specialized
| for editing tabular data (no formulas, CSV import 'just
| works' all the time, ...) or something that has formulas
| like Excel but not on a grid or not just on a grid. For the
| latter there was this product
|
| https://en.wikipedia.org/wiki/TK_Solver
|
| which is still around. People were amazed with TK Solver
| when it came out and I'm surprised to this day that it
| hasn't had a lot of competition.
| wlindley wrote:
| We had that once, in an MSDOS-era program called Javelin
| ( https://en.wikipedia.org/wiki/Javelin_Software ):
|
| "Unlike models in a spreadsheet, Javelin models are built
| on objects called variables, not on data in cells of a
| report. For example, a time series, or any variable, is
| an object in itself, not a collection of cells which
| happen to appear in a row or column... Calculations are
| performed on these objects, as opposed to a range of
| cells, so adding two time series automatically aligns
| them in calendar time, or in a user-defined time frame.
| Data are independent of worksheets..."
|
| That was forty years ago and as far as I know there
| hasn't been a program that works the same way since.
| Perhaps someone could reverse-engineer it (the jav.exe
| file is only 56,448 bytes!)
| bee_rider wrote:
| Spreadsheets are kind of interesting tech. Because the
| user is more-or-less directly building the data
| dependency graph, I'd expect them to map better to actual
| hardware (massively parallel in so many dimensions)
| better than a sequential procedural language like C.
|
| But, they are not very fashionable.
|
| It would be fun to have a pipeline to compile a
| spreadsheet, that produces an executable. Potential
| outputs: cpu, gpu, fpga. Haha.
| franek wrote:
| While not built around CSV, two terminal spreadsheet tools I
| have successfully used in the past are sc-im and the (neo)vim
| plugin vim-table-mode:
|
| https://github.com/andmarti1424/sc-im/
|
| https://github.com/dhruvasagar/vim-table-mode
|
| Back then I stopped using sc-im because it could not
| import/export XLSX, if I remember correctly. Apparently it
| can today!
|
| vim-table-mode always felt a little fragile and I don't want
| to be bound to vim anymore. That said, it still feels like a
| small miracle to me to have functional spreadsheet formulas
| inside markdown documents - calculation and typesetting all
| in one place.
| genericacct wrote:
| I wrote an SC fork that does this ,will send a link when im
| home
| somat wrote:
| Can confirm, I consider visidata to be the gold standard for
| data exploration.
|
| I have largely given up on spreadsheets, replacing them with
| relational databases. And on top of being a world class tool
| for understanding the data for import, visidata makes for a
| pretty great query pager.
| retrochameleon wrote:
| I've been using Visidata and love it for viewing, extracting
| rows or columns to a different file, but I know it is capable
| of much more that I haven't figured out yet
| musha68k wrote:
| Not on "computer" but some such in your pipeline should be good
| enough more often than not?
|
| awk -v OFS='\t' '{$1=$1; print}' | column -t
| cowsandmilk wrote:
| many csv data sets cannot be handled by awk due to csv
| supporting things like commas inside fields, newlines inside
| fields, etc. Your solution would also fall over for any tabs
| within fields. Plenty of tools out there support RFC 4180 CSV
| files as well as common csv variants. And these tools have been
| around for a decade and have suites of cli commands that can be
| piped together, just speaking "CSV" as a format rather than
| pure text streams.
| musha68k wrote:
| Good points on using proper CSV tooling 100%; so that's why
| the disclaimer "more often than not"
|
| BTW I just realized that I was omitting the field separator
| '-F,' actually.
|
| For those pesky nested commas from top off simplistic text
| hat still I'd go back to sed for fun and potential profit:
|
| sed 's/","/"\t"/g; s/^"/"\t/; s/"$//; s/,,/, ,/g' | column -t
|
| Not elegant nor hands-off and might be missing something
| else; def needs the right data / volume and fiddling as we
| know ymmv; can get unwieldy quickly etc etc
|
| TLDR; use column -t when applicable
| prudentpomelo wrote:
| Awk now supports a `--csv` flag for processing csv's.
| https://github.com/onetrueawk/awk/blob/master/README.md
| musha68k wrote:
| oh nice TIL
| snidane wrote:
| How does one output properly csv quoted rows? It seems thr
| csv flag works only for parsing inputs.
| kkoncevicius wrote:
| Does it allow specifying columns when filtering for rows?
|
| For instance, the shown example filters for "Bug" but seems to
| filter for rows that contain "Bug" in any of the columns. Can I
| specify to filter for "Bug" only looking at a certain column?
|
| At the same time - is filtering by numbers implemented, like
| filtering for rows that have a numeric value above X in a certain
| column?
|
| If not these should be on TODO list - those are very common
| operations for .csv type data.
| dmd wrote:
| You want https://www.visidata.org/
| account-5 wrote:
| I love these sorts of programs. Is there a list so where of data
| programs like this for the command line? Im sure there will be.
| drog wrote:
| One of the things that greatly improved my csv workflow is
| duckdb. It's a small binary that allows querying csv with sql.
| lyjackal wrote:
| You can do the same with SQLite, which is usually already
| installed in most environments sqlite>
| .import test.csv foo --csv
| llimllib wrote:
| A poor man's version of csvlens is something like:
| sqlite -column :memory: '.import --csv file.csv tmp' 'select
| * from tmp;' | bat
|
| which imports the csv into sqlite and outputs it to bat, my
| favorite pager - use `less` or whatever else you desire.
| dbreunig wrote:
| duckdb is a single file with no dependencies and it's _fast_.
| Still blows my mind how quickly it can query GB sized gzipped
| CSVs.
| wenc wrote:
| I use DuckDB for queries and Visidata for quick inspections.
|
| Between those two, I can work with not only CSVs, but also JSON
| and Parquet files (which are blazing fast -- CSVs are good for
| human readability and editability, but they're horrendous for
| queries).
|
| CLI CSV tools pop up every now and then, but there's too many
| of them and I feel that my use cases are sufficiently addressed
| with only 2 tools.
| dkga wrote:
| Long live duckDB! Big fan here.
| pradeepchhetri wrote:
| I enjoy using clickhouse-local for parsing csv files. I generally
| hit situations where I need custom delimiter and custom parsing
| rules, I find it handles all of these edge cases very well.
| Recently I found that if my csv files are compressed, i don't
| even need to uncompress them, it auto-magically figures out the
| compression format and process it for me.
| tambourine_man wrote:
| Didn't know about them, seems interesting.
| RhysU wrote:
| ngrid is quite usable, stable, and sensible for streaming work:
| https://github.com/twosigma/ngrid
| ranger_danger wrote:
| a TUI program without a screenshot might as well not exist...
| nobody is going to use it if they can't see what it looks like.
| RhysU wrote:
| Heh, and to think I downloaded dozens of Slackware floppy
| images over a 28.8 modem without any screenshots! What a fool
| I was!
|
| (I have used this utility weekly for almost 10 years. It
| isn't pretty. It's effective.)
| bbkane wrote:
| Maybe they'd accept a PR to the README with a screenshot.
| bishfish wrote:
| This would be nice to have with lesspipe.sh viewer.
| tinkertamper wrote:
| Love this!
|
| Call me lame, but are there any open-source projects that
| accomplish this type of gui in typescript? I have an idea but it
| uses something written in JavaScript.
| chuckadams wrote:
| You can use any javascript library in typescript, and TS is all
| JS at runtime anyway. You can even add types without touching
| the JS library, just write a .d.ts file and stick it anywhere
| that TS looks for sources.
| zeckalpha wrote:
| I'd love something like this in csvkit, xsv, or qsv. The
| refragmentation of CSV CLI tools is counter to the long term
| trend.
| wodenokoto wrote:
| Is there an open source thing similar to this that has a windows
| gui?
| olavgg wrote:
| Libreoffice is a great tool for opening csv files
| pama wrote:
| Emacs has an excellent csv mode.
| trackofalljades wrote:
| This is glorious, someone put it in homebrew!
| edu_guitar wrote:
| Nice! Once or twice I've used tad as a GUI to view csv files, but
| I usually use vi with nowrap or read the file in R. Now csvlens
| will be my default for csv files.
|
| [1]:https://www.tadviewer.com/
| bee_rider wrote:
| I like viewing most CSVs in vim. It can be a little annoying
| when the columns don't line up well. Often I'll replace all the
| commas with a tab, and then set the tab width to some very high
| value to make it all line up.
|
| Of course, there exist plenty of CSV files that don't follow
| any particular standard, so the trick doesn't always work (if
| there are tabs in the data, if there's some very wide field, or
| if there's a comma in the data). Works good enough for the
| files I have, though.
| RicoElectrico wrote:
| I tried some TUI tool that was pretty advanced at viewing tables
| (sorting, filtering etc.), but I don't remember the name. It was
| on a WSL instance on my work laptop which I nuked when they laid
| me off.
|
| I remember that it was tab-based. Any idea what that could have
| been?
| nickjj wrote:
| One thing I often do with CSVs is sum up all or specific rows in
| a column.
|
| For example maybe you're doing end of year taxes and now you have
| this large CSV export from your bank or payment provider with
| multiple categories and you want to get the totals for certain
| things.
|
| In a GUI tool it's really easy to sort by a column and drag your
| mouse to select what you want and see it summed in real time.
|
| Oftentimes things aren't clean enough to have 100% confidence
| that you can solve this with an automated script because maybe
| something is spelled slightly different but it's really the same
| thing. This feels like one of those things where spending a legit
| 10-15 minutes once a year to do it manually is better than trying
| to account for every known and unknown edge case you could think
| of. The stakes are too high if you get it wrong since it's
| related to taxes.
|
| Has anyone found a really good standalone basic spreadsheet app
| that "just works" which isn't Microsoft Excel that works on
| Windows or Linux? I don't know why but Libre and Open Office both
| struggle to parse columns out in certain types of CSVs and the
| sorting behavior is typically a lot worse than Google's
| spreadsheet app but I'd like to remove some dependence on using
| Google.
| AB1908 wrote:
| I've never tried this myself but you could try importing the
| CSV in sqlite and then run aggregate queries over it. Does that
| sound useful?
| mongol wrote:
| Was about to suggest the same
| nickjj wrote:
| You can but it's back to depending on code to get the totals.
| This is one spot where IMO being able to visualize the data
| by seeing the rows and have immediate feedback on the sum is
| useful. You can continue dragging or use CTRL + SHIFT
| clicking to select more stuff as needed while letting your
| brain decide what should be grouped together.
|
| With a SQL / code approach you have to account for these
| things without being able to see them and then adjust the
| code afterwards to include your custom groupings. It ends up
| taking more time. If the categories didn't change every year
| it would for sure be worth it to code up a solution since
| you'll know the edge cases by looking at the existing CSV but
| it's a moving target because it could change next year.
| wobblykiwi wrote:
| I haven't dug too much into the tool, but could you use
| something like Datasette for that?
| hectormalot wrote:
| If you want to use the CLI, Visidata might be what you want. It
| does have a bit of a learning curve. Beyond that, I've found it
| quite handy to do quick data explorations. e.g. there are
| shortcuts for histograms, filtering, x-y plots, etc.
|
| [1]: https://www.visidata.org
| lowbloodsugar wrote:
| Jupyter?
| snidane wrote:
| There is sc-im, which is the closest you'll get to a full
| spreadsheet app in terminal with vi controls.
|
| https://github.com/andmarti1424/sc-im
| e12e wrote:
| Have you tried gnumeric?
| nickjj wrote:
| Nope, but I just tried. I went to their site and noticed
| there's no binaries for Windows. The site is also served over
| HTTP (not HTTPS) and the default experience for apt
| installing it on Ubuntu 22.04 didn't work due to a bunch of
| packages no longer existing: E: Failed to
| fetch
| http://archive.ubuntu.com/ubuntu/pool/main/e/evince/evince-
| common_42.3-0ubuntu3_all.deb 404 Not Found [IP:
| 91.189.91.83 80] E: Failed to fetch http://security.u
| buntu.com/ubuntu/pool/main/p/poppler/libpoppler118_22.02.0-2u
| buntu0.2_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
| E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/mai
| n/p/poppler/libpoppler-glib8_22.02.0-2ubuntu0.2_amd64.deb
| 404 Not Found [IP: 91.189.91.83 80] E: Failed to
| fetch http://security.ubuntu.com/ubuntu/pool/main/g/ghostscri
| pt/libgs9-common_9.55.0%7edfsg1-0ubuntu5.5_all.deb 404 Not
| Found [IP: 91.189.91.83 80] E: Failed to fetch http:/
| /security.ubuntu.com/ubuntu/pool/main/g/ghostscript/libgs9_9.
| 55.0%7edfsg1-0ubuntu5.5_amd64.deb 404 Not Found [IP:
| 91.189.91.83 80] E: Failed to fetch http://archive.ub
| untu.com/ubuntu/pool/main/e/evince/libevdocument3-4_42.3-0ubu
| ntu3_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
| E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main
| /e/evince/libevview3-3_42.3-0ubuntu3_amd64.deb 404 Not
| Found [IP: 91.189.91.83 80] E: Failed to fetch http:/
| /archive.ubuntu.com/ubuntu/pool/main/e/evince/evince_42.3-0ub
| untu3_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
| E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/mai
| n/w/webkit2gtk/libjavascriptcoregtk-4.0-18_2.42.1-0ubuntu0.22
| .04.1_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
| E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/mai
| n/w/webkit2gtk/libwebkit2gtk-4.0-37_2.42.1-0ubuntu0.22.04.1_a
| md64.deb 404 Not Found [IP: 91.189.91.83 80]
|
| Going to go with nope on this one.
| e12e wrote:
| Gnumeric isn't in main on Debian/Ubuntu anymore?
|
| Ed: looks like it is in universe on Ubuntu - you have
| universe enabled?
|
| https://packages.ubuntu.com/search?keywords=gnumeric
|
| Ed: Did you apt update? Core pieces of evince missing
| sounds very strange?
| fao_ wrote:
| See we wouldn't need this if everyone had standardized on TSV
| instead
| JohnKemeny wrote:
| I don't understand if this is meant as a joke or not. Do you
| mean that we then would have Tsvlens? Tabs don't automatically
| make columns aligned, nor do they eliminate the problem of
| quoting. I don't get it, but then again, I'm exceptionally dim-
| witted.
| bee_rider wrote:
| Because tabs are flexible space, you can render them really
| wide in most text editors, wide enough that all the columns
| line up. But it is a hacky solution!
| m2f2 wrote:
| TSV are far superior to CSV as the separator is a TAB and not
| commas, semicolons or other weird stuff depending on your
| locale. Same for single and double quotes. Only thing to be
| aware are tsv created by Excel with newlines embedded in text
| fields, which are a huge PITA for almost all parsers.
| leeoniya wrote:
| diffeence from https://github.com/BurntSushi/xsv ?
| adius wrote:
| I found the perfect solution for me with
| https://www.moderncsv.com. Starts fast, focused GUI, fully
| featured, no bullshit!
| minroot wrote:
| Really cool, hope it was open source
| BasilPH wrote:
| That looks like a great CSV viewer.
|
| I've enjoyed using csvkit[^0] in the past. The viewer isn't as
| good as csvlens seem to be, but it comes with the ability to
| grep, cut and pipe CSV data which has come in handy.
|
| csvlens + csvkit might be a great combination.
|
| [^0]: https://csvkit.readthedocs.io/en/latest/
| yzzyx wrote:
| I have been looking for a viewer exactly like this for so long!
| Visidata is nice, but way more complex than what I was after.
| This will fit perfectly into my workflow, thank you for sharing!
___________________________________________________________________
(page generated 2024-01-06 23:00 UTC)