[HN Gopher] Show HN: CSV GB+ by Data.olllo - Open and Process CS...
___________________________________________________________________
Show HN: CSV GB+ by Data.olllo - Open and Process CSVs Locally
I built CSV GB+ by Data.olllo, a local data tool that lets you
open, clean, and export gigabyte-sized CSVs (even billions of rows)
without writing code. Most spreadsheet apps choke on big files.
Coding in pandas or Polars works--but not everyone wants to write
scripts just to filter or merge CSVs. CSV GB+ gives you a fast,
point-and-click interface built on dual backends (memory-optimized
or disk-backed) so you can process huge datasets offline. Key
Features: Handles massive CSVs with ease -- merge, split, dedup,
filter, batch export Smart engine switch: disk-based "V Core" or
RAM-based "P Core" All processing is offline - no data upload or
telemetry Supports CSV, XLSX, JSON, DBF, Parquet and more
Designed for data pros, students, and privacy-conscious users
Register for 7-days free to pro try, pro versions remove row limits
and unlock full features. I'm a solo dev building Data.olllo as a
serious alternative to heavy coding or bloated enterprise tools.
Download for Windows:
https://apps.microsoft.com/detail/9PFR86LCQPGS User Guide:
https://olllo.top/articles/article-0-Data.olllo-UserGuide Would
love feedback! I'm actively improving it based on real use cases.
Author : olllo
Score : 40 points
Date : 2025-05-14 15:12 UTC (7 hours ago)
(HTM) web link (apps.microsoft.com)
(TXT) w3m dump (apps.microsoft.com)
| xnx wrote:
| Is this better than the free Tad (https://www.tadviewer.com/)
| which seems to do similar things for free?
| rad_gruchalski wrote:
| And on operating systems other than Windows...
| dangerlibrary wrote:
| It is 2025 and CSVs still dominate data interchange between
| organizations.
|
| https://graydon2.dreamwidth.org/193447.html
| esafak wrote:
| parquet is also popular.
| paddy_m wrote:
| Do you have a demo video?
|
| What are you using for processing (polars)?
|
| Marketing note: I'm sure you're proud of P Core/V Core, but that
| doesn't matter to your users, it's an implementation detail. At a
| maximum I'd write "intelligent execution that scales from small
| files to large files".
|
| As an implementation note, I would make it simple to operate on
| just the first 1000 (10k or 100k) rows so responses are super
| quick, then once the users are happy about the transform, make it
| a single click to operate on the entire file with a time
| estimate.
|
| Another feature I'd like in this vein is execute on a small
| subset, then if you find an error with a larger subset, try to
| reduce the larger subset to a small quick to reproduce version.
| Especially for deduping.
| marcellus23 wrote:
| > Marketing note: I'm sure you're proud of P Core/V Core, but
| that doesn't matter to your users, it's an implementation
| detail. At a maximum I'd write "intelligent execution that
| scales from small files to large files".
|
| Speaking personally, "intelligent execution that scales from
| small files to large files" sounds like marketing buzz that
| could mean absolutely nothing. I like that it mentions
| specifically switching between RAM and disk-powered engines,
| because that suggests it's not just marketing speak, but was
| actually engineered. Maybe P vs V Core is not the best way to
| market it, but I think it's worth mentioning that design.
| TheTaytay wrote:
| Thank you for this. I find myself increasingly using CSVs (TSVs
| actually) as the data format of choice. I confess I wish this was
| written for Mac too, but I like the trend of (once again) moving
| data processing down to our super computers on our desk...
| hilti wrote:
| ... I'm trying to use our super computers in our pockets, like
| an iPhone ;-) But still struggling with the way how to present
| CSV data effectively on a small screen, although it's huge in
| terms of pixels compared to computer screens from the 90s
|
| It's interesting to research how capable applications like
| Lotus123 have been even on low resolutions like 800x600 pixel
| compared to today's standard
| RyanHamilton wrote:
| QStudio allows querying CSV on mac via DuckDB:
| https://www.timestored.com/qstudio/csv-file-viewer I've been
| improving the Mac version a lot lately, key bindings, icon, an
| App package to download. So if you find any problems please
| raise a github issue.
| hermitcrab wrote:
| If you are wrangling CSV/TSV files on Mac, it might be worth
| taking a look at Easy Data Transform.
| paddy_m wrote:
| Ok, if we are all tagging and promoting our own projects, check
| out mine.
|
| I created Buckaroo to provide a better table viewing experience
| inside of notebooks. I also built a low code UI and auto
| cleaning to expedite the wrote data cleaning tasks that take up
| a large portion of data analysis. Autocleaning is heuristically
| powered - no LLMs, so it's fast and your data stays local. You
| can apply different autocleaning strategies and visually
| inspect the results. When you are happy with the cleaning, you
| can copy and paste the python code as a reusable function.
|
| All of this is open source, and its extendable/customizable.
|
| Here's a video walking through autocleaning and how to extend
| it https://youtu.be/A-GKVsqTLMI
|
| Here's the repo: https://github.com/paddymul/buckaroo
___________________________________________________________________
(page generated 2025-05-14 23:01 UTC)