hngopher.com

       [HN Gopher] Show HN: CSV GB+ by Data.olllo - Open and Process CS...
       ___________________________________________________________________
        
       Show HN: CSV GB+ by Data.olllo - Open and Process CSVs Locally
        
       I built CSV GB+ by Data.olllo, a local data tool that lets you
       open, clean, and export gigabyte-sized CSVs (even billions of rows)
       without writing code.  Most spreadsheet apps choke on big files.
       Coding in pandas or Polars works--but not everyone wants to write
       scripts just to filter or merge CSVs. CSV GB+ gives you a fast,
       point-and-click interface built on dual backends (memory-optimized
       or disk-backed) so you can process huge datasets offline.  Key
       Features: Handles massive CSVs with ease -- merge, split, dedup,
       filter, batch export  Smart engine switch: disk-based "V Core" or
       RAM-based "P Core"  All processing is offline - no data upload or
       telemetry  Supports CSV, XLSX, JSON, DBF, Parquet and more
       Designed for data pros, students, and privacy-conscious users
       Register for 7-days free to pro try, pro versions remove row limits
       and unlock full features. I'm a solo dev building Data.olllo as a
       serious alternative to heavy coding or bloated enterprise tools.
       Download for Windows:
       https://apps.microsoft.com/detail/9PFR86LCQPGS  User Guide:
       https://olllo.top/articles/article-0-Data.olllo-UserGuide  Would
       love feedback! I'm actively improving it based on real use cases.
        
       Author : olllo
       Score  : 40 points
       Date   : 2025-05-14 15:12 UTC (7 hours ago)
        
 (HTM) web link (apps.microsoft.com)
 (TXT) w3m dump (apps.microsoft.com)
        
       | xnx wrote:
       | Is this better than the free Tad (https://www.tadviewer.com/)
       | which seems to do similar things for free?
        
         | rad_gruchalski wrote:
         | And on operating systems other than Windows...
        
       | dangerlibrary wrote:
       | It is 2025 and CSVs still dominate data interchange between
       | organizations.
       | 
       | https://graydon2.dreamwidth.org/193447.html
        
         | esafak wrote:
         | parquet is also popular.
        
       | paddy_m wrote:
       | Do you have a demo video?
       | 
       | What are you using for processing (polars)?
       | 
       | Marketing note: I'm sure you're proud of P Core/V Core, but that
       | doesn't matter to your users, it's an implementation detail. At a
       | maximum I'd write "intelligent execution that scales from small
       | files to large files".
       | 
       | As an implementation note, I would make it simple to operate on
       | just the first 1000 (10k or 100k) rows so responses are super
       | quick, then once the users are happy about the transform, make it
       | a single click to operate on the entire file with a time
       | estimate.
       | 
       | Another feature I'd like in this vein is execute on a small
       | subset, then if you find an error with a larger subset, try to
       | reduce the larger subset to a small quick to reproduce version.
       | Especially for deduping.
        
         | marcellus23 wrote:
         | > Marketing note: I'm sure you're proud of P Core/V Core, but
         | that doesn't matter to your users, it's an implementation
         | detail. At a maximum I'd write "intelligent execution that
         | scales from small files to large files".
         | 
         | Speaking personally, "intelligent execution that scales from
         | small files to large files" sounds like marketing buzz that
         | could mean absolutely nothing. I like that it mentions
         | specifically switching between RAM and disk-powered engines,
         | because that suggests it's not just marketing speak, but was
         | actually engineered. Maybe P vs V Core is not the best way to
         | market it, but I think it's worth mentioning that design.
        
       | TheTaytay wrote:
       | Thank you for this. I find myself increasingly using CSVs (TSVs
       | actually) as the data format of choice. I confess I wish this was
       | written for Mac too, but I like the trend of (once again) moving
       | data processing down to our super computers on our desk...
        
         | hilti wrote:
         | ... I'm trying to use our super computers in our pockets, like
         | an iPhone ;-) But still struggling with the way how to present
         | CSV data effectively on a small screen, although it's huge in
         | terms of pixels compared to computer screens from the 90s
         | 
         | It's interesting to research how capable applications like
         | Lotus123 have been even on low resolutions like 800x600 pixel
         | compared to today's standard
        
         | RyanHamilton wrote:
         | QStudio allows querying CSV on mac via DuckDB:
         | https://www.timestored.com/qstudio/csv-file-viewer I've been
         | improving the Mac version a lot lately, key bindings, icon, an
         | App package to download. So if you find any problems please
         | raise a github issue.
        
         | hermitcrab wrote:
         | If you are wrangling CSV/TSV files on Mac, it might be worth
         | taking a look at Easy Data Transform.
        
         | paddy_m wrote:
         | Ok, if we are all tagging and promoting our own projects, check
         | out mine.
         | 
         | I created Buckaroo to provide a better table viewing experience
         | inside of notebooks. I also built a low code UI and auto
         | cleaning to expedite the wrote data cleaning tasks that take up
         | a large portion of data analysis. Autocleaning is heuristically
         | powered - no LLMs, so it's fast and your data stays local. You
         | can apply different autocleaning strategies and visually
         | inspect the results. When you are happy with the cleaning, you
         | can copy and paste the python code as a reusable function.
         | 
         | All of this is open source, and its extendable/customizable.
         | 
         | Here's a video walking through autocleaning and how to extend
         | it https://youtu.be/A-GKVsqTLMI
         | 
         | Here's the repo: https://github.com/paddymul/buckaroo
        
       ___________________________________________________________________
       (page generated 2025-05-14 23:01 UTC)