[HN Gopher] Tabula - Extract tables from PDF files
       ___________________________________________________________________
        
       Tabula - Extract tables from PDF files
        
       Author : pabs3
       Score  : 47 points
       Date   : 2021-06-07 05:46 UTC (17 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | gitowiec wrote:
       | Camelot and it's Excalibur are great too! We used it to convert
       | different bank statements
        
       | petalmind wrote:
       | You may also find useful: https://github.com/adworse/iguvium
        
       | tayloramurphy wrote:
       | I used Tabula quite a bit at the startup I used to work at (> 3
       | yrs ago). We were curating and organizing genetic testing
       | information and much of the data was sent to us with PDFs.
       | 
       | It didn't work everytime, but when it did, it was awesome!
        
       | geonic wrote:
       | It's a nice tool. Found it by chance a couple of days ago. It did
       | save me a lot of typing.
        
       | smt88 wrote:
       | We tried this and a few other tools, but we ended up with
       | PDF2XL[1] (which works on everything, not just tables).
       | 
       | It's pretty ugly and not cheap, but the data extraction is
       | absolutely _magical_.
       | 
       | I very rarely feel joy and excitement when using a tool,
       | especially a PDF-related tool, but it saved our dev team at least
       | 100 hours when we first used it. We have it as an automated part
       | of one of our client flows and they happily pay us way more than
       | they should.
       | 
       | 1. https://pdf2xl.com/
        
       ___________________________________________________________________
       (page generated 2021-06-07 23:00 UTC)