[HN Gopher] Fselect: Find files with SQL-like queries
       ___________________________________________________________________
        
       Fselect: Find files with SQL-like queries
        
       Author : ingve
       Score  : 175 points
       Date   : 2021-04-05 12:07 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | diroussel wrote:
       | This is very similar to fsql, which looks similary useful.
       | 
       | https://github.com/kashav/fsql
        
       | bdcravens wrote:
       | ColdFusion/CFML has had this capability for a while: Most
       | iterables return a "query" data type, and queries can be queried
       | against using an in-memory SQL engine.
       | 
       | https://cfdocs.org/directorylist
       | 
       | https://cfdocs.org/queryexecute
        
       | dimator wrote:
       | I've found myself using this rather than come up with the
       | equivalent `find` predicate.
        
       | cbm-vic-20 wrote:
       | The BeOS filesystem had something like this built-in, back in the
       | day.
       | 
       | https://arstechnica.com/information-technology/2018/07/the-b...
        
         | tjfontaine wrote:
         | My first thought was https://en.wikipedia.org/wiki/WinFS
        
       | TruthWillHurt wrote:
       | This is amazing. A lot of people forget the fact that the
       | filesystem is in a way a database, and a flexible one at that!
        
       | samatman wrote:
       | The perfect complement for this would be a reboot of the BeOS-
       | style filesystem.
       | 
       | For the unaware, BeFS combines the attributes of a hierarchical
       | filesystem and (some of) a relational database. It could do
       | interesting things like store contacts as a zero-contents file
       | which had key-value pairs for things like names and email
       | addresses.
       | 
       | There are some open-source reimplementations of it, thanks to the
       | Haiku project; I wonder if they've ever been integrated with the
       | Linux kernel and distros.
        
         | hinkley wrote:
         | Microsoft tried this in NT5.0 and ended up shelving it.
         | 
         | Trying again was a substantial reason for the delay of one of
         | the more recent versions. Vista perhaps?
        
       | tgv wrote:
       | This on macOS' Spotlight database...
        
       | AtlasBarfed wrote:
       | I know this is rust and has its own query engine and it is pretty
       | cool, but makes me wonder how much can be done with:
       | 
       | 1) dump filesystem (or some other data schema generation)
       | metadata into sqllite 2) pass query to sqllite 3) pass sqllite
       | response to stdout
       | 
       | and it would still be faster than the adhoc query engines like
       | this.
        
         | jhayward wrote:
         | In my experience the limiting factor in response time is the
         | traversal of the FS/OS structures in your step 1. It seems
         | unlikely that anything this program is doing would be any
         | slower than what you are describing.
        
           | majkinetor wrote:
           | Not necessarily.
           | 
           | On Windows for example there is Everything search engine
           | which scans NTFS table and installs filter driver. Its
           | instant on any disk size. If it were keeping its database in
           | sqlite, we would have exactly what AtlasBarfed suggested.
        
             | jhayward wrote:
             | I am drawing a distinction between actually using an SQL
             | database as the dynamic attribute store of the file system,
             | vs "dumping the FS to SQLite", which implies an on-demand
             | traversal to me.
        
       | mshockwave wrote:
       | I'm surprised to see this in the `--help` menu:
       | 
       | ``` Japanese string:                       CONTAINS_JAPANESE
       | Used to check if string value contains Japanese symbols
       | CONTAINS_KANA               Used to check if string value
       | contains kana symbols                  CONTAINS_HIRAGANA
       | Used to check if string value contains hiragana symbols
       | CONTAINS_KATAKANA           Used to check if string value
       | contains katakana symbols                  CONTAINS_KANJI
       | Used to check if string value contains kanji symbols
       | 
       | ```
        
       | leephillips wrote:
       | I haven't tried it yet, but his looks potentially super useful.
        
       | Jeff_Brown wrote:
       | Grep integration would be awesome -- e.g. "find files with name
       | satisfying x and content satisfying y", or (even better) "files
       | in which grep query y is satisfied within n lines of a spot
       | satisfying grep query y".
        
         | TruthWillHurt wrote:
         | Isn't this what pipes are for?
        
         | tyingq wrote:
         | It does support regex expressions directly:                 ~=
         | | =~ | regexp | rx       Used to check if the column value
         | matches the regex pattern       !=~ | !~= | notrx
         | Used to check if the column value doesn't match the regex
         | pattern
        
         | rhizome wrote:
         | find $PATH -regex $FILENAME_REGEX -exec grep $CONTENT_REGEX
         | grep -r [-E] $CONTENT_REGEX --include $FILEGLOB
         | 
         | Your bonus case:                   find $PATH -regex
         | $FILENAME_REGEX \             -exec sh -c "grep [-E]
         | -$NUMBER_OF_CONTEXT_LINES $CONTENT_REGEX1 | grep -l [-E]
         | $CONTENT_REGEX2"
         | 
         | The "[-E]" here is just to mention the command switch if you
         | want to use extended regex.
        
       | freedomben wrote:
       | If you're skeptical, I was too.
       | 
       | My thoughts before reading the README.md: SQL is nice and all but
       | seems verbose and awkward compared to just using find.
       | 
       | My thoughts after reading the README.md: Wow, this tool can query
       | on file format-specific metadata such as ID3 tags in mp3 files
       | and even width/height in pictures. This is really neat. There is
       | a lot of possibility here.
        
       | crazygringo wrote:
       | I just want to say this is really cool.
       | 
       | Honestly I'd love to see SQL (or something like it) become a
       | first-class citizen in operating systems generally -- as a
       | standard way for manipulating files, logs, preference files, etc.
       | 
       | To be clear: _not_ turning things into databases (keep logs as
       | logs), but make everything interpretable to SQL.
       | 
       | It's bizarre to me that in just a few lines I can achieve magic
       | with a database... but I can't trivially do the same magic with
       | my filesystem.
       | 
       | Just like a shell is an integral part of an OS... shouldn't a
       | query language be too? I'd love it if, out-of-the-box, Linux
       | distributions came with a standardized SQL+JSON-over-the-
       | filesystem approach that functioned as a small-scale working
       | database for any tool that wanted to use it as such.
        
         | aratno wrote:
         | Have you used osquery? Seems like a good chunk of what you're
         | looking for!
         | 
         | https://www.osquery.io/schema/4.7.0/
        
           | crazygringo wrote:
           | It's halfway what I'm talking about, thanks. Absolutely yes
           | to the idea of being able to query everything!
           | 
           | But for me the idea of _writing_ (CREATE, UPDATE, DELETE)
           | would be an integral part of it as well -- whether inserting
           | a property into a JSON file, adding a cronjob as column
           | parameters rather than a text line, or creating files.
        
             | neolog wrote:
             | The writing part is what NixOS does (with Nixlang rather
             | than SQL).
        
       | iwebdevfromhome wrote:
       | I have a lot of files in my home directory, specially in the code
       | folder where all my projects are, and you can imagine all the
       | files that live in the python virtualenv and node_modules but
       | even so the command finished execution in 45s, I think this is
       | very impressive!
        
       ___________________________________________________________________
       (page generated 2021-04-06 23:00 UTC)