[HN Gopher] Fselect: Find files with SQL-like queries
___________________________________________________________________
Fselect: Find files with SQL-like queries
Author : ingve
Score : 175 points
Date : 2021-04-05 12:07 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| diroussel wrote:
| This is very similar to fsql, which looks similary useful.
|
| https://github.com/kashav/fsql
| bdcravens wrote:
| ColdFusion/CFML has had this capability for a while: Most
| iterables return a "query" data type, and queries can be queried
| against using an in-memory SQL engine.
|
| https://cfdocs.org/directorylist
|
| https://cfdocs.org/queryexecute
| dimator wrote:
| I've found myself using this rather than come up with the
| equivalent `find` predicate.
| cbm-vic-20 wrote:
| The BeOS filesystem had something like this built-in, back in the
| day.
|
| https://arstechnica.com/information-technology/2018/07/the-b...
| tjfontaine wrote:
| My first thought was https://en.wikipedia.org/wiki/WinFS
| TruthWillHurt wrote:
| This is amazing. A lot of people forget the fact that the
| filesystem is in a way a database, and a flexible one at that!
| samatman wrote:
| The perfect complement for this would be a reboot of the BeOS-
| style filesystem.
|
| For the unaware, BeFS combines the attributes of a hierarchical
| filesystem and (some of) a relational database. It could do
| interesting things like store contacts as a zero-contents file
| which had key-value pairs for things like names and email
| addresses.
|
| There are some open-source reimplementations of it, thanks to the
| Haiku project; I wonder if they've ever been integrated with the
| Linux kernel and distros.
| hinkley wrote:
| Microsoft tried this in NT5.0 and ended up shelving it.
|
| Trying again was a substantial reason for the delay of one of
| the more recent versions. Vista perhaps?
| tgv wrote:
| This on macOS' Spotlight database...
| AtlasBarfed wrote:
| I know this is rust and has its own query engine and it is pretty
| cool, but makes me wonder how much can be done with:
|
| 1) dump filesystem (or some other data schema generation)
| metadata into sqllite 2) pass query to sqllite 3) pass sqllite
| response to stdout
|
| and it would still be faster than the adhoc query engines like
| this.
| jhayward wrote:
| In my experience the limiting factor in response time is the
| traversal of the FS/OS structures in your step 1. It seems
| unlikely that anything this program is doing would be any
| slower than what you are describing.
| majkinetor wrote:
| Not necessarily.
|
| On Windows for example there is Everything search engine
| which scans NTFS table and installs filter driver. Its
| instant on any disk size. If it were keeping its database in
| sqlite, we would have exactly what AtlasBarfed suggested.
| jhayward wrote:
| I am drawing a distinction between actually using an SQL
| database as the dynamic attribute store of the file system,
| vs "dumping the FS to SQLite", which implies an on-demand
| traversal to me.
| mshockwave wrote:
| I'm surprised to see this in the `--help` menu:
|
| ``` Japanese string: CONTAINS_JAPANESE
| Used to check if string value contains Japanese symbols
| CONTAINS_KANA Used to check if string value
| contains kana symbols CONTAINS_HIRAGANA
| Used to check if string value contains hiragana symbols
| CONTAINS_KATAKANA Used to check if string value
| contains katakana symbols CONTAINS_KANJI
| Used to check if string value contains kanji symbols
|
| ```
| leephillips wrote:
| I haven't tried it yet, but his looks potentially super useful.
| Jeff_Brown wrote:
| Grep integration would be awesome -- e.g. "find files with name
| satisfying x and content satisfying y", or (even better) "files
| in which grep query y is satisfied within n lines of a spot
| satisfying grep query y".
| TruthWillHurt wrote:
| Isn't this what pipes are for?
| tyingq wrote:
| It does support regex expressions directly: ~=
| | =~ | regexp | rx Used to check if the column value
| matches the regex pattern !=~ | !~= | notrx
| Used to check if the column value doesn't match the regex
| pattern
| rhizome wrote:
| find $PATH -regex $FILENAME_REGEX -exec grep $CONTENT_REGEX
| grep -r [-E] $CONTENT_REGEX --include $FILEGLOB
|
| Your bonus case: find $PATH -regex
| $FILENAME_REGEX \ -exec sh -c "grep [-E]
| -$NUMBER_OF_CONTEXT_LINES $CONTENT_REGEX1 | grep -l [-E]
| $CONTENT_REGEX2"
|
| The "[-E]" here is just to mention the command switch if you
| want to use extended regex.
| freedomben wrote:
| If you're skeptical, I was too.
|
| My thoughts before reading the README.md: SQL is nice and all but
| seems verbose and awkward compared to just using find.
|
| My thoughts after reading the README.md: Wow, this tool can query
| on file format-specific metadata such as ID3 tags in mp3 files
| and even width/height in pictures. This is really neat. There is
| a lot of possibility here.
| crazygringo wrote:
| I just want to say this is really cool.
|
| Honestly I'd love to see SQL (or something like it) become a
| first-class citizen in operating systems generally -- as a
| standard way for manipulating files, logs, preference files, etc.
|
| To be clear: _not_ turning things into databases (keep logs as
| logs), but make everything interpretable to SQL.
|
| It's bizarre to me that in just a few lines I can achieve magic
| with a database... but I can't trivially do the same magic with
| my filesystem.
|
| Just like a shell is an integral part of an OS... shouldn't a
| query language be too? I'd love it if, out-of-the-box, Linux
| distributions came with a standardized SQL+JSON-over-the-
| filesystem approach that functioned as a small-scale working
| database for any tool that wanted to use it as such.
| aratno wrote:
| Have you used osquery? Seems like a good chunk of what you're
| looking for!
|
| https://www.osquery.io/schema/4.7.0/
| crazygringo wrote:
| It's halfway what I'm talking about, thanks. Absolutely yes
| to the idea of being able to query everything!
|
| But for me the idea of _writing_ (CREATE, UPDATE, DELETE)
| would be an integral part of it as well -- whether inserting
| a property into a JSON file, adding a cronjob as column
| parameters rather than a text line, or creating files.
| neolog wrote:
| The writing part is what NixOS does (with Nixlang rather
| than SQL).
| iwebdevfromhome wrote:
| I have a lot of files in my home directory, specially in the code
| folder where all my projects are, and you can imagine all the
| files that live in the python virtualenv and node_modules but
| even so the command finished execution in 45s, I think this is
| very impressive!
___________________________________________________________________
(page generated 2021-04-06 23:00 UTC)