[HN Gopher] Show HN: Musoq - Query Anything with SQL Syntax (Git...
       ___________________________________________________________________
        
       Show HN: Musoq - Query Anything with SQL Syntax (Git, C#, CSV, Can
       DBC)
        
       Hey, For those of you who don't know my little tool Musoq, I wanted
       to introduce it as a small tool that allows you to query with SQL-
       like syntax without any database.  It allows you to query various
       things from niche ones like CAN DBC files, weird ones like C# code,
       interesting ones with Git querying to regular stuff like CSV, TSV
       and various others.  I am quite a bit experimenting with various
       things so I'm hybridizing the engine with LLMs or doing other weird
       stuff that are more or less practical :-)  I wanted also to share
       some recent developments in this little project as I hope it might
       be interesting to some of you.  New Experimental Plugins: * _Git
       Plugin (Beta)_ : I've been working on Git repository querying -
       managed to test it on the EF Core repo (16k commits) and it seems
       to work okay * _Roslyn Plugin (Beta)_ : Added basic C# code
       analysis capabilities  For the very first time: I've extended CROSS
       APPLY to use computed results as arguments! Now the operator can
       use values from the current row as inputs. Here's an example:
       SELECT         f.DirectoryName,         f.FileName       FROM
       #os.directories('/some/path', false) d       CROSS APPLY
       #os.files(d.FullName, true) f       WHERE d.Name IN ('Folder1',
       'Folder2')       After another pack of fixes I'm finally able to
       query multiple git repositories AT ONCE!                 with
       ProjectsToAnalyze as (         select             dir2.FullName as
       FullName         from #os.directories('D:\repos', false) dir1
       cross apply #os.directories(dir1.FullName, false) dir2
       where             dir2.Name = '.git'       )       select
       c.Message,         c.Author,         c.CommittedWhen       from
       ProjectsToAnalyze p cross apply #git.repository(p.FullName) r
       cross apply r.Commits c       where c.AuthorEmail = 'my-
       email@email.ok'       order by c.CommittedWhen desc       Under the
       Hood: - Added a _Buckets_ feature for memory management (currently
       just testing it with the Roslyn plugin)  - Moved to _.NET 8_  -
       Added _CROSS /OUTER APPLY_ operators  - Made some improvements to
       error messages and runtime behavior  New piping features: I've been
       experimenting with piping capabilities: * _Image Analysis with
       LLMs_ :                 ./Musoq.exe image encode "image.jpg" |
       ./Musoq.exe run query "select s.Shop, s.ProductName, s.Price from
       ..."       * _Text Data Extraction_ :                 Get-Content
       "ticket.txt" | ./Musoq.exe run query "select t.TicketNumber,
       t.CustomerName ... from #stdin.text('Ollama', 'llama3.1') t"
       * _Data Source Combination_ :                 { docker image ls;
       ./Musoq.exe separator; docker container ls } | ./Musoq.exe run
       query "..."       I'm working on comprehensive documentation: I
       encourage you especially to look at section "Practical Examples and
       Applications" and "Data Sources" where you can look at all the
       tables the tool currently provides.
       <https://puchaczov.github.io/Musoq/>  Other Changes:  - Made some
       improvements to OS and Archive data sources (OS can now query
       metadata like EXIF)  - Added a few fields to CAN DBC plugin  -
       Command outputs can now be used as inputs for queries  I'm hoping
       to:  - Improve stability and add more tests  - Flesh out the
       documentation  - Work on package distribution (Scoop, Ubuntu
       packages)  - Share some examples of source code querying with
       Roslyn  Ideas for later:  - WHERE robust analysis and optimizations
       - DISTINCT operator implementation  - PROTOBUF schema support  -
       Performance improvements  - Query parallelization  - Recursive CTEs
       - Subqueries  I'd really appreciate any thoughts or feedback!  The
       documentation section where I write a short analysis of EF Core
       with git plugin: <https://puchaczov.github.io/Musoq/practical-
       examples-and-app...>
        
       Author : Puchaczov
       Score  : 11 points
       Date   : 2024-12-18 19:02 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       ___________________________________________________________________
       (page generated 2024-12-18 23:01 UTC)