[HN Gopher] Fq: Jq for Binary Formats
       ___________________________________________________________________
        
       Fq: Jq for Binary Formats
        
       Author : ingve
       Score  : 501 points
       Date   : 2023-06-03 09:13 UTC (13 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | NackerHughes wrote:
       | If the 'j' in `jq` stands for 'JSON', what does the 'f' stand
       | for?
        
         | zulu-inuoe wrote:
         | Good question. Maybe 'format' or just 'file'?
        
         | jensenbox wrote:
         | It stands for the sound you make when you cat a binary file.
        
       | qwefqgq3223 wrote:
       | This is what I like about powershell. It passes objects via
       | pipeline and if you need to query or filter something, you don't
       | need to learn millions of different tolls (jq, xmlstarlet, etc.)
       | - just use programming language features for everything.
        
       | bombolo wrote:
       | What about GNU Poke? http://www.jemarch.net/poke
        
       | eftel wrote:
       | I'm currently trying to make sense of a binary file that is used
       | by a proprietary program to import data.
       | 
       | The file is generated on a server out of my control but I'm able
       | to see that some kind of key is being sent alongside the data.
       | (To encrypt it?)
       | 
       | How would one approach something like this? Where could I look
       | for freelancers who are able to help with this?
        
         | scrollaway wrote:
         | The most straightforward way will be to reverse engineer the
         | program that imports the data.
         | 
         | Look for reverse engineer freelancers. Many of them in the
         | video game space.
        
         | xmcqdpt2 wrote:
         | Do you have access to the program that reads the data? If so,
         | you can use a debugger to step through the parser for the file,
         | even if symbols are stripped [1]. You can breakpoint on
         | syscalls, such as when the file gets opened [2] and then step
         | through and look around memory for the decrypted version. If
         | you have an idea of what the file should contain you can
         | probably identify patterns this way.
         | 
         | I'm not an expert on this topic at all though.
         | 
         | [1] Of course you then have less information but it's still
         | possible to see the assembly while the file gets parsed. See
         | for example,
         | 
         | http://felix.abecassis.me/2012/08/gdb-debugging-stripped-bin...
         | 
         | [2] https://sourceware.org/gdb/onlinedocs/gdb/Set-
         | Catchpoints.ht...
        
           | xvilka wrote:
           | For this kind of task, using low-level debugger tools is
           | probably better. Rizin[1][2]/Cutter[3][4] could help. We also
           | have GSoC participant this year who works hard on improving
           | debuginfo and debugging support[5]. I personally also like
           | Binary Ninja, they recently made their debugger stable
           | enough[6].
           | 
           | [1] https://rizin.re/
           | 
           | [2] https://github.com/rizinorg/rizin
           | 
           | [3] https://cutter.re/
           | 
           | [4] https://github.com/rizinorg/cutter
           | 
           | [5] https://rizin.re/posts/gsoc-2023-announcement/
           | 
           | [6] https://binary.ninja/2023/05/03/3.4-finally-
           | freed.html#debug...
        
       | [deleted]
        
       | sbussard wrote:
       | Oh nice! We got a public gqui now
        
         | jeffbee wrote:
         | Looks like not really. The proto support is pretty basic. It
         | can't print floats and doubles and doesn't parse groups or
         | packed fields. It doesn't use a descriptor database so it can
         | only print the field number, not its name, and it can only
         | differentiate nested messages if the user calls '|protobuf' on
         | what is otherwise considered a string.
        
       | jxf wrote:
       | This looks beautiful. When there's more data I bet it's going to
       | be a great tool to hook up to LLMs. "Draw the first frame of this
       | video using fq."
        
       | fulafel wrote:
       | Previously (2021): https://news.ycombinator.com/item?id=29657094
        
         | dang wrote:
         | Thanks! Macroexpanded:
         | 
         |  _Fq: Jq for Binary Formats_ -
         | https://news.ycombinator.com/item?id=29657094 - Dec 2021 (81
         | comments)
        
       | majkinetor wrote:
       | Awesome.
       | 
       | It would be good if some form of externally plugable binary
       | format specification is doable in the future. As far as I can
       | see, if the binary format is not supported OTB, you can't use
       | this tool.
        
         | notpushkin wrote:
         | Kaitai Struct might be a good choice for that:
         | https://kaitai.io/
        
           | theozaurus wrote:
           | I've found Kaitai struct to be an absolute joy to use on
           | things like EEPROM dumps from car computers.
        
             | leoh wrote:
             | Has anyone found anything incredible for arbitrary UTF-8
             | data?
        
           | evntdrvn wrote:
           | Thanks for the link, that's really cool!
        
         | Hackbraten wrote:
         | I suppose OTB means "off the bat?" Non-native speaker here, and
         | my web search turned up nothing.
        
           | orls wrote:
           | I assume in this case it's "Out (of) The Box"
        
         | nyberg wrote:
         | http://www.jemarch.net/poke might be interesting in that case.
        
           | wakeupcall wrote:
           | I second poke. It's an amazing tool for debugging in general.
           | 
           | It's relatively rare to look into standardized binary formats
           | (you'll likely look directly into a library at that point),
           | unless you're writing a writer/parser/decoder yourself and
           | need to double-check the output.
           | 
           | When developing with general binary data in mind, poke is
           | much more useful.
        
       | [deleted]
        
       | jensenbox wrote:
       | Did I miss something or is there no Ubuntu or Debian installer?
       | 
       | I certainly know how to download a file and add it to my path (or
       | put it in my personal bin directory) but sure would be nice to
       | have a super simple installer.
       | 
       | LOL - please do not make a snap or whatever the hell the "cool
       | kids" use. I certainly wouldn't want to advocate continued use of
       | that pattern for utility functions.
        
         | dima55 wrote:
         | It's already in stock Debian.
        
         | 2h wrote:
         | This has got to be the pettiest gripes I have heard in a while.
         | It's already built, you literally just extract and run:
         | 
         | https://github.com/wader/fq/releases/tag/v0.6.0
        
         | fullspectrumdev wrote:
         | You can always submit a PR that adds releases support for
         | Debian or RPM packages or whatever format you like.
         | 
         | I might consider doing that next week, just realised I have no
         | idea how a RPM is made, and I haven't thought about making a
         | Debian package in years.
        
           | piperswe wrote:
           | Since this is written in Go, it's almost trivial to use fpm
           | [1] to generate a variety of packages. Alternatively you can
           | use nfpm [2] if you don't want to have to deal with
           | installing Ruby & a gem.
           | 
           | [1]: https://github.com/jordansissel/fpm [2]:
           | https://github.com/goreleaser/nfpm
        
       | sillysaurusx wrote:
       | fqing finally. It always seemed strange that there wasn't any
       | central database of binary parsers that everyone could contribute
       | to. Nearly every file format is fully documented, but none of the
       | docs are programmatic.
       | 
       | I was trying to rip some sounds from a wii game called Rhythm
       | Heaven, and it's ridiculous how primitive the tech is. By that I
       | mean the programming community's tech. If you want to extract
       | some assets, you'd better be running Windows, and you'll need to
       | download some random exe from mediafire made by Jared, a 13yo
       | that coded the extractor in C in his spare time. This is only a
       | very slight exaggeration; Windows being a requirement isn't.
       | 
       | Hopefully projects like this will standardize all binary formats
       | once and for all.
       | 
       | Actually, this is a good opportunity to ask: how would one
       | contribute a binary parser to fq? If I wrote one for wii sound
       | files, can I just submit a PR or is there some other process?
       | 
       | EDIT: https://github.com/wader/fq/blob/master/doc/dev.md
       | documents the development side of things. I'm more interested in
       | the project itself -- if someone puts in the work to make a
       | decoder for an obscure binary format, will it get merged
       | (assuming it's high quality) or is this only for popular formats?
        
         | nonethewiser wrote:
         | > I was trying to rip some sounds from a wii game called Rhythm
         | Heaven, and it's ridiculous how primitive the tech is.
         | 
         | This seems like a pretty niche need with only some hobbyists
         | motivated enough to work on it. Is there a broader application
         | than your use case? Otherwise I think thats why the existing
         | software for this isnt great.
        
           | sillysaurusx wrote:
           | Yeah, I didn't mean to sound entitled. I only meant I was
           | excited for projects like fq to shake things up. When I was
           | writing a parser for the sound file I was thinking "hmm...
           | this really feels like duplicate work."
           | 
           | On the other hand, it's surprising to me that "grab sounds
           | from a wii game" is so niche! My gut felt like it would be
           | slightly more complicated than unzipping a tarball, but my
           | gut didn't expect it to be a programming challenge worthy of
           | a small competition.
        
             | foobarian wrote:
             | It's all fun and games until Nintendo sends in a cease and
             | desist notice
        
               | sillysaurusx wrote:
               | Seriously. Nintendo is an evil corporation. I was about
               | to write "close to," but they crossed the line when they
               | sent someone to prison and garnished his wages for the
               | rest of his life.
        
               | ninepoints wrote:
               | I really don't think protecting IP produced at company
               | expense is "evil." That's their prerogative, and people
               | knowingly violating agreements/ToS are playing with fire.
        
               | ozim wrote:
               | Not just someone but dude was not ripping or cracking
               | stuff for fun. He made business out of it and what's
               | worse he added ransomware to scam his "customers".
               | 
               | POS deserved all of it if not more.
        
               | huimang wrote:
               | The punishment was severely out of proportion to the
               | actual crime. He was definitely being made into an
               | example.
        
               | sillysaurusx wrote:
               | If he added ransomware, that's a bit different. Making a
               | business out of it isn't that bad (think about it in
               | terms of people going to prison for assault vs merely
               | making some money), but the ransomware would be.
               | 
               | Still, garnishing someone's wages the rest of his life
               | seems out of proportion. But I admit it's harder to
               | defend someone that made a livelihood out of holding
               | peoples' data hostage.
        
         | wongarsu wrote:
         | There is the 010 Editor, at heart a cross-platform scriptable
         | hex editor with a template language [1]. It has a central
         | template repository [2] as well as templates around the
         | internet (e.g. 3, 4).
         | 
         | But it being a paid tool means there are fewer template
         | contributions from 13 year olds, which if we are all honest
         | make up the majority of unpaid open source contributions - they
         | simply have more spare time.
         | 
         | 1: https://www.sweetscape.com/010editor/
         | 
         | 2: https://www.sweetscape.com/010editor/repository/templates/
         | 
         | 3: https://github.com/tge-was-taken/010-Editor-
         | Templates/tree/m...
         | 
         | 4:
         | https://wiki.redmodding.org/cyberpunk-2077-modding/modding-k...
        
           | overengineer wrote:
           | What is wrong with contributions from 13 year olds?
        
             | xp84 wrote:
             | I don't think it was implied that there's anything bad, I
             | read it as the opposite.
        
             | oxygen_crisis wrote:
             | There aren't enough of them, because the tool costs money,
             | and 13-year-olds can't spend money on software as readily
             | as adults.
        
           | nness wrote:
           | 010editor's templating language looks very interesting (and
           | something I had been trying to accomplish with my own mixed-
           | bag of tools when reverse-engineering). I suppose as a
           | hobbyist, the price is a hard one to get over...
        
         | BeefWellington wrote:
         | Binwalk is my go-to for that kind of thing usually.
        
           | fullspectrumdev wrote:
           | Try unblob sometime, it's a more modern, maintained
           | alternative (not a fork). A company called OneKey that do
           | some firmware security stuff maintain it, and generally it's
           | pretty good.
        
         | stelonix wrote:
         | I used to be from the romhacking community back in the 2000s
         | and due to usage of Windows, open source/foss wasn't even known
         | to most people. The culture of Windows programmers is way more
         | focused on freeware/binaries.
         | 
         | Still waiting to this day for FuSoYa to release the source code
         | of Lunar Magic.
         | 
         | About a central database of binary parsers, I've been wanting
         | this for ages too. The closest I ever found was augeas, but
         | that's for configuration files.
        
           | xp84 wrote:
           | Yeah that's a mysterious one. Such an incredible achievement,
           | and it enabled so, so much creativity, and free (as in beer)
           | to all as far as I know. I hope FuSoYa does open source it
           | someday.
        
           | marlin wrote:
           | > About a central database of binary parsers, I've been
           | wanting this for ages too. The closest I ever found was
           | augeas, but that's for configuration files.
           | 
           | I'm working on something, that is a open template format for
           | binary file formats. It is usable today as a universal file
           | extractor, with some bugs and limitations.
           | 
           | Check it out at https://github.com/martinlindhe/feng
        
         | amenghra wrote:
         | I miss the ResEdit days of Macintosh days (i.e. MacOS 7-8-9).
         | You could see/steal/modify/hack the visual assets of most
         | binaries. You could remap keyboard shortcuts, modify menus,
         | etc.
        
           | mpweiher wrote:
           | Well, app wrappers make this even simpler: just poke inside
           | with a file browser.
        
         | BoppreH wrote:
         | > It always seemed strange that there wasn't any central
         | database of binary parsers that everyone could contribute to.
         | 
         | Fully agree on the need for such a database. The problem is
         | that it's been tried, but lacked traction. The latest one I've
         | seen is Kaitai format files[1], that can be used in visualizers
         | or to auto-generate parsers.
         | 
         | [1] https://formats.kaitai.io/
        
           | fullspectrumdev wrote:
           | Thanks for the reminder about Katai, I've been meaning to
           | look at it specifically for something but forgot what it was
           | called.
        
         | kazinator wrote:
         | > It always seemed strange that there wasn't any central
         | database of binary parsers that everyone could contribute to.
         | 
         | file utility with /etc/magic database
        
           | loeg wrote:
           | Yeah, that was my thought as well, although I don't know that
           | file/magic get into much structure beyond identifying the
           | file format.
        
         | panzi wrote:
         | That's also my experience. People releasing their tools on
         | obscure forums, usually without source code and no version
         | control. Almost enire communities that haven't heard of GitHub.
         | Though usually the tools work in wine... but are next to
         | useless. GUI only and very clumsy to use. So many clicks for
         | each single file to extract, no batch operations, no command
         | line interface. A big WTF. How can you do any modding with
         | those tools?
        
         | realjhol wrote:
         | Like wireshark directors but more general
        
         | grumbel wrote:
         | There is QuickBMS[1], which covers quite a few game related
         | formats.
         | 
         | [1] http://aluigi.altervista.org/quickbms.htm
        
         | erichdongubler wrote:
         | It seems strange, because it's not reality! Forensic tools like
         | FTK and Autopsy have had a plug-in framework for these forever,
         | speaking as a former contributor to the former. There's also
         | Kaitai Struct.
         | 
         | I'm sure other communities have popped up that I haven't heard
         | of, too. There's lots of interest in unifying forensic parsing
         | under open work.
        
           | marlin wrote:
           | I'm working on something, that is a open template format for
           | binary file formats. It is usable today as a universal file
           | extractor, with some bugs and limitations.
           | 
           | Check it out at https://github.com/martinlindhe/feng
        
         | andoma wrote:
         | Being a long time personal friend with the author I can assure
         | you the more obscure the better :-) His interest in esoteric
         | things and solutions are "well documented" if you browse around
         | his github repos.
         | 
         | Some examples:
         | 
         | https://github.com/wader/jqjq https://github.com/wader/catgolf
        
           | sillysaurusx wrote:
           | Wonderful, thank you so much! Such a cool project.
        
           | Akronymus wrote:
           | I am RE-ing some binary file formats for games. Should those
           | be contributed as well?
        
         | jchw wrote:
         | Kaitai Struct is the better way to go for an ecosystem
         | solution, but the tooling could certainly use improvements. In
         | addition to 010 Editor, there's also KDE's Okteta. It does not
         | have a lot of good OSDs and the OSD format/scripting for
         | specifying formats is a little anemic (I'd like to help improve
         | it if I can find time...) but it's very serviceable and a
         | decent open source alternative to what 010 has. Shameless plug,
         | I made a decent Windows EXE/PE OSD for Okteta. (It's even got a
         | bit of support for NE16 executables, just for fun.)
         | 
         | This entire genre of tools has been a long time point of
         | interest for me. In addition to making a couple OSDs and
         | contributing some tiny improvements to Kaitai, I also have my
         | own binary schema library for Go, restruct, which, biased or
         | not, remains my favorite way to poke at arbitrary formats,
         | since it's really easy to sketch stuff out and read and write
         | to files quickly. It's basically Go's encoding/binary but with
         | struct tags for more advanced things.
        
           | marlin wrote:
           | > This entire genre of tools has been a long time point of
           | interest for me.
           | 
           | Same for me! It turned into a long journey and I am working
           | on a solution that I am very happy about.
           | 
           | Somewhere mid-journey I learned about kaitai struct and lost
           | a bit of steam seeing it was similar. But I think my offering
           | is superior in a more simple template format with less
           | programming required and a nice cli app.
           | 
           | I am yet to announce it publicly, but i been meaning to for
           | so long already.
           | 
           | If you would like to check it out I would be happy!
           | 
           | You can use it to map / view content from a format there is a
           | template for. Alot of common formats is already included and
           | you can extend it using your own templates.
           | 
           | https://github.com/martinlindhe/feng
        
             | joshspankit wrote:
             | As someone who had a great time with Kaitai, may I suggest
             | that you write an interface so that fq can be used with any
             | format that Kaitai understands (and any that people add in
             | future)
        
       | natch wrote:
       | In my mind the 'f' here stands for 'f yeah'
        
       | _pmf_ wrote:
       | I'd also like to throw https://github.com/WerWolv/ImHex in the
       | mix here.
        
       | candiodari wrote:
       | LOVE the name.
        
         | nonethewiser wrote:
         | Reminds me of this emphatically best middle school
         | 
         | Fuqing #1 Middle School
         | 
         | https://g.co/kgs/HPtt5n
        
         | dotancohen wrote:
         | I didn't even think of that until you mentioned it!
        
         | pcthrowaway wrote:
         | No, FQ!!
        
         | jacknews wrote:
         | the name kind of sucks, why 'f' q? F for ... 'FU' ('teenage
         | snigger'). It should be 'bq', binary query, after 'jq' json
         | query.
         | 
         | Cool project none-the-less. The comment about 'programmatic
         | documentation' of binary formats is very interesting, maybe
         | some kind of 'binary description markup' could be part of this?
        
       | brabel wrote:
       | I've written a parser of Java class files (which works in any JVM
       | language as they all compile to the same bytecode format). It was
       | surprisingly easy! Maybe that could be useful to analyse class
       | files in jq??!
        
       | pmoriarty wrote:
       | Too bad that jq has such a shitty, convoluted syntax.
       | 
       | Could have definitely chosen one of the many alternatives to jq
       | that also worked with JSON but did so in a much clearer and more
       | elegant way.
        
         | hedora wrote:
         | Can you name an alternative to jq with better syntax?
        
           | dmoura wrote:
           | I prefer a SQL-like format. It's not as complete but it cover
           | most of the day-to-day use cases. Take a look at
           | https://github.com/dcmoura/spyql (I am the author). Congrats
           | on fq!
        
         | xp84 wrote:
         | Commenting to follow because I'm curious what alternatives you
         | mean. I thought a lot of people liked jq and I only just
         | finally got around to installing it, so if there's a much
         | better way I'd like to hear it.
        
       | usr1106 wrote:
       | FOSDEM presentation by the author earlier this year:
       | https://fosdem.org/2023/schedule/event/bintools_fq/
        
       | pabs3 wrote:
       | I love that this includes a section in the README about other
       | tools that are similar or related to fq. Every open source
       | project should list its competitors.
        
         | SkyPuncher wrote:
         | I don't even necessarily see listing alternatives as
         | competition. Often alternatives solve overlapping, but
         | different problems.
        
         | suavesito wrote:
         | Not only that, the README has a section called _Hopes_, the
         | second bullet point being
         | 
         | > Inspire people to create similar tools.
        
         | el_oni wrote:
         | I wouldnt even say the are necessarily competitors. More like a
         | flathead screwdriver to compliment your phillips in most of
         | those cases
        
         | tpoacher wrote:
         | Agreed. I try to do the same with my stuff. Honest, transparent
         | market research / placement improves your offering, rather than
         | diminishes it.
        
         | tough wrote:
         | Right? Not doing so is either disingenuous or amateurish
        
       ___________________________________________________________________
       (page generated 2023-06-03 23:00 UTC)