[HN Gopher] Fq: Jq for Binary Formats
___________________________________________________________________
Fq: Jq for Binary Formats
Author : ingve
Score : 501 points
Date : 2023-06-03 09:13 UTC (13 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| NackerHughes wrote:
| If the 'j' in `jq` stands for 'JSON', what does the 'f' stand
| for?
| zulu-inuoe wrote:
| Good question. Maybe 'format' or just 'file'?
| jensenbox wrote:
| It stands for the sound you make when you cat a binary file.
| qwefqgq3223 wrote:
| This is what I like about powershell. It passes objects via
| pipeline and if you need to query or filter something, you don't
| need to learn millions of different tolls (jq, xmlstarlet, etc.)
| - just use programming language features for everything.
| bombolo wrote:
| What about GNU Poke? http://www.jemarch.net/poke
| eftel wrote:
| I'm currently trying to make sense of a binary file that is used
| by a proprietary program to import data.
|
| The file is generated on a server out of my control but I'm able
| to see that some kind of key is being sent alongside the data.
| (To encrypt it?)
|
| How would one approach something like this? Where could I look
| for freelancers who are able to help with this?
| scrollaway wrote:
| The most straightforward way will be to reverse engineer the
| program that imports the data.
|
| Look for reverse engineer freelancers. Many of them in the
| video game space.
| xmcqdpt2 wrote:
| Do you have access to the program that reads the data? If so,
| you can use a debugger to step through the parser for the file,
| even if symbols are stripped [1]. You can breakpoint on
| syscalls, such as when the file gets opened [2] and then step
| through and look around memory for the decrypted version. If
| you have an idea of what the file should contain you can
| probably identify patterns this way.
|
| I'm not an expert on this topic at all though.
|
| [1] Of course you then have less information but it's still
| possible to see the assembly while the file gets parsed. See
| for example,
|
| http://felix.abecassis.me/2012/08/gdb-debugging-stripped-bin...
|
| [2] https://sourceware.org/gdb/onlinedocs/gdb/Set-
| Catchpoints.ht...
| xvilka wrote:
| For this kind of task, using low-level debugger tools is
| probably better. Rizin[1][2]/Cutter[3][4] could help. We also
| have GSoC participant this year who works hard on improving
| debuginfo and debugging support[5]. I personally also like
| Binary Ninja, they recently made their debugger stable
| enough[6].
|
| [1] https://rizin.re/
|
| [2] https://github.com/rizinorg/rizin
|
| [3] https://cutter.re/
|
| [4] https://github.com/rizinorg/cutter
|
| [5] https://rizin.re/posts/gsoc-2023-announcement/
|
| [6] https://binary.ninja/2023/05/03/3.4-finally-
| freed.html#debug...
| [deleted]
| sbussard wrote:
| Oh nice! We got a public gqui now
| jeffbee wrote:
| Looks like not really. The proto support is pretty basic. It
| can't print floats and doubles and doesn't parse groups or
| packed fields. It doesn't use a descriptor database so it can
| only print the field number, not its name, and it can only
| differentiate nested messages if the user calls '|protobuf' on
| what is otherwise considered a string.
| jxf wrote:
| This looks beautiful. When there's more data I bet it's going to
| be a great tool to hook up to LLMs. "Draw the first frame of this
| video using fq."
| fulafel wrote:
| Previously (2021): https://news.ycombinator.com/item?id=29657094
| dang wrote:
| Thanks! Macroexpanded:
|
| _Fq: Jq for Binary Formats_ -
| https://news.ycombinator.com/item?id=29657094 - Dec 2021 (81
| comments)
| majkinetor wrote:
| Awesome.
|
| It would be good if some form of externally plugable binary
| format specification is doable in the future. As far as I can
| see, if the binary format is not supported OTB, you can't use
| this tool.
| notpushkin wrote:
| Kaitai Struct might be a good choice for that:
| https://kaitai.io/
| theozaurus wrote:
| I've found Kaitai struct to be an absolute joy to use on
| things like EEPROM dumps from car computers.
| leoh wrote:
| Has anyone found anything incredible for arbitrary UTF-8
| data?
| evntdrvn wrote:
| Thanks for the link, that's really cool!
| Hackbraten wrote:
| I suppose OTB means "off the bat?" Non-native speaker here, and
| my web search turned up nothing.
| orls wrote:
| I assume in this case it's "Out (of) The Box"
| nyberg wrote:
| http://www.jemarch.net/poke might be interesting in that case.
| wakeupcall wrote:
| I second poke. It's an amazing tool for debugging in general.
|
| It's relatively rare to look into standardized binary formats
| (you'll likely look directly into a library at that point),
| unless you're writing a writer/parser/decoder yourself and
| need to double-check the output.
|
| When developing with general binary data in mind, poke is
| much more useful.
| [deleted]
| jensenbox wrote:
| Did I miss something or is there no Ubuntu or Debian installer?
|
| I certainly know how to download a file and add it to my path (or
| put it in my personal bin directory) but sure would be nice to
| have a super simple installer.
|
| LOL - please do not make a snap or whatever the hell the "cool
| kids" use. I certainly wouldn't want to advocate continued use of
| that pattern for utility functions.
| dima55 wrote:
| It's already in stock Debian.
| 2h wrote:
| This has got to be the pettiest gripes I have heard in a while.
| It's already built, you literally just extract and run:
|
| https://github.com/wader/fq/releases/tag/v0.6.0
| fullspectrumdev wrote:
| You can always submit a PR that adds releases support for
| Debian or RPM packages or whatever format you like.
|
| I might consider doing that next week, just realised I have no
| idea how a RPM is made, and I haven't thought about making a
| Debian package in years.
| piperswe wrote:
| Since this is written in Go, it's almost trivial to use fpm
| [1] to generate a variety of packages. Alternatively you can
| use nfpm [2] if you don't want to have to deal with
| installing Ruby & a gem.
|
| [1]: https://github.com/jordansissel/fpm [2]:
| https://github.com/goreleaser/nfpm
| sillysaurusx wrote:
| fqing finally. It always seemed strange that there wasn't any
| central database of binary parsers that everyone could contribute
| to. Nearly every file format is fully documented, but none of the
| docs are programmatic.
|
| I was trying to rip some sounds from a wii game called Rhythm
| Heaven, and it's ridiculous how primitive the tech is. By that I
| mean the programming community's tech. If you want to extract
| some assets, you'd better be running Windows, and you'll need to
| download some random exe from mediafire made by Jared, a 13yo
| that coded the extractor in C in his spare time. This is only a
| very slight exaggeration; Windows being a requirement isn't.
|
| Hopefully projects like this will standardize all binary formats
| once and for all.
|
| Actually, this is a good opportunity to ask: how would one
| contribute a binary parser to fq? If I wrote one for wii sound
| files, can I just submit a PR or is there some other process?
|
| EDIT: https://github.com/wader/fq/blob/master/doc/dev.md
| documents the development side of things. I'm more interested in
| the project itself -- if someone puts in the work to make a
| decoder for an obscure binary format, will it get merged
| (assuming it's high quality) or is this only for popular formats?
| nonethewiser wrote:
| > I was trying to rip some sounds from a wii game called Rhythm
| Heaven, and it's ridiculous how primitive the tech is.
|
| This seems like a pretty niche need with only some hobbyists
| motivated enough to work on it. Is there a broader application
| than your use case? Otherwise I think thats why the existing
| software for this isnt great.
| sillysaurusx wrote:
| Yeah, I didn't mean to sound entitled. I only meant I was
| excited for projects like fq to shake things up. When I was
| writing a parser for the sound file I was thinking "hmm...
| this really feels like duplicate work."
|
| On the other hand, it's surprising to me that "grab sounds
| from a wii game" is so niche! My gut felt like it would be
| slightly more complicated than unzipping a tarball, but my
| gut didn't expect it to be a programming challenge worthy of
| a small competition.
| foobarian wrote:
| It's all fun and games until Nintendo sends in a cease and
| desist notice
| sillysaurusx wrote:
| Seriously. Nintendo is an evil corporation. I was about
| to write "close to," but they crossed the line when they
| sent someone to prison and garnished his wages for the
| rest of his life.
| ninepoints wrote:
| I really don't think protecting IP produced at company
| expense is "evil." That's their prerogative, and people
| knowingly violating agreements/ToS are playing with fire.
| ozim wrote:
| Not just someone but dude was not ripping or cracking
| stuff for fun. He made business out of it and what's
| worse he added ransomware to scam his "customers".
|
| POS deserved all of it if not more.
| huimang wrote:
| The punishment was severely out of proportion to the
| actual crime. He was definitely being made into an
| example.
| sillysaurusx wrote:
| If he added ransomware, that's a bit different. Making a
| business out of it isn't that bad (think about it in
| terms of people going to prison for assault vs merely
| making some money), but the ransomware would be.
|
| Still, garnishing someone's wages the rest of his life
| seems out of proportion. But I admit it's harder to
| defend someone that made a livelihood out of holding
| peoples' data hostage.
| wongarsu wrote:
| There is the 010 Editor, at heart a cross-platform scriptable
| hex editor with a template language [1]. It has a central
| template repository [2] as well as templates around the
| internet (e.g. 3, 4).
|
| But it being a paid tool means there are fewer template
| contributions from 13 year olds, which if we are all honest
| make up the majority of unpaid open source contributions - they
| simply have more spare time.
|
| 1: https://www.sweetscape.com/010editor/
|
| 2: https://www.sweetscape.com/010editor/repository/templates/
|
| 3: https://github.com/tge-was-taken/010-Editor-
| Templates/tree/m...
|
| 4:
| https://wiki.redmodding.org/cyberpunk-2077-modding/modding-k...
| overengineer wrote:
| What is wrong with contributions from 13 year olds?
| xp84 wrote:
| I don't think it was implied that there's anything bad, I
| read it as the opposite.
| oxygen_crisis wrote:
| There aren't enough of them, because the tool costs money,
| and 13-year-olds can't spend money on software as readily
| as adults.
| nness wrote:
| 010editor's templating language looks very interesting (and
| something I had been trying to accomplish with my own mixed-
| bag of tools when reverse-engineering). I suppose as a
| hobbyist, the price is a hard one to get over...
| BeefWellington wrote:
| Binwalk is my go-to for that kind of thing usually.
| fullspectrumdev wrote:
| Try unblob sometime, it's a more modern, maintained
| alternative (not a fork). A company called OneKey that do
| some firmware security stuff maintain it, and generally it's
| pretty good.
| stelonix wrote:
| I used to be from the romhacking community back in the 2000s
| and due to usage of Windows, open source/foss wasn't even known
| to most people. The culture of Windows programmers is way more
| focused on freeware/binaries.
|
| Still waiting to this day for FuSoYa to release the source code
| of Lunar Magic.
|
| About a central database of binary parsers, I've been wanting
| this for ages too. The closest I ever found was augeas, but
| that's for configuration files.
| xp84 wrote:
| Yeah that's a mysterious one. Such an incredible achievement,
| and it enabled so, so much creativity, and free (as in beer)
| to all as far as I know. I hope FuSoYa does open source it
| someday.
| marlin wrote:
| > About a central database of binary parsers, I've been
| wanting this for ages too. The closest I ever found was
| augeas, but that's for configuration files.
|
| I'm working on something, that is a open template format for
| binary file formats. It is usable today as a universal file
| extractor, with some bugs and limitations.
|
| Check it out at https://github.com/martinlindhe/feng
| amenghra wrote:
| I miss the ResEdit days of Macintosh days (i.e. MacOS 7-8-9).
| You could see/steal/modify/hack the visual assets of most
| binaries. You could remap keyboard shortcuts, modify menus,
| etc.
| mpweiher wrote:
| Well, app wrappers make this even simpler: just poke inside
| with a file browser.
| BoppreH wrote:
| > It always seemed strange that there wasn't any central
| database of binary parsers that everyone could contribute to.
|
| Fully agree on the need for such a database. The problem is
| that it's been tried, but lacked traction. The latest one I've
| seen is Kaitai format files[1], that can be used in visualizers
| or to auto-generate parsers.
|
| [1] https://formats.kaitai.io/
| fullspectrumdev wrote:
| Thanks for the reminder about Katai, I've been meaning to
| look at it specifically for something but forgot what it was
| called.
| kazinator wrote:
| > It always seemed strange that there wasn't any central
| database of binary parsers that everyone could contribute to.
|
| file utility with /etc/magic database
| loeg wrote:
| Yeah, that was my thought as well, although I don't know that
| file/magic get into much structure beyond identifying the
| file format.
| panzi wrote:
| That's also my experience. People releasing their tools on
| obscure forums, usually without source code and no version
| control. Almost enire communities that haven't heard of GitHub.
| Though usually the tools work in wine... but are next to
| useless. GUI only and very clumsy to use. So many clicks for
| each single file to extract, no batch operations, no command
| line interface. A big WTF. How can you do any modding with
| those tools?
| realjhol wrote:
| Like wireshark directors but more general
| grumbel wrote:
| There is QuickBMS[1], which covers quite a few game related
| formats.
|
| [1] http://aluigi.altervista.org/quickbms.htm
| erichdongubler wrote:
| It seems strange, because it's not reality! Forensic tools like
| FTK and Autopsy have had a plug-in framework for these forever,
| speaking as a former contributor to the former. There's also
| Kaitai Struct.
|
| I'm sure other communities have popped up that I haven't heard
| of, too. There's lots of interest in unifying forensic parsing
| under open work.
| marlin wrote:
| I'm working on something, that is a open template format for
| binary file formats. It is usable today as a universal file
| extractor, with some bugs and limitations.
|
| Check it out at https://github.com/martinlindhe/feng
| andoma wrote:
| Being a long time personal friend with the author I can assure
| you the more obscure the better :-) His interest in esoteric
| things and solutions are "well documented" if you browse around
| his github repos.
|
| Some examples:
|
| https://github.com/wader/jqjq https://github.com/wader/catgolf
| sillysaurusx wrote:
| Wonderful, thank you so much! Such a cool project.
| Akronymus wrote:
| I am RE-ing some binary file formats for games. Should those
| be contributed as well?
| jchw wrote:
| Kaitai Struct is the better way to go for an ecosystem
| solution, but the tooling could certainly use improvements. In
| addition to 010 Editor, there's also KDE's Okteta. It does not
| have a lot of good OSDs and the OSD format/scripting for
| specifying formats is a little anemic (I'd like to help improve
| it if I can find time...) but it's very serviceable and a
| decent open source alternative to what 010 has. Shameless plug,
| I made a decent Windows EXE/PE OSD for Okteta. (It's even got a
| bit of support for NE16 executables, just for fun.)
|
| This entire genre of tools has been a long time point of
| interest for me. In addition to making a couple OSDs and
| contributing some tiny improvements to Kaitai, I also have my
| own binary schema library for Go, restruct, which, biased or
| not, remains my favorite way to poke at arbitrary formats,
| since it's really easy to sketch stuff out and read and write
| to files quickly. It's basically Go's encoding/binary but with
| struct tags for more advanced things.
| marlin wrote:
| > This entire genre of tools has been a long time point of
| interest for me.
|
| Same for me! It turned into a long journey and I am working
| on a solution that I am very happy about.
|
| Somewhere mid-journey I learned about kaitai struct and lost
| a bit of steam seeing it was similar. But I think my offering
| is superior in a more simple template format with less
| programming required and a nice cli app.
|
| I am yet to announce it publicly, but i been meaning to for
| so long already.
|
| If you would like to check it out I would be happy!
|
| You can use it to map / view content from a format there is a
| template for. Alot of common formats is already included and
| you can extend it using your own templates.
|
| https://github.com/martinlindhe/feng
| joshspankit wrote:
| As someone who had a great time with Kaitai, may I suggest
| that you write an interface so that fq can be used with any
| format that Kaitai understands (and any that people add in
| future)
| natch wrote:
| In my mind the 'f' here stands for 'f yeah'
| _pmf_ wrote:
| I'd also like to throw https://github.com/WerWolv/ImHex in the
| mix here.
| candiodari wrote:
| LOVE the name.
| nonethewiser wrote:
| Reminds me of this emphatically best middle school
|
| Fuqing #1 Middle School
|
| https://g.co/kgs/HPtt5n
| dotancohen wrote:
| I didn't even think of that until you mentioned it!
| pcthrowaway wrote:
| No, FQ!!
| jacknews wrote:
| the name kind of sucks, why 'f' q? F for ... 'FU' ('teenage
| snigger'). It should be 'bq', binary query, after 'jq' json
| query.
|
| Cool project none-the-less. The comment about 'programmatic
| documentation' of binary formats is very interesting, maybe
| some kind of 'binary description markup' could be part of this?
| brabel wrote:
| I've written a parser of Java class files (which works in any JVM
| language as they all compile to the same bytecode format). It was
| surprisingly easy! Maybe that could be useful to analyse class
| files in jq??!
| pmoriarty wrote:
| Too bad that jq has such a shitty, convoluted syntax.
|
| Could have definitely chosen one of the many alternatives to jq
| that also worked with JSON but did so in a much clearer and more
| elegant way.
| hedora wrote:
| Can you name an alternative to jq with better syntax?
| dmoura wrote:
| I prefer a SQL-like format. It's not as complete but it cover
| most of the day-to-day use cases. Take a look at
| https://github.com/dcmoura/spyql (I am the author). Congrats
| on fq!
| xp84 wrote:
| Commenting to follow because I'm curious what alternatives you
| mean. I thought a lot of people liked jq and I only just
| finally got around to installing it, so if there's a much
| better way I'd like to hear it.
| usr1106 wrote:
| FOSDEM presentation by the author earlier this year:
| https://fosdem.org/2023/schedule/event/bintools_fq/
| pabs3 wrote:
| I love that this includes a section in the README about other
| tools that are similar or related to fq. Every open source
| project should list its competitors.
| SkyPuncher wrote:
| I don't even necessarily see listing alternatives as
| competition. Often alternatives solve overlapping, but
| different problems.
| suavesito wrote:
| Not only that, the README has a section called _Hopes_, the
| second bullet point being
|
| > Inspire people to create similar tools.
| el_oni wrote:
| I wouldnt even say the are necessarily competitors. More like a
| flathead screwdriver to compliment your phillips in most of
| those cases
| tpoacher wrote:
| Agreed. I try to do the same with my stuff. Honest, transparent
| market research / placement improves your offering, rather than
| diminishes it.
| tough wrote:
| Right? Not doing so is either disingenuous or amateurish
___________________________________________________________________
(page generated 2023-06-03 23:00 UTC)