[HN Gopher] Show HN: Bin-graph: Visualize binary files
       ___________________________________________________________________
        
       Show HN: Bin-graph: Visualize binary files
        
       This program provides a simple way of visualizing the different
       regions of a binary file. Written in C, depends only on libpng.
       Currently (commit 1dd42e3) it is able to generate PNG images that
       represent various aspects of the binary:  - Grayscale: Byte values,
       00..FF. - Ascii: Printability of each byte. - Entropy: Of a
       "block", changed with --block-size. - Histogram: Bar graph of the
       byte frequencies. - Bigrams: Each point is determined by a pair of
       bytes. - Dotplot: Measure self-similarity. Image width/height is
       N^2.  In the future, I plan on adding an SDL version that allows
       the user to view a section of the file interactively (sections are
       currently supported with --offset-start and --offset-end).  More
       information on the README.
        
       Author : 8dcc
       Score  : 41 points
       Date   : 2024-09-01 14:30 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | Terr_ wrote:
       | From the perspective of marketing/spreading this thing--even to
       | Engineers--I think some pictures of example output would help
       | quite a lot.
        
       | atoav wrote:
       | So you made a visualization tool and don't show an example image
       | in the readme?
       | 
       | I am not sure if I should be shocked or impressed, but consider
       | the following question: Would you download a random program of
       | the internet because of the _promise_ of it creating a useful
       | visualization? You probably would if the visualization looked
       | useful to you. But as you can 't see it without downloading.. You
       | get my point.
        
         | floodle wrote:
         | The amount of times I see this with projects on HN is crazy.
         | The _one_ thing I want to know is how the output looks.
        
           | mojomark wrote:
           | Maybe the OP fixed it after seeing these comments, but I see
           | output examples now in both the 'examples' folder as well as
           | the readme.
        
             | 8dcc wrote:
             | I added them after reading these replies, yes. I usually
             | don't add screenshots because I don't like adding images to
             | the git repository (people don't really want to clone
             | that), and I don't want to use an external CDN for
             | uploading my images.
             | 
             | In this project in particular, I understand that it is
             | important.
        
               | Terr_ wrote:
               | Not sure if there are any restrictions keeping you from
               | linking directly from the project readme, but perhaps a
               | separate repo can drive: https://pages.github.com/
        
               | 8dcc wrote:
               | Some months ago, before a GitHub update, you could just
               | paste an image while editing a Markdown file and it would
               | be uploaded to their CDN. You could use that link even in
               | other files (e.g. Org files in my case). They changed
               | their upload system and this is not the case anymore.
        
       | ramon156 wrote:
       | Some examples would be nice, I'm not familiar with binary
       | visualization, so I'm curious why and when I would personally use
       | this
        
         | sim7c00 wrote:
         | these days, when using the right vis techniques you can easily
         | train something like resnet to find malware. some is even
         | plainly obvious to the eye. some packers and encryption is easy
         | to spot but more subtle patterns a neural network can identify.
         | for a reverse engineer or malware analist, seeing what part of
         | files contain 'executable code' is also possible. for example
         | if shellcode is embedded in a different file type or such.
         | check the chris domas video linked on the github. its pretty
         | epic :D
        
       | denysvitali wrote:
       | No screenshots or demo - how do I know if this is better than the
       | already awesome web-first binvis.io?
        
         | 8dcc wrote:
         | My project was inspired by Cortesi's blog posts, among other
         | things. I thought binary visualization could be useful for
         | reverse engineering, a fun thing to program, so I did. I wanted
         | to share it in case someone was interested in the code for
         | generating the image, since I made it as "extensible" as I
         | could, but I am sure there are a lot of improvements to be
         | made.
         | 
         | That website is more interactive, shows a hex dump of the
         | binary and doesn't require you to download/compile anything.
         | It's probably more practical for most users, but my project has
         | some other modes that might be helpful for recognizing patterns
         | in different file formats (look at the talks linked in the
         | README for more information on what I mean). Also, as far as I
         | know the source code for binvis.io is not public.
         | 
         | P.S. I added a link to binvis.io to the README as well.
        
       | 8dcc wrote:
       | I added screenshots to the README.
        
         | ramon156 wrote:
         | Kudos!
        
       | hoosieree wrote:
       | I love these kinds of tools! For part of my PhD research I made a
       | bunch of digraph heatmaps of (differently-obfuscated variations
       | of) stdlib binary files (raw byte sequences and asm mnemonics
       | shown side-by-side):
       | 
       | https://alexshroyer.com/misc/digraphs.mp4
       | 
       | There are often bright spots in these kinds of visuals that you
       | end up seeing over and over again (e.g. clusters of ASCII).
        
         | 8dcc wrote:
         | > There are often bright spots in these kinds of visuals that
         | you end up seeing over and over again (e.g. clusters of ASCII).
         | 
         | Indeed, this is specially true in the "bigrams" mode, where
         | each point (X,Y) is set if the bytes X and Y (00..FF) appear in
         | that order in the input. If you look at the bigrams example in
         | the README, you can see that there is a bright zone where the
         | lowercase ASCII characters are, since that graph is plotting
         | the .rodata section of the binary (using the bin-graph-
         | section.sh script). These patterns appear with other kinds of
         | data, not just text (e.g. x86 instructions).
        
       ___________________________________________________________________
       (page generated 2024-09-02 23:01 UTC)