[HN Gopher] Reversing LZ91 from Commander Keen
       ___________________________________________________________________
        
       Reversing LZ91 from Commander Keen
        
       Author : samrussellbg
       Score  : 58 points
       Date   : 2021-08-17 15:26 UTC (7 hours ago)
        
 (HTM) web link (www.lodsb.com)
 (TXT) w3m dump (www.lodsb.com)
        
       | albertzeyer wrote:
       | Note that Commander Keen itself was also reverse engineered. The
       | most active project is Commander Genius:
       | https://clonekeenplus.sourceforge.io/
       | https://github.com/gerstrong/Commander-Genius (You will find some
       | code by me in this. :))
        
       | pdw wrote:
       | The article doesn't make this explicit, but this game is just
       | using LZEXE. This was an executable file compressor that was very
       | widely used in the early 90s. An early project by Fabrice
       | Bellard!
        
         | TacticalCoder wrote:
         | Oh my goodness! I spend countless hours using LZEXE back in the
         | days and never made the link. I never realized it was made by
         | Fabrice Bellard!
         | 
         | From his homepage:
         | 
         | > "I wrote LZEXE in 1989 and 1990 when I was 17."
         | 
         | Incredible.
        
         | _kdave wrote:
         | Also tools like CUP386 did that for free, but anyway
         | interesting read.
        
       | indentit wrote:
       | Forgive me, as I've never disassembled anything myself(!), but
       | would it not be helpful to be able to disassemble an executable
       | into pseudo-code or something (I guess ideally something a bit
       | higher level but re-compilable) alongside the assembly language?
       | It seems to me that it could be much easier to understand what is
       | happening that way, no?
        
         | mips_avatar wrote:
         | Some delta compressors like Google Courgette actually do this.
        
         | AnIdiotOnTheNet wrote:
         | You'd think so, but it turns out that reversing compiled code
         | in an automated fashion doesn't usually produce very readable
         | results:
         | 
         | https://derevenets.com/examples.html
        
           | davikrr wrote:
           | Hex-Rays begs to differ.
        
           | mywittyname wrote:
           | Sometimes you get lucky and debug information is left in the
           | binary.
        
         | mschuster91 wrote:
         | It's not possible in the most cases - unless you have the exact
         | same version of the compiler that was used _and_ you can figure
         | out the build settings that were used (especially
         | optimizations, but also stuff like include order), you can 't
         | recreate the assembler code from pseudo / C code.
         | 
         | Modernizations are especially tricky. Modern compilers can do
         | all sort of weird magic, sometimes combining two or more lines
         | of code into one instruction. Old school compilers don't
         | optimize much which is part of why performance-critical parts
         | of game engines were written in Assembler for a long time.
         | 
         | Not to mention that some stuff you can do in Assembler has no
         | equivalent in higher-level code (e.g. dealing with raw stack
         | frames), and even Assembler to byte code is nowhere near 1:1
         | reversible.
        
           | bugfix wrote:
           | You might not get the exact same code, but it is certainly
           | possible to generate C/pseudo-code from the binary.
           | 
           | IDA Pro and Ghidra can identify functions and generate the
           | equivalent C code. I know that this is not the original code,
           | but it does help a bit when you are trying to get an idea of
           | what a large function doing.
        
             | kaoD wrote:
             | You're both right.
             | 
             | I've used Ghidra to reverse-engineer a game's serialization
             | format[0] and, even though the C-ish result was marginally
             | better than manually tracking registers across the
             | disassembly, it was far from understandable.
             | 
             | A great deal of the work was cleaning up the resulting C
             | into something that a human would've written instead of the
             | garbage ASM-with-C-syntax that Ghidra produced.
             | 
             | That is nowhere near what OP was suggesting (although
             | useful nonetheless).
             | 
             | [0] https://github.com/alvaro-cuesta/townsclipper
        
               | mschuster91 wrote:
               | I'm actually reverse-engineering a game myself...
               | interestingly, for me Ghidra produces very good results,
               | way better than IDA did ten years ago. On the other hand
               | I may be lucky simply because 1996 Borland C++ is a
               | pretty dumb, unoptimizing compiler and there is
               | absolutely no copy protection or whatever present in the
               | game, not even a dead code elimination.
               | 
               | Only thing where Ghidra lacks any form of knowledge of is
               | how to deal with the FS register that is used for SEH on
               | win32... it just marks it as in_FS_offset with no way to
               | tell it that it can replace FS:[0xXX] with appropriate
               | TIB access macros.
        
       ___________________________________________________________________
       (page generated 2021-08-17 23:01 UTC)