[HN Gopher] Cicoparser: Full game reverse engineering [video]
___________________________________________________________________
Cicoparser: Full game reverse engineering [video]
Author : gabonator
Score : 48 points
Date : 2021-04-20 04:51 UTC (18 hours ago)
(HTM) web link (www.youtube.com)
(TXT) w3m dump (www.youtube.com)
| gitowiec wrote:
| This was great to watch. I wish it could work with Linux. First
| game I would like to recompile is Dune.
| wts42 wrote:
| Excellent pick. Mentat approved.
| gabonator wrote:
| Should work with linux without any problem - cicoparser was
| initially developed on windows and does not use any libraries
| besides std, you can build it using gcc compiler... Host
| application is based on SDL2, so it should work really anywhere
| without any extra work
| Cloudef wrote:
| There's similar set of tools by notaz[1] that were used to static
| recompile starcraft, diablo, diablo 2, and jazz jackrabbit games
| to ARM Linux. You can read more about the recompilation here[2].
|
| 1: https://github.com/notaz/ia32rtools 2:
| https://www.giantpockets.com/starcraft-pandora-port-came/
| tibbydudeza wrote:
| It gave me flashbacks to using DOS with Norton Commander :).
| tralarpa wrote:
| I got very excited when I saw the description of the video
| "Conversion of game into C++ with cicoparser and IDA
| disassembler". I thought "neat, a new decompiler".
|
| But then I understood what CicoParser is doing: it translates
| machine instructions into C-statements, i.e. when your binary
| contains an instruction like "mov 123,sp", the output will be a C
| source file with a statement "memory16(_ds,123)=_sp;". On the
| github page, they say it is not a CPU emulator, but I would
| rather say it is a CPU emulator with AOT compilation of the
| binary.
| gabonator wrote:
| If it was CPU emulator, it would update all the flags everytime
| performing any ALU operation (I have seen this approach in one
| source-to-source compiler). Actually, there is not much you can
| do: If the instruction stores SP into DS:123, it converts the
| instruction into simple assignment *((WORD*)&memory[ds*16+123])
| = sp. All the ALU operations are directly calculated using the
| target instruction set, the flags register is updated only when
| necessary. Nor the memory is emulated, it directly accessess
| the memory buffer (in the video there are just extra range
| checks, even the *16 operation can be optimized replacing ds/es
| with memory pointers). Only thing that is emulated is the EGA
| adapter.
| habibur wrote:
| Right. From a birds eye view it might more look like
| assembly. But look closer and you see it summarizes a bunch
| of idiomatic assemblies into C code.
|
| And it will improve over time if the developers continue to
| give it effort.
| tralarpa wrote:
| Thanks for the explanation. Very nice project. I guess self-
| modifying code does not work with this technique, does it? (I
| don't know much about DOS games and how common self-modifying
| code was on PCs).
|
| Concerning access to video memory: I saw that you treat them
| "manually" in some cases. I am wondering whether you could
| avoid that by using virtual memory. You could mark the pages
| as invalid and when an instruction tries to access them, you
| catch it and replace the memory[...] access instruction by a
| call to memoryVideoGet. The JIT-Compiler of the Amiga
| emulator uses a similar technique for indirect accesses to
| hardware registers.
| gabonator wrote:
| Good point. In the set of games (10 in total, release date
| up to 1991) I was porting I found only one that used this
| nasty technique. And it was just rewriting only single byte
| of code (something like rewriting nop instruction into
| return). So very simple case so far. Of course using
| cicoparser doesn't mean that you get working code without
| any manual work. You will always need to fix some issues by
| hand. Virtual memory does not solve anything in this case.
| Writing to EGA video ram means that you want to display
| some pixels. But the write operation goes through some
| extra logic which decides what to do with the byte being
| written (extra rotation, masking...) and by reading the
| same addrees you are not guaranteed to get the same value
| back. EGA control registers handle this process and you
| simply need to emulate this behaviour somehow.
| albertzeyer wrote:
| But what exactly is the difference to a decompiler then?
| tralarpa wrote:
| Probably depends on your definition of a decompiler. For me,
| a decompiler reverses to some extend the operation of a
| compiler. Variables instead of registers, function call
| arguments instead of stack pushs, etc.
|
| Of course, you could also say that a decompiler is any tool
| that produces something from a binary that you can compile
| again. But in that case, I could claim that this here is also
| a decompiled program: byte[] programbinary={
| put binary of the program here };
| runEmulator(programbinary);
| teawrecks wrote:
| A compiler has optimization steps. Rather than going
| straight from human readable C to binary, it compiles to an
| IR and then uses some heuristics to create binary that is
| more efficient for the machine to execute.
|
| I feel like you're effectively asking for an optimization
| step. Decompile to an IR, and then use some heuristics to
| get back to C that is more efficient for humans to read.
|
| And if a compiler without an optimizer is still a compiler,
| then a decompiler without an "optimizer" should still be
| called a decompiler.
| CodeArtisan wrote:
| A decompiler translates bytecote into a structured program.
|
| https://en.wikipedia.org/wiki/Structured_programming
___________________________________________________________________
(page generated 2021-04-20 23:01 UTC)