[HN Gopher] Reverse engineering programs with unknown instructio...
___________________________________________________________________
Reverse engineering programs with unknown instruction sets (2012)
[pdf]
Author : lauriewired
Score : 116 points
Date : 2023-01-27 11:45 UTC (11 hours ago)
(HTM) web link (www.recon.cx)
(TXT) w3m dump (www.recon.cx)
| olivierduval wrote:
| Amazing !!! Look a lot like breaking a cypher with the added
| specifics of processor knowledge !
| [deleted]
| stuckkeys wrote:
| Is the site decompilation.info down? Cannot access it.
| crecker wrote:
| It seems so
| kijiki wrote:
| Also enjoyable, reverse engineering the Transmeta Crusoe's
| internal VLIW instruction set:
| https://www.realworldtech.com/crusoe-intro/
|
| I suspect the Anonymous author might have gotten a tip or two
| from a friendly Transmeta hardware or software engineer.
| egberts1 wrote:
| I once wrote a detector of 38 known machine languages.
|
| Akin to an expansion of the UNIX file command.
|
| It would listed known machine code(s) encountered at least within
| 4 bytes (in probability order).
|
| Good times, good times.
|
| (oh, sadly, not open source, but proprietary; I still do wish I
| could release this gem.)
| unwind wrote:
| In what context was that used, if you can elaborate?
| egberts1 wrote:
| Like the UNIX file command, it lists out what the file
| content probably is/are.
|
| It can also breakdown the file in question by regions and
| group such data content into most probable types ... for each
| region.
|
| As to its final application, that is not in my contract/task
| description.
| unwind wrote:
| Okay, thanks.
|
| Yeah I'm very familiar with 'file', I just wondered in what
| context one needs the ability to identify 38 machine
| languages, i.e. why does an organization deal with files
| containing unknown machine code, and have the need to
| identify them?
|
| Sounds like maybe reverse engineering/security
| "research"-oriented work, perhaps.
| egberts1 wrote:
| I was basically leveraging my eidetic memory of opcodes
| and operands and its bitfields.
|
| It all got started with writing pure assembly for
| Motorola 6502 (for arcades) and PDP-11 then eventually
| ended with ARM/RISC/MIPS. Most esoteric one is the
| Transmeta VLIW (TMS3200-02).
|
| and someone asked for one (internally).
| tom_ wrote:
| Previously on HN, possibly not unrelated:
| https://news.ycombinator.com/item?id=25115916
| amelius wrote:
| But what if the CPU assumes the instruction stream is compressed?
| gus_massa wrote:
| In the slide 9, they show the frequency of each 16-bit value.
| In a compressed code, the frequency of each value should be
| almost equal.
|
| 10 or 20 years ago, when reverse engineering any unknown file
| it was a good to assume it was no compressed and you could get
| some insight looking at the hex editor and hopping the best.
| Now many are compressed, so a good first step is to change the
| extension to .zip and try WinRar (or look for a header if you
| are not lazy).
|
| I assume that with compressed code you can use the same
| strategy. Try to assume it's using a well known compression
| algorithm, and crossing your fingers.
| anthk wrote:
| 7zip, unar, innoextract...
|
| And, of course, upx.
| msm_ wrote:
| Shout out to CPUAdventure challenge from DragonCTF 2019, which
| were basically this. If you like the slides, you should find this
| writeup entertaining:
| https://www.robertxiao.ca/hacking/dsctf-2019-cpu-adventure-u...
| thrdbndndn wrote:
| Thanks, this is much easier to understand than a slide (without
| presenter).
| skissane wrote:
| I wonder what the mystery instruction set in the slides actually
| is? (Assuming it is a real instruction set and not just something
| made up to demo the idea.)
| [deleted]
| gwern wrote:
| It's a reverse-engineering conference presentation by 2 Russian
| authors who highlight that they aren't providing any details
| about the context despite the obvious extreme relevance, and
| where their solution does not handle any obfuscation at all. So
| they are probably not decompiling APT malware running in nested
| VMs, but I'm going to guess reverse-engineering old highly-
| secret Russian military hardware where the only docs are high-
| level ones about the usage and repair, not what the chips are
| _doing_ , and where the contractor wants to bugfix or develop
| new versions but needs to understand all the inner logic and
| what empirical ad hoc corrections it might be incorporating
| through the wisdom of long-dead Russian mega-brain engineers.
| tempodox wrote:
| Stuff like that is definitely fun. In the 1990s I bought a Sharp
| PC-E500S pocket computer and hacked the CPU's instruction set.
| With no internet and no documentation about the processor, I
| invented my own assembler syntax for the instructions. Assembler,
| disassembler, hex monitor, (written in Basic) are all still
| working to this day.
| lloydatkinson wrote:
| You should post that online I'm sure people, including me,
| would love to read it.
| tempodox wrote:
| All my notes at the time were made with pencil on paper. Even
| if I could find them, I'm not sure they would still be
| readable. The Basic programs could only be copied by re-
| typing them manually on a contemporary computer. Presenting
| this pre-internet stuff on a website would just be too much
| work, sorry.
| hasmanean wrote:
| That just makes it a meta challenge...for some unknown
| engineer who wants to reverse-engineer an engineer's
| program that reverse-engineered a program with an unknown
| instruction set.
| codetrotter wrote:
| I understand and sympathise with that.
|
| If you do find the documents though, please consider just
| scanning them and uploading them to Internet Archive and
| posting the links to HN. That way someone else in the
| future can find it and decide if they want to do the manual
| re-typing etc themselves :)
| fallat wrote:
| Please _please_ write about the whole process :) I 'd love to
| read it!
| intelVISA wrote:
| Lovely, you should document your stories that sounds
| impressive!
| [deleted]
___________________________________________________________________
(page generated 2023-01-27 23:01 UTC)