[HN Gopher] Building a Minimal Viable Armv7 Emulator from Scratch
___________________________________________________________________
Building a Minimal Viable Armv7 Emulator from Scratch
Author : xnacly
Score : 63 points
Date : 2025-11-21 13:30 UTC (9 hours ago)
(HTM) web link (xnacly.me)
(TXT) w3m dump (xnacly.me)
| anthk wrote:
| That's reminds me of the minimal MIPS1 interpreter written in
| Perl able to run old statically linked Linux binaries:
|
| https://github.com/pts/pts-mips-emulator
| thesnide wrote:
| that's an excellent approach to ancient binaries!
| dosinga wrote:
| At first sight I read Minimal Viable Army - would be a good read
| for an aspiring evil mastermind
| syntex wrote:
| I see that author decorating webiste for Christmas :)
| costco wrote:
| Nice article and especially so for including the parsing that
| most people just outsource. What's great about using an emulator
| is that you can also do fun things with the syscalls like
| implementing your own "virtual filesystem" instead of just
| translating directly to the x86_64 equivalent syscall:
| https://github.com/gamozolabs/fuzz_with_emus/blob/master/src...
| (not my code but basically something like this)
| bArray wrote:
| I had to disable JS, the site is unreadable otherwise.
| xnacly wrote:
| The snowflakes do vanish once you scroll down even a single
| pixel
| MisterTea wrote:
| They stop spawning but the flakes already on the screen
| continue to fall until they reach the bottom. It's cute but
| very annoying IMO and there should be a user control to
| instantly turn it off.
| xnacly wrote:
| Its winter, it takes a literal second to fade out
| MisterTea wrote:
| They do not fade out in a second and continue to fall for
| 10+ seconds until they fade at the bottom of the page. It
| is also not winter everywhere.
| pm215 wrote:
| For a learning project like this one, this would probably be
| overkill, but my personal suggestion for instruction decoding is
| that it really pays in the long term to use a data driven
| decoder. It's fairly easy to do a handcoded "if bits A..B are
| 0b1000 and..." decoder for the basic integer parts of the
| instruction set, but especially as you get into complexities like
| SIMD and if you need your decoder to be easy to modify to add new
| instructions later this gets very unwieldy.
|
| QEMU switched to a data driven representation with a python
| program to autogenerate the "check bit patterns and extract
| fields" code, and it's one of the better design overhauls we've
| done: we started using it mostly for new code but went back and
| converted some of the old handwritten decoders too. It's much
| easier to add a new instruction when you only need to add a line
| like USADA8 ---- 0111 1000 rd:4 ra:4
| rm:4 0001 rn:4
|
| and add a function trans_USADA8() that gets called with the field
| values, compared to trying to find the right place in a big
| existing set of handcoded switch and if statements to add the
| extra checks for the insn.
| xnacly wrote:
| I think rust macros could shine for this usecase, definitely on
| my TODO list
| ethin wrote:
| Did you or anyone else figure out how to work with the
| Aarch64/ARM32 open-source JSON schemas? I could never figure
| them out and I feel like if I wanted to work with them I'd have
| to manually write Pydantic models for them or something,
| because Quicktype chokes on them. Mainly because the schemas
| are recursive. If there's one thing I like about RISC-V, it's
| that their riscv-opcodes instruction dictionaries are trivial
| to work with (although it's tricky for me to auto-generate
| "this operand is signed, this one isn't" logic since the repo
| currently doesn't convey that information).
| pm215 wrote:
| I haven't looked at them, partly because I'm sceptical about
| how much use they are. In my experience the hard part of
| emulating new instructions is the semantics, the "what does
| this instruction actually do?" bit. The line in the decode
| file that specifies the 1s and 0s and fields is trivial and
| takes no time. Plus, the way the architecture splits things
| up often doesn't match the way that makes the emulation
| simpler: for instance "same operation, but comes in signed
| and unsigned flavours" is often encoded with a sign bit but
| appears in the architecture as two separate instructions, one
| with an S in the name and one with a U. It's often simpler to
| have one decode line which covers both and passes signedness
| as an argument, where something auto-generated from the
| architectual data would give you two distinct functions.
|
| For cases where everything you need is really in the
| datafiles (e.g. a simple disassembler) or where you're
| providing a user facing API that you want to have match the
| architecture documentation closely, the tradeoffs are
| different.
|
| Also for QEMU I tend to value "works the same regardless of
| target architecture" over "we can do a clever thing for this
| one case but all the others will be different".
| daeken wrote:
| So, it _is_ possible to use the machine consumable forms of
| the ISA, but realistically you 'll spend way longer fighting
| it than finding other strategies.
|
| Years ago, I ended up creating my own aarch64 definition,
| which I use to generate disassemblers, interpreters, and
| recompilers (dynamic and static) automatically: https://githu
| b.com/daeken/SharpRetro/blob/main/Aarch64Genera...
|
| It doesn't have perfect support, but it has served as an
| incredibly useful resource. I've since generalized it to work
| for other architectures, and that same repo has definitions
| for MIPS (specifically the PSX CPU), DMG, and the groundwork
| for an x86 core. The goal is to be able to define these once,
| then generate any future targets automatically.
| retrac wrote:
| Getting a bit off topic, but I feel like this task is something
| that ought to have special language support.
|
| It's a kind of serialization/deserialization, or what I think
| Python and some others call "pickling". Same task. Turn these
| raw bit patterns into typed values.
|
| Ada probably comes closest of the major languages to pulling it
| off. It has separation of the abstract/programmer's view of a
| data type and the implementation / low representation of that
| type.
|
| Specify a bunch of records like: for
| Instruction use record Condition at 0 range 31 ..
| 28; ImmFlag at 0 range 27 .. 27; Opcode
| at 0 range 24 .. 21; CondFlag at 0 range 20 .. 20;
| Rn at 0 range 19 .. 16; Rd at 0 range 15 .. 12;
| Operand at 0 range 11 .. 0; end record;
|
| Then aim a pointer at your instructions and read them as
| records/structs.
|
| It works particularly cleanly with a nice RISC encoding like
| ARM. I'm not actually sure if that would work in Ada. The use
| representation syntax might not be generic enough.
| duskwuff wrote:
| That sort of approach works for some very simple instruction
| encodings, but doesn't really handle:
|
| 1) instructions which "bend" the format, like ARM
| instructions such as STMIA or B which combine multiple fields
| to make a larger immediate value or mask.
|
| 2) recognizing instructions which use special values in
| fields (like ARM condition = 1111) to represent a special
| instruction.
|
| 3) instruction encodings with split fields, like the split
| immediate in RISC-V S-type instructions.
|
| 4) instruction encodings which have too many instruction-
| specific quirks to fit into any reasonable schema, like
| 68000.
| pm215 wrote:
| If you think Arm is a "nice RISC encoding" then I think
| you've mostly been looking at the older integer bits of it
| :-) As you get into FP and SIMD there are just a lot more
| useful operations that need to fit into the strictly limited
| encoding space, and new features that need to be tucked into
| previously unused corners of the space, and it all gets
| noticeably less regular (e.g. "these two bits encode the
| operand size which is 0b00/0b01/0b10 for 8/16/32 bits, but 64
| bit operands aren't supported and 0b11 means it's part of an
| entirely different set of instructions").
| general1465 wrote:
| Looks nice, but the massive performance hit will be from constant
| parsing of arguments of opcodes. If you are emulating relatively
| small binaries (i.e. embedded stuff with few megabytes of
| program), it is better to approach each instruction as an object,
| parse it once and then save it into RAM into some tree structure
| so you can quickly find given opcode by its binary representation
| and then see if it has been parsed or not yet.
| gorgoiler wrote:
| I built an emulator in Python with my students, aged 15 and 16.
| All it did was have a little chunk of memory, PC, some registers,
| arithmetic, and branching, but it was really fun _and a
| surprisingly tractable_ project to work on. Once you have a
| working "machine" with an assembler you can then start to
| automate function call and return which is of course a large part
| of what a compiler does for you.
|
| With hindsight I would have loved to do a proper compiler but
| that's undergrad level really. I really recommend it as a toy
| post-food-coma project for when you're stuck with the family
| either next week or at the end of December :)
___________________________________________________________________
(page generated 2025-11-21 23:01 UTC)