https://ratfactor.com/cards/asm-data-in-text vector ratfactor rat logo Home | About | Now | Repos | Cards | Contact | RSSrss feed icon This is a card in Dave's Virtual Box of Cards. i386 Assembly Language trick for storing data in .text Created: 2023-11-08 Don't want a big old bloated multi-kilobyte executable with a data segment, but you do need to store some data? I "discovered" this while working on Meow5. One of my biggest challenges has been storing strings in a position-independent way. I've considered a Global Offset Table (GOT), but it seems like way more machinery than I need. And though I went well out of my way to figure out how to make multi-segment ELF executables, I just couldn't bear to export a "huge" 2kb+ padded file with the proper alignment. So I was determined to store strings in a Forth-style manner - embedding them with the executable code and jumping over them. To do this was going to be a lot of painful trial-and-error to get the exact right opcodes compiled in place to push the address of the string on the stack followed by a jmp instruction to hop over the string (gotta know the length first, then come back and write the instruction). The thing that makes this extra hard is that 32-bit i386 doesn't have an instruction for getting the value of the instruction pointer (EIP) directly. (x86_64 added this later.) But I realized something: there is an instruction that pushes the following address on the stack and then jumps to a location. It's call. So I made a little stand-alone NASM assembly "Hello world" test as a proof of concept. I realized I can even get the length of the string by subtracting the offset of my call'd label from the return address: global _start _start: ; call pushes the next address on the stack ; so we could "return" there call print ; this gets jumped over, but we've got the address! db `Hello World!\n` print: ; Print with Linux SYS_WRITE ; pop the address from the stack to ecx pop ecx ; string address mov edx, print ; label address sub edx, ecx ; length of string! mov ebx, 1 ; STDOUT mov eax, 4 ; SYS_WRITE int 80h ; Exit with Linux SYS_EXIT mov ebx, 0 ; return status mov eax, 1 ; SYS_EXIT int 80h ; kernel syscall Let's build it: $ nasm -f elf32 -o hello.o hello.asm $ ld -m elf_i386 -n hello.o -o hello Without a data segment, this executable is truly tiny, just 532 bytes. $ ls -l hello -rwxr-xr-x 1 dave users 532 Nov 8 20:58 hello And does it work? $ ./hello Hello World! Of course. :-) I'm sure this trick is well-known and has a name and everything, but it's new to me. I was really pleased with how nice and neat it turned out and it's definitely going straight into Meow5, which should allow me to finally finish that project. This page was last generated 2023-11-08 21:19:22 -0500 All content (c) Copyright Dave Gauer