-----------------Epoch----------------- A 4am crack 2016-10-10 -------------------. updated 2020-06-24 |___________________ Name: Epoch Genre: arcade Year: 1981 Authors: Larry Miller Publisher: Sirius Software Media: single-sided 5.25-inch floppy OS: custom Previous cracks: several uncredited file cracks Similar cracks: #516 Outpost #315 Beer Run ~ Chapter 0 In Which Various Automated Tools Fail In Interesting Ways COPYA immediate disk read error Locksmith Fast Disk Backup unable to read any track EDD 4 bit copy (no sync, no count) hangs during boot Copy ][+ nibble editor track 0 has some 4-4 encoded data other tracks are unreadable Disk Fixer nope (can't read 4-4 encoded tracks) Why didn't COPYA work? not a 16-sector disk Why didn't Locksmith FDB work? ditto Why didn't my EDD copy work? I don't know. Could be a nibble check during boot. Could be that the data is loaded from half tracks. Could be both, or neither. Next steps: 1. Trace the boot 2. Capture the game in memory 3. Write it out to a standard disk with some kind of fastloader ~ Chapter 1 In Which We Find A Very Unfriendly "Do Not Disturb" Sign [S6,D1=original disk] [S5,D1=my work disk] ]PR#5 CAPTURING BOOT0 ...reboots slot 6... ...reboots slot 5... SAVING BOOT0 ]BLOAD BOOT0,A$800 ]CALL -151 *801L ; display hi-res graphics page ; (uninitialized) 0801- 8D 50 C0 STA $C050 0804- 8D 52 C0 STA $C052 0807- 8D 54 C0 STA $C054 080A- 8D 57 C0 STA $C057 ; get slot (x16) 080D- A6 2B LDX $2B ; a counter? or an address? 080F- A9 04 LDA #$04 0811- 85 11 STA $11 0813- A0 00 LDY #$00 0815- 84 10 STY $10 ; look for custom prologue ("DD AD DA") 0817- BD 8C C0 LDA $C08C,X 081A- 10 FB BPL $0817 081C- C9 DD CMP #$DD 081E- D0 F7 BNE $0817 0820- BD 8C C0 LDA $C08C,X 0823- 10 FB BPL $0820 0825- C9 AD CMP #$AD 0827- D0 F3 BNE $081C 0829- BD 8C C0 LDA $C08C,X 082C- 10 FB BPL $0829 082E- C9 DA CMP #$DA 0830- D0 EA BNE $081C ; read 4-4 encoded data immediately ; (no address field, no sector numbers) 0832- BD 8C C0 LDA $C08C,X 0835- 10 FB BPL $0832 0837- 38 SEC 0838- 2A ROL 0839- 85 0E STA $0E 083B- BD 8C C0 LDA $C08C,X 083E- 10 FB BPL $083B 0840- 25 0E AND $0E ; ($10) is an address, initialized at ; $080F as $0400 (yes, the text page) 0842- 91 10 STA ($10),Y 0844- C8 INY 0845- D0 EB BNE $0832 0847- E6 11 INC $11 0849- A5 11 LDA $11 ; loop until we hit page 8 (i.e. we're ; filling $0400..$07FF) 084B- C9 08 CMP #$08 084D- D0 E3 BNE $0832 084F- BD 80 C0 LDA $C080,X ; clear $0900..$BFFF in main memory 0852- A9 09 LDA #$09 0854- 85 01 STA $01 0856- A9 00 LDA #$00 0858- 85 00 STA $00 085A- A8 TAY 085B- A2 B7 LDX #$B7 085D- 91 00 STA ($00),Y 085F- C8 INY 0860- D0 FB BNE $085D 0862- E6 01 INC $01 0864- CA DEX 0865- D0 F6 BNE $085D ; calculate a checksum of page 8 (this ; code right here) 0867- 8A TXA 0868- E8 INX 0869- F0 06 BEQ $0871 086B- 5D 00 08 EOR $0800,X 086E- 4C 68 08 JMP $0868 ; use the stack pointer (!) to keep a ; copy of that checksum 0871- AA TAX 0872- 9A TXS ; calculate another checksum of zero ; page 0873- A2 00 LDX #$00 0875- 8A TXA 0876- 55 00 EOR $00,X 0878- E8 INX 0879- D0 FB BNE $0876 ; get slot (x16) again 087B- A6 2B LDX $2B ; jump to the code we just read into ; the text page 087D- 4C 00 04 JMP $0400 Well that's lovely. I want to interrupt the boot at $087D, but if I do, it will modify the checksum that ends up in the stack pointer. It's also wiping main memory, including the place I usually put my boot trace callbacks (around $9700). So, a three-pronged attack: 1. Relocate the code to $0900. Most of it uses relative branching already, except for one JMP at $086E, which I can patch. The code will still run, but I'll be able to patch it without altering the checksum. 2. Disable the memory wipe at $095D. 3. Patch the code at $097D to jump to a routine under my control. ~ Chapter 2 In Which Nothing Happens, Inhospitably *9600 *97FF 97FF- 20 The initial checksum of boot0 is $20. *C500G ... ]CALL -151 *9600 *97FF 97FF- 25 The second checksum, which gets stashed in the stack pointer at $0471, is $25. ~ Chapter 4 Half A Track Is Better Than None Continuing the boot trace at $0472... *C500G ... ]BLOAD BOOT1 0400-07FF,A$2400 ]CALL -151 *2472L 2472- A0 03 LDY #$03 2474- 20 DC 04 JSR $04DC *24DCL ; advance drive head by one phase ; (a.k.a. a half track) 24DC- E6 0C INC $0C 24DE- A5 0C LDA $0C 24E0- 29 03 AND #$03 24E2- 0A ASL 24E3- 05 2B ORA $2B 24E5- AA TAX 24E6- BD 81 C0 LDA $C081,X 24E9- 20 F8 04 JSR $04F8 24EC- BD 80 C0 LDA $C080,X 24EF- 20 F8 04 JSR $04F8 ; loop a number of times (given in the ; Y register on entry) 24F2- 88 DEY 24F3- D0 E7 BNE $24DC 24F5- A6 2B LDX $2B 24F7- 60 RTS ; wait routine (called from $04E9 and ; $04EF) 24F8- A9 40 LDA #$40 24FA- 8D 50 C0 STA $C050 24FD- 4C A8 FC JMP $FCA8 We started on track 0 and advanced the drive head by 3 phases, so now we're on track 1.5. Continuing from $0477... ; get target memory page from an array ; at $05D0 2477- A4 0E LDY $0E 2479- B9 D0 05 LDA $05D0,Y ; if page = 0, jump to next stage ; at $0500, otherwise continue at $0481 247C- D0 03 BNE $2481 247E- 4C 00 05 JMP $0500 2481- 20 90 04 JSR $0490 *2490L ; sector count (4-4 encoded tracks can ; only hold $0C pages worth of data) 2490- 85 05 STA $05 2492- 18 CLC 2493- A9 0C LDA #$0C 2495- 85 06 STA $06 2497- A0 00 LDY #$00 2499- 84 04 STY $04 ; match custom prologue "DD AD DA" 249B- BD 8C C0 LDA $C08C,X 249E- 10 FB BPL $249B 24A0- C9 D5 CMP #$D5 24A2- D0 F7 BNE $249B 24A4- BD 8C C0 LDA $C08C,X 24A7- 10 FB BPL $24A4 24A9- C9 AA CMP #$AA 24AB- D0 F3 BNE $24A0 24AD- BD 8C C0 LDA $C08C,X 24B0- 10 FB BPL $24AD 24B2- C9 DA CMP #$DA 24B4- D0 EA BNE $24A0 ; now read 4-4 encoded data into ($04) 24B6- BD 8C C0 LDA $C08C,X 24B9- 10 FB BPL $24B6 24BB- 38 SEC 24BC- 2A ROL 24BD- 85 0F STA $0F 24BF- 8D 50 C0 STA $C050 24C2- BD 8C C0 LDA $C08C,X 24C5- 10 FB BPL $24C2 24C7- 25 0F AND $0F 24C9- 91 04 STA ($04),Y 24CB- C8 INY 24CC- D0 E8 BNE $24B6 ; increment target page 24CE- E6 05 INC $05 ; decrement count 24D0- C6 06 DEC $06 ; Loop back to read more. Note: this ; goes directly to data read routine, ; not the prologue match routine. There ; is only one prologue per track. 24D2- D0 E2 BNE $24B6 24D4- 60 RTS Continuing from $0484... *2484L ; not shown, but the subroutine ; sets Y=2 and falls through to drive ; head advance routine, so this will ; skip ahead 2 phases = 1 whole track, ; so we're still on half tracks but now ; 2.5, 3.5, 4.5, &c. 2484- 20 D8 04 JSR $04D8 ; show hi-res screen, increment index ; into page array, and jump back to ; read the next track 2487- 20 00 06 JSR $0600 *2600L ; set some addresses that are likely to ; be important later 2600- A9 00 LDA #$00 2602- 8D D1 6D STA $6DD1 2605- 8D D2 6D STA $6DD2 2608- 8D D3 6D STA $6DD3 260B- A9 60 LDA #$60 260D- 8D 7F 7F STA $7F7F 2610- A9 05 LDA #$05 2612- 85 00 STA $00 2614- A9 3B LDA #$3B 2616- 85 01 STA $01 2618- A9 C0 LDA #$C0 261A- 85 02 STA $02 261C- A9 84 LDA #$84 261E- 85 03 STA $03 2620- A9 95 LDA #$95 2622- 85 05 STA $05 2624- A9 14 LDA #$14 2626- 85 06 STA $06 2628- A0 15 LDY #$15 262A- B1 01 LDA ($01),Y 262C- A9 55 LDA #$55 262E- 85 FE STA $FE 2630- A9 FD LDA #$FD 2632- 85 FF STA $FF 2634- 60 RTS If I know anything about anything, that will prove to be important later. Continuing from $048A... ; increment page index 248A- E6 0E INC $0E ; and branch back (exits via $0500 when ; the target page = 0) 248C- 4C 77 04 JMP $0477 Here is the target page table (accessed at $0479): *25D0. 25D0- 0C 18 24 30 3C 48 54 60 25D8- 6C 78 84 90 9C A8 B4 00 Each call to $0490 reads $0C sectors, so we're filling $0C00..$BFFF entirely. Once the page array is exhausted, $047E jumps to $0500 for the next boot stage. To sum up: - We're reading data from consecutive half tracks (1.5, 2.5, 3.5, &c.) - Each track has $0C pages of data in a custom (non-sector-based) format - We're filling $0C00..$BFFF in main memory - Nothing in this read loop relies on the checksum we stashed in the stack pointer or the later checksum we pushed twice to the stack - $047E exits via $0500 Let's capture it. ~ Chapter 5 In Which Things Have Been Made As Difficult As Possible For Us *9600 A quick inspection of memory confirms that $0C00..$BFFF have changed, and the rest are untouched (except $0800 for boot0 and the text page for boot1, but I knew about those already). *C500G ... ]BSAVE OBJ.0C00-7FFF,A$7400 ]BRUN TRACE2 ...reboots slot 6... *2000<8000.BFFFM *C500G ... ]BSAVE OBJ.8000-BEFF,A$2000,L$3F00 ]BSAVE OBJ.BF00-BFFF,A$5F00,L$100 That's it; that's the entire game code. Now back to the bootloader to see where the entry point is. ]BLOAD OBJ.0400-07FF,A$2400 ]CALL -151 *2500L ; turn off drive motor 2500- BD 88 C0 LDA $C088,X ; checksum entire game code in memory 2503- A9 0C LDA #$0C 2505- 85 81 STA $81 2507- A9 00 LDA #$00 2509- 85 80 STA $80 250B- A8 TAY 250C- A2 B4 LDX #$B4 250E- 51 80 EOR ($80),Y 2510- C8 INY 2511- D0 FB BNE $250E 2513- E6 81 INC $81 2515- CA DEX 2516- D0 F6 BNE $250E 2518- A8 TAY ; if checksum fails, it's off to The ; Badlands with you! 2519- D0 25 BNE $2540 ; transfer the stack pointer to X -- ; remember this was set as the result ; of the checksum back at $0470 251B- BA TSX Now X is #$25. 251C- A0 00 LDY #$00 251E- B1 FF LDA ($FF),Y 2520- 48 PHA zp$FF was set to #$FD at $0632. The behavior of this addressing mode is strange, though. We're using zp$FF as the low byte of an address. But zero page always wraps around, so the high byte of the address is zp$00, not $0100. zp$00 was set to #$05 at $0612. So the address we're loading is $05FD (+Y, which is 0), in memory now at $25FD. *25FD 25FE- 71 And that's what gets pushed to the stack: #$71. Continuing... 2521- C8 INY 2522- B1 FF LDA ($FF),Y 2524- 48 PHA Same addressing mode, but now Y is 1, so we're getting the value of $05FE. *25FE 25FE- 42 So we've pushed #$71/#$42 to the stack. 2525- 8A TXA 2526- 85 37 STA $37 Now zp$37 is #$25. 2528- 49 B6 EOR #$B6 252A- 99 FF 01 STA $01FF,Y Now $0200 is #$25 XOR #$B6 = #$93. ; set up the rest of zero page in bulk 252D- A0 60 LDY #$60 252F- B9 00 07 LDA $0700,Y 2532- 99 00 00 STA $0000,Y 2535- C8 INY 2536- D0 F7 BNE $252F ; and "exit" via the address we just ; pushed to the stack 2538- 60 RTS "RTS" pops the two bytes we pushed and adds 1, so the entry point of the game is $7143. *BLOAD OBJ.0C00-7FFF,A$C00 *7143L 7143- 1A ??? 7144- AC 43 09 LDY $0943 7147- 7A ??? 7148- 5A ??? 7149- A8 TAY 714A- BA TSX 714B- A8 TAY 714C- BA TSX 714D- 3A ??? 714E- AE 84 01 LDX $0184 7151- 3A ??? 7152- 7A ??? 7153- 1A ??? 7154- AD 99 99 LDA $9999 7157- 1A ??? 7158- B8 CLV 7159- 4C 53 0F JMP $0F53 715C- 84 75 STY $75 715E- 3A ??? 715F- B8 CLV I've missed something. ~ Chapter 6 Undocumented Opcode Is Best Opcode Actually, I haven't missed anything. All those opcodes that show up as "???" in the monitor listing are actually valid (if undocumented) 6502 opcodes. $1A, $3A, $5A, $7A, and $BA are all equivalent to $EA -- a NOP. (Somewhat surprisingly, these opcodes work even on my enhanced Apple //e with a 65c02 processor.) So this code does a bunch of seemingly random things to registers, then eventually jumps to $0F53. *F53L ; well at least this looks like code! ; copy The Badlands to $0300 0F53- A2 40 LDX #$40 0F55- BD 40 05 LDA $0540,X 0F58- 9D 00 03 STA $0300,X 0F5B- CA DEX 0F5C- 10 F7 BPL $0F55 ; switch to RAM 0F5E- AD 81 C0 LDA $C081 0F61- AD 81 C0 LDA $C081 ; set high- and low-level reset vectors 0F64- A9 00 LDA #$00 0F66- 8D F2 03 STA $03F2 0F69- 8D FC FF STA $FFFC 0F6C- A9 03 LDA #$03 0F6E- 8D F3 03 STA $03F3 0F71- 8D FD FF STA $FFFD 0F74- 49 A5 EOR #$A5 0F76- 8D F4 03 STA $03F4 ; back to ROM 0F79- AD 80 C0 LDA $C080 ; and continue to the real entry point 0F7C- 4C 33 81 JMP $8133 And now I have enough information to run the game without the bootloader -- and see if I've *really* missed something. ; get the bootloader back in memory *BLOAD OBJ.0400-07FF,A$2400 ; this will end up on zero page (we'll ; move it later) *B60<760.7FFM ; load the game *BLOAD OBJ.0C00-7FFF,A$C00 *BLOAD OBJ.8000-BEFF,A$8000 ; load last page in lower memory so it ; doesn't override Diversi-DOS (we'll ; move it later) *BLOAD OBJ.BF00-BFFF,A$800 Now a short loader program that initializes zero page and jumps to the real entry point. 0B00- A9 25 LDA #$25 0B02- 85 37 STA $37 0B04- 49 B6 EOR #$B6 0B06- 8D 00 02 STA $0200 0B09- A0 60 LDY #$60 0B0B- B9 00 0B LDA $0B00,Y 0B0E- 99 00 00 STA $0000,Y 0B11- C8 INY 0B12- D0 F7 BNE $0B0B 0B14- 4C 33 81 JMP $8133 *BSAVE LOADER,A$B00,L$100 ; disconnect DOS *FE89G FE93G ; move the last page into place *BF00<800.8FFM ; and run our custom loader *B00G ...works, and it is glorious... ~ Chapter 7 In Which We Step, Ever So Gently, Into The 21st Century I have all the game code. I know how to initialize it and call it. Now to write it all to disk. (We'll worry about reading it back in just a minute.) [S6,D1=blank formatted disk] [S5,D1=my work disk] ]PR#5 ... ]CALL -151 ; page count (decremented) 0300- A9 90 LDA #$B5 0302- 85 FF STA $FF ; logical sector (incremented) 0304- A9 00 LDA #$00 0306- 85 FE STA $FE ; call RWTS to write sector 0308- A9 03 LDA #$03 030A- A0 88 LDY #$88 030C- 20 D9 03 JSR $03D9 ; increment logical sector, wrap around ; from $0F to $00 and increment track 030F- E6 FE INC $FE 0311- A4 FE LDY $FE 0313- C0 10 CPY #$10 0315- D0 07 BNE $031E 0317- A0 00 LDY #$00 0319- 84 FE STY $FE 031B- EE 8C 03 INC $038C ; convert logical to physical sector 031E- B9 40 03 LDA $0340,Y 0321- 8D 8D 03 STA $038D ; increment page to write 0324- EE 91 03 INC $0391 ; loop until done with all $90 pages 0327- C6 FF DEC $FF 0329- D0 DD BNE $0308 032B- 60 RTS *340.34F ; logical to physical sector mapping 0340- 00 07 0E 06 0D 05 0C 04 0348- 0B 03 0A 02 09 01 08 0F *388.397 ; RWTS parameter table, pre-initialized ; with slot 6, drive 1, track $01, ; sector $00, address $0A00, and RWTS ; write command ($02) 0388- 01 60 01 00 01 00 FB F7 0390- 00 0A 00 00 02 00 00 60 *BSAVE MAKE,A$300,L$98 ; load everything off-by-$100 so we ; leave $BF00+ untouched (this is the ; only page in main memory used by ; Diversi-DOS 64K) *BLOAD LOADER,A$A00 *BLOAD OBJ.0C00-7FFF,A$B00 *BLOAD OBJ.8000-BEFF,A$7F00 *BLOAD OBJ.BF00-BFFF,A$BE00 [S6,D1=blank disk] *300G ; write game to disk Now I have the entire game on tracks $01-$0C of a standard 16-sector disk. To read it back as quickly as possible, I'll use qkumba's "0boot" bootloader, newly updated to version 2.0 with support for partial tracks. ~ Chapter 8 0boot 2.0 0boot lives on track $00, just like me. Sector $00 (boot0) reuses the disk controller ROM routine to read sector $0E (boot1). Boot0 creates a few data tables, copys boot1 to zero page, modifies it to accomodate booting from any slot, and jumps to it. Boot0 is loaded at $0800 by the disk controller ROM routine. ; tell the ROM to load only this sector ; (we'll do the rest manually) 0800- [01] ; The accumulator is $01 after loading ; sector $00, or $03 after loading ; sector $0E. We don't need to preserve ; the value, so we just shift the bits ; to determine whether this is the ; first or second time we've been here. 0801- 4A LSR ; second run -- we've loaded boot1, so ; skip to boot1 initialization routine 0802- D0 0E BNE $0812 ; first run -- increment the physical ; sector to read (this will be the next ; sector under the drive head, so we'll ; waste as little time as possible ; waiting for the disk to spin) 0804- E6 3D INC $3D ; X holds the boot slot (x16) -- ; munge it into $Cx format (e.g. $C6 ; for slot 6, but we need to accomodate ; booting from any slot) 0806- 8A TXA 0807- 4A LSR 0808- 4A LSR 0809- 4A LSR 080A- 4A LSR 080B- 09 C0 ORA #$C0 ; push address (-1) of the sector read ; routine in the disk controller ROM 080D- 48 PHA 080E- A9 5B LDA #$5B 0810- 48 PHA ; "return" via disk controller ROM, ; which reads boot1 into $0900 and ; exits via $0801 0811- 60 RTS ; Execution continues here (from $0802) ; after boot1 code has been loaded into ; $0900. This works around a bug in the ; CFFA 3000 firmware that doesn't ; guarantee that the Y register is ; always $00 at $0801, which is exactly ; the sort of bug that qkumba enjoys ; uncovering. 0812- A8 TAY ; munge the boot slot, e.g. $60 -> $EC ; (to be used later) 0813- 8A TXA 0814- 09 8C ORA #$8C ; Copy the boot1 code from $0901..$09FF ; to zero page. ($0900 holds the 0boot ; version number. This is version 1. ; $0000 is initialized later in boot1.) 0816- BE 00 09 LDX $0900,Y 0819- 96 00 STX $00,Y 081B- C8 INY 081C- D0 F8 BNE $0816 ; There are a number of places in boot1 ; that need to hit a slot-specific soft ; switch (read a nibble from disk, turn ; off the drive, &c). Rather than the ; usual form of "LDA $C08C,X", we will ; use "LDA $C0EC" and modify the $EC ; byte in advance, based on the boot ; slot. $00F5 is an array of all the ; places in the boot1 code that need ; this adjustment. 081E- C8 INY 081F- B6 E3 LDX $E3,Y 0821- 95 00 STA $00,X 0823- D0 F9 BNE $081E ; munge $EC -> $E0 (used later to ; advance the drive head to the next ; track) 0825- 29 F0 AND #$F0 0827- 85 CB STA $CB ; munge $E0 -> $E8 (used later to ; turn off the drive motor) 0829- 09 08 ORA #$08 082B- 85 D9 STA $D9 ; push several addresses to the stack ; (more on this later) 082D- A2 06 LDX #$06 082F- B5 DD LDA $DD,X 0831- 48 PHA 0832- CA DEX 0833- D0 FA BNE $082F ; number of tracks to load (x2) (game- ; specific; this game uses $0C tracks) 0835- A0 18 LDY #$18 ; push $0003 to the stack (more on this ; later) 0837- 8A TXA 0838- 48 PHA 0839- A9 03 LDA #$03 083B- 48 PHA 083C- 8A TXA ; unconditional branch over the next ; loop 083D- 18 CLC 083E- 90 07 BCC $0847 ; loop starts here 0840- 8A TXA ; every other time through this loop, ; we will end up taking this branch 0841- 90 03 BCC $0846 ; X is 0 going into this loop, and it ; never changes, so A is always 0 too. ; So this will push $0000 to the stack ; (to "return" to $0001, which reads a ; track into memory) 0843- 48 PHA 0844- 48 PHA ; There's a "SEC" hidden here (because ; it's opcode $38), but it's only ; executed if we take the branch at ; $0841, which lands at $0846, which is ; in the middle of this instruction. ; Otherwise we execute the compare, ; which clears the carry bit. So the ; carry flip-flops between set and ; clear, so the BCC at $0841 is only ; taken every other time. 0845- C9 38 CMP #$38 ; Push $00B6 to the stack, to "return" ; to $00B7. This routine advances the ; drive head to the next half track. 0847- 48 PHA 0848- A9 B6 LDA #$B6 084A- 48 PHA ; loop until done 084B- 88 DEY 084C- D0 F2 BNE $0840 Because of the carry flip-flop, we will push $00B6 to the stack every time through the loop, but we will only push $0000 every other time. The loop runs for twice the number of tracks we want to read, so the stack ends up looking like this (remember all addresses are off-by-1 because of how the Apple II "returns" to stack addresses): --top-- $00B6 (move drive 1/2 track) $00B6 (move drive another 1/2 track) $0000 (read track into memory) $00B6 \ $00B6 } second group $0000 / $00B6 \ $00B6 } third group $0000 / . . [repeated for each track] . $00B6 \ $00B6 } final group $0000 / $0003 entry point to read the last few sectors from the final track (we can't read the entire track into memory because we'd end up overwriting $C000 ROM space and wreaking havoc with softswitches) $FE88 IN#0 (this and the following two addresses were pushed to the stack in the loop at $082F) $FE92 PR#0 $00D7 turn off drive motor and jump to my game-specific custom loader at $0B00 --bottom-- Boot1 reads the game into memory from tracks $01-$0C, but it isn't a loop. It's one routine that reads a track and another routine that advances the drive head. We're essentially unrolling the read loop on the stack, in advance, so that each routine gets called as many times as we need, when we need it. Like dancers in a chorus line, each routine executes then cedes the spotlight. Each seems unaware of the others, but in reality they've all been meticulously choreographed. ~ Chapter 9 6 + 2 Before I can explain the next chunk of code, I need to pause and explain a little bit of theory. As you probably know if you're the sort of person who reads this sort of thing, Apple II floppy disks do not contain the actual data that ends up being loaded into memory. Due to hardware limitations of the original Disk II drive, data on disk must be stored in an intermediate format called "nibbles." Bytes in memory are encoded into nibbles before writing to disk, and nibbles that you read from the disk must be decoded back into bytes. The round trip is lossless but requires some bit wrangling. Decoding nibbles-on-disk into bytes-in- memory is a multi-step process. In "6-and-2 encoding" (used by DOS 3.3, ProDOS, and all ".dsk" image files), there are 64 possible values that you may find in the data field (in the range $96..$FF, but not all of those, because some of them have bit patterns that trip up the drive firmware). We'll call these "raw nibbles." Step 1: read $156 raw nibbles from the data field. These values will range from $96 to $FF, but as mentioned earlier, not all values in that range will appear on disk. Now we have $156 raw nibbles. Step 2: decode each of the raw nibbles into a 6-bit byte between 0 and 63 (%00000000 and %00111111 in binary). $96 is the lowest valid raw nibble, so it gets decoded to 0. $97 is the next valid raw nibble, so it's decoded to 1. $98 and $99 are invalid, so we skip them, and $9A gets decoded to 2. And so on, up to $FF (the highest valid raw nibble), which gets decoded to 63. Now we have $156 6-bit bytes. Step 3: split up each of the first $56 6-bit bytes into pairs of bits. In other words, each 6-bit byte becomes three 2-bit bytes. These 2-bit bytes are merged with the next $100 6-bit bytes to create $100 8-bit bytes. Hence the name, "6-and-2" encoding. The exact process of how the bits are split and merged is... complicated. The first $56 6-bit bytes get split up into 2-bit bytes, but those two bits get swapped (so %01 becomes %10 and vice- versa). The other $100 6-bit bytes each get multiplied by 4 (a.k.a. bit-shifted two places left). This leaves a hole in the lower two bits, which is filled by one of the 2-bit bytes from the first group. A diagram might help. "a" through "x" each represent one bit. ------------- 1 decoded 3 decoded nibble in + nibbles in = 3 bytes first $56 other $100 00abcdef 00ghijkl 00mnopqr | 00stuvwx | split | & shifted swapped left x2 | | V V 000000fe + ghijkl00 = ghijklfe 000000dc + mnopqr00 = mnopqrdc 000000ba + stuvwx00 = stuvwxba ------------- Tada! Four 6-bit bytes 00abcdef 00ghijkl 00mnopqr 00stuvwx become three 8-bit bytes ghijklfe mnopqrdc stuvwxba When DOS 3.3 reads a sector, it reads the first $56 raw nibbles, decoded them into 6-bit bytes, and stashes them in a temporary buffer (at $BC00). Then it reads the other $100 raw nibbles, decodes them into 6-bit bytes, and puts them in another temporary buffer (at $BB00). Only then does DOS 3.3 start combining the bits from each group to create the full 8-bit bytes that will end up in the target page in memory. This is why DOS 3.3 "misses" sectors when it's reading, because it's busy twiddling bits while the disk is still spinning. ~ Chapter 10 Back to 0boot 0boot also uses "6-and-2" encoding. The first $56 nibbles in the data field are still split into pairs of bits that need to be merged with nibbles that won't come until later. But instead of waiting for all $156 raw nibbles to be read from disk, it "interleaves" the nibble reads with the bit twiddling required to merge the first $56 6-bit bytes and the $100 that follow. By the time 0boot gets to the data field checksum, it has already stored all $100 8-bit bytes in their final resting place in memory. This means that 0boot can read all 16 sectors on a track in one revolution of the disk. That's crazy fast. To make it possible to do all the bit twiddling we need to do and not miss nibbles as the disk spins(*), we do some of the work earlier. We multiply each of the 64 possible decoded values by 4 and store those values. (Since this is accomplished by bit shifting and we're doing it before we start reading the disk, this is called the "pre-shift" table.) We also store all possible 2-bit values in a repeating pattern that will make it easy to look them up later. Then, as we're reading from disk (and timing is tight), we can simulate all the bit math we need to do with a series of table lookups. There is just enough time to convert each raw nibble into its final 8-bit byte before reading the next nibble. (*) The disk spins independently of the CPU, and we only have a limited time to read a nibble and do what we're going to do with it before WHOOPS HERE COMES ANOTHER ONE. So time is of the essence. Also, "As The Disk Spins" would make a great name for a retrocomputing-themed soap opera. I am going to continue making this joke until someone makes it happen, then I promise I will stop. The first table, at $0200..$02FF, is three columns wide and 64 rows deep. Astute readers will notice that 3 x 64 is not 256. Only three of the columns are used; the fourth (unused) column exists because multiplying by 3 is hard but multiplying by 4 is easy (in base 2 anyway). The three columns correspond to the three pairs of 2-bit values in those first $56 6-bit bytes. Since the values are only 2 bits wide, each column holds one of four different values (%00, %01, %10, or %11). The second table, at $0300..$0369, is the "pre-shift" table. This contains all the possible 6-bit bytes, in order, each multiplied by 4 (a.k.a. shifted to the left two places, so the 6 bits that started in columns 0-5 are now in columns 2-7, and columns 0 and 1 are zeroes). Like this: 00ghijkl --> ghijkl00 Astute readers will notice that there are only 64 possible 6-bit bytes, but this second table is larger than 64 bytes. To make lookups easier, the table has empty slots for each of the invalid raw nibbles. In other words, we don't do any math to decode raw nibbles into 6-bit bytes; we just look them up in this table (offset by $96, since that's the lowest valid raw nibble) and get the required bit shifting for free. addr | raw | decoded 6-bit | pre-shift -----+-----+----------------+---------- $300 | $96 | 0 = %00000000 | %00000000 $301 | $97 | 1 = %00000001 | %00000100 $302 | $98 [invalid raw nibble] $303 | $99 [invalid raw nibble] $304 | $9A | 2 = %00000010 | %00001000 $305 | $9B | 3 = %00000011 | %00001100 $306 | $9C [invalid raw nibble] $307 | $9D | 4 = %00000100 | %00010000 . . . $368 | $FE | 62 = %00111110 | %11111000 $369 | $FF | 63 = %00111111 | %11111100 Each value in this "pre-shift" table also serves as an index into the first table (with all the 2-bit bytes). This wasn't an accident; I mean, that sort of magic doesn't just happen. But the table of 2-bit bytes is arranged in such a way that we take one of the raw nibbles that needs to be decoded and split apart (from the first $56 raw nibbles in the data field), use that raw nibble as an index into the pre- shift table, then use that pre-shifted value as an index into the first table to get the 2-bit value we need. That's a neat trick. ; this loop creates the pre-shift table ; at $300 084E- A2 40 LDX #$40 0850- A4 58 LDY $58 0852- 98 TYA 0853- 0A ASL 0854- 24 58 BIT $58 0856- F0 12 BEQ $086A 0858- 05 58 ORA $58 085A- 49 FF EOR #$FF 085C- 29 7E AND #$7E 085E- B0 0A BCS $086A 0860- 4A LSR 0861- D0 FB BNE $085E 0863- CA DEX 0864- 8A TXA 0865- 0A ASL 0866- 0A ASL 0867- 99 EA 02 STA $02EA,Y 086A- C6 58 DEC $58 086C- D0 E2 BNE $0850 And this is the result (".." means the address is uninitialized and unused): 0300- 00 04 .. .. 08 0C .. 10 0308- 14 18 .. .. .. .. .. .. 0310- 1C 20 .. .. .. 24 28 2C 0318- 30 34 .. .. 38 3C 40 44 0320- 48 4C .. 50 54 58 5C 60 0328- 64 68 .. .. .. .. .. .. 0330- .. .. .. .. .. 6C .. 70 0338- 74 78 .. .. .. 7C .. .. 0340- 80 84 .. 88 8C 90 94 98 0348- 9C A0 .. .. .. .. .. A4 0350- A8 AC .. B0 B4 B8 BC C0 0358- C4 C8 .. .. CC D0 D4 D8 0360- DC E0 .. E4 E8 EC F0 F4 0368- F8 FC ; this loop creates the table of 2-bit ; values at $200, magically arranged to ; enable easy lookups later 086E- 46 BA LSR $BA 0870- 46 BA LSR $BA 0872- B5 EA LDA $EA,X 0874- 99 FF 01 STA $01FF,Y 0877- E6 AF INC $AF 0879- A5 AF LDA $AF 087B- 25 BA AND $BA 087D- D0 05 BNE $0884 087F- E8 INX 0880- 8A TXA 0881- 29 03 AND #$03 0883- AA TAX 0884- C8 INY 0885- C8 INY 0886- C8 INY 0887- C8 INY 0888- C0 04 CPY #$04 088A- B0 E6 BCS $0872 088C- C8 INY 088D- C0 04 CPY #$04 088F- 90 DD BCC $086E And this is the result: 0200- 00 00 00 .. 00 00 02 .. 0208- 00 00 01 .. 00 00 03 .. 0210- 00 02 00 .. 00 02 02 .. 0218- 00 02 01 .. 00 02 03 .. 0220- 00 01 00 .. 00 01 02 .. 0228- 00 01 01 .. 00 01 03 .. 0230- 00 03 00 .. 00 03 02 .. 0238- 00 03 01 .. 00 03 03 .. 0240- 02 00 00 .. 02 00 02 .. 0248- 02 00 01 .. 02 00 03 .. 0250- 02 02 00 .. 02 02 02 .. 0258- 02 02 01 .. 02 02 03 .. 0260- 02 01 00 .. 02 01 02 .. 0268- 02 01 01 .. 02 01 03 .. 0270- 02 03 00 .. 02 03 02 .. 0278- 02 03 01 .. 02 03 03 .. 0280- 01 00 00 .. 01 00 02 .. 0288- 01 00 01 .. 01 00 03 .. 0290- 01 02 00 .. 01 02 02 .. 0298- 01 02 01 .. 01 02 03 .. 02A0- 01 01 00 .. 01 01 02 .. 02A8- 01 01 01 .. 01 01 03 .. 02B0- 01 03 00 .. 01 03 02 .. 02B8- 01 03 01 .. 01 03 03 .. 02C0- 03 00 00 .. 03 00 02 .. 02C8- 03 00 01 .. 03 00 03 .. 02D0- 03 02 00 .. 03 02 02 .. 02D8- 03 02 01 .. 03 02 03 .. 02E0- 03 01 00 .. 03 01 02 .. 02E8- 03 01 01 .. 03 01 03 .. 02F0- 03 03 00 .. 03 03 02 .. 02F8- 03 03 01 .. 03 03 03 .. And now for something completely different. The original disk briefly displayed an uninitialized hi-res graphics page (originally at $0801 -- literally the first thing it does on boot). So I want to do the same. It won't be absolutely first thing, but it'll be close. 0891- 2C 54 C0 BIT $C054 0894- 2C 52 C0 BIT $C052 0897- 2C 57 C0 BIT $C057 089A- 2C 50 C0 BIT $C050 089D- 60 RTS [Note to future self: $0891..$08FF is available for game-specific init code, but it can't rely on or disturb zero page in any way. That rules out a lot of built-in ROM routines; be careful. If the game needs no initialization, you can zap this entire range and put an "RTS" at $0891.] Everything else is already lined up on the stack. All that's left to do is "return" and let the stack guide us through the rest of the boot. ~ Chapter 11 0boot boot1 The rest of the boot runs from zero page. It's hard to show you exactly what boot1 will look like, because it relies heavily on self-modifying code. In a standard DOS 3.3 RWTS, the softswitch to read the data latch is "LDA $C08C,X", where X is the boot slot times 16 (to allow disks to boot from any slot). 0boot also supports booting from any slot, but instead of using an index, each fetch instruction is pre- set based on the boot slot. We only need to set this up once, because we're only going to read from the disk once. Not only does this free up the X register, it lets us juggle all the registers and put the raw nibble value in whichever one is convenient at the time. (We take full advantage of this freedom.) I've marked each pre-set softswitch with "o_O" to remind you that self-modifying code is awesome. There are several other instances of addresses and constants that get modified while boot1 is running. I've marked these with "/!\" to remind you that self-modifying code is dangerous and you should not try this at home. The first thing popped off the stack is the drive arm move routine at $00B6. It moves the drive exactly one phase (half a track). 00B7- E6 BA INC $BA ; This value was set at $00B7 (above). ; It's incremented monotonically, but ; it's ANDed with $03 later, so its ; exact value isn't relevant. 00B9- A0 3F LDY #$3F /!\ ; short wait for PHASEON 00BB- A9 04 LDA #$04 00BD- 20 C3 00 JSR $00C3 ; fall through 00C0- 88 DEY ; longer wait for PHASEOFF 00C1- 69 41 ADC #$41 00C3- 85 CE STA $CE ; calculate the proper stepper motor to ; access 00C5- 98 TYA 00C6- 29 03 AND #$03 00C8- 2A ROL 00C9- AA TAX ; This address was set at $0827, ; based on the boot slot. 00CA- BD D1 C0 LDA $C0D1,X /!\ ; This value was set at $00C3 so that ; PHASEON and PHASEOFF have optimal ; wait times. 00CD- A9 D1 LDA #$D1 /!\ ; wait exactly the right amount of time ; after accessing the proper stepper ; motor 00CF- 4C A8 FC JMP $FCA8 Since the drive arm routine only moves one phase, it was pushed to the stack twice before each track read. Our game is stored on whole tracks; this half- track trickery is only to save a few bytes of code in boot1. The track read routine starts at $0001, because that let us save 1 byte in the boot0 code when we were pushing addresses to the stack. (We could just push $00 twice.) ; sectors-left-to-read-on-this-track ; counter (incremented to $00) 0001- A2 F0 LDX #$F0 0003- 2C A2 FB BIT $FBA2 0006- 86 00 STX $00 Pay no attention to the BIT instruction at $0003, which just happens to hide a whole other instruction at $0004. We will return to this later. We initialize an array at $00DE that tracks which sectors we've read from the current track. Astute readers will notice that this part of zero page had real data in it -- some addresses that were pushed to the stack, and some other values that were used to create the 2-bit table at $0200. All true, but all those operations are now complete, and the space from $00DE..$00FF is now available for unrelated uses. The array is in physical sector order, thus the RWTS assumes data is stored in physical sector order on each track. (This is why my MAKE program had to map to physical sector order when writing. This saves 18 bytes: 16 for the table and 2 for the lookup command!) Values are the actual pages in memory where that sector should go, and they get zeroed once the sector is read (so we don't waste time decoding the same sector twice). ; starting address (game-specific; ; this one starts loading at $0B00) 0008- A9 0B LDA #$0B /!\ 000A- 95 EE STA $EE,X 000C- E6 09 INC $09 000E- E8 INX 000F- D0 F7 BNE $0008 0011- 20 D2 00 JSR $00D2 ; subroutine reads a nibble and ; stores it in the accumulator 00D2- AD D1 C0 LDA $C0D1 o_O 00D5- 10 FB BPL $00D2 00D7- 60 RTS Continuing from $0014 (wow that sounds weird, doesn't it?)... ; first nibble must be $D5 0014- C9 D5 CMP #$D5 0016- D0 F9 BNE $0011 ; read second nibble, must be $AA 0018- 20 D2 00 JSR $00D2 001B- C9 AA CMP #$AA 001D- D0 F5 BNE $0014 ; We actually need the Y register to be ; $AA for unrelated reasons later, so ; let's set that now. (We have time, ; and it saves 1 byte!) 001F- A8 TAY ; read the third nibble 0020- 20 D2 00 JSR $00D2 ; is it $AD? 0023- 49 AD EOR #$AD ; Yes, which means this is the data ; prologue. Branch forward to start ; reading the data field. 0025- F0 1F BEQ $0046 If that third nibble is not $AD, we assume it's the end of the address prologue. ($96 would be the third nibble of a standard address prologue, but we don't actually check.) We fall through and start decoding the 4-4 encoded values in the address field. 0027- A0 02 LDY #$02 The first time through this loop, we'll read the disk volume number. The second time, we'll read the track number. The third time, we'll read the physical sector number. We don't actually care about the disk volume or the track number, and once we get the sector number, we don't verify the address field checksum. 0029- 20 D2 00 JSR $00D2 002C- 2A ROL 002D- 85 AF STA $AF 002F- 20 D2 00 JSR $00D2 0032- 25 AF AND $AF 0034- 88 DEY 0035- 10 F2 BPL $0029 ; store the physical sector number ; (will re-use later) 0037- 85 AF STA $AF ; use physical sector number as an ; index into the sector address array 0039- A8 TAY ; get the target page (where we want to ; store this sector in memory) 003A- B6 DE LDX $DE,Y ; store the target page in several ; places throughout the following code 003C- 86 9E STX $9E 003E- CA DEX 003F- 86 6E STX $6E 0041- 86 86 STX $86 0043- E8 INX ; This is an unconditional branch, ; because the ROL at $002C will always ; set the carry. We're done processing ; the address field, so we need to loop ; back and wait for the data prologue. 0044- B0 CB BCS $0011 ; execution continues here (from $0025) ; after matching the data prologue 0046- E0 00 CPX #$00 ; If X is still $00, it means we found ; a data prologue before we found an ; address prologue. In that case, we ; have to skip this sector, because we ; don't know which sector it is and we ; wouldn't know where to put it. Sad! 0048- F0 C7 BEQ $0011 Nibble loop #1 reads nibbles $00..$55, looks up the corresponding offset in the preshift table at $0300, and stores that offset in the temporary buffer at $036A. ; initialize rolling checksum to $00 004A- 85 58 STA $58 004C- AE D1 C0 LDX $C0D1 o_O 004F- 10 FB BPL $004C ; The nibble value is in the X register ; now. The lowest possible nibble value ; is $96 and the highest is $FF. To ; look up the offset in the table at ; $0300, we need to subtract $96 from ; $0300 and add X. 0051- BD 6A 02 LDA $026A,X ; Now the accumulator has the offset ; into the table of individual 2-bit ; combinations ($0200..$02FF). Store ; that offset in the temporary buffer ; at $036A, in the order we read the ; nibbles. But the Y register started ; counting at $AA, so we need to ; subtract $AA from $036A and add Y. 0054- 99 C0 02 STA $02C0,Y ; The EOR value is set at $004A ; each time through loop #1. 0057- 49 7F EOR #$7F /!\ 0059- C8 INY 005A- D0 EE BNE $004A Here endeth nibble loop #1. Nibble loop #2 reads nibbles $56..$AB, combines them with bits 0-1 of the appropriate nibble from the first $56, and stores them in bytes $00..$55 of the target page in memory. 005C- A0 AA LDY #$AA 005E- AE D1 C0 LDX $C0D1 o_O 0061- 10 FB BPL $005E 0063- 5D 6A 02 EOR $026A,X 0066- BE C0 02 LDX $02C0,Y 0069- 5D 02 02 EOR $0202,X ; This address was set at $003F ; based on the target page (minus 1 ; so we can add Y from $AA..$FF). 006C- 99 56 D1 STA $D156,Y /!\ 006F- C8 INY 0070- D0 EC BNE $005E Here endeth nibble loop #2. Nibble loop #3 reads nibbles $AC..$101, combines them with bits 2-3 of the appropriate nibble from the first $56, and stores them in bytes $56..$AB of the target page in memory. 0072- 29 FC AND #$FC 0074- A0 AA LDY #$AA 0076- AE D1 C0 LDX $C0D1 o_O 0079- 10 FB BPL $0076 007B- 5D 6A 02 EOR $026A,X 007E- BE C0 02 LDX $02C0,Y 0081- 5D 01 02 EOR $0201,X ; This address was set at $003C ; based on the target page (minus 1 ; so we can add Y from $AA..$FF). 0084- 99 AC D1 STA $D1AC,Y /!\ 0087- C8 INY 0088- D0 EC BNE $0076 Here endeth nibble loop #3. Loop #4 reads nibbles $102..$155, combines them with bits 4-5 of the appropriate nibble from the first $56, and stores them in bytes $AC..$FF of the target page in memory. 008A- 29 FC AND #$FC 008C- A2 AC LDX #$AC 008E- AC D1 C0 LDY $C0D1 o_O 0091- 10 FB BPL $008E 0093- 59 6A 02 EOR $026A,Y 0096- BC BE 02 LDY $02BE,X 0099- 59 00 02 EOR $0200,Y ; This address was set at $003C ; based on the target page. 009C- 9D 00 D1 STA $D100,X /!\ 009F- E8 INX 00A0- D0 EC BNE $008E Here endeth nibble loop #4. ; Finally, get the last nibble, ; which is the checksum of all ; the previous nibbles. 00A2- 29 FC AND #$FC 00A4- AC D1 C0 LDY $C0D1 o_O 00A7- 10 FB BPL $00A4 00A9- 59 6A 02 EOR $026A,Y ; if checksum fails, start over 00AC- D0 96 BNE $0044 ; This was set to the physical ; sector number (at $0037), so ; this is a index into the 16- ; byte array at $00DE. 00AE- A0 00 LDY #$00 /!\ ; store $00 at this index in the sector ; array to indicate that we've read ; this sector 00B0- 96 DE STX $DE,Y ; are we done yet? 00B2- E6 00 INC $00 ; nope, loop back to read more sectors 00B4- D0 8E BNE $0044 ; And that's all she read. 00B6- 60 RTS 0boot's track read routine is done when $0000 hits $00, which is astonishingly beautiful. Like, "now I know God" level of beauty. And so it goes: we pop another address off the stack, move the drive arm, read another track. Eventually we get to the $0003 address we pushed to the stack in boot0. That "returns" to $0004, which looks like this: ; game-specific number of sectors to ; load from final track (subtracted ; from $FF, so 5 sectors -- they'll go ; into $BB00..$BFFF in memory) 0004- A2 FB LDX #$FB 0006- 86 00 STX $00 That was hidden in plain sight this entire time, inside the BIT instruction at $0003. I told you we'd return to it. After reading 5 sectors from the final track, we hit the "RTS" at $00B6 again, burn through the machine initialization routines we pushed to the stack (PR#0, IN#0), then pop off one last address and continue at $00D8: ; turn off drive motor 00D8- AD D1 C0 LDA $C0D1 /!\ ; jump to game-specific loader 00DB- 4C 00 0B JMP $0B00 And that's all she wrote^H^H^H^H^Hread. Quod erat liberandum. ~ Changelog 2020-06-24 - typo in the 6-and-2 encoding diagram [thanks Andrew R.] 2016-10-16 - typos (thanks qkumba) 2016-10-10 - initial release --------------------------------------- A 4am crack No. 872 ------------------EOF------------------