[HN Gopher] Donkey Kong Country 2 and Open Bus
___________________________________________________________________
Donkey Kong Country 2 and Open Bus
Author : colejohnson66
Score : 175 points
Date : 2025-06-30 15:01 UTC (7 hours ago)
(HTM) web link (jsgroth.dev)
(TXT) w3m dump (jsgroth.dev)
| mock-possum wrote:
| Love stuff like this, I feel like I'm only ever 60% following the
| assembly code, so the prose explanation alongside really helps -
| and it's fun to hear these 'bugs that nobody understood or
| possibly even noticed until now in a classic piece of software'
| stories!
| shadowgovt wrote:
| One of the things I love about this era of systems is that
| there were none of the modern checks that we consider table-
| stakes in nearly everything, including most embedded systems
| (necessary in anything that can be hooked up to a network, and
| still so cheap that it's included as a nice-to-have in
| completely isolated embedded architectures).
|
| Lots of reads and writes in the original NES just toggled
| voltages on a line somewhere, and then what happened, happened.
| You got the effect you wanted by toggling those voltages in a
| very controlled manner lock-stepped with the signal indicating
| the behavior of the CRT blanking intervals. Some animations in
| Super Mario Bros 3 involved toggling a RAM mux to select from
| multiple banks of sprite data so that when the graphics
| hardware went to pull sprites, it'd pull them from an entirely
| different chip with slight variations in their look. And since
| the TV timing mattered, they had to release different software
| for regions with NTSC and PAL TVs since those TVs operate with
| different refresh rates and refresh rate was the clock that
| drove the render logic.
|
| It was a wild time.
| deater wrote:
| I have to say as a 6502 assembly programmer I have wasted many
| hours of my life tracking down the same issue in my code
| (forgetting to put an # in front of an immediate value and thus
| accidentally doing a memory access instead). Often it's like this
| case too where things might accidentally work some of the time.
|
| Worse than the floating-bus in this example is when it depends on
| uninitialized RAM which is often consistent based on DRAM so the
| code will always work on your machine/emulator but won't on
| someone else's machine with different DRAM chips (invariably you
| catch this at a demoparty when it won't run on the party machine
| and you only have 15 minutes to fix it before your demo is about
| to be presented)
| anonymousiam wrote:
| Was there ever an architecture that used dynamic memory with a
| 6502 CPU? In my (limited?) experience, that platform always had
| static RAM.
| deater wrote:
| I think you'll find more systems used DRAM than SRAM.
|
| The Apple II was one of the first 6502 systems to use DRAM
| (in 1977) and Woz was incredibly clever in getting the
| refresh for free as a side effect of the video generation
| retrac wrote:
| Most of them. Static RAM was (and still is) more expensive
| since it needs more transistors and chip area per bit stored.
| It it, however, also much easier to interface since it
| doesn't need refresh circuitry. This is why you see it in the
| earliest designs, and also why you see it in so many hobbyist
| designs. It's also why you tend to see it in the video
| systems even if the rest of the machine uses DRAM. Dealing
| with DRAM refresh while reading out the whole memory chip
| sequentially (while also having a second port to read/write
| from the CPU!) starts making things very complicated.
|
| But still DRAM is what you would use for a "real" system.
| Wozniak's design for the Apple II used a clever hack where
| the system actually runs at 2 MHz with an effective CPU rate
| of 1 MHz. Any read from a DRAM row will refresh the entire
| row. Approximately every other cycle the video system steps
| incrementally through memory, refreshing as it goes.
| rzzzt wrote:
| Same with the VIC-II and the 6510 in the Commodore 64. The
| video chip is given the main character role for the bus,
| stopping the CPU from moving forward if it needs cycles for
| video generation or DRAM refresh.
| wk_end wrote:
| Well, the SNES - if that counts, it's a 65816 - uses DRAM.
| This is especially noteworthy because the DRAM refresh is
| actually visible on-screen on some units:
|
| https://www.retrorgb.com/snesverticalline.html
| adrian_b wrote:
| There must have been computers with 6502 and DRAM.
|
| For higher memory capacities, e.g. 32 kB, 48 kB or 64 kB,
| static RAM would have been too expensive and too big, even if
| 6502 did not have an integrated DRAM controller, like Zilog
| Z80.
|
| Using SRAM instead of DRAM meant using 4 times more IC
| packages, e.g, 32 packages instead of 8. The additional DRAM
| controller required by DRAM would have needed only 1 to 4
| additional IC packages. Frequently the display controller
| could be used to also ensure DRAM refresh.
| Braxton1980 wrote:
| Are you thinking of SDRAM (a type of DRAM)?
| anonymousiam wrote:
| I appreciate all of the responses. I did development on a
| KIM-1 and I owned a SYM-1. Both of these used static RAM. I
| expanded the RAM in my SYM-1 from 4K to 8K (with eight 2114
| static RAM chips). I never owned any other 6502 based
| computers.
| RiverCrochet wrote:
| 6502 was my first assembly language, and I always thought of
| instructions like "LDA #2" as "load A with the number 2" versus
| LDA 2 (load A with what's in memory location 2).
| bartread wrote:
| This is the kind of situation where feeding your code through
| an LLM can actually be helpful: they're really good at spotting
| the kind of errors/typos like this that have a profound impact
| but which our eyes tend to all to easily scan over/past.
| nancyminusone wrote:
| The last time I tried an LLM on assembly, it made up
| instructions that didn't exist.
| cdelsolar wrote:
| cool; nowadays LLMs are better
| iforgotpassword wrote:
| Today I used chatgpt for winapi stuff - it made up
| structs and enums regarding display config. So not too
| convinced it'll be any good with 6502 asm.
| recursive wrote:
| cool; but not better enough
| jihadjihad wrote:
| I know it's OT, but I have to say, for a 30-year-old video game,
| it's remarkable how well DK Country 2 holds up today. I've been
| playing it with an emulator and the graphics, sound, level
| design, and controls are all masterful. The kids can keep
| Fortnite, I'll take DKC and Chrono Trigger any day!
| christophilus wrote:
| Chrono Trigger holds up. That game is a masterpiece.
| aidenn0 wrote:
| Most Rare-developed games from that era were really well done.
| pipes wrote:
| Whenever I'm playing a game via emulation and I get stuck, I do
| end up wondering if it's a bug in the emulator. This particular
| issue, I would have assumed the game was designed this way it and
| is just difficult.
|
| Not quite related, but i get a similar feeling if the game seems
| really tough: "is this due to emulation latency". I went down a
| rabbit hole on this one and built myself a mister FPGA!
| bnjms wrote:
| Chronic Trigger had one like this. I recall there is a section
| where you catch a rat and have to input four simultaneous key
| inputs after catching the rat. But usb inputs only forwarded 3
| at a time so to get passed this you'd mash all four and
| eventually you'd get them registered inside the very short
| timeframe. Took many tries and was very frustrating.
| Y_Y wrote:
| > _Chronic Trigger_
|
| Sometimes it does feel that way...
| aidenn0 wrote:
| I played a lot of bionic commando as a kid. When I loaded it up
| in an emulator in the early 2000's it was _way_ harder than I
| rememberd. Then I realized there was an emulation bug where the
| enemies didn 't disappear when you blew up the base, but Ladd
| still froze; that meant I needed roughly 2 extra life points
| when clearing a level. Just to see if I could, I did beat it
| that way once, but never again.
| pipes wrote:
| Why did I get downvoted for this?
| bigstrat2003 wrote:
| I only ever played DKC on ZSNES, and I had no idea that this
| was an emulator bug until reading the article. Like you said, I
| just assumed that it was the intended game design to time your
| launch from the barrel so that it was the correct angle. It
| blew my mind to learn that it was a bug!
| nicetryguy wrote:
| I don't always make 6502(ish) errors, but when i do, it's usually
| the memory address instead of the immediate! It's a very common
| and easy mistake to make, and i believe Chuck Peddle himself
| deeply regretted the (number symbol, pound sign, hashtag) #$1234
| syntax for immediate values. I made # appear bright red in my
| IDE, it helps, a bit... Even the ASM gods at Rare fell victim to
| the same issue!
| JoshTriplett wrote:
| I ran into a similar issue a long time ago, with the GNU
| assembler in "intel_syntax noprefix" mode. It has an issue
| where there's syntactic ambiguity that makes it possible to
| interpret a forward-referenced named constant _immediate_ as a
| reference to an unknown _symbol_ , if in an instruction that
| could accept either an immediate or a memory address. The net
| result is assembling the instruction to have a placeholder
| memory address (expected to be filled in by the relocated
| address of the symbol when linked) rather than the expected
| immediate. Painful to debug.
| anonymousiam wrote:
| I started reading this to understand Open Bus, which was
| capitalized in the title, so I assumed it was a proper name for
| some old bus protocol/standard that I'd never heard of.
|
| After reading, I realized that he just meant that the bus was
| "open" as in not connected to anything, because the address line
| decoders had no memory devices enabled at the specified address
| ($2000).
|
| It's pretty funny that the omission of the immediate mode (#)
| went unnoticed until the obsolete emulator didn't behave in the
| same way as the real hardware when reading <nothing> from memory.
|
| His solution of changing the instruction to use immediate
| addressing mode (instead of absolute) would have the consequence
| of faster execution time, because the code is no longer executing
| a read from memory. It's probably now faster by about 2us through
| that blob of code, but maybe this only matters on bare metal and
| not the emulator, which is probably not time-perfect anyway.
| wk_end wrote:
| > It's probably now faster by about 2us through that blob of
| code, but maybe this only matters on bare metal and not the
| emulator, which is probably not time-perfect anyway.
|
| (Some) SNES emulators really are basically time-perfect, at
| this point [0]. But 2us isn't going to make an appreciable
| difference in anything but exceptional cases.
|
| [0] https://arstechnica.com/gaming/2021/06/how-snes-emulators-
| go...
| BearOso wrote:
| There's actually some issues with clock drift, and
| speculation whether or not original units had an accurate
| crystal or varied significantly in timing. The only way to
| figure that out is to go back and ask the designers what the
| original spec was, and who knows if they remember. So they're
| not really time-perfect, because the clock speeds can vary as
| much as a half-percent.
| NobodyNada wrote:
| It's mostly the audio clock that is suspectible to drift.
| Everything except the audio subsystem is derived from a
| single master clock, so even if the master clock varies in
| frequency slightly, all the non-audio components will
| remain in sync with each other.
|
| That means the 2 clock cycles could theoretically make an
| observable difference if they cause the CPU to miss a frame
| deadline and cause the game to take a lag frame. But this
| is rather unlikely.
| BearOso wrote:
| The CPU has shown some variation, but yes, it's the APU
| that has a ceramic clock source that isn't even close to
| the same among units. Apparently those ceramic resonators
| have a pretty high variation, even when new.
|
| When byuu/near tried to find a middle-ground for the APU
| clock, the average turned out to be about 1025296
| (32040.5 * 32). Some people have tested units recently
| and gotten an even higher average. They speculate that
| aging is causing the frequency to increase, but I don't
| really know if this is the case or if there really was
| that much of a discrepancy originally.
|
| It does cause some significant compatibility issues, too,
| like with attraction mode desyncs and random freezes.
| shadowgovt wrote:
| In general, even SNES games are still doing frame-locking,
| right? i.e. if you save 2us you're just lengthening the
| amount of time the code is going to wait for a blanking
| signal by 2us.
| wk_end wrote:
| Yeah, exactly. It'd have to be really exceptional cases.
| For example, exactly one game (Air Strike Patrol) has timed
| writes to certain video registers to create a shadow
| effect, but 2us is so minor I don't think it'd appreciably
| effect even that. Or, like, the SNES has an asynchronous
| multiplier/divider that returns invalid results while the
| computation is on-going, so if you optimized some code you
| might end up reading back garbage.
|
| IIRC ZSNES actually had basically no timing; all
| instructions ran for effectively one cycle. ZSNES wasn't an
| accurate emulator, but it mostly worked for most games most
| of the time.
| shadowgovt wrote:
| Rare has a history of video games that work in testing and have
| bugs buried in them for years until some novel architecture
| surfaces them. Not to imply other companies _don 't;_ just that
| Rare is an easy-to-reference name on the topic.
|
| Donkey Kong 64 has a memory leak that will kill the game after
| a (for that era) unlikely amount of contiguous time playing it
| (8-9 hours, if I understand correctly). That was not caught in
| development but is a trivial amount of time to rack up if
| someone is playing the game and saving progress via emulator
| save-state instead of the in-game save feature.
|
| (Note: there is some ambiguous history here. Some sources claim
| the game shipping with the Memory Pak was a last-ditch effort
| to hide the bug by pushing the crash window out to 13-20 hours
| instead of 8-9. I think recent research on the issue suggests
| that was coincidence and the game didn't ship with either Rare
| or Nintendo being aware of the bug).
| jordigh wrote:
| Donkey Kong 64 running for 11 hours just fine:
|
| https://www.youtube.com/watch?v=HWUg_iM7yIg
| helf wrote:
| I love this sort of content! My favorite things I find on HN :D
| NobodyNada wrote:
| Open bus quite literally means that the data bus lines are an
| open circuit -- the CPU has placed an unmapped or write-only
| address on the address bus, and no hardware on the bus has
| responded, so the bus lines aren't being driven and are just
| floating. Thus, nominally, this is a case of undefined behavior
| at the hardware level.
|
| In order to understand what _actually_ happens, we need to look a
| little closer at the physical structure of a data bus -- you have
| long conductors carrying the signals around the motherboard and
| to the cartridge, separated from the ground plane by a thin layer
| of insulating substrate. This looks a lot like a capacitor, and
| in fact this is described and modeled as "parasitic capacitance"
| by engineers who try to minimize it, since this effect limits the
| maximum speed of data transmission over the bus. But this effect
| means that, whenever the bus is not being driven, it tends to
| stay at whatever voltage it was last driving to -- just like a
| little DRAM cell, producing the "open-bus reads return the last
| value transferred across the bus" effect described in the
| article.
|
| It's not uncommon for games to accidentally rely on open-bus
| effects, like DKC2. On the NES, the serial port registers for
| connecting to a controller only drive the low-order bits and the
| high bits are open-bus; there are a few games that read the
| controller input with the instruction LDA $4016 and expect to see
| the value $40 or $41 (with the 4 sticking around because of open-
| bus).
|
| There's also speedrun strategies that rely on open-bus behavior
| as part of memory-corruption or arbitrary-code-execution
| exploits, such as the Super Mario World credits warp, which sends
| the program counter on a trip through unmapped memory before
| eventually landing in RAM and executing a payload crafted by
| carefully manipulating enemy positions [1].
|
| But there's some exceptions to the usual predictable open bus
| behavior. Nonstandard cartridges could return a default value for
| unmapped memory, or include pull-up or pull-down resistors that
| impact the behavior of open bus. There's also an interesting
| interactions with DMA; the SNES supports a feature called HDMA
| which allows applications to schedule DMA transfers to transfer
| data from the CPU to the graphics hardware with precise timing in
| order to upload data or change settings mid-frame [2]. This DMA
| transfer temporarily pauses the CPU in order to use the bus to
| perform the transfer, which can change the behavior of an open-
| bus read if a DMA transfer happens to occur in the middle of an
| instruction (between reading the target address & performing the
| actual open-bus read).
|
| This very niche edge case has a significant impact on a Super
| Metroid speedrun exploit [3] which causes an out-of-bounds
| memcpy, which attempts to transfer a large block of data from
| open-bus to RAM. The open-bus read almost always returns zero
| (because the last byte of the relevant load instruction is zero),
| but when performed in certain rooms with HDMA-heavy graphical
| effects, there's a good chance that a DMA transfer will affect
| one of the reads, causing a non-zero byte to sneaks in somewhere
| important and causing the exploit to crash instead of working
| normally. This has created a mild controversy in the community,
| where some routes and strategies are only reliable on emulators
| and nonstandard firmwares; a player using original hardware or a
| very accurate emulator has a high chance of experiencing a crash,
| whereas most emulators (including all of Nintendo's official re-
| releases of the game) do not emulate this niche edge case of a
| mid-instruction HDMA transfer changing the value of an open-bus
| read.
|
| Also, the current fastest TAS completion of Super Metroid [4]
| relies on this HDMA interaction. We found a crash that attempted
| to execute open bus, but wasn't normally controllable in a useful
| way; by manipulating enemies in the room to influence CPU timing,
| we were able to use HDMA to put useful instructions on the bus at
| the right timing, eventually getting the console to execute
| controller inputs as code and achieve full arbitrary code
| execution.
|
| [1]: https://youtu.be/vAHXK2wut_I
|
| [2]: https://youtu.be/K7gWmdgXPgk
|
| [3]: https://youtu.be/CnThmKhtfOs
|
| [4]: https://tasvideos.org/8214S
| russellbeattie wrote:
| > _... we need to look a little closer at the physical
| structure of a data bus_
|
| Once again, I have to give a shout out to Ben Eater, whose
| video series on making a breadboard computer with the 6502 is
| why I actually understand what the article is about and what
| you're referring to when describing the hardware issues.
| (Obviously, extrapolating from his basic bus example to a
| commercial machine.) I'd be pretty clueless otherwise.
|
| https://eater.net
| Dwedit wrote:
| I once encountered SNES Puyo Puyo doing PPU open bus. This was
| when I was working on the RunAhead feature for RetroArch, and was
| checking when savestates failed to match. CPU execution trace
| logs didn't match because a value read out of PPU Open Bus didn't
| match after loading state.
| jgalt212 wrote:
| DKC 1 with the SGI prerendered 3d graphics was cutting edge
| stuff. Vector Man on the Genesis did something similar to less
| acclaim.
___________________________________________________________________
(page generated 2025-06-30 23:00 UTC)