[HN Gopher] Donkey Kong Country 2 and Open Bus
       ___________________________________________________________________
        
       Donkey Kong Country 2 and Open Bus
        
       Author : colejohnson66
       Score  : 175 points
       Date   : 2025-06-30 15:01 UTC (7 hours ago)
        
 (HTM) web link (jsgroth.dev)
 (TXT) w3m dump (jsgroth.dev)
        
       | mock-possum wrote:
       | Love stuff like this, I feel like I'm only ever 60% following the
       | assembly code, so the prose explanation alongside really helps -
       | and it's fun to hear these 'bugs that nobody understood or
       | possibly even noticed until now in a classic piece of software'
       | stories!
        
         | shadowgovt wrote:
         | One of the things I love about this era of systems is that
         | there were none of the modern checks that we consider table-
         | stakes in nearly everything, including most embedded systems
         | (necessary in anything that can be hooked up to a network, and
         | still so cheap that it's included as a nice-to-have in
         | completely isolated embedded architectures).
         | 
         | Lots of reads and writes in the original NES just toggled
         | voltages on a line somewhere, and then what happened, happened.
         | You got the effect you wanted by toggling those voltages in a
         | very controlled manner lock-stepped with the signal indicating
         | the behavior of the CRT blanking intervals. Some animations in
         | Super Mario Bros 3 involved toggling a RAM mux to select from
         | multiple banks of sprite data so that when the graphics
         | hardware went to pull sprites, it'd pull them from an entirely
         | different chip with slight variations in their look. And since
         | the TV timing mattered, they had to release different software
         | for regions with NTSC and PAL TVs since those TVs operate with
         | different refresh rates and refresh rate was the clock that
         | drove the render logic.
         | 
         | It was a wild time.
        
       | deater wrote:
       | I have to say as a 6502 assembly programmer I have wasted many
       | hours of my life tracking down the same issue in my code
       | (forgetting to put an # in front of an immediate value and thus
       | accidentally doing a memory access instead). Often it's like this
       | case too where things might accidentally work some of the time.
       | 
       | Worse than the floating-bus in this example is when it depends on
       | uninitialized RAM which is often consistent based on DRAM so the
       | code will always work on your machine/emulator but won't on
       | someone else's machine with different DRAM chips (invariably you
       | catch this at a demoparty when it won't run on the party machine
       | and you only have 15 minutes to fix it before your demo is about
       | to be presented)
        
         | anonymousiam wrote:
         | Was there ever an architecture that used dynamic memory with a
         | 6502 CPU? In my (limited?) experience, that platform always had
         | static RAM.
        
           | deater wrote:
           | I think you'll find more systems used DRAM than SRAM.
           | 
           | The Apple II was one of the first 6502 systems to use DRAM
           | (in 1977) and Woz was incredibly clever in getting the
           | refresh for free as a side effect of the video generation
        
           | retrac wrote:
           | Most of them. Static RAM was (and still is) more expensive
           | since it needs more transistors and chip area per bit stored.
           | It it, however, also much easier to interface since it
           | doesn't need refresh circuitry. This is why you see it in the
           | earliest designs, and also why you see it in so many hobbyist
           | designs. It's also why you tend to see it in the video
           | systems even if the rest of the machine uses DRAM. Dealing
           | with DRAM refresh while reading out the whole memory chip
           | sequentially (while also having a second port to read/write
           | from the CPU!) starts making things very complicated.
           | 
           | But still DRAM is what you would use for a "real" system.
           | Wozniak's design for the Apple II used a clever hack where
           | the system actually runs at 2 MHz with an effective CPU rate
           | of 1 MHz. Any read from a DRAM row will refresh the entire
           | row. Approximately every other cycle the video system steps
           | incrementally through memory, refreshing as it goes.
        
             | rzzzt wrote:
             | Same with the VIC-II and the 6510 in the Commodore 64. The
             | video chip is given the main character role for the bus,
             | stopping the CPU from moving forward if it needs cycles for
             | video generation or DRAM refresh.
        
           | wk_end wrote:
           | Well, the SNES - if that counts, it's a 65816 - uses DRAM.
           | This is especially noteworthy because the DRAM refresh is
           | actually visible on-screen on some units:
           | 
           | https://www.retrorgb.com/snesverticalline.html
        
           | adrian_b wrote:
           | There must have been computers with 6502 and DRAM.
           | 
           | For higher memory capacities, e.g. 32 kB, 48 kB or 64 kB,
           | static RAM would have been too expensive and too big, even if
           | 6502 did not have an integrated DRAM controller, like Zilog
           | Z80.
           | 
           | Using SRAM instead of DRAM meant using 4 times more IC
           | packages, e.g, 32 packages instead of 8. The additional DRAM
           | controller required by DRAM would have needed only 1 to 4
           | additional IC packages. Frequently the display controller
           | could be used to also ensure DRAM refresh.
        
           | Braxton1980 wrote:
           | Are you thinking of SDRAM (a type of DRAM)?
        
             | anonymousiam wrote:
             | I appreciate all of the responses. I did development on a
             | KIM-1 and I owned a SYM-1. Both of these used static RAM. I
             | expanded the RAM in my SYM-1 from 4K to 8K (with eight 2114
             | static RAM chips). I never owned any other 6502 based
             | computers.
        
         | RiverCrochet wrote:
         | 6502 was my first assembly language, and I always thought of
         | instructions like "LDA #2" as "load A with the number 2" versus
         | LDA 2 (load A with what's in memory location 2).
        
         | bartread wrote:
         | This is the kind of situation where feeding your code through
         | an LLM can actually be helpful: they're really good at spotting
         | the kind of errors/typos like this that have a profound impact
         | but which our eyes tend to all to easily scan over/past.
        
           | nancyminusone wrote:
           | The last time I tried an LLM on assembly, it made up
           | instructions that didn't exist.
        
             | cdelsolar wrote:
             | cool; nowadays LLMs are better
        
               | iforgotpassword wrote:
               | Today I used chatgpt for winapi stuff - it made up
               | structs and enums regarding display config. So not too
               | convinced it'll be any good with 6502 asm.
        
               | recursive wrote:
               | cool; but not better enough
        
       | jihadjihad wrote:
       | I know it's OT, but I have to say, for a 30-year-old video game,
       | it's remarkable how well DK Country 2 holds up today. I've been
       | playing it with an emulator and the graphics, sound, level
       | design, and controls are all masterful. The kids can keep
       | Fortnite, I'll take DKC and Chrono Trigger any day!
        
         | christophilus wrote:
         | Chrono Trigger holds up. That game is a masterpiece.
        
         | aidenn0 wrote:
         | Most Rare-developed games from that era were really well done.
        
       | pipes wrote:
       | Whenever I'm playing a game via emulation and I get stuck, I do
       | end up wondering if it's a bug in the emulator. This particular
       | issue, I would have assumed the game was designed this way it and
       | is just difficult.
       | 
       | Not quite related, but i get a similar feeling if the game seems
       | really tough: "is this due to emulation latency". I went down a
       | rabbit hole on this one and built myself a mister FPGA!
        
         | bnjms wrote:
         | Chronic Trigger had one like this. I recall there is a section
         | where you catch a rat and have to input four simultaneous key
         | inputs after catching the rat. But usb inputs only forwarded 3
         | at a time so to get passed this you'd mash all four and
         | eventually you'd get them registered inside the very short
         | timeframe. Took many tries and was very frustrating.
        
           | Y_Y wrote:
           | > _Chronic Trigger_
           | 
           | Sometimes it does feel that way...
        
         | aidenn0 wrote:
         | I played a lot of bionic commando as a kid. When I loaded it up
         | in an emulator in the early 2000's it was _way_ harder than I
         | rememberd. Then I realized there was an emulation bug where the
         | enemies didn 't disappear when you blew up the base, but Ladd
         | still froze; that meant I needed roughly 2 extra life points
         | when clearing a level. Just to see if I could, I did beat it
         | that way once, but never again.
        
         | pipes wrote:
         | Why did I get downvoted for this?
        
         | bigstrat2003 wrote:
         | I only ever played DKC on ZSNES, and I had no idea that this
         | was an emulator bug until reading the article. Like you said, I
         | just assumed that it was the intended game design to time your
         | launch from the barrel so that it was the correct angle. It
         | blew my mind to learn that it was a bug!
        
       | nicetryguy wrote:
       | I don't always make 6502(ish) errors, but when i do, it's usually
       | the memory address instead of the immediate! It's a very common
       | and easy mistake to make, and i believe Chuck Peddle himself
       | deeply regretted the (number symbol, pound sign, hashtag) #$1234
       | syntax for immediate values. I made # appear bright red in my
       | IDE, it helps, a bit... Even the ASM gods at Rare fell victim to
       | the same issue!
        
         | JoshTriplett wrote:
         | I ran into a similar issue a long time ago, with the GNU
         | assembler in "intel_syntax noprefix" mode. It has an issue
         | where there's syntactic ambiguity that makes it possible to
         | interpret a forward-referenced named constant _immediate_ as a
         | reference to an unknown _symbol_ , if in an instruction that
         | could accept either an immediate or a memory address. The net
         | result is assembling the instruction to have a placeholder
         | memory address (expected to be filled in by the relocated
         | address of the symbol when linked) rather than the expected
         | immediate. Painful to debug.
        
       | anonymousiam wrote:
       | I started reading this to understand Open Bus, which was
       | capitalized in the title, so I assumed it was a proper name for
       | some old bus protocol/standard that I'd never heard of.
       | 
       | After reading, I realized that he just meant that the bus was
       | "open" as in not connected to anything, because the address line
       | decoders had no memory devices enabled at the specified address
       | ($2000).
       | 
       | It's pretty funny that the omission of the immediate mode (#)
       | went unnoticed until the obsolete emulator didn't behave in the
       | same way as the real hardware when reading <nothing> from memory.
       | 
       | His solution of changing the instruction to use immediate
       | addressing mode (instead of absolute) would have the consequence
       | of faster execution time, because the code is no longer executing
       | a read from memory. It's probably now faster by about 2us through
       | that blob of code, but maybe this only matters on bare metal and
       | not the emulator, which is probably not time-perfect anyway.
        
         | wk_end wrote:
         | > It's probably now faster by about 2us through that blob of
         | code, but maybe this only matters on bare metal and not the
         | emulator, which is probably not time-perfect anyway.
         | 
         | (Some) SNES emulators really are basically time-perfect, at
         | this point [0]. But 2us isn't going to make an appreciable
         | difference in anything but exceptional cases.
         | 
         | [0] https://arstechnica.com/gaming/2021/06/how-snes-emulators-
         | go...
        
           | BearOso wrote:
           | There's actually some issues with clock drift, and
           | speculation whether or not original units had an accurate
           | crystal or varied significantly in timing. The only way to
           | figure that out is to go back and ask the designers what the
           | original spec was, and who knows if they remember. So they're
           | not really time-perfect, because the clock speeds can vary as
           | much as a half-percent.
        
             | NobodyNada wrote:
             | It's mostly the audio clock that is suspectible to drift.
             | Everything except the audio subsystem is derived from a
             | single master clock, so even if the master clock varies in
             | frequency slightly, all the non-audio components will
             | remain in sync with each other.
             | 
             | That means the 2 clock cycles could theoretically make an
             | observable difference if they cause the CPU to miss a frame
             | deadline and cause the game to take a lag frame. But this
             | is rather unlikely.
        
               | BearOso wrote:
               | The CPU has shown some variation, but yes, it's the APU
               | that has a ceramic clock source that isn't even close to
               | the same among units. Apparently those ceramic resonators
               | have a pretty high variation, even when new.
               | 
               | When byuu/near tried to find a middle-ground for the APU
               | clock, the average turned out to be about 1025296
               | (32040.5 * 32). Some people have tested units recently
               | and gotten an even higher average. They speculate that
               | aging is causing the frequency to increase, but I don't
               | really know if this is the case or if there really was
               | that much of a discrepancy originally.
               | 
               | It does cause some significant compatibility issues, too,
               | like with attraction mode desyncs and random freezes.
        
           | shadowgovt wrote:
           | In general, even SNES games are still doing frame-locking,
           | right? i.e. if you save 2us you're just lengthening the
           | amount of time the code is going to wait for a blanking
           | signal by 2us.
        
             | wk_end wrote:
             | Yeah, exactly. It'd have to be really exceptional cases.
             | For example, exactly one game (Air Strike Patrol) has timed
             | writes to certain video registers to create a shadow
             | effect, but 2us is so minor I don't think it'd appreciably
             | effect even that. Or, like, the SNES has an asynchronous
             | multiplier/divider that returns invalid results while the
             | computation is on-going, so if you optimized some code you
             | might end up reading back garbage.
             | 
             | IIRC ZSNES actually had basically no timing; all
             | instructions ran for effectively one cycle. ZSNES wasn't an
             | accurate emulator, but it mostly worked for most games most
             | of the time.
        
         | shadowgovt wrote:
         | Rare has a history of video games that work in testing and have
         | bugs buried in them for years until some novel architecture
         | surfaces them. Not to imply other companies _don 't;_ just that
         | Rare is an easy-to-reference name on the topic.
         | 
         | Donkey Kong 64 has a memory leak that will kill the game after
         | a (for that era) unlikely amount of contiguous time playing it
         | (8-9 hours, if I understand correctly). That was not caught in
         | development but is a trivial amount of time to rack up if
         | someone is playing the game and saving progress via emulator
         | save-state instead of the in-game save feature.
         | 
         | (Note: there is some ambiguous history here. Some sources claim
         | the game shipping with the Memory Pak was a last-ditch effort
         | to hide the bug by pushing the crash window out to 13-20 hours
         | instead of 8-9. I think recent research on the issue suggests
         | that was coincidence and the game didn't ship with either Rare
         | or Nintendo being aware of the bug).
        
           | jordigh wrote:
           | Donkey Kong 64 running for 11 hours just fine:
           | 
           | https://www.youtube.com/watch?v=HWUg_iM7yIg
        
       | helf wrote:
       | I love this sort of content! My favorite things I find on HN :D
        
       | NobodyNada wrote:
       | Open bus quite literally means that the data bus lines are an
       | open circuit -- the CPU has placed an unmapped or write-only
       | address on the address bus, and no hardware on the bus has
       | responded, so the bus lines aren't being driven and are just
       | floating. Thus, nominally, this is a case of undefined behavior
       | at the hardware level.
       | 
       | In order to understand what _actually_ happens, we need to look a
       | little closer at the physical structure of a data bus -- you have
       | long conductors carrying the signals around the motherboard and
       | to the cartridge, separated from the ground plane by a thin layer
       | of insulating substrate. This looks a lot like a capacitor, and
       | in fact this is described and modeled as  "parasitic capacitance"
       | by engineers who try to minimize it, since this effect limits the
       | maximum speed of data transmission over the bus. But this effect
       | means that, whenever the bus is not being driven, it tends to
       | stay at whatever voltage it was last driving to -- just like a
       | little DRAM cell, producing the "open-bus reads return the last
       | value transferred across the bus" effect described in the
       | article.
       | 
       | It's not uncommon for games to accidentally rely on open-bus
       | effects, like DKC2. On the NES, the serial port registers for
       | connecting to a controller only drive the low-order bits and the
       | high bits are open-bus; there are a few games that read the
       | controller input with the instruction LDA $4016 and expect to see
       | the value $40 or $41 (with the 4 sticking around because of open-
       | bus).
       | 
       | There's also speedrun strategies that rely on open-bus behavior
       | as part of memory-corruption or arbitrary-code-execution
       | exploits, such as the Super Mario World credits warp, which sends
       | the program counter on a trip through unmapped memory before
       | eventually landing in RAM and executing a payload crafted by
       | carefully manipulating enemy positions [1].
       | 
       | But there's some exceptions to the usual predictable open bus
       | behavior. Nonstandard cartridges could return a default value for
       | unmapped memory, or include pull-up or pull-down resistors that
       | impact the behavior of open bus. There's also an interesting
       | interactions with DMA; the SNES supports a feature called HDMA
       | which allows applications to schedule DMA transfers to transfer
       | data from the CPU to the graphics hardware with precise timing in
       | order to upload data or change settings mid-frame [2]. This DMA
       | transfer temporarily pauses the CPU in order to use the bus to
       | perform the transfer, which can change the behavior of an open-
       | bus read if a DMA transfer happens to occur in the middle of an
       | instruction (between reading the target address & performing the
       | actual open-bus read).
       | 
       | This very niche edge case has a significant impact on a Super
       | Metroid speedrun exploit [3] which causes an out-of-bounds
       | memcpy, which attempts to transfer a large block of data from
       | open-bus to RAM. The open-bus read almost always returns zero
       | (because the last byte of the relevant load instruction is zero),
       | but when performed in certain rooms with HDMA-heavy graphical
       | effects, there's a good chance that a DMA transfer will affect
       | one of the reads, causing a non-zero byte to sneaks in somewhere
       | important and causing the exploit to crash instead of working
       | normally. This has created a mild controversy in the community,
       | where some routes and strategies are only reliable on emulators
       | and nonstandard firmwares; a player using original hardware or a
       | very accurate emulator has a high chance of experiencing a crash,
       | whereas most emulators (including all of Nintendo's official re-
       | releases of the game) do not emulate this niche edge case of a
       | mid-instruction HDMA transfer changing the value of an open-bus
       | read.
       | 
       | Also, the current fastest TAS completion of Super Metroid [4]
       | relies on this HDMA interaction. We found a crash that attempted
       | to execute open bus, but wasn't normally controllable in a useful
       | way; by manipulating enemies in the room to influence CPU timing,
       | we were able to use HDMA to put useful instructions on the bus at
       | the right timing, eventually getting the console to execute
       | controller inputs as code and achieve full arbitrary code
       | execution.
       | 
       | [1]: https://youtu.be/vAHXK2wut_I
       | 
       | [2]: https://youtu.be/K7gWmdgXPgk
       | 
       | [3]: https://youtu.be/CnThmKhtfOs
       | 
       | [4]: https://tasvideos.org/8214S
        
         | russellbeattie wrote:
         | > _... we need to look a little closer at the physical
         | structure of a data bus_
         | 
         | Once again, I have to give a shout out to Ben Eater, whose
         | video series on making a breadboard computer with the 6502 is
         | why I actually understand what the article is about and what
         | you're referring to when describing the hardware issues.
         | (Obviously, extrapolating from his basic bus example to a
         | commercial machine.) I'd be pretty clueless otherwise.
         | 
         | https://eater.net
        
       | Dwedit wrote:
       | I once encountered SNES Puyo Puyo doing PPU open bus. This was
       | when I was working on the RunAhead feature for RetroArch, and was
       | checking when savestates failed to match. CPU execution trace
       | logs didn't match because a value read out of PPU Open Bus didn't
       | match after loading state.
        
       | jgalt212 wrote:
       | DKC 1 with the SGI prerendered 3d graphics was cutting edge
       | stuff. Vector Man on the Genesis did something similar to less
       | acclaim.
        
       ___________________________________________________________________
       (page generated 2025-06-30 23:00 UTC)