https://roadrunnerwmc.github.io/blog/2020/05/08/nsmb-rng.html Writings of RoadrunnerWMC [ ] About One Number Repeated Forever: RNG in NSMB May 8, 2020 Hello! It's been a while, hasn't it. Sorry for the delay. These posts sometimes take a long time to put together, since this blog isn't my highest priority and I try to add custom multimedia to each one. Nevertheless, I already have several future posts in the works, so you can look forward to more in the coming months. Anyway! This is a sort of "part 0" for an upcoming 2-part series about the Green Toad House minigame in New Super Mario Bros. The minigame involves randomness, so in researching it, I investigated NSMB's random number generator (RNG). My findings didn't turn out to be very relevant to the minigame, but are interesting in their own right. To keep this from getting far too long, I'm assuming you already know with what an RNG is, and about the concept of seeding one. If not, here are some good resources: pannenkoek2012 on YouTube (SM64), Retro Game Mechanics Explained on YouTube (SMW), Wikipedia. Investigating the function To start off, here's a manual decompilation of NSMB's random number generator, which I am by executive decision naming "rand_nsmb": uint32_t rand_nsmb(uint32_t *state) { uint64_t value = (uint64_t)(*state) * 1664525 + 1013904223; return *state = value + (value >> 32); } Interactive rand_nsmb Want to try out rand_nsmb? You can do that right here: uint32_t state = [0 ]; rand_nsmb(&state); An RNG function where the next output is calculated as the previous output times some constant, plus another constant, is called a linear congruential generator. The function above is almost an LCG. The only part that makes it not quite qualify is the + (value >> 32) at the end. Whenever you find a piece of unknown code with weird large constants, you can often learn a lot by Googling them. If you search for 1664525 and 1013904223, you'll find that they're very commonly used together in LCG functions^1, and were originally published in the book Numerical Recipes. That book names the LCG function based upon these two numbers "ranqd1" (for "random quick-and-dirty generator #1"). I'll be using that name for this LCG from here on. For comparison, here's an implementation of ranqd1 (not copied directly from the book) matching the formatting of rand_nsmb above: uint32_t ranqd1(uint32_t *state) { uint64_t value = (uint64_t)(*state) * 1664525 + 1013904223; return *state = value; } A note about seeding As far as I can tell, NSMB's procedure for seeding the RNG upon game boot is sound. It gathers entropy from a variety of sources, including time spent loading the game so far, current scanline number, GPU status, any buttons being held, and more. Then it takes a SHA-1 hash of it all, and XORs that up into a 32-bit integer. Given this, I think it's safe to assume for this analysis that seed values are uniformly distributed. Other than missing the + (value >> 32), it's identical to rand_nsmb. ranqd1 is known to have the very nice property of cycling through every 32-bit integer before looping^2 (a "full cycle"), and is generally (quoting Numerical Recipes) "about as good as any 32-bit linear congruential generator." So NSMB's RNG function is just ranqd1 - a decent LCG - with a small change. Nintendo adds back the bits that would otherwise be truncated away during the implicit cast to uint32_t when returning. On the surface, this feels like it should be a good thing - instead of losing that somewhat-random information, it's now getting recycled! But here's the issue: random number generation is one of those things where you should seriously not mess with tried-and-true formulas unless you really know what you're doing. And as we'll see soon, this change ends up bringing more issues than benefits. Cycles I mentioned earlier that ranqd1 cycles through every 32-bit integer before looping. With NSMB's modification, this property doesn't necessarily still hold, so I decided to do a fairly thorough analysis of rand_nsmb's cycle(s). I wrote a C program that iterated over every 32-bit integer, plugged each one into rand_nsmb, and followed the resulting trail of RNG values until it hit a duplicate. This scenario lends itself nicely to dynamic programming, so I had the program fill information about each seed (which cycle it ended up in, and how many steps it took to reach it) into a fairly gigantic 20 GB data table file. Even with the file stored on my laptop's internal SSD, the program took about two weeks to finish running. Once that was done, I was able to answer a lot of questions quickly by linearly reading through the output file with Python (which takes about half an hour^3) and calculating statistics I was interested in. Here's what I found. Let's look at the average case first. Given a random starting seed, rand_nsmb will repeat an output after 1,820,529 calls, on average. While that's a far cry from the 4,294,967,296 calls needed to get ranqd1 to repeat a number, it's still absolutely plenty for a video game like this. What I find more interesting is to inspect the worst cases. If you broke a mirror underneath a ladder prior to launching NSMB, how small of a cycle could you end up in? Well, here's a list of all 1653 of rand_nsmb's cycles, sorted by length, and a pie chart showing the amounts of states that feed into them: * 1 cycle of length 1,708,724 * 1 cycle of length 354,835 * 1 cycle of length 155,834 * 1 cycle of length 146,318 * 1 cycle of length 127,646 * 1 cycle of length 81,673 * 1 cycle of length 48,534 * 1 cycle of length 26,128 * 1 cycle of length 1371 * 1630 cycles of length 8 * 12 cycles of length 4 * 1 cycle of length 2 * 1 cycle of length 1 About 3/4 of all states drain into the largest cycle, which is 1.7 million states long. Following that are a bunch of progressively smaller cycles, with similarly shrinking slices of the pie chart. I don't know why there are so many cycles exactly 8 elements long, but that's not important. Look at the bottom of the list. There are twelve cycles 4 states long (henceforth "4-cycles"), a 2-cycle, and... a 1-cycle. This RNG function has a fixed point. Here it is: 1144735523. You can check it for yourself with the interactive rand_nsmb widget from earlier if you want - rand_nsmb (1144735523) returns 1144735523.^4 To put it mildly, a random number generator that might spit out the same number over and over is not ideal. The fixed point Impact on gameplay What would it be like to play NSMB with its RNG function stuck on a single number? While you'd only have to boot the game 1.9 billion times to have a 90% probability of hitting rand_nsmb's fixed point, it's a tiny bit faster to achieve it by hacking. You can use this NSMBe code patch to do it. I played through the whole game like this. Here are some highlights of what I found: Your browser doesn't support HTML5