[HN Gopher] RP2040 Doom
       ___________________________________________________________________
        
       RP2040 Doom
        
       Author : xkriva11
       Score  : 411 points
       Date   : 2022-03-14 13:59 UTC (1 days ago)
        
 (HTM) web link (kilograham.github.io)
 (TXT) w3m dump (kilograham.github.io)
        
       | mikewarot wrote:
       | Has anyone seen the schematic for this?
       | 
       | I've got some Raspberry Pi Picos, and would like to try it out.
        
       | ChuckMcM wrote:
       | Fun stuff, it always amazes me that people are surprised. Not
       | having lived through it is a part of that I'm sure.
       | 
       | The RP2040 is more powerful than an 80286. The PC/AT which was
       | hugely more powerful than the original IBM PC (on which DOOM also
       | ran). Put a keyboard, mouse, and an frame buffer on an STM32F4 or
       | F7 and you've got the computational and capability equivalent of
       | the PC's that powered the world in 1985. People did accounting,
       | CAD, spreadsheets, email, all sorts of things on them. Amazing I
       | know, but here we are.
        
         | _joel wrote:
         | Sure, but it's not about the speed of the hardware but what was
         | done to port it to the hardware. RAM and ROM would be larger
         | then. It has 256KB, I remember 286's having well over 1MB,
         | 386's even more!
         | 
         | I don't think PC/AT was sub $1 either :)
        
         | varajelle wrote:
         | The computation power was not so much the issue for this port.
         | The challenge was to make it work with much less RAM and
         | storage available.
        
           | pflanze wrote:
           | Yes, they are solving the storage and RAM challenges
           | partially by throwing CPU at it: not using native pointers,
           | switching between multiple struct sizes where the original
           | had one, compressed integer values, etc., also they
           | restructured drawing to happen in slices as the beam travels,
           | which must have its costs. Also I wonder whether original
           | Doom relied on some GPU hardware, whereas here everything
           | happens in software. RP2040 doom also has to emulate the
           | sound hardware, and handle lots of interrupts to initialize
           | the DMA for each individual video scanline.
           | 
           | OTOH they are actually overclocking the RP2040 at 270Mhz.
        
             | mkl wrote:
             | The original Doom was all software. GPUs weren't a thing
             | for mainstream PCs at that point.
        
         | vbezhenar wrote:
         | I thought about building toy pc with rp2040 but I wasn't able
         | to solve gpu problem. Driving display seems very hard task
         | without some dedicated hardware. And using serial output is not
         | fun.
        
           | II2II wrote:
           | There are a lot of options on this front. As far as I can
           | tell, most displays for embedded platforms are sold as
           | modules that you interface with via a serial or parallel bus.
           | There are libraries out there to handle the grunt work, if
           | you don't want to dig through data sheets yourself.
           | 
           | If you want something that doesn't use any dedicated
           | hardware, interfacing with analog displays (e.g.
           | NTSC/PAL/VGA) can be done with a handful of resistors on GPIO
           | pins. Conceptually, it is easier but actually dealing with
           | timing is a pain. Again, libraries that deal with the grunt
           | work are available.
        
           | ChuckMcM wrote:
           | So, FWIW, I've been playing around with this. I've got an
           | FPGA board that has an HDMI output[1]. I have a simple 1280 x
           | 720 frame buffer running on it (read the DRAM, display it on
           | the monitor. I'm building a carrier board to connect it to an
           | STM32F429 Nucleo-144 board using the ST Micros flexible
           | memory controller (FMC) peripheral. This will present the
           | frame buffer contents to the STM32 as memory.
           | 
           | Additionally, some "control registers" are being implemented
           | in the FPGA that can do certain actions. At a minimum they
           | are "clear to one color", "copy region", "scroll region", and
           | "copy glyph". The STM32 has the DMA2 peripheral that does a
           | lot of cool bitblt type functions but these can be nominally
           | slowed down by not synchronizing with the FPGA's schedule for
           | displaying things.
           | 
           | The STM32 is running micropython. The "plan", such as it is,
           | is to let the REPL run using the display as its terminal, and
           | a "graphics mode" to reserve parts of the screen for
           | graphics. The small goal is to re-create sort of the
           | VIC-20/C64/ZX Spectrum kind of "vibe" (interpreted language
           | easy access to the graphics) and then build from there.
           | Clearly the basic frame buffer is like 20% of the FPGA so
           | there is lots of room to do other stuff in there.
           | 
           | [1] https://www.kickstarter.com/projects/1812459948/minispart
           | an6...
        
             | vbezhenar wrote:
             | Thanks, this sounds awesome and interesting topic for
             | learning in the future.
        
         | bombcar wrote:
         | Doom required a 386 (and really wanted a 486) iirc.
         | 
         | Wolfenstein worked on 286.
        
           | mobilio wrote:
           | And require 4MB ram... since i have machine with 2MB i wasn't
           | able to enjoy it.
           | 
           | But i found DOS4GW command line option to "emulate" ram with
           | swap file on DOS. I make virtual memory like 4MB and run
           | game. It took 15 minutes to start game and show main menu,
           | another 5 mins to navigate on menu and 15 to run game. Frame
           | rate was some like frame PER minute.
        
         | klelatti wrote:
         | Isn't it a __lot__ more powerful than a 286? 32 bit dual core
         | at 133 MHz - roughly a Pentium say?
         | 
         | Plus some of the early 8 bit machines drove a display with
         | minimal extra hardware - eg the ZX80 / ZX81 although it was
         | very very slow as a result!
        
           | ChuckMcM wrote:
           | In terms of instructions per clock and I/O bandwidth it
           | compares more favorably to 16 bit architectures than 32 bit
           | ones even though the Cortex-M family is nominally 32 bits.
        
             | klelatti wrote:
             | I think memory bandwidth is key and I don't know how the
             | RP2040 stacks up but even 486 wasn't superscalar and maxed
             | out at 66MHz by comparison.
        
         | JohnBooty wrote:
         | Some good discussion here, but you've got Doom's original
         | system requirements wrong.
         | 
         | It required a 386 with 4MB of RAM. It would _not_ run on a 286,
         | much less the original 4.77mhz IBM PC.
         | 
         | source: https://www.mobygames.com/game/dos/doom/techinfo
         | 
         | IIRC, Doom was very playable, but not exactly smooth on my
         | cheap 386SX which was... 20mhz? But it ran like butter on the
         | 66MHZ 486's in the school's computer lab.
        
       | pflanze wrote:
       | This is impressive.
       | 
       | I'm wondering about a few things:
       | 
       | - "I decided to leave the XIP cache to do its thing, and select a
       | few small areas of hot code or data to promote to RAM
       | manually"[1]: I understood this as you leaving the XIP cache
       | activated. But this seems at odds with "16K of flash XIP cache,
       | that we've talked about, but decided not to use."[also 1], which
       | I'm interpreting as "decided not to make use of the XIP cache
       | (i.e. turn it off)" (maybe I'm misreading).
       | 
       | - I thought ARM32 has 12(-14) usable registers (compared to 14-15
       | in x86-64), so why these mentions of "scarce Cortex-M0+
       | registers"? (Does FIQ mode reduce the number of usable
       | registers?)
       | 
       | - "not good on a Cortex M0+ where the overhead of a function call
       | is generally 30-40 cycles, with the corresponding loss of most of
       | your precious "in-register" state": are function calls
       | _disproportionally_ slower on Cortex M0+? (Certainly 30-40 cycles
       | seems high.) Why is that? (Registers r4-r11 are callee-saved[2],
       | thus not lost; mutable data might have to be re-read from memory,
       | though--just like on other architectures, but maybe CPU caches
       | are faster on those.)
       | 
       | - "These OR values can be stored in a lookup table indexed by
       | higher bits in the sample position, and thus the 8x space savings
       | can be realized without needing any branches in the code!"[3]:
       | Cortex-M0+ has a 2-stage pipeline[4], I'd hence expect the cost
       | of a jump to be just 1 additional cycle, for the re-processing of
       | the 1st stage for the next instruction (maybe I'm wrong), which
       | would be the same as a memory access. (Maybe multiple jumps can
       | be saved this way, though.) Did measurements show the lookup
       | table to be faster?
       | 
       | [1] https://kilograham.github.io/rp2040-doom/speed_and_ram.html
       | [2] https://en.wikipedia.org/wiki/Calling_convention#ARM_(A32)
       | [3] https://kilograham.github.io/rp2040-doom/sound.html [4]
       | https://en.wikipedia.org/wiki/Cortex-M0%2B#Cortex-M0+
        
         | ranma42 wrote:
         | > - I thought ARM32 has 12(-14) usable registers (compared to
         | 14-15 in x86-64), so why these mentions of "scarce Cortex-M0+
         | registers"?
         | 
         | Cortex-M0+ is thumb-only (compressed 16bit instruction
         | encoding). "In Thumb state, the high registers, r8-r15, are not
         | part of the standard register set. The assembly language
         | programmer has limited access to them, but can use them for
         | fast temporary storage."
         | 
         | > (Does FIQ mode reduce the number of usable registers?)
         | 
         | There is no FIQ mode in Cortex-M. Instead you usually have the
         | nifty Nested Vectored Interrupt Controller (NVIC) and it is
         | designed so your interrupt handlers can be regular C functions,
         | with no special handling needed (no special interrupt return
         | instruction) needed.
        
           | ant6n wrote:
           | In thumb2 virtually all arm32 Instructions are available,
           | some as 32bit encodings. But even the 16bit encodings include
           | some instructions that work on hi registers.
        
         | moefh wrote:
         | > - "I decided to leave the XIP cache to do its thing, and
         | select a few small areas of hot code or data to promote to RAM
         | manually"[1]: I understood this as you leaving the XIP cache
         | activated. But this seems at odds with "16K of flash XIP cache,
         | that we've talked about, but decided not to use."[also 1],
         | which I'm interpreting as "decided not to make use of the XIP
         | cache (i.e. turn it off)" (maybe I'm misreading).
         | 
         | The way I'm reading it, you can disable the XIP cache and use
         | that 16KB of RAM for anything else you want. But the author
         | "decided not to use" it for something else, that is, the 16KB
         | are still being used as cache.
        
       | 1024core wrote:
       | So I just got my hands on a couple of Picos. I was so excited to
       | find a RPi board in stock, that I forgot to check if it has WiFi
       | or not. I would like to run some form of OpenSprinkler on that
       | (even if not OpenSprinkler, I can use cron jobs to control the
       | sprinkler relays by hand).
       | 
       | Any tips on how to get WiFi working on Pico?
        
         | jonp888 wrote:
         | I don't think you understood what you were buying. The Pico has
         | no relation to the rest if the RPi product line.
         | 
         | It is basically an Arduino on steroids. It cannot run cron
         | jobs. It has no network stack, kernel or operating system, and
         | cannot run any software not specifically written for it.
        
           | 1024core wrote:
           | :-D Yeah, I didn't read the description before buying it. On
           | the other hand, it was about $7 each, so not a huge loss.
           | 
           | I have heard that it's possible to run some form of Unix:
           | https://www.zdnet.com/article/now-you-can-run-unix-on-the-
           | ti...
           | 
           | Wanted to hear from my homies here for any ideas.
        
         | callmemclovin wrote:
         | I guess the RPi Zero W would suit your needs a lot better, and
         | it seems to be in stock in some shops. The Pico doesn't run
         | Linux so cron jobs aren't possible.
        
           | anticensor wrote:
           | Pico can run FUZIX, which incorporates parts of real Unix.
        
       | alexk307 wrote:
       | Well done!
        
       | IMSAI8080 wrote:
       | Wow. That's just amazing work.
        
       | RF_Savage wrote:
       | Oh boy, that is really impressive.
        
       | gchadwick wrote:
       | Awesome! A huge amount of work must have gone into this.
       | 
       | I did some playing around with VGA graphics from the Pico when it
       | first came out (wrote a simple library to produce SNES like
       | graphics, wrote it all up on my blog
       | https://gregchadwick.co.uk/blog/playing-with-the-pico-pt6/). It
       | felt like Doom should be doable but I figured you'd need an off
       | chip RAM expansion interfaced via the PIO. Clearly not.
       | 
       | The Pico really is a very fun board to play around with. Could be
       | a great target for a retro style mini console thing.
        
         | erosenbe0 wrote:
         | Fun VGA experiment -- thanks for writing it up. I've done VGA
         | with FPGAs but I like how Pico is way cheaper with an open tool
         | chain and great accessories.
        
         | jrockway wrote:
         | > Could be a great target for a retro style mini console thing.
         | 
         | I've been playing with this thing. It's quite neat:
         | https://shop.pimoroni.com/products/picosystem?variant=323695...
         | 
         | (Note that it's a dev board; if you just want to play AAA
         | games, not the thing to buy. If you want to program a game and
         | show it off, it's what you want.)
        
           | klelatti wrote:
           | Thanks for highlighting this - looks awesome.
        
             | kefabean wrote:
             | These are truly fab little devices - i bought a couple for
             | my kids to play simple (not too addictive) games and
             | hopefully get them in to programming.
             | 
             | question is, how would one go about compiling Doom for the
             | Picosystem? that would be so cool...
        
           | jlundberg wrote:
           | The PicoSystem is really neat and has awesome build quality.
           | 
           | One thing I really like about it is the super quick boot
           | compared to all normal devices that need to boot an operating
           | system first.
           | 
           | Highly recommended.
        
             | codetiger wrote:
             | I built a retro style game console for myself and now
             | working on building games on it. Never thought doom is
             | possible without lot of external hardware for RAM and
             | storage
             | 
             | https://codetiger.github.io/blog/building-a-retro-style-
             | game...
        
             | officeplant wrote:
             | +1 for build quailty. I love my PicoSystem even if my goal
             | of learning to program a game for it has fallen flat, ha.
        
         | [deleted]
        
       | Cthulhu_ wrote:
       | That's cool, it's Doom in a completely self-contained cartridge
       | size. Would it be possible to just hook up a cartridge like this
       | to a monitor (through e.g. usb-c) directly?
       | 
       | Also if the RP2040 is just $1, does this mean we should be able
       | to get e.g. Doom on cheap handheld single-game devices like the
       | old Game & Watch and similar machines? I remember spending hours
       | on these "racing games" or 12-in-1 Tetris LCD machines from the
       | toy shop. How much does a small (2-4") color OLED or backlit LCD
       | cost these days? Actually, what is the cheap handheld market
       | looking like these days? I had a boggle at the local toy shop's
       | website, VTech is still going for it but mainly with baby toys it
       | seems, and those Tetris handhelds are still the same from 20-30
       | years ago, they cost just EUR3,99 these days. I'm also seeing
       | some products from a company called Wonky Toys, and miniature
       | Atari arcade cabinets.
        
       | 0des wrote:
       | Music and sound too???
       | 
       | ...what am I doing with my life
        
       | WithinReason wrote:
       | This thing even has networked multiplayer! In about 256K of RAM
       | and 2MB flash! (It's the Raspberry Pi Pico board)
       | 
       | Carmack would be proud!
        
         | tenebrisalietum wrote:
         | In contrast with these limited platforms:
         | 
         | SNES Doom - 128KB RAM, I think a 2MByte ROM, CPU is 65816 at
         | 3Mhz + SuperFX RISC CPU at 21Mhz, which also had its own 64KB
         | of RAM.
         | 
         | PSX Doom - 2MB RAM + 1KB fast scratchpad, able to load from a
         | standard 650MB CD, CPU is a MIPS R3051 at 33Mhz + the PSX
         | accelerated graphics, not used except to draw strips
         | 
         | So doing this in a device that has not too much more RAM than
         | the SNES and also has to livestream the VGA signal Atari 2600
         | style is exceedingly impressive. It's a dual CPU unit but
         | basically having to spend a core manually bit banging the VGA
         | signal like that is what fascinated me the most.
        
           | al2o3cr wrote:
           | FWIW, it's possible to bit-bang DVI at 640x480 on the 2040.
           | Takes about half of the available resources:
           | 
           | https://github.com/Wren6991/PicoDVI
        
             | ranma42 wrote:
             | That requires a hefty overclock though (252MHz instead of
             | 133MHz).
        
               | bonzini wrote:
               | DOOM overclocks it to 270MHz. :)
        
           | _Microft wrote:
           | I haven't checked but don't think that the second core was
           | bitbanging the VGA signal. The RP2040 has PIO (programmable
           | I/O) mini cores that can read directly from RAM (DMA) and
           | address the GPIO pins directly. They most likely used that to
           | their advantage.
           | 
           | Edit: yes, see
           | https://kilograham.github.io/rp2040-doom/rendering.html
        
         | TillE wrote:
         | Those are important limitations, but there's a lot of room for
         | solutions when you have a dual core CPU which is many times
         | faster than a 486.
        
           | WithinReason wrote:
           | Doom 1 itself has 4MB of RAM as a minimum requirement:
           | 
           | https://www.computerhope.com/games/games/doomx.htm#doom
        
             | vikingerik wrote:
             | Much of that is for art assets. Do it with fewer or lower-
             | resolution textures and sprites, and you could get away
             | with quite a bit less. The executable code can fit in well
             | under one megabyte. You could even procedurally-generate
             | the art, if you've got way more CPU core available than
             | storage.
             | 
             | The SNES ran Doom with two 64k RAM banks (albeit with
             | textures and data such as level geometry running directly
             | from ROM.)
        
               | WithinReason wrote:
               | The SNES used a dumbed-down version of Doom:
               | 
               | https://doom.fandom.com/wiki/Super_NES
               | 
               | While this port makes it a point to port everything
               | accurately, including multiplayer.
        
             | billyhoffman wrote:
             | The article talks about this and allows you access Flash as
             | if it was very slow RAM (fronted with a 16K cache of actual
             | RAM). This allows the author to do many things like
             | directly access the levels and textures without loading
             | them into RAM, And in fact storing them compressed and
             | using the second core to uncompress them on the fly
        
               | numpad0 wrote:
               | In layman's approximation, could this be exactly how
               | modern x86 CPU with multi-level caches work, or is it
               | completely different from such things?
        
               | erosenbe0 wrote:
               | Not the same but not totally different. MCU abstraction
               | is simple and more like vintage stuff. So it would closer
               | to an 80s system that executed many routines from memory
               | mapped ROMs -- in addition to system RAM -- with an
               | instruction cache on CPU.
               | 
               | MCUs can have on chip RAM/ROM and off chip [quad] serial
               | RAM/ROM, and even parallel access RAM/ROM like FRAM.
               | Several ways to skin the cat. Or cut the pie, as it were.
        
       | ogurechny wrote:
       | > You can refer to Doom Wiki - WAD section to get a bit more
       | detail about the types of lumps mentioned below.
       | 
       | I just HAVE to nitpick on Fandom/Wikia search term squatting.
       | There is a real Doom Wiki at doomwiki.org.
        
       | anthk wrote:
       | I would love nethack/slashem on this, but I think NH 3.4.3 needs
       | 2MB of RAM at least.
        
       | AlotOfReading wrote:
       | What a coincidence, I was working on porting doom to the e-ink
       | badger2040 last week. Getting doom to fit into memory was fairly
       | straightforward, but they did a better job than me. I'm very
       | impressed they got the original WADs and networking going as
       | well. Great work!
        
         | whiskers wrote:
         | Hah! I never imagined DOOM as a use case when I was designing
         | the Badger 2040 - how foolish of me, in hindsight it's obvious!
         | 
         | How far did you get with it? Any video of the end result?
        
           | AlotOfReading wrote:
           | I got things drawing with low-complexity WADs, but had
           | issues/graphical snow after a few seconds in and needed more
           | polish to fit the original WAD in memory. I figured the video
           | would be best if it opened with the hangar level, so I
           | haven't made a one yet. Might be worth rebasing off this
           | effort instead.
        
       | Narishma wrote:
       | > 320x200x60 VGA output (really 1280x1024x60).
       | 
       | The original game ran (if you had a fast enough PC) at 35 FPS on
       | a 70 Hz display.
        
       | aaroninsf wrote:
       | > RP2040 Doom supports up to four players in regular/deathmatch
       | mulit-player over a two wire I2C connection.
       | 
       | I thought I had done something worthwhile with I2C.
       | 
       | I was wrong, and I am a bad person.
        
       | vardump wrote:
       | I'm speechless. That was... an impressive effort.
        
       | qwertox wrote:
       | > I2C networking for up to 4 players
       | 
       | How does one even get these ideas?
        
         | sitzkrieg wrote:
         | two free gpios left, clearly
        
           | dchichkov wrote:
           | It is better than this:
           | 
           | > For RP2040 Doom, whilst I thought I might need to build my
           | own single pin PIO networking with some sort of token
           | passing, it turned out I had 2 GPIO pins free that could be
           | configured for I2C, so I decided to just use that instead.
        
             | sitzkrieg wrote:
             | it is almost as though i was referring to that!
        
       | anthk wrote:
       | Also, check fastdoom:
       | 
       | https://www.youtube.com/watch?v=EZvI8wCVOPU
       | 
       | https://www.youtube.com/watch?v=Eh31az9epAo
        
       | _joel wrote:
       | This is a work of art, I've learnt a lot reading this, thank you.
        
       | chefandy wrote:
       | The Pi Pico reinvigorated my love of tinkering with electronics.
       | I can hack my way through C on an Arduino (and would probably
       | still use it for any serious deployment that I didn't expect to
       | turn into a big community effort) but for standing up quick proof
       | of concepts, embedded python is outstanding. Incredible for $4.
       | 
       | These newer compatible boards being released are awesome.
        
       ___________________________________________________________________
       (page generated 2022-03-15 23:02 UTC)