[HN Gopher] MiSTer, an open-source FPGA gaming project
___________________________________________________________________
MiSTer, an open-source FPGA gaming project
Author : tediousdemise
Score : 189 points
Date : 2021-04-11 17:59 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| fooblat wrote:
| I love my MiSTer build!
|
| The usb controllers made for the recent mini systems (NES, SNES,
| Genesis, etc) make great accessories for the mister. Add a couple
| of usb arcade sticks and you can really play almost any classic
| retro games as it was meant to be played.
|
| And then there are all the classic computer cores even including
| the PDP-1!
| vardump wrote:
| Yup, same! Can wholeheartedly recommend it for those who want
| something between emulation and real hardware.
|
| The Amiga core is fun, AGA, 2 MB chip, 384 MB fast. It supports
| hard disk images, so you can do a hard disk based Workbench
| installation and load games and demos practically instantly
| (and safely exit to Workbench) using WHDLoad.
|
| Arcade cores are fun as well. Just like in childhood, but less
| hungry for quarters. :) Recently played with arcade Gauntlet
| core a bit for example.
| xigency wrote:
| Interesting.
|
| I have a couple of FPGA boards on their way to me in the mail
| which I intend to use for some homebrew video game projects.
| Besides getting the development environment working, it can be
| tricky outputting video from an FPGA because of the precise
| timing involved. I will have to look through these resources to
| see if there are any good tricks to use here.
| phendrenad2 wrote:
| MiSTer is an amazing phenomenon. The MiSTer itself is just an
| Intel FPGA devkit, which many believe to be sold at a loss
| (because it's a training tool and not Intel's main source of FPGA
| revenue). The amazing thing is the aftermarket for addons. There
| are many possible combinations of addon boards that add RAM with
| deterministic latency, USB hubs, cooling fans, cases, retro
| controller ports, etc. All custom-made for this ecosystem.
| tinybear1 wrote:
| It is definitely being sold at a loss, the Cyclone V SOC being
| used costs more than the entire development board.[0] I wonder
| if Intel will ever take notice due to MiSTer's growing
| popularity and quit subsidizing the board.
|
| [0]
| https://www.digikey.com/en/products/detail/intel/5CSEBA6U23I...
|
| Edit: it was erroneous of me to state the board was being sold
| at a loss, rather I meant that the board was being definitely
| being subsidized by companies such as Intel and their partners
| such as Panasonic. My mistake. I also wasn't meaning to convey
| that the consumer Digikey pricing was the same as the large
| volume manufacturers such as Terasic. Rather I meant to
| demonstrate and agree with the OP on the astounding situation
| that MiSTer currently exists in, owning to the lack of economic
| viability for someone to produce a low volume commercial FPGA
| emulation machine for a niche audience without any
| subsidization.
| tverbeure wrote:
| There is absolutely no way they're sold at a loss. Your
| DigiKey price of $245 proves this, because a factor of 10 is
| a good starting point as a ratio between volume and one-off
| DigiKey pricing of any type of complex silicon.
|
| A better way to approach this is as follows: what's the die
| size of an FPGA like this? What's the production cost of the
| die? Then check the historic gross margin percentage of FPGA
| companies. Xilinx is around 68%, and that includes high-end
| products which carry the highest markups, unlike this cookie
| cutter thing.
|
| That should give you a good ballpark number.
|
| DigiKey charges what they do because nobody else is willing
| to sell these things in low volume, and they have very high
| inventory costs.
| andrewcchen wrote:
| Digikey pricing is not indicative of actual volume pricing,
| especially for FPGAs where they are often many times
| overpriced when buying from distributors. I doubt the board
| is sold at an loss, probably sold at a small profit, not
| that's it's really significant for a low volume dev board.
| tverbeure wrote:
| The price of a DE10-Nano is $135 ($115 for academic use.)
|
| Anyone who thinks that Terasic sells these at a loss doesn't
| have a clue about volume pricing of FPGAs. And as a special
| Intel partner, there's little doubt that Terasic has access to
| this kind of pricing.
| Jorge1o1 wrote:
| I don't really know much about game emulation so I was curious
| about what differentiates this FPGA game project vs traditional
| CPU emulation.
|
| From their github page [1]:
|
| >Traditional emulators on CPUs execute code sequentially. This is
| a tricky method of emulation because real hardware has many chips
| and all of them work in parallel...This requires a lot of CPU
| power to emulate even an old and slow retro computer. Sometimes
| even a modern CPU working at 100 times the speed of the retro
| computer is not enough, so the emulator has to use approximation,
| skip emulation of some less important parts, or assume some
| standard work of the emulated system without extraordinary usage.
|
| > FPGA doesn't need high frequencies to emulate retro computers;
| it works at much lower frequencies than traditional emulators
| require. Since everything in FPGA works in parallel, it is no
| problem to handle any possible usage of the emulated system.
|
| [1] https://github.com/MiSTer-devel/Main_MiSTer/wiki/Why-FPGA
|
| (Edited for formatting)
| TillE wrote:
| byuu wrote a good article about this, unfortunately it's no
| longer available, but basically it should be self-evident that
| there's nothing _inherently_ more accurate about hardware
| emulation.
|
| If you've actually decapped the original chips and duplicated
| them exactly in an FPGA, that's pretty cool. But otherwise it's
| just another approximation. The lower power requirements are
| nice, of course.
| zokier wrote:
| I think big differentiator is that it is easier to get
| predictable latencies with FPGA where you control almost
| everything, compared to general-purpose PC which is not
| really that well optimized for hard real-time operation. So I
| believe "race the beam" style things are more easily
| accomplished with FPGAs, and also having tight audio-video
| sync. Although the PC emulation scene has been also doing
| some fairly incredible things too.
| valec wrote:
| you can find it here https://archive.is/fWosI
| emodendroket wrote:
| That's true, but I think it's also true that you could trim a
| bit more lag if you do it well.
| tyingq wrote:
| Quite a few of the FPGA soft cores related to 8 bit gaming
| are reverse engineered from either schematics, decapped
| chips, or both. Or they take pains to at least use the same
| number of cycles for each instruction, etc.
| mbalyuzi wrote:
| Actually decapping the original chips is very much a thing.
| See for example Chris Smith's work mapping out the innards of
| the ZX Spectrum ULA -
| http://www.zxdesign.info/book/insideULA.shtml .
| near wrote:
| It is, but these cores are almost exclusively not being
| done that way. Not yet at least. I hope that they will be,
| that would be really awesome. I paid $1200 last year for
| the SNES PPUs to be decapped for this purpose, but it's a
| truly enormous undertaking to map out those chips and then
| recreate it in Verilog. You're talking thousands of hours
| of work per chip. If anyone reading this is able to help
| with that effort, please do let me know, we could really
| use the help.
| tediousdemise wrote:
| By decapped, do you mean delidded?
|
| Theoretically it would be possible to automate this with
| a couple things:
|
| - USB electron microscope to image the transistor
| topology
|
| - CV lib to identify connections and generate
| corresponding Verilog code
| klodolph wrote:
| "Decapping" is a more intense version of delidding where
| you use chemical agents or something similarly extreme
| (laser, plasma, milling) to remove the package (ceramic,
| plastic).
|
| My understanding is that there are people who do it often
| enough that it is automated in the way you describe, but
| you still need someone with a lot of skill to spend
| serious time on it. Computer vision works wonders but
| there are errors which must be identified and fixed.
|
| A lot of the chips people care about are can just be done
| optically, no electron microscope needed.
| tediousdemise wrote:
| Ah, that's a good distinction. I'd be pretty scared of
| damaging the hardware by doing that, but I'm sure there
| are some really experienced folks out there that would
| appreciate the hardware donation.
| FPGAhacker wrote:
| Not that this is necessarily helpful to you in the short
| term, but it strikes me as a good problem for machine
| learning (going from die pictures to transistor
| schematic.)
| bcrl wrote:
| That's exactly what's happening. There are loads of projects
| going on right now decapping old chips and reverse
| engineering them. From old CPUs like the 6502 to the Amiga
| Alice chip. It's just a matter of time before most of the
| retro systems are fully reverse engineered and documented.
| tediousdemise wrote:
| On FPGAs (depending on the hardware mapping), you get the
| benefit of lower latency. I consider this to be timing
| accuracy.
|
| Say you have two implementations of an LED controlled by a
| switch: one which uses an FPGA and one which uses a
| microcontroller. The uC implementation must continuously poll
| peripherals connected to its GPIO pins at a set frequency; it
| must check the state of the switch, and then change the state
| of the LED. The FPGA, on the other hand, _physically_ wires
| the switch to the LED; there is no lag when the state of the
| switch changes.
|
| The FPGA implementation can be scaled to connect however many
| additional lights and switches you want (limited by the size
| of the fabric), with zero overhead lag. This is the
| parallelization benefit of FPGAs that you may hear about. For
| the uC implementation, you must add additional switches and
| lights to the polling loop, which brings down performance in
| linear time, O(n). This is the drawback of sequential
| processing.
| klodolph wrote:
| Most game consoles don't do any of this, though. The
| gamepad is polled by software.
|
| On the NES and SNES, the buttons are connected to a shift
| register (e.g. 4021). The CPU triggers a latch and then
| reads out the shift register one bit at a time.
| mikepurvis wrote:
| This would be less about a user peripheral like the
| gamepad (which is obviously going to be read out exactly
| once per frame anyway) and more about getting subtle
| interactions between the CPU, memory, and specialized
| systems for graphics/audio correct. And not just correct
| after thousands of hours of work to smoke out the exact
| sources of specific title bugs, but correct essentially
| for free.
|
| See for example the tale of an absolutely wild mGBA
| investigation that was posted here a while ago:
|
| "What happens if an interrupt gets raised between
| prefetch and the data load? Will it start prefetching the
| interrupt vector before the invalid memory access? I
| quickly mocked this up in mGBA, turned on interrupts in
| the test ROM, and sure enough it broke out of the loop.
| So I tried the same test ROM on hardware and...it did not
| break out of the loop. So there goes that theory.
| Eventually I realized something. You saw that asterisk
| earlier I'm sure, so yes, there is one thing that can
| happen in between prefetch and the memory access, but
| only if the memory bus gets queried by something other
| than the CPU between the prefetch and invalid memory
| access."
|
| https://mgba.io/2020/01/25/infinite-loop-holy-grail/
| djmips wrote:
| This was given as an example, not for you to straw man
| about the gamepad.
| rtkwe wrote:
| There was a good article from Arstechnica a decade ago that
| pointed out why you need so much more power to get perfect
| emulation. To get exact emulation takes a lot of power because
| there are a few games which use odd tricks that are hard to
| document and precisely reimplement in software. FPGA emulation
| gets around that by more directly emulating the hardware.
|
| https://arstechnica.com/gaming/2011/08/accuracy-takes-power-...
| dang wrote:
| Related thread from 2018:
|
| _MiSTer: Run Amiga, SNES, NES and Genesis on an FPGA_ -
| https://news.ycombinator.com/item?id=18721594 - Dec 2018 (30
| comments)
| hyperpl wrote:
| I'd really like to see a portable/handheld leverage this
| technology for on-the-go gaming.
| craigjb wrote:
| I built the Gameslab around this concept, but haven't worked on
| it much lately.
|
| https://craigjb.com/2019/11/26/gameslab-overview/
| tediousdemise wrote:
| The Analogue Pocket[1] is exactly this (albeit proprietary).
| Out of the box it recreates GB, GBC, and GBA using the Altera
| Cyclone-V platform.
|
| [1] https://www.analogue.co/pocket
| drewblaisdell wrote:
| I wonder, why is there no DIY Analogue Pocket-style MiSTer
| project? Is the DE10-Nano too large or inefficient for this?
| jamespo wrote:
| The limited market is problably covered with Odroid Go /
| GPD XD / RG350M etc. Mister leverages an off the shelf FPGA
| board that would require a lot more work in a handheld
| form.
| tediousdemise wrote:
| I'd reckon it's the same reason that there isn't much of a
| custom laptop scene. The open ended nature of stuffing a
| screen, battery, and input peripherals into a chassis seems
| an order of magnitude more difficult than just making a
| headless box to plug into your TV.
|
| But with some effort, it would be awesome.
| jonny_eh wrote:
| Physical design is also a lot more important. Getting
| "feel" just right is very hard and expensive, especially
| when it comes to game controllers.
| duskwuff wrote:
| The DE10-Nano itself is a bit large for a handheld device,
| and hasn't been optimized for power consumption. (It's
| designed as a development board, not as a component of a
| finished product.) There's nothing stopping someone from
| using the Cyclone-V SoC in a handheld device, though.
| GekkePrutser wrote:
| This isn't really new, right? I've heard of this years ago.
|
| But it is an amazing project. Instead of emulating, they actually
| rebuilt the old custom ICs (which 8-bit computers were full of)
| in an FPGA. Really impressive.
| jonny_eh wrote:
| Old projects get reshared many times. It's always new to
| someone.
| tediousdemise wrote:
| Yeah, it really is an amazing application for FPGAs--preserving
| computing and gaming history. The list of cores available for
| MiSTer is simply staggering:
|
| > Computers - Classic
|
| * Acorn Archimedes * Acorn Atom * Alice MC10 * Altair 8800 *
| Amiga * Amstrad CPC 6128 * Amstrad PCW * ao486 (PC 486) *
| Apogee * Apple I * Apple II+ * Apple Macintosh Plus * Aquarius
| * Atari 800XL * Atari ST/STe * BBC Micro B,Master * BK0011M *
| Color Computer 2, Dragon 32 * Commodore 16, Plus/4 * Commodore
| 64, Ultimax * Commodore PET * Commodore VIC-20 * DEC PDP-1 *
| EDSAC * Galaksija * Jupiter Ace * Laser 310 * MSX * MultiComp *
| Orao * Oric 1 & Atmos * SAM Coupe * Sharp MZ Series * Sinclair
| QL * Specialist/MX * TI-99/4A * TRS-80 Model 1 * TSConf *
| Vector 06C * X68000 * ZX Spectrum * ZX Spectrum Next * ZX81
|
| > Consoles - Classic
|
| * Astrocade * Atari 2600 * Atari 5200 * Atari Lynx * AY-3-8500
| * ColecoVision, SG-1000 * Gameboy, Gameboy Color * Gameboy
| Advance * Genesis/Megadrive * SMS, Game Gear * MegaCD * NeoGeo
| * NES * Odyssey2 * SNES * TurboGrafx 16 / PC Engine * Vectrex
|
| > Other Systems
|
| * Arduboy * Chess * CHIP-8 * Epoch Galaxy II * Flappy Bird *
| Game of Life * TomyTronic Scramble
| timbit42 wrote:
| I'm still waiting for the KENBAK-1 core.
| tediousdemise wrote:
| Is there good documentation or ICDs out there that
| adequately describe the architecture? Looks like there's
| only 50 that were ever made, and only 14 believed to exist
| today.
| zokier wrote:
| http://kenbakkit.com/manuals.html
|
| Seems pretty well documented. Considering the simplicity
| of the computer, feels like it would be relatively easy
| project to get to MiST
| pomian wrote:
| Interesting project would be to dig out some old cassettes
| from, let's say, commodore 64. Try to load them into a
| present day computer by patching wires/cables? - and see if
| they run in this system. I remember writing for example: a
| mining program, to calculate, overburden, volume and tonnage,
| at different slopes, different rock types, etc. The science
| behind the calculations is still valid, but we could likely
| increase load times, and calculating times.
| near wrote:
| It is indeed an amazing project, especially its open source
| nature. It provides some impressive power savings and latency
| reductions that are very hard to match with general purpose
| CPUs.
|
| But in most cases, it is emulation, as the lead developer will
| attest.
|
| https://github.com/MiSTer-devel/Main_MiSTer/wiki/Why-FPGA
|
| "From my point of view, if the FPGA code is based on the
| circuitry of real hardware (along with the usual tweaks for
| FPGA compatibility), then it should be called replication.
| Anything else is emulation, since it uses different kinds of
| approximation to meet the same objectives. Currently, it's hard
| to find a core that can truly be called a replica - most cores
| are based on more-or-less functional recreations rather than
| true circuit recreation. The most widely used CPU cores - the
| Z80 (T80) and MC68000 (TG68K) - are pure functional emulations,
| not replications. So it's okay to call FPGA cores emulators,
| unless they are proven to be replicas."
|
| But there's nothing wrong with emulation for preservation,
| until we get to a point where we can wide-scale clone these
| older chips down to the transistor level through analysis of
| delayered decap scans. And even then, emulation will be useful
| for artificial enhancements as well as for understanding how
| all those transistors actually worked at a higher level.
|
| It's also not a total solution: by taking many more transistors
| to programmatically simulate just one, it limits the maximum
| scale and frequency of what it can support. N64/PS1/Saturn has
| not yet been fully supported and is still theoretical, but
| likely, to be possible. Going beyond that is not possible at
| this time.
|
| Software emulation and FPGA devices should be seen as
| complementary approaches, rather than competitive. The
| developers of each often work together, and new knowledge is
| mutually beneficial.
| floatboth wrote:
| Well, yeah, it's not replication if it's not an exact
| hardware replica, but the word "emulation" has very
| "software" connotations. I guess let's call it.. recreation?
| (That word is even in the quote above!)
| someperson wrote:
| "FPGA re-implementation" may be a better term
| jamespo wrote:
| So it's not perfect but it's better than emulators...
| near wrote:
| In latency and power usage, yes. In compatibility and
| accuracy, no. Both are Turing complete, so there's nothing
| you can do with one that you can't do with the other.
|
| If you take the SNES core, my software emulator has 100%
| compatibility and no known bugs, and synchronizes all
| components at the raw clock cycle level. It also mitigates
| most of the latency concern through a technique known as
| run-ahead. But it does require more power to do this.
| stormbrew wrote:
| I'm really curious where you got "better" out of the quoted
| text. Because it's not there or implied, but people keep
| reading this into anything about fpga recreations of chips.
| There's nothing inherently better about doing emulation on
| an fpga or a cpu, other than basically the amount of
| electricity involved in doing it.
|
| But people keep presuming an improved accuracy that there's
| no basis for.
| emodendroket wrote:
| Probably the marketing copy for Super NT and similar
| products... harder to get people to part with hundreds of
| dollars if your pitch is "lower power draw and reduced
| input delay"
| cmrdporcupine wrote:
| Lower latency is definitely a thing. With FPGA it's
| possible to 'chase the beam' like the original hardware,
| and have much reduced input latency from devices, etc.
| With an emulator you're going to be fighting the OS and
| the frameworks you built on top of. Even if you go "bare
| metal" (like my friend's BMC64 project which runs a C64
| emulator like a unikernel on the RPi with no OS) you are
| still dealing with hardware built for usage patterns very
| different from the classic systems. You're always going
| to be one or more frames behind.
| near wrote:
| That is true. There are however techniques software
| emulators can use like run-ahead that can get you lower
| latency than even the original hardware on a PC:
| https://near.sh/articles/input/run-ahead
|
| The caveat is that it doesn't _always_ work, and it makes
| the power requirements even more unbalanced. Some might
| also see it as a form of cheating to go below the
| original game 's latency. If you want to match the
| original game's latency precisely, FPGAs are the way to
| go right now for sure.
| tediousdemise wrote:
| Run-ahead seems pretty cool, great technical write up.
| How would you compare this to the feature called frame-
| skipping that I often see implemented in software
| emulators?
| mschuster91 wrote:
| > It's also not a total solution: by taking many more
| transistors to programmatically simulate just one, it limits
| the maximum scale and frequency of what it can support.
| N64/PS1/Saturn has not yet been fully supported and is still
| theoretical, but likely, to be possible. Going beyond that is
| not possible at this time.
|
| The limiting factor here is the amount of stuff you can throw
| into a single FPGA, correct?
|
| So in theory, shouldn't it be possible to tie a bunch of
| FPGAs together, with two beefy ones being responsible for
| replicating CPU / GPU functionality, a couple smaller ones
| for sound and other "helper" processors, and some bog-
| standard ARM SoC to provide the bitstreams to the FPGAs and
| emulate storage (game cartridges, save cards) and input
| elements (mainly "modern" controllers)?
| near wrote:
| There's both a cost and a speed barrier to it. FPGAs are
| often used to design, simulate, and test modern circuits at
| sub-realtime speeds. No amount of FPGAs will get you a PS2
| emulator at playable speeds right now, let alone a
| PS3/Switch emulator. PCs can do that today by taking
| shortcuts such as dynamic recompilation and idle loop
| skipping.
| vardump wrote:
| Hmm... looking at the frequencies and gate counts, I
| think PS2 is well within realm of possibility to run on a
| not-so-cheap FPGA (or several). But PS3 generation
| consoles definitely not.
| duskwuff wrote:
| > The limiting factor here is the amount of stuff you can
| throw into a single FPGA, correct?
|
| And the speed that you can get your design to run at.
| Something like the Game Cube (PPC750 @ 485 MHz) would be
| difficult to implement in an FPGA, for example.
| GekkePrutser wrote:
| Ah ok I wasn't aware of this. I thought it was spot on.
|
| And yeah I hope we can easily order small batches of ICs (at
| big pitch of course) in a few years, in a similar way to how
| creating PCBs has become so simple now.
|
| I mean I remember how much of a PITA it was in the 80s.
| Drawing on overhead sheets. All the acids and other
| chemicals. Drilling. And now we get super-accurate 10x10cm
| boards dual-layer, drilled, soldermasked and silkscreened for
| a buck a pop with a minimum of 10. Wow. I really hope this
| trend continues down to the scale of ICs (or that FPGAs
| simply get better/easier).
|
| By the way, emulating a CPU is pretty easy and very accurate
| anyway. The big problem with accurate emulation is with some
| of the peripheral ICs which used hard to emulate stuff like
| analog sound generators.
___________________________________________________________________
(page generated 2021-04-11 23:00 UTC)