[HN Gopher] Detecting a PS2 Emulator: When 1*X does not equal X
       ___________________________________________________________________
        
       Detecting a PS2 Emulator: When 1*X does not equal X
        
       Author : fobes
       Score  : 153 points
       Date   : 2024-06-08 16:05 UTC (6 hours ago)
        
 (HTM) web link (fobes.dev)
 (TXT) w3m dump (fobes.dev)
        
       | TillE wrote:
       | Interesting example of the kind of thing it's probably not worth
       | caring about in software emulation. Emulating the bug would be
       | considerably slower.
       | 
       | Some day replicating the PS2 on an FPGA will be feasible, and
       | then figuring out how this worked will be a fun project for
       | someone.
        
         | saagarjha wrote:
         | Depends on whether the bug breaks games or not.
        
         | jonhohle wrote:
         | FPGA implementations are often implemented based on code or
         | documentation from software emulation projects. An FPGA version
         | of a PS2 has no guarantee of not implementing the same or
         | similar bug.
        
           | crote wrote:
           | The point here is that it is actually _viable_ to reimplement
           | such bugs without incurring significant performance
           | penalties.
           | 
           | A software emulator has to be able to execute a single PS2
           | instruction in the same amount of wall time as it'd take on
           | the original hardware. With a regular multiplication that's
           | fairly easy: x86 also has multiplication, so you can do a 1:1
           | translation and be fairly certain it's within your time
           | budget. With a _bugged_ multiplication you need to do a
           | regular x86 multiplication, and wrap that in a few dozen
           | other instructions to add the buggy behaviour to it. There 's
           | a pretty decent chance it's simply too expensive!
           | 
           | When you're writing an FPGA emulator you are able to recreate
           | the buggy multiplication directly in hardware. There's no
           | additional wrapping needed, so (beyond figuring out intended
           | behaviour) it's not any more costly than emulating a non-
           | buggy multiplication. It's far easier to do a cycle-accurate
           | emulation because you have direct control over the
           | transistors!
        
             | gamepsys wrote:
             | > There's a pretty decent chance it's simply too expensive!
             | 
             | I doubt the 24 year old, 300Mhz RISC, 32MB Ram, PS2
             | instruction set is too expensive to do a cycle perfect
             | replication.
        
               | trealira wrote:
               | Higan is a cycle-perfect SNES emulator, and it's very
               | single-core CPU-intensive. This is what the FAQ [0] says:
               | 
               |  _Full-speed emulation for the Super Famicom base unit
               | requires an Intel Core 2 Duo (or AMD equivalent), full-
               | speed for games with the SuperFX chip requires an Intel
               | Ivy Bridge (or equivalent), full-speed for the wireframe
               | animations in Mega Man X2 requires an even faster
               | computer. Low-power CPUs like ARM chips, or Intel Atom
               | and Celeron CPUS generally aren't fast enough to emulate
               | the Super Famicom with higan, although other emulated
               | consoles may work._
               | 
               | Work can't be split across cores (according to the FAQ)
               | because that would compromise the accuracy of the timing.
               | 
               | It may be that the PS2 has similar problems while being
               | more powerful than the SNES.
               | 
               | [0]: https://higan.readthedocs.io/en/v104b/faq/
        
         | fobes wrote:
         | Fixing this bug would be part of fixing a bunch of other
         | floating point bugs, more specifically rounding and clamping.
         | 
         | Yes, software floating point would be slower, but the general
         | solution would probably follow the PS4s PS2 emulator. Where
         | each game can have whitelisted sections of code for the
         | software floating point path.
        
         | b3orn wrote:
         | Why would you want to emulate an old crappy MIPS CPU using a
         | relatively expensive FPGA? The whole idea of emulating old
         | consoles is to be independent of the hardware so you can play
         | your old games on your computer or phone.
        
           | fragmede wrote:
           | For you, maybe. For others, having thing that does what this
           | other thing did, with no regard to cost, is a fun adventure,
           | in and of itself.
        
             | b3orn wrote:
             | In that case I'm sure current FPGAs are already capable of
             | emulating the MIPS CPU of a PS2.
        
               | aeyes wrote:
               | The CPU yes but you want to emulate every piece of
               | hardware in the console. The audio chips, the GPU, they
               | way video memory works and so on.
        
             | Agingcoder wrote:
             | I second that. I wrote a nes emulator twenty years ago
             | because it was fun, and not for any practical purpose. I
             | had no idea what I was doing, but I remember being in awe
             | of the nes after reading the detailed hw spec ( found on
             | zophars domain, doc by yoshi if memory serves me well ). I
             | promptly decided to write an emulator in whatever language
             | I was learning at that time.
             | 
             | The result was terrible, but I had tremendous fun!
        
           | crote wrote:
           | Because the implementation on your computer or phone behaves
           | _slightly_ differently from actual hardware, sometimes to the
           | point of being unplayable. If you can 't get your hands on a
           | genuine working console, cycle-accurate FPGA implementation
           | is the next best thing.
        
           | 7thpower wrote:
           | Often the games I'm excited to play run like shit and have
           | timing issues that are difficult to measure but feel wrong
           | and diminish the nostalgia.
           | 
           | Free time is scarce, so I'll gladly pay a couple hundred to
           | not have to spend time fiddling with settings only to
           | ultimately capitulate.
        
       | qbane wrote:
       | That is why emulating, when targeting 100% accuracy, is a
       | craftsmanship in our industry. Not only do you need to know each
       | and every quirk the original hardware/software has, but you also
       | need to replicate it, however peculiar it is. Consider the
       | potential performance impact if itself is not challenging enough.
        
         | jsheard wrote:
         | Emulators have to be pragmatic about accuracy, when emulating
         | more modern systems it's generally not feasible to target 100%
         | hardware accuracy _and_ usable performance, so they tend to
         | accept compromises which are technically deviations from the
         | real hardware but usually don 't make any observable difference
         | in practice. Anything that uses a JIT recompiler is never going
         | to be perfectly cycle-accurate to the original hardware but it
         | usually doesn't matter unless the game code is deliberately
         | constructed to break emulators.
         | 
         | Dolphin had to reckon with that balance when a few commercial
         | Wii games included such anti-emulator code, which abused
         | details of the real Wii CPUs cache behavior. Technically they
         | _could_ have emulated the real CPU cache to make those games
         | work seamlessly, but the performance overhead (likely a 10x
         | slowdown) would make them unplayable, so they hacked around it
         | instead.
         | 
         | https://dolphin-emu.org/blog/2017/02/01/dolphin-progress-rep...
        
           | bobmcnamara wrote:
           | I once wrote something that would hard lock cortex-A8 but not
           | the cortex-A9 we shipped on. To my knowledge, nobody tracked
           | down why our app, once exfiltrated from our device, would
           | crash slightly older phones.
        
             | pm215 wrote:
             | Were you exploiting an A8 erratum, or detecting "this is an
             | A8" somehow and then making it barf in a less processor
             | specific way?
        
             | AtlasBarfed wrote:
             | So they'll patch around it.
             | 
             | You're just making your software worthless in the long run
             | for some value probably less than 5 years, or creating a
             | fun problem for an emu hacker.
             | 
             | Most of the significant losses to piracy monetarily isn't
             | emulation, it's the chippers/mods that bypass cloned media
             | copy protection.
             | 
             | Which emulator authors have a lot more control over in
             | bypassing
        
               | pm215 wrote:
               | If it hardlocked an A8 but not an A9, chances are very
               | high that an emulator would run it with no problem,
               | because nobody deliberately tries to emulate the kind of
               | CPU bug that lets an app hardlock the CPU. GP appears to
               | have been interested in deterring people from running
               | their code on non-authorised real hardware at the time,
               | not targeting emulator users.
        
               | mikestew wrote:
               | You assume an anti-piracy attempt when GP, from my
               | reading, made no such statement. More of a mystery, but
               | who cares because the problem hardware wasn't what they
               | shipped on.
        
               | cwillu wrote:
               | They used the word exfiltrate, it's not a stretch.
        
           | FMecha wrote:
           | >but it usually doesn't matter
           | 
           | When it comes to speedrunning: Some speedrunners do, though,
           | to ensure their speedrun tech are reproducible on both
           | emulators and real hardware.
        
             | jsheard wrote:
             | That's true, the small differences between a pragmatic
             | "accurate enough" emulator and real hardware can matter for
             | speedrunners. The difference between real hardware running
             | at 60fps and a principled cycle-accurate emulator running
             | at <0.1fps would matter more, though.
             | 
             | For the SNES and earlier it's feasible to have exceptional
             | accuracy and still usable performance, but for anything
             | modern it's just not happening. Imagine trying to write a
             | cycle-accurate emulator core for a modern CPU with
             | instruction re-ordering, branch prediction, prefetching,
             | asynchronous memory, etc, nevermind making it go fast.
        
               | FMecha wrote:
               | >For the SNES and earlier
               | 
               | I think the cutline can be moved to the original
               | PlayStation now.
               | 
               | >but for anything modern it's just not happening.
               | 
               | Which arguably explains a cultural rift in arcade
               | emulation circles. MAME's philosophy is about cycle-
               | accuracy, which might work for bespoke arcade hardware up
               | to early 3D systems, whether they're bespoke (such as
               | Namco's System 22) or console-derived (Namco's System 1x
               | series, which all derive from the original PlayStation
               | hardware) hardware. For newer arcade titles, which are
               | just beefed period PCs, such kind of emulation
               | (philosophy) would not be suffice for gameplay.
        
           | masfuerte wrote:
           | > Anything that uses a JIT recompiler is never going to be
           | perfectly cycle-accurate to the original hardware
           | 
           | beebjit [1] is a cycle-accurate JIT-based emulator for the
           | BBC Micro. It can be done.
           | 
           | [1]: https://github.com/scarybeasts/beebjit
        
           | mardifoufs wrote:
           | I wonder how mainframe emulators (that sometimes are used to
           | run legacy, very critical software on modern hardware) manage
           | to do it. Do they go for full complete emulation? As in,
           | implementing the entire hardware in software?
        
       | mattbee wrote:
       | The simplest trick for detecting old ARM emulation - ISTR was
       | used on some Gameboy Advance copy protection: store a booby-trap
       | instruction at PC+4 (i.e. the very next one). A real ARM has a
       | pipeline that reads PC+8, while decoding PC+4 and while executing
       | at PC. So the newly-stored instruction should have no effect. An
       | emulator (which didn't emulate the hardware pipeline) would
       | execute it.
       | 
       | edit: described in more detail here, among other emulation-
       | busting measures from 2004 https://mgba.io//2014/12/28/classic-
       | nes/
        
         | ithkuil wrote:
         | Some pipelined CPUs have retained compatibility with self-
         | modifying code and detect when you overwrite an instruction
         | that is on the pipeline and flush it.
         | 
         | X86 has that machinery although I'm not sure if they dropped it
         | eventually on the 64-bit variant.
        
         | ekidd wrote:
         | The Texas Instrument TI320C40 digital signal processor had even
         | weirder pipeline issues:
         | 
         | - Branch delay slots
         | (https://en.wikipedia.org/wiki/Delay_slot), where one or more
         | instruction(s) _after_ a branch would be executed before the
         | branch actually occurred.
         | 
         | - Load delay slots, where values stored into registers weren't
         | guaranteed to appear until some later instruction. I believe
         | the the value in the register was undefined for several cycles?
         | 
         | Writing tightly-optimized assembly code for these chips was
         | pretty horrible, sort of like playing an unusually tasteless
         | Zachtronics clone.
        
       ___________________________________________________________________
       (page generated 2024-06-08 23:00 UTC)