[HN Gopher] Determinism in League of Legends: Implementation (2017)
       ___________________________________________________________________
        
       Determinism in League of Legends: Implementation (2017)
        
       Author : highfrequency
       Score  : 112 points
       Date   : 2024-09-30 18:46 UTC (4 days ago)
        
 (HTM) web link (technology.riotgames.com)
 (TXT) w3m dump (technology.riotgames.com)
        
       | SeanAnderson wrote:
       | "After consulting with several other Rioters, our team settled on
       | developing a new PRNG implementation based on XOR-SHIFT."
       | 
       | I wonder why?
        
         | bob1029 wrote:
         | Likely performance. You can do xor and shift operations
         | essentially for free on a modern pipelined CPU.
         | 
         | Other techniques, such as linear congruential generators,
         | require multiplication and division.
        
         | Joker_vD wrote:
         | "It's simple, fast, and has a great distribution, which means
         | it's more than good enough for gameplay."
         | 
         | Its distribution is not flawless, but I would agree that it's
         | good enough for gameplay.
        
           | DarthBatman wrote:
           | This! We don't use it for mission-critical crypto or the
           | like. It's there to figure out whether a bot goes left or
           | right. Given the enormous complexity of the League game state
           | and input surface area, it likely won't be a concern unless
           | someone devises a way to exploit it for RNG manipulation
           | (this seems unlikely.)
           | 
           | The reason we didn't use crand was just code ergonomics.
           | C-style rand has global state (by default) and we wanted our
           | own interface to be explicit in the code so you knew when you
           | were using a gameplay-impacting random number.
        
         | underscoresunde wrote:
         | underscores
        
         | o11c wrote:
         | The xorshift family is one of the top two RNGs in the modern
         | day (the other being PCG). Unfortunately, "choosing the ideal
         | member of the family" is nontrivial if you're trying to eke out
         | every bit of performance.
         | 
         | The problem with the xorshift family is that it doesn't
         | implement the nice auto-scaling (and thus auto-testing) that
         | PCG has. Re-parameterizing it means inventing whole new numbers
         | from scratch and hoping you plug them in correctly. Xorshift
         | also has a rare-but-severe "stuck near zero" problem.
         | 
         | The problem with the PCG family is that its trusted way of
         | scaling (nobody likes the `_k` variants) relies on 128-bit
         | multiplication (a modern member of the family at least reduces
         | that to a mixed 128x64 multiply), which remains notably slow,
         | unlike xorshift which only uses small integers.
         | 
         | (But I must emphasize: _all_ RNGs from other families either
         | have worse variants of these problems, or give up on
         | performance entirely. These concerns should be used only to
         | select your preferred tradeoff between the two, there is no
         | excuse for _ever_ using any other RNG family unless you need
         | cryptographic security)
        
         | TacticalCoder wrote:
         | Besides the other answers, could also be that they wanted to
         | make sure they had their hands on a fully deterministic
         | implementation across different hardware/software stacks (like
         | Windows and OS X).
         | 
         | As a sidenote on a project the size of LoL, the time, size and
         | cost of implementing a PRNG based on a known one is a rounding
         | error.
        
       | nightowl_games wrote:
       | > League of Legends does not use a fixed-step server clock, but
       | rather a measured frame delta time which we throttle to a target
       | frame rate. This makes the server more resilient to system-wide
       | performance spikes, but this pattern definitely complicated
       | determinism.
       | 
       | This could use some elaboration. I don't even know how it's
       | possible to have determinism without a fixed-step. I must
       | misunderstand what this means. Every simulation tick in league of
       | legends has gotta be the same duration, right? ie: 1.0 / 60.0?
        
         | tylerhou wrote:
         | You can still achieve determinism by treating the measured
         | delta-time as an input, and resimulating the game with recorded
         | delta-time. This may not be feasible on public servers, but is
         | definitely feasible for professional play where determinism is
         | important to rewind games due to bugs which affect competitive
         | integrity.
        
           | maest wrote:
           | They have so many bugs that there's a procedure for rewinding
           | the game during comp plays? Did I misunderstand?
        
             | pohuing wrote:
             | This is not unusual. Bugs happen and when there's a lot of
             | money on the line you need a rollback system.
        
               | maest wrote:
               | Interesting. I don't think I've seen this in any other
               | competitive game, which is why I'm surprised. But maybe
               | I'm just lacking information
        
               | mackman wrote:
               | I implemented a debug replay system for an American
               | Football game when I worked at Sony. The AI hierarchy and
               | all animation and physics state was recorded and game
               | could be rewound to figure out why an NPC made a poor
               | decision. Only needed to track state for the duration of
               | a single play. This was on PS2 which I think had 32MB of
               | memory but the dev systems had an extra 96MB which we
               | could use for this sort of thing.
        
               | chowells wrote:
               | Hell. Replay review in American Football and other sports
               | _is_ a form of rewinding and fixing bugs in officiating.
               | 
               | So it's certainly done in real life when money's on the
               | line.
        
               | nightowl_games wrote:
               | Starcraft 2 lets you resume a game from a replay.
        
               | tylerhou wrote:
               | In StarCraft II you can resume from replay (and maybe in
               | previous games as well), but I don't know if this has
               | ever actually been used due to bugs in the game engine
               | during competitive play.
        
             | IanGabes wrote:
             | They do! Its called Chronobreak. They have used it many
             | times in professional matches successfully, but it doesn't
             | work in 100% of scenarios.
        
           | nightowl_games wrote:
           | Yeah you _could_ but I fail to see why you would ever do
           | this. It's also not feasible to synchronize the client with
           | the server in real time with this input.
        
             | tylerhou wrote:
             | You don't need to synchronize the server and the client.
             | The point of determinism is to be able to reproduce game
             | states deterministically, not to have people play the video
             | game deterministically. Again, this is to "undo" bugs in
             | pro play (Chronobreak).
        
         | DarthBatman wrote:
         | Thought I heard my ears ringing... Blast from the past here.
         | 
         | That's the beauty part of recording server state and playing it
         | back -- we included a recording of whatever accurate integer-
         | based OS time I could get on each platform. On Windows, for
         | example, I record the raw values of QPC ticks and the tick rate
         | into the recording. Then on playback, I play back the clocks
         | themselves as if the hardware has that tick rate and tick count
         | are from the recorded computer. So if frame 10 took 33.3ms and
         | frame 11 took 66.7ms, we'd play them back that way.
         | 
         | Note that playbacks are almost never in real-time. The reason a
         | typical server frame takes 33ms is that we might be processing
         | for 5ms and then yielding the thread (sleeping) for the
         | remaining time during real-time gameplay. On playback, we don't
         | yield the thread, so we play back faster than real-time when
         | doing something like a live Chronobreak (disaster recovery.)
         | 
         | Strictly speaking, you are right. Let's say we had a game with
         | no player inputs (like a bots-only game.) It's possible that
         | the outcomes might vary between two games with the same random
         | seeds if they run different numbers of frames over the same
         | time period. That said, we've actually tried this a few times,
         | and our game code is surprisingly resilient to this -- only in
         | a few instances have we found issues where subtle variations in
         | frame time could impact player outcomes (and those are
         | generally corrected when we find them -- this is an esports
         | game after all.)
         | 
         | A lot of this comes down to tradeoffs. Yes we might have those
         | subtle issues, but if we fixed our timestep, then we'd have the
         | other problem of the game potentially going into slow-motion
         | during (the admittedly rare) periods of long frame times. In
         | practice, we decided time dilation is more negative for player
         | performance than loss of interactive resolution; humans are
         | great predictors of motion with limited information, but
         | they're not so great when the rules of localized reality bend.
         | 
         | Edit: typo
        
           | nightowl_games wrote:
           | > It's possible that the outcomes might vary between two
           | games
           | 
           | So just to clarify: in League, the 'dt' passed into each
           | 'simulation step' is NOT constant? Isnt this kinda crazy? In
           | your later articles, floating point imprecision is talked
           | about. Couldnt this variance in dt create odd behaviour and
           | ultimate contribute to weird things like 'character movement
           | speed' being not _exactly_ the same between games? (like
           | really small, but still...)
           | 
           | And beyond that, how does the client and the server
           | synchronize with each other if the frame #s represent
           | different positions in time? My mind is blown right now...
           | 
           | Note: I've worked on many networked games, and have written
           | rollback/resimulate/replay code. I don't really understand
           | _why_ League wouldnt use a fixed time step. Whats the
           | advantage? In our games, the rendering of course uses the
           | real dt passed in from the OS, but the simulation step is
           | always fixed. This means in our games during a replay, your
           | computer could render the outcome differently if your frame
           | rate is different, but the raw simulation is always the same.
           | 
           | For context, to show I at least have some idea what I'm
           | talking about, I made this replay system (and the game has
           | rollback multiplayer):
           | 
           | https://gooberdash.winterpixel.io/?play=5b74f7c0-8591-40dc-b.
           | ..
           | 
           | I havent played a lot of League, but I always assumed it
           | would use deterministic lock step networking (like it's
           | predecessor, Warcraft 3)
        
             | DarthBatman wrote:
             | We're recording the integer clocks though, and those don't
             | change between runs. While game code converts things like
             | (QPC ticks over tick-rate) to floating point, we don't sum
             | those numbers directly. Instead, we internally store times
             | as the raw integers then convert them to floats on-demand
             | (typically when an Update function is asking for
             | "elapsedTime" or a timer is asking for a "timeSpan" (like
             | time since the start of the game.))
             | 
             | LoL and TFT don't use a synchronized-simulation (lockstep,
             | sometimes called peer-to-peer) networking model. LoL is
             | Client-Server, meaning the replication is explicit and not
             | based purely on playing back client inputs. This gives us
             | more control over things like network visibility, LODs, and
             | latency compensation at a feature-by-feature level at the
             | cost of increased complexity. Most of the games I've built
             | over the years use this model and the LoL team is super
             | comfortable with it.
             | 
             | The GameClients are not deterministic in the way that the
             | GameServer is, though they're naturally closer to that
             | ideal since the GameServer itself is typically predictable.
             | 
             | Don't get me wrong, there's a time and place for lockstep
             | replication, and LoL probably could have gone that way. I
             | wasn't there when that direction was picked, but I would
             | have likely made the same choice as my predecessors,
             | knowing what I do about our approach to competitive
             | integrity.
        
               | doctorpangloss wrote:
               | All this stuff predates ECS and a fully specified
               | definition of what a live service continent spanning MOBA
               | is. All the tradeoffs make sense to me. The real question
               | is, would it have been possible to define an engine
               | solution that looks more like Overwatch, in the absence
               | of a fully specified game? I feel like that is ECS's
               | greatest weakness.
        
               | meheleventyone wrote:
               | I feel like you're skipping a step to make this comment
               | make sense. Which is saying why using the Overwatch model
               | would be better and why the need to introduce an ECS?
        
         | mackman wrote:
         | The physics game sim step had to be hardcoded but you could
         | vary the graphics rendering independently. That did mean you
         | needed to use a different RNG for sim objects and graphics-only
         | particle effects.
         | 
         | I worked on the deterministic instant-replay system for a
         | racing sim on PS3. The most interesting thing was that on PS3
         | we had main processor cores (PPU) and helper cores (SPU) which
         | had their own instruction set. Our physics sim could distribute
         | work across both, but the thread assignment had to be
         | deterministic because PPU and SPU got marginally different
         | floating point results for the same instructuion and inputs.
         | That was a fun one to figure out!
        
           | hamish-b wrote:
           | Incredible insight, thank you! Could you elaborate on any
           | more of your experiences with PPU/SPU programming? Also, what
           | have you gone on to do since then?
        
       | TacticalCoder wrote:
       | LoL was inspired by DotA (Defense of the Ancient). DotA was a
       | Warcraft III mod.
       | 
       | Warcraft III came out in 2002 and was fully deterministic. Save
       | files for entire games with up to 8 players (IIRC?) and huge
       | armies were... Tiny. For they only saved the inputs.
       | 
       | It comes at a surprise to me not that LoL eventually became
       | deterministic but that it wasn't in the first place.
       | 
       | As a funny sidenote, around 1990 / 1991 I wrote a little 2D game
       | and there was a bug I simply couldn't figure out. It'd happen
       | very very rarely. And I realized, by myself, that if I rewrote
       | the game's little physics engine to be 100% deterministic I could
       | just record the inputs while playing and then replay it,
       | hopefully reproducing the crash. It worked. I eventually ended up
       | recording the bug and, sure enough, upon replaying the game would
       | crash: at that point I knew the bug was a goner. It was a
       | dangling pointer in a rare case where the hero had picked an
       | extra allowing to shot two shots at once on screen instead of one
       | and if the player would then finish the level with the first shot
       | gone out of the screen but the second shot still on screen.
       | 
       | I had never heard of anyone using a deterministic engine back in
       | 1990/1991. Fun memories.
        
         | jaharios wrote:
         | Post like this is way I read HN. Thanks from us who this missed
         | stuff.
        
       | sylware wrote:
       | I play dota2 because it has a well supported native elf/linux
       | client.
       | 
       | And that even if I live in a country where LOL is, by far, more
       | popular.
        
       ___________________________________________________________________
       (page generated 2024-10-04 23:02 UTC)