[HN Gopher] Game Timers: Issues and Solutions (2012)
___________________________________________________________________
Game Timers: Issues and Solutions (2012)
Author : certeoun
Score : 68 points
Date : 2021-08-17 09:32 UTC (13 hours ago)
(HTM) web link (www.fabiensanglard.net)
(TXT) w3m dump (www.fabiensanglard.net)
| certeoun wrote:
| > The joy of "discovering" this approach was short lived: This
| system is very well known in the game industry and is actually
| the keystone of games such as Starcraft II or id Software engines
| !
|
| Is it very well known in the game industry? I have trouble
| finding anything about it.
| meheleventyone wrote:
| Yeah extremely well known. Doesn't mean everyone uses fixed
| time steps though or talks about it a lot when they do. But
| you'll find the concept creeps up a lot in physic engines,
| deterministic networking/gameplay and in discussions of engine
| architecture amongst other places.
| certeoun wrote:
| I am a little confused, isn't this a "fixed timestep"?:
| https://www.gafferongames.com/post/fix_your_timestep/
|
| Fabien calls it "fixed timeslice". I find Fabien's solution
| much simpler. It is quite different to Fiedler's one.
| meheleventyone wrote:
| Whether you call it a step or slice doesn't matter, tick is
| another common term for the same concept. Fabien's solution
| is mentioned in Glenn's. Glenn just takes it a stage
| further afterward and allows the fractional time remaining
| to be accounted for when rendering.
| certeoun wrote:
| while (running) { static
| std::uint32_t simulation_time{0U};
| std::uint32_t const real_time = SDL_GetTicks();
| process_input(); while (simulation_time
| < real_time) {
| simulation_time += 16U; update(16U);
| } render(); }
|
| As I understand it, the inner while loop's is essentially
| playing "catch up" with the outer while. For simplicity's
| sake, let's assume that simulation_time is incremented to
| 1 in the inner while (simulation_time += 1U). Further,
| let us assume that simulation_time is 42 in the first
| iteration. Now, the inner while needs to execute 42 times
| until it fails the loop condition (42 < 42).
|
| On the second iteration, simulation_time is equal to 58
| (42 + 16). The inner while executes (42 < 58) 16 times
| now.
|
| On the third iteration, simulation_time is equal to 59
| (58 + 1). The inner while executes (58 < 59) 1 time.
|
| The real_time cannot stay the same, as it polls the
| number of ticks since SDL was initialized. So apparently,
| simulation_time is always smaller than real_time.
|
| (If for some reason, real_time isn't incremented in an
| iteration, then the inner loop will not get executed.
| However, I don't see a case where it can happen.)
|
| Now, inside the update function there is a position
| update akin to current_position += speed * 16U. Now with
| the above iterations:
|
| - The first iteration, causes 42 update calls (so
| current_position is updated 42 times as well).
|
| - The second iteration, causes 16 update calls.
|
| - The third, calls update 1 time.
|
| So we are advancing the position of something variable
| times. (We are also executing the inner while in variable
| amounts.)
|
| Maybe I am missing something here, but why doesn't the
| non-constant number of updates cause jitter? I really
| don't understand why it works at all to be honest. I am
| trying to understand it fully.
| meheleventyone wrote:
| The simulated time is incremented by 16ms per inner loop
| a roughly 60hz rate. You're correct that the simulated
| time catches up to the real time. Just wrong about the
| number of sub-steps and how the timers proceed.
| Technically Fabian's solution will likely run slightly
| ahead of real time because of the inner loop condition.
|
| You do get jitter from a changing number of sub-steps per
| variable step which is why you want to do things like the
| interpolation in Glenn's solution. This is also why you
| still want to do everything in your power as a game
| developer to reduce the variance in real step size.
| account42 wrote:
| > This is also why you still want to do everything in
| your power as a game developer to reduce the variance in
| real step size.
|
| Well the main reason is that for either variable time
| slices or fixed time slices with interpolation, what the
| time for the simulation/interpolation you want is that
| for the current frame, but you don't have that until you
| render it. So what you do is assume that it will take
| about as long as the previous frame, which is a bad
| prediction if the frame time variance is too big. If you
| really want to eliminate the jitter you'd need to
| simulate/interpolate for a time that you are guaranteed
| to be able to render the frame in and then make sure that
| the frame is displayed when that time elapses and not
| before. This of course increases latency so there is
| always a tradeoff.
|
| Another cause for jitter is that even measuring the time
| between two successive frames can be difficult with
| modern GPUs - the CPU time at the start of the game loop
| doesn't really cut it:
| https://medium.com/@alen.ladavac/the-elusive-frame-
| timing-16...
| certeoun wrote:
| > The simulated time is incremented by 16ms per inner
| loop a roughly 60hz rate. You're correct that the
| simulated time catches up to the real time. Just wrong
| about the number of sub-steps. Technically Fabian's
| solution will likely run slightly ahead of real time
| because of the inner loop condition.
|
| Can't the inner while take 1 ms or 1 ns for an iteration?
| I don't see how the inner while's execution time is at
| roughly 16 ms. Okay, if my number of sub-steps is wrong,
| then I am really missing something here. Just trying to
| understand what exactly is wrong with my thinking.
|
| It is supposed to be so simple, yet I have real trouble
| understanding it currently. I am basically stuck now.
| meheleventyone wrote:
| The inner loop can take as little time as it likes to
| actually do the update but it is accounted in simulated
| time by adding 16 to the timer and accounting for that in
| the update by passing in 16 as the amount of time to
| update for.
|
| Real time is updated once at the start of the step then
| simulated time catches up in 16ms increments. So you get
| as many update calls as there are 16ms increments to
| simulated time before it exceeds real time.
|
| This is all very simple but also pretty finicky to think
| through. I'd replicate the code in the language of your
| choice and step through it.
| [deleted]
| TacticalCoder wrote:
| Yup it is well known since a long time. Fun fact: I discovered
| that technique myself around 1991 (and still have the ASM + C
| source code for the DOS game using it). The first time I
| remember it publicly described was on the GamaSutra website in
| a post-mortem for the first "Age of Empire" game. GamaSutra
| was/is a big deal among game devs so the technique became
| "famous" when that post-mortem was made.
|
| I wrote about it in a thread here called: _" Why bugs might
| feel "impossible""_ about two months ago and someone commented
| that the game TerraNova (from 1996) had a fully deterministic
| engine.
|
| BTW TFA say StarCraft 2 (2007) used the technique but Warcraft
| 3 (2003) was already using a deterministic engine. Save files
| for Warcraft 3 games (including multiplayer ones over the
| Internet) consisted in only recording the player inputs and the
| "tick" at which they happened. This make for tiny savefiles,
| even for very long games.
|
| So basically: several of us independently discovered that
| technique in the nineties. There may have been games in the
| eighties already using such a technique but I don't know any.
|
| https://news.ycombinator.com/item?id=27517074
| certeoun wrote:
| Was it this article?: https://www.gamasutra.com/view/news/115
| 711/The_Game_Develope...
| TacticalCoder wrote:
| It was more than 20 years ago so I don't remember: quickly
| skimming through it I don't think so. I still have emails
| somewhere I exchanged back then with the dev who wrote the
| article so I may be able to find the date.
|
| But in any case: the technique ain't new.
| certeoun wrote:
| Now, I seriously question the notion of "the internet
| never forgets things". I wonder how much valuable
| knowledge got lost from that era (the 90s).
| setr wrote:
| There's a reason webarchive exists -- the internet
| forgets things constantly (another solution to the
| problem are things like ipfs)
| adzm wrote:
| There was a ton of weird random stuff on geocities that
| never got saved.
| IncRnd wrote:
| There were huge amounts of knowledge and techniques that
| were never preserved. The Internet has forgotten many
| things about applied computer science.
| toast0 wrote:
| "the internet never forgets things" is more of an caution
| than a promise.
|
| Don't say stupid things that will look even worse in 10
| years or sometimes fairly mainstream things that will
| look terrible in 40 years, because the internet never
| forgets.
|
| But, you still need to archive stuff you want to
| reference later. Even if it's still out there, it might
| be much harder to find.
| fuball63 wrote:
| I learned about it reading the quake 2 sourcecode. In school or
| tutorial articles, they always teach the first technique, I
| find. "Delta timing" is what I knew it as.
| hoseja wrote:
| I don't think many game industry techniques are very well
| described in the open.
| stayfrosty420 wrote:
| So is the the length of this slice equivalent the infamous "tick
| rate" that gamers often complain about?
| fabiensanglard wrote:
| Yes!
| IncRnd wrote:
| Don't worry, we'll get that solved in a jiffy.
| skizm wrote:
| Tick rate usually means how many times a server calculates the
| state of the world in an online game. So a client can have
| 300fps, but the server might only calculate 128 ticks per
| second. If your computer's FPS dips below that of the server's
| tick rate (or your internet is bad) you start missing
| information. The client usually calculates things as if it has
| perfect information, and then when a tick comes in an tells
| your computer it is wrong, the client corrects itself. So the
| lower the tick rate on the server, the more times this happens
| (since your computer has made hundreds of calculations since
| the last tick came in) and it seems slower / choppier.
| formerly_proven wrote:
| Some games solved the low-tickrate / low-fps problem by
| timestamping user input independently and interpolating to
| the simulation, i.e. you can still hit things even with just
| say 10 ticks per second.
| uCantCauseUCant wrote:
| It can be reduced though, by training a model of the enemy
| player behaviour to more accurately predict and extra-polate
| on the others behaviour. (Not just assume linear continued
| behaviour). Of course, this means things like dissapearing
| shots, if the opponent didn't fire.
| bob1029 wrote:
| I feel like there are some ideas in Fintech that haven't quite
| leaked over into the gaming space yet.
|
| Has anyone investigated the idea of sacrificing an entire high
| priority thread to a timer loop that busy waits over a collection
| of pending timers? Most gaming PCs these days have more than
| enough cores to get away with the idea of a full-time high-
| precision timer thread.
|
| In my experiments of this idea, I have had no trouble getting
| jitter under 100uS in managed languages (C#/Java). Millions of
| timers scheduled seems to be no problem, assuming mild
| optimizations are employed (i.e. order by next desired execution
| tick). With all of this magic turned on, I still have 31 other
| cores to play with. Task.Delay looks like a preschool art project
| by comparison to the precision I get out of my (arguably dumb-as-
| hell) DIY timer impl.
|
| I have used this approach to schedule client frame draws in a
| custom UI framework, and it is indistinguishable from butter on
| my computer and across reasonable networks. In my prototypes, the
| client input events are entirely decoupled from the frame draws
| by way of a separate ringbuffer/batching abstraction. There is
| actually no locking in my architecture. Draws back to the client
| use immutable snapshots of state, so current updates can continue
| unfettered. The state is only ever mutated from a single thread.
|
| Technically, I sacrifice ~2 threads in my architecture, as the
| ringbuffer also uses a busy wait technique.
|
| All of this said, if you are willing to burn more power, use more
| cores, and think outside of the box just a bit, you can do some
| pretty amazing shit.
|
| Consider these ideas at broader scale too - What if you could
| amortize the cost of that power-hungry timer thread for thousands
| of gamers instead of just 1? Are there maybe some additional
| benefits to moving 100% of the game state to the server and
| shipping x265 frames to the end users?
| jdmichal wrote:
| > Consider these ideas at broader scale too - What if you could
| amortize the cost of that power-hungry timer thread for
| thousands of gamers instead of just 1? Are there maybe some
| additional benefits to moving 100% of the game state to the
| server and shipping x265 frames to the end users?
|
| I don't know of any online game that doesn't track the official
| game state on the server. This is required for anti-cheat
| reasons, if nothing else. However, networking would typically
| be optimized to send small packets of information, and
| typically over UDP. If a packet is missed, the next packet will
| just be overwriting it anyway, so no big deal. Clients would
| basically just locally simulate game state and then reconcile
| that against the game state being received from the server.
|
| Also, rendering has to be done per-client anyway, since each
| one has a different camera state. So there's no economy of
| scale for doing that on the server rather than the client. In
| fact, there's probably anti-economy: Servers that aren't
| rendering don't need expensive video cards unless they're using
| them for physics.
|
| I have the understanding that more modern games do sometimes
| use TCP sockets. And obviously modern bandwidth has made
| streaming frames realistically possible. Hence the recent
| emergence of streaming game services.
| formerly_proven wrote:
| > I don't know of any online game that doesn't track the
| official game state on the server.
|
| A lot of titles are more or less serverless peer-to-peer.
| jdmichal wrote:
| "Serverless" as in one of the peers is nominated as the
| Source of Truth? Or truly decentralized state? Because the
| first option is just changing who owns the server.
| formerly_proven wrote:
| Both.
| flohofwoe wrote:
| It's much easier than that, you just need to use the display
| refresh rate as your time base.
|
| In the end, the only important event is when your new frame
| shows up on screen, and this is (a) very precise, and (b) out
| of your control anyway, at least for traditional fixed-refresh-
| rate displays.
|
| It doesn't really matter where exactly within a frame-time-
| slice your per-frame-code runs, or how long it takes as long as
| it fits comfortably within one display refresh interval, the
| basic timer resolution that matters is the display refresh
| rate.
|
| PS: it's more difficult if you also need to minimize input- and
| networking-latency, that's were it may make sense to to use
| separate threads I guess. But on desktop PCs (versus game
| consoles), so much also depends on the operating system playing
| ball...
|
| PPS: busy looping probably isn't a good idea when power
| consumption matters (e.g. on mobile)
| TheHideout wrote:
| Related: If anyone is interested, I've been working on an ECS
| based mini-game framework in Rust [0] that runs with this idea.
| It has a 1200 hz simulation clock, 60 hz rendering clock, and 150
| hz physics integrator (Euler). I've also posted some example
| mini-games that use it [1].
|
| [0] https://github.com/Syn-Nine/mgfw
|
| [1] https://github.com/Syn-Nine/rust-mini-
| games/tree/main/2d-gam...
| flohofwoe wrote:
| Please note that non-60Hz displays are surprisingly common (not
| just "gamer monitors" with 120, 240Hz etc..., but also odd
| refresh rates like 72, 75, 85, 90 and 100Hz)
|
| Ideally the 'render clock' would query the display refresh rate
| from the operating system, but this is harder than it sounds.
| The next best option is to measure the frame time, and then
| round to the next "common" refresh rate (AFAIK that's the only
| way it works on the web for instance).
| vardump wrote:
| Indeed, notably cheap 75 Hz displays have become surprisingly
| common lately.
| TheHideout wrote:
| Interesting. Thanks for the note.
| fabiensanglard wrote:
| Written during one of these blizzard like I have only seen in
| Toronto.
|
| This makes me want a French Vanilla from Tim Hortons.
| mLuby wrote:
| Regarding "missing" a collision due to the delta-time being too
| large, I thought the game industry standard is to "backtrack" (in
| math, not in ticks) to determine the correct collision point and
| go from there. Granted, that is a bit more complicated than just
| keeping the delta-t/tick consistently small, but it is more
| correct.
| kevingadd wrote:
| Before you can back track, you need to first recognize that you
| missed a collision
| mLuby wrote:
| Could be--it's been a while since I read about the technique.
|
| I thought a game engine doing this figured out the collisions
| _when the trajectories changed_ so then it 's just a matter
| of rendering those known effects (rather than the more naive
| approach of checking each tick or frame whether anything's
| collided). Note that doing this at impulse can cause the game
| to slow down when many objects are moved at the same time,
| like an explosion. I think it works pretty well because most
| game objects are stationary, or even immovable; also moving
| objects are rarely subject to new accelerations--they stay
| ballistic.
|
| But IANAGameDev, so do correct me if you know something about
| this technique. t=0 cube1 p=5,0 v=0,0
| m/s cube2 p=1,0 v=2,0 m/s // Game knows cube1
| and 2 will collide at t=2 t=1 cube1 p=0,0 v=0,0
| m/s cube2 p=3,0 v=2,0 m/s cube3 p=0,0 v=5,0
| m/s // Game knows cube1 and 2 AND 3 will collide at
| t=2 t=2 all cubes hit at 5,0
| plopz wrote:
| Its pretty difficult to do quickly and with numerical
| robustness, specifically when you have rotation. Here's a
| talk by the creator of Box2D while he was working for
| Blizzard on continuous collision detection.
|
| https://www.youtube.com/watch?v=7_nKOET6zwI
| mLuby wrote:
| Thanks, that was an interesting talk. Apparently what I
| was describing is "time of impact" as that talk describes
| here: https://youtu.be/7_nKOET6zwI?t=495
| formerly_proven wrote:
| A lot of game physics engines are still pretty bad at this,
| which is why being able to clip through walls is such a common
| issue. Even in games where the physics are more or less solid,
| there are usually many bugs where fixed animations allow
| clipping through stuff (e.g. cover or hurdle-jumping
| animations), because those simply don't interact with the
| physics engine at all - you only need to get to a start point
| that places the animation's end point out of bounds and be able
| to trigger the animation.
| masklinn wrote:
| I'm not a gamedev at all so I never really dug into it, but the
| issue I've always had there was how to handle input when
| executing multiple simulation steps within a single "execution
| frame".
| meheleventyone wrote:
| Depends on how input can be handled you could repoll between
| each sub-step but practically just sending in the same input to
| every sub-step works. Usually you aren't doing many sub-steps
| per step anyway.
| [deleted]
| flohofwoe wrote:
| The other problem with naively measuring the frame duration is
| that you'll get sub-millisecond-jitter because modern operating
| systems are neiter "hard realtime" nor "soft realtime", which in
| turn introduces micro-stutter because your game frames will be
| timed slightly differently than when your new frame shows up on
| screen (because of the fixed display refresh rate - unless of
| course a variable refresh rate is used like G-Sync).
|
| This is also my main pet peeve on the web. You can't measure a
| precise frame duration (made much worse by Spectre/Meltdown
| mitigations), but you also can't query the display refresh rate.
|
| In the 80's we took perfectly smooth scrolling and animations for
| granted, because most 80's home computers and game consoles were
| proper hard-realtime systems. Counter-intuitively this is harder
| to achieve on modern PCs that are many thousand times faster.
| MisterTea wrote:
| > In the 80's we took perfectly smooth scrolling and animations
| for granted, because most 80's home computers and game consoles
| were proper hard-realtime systems.
|
| Proper hard realtime means the software is designed to meet
| stringent time deadlines. If a deadline is missed then the
| system has failed.
|
| Soft real time means you tolerate missing one or more deadlines
| if the system is designed to handle it.
|
| The 80's hardware only ran the game code so there was never any
| CPU contention. There was no kernel, scheduler, threads or
| processes. The programmers could wrap their heads round the
| simpler hardware and use all available tricks to optimize every
| clock tick to do useful work.
|
| Nowadays we have stupid cheap multicore GHz CPU's for a few
| dollars with GB of RAM so you brute force your way through
| everything on a general purpose OS like Linux.
| flohofwoe wrote:
| Yes, and also the video and audio chips were "cycle-
| synchronized" with the CPU (e.g. it was guaranteed that a
| fixed number of CPU cycles after the vsync interrupt you'd
| always end up at the exact same video raster position
| etc...).
|
| OTH making the hardware components "asynchronous" and the
| timings "unpredictable" enabled today's performance (e.g. by
| introducing caches and pipelines).
| devwastaken wrote:
| What's preventing PC's from having some realtime clock hardware
| component that could be used by software?
| flohofwoe wrote:
| This isn't the problem, those timers exist and are accessible
| (unless you're in a web browser). The problem is that your
| code is essentially running at an unpredictable point in time
| within the current frame. E.g. if you measure the time
| duration between the same line of code at the start of your
| per-frame function you'll get a tiny jitter and never exactly
| 16.66667 ms (assuming a 60Hz display). I'm not sure what's
| the exact reason, but I guess there are many things that can
| throw off this sort of time measurement in a modern operating
| system (for instance process/thread scheduling).
| krona wrote:
| Modern CPU's already have several counters for 'nominal'
| cycles (i.e. cycles at a constant frequency). e.g. TSC
| instructions and Performance-Monitoring Counters.
|
| Converting nominal cycles to time can be unreliable in some
| cases but not impossible.
| bsder wrote:
| > This is also my main pet peeve on the web. You can't measure
| a precise frame duration (made much worse by Spectre/Meltdown
| mitigations), but you also can't query the display refresh
| rate.
|
| Vulkan has extensions for measuring frame timings. I suspect
| DirectX 12 does too, given how similar it is to Vulkan.
|
| See:
| https://www.khronos.org/registry/vulkan/specs/1.2-extensions...
| Pulcinella wrote:
| I believe Unity has just recently (I.e. since October 2020)
| taken steps to help mitigate this microstutter, unless I am
| misunderstanding you.
|
| https://blog.unity.com/technology/fixing-time-deltatime-in-u...
| megameter wrote:
| The last time I implemented a timing loop I came up with an
| approach that would mitigate jitter and oscillation artifacts
| occurring due to OS overhead (though not solving the aspect of
| video refresh itself being misaligned). One of my major
| assumptions was that I could run the game at higher frequency
| than refresh and therefore rely on downsampling my output, vs
| the norm of many higher-end games where the base tick is
| interpolated up to refresh rate.
|
| 1. Derive ideal number of elapsed frames using elapsed time
| from start divided by framerate.
|
| 2. Run updates to "catch up" to ideal frames.
|
| 3. Observe "rubberbanding" artifact as game oscillates between
| too slow and too fast.
|
| 4. Add limits to how many catchup frames are allowed per render
| frame to stop the rubberbanding. Implement a notion of "dropped
| frames" instead, so that our ideal time accounting has a
| release valve when too much catchup is needed.
|
| 5. Now apply a low-pass filter to the amount of catchup frames
| over time, e.g. if you have a tick of 120hz but your monitor is
| at 60hz you are likely to see an occasional high frequency
| oscillation in how many tick frames are supplied per render
| frame, like [2,3,2,3,2,3]. Filtering can make this
| [2,2,2,3,3,3]. (It's been a while since I did this, so I don't
| recall my exact algorithm)
|
| The end result of this effort is that wall-clock time is obeyed
| over the long-run(minus dropped frames) but the supplied frames
| are also able to maintain the same pacing for longer periods,
| which makes the moment-to-moment experience very predictable,
| hence smooth.
|
| While I haven't tried it, I think the equivalent thing when
| interpolating would be to freeze a certain issued delta time
| pace for some number of frames.
___________________________________________________________________
(page generated 2021-08-17 23:01 UTC)