[HN Gopher] Making Your Game Go Fast by Asking Windows Nicely
___________________________________________________________________
Making Your Game Go Fast by Asking Windows Nicely
Author : zdw
Score : 113 points
Date : 2022-01-16 18:21 UTC (1 days ago)
(HTM) web link (www.anthropicstudios.com)
(TXT) w3m dump (www.anthropicstudios.com)
| Const-me wrote:
| About switchable graphics, nVidia APIs do work. The problem with
| them, there's no API to switch to the faster GPU, they only have
| APIs to setup a profile for an application, ask for the faster
| GPU in that profile, and the changes will be applied next time
| the app launches.
|
| I had to do that couple times for Direct3D 11 or 12 apps with
| frontend written in WPF. Microsoft doesn't support exporting
| DWORD variables from .NET executables.
|
| Technical info there: https://stackoverflow.com/a/40915100
| masonremaley wrote:
| It's possible I'm misunderstanding the docs, but here's the
| line that lead me to believe linking to one of their libraries
| alone would be enough (and lead to my surprise when it didn't
| work):
|
| (https://docs.nvidia.com/gameworks/content/technologies/deskt..
| .)
|
| > For any application without an existing application profile,
| there is a set of libraries which, when statically linked to a
| given application executable, will direct the Optimus driver to
| render the application using High Performance Graphics. As of
| Release 302, the current list of libraries are vcamp110.dll,
| vcamp110d.dll, nvapi.dll, nvapi64.dll, opencl.dll, nvcuda.dll,
| and cudart _._.
| Const-me wrote:
| Can it be that you linked to one of these libraries, but
| never called any function from that DLL, so your linker
| dropped the unused DLL dependency?
|
| However, I don't really like that method. The app will fail
| to launch on computers without nVidia drivers, complaining
| about the missing DLL. For languages like C++ or Rust, the
| exported DWORD variable is the best way to go. The only
| reason I bothered with custom installer actions, that method
| wasn't available.
| masonremaley wrote:
| Hmm. I _think_ I tried calling into their API to rule that
| out--but, it 's been a while, so it's 100% possible I
| remember incorrectly which would explain why it didn't
| work!
| shawnz wrote:
| > This isn't often relevant for games, but, if you need to check
| how much things would have been scaled if you weren't DPI aware,
| you can call GetDpiForWindow and divide the result by 96.
|
| If you aren't scaling up text and UI elements based on the DPI
| then it doesn't really sound like your application is truly DPI
| aware to me. I don't see why that applies any differently to
| games versus any other kind of application.
| ziml77 wrote:
| Games should either be aware of the user's preferred scaling or
| at least offer their own UI scaling option. But they should
| always register as DPI aware so they don't render the 3D scene
| at a lower resolution than what's selected
| [deleted]
| jeroenhd wrote:
| Unless the game engine is doing its own scaling, this does
| sound like lying to the operating system to get out of the way
| of those pesky user-friendly features to get more frames.
|
| I think Microsoft made it this hard to enable the DPI-aware
| setting exactly because it forces developers to think about
| things like DPI. If everyone follows this guide and ignores it,
| then I predict that in a few years this setting will be ignored
| as well and a new DPI-awareness API will be released.
| makomk wrote:
| I think it's reasonably common for games to scale their text
| and UI elements by the overall screen or window size, in which
| case opting out of clever OS DPI tricks is the right choice.
| Using actual DPI doesn't make much sense in general - the
| player could be sitting right in front of their laptop screen
| or feet away from a big TV, which obviously require very
| diffent font sizes in real world units.
| shawnz wrote:
| But in those cases you'd expect the user to manually adjust
| their scaling settings, which wouldn't be respected if
| following the author's advice here.
| masonremaley wrote:
| Yup, you hit the nail on the head (author of the article
| here). I guess I could've clarified that, I didn't expect
| people to assume I was advocating against scaling your UIs to
| fit the user's screen! Many games scale to fit the window by
| default, and even offer additional controls on top of that.
| shawnz wrote:
| It's not as simple as just scaling the UI to the size of
| the screen though, because the UI elements should be
| _bigger_ at the same screen size if the scaling is higher.
| That 's why, like you mention in the article, you'll be
| able to tell when the setting has been changed simply by
| looking at the scale of the UI: because it will be wrongly
| too small once the setting is activated.
| masonremaley wrote:
| Yup! I'm aware of what DPI scale is for, I use it when I
| write game tools. I don't use it in game, though--that's
| an intentional tradeoff I'm making. It seems like a
| pretty common tradeoff for games though!
|
| If you want to see why, try mocking up a typical shooter
| HUD. Now try scaling up/down all the elements by 50% and
| see what happens. Feel free to play with the anchoring,
| etc. Chances are you're not gonna like what you see!
| Things get even more complicated when you consider that
| players with controllers often change their view distance
| when gaming and don't wanna reconfigure their display all
| the time.
|
| The typical solution is to fix the UI scale to the window
| size, and keep the text large enough that it's readable
| at a large range of DPIs and viewing distances. If you
| can't get 100% there that way you'll typically add an in-
| game UI scale option. (The key difference between that
| and the built in UI scaling in Windows being that it's
| specific to the game, so you'll set it to something
| milder than you'd set the Windows option, and it will
| only affect the game so you don't have to keep changing
| it back and forth.)
|
| [EDIT] I think I came up with a way to explain this that
| saves you the trouble of drawing it out yourself. The
| fundamental issue, view distance changes aside, is that
| games are balancing a third variable most apps don't have
| to: how much of the background--the actual game--is the
| UI occluding?
| TimTheTinker wrote:
| > by linking with PowrProf.dll, and then calling this function
| from powersetting.h as follows
|
| > This function is part of User32.lib, and is defined in
| winuser.h which is included in Windows.h.
|
| This is one reason I think Windows is such a mess of an OS. (Look
| at the contents of C:\Windows and tell me it's not, if you can do
| so with a straight face!)
|
| To make what ought to be a system call you have to load some DLL,
| sys, or lib file at a random (but fixed) path and call a function
| on it.
|
| That combined with COM, and the registry, and I don't want to
| touch it with a ten-foot pole.
| Someone wrote:
| > To make what ought to be a system call you have to load some
| DLL, sys, or lib file at a random (but fixed) path and call a
| function on it.
|
| _"Ought to be a system call"_ is a matter of opinion. Among
| OSes, Linux is an outlier in that it keeps its system call
| interface stable.
|
| Many other OSes choose to provide a library with a stable
| interface through which system calls can (and, in some cases
| must. See https://lwn.net/Articles/806776/; discussed in
| https://news.ycombinator.com/item?id=21859612) be called. That
| allows them to change the system call ABI, for example to
| retire calls that have been superseded by other ones.
|
| (ideally, IMO, that library should not be the C library. There
| should be two libraries, a "Kernel interface library" and a "C
| library". That's a different subject, though)
| andriosusanto wrote:
| bobbyi wrote:
| ASSERT(SetProcessDpiAwarenessContext(DPI_AWARENESS_CONTEXT_PER_MO
| NITOR_AWARE_V2));
|
| If ASSERT is a no-op in release mode then you're only getting
| your setting set here while in debug mode
| masonremaley wrote:
| It's not, in my codebase, but I'll edit that when I have the
| chance so nobody blindly copy pastes it and ends up with
| something super broken
| sdflhasjd wrote:
| PowerSetActiveScheme sets the system power plan, it's not
| something a game should be doing without telling the user first.
| usbqk wrote:
| Well said.
|
| https://devblogs.microsoft.com/oldnewthing/20081211-00/?p=19...
| CamperBob2 wrote:
| Translation: "We didn't provide the necessary API support, so
| now we're going to whine about ad-hoc brute force solutions
| that developers would never have had to resort to if we'd
| done our jobs."
|
| Why isn't there a function I can call that enforces full CPU
| power, but only while my application is running? I never
| _wanted_ to change global system-level settings, but if that
| 's the only affordance provided by the Win32 API, then so be
| it.
| londons_explore wrote:
| Perhaps if the user has a slow old CPU we could also order
| them a new one on Amazon for use only while the game is
| running too...
| detaro wrote:
| Because you're generally not supposed to overwrite the
| users performance settings temporarily either?
| CamperBob2 wrote:
| It would be nice if it were that simple. Unfortunately,
| power settings under Windows are incredibly (and
| unnecessarily) complex, and I doubt that one in twenty
| users even knows the options are available. Worse, the
| Windows power settings tend to revert magically to
| "energy saving" mode under various unspecified
| conditions. This phenomenon almost cost me an expensive
| session at an EMC test lab once, when data acquisition on
| the device under test repeatedly timed out due to CPU
| starvation.
|
| It's entirely reasonable for performance-critical
| applications (not just games!) to be able to request
| maximum available performance from the hardware without
| resorting to stupid tricks like the measures described in
| this story, launching threads that do nothing but run
| endless loops, and so forth.
|
| I do agree with those who point out that this should be a
| user-controlled option. On the application side, this
| could be as simple as a checkbox labeled "Enable maximum
| performance while running" or something similar. Ideally,
| the OS would then switch back to the system-level
| performance setting when the application terminates,
| rather than leaving it up to the application to do the
| right thing and restore it explicitly.
| johncolanduoni wrote:
| Sometimes those are the user's performance settings, but
| more often the user has no idea what these performance
| settings you speak of are and they just don't want to see
| your game stutter. It would be nice to be able to
| distinguish these cases, and this user would love if
| games could temporarily disable aggressive power saving
| automatically when I'm running a game and put it back the
| rest of the time.
| protastus wrote:
| Alternative translation: "Our documentation is weak and our
| engineering teams aren't held accountable for it, so we're
| blaming third-party developers instead of doing our jobs".
| classichasclass wrote:
| Interestingly, Garage Band on my G5 kicks power management to
| highest performance without asking, though it turns it back
| down when it quits. Guess Apple didn't have a problem with it.
| masonremaley wrote:
| That's a good point--I'll look into whether Microsoft has any
| guidelines on this, and add a disclaimer to the article when I
| get a chance.
| discreditable wrote:
| I've had games do this and found it annoying since I like my PC
| to run in balanced mode. Not so much to save power but to let
| the machine idle when I'm not using it. Found I could work
| around it by deleting the power plans other than balanced.
|
| I've never played OP's game, so evidently a few games are out
| there doing this.
| connordoner wrote:
| Yeah, this feels like really bad UX.
| ahelwer wrote:
| I used to work at a high-performance scientific computing
| company. In the mid-2000s they ran into a weird issue where
| performance would crater on customer PCs running windows, unless
| that PC were currently running Windows Media Player. Something to
| do with process scheduling priority. Don't know whether this was
| a widely-disseminated old hand trick of the era or anything.
| bee_rider wrote:
| It is astonishing to me that someone would want to use Windows
| for something HPC related. I'm not generally a Windows hater
| (actually I am, but I see that there are legitimate business
| reasons to use it), but the HPC ecosystem seems much more
| Linux-friendly.
| aldebran wrote:
| The windows team works closely with hardware makers to
| support new and upcoming specialized hardware. This enables
| hardware makers to focus on hardware bring up and not worry
| about OS support.
|
| There are many technologies that work first or better until a
| certain time on Windows. For example (not necessarily HPC
| related) SMR drive support.
| bee_rider wrote:
| I definitely agree that if I had to get some random device
| working, Windows is probably a good first OS to try. But
| since Linux has such a large supercomputer/cluster/cloud
| presence, the situation is sort of flipped for HPC. At
| least as far as I've seen -- most numerical codes seem to
| target the Unix-verse first, and the only weird drivers you
| need are the GPU drivers (actually I haven't tried much
| GPGPU out, but I believe the Linux NVIDIA GPGPU drivers
| aren't the same horrorshow that their desktop counterparts
| are).
| pjmlp wrote:
| Speaking for the time I used to be at CERN, while the
| cluster is fully UNIX based, there is a big crowd of
| researchers running visualisation software, and other
| research related work on Windows e.g.
| Matlab/Tableau/Excel, and nowadays I assume macOS as well
| (it was early days for it 20 years ago).
| bee_rider wrote:
| I was thinking more of the number crunching bits, rather
| than visualization, since the original issue was around
| performance. But I guess visualization can be
| computationally cruncy too.
| ahelwer wrote:
| It is, but there are a lot of applications that people like
| to use on Windows PCs (think CAD or data analysis stuff) that
| have computationally-intensive subroutines. In that company's
| case it was GPU-accelerated electromagnetic wave simulations,
| seismic imaging reconstruction, and CT scan reconstruction.
| The company developed these libraries and licensed them for
| use in larger CAD or data analysis software packages.
| Const-me wrote:
| Probably timeBeginPeriod WinAPI called by that media player:
| https://docs.microsoft.com/en-us/windows/win32/api/timeapi/n...
| vardump wrote:
| Google Chrome had exactly that effect and at least in the
| past running Google Chrome made some software to function
| correctly. (Although perhaps there's also some software
| timeBeginPeriod(1) affects negatively.)
|
| Doesn't help when your testers run Google Chrome all the
| time...
| [deleted]
| shmerl wrote:
| Remind me this:
| https://www.extremetech.com/computing/294907-why-moving-the-...
| SamReidHughes wrote:
| You can also see performance improvements in processes that do
| I/O by having a low priority process running that does nothing
| but run an infinite loop. This keeps the computer from switching
| to idle CPU states during the I/O. This was on Linux, there is
| probably an OS setting to accomplish the same thing, but it was
| pretty counter-intuitive.
| wruza wrote:
| _missing vblank due to power management_
|
| Ugh. Few years ago I've built a gaming rig with i5-8400 and
| GTX1080 (both chosen for known workloads). Some games ran fine,
| but some were jerky af, and frametime monitor was zigzag-y all
| over the place. I thought that maybe 8400 was not a best choice
| despite my research and bought i7-8700 only to see that situation
| got much _worse_. After days of googling and discussions I found
| the issue: mobo bios had C1E state enabled. In short, it allows
| to drop the CPU frequency and voltage significantly when it 's
| idling, but this technique isn't ready to operate 100+ times per
| second. After drawing a frame, CPU basically did nothing for a
| period of time (<10ms), which was enough to drop to C1E, but it
| can't get out of it quickly for some reason. And of course 8700
| was much better at sucking at it, since it had more free time to
| fall asleep.
|
| I understand that power saving is useful in general, but man,
| when Direct3D sees every other frame skipped, maybe it's time to
| turn the damn thing off for a while. Idk how a regular consumer
| could deal with it. You basically spend a little fortune on a
| rig, which then stutters worse than an average IGP because of
| some stupid misconfiguration.
| stedolph wrote:
| I strongly agree with you, and it's just that I have i7, but
| all of the experiences are quite similar.
| Const-me wrote:
| > but it can't get out of it quickly for some reason
|
| As overclockers are aware, to achieve higher frequencies but
| keep CPU stable, one gonna need higher CPU voltage. Works other
| way too, lowing frequency allows to lower voltage, and that's
| what mostly delivers the power saving from these low-power
| states.
|
| These chips can't adjust voltage instantly because wires inside
| them are rather thin, and there's non-trivial capacity
| everywhere. This means CPUs can drop frequency instantly, then
| decrease the voltage over time. However, if they raise
| frequency instantly without first raising the voltage, the chip
| will glitch.
|
| That's AFAIK the main reason why increasing the clock frequency
| takes time. The chips first raise the voltage which takes time
| because capacity, and only then they can raise the frequency
| which is instant.
| Bancakes wrote:
| There's apps that disable C states and core parking, and boost
| P states.
|
| I use QuickCPU and max everything out. Yes it sounds like a
| sham but works wonders.
|
| https://coderbag.com/product/quickcpu
| wruza wrote:
| I simply disabled C1E in BIOS, because it's a desktop. But
| still had to use EmptyStandbyList technique afterwards, which
| helps with the rest of the issue (tested with and without for
| few days, it really works).
| splittingTimes wrote:
| Is it possible to employ any of those API calls in Java? How
| would the equivalents look here?
| howdydoo wrote:
| > As of April 5th 2017 with the release of Windows 10 Version
| 1703, SetProcessDpiAwarenessContext used above is the replacement
| for SetProcessDpiAwareness, which in turn was a replacement for
| SetProcessDPIAware. Love the clear naming scheme.
|
| This is the kind of thing I hate about "New Windows". Once upon a
| time MS used to strive for backward compatibility. These days
| every few years there's a new function you need to call. You
| can't get optimal behavior just by writing good code from the
| start. You need to do that, and also call the
| YesIKnowHowPixelsWork api call, and set
| <yesIAmCompetent>true</yesIAmCompetent> in your manifest to get
| what should be the default behavior. It's a mess.
| zamadatix wrote:
| This is precisely the "Old Windows" way of doing things where
| there are legacy APIs still supported for that forever
| backwards compatibility and current APIs exist for ways you
| probably want to do things in a new app.
|
| For reference SetProcessDPIAware solidified over 15 years ago
| whereas 15 years prior to that there wasn't even a taskbar so
| of course it's going to be out of date from a UI API
| perspective but that's what's needed if you want to also
| support apps from 15 years ago well.
| andrewf wrote:
| Specific example from the good old days: EnableTraceEx2
| "supersedes the EnableTrace and EnableTraceEx functions.".
| https://docs.microsoft.com/en-
| us/windows/win32/api/evntrace/...
|
| Func -> FuncEx -> FuncExN was a common pattern. (Which I like
| more than Func -> Funcness -> FuncnessContext, despite the
| lack of creativity!) Another one was tagging structures with
| their own length as the first member variable, so if a later
| SDK creates a newer version of the struct, the callee can
| tell the difference. eg https://docs.microsoft.com/en-
| us/windows/win32/seccrypto/cry...
| ziml77 wrote:
| The reason it's so complex is _because_ of backwards
| compatibility. Non-DPI aware applications from before DPI
| settings were a thing can 't advertise that they're not DPI
| aware, so if an application doesn't announce which it is,
| Windows has to assume that it's not aware. A couple years ago,
| Microsoft was able to make changes to the GDI libraries to
| automatically adjust the size of elements its rendering which
| makes a lot of things sharper. But things like images or
| anything on screen not rendered by GDI will not magically
| become sharp.
| rossy wrote:
| As someone who contributed to a (formerly) OpenGL-based video
| player[1], these issues with waiting for vblank and frame time
| variability on Windows are depressingly familiar. Dropping even
| one frame is unacceptable in a video player, but we seemed to
| drop them unavoidably. We fought a losing battle with frame
| timings in OpenGL for years, which eventually ended by just
| porting the renderer to Vulkan and Direct3D 11.
|
| One thing that we noticed was that wakeups after wglSwapBuffers
| were just more jittery than wakeups after D3D9/D3D11 Present()
| with the same software on the same system. In windowed mode, this
| could be mitigated by blocking on DwmFlush() instead of
| wglSwapBuffers (it seems like GLFW does this too, but only in
| Vista and 7.)
|
| The developer might also get some mileage from using ANGLE (a
| GLES 3.1 implementation on top of D3D11) or Microsoft's new
| GLon12.
|
| [1]: https://mpv.io/
| nottorp wrote:
| Two comments that kinda go against the flow:
|
| 1. Please add options to _conserve_ battery too. A FPS limiter
| would be good. Messing with the system power management when the
| user doesn 't want to be tethered to a wall plug is Not Nice(tm).
|
| 2. When you do UI scaling, especially if you're young with 20/20
| eyesight, please allow scaling beyond what you think is big
| enough.
___________________________________________________________________
(page generated 2022-01-17 23:01 UTC)