[HN Gopher] Vgpu_unlock: Unlock vGPU functionality for consumer ...
___________________________________________________________________
Vgpu_unlock: Unlock vGPU functionality for consumer grade GPUs
Author : fragileone
Score : 273 points
Date : 2021-04-09 18:42 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ternox99 wrote:
| I don't understand how and where I can download nvidia grid vgpu
| driver. Anyone can help me?
| ur-whale wrote:
| If - like me - you don't have a clue what vGPU is:
|
| https://www.nvidia.com/en-us/data-center/virtual-solutions/
|
| TL;DR: seems to be something useful for deploying GPUs in the
| cloud, but I may not have understood fully.
| lovedswain wrote:
| It instantiates multiple logical PCI adaptors for a single
| physical adaptor. The logical adaptors can then be mapped into
| VMs which can directly program a hardware-virtualized view of
| the graphics card. Intel has the same feature in their graphics
| and networking chips
| ur-whale wrote:
| Thanks for the explanation, but that's more of a "this is how
| it works" than a "this is why it's useful".
|
| What would be the main use case?
| kjjjjjjjjjjjjjj wrote:
| 4 people sharing 1 CPU and 1 GPU that is running a
| hypervisor with separate installations of windows for
| gaming
|
| Basically any workload that requires sharing a GPU between
| discrete VMs
| wmf wrote:
| The use case is allowing the host system and VM(s) to
| access the same GPU at the same time.
| ur-whale wrote:
| Yeah, I got that from the technical explanation.
|
| What's the _practical_ use case, as in, when would I need
| this?
|
| [EDIT]: To maybe ask a better way: will this practically
| help me train my DNN faster?
|
| Or if I'm a cloud vendor, will this allow me to deploy
| cheaper GPU for my users?
|
| I guess I'm asking about the economic value of the hack.
| lovedswain wrote:
| Running certain ML models in VMs
|
| Running CUDA in VMs
|
| Running transcoders in VMs
|
| Running <anything that needs a GPU> in VMs
| ur-whale wrote:
| This is the exact same information you posted above.
|
| Please see my edit.
| [deleted]
| jandrese wrote:
| You have a Linux box but you want to play a game and it
| doesn't work properly under Proton, so you spin up a
| Windows VM to play it instead.
|
| The host still wants access to the GPU to do stuff like
| compositing windows and H.265 encode/decode.
| skykooler wrote:
| And outputting anything to the screen in general.
| Usually, your monitor(s) are plugged into the ports on
| the GPU.
| jowsie wrote:
| Same as any hypervisor/virtual machine setup. Sharing
| resources. You can build 1 big server with 1 big GPU and
| have multiple people doing multiple things on it at once,
| or one person using all the resources for a single
| intensive load.
| ur-whale wrote:
| Thanks, this is a concise answer.
|
| However, I was under the impression - at least on Linux -
| that I could run multiple workloads in parallel on the
| same GPU without having to resort to vGPU.
|
| I seem to be missing something.
| antattack wrote:
| If you are running Linux in a VM, vGPU will allow
| acceleration with OpenGL, WebGL, Vulcan applications like
| games, CAD, CAM, EDA, for example.
| hesk wrote:
| In addition to the answer by skykooler, virtual GPUs also
| allow you to set hard resource limits (e.g., amount of L2
| cache, number of streaming multiprocessors), so different
| workloads do not interfere with each other.
| cosmie wrote:
| This[1] may help.
|
| What you're saying is true, but it's generally using
| either the API remoting or device emulation methods
| mentioned on that wiki page. In those cases, the VM does
| not see your actual GPU device, but emulated device
| provided by the VM software. I'm running Windows within
| Parallels on a Mac, and here[2] is a screenshot showing
| the different devices each sees.
|
| In the general case, the multiplexing is all software
| based. The guest VM talks to the an emulated GPU, the
| virtualized device driver then passes those to the
| hypervisor/host, which then generates equivalent calls on
| to the GPU, then back up the chain. So while you're still
| ultimately using the GPU, the software-based indirection
| introduces a performance penalty and potential
| bottleneck. And you're also limited to the cross-section
| of capabilities exposed by your virtualized GPU driver,
| hypervisor system, and the driver being used by that
| hypervisor (or host OS, for Type 2 hypervisors). The
| table under API remoting shows just how varied 3D
| acceleration support is across different hypervisors.
|
| As an alternative to that, you can use fixed passthrough
| to directly expose your physical GPU to the VM. This lets
| you tap into the full capabilities of the GPU (or other
| PCI device), and achieves near native performance. The
| graphics calls you make in the VM now go directly to the
| GPU, cutting out game of telephone that emulated devices
| play. Assuming, of course, your video card drivers aren't
| actively trying to block you from running within a VM[3].
|
| The problem is that when a device is assigned to a guest
| VM in this manner, that VM gets exclusive access to it.
| Even the host OS can't use it while its assigned to the
| guest.
|
| This article is about the fourth option - mediated
| passthrough. The vGPU functionality enables the graphics
| card to expose itself as multiple logical interfaces. So
| every VM gets its own logical interface to the GPU and
| send calls directly to the physical GPU like it does in
| normal passthrough mode, and the hardware handles the
| multiplexing aspect instead of the host/hypervisor
| worrying about it. Which gives you the best of both
| worlds.
|
| [1] https://en.wikipedia.org/wiki/GPU_virtualization
|
| [2] https://imgur.com/VMAGs5D
|
| [3] https://wiki.archlinux.org/index.php/PCI_passthrough_
| via_OVM...
| skykooler wrote:
| You can, but only directly under that OS. If you wanted
| to run, say, a Windows VM to run a game that doesn't work
| in Wine, you'd need some way to give a virtual GPU to the
| virtual machine. (As it is now, the only way you'd be
| able to do this is to have a separate GPU that's
| dedicated to the VM and pass that through entirely.)
| noodlesUK wrote:
| I wish Nvidia would open this up properly. The fact that intel
| integrated gpus can do GVT-G and I literally can't buy a laptop
| which will do vgpu passthrough with an Nvidia card for any amount
| of money is infuriating.
| my123 wrote:
| GVT-g is gone on 10th-gen GPUs (Ice Lake) and later. Not
| supported on Intel dGPUs either.
| cercatrova wrote:
| For virtualized Windows from Linux, check out Looking Glass which
| I posted about previously
|
| https://news.ycombinator.com/item?id=22907306
| albertzeyer wrote:
| The Python script actually mostly uses Frida (https://frida.re/)
| scripting. I haven't seen Frida before, but this looks very
| powerful. I did some similar (but very basic) things with
| GDB/LLDB scripting before but Frida seems to be done for exactly
| things like this.
| madjam002 wrote:
| I built an "X gamers/workstations 1 CPU" type-build last year and
| this has been the main problem, I have two GPUs, one of which is
| super old and I have to choose which one I want to use when I
| boot up a VM.
|
| Will definitely be checking this out!
| WrtCdEvrydy wrote:
| This is a dumb question, but which hypervisor configuration is
| this targeted towards to.
|
| There's a lot of detail on the link which I appreciate but maybe
| I missed it.
| sudosysgen wrote:
| Amazing! Simply amazing!
|
| This not only enables the use of GPGPU on VMs, but also enables
| the use of a single GPU to virtualize Windows video games from
| Linux!
|
| This means that one of the major problems with Linux on the
| desktop for power users goes away, and it also means that we can
| now deploy Linux only GPU tech such as HIP on any operating
| system that supports this trick!
| cercatrova wrote:
| For virtualized Windows from Linux, check out Looking Glass
| which I posted about previously
|
| https://news.ycombinator.com/item?id=22907306
| zucker42 wrote:
| That requires two GPUs.
| ur-whale wrote:
| > Amazing! Simply amazing!
|
| If it's such a cool feature, why does NVidia lock it away non-
| Tesla H/W?
|
| [EDIT]: Funny, but the answers to this question actually
| provide way better answers to the other question I posted in
| this thread (as in: what is this for).
| sudosysgen wrote:
| Because otherwise, people would be able to use non-Tesla GPUs
| for cloud compute workloads, drastically reducing the cost of
| cloud GPU compute, and it would also enable the use of non-
| Tesla GPUs as local GPGPU clusters - additionally reducing
| workstation GPU sales due to more efficient resource use.
|
| GPUs are a duopoly due to intellectual property laws and high
| costs of entry (the only companies I know of that are willing
| to compute are Chinese and only a result of sanctions), so
| for NVidia this just allows for more profit.
| userbinator wrote:
| Interestingly, Intel is probably the most open with its
| GPUs, although it wasn't always that way; perhaps they
| realised they couldn't compete on performance alone.
| bayindirh wrote:
| I think AMD is on par with Intel, no?
| colechristensen wrote:
| Openness usually seems to be a feature of the runners up.
| moonbug wrote:
| trivial arithmetic will tell you it's not the cost of the
| hardware that makes AWS and Azure GPU instances expensive.
| sudosysgen wrote:
| Certainly, and both AWS, GCP and Azure even on CPU are
| much beyond simply hardware cost - there are hosts that
| are 2-3x cheaper for most uses with equivalent hardware
| resources.
| semi-extrinsic wrote:
| Yeah, but now the comparison for many companies (e.g. R&D
| dept. is dabbling a bit in machine learning) becomes "buy
| one big box with 4x RTX 3090 for ~$10k and spin up VMs on
| that as needed", versus the cloud bill. Previously the
| cost of owning physical hardware with that capability
| would be a lot higher.
|
| This has the potential to challenge the cloud case for
| sporadic GPU use, since cloud vendors cannot buy RTX
| cards. But it would require that the tooling becomes
| simple to use and reliable.
| simcop2387 wrote:
| Entirely for market segmentation. The ones they allow it on
| are much more expensive. With this someone could create a
| cloud game streaming service using normal consumer cards and
| dividing them up for a much cheaper experience than the $5k+
| cards that they currently allow it on. The recent change to
| allow virtualization at all (removing the code 43 block) does
| allow some of that, but does not allow you to say take a 3090
| and split it up for 4 customers and get 3060-like performance
| for each of them for a fraction of the cost.
| lostmsu wrote:
| I am interested in the recent change you are referring to.
| Is there a good article on how to use it on Windows or at
| least Linux?
| sirn wrote:
| The OP is referring to GPU passthrough setup[1], which
| passes through a GPU from Linux host to Windows guest
| (e.g. for gaming). This is done by detaching the GPU from
| the host and pass it to the VM, thus most setup requires
| two GPUs since one need to remain with the host (although
| single GPU passthrough is also possible).
|
| Nvidia used to detect if the host is a VM and return
| error code 43 blocking them from being used (for market
| segmentation between GeForce and Quadro). This is usually
| solved by either patching VBIOS or hiding KVM from the
| guest, but it was painful and unreliable. Nvidia removed
| this limitation with RTX 30 series.
|
| This vGPU feature unlock (TFA) would allow GPU to be
| virtualized without requiring the GPU to first be
| detached from the host, vastly simplify the setup and
| open up the possibility of having multiple VMs running on
| a single GPU, all with its own dedicated vGPU.
|
| [1]: https://wiki.archlinux.org/index.php/PCI_passthrough
| _via_OVM...
| my123 wrote:
| The RTX A6000 is at USD 4650, with 48GB of VRAM and the
| full chip enabled (+ECC, vGPU, pro drivers of course)
|
| The RTX 3090, with 24GB of VRAM is at USD 1499.
|
| Customer dGPUs from other HW providers do not have
| virtualisation capabilities either.
| baybal2 wrote:
| Well, I believe intel has it on iGPUs just very well
| hidden
| my123 wrote:
| https://news.ycombinator.com/item?id=26367726
|
| Not anymore.
| RicoElectrico wrote:
| Ngreedia - the way it's meant to be paid(tm)
| IncRnd wrote:
| Nvidia sells an ever greater percentage of their sales to the
| data-center market, and consumers purchase a shrinking
| portion. They do not want to flatten their currently upward
| trending data-center sales of high-end cards.
|
| _NVIDIA 's stock price has doubled since March 2020, and
| most of these gains can be largely attributed to the
| outstanding growth of its data center segment. Data center
| revenue alone increased a whopping 80% year over year,
| bringing its revenue contribution to 37% of the total. Gaming
| still contributes 43% of the company's total revenues, but
| NVIDIA's rapid growth in data center sales fueled a 39% year-
| over-year increase in its companywide first-quarter revenues.
|
| The world's growing reliance on public and private cloud
| services requires ever-increasing processing power, so the
| market available for capture is staggering in its potential.
| Already, NVIDIA's data center A100 GPU has been mass adopted
| by major cloud service providers and system builders,
| including Alibaba (NYSE:BABA) Cloud, Amazon (NASDAQ:AMZN)
| AWS, Dell Technologies (NYSE:DELL), Google (NASDAQ:GOOGL)
| Cloud Platform, and Microsoft (NASDAQ: MSFT) Azure._
|
| https://www.fool.com/investing/2020/07/22/data-centers-
| hold-...
| matheusmoreira wrote:
| To make people pay more.
| Youden wrote:
| > This means that one of the major problems with Linux on the
| desktop for power users goes away, and it also means that we
| can now deploy Linux only GPU tech such as HIP on any operating
| system that supports this trick!
|
| If you're brave enough, you can already do that with GPU
| passthrough. It's possible to detach the entire GPU from the
| host and transfer it to a guest and then get it back from the
| guest when the guest shuts down.
| [deleted]
| spijdar wrote:
| This could be way more practically useful than GPU
| passthrough. GPU passthrough demands at least two GPUs (an
| integrated one counts), requires at least two monitors (or
| two video inputs on one monitor), and in my experience has a
| tendency to do wonky things when the guest shuts off, since
| the firmware doesn't seem to like soft resets without the
| power being cycled. It also requires some CPU and PCIe
| controller settings not always present to run safely.
|
| This could allow a single GPU with a single video output to
| be used to run games in a Windows VM, without all the hoops
| that GPU passthrough entails. I'd definitely be excited for
| it!
| sudosysgen wrote:
| Certainly, but this requires both BIOS/UEFI fiddling and it
| also means you can't use both Windows and Linux at the same
| time, which is very important for me.
| airocker wrote:
| This is super! What would it take to abstract it similar to
| CPU/Memory by specifying limits only in croups? Limits could be
| like GPU Memory size/amount of parallelization?
| liuliu wrote:
| One thing I want to figure out (because I don't have a dedicated
| Windows gaming desktop), and the documentation on the internet
| seems sparse: it is my understanding that if I want to use PCIe
| passthrough with Windows VM, these GPUs cannot be available to
| the host machine at all, or technically it can, but I need to do
| some scripting to make sure the NVIDIA driver doesn't own these
| PCIe lanes before open Windows VM and re-enable it after
| shutdown?
|
| If I go with vGPU solution, I don't need to turn on / off NVIDIA
| driver for these PCIe lanes when running Windows VM? (I won't use
| these GPUs on host machine for display).
| Youden wrote:
| > One thing I want to figure out (because I don't have a
| dedicated Windows gaming desktop), and the documentation on the
| internet seems sparse: it is my understanding that if I want to
| use PCIe passthrough with Windows VM, these GPUs cannot be
| available to the host machine at all, or technically it can,
| but I need to do some scripting to make sure the NVIDIA driver
| doesn't own these PCIe lanes before open Windows VM and re-
| enable it after shutdown?
|
| The latter statement is correct. The GPU can be attached to the
| host but it has to be detached from the host before the VM
| starts using it. You may also need to get a dump of the GPU ROM
| and configure your VM to load it at start up.
|
| Regarding the script, mine resembles [0]. You need to remove
| the NVIDIA drivers and then attach the card to VFIO. And then
| the opposite afterwards. You may also need to image your GPU
| ROM [1]
|
| [0]: https://techblog.jeppson.org/2019/10/primary-vga-
| passthrough...
|
| [1]: https://clayfreeman.github.io/gpu-passthrough/#imaging-
| the-g...
| matheusmoreira wrote:
| Exactly. With GPU virtualization the driver is able to share
| the GPU resources with multiple systems such as the host
| operating system and guest virtual machine. Shame on nvidia for
| arbitrarily locking us out of this feature.
| [deleted]
| DCKing wrote:
| Dual booting is for chumps. If I could run a base Linux system
| and arbitrarily run fully hardware accelerated VMs of multiple
| Linux distros, BSDs and Windows, I'd be all over that. I could
| pretend here that I really _need_ the ability to quickly switch
| between OSes, that I 'd like VM-based snapshots, or that I have
| big use cases to multiplex the hardware power in my desktop box
| like that. I really don't need it. I just want it.
|
| I really hope Intel sees this as an opportunity for their DG2
| graphics cards due out later this year.
|
| If anyone from Intel is reading this: if you guys want to carve
| out a niche for yourself, and have power users advocate for your
| hardware - this is it. Enable SR-IOV for your upcoming Xe DG2 GPU
| line just as you do for your Xe integrated graphics. Just observe
| the lengths that people go to for their Nvidia cards, injecting
| code into their proprietary drivers just to run this. You can
| make this a champion feature just by _not disabling_ something
| your hardware can already do. Add some driver support for it in
| the mix and you 'll have an instant enthusiast fanbase for years
| to come.
| strstr wrote:
| Passthrough is workable right now. It's a pain to get set up,
| but it is workable.
|
| You don't need vgpu to get the job done. I've had two set ups
| over time: one based on a jank old secondary gpu that is used
| by the vm host, another based on just using the jank integrated
| graphics on my chip.
|
| Even still, I dual boot because it just works. It always works,
| and boot times are crazy low for Windows these days. No
| fighting with drivers. No fighting with latency issues for non-
| passthrough devices. It all just works.
| DCKing wrote:
| Oh I'm aware of passthrough. It's just a complete second
| class citizen because it isn't really virtualization, it's a
| hack. Virtualization is about multiplexing hardware.
| Passthrough is the opposite of multiplexing hardware: it's
| about yanking a peripheral from your host system and shoving
| it into one single guest VM. The fact that this yanking is
| poorly supported and has poor UX makes complete sense.
|
| I consider true peripheral multiplexing with true GPU
| virtualization to be the way of the future. It's true
| virtualization and doesn't even require you to sacrifice
| and/or babysit a single PCIe connected GPU. Passthrough is
| just a temporary hacky workaround that people have to apply
| now because there's nothing better.
|
| In the best case scenario - with hardware SR-IOV support plus
| basic driver support for it, enabling GPU access in your VM
| with SR-IOV would be a simple checkbox in the virtualization
| software of the host. GPU passthrough can't ever get there in
| terms of usability.
| fock wrote:
| I have a Quadro card and at least for Windows guests I can
| easily move the card between running guests (Linux has some
| problems with yanking though). Still, virtualized GPUs
| would be nice.
| jagrsw wrote:
| It works with some cards, not with others. Eg. for Radeon Pro
| W5500 there's no known card reset method that works (no
| method from https://github.com/gnif/vendor-reset works) so I
| had to do S3 suspend before running a VM with _systemctl
| suspend_ or with _rtcwake -m mem -s 2_
|
| Now I have additional RTX 2070 and it works ok.
| blibble wrote:
| passthrough has become very easy to set up, just add your pci
| card in virt-manager and away you go
|
| saying that, these days I just have a second pc with a load
| of cheap USB switches...
| m463 wrote:
| I've been running proxmox. I haven't run windows, but I have
| ubuntu vm's with full hardware gpu passthrough. I've passed
| through nvidia and intel gpus.
|
| I also have a macos vm, but I didn't set up gpu passthrough for
| that. Tried it once, it hung, didn't try it again. I use remote
| desktop anyway.
|
| here are some misc links:
|
| https://manjaro.site/how-to-enable-gpu-passthrough-on-proxmo...
|
| https://manjaro.site/tips-to-create-ubuntu-20-04-vm-on-proxm...
|
| https://pve.proxmox.com/wiki/Pci_passthrough
|
| https://blog.konpat.me/dev/2019/03/11/setting-up-lxc-for-int...
| easton wrote:
| Given that I use my desktop 90% of the time remotely these
| days, I'm going to set this up next time I'm home and move my
| Windows stuff into a VM. Then I can run Docker natively on the
| host and when Windows stops cooperating, just create a new VM
| (which I can't do remotely with it running on bare metal, at
| least without the risk of it not coming back up).
| schaefer wrote:
| There's a _lot_ of customer loyalty on the table waiting for the
| first GPU manufacturer to unlock this feature on consumer grade
| cards without forcing us to resort to hacks.
| [deleted]
| neatze wrote:
| To me this is laughably naive question, but I ask it any way.
|
| My understanding is that CPU/GPU per application can make only
| single draw call in sequential manner. (eg. CPU->GPU->CPU->GPU)
|
| Could vgpu's be used for concurrent draw calls from multiple
| processes of an single application ?
| milkey_mouse wrote:
| > My understanding is that CPU/GPU per application can make
| only single draw call in sequential manner.
|
| The limitation you're probably thinking of is in the OpenGL
| drivers/API, not in the GPU driver itself. OpenGL has global
| (per-application) state that needs to be tracked, so outside of
| a few special cases like texture uploading you have to only
| issue OpenGL calls from one thread. If applications use the
| lower-level Vulkan API, they can use a separate "command queue"
| for each thread. Both of those are graphics APIs, I'm less
| familiar with the compute-focused ones but I'm sure they can
| also process calls from multiple threads.
| milkey_mouse wrote:
| And VGPUS are isolated from one another, that's the whole
| point-so using multiple in one application would be very
| difficult, as I don't think they can share data/memory in any
| way.
| neatze wrote:
| My primitive thoughts:
|
| Threaded Computation on CPU -> Single GPU Call -> Parallel
| Computation on GPU -> Threaded Computation on CPU ...
|
| I wonder if it can be used in such way:
|
| Asyc Concurrent Computation on CPU -> Asyc Concurrent GPU
| Calls -> Parallel Time Independent Computations on GPU ->
| Asyc Concurrent Computation on CPU
| [deleted]
| shmerl wrote:
| Is this for SR-IOV? It's too bad SR-IOV isn't supported on
| regular desktop AMD GPUs for example in the Linux driver.
| Nullabillity wrote:
| Yes, this is basically NVidia's SR-IOV.
| jarym wrote:
| Hacking at its finest! Nice
| h2odragon wrote:
| > In order to make these checks pass the hooks in
| vgpu_unlock_hooks.c will look for a ioremap call that maps the
| physical address range that contain the magic and key values,
| recalculate the addresses of those values into the virtual
| address space of the kernel module, monitor memcpy operations
| reading at those addresses, and if such an operation occurs, keep
| a copy of the value until both are known, locate the lookup
| tables in the .rodata section of nv-kernel.o, find the signature
| and data bocks, validate the signature, decrypt the blocks, edit
| the PCI device ID in the decrypted data, reencrypt the blocks,
| regenerate the signature and insert the magic, blocks and
| signature into the table of vGPU capable magic values. And that's
| what they do.
|
| I'm very grateful _I_ wasn 't required to figure that out.
| stingraycharles wrote:
| I love the conciseness of this explanation. In just a few
| sentences, I completely understand the solution, but at the
| same time also understand the black magic wizardry that was
| required to pull it off.
| jacquesm wrote:
| Not to mention the many hours or days of being stumped. This
| sort of victory typically doesn't happen overnight.
|
| What bugs me about companies like NV is that if they just
| sold their hardware and published the specs they'd probably
| sell _more_ than with all this ridiculously locked down
| nonsense, it is just a lot of work thrown at limiting your
| customers and protecting a broken business model.
| eli wrote:
| But they'd also sells fewer high end models. I don't doubt
| that they've done the math.
| minimalist wrote:
| Related but different:
|
| - nvidia-patch [0] "This patch removes restriction on maximum
| number of simultaneous NVENC video encoding sessions imposed by
| Nvidia to consumer-grade GPUs."
|
| - About a week ago "NVIDIA Now Allows GeForce GPU Pass-Through
| For Windows VMs On Linux" [1]. Note, this is only for the driver
| on Windows VM guests not GNU/Linux guests.
|
| Hopefully the project in the OP will mean that GPU access is
| finally possible on GNU/Linux guests on Xen, thank you for
| sharing OP.
|
| [0]: https://github.com/keylase/nvidia-patch
|
| [1]:
| https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-G...
___________________________________________________________________
(page generated 2021-04-09 23:00 UTC)