hngopher.com

       [HN Gopher] Vgpu_unlock: Unlock vGPU functionality for consumer ...
       ___________________________________________________________________
        
       Vgpu_unlock: Unlock vGPU functionality for consumer grade GPUs
        
       Author : fragileone
       Score  : 440 points
       Date   : 2021-04-09 18:42 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | effie wrote:
       | Wow, does this mean NVIDIA consumer cards can now do multi-VM
       | multiplexing of single GPU? On AMD side, only special cards like
       | FirePro S7150 can do this.
       | 
       | Does this work also on Xen? NVIDIA drivers were always non-
       | functional with Xen+consumer cards.
        
       | ternox99 wrote:
       | I don't understand how and where I can download nvidia grid vgpu
       | driver. Anyone can help me?
        
         | liuliu wrote:
         | Register for evaluation:
         | https://nvid.nvidia.com/dashboard/#/dashboard
        
           | ternox99 wrote:
           | Thanks :)
        
       | ur-whale wrote:
       | If - like me - you don't have a clue what vGPU is:
       | 
       | https://www.nvidia.com/en-us/data-center/virtual-solutions/
       | 
       | TL;DR: seems to be something useful for deploying GPUs in the
       | cloud, but I may not have understood fully.
        
         | lovedswain wrote:
         | It instantiates multiple logical PCI adaptors for a single
         | physical adaptor. The logical adaptors can then be mapped into
         | VMs which can directly program a hardware-virtualized view of
         | the graphics card. Intel has the same feature in their graphics
         | and networking chips
        
           | ur-whale wrote:
           | Thanks for the explanation, but that's more of a "this is how
           | it works" than a "this is why it's useful".
           | 
           | What would be the main use case?
        
             | kjjjjjjjjjjjjjj wrote:
             | 4 people sharing 1 CPU and 1 GPU that is running a
             | hypervisor with separate installations of windows for
             | gaming
             | 
             | Basically any workload that requires sharing a GPU between
             | discrete VMs
        
             | wmf wrote:
             | The use case is allowing the host system and VM(s) to
             | access the same GPU at the same time.
        
               | ur-whale wrote:
               | Yeah, I got that from the technical explanation.
               | 
               | What's the _practical_ use case, as in, when would I need
               | this?
               | 
               | [EDIT]: To maybe ask a better way: will this practically
               | help me train my DNN faster?
               | 
               | Or if I'm a cloud vendor, will this allow me to deploy
               | cheaper GPU for my users?
               | 
               | I guess I'm asking about the economic value of the hack.
        
               | Sebb767 wrote:
               | > To maybe ask a better way: will this practically help
               | me train my DNN faster?
               | 
               | Probably not. It will only help you if you previously
               | needed to train it on a CPU because you were in a VM, but
               | this seems unlikely. It will not speed up your existing
               | GPU in any way compared to simply using it bare-metal
               | right now.
               | 
               | > Or if I'm a cloud vendor, will this allow me to deploy
               | cheaper GPU for my users?
               | 
               | Yes. This ports a feature from the XXXX$-range of GPUs to
               | the XXX$-range of GPUs. Since the performance of those is
               | similar or nearly similar, you can save a lot of money
               | this way. It will also make the entry costs to the market
               | lower (i.e. now a hypervisor could be sub-1k$, if you go
               | for cheap parts).
               | 
               | On the other hand, a business selling GPU time to
               | customer will probably not want to rely on a hack
               | (especially since there's a good chance it's violating
               | NVidias license), so unless you're building your on HW,
               | your bill will probably not drop. But if you're an ML
               | startup or a hobbyist, you can now cheap out on/actually
               | afford this kind of setup.
        
               | lovedswain wrote:
               | Running certain ML models in VMs
               | 
               | Running CUDA in VMs
               | 
               | Running transcoders in VMs
               | 
               | Running <anything that needs a GPU> in VMs
        
               | ur-whale wrote:
               | This is the exact same information you posted above.
               | 
               | Please see my edit.
        
               | [deleted]
        
               | jandrese wrote:
               | You have a Linux box but you want to play a game and it
               | doesn't work properly under Proton, so you spin up a
               | Windows VM to play it instead.
               | 
               | The host still wants access to the GPU to do stuff like
               | compositing windows and H.265 encode/decode.
        
               | skykooler wrote:
               | And outputting anything to the screen in general.
               | Usually, your monitor(s) are plugged into the ports on
               | the GPU.
        
             | jowsie wrote:
             | Same as any hypervisor/virtual machine setup. Sharing
             | resources. You can build 1 big server with 1 big GPU and
             | have multiple people doing multiple things on it at once,
             | or one person using all the resources for a single
             | intensive load.
        
               | ur-whale wrote:
               | Thanks, this is a concise answer.
               | 
               | However, I was under the impression - at least on Linux -
               | that I could run multiple workloads in parallel on the
               | same GPU without having to resort to vGPU.
               | 
               | I seem to be missing something.
        
               | antattack wrote:
               | If you are running Linux in a VM, vGPU will allow
               | acceleration with OpenGL, WebGL, Vulcan applications like
               | games, CAD, CAM, EDA, for example.
        
               | hesk wrote:
               | In addition to the answer by skykooler, virtual GPUs also
               | allow you to set hard resource limits (e.g., amount of L2
               | cache, number of streaming multiprocessors), so different
               | workloads do not interfere with each other.
        
               | cosmie wrote:
               | This[1] may help.
               | 
               | What you're saying is true, but it's generally using
               | either the API remoting or device emulation methods
               | mentioned on that wiki page. In those cases, the VM does
               | not see your actual GPU device, but emulated device
               | provided by the VM software. I'm running Windows within
               | Parallels on a Mac, and here[2] is a screenshot showing
               | the different devices each sees.
               | 
               | In the general case, the multiplexing is all software
               | based. The guest VM talks to the an emulated GPU, the
               | virtualized device driver then passes those to the
               | hypervisor/host, which then generates equivalent calls on
               | to the GPU, then back up the chain. So while you're still
               | ultimately using the GPU, the software-based indirection
               | introduces a performance penalty and potential
               | bottleneck. And you're also limited to the cross-section
               | of capabilities exposed by your virtualized GPU driver,
               | hypervisor system, and the driver being used by that
               | hypervisor (or host OS, for Type 2 hypervisors). The
               | table under API remoting shows just how varied 3D
               | acceleration support is across different hypervisors.
               | 
               | As an alternative to that, you can use fixed passthrough
               | to directly expose your physical GPU to the VM. This lets
               | you tap into the full capabilities of the GPU (or other
               | PCI device), and achieves near native performance. The
               | graphics calls you make in the VM now go directly to the
               | GPU, cutting out game of telephone that emulated devices
               | play. Assuming, of course, your video card drivers aren't
               | actively trying to block you from running within a VM[3].
               | 
               | The problem is that when a device is assigned to a guest
               | VM in this manner, that VM gets exclusive access to it.
               | Even the host OS can't use it while its assigned to the
               | guest.
               | 
               | This article is about the fourth option - mediated
               | passthrough. The vGPU functionality enables the graphics
               | card to expose itself as multiple logical interfaces. So
               | every VM gets its own logical interface to the GPU and
               | send calls directly to the physical GPU like it does in
               | normal passthrough mode, and the hardware handles the
               | multiplexing aspect instead of the host/hypervisor
               | worrying about it. Which gives you the best of both
               | worlds.
               | 
               | [1] https://en.wikipedia.org/wiki/GPU_virtualization
               | 
               | [2] https://imgur.com/VMAGs5D
               | 
               | [3] https://wiki.archlinux.org/index.php/PCI_passthrough_
               | via_OVM...
        
               | skykooler wrote:
               | You can, but only directly under that OS. If you wanted
               | to run, say, a Windows VM to run a game that doesn't work
               | in Wine, you'd need some way to give a virtual GPU to the
               | virtual machine. (As it is now, the only way you'd be
               | able to do this is to have a separate GPU that's
               | dedicated to the VM and pass that through entirely.)
        
       | noodlesUK wrote:
       | I wish Nvidia would open this up properly. The fact that intel
       | integrated gpus can do GVT-G and I literally can't buy a laptop
       | which will do vgpu passthrough with an Nvidia card for any amount
       | of money is infuriating.
        
         | my123 wrote:
         | GVT-g is gone on 10th-gen GPUs (Ice Lake) and later. Not
         | supported on Intel dGPUs either.
        
           | cromka wrote:
           | Oh wow. Had no idea and I was explicitly planning my next
           | server to be Intel because of the GVT-g. Why did they abandon
           | it?
        
             | effie wrote:
             | More like did not prioritize it.
             | 
             | https://github.com/intel/gvt-linux/issues/126
        
             | [deleted]
        
       | pasnew wrote:
       | Great work ! Does it limited by Nvidia's 90 day vGPU software
       | evaluation ?
        
       | cercatrova wrote:
       | For virtualized Windows from Linux, check out Looking Glass which
       | I posted about previously
       | 
       | https://news.ycombinator.com/item?id=22907306
        
       | albertzeyer wrote:
       | The Python script actually mostly uses Frida (https://frida.re/)
       | scripting. I haven't seen Frida before, but this looks very
       | powerful. I did some similar (but very basic) things with
       | GDB/LLDB scripting before but Frida seems to be done for exactly
       | things like this.
        
       | madjam002 wrote:
       | I built an "X gamers/workstations 1 CPU" type-build last year and
       | this has been the main problem, I have two GPUs, one of which is
       | super old and I have to choose which one I want to use when I
       | boot up a VM.
       | 
       | Will definitely be checking this out!
        
       | opan wrote:
       | Is this only for Nvidia GPUs? If so, why not put it in the title?
        
       | WrtCdEvrydy wrote:
       | This is a dumb question, but which hypervisor configuration is
       | this targeted towards to.
       | 
       | There's a lot of detail on the link which I appreciate but maybe
       | I missed it.
        
         | effie wrote:
         | KVM, maybe also Xen.
        
       | solnyshok wrote:
       | what are those "some Geforce and Quadro GPUs that share the same
       | physical chip as the Tesla GPUs"?
        
         | broodbucket wrote:
         | it's in the source
         | 
         | https://github.com/DualCoder/vgpu_unlock/blob/0675b563acdae8...
        
           | solnyshok wrote:
           | thanks! it is interesting. gtx 1060 3gb is ok, but rtx 3070
           | isn't. and it is time to upgrade from my trusty 970
        
       | effie wrote:
       | The number of technology layers one must understand and control
       | to make GPU draw what VM wants is frankly insane. Hypervisor
       | channels, GPU driver code, graphics server API, graphics toolkit
       | libraries - all these have several variants. It seems close to
       | impossible to get anything done in this space.
       | 
       | I just want to send drawing commands from a node over network to
       | another node with a GPU. Like "draw a black rectangle
       | 20,20,200,200 on main GPU on VM at 192.168.1.102".
       | 
       | How would I do that in the simplest possible way? Is there some
       | network graphics command protocol?
       | 
       | Like X11, but simpler and faster, just the raw drawing commands.
        
       | sudosysgen wrote:
       | Amazing! Simply amazing!
       | 
       | This not only enables the use of GPGPU on VMs, but also enables
       | the use of a single GPU to virtualize Windows video games from
       | Linux!
       | 
       | This means that one of the major problems with Linux on the
       | desktop for power users goes away, and it also means that we can
       | now deploy Linux only GPU tech such as HIP on any operating
       | system that supports this trick!
        
         | cercatrova wrote:
         | For virtualized Windows from Linux, check out Looking Glass
         | which I posted about previously
         | 
         | https://news.ycombinator.com/item?id=22907306
        
           | zucker42 wrote:
           | That requires two GPUs.
        
             | solnyshok wrote:
             | Looking Glass is agnostic to the hardware setup. it works
             | both with pci passthrough and with intel gvtg (single gpu
             | sliced into vgpus)
        
         | ur-whale wrote:
         | > Amazing! Simply amazing!
         | 
         | If it's such a cool feature, why does NVidia lock it away non-
         | Tesla H/W?
         | 
         | [EDIT]: Funny, but the answers to this question actually
         | provide way better answers to the other question I posted in
         | this thread (as in: what is this for).
        
           | sudosysgen wrote:
           | Because otherwise, people would be able to use non-Tesla GPUs
           | for cloud compute workloads, drastically reducing the cost of
           | cloud GPU compute, and it would also enable the use of non-
           | Tesla GPUs as local GPGPU clusters - additionally reducing
           | workstation GPU sales due to more efficient resource use.
           | 
           | GPUs are a duopoly due to intellectual property laws and high
           | costs of entry (the only companies I know of that are willing
           | to compute are Chinese and only a result of sanctions), so
           | for NVidia this just allows for more profit.
        
             | userbinator wrote:
             | Interestingly, Intel is probably the most open with its
             | GPUs, although it wasn't always that way; perhaps they
             | realised they couldn't compete on performance alone.
        
               | bayindirh wrote:
               | I think AMD is on par with Intel, no?
        
               | SXX wrote:
               | AMD do have great open source drivers, but they have
               | longer lag behind with their code merges compared to
               | Intel. Also at least a while ago their open documentation
               | was quite lacking for newer generations of GPUs.
        
               | colechristensen wrote:
               | Openness usually seems to be a feature of the runners up.
        
             | moonbug wrote:
             | trivial arithmetic will tell you it's not the cost of the
             | hardware that makes AWS and Azure GPU instances expensive.
        
               | sudosysgen wrote:
               | Certainly, and both AWS, GCP and Azure even on CPU are
               | much beyond simply hardware cost - there are hosts that
               | are 2-3x cheaper for most uses with equivalent hardware
               | resources.
        
               | semi-extrinsic wrote:
               | Yeah, but now the comparison for many companies (e.g. R&D
               | dept. is dabbling a bit in machine learning) becomes "buy
               | one big box with 4x RTX 3090 for ~$10k and spin up VMs on
               | that as needed", versus the cloud bill. Previously the
               | cost of owning physical hardware with that capability
               | would be a lot higher.
               | 
               | This has the potential to challenge the cloud case for
               | sporadic GPU use, since cloud vendors cannot buy RTX
               | cards. But it would require that the tooling becomes
               | simple to use and reliable.
        
           | simcop2387 wrote:
           | Entirely for market segmentation. The ones they allow it on
           | are much more expensive. With this someone could create a
           | cloud game streaming service using normal consumer cards and
           | dividing them up for a much cheaper experience than the $5k+
           | cards that they currently allow it on. The recent change to
           | allow virtualization at all (removing the code 43 block) does
           | allow some of that, but does not allow you to say take a 3090
           | and split it up for 4 customers and get 3060-like performance
           | for each of them for a fraction of the cost.
        
             | lostmsu wrote:
             | I am interested in the recent change you are referring to.
             | Is there a good article on how to use it on Windows or at
             | least Linux?
        
               | sirn wrote:
               | The OP is referring to GPU passthrough setup[1], which
               | passes through a GPU from Linux host to Windows guest
               | (e.g. for gaming). This is done by detaching the GPU from
               | the host and pass it to the VM, thus most setup requires
               | two GPUs since one need to remain with the host (although
               | single GPU passthrough is also possible).
               | 
               | Nvidia used to detect if the host is a VM and return
               | error code 43 blocking them from being used (for market
               | segmentation between GeForce and Quadro). This is usually
               | solved by either patching VBIOS or hiding KVM from the
               | guest, but it was painful and unreliable. Nvidia removed
               | this limitation with RTX 30 series.
               | 
               | This vGPU feature unlock (TFA) would allow GPU to be
               | virtualized without requiring the GPU to first be
               | detached from the host, vastly simplify the setup and
               | open up the possibility of having multiple VMs running on
               | a single GPU, all with its own dedicated vGPU.
               | 
               | [1]: https://wiki.archlinux.org/index.php/PCI_passthrough
               | _via_OVM...
        
             | my123 wrote:
             | The RTX A6000 is at USD 4650, with 48GB of VRAM and the
             | full chip enabled (+ECC, vGPU, pro drivers of course)
             | 
             | The RTX 3090, with 24GB of VRAM is at USD 1499.
             | 
             | Customer dGPUs from other HW providers do not have
             | virtualisation capabilities either.
        
               | baybal2 wrote:
               | Well, I believe intel has it on iGPUs just very well
               | hidden
        
               | my123 wrote:
               | https://news.ycombinator.com/item?id=26367726
               | 
               | Not anymore.
        
           | RicoElectrico wrote:
           | Ngreedia - the way it's meant to be paid(tm)
        
           | IncRnd wrote:
           | Nvidia sells an ever greater percentage of their sales to the
           | data-center market, and consumers purchase a shrinking
           | portion. They do not want to flatten their currently upward
           | trending data-center sales of high-end cards.
           | 
           |  _NVIDIA 's stock price has doubled since March 2020, and
           | most of these gains can be largely attributed to the
           | outstanding growth of its data center segment. Data center
           | revenue alone increased a whopping 80% year over year,
           | bringing its revenue contribution to 37% of the total. Gaming
           | still contributes 43% of the company's total revenues, but
           | NVIDIA's rapid growth in data center sales fueled a 39% year-
           | over-year increase in its companywide first-quarter revenues.
           | 
           | The world's growing reliance on public and private cloud
           | services requires ever-increasing processing power, so the
           | market available for capture is staggering in its potential.
           | Already, NVIDIA's data center A100 GPU has been mass adopted
           | by major cloud service providers and system builders,
           | including Alibaba (NYSE:BABA) Cloud, Amazon (NASDAQ:AMZN)
           | AWS, Dell Technologies (NYSE:DELL), Google (NASDAQ:GOOGL)
           | Cloud Platform, and Microsoft (NASDAQ: MSFT) Azure._
           | 
           | https://www.fool.com/investing/2020/07/22/data-centers-
           | hold-...
        
           | matheusmoreira wrote:
           | To make people pay more.
        
             | [deleted]
        
         | [deleted]
        
         | [deleted]
        
         | jhugo wrote:
         | I built a PC with two decent GPUs with the intention of doing
         | this (one GPU for windows in the VM, one for Linux running on
         | the host). It works _great_ performance-wise but any game with
         | anti-cheat will be very unhappy in a VM. I tried various
         | workarounds which work to varying degrees but ultimately it's a
         | huge pain.
        
         | bavell wrote:
         | While this is definitely welcome news, GPU VFIO passthrough has
         | been possible for awhile now. I've been playing games on my
         | windows VM + linux host for a few years at least. 95% native
         | performance without needing to dual boot has been a game-
         | changer (heh).
        
           | eutropia wrote:
           | What is your setup like?
           | 
           | I'm planning on switching to VFIO for my next rebuild, and
           | was curious as to how stable the setup was.
        
           | sneak wrote:
           | Could you share your configuration?
        
         | Youden wrote:
         | > This means that one of the major problems with Linux on the
         | desktop for power users goes away, and it also means that we
         | can now deploy Linux only GPU tech such as HIP on any operating
         | system that supports this trick!
         | 
         | If you're brave enough, you can already do that with GPU
         | passthrough. It's possible to detach the entire GPU from the
         | host and transfer it to a guest and then get it back from the
         | guest when the guest shuts down.
        
           | [deleted]
        
           | spijdar wrote:
           | This could be way more practically useful than GPU
           | passthrough. GPU passthrough demands at least two GPUs (an
           | integrated one counts), requires at least two monitors (or
           | two video inputs on one monitor), and in my experience has a
           | tendency to do wonky things when the guest shuts off, since
           | the firmware doesn't seem to like soft resets without the
           | power being cycled. It also requires some CPU and PCIe
           | controller settings not always present to run safely.
           | 
           | This could allow a single GPU with a single video output to
           | be used to run games in a Windows VM, without all the hoops
           | that GPU passthrough entails. I'd definitely be excited for
           | it!
        
             | Youden wrote:
             | > GPU passthrough demands at least two GPUs
             | 
             | It doesn't. As I said, you can detach the GPU from the host
             | and pass it to the guest and back again. I elaborated a bit
             | more in another comment [0].
             | 
             | > This could be way more practically useful than GPU
             | passthrough.
             | 
             | I think that depends on the mechanics of how it works. How
             | exactly do you get the "monitor" of the vGPU?
             | 
             | [0]: https://news.ycombinator.com/item?id=26755390
        
             | zamadatix wrote:
             | It only requires 2 GPUs if you plan on using Linux GUI
             | applications as you game on Windows. Besides, any shared
             | single GPU solution is going to introduce performance
             | overhead and display latency, both of which are undesired
             | for gaming. Great for non-gaming things though - but
             | generally you don't need Windows for those anyways.
        
             | Kubuxu wrote:
             | > an integrated one counts
             | 
             | From experience not always. If the dedicated GPU gets
             | selected as BIOS GPU then it might be impossible to reset
             | it properly for the redirect. I had this problem with 1070.
             | 
             | I have to say vGPU is amazing feature, and this possibly
             | brings it to "average" user (as average user doing GPU
             | passthrough can be).
        
           | sudosysgen wrote:
           | Certainly, but this requires both BIOS/UEFI fiddling and it
           | also means you can't use both Windows and Linux at the same
           | time, which is very important for me.
        
             | genewitch wrote:
             | I run gentoo host (with dual monitors) and a third monitor
             | on a separate GPU for windows. I bought a laptop with
             | discrete and onboard GPUs, and discovered that the windows
             | VM now lives in msrdp.exe on the laptop, rather than
             | physically interacted with keyboard and mouse. i still can
             | interact with the VM if there's some game my laptop chokes
             | on, but so far it's not worth the hassle for the extra 10%
             | framerate. It's amusing because my laptop has 120hz
             | display, so "extra 10% FPS" would be nice _on the laptop_
             | but hey, we 're not made of money over here.
             | 
             | Oh, i got sidetracked. I have a kernel command line that
             | invokes IOMMU and "blacklists" the set of PCIE lanes that
             | GPU sits on, the kernel never sees it, even when its in
             | use. The next thing that i had to do was set up a vfio-bind
             | script, that just tells qemu what GPU it's going to use.
             | Thirdly, and this is the unfortunate part, since i forgot
             | exactly what i did - there's some weirdness with windows in
             | qemu with a passthru GPU - you have to registry hack some
             | obscure stuff in to the way windows handles the GPU memory.
             | 
             | If i am not mistaken, 95% of all of my issues were solved
             | by reading the ArchLinux documentation for qemu
             | host/guests. My system is ryzen 3600, 64GB of ram, 2x NVME
             | drives + one M.2 Sata drive, a gtx 1060 and a gtx 1070.
             | Gentoo gets 16GB of ram (unless i need more, i just shut
             | down windows or reset the guest memory) and the 1060.
             | Windows gets ~47GB of ram, and the 1070, a wifi card, and a
             | USB sound card. One of the things you quickly realize with
             | guests on machines like this is that consumer grade
             | motherboards and CPUs are garbage, there aren't enough PCIe
             | lanes to, say, passthrough a bunch of USB or SAS/SATA
             | ports, or a dedicated PCIe soundcard, or firewire. If you
             | have an idea that you'd really like to try this out as an
             | actual "desktop replacement" - especially for replacing
             | multiple desktops, i recommend going to at least a
             | threadripper, as those can expose like 4-6 times as many
             | PCIe lanes to the host OS, meaning the possibility of
             | multiple guests on multiple GPUs, or a single "redundant"
             | guest, with USB ports, SATA ports, and pcie
             | sound/firewire/whatever.
             | 
             | Why would anyone do this? dd if=/dev/sdb
             | of=/mnt/nfs/backups/windows-date.img . Q.E.D.
        
       | airocker wrote:
       | This is super! What would it take to abstract it similar to
       | CPU/Memory by specifying limits only in croups? Limits could be
       | like GPU Memory size/amount of parallelization?
        
       | liuliu wrote:
       | One thing I want to figure out (because I don't have a dedicated
       | Windows gaming desktop), and the documentation on the internet
       | seems sparse: it is my understanding that if I want to use PCIe
       | passthrough with Windows VM, these GPUs cannot be available to
       | the host machine at all, or technically it can, but I need to do
       | some scripting to make sure the NVIDIA driver doesn't own these
       | PCIe lanes before open Windows VM and re-enable it after
       | shutdown?
       | 
       | If I go with vGPU solution, I don't need to turn on / off NVIDIA
       | driver for these PCIe lanes when running Windows VM? (I won't use
       | these GPUs on host machine for display).
        
         | Youden wrote:
         | > One thing I want to figure out (because I don't have a
         | dedicated Windows gaming desktop), and the documentation on the
         | internet seems sparse: it is my understanding that if I want to
         | use PCIe passthrough with Windows VM, these GPUs cannot be
         | available to the host machine at all, or technically it can,
         | but I need to do some scripting to make sure the NVIDIA driver
         | doesn't own these PCIe lanes before open Windows VM and re-
         | enable it after shutdown?
         | 
         | The latter statement is correct. The GPU can be attached to the
         | host but it has to be detached from the host before the VM
         | starts using it. You may also need to get a dump of the GPU ROM
         | and configure your VM to load it at start up.
         | 
         | Regarding the script, mine resembles [0]. You need to remove
         | the NVIDIA drivers and then attach the card to VFIO. And then
         | the opposite afterwards. You may also need to image your GPU
         | ROM [1]
         | 
         | [0]: https://techblog.jeppson.org/2019/10/primary-vga-
         | passthrough...
         | 
         | [1]: https://clayfreeman.github.io/gpu-passthrough/#imaging-
         | the-g...
        
         | matheusmoreira wrote:
         | Exactly. With GPU virtualization the driver is able to share
         | the GPU resources with multiple systems such as the host
         | operating system and guest virtual machine. Shame on nvidia for
         | arbitrarily locking us out of this feature.
        
           | liuliu wrote:
           | Got some time to try this now. It worked as expected, I have
           | vgpu_vfio. However, it doesn't perfectly fit my needs.
           | Particularly, my host system is "heavy", I need it to run
           | CUDA etc, while the VM just to run games. However, it seems
           | the 460.32.04 driver on host doesn't have full functionality,
           | hence, cannot run CUDA on the host any more.
        
           | judge2020 wrote:
           | Is there info on this sort of usage? I'd love to use the host
           | for NVENC and a VM guest for traditional GPU stuff, but
           | haven't been able to find anything on doing that.
        
         | [deleted]
        
       | DCKing wrote:
       | Dual booting is for chumps. If I could run a base Linux system
       | and arbitrarily run fully hardware accelerated VMs of multiple
       | Linux distros, BSDs and Windows, I'd be all over that. I could
       | pretend here that I really _need_ the ability to quickly switch
       | between OSes, that I 'd like VM-based snapshots, or that I have
       | big use cases to multiplex the hardware power in my desktop box
       | like that. I really don't need it. I just want it.
       | 
       | I really hope Intel sees this as an opportunity for their DG2
       | graphics cards due out later this year.
       | 
       | If anyone from Intel is reading this: if you guys want to carve
       | out a niche for yourself, and have power users advocate for your
       | hardware - this is it. Enable SR-IOV for your upcoming Xe DG2 GPU
       | line just as you do for your Xe integrated graphics. Just observe
       | the lengths that people go to for their Nvidia cards, injecting
       | code into their proprietary drivers just to run this. You can
       | make this a champion feature just by _not disabling_ something
       | your hardware can already do. Add some driver support for it in
       | the mix and you 'll have an instant enthusiast fanbase for years
       | to come.
        
         | strstr wrote:
         | Passthrough is workable right now. It's a pain to get set up,
         | but it is workable.
         | 
         | You don't need vgpu to get the job done. I've had two set ups
         | over time: one based on a jank old secondary gpu that is used
         | by the vm host, another based on just using the jank integrated
         | graphics on my chip.
         | 
         | Even still, I dual boot because it just works. It always works,
         | and boot times are crazy low for Windows these days. No
         | fighting with drivers. No fighting with latency issues for non-
         | passthrough devices. It all just works.
        
           | DCKing wrote:
           | Oh I'm aware of passthrough. It's just a complete second
           | class citizen because it isn't really virtualization, it's a
           | hack. Virtualization is about multiplexing hardware.
           | Passthrough is the opposite of multiplexing hardware: it's
           | about yanking a peripheral from your host system and shoving
           | it into one single guest VM. The fact that this yanking is
           | poorly supported and has poor UX makes complete sense.
           | 
           | I consider true peripheral multiplexing with true GPU
           | virtualization to be the way of the future. It's true
           | virtualization and doesn't even require you to sacrifice
           | and/or babysit a single PCIe connected GPU. Passthrough is
           | just a temporary hacky workaround that people have to apply
           | now because there's nothing better.
           | 
           | In the best case scenario - with hardware SR-IOV support plus
           | basic driver support for it, enabling GPU access in your VM
           | with SR-IOV would be a simple checkbox in the virtualization
           | software of the host. GPU passthrough can't ever get there in
           | terms of usability.
        
             | fock wrote:
             | I have a Quadro card and at least for Windows guests I can
             | easily move the card between running guests (Linux has some
             | problems with yanking though). Still, virtualized GPUs
             | would be nice.
        
             | strstr wrote:
             | I guess I don't really see the benefit of "true"
             | virtualization, other than it's usability improvements. I
             | generally want to squeeze out ever bit of performance out
             | of the GPU if I care to share it with the guest at all (at
             | least on my home machine). I'd be using it for playing
             | games.
             | 
             | For the cloud, I could imagine wanting vGPUs so you can
             | shard the massive GPUs that are used there. But in cloud,
             | you would then have a single device be multi-tenant, which
             | is a bit spicy security wise. Passthrough has a very
             | straightforward security model.
        
               | KabirKwatra wrote:
               | SRIOV would allow anyone with a single gpu to be able to
               | get into virtualization. It would eliminate the single
               | biggest barrier to entry in the space.
        
           | jagrsw wrote:
           | It works with some cards, not with others. Eg. for Radeon Pro
           | W5500 there's no known card reset method that works (no
           | method from https://github.com/gnif/vendor-reset works) so I
           | had to do S3 suspend before running a VM with _systemctl
           | suspend_ or with _rtcwake -m mem -s 2_
           | 
           | Now I have additional RTX 2070 and it works ok.
        
           | blibble wrote:
           | passthrough has become very easy to set up, just add your pci
           | card in virt-manager and away you go
           | 
           | saying that, these days I just have a second pc with a load
           | of cheap USB switches...
        
         | m463 wrote:
         | I've been running proxmox. I haven't run windows, but I have
         | ubuntu vm's with full hardware gpu passthrough. I've passed
         | through nvidia and intel gpus.
         | 
         | I also have a macos vm, but I didn't set up gpu passthrough for
         | that. Tried it once, it hung, didn't try it again. I use remote
         | desktop anyway.
         | 
         | here are some misc links:
         | 
         | https://manjaro.site/how-to-enable-gpu-passthrough-on-proxmo...
         | 
         | https://manjaro.site/tips-to-create-ubuntu-20-04-vm-on-proxm...
         | 
         | https://pve.proxmox.com/wiki/Pci_passthrough
         | 
         | https://blog.konpat.me/dev/2019/03/11/setting-up-lxc-for-int...
        
           | Asooka wrote:
           | >macOS VM
           | 
           | What is the current licensing situation on this? Can I use it
           | legally to build software for Mac?
        
         | easton wrote:
         | Given that I use my desktop 90% of the time remotely these
         | days, I'm going to set this up next time I'm home and move my
         | Windows stuff into a VM. Then I can run Docker natively on the
         | host and when Windows stops cooperating, just create a new VM
         | (which I can't do remotely with it running on bare metal, at
         | least without the risk of it not coming back up).
        
       | williesleg wrote:
       | Why can't these assholes post a list of compatible gpu's, why do
       | I have to install this shit to see?
        
       | Sebb767 wrote:
       | If anyone needs a list of currently supported CPUs, you can find
       | it in the source code:
       | 
       | https://github.com/DualCoder/vgpu_unlock/blob/master/vgpu_un...
       | (the comments behind the device ids)
        
       | kkielhofner wrote:
       | At the risk of piling on without value: "Amazing, simply
       | amazing".
       | 
       | I've been (more or less) accused of being an Nvidia fanboy on HN
       | previously but this is an area where I've always thought Nvidia
       | has their market segmentation wrong. Just wrong.
       | 
       | This is great work. (Period, as in "end of sentence, mic drop").
        
       | schaefer wrote:
       | There's a _lot_ of customer loyalty on the table waiting for the
       | first GPU manufacturer to unlock this feature on consumer grade
       | cards without forcing us to resort to hacks.
        
         | [deleted]
        
         | judge2020 wrote:
         | Not many people are foregoing a GPU given this limitation,
         | though, except for maybe miners (which will virtualize and
         | pirate/never activate Windows if they really need Windows).
        
           | throwaway2048 wrote:
           | that equilibrium only holds as long as your competitor
           | doesn't offer it.
        
       | neatze wrote:
       | To me this is laughably naive question, but I ask it any way.
       | 
       | My understanding is that CPU/GPU per application can make only
       | single draw call in sequential manner. (eg. CPU->GPU->CPU->GPU)
       | 
       | Could vgpu's be used for concurrent draw calls from multiple
       | processes of an single application ?
        
         | milkey_mouse wrote:
         | > My understanding is that CPU/GPU per application can make
         | only single draw call in sequential manner.
         | 
         | The limitation you're probably thinking of is in the OpenGL
         | drivers/API, not in the GPU driver itself. OpenGL has global
         | (per-application) state that needs to be tracked, so outside of
         | a few special cases like texture uploading you have to only
         | issue OpenGL calls from one thread. If applications use the
         | lower-level Vulkan API, they can use a separate "command queue"
         | for each thread. Both of those are graphics APIs, I'm less
         | familiar with the compute-focused ones but I'm sure they can
         | also process calls from multiple threads.
        
           | milkey_mouse wrote:
           | And VGPUS are isolated from one another, that's the whole
           | point-so using multiple in one application would be very
           | difficult, as I don't think they can share data/memory in any
           | way.
        
           | neatze wrote:
           | My primitive thoughts:
           | 
           | Threaded Computation on CPU -> Single GPU Call -> Parallel
           | Computation on GPU -> Threaded Computation on CPU ...
           | 
           | I wonder if it can be used in such way:
           | 
           | Asyc Concurrent Computation on CPU -> Asyc Concurrent GPU
           | Calls -> Parallel Time Independent Computations on GPU ->
           | Asyc Concurrent Computation on CPU
        
         | [deleted]
        
       | shmerl wrote:
       | Is this for SR-IOV? It's too bad SR-IOV isn't supported on
       | regular desktop AMD GPUs for example in the Linux driver.
        
         | Nullabillity wrote:
         | Yes, this is basically NVidia's SR-IOV.
        
       | jarym wrote:
       | Hacking at its finest! Nice
        
       | h2odragon wrote:
       | > In order to make these checks pass the hooks in
       | vgpu_unlock_hooks.c will look for a ioremap call that maps the
       | physical address range that contain the magic and key values,
       | recalculate the addresses of those values into the virtual
       | address space of the kernel module, monitor memcpy operations
       | reading at those addresses, and if such an operation occurs, keep
       | a copy of the value until both are known, locate the lookup
       | tables in the .rodata section of nv-kernel.o, find the signature
       | and data bocks, validate the signature, decrypt the blocks, edit
       | the PCI device ID in the decrypted data, reencrypt the blocks,
       | regenerate the signature and insert the magic, blocks and
       | signature into the table of vGPU capable magic values. And that's
       | what they do.
       | 
       | I'm very grateful _I_ wasn 't required to figure that out.
        
         | stingraycharles wrote:
         | I love the conciseness of this explanation. In just a few
         | sentences, I completely understand the solution, but at the
         | same time also understand the black magic wizardry that was
         | required to pull it off.
        
           | jacquesm wrote:
           | Not to mention the many hours or days of being stumped. This
           | sort of victory typically doesn't happen overnight.
           | 
           | What bugs me about companies like NV is that if they just
           | sold their hardware and published the specs they'd probably
           | sell _more_ than with all this ridiculously locked down
           | nonsense, it is just a lot of work thrown at limiting your
           | customers and protecting a broken business model.
        
             | TomVDB wrote:
             | Is a "broken business model" one that requires you to pay
             | for extra additional features?
             | 
             | If Nvidia enabled all their professional features on all
             | gaming SKUs, the only reason to buy a professional SKU
             | would be additional memory.
             | 
             | Today, they make almost $1B per year in the professional
             | non-datacenter business alone. There is no way they'd be
             | able to compensate that revenue with volume (and gross
             | margins would obviously tank as well, which makes Wall
             | Street very unhappy.)
             | 
             | That's obviously even more so in today's market conditions.
             | 
             | Do you feel it's justified that you have to pay $10K extra
             | for the self-driving feature on a Tesla? Or should they
             | also be forced to give away that feature for free? After
             | all, it's just a SW upgrade. (Don't mistake this for an
             | endorsement...)
        
               | matheusmoreira wrote:
               | > Is a "broken business model" one that requires you to
               | pay for extra additional features?
               | 
               | Yes. Who actually likes being segmented into markets? We
               | want to pay a fair price for products instead of being
               | exploited.
               | 
               | > If Nvidia enabled all their professional features on
               | all gaming SKUs, the only reason to buy a professional
               | SKU would be additional memory.
               | 
               | So what? A GPU is a GPU. It's all more or less the same
               | thing. They would not have to lock down hardware features
               | otherwise.
               | 
               | > Today, they make almost $1B per year in the
               | professional non-datacenter business alone. There is no
               | way they'd be able to compensate that revenue with volume
               | (and gross margins would obviously tank as well, which
               | makes Wall Street very unhappy.)
               | 
               | Who cares really. Pursuit of profit does not excuse bad
               | behavior. They _should_ lose money every time they do it.
        
               | TomVDB wrote:
               | Having to pay more for product than what you're willing
               | to pay is exploitation.
        
               | matheusmoreira wrote:
               | More like selling me hardware with a built-in limiter
               | that doesn't go away unless I pay more and they flip the
               | "premium customer" bit.
        
               | TomVDB wrote:
               | Are you against buying a license to unlock advanced
               | software features as well, or do you have the same
               | irrational belief that only products that include a HW
               | component shouldn't be allowed to charge for advanced
               | features?
               | 
               | Would you prefer if companies made 2 separate pieces of
               | silicon designs, one with virtualization support in HW
               | and one without, even if it would reduce their ability to
               | work on advancing the state of the art due to wasted
               | engineering resources?
               | 
               | Or would you prefer that all features are enabled all the
               | time, but with the consequence that prices are raised by,
               | say, 10% for everyone, even though 99% of customers don't
               | give a damn about these extra features?
        
               | matheusmoreira wrote:
               | > Are you against buying a license to unlock advanced
               | software features as well
               | 
               | I'm against "licenses" in general. If your software is
               | running on my computer, I make the rules. If it's running
               | on your server, you make the rules. It's very simple.
               | 
               | > do you have the same irrational belief that only
               | products that include a HW component shouldn't be allowed
               | to charge for advanced features?
               | 
               | When I buy a thing, I expect it to perform to its full
               | capacity. Nothing irrational about that.
               | 
               | > Would you prefer if companies made 2 separate pieces of
               | silicon designs, one with virtualization support in HW
               | and one without
               | 
               | Sure. At least then there would be real limitation rather
               | than some made up illusion.
               | 
               | > even if it would reduce their ability to work on
               | advancing the state of the art due to wasted engineering
               | resources?
               | 
               | The real waste of engineering resources is all this
               | software limiter crap. They shouldn't even be writing
               | drivers in the first place. They're a hardware company,
               | they should be making hardware and publishing
               | documentation. Instead they're locking out open source
               | developers, adding DRM to their cards and blocking
               | workloads they don't like.
               | 
               | > Or would you prefer that all features are enabled all
               | the time, but with the consequence that prices are raised
               | by, say, 10% for everyone, even though 99% of customers
               | don't give a damn about these extra features?
               | 
               | That is how things are supposed to work, yes.
        
               | TomVDB wrote:
               | Alright, you want the power to decide who's allowed to
               | make hardware and who's allowed to write software.
               | 
               | Have a nice day!
        
               | matheusmoreira wrote:
               | Not really? I don't really care how much money they burn
               | on useless stuff. You brought up misuse of engineering
               | resources so I pointed out the fact they didn't actually
               | have to write any software. All they have to do is
               | release documentation and the problem will take care of
               | itself.
        
               | [deleted]
        
               | webmaven wrote:
               | _> If Nvidia enabled all their professional features on
               | all gaming SKUs, the only reason to buy a professional
               | SKU would be additional memory._
               | 
               |  _> Today, they make almost $1B per year in the
               | professional non-datacenter business alone. There is no
               | way they'd be able to compensate that revenue with volume
               | (and gross margins would obviously tank as well, which
               | makes Wall Street very unhappy.)_
               | 
               | You're looking at it wrong. If Nvidia were to enable all
               | features on their hardware, they wouldn't be giving up
               | that additional revenue, they would instead have to
               | create differentiated hardware with and without certain
               | features.
               | 
               | Their costs would increase somewhat (as currently their
               | professional SKUs enjoy some economies of scale by virtue
               | of being lumped in with the higher-volume gaming SKUs),
               | but it would hardly be the catastrophe you're describing.
               | The pro market is large enough to enjoy it's own
               | economies of scale, even if the hardware wasn't nearly
               | identical (which it still would be).
        
               | lawl wrote:
               | > Do you feel it's justified that you have to pay $10K
               | extra for the self-driving feature on a Tesla? Or should
               | they also be forced to give away that feature for free?
               | After all, it's just a SW upgrade.
               | 
               | I feel like i already paid for the hardware. If telsa
               | says it's cheaper for them to stick the necessary
               | hardware into every car, i'm still paying for it if i buy
               | one without self-driving. Thus if i think the tesla
               | software isn't worth 10k and i'd rather use openpilot, i
               | feel like i should have the right to do that.
               | 
               | But nvidia is also actively interfering with open source
               | drivers (nouveau) with signature checks etc.
        
               | xvilka wrote:
               | Yes, NVIDIA is actively malicious on that front.
        
               | TomVDB wrote:
               | (Leaving aside that Tesla also actively prevents you from
               | running your own software on its platform...)
               | 
               | The whole focus on hardware is just bizarre.
               | 
               | When you buy a piece of SW that has free features but
               | requires a license key to unlock advanced features,
               | everything is fine, but the moment HW is involved all of
               | that flies out of the window.
               | 
               | Extra features cost money to implement. Companies want to
               | be paid for it.
               | 
               | A company like Nvidia could decide to make 2 pieces of
               | silicon, one with professional features and once without.
               | Or they could disable it.
               | 
               | Obviously, you'd prefer the first option, even if it
               | absolutely makes no sense to do so. It'd be a waste of
               | engineering resources that could have been spent on
               | future products.
               | 
               | Deciding to disable a feature on a piece of silicon is no
               | different than changing a #define or adding "if (option)"
               | to disable an advanced feature.
               | 
               | By doing so, I have the option to not pay for an advanced
               | feature that I don't need.
               | 
               | I don't want the self-driving option in a Tesla and I'm
               | very happy to have that option.
        
               | Out_of_Characte wrote:
               | >When you buy a piece of SW that has free features but
               | requires a license key to unlock advanced features,
               | everything is fine, but the moment HW is involved all of
               | that flies out of the window.
               | 
               | This doesn't make sense at all. In your scenario you
               | always pay for what you get. and developing additional
               | features has a non-zero cost asociated with it (Unless
               | you download software which targets unskilled consumers,
               | Like Chessbase's Fritz engine which was essentially
               | stockfish but 100$ instead of 0)
               | 
               | >Extra features cost money to implement. Companies want
               | to be paid for it.
               | 
               | This doesn't make sense in your scenario either. You
               | already have the sillicon with the 'advanced features' in
               | your hands. The reason they lock the feature is so that
               | you have to buy a more expensive card, with overpowered
               | hardware that you dont need, in order to use a feature
               | that all cards have if it weren't disabled. The only
               | reasonable explanation you could have at this point that
               | doesn't involve monopolistic practices to make more money
               | (Nothing wrong with that) is that the development of the
               | feature itself was so prohibitively expensive that it
               | required consumers to pay for much higher margin cards in
               | order to offset the development costs. Which is what's
               | happening
               | 
               | >A company like Nvidia could decide to make 2 pieces of
               | silicon, one with professional features and once without.
               | Or they could disable it.
               | 
               | That would cost alot of money. all the more reasons why
               | it might have been done to upsell more cards instead of
               | offering quantitative improvements for a different price.
        
               | zaksoup wrote:
               | I agree with you, but I also think the commenter you're
               | replying to agrees with you.
               | 
               | The issue is not that Tesla FSD should come with the
               | hardware, the issue is that if I buy the hardware I
               | should have the right to do whatever I want with it, and
               | so we shouldn't leave aside that Tesla prevents us from
               | running our own software.
               | 
               | This is relevant to the NVidia situation since their
               | software doesn't add features, it limits things the chip
               | is already capable of. Just like Tesla won't let you run
               | Comma.AI or something similar on their hardware...
        
             | eli wrote:
             | But they'd also sells fewer high end models. I don't doubt
             | that they've done the math.
        
             | PragmaticPulp wrote:
             | > they'd probably sell more than with all this ridiculously
             | locked down nonsense
             | 
             | It's currently impossible to find any nVidia GPU in stock
             | because the demand far outstrips the supply.
             | 
             | Market segmentation is only helping their profit margins,
             | not hurting it.
        
             | oatmeal_croc wrote:
             | I have a feeling that they've done the math and have
             | realized what makes them the most money.
        
             | matheusmoreira wrote:
             | Their business model is not broken. Not yet. With hardware
             | unlocking software like vgpu_unlock, we can break it.
        
               | throwaway2048 wrote:
               | Count on driver updates breaking this workaround
        
               | matheusmoreira wrote:
               | If they break it, the people who really need this feature
               | will simply not upgrade. Companies run decades old
               | software all the time, this isn't going to be any
               | different. It's just like nvidia's ridiculous blocking of
               | cryptocurrency mining workloads. Once the fix is out,
               | it's over.
               | 
               | Also I have no doubt people will find other ways to
               | unlock the hardware.
        
               | throwaway2048 wrote:
               | stuff like video games often require driver updates to
               | function, which is a major use case of this hack. Not to
               | mention older nvidia drivers do not support newer linux
               | kernels.
        
               | webmaven wrote:
               | _> Count on driver updates breaking this workaround_
               | 
               | If the workaround results in enough money being left on
               | the table, this _might_ prompt 3rd party investment in
               | open source drivers in order to keep the workaround
               | available by eliminating the dependence on Nvidia 's
               | proprietary drivers.
        
             | [deleted]
        
         | kbumsik wrote:
         | Also, I wonder what kinds of skills are required to figure this
         | out? I don't think just knowing Linux kernel internals would be
         | enough.
        
           | Sebb767 wrote:
           | The actual "trick" behind this is well known and has been
           | done for quite some time. One could actually solder in a
           | different device id in the GTX 6xx series [0] or flash a
           | different VBIOS on the GTX 5xx ones. The real achievement
           | here is implementing this in software without touching the
           | GPU.
           | 
           | This is not to downplay the OP, of course - this is truly
           | great and I'm sure it was a lot of work. But the hardware
           | part is not new.
           | 
           | [0] https://web.archive.org/web/20200814064418/https://www.ee
           | vbl...
        
       | minimalist wrote:
       | Related but different:
       | 
       | - nvidia-patch [0] "This patch removes restriction on maximum
       | number of simultaneous NVENC video encoding sessions imposed by
       | Nvidia to consumer-grade GPUs."
       | 
       | - About a week ago "NVIDIA Now Allows GeForce GPU Pass-Through
       | For Windows VMs On Linux" [1]. Note, this is only for the driver
       | on Windows VM guests not GNU/Linux guests.
       | 
       | Hopefully the project in the OP will mean that GPU access is
       | finally possible on GNU/Linux guests on Xen, thank you for
       | sharing OP.
       | 
       | [0]: https://github.com/keylase/nvidia-patch
       | 
       | [1]:
       | https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-G...
        
       | archi42 wrote:
       | This puts me in a tough spot: There are good reasons to go with
       | nVidia (once they release GPUs with proper memory configuration),
       | DLSS, RTX and now this. On the other hand, do I really want to
       | give money to the company that locked this feature in the first
       | place? Difficult call, but at least the ridiculous prices mean I
       | can still think about it for a few more months.
        
         | SXX wrote:
         | I'm very much AMD fanboy and I find RTX quite useless since
         | it's only implemented in few games and work well in 2K only on
         | really high-end cards.
         | 
         | Yet AMD is no different there. They also locked SR-IOV for
         | premium datacenter hardware only and they certainly want to
         | keep some features away from consumers.
        
           | chaz6 wrote:
           | Not particularly an nvidia fanboy, but RTX can also be used
           | to speed up rendering in Blender with the E-Cycles add-on.
           | 
           | https://blendermarket.com/products/e-cycles
        
       ___________________________________________________________________
       (page generated 2021-04-10 23:02 UTC)