[HN Gopher] Show HN: Attaching to a virtual GPU over TCP
___________________________________________________________________
Show HN: Attaching to a virtual GPU over TCP
We developed a tool to trick your computer into thinking it's
attached to a GPU which actually sits across a network. This allows
you to switch the number or type of GPUs you're using with a single
command.
Author : bmodel
Score : 310 points
Date : 2024-08-09 16:50 UTC (1 days ago)
(HTM) web link (www.thundercompute.com)
(TXT) w3m dump (www.thundercompute.com)
| talldayo wrote:
| > Access serverless GPUs through a simple CLI to run your
| existing code on the cloud while being billed precisely for usage
|
| Hmm... well I just watched you run nvidia-smi in a Mac terminal,
| which is a platform it's explicitly not supported on. My instant
| assumption is that your tool copies my code into a private server
| instance and communicates back and forth to run the commands.
|
| Does this platform expose eGPU capabilities if my host machine
| supports it? Can I run raster workloads or network it with my own
| CUDA hardware? The actual way your tool and service connects
| isn't very clear to me and I assume other developers will be
| confused too.
| bmodel wrote:
| Great questions! To clarify the demo, we were ssh'd into a
| linux machine with no GPU.
|
| Going into more details for how this works, we intercept
| communication between the CPU and the GPU so only GPU code and
| commands are sent across the network to a GPU that we are
| hosting. This way we are able to virtualize a remote GPU and
| make your computer think it's directly attached to that GPU.
|
| We are not copying your CPU code and running it on our
| machines. The CPU code runs entirely on your instance (meaning
| no files need to be copied over or packages installed on the
| GPU machine). One of the benefits of this approach is that you
| can easily scale to a more / less powerful GPU without needing
| to setup a new server.
| billconan wrote:
| does this mean you have a customized/dummy kernel gpu driver?
|
| will that cause system instability, say, if the network
| suddenly dropped?
| bmodel wrote:
| We are not writing any kernel drivers, this runs entirely
| in userspace (this won't result in a crowdstrike level
| crash haha).
|
| Given that, if the network suddenly dropped then only the
| process using the GPU would fail.
| ZeroCool2u wrote:
| How do you do that exactly? Are you using eBPF or
| something else?
|
| Also, for my ML workloads the most common bottleneck is
| GPU VRAM <-> RAM copies. Doesn't this dramatically
| increase latency? Or is it more like it increases latency
| on first data transfer, but as long as you dump
| everything into VRAM all at once at the beginning you're
| fine? I'd expect this wouldn't play super well with stuff
| like PyTorch data loaders, but would be curious to hear
| how you've faired when testing.
| bmodel wrote:
| We intercept api calls and use our own implementation to
| forward them to a remote machine. No eBPF (which I
| believe need to run in the kernel).
|
| As for latency, we've done a lot of work to minimize that
| as much as possible. You can see the performance we get
| running inference on BERT from huggingface here:
| https://youtu.be/qsOBFQZtsFM?t=64. It's still slower than
| local (mainly for training workloads) but not by as much
| as you'd expect. We're aiming to reach near parity in the
| next few months!
| samstave wrote:
| When you release a self-host version, what would be
| really neat would be to see it across HFT focused NICs
| that have huge TCP buffers...
|
| https://www.arista.com/assets/data/pdf/HFT/HFTTradingNetw
| ork...
|
| Basically taking into account the large buffers and
| super-time-sensitive nature of HFT networking
| optimizations, I wonder if your TCP<-->GPU might benefit
| from both the HW and the learnings of NFT stylings?
| ZeroCool2u wrote:
| Got it. eBPF module run as part of the kernel, but
| they're still user space programs.
|
| I would would consider using a larger model for
| demonstrating inference performance as I have 7B models
| deployed on CPU at work, but GPU is still important
| training BERT size models.
| billconan wrote:
| is this a remote nvapi?
|
| this is awesome. can it do 3d rendering (vulkan/opengl)
| czbond wrote:
| I am not in this "space", but I second the "this is cool to
| see", more stuff like this needed on HN.
| cpeterson42 wrote:
| Appreciate the praise!
| bmodel wrote:
| Thank you!
|
| > is this a remote nvapi
|
| Essentially yes! Just to be clear, this covers the entire GPU
| not just the NVAPI (i.e. all of cuda). This functions like you
| have the physical card directly plugged into the machine.
|
| Right now we don't support vulkan or opengl since we're mostly
| focusing on AI workloads, however we plan to support these in
| the future (especially if there is interest!)
| billconan wrote:
| sorry, I didn't mean nvapi, I meant rmapi.
|
| I bet you saw this https://github.com/mikex86/LibreCuda
|
| they implemented the cuda driver by calling into rmapi.
|
| My understanding is if there is a remote rmapi, other user
| mode drivers should work out of the box?
| doctorpangloss wrote:
| I don't get it. Why would I start an instance in ECS, to use your
| GPUs in ECS, when I could start an instance for the GPUs I want
| in ECS? Separately, why would I want half of Nitro, instead of
| real Nitro?
| billconan wrote:
| it's more transparent to your system, for example, if you have
| a gui application that needs gpu acceleration on a thin client
| (Matlab, solidworks, blender), you can do so without setting up
| ECS. you can develop without any gpu, but suddenly have one
| when you need to run simulation. this will be way cheaper than
| AWS.
|
| I think essentially this is solving the same problem Ray
| (https://www.ray.io/) is solving, but in a more generic way.
|
| it potentially can have finer grained gpu sharing, like a half-
| gpu.
|
| I'm very excited about this.
| bmodel wrote:
| Exactly! The finer grain sharing is one of the key things on
| our radar right now
| goku-goku wrote:
| www.juicelabs.co does all this today, including the GPU
| sharing and fractionalization.
| ranger_danger wrote:
| the free community version has been discontinued, and
| also doesn't support a linux client with non-CUDA
| graphics, regardless of the server OS, which is a non-
| starter for me
| bmodel wrote:
| Great point, there are a few benefits:
|
| 1. If you're actively developing and need a GPU then you
| typically would be paying the entire time the instance is
| running. Using Thunder means you only pay for the GPU while
| actively using it. Essentially, if you are running CPU only
| code you would not be paying for any GPU time. The alterative
| for this is to manually turn the instance on and off which can
| be annoying.
|
| 2. This allows you to easily scale the type and number of GPUs
| you're using. For example, say you want to do development on a
| cheap T4 instance and run a full DL training job on a set of 8
| A100. Instead of needing to swap instances and setup everything
| again, you can just run a command and then start running on the
| more powerful GPUs.
| doctorpangloss wrote:
| Okay, but your GPUs are in ECS. Don't I just want this
| feature from Amazon, not you, and natively via Nitro? Or even
| Google has TPU attachments.
|
| > 1. If you're actively developing and need a GPU [for
| fractional amounts of time]...
|
| Why would I need a GPU for a short amount of time during
| development? For testing?
|
| I don't get it - what would testing an H100 over a TCP
| connection tell me? It's like, yeah, I can do that, but it
| doesn't represent an environment I am going to use for real.
| Nobody runs applications to GPUs on buses virtualized over
| TCP connections, so what exactly would I be validating?
| bmodel wrote:
| I don't believe Nitro would allow you to access a GPU
| that's not directly connected to the CPU that the VM is
| running on. So swapping between GPU type or scaling to
| multiple GPUs is still a problem.
|
| From the developer perspective, you wouldn't know that the
| H100 is across a network. The experience will be as if your
| computer is directly attached to an H100. The benefit here
| is that if you're not actively using the H100 (such as when
| you're setting up the instance or after the training job
| completes) you are not paying for the H100.
| doctorpangloss wrote:
| Okay, a mock H100 object would also save me money. I
| could pretend a 3090 is an A100. "The experience would be
| that a 3090 is an A100." Apples to oranges comparison?
| It's using a GPU attached to the machine versus a GPU
| that crosses a VPC boundary. Do you see what I am saying?
|
| I would never run a training job on a GPU virtualized
| over TCP connection. I would never run a training job
| that requires 80GB of VRAM on a 24GB VRAM device.
|
| Whom is this for? Who needs to save kopecks on a single
| GPU who needs H100s?
| teaearlgraycold wrote:
| I develop GPU accelerated web apps in an EC2 instance with
| a remote VSCode session. A lot of the time I'm just doing
| web dev and don't need a GPU. I can save thousands per
| month by switching to this.
| amelius wrote:
| Sounds like you can save thousands by just buying a
| simple GPU card.
| teaearlgraycold wrote:
| Well, for the time being I'm really just burning AWS
| credits. But you're right! I do however like that my dev
| machine is the exact same instance type in the same AWS
| region as my production instances. If I built an
| equivalent machine it would have different performance
| characteristics. Often times the AWS VMs have weird
| behavior that I would otherwise be caught off guard with
| when deploying to the cloud for the first time.
| steelbrain wrote:
| Ah this is quite interesting! I had a usecase where I needed a
| GPU-over-IP but only for transcoding videos. I had a not-so-
| powerful AMD GPU in my homelab server that somehow kept crashing
| the kernel any time I tried to encode videos with it and also an
| NVIDIA RTX 3080 in a gaming machine.
|
| So I wrote https://github.com/steelbrain/ffmpeg-over-ip and had
| the server running in the windows machine and the client in the
| media server (could be plex, emby, jellyfin etc) and it worked
| flawlessly.
| crishoj wrote:
| Interesting. Do you know if your tool supports conversions
| resulting in multiple files, such as HLS and its myriad of
| timeslice files?
| steelbrain wrote:
| Since it's sharing the underlying file system and just
| running ffmpeg remotely, it should support any variation of
| outputs
| bhaney wrote:
| This is more or less what I was hoping for when I saw the
| submission title. Was disappointed to see that the submission
| wasn't actually a useful generic tool but instead a paid cloud
| service. Of course the real content is in the comments.
|
| As an aside, are there any uses for GPU-over-network other than
| video encoding? The increased latency seems like it would
| prohibit anything machine learning related or graphics
| intensive.
| trws wrote:
| Some computation tasks can tolerate the latency if they're
| written with enough overlap and can keep enough of the data
| resident, but they usually need more performant networking
| than this. See older efforts like rcuda for remote cuda over
| infiniband as an example. It's not ideal, but sometimes worth
| it. Usually the win is in taking a multi-GPU app and giving
| it 16 or 32 of them rather than a single remote GPU though.
| tommsy64 wrote:
| There is a GPU-over-network software called Juice [1]. I've
| used it on AWS for running CPU-intensive workloads that also
| happen to need some GPU without needing to use a huge GPU
| instance. I was able to use a small GPU instance, which had
| just 4 CPU cores, and stream its GPU to one with 128 CPU
| cores.
|
| I found Juice to work decently for graphical applications too
| (e.g., games, CAD software). Latency was about what you'd
| expect for video encode + decode + network: 5-20ms on a LAN
| if I recall correctly.
|
| [1] - https://github.com/Juice-Labs/Juice-Labs
| Fnoord wrote:
| I mean, anything you use a GPU/TPU for could benefit.
|
| IPMI and such could use it. Like, for example, Proxmox could
| use it. Machine learning tasks (like Frigate) and hashcat
| could also use such. All in theory, of course. Many tasks use
| VNC right now, or SPICE. The ability to extract your GPU in
| the Unix way over TCP/IP is powerful. Though Node.js would
| not be the way I'd want such to go.
| lostmsu wrote:
| How do you use it for video encoding/decoding? Won't the
| uncompressed video (input for encoding or output of decoding)
| be too large to transmit over network practically?
| bhaney wrote:
| Well, the ffmpeg-over-ip tool in the GP does it by just not
| sending uncompressed video. It's more of an ffmpeg server
| where the server is implicitly expected to have access to a
| GPU that the client doesn't have, and only compressed video
| is being sent back and forth in the form of video streams
| that would normally be the input and output of ffmpeg. It's
| not a generic GPU server that tries to push a whole PCI bus
| over the network, which I personally think is a bit of a
| fool's errand and doomed to never be particularly useful to
| existing generic workloads. It would work if you very
| carefully redesign the workload to not take advantage of a
| GPU's typical high bandwidth and low latency, but if you
| have to do that then what's the point of trying to abstract
| over the device layer? Better to work at a higher level of
| abstraction where you can optimize for your particular
| application, rather than a lower level that you can't
| possibly implement well and then have to completely redo
| the higher levels anyway to work with it.
| lostmsu wrote:
| Ah, you mean transcoding scenarios. Like it can't encode
| my screen capture.
| johnisgood wrote:
| I am increasingly growing tired of these "cloud" services,
| paid or not. :/
| adwn wrote:
| Well, feel free to spend your own time on writing such a
| tool and releasing it as Open Source. That would be a
| really cool project! Until then, don't complain that others
| aren't willing to donate a significant amount of their work
| to the public.
| rowanG077 wrote:
| There is a vast gap between walled garden cloud service
| rent seeking and giving away software as open source. In
| the olden days you could buy software licenses to run it
| wherever you wanted.
| mhuffman wrote:
| I agree. Paying by the month for the rest of your life or
| they cut you off is not something I am a fan of. I feel
| sorry for people too young to remember that you could
| actually buy an app, get free bug updates and get a
| discount if they made some big changes on a new version
| that you might (or might not) want. But it was up to you
| when and where you ran it and it was yours forever. I have
| heard the arguments for why people enjoy this monthly
| subscription model, but my counter argument is that people
| did just fine before without them, so what is so different
| now? And I mean, in general, not that you need to use a GPU
| for 1 hour but don't want to buy one. I mean, for example,
| how Adobe products run on your computer but you rent them
| forever.
| toomuchtodo wrote:
| Have you done a Show HN yet? If not, please consider doing so!
|
| https://gist.github.com/tzmartin/88abb7ef63e41e27c2ec9a5ce5d...
|
| https://news.ycombinator.com/showhn.html
|
| https://news.ycombinator.com/item?id=22336638
| cpeterson42 wrote:
| Given the interest here we decided to open up T4 instances for
| free. Would love for y'all to try it and let us know your
| thoughts!
| dheera wrote:
| What is your A100 and H100 pricing?
| cpeterson42 wrote:
| We are super early stage and don't have A100s or H100s live
| yet. Exact pricing TBD but expect it to be low. If you want
| to use them today, reach out directly and we can set them up
| :)
| tptacek wrote:
| This is neat. Were you able to get MIG or vGPUs working with it?
| bmodel wrote:
| We haven't tested with MIG or vGPU, but I think it would work
| since it's essentially physically partitioning the GPU.
|
| One of our main goals for the near future is to allow GPU
| sharing. This would be better than MIG or vGPU since we'd allow
| users to use the entire GPU memory instead of restricting them
| to a fraction.
| tptacek wrote:
| We had a hell of a time dealing with the licensing issues and
| ultimately just gave up and give people whole GPUs.
|
| What are you doing to reset the GPU to clean state after a
| run? It's surprisingly complicated to do this securely (we're
| writing up a back-to-back sequence of audits we did with
| Atredis and Tetrel; should be publishing in a month or two).
| bmodel wrote:
| We kill the process to reset the GPU. Since we only store
| GPU state that's the only clean up we need to do
| tptacek wrote:
| Hm. Ok. Well, this is all very cool! Congrats on
| shipping.
| azinman2 wrote:
| Won't the VRAM still contain old bits?
| kawsper wrote:
| Cool idea, nice product page!
|
| Does anyone know if this is possible with USB?
|
| I have a Davinci Resolve license USB-dongle I'd like to not
| plugging into my laptop.
| kevmo314 wrote:
| You can do that with USB/IP: https://usbip.sourceforge.net/
| orsorna wrote:
| So what exactly is the pricing model? Do I need a quote? Because
| otherwise I don't see how to determine it without creating an
| account which is needlessly gatekeeping.
| bmodel wrote:
| We're still in our beta so it's entirely free for now (we can't
| promise a bug-free experience)! You have to make an account but
| it won't require payment details.
|
| Down the line we want to move to a pay-as-you-go model.
| Cieric wrote:
| This is interesting, but I'm more interested in self-hosting. I
| already have a lot of GPUs (some running some not.) Does this
| have a self-hosting option so I can use the GPUs I already have?
| cpeterson42 wrote:
| We don't support self hosting yet but the same technology
| should work well here. Many of the same benefits apply in a
| self-hosted setting, namely efficient workload scheduling, GPU-
| sharing, and ease-of-use. Definitely open to this possibility
| in the future!
| covi wrote:
| If you want to use your own GPUs or cloud accounts but with a
| great dev experience, see SkyPilot.
| ellis0n wrote:
| You can rent out your GPUs in the cloud with services like
| Akash Network and rent GPUs at thundercompute.com.. manager's
| path, almost like self-hosting :)
| cpeterson42 wrote:
| We created a discord for the latest updates, bug reports, feature
| suggestions, and memes. We will try to respond to any issues and
| suggestions as quickly as we can! Feel free to join here:
| https://discord.gg/nwuETS9jJK
| throwaway888abc wrote:
| Does it work for gaming on windows ? or even linux ?
| cpeterson42 wrote:
| In theory yes. In practice, however, latency between the CPU
| and remote GPU makes this impractical
| boxerbk wrote:
| You could use a remote streaming protocol, like Parsec, for
| that. You'd need your own cloud account and connect directly to
| a GPU-enabled cloud machine. Otherwise, it would work to let
| you game.
| rubatuga wrote:
| What ML packages do you support? In the comments below it says
| you do not support Vulkan or OpenGL. Does this support AMD GPUs
| as well?
| bmodel wrote:
| We have tested this with pytorch and huggingface and it is
| mostly stable (we know there are issues with pycuda and jax).
| In theory this should work with any libraries, however we're
| still actively developing this so bugs will show up
| the_reader wrote:
| Would be possible to mix it with Blender?
| bmodel wrote:
| At the moment out tech is linux-only so it would not work with
| Blender.
|
| Down the line, we could see this being used for batched render
| jobs (i.e. to replace a render farm).
| comex wrote:
| Blender can run on Linux...
| bmodel wrote:
| Oh nice, I didn't know that! In that case it might work,
| you could try running `tnr run ./blender` (replace the
| ./blender with how you'd launch blender from the CLI) to
| see what happens. We haven't tested it so I can't make
| promises about performance or stability :)
| chmod775 wrote:
| _Disclaimer: I only have a passing familiarity with
| Blender, so I might be wrong on some counts._
|
| I think you'd want to run the blender GUI locally and
| only call out to a headless rendering server ("render
| farm") that uses your service under the hood to get the
| actual render.
|
| This separation is already something blender supports,
| and you could for instance use Blender on Windows despite
| your render farm using Linux servers.
|
| Cloud rendering is adjacent to what you're offering, and
| it should be trivial for you to expand into that space by
| just figuring out the setup and preparing a guide for
| users wishing to do that with your service.
| teaearlgraycold wrote:
| This could be perfect for us. We need very limited bandwidth but
| have high compute needs.
| bmodel wrote:
| Awesome, we'd love to chat! You can reach us at
| founders@thundercompute.com or join the discord
| https://discord.gg/nwuETS9jJK!
| goku-goku wrote:
| Feel free to reach out www.juicelabs.co
| bkitano19 wrote:
| this is nuts
| cpeterson42 wrote:
| We think so too, big things coming :)
| goku-goku wrote:
| www.juicelabs.co
| dishsoap wrote:
| For anyone curious about how this actually works, it looks like a
| library is injected into your process to hook these functions [1]
| in order to forward them to the service.
|
| [1] https://pastebin.com/raw/kCYmXr5A
| almostgotcaught wrote:
| How did you figure out these were hooked? I'm assuming some
| flag that tells ld/ldd to tell you when some symbol is rebound?
| Also I thought a symbol has to be a weak symbol to be rebound
| and assuming nvidia doesn't expose weak symbols (why would
| they) the implication is that their thing is basically
| LD_PRELOADed?
| yarpen_z wrote:
| Yes. While I don't know what they do internally, API remoting
| has been used for GPUs since at least rCUDA - that's over 10
| years ago.
|
| LD_PRELOAD trick allows you to intercept and virtualize calls
| to the CUDA runtime.
| the8472 wrote:
| Ah, I assumed/hoped they had some magic that would manage to
| forward a whole PCIe device.
| Zambyte wrote:
| Reminds me of Plan9 :)
| K0IN wrote:
| can you elaborate a bit on why? (noob here)
| Zambyte wrote:
| In Plan 9 everything is a file (for real this time). Remote
| file systems are accessible through the 9P protocol (still
| used in modern systems! I know it's used in QEMU and WSL).
| Every process has its own view of the filesystem called a
| namespace. The implication of these three features is that
| remote resources can be transparently accessed as local
| resources by applications.
| radarsat1 wrote:
| I'm confused, if this operates at the CPU/GPU boundary doesn't it
| create a massive I/O bottleneck for any dataset that doesn't fit
| into VRAM? I'm probably misunderstanding how it works but if it
| intercepts GPU i/o then it must stream your entire dataset on
| every epoch to a remote machine, which sounds wasteful, probably
| I'm not getting this right.
| bmodel wrote:
| That understanding of the system is correct. To make it
| practical we've implemented a bunch of optimizations to
| minimize I/O cost. You can see how it performs on inference
| with BERT here: https://youtu.be/qsOBFQZtsFM?t=69.
|
| The overheads are larger for training compared to inference,
| and we are implementing more optimizations to approach native
| performance.
| radarsat1 wrote:
| Aah ok thanks, that was my basic misunderstanding, my mind
| just jumped straight to my current training needs but for
| inference it makes a lot of sense. Thanks for the
| clarification.
| semitones wrote:
| > to approach native performance.
|
| The same way one "approaches the sun" when they take the
| stairs?
| the8472 wrote:
| I guess there's non-negligible optimization potential, e.g.
| by doing hash-based caching. If the same data gets uploaded
| twice they can have the blob already sitting somewhere
| closer to the machine.
| ruined wrote:
| yes. my building has stairs and i find them useful because
| usually i don't need to go to the sun
| ranger_danger wrote:
| Is DirectX support possible any time soon? This would be
| _huge_ for Windows VMs on Linux...
| zozbot234 wrote:
| You could use unofficial Windows drivers for virtio-gpu,
| that are specifically intended for VM use.
| ranger_danger wrote:
| Indeed, it's just that it's not very feature-complete or
| stable yet.
| winecamera wrote:
| I saw that in the tnr CLI, there are hints of an option to self-
| host a GPU. Is this going to be a released feature?
| cpeterson42 wrote:
| We don't support self-hosting yet but are considering adding it
| in the future. We're a small team working as hard as we can :)
|
| Curious where you see this in the CLI, may be an oversight on
| our part. If you can join the Discord and point us to this bug
| we would really appreciate it!
| test20240809 wrote:
| pocl (Portable Computing Language) [1] provides a remote backend
| [2] that allows for serialization and forwarding of OpenCL
| commands over a network.
|
| Another solution is qCUDA [3] which is more specialized towards
| CUDA.
|
| In addition to these solutions, various virtualization solutions
| today provide some sort of serialization mechanism for GPU
| commands, so they can be transferred to another host (or
| process). [4]
|
| One example is the QEMU-based Android Emulator. It is using
| special translator libraries and a "QEMU Pipe" to efficiently
| communicate GPU commands from the virtualized Android OS to the
| host OS [5].
|
| The new Cuttlefish Android emulator [6] uses Gallium3D for
| transport and the virglrenderer library [7].
|
| I'd expect that the current virtio-gpu implementation in QEMU [8]
| might make this job even easier, because it includes the
| Android's gfxstream [9] (formerly called "Vulkan Cereal") that
| should already support communication over network sockets out of
| the box.
|
| [1] https://github.com/pocl/pocl
|
| [2] https://portablecl.org/docs/html/remote.html
|
| [3] https://github.com/coldfunction/qCUDA
|
| [4] https://www.linaro.org/blog/a-closer-look-at-virtio-and-
| gpu-...
|
| [5]
| https://android.googlesource.com/platform/external/qemu/+/em...
|
| [6] https://source.android.com/docs/devices/cuttlefish/gpu
|
| [7]
| https://cs.android.com/android/platform/superproject/main/+/...
|
| [8] https://www.qemu.org/docs/master/system/devices/virtio-
| gpu.h...
|
| [9]
| https://android.googlesource.com/platform/hardware/google/gf...
| fpoling wrote:
| Zscaler uses a similar approach in their remote browser. WebGL
| in the local browser exposed as a GPU to a Chromium instance in
| the cloud.
| mmsc wrote:
| What's it like to actually use this for any meaningful
| throughput? Can this be used for hash cracking? Every time I
| think about virtual GPUs over a network, I think about botnets.
| Specifically from
| https://www.hpcwire.com/2012/12/06/gpu_monster_shreds_passwo...
| "Gosney first had to convince Mosix co-creator Professor Amnon
| Barak that he was not going to "turn the world into a giant
| botnet.""
| cpeterson42 wrote:
| This is definitely an interesting thought experiment, however
| in practice our system is closer to AWS than a botnet, as the
| GPUs are not distributed. This technology does lend itself to
| some interesting applications with creating very flexible
| clusters within data centers that we are exploring.
| m3kw9 wrote:
| So won't that make the network the prohibitive bottle neck? Your
| memory bandwidth is 1gbps max
| teaearlgraycold wrote:
| Cloud hosts will offer 10Gb/s. Anyway, in my experience with
| training LoRAs and running DINOv2 inference you don't need much
| bandwidth. We are usually sitting at around 10-30MB/s per GPU.
| userbinator wrote:
| It's impressive that this is even possible, but I wonder what
| happens if the network connection goes down or is anything but
| 100% stable? In my experience drivers react badly to even a local
| GPU that isn't behaving.
| tamimio wrote:
| I'm more interested in using tools like hashcat, any benchmark on
| these? As the docs link returns error.
| delijati wrote:
| Even a directly attached eGPU via thunderbold 4 was after some
| time too slow for machine learning aka training. As i work now
| fully remote i just have a beefy midi tower. Some context about
| eGPU [1].
|
| But hey i'm happy to be proofed wrong ;)
|
| [1] https://news.ycombinator.com/item?id=38890182#38905888
| xyst wrote:
| Exciting. But would definitely like to see a self hosted option.
| ellis0n wrote:
| In 2008, I had a powerful server with XEON CPU, but the
| motherboard had no slots for a graphics card. I also had a
| computer with a powerful graphics card but a weak Core 2 Duo. I
| had the idea of passing the graphics card over the network using
| Linux drivers. This concept has now been realized in this
| project. Good job!
| somat wrote:
| What makes me sad is that the original sgi engineers who
| developed glx were very careful to use x11 mechanisms for the gpu
| transport, so it was fairly trivial to send the gl stream over
| the network to render on your graphics card. "run on the
| supercomputer down the hall, render on your workstation". More
| recent driver development has not shown such care and this is
| usually no longer possible.
|
| I am not sure how useful it was in reality(usually if you had a
| nice graphics card you also had a nice cpu) but I had fun playing
| around with it. There was something fascinating about getting
| accelerated graphics on a program running in the machine room. I
| was able to get glquake running like this once.
___________________________________________________________________
(page generated 2024-08-10 23:01 UTC)