[HN Gopher] PyTorch Library for Running LLM on Intel CPU and GPU
       ___________________________________________________________________
        
       PyTorch Library for Running LLM on Intel CPU and GPU
        
       Author : ebalit
       Score  : 265 points
       Date   : 2024-04-03 10:28 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | tomrod wrote:
       | Looking forward to reviewing!
        
       | Hugsun wrote:
       | I'd be interested in seeing benchmark data. The speed seemed
       | pretty good in those examples.
        
       | antonp wrote:
       | Hm, no major cloud provider offers intel gpus.
        
         | anentropic wrote:
         | Lots offer Intel CPUs though...
        
         | VHRanger wrote:
         | No, but for consumers they're a great offering.
         | 
         | 16GB RAM and performance around a 4060ti or so, but for 65% of
         | the price
        
           | _joel wrote:
           | and 65% of the software support, less I'm inclined to
           | believe? Although having more players in the fold is
           | definitely a good thing.
        
             | VHRanger wrote:
             | Intel is historically really good at the software side,
             | though.
             | 
             | For all their hardware research hiccups in the last 10
             | years, they've been delivering on open source machine
             | learning libraries.
             | 
             | It's apparently the same on driver improvements and gaming
             | GPU features in the last year.
        
               | frognumber wrote:
               | I'm optimistic Intel will get the software right in due
               | course. Last I looked, it wasn't all there yet, but it
               | was on the right track.
               | 
               | Right now, I have a nice NVidia card, but if things stay
               | on track, I think it's very likely my next GPU might be
               | Intel. Open-source, not to mention better value.
        
               | HarHarVeryFunny wrote:
               | But even if Intel have stable optimized drivers and ML
               | support, it'd still need to be supported by PyTorch/etc
               | for most developers to want to use it. People want to
               | write at high level, not at CUDA-type level.
        
               | VHRanger wrote:
               | Intel is supported in Pytorch, though. It's supported
               | from their own branch, which is presumably a big
               | annoyance to install, but they do work
        
               | HarHarVeryFunny wrote:
               | I just tried googling for Intel's PyTorch, and it's clear
               | as mud as to exactly what's run on the GPU and what is
               | not. I assume they'd be bragging about it if this ran
               | everything on their GPU the same as it would on NVDIA, so
               | I'm guessing it just accelerates some operations.
        
         | belthesar wrote:
         | Intel GPUs got quite a bit of penetration in the SE Asian
         | market, and Intel is close to releasing a new generation. In
         | addition, Intel's allowing for GPU virtualization without
         | additional license fees (unlike Nvidia and GRID licenses),
         | allowing hosting operators to carve up these cards. I have a
         | feeling we're going to see a lot more Intel offerings
         | available.
        
       | DrNosferatu wrote:
       | Any performance benchmark against 'llamafile'[0] or others?
       | 
       | [0] - https://github.com/mozilla-Ocho/llamafile
        
         | VHRanger wrote:
         | You can already use intel GPUs (both ARC and iGPUS) with
         | llama.cpp on a bunch of backends:
         | 
         | - SYCL [1]
         | 
         | - Vulkan
         | 
         | - OpenCL
         | 
         | I don't own the hardware, but I imagine SYCL is more performant
         | for ARC , because it's the one intel is pushing for their
         | datacenter stuff
         | 
         | [1]:
         | https://www.intel.com/content/www/us/en/developer/articles/t...
        
       | captaindiego wrote:
       | Are there any Intel GPUs with a lot of vRAM that someone could
       | recommend that would work with this?
        
         | goosedragons wrote:
         | For consumer stuff there's the Intel Arc A770 with 16GB VRAM.
         | More than that and you start moving into enterprise stuff.
        
           | ZeroCool2u wrote:
           | Which seems like their biggest mistake. If they would just
           | release a card with more than 24GB VRAM, people would be
           | clamoring for their cards, even if they were marginally
           | slower. It's the same reason that 3090's are still in high
           | demand compared to the 4090's.
        
         | Aromasin wrote:
         | There's the Max GPU (Ponte Vecchio), their datacentre offering,
         | with 128GB of HBM2e memory, 408 MB of L2 cache, and 64 MB of L1
         | cache. Then there's Gaudi, which has similar numbers but with
         | cores specific for AI workloads (as far as I know from the
         | marketing).
         | 
         | You can pick them up in prebuilds from Dell and Supermicro:
         | https://www.supermicro.com/en/accelerators/intel
         | 
         | Read more about them here: https://www.servethehome.com/intel-
         | shows-gpu-max-1550-perfor...
        
       | vegabook wrote:
       | The company that did 4-cores-forever, has the opportunity to
       | redeem itself, in its next consumer GPU release, by disrupting
       | the "8-16GB VRAM forever" that AMD and Nvidia have been imposing
       | on us for a decade. It would be poetic to see 32-48GB at a non-
       | eye-watering price point.
       | 
       | Intel definitely seems to be doing all the right things on
       | software support.
        
         | sitkack wrote:
         | What is obvious to us, is an industry standard to Product
         | Managers. When is the last time you have seen an industry
         | player upset the status quo? Intel has not changed _that_ much.
        
         | zoobab wrote:
         | "It would be poetic to see 32-48GB at a non-eye-watering price
         | point."
         | 
         | I heard some Asrock motherboard BIOSes could set the VRAM up to
         | 64GB on Ryzen5.
         | 
         | Doing some investigations with different AMD hardware atm.
        
           | stefanka wrote:
           | That would be an interesting information. Which MB works with
           | with which APU with 32 or more GB of VRAM. Can you post your
           | findings please?
        
           | LoganDark wrote:
           | When has an APU ever been as fast as a GPU? How much cache
           | does it have, a few hundred megabytes? That can't possibly be
           | enough for matmul, no matter how much slow DDR4/5 is
           | technically addressable.
        
         | whalesalad wrote:
         | still wondering why we can't have gpu's with sodimm slots so
         | you can crank the vram
        
           | amir_karbasi wrote:
           | I believe that the issue is that graphic cards require really
           | fast memory. This requires close memory placement (that's why
           | the memory is so close to the core on the board). expandable
           | memory will not be able to provide the required bandwidth
           | here.
        
             | frognumber wrote:
             | The universe used to have hierarchies. Fast memory close,
             | slow memory far. Registers. L1. L2. L3. RAM. Swap.
             | 
             | The same thing would make a lot of sense here. Super-fast
             | memory close, with overflow into classic DDR slots.
             | 
             | As a footnote, going parallel also helps. 8 sticks of RAM
             | at 1/8 the bandwidth each is the same as one stick of RAM
             | at 8x the bandwidth, if you don't multiplex onto the same
             | traces.
        
               | riskable wrote:
               | It's not so simple... The way GPU architecture works is
               | that it _needs_ as-fast-as-possible access to its VRAM.
               | The concept of  "overflow memory" for a GPU is your PC's
               | RAM. Adding a secondary memory controller and equivalent
               | DRAM to the card itself would only provide a trivial
               | improvement over, "just using the PC RAM".
               | 
               | Point of fact: GPUs don't even use all the PCI Express
               | lanes they have available to them! Most GPUs (even top of
               | the line ones like Nvidia's 4090) only use about 8 lanes
               | of bandwidth. This is why some newer GPUs are being
               | offered with M.2 slots so you can add an SSD
               | (https://press.asus.com/news/asus-dual-geforce-
               | rtx-4060-ti-ss... ).
        
               | wongarsu wrote:
               | GPUs have memory hierarchies too. A 4090 has about 16MB
               | of L1 cache and 72MB of L2 cache, followed by the 24GB of
               | GDDR6 RAM, followed by host ram that can be accessed via
               | PCIe.
               | 
               | The issue is that GPUs are massively parallel. A 4090 has
               | 128 streaming multiprocessors, each executing 128
               | "threads" or "lanes" in parallel. If each "thread" works
               | on a different part of memory that leaves you with 1kB of
               | L1 cache per thread, and 4.5kB of L2 cache each. For each
               | clock cycle you might be issuing thousands of request to
               | your memory controller for cache misses and prefetching.
               | That's why you want insanely fast RAM.
               | 
               | You can write CUDA code that directly accesses your host
               | memory as a layer beyond that, but usually you want to
               | transfer that data in bigger chunks. You probably could
               | make a card that adds DDR4 slots as an additional level
               | of hierarchy. It's the kind of weird stuff Intel might do
               | (the Phi had some interesting memory layout ideas).
        
           | chessgecko wrote:
           | You could, but the memory bandwidth wouldn't be amazing
           | unless you had a lot of sticks and it would end up getting
           | pretty expensive
        
           | justsomehnguy wrote:
           | Look at the motherboards with >2 Memory channels. That would
           | require a lot of physical space, which is quite restricted on
           | a 50 y/o standard for the expansion cards.
        
           | riskable wrote:
           | You can do this sort of thing but you can't use SODIMM slots
           | because that places the actual memory chips too far away from
           | the GPU. Instead what you need is something like BGA sockets
           | (https://www.nxp.com/design/design-center/development-
           | boards/... ) which are _stupidly expensive_ (e.g. $600 per
           | socket).
        
             | monocasa wrote:
             | You could probably use something like CAMM which solved a
             | similar problem for lpddr.
             | 
             | https://en.wikipedia.org/wiki/CAMM_(memory_module)
        
         | chessgecko wrote:
         | Going above 24GB is probably not going to be cheap until gddr7
         | is out, and even that will only push it to 36gb. The fancier
         | stacked gddr6 stuff is probably pretty expensive and you can't
         | just add more dies because of signal integrity issues.
        
           | frognumber wrote:
           | Assuming you want to maintain full bandwidth.
           | 
           | Which I don't care too much about.
           | 
           | However, even 16->24GB is a big step, since a lot of the
           | model are developed for 3090/4090-class hardware. 36GB would
           | place it lose to the class of the fancy 40GB data center
           | cards.
           | 
           | If Intel decided to push VRAM, it will definitely have a
           | market. Critically, a lot of folks will also be incentivized
           | to make software compatible, since it will be the cheapest
           | way to run models.
        
             | 0cf8612b2e1e wrote:
             | At this point, I cannot run an entire class of models
             | without OOM. I will take a performance hit if it lets me
             | run it at all.
             | 
             | I want a consumer card that can do some number of tokens
             | per second. I do not need a monster that can serve as the
             | basis for a startup.
        
               | hnfong wrote:
               | A maxed out Mac Studio probably fits your requirements as
               | stated.
        
               | 0cf8612b2e1e wrote:
               | If I were willing to drop $4k on that setup, I might as
               | well get the real NVidia offering.
               | 
               | The hobbyist market needs something priced well under $1k
               | to make it accessible.
        
             | rnewme wrote:
             | How comes you don't care about full bandwidth?
        
               | Dalewyn wrote:
               | The thing about RAM speed (aka bandwidth) is that it
               | becomes irrelevant if you run out and have to page out to
               | slower tiers of storage.
        
         | riskable wrote:
         | No kidding... Intel is playing catch-up with Nvidia in the AI
         | space and a big reason for that is their offerings aren't
         | competitive. You can get an Intel Arc A770 with 16GB of VRAM
         | (which was released in October, 2022) for about $300 or an
         | Nvidia 4060 Ti with 16GB of VRAM for ~$500 which is _twice_ as
         | fast for AI workloads in reality (see:
         | https://cdn.mos.cms.futurecdn.net/FtXkrY6AD8YypMiHrZuy4K-120...
         | )
         | 
         | This is a huge problem because _in theory_ the Arc A770 is
         | faster! It 's theoretical performance (TFLOPS) is _more_ than
         | twice as fast as an Nvidia 4060 (see:
         | https://cdn.mos.cms.futurecdn.net/Q7WgNxqfgyjCJ5kk8apUQE-120...
         | ). So why does it perform so poorly? Because everything AI-
         | related has been developed and optimized to run on Nvidia's
         | CUDA.
         | 
         | Mostly, this is a mindshare issue. If Intel offered a
         | workstation GPU (i.e. _not_ a ridiculously expensive
         | "enterprise" monster) that developers could use that had
         | something like 32GB or 64GB of VRAM it would sell! They'd sell
         | zillions of them! In fact, I'd wager that they'd be _so_
         | popular it 'd be hard for consumers to even get their hands on
         | one because it would sell out everywhere.
         | 
         | It doesn't even need to be the fastest card. It just needs to
         | offer more VRAM than the competition. Right now, if you want to
         | do things like training or video generation the lack of VRAM is
         | a bigger bottleneck than the speed of the GPU. How does Intel
         | not see this!? They have the power to step up and take over a
         | huge section of the market but instead they're just copying
         | (poorly) what everyone else is doing.
        
           | Workaccount2 wrote:
           | Based on leaks, it looks like intel somehow missed an easy
           | opportunity here. There is an insane demand for high VRAM
           | cards now, and it seems the next intel cards will be 12GB.
           | 
           | Intel, screw everything else, just pack as much VRAM in those
           | as you can. Build it and they will come.
        
             | dheera wrote:
             | Exactly, I'd love to have 1TB of RAM that can be accessed
             | at 6000 MT/s.
        
               | talldayo wrote:
               | Optane is crying and punching the walls right now.
        
               | yjftsjthsd-h wrote:
               | Does optane have an advantage over RAM here?
        
               | watersb wrote:
               | Optane products were sold as DIMMS with single-DIMM
               | capacity as high as 512 GB. With an Intel memory
               | controller that could make it look like DRAM.
               | 
               | 512 GB.
               | 
               | It was slower than conventional DRAM.
               | 
               | But for AI models, Optane may have an advantage: it's
               | bit-addressable.
               | 
               | I'm not aware of any memory controllers that exposed that
               | single-bit granularity; Optane was fighting to create a
               | niche for itself, between DRAM and NAND Flash: pretending
               | to be both, when it was neither.
               | 
               | Bit-level operations, computational units in the same
               | device as massive storage, is an architecture that has
               | yet to be developed.
               | 
               | AI GPUs try to be such an architecture by plopping 16GB
               | of HBM next to a sea of little dot-product engines.
        
           | glitchc wrote:
           | I think the answer to that is fairly straightforward. Intel
           | isn't in the business of producing RAM. They would have to
           | buy and integrate a third-party product which is likely not
           | something their business side has ever contemplated as a
           | viable strategy.
        
             | monocasa wrote:
             | Their GPUs as sold already include RAM.
        
               | glitchc wrote:
               | Yes, but they don't fab their own RAM. It's a cost center
               | for them.
        
               | monocasa wrote:
               | If they can sell the board with more RAM for more than
               | their extra RAM costs, or can sell more GPUs total but
               | the RAM itself is priced essentially at cost, then it's
               | not a cost center.
        
               | RussianCow wrote:
               | That's not what a cost center is. There is an opportunity
               | for them to make more money by putting more RAM into
               | their GPUs and exposing themselves to a different market.
               | Whether they physically manufacture that RAM doesn't
               | matter in the slightest.
        
           | ponector wrote:
           | I don't agree. Who will buy it? A few enthusiasts who wants
           | to run LLM locally but cannot afford M3 or 4090?
           | 
           | It will be a niche product with poor sales.
        
             | talldayo wrote:
             | > Who will buy it?
             | 
             | Frustrated AMD customers willing to put their money where
             | their mouth is?
        
             | bt1a wrote:
             | I think there's more than a few enthusiasts who would be
             | very interesting in buying 1 or more of these cards (if
             | they had 32+ GB of memory), but I don't have any data to
             | back that opinion up. It is not only those who can't afford
             | a 4090 though.
             | 
             | While the 4090 can run models that use less than 24GB of
             | memory at blistering speeds, models are going to continue
             | to scale up and 24GB is fairly limiting. Because LLM
             | inference can take advantage of splitting the layers among
             | multiple GPUs, high memory GPUs that aren't super expensive
             | are desirable.
             | 
             | To share a personal perspective, I have a desktop with a
             | 3090 and an M1 Max Studio with 64GB of memory. I use the M1
             | for local LLMs because I can use up to 57~GB of memory,
             | even though the output (in terms of tok/s) is much slower
             | than ones I can fit on a 3090.
        
               | Dalewyn wrote:
               | >models are going to continue to scale up and 24GB is
               | fairly limiting
               | 
               | >24GB is fairly limiting
               | 
               | Can I take a moment to suggest that maybe we're very
               | spoiled?
               | 
               | 24GB of VRAM is more than most peoples' system RAM, and
               | that is "fairly limiting"?
               | 
               | To think Bill once said 640KB would be enough.
        
               | hnfong wrote:
               | It doesn't matter whether anyone is "spoiled" or not.
               | 
               | The fact is large language models require a lot of VRAM,
               | and the more interesting ones need more than 24GB to run.
               | 
               | The people who are able to afford systems with more than
               | 24GB VRAM will go buy hardware that gives them that, and
               | when GPU vendors release products with insufficient VRAM
               | they limit their market.
               | 
               | I mean inequality is definitely increasing at a worrying
               | rate these days, but let's keep the discussion on
               | topic...
        
               | Dalewyn wrote:
               | I'm just fascinated that the response/demand to running
               | out of RAM is _" Just sell us more RAM, god damn!"_
               | instead of engineering a solution to make due with what
               | is practically (and realistically) available.
        
               | xoranth wrote:
               | People have engineered solutions to make what is
               | available practical (see all the various quantization
               | schemes that have come out).
               | 
               | It is just that there's a limit to how much you can
               | compress the models.
        
               | dekhn wrote:
               | I would say that increasing RAM to avoid engineering a
               | solution has long been a successful strategy.
               | 
               | i learned my RAM lesson when I bought my first real linux
               | PC. it had 4MB of RAM, which was enough to run X, bash,
               | xterm, and emacs. But once I ran all that and also wanted
               | to compile with g++, it would start swapping, which in
               | the days of slow hard drives, was death to productivity.
               | 
               | I spent $200 to double to 8MB, and then another $200 to
               | double to 16MB, and then finally, $200 to max out the RAM
               | on my machine-- 32MB! And once I did that everything
               | flew.
               | 
               | Rather than attempting to solve the problem by making
               | emacs (eight megs and constantly swapping) use less RAM,
               | or find a way to hack without X, I deployed money to max
               | out my machine (which was practical, but not
               | realistically available to me unless I gave up other
               | things in life for the short term). Not only was I more
               | productive, I used that time to work on _other_
               | engineering problems which helped build my career, while
               | also learning an important lesson about swapping /paging.
               | 
               | People demand RAM and what was not practically available
               | is often available 2 years later as standard. Seems like
               | a great approach to me, especially if you don't have
               | enough smart engineers to work around problems like that
               | (see "How would you sort 4M integers in 2M of RAM?")
        
               | watersb wrote:
               | _> I spent $200 to double to 8MB, and then another $200
               | to double to 16MB, and then finally, $200 to max out the
               | RAM on my machine-- 32MB!_
               | 
               | Thank you. Now I feel a log better for dropping $700 on
               | the 32MB of RAM when I built my first rig.
        
               | whiplash451 wrote:
               | By the same logic, we'd still be writing assembly code on
               | 640KB RAM machines in 2024.
        
               | michaelt wrote:
               | There has in fact been a great deal of careful
               | engineering to allow 70 billion parameter models to run
               | on _just_ 48GB of VRAM
               | 
               | The people _training_ 70B parameter models from scratch
               | need ~600GB of VRAM to do it!
        
               | nl wrote:
               | While saying "we want more efficiency" is great there is
               | a trade off between size and accuracy here.
               | 
               | It is possible that compressing and using all of human
               | knowledge takes a lot of memory and in some cases the
               | accuracy is more important than reducing memory usage.
               | 
               | For example [1] shows how Gemma 2B using AVX512
               | instructions could solve problems it couldn't solve using
               | AVX2 because of rounding issues with the lower-memory
               | instructions. It's likely that most quantization (and
               | other memory reduction schemes) have similar problems.
               | 
               | As we develop more multi-modal models that can do things
               | like understand 3D video in better than real time it's
               | likely memory requirements will _increase_ , not
               | decrease.
               | 
               | [1] https://github.com/google/gemma.cpp/issues/23
        
             | loudmax wrote:
             | I tend to agree that it would be niche. The machine
             | learning enthusiast market is far smaller than the gamer
             | market.
             | 
             | But selling to machine learning enthusiasts is not a bad
             | place to be. A lot of these enthusiasts are going to go on
             | to work at places that are deploying enterprise AI at
             | scale. Right now, almost all of their experience is CUDA
             | and they're likely to recommend hardware they're familiar
             | with. By making consumer Intel GPUs attractive to ML
             | enthusiasts, Intel would make their enterprise GPUs much
             | more interesting for enterprise.
        
               | mysteria wrote:
               | The problem is that this now becomes a long term
               | investment, which doesn't work out when we have CEOs
               | chasing quarterly profits and all that. Meanwhile Nvidia
               | stuck with CUDA all those years back (while ensuring that
               | it worked well on both the consumer and enterprise line)
               | and now they reap the rewards.
        
               | Wytwwww wrote:
               | Current Intel and its leadership seems to be much more
               | focused on long term goals/growth than before, or so they
               | claim.
        
               | antupis wrote:
               | It would be same playbook that NVIDIA did CUDA where was
               | market 2010 when it was research labs and hobbyists doing
               | vector calculations.
        
               | resource_waste wrote:
               | I need offline LLMs for work.
               | 
               | It doesnt need to be consumer grade, it doesnt need to be
               | ultra high either.
               | 
               | It needs to be cheap enough for my department to
               | expensive it via petty cash.
        
             | Aerroon wrote:
             | It's about mindshare. Random people using your product to
             | do AI means that the tooling is going to improve because
             | people will try to use them. But as it stands right now if
             | you think there's any chance you want to use AI in the next
             | 5 years, then why would you buy anything other than Nvidia?
             | 
             | It doesn't even matter if that's your primary goal or not.
        
             | alecco wrote:
             | AFAIK, unless you are a huge American corp with orders
             | above $100m Nvidia will only sell you old and expensive
             | server cards like the crappy A40 PCIe 4.0 48GB GDDR6 at
             | $5,000. Good luck getting SXM H100s or GH200.
             | 
             | If Intel sells a stackable kit with a lot of RAM and a
             | reasonable interconnect a lot of corporate customers will
             | buy. It doesn't even have to be that good, just half way
             | between PCIe 5.0 and NVLink.
             | 
             | But it seems they are still too stuck in their old ways. I
             | wouldn't count on them waking up. Nor AMD. It's sad.
        
               | ponector wrote:
               | Parent comment requested non-enterprise, consumer grade
               | GPU with tons of memory. I'm sure there is no market for
               | this.
               | 
               | However, server solutions could have some traction.
        
             | resource_waste wrote:
             | >M3
             | 
             | >4090
             | 
             | These are noob hardware. A6000 is my choice.
             | 
             | Which really only further emphesizes your point.
             | 
             | >CPU based is a waste of everyone's time/effort
             | 
             | >GPU based is 100% limited by VRAM, and is what you are
             | realistically going to use.
        
             | jmward01 wrote:
             | Microsoft got where they are because the developed tools
             | that everyone used. The got the developers and the
             | consumers followed. Intel (or AMD) could do the same thing.
             | Get a big card with lost of ram so that the developers get
             | used to your ecosystem and then sell the enterprise GPUs to
             | make the $$$. It is a clear path with a lot of history and
             | it blows my mind Intel and AMD aren't doing it.
        
         | belter wrote:
         | AMD making drivers of high quality? I would pay to see that :-)
        
         | haunter wrote:
         | First crypto then AI, I wish GPUs were left alone for gaming.
        
           | azinman2 wrote:
           | Didn't nvidia try to block this in software by slowing down
           | mining?
           | 
           | Seems like we just need consumer matrix math cards with
           | literally no video out, and then a different set of
           | requirements for those with a video out.
        
             | wongarsu wrote:
             | But Nvidia doesn't want to make consumer compute cards
             | because those might steal market share from the datacenter
             | compute cards they are selling at 5x markup.
        
           | talldayo wrote:
           | Are there actually gamers out there that are still struggling
           | to source GPUs? Even at the height of the mining craze, it
           | was still possible to backorder cards at MSRP if you're
           | patient.
           | 
           | The serious crypto and AI nuts are all using custom hardware.
           | Crypto moved onto ASICs for anything power-efficient, and
           | Nvidia's DGX systems aren't being cannibalized from the
           | gaming market.
        
           | baq wrote:
           | They were.
           | 
           | But then those pesky researchers and hackers figured out how
           | to use the matmul hardware for non-gaming.
        
         | UncleOxidant wrote:
         | > Intel definitely seems to be doing all the right things on
         | software support.
         | 
         | Can you elaborate on this? Intel's reputation for software
         | support hasn't been stellar, what's changed?
        
         | OkayPhysicist wrote:
         | The issue from the manufacturer's perspective is that they've
         | got two different customer bases with wildly different
         | willingness to pay, but not substantially different needs from
         | their product. If Nvidia and AMD didn't split the two markets
         | somehow, then there would be no cards available to the PC
         | market, since the AI companies with much deeper pockets would
         | buy up the lot. This is undesirable from the manufacturer's
         | perspective for a couple reasons, but I suspect a big one is
         | worries that the next AI winter would cause their entire
         | business to crater out, whereas the PC market is pretty
         | reliable for the foreseeable future.
         | 
         | Right now, the best discriminator they have is that PC users
         | are willing to put up with much smaller amounts of VRAM.
        
       | donnygreenberg wrote:
       | Would be nice if this came with scripts which could launch the
       | examples on compatible GPUs on cloud providers (rather than
       | trying to guess?). Would anyone else be interested in that?
       | Considering putting it together.
        
       ___________________________________________________________________
       (page generated 2024-04-03 23:01 UTC)