[HN Gopher] GPU Caching Compared Among AMD, Intel UHD, Apple M1
___________________________________________________________________
GPU Caching Compared Among AMD, Intel UHD, Apple M1
Author : jb1991
Score : 43 points
Date : 2023-01-16 17:58 UTC (5 hours ago)
(HTM) web link (chipsandcheese.com)
(TXT) w3m dump (chipsandcheese.com)
| hot_gril wrote:
| Nice, succinct 1-2 page article going into interesting technical
| details. As someone who's hardly touched graphics, GPUs have
| always been magic to me, especially integrated ones, so it's nice
| to read digestible explanations about them.
| lowbloodsugar wrote:
| >bandwidth is the same for AMD and Apple and much lower for
| Intel.
|
| Later
|
| >Intel: 700, AMD 1400, Apple: 2100
|
| I wouldn't call 2x and 3x "similar".
|
| Also I don't see why author thinks desktop chips with integrated
| graphics are meant to be paired with a discreet GPU. Surely the
| opposite is true. I got a faster CPU by not getting one with
| integrated graphics.
|
| Finally, doesn't the fact that apple has a fundamentally
| different rendering pipeline relevant?
| Dalewyn wrote:
| At least with regards to Intel CPUs, iGPU-less CPUs (the ones
| with -F suffixes) are otherwise identical to the standard ones
| with iGPUs. The main reason to buy them is the slightly lower
| price, which could make a difference if you're on a tight
| budget.
|
| On a tangential note, it's great having an iGPU even if you are
| almost never going to use it. If your discrete GPU borks, you
| have a fallback ready and waiting. If you do use it alongside a
| discrete GPU, you can offload certain lower priority tasks like
| video encoding/decoding to it.
| deagle50 wrote:
| Intel Steam Deck 2 would be very interesting. I think they could
| make something very compelling in the continuous 15W under gaming
| load space.
| alanfranz wrote:
| > as modern dedicated GPUs, can theoretically do zero-copy
| transfers by mapping the appropriate memory on both the CPU and
| GPU.
|
| Is this true for dgpus? How does this work?
| yaantc wrote:
| This is not specific to dGPU, it could apply to any PCIe
| device. Emphasis on "theoretically" too.
|
| On the device (dGPU here), it is possible to route memory
| accesses to part of the internal address space to the PCIe
| controller. In turn, the PCIe controller can translate such
| received memory access into a PCIe request (read or write), in
| the different PCIe address space, with some address
| translation.
|
| This PCIe request goes to the PCIe host (CPU in a dGPU
| scenario). Here too the host PCIe controller can map the PCIe
| request, using using a PCIe address space address, into the
| host address space. And this can go to the host memory (after
| IOMMU filtering and address translation usually). And all this
| back for the return trip to the device in case of a read.
|
| So latency would be rather high, but technically possible. In
| most application such transfers are offloaded to a DMA in the
| PCIe controller doing a copy between PCIe and local address
| spaces, but a processing core can certainly do a direct access
| without DMA if all the address mappings are suitably
| configured.
| kevingadd wrote:
| In theory for a long time you've been able to "persistently
| map" A GPU side buffer that houses things like indexes, vertex
| data, or even textures, and then write directly* into GPU
| memory from the CPU without a staging buffer. This was referred
| to as 'AZDO' (Approaching Zero Driver Overhead) in the OpenGL
| space and eventually fed into the design of Vulkan and Direct3D
| 12 (see https://www.gdcvault.com/play/1020791/Approaching-Zero-
| Drive... if you're curious about all of this)
|
| I say in theory and used an asterisk because I think it's
| generally the case that the driver could lie and just maintain
| an illusion for you by flushing a staging buffer at the 'right
| time'. But in practice my understanding is that the memory
| writes will go straight over the PCIe bus to the GPU and into
| its memory, perhaps with a bit of write-caching/write-combining
| locally on the CPU. It would be wise to make sure you never
| _read_ from that mapped memory :)
___________________________________________________________________
(page generated 2023-01-16 23:00 UTC)