[HN Gopher] Snapdragon 8 Gen 1's iGPU: Adreno Gets Big
       ___________________________________________________________________
        
       Snapdragon 8 Gen 1's iGPU: Adreno Gets Big
        
       Author : rbanffy
       Score  : 81 points
       Date   : 2024-03-07 12:00 UTC (11 hours ago)
        
 (HTM) web link (chipsandcheese.com)
 (TXT) w3m dump (chipsandcheese.com)
        
       | xlazom00 wrote:
       | Does anyone know if there is any processor(in phone) with
       | SVE/SVE2? As SVE/SVE2 SHOULD be in all new ARMv9.0-A + CPU cores.
       | 
       | I did have a chance to test only Qualcomm Snapdragon cores (
       | Samsung Galaxy S23).
        
         | fotad wrote:
         | Snapdragon 888 probably the first one have SVE, Snapdragon 7/8
         | gen all have Qualcomm(r) Type-1 Hypervisor, so what's your
         | test, abel to run any linux on it?
        
           | xlazom00 wrote:
           | I am using default android on Samsung Galaxy S23 with
           | userland
           | https://play.google.com/store/apps/details?id=tech.ula So any
           | advice how to run it?
        
         | monocasa wrote:
         | Looking at the docs, it doesn't look like ARMv9-A strictly
         | requires FEAT_SVE to be implemented.
         | 
         | What that means in practice is anyone's guess.
        
         | my123 wrote:
         | Qualcomm disables SVE (masked at the hypervisor level) on all
         | their silicon. If you want SVE on a phone roday, your options
         | are Tensor G3, Exynos 2200/2400 or MediaTek phones with ARMv9
         | CPUs.
         | 
         | Or if you have hypervisor execution level code exec (including
         | an unfused phone), you can patch up that limitation.
        
           | xlazom00 wrote:
           | So I have this
           | https://en.wikipedia.org/wiki/Windows_Dev_Kit_2023
           | 
           | And we can run ubuntu on that
           | https://github.com/armbian/build
           | 
           | There isn't any hypervisor running on that and still no SVE
           | So any advice ?
        
       | rayiner wrote:
       | Snapdragon 800 -> Snapdragon 8 Gen N is terrible marketing. It
       | makes everything sound like a point release.
        
         | causi wrote:
         | My headcanon is that Snapdragon 875 would've been a great chip
         | unlike the infamously overheating 888.
        
           | nfriedly wrote:
           | I'm on a Snapdragon 870 (moto g100), and it's great.
           | Performance and battery life are both good enough that I
           | don't think about it very often, and I've never had any
           | overheating issues.
           | 
           | I think the 870 was the highest end Snapdragon that didn't
           | have any overheating issues for a while.
        
             | joecool1029 wrote:
             | 5nm Samsung process was just bad. Needs active cooling.
             | 
             | I've had the sm8250 (865 in my case) and it's a great chip
             | that's faster in real-world conditions than the 888 hand
             | warmer was. Now I have an 8 gen 2 device and it has no
             | overheating issues. The 8 gen 1+ doesn't appear to have any
             | either. I avoided the 8 gen 1 after having the bad
             | experience with the 888.
        
               | nfriedly wrote:
               | Aah, ok, I edited my comment to just say that it was the
               | best for a while, not still the best. Agree with you
               | about Samsung's 5nm process.
        
         | rbanffy wrote:
         | I believe their biggest failure in branding is to not put out
         | Raspberry Pi-like small boards for people to experiment. Anyone
         | with a slight interest in computers knows an M3 is a beast and
         | that an i3 is meh. Almost nobody outside Qualcomm knows why a
         | Gen 8 N is better than an 800 and what the difference would be.
        
           | RuggedPineapple wrote:
           | They do in fact put those out. That said the pricing is not
           | at all Pi like. The Snapdragon 8 Gen 2 board goes for north
           | of a thousand dollars.
           | 
           | https://www.lantronix.com/products/snapdragon-8-gen-2-mobile.
           | ..
        
             | jsheard wrote:
             | That's only Pi-like in the sense of being an SBC, unlike
             | the Pi it's meant to be an evaluation/reference platform
             | for integrators who want to build their own board around
             | the SOC, not something you would buy to use for its own
             | sake.
        
               | rbanffy wrote:
               | And this is where they fail - the people who gets their
               | boards have almost zero interest in upstreaming whatever
               | hardware enablement they do to make the boards sort of
               | work for their specific use cases.
               | 
               | I really don't think undercutting their evaluation board
               | business with affordable SBCs would hurt their bottom
               | line.
        
               | jsheard wrote:
               | The people who buy those eval boards probably aren't even
               | _allowed_ to upstream whatever they do with it, they 'll
               | have signed an NDA with Qualcomm to get access to the
               | documentation.
        
               | GuB-42 wrote:
               | But maybe they could make it more Pi-like, that is
               | pricing it so that it can interest both integrators and
               | hobbyists, with a bootloader that makes it easy to tinker
               | with the OS (not necessarily the firmware).
        
               | rbanffy wrote:
               | I am sure that if the RPi Foundation can do it, Qualcomm
               | can as well.
        
               | jsheard wrote:
               | Broadcom is the counterpart to Qualcomm in that
               | comparison, and those two have similar attitudes towards
               | the hobbyist/enthusiast market - they don't care in the
               | slightest. It took a entity outside of Broadcom which
               | nonetheless had deep connections to them (Eben Upton was
               | there prior to starting RPi) in order to broker a
               | compromise where the Pi could happen, and even then
               | Broadcom kept most of the documentation tied up in NDAs
               | and the bare SOCs not available for sale to the general
               | public.
               | 
               | The Raspberry Pi is an anomaly that's unlikely to be
               | replicated.
        
           | monocasa wrote:
           | There's economic reasons behind the cheap SoCs you see.
           | Invariably it's when an SoC was made in bulk expecting some
           | market that never materialized.
           | 
           | For instance the original Pi SoC was very clearly intended
           | for some set top box OEM that didn't pick it up for some
           | reason.
           | 
           | When that happens, after a while the chip manufacturer is
           | willing to sell them for way below cost just to get anything
           | back from their inventory that from their perspective looks
           | like a complete loss.
           | 
           | So you get an industrious cottage industry that takes those
           | to cheap to make sense chips, cost reduces the reference
           | design, and ships them with the just barely building sdk from
           | the chip manufacturer.
           | 
           | At the end of the day Qualcomm doesn't care about this market
           | because they are pretty good about keeping their customers on
           | the hook for their bulk orders. So they're focused on
           | supporting current bulk customers where a $1k dev board is
           | actually a really reasonable price.
        
           | jsight wrote:
           | That's a good point. Perhaps even something like the NUC
           | form-factor would work well.
        
           | MikusR wrote:
           | The soc alone is 160+ dollars.
        
           | Dalewyn wrote:
           | Intel Core i3 is to Apple M3, i5 is to M3 Pro, i7 is to M3
           | Max, and i9 is to M3 Ultra.
           | 
           | If you think an i3 is "meh", you know nothing about
           | computers. For the vast majority of users including gamers,
           | an i3 is overkill.
        
         | bartekpacia wrote:
         | I remember having Snapdragon 800 in my LG G2. It was a beast at
         | the time!
        
           | rbanffy wrote:
           | Most people switch phones on a cadence that makes every new
           | one feel like a beast (my personal phone is a 2nd gen iPhone
           | SE and my new work phone is a 13, which feels like a beast
           | next to the other).
           | 
           | It's also incredibly rare to find someone who knows what chip
           | is in their phones. I have the vague notion my personal phone
           | has an A15 or something like that.
        
       | umanwizard wrote:
       | The title should say 8+, not 8 (@dang)
        
         | nfriedly wrote:
         | You're correct. But I believe they have an identical design -
         | the only difference is that the 8+ runs a little faster and
         | more efficiently, because it was manufactured by TSMC, whereas
         | the 8 was manufactured by Samsung.
        
           | clamchowder wrote:
           | The 8+ runs at slightly faster clocks and has a smaller die.
           | Somehow TSMC's 4 nm is better than Samsung's 4 nm.
           | 
           | But yeah everything I wrote there should be applicable to the
           | Snapdragon 8 Gen 1 as well, just without the 10% GPU clock
           | speed increase that Qualcomm says the 8+ gets.
        
       | cubefox wrote:
       | Any explanation why Qualcomm uses "tile based rendering" while
       | Nvidia and AMD don't?
        
         | charcircuit wrote:
         | Both Nvidia and AMD do use tile based rendering
        
           | bpye wrote:
           | And for what it's worth Apple and Imagination do also -
           | Imagination being where tiled rendering really started.
        
           | kllrnohj wrote:
           | Not quite, not in the same way mobile GPUs do. Also I don't
           | think AMD ever did the transition. They announced it (draw
           | stream binning), but don't seem to have ever shipped it?
        
         | kllrnohj wrote:
         | It's significantly more bandwidth efficient to do tile based
         | rendering however it has more performance cliffs and requires
         | more game developer care to avoid hitting issues. For mobile
         | SoCs (powervr, adreno, mali, and whatever apple calls their
         | powervr-derivative) you can't just throw GDDR and get gobs of
         | bandwidth, and also bandwidth is power expensive, so the
         | savings more than offsets the performance cliffs and developer
         | complications
        
         | MikusR wrote:
         | Nvidia has been using it for 10 years.
         | https://www.realworldtech.com/tile-based-rasterization-nvidi...
        
         | kbolino wrote:
         | If you can afford to render the entire screen, why would you
         | tile? Tiling complicates certain kinds of shaders, tiles have
         | to be stitched together to make the final image, and tiling
         | redraws (different parts of) the same triangles multiple times.
         | 
         | On a dedicated GPU with lots of memory bandwidth, there's
         | probably no benefit (and maybe even some penalty) to use tiling
         | with lower resolutions (e.g. 1080p). However, 4K rendering
         | might benefit from it and 8K rendering probably requires it.
        
           | kllrnohj wrote:
           | The output resolution doesn't matter. The reason you tile is
           | to improve locality and thus have more cache hits. For mobile
           | GPUs, the cache is literally a specific tile buffer, but for
           | nvidia (who also do this) the cache is just L2. But by tiling
           | the geometry, they spend more time in L2 and fewer times
           | hitting DRAM. This is a performance _and_ power win. The
           | actual resolution is irrelevant as the tiles are very small,
           | even on nvidia GPUs. 16x16 and 32x32 are common tile sizes,
           | see https://www.techpowerup.com/231129/on-nvidias-tile-based-
           | ren... and https://www.youtube.com/watch?v=Nc6R1hwXhL8 where
           | this was reverse engineered before nvidia actually talked
           | about it.
           | 
           | You might be confusing tiled rendering with various upscaling
           | technologies or variable rate shading?
        
       | kllrnohj wrote:
       | For the 'CPU to GPU Copy Bandwidth' section the more likely
       | reason GPU -> CPU copy is slow is there's no reason to do it.
       | Adreno is unified memory, you can just mmap that GPU buffer on
       | the CPU. This is done on Android via the "gralloc" HAL, also
       | called (A)HardwareBuffer.
       | 
       | CPU->GPU is still valuable in that it's where texture swizzling
       | happens to optimize the data for non-linear access, and vendors
       | are all cagey about documenting these formats. But I don't think
       | there's a copy engine for it at all, i think it's just CPU code.
       | If you run a perfetto trace you can see adreno actually using
       | multiple threads for this, likely why CPU->GPU is then so much
       | faster than the reverse. But you almost never need non-linear
       | output, so since vendor-specific swizzling isn't helpful you just
       | don't bother and use shared memory between the two.
        
       | flakiness wrote:
       | Great article, as always!
       | 
       | The author not only looks into the spec sheet and the
       | presentation, but also looks into the OSS mesa code and uses
       | OpenGL introspection to reverse-engineer (well, not by himself
       | but...) the architecture. For me this is one of the most detailed
       | explanations of how mobile GPU architecture looks like.
       | 
       | The comparison to the older NVIDIA GPU is also very helpful (it
       | is like a 6 year gap between this and discrete NVIDIA GPU 1050
       | GTX). Now I wonder how it compares to other mobile GPUs like
       | Apple's or ARM's.
        
       ___________________________________________________________________
       (page generated 2024-03-07 23:01 UTC)