[HN Gopher] 100Gbps RF Sample Offload for RFSoC Using GNU Radio ...
       ___________________________________________________________________
        
       100Gbps RF Sample Offload for RFSoC Using GNU Radio and PYNQ
        
       Author : teleforce
       Score  : 42 points
       Date   : 2023-12-07 12:18 UTC (10 hours ago)
        
 (HTM) web link (strathprints.strath.ac.uk)
 (TXT) w3m dump (strathprints.strath.ac.uk)
        
       | gumballindie wrote:
       | > utilising a GPU acceleration
       | 
       | GPUs are incredibly versatile. As CPUs struggle to add more cores
       | instead we should shift software as much as possible on to the
       | GPU and use the CPU as a coprocessor.
       | 
       | At least the GPU should be given direct access to storage.
       | 
       | Perhaps GPUs should come with 1-2 NVMe slots and upgradeable
       | VRAM.
        
         | 1oooqooq wrote:
         | in this case the GPU is just to cheapen development/enable
         | prototyping. Nothing in the real world will use it instead of a
         | fpga or custom silicon.
         | 
         | also, LOL at nokia making bank on their overly finicky 200Gpbs
         | SFP and these guys "wires? meh"
        
           | KeplerBoy wrote:
           | "Nothing in the real world will use it instead of a fpga or
           | custom silicon."
           | 
           | That's a bold claim and pretty wrong.
        
         | MisterTea wrote:
         | GPU's suck at general loads. Let the CPU worry about the boring
         | general nonsense like data shuffling and just hand the GPU
         | massively parallel workloads like graphics, computations and so
         | on. Let it do what it does best and let DMA handle the rest.
         | Horses for courses.
        
           | gumballindie wrote:
           | I am aware of their limitations. But I al wondering if
           | there's a way to make GPU cores behave like a single unit, or
           | larger units like cpu cores, when needed - somehow to process
           | data in a sequence of instructions, when needed and in
           | parallel when not. Just thinking out loud. As I am playing
           | around with GPUs for various reasons I am simply amazed at
           | how versatile they are.
           | 
           | One fun hack I wanted to make was a database function that
           | would let me update database record on the GPU. One could
           | implement the function as a kernel and have it update tens of
           | thousands of records in one go.
           | 
           | The bottleneck is reading data from disk, to cpu / ram, to
           | gpu.
           | 
           | Instead of there was a way to read data directly from disk,
           | all records could be loaded in VRAM at once and data updated.
           | 
           | I think for a good number of use cases databases are prime
           | candidates for parallel processing. Each record being
           | processed by a kernel.
           | 
           | For gaming a significant bottleneck is moving texture and
           | mesh data between disks, cpus, and the gpu.
           | 
           | Essentially, I think the main component of a PC could be the
           | GPU while the CPU is there to execute block operations - if
           | there is no way to "fuse" gpu cores as per my first
           | paragraph.
           | 
           | I just think there's a lot of performance left on the table
           | otherwise.
           | 
           | For ml inference at edge for instance all you need is a GPU
           | with a network card and extremely fast storage. All of which
           | are available but bottlenecked by the cpu.
        
         | xxpor wrote:
         | If you need to do parallel matrix math, yeah GPUs are great.
         | Turns out that describes a lot of scientific/graphics/signal
         | processing workloads. But the majority of computing is just
         | straight forward linear moving bits around.
        
         | juliangoldsmith wrote:
         | GPUs are great for things that are parallelizable, but a lot of
         | things aren't.
         | 
         | Technologies like GPUDirect are interesting in that vein.
         | Theoretically you can DMA data directly from an NVMe drive on
         | another machine, across an Infiniband link, and into VRAM
         | without the data ever touching the CPU.
        
           | londons_explore wrote:
           | More importantly, why wasn't that possible in 1998?
           | 
           | CPU independent DMA engines have existed since 1995 or
           | earlier... DMA engines can typically be controlled by a queue
           | of work, and can read/write to anything memory mapped. The
           | work units themselves can be in memory mapped space of the
           | GPU or network adapter, allowing those devices to queue up
           | work too. And the DMA transfer can be triggered by any
           | interrupt source too (initially intended for things like
           | sound cards to have a tiny buffer be frequently refilled by
           | the DMA engine without CPU intervention).
           | 
           | Basically, hardware has supported this forever. It is
           | software that is lagging.
        
           | gumballindie wrote:
           | Every now and then there's a comment on HN that either ruins
           | my weekend or my budget. Yours does both. I have to get both
           | types of devices now to play around with.
        
           | KeplerBoy wrote:
           | It goes beyond that. You can DMA stuff directly from a NIC
           | into VRAM. That's probably why Nvidia acquired Mellanox a few
           | years ago.
        
         | topspin wrote:
         | > upgradeable VRAM
         | 
         | Consumer class GPUs don't have a lot of flexibility regarding
         | VRAM size or type. GPUs have highly optimized, integrated
         | memory controllers that are designed to use one or only a small
         | number of distinct VRAM capacities and types. Enterprise class
         | GPUs do have more options wrt VRAM, but it's still not a free-
         | for-all like typical desktop/server RAM configurations.
         | 
         | I suppose one can imagine this changing in the future, but the
         | memory controller complexity and costs (die area, power
         | consumption, etc.) would need to be greatly increased.
        
       | CobaltFire wrote:
       | So this uses the AMD/Zynq RFSoC 4x2 to pipe ~78Gbps of raw data
       | (at maximum) from an SDR front end to a machine running GNU
       | Radio.
       | 
       | That's absolutely nuts. The board is a bit over $2k, and they
       | don't go into how effective the RF front end on that board is,
       | but enabling that kind of bandwidth at that price is amazing.
        
         | nimish wrote:
         | That's the academic price. The commercial/regular price is
         | likely 5-10x more.
        
           | londons_explore wrote:
           | But expect a 10x discount again if you buy in bulk not
           | through a reseller...
        
           | peckforth43 wrote:
           | It's 2x.
        
         | kvemkon wrote:
         | Interfaces with 80 Gbit/s announced 3rd quarter 2022 (rumored
         | since summer 2021):
         | 
         | * DisplayPort 2.1
         | 
         | * USB4 version 2
         | 
         | * Thunderbolt 5
         | 
         | Amazing for desktops and notebooks.
         | 
         | Hopefully such boards could also make soon into at least SDR
         | hobby price range.
        
         | fidotron wrote:
         | This is also the kind of product where making a serious effort
         | to buy one will attract the attention of various agencies.
         | Looks like amazing fun to me.
        
       | nxa wrote:
       | > FFT of a full 2GHz bandwidth RF signal at 60 frames per second
       | 
       | Is this useful for anything else than visualization of the
       | spectrum in a waterfall diagram?
        
         | stagger87 wrote:
         | The processing that goes into the phosphor and waterfall
         | displays is usually overlapped FFTs, which is also the
         | processing for a lot of common channelization techniques. So
         | yes, probably not so much in the commercial space, but
         | certainly very useful. In fact I'm aware of platforms that
         | don't even store/manipulate I/Q data, but the overlapped FFT
         | data since it's usually much more useful as a starting point.
        
         | KeplerBoy wrote:
         | Oh, yes it is. It is a necessary step for all radar processing
         | in FMCW systems.
        
       | bobbob1921 wrote:
       | Am I reading this correct in that they essentially are able to
       | capture (what's called commercially) a real time bandwidth
       | "block" of around 350-450MHz? I'm getting the term realtime
       | bandwidth (rtbw) in relation to a device such as this which can
       | capture up to 245 mhz of RTBW , although that requires 2X antenna
       | ports / inputs: https://aaronia.com/en/produkte/spectrum-
       | analyzer/spectran-v...
        
         | stagger87 wrote:
         | Yes, it's the same real-time bandwidth concept being discussed
         | here, except up to 2.5GHz in the paper.
        
       | teleforce wrote:
       | This is the paper's author presentation in this year GNU Radio
       | conference [1].
       | 
       | [1] Ultra-wideband SDR architecture for AMD RFSoCs using PYNQ
       | based GNU Radio blocks:
       | 
       | https://events.gnuradio.org/event/21/contributions/418/
        
       ___________________________________________________________________
       (page generated 2023-12-07 23:01 UTC)