[HN Gopher] 100Gbps RF Sample Offload for RFSoC Using GNU Radio ...
___________________________________________________________________
100Gbps RF Sample Offload for RFSoC Using GNU Radio and PYNQ
Author : teleforce
Score : 42 points
Date : 2023-12-07 12:18 UTC (10 hours ago)
(HTM) web link (strathprints.strath.ac.uk)
(TXT) w3m dump (strathprints.strath.ac.uk)
| gumballindie wrote:
| > utilising a GPU acceleration
|
| GPUs are incredibly versatile. As CPUs struggle to add more cores
| instead we should shift software as much as possible on to the
| GPU and use the CPU as a coprocessor.
|
| At least the GPU should be given direct access to storage.
|
| Perhaps GPUs should come with 1-2 NVMe slots and upgradeable
| VRAM.
| 1oooqooq wrote:
| in this case the GPU is just to cheapen development/enable
| prototyping. Nothing in the real world will use it instead of a
| fpga or custom silicon.
|
| also, LOL at nokia making bank on their overly finicky 200Gpbs
| SFP and these guys "wires? meh"
| KeplerBoy wrote:
| "Nothing in the real world will use it instead of a fpga or
| custom silicon."
|
| That's a bold claim and pretty wrong.
| MisterTea wrote:
| GPU's suck at general loads. Let the CPU worry about the boring
| general nonsense like data shuffling and just hand the GPU
| massively parallel workloads like graphics, computations and so
| on. Let it do what it does best and let DMA handle the rest.
| Horses for courses.
| gumballindie wrote:
| I am aware of their limitations. But I al wondering if
| there's a way to make GPU cores behave like a single unit, or
| larger units like cpu cores, when needed - somehow to process
| data in a sequence of instructions, when needed and in
| parallel when not. Just thinking out loud. As I am playing
| around with GPUs for various reasons I am simply amazed at
| how versatile they are.
|
| One fun hack I wanted to make was a database function that
| would let me update database record on the GPU. One could
| implement the function as a kernel and have it update tens of
| thousands of records in one go.
|
| The bottleneck is reading data from disk, to cpu / ram, to
| gpu.
|
| Instead of there was a way to read data directly from disk,
| all records could be loaded in VRAM at once and data updated.
|
| I think for a good number of use cases databases are prime
| candidates for parallel processing. Each record being
| processed by a kernel.
|
| For gaming a significant bottleneck is moving texture and
| mesh data between disks, cpus, and the gpu.
|
| Essentially, I think the main component of a PC could be the
| GPU while the CPU is there to execute block operations - if
| there is no way to "fuse" gpu cores as per my first
| paragraph.
|
| I just think there's a lot of performance left on the table
| otherwise.
|
| For ml inference at edge for instance all you need is a GPU
| with a network card and extremely fast storage. All of which
| are available but bottlenecked by the cpu.
| xxpor wrote:
| If you need to do parallel matrix math, yeah GPUs are great.
| Turns out that describes a lot of scientific/graphics/signal
| processing workloads. But the majority of computing is just
| straight forward linear moving bits around.
| juliangoldsmith wrote:
| GPUs are great for things that are parallelizable, but a lot of
| things aren't.
|
| Technologies like GPUDirect are interesting in that vein.
| Theoretically you can DMA data directly from an NVMe drive on
| another machine, across an Infiniband link, and into VRAM
| without the data ever touching the CPU.
| londons_explore wrote:
| More importantly, why wasn't that possible in 1998?
|
| CPU independent DMA engines have existed since 1995 or
| earlier... DMA engines can typically be controlled by a queue
| of work, and can read/write to anything memory mapped. The
| work units themselves can be in memory mapped space of the
| GPU or network adapter, allowing those devices to queue up
| work too. And the DMA transfer can be triggered by any
| interrupt source too (initially intended for things like
| sound cards to have a tiny buffer be frequently refilled by
| the DMA engine without CPU intervention).
|
| Basically, hardware has supported this forever. It is
| software that is lagging.
| gumballindie wrote:
| Every now and then there's a comment on HN that either ruins
| my weekend or my budget. Yours does both. I have to get both
| types of devices now to play around with.
| KeplerBoy wrote:
| It goes beyond that. You can DMA stuff directly from a NIC
| into VRAM. That's probably why Nvidia acquired Mellanox a few
| years ago.
| topspin wrote:
| > upgradeable VRAM
|
| Consumer class GPUs don't have a lot of flexibility regarding
| VRAM size or type. GPUs have highly optimized, integrated
| memory controllers that are designed to use one or only a small
| number of distinct VRAM capacities and types. Enterprise class
| GPUs do have more options wrt VRAM, but it's still not a free-
| for-all like typical desktop/server RAM configurations.
|
| I suppose one can imagine this changing in the future, but the
| memory controller complexity and costs (die area, power
| consumption, etc.) would need to be greatly increased.
| CobaltFire wrote:
| So this uses the AMD/Zynq RFSoC 4x2 to pipe ~78Gbps of raw data
| (at maximum) from an SDR front end to a machine running GNU
| Radio.
|
| That's absolutely nuts. The board is a bit over $2k, and they
| don't go into how effective the RF front end on that board is,
| but enabling that kind of bandwidth at that price is amazing.
| nimish wrote:
| That's the academic price. The commercial/regular price is
| likely 5-10x more.
| londons_explore wrote:
| But expect a 10x discount again if you buy in bulk not
| through a reseller...
| peckforth43 wrote:
| It's 2x.
| kvemkon wrote:
| Interfaces with 80 Gbit/s announced 3rd quarter 2022 (rumored
| since summer 2021):
|
| * DisplayPort 2.1
|
| * USB4 version 2
|
| * Thunderbolt 5
|
| Amazing for desktops and notebooks.
|
| Hopefully such boards could also make soon into at least SDR
| hobby price range.
| fidotron wrote:
| This is also the kind of product where making a serious effort
| to buy one will attract the attention of various agencies.
| Looks like amazing fun to me.
| nxa wrote:
| > FFT of a full 2GHz bandwidth RF signal at 60 frames per second
|
| Is this useful for anything else than visualization of the
| spectrum in a waterfall diagram?
| stagger87 wrote:
| The processing that goes into the phosphor and waterfall
| displays is usually overlapped FFTs, which is also the
| processing for a lot of common channelization techniques. So
| yes, probably not so much in the commercial space, but
| certainly very useful. In fact I'm aware of platforms that
| don't even store/manipulate I/Q data, but the overlapped FFT
| data since it's usually much more useful as a starting point.
| KeplerBoy wrote:
| Oh, yes it is. It is a necessary step for all radar processing
| in FMCW systems.
| bobbob1921 wrote:
| Am I reading this correct in that they essentially are able to
| capture (what's called commercially) a real time bandwidth
| "block" of around 350-450MHz? I'm getting the term realtime
| bandwidth (rtbw) in relation to a device such as this which can
| capture up to 245 mhz of RTBW , although that requires 2X antenna
| ports / inputs: https://aaronia.com/en/produkte/spectrum-
| analyzer/spectran-v...
| stagger87 wrote:
| Yes, it's the same real-time bandwidth concept being discussed
| here, except up to 2.5GHz in the paper.
| teleforce wrote:
| This is the paper's author presentation in this year GNU Radio
| conference [1].
|
| [1] Ultra-wideband SDR architecture for AMD RFSoCs using PYNQ
| based GNU Radio blocks:
|
| https://events.gnuradio.org/event/21/contributions/418/
___________________________________________________________________
(page generated 2023-12-07 23:01 UTC)