[HN Gopher] Computing Without Processors (2011)
___________________________________________________________________
Computing Without Processors (2011)
Author : hasheddan
Score : 34 points
Date : 2024-03-09 14:47 UTC (8 hours ago)
(HTM) web link (cacm.acm.org)
(TXT) w3m dump (cacm.acm.org)
| brap wrote:
| So it could be "Computing _With_ Processors". That could be the
| title too. Having said that, I think this is a terrific title.
| fsiefken wrote:
| The excellent article is ancient (2011), but the author satnam
| singh is alive, kicking and working for groq.
|
| https://www.cst.cam.ac.uk/people/ss2072
|
| https://codesync.global/speaker/satnam-singh/
|
| https://research.google.com/intl/en/pubs/SatnamSingh.html
|
| https://www.linkedin.com/in/satnam6502/
| creer wrote:
| 2011 "ancient"? Author alive! Amazing world we live in - deep
| witchcraft probably involved here. Perhaps upload?
|
| Seriously though. There are plenty of useful readings that came
| out before the past 7 days. And digging them out and bringing
| them to our attention is useful.
| kouru225 wrote:
| Honestly, with how quick shit has been moving even 2020 seems
| ancient
| jacoblambda wrote:
| He also has a twitter: https://twitter.com/satnam6502
| reidacdc wrote:
| The article being from 2011 is perhaps why it can be as long as
| it is without mentioning "Coarse-grained reconfigurable arrays",
| or CGRAs, which, at least as of 2019 when I learned about them,
| seemed to occupy a good middle ground between conventional CPUs
| and FPGAs.
|
| The idea is that, instead of being a bunch of gates like an FPGA,
| the components of the CGRA are at the scale of an ALU, or maybe
| an on-silicon network switch, with a single CGRA having different
| parts that are optimized for e.g. numerics, IO, encryption,
| caching, etc., which you can knit together into the processor you
| need.
|
| That's maybe where this idea went?
|
| Here's a more recent link covering similar ground:
|
| https://semiengineering.com/specialization-vs-generalization...
| adrian_b wrote:
| Also about CGRA:
|
| https://efficient.computer/technology
|
| The claims about improved energy efficiency (due to the
| elimination of the instruction fetching and decoding and of the
| register files) can be correct only when such a CGRA is not
| used as a general-purpose CPU, but as an accelerator used to
| implement various iterative algorithms, i.e. when its dataflow
| compiler could be used as a replacement for something like
| CUDA.
|
| A FPGA would have the same energy efficiency advantage for
| algorithms without much numeric computation, but it is not
| competitive with a GPU or a CGRA for most numeric computations,
| except DSP, because it includes only small fixed-point
| multipliers and adders, which are not as efficient as big
| vector floating-point fused-multiply-add execution units.
| jacoblambda wrote:
| It's worth noting that what you are describing is basically an
| FPGA nowadays.
|
| FPGAs don't have "gates" as the basic building blocks.
|
| Instead you have "logic cells" which are composed of a fixed
| size (often either 4 or 6 bit) LUT (look up table), one or two
| flip flops, and a multiplexer to choose whether to use the
| stored value or the new LUT value. They also sometimes contain
| basic ALU components like adders or multipliers. Those logic
| cells are then usually grouped together to form logic blocks
| which might have some amount of local memory/cache available.
| These blocks are the smallest "discrete" component of an FPGA
| and are configured as a whole block with configurations
| determined at synthesis time.
|
| On top of this you have memory blocks and other "hard IP" like
| DSP slices, etc distributed around the IC for these logic
| blocks to take advantage of.
|
| And then finally you have larger hard IP that a given chip only
| has a few of. These include your PLLs (phase locked loops) or
| other analog clock multiplier hardware (to allow you to run
| multiple clock domains on a single FPGA), your encryption and
| encoding/decoding accelerators, dedicated protocol hard IP
| (ethernet, PCIE, etc), and hardware that is directly attached
| to the IO (ADCs, DACs, pullup/pulldown resistor configuration,
| etc). And increasingly nowadays also full blown hard IP CPUs
| and GPUs that can interact directly with the FPGA.
| adrian_b wrote:
| No the previous poster has not described a FPGA, but a FPGA-
| like device that contains much more complex fixed-function
| blocks than the small DSP multipliers that are available in
| the currently existing FPGAs.
|
| With 18-bit integer multipliers or the like you cannot
| compete in energy efficiency with the arithmetic execution
| units of a GPU.
|
| The so-called CGRAs are an attempt to revive the idea of
| reconfigurable dataflow processors, with the hope of
| combining in the same device the advantages of the FPGAs with
| the advantages of the GPUs.
___________________________________________________________________
(page generated 2024-03-09 23:01 UTC)