[HN Gopher] AMD Patent Reveals Hybrid CPU-FPGA Design That Could...
___________________________________________________________________
AMD Patent Reveals Hybrid CPU-FPGA Design That Could Be Enabled by
Xilinx Tech
Author : craigjb
Score : 137 points
Date : 2021-01-03 17:39 UTC (5 hours ago)
(HTM) web link (hothardware.com)
(TXT) w3m dump (hothardware.com)
| rwmj wrote:
| About *!$% time! I was hoping Intel would do something like this
| when they acquired Altera a few years back. Does anyone know why
| Intel acquired Altera?
| PedroBatista wrote:
| Almost the same reason someone buys a Peloton bike or rusted
| old Porsche. Because someone had a dream last night and have
| the money.
| d_tr wrote:
| AFAIK there exist some Xeon + FPGA chips. No clue about
| availability though...
| mhh__ wrote:
| Xilinx already have ARM cores in their FPGAs so I wonder which
| way they'll go - I'd honestly prefer a neoverse core than an X86
| efferifick wrote:
| Not sure how realistic it would be, but I would like to see a
| RISC-V base core, and the FPGA implementing the extensions.
| Why? Because it would be cool! Also, I don't really see a use
| case except for debugging compilers supporting multiple RISC-V
| extensions and what not.
| craigjb wrote:
| Microchip has the product for you then! Well, the RiscV part
| anyway. https://www.microsemi.com/product-directory/soc-
| fpgas/5498-p...
| jagger27 wrote:
| AMD already has full-on Arm products.
|
| https://www.amd.com/en/amd-opteron-a1100
| sbrorson wrote:
| You are right the ARM cores, mostly. Xilinx Zynq devices have
| ARM A devices built into them as "hard" cores. That is, the
| ARMs are instantiated directly in silicon, not as "soft" cores
| which take LUTs (gates) from the FPGA fabric. The ARM A is a
| microprocessor (not a microcontroller) powerful enough to run
| Linux.
|
| The ARM connects to the FPGA fabric using a so-called AXI bus,
| which is a local bus defined by ARM. Xilinx supplies a bunch of
| "soft" cores which you can instantiate in the FPGA and
| integrate with the ARM. Of course, you can write your own logic
| for the FPGA too, as long as you can figure out how to
| interface to it using one of the AXI bus variants.
|
| Several vendors offer experimenters platforms which are
| affordable enough for hobbyists and folks making engineering
| prototypes. Examples are the Avnet's Zed board and Digilent's
| Zybo board.
|
| The biggest problem with the Zynq ecosystem is that the Xilinx
| tools -- Vivado/SDK and whatever they renamed it to last year
| -- are steaming piles of smelly brown stoff. Vivado is buggy,
| poorly supported, has bad documentation, and the supplied
| examples typically don't work in the latest version of Vivado
| since they were written long ago and have been made obsolete
| via version skew. An absolute disgrace compared to what
| software engineers are used to. The SDK is basically Eclipse
| which has its own problems, but is not as bad as Vivado. Ask me
| how I know.
|
| I think AMD and Xilinx have a long way to go before they can
| satisfy the hype and speculation I see in all the posts here. I
| suppose one could shell out $20K for a seat of Synopsys if one
| wanted a decent set of dev tools, but that's not the direction
| most software engineers are going nowadays.
|
| Also, assuming NVidia completes its acquisition of ARM, the
| whole Zynq ecosystem is imperiled since it pits ARM against
| NVidia.
| ohazi wrote:
| For decades, the FPGA vendors have had this fever dream of "an
| FPGA in every PC" -- either as an add-on card, or as part of the
| chipset on a motherboard -- that would enable a compiler or
| operating system to seamlessly accelerate arbitrary tasks on
| demand.
|
| In my opinion, the problem has always been their software: the
| FPGA vendor tools are slow, bloated monstrosities. The core of
| these tools are written by the big three EDA vendors (Cadence,
| Synopsys, and Mentor Graphics) rather than the FPGA vendors
| themselves. The licenses include ridiculous, paranoid
| restrictions [1] and force the FPGA vendors to keep their
| bitstream formats and timing databases secret [2] in order to
| prevent competition from other tool vendors. Most FPGA vendors
| didn't see this as a problem, but even the ones that did didn't
| have much of a choice, because the tool market is a cartel.
|
| Thankfully, we now have an open source toolchain [3] with support
| for a growing number of FPGA architectures [4], and using it vs.
| the vendor tools is like using gcc or llvm vs. a '90s era, non-
| compliant C++ compiler. It even has a real IR that isn't Verilog,
| which has made it easier to design new HDLs [5].
|
| I don't see how a dynamic FPGA accelerator platform can be even
| remotely viable without this. It's the difference between a
| developer getting to choose between one of a few dozen pre-baked
| designs that lock up the entire FPGA (and needing to learn how to
| shovel data into it), vs. a compiler flag that can give you the
| option of unrolling any loop directly into any inactive region of
| FPGA fabric.
|
| It would be quite the cherry on top to see AMD build something
| interesting in this space. But unless they're willing to fully
| unencumber at least this one design, I think the effort is likely
| to fail. The open source guys are chomping at the bit to make
| this work, and have been making real progress lately. Meanwhile,
| the EDA vendors have been making promises, failing, and throwing
| tantrums for the last 20 years. It's time to write them off.
|
| [1]
| https://twitter.com/OlofKindgren/status/1052822081652617221?...
|
| [2] Imagine trying to write an assembler without being allowed to
| see the manual that tells you how instructions are encoded. It's
| like that, but the state-space is hundreds to thousands of bytes
| in multiple configurations rather than a few dozen bits.
|
| [3] https://github.com/YosysHQ/yosys
|
| [4] https://symbiflow.github.io/
|
| [5] https://github.com/m-labs/nmigen
| travis729 wrote:
| I would love to hack on FPGAs but always run into the issue of
| closed toolchains. The recent open source work is a breath of
| fresh air, but we need to see an FPGA vender that embraces and
| sponsors this work.
| ohazi wrote:
| I think/hope it's an unstable equilibrium -- if either
| Altera/Intel or Xilinx/AMD give a nod to the open source
| tools, the others will follow.
|
| Lattice is seemingly at "wink wink, nudge nudge" levels of
| support -- their lawyers won't allow them to say anything
| because they're afraid of pissing off Synopsys, but they also
| know that they're currently the best supported platform, and
| don't seem interested in deliberately making things
| difficult.
| mhh__ wrote:
| Symbiflow is still a long long way off replacing Vendor tools
| at scale, right?
|
| I'm really liking Clash and Bluespec (Bluespec is completely
| open source now) but I don't want to write any conventional
| languages.
| thrtythreeforty wrote:
| What does Bluespec compile to? All the way to a bitstream
| (surely not) or to Verilog or an intermediate language?
| mhh__ wrote:
| Firstly, (for the uninitiated) Bluspec is both a Haskell
| DSL (Bluespec Classic) and a Verilog-like language
| (Bluespec SystemVerilog)
|
| It compiles to Verilog, but the stack is much more
| integrated than other similar compile-to-verilog HDLs - the
| simulator is similar to verilator and much easier to get
| started with.
|
| I'm kind of beginning to feel that Haskell isn't a good
| medium for HDL code - Verilog already encourages unreadable
| names like "mem_chk_sig_state" and Haskell code is almost
| unstructured to my eye (I like functional programming but
| it seems hard to keep it readable because of the style it
| imposes - the flow is there but the names are usually way
| too short for my taste)
| ohazi wrote:
| I'm pretty sure Bluespec and SpinalHDL compile to Verilog.
| Chisel uses it's own IR (FIRRTL). I think Migen used to
| target Verilog, but now targets (one of?) the IR(s) that
| Yosys supports (RTLIL?).
| [deleted]
| Traster wrote:
| I hate to be that bucket of cold water, but there's _multiple_
| reasons FPGAs haven 't been successful in package with CPUs.
| Firstly, the costs of embedding the FPGA - FPGAs are relatively
| large and power hungry (for what they can do), if you're sticking
| one on a CPU die, you're seriously talking about trading that
| against other extremely useful logic. You really need to make a
| judgement at purchase time whether you want that dark piece of
| silicon instead of CPU cores for day to day use.
|
| Secondly, whilst they're reconfigurable, they're not
| reoconfigurable in the time scales it takes to spawn a thread,
| it's more like the same scale of time to compile a program (this
| is getting a little better over time). Which makes it a difficult
| system design problem to make sure your FPGA is programmed with
| the right image to run the software programme you want. If you're
| at that level of optimization, why not just design your system to
| use a PCI-E board, it'll give you more CPU, and way more FPGA
| compute and both will be cheaper because you get a stock CPU and
| stock FPGA, not some super custom FPGA-CPU hybrid chip.
|
| Thirdly the programming model for FPGAs are fundamentally very
| different to CPUs, it's dataflow, and generally the FPGA is
| completely deterministic. We really don't have a good answer for
| writing FPGA logic to handle the sort of cache hierarchy, out of
| order execution that CPUs do. So you're not getting the same sort
| of advantage that you'd expect from that data locality. It's very
| difficult to write CPU/FPGA programs that run concurrently,
| almost all solutions today run in parallel - you package up your
| work, send it off to the FPGA and wait for it to finish.
|
| Finally, as others have said - the tools are bad. That's
| relatively solvable.
|
| For me, it boils down to this, if you have an application that
| you think would be good on the same package as a CPU, it's
| probably worth hardening it into ASIC (see: error correction,
| Apple's AI stuff). If you have an application that isn't, then a
| PCI-E card is probably a better bet - you get more FPGA, more CPU
| and you're not trading the two off.
| imtringued wrote:
| It's easier to provide "custom instructions" and only
| accelerate CPU bottlenecks if you don't have PCIe as a massive
| bottleneck. If you are using an accelerator behind a bus you
| always have to make sure there is enough work for the
| accelerator to justify a data transfer. GPUs are built around
| the idea of batching a lot of work and running it in parallel.
| You can make an FPGA work like that but you are throwing away
| the low latency benefits of FPGAs.
| wtallis wrote:
| Even the best-case scenarios for integrating a FPGA onto the
| same die as CPU cores would still have the FPGA _separate_
| from the CPU cores. It 's really not possible to make an
| open-ended high bandwidth low latency interface to a huge
| chunk of FPGA silicon part of the regular CPU core's tightly-
| optimized pipeline, without drastically slowing down that
| CPU. The sane way to use an FPGA is as a coprocessor, not
| grafted onto the processor core itself. Then, you're
| interacting with the FPGA through interfaces like memory-
| mapped IO whether it's on-die, on-package, or on an add-in
| card.
| Traster wrote:
| Yeah, worth mentioning highly optimized FPGA designs run at
| up to 600MHz (or to put it another way, 400MHz lower than
| what Intel advertised 4 years ago). So at a minimum, you're
| going to clock cross, have a >10 cycle pipeline at CPU
| speeeds (variable clock) and clock cross back.
| jacoblambda wrote:
| I definitely agree that a PCI-E card is preferable. Hell even
| if you have it in CPU, you probably want it sat on the PCI-E
| bus anyways so it can P2P DMA with other hardware.
|
| Also (not disagreeing but I'm curious), last time I checked
| FPGAs could pull off some level of partial reconfiguration in
| the millisecond and sub millisecond ranges. I may be a bit off
| on these times but I saw them in a research paper a few years
| back. What types of speed would be necessary for CPUs to
| actually be able to benefit from a small FPGA onboard (rather
| than on an expansion card) with all the context switching.
| user5994461 wrote:
| Yet another patent that should never have been granted.
|
| SoC have been a thing for a long time. SoC = CPU + FPGA on a
| single chip.
|
| Looking at the patent, the list of 20 claims is absurd. The title
| says it all "... PROGRAMMABLE INSTRUCTIONS IN COMPUTER SYSTEMS",
| they're trying to patent anything that can run or dispatch
| instructions.
| mhh__ wrote:
| Ironically the neural engine patent is literally the only
| public information on how it works I can find
| refulgentis wrote:
| >> the list of 20 claims is absurd.
|
| Claims are a union - each individual claim may sound simple,
| what matters is the combination.
|
| >> The title says it all "... PROGRAMMABLE INSTRUCTIONS IN
| COMPUTER SYSTEMS", they're trying to patent anything that can
| run or dispatch instructions.
|
| No. The title of a patent is not a patent.
| fvv wrote:
| Claims define the context and boundaries of the patent
| user5994461 wrote:
| Every claim is almost a patent on its own. Submit 20 claims
| that are progressively more specific, so if one claim is
| denied during the patent application or afterwards, the other
| claims can still stand.
|
| Typical strategy is to claim as many things as you can
| imagine, like inventing CPU and anything that can evaluate an
| instruction and instructions themselves, then remove any
| claim that the patent office refuses to grant.
| cptskippy wrote:
| That's how the industry works. You gather and hoard as many
| frivolous patents as you can in a cold war arms race. If a new
| company threatens your business, you search your portfolio for
| a patent they violated and sue them.
|
| Companies who grow to a certain size look to be acquired by
| larger firms with bigger war chests.
|
| Sometimes companies recognize patents are stifling progress and
| engage in cross licensing or pooling of patents. Sometimes they
| do it to gang up on a new rival.
| economusty wrote:
| Computronium
| d_tr wrote:
| The main reason I am interested in this acquisition is a (faint)
| hope that they open some specs up to help projects like
| SymbiFlow.
| Scene_Cast2 wrote:
| A killer tech for this would be a framework that automatically
| reprograms the FPGA and offloads the work if it makes sense. For
| example - running k-means? Have your FPGA automatically (with
| minimal dev effort) flash to be a Nearest Neighbor accelerator.
|
| The problem is finding a way to make that translation happen with
| minimal dev effort, as software is written rather differently
| from hardware.
| cashsterling wrote:
| I recommend checking out CacheQ: https://cacheq.com/
|
| they are working on almost exactly this. If I was an investor,
| or Intel or AMD, I would buy them and/or invest heavily.
| therealcamino wrote:
| Their web site is very sparse on what programming models the
| tool supports. Traditionally, the things you can easily
| accelerate automatically are algorithms you can write
| naturally in Fortran 77 (lots of arrays, no pointers), and
| that's one limit on the applicability of these automatic
| tools. (Other limits that other posters have pointed out are
| compilation+place+route runtime, and reconfiguration time.)
|
| They are claiming you can use malloc and make "extensive" use
| of pointers in C programs and still have them automatically
| compiled for the FPGA. That's where details are needed and
| they are mostly missing.
|
| I watched their 30 minute demo film. The speedups are
| impressive, and on the small example it's impressive that it
| does the partitioning automatically. However, the program
| contains only a single call to malloc, and all pointers are
| derived from that address, so it doesn't do much to convince
| us that it the memory model and alias analysis give you more
| flexibility than the F77 model.
| d_tr wrote:
| You might want to check the "Warp Processing" project out:
| http://www.cs.ucr.edu/~vahid/warp/. It is probably exactly what
| you are thinking about. Transparent analysis of the instruction
| stream at runtime and synthesis and offloading of hot spots to
| the FPGA.
| Scene_Cast2 wrote:
| Huh, interesting. It seems that the work doesn't have to be
| explicitly parallel for this to work, which is a surprise.
| rch wrote:
| I recall reading papers about doing this by profiling Java apps
| a decade or so ago, but I would have to dig pretty deep in my
| HN comment history to find them.
|
| The approach seems conceptually similar to the optimizations
| available via the enterprise version of GraalVM.
| BryanBeshore wrote:
| Lisa Su is a fantastic CEO. Time will tell what the impact of
| AMD's acquisition of Xilinx will be (should it close), but this
| shows the strategy and execution behind Su and team.
|
| While a lot of acquisitions don't pan out, this seems great.
| parsimo2010 wrote:
| AMD purchasing Xilinx is a reaction to Intel purchasing Altera
| five years ago. Dr. Su might be a good CEO for other reasons,
| but this isn't something that illustrates brilliant strategy on
| her part.
| cptskippy wrote:
| The industry doesn't move overnight. AMD might have seen
| where Intel was going and didn't want to be caught off guard,
| or that might be the alternative to Apple approach of dozens
| of coprocessors on a chip.
| BryanBeshore wrote:
| As I said, time will tell
| rusticpenn wrote:
| Intel did not produce anything worthwhile from that strategy
| yet and I have seen no plans either. I use Altera for all my
| FPGA needs.
| ATsch wrote:
| A large reason for the deal with Altera was that Altera
| already used intel for fabrication. I understand Intel's
| 10nm and 7nm failure has hurt them a lot in that regard,
| quite the opposite of the expected synergy. Unlike Xilinx
| for AMD, they didn't really have any other technologies
| intel needed either, the biggest advantage was fabrication
| and that fell through.
| sitkack wrote:
| This is AMD competing with Nvidia, not AMD competing with
| Intel.
| GeorgeTirebiter wrote:
| Xilinx had laid off a good chunk right before their sale to
| AMD. Xilinx was having some financial troubles; when that
| happens, investors want out before a company craters. So
| selling themselves was one possible solution.
| ATsch wrote:
| I think it's more a reaction to the decreasing importance of
| CPUs in the datacenter in favor of interconnect technology.
| FPGAs are one of the directions in which the "smart nic" or
| "DPU" tech has been moving, which is critical to the trend of
| datacenter disaggregation. Xilinx has a very strong offering
| in that regard.
| baybal2 wrote:
| It is not a trend at all if you look at market data.
|
| Prime majority of hosting market still goes to bog standard
| servers, not even blades.
|
| I'll wait for "clouds" to get to significant double double
| digit market share first.
| ATsch wrote:
| If you look at market data, you can see that this market
| did not exist a few years ago and is now estimated to be
| worth billions, with major players releasing products in
| the space. Unless the dynamics pushing this forward
| change overnight, I think it's pretty safe to call it a
| trend.
| DCKing wrote:
| They're going to need good leadership to pull this off. AMD
| doesn't have a great track record when it comes to these
| integrations.
|
| AMD bought ATI while promising the same integration
| "synergies". GPU style compute was going to be completely woven
| into the CPU - "AMD Fusion". Sounds great - but they ended up
| with them being beaten to the CPU-with-integrated-GPU market by
| Intel by over a year (Intel Clarkdale launched January 2010,
| AMD Llano midway 2011). 14 years after the acquisition, AMD's
| iGPU integration is not much different compared to any other
| iGPU integration, their raw performance lead is shrinking
| compared to Intel and they're beaten by Apple. Radeon
| Technologies Group functionally operates independently within
| the company, and AMD won't use their more performant new RDNA
| architecture in iGPUs for two years after its launch for some
| reason - even their 2021 APUs still use their 2017 Vega
| architecture (fundamentally based on 2012 GCN technology). In
| the intervening years they've screwed up their processor
| architecture and marketshare for by going all in on the
| terrible Bulldozer architecture that was designed around the
| broken promises of far reaching GPU integration.
|
| Given all that the ATI acquisition might still have been worth
| it - in hindsight AMD needed a competent GPU architecture one
| way or another - but the mismanagement of this acquistion
| nearly killed the company. I hope better leadership can do
| something here but I'm not really holding my breath.
| atq2119 wrote:
| Agreed. Now to be fair, the acquisition is also what helped
| the company survive because it got them the console business.
| So it's not like it was completely botched.
|
| They screwed up majorly with software, and they may have the
| same problem with an FPGA acquisition as well. AMD failed big
| time to capitalize on GPUs the way Nvidia did, and that's
| really almost entirely down to lack of good software
| solutions. There's ROCm now and it seems plausible that the
| gap is going to narrow further with AMD GPUs deployed to big
| HPC clusters, but a gap remains.
| m4rtink wrote:
| Aren't all the new desktop consoles and the generation before
| that based on AMD CPU and GPU fused together in a specific
| way ?
| wtallis wrote:
| The consoles use AMD SoCs that include CPU and GPU cores,
| but there's nothing special about how the CPU and GPU are
| connected. The only remotely unusual aspect there is that
| many of the console SoCs connect GDDR5/6 to the SoC's
| shared memory controller, while other consumer devices
| using similar chips (marketed by AMD as APUs) tend to use
| DDR4 or LPDDR.
| qwerty456127 wrote:
| I could never stop wondering why is this not a norm yet. Why
| doesn't every computer have an FPGA.
| mhh__ wrote:
| Probably power, getting the data onto the FPGA, and utilising
| FPGA's being unlike software.
|
| I definitely want one but any common task worth having on an
| FPGA is probably common enough to justify either a GPU or
| actual silicon.
|
| Intel and AMD both have the IP to do it, and iPhones do have a
| Lattice chip on them apparently
| rowanG077 wrote:
| Once partial reconfiguration works and the FPGA can access
| main memory directly I see a lot of use cases. Imagine
| applications reconfiguring the FPGA in the blink of an eye to
| optimize their own algorithms.
| rjsw wrote:
| There have been PCI FPGA boards available for a long time
| that can access main memory, I had them in my desktop
| machines nearly 20 years ago.
| rowanG077 wrote:
| Yes through the PCI bus not directly. You don't want to
| have that latency. You want a unified model. Like Intel
| GPUs that can access main memory, or the FPGA being
| another endpoint in AMDs infinite fabric architecture.
| That exists as well in SoCFPGA boards. But not in the mid
| or high performance segments.
| rjsw wrote:
| Back when AMD released the first Opteron CPUs there was a
| vendor selling an FPGA that would plug into an Opteron
| socket along with the IP to implement HyperTransport in
| the FPGA.
| mhh__ wrote:
| Attacking a hypothetical poorly isolated on-chip FPGA
| seems like the mother of all exploits, thinking about it
| rowanG077 wrote:
| Why? To make an FPGA do what you want you need to be able
| to reconfigure it. If you have reconfiguration capability
| you need to have remote code execution. And in that case
| you have already lost.
| mhh__ wrote:
| As in, the FPGA would have to be carefully segmented so
| the accelerator couldn't be used to access memory it
| shouldn't have access to.
|
| I don't think it would happen in a general purpose chip
| but I could see it happening in a smaller one like the
| exploits christopher Domas demonstrated against some
| embedded X86 cores.
| rowanG077 wrote:
| Why though? Your Integrated Intel or AMD GPU can also
| access all of your memory. I don't see how an FPGA
| provides any additional attack vector. As I said you'd
| need code execution privileges anyway and once you have
| that your system is already owned.
| rjsw wrote:
| The boards that I have used could not reprogram the FPGA
| over the PCI bus.
| mhh__ wrote:
| I was thinking aloud about the memory rather than the
| actual FPGA bitstream
| imtringued wrote:
| Existing FPGA vendors made sure their products remained in a
| lucrative niche by maintaining full control over the
| development process for FPGA designs.
| amelius wrote:
| My guess: because FPGAs are slow compared to mainstream desktop
| CPUs and only make sense if you have massive paralelism. But
| then you'd need a massive FPGA which would be crazy expensive,
| plus you'd need a good way to handle throughput.
|
| I could be totally wrong, though.
| atq2119 wrote:
| That, plus programming FPGA kind of sucks. The software tool
| chains are somewhere between 20 and 30 years behind the state
| of the art for software development.
|
| Also, FPGAs can't be reasonably context-switched. Flashing
| them takes a significant amount of time, so forget about
| time-multiplexing access to the FPGA among different
| applications.
| m4rtink wrote:
| I could imagine some sort of API based queuing - say you
| have 2 "slots" you can program stuff on so if you play 8 k
| video you can have on flashed to video decoder while the
| other one can speed up your kernel compilation. If you then
| want to also use FPGA accelerated denoising on some video
| you recently recorded, the OS will politely tell you to
| wait for one of the other apps using the available slots to
| terminate first.
| amelius wrote:
| Is there even any progress in OSes with respect to how
| they deal with tasks/processes on GPUs?
| atq2119 wrote:
| Progress relative to what?
|
| Since applications do all their rendering via the GPU
| these days, desktop multi-tasking requires reasonably
| time-sliced access to the GPU. GPUs have proper memory
| protection these days (GPU-side page tables for each
| process). That's big progress over 10 years ago.
| ineedasername wrote:
| Sounds like spending a few hours a month learning an HDL could be
| a good long-term career decision.
| deelowe wrote:
| Anyone who is considering this, make sure you learn digital
| circuits first.
| seabird wrote:
| You're going to need to commit a lot more time than that. HDLs
| and the surrounding concepts have key fundamental differences
| from software that a lot of developers have a hard time
| stomaching. That's why high-level synthesis is the FPGA
| industry's City of El Dorado; software developers would be able
| to create acceleration designs without having to build up a
| fairly large new skillset.
| imtringued wrote:
| I've never understood this argument. The change in mindset is
| extremely small. It's merely a matter of awareness. High
| level synthesis can work just fine if you don't go overboard
| with constructs that are hard to synthesize. There is no
| fundamental reason why a math equation in C should be harder
| to synthesize than the Verilog or VHDL equivalent.
| Nullabillity wrote:
| The dataflow dialect of VHDL instantly felt really natural to
| me, coming from FRP (among a bunch of other stuff).
|
| Of course, using it in industry is presumably pretty
| different from using it for a few school courses.
| efferifick wrote:
| While sibling comments mention that it is probably wiser to
| learn digital logic before HDL (and I agree with them), I think
| it is important to also consider that there is now High Level
| Syntehesis where programming languages similar to C (e.g.,
| OpenCL) can compile to VHDL. HLS may lower the barrier for
| programmers to take advantage of FPGAs. However, whether the
| design can compile to fit the constraints of the FPGA available
| is another question that I do not know the answer.
| nsajko wrote:
| I think the right way isn't "learn a HDL", it's "learn digital
| electronics design". Hardware description languages enable
| succint hardware description, but it's still necessary to keep
| an image of the actual hardware in mind.
| ip26 wrote:
| HDL is really just ascii schematics.
| GuB-42 wrote:
| Everyone seems to be talking about accelerated instructions but
| how about I/O?
|
| FPGAs are awesome at asynchronous I/O and low latency. We could
| implement network stacks, sound and video processing, etc... It
| can start a TLS handshake as soon as the electrical signal hits
| the ethernet port, while the CPU is not even aware of it
| happening. It can timestamp MIDI input down to the microsecond
| and replay with the same precision. It can process position data
| from a VR headset at the very last moment in the graphics
| pipeline. Maybe even do something like a software defined radio.
|
| Basically every simple but latency-critical operations. Of
| course, embedded/realtime systems are a prime target.
| slimsag wrote:
| A fair amount of enterprise NICs in data centers do exactly
| this, e.g. Intel FPGA smart NICs
|
| I don't know enough to know how this being on the CPU would
| affect performance in this scenario, but I'd love to learn
| more!
| leecb wrote:
| Everything described in the article sounds exactly like some of
| the Virtex*-FX products from more than 10 years ago.
|
| For instance, the Virtex4-FX had either one or two 450MHz PowerPC
| coresembedded in it, where you could implement 8 of your own
| additional instructions in the FPGA. This is effectively now a
| CPU where you can extend the instruction set, and design your own
| instructions specific to your application. For example, you might
| make special instructions using the onboard logic to accelerate
| video compression, or math operations; I know of one application
| that was designed to do a 4x4 matrix multiply per cycle.
|
| https://www.digikey.com/catalog/en/partgroup/virtex-4-fx-ser...
| https://www.xilinx.com/support/documentation/data_sheets/ds1...
| mhh__ wrote:
| What was the latency like to actually get data into your shiny
| new instruction e.g. do I get a 14 stage pipeline stall to
| actually use the instruction?
| rowanG077 wrote:
| That depends on how you designed your instruction.
| sitkack wrote:
| And your pipeline
| thrtythreeforty wrote:
| For those curious, Xtensa is a similar embeddable architecture
| (known especially for its use in the ESP32 microcontroller)
| that allows broad latitude to the designer to customize its
| instruction set with custom acceleration. The integration is
| very good, the compiler recognizes the new intrinsics and the
| designer has control over how the instruction is pipelined into
| the main processor.
|
| Unfortunately it's very proprietary, and as far as I know there
| isn't an at-home version you can play with on FPGAs. But this
| kind of thing does exist if you can afford it - you don't have
| to roll your own RTL.
| nynx wrote:
| This is exciting! Would be cool if it could access some sort of
| gpio as well!
___________________________________________________________________
(page generated 2021-01-03 23:00 UTC)