[HN Gopher] Nyuzi - An Experimental Open-Source FPGA GPGPU Proce...
___________________________________________________________________
Nyuzi - An Experimental Open-Source FPGA GPGPU Processor
Author : peter_d_sherman
Score : 128 points
Date : 2021-02-14 14:37 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ourlordcaffeine wrote:
| Are there open source opencl to FPGA compilers?
|
| If you're playing with FPGA's, you might as well directly compile
| the kernel into a circuit, rather than building a gpu on an FPGA
| and running your kernel on that.
|
| Proprietary solutions like Altera OpenCL compiler exist.
| antman wrote:
| Layman here! I see a lot of posts on that subject lately so I
| need to ask: Can someone design a RAM chip?
| pjc50 wrote:
| Possibly, but why would you want a less efficient RAM chip that
| costs more compared to something that's a commodity you can
| buy?
| ComputerGuru wrote:
| There's basically certain little magic sauce to RAM chip
| _design_. The production process (which is rather independent
| from the commonly discussed "node size" production for CPU
| /GPU-related tech) is where the magic happens.
| detaro wrote:
| What do you mean specifically by "design a RAM chip"?
| (obviously RAM chips that you can buy are designed before they
| are made, so that's probably not what you are after?)
|
| FPGAs typically do contain dedicated RAM areas, because
| implementing it out of FPGA logic slices is terribly
| inefficient.
| tcherasaro wrote:
| FPGA designer here. Just wanted to point out that
| "efficiency" is highly context sensitive in FPGA design.
| Everything is an area / speed / power trade-off. If you only
| need a ram that is 8-bits wide and 64 words deep then it
| might be way inefficient to waste a dedicated 18kbit block
| ram on it when it would fit better into 2 LUTs. This is why
| Xilinx, for one, provides pragma such as RAM_STYLE to help
| guide synthesis:
|
| (* ram_style = "distributed" *) reg [data_size-1:0] myram
| [2**addr_size-1:0];
|
| block: Instructs the tool to infer RAMB type components.
|
| distributed: Instructs the tool to infer the LUT RAMs.
|
| registers: Instructs the tool to infer registers instead of
| RAMs.
|
| ultra: Instructs the tool to use the UltraScale+TM URAM
| primitives.
|
| See: https://www.xilinx.com/support/documentation/sw_manuals/
| xili...
|
| edit: formatting*
| sitkack wrote:
| Yes, designing ram is a lower level operation as compared to
| designing logic via an HDL and needs to directly take the
| process (chemistry, optics, mechanics) of the fab into account.
|
| https://openram.soe.ucsc.edu/
| bserge wrote:
| DRAM chips are fascinating.
|
| Instead of going with the much more expensive SRAM, someone
| decided that refreshing billions of capacitors hundreds of
| times a second while performing read and write operations is
| an acceptable way of _storing_ data (even if only while
| powered).
|
| I wonder what the managers who first heard the idea must've
| thought :D
|
| And it works so well! It's probably one of the most reliable
| component in a computer.
| pjc50 wrote:
| Many of the early RAM systems were non-persistent (mercury
| delay lines, phosphor) and some were destructive-read (core
| memory).
|
| Appears to have been invented by Dennard of Dennard
| Scaling: https://www.thoughtco.com/who-invented-the-
| intel-1103-dram-c...
| kleiba wrote:
| I'm a total lay person here, but my understanding is that
| designing a new processor is very challenging these days because
| of the patent situation. That is, so much in hardware design is
| patented that you're bound to run into problems if you don't know
| what you're doing.
|
| Is this true, and is it of relevance here?
| 10000truths wrote:
| Yes, IP cores are very expensive to license, if they're even
| available for licensing at all. This is part of the appeal of
| RISC-V - an open-spec, royalty-free processor architecture that
| is free of charge for chip designers to implement.
| lkcl wrote:
| unfortunately, if you make modifications and you want them to
| be "upstreamed" (using libre/open project terminology as an
| alonogy) you cannot do that without participating in the
| RISC-V Foundation. you can implement APPROVED (Authorized)
| parts of the RISC-V specification. you cannot arbitrarily go
| changing it and still call it "RISC-V", that's a Trademark
| violation.
| admax88q wrote:
| RISC-V is not an IP core, just an instruction set
| architecture.
|
| Any implementation of it has the exact same patent minefield
| to navigate as any other ISA. Most of the patents are around
| implementation techniques not instruction set.
| jecel wrote:
| The RISC-V instruction set was carefully designed not to
| require the use of any currently valid patents to do an
| implementation. It is up to each processor designer to not
| violate any patents in their project.
| lkcl wrote:
| this is unfortunately not true (that the RISC-V ISA was
| designed not to require currently-valid patents). people
| may _believe_ that to be the case, but it 's not. from
| 3rd hand i've heard that IBM has absolutely tons of
| patents that RISC-V infringes. whether IBM decide to take
| action on that is another matter. they're a bit of a
| heavyweight, so there would have to be substantial harm
| to their business for the "800 lb gorilla" effect to kick
| in.
| astrange wrote:
| It's not actually possible to do this though, it's up to
| the other side's lawyers to decide if they're going to
| sue you, and the answer is yes if they can afford it. You
| don't have a jury on hand to evaluate every patent that
| ever exists.
|
| Besides that, engineers in large companies are told to
| explicitly not look up any patents so they won't be
| accused of willful infringement.
| vmception wrote:
| Yes but there are alot of profitable applications which dont
| need to be advertised. You run it in-house and make money on
| the output, ie. ML farms or mining. You dont take preorders for
| the hardware at all and just have boutique custom units and
| nobody knows the architecture, even if you offer some remote
| rental/SaaS tool.
| pkaye wrote:
| This processor seems to be a barrel processor architecture from
| my quick look so not entirely new.
|
| https://en.wikipedia.org/wiki/Barrel_processor
| ChuckNorris89 wrote:
| Not sure of how relevant it is here, but yes, GPUs
| architectures are bound by tons of patents so you can bet your
| a$$ that if you were to commercially launch your own GPU IP,
| you'll have Nvidia's and AMD's lawyers knocking on your door in
| under 10 seconds.
|
| IIRC most companies out there selling GPU IP are still paying
| royalties to AMD for their patents on shader architecture which
| they got from their acquisition of ATI which in turn came from
| their acquisition of ArtX which was founded by people who
| worked at the long defunct SGI (Silicon Graphics).
|
| The funny thing is, if you backtrack through all GPU
| innovations, most stem from former SGI employees.
|
| When 3Dfx went under, even though Nvidia's GPU tech was already
| superior to anything 3Dfx had, Nvidia immediately swept in and
| picked their carcass clean, mostly for their patents in this
| space, so they would have more ammo/leverage against
| competitors going forward.
|
| Regardless how you feel about patents, with their pros and
| cons, hardware engineering is a capital intensive business and
| without patents to protect your expensive R&D, it wouldn't be a
| viable business.
| bserge wrote:
| Aren't patents supposed to expire?
|
| Isn't that the idea, you have a patent for 10-20 years, build
| your business (which AMD/nVidia did, very successfully) then
| everyone is free to use it, possibly leading to innovation?
|
| I'm poorly versed in this, so if anyone with more knowledge
| could share some thoughts, that would be appreciated.
| lkcl wrote:
| only if the patent holder does not create an "improvement"
| on the old one. then the older (referenced) patent is
| extended. Bosch have done this specifically so that they
| can hold on to the original CAN Bus patent.
| HideousKojima wrote:
| Correct, patents in the US expire after 20 years.
| JPLeRouzic wrote:
| And if I remember correctly, (I wrote my last patent 10
| years ago) there are annuals fees that would invalidate
| any right if not paid.
| arithmomachist wrote:
| >Nvidia immediately swept in and picked their carcass clean,
| mostly for their patents in this space, so they would have
| more ammo/leverage against competitors going forward.
|
| That's surely not a healthy situation either. Courts should
| never be a central part of competition among businesses.
| joshspankit wrote:
| To clarify what I think is the relevance, as well as to
| explore my own questions:
|
| If someone were to clean-room design their own GPU chip, how
| likely is it that Nvidia and AMD would come down on them
| anyway simply by virtue of the fact that they (presumably)
| have patents on everything that you could think of putting in
| that chip?
|
| In essence: do you now have to be an expert in what you're
| _not_ allowed to put in before you even start?
| raphlinus wrote:
| So here's what I would do if I were in this situation. I
| wouldn't build a graphics processing unit per se, but
| instead would build a highly parallel SIMD CPU organized in
| workgroups, and with workgroup-local shared memory. These
| cores could be relatively simple in some respects (they
| wouldn't need complex out-of-order superscalar pipelines or
| sophisticated branch prediction), but should have good
| simultaneous multithreading to hide latency effectively.
|
| Then, if you wanted to run a traditional rasterization
| pipeline, you'd do it basically in software, using
| approaches similar to cudaraster (which is BSD licensed!).
| The paper on that suggests that it would be on the order 2X
| slower than optimized GPU hardware for triangle-centric
| workloads, but that might be worth it. The good news is
| this story gets better the more the workload diverges from
| what traditional GPUs are tuned for - in particular, the
| more sophisticated the shaders get, the more performance
| depends on the ability to just evaluate the shader code
| efficiently.
|
| It would of course be very difficult to make a chip that is
| competitive with modern GPUs (the engineering involved is
| impressive by any standards), but I think a lot would be
| gained from such an effort.
|
| I should probably disclaim that this is _definitely_ not
| legal advice. Anyone who wants to actually play in the GPU
| space should plan on spending some quality time with a team
| of topnotch lawyers.
| jeffbush wrote:
| (project author here) That is pretty close to the
| approach this project has taken, although my motivation
| was not so much avoiding IP as exploring the line between
| hardware acceleration and software.
| lkcl wrote:
| allo jeff nice to see you're around :) thank you so much
| for the time you spend guiding me through nyuzi. also for
| explaining the value of the metric "pixels / clock" as a
| measure for iteratively being able to focus on the
| highest bang-per-buck areas to make incremental
| improvements, progressing from full-software to high-
| performance 3D.
|
| have you seen Tom Forsyth's fascinating and funny talk
| about how Larrabee turned into AVX512 after 15 years?
|
| https://player.vimeo.com/video/450406346
| https://news.ycombinator.com/item?id=15993848
| raphlinus wrote:
| Great to hear! I've poked around a little and see that,
| and in any case wish you success and that we can all
| learn from it.
| ChuckNorris89 wrote:
| To clarify further, Nvidia and AMD (and probably other
| small players like ARM, Quallcomm, Imagination) own the
| patents on core shader tech, which are the building blocks
| of any modern GPU design.
|
| If you want to design a GPU IP that works around all their
| patents, you probably can, but unless you're a John Carmack
| x10, your resulting design would be horribly inefficient
| and not competitive enough to be worth the expensive
| silicon it will be etched on and probably not compatible to
| any modern API like Vulcan or DirectX.
|
| But if you just want to build your own meme GPU for
| education/shits and giggles, that doesn't follow any
| patents or APIs, then you can and some people already did:
|
| https://www.youtube.com/watch?v=l7rce6IQDWs
| ericbarrett wrote:
| I am not in the graphics space but I am quite familiar with
| tech business practices.
|
| I think the chance you would be sued is near 100%. If you
| released and showed any market traction at all, you would
| immediately become a threat to the duopoly; they surely
| remember the rise of 3Dfx. Don't bother arguing the merits
| of the patents because it would be a business decision, not
| a technical one--this is the kind of thing that's decided
| at the C-level and then justified (or cautioned against) by
| the company's legal team, not the other way around. Patents
| are merely leverage to effect the defense of the business,
| and you can be sure they'll be used.
| joshspankit wrote:
| I agree with you (and definitely a conversation worth
| having) but for the sake of this thread let's pretend
| that legal action would only be taken when a patent was
| actually matched with what was put in the chip.
| lkcl wrote:
| if it were done, say, as a Libre/Open processor, say, with
| the backing of NLnet (a Charitable Foundation), where the
| "Bad PR ju-ju" for trying it on was simply not worth the
| effort
|
| if it were done. say, as a Libre/Open processor, say, with
| the backing of NLnet (a Charitable Foundation), where NLnet
| has access to over 450 Law Professors more than willing to
| protect "Libre/Open" projects from patent trolls by running
| crowd-funded patent-busting efforts
|
| if it were done as a Libre/Open Hybrid Processor, based on
| extending an ISA such as ooo, I dunno, maybe OpenPOWER,
| which has the backing of IBM with a patent portfolio
| spanning several decades, who would be very upset if tiny
| companies like NVidia or AMD tried it on against a
| Charitably-funded project.
|
| that would be a very interesting situation, wouldn't it? i
| wonder if there's a project around that's trying this as a
| strategy? hmmm, hey, you know what? there is! it's called
| http://libre-soc.org
| ericbarrett wrote:
| I learned GL in the 1990s on SGI systems. Shaders didn't
| exist, poly counts were in the 100s, and textures were a
| massive processing burden. The rendering pipeline of course
| was quite different. And yet so much is the same! Code
| organization, data types, all is quite familiar, whether it's
| OpenGL or DirectX or what not. The achievements of SGI
| engineers have literally benefited generations.
| lkcl wrote:
| Jeff's evaluation of GPLGPU is fascinating:
| https://jbush001.github.io/2016/07/24/gplgpu-
| walkthrough.htm...
|
| you are absolutely correct in that everything has moved on
| from "Fixed Function" of SGI, and how GPLGPU works (worked)
| - btw it's NOT GPL-licensed: Frank sadly made his own
| license, "GPL words but with non-commercial tacked onto the
| end" which ... er... isn't GPL... _sigh_ - but everything
| commercially has now moved on to Shader Engines.
|
| that basically means Vulkan.
|
| however you may be fascinated to know, from Jeff's
| evaluation, that there are still startling similarities in
| basic functionality in not-GPL GPLGPU and in modern designs
| targetted at Shader Engines.
| ComputerGuru wrote:
| I don't see how patents acquired from SGI could possibly
| still be protected and require licensing.
| peter_d_sherman wrote:
| Related:
|
| Ben Eater - Let's build a video card!
|
| https://eater.net/vga
|
| Embedded Thoughts Blog - Driving a VGA Monitor Using an FPGA
|
| https://embeddedthoughts.com/2016/07/29/driving-a-vga-monito...
|
| Ken Shirriff - Using an FPGA to generate raw VGA video:FizzBuzz
| with animation
|
| http://www.righto.com/2018/04/fizzbuzz-hard-way-generating-v...
|
| Clifford Wolf - SimpleVOut -- A Simple FPGA Core for Creating
| VGA/DVI/HDMI/OpenLDI Signals
|
| https://github.com/cliffordwolf/SimpleVOut
|
| PDS: Also, this looks interesting, from SimpleVOut:
|
| >"svo_vdma.v
|
| A _video DMA controller_. Has a read-only AXI4 master interface
| to access the video memory. "
| fortran77 wrote:
| Yeah, but these people aren't doing GPGPU computation
| phendrenad2 wrote:
| Or even anything resembling even 2D graphics acceleration.
| FPGAhacker wrote:
| One of the things that interests me (of many), is the use of
| cmake.
|
| Does anyone have good references on extending cmake to new tools
| that don't produce executables per se, or otherwise work in non
| traditional ways?
| code-scope wrote:
| Very Cool project:
|
| Love GPGPU, I git clone it and try to understand the code better
| here: https://www.code-
| scope.com/s/s/u#c=sd&uh=0f2c2fa280a2&h=afe7a329&di=-1&i=38
|
| It looks like 5 stages FP (FP32?) pipe lines, NUM_VECTOR_LANES
| =16 NUM_REGISTERS=32
|
| Are you writing your own kernel from scratch? If so which CPU
| does it runs on - some embedded CPU inside FPGA?
|
| In the mandelbrot.c code, it has following: #define vector_mixi
| __builtin_nyuzi_vector_mixi How does it get
| translate to vector operations in FPGA? Where is the code
| implement the __builtin_*?
|
| Thanks a lot an very interesting project.
| marcodiego wrote:
| There's people keeping OpenVGA alive[1]. With the failure of the
| open grahics project[2] is there any known promising projects
| besides libregpu[3]?
|
| [1] https://github.com/elec-otago/openvga
|
| [2] https://en.wikipedia.org/wiki/Open_Graphics_Project
|
| [3] https://libre-soc.org/3d_gpu/
| phkahler wrote:
| >> is there any known promising projects besides libregpu?
|
| I think the most useful thing right now would be a high quality
| version of the "easy" parts of a GPU. Basic scan out, possibly
| overlays, color space conversion, buffer handling. This would
| allow ANY open processor projects to have frame buffer graphics
| and run LLVMpipe for basic rendering and desktop compositing.
| This may be slow, but it is required for every open GPU
| project, while a SoC can live without the actual GPU for some
| applications.
|
| IMHO, first thing first.
| lkcl wrote:
| this is easy to chuck together in a few days, literally, from
| pre-existing components found on the internet.
|
| * litex (choose any one of the available cores)
|
| * richard herveille's excellent rgb_ttl / VGA HDL
| https://github.com/RoaLogic/vga_lcd
|
| * some sort of "sprite" graphics would do
| https://hackaday.com/2014/08/15/sprite-graphics-
| accelerator-...
|
| the real question is: would anyone bother to give you the
| money to make such a project, and the question before that
| is: can you tell a sufficiently compelling story to get
| customers - _real_ customers with money - to write you a
| Letter of Intent that you can show to investors?
|
| if the answer to either of those questions is "no" then, with
| many apologies for pointing this out, it's a waste of your
| time unless you happen to have some other reason for doing
| the work - basically one with zero expectation up-front of
| turning it into a successful commercial product.
|
| now, here's the thing: even if you were successful in that
| effort, it's so trivial (Richard Herveille's RGB/TTL HDL sits
| as a peripheral on the Wishbone Bus) that it's like... why
| are you doing this again?
|
| the _real_ effort _is_ the 3D part - Vulkan compliance,
| Texture Opcodes, Vulkan Image format conversion opcodes
| (YUV2RGB, 8888 to 1555 etc. etc.), SIN /COS/ATAN2, Dot
| Product, Cross Product, Vector Normalisation, Z-Buffers and
| so on.
| phkahler wrote:
| Seriously? VGA with DVI outputs? And a link to a Sprite
| engine?
|
| We need HDMI output, preferably 4K capable. I also
| mentioned colorspace conversion. Should have also said to
| "just throw in" video decoder for VP9 and AV1 if that's
| available. The point is that the likes of SiFive and other
| Risc-V SoC vendors should be making desktop chips, not just
| headless Linux boards or ones with proprietary GPUs.
|
| Like I said, the "easy" part should be done and available -
| not theoretically assemblable from various pieces.
|
| If this were readily available, I'd be able to buy it from
| someone today. There IS a market for it and that will be
| growing fast. Add a real GPU and things look even better.
| marcodiego wrote:
| Yeah. I also miss the small but firm steps approach.
___________________________________________________________________
(page generated 2021-02-14 23:00 UTC)