[HN Gopher] Optimization Techniques for GPU Programming [pdf]
___________________________________________________________________
Optimization Techniques for GPU Programming [pdf]
Author : ibobev
Score : 37 points
Date : 2023-08-09 20:25 UTC (2 hours ago)
(HTM) web link (dl.acm.org)
(TXT) w3m dump (dl.acm.org)
| flakiness wrote:
| To ones who are interested: "Programming Massively Parallel
| Processors: A Hands-on Approach" is a great book to learn CUDA
| programming, and it talks mostly about performance because, after
| all, GPU is about speed.
|
| Unlike normal programming books, it talks a lot about how GPUs
| work and how the introduced techniques fit in that picture. It's
| interesting even if you are just curious how a (NVIDIA) GPU works
| at code-level. Strongly recommended.
| mathisfun123 wrote:
| > it talks a lot about how GPUs work
|
| it's true - out of all of the "LEARN CUDA IN 24 HOURS" books,
| this is the best one. indeed this isn't one of those same books
| - this is a textbook - but at first glance it resembles them
| (at least the color scheme and the title led me astray when i
| first found it).
| gpuhacker wrote:
| I bought the first edition when it came out, and definitely it
| was a gold mine of information on the subject. I wonder though,
| is the fourth edition worth buying another copy? Nvidia has
| been advancing CUDA, in particular moving more towards C++ in
| the kernel language. But none of that was present when this
| book came out in 2007. Now more and more stuff is happening at
| thread block level with the cooperative group C++ API and warp
| level for tensor cores. It would be great if the authors
| revisited all the early chapters to modernize that content, but
| that's a lot of work so I don't usually count on authors making
| such an effort for later editions.
| w-m wrote:
| Does anybody have an idea on how to get in to Metal programming
| (as in Apple Metal)? I'd love to mess around a little with this
| on iOS and macOS while learning about tile-based rendering, but I
| have trouble locating educational written material.
|
| There's a book (https://metalbyexample.com/the-book/), but the
| author has put up a note that it's quite out of date. It seems
| the most up-to-date information is available in the WWDC videos
| (regarding e.g. Metal 3), but I'd really prefer something
| written. And Apple's documentation reads more like a reference
| material and is quite confusing when starting out.
| winwang wrote:
| (+1) I'm a newb to Metal myself, and I wanted to use Swift as
| the driving language (which was a main selling point).
| Unfortunately, almost all the material is in Objective C.
| winwang wrote:
| If people like GPU programming, I wrote a blog post this week
| about GPU-accelerated hashmaps, semi-provocatively titled "Can we
| 10x Rust hashmap throughput?".
|
| HN post here: https://news.ycombinator.com/item?id=37036058
| eachro wrote:
| I've been looking into getting into GPU programming, starting
| with CS334 (https://developer.nvidia.com/udacity-cs344-intro-
| parallel-pr...) on Udacity. I'm curious to hear from some of the
| more seasoned GPU veterans out there, what other resources would
| be good to take a look at after finishing the videos and
| assignments?
| yzh wrote:
| I would recommend the course from Oxford
| (https://people.maths.ox.ac.uk/gilesm/cuda/). Also explore the
| tutorial section of cutlass (https://github.com/NVIDIA/cutlass/
| blob/main/media/docs/cute/...) if you want to learn more about
| high performance gemm. OpenAI triton is another good resource
| if you want to write relatively performant cuda kernels using
| python for deep learning (https://openai.com/research/triton)
| gpuhacker wrote:
| If you want to go really in-depth I can recommend GTC on
| demand. It's Nvidia streaming platform with videos from past
| GTC conferences. Tony Scuderio had a couple of videos on there
| called GPU memory bootcamp that are among the best advanced GPU
| programming learning material out there.
| pengaru wrote:
| https://shadertoy.com is a great way to explore shaders
___________________________________________________________________
(page generated 2023-08-09 23:00 UTC)