[HN Gopher] Implementing a Tiny CPU Rasterizer
___________________________________________________________________
Implementing a Tiny CPU Rasterizer
Author : atan2
Score : 29 points
Date : 2024-11-01 15:16 UTC (7 hours ago)
(HTM) web link (lisyarus.github.io)
(TXT) w3m dump (lisyarus.github.io)
| Cieric wrote:
| I love looking at stuff like this, working in the GPU space has
| only ever renewed my ambitions to work on similar projects. The
| hardest thing I always ran into with the more optimized fill
| algorithms was working around the single pixel holes that appear
| when doing everything with integers.
|
| Small nitpick though, there seems to be an issue with the source
| code views, the file names and the first line of the code are on
| the same line with no spacing. looks like it might be a static
| site generation issue since there aren't any nodes to separate
| the name and the code in the raw html.
|
| Edit: turns out the issue I'm seeing is somehow due to Firefox,
| seems to work correctly in edge.
| smokel wrote:
| Wait, what? They rasterize a triangle by checking _for each
| pixel_ if it intersects with the triangle? Where are the days
| when you learned Bresenham [1] or fixed-point arithmetic [2] to
| determine the extents of a scan line, and then fill it with an
| ordinary for-loop?
|
| [1] https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm
|
| [2] https://en.wikipedia.org/wiki/Fixed-point_arithmetic
| vardump wrote:
| Checking every pixel is generally faster on modern hardware due
| to SIMD! Of course you'll need to clip properly.
|
| There are some pathological cases, like a thin long almost
| diagonal triangles. But those (rare) cases can be handled too
| by some subdivision clipping.
| cv5005 wrote:
| The thing is that modern computer architecture strikes again -
| like the old linked list vs array thing.
|
| Doing a candidate loop over a bounding box with edge equations
| can be much faster than ye old scanline algorithm because it
| lends itself more easily to simd and parallel approches - you
| can divide things up into tiles and process multiple pixels at
| a time with wide instructions and schedule tiles on multiple
| threads.
| smokel wrote:
| For those not in the know, things only get faster with this
| approach when you apply some optimization. The article that
| we are discussing won't get there until part 12.
|
| For the time being one may consult Ryg's amazing blog series:
| https://fgiesen.wordpress.com/2013/02/10/optimizing-the-
| basi...
| ehaliewicz2 wrote:
| Those were the 90s. Modern rasterizers* all use barycentric
| coordinate based algorithms for a few reasons.
|
| Easier to implement with proper fill conventions and
| multisampling, and much easier to parallelize in hardware and
| software.
|
| * Hardware even back in the 90s used this type of approach :)
| thechao wrote:
| We use barycentric (both in high performance software
| rasterizers and in hardware) because attribute interpolation
| is significantly more costly than in-bounds checking (both
| due to the total number of interpolates, and the precision).
| The in-bounds check is going to be just a few instructions (a
| few fmas and some sign checking) for two dimensions (x, y);
| whereas attribute interpolation could be needed for 30-128
| 'dimensions'. It's easier to push the interpolation to the
| use-site in the fragment shader, and let the compiler do its
| magic.
___________________________________________________________________
(page generated 2024-11-01 23:00 UTC)