[HN Gopher] Faster, easier 2D vector rendering [video]
       ___________________________________________________________________
        
       Faster, easier 2D vector rendering [video]
        
       Slides:
       https://docs.google.com/presentation/d/1f_vKBJMaD68ifBO2j83l...
        
       Author : raphlinus
       Score  : 108 points
       Date   : 2025-06-10 13:17 UTC (9 hours ago)
        
 (HTM) web link (www.youtube.com)
 (TXT) w3m dump (www.youtube.com)
        
       | Fraterkes wrote:
       | Is there any specific connection between Rust and the
       | netherlands? A friend of mine helped organize a big rustcon in
       | Delft a while ago, I think Raph spoke there too.
       | 
       | Oh and a question for Raph, did the new spline you invented end
       | up being integrated in any vector/font-creation tools? I remember
       | being really impressed when I first tried your demo
        
         | raphlinus wrote:
         | Yes, I was born in Enkhuizen.
         | 
         | The newest spline work (hyperbezier) is still on the back
         | burner, as I'm refining it. This turns out to be quite
         | difficult, but I'm hopeful it will turn out better than the
         | previous prototype you saw.
        
       | littlestymaar wrote:
       | What I love with Raph Levien is how it does exactly the opposite
       | of what most people do in tech: given the current financial
       | incentives, the only thing most people can afford is to do
       | something "good enough" as fast as possible, and in the end we
       | end up with lots of subpar, half-baked solutions that cannot be
       | fixed properly because many people rely on the tool they have and
       | fixing it in depth would break everyone's workflow. Until the
       | next solution appears, and given the same structural constraints,
       | end up failing in the exact same shortcomings.
       | 
       | Instead Raph has spent the past 9 years I believe, trying to
       | create a _sound_ foundation on the problem of performant UI
       | rendering.
       | 
       | I don't know how it will go, and he's going to end up shipping
       | his grand vision at all eventually, but I really appreciate the
       | effort of "doing something well" in a world that pretty much only
       | rewards "doing something quickly".
        
         | pvg wrote:
         | There's a funny bit in that vein in Cringley's _Accidental
         | Empires_ :
         | 
         |  _"The first volume of Knuth 's series (dedicated to the IBM
         | 650 computer, "in remembrance of many pleasant evenings") was
         | printed in the late 1960s using old-fashioned but beautiful
         | hot-type printing technology, complete with Linotype machines
         | and the sharp smell of molten lead. Volume 2, which appeared a
         | few years later, used photo-offset printing to save money for
         | the publisher (the publisher of this book, in fact). Knuth
         | didn't like the change from hot type to cold, from Lino to
         | photo, and so he took a few months off from his other work,
         | rolled up his sleeves, and set to work computerizing the
         | business of setting type and designing type fonts. Nine years
         | later, he was done."_
        
         | unconed wrote:
         | We already have plenty of techniques that are fast enough for
         | classic UI rendering. There is no conceivable bottleneck for
         | the kind of stuff that is on your screen right now. It's not a
         | matter of "doing something quickly" imo, that's an issue
         | specific to the games industry, and largely caused by the need
         | to make entirely custom, animated, textured UIs as a feature
         | for a single product.
         | 
         | What projects like Slug and Vello rather show is that GPU
         | coding remains so obtuse that you cannot tackle an isolated
         | subproblem like 2D vector rendering, and instead have to make
         | apple pie from scratch by first creating the universe. And then
         | the resulting solution is itself a whole beast that cannot just
         | be hooked up to other API(s) and languages than it was created
         | for, unless that is specifically something you also architect
         | for. As the first slide shows, v1 required modern GPUs, and the
         | CPU side uses hand-optimized SIMD routines.
         | 
         | 2D vector graphics is also just an awkward niche to optimize
         | for today. GPUs are optimized for 3D, where z-buffers are used
         | to draw things in an order-independent way. 2D graphics instead
         | must be layered and clipped in the right order, which is much
         | more difficult to 'embarrassingly' parallelize. Formats like
         | SVG can have an endless number of points per path, e.g. a
         | detailed polygon of the United States has to be processed as
         | one shape, you can't blindly subdivide it. You also can't rely
         | on vanilla anti-aliasing because complementary edges wouldn't
         | be fully opaque.
         | 
         | Even if you do go all the way, you'll still have just a 2D
         | rasterizer. Perhaps it can work under projective transform,
         | that's usually pretty easy, but will it be significantly more
         | powerful or extensible than something like Cairo is today? Or
         | will it just do that exact same feature set in a
         | technologically sexier way? e.g. Can it be adapted to rendering
         | of 3D globes and maps, or would that break everything? And note
         | that rasterizing fonts as just unhinted glyphs (i.e. paths) is
         | rarely what people what.
        
       | leetrout wrote:
       | Raph - I know enough to be very dangerous with GPUs so please
       | forgive my ignorance. Two questions:
       | 
       | 1. Do you have a favorite source for GPU terminology like draw
       | calls? I optimized for them on an unreal engine project but never
       | "grokked" what all the various GPU constructs are and how to
       | understand their purpose, behavior and constraints. (For this
       | reason I was behind the curve for most of your talk :D) Maybe
       | this is just my lack of understanding of what a common / modern
       | pipeline consists of?
       | 
       | 2. I replayed the video segment twice but it is still lost on me
       | how you know which side of the path in a tile is the filled side.
       | Is that easy to understand from the code if I go spelunking for
       | it? I am particularly interested in the details on how that is
       | known and how the merge itself is performed.
        
         | Nevermark wrote:
         | Given the edges are represented by lines, line representation
         | needs to be robust to any angle, including very vertical or
         | very horizontal, and they are segments:
         | 
         | I expect the line segments are represented by their two end
         | points.
         | 
         | This makes it easy to encode which side is fill vs. alpha, by
         | ordering the two points. So that as you move from the first
         | point to the second, fill is always on the right. (Or vice
         | versa.)
         | 
         | Another benefit of ordering at both point and segment levels,
         | is from one segment to the next, a turn to the fill side vs.
         | the alpha side, can be used to inform clipping, either convex
         | or concave, reflecting both segments.
         | 
         | No idea if any of this what is actually happening here, but
         | this is one way to do it. The animation of segmentation did
         | reflect ordered segments, and clockwise for the outside border,
         | and counterclockwise for the cavity in R. Fill to the right.
        
           | leetrout wrote:
           | Makes sense! Thank you for the explainer and the call out to
           | the detail of which way they walk the segments.
        
         | raphlinus wrote:
         | 1. I like ryg's "A trip through the Graphics Pipeline" [1].
         | It's from 2011 but holds up pretty well, as the fundamentals
         | haven't changed. The main new topic, perhaps, is the rise of
         | tile based deferred rendering, especially on mobile.
         | 
         | 2. I skipped over this in the interest of time. `Nevermark has
         | the central insight, but the full story is more interesting.
         | For each tile, detect whether the line segment crosses the top
         | edge of the tile, and if so, the direction. This gives you a
         | delta of -1, 0, or +1. Then do a prefix sum of these deltas on
         | the _sorted_ tiles. That gives you the winding number at the
         | top left corner of each tile, which in turn lets you compute
         | the sparse fills and also which side to fill within the tile.
         | 
         | [1]: https://fgiesen.wordpress.com/2011/07/09/a-trip-through-
         | the-...
        
       | jasonthorsness wrote:
       | Always amazes me when the cost of computing the optimized draw
       | regions as shown in the slides is worth the savings even on CPU.
        
       | coffeeaddict1 wrote:
       | Great presentation. However, I have mixed feelings about Vello.
       | On one hand, it's awesome that someone is truly trying to push
       | the GPU to do 2D rendering work. But my impression is that the
       | project has hit some considerable limitations due to the lack of
       | "inter-workgroup" level cooperation in current rendering APIs and
       | difficulties with dynamic memory allocation on the GPU. I'm sure
       | the hybrid rendering scheme they have will be great, as the
       | people behind it are extremely capable, but is that really a
       | meaningful step beyond what Pathfinder achieved years ago? Also,
       | in terms of CPU rendering, Blend2D exists and it's blazing fast.
       | Can Vello's CPU backend really do better?
        
         | virtualritz wrote:
         | TLDR; there other reasons why someone would prefer vello than
         | speed.
         | 
         | There are different applications for 2D rendering.
         | 
         | In our case we need support for the rendering to take place
         | with f32/float precision, i.e. RGBA colors need to be 96 bit
         | values.
         | 
         | We also do not care if the renderer is realtime. The
         | application we have is vector rendering for movie production.
         | 
         | That's where the multiple backend approach of vello and
         | especially the vello-cpu crate become really interesting. We
         | will either add the f32 support ourselves or hope it will
         | become part of the vello roadmap at some stage.
         | 
         | Also, Blend2D is C++ (as is Skia, the best alternative, IMHO).
         | Adding a C++ toolchain requirement to any Rust project is
         | always a potential PITA.
         | 
         | For example, on the (Rust) software we work on, C++ toolchain
         | breakage around a C++ image processing lib that we Rust-wrapped
         | cost us two man weeks over the last 11 months. That's a lot for
         | a startup where two devs work on the resp. affected part.
         | 
         | Suffice to say, there was zero Rust toolchain-related work done
         | or breakage happening in the same timeframe.
        
           | Asm2D wrote:
           | Blend2D has C-API and no dependencies - it doesn't even need
           | a C++ standard library - so generally it's not an issue to
           | build it and use it anywhere.
           | 
           | There is a different problem though. While many people
           | working on Vello are paid full time, Blend2D lacks funding
           | and what you see today was developed independently. So, the
           | development is super slow and that's the reason that Blend2D
           | will most likely never have the features other libraries
           | have.
        
       | s-mon wrote:
       | Great presentation and thanks for sharing the slides. Wondering,
       | can any of these methods be used for 3D too?
        
       | taneq wrote:
       | I was confused by the step where the tiles are generated by
       | tracing the outline and then sorted afterwards. It seems like
       | this could be faster to do earlier (possibly even largely
       | precomputed) using something analogous to oldschool scan
       | conversion or span buffers? I'm not super up to date on this
       | stuff so would love to know why this is faster.
        
       | boxerab wrote:
       | Excellent presentation - I like that the design bakes in
       | rendering efficiency from the get go.
        
       | morio wrote:
       | You got a long way to go. Writing a rasterizer from scratch is a
       | huge undertaking.
       | 
       | What's the internal color space, I assume it is linear sRGB? It
       | looks like you are going straight to RGBA FP32 which is good.
       | Think how you will deal with denormals as the CPU will deal with
       | those differently compared to the GPU. Rendering artifacts galore
       | once you do real world testing.
       | 
       | And of course IsInf and NaN need to be handled everywhere. Just
       | checking for F::ZERO is not enough in many cases, you will need
       | epsilon values. In C++ doing if(value==0.0f){} or if
       | (value==1.0f){} is considered a code smell.
       | 
       | Just browsing the source I see Porter Duff blend modes. Really,
       | in 2025? Have fun dealing with alpha compositing issues on this
       | one. Also most of the 'regular' blend modes are not alpha
       | compositing safe, you need special handling of alpha values in
       | many cases if you do not want to get artifacts. The W3C spec is
       | completely underspecified in this regard. I spent many months
       | dealing with this myself.
       | 
       | If I were to redo a rasterizer from scratch I would push
       | boundaries a little more. For instance I would target full FP32
       | dynamic range support and a better internal color space, maybe
       | something like OKLab to improve color blending and compositing
       | quality. And coming up with innovative ways to use this gained
       | dynamic range.
        
         | pixelpoet wrote:
         | Isn't "linear sRGB" an oxymoron?
        
           | morio wrote:
           | Not really, it's the same color primaries just without the
           | non-linear transfer function.
        
         | mfabbri77 wrote:
         | You didn't mention one of the biggest source of 2d vector
         | graphic artifacts: mapping polygon coverage to the alpha
         | channel, which is what virtually all engines do, and is the
         | main reason why we at Mazatech are writing a new version of our
         | engine, AmanithVG, based on a simple idea: draw all the paths
         | (polygons) at once. Well, the idea is simple, the
         | implementation... not so much ;)
        
         | raphlinus wrote:
         | It's device sRGB for the time being, but more color spaces are
         | planned.
         | 
         | You are correct that conflation artifacts are a problem and
         | that doing antialiasing in the right color space can improve
         | quality. Long story short, that's future research. There are
         | tradeoffs, one of which is that use of the system compositor is
         | curtailed. Another is that font rendering tends to be weak and
         | spindly compared with doing compositing in a device space.
        
           | morio wrote:
           | Yeah, there is an entire science on how to do font rendering
           | properly. Perceptually you should even take into account if
           | you have white text on black background or the other way as
           | this changes the perceived thickness of the text. Slightly
           | hinted SDFs kind of solve that issue and look really good but
           | of course making that work on CPUs is difficult.
        
       | krona wrote:
       | As someone interested but unfamiliar with the state-of-the-art of
       | GPU vector rasterization I struggle to understand how the method
       | described here isn't a step back from the SLUG algorithm or the
       | basic ~100 lines of glsl of the vector texture approach of
       | https://wdobbie.com/post/gpu-text-rendering-with-vector-text...
       | from nearly a decade ago (albeit with some numerical precision
       | limitations.)
       | 
       | Is the problem here that computing the vector texture in real-
       | time is too expensive and perhaps that font contours are too much
       | of a special case of a general purpose vector rasterizer to be
       | useful? The SLUG algorithm also implements 'banding' which seems
       | similar to the tiling described in the presentation.
        
       | xixixao wrote:
       | Is his tattoo an Euler spiral? I made a little explainer for
       | those (feedback welcome)
       | 
       | https://xixixao.github.io/euler-spiral-explanation/
        
         | jobstijl wrote:
         | It is :) https://mastodon.online/@raph/110749467084183893
        
       | haniehz wrote:
       | Very nice.
        
       ___________________________________________________________________
       (page generated 2025-06-10 23:00 UTC)