[HN Gopher] Reverse Z in 3D graphics (and why it's so awesome)
___________________________________________________________________
Reverse Z in 3D graphics (and why it's so awesome)
Author : fanf2
Score : 75 points
Date : 2024-06-04 13:42 UTC (9 hours ago)
(HTM) web link (tomhultonharrop.com)
(TXT) w3m dump (tomhultonharrop.com)
| GenerocUsername wrote:
| Articles about graphics and rendering without a single image
| should be a misdemeanor. I know it's not explicitly about any
| particular graphic and instead about performance, but I still
| think it's missing that extra pizaz
| tonetheman wrote:
| this you are completely correct. just anything to show me why
| this is awesome instead of just the math!
| AlunAlun wrote:
| I disagree. Anyone with a basic understanding of how
| perspective projection matrices work will understand the
| article, especially with the code samples and equations. (Edit:
| and anybody without this understanding won't be interested in
| the article anyway).
|
| I thought it was great, and had a full-on "smack my forehead"
| moment as to why I had never realized something so simple and
| effective, after 20 years of programming 3D graphics!
| qingcharles wrote:
| LOL. Me too. 35 years here. There's always something new to
| learn.
|
| This, coupled with all the fascinating new ideas on how to
| optimize 3D math has made me determined to go rewrite my old
| x86 code.
| Keyframe wrote:
| oh, my. Back in the day, when learning graphics from books,
| there were some with equations and matrixes only. Imagine that!
| How did we learn at all in such savage environment?!
| qarl wrote:
| TLDR - floating point format has a great deal of precision near
| 0. Normally depth maps range from -1 to 1, but if you change that
| to 0 to 1 (and flip it) you can utilize this extra precision.
| karmakaze wrote:
| I couldn't quite connect all the dots as to why this is better.
| Something I do understand is summing numbers close to zero before
| numbers of larger magnitude away from zero produces a more
| accurate sum because floating-point addition isn't truly
| associative.
|
| My best guess is that this reverse-z convention keeps numbers at
| the same scale more often. I think it's important about the
| _same_ scale rather than near zero because the relative
| _precision_ is the same at every scale (single precision float:
| 24 bits stored in 23 bits with implied leading 1 bit). If the
| article is trying to say that numbers near zero have more
| relative precision because of denormalized representation of FP
| numbers it should call that out explicitly. Also the advantage of
| similar scales is for addition /subtraction there should be no
| advantage for multiplication/division AFAIK.
| nox101 wrote:
| https://developer.nvidia.com/content/depth-precision-visuali...
| nwellnhof wrote:
| This is the key paragraph: "The reason for this bunching up of
| values near 1.0 is down to the non linear perspective
| divide..."
| karmakaze wrote:
| That makes sense: many values close to 1.0 then transforming
| to represent that as 0.0 provides additional precision.
| Ono-Sendai wrote:
| I use reverse-z when I can, works great. Totally solves all
| z-fighting problems.
| tylerneylon wrote:
| If you're wondering why there's no demo graphic, it's because
| this idea is meant to produce correct results (in a bug-robust
| way), not anything different.
|
| One thing that can go wrong in 3d graphics is z-fighting, where a
| scene has rendering artifacts because two objects are touching or
| intersect each other, and the triangles that are close to each
| other look bad when their z values (distance from the camera) hit
| machine tolerance differences. Basically the rendering has errors
| due to floating point limitations.
|
| The post is pointing out that the float32 format represents many,
| many more values near 0 than near 1. Over 99% of all values
| represented in [0,1] are in [0,0.5]. And when you standardize the
| distances, it's common to map the farthest distance you want to 1
| and the nearest to 0. But this severely restricts your effective
| distances for non-close objects, enough that z-fighting becomes
| possible. If you switch the mapping so that z=0 represents the
| farthest distance, then you get many, many more effective
| distance values for most of your rendered view distance because
| of the way float32 represents [0,1].
| ajconway wrote:
| There is also a technique called logarithmic depth buffer
| (which should be self-explanatory):
| https://threejs.org/examples/?q=dept#webgl_camera_logarithmi...
| rendaw wrote:
| That's a pretty stunning visualization too
| indigoabstract wrote:
| I wasn't aware that logarithmic depth buffer could be
| implemented in WebGL since it lacks glClipControl(). It's
| cool that someone found a way to do it eventually (apparently
| by writing to gl_FragDepth).
| Const-me wrote:
| > apparently by writing to gl_FragDepth
|
| If they do that, this disables early Z rejection
| performance optimization implemented in most GPUs. For some
| scenes, the performance cost of that can be huge. When
| rendering opaque objects in front-to-back order, early Z
| rejection sometimes saves many millions of pixel shader
| calls per frame.
| xeonmc wrote:
| And not to mention, floating point numbers are already
| roughly logarithmically distributed. Logarithmic
| distributions are most important in large differing
| orders of magnitude, so having the piecewise-linear
| approximation of logarithm is good enough for proximity
| buffers.
| Const-me wrote:
| Indeed, logarithmic depth is pretty much useless on
| modern hardware, but it wasn't always the case.
|
| On Windows, the support for DXGI_FORMAT_D32_FLOAT is
| required on feature level 10.0 and newer, but missing
| (not even optional) on feature level 9.3 and older.
| Before Windows Vista and Direct3D 10.0 GPUs, people used
| depth formats like D16_UNORM or D24_UNORM_S8_UINT i.e.
| 16-24 bits integers. Logarithmic Z made a lot of sense
| with these integer depth formats.
| rowanG077 wrote:
| At least on my GPU this is extremely slow compared to
| disabling it.
| wk_end wrote:
| Thanks - this is much clearer than the original article, which
| somehow never actually defines "reverse Z" before going on and
| on about it.
|
| Why is it that most things you render are far away rather than
| close? I guess I'd expect the opposite, at least depending on
| what you're rendering. And I'd also assume you'd want better
| visual fidelity for closer things, so I find this a little
| counter-intuitive.
|
| Are there certain cases where you'd want "regular Z" rather
| than "reverse Z"? Or cases where you'd want to limit your Z to
| [0, 0.5] instead of [0, 1]?
| CamperBob2 wrote:
| _Why is it that most things you render are far away rather
| than close?_
|
| Because most things _are_ far away rather than close...?
| wk_end wrote:
| I guess...but in terms of what you see on screen most of
| what you see is going to be dominated by closer things, is
| what I'm getting at. The world is very big but right now
| most of what I see is relatively speaking close to me. I
| don't care much about the precise positions of stuff in
| China right now, but as for the stuff in my apartment in
| Canada it's very important for my sense of reality that the
| positioning is quite exact - even though there's a lot more
| stuff in China.
| cornstalks wrote:
| The reason for this is in the article:
|
| > _The reason for this bunching up of values near 1.0 is
| down to the non linear perspective divide._
|
| The function in question is nonlinear. So using "normal"
| z, values in the range [0, 0.5] are going to be _very_ ,
| _very_ close to the camera. The vast majority of things
| aren 't going to be that close to the camera. Most
| typical distances are going to be in the [0.5, 1] range.
|
| Hence, reversing that gives you more precision where it
| matters.
| xboxnolifes wrote:
| If you're standing outside holding a phone up to your
| face, you have 2 things close (hand/phone) and everything
| else (trees, buildings, cars, furniture, animals, etc)
| around you is not close.
| recursivecaveat wrote:
| You still need precision even for things that are far
| away. You don't necessarily care that the building your
| character is looking at a mile away is exactly in the
| right spot, but imprecision causing Z-fighting will be
| very noticeable as the windows constantly flicker back
| and forth in front of the facade.
| Jasper_ wrote:
| "Close" here means "millimeters away". The logarithmic nature
| of the depth buffer means that even your player character is
| considered far enough to be already at ~0.9 in a "regular Z"
| case (depending on where you put your near plane, of course).
| Go stare at a few depth buffers and you'll get a good taste
| of it.
| Jasper_ wrote:
| Here's a demo graphic for reversed Z in WebGPU; requires a
| capable browser.
|
| https://webgpu.github.io/webgpu-samples/?sample=reversedZ
| forrestthewoods wrote:
| > Over 99% of all values represented in [0,1] are in [0,0.5].
|
| Is that true if you exclude denormals?
| cornstalks wrote:
| Yes. Subnormals start at <1.0 * 2^(-126). Between [1.0 *
| 2^(-126), 0.5] there are 125x more distinct float values than
| there are in the range [0.5, 1].
| edflsafoiewq wrote:
| Yes. The floats in [0,1) come in blocks: [1/2, 1), [1/4,
| 1/2), [1/8, 1/4), etc. There are 2^23 floats in each block
| and there are 127 blocks. The reason there are so many floats
| in [0, 0.5] is only one block, [1/2, 1), is outside that
| range. If you exclude the denormals (which have a special
| block that stretches to 0, [0, 1/2^126)), you still just
| excluded a single block.
| forrestthewoods wrote:
| Ahhh that makes sense. That's a much more clear
| explanation. Thanks!
| jstanley wrote:
| Then you could show some example scenes that render wrong the
| "standard" way but are fixed by the reverse z idea.
| kazinator wrote:
| The demo graphic you need is not of the rendered scene, but the
| behind the scenes. Like the diagrams in the old Foley and Van
| Dam text. A visual of the perspective-transformed scene, but
| from a different angle, where you can see the Z coordinates.
| jandrese wrote:
| I'm not a graphics expert, but reading the article and the long
| roundabout way they got back to "we're just working with the 23
| bit mantissa because we've clamped down our range" makes me
| wonder why you couldn't just do fixed point math with the full 32
| bit range instead?
| NBJack wrote:
| Modern GPUs aren't optimized for it. You can indeed do this in
| software, but this is the path hardware acceleration took.
| cmovq wrote:
| It's actually quite common to use a 24bit fixed point format
| for the depth buffer, leaving 8 bits for the stencil buffer.
|
| GPUs do a lot of fixed point to float conversions for the
| different texture and vertex formats since memory bandwidth
| is more expensive than compute.
| Jasper_ wrote:
| Plenty of GPUs do have special (lossless) depth compression;
| that's what the 24-bit depth targets are about.
| Dylan16807 wrote:
| Okay but now all your values are bunched up next to the far
| plane, between 990 and 1000? Using this kind of mapping seems
| inherently flawed no matter what range you pick.
| gary_0 wrote:
| No, because depth values are in screen space (1/z), not world
| space (like 990-1000). Here are some graphs that make the
| benefits obvious:
| https://developer.nvidia.com/blog/visualizing-depth-precisio...
| kazinator wrote:
| The thing to understand is that the perspective transform in
| homogeneous coordinates doesn't just shrink the X and Y
| coordinates based on Z distance, but it also shrinks the Z
| coordinate too!
|
| The farther some objects are from the viewer, the closer together
| they are bunched in Z space.
|
| Basically, the points out at infinity where parallel lines appear
| to converge under perspective are translated to a finite Z
| distance. So all of infinity is squeezed into that range.
|
| Why is Z also transformed? In order to keep the transformation
| affine: to keep the correctness of straight lines under the
| perspective transformation.
|
| If you have an arbitrarily rotated straight line in a perspective
| view, and you just divide the X and Y coordinates along that line
| by Z, the line will look just fine in the perspective view,
| because the viewer does not see the Z coordinate. But in fact in
| the perspective-transformed space (which is 3 dimensional), it
| will not be a straight line.
|
| Now that is fine for wireframe or even certain polygon graphics,
| in which we can regard the perspective as a projection from 3D to
| 2D and basically drop the Z coordinate entirely after dividing X
| and Y by it.
|
| An example of where it breaks down is when you're using a
| Z-buffer for the visibility calculations.
|
| If you have two polygons that intersect, and you don't handle the
| Z coordinate in the perspective transform, they will not
| correctly intersect: the line at which they intersect, calculated
| by your Z buffer, will not be the right one, and that will be
| particularly obvious under movement, appearing as a weird
| artifact where that intersecting line shifts around.
|
| You won't see this when you have polygons that only join at their
| vertices to form surfaces. The 2D transform is good enough: the
| endpoints of a shared edge are correctly transformed from 3D into
| 2D taking into account perspective, and we just render a straight
| line between them in 2D and all is good. In the intersecting
| case, there are interpolated points involved which are not
| represented as vertices and have not been individually
| transformed. That interpolation then goes wrong.
|
| Another area where this is important, I think, is correctness of
| texture maps. If it is wrong, then a rotating polygon with a
| texture map will have creeping artifacts, where the pixels of the
| texture don't stick to the same spot on the polygon, except near
| the vertices.
___________________________________________________________________
(page generated 2024-06-04 23:01 UTC)