[HN Gopher] The genius of binary space partitioning in Doom
___________________________________________________________________
The genius of binary space partitioning in Doom
Author : pgayed
Score : 248 points
Date : 2022-11-21 14:39 UTC (8 hours ago)
(HTM) web link (twobithistory.org)
(TXT) w3m dump (twobithistory.org)
| pkilgore wrote:
| Also discussed / covered in
| https://news.ycombinator.com/item?id=21906051
| kazinator wrote:
| It seems that a challenge in BSP is going to be artifacts, when a
| surface, particularly a textured one, is chopped into pieces by
| several planes. If this is not done very carefully, the cuts
| might show, due to edge artifacts caused by the separate scan
| conversion of the fragments. Certain shading tricks like Phong
| may be difficult to apply in such a way that the result looks as
| if the original undivided polygon had been rendered.
| dmbaggett wrote:
| Amazingly brilliant work, especially given the CPU capabilities
| at the time. Carmack's use of BSP trees inspired my own work on
| the Crash Bandicoot renderer. I was also really intrigued by Seth
| Teller's Ph.D. thesis on Precomputed Visibility Sets though I
| knew that would never run on home console hardware.
|
| None of these techniques is relevant anymore given that all the
| hardware has Z buffers, obviating the need to explicitly order
| the polygons during the rendering process. But at the time (mid
| 90s) it was arguably the key problem 3D game developers needed to
| solve. (The other was camera control; for Crash Andy Gavin did
| that.)
|
| A key insight is that sorting polygons correctly is inherently
| O(N^2), not O(N lg N) as most would initially assume. This is
| because polygon overlap is not a transitive property (A in front
| of B and B in front of C does NOT imply A in front of C, due to
| cyclic overlap.) This means you can't use O(N lg N) sorting,
| which in turn means sorting 1000 polygons requires a million
| comparisons -- infeasible for hardware at the time.
|
| This is why many games from that era (3DO, PS1, etc) suffer from
| polygons that flicker back and forth, in front of and behind each
| other: most games used bucket sorting, which is O(N) but only
| approximate, and not stable frame to frame.
|
| The handful of games that did something more clever to enable
| correct polygon sorting (Doom, Crash and I'm sure a few others)
| looked much better as a result.
|
| Finally, just to screw with other developers, I generated a giant
| file of random data to fill up the Crash 1 CD and labeled it
| "bsptree.dat". I feel a bit guilty about that given that everyone
| has to download it when installing the game from the internet,
| even though it is completely useless to the game.
| PainfullyNormal wrote:
| > None of these techniques is relevant anymore given that all
| the hardware has Z buffers, obviating the need to explicitly
| order the polygons during the rendering process. But at the
| time (mid 90s) it was arguably the key problem 3D game
| developers needed to solve. (The other was camera control; for
| Crash Andy Gavin did that.)
|
| In your opinion, What is the key problem 3d game developers
| need to solve in 2022?
| chrisseaton wrote:
| > None of these techniques is relevant anymore given that all
| the hardware has Z buffers, obviating the need to explicitly
| order the polygons during the rendering process.
|
| You can't mean that all the polygons in a game world are now
| sent to the GPU, entirely relying on viewport culling and Z
| buffer to remove the ones out of view? I'm not an expert but
| I'm sure that's not true - doesn't Source and latest iD Tech
| use BSP right now for example?
| dmbaggett wrote:
| Occlusion culling is still relevant and of course BSP trees
| help with that as well. What I meant is there is no value in
| sorting polygons any longer. (As far as I know; the last time
| I worked on a game was 1997.)
| gary_0 wrote:
| It should be noted that virtually all renderers do _frustum
| culling_ , meaning anything in the game world that is
| guaranteed to be out the camera's viewing volume is not
| rendered. Frustum culling is quite simple. _Occlusion
| culling_ is usually done after frustum culling and is more
| complex, and the method tends to vary depending on the scene
| type and game engine, or is sometimes not done at all (modern
| GPUs are so powerful that smaller or simpler game areas
| render fast enough without occlusion culling).
| Jasper_ wrote:
| That's known as "occlusion culling", and it's still a bit of
| an open problem [0]; a lot of games do indeed just send all
| draw calls inside the frustum. Turns out a good Z-prepass is
| pretty free and helps a lot with things occlusion culling
| would normally help with. And deferred rendering helps even
| more with things like quad overdraw from disparate objects,
| as long as your material mask is mostly the smae.
|
| The Source Engine still uses has a pre-built BSP for surface
| visibility. Source 2 has replaced this with a world-aligned
| visibility octree [1].
|
| [0] The common solution these days is a low-resolution depth
| buffer rasterized on the _CPU_ with SIMD, because doing it on
| the GPU would have noticeable latency... you 've probably
| played a game where you turn into a twisty hallway and the
| walls/object in there only show up after a few frames...
| that's GPU occlusion culling at work.
|
| [1] https://developer.valvesoftware.com/wiki/Half-
| Life:_Alyx_Wor...
| teraflop wrote:
| It's a bit more nuanced than that. The parent commenter is
| correct that modern game engines don't _need_ to sort
| polygons to render them correctly (with some exceptions e.g.
| transparent objects), and doing so at a fine-grained level is
| generally not worth the CPU cost. Especially since the
| _bandwidth_ between the CPU and GPU can be a bottleneck, so
| if possible you want to only upload the level geometry once
| instead of having to rearrange it on every frame.
|
| Game engines can of course still do their own _coarse-
| grained_ occlusion culling, in order to reduce the amount of
| geometry that needs to be processed per frame. And there can
| still be a benefit to _approximately_ sorting objects: if you
| draw objects in approximate front-to-back order, then a
| shader can do "early depth testing" and skip the expensive
| shading computations for pixels that are occluded by already-
| drawn polygons.
| gary_0 wrote:
| Even for modern games that don't care about depth ordering,
| they still tend to render first-person weapons or third-
| person characters first, and the sky last, just because
| it's an easy way to skip a bunch of overdraw, so why not?
| hmry wrote:
| Source was released 18 years ago.
|
| The current Source 2 does not use BSP anymore.
| gary_0 wrote:
| I think BSP was kind of obsolete even when it was used in
| Source 1, but was left in because removing it would have
| required changing a lot of code for no real benefit. The
| blocky brush-based level geometry system from GoldSrc still
| remained, but they added a lot of higher-fidelity features
| on top of it. IIRC, the level editing pipeline code for
| things like PVS and baked lighting were still based on code
| from GoldSrc. Better, newer techniques existed but what
| Valve had wasn't broken, so why fix it?
| skocznymroczny wrote:
| From what I've seen in the modern game engines, the current
| state of the art seems to be executing a compute shader which
| does frustum and occlusion culling on GPU and dynamically
| generates an indirect argument buffer which gets drawn in
| several or a single indirect draw.
|
| This article contains a basic implementation of such idea in
| Vulkan -
| https://vkguide.dev/docs/gpudriven/gpu_driven_engines/
| CyberDildonics wrote:
| _I'm not an expert but I'm sure that's not true_
|
| If you're not an expert why are you _sure_ it 's not true.
| Most games render a lightweight pass to do all rasterization
| to a g-buffer, then do the expensive shading later. This
| separates visibility from shading. If fragments are already
| occluded by the z-buffer they can be skipped as soon as they
| can test their z value against the buffer.
| chrisseaton wrote:
| Do you have my comments bookmarked or something? You reply
| to me with something negative to say far more frequently
| than if you came across them randomly. Can you give it a
| rest if you can't manage to tone down the snide please?
|
| As other comments say, including the original comment
| author, game engines actually do still rely on their own
| space partitioning to reduce draw calls. Source 2 just does
| it quad rather than binary. Source is still used in
| actively developed games and is still BSP, so it's not true
| that the techniques are not relevant.
| CyberDildonics wrote:
| It's strange to get so upset about negativity when you
| replied to someone saying "I'm sure that's not true" even
| though you were talking to an expert who was making a
| general statement about the way rendering works.
| chrisseaton wrote:
| Just give replying to me a rest will you, if you don't
| like my comments.
| CyberDildonics wrote:
| The person you replied to made an informative comment and
| you seemed to want to poke a hole in a generalization
| that people could learn from, which a lot of other people
| have pointed out, so I don't know why you would get upset
| at me for explaining more about how rendering works.
| dicroce wrote:
| yeah this didn't seem right to me either
| zasdffaa wrote:
| > sorting polygons correctly is inherently O(N^2), [...]
| because polygon overlap is not a transitive property (A in
| front of B and B in front of C does NOT imply A in front of C,
| due to cyclic overlap.)
|
| Well ok but I don't get this:
|
| > This means you can't use O(N lg N) sorting, which in turn
| means sorting 1000 polygons requires a million comparisons --
| infeasible for hardware at the time
|
| ISTM you CAN'T sort it full stop because of cycles - so what
| kind of sort (never mind O(N^2), _any_ sort) can order cycles?
| They can 't.
| kvark wrote:
| > None of these techniques is relevant anymore given that all
| the hardware has Z buffers
|
| Not true if you consider transparent objects. Rendering with
| order-independent transparency is still a hot topic without a
| clearly winning solution on GPU.
|
| Web browsers have lots of semi-transparent rectangles, which
| can be transformed under "preserve3d" context. This is a
| classic case of effective BSP that is actual. (background:
| implementing BSP in WebRender a few years ago).
|
| https://github.com/servo/plane-split
| bodhiandphysics wrote:
| As ray tracing becomes more common, I think we'll see a big
| return of various BSP trees in games. BSP is still the crucial
| element in offline rendering.
| detritus wrote:
| > Finally, just to screw with other developers, I generated a
| giant file of random data to fill up the Crash 1 CD and labeled
| it "bsptree.dat". I feel a bit guilty about that given that
| everyone has to download it when installing the game from the
| internet, even though it is completely useless to the game.
|
| Wonderful! THIS is the kind of silly nitty gritty detail I want
| to hear more about - particularly the whys and wherefores of
| the decision, largely-unconsidered as it may well have been.
| Puts me in mind of the semi-affectionate rivalry between
| different demo/crack houses in the eighties and nineties,
| engaging in largely unseen conflict, all submarine-like :) And,
| if you're reading this - know that this thoroughly un-tech nerd
| has read all of your Crash Bandicoot breakdowns, no matter how
| arcane they might seem to me :)
| jetbalsa wrote:
| Great comment! That poor poor bsptree.dat will forever live on.
| I wonder if anyone really sat down and tried to bash their head
| against bsptree.dat
| YesBox wrote:
| I'm developing an isometric city builder game. I've figured out
| how to depth sort everything on the CPU efficiently, though
| there were some ugly workarounds I had to implement. Suffice to
| say, for anything outside of buildings, sorting on the screen.y
| position of the texture works perfectly (so long as each asset
| is one tile in size, or broken up into multiple tiles).
|
| But, I am starting to implement vehicles, first time adding
| assets that are multiple tiles in size. It would be really nice
| if I could create one asset for the vehicles, but the sorting
| on screen.y will not work for 3D rectangles, so I am breaking
| the up into multiple tiles...
|
| Do you think BSP trees will work with thousands of moving
| assets? i.e., I would have to recreate the BSP tree every
| frame.
| kqr wrote:
| How much do things move between two frames? Would you really
| have to recreate the full tree or just reorder a small number
| of leaves?
| fragmede wrote:
| there's probably some optimization that you can do, but you
| can't know that the second frame isn't the conclusion to a
| move from 10 frames ago that is what reveals a giant area.
| frame 0: open door frame 10: still can't see past
| edge of door frame 11: door opens enough to see big
| new area
|
| between 10 and 11 it turns out there's a huge difference,
| even though they're only two frames apart.
| YesBox wrote:
| Somewhere between one to $tile_size pixels. Tile size
| depends on the zoom level (I'm using SFML, had to add a
| separate sprite sheet for each zoom level cause SFML's
| built in zoom out method is awful).
|
| This is the first time of hearing BSP, and I read most of
| the OP's article to have a very basic understanding how it
| works.
|
| Since this is a tree, reordering N elements would be
| approach N^2 complexity, would it not? (edit: I assumed you
| would have to find each node from the root, which could
| very well be a bad premise).
| ant6n wrote:
| Assuming everything touches the floor, and floor is flat,
| sort based on bottom y coordinate of each object
| YesBox wrote:
| This works perfectly for isometric cubes, and is what I do
| currently.
|
| If you imagine two long rectangles, one in each of your two
| hands, and pretend they are two cars passing each other in
| an isometric world, you will see that pretty soon, one
| car's screen.y position will be below the other before it's
| clear of the vehicle which should actually be occluding
| part of it.
| pfedak wrote:
| Do you know how much changed for the sequels? From some reverse
| engineering, Crash Warped relies on depth bucketing for dynamic
| triangles, while level geometry is streamed from the disk based
| on the camera position, already at the appropriate LOD and
| sorted and bucketed. Is the BSP logic you're talking about
| real-time, or part of a similar pre-processing step?
| HellDunkel wrote:
| The beauty of the BSP tree is that it used recursion. When first
| approached this is a challenge to wrap your head around- very
| similar to the quicksort algorithm. Great read.
|
| These days octtrees more or less replaced BSP trees i guess. They
| are easier to handle and work better with more polygons.
| j7ake wrote:
| Brilliant synthesis of the problem and why it was difficult.
|
| My feeling is that Carmack is really good at speeding up
| important parts of video game performance to achieve
| qualitatively different gaming experience.
|
| Sometimes I wonder if Carmack was born 30 years later, when
| computing resources make such advances less needed, he would
| still be as successful? Perhaps he would go on to improve
| performance of other problems eg virtual reality and deep
| learning.....
| NayamAmarshe wrote:
| Doom is truly an amazing showcase of engineering achievements and
| going the extra mile for optimizations.
|
| Games these days just depend on raw hardware and optimization
| takes a back seat. Games like Plague Tale Requiem, while look
| fantastic, are equally bad at the engineering part. How is an RTX
| 3080 the minimum recommended spec for 1080@60fps, is beyond me.
| Sadly, people these days do not care about software optimization
| and those who do are bombarded with comments like "Get better
| hardware".
| HellDunkel wrote:
| The truely great games do. I am sure the developers of BOTW
| were very serious about optimization and you can actually feel
| that playing the game (on switch). It's a masterpiece just like
| doom.
| _whiteCaps_ wrote:
| Wow, this brings back memories...
|
| I took a 3D computer graphics course in university in the early
| 2000s, and the final project was to make a game. IIRC we could
| use as many open source libraries + assets as we wanted, but we
| had to write the renderer.
|
| Our professor was a big Buffy the Vampire Slayer fan, so we made
| a Buffy game. My team thought we'd have some easy points for
| that. When we presented our plan, he says "I know you know I'm a
| Buffy fan, so this better be good"... Ooops. Time to get to work.
|
| We used the Quake map format, which is BSP, and our renderer
| worked pretty well. We used Quake 2 character models, which
| worked pretty well on its own. I think it was because someone had
| made some vampire models for it.
|
| When we integrated the two, the performance was terrible. Like a
| slideshow. After a brief panic, we realized that the Z axis was
| inverted between the two modules we'd written, and the guy that
| put them together was inverting the model _every frame_. After
| inverting it once when the model was loaded, they worked pretty
| well together.
|
| We added a simple hitbox (a rectangular prism at the maximum
| points of the model), and had a fun little game after that!
| diskzero wrote:
| I implimented a BSP renderer in my somewhat unremarkable CDROM
| game Dark Fiber [1] in 1995 or so. I also remember seeing the
| algorithm in Foley and Van Dam, but it wasn't until a
| conversation with John at some conference/tradeshow that I went
| for doing the engine using BSP. It was a lot of fun figuring out
| how to optimize the algorithm and I also spent a lot of time
| using Mathematica figuring out the equations. Well, that was at
| least my excuse for buying a fairly expensive piece of software
| as a small gaming startup.
|
| https://archive.org/details/DarkFiberWindowsMac
| tasty_freeze wrote:
| BSP (binary space partitioning) was a well known algorithm, not
| something Carmack picked out of obscurity. It is well covered in
| every edition of Foley and van Dam's bible "Computer Graphics".
| The arcade game "I, Robot" from Atari (1983) used BSP to render
| the polygons back to front -- there was no z-buffer.
|
| That isn't to deny that Carmack was brilliant. But him using BSP
| isn't some masterstroke of genius in itself.
| throw_m239339 wrote:
| > The arcade game "I, Robot" from Atari (1983) used BSP to
| render the polygons back to front -- there was no z-buffer.
|
| wow 1983
|
| https://www.youtube.com/watch?v=EHkwdvfXHJc
| bitexploder wrote:
| Carmack even discusses this on the Lex Fridman podcast. Carmack
| basically had high school level math around that era and was a
| voracious reader of every scrap of game programming lore he
| could find. And outside of it. He is a very intellectually
| honest guy and it is refreshing to hear him talk about how he
| worked in those days. I came up similar to Carmack and he is my
| programming hero, so, I have a lot of admiration for how he
| managed to do things.
| gibolt wrote:
| Long video (5hrs+), but well worth a watch/listen. Can
| increase playback speed to save time
|
| https://youtu.be/I845O57ZSy4
| beebeepka wrote:
| I spent several game nights shooting guys in TF2 while
| listening to this interview. It is long. The channel has a
| few good interviews with biology guys as well
| bitexploder wrote:
| It was easily one of my favorite podcasts of the last few
| years and Lex usually drives me a little crazy.
| phkahler wrote:
| I, Robot used a lot of great tricks, but I'm gonna have to
| consult John Manfrida on the use of actual BSP trees.
|
| They did use normal vectors to determine visibility and may
| have done simple back to front within objects using that, but
| the playfield is a grid and mostly rendered back to front.
| lordfrito wrote:
| FYI I'm the author of the original I Robot emulator from
| 1998. This is the one that interprets the mathbox, which
| means I only know what the mathbox does at the highest 'API'
| level (can't 100% speak to what the mathbox microcode
| actually does)
|
| Paul is essentially correct. It's mostly back to front, with
| some assumptions. I can confirm quite a number of tricks were
| used, not entirely sure they qualify as a full BSP (although
| it does involve precompiled mesh optimizations) but then
| again I'm no BSP expert (just an I Robot expert).
|
| The rasterizer draws a list of polygons (full polys, not just
| triangles) which must be fully computed before sending to the
| rasterizer. This "display list" is drawn in order. There is
| no Z buffer, so everything draws on top of what was already
| there. So it's important that the display list have the
| polygons somewhat sorted. Also the polygons are drawn as is,
| which means rotation/projection is done before being written
| to the display list.
|
| The camera was fixed (can't yaw) as to simplify math. This
| means you are always looking straight ahead in Z axis, which
| allows for all sorts of optimizations. It also makes it
| trivial to sort objects back to front, simply by checking z
| coordinate.
|
| Next, the hardware only supports 16 moveable "3D mesh"
| objects that can be drawn per frame. With an object count
| this low it's trivial to sort back to front without consuming
| many CPU cycles. 6809 index instructions make this part a
| breeze.
|
| The game playfield is "built on the fly", essentially
| mirroring a 16x16 (16x24? 16x32?) playfield tile array in RAM
| (each byte encodes tile height), and building each cube face
| polygon on the fly.
|
| Because of the fixed camera, no real sorting or culling is
| necessary when creating the display list at the highest
| level. You simply draw from back of the screen (furthest
| playfield row) to front (closest row). This process never has
| to change, it's a linear traversal of playfield array memory,
| and it always works as long as the camera doesn't yaw. Just
| draw each row on top of the last one. Guaranteed to not cause
| problems.
|
| To draw a row (16 tiles, 15 visible) all of the
| front/top/side polygons are computed and written to the
| display list. Again, because Z axis is fixed it makes
| creating the polygon coordinates trivial (you always see
| front of cube, no culling needed). Once that's done you draw
| any 3D meshes that are in the row (objects sitting on the
| playfield cells). Then you move to the next row, and the
| next, and then you're done.
|
| So at this high level, no real culling is done... things
| essentially get drawn in order of depth, which always works
| as long as you don't yaw the camera.
|
| Now... where things get interesting is the 3D mesh objects
| themselves are stored in a sort of "pre-computed" way to
| optimize surface removal (not sure if this was done by hand,
| or by an automated/compile process). It's not so much a tree
| structure, as it's actually done through specialized
| instructions. I suppose the jumping/branching is the
| equivalent of tree traversal.
|
| Basically the meshes are held in ROM, and are really just
| specialized subroutines, written as a series of simple
| instructions for the Mathbox. To keep this writeup simple,
| treat the Mathbox as being able to execute two instructions:
| 1) write a polygon to the display list 2) "relative
| jump to instruction" if normal vector dot product points
| towards/away from camera
|
| A 3D mesh object might be coded in ROM like this
| Mathbox3DMeshObject: if normal vector dot product <
| 0, jump to X ; pointing away from camera write poly 1
| to display list ; forward facing polygons write
| poly 2 to display list ... Jump to Y
| X: write poly 10 to display list ; back facing
| polygons write poly 11 to display list ...
| Y: if normal vector dot product < 0, jump to Z ;
| sideway facing? write poly 20 to display list
| write poly 21 to display list ... Z:
| write poly 99 to display list ; always drawn ...
| END
|
| A mesh object could code any number of jumps, to cover side
| facing objects, corner cases, etc. Note I'm glossing over the
| fact that the mathbox would also multiply the coordinates by
| the current rotation matrix, project to X/Y, perform shading,
| etc. before emitting to the display list.
|
| As code it's not a real tree structure, although executing
| the code behaves a lot like traversing a tree. I guess this
| is sort of a BSP? Can anyone chime in here?
|
| It was pretty clever how it worked out. The mathbox executes
| the instructions, and creates draw instructions in order.
| Polygons get skipped by virtue of the fact you're
| periodically checking normal vectors, and skipping around to
| different instructions as needed.
|
| Edited: clarity, typos
| phkahler wrote:
| Sounds like the mesh objects are nearly equivalent to small
| BSP trees.
|
| Thanks John!
| lordfrito wrote:
| It definitely does work like a BSP tree.
|
| The issue to me is that a tree is a data structure, but
| the meshes in I, Robot are encoded as literal subroutines
| of machine instructions. There is no tree to traverse,
| the Mathbox simply executes the instructions as it
| encounters them.
|
| These aren't even "interpreted" (fake) instructions for a
| software VM (like Sweet16). They're literal instructions
| for the custom Mathbox microcode. The Mathbox executes
| the instructions, and outputs polygons to the display
| list as told.
|
| It's a lot like a compiler optimization that unrolls a
| loop inline... Can you even call it a loop once it's
| unrolled? In this case, it's as if a BSP was created,
| then turned into literal instructions for a custom
| machine. Many times the polygons were duplicated in the
| code (in-lining polys works faster than branching back).
| It blurs the line between code and data.
|
| I'm really blown away by the sophistication of the I,
| Robot Mathbox... The microcode was exceedingly complex,
| much more than just some fast matrix math (like
| Battlezone etc). It executed instructions to compute dot
| products, which would potentially result in X/Y/Z points
| being multiplied by a rotation matrix, projected, and
| written as polygons to the display list, for later
| rasterization by a different processor altogether.
|
| Anyhow it's fun to revisit this stuff :)
| chowells wrote:
| Sure, that's still a BSP tree. It's just represented as
| code instead of data. The duality of data and code is a
| long-established optimization path for static
| information.
| JohnBooty wrote:
| These aren't even "interpreted" (fake)
| instructions for a software VM (like Sweet16).
| They're literal instructions for the custom
| Mathbox microcode. The Mathbox executes the
| instructions, and outputs polygons to the
| display list as told.
|
| Wow, that's amazing, I love that. Thank you so much for
| explaining this.
| corysama wrote:
| 20 years after playing around with your emulator, I get to
| tell you that the "metafile dump to Window's clipboard"
| feature is just so nifty it still sticks out in my memory
| today.
| lordfrito wrote:
| Glad you appreciated it! Not many people knew it was
| there. I made a custom T shirt from some of the clipboard
| vector output.
|
| FYI I've ported the emulator to C#, it runs in even
| higher resolutions than before. [1]
|
| [1] https://github.com/manfreda-dot-org
| tasty_freeze wrote:
| To be fair, I didn't ever look at the code. I took the word
| of Dave Sherman, the guy who designed the hardware for it
| (though the "math box" was lifted from another game, probably
| Battlezone).
|
| Here is one more interesting thing about the display that
| Dave told me. DRAM was expensive, and at the time the I,
| Robot game was much more expensive than most of the arcade
| machines they put out. To save on DRAM, they used fake double
| horizontal resolution.
|
| This is a memory from a conversation 30 years ago, but there
| is 3b per pixel to specify color, and one bit to indicate if
| the edge should be pushed out half a pixel. That is, when
| rasterizing the polygons, they saved one fractional X bit in
| the pixel. In the interior of polygons it had no effect, but
| when generating video, if pixel at offset (x+0) and pixel at
| offset (x+1) had a different color index, the fraction bit
| from pixel (x+0) was used in the video timing generator to
| place where the color transition would happen, doubling the
| apparent horizontal resolution. When more than one edge
| crossed a pixel the information isn't that useful, but such
| pixels are a tiny fraction of all edge pixels, so overall it
| was a big win.
| lordfrito wrote:
| You know reading this caused some forgotten neuron in my
| brain to fire.
|
| When I was reverse engineering the thing, I seem to
| remember there being a bit related to color/shading that I
| couldn't figure out what it did (I could see that it was
| being used).
|
| The system had a 64 color palette. The software broke that
| up into 8 sub-palettes to give you for 8 main colors with 8
| shades each (0-7 first color, 8-15 second color, etc.)
|
| Polygons could call out colors directly using full palette
| index. Most polygons "faked" shading, they really just
| specified different shades of the same color for effect.
| For example, the dodecahedron / meteors in the space waves
| aren't real-time shaded, just colored to look that way.
|
| However the hardware could do shading, which cost an
| expensive dot product. So it was used sparingly. To do
| shading, the polygon used a 3 bit color "index" to call out
| one of the 8 sub-pallets. Then you add an offset of 0-7
| based on the dot product of the normal vector to apply
| shading (I used 8 x dot product to get you an offset from 0
| to 7.9999).
|
| I recall there being an extra bit there (4 bits instead of
| 3) but couldn't figure out what that last bit was for (it
| seemed like a "1/2 bit" to me). It looked like it was being
| used, but didn't make sense, so I stripped it off, and it
| didn't seem to affect anything so I forgot about it... till
| now. If I'm remembering correctly, this might literally be
| a "1/2 step" to add between shades? That can't be right can
| it?
|
| Another way I read what you wrote is that it could be
| considered an "anti-aliasing" bit to do some alpha blending
| at the edge of a polygon to fake higher resolution? Maybe
| that's a better way to read it?
|
| My memory is probably not 100% here... Would love to hear
| more.
| phkahler wrote:
| I had read about that fractional pixel online, and the bit
| does seen to exist in the vram (3 bits color, 3 bits
| intensity, one other bit).
|
| I don't think it ever got implemented all the way through.
| If it works as described, it should cause the right edges
| of polygons to appear with higher resolution than the left
| edges, particularly on black (blank) background. I've got
| an I, Robot in my living room and have looked for this
| visual difference and can not see it. Doodle City even
| provides the opportunity to move and rotate objects, so
| even with direct control of object and orientation I can't
| see it.
|
| Apparently there were other things that never got finished
| with the game too.
| toolslive wrote:
| Yes, If you were reading DrDobbs at the time, you would know
| that they actually tried every algorithm in the computer
| graphics bible (and they reported about it in that magazine).
| Some algorithms were in fact better than BSP, but were a bit
| instable performance wise (better avg performance, but worse
| worst case)
| Teslazar wrote:
| I think you're referring to Michael Abrash's Ramblings in
| Realtime articles which were written about Quake. When I was
| a teen I used to carry these around with me and read them
| over and over while I was implementing my own engine.
|
| Copies are available here: https://www.bluesnews.com/abrash/
| toolslive wrote:
| You're right: Quake. not Doom. There are 2 problems getting
| old: the first one is the memory that goes, ... the second
| one I forgot.
| winrid wrote:
| A lot of this is covered in his Graphics Programming Black
| Book too.
| johnvanommen wrote:
| I was thinking the same thing.
|
| The author references the Michael Abrash book, and said it
| was from "the late 90s."
|
| But IIRC, Abrash was writing about BSP in Doctor Dobbs
| Journal in the early 90s.
|
| IE: Abrash inspired Carmack, not the other way around.
|
| In 1992 I was really keen on learning to make arcade style
| games on the PC, and I studied those Abrash articles like
| crazy.
| pavon wrote:
| Yeah, when I was inexperienced what impressed me about Carmack
| was all the "cutting edge" techniques and tricks he used to
| push hardware to its limits. As I've became more experienced
| what impresses me is how iD was able to crank out so much
| software so quickly, have time to research these more advanced
| methods, quickly triage which were promising and which were
| not, and get them working and well optimized in production
| software.
|
| I can read research and incorporate it into my code, but it
| takes me a while, and I usually have multiple dead-ends where
| it isn't until after I've tried using a technique with real
| data that I realize it has weaknesses the researchers didn't
| point out and which can't easily be fixed. I can crank out
| quick code, but it isn't going to be advanced or optimized. I
| struggle finding the right balance of how to trade between
| these three things, and yet Carmack and iD were able to squeeze
| in all three at once.
| kudokatz wrote:
| > As I've became more experienced what impresses me is how iD
| was able to crank out so much software so quickly
|
| > usually have multiple dead-ends where it isn't until after
| I've tried using a technique with real data that I realize it
| has weaknesses the researchers didn't point out and which
| can't easily be fixed
|
| If you listen to John Romero himself telling the story of id
| and the fast iteration, it was "due to having, basically, 10
| years of intense game development experience prior":
|
| https://youtu.be/E2MIpi8pIvY?t=543
|
| ("it's also due to the first principle we had - no
| prototypes")
|
| A theme I tend to see is that a lot of folks with super
| output appear to others like icebergs - a LOT of effort that
| nobody can see, and a tiny peak at which everyone marvels.
| ww520 wrote:
| Carmack work long hours everyday, really long hours.
| HelloNurse wrote:
| Using a BSP tree to render front to back rather than back to
| front is also quite obvious: it's the same visit in the
| opposite order.
| 0xAFFFF wrote:
| I you guys are interested in how Doom was implemented, I cannot
| recommend enough Fabien Sanglard's book, Game Engine Black Book:
| Doom.
|
| https://fabiensanglard.net/gebbdoom/
| teaearlgraycold wrote:
| Also the first one on Wolfenstein
| phkahler wrote:
| I was studying computer graphics at the time. The book we used
| was "Computer Graphics Principles and Practice" I don't recall
| which edition. BSP trees are covered in the book, and like the
| article says, had been written about more than a decade prior.
|
| What Carmack had was the ability to read research papers and
| translate them into working code. I can do that too, but it does
| seem to be a less common skill in the software world.
| JohnBooty wrote:
| What Carmack had was the ability to read research
| papers and translate them into working code. I can do
| that too, but it does seem to be a less common skill
| in the software world.
|
| Well, that's the argument for Carmack being merely _very very
| very good_ and not a "genius."
|
| Here's the argument _for_ him:
|
| There were a lot of exceptionally talented people working in
| the games industry, and for four+ generations of PC hardware
| (the Keen, Wolf3D, Doom, and Quake eras) Carmack's engines were
| years ahead of everybody else's and had an absolutely massive
| impact on the industry.
|
| Of course, the term "genius" is nebulous and it can mean
| whatever we want it to mean.
|
| However, describing him as merely a guy who implemented other
| peoples' algorithms may be true in a literal sense but really
| misses the big picture IMO.
| bluedino wrote:
| > What Carmack had was the ability to read research papers
|
| And he had to read actual papers, too. I could see him making
| copies at the library. No Google or StackOverflow back then.
| jasonwatkinspdx wrote:
| Not quite when Doom came out, but within a few years you
| could find a surprising amount about graphics on grad
| students' homepages. I remember learning about PVS, Portal
| based rendering, and a number of similar things from the
| original paper author's homepage.
| bitwize wrote:
| Not just working code -- code that worked well in a real-time
| video game on 386/486 class hardware. That's the genius bit as
| far as Doom's code is concerned.
|
| Edit: clarify
| johnvanommen wrote:
| That was the thing that was so revolutionary about Abrash: he
| figured out how to create graphics that rivaled arcade
| hardware on the IBM PC.
|
| Most importantly: _without_ a GPU.
|
| At the time, all of the really "flashy" arcade games used
| GPUs that cost around $1500 in today's dollars. Abrash pulled
| off this stunt with no GPU.
|
| I'd argue that the entire reason the Xbox is called "Xbox" is
| because of "Mode X", invented by Abrash.
| bitwize wrote:
| The Xbox is called Xbox because Microsoft wanted to build a
| console around Windows APIs, internally calling it a
| "DirectX box".
|
| And DirectX has nothing to do with ModeX; it's a family of
| APIs: DirectDraw, Direct3D, DirectSound, etc.
| coliveira wrote:
| When Carmack did his work, the computers he had were 386/486
| class. So the distinction you're making doesn't make sense.
| Working code is code that works in the computer you have at
| the time you're writing the software.
| bombcar wrote:
| Doom was technically developed on NeXT workstations, but it
| had to perform well enough on a 386 (even if in a reduced
| window).
| royjacobs wrote:
| There is a distinction between 'working code' and 'working
| code that runs at interactive frame rates in a severely
| memory-constrained environment'.
| sebstefan wrote:
| > The really neat thing about a BSP tree, which Fuchs, Kedem, and
| Naylor stress several times, is that it only has to be
| constructed once. This is somewhat surprising, but the same BSP
| tree can be used to render a scene no matter where the camera
| viewpoint is. The BSP tree remains valid as long as the polygons
| in the scene don't move. This is why the BSP tree is so useful
| for real-time rendering--all the hard work that goes into
| constructing the tree can be done beforehand rather than during
| rendering.
|
| I can't visualize why it's true
| whatshisface wrote:
| Here's my explanation of BSP trees, written as much for my own
| benefit as anybody else's (maybe it will help you):
|
| To start off with, we have a basic geometric fact: if you have
| a surface that divides space into separate regions (this can be
| a closed surface like a sphere, or an infinite surface like a
| plane), a line of sight can't cross between the two partitions
| of space without passing through the surface. If the surface is
| convex, a line of sight can only cross it once. (That's one
| definition of convexity.)
|
| Now, suppose that you have a lot of objects, and none of them
| are on both sides of the boundary. You can ensure that
| condition by splitting any object that crosses the boundary in
| two. No matter where the line of sight begins, and no matter
| what direction it moves in, the boundary then offers a
| definite, albeit partial ordering of events: objects on the
| same side of the boundary may intersect, then the line will
| intersect the boundary, then the objects on the other side may
| intersect.
|
| If you find an intersection with an opaque object on the same
| side of the boundary, you never need to check any objects on
| the other side, because the order of events above. That is a
| big benefit when about half of the objects in front of you are
| on your side of the boundary: if you are right in front of the
| boundary, or if there's nothing on the other side of it, it
| doesn't help a lot.
|
| Checking all of the objects on one side of the boundary is the
| same kind of problem as checking all the objects, and that
| means you can speed those up too by having sub-boundaries for
| the sub-problems. That gives you the recursion and makes it a
| tree. If the reasoning based on boundaries continues until
| there is only one object in a sub-problem, you don't need to
| sort lists based on distance - meaning that the whole ordering
| problem can be solved just with partition trees.
|
| Choosing a good tree is a difficult problem because, unlike the
| fact that the ordering problem can always be solved, the amount
| that an individual boundary division helps depends on where you
| and the objects are in relation to it. Planes are generally
| used instead of other things like spheres because splitting a
| polygon along the edge of a sphere doesn't result in another
| polygon, while splitting it along a plane does.
| bombcar wrote:
| https://developer.valvesoftware.com/wiki/BSP may help - the key
| is you have divided it into _convex_ spaces.
| [deleted]
| kazinator wrote:
| Because the input to the BSP traversal is the view position.
|
| Every node in the tree is associated with an infinite plane
| that divides space in half. Stuff on one side of the plane is
| in the left subtree; stuff on the other side of the plane is in
| the other subtree.
|
| The viewer is on one side of the plane or the other (or maybe
| on it, oops).
|
| We know that stuff on the other side of the plane cannot
| occlude the view of stuff on the viewer's side of the plane. So
| for instance if we're doing back-to-front rendering, we would
| draw the material from the other side of the plane first, then
| stuff from this side. (Subject to the content being rotated
| into the correct view and clipped to the view frustum.)
|
| There is no polygon in the tree that intersects/straddles these
| dividing planes, which is the point. When the BSP is
| constructed, whenever these planes cut through some input
| polygon, it is cut into two pieces that get assigned to
| different subtrees. There are then situations in which those
| pieces will not get drawn at the same time though they belong
| to the same logical surface.
|
| The planes which subdivide space in half are independent of
| each other; they go every which way. So if we have nodes like
| a b c d e f g
|
| it is possibly the case that plane b cuts through the polygon
| sets f and g, and maybe some of them even straddle plane b.
| None of that is relevant to them because the b subset is on the
| opposite side of common ancestor a, the b plane subdivision is
| concerned only separating d from e. Binary space partitioning
| is not actually a mathematical partition (exhaustive division
| into non-overlapping parts) of the space. The set of poygons
| gets partititoned, of course; not the space itself.
| yojo wrote:
| For people not reading the whole thing (reasonable, it is long):
| the author comes to the conclusion that it was not so "genius"
| after all, given the prior art, and that he read it in a computer
| graphics book. He still gives Carmack props for doing his
| homework and getting a working implementation built in a novel
| game engine.
| jamestimmins wrote:
| One thing confuses me about this: does it require a separate tree
| per-axis?
| gary_0 wrote:
| It doesn't split by axis, it splits by planes that can be
| oriented however. Planes split a space into 2 half-spaces,
| hence the "binary" part. Only the older Wolfenstein-style
| rendering cared about the axis.
| jamestimmins wrote:
| That makes sense if I think about everything existing on a
| single axis (objects in front of each other on the z-axis).
|
| But if you then turn 90 degrees, then suddenly a bunch of
| those z-axis items are side by side, and order is based on
| the x-axis of my original perspective. So I don't understand
| how a binary structure accounts for that perspective shift.
|
| Edit: For clarification, my understanding is that there's a
| single tree built per-level in advance. If I'm mistaken and
| it's re-building the tree 30 times per second, the lack of
| axis concerns makes sense, but the article seemed to indicate
| that that's computationally prohibitive.
| gary_0 wrote:
| The tree is built once. The tree can be traversed from any
| point in the world because each node stores which plane it
| used to split, and it is easy to compute which side of the
| plane you are viewing.
|
| Have you tried reading
| https://en.wikipedia.org/wiki/Binary_space_partitioning ?
| It shows how a BSP is built and traversed for rendering.
| jamestimmins wrote:
| Interesting. Thanks for sharing. I think i need to dig
| into that article to try to understand this better.
| isolli wrote:
| I'm reminded of the render algorithm in Comanche, the 1992 video
| game [0]. I find it remarkably ingenious, and it certainly felt
| incredible at the time.
|
| [0] https://github.com/s-macke/VoxelSpace
| speed_spread wrote:
| I've played countless hours with the A-10 simulator which I
| believe used the same "voxel" technique.
| atan2 wrote:
| Really? I am not aware of such game. Do you remember the A-10
| flight sim you played that used voxelspace?
| samlittlewood wrote:
| Quite a few games (eg: Starfox) in the 90's would use BSP in a
| slightly different way - purely within each object. The tree
| would be generated offline (mechanically or by hand), and would
| often be some sort of branching bytecode (test this, draw that
| etc.)
|
| It was very useful to be able to use one plane test to discard a
| whole bunch of geometry, eg: a surface plus its detail, or
| everything visible through a window.
|
| This still left the problem of sorting between objects - mostly a
| depth sort was just fine. Special cases like weapons on
| hardpoints, or vehicles in hangars could be handled by having
| some sort of 'call submodel' for spaces within the parent model.
| Beyond that, just dial up the hostile firepower until players
| stop complaining about rendering artefacts.
| bigmattystyles wrote:
| I thought this was going to be about all the small hallways in
| Doom where you have to close one door before opening another -
| this was done so that the engine would never have to render more
| space than 1 room at a time. I don't know if that's true, just
| something I've often heard repeated.
| tboughen wrote:
| I played Doom, Doom 2 and Final Doom all the way through about
| 20 years ago - I can't recall a single instance of this kind of
| level design. Doors stayed open once opened. Maybe it was done
| with corners/sight lines instead? To be honest I remember some
| vast areas rendered, especially when accessing secrets.
| jonny_eh wrote:
| Doors definitely auto-closed behind you.
| sedeki wrote:
| What is a good learning path (books, videos, ???) to take in
| order to understand modern graphics programming? At the level of
| Vulkan or Metal, say.
|
| I have had a bird's eye exposure to shader-based OpenGL but don't
| feel I have a good intuitive understanding of how the GPU
| operates.
| newsclues wrote:
| Graphics Programming Black Book by Michael Abrash 1997
| bombcar wrote:
| I recommend this even though it's now 20+ years old, it's
| accessible enough that you can get a foothold.
|
| Alternatively, get the Black Books for Wolfenstein and Doom,
| and read them first.
| dceddia wrote:
| This tutorial is well regarded for learning the ins and outs of
| modern OpenGL including how the GPU pipeline works, which I
| think translates pretty well to Vulkan or Metal (at least the
| concepts anyway) https://learnopengl.com/
| coldpie wrote:
| Start with Foley & van Dam to lay the groundwork & terms of
| art, then grab whatever API tutorial you like and get started.
| You'll find plenty of further resources once you're more
| familiar with the field and start discovering your interests.
| Graphics programming is a seriously deep discipline and will
| take years of study to start doing real work in. But it's a
| super cool field with plenty of work on both the research &
| practical implementation sides.
| i_like_apis wrote:
| Include: Ericson, Real-Time Collision Detection
| https://www.goodreads.com/book/show/620505.Real_Time_Collisi...
| Waterluvian wrote:
| I think the "sit and read research papers" part is often omitted,
| to our own peril, from the romanticized image of the genius
| computer hacker of that era.
| ddeville wrote:
| Carmack himself describes it well in the recent interview with
| Lex Fridman https://youtu.be/I845O57ZSy4?t=8961
| random314 wrote:
| Yet another Carmack worship story. He is a great engineer, but
| let's not overhype this.
|
| It started with the legendary square root function. Which lost
| its legendary status once it turned out Carmack wasn't the
| author.
|
| Now it is about standard BSP technique from Computer graphics
| text books from that era.
|
| There is also hype about Carmack solving AGI by 2030, when he is
| yet to publish a single widely cited AI paper.
|
| EDIT:
|
| I am guilty of reading the first couple of paragraphs and
| assuming this was Carmack worship. My bad! Leaving the original
| comment, mostly as is.
| EamonnMR wrote:
| The story of tracing the magic number back is really fun and
| it's interesting to see how the trade of computer graphics
| programming was passed down from one company to another. I
| don't think it's legendary status was at all diminished when it
| turns out not to have been Carmack's invention.
|
| https://www.beyond3d.com/content/articles/8/
|
| https://web.archive.org/web/20120920120948/http://blog.quent...
| dmbaggett wrote:
| As someone working at the same time as him on similar problems,
| I honestly and sincerely disagree. At the time my reaction was
| "this is space cadet stuff". 486 hardware was really slow, and
| the idea of doing any kind of real computational geometry back
| then was incredibly far-fetched. Until he showed it wasn't.
| ajuc wrote:
| Carmack is a great programmer because he consistently produces
| solid, fast code, not because he has groundbreaking ideas.
|
| It's similar with Linus. He didn't invented monolithic kernels,
| UNIX, distributed version control. He "just" wrote good
| implementations.
|
| Turns out brilliant ideas are overrated in programming, what
| matters is execution and daily grind.
| FartyMcFarter wrote:
| > Turns out brilliant ideas are overrated in programming,
| what matters is execution and daily grind.
|
| You also have to be good at finding the brilliant ideas that
| are worth using.
| manv1 wrote:
| You need to understand the problem, understand what you're
| trying to do, and understand how this abstract solution
| maps to your problem.
|
| Then you have to implement it.
| smoldesu wrote:
| FWIW, the inverse square root function wasn't popularized by
| Carmack. Most people shared it because they were amused by the
| confused comments surrounding it, and how it demonstrated an
| extremely vertical slice of optimization.
|
| Treating anyone like a rockstar is a bad idea, but Carmacks and
| Wozniaks deserve the praise more than the Zuckerbergs and Jobs'
| out there.
| Ygg2 wrote:
| What do you mean not worship Jiu-Jitsu practicioner, Luddite
| Nemesis, Jace Hall asphyxiatior, and master of the Anti-Life
| equation - John Carmack?
| JustSomeNobody wrote:
| Yeah, I don't think we read the same article. This isn't a
| Carmack worship article.
| royjacobs wrote:
| Let's not underhype it, either: It seems you are
| underestimating the impact that Doom had on PC gaming when it
| came out. Using BSP trees to perform zero overdraw rendering
| definitely was not something that had been done before on
| consumer level hardware and it can definitely be argued that
| applying them this way for Doom was a stroke of genius.
| coliveira wrote:
| You're setting the bar for "genius" too low, which
| unfortunately is commonplace in computer programming. If you
| take something that already exists and apply it to your code
| I don't call it genius. Hi quality programming, yes, but
| genius no.
| royjacobs wrote:
| I didn't realise I was not allowed to set my own bars,
| though :)
| cwillu wrote:
| From the Carmack worship story: "So Carmack was not the first
| person to think of using BSP trees in a real-time graphics
| simulation. Of course, it's one thing to anticipate that BSP
| trees might be used this way and another thing to actually do
| it. But even in the implementation Carmack may have had more
| guidance than is commonly assumed."
| random314 wrote:
| I am guilty of reading the first couple of paragraphs and
| assuming this was Carmack worship. My bad!
| cwillu wrote:
| Honey for the poison.
| choeger wrote:
| What I don't understand is, how the tree can be used even after
| movement? How do you know which polygon is behind the other if
| camera position and angle are freely chosen?
___________________________________________________________________
(page generated 2022-11-21 23:00 UTC)