[HN Gopher] Level of Gaussians: Real-Time View Synthesis for Mil...
___________________________________________________________________
Level of Gaussians: Real-Time View Synthesis for Millions of Square
Meters
Author : corysama
Score : 171 points
Date : 2024-04-30 19:45 UTC (1 days ago)
(HTM) web link (zju3dv.github.io)
(TXT) w3m dump (zju3dv.github.io)
| corysama wrote:
| I got excited because the code was just released. But, apparently
| the paper is still not available? Sorry...
| blovescoffee wrote:
| Crazy good results but without the paper (which the link at the
| time just goes back to the site) it's a bit difficult to check
| _how_ good. What data is required, how long are training runs
| /how many steps?
| logtempo wrote:
| Using 200 photos taken with a conventional camera, at a refresh
| rate of 105 frames per second - the quality of video game
| images - the result gives the illusion of walking through the
| video. Better still, if you zoom in, you can see finer details,
| such as the spokes of a bicycle wheel, in excellent detail.
|
| It use neural network techniques, but it's not strictly using
| NN.
|
| it do the same result as Nerf from google in 30min, nvidia
| result in 7minute. It can achieve more than 100fps if you let
| it train longer.
|
| https://www.inria.fr/fr/3d-gaussian-splatting-vision-ordinat...
| was_a_dev wrote:
| The fact we have released code before the paper is wild.
| Typically the promise of open sourced code never comes to
| fruition
| speps wrote:
| Actual title is: Real-Time View Synthesis for Large Scenes with
| Millions of Square Meters
|
| Which makes more sense than: Real-Time View Synthesis for Square
| Meters
| corysama wrote:
| Title edited. Thanks. I couldn't fit the whole title. But,
| didn't think I cut out "Millions of"...
| Retr0id wrote:
| I hope the next-gen google earth looks something like this.
| londons_explore wrote:
| Please Google, implement this in google maps (especially on
| mobile).
|
| It's been _over a decade_ and we 're still stuck with 2D maps and
| boxy untextured buildings.
| cubefox wrote:
| Google uses texture mapped polygons instead of 3D Gaussians, so
| this wouldn't work for Google Maps. But there actually is a
| collection of libraries which does the same thing for polygonal
| data: https://vcg.isti.cnr.it/nexus/
|
| One of the guys working on this is Federico Ponchio. His 2008
| PhD thesis, which provided the core insight for Unreal Engine's
| Nanite, is referenced at bottom.
| londons_explore wrote:
| > Google uses texture mapped polygons instead of 3D
| Gaussians,
|
| Time to switch I'd say...
|
| Polygons are a poor fit, especially for trees and windows and
| stuff that needs to be semitransparent/fluffy.
|
| I suspect the gaussians will compress better, and give better
| visual quality for a given amount of data downloaded and GPU
| VRAM. (the current polygon model uses absolutely loads of
| both, leading to a poor experience for those without the
| highest end computers and fast internet connections).
| leodriesch wrote:
| I am really impressed by the Apple Maps implementation. I think
| it also uses textured polygons, but does so in a very good
| looking way and at 120 fps on an iPhone, showing even a whole
| city in textured 3d.
| martinkallstrom wrote:
| Apple bought a Swedish startup called C3 and their became 3D
| part of Apple Maps. That startup was a spin-off from Saab
| Aerospace, who had developed a vision system for terrain-
| following missiles. Saab ran a project with the municipal
| innovation agency in Linkoping and the result was that they
| decided this tech should be possible to find civilian use
| cases for. C3 decided to fly small Cessnas in grids across a
| few major cities and also Hoover Dam, and built a ton of code
| on top of the already extremely solid foundation from Saab.
| The timing was impeccable (now many years ago) and they
| managed to get Microsoft, Apple and Samsung into a bidding
| war which drove up the price. But it was worth it for Apple
| to have solid 3D in Apple Maps and the tech has stood the
| test of time.
| dxjacob wrote:
| I remember seeing a Nokia or Here demo around that time
| that looked like similar or the same tech. Do you know
| anything published about it with technical details? Seems
| like enough time has passed that it would be more
| accessible. I would love to learn more about it.
| astrange wrote:
| The reason there isn't much investment here is that it's
| expensive to update the image data and the result isn't very
| useful.
|
| You barely ever need to look at 3D photogrammetry buildings for
| anything and there aren't many questions it answers outside of
| curiosity.
|
| I do wonder if they could integrate street view images into it
| better.
| londons_explore wrote:
| Even old image data is pretty useful. If they could make a 3d
| view that seamlessly integrated satellite, plane, and street
| level imagery into one product, it would be a much better UX
| than having to manually switch to street view mode.
| astrange wrote:
| Well, almost all of satellite view is actually plane
| images. Satellite images aren't good enough resolution for
| 3D as far as I know.
|
| The other problem is you can only update them in sunny
| weather. So SAR is a lot more useful because it can see
| through clouds.
| logtempo wrote:
| it could be a service for local uses: you select an area and
| ask Google to render it. Could be even premium service hehe
| bufferoverflow wrote:
| Google Maps has 3D (in some areas). Click on Layers -> More ->
| Globe view.
|
| Looks like this: https://i.imgur.com/wcCJmbd.png
| londons_explore wrote:
| but thats desktop not mobile.
|
| And when you zoom right into streets, storefronts and stuff
| are barely visible because they haven't properly integrated
| street level imagery.
| lend000 wrote:
| Looks even better than Microsoft flight simulator. Awesome!
| jiggawatts wrote:
| So this is just Level-of-Detail (LoD) implemented for Gaussian
| splats? Impressive results, but I would have figured this is an
| obvious next-step...
|
| Also, is it bad that the first thing I thought of was that
| commanders in the Ukraine war could use this? E.g.: stitch
| together the video streams from thousands of drones to build up
| an up-to-date view of the battlefield?
| angusturner wrote:
| Can anyone familiar with 3d graphics speculate what would be
| required to implement this into a game engine?
|
| I'm guessing that adding physics, collision-detection etc. on top
| of this is non-trivial compared to using a mesh?
|
| But I feel like for stuff like tree foliage (where maybe you
| don't care about collisions?), this would be really awesome,
| given the limitations of polygons. + also just any like
| background scenery, stuff out of the player's reach.
| corysama wrote:
| I worked in game engines for a long time. The main hurdle is
| just that it's new. There's a many-decade legacy pipeline of
| tools and techniques built around triangles. Splats are
| something new.
|
| The good news is that splats are really simple once they've
| been generated. Maybe simpler than triangles depending on how
| you look at it. It's just a matter of doing the work to set up
| new tools and pipelines.
| modeless wrote:
| It's easy to render these in a game engine. I'm sure physics
| and collision detection are possible. The big, huge, gigantic
| issue is actually lighting.
|
| These scenes come with real world lighting baked in. This is
| great because it looks amazing, it's 100% correct, far better
| than the lighting computed by any game engine or even movie-
| quality offline ray tracer. This is a big part of why they look
| so good! But it's also a curse. Games need to be interactive.
| When things move, lighting changes. Even something as simple as
| opening a door can have a profound effect on lighting. Anything
| that moves changes the lighting on itself _and_ everything
| around it. Let alone moving actual lights around, changing the
| time of day, etc.
|
| There's absolutely no way to change the baked-in lighting in
| one of these captures in a high quality way. I've seen several
| papers that attempt it and the results all suck. It's not the
| fault of the researchers, it's a very hard problem. There are
| two main issues:
|
| One, in order to perfectly re-light a scene you first have to
| de-light it, that is, compute the lighting-independent BRDF of
| every surface. The capture itself doesn't even contain enough
| information to do this in an unambiguous way. You can't know
| for sure how a surface would react under different lighting
| conditions than were present in the pictures that made up the
| original scan. Maybe in theory you can guess well enough in
| most cases and extrapolate, and AI can likely help a lot here,
| but in practice we are far away from good quality so far.
|
| Two, given the BRDF of all surfaces and a set of new lights,
| you have to apply the new lighting to the scene. Real-time
| solutions for lighting are _very_ approximate and won 't be
| anywhere near the quality of the lighting in the original scan.
| So you'll lose some of that photorealistic quality when you do
| this, even if your BRDFs are perfect (they won't be). It will
| end up looking like regular game graphics instead of the
| picture-perfect scans you want. If you try to blend the new
| lighting with the original lighting, the boundaries will
| probably be obvious. You're competing with perfection! Even
| offline rendering would struggle to match the quality of the
| baked-in lighting in these captures.
|
| To me the ultimate solution needs to involve AI. Analytically
| relighting everything perfectly is infeasible, but AI can
| likely do approximate lighting that looks more plausible in
| most cases, especially when trying to match captured natural
| lighting. I'm not sure exactly how it will work, but AI is
| already being used in rendering and its use will only increase.
| rallyforthesun wrote:
| Thanks for pointing out the challenges with gaussian
| splattings. Are there any AI based relighting methods out
| there? Some prompt based editing like nerf2nerf or Language-
| embedded NerFs maybe?
| esperent wrote:
| You've elucidated very clearly an issue that I've been
| thinking about since the very first time I saw gaussian
| splats. The best idea I've had (besides "AI magic") is
| something like pre-calculating at least two different
| lighting states, e.g. door open and door closed, or midday
| and evening, and then blending between them.
|
| Do you know if anyone has tried this? Or otherwise, what're
| the best current attempts at solving it?
| gct wrote:
| Given they're indexing into a tree, animation will be a pain.
| Karliss wrote:
| Game physics are often using a separate mesh from the one used
| for rendering or even combination of primitive shapes anyway.
| So it doesn't matter how graphics part is rendered. No point
| wasting resources on details which don't affect gameplay, and
| having to much tiny collision geometry increase chance of
| having player stuck or snag against it.
| mrwyz wrote:
| Cool, but not touching this; no license and requires Inria's
| proprietary rasterizer.
|
| People should stop basing all of this new research on proprietary
| software, when we have open source implementations [1][2].
|
| [1] gsplat: https://github.com/nerfstudio-project/gsplat [2]
| opensplat: https://github.com/pierotofy/opensplat
| cubefox wrote:
| I'm surprised anything in 3D Gaussian splatting uses a
| rasterizer. I thought those were only used for polygonal data.
| VelesDude wrote:
| I mean technically rasterization means taking any vector data
| and plotting it in a 2D space... so I guess it is correct.
|
| But yes, I know what you are getting at. This would normally
| be done via a software/shader pipeline rather than a GPU's
| polygonal process.
| littlestymaar wrote:
| Gaussian splatting feel magical, and with 4D Gaussian splatting
| now being a thing, 3D movies that are actually 3D, and in which
| you can navigate could be a reality in the coming years. (And I
| suspect the first use-case will be porn, as usual).
| datascienced wrote:
| Movies can become games as well.
| KaiserPro wrote:
| Sorry to be naive, but isn't this basically applying pointcloud
| decimation to achieve dynamic level of detail?
|
| Am I missing something or is there a new concept that doesn't
| exist in standard point cloud renderers?
___________________________________________________________________
(page generated 2024-05-01 23:02 UTC)