[HN Gopher] SMERF: Streamable Memory Efficient Radiance Fields
___________________________________________________________________
SMERF: Streamable Memory Efficient Radiance Fields
We built SMERF, a new way for exploring NeRFs in real-time in your
web browser. Try it out yourself! Over the last few months, my
collaborators and I have put together a new, real-time method that
makes NeRF models accessible from smartphones, laptops, and low-
power desktops, and we think we've done a pretty stellar job!
SMERF, as we like to call it, distills a large, high quality NeRF
into a real-time, streaming-ready representation that's easily
deployed to devices as small as a smartphone via the web browser.
On top of that, our models look great! Compared to other real-time
methods, SMERF has higher accuracy than ever before. On large
multi-room scenes, SMERF renders are nearly indistinguishable from
state-of-the-art offline models like Zip-NeRF and a solid leap
ahead of other approaches. The best part: you can try it out
yourself! Check out our project website for demos and more. If you
have any questions or feedback, don't hesitate to reach out by
email (smerf@google.com) or Twitter (@duck).
Author : duckworthd
Score : 258 points
Date : 2023-12-13 19:03 UTC (3 hours ago)
(HTM) web link (smerf-3d.github.io)
(TXT) w3m dump (smerf-3d.github.io)
| sim7c00 wrote:
| this looks really amazing. i have a relatively old smartphone
| (2019) and its really surprisingly smooth and high fidently.
| amazing job!
| duckworthd wrote:
| Thank you :). I'm glad to hear it! Which model are you using?
| sim7c00 wrote:
| samsung galaxy 10se
| guywithabowtie wrote:
| Any plans to release the models ?
| duckworthd wrote:
| The pretrained models are already available online! Check out
| the "demo" section of the website. Your browser is fetching the
| model when you run the demo.
| ilaksh wrote:
| Will the code be released, or an API endpoint? Otherwise it
| will be impossible for us to use it for anything.. since it's
| Google I assume it will just end up in a black hole like most
| of the research.. or five years later some AI researchers
| leave and finally create a startup.
| zeusk wrote:
| Are radiance fields related to Gaussian splattering?
| duckworthd wrote:
| Gaussian Splatting is heavily inspired by work in radiance
| fields (or NeRF) models. They use much of the same technology!
| corysama wrote:
| Similar inputs, similar outputs, different representation.
| aappleby wrote:
| Very impressive demo.
| duckworthd wrote:
| Thank you!
| refulgentis wrote:
| This is __really__ stunning work, huge, huge, deal that I'm
| seeing this in a web browser on my phone. Congratulations!
|
| When I look at the NYC scene in the highest quality on desktop,
| I'm surprised by how low-quality ex. the stuff on the counter and
| shelves is. So then I load the lego model, and see that's _very_
| detailed, so it doesn't seem inherent to the method.
|
| Is it a consequence of input photo quality, or something else?
| duckworthd wrote:
| > This is __really__ stunning work
|
| Thank you :)
|
| > Is it a consequence of input photo quality, or something
| else?
|
| It's more a consequence of spatial resolution: the bigger the
| space, the more voxels you need to maintain a fixed resolution
| (e.g. 1 mm^3). At some point, we have to give up spatial
| resolution to represent larger scenes.
|
| A second limitation is the teacher model we're distilling. Zip-
| NeRF (https://jonbarron.info/zipnerf/) is good, but it's not
| _perfect_. SMERF reconstruction quality is upper-bounded by its
| Zip-NeRF teacher.
| jacoblambda wrote:
| Is there a relatively easy way to apply these kinds of techniques
| (either NeRFs or gaussian splats) to larger environments even if
| it's lower precision? Like say small towns/a few blocks worth of
| env.
| duckworthd wrote:
| In principle, there's no reason you can't fit multiple City
| blocks at the same time with Instant NGP on a regular desktop.
| The challenge is in estimating the camera and lens parameters
| over such a large space. I expect such a reconstruction to be
| quite fuzzy given the low space resolution.
| ibrarmalik wrote:
| You're under the right paper for doing this. Instead of one big
| model, they have several smaller ones for regions in the scene.
| This way rendering is fast for large scenes.
|
| This is similar to Block-NeRF [0], in their project page they
| show some videos of what you're asking.
|
| As for an easy way of doing this, nothing out-of-the-box. You
| can keep an eye on nerfstudio [1], and if you feel brave you
| could implement this paper and make a PR!
|
| [0] https://waymo.com/intl/es/research/block-nerf/
|
| [1] https://github.com/nerfstudio-project/nerfstudio
| barrkel wrote:
| The mirror on the wall of the bathroom in the Berlin location
| looks through to the kitchen in the next room. I guess the depth
| gauging algorithm uses parallax, and mirrors confuse it, seeming
| like windows. The kitchen has a blob of blurriness as the rear of
| the mirror intrudes into kitchen, but you can see through the
| blurriness to either room.
|
| The effect is a bit spooky. I felt like a ghost going through
| walls.
| nightpool wrote:
| The refigerator in the NYC scene has a very slick specular
| lighting effect based on the angle you're viewing it from, and
| if you go "into" the fridge you can see it's actually
| generating a whole 3d scene with blurry grey and white colors
| that turn out to precisely mimic the effects of the light from
| the windows bouncing off the metal, and you can look "out" from
| the fridge into the rest of the room. Same as the full-length
| mirror in the bedroom in the same scene--there's a whole
| virtual "mirror room" that's been built out behind the mirror
| to give the illusion of depth as you look through it. Very cool
| and unique consequence of the technology
| pavlov wrote:
| Wow, thanks for the tip. Fridge reflection world is so cool.
| Feels like something David Lynch might dream up.
|
| A girl is eating her morning cereal. Suddenly she looks
| apprehensively at the fridge. Camera dollies towards the
| appliance and seamlessly penetrates the reflective surface,
| revealing a deep hidden space that exactly matches the
| reflection. At the dark end of the tunnel, something stirs...
| A wildly grinning man takes a step forward and screams.
| daemonologist wrote:
| Neat! Here are some screenshots of the same phenomenon with
| the TV in Berlin: https://imgur.com/a/3zAA5K8
| TaylorAlexander wrote:
| Oh wow yeah. It's interesting because when I look at the
| fridge my eye maps that to "this is a reflective surface",
| which makes sense because that's true in the source images,
| but then it's actually rendered as a cavity with appropriate
| features rendered in 3D space. What's a strange feeling is to
| enter the fridge and then turn around! I just watched
| Hbomberguy's Patreon-only video on the video game Myst, and
| in Myst the characters are trapped in books. If you choose
| the wrong path at the end of the game you get trapped in a
| book, and the view you get trapped in a book looks very
| similar to the view from inside the NYC fridge!
| deltaburnt wrote:
| Mirror worlds are a pretty common effect you'll see in NeRFs.
| Otherwise you would need a significantly more complex view
| dependent feature rendered onto a flat surface.
| chpatrick wrote:
| This happens with any 3D reconstruction. It's because any
| mirror is indistinguishable from a window into a mirrored
| room. The tricky thing is if there's actually a something
| behind the mirror as well.
| Zetobal wrote:
| It has exactly the same drawbacks as photogrammetry in regards
| of highly reflective surfaces.
| rzzzt wrote:
| You can also get inside the bookcase for the ultimate Matthew
| McConaughey experience.
| promiseofbeans wrote:
| It runs impressively well on my 2yo s21fe. It was super
| impressive how it streamed in more images as I explored the
| space. The tv reflections in the Berlin demo were super
| impressive.
|
| My one note is that it look a really long time to load all the
| images - the scene wouldn't render until all ~40 initial images
| loaded. Would it be possible to start partially rendering as the
| images arrive, or do you need to wait for all of them before you
| can do the first big render?
| duckworthd wrote:
| Pardon our dust: "images" is a bad name for what's being
| loaded. Past versions of this approach (MERF) stored feature
| vectors in PNG images. We replace them with binary arrays.
| Unfortunately, all such arrays need to be loaded before the
| first frame can be rendered.
|
| You do however point out one weakness of SMERF: large payload
| sizes. If we can figure out how to compress them by 10x, it'll
| be a very different experience!
| VikingCoder wrote:
| Wow. Some questions:
|
| Take for instance the fulllivingroom demo. (I prefer fps mode.)
|
| 1) How many images are input?
|
| 2) How long does it take to compute these models?
|
| 3) How long does it take to prepare these models for this
| browser, with all levels, etc?
|
| 4) Have you tried this in VR yet?
| vyrotek wrote:
| Not exactly what you asked for. But I recently came across this
| VR example using Gaussian Splatting instead. Exciting times.
|
| https://twitter.com/gracia_vr/status/1731731549886787634
|
| https://www.gracia.ai
| duckworthd wrote:
| Glad you liked our work!
|
| 1) Around 100-150 if memory serves. This scene is part of the
| mip-NeRF 360 benchmark, which you can download from the
| corresponding project website:
| https://jonbarron.info/mipnerf360/
|
| 2) Between 12 and 48 hours, depending on the scene. We train on
| 8x V100s or 16x A100s.
|
| 3) The time for preparing assets is included in 2). I don't
| have a breakdown for you, but it's something like 50/50.
|
| 4) Nope! A keen hacker might be able to do this themselves by
| editing the JavaScript code. Open your browser's DevTools and
| have a look -- the code is all there!
| dougmwne wrote:
| Do you need position data to go along with the photos or just
| the photos?
|
| For VR, there's going to be some very weird depth data from
| those reflections, but maybe they would not be so bad when
| you are in headset.
| durag wrote:
| Any plans to do this in VR? I would love to try this.
| duckworthd wrote:
| Not at the moment but an intrepid hacker could surely extend
| our JavaScript code and put something together.
| blovescoffee wrote:
| Since you're here @author :) Do you mind giving a quick rundown
| on how this competes with the quality of zip-nerf?
| duckworthd wrote:
| Check out our explainer video for answers to this question and
| more! https://www.youtube.com/watch?v=zhO8iUBpnCc
| heliophobicdude wrote:
| Great work!!
|
| Question for the authors, are there opportunities, where they
| exist, to not use optimization or tuning methods for
| reconstructing a model of a scene?
|
| We are refining efficient ways of rendering a view of a scene
| from these models but the scenes remain static. The scenes also
| take a while to reconstruct too.
|
| Can we still achieve the great look and details of RF and GS
| without paying for an expensive reconstruction per instance of
| the scene?
|
| Are there ways of greedily reconstructing a scene with
| traditional CG methods into these new representations now that
| they are fast to render?
|
| Please forgive any misconceptions that I may have in advanced! We
| really appreciate the work y'all are advancing!
| duckworthd wrote:
| > Are there opportunities, where they exist, to not use
| optimization or tuning methods for reconstructing a model of a
| scene?
|
| If you know a way, let me know! Every system I'm aware of
| involves optimization in one way or another, from COLMAP to 3D
| Gaussian Splatting to Instant NGP and more. Optimization is a
| powerful workhorse that gives us a far wider range of models
| than a direct solver ever could. > Can we still achieve the
| great look and details of RF and GS without paying for an
| expensive reconstruction per instance of the scene?
|
| In the future I hope so. We don't have a convincing way to
| generate 3D scenes yet, but given the progress in 2D, I think
| it's only a matter of time.
|
| > Are there ways of greedily reconstructing a scene with
| traditional CG methods into these new representations now that
| they are fast to render?
|
| Not that I'm aware of! If there were, I think these works
| should be on the front page instead of SMERF.
| annoyingnoob wrote:
| There is a market here for Realtors to upload pictures and
| produce walk-throughs of homes for sale.
| esafak wrote:
| https://matterport.com/
| ibrarmalik wrote:
| The Luma folks made something similar:
| https://apps.apple.com/app/luma-flythroughs/id6450376609?l=e...
| SubiculumCode wrote:
| Im not sure why this demo runs so horribly in Firefox but not
| other browsers..anyone else having this?
| daemonologist wrote:
| Runs pretty well (20-100 fps depending on the scene) for me on
| both Firefox 120.1.1 on Android 14 (Pixel 7; smartphone preset)
| and Firefox 120.0.1 on Fedora 39 (R7 5800, 64 GB memory, RX
| 6600 XT; 1440p; desktop preset).
| SubiculumCode wrote:
| It seems that for some reason, my firefox is stuck in
| software compositor. I am getting:
|
| WebRender initialization failed Blocklisted; failure code
| RcANGLE(no compositor device for EGLDisplay)(Create)_FIRST
| 3D11_COMPOSITING runtime failed Failed to acquire a D3D11
| device Blocklisted; failure code
| FEATURE_FAILURE_D3D11_DEVICE2
|
| I'm running a 3060
| jerpint wrote:
| Just ran this on my phone through a browser, this is very
| impressive
| duckworthd wrote:
| Thank you :)
| catskul2 wrote:
| When might we see this in consumer VR? I'm surprised we don't
| already but I was suspecting it was a computation constraint.
|
| Does this relieve the computation constraint enough to run on
| Quest 2/3?
|
| Is there something else that would prevent binocular use?
| doctoboggan wrote:
| I recently got a new quest and I am wondering the same thing.
| The fact that this is currently running in a browser (and can
| run on a mobile device) gives me hope that we will see
| something like this in VR sooner rather than later.
| duckworthd wrote:
| I can't predict the future, but I imagine soon: all of the
| tools are there. The reason we didn't develop for VR is
| actually simpler than you'd think: we just don't have the
| developer time! At the end of the day, only a handful of people
| actively wrote code for this project.
| nox100 wrote:
| memory efficient? It downloaded 500meg!
| bongodongobob wrote:
| A. Storage isn't memory
|
| B. That's hardly anything in 2023.
| duckworthd wrote:
| Right-o. The web viewer is swapping assets in and out of
| memory as the user explores the scene. The Network and disc
| requirements are high but memory usage is low.
| monlockandkey wrote:
| Get this on a VR headset and you have a game changer literally.
| modeless wrote:
| How long until you can stitch Street View into a seamless
| streaming NeRF of every street in the world? I hope that's the
| goal you're working towards!
| duckworthd wrote:
| ;)
| modeless wrote:
| Haha, too bad the Earth VR team was disbanded because that
| would be the Holy Grail. If someone can get the budget to
| work on that I'd be tempted to come back to Google just to
| help get it done! It's what I always wanted when I was
| building the first Earth VR demo...
| deelowe wrote:
| I read another article talking about what waymo was working on
| and this looks oddly similar... My understanding is that the
| goal is to use this to reconstruct 3d models of street view
| images in real time.
| yarg wrote:
| What I'm seeing from all of these things is very accurate single
| navigable 3D images.
|
| What I haven't seen anything of is feature and object detection,
| blocking and extraction.
|
| Hopefully a more efficient and streamable codec necessitates the
| sort of structure that lends itself more easily to analysis.
| fngjdflmdflg wrote:
| >Google DeepMind Google Research Google Inc.
|
| What a variety of groups! How did this come about?
| tomatotomato31 wrote:
| I'm following this through two minutes paper and I'm looking
| forward to using it.
|
| My grandpa died 2 years ago and in hindsight I took pictures for
| using them as in your demo.
|
| Awesome thanks:)
| duckworthd wrote:
| It would be my dream to make capturing 3D memories as easy and
| natural as taking a 2D photos with your smartphone today.
| Someday!
| twelfthnight wrote:
| Hope this doesn't come as snarky, but does Google pressure
| researchers to do PR in their papers? This really is cool, but
| there is a lot of self-promotion in this paper and very little
| discussion of limitations (and the discussion of them is
| bookended by qualifications why they really aren't limitations).
|
| It makes it harder for me to trust the paper if I feel like the
| paper is trying to persuade me of something rather than describe
| the complete findings.
| tomatotomato31 wrote:
| People are not allowed to be proud of their work anymore?
| yieldcrv wrote:
| I had read about a competing technology that was suggesting
| NeRF's were a dead end
|
| but perhaps that was biased?
| rzzzt wrote:
| What kind of modes does the viewer cycle through when I press the
| space key?
___________________________________________________________________
(page generated 2023-12-13 23:00 UTC)