[HN Gopher] URAvatar: Universal Relightable Gaussian Codec Avatars
       ___________________________________________________________________
        
       URAvatar: Universal Relightable Gaussian Codec Avatars
        
       Author : mentalgear
       Score  : 95 points
       Date   : 2024-11-07 07:20 UTC (15 hours ago)
        
 (HTM) web link (junxuan-li.github.io)
 (TXT) w3m dump (junxuan-li.github.io)
        
       | chpatrick wrote:
       | Wow that looks pretty much solved! Is there code?
        
         | mentalgear wrote:
         | Unfortunately not yet. Also code alone without the training
         | data and weights might still requires considerable effort. I
         | also wonder how diverse their training data is, i.e. how well
         | the solution will generalize.
        
           | vessenes wrote:
           | I'll note that they had pretty good diversity in the test
           | subjects shown - weight, gender, some racial diversity. I
           | thought it was above average compared to many AI papers that
           | aren't specifically focused on diversity as a training goal
           | or metric. I'm curious to try this. Something tells me this
           | is more likely to get bought and turned into a product or an
           | offering than to be open sourced, though.
        
       | mentalgear wrote:
       | With the computational efficiency of Gaussian splatters, this
       | could be ground-breaking for photorealistic avatars, possible
       | driven by LLMs and generative audio.
        
       | michaelt wrote:
       | Those demo videos look great! Does anyone know how this compares
       | to the state of the art in generating realistic, relightable
       | models of things more broadly? For example, for video game
       | assets?
       | 
       | I'm aware of traditional techniques like photogrammetry - which
       | is neat, but the lighting always looks a bit off to me.
        
         | zitterbewegung wrote:
         | I don't do video game programming but what I have heard about
         | engines is that lighting is controlled by the game engine and
         | it's one step in the pipeline to render the game. Ray tracing
         | is one technique where the light source and the location of the
         | 3d model has simulated light rays in relation of the light
         | source and model.
         | 
         | They are probably rendering with a simple lighting model since
         | this is a system where lighting in a game is handled by another
         | algorithm
        
       | dwallin wrote:
       | Given the complete lack of any actual details about performance I
       | would hazard a guess that this approach is likely barely
       | realtime, requiring top hardware, and/or delivering an
       | unimpressive fps. I would love to get more details though.
        
         | ladberg wrote:
         | Gaussian splats can pretty much be rendered in any off the
         | shelf 3D engine with reasonable performance, and the focus of
         | the paper is generating the splats so there's no real reason
         | for them to mention runtime details
        
           | dwallin wrote:
           | Relightable Gaussian Codec Avatars are very, very far from
           | your off-the-shelf splatting tech. It's fair to say that this
           | paper is more about a way of generating more efficiently, but
           | in the original paper from the codec avatars team
           | (https://arxiv.org/pdf/2312.03704) they required a A100 to
           | run at just above 60fps at 1024x1024.
           | 
           | Nothing here seems to have moved that needle.
        
       | jy14898 wrote:
       | Interesting that under the "URAvatar from Phone Scan" section,
       | the first example shows a lady with blush/flush, which only
       | appears in the center video when viewed straight on - the other
       | angles remove this
        
       | petesergeant wrote:
       | This is great work, although I note that the longer you look at
       | them, and the more examples you look at in the page, the wow
       | factor drops off a bit. The first example is exceptional, but
       | when you get down to the video of "More from Phone Scan" and look
       | at any individual avatar, you find yourself deep in the uncanny
       | valley very quickly
        
         | brk wrote:
         | I noticed that too. It also doesn't seem to always know how to
         | map (or remove) certain things, like the hair bun on the input
         | image, to the generated avatars once you get outside of the
         | facial region.
        
       ___________________________________________________________________
       (page generated 2024-11-07 23:01 UTC)