[HN Gopher] Gaussian Head Avatar: Ultra High-Fidelity Head Avata...
       ___________________________________________________________________
        
       Gaussian Head Avatar: Ultra High-Fidelity Head Avatar via Dynamic
       Gaussians
        
       Author : phil9l
       Score  : 150 points
       Date   : 2023-12-08 09:38 UTC (13 hours ago)
        
 (HTM) web link (yuelangx.github.io)
 (TXT) w3m dump (yuelangx.github.io)
        
       | cubefox wrote:
       | "represented by controllable 3D Gaussians"
       | 
       | They just assume everyone knows what they mean with "3D
       | Gaussian".
        
         | drsopp wrote:
         | No, they assume their peers do.
        
           | vmfunction wrote:
           | It is academia ;-)
        
         | GaggiX wrote:
         | I don't think a research paper is meant to be understood by
         | everyone, and I imagine the authors don't have that expectation
         | either.
        
         | data-ottawa wrote:
         | I actually have the same gripe, unfortunately there is a long
         | history of academic fields naming things like this (there's an
         | entire Wikipedia page on this banned after Gauss/"Gaussian" htt
         | ps://en.wikipedia.org/wiki/List_of_things_named_after_Car...)
        
           | blovescoffee wrote:
           | Not really. There are many things named after Gauss but a
           | "Gaussian" is almost always meant to be a probability
           | distribution / density function that is very well understood
           | and defined (and common)
        
         | eigenvalue wrote:
         | Take a linear algebra course or read a textbook before trying
         | to read and understand cutting edge ML research!
        
           | cubefox wrote:
           | Rude!
        
         | ndriscoll wrote:
         | Seems like they just mean the vector version of a Gaussian
         | function: f(r) = exp(-r*r). Basically a "bell curve" except in
         | 3D so it's a ball that's dense at the center and dies off. Then
         | the optimizer might learn to produce an intensity, offset, and
         | width for each point in a cloud, so the A,B,C for A*f((r-B)/C)
         | at each point or something.
        
           | blovescoffee wrote:
           | This is in fact what the optimizer does. At least in the
           | original paper, the model learns to skew and rotate the
           | gaussians.
        
       | PTOB wrote:
       | And now: - the 4-hr Work Week toolkit is complete. - deep-fakes
       | are now just regular fakes
        
         | cloudking wrote:
         | When can we send our gaussian heads to the Zoom meeting?
        
       | randall wrote:
       | I didn't expect Gaussian splats to be so good at approximating
       | geometry. It's cool when you see a new foundational approach to
       | something that's been done a certain way for decades.
        
         | eurekin wrote:
         | Someone mentioned previously that guassian splats are ideal
         | extension of Ai image generation into 3d and they might have
         | turned out correct
        
           | kridsdale1 wrote:
           | They're basically voxels after we abandon Cartesian
           | 3-dimensionality (which imposes cubes)
        
             | porphyra wrote:
             | More like point clouds since they are not constrained to a
             | grid.
             | 
             | NERF is more like voxels though.
        
               | blovescoffee wrote:
               | What do you mean NERF is like voxels? It is very much NOT
               | like voxels
        
         | blovescoffee wrote:
         | Most of the time they aren't that good at approximating
         | geometry. They are good at approximating the "appearance" of
         | geometry. However, many regularization techniques and priors
         | can be introduced to make the Gaussian splatting technique
         | better at geometry approximation.
        
       | kthejoker2 wrote:
       | Stills look great, fidelity is there ...
       | 
       | The actual rotating avatar videos still have extremely poor
       | approximations of human musculature especially at the eyes and
       | jaw (bc these are hollow surface meshes naturally)
       | 
       | Is there research to overlay these models on more representative
       | facial muscles?
        
       | heliophobicdude wrote:
       | This is excellent! A similar paper from Meta was published two
       | days ago on Avatars too
       | 
       | https://shunsukesaito.github.io/rgca/
       | 
       | Here's the discussion for it(empty as of right now):
       | https://news.ycombinator.com/item?id=38554537
        
         | lelag wrote:
         | There was also this project that was posted a couple days ago:
         | https://blog.metaphysic.ai/controllable-deepfakes-with-gauss...
         | 
         | Why they call a virtual avatar a deepfake beats me though....
        
           | peterleiser wrote:
           | If you can "wear" the face of someone else then that seems
           | like deepfake territory.
        
           | yjftsjthsd-h wrote:
           | > Why they call a virtual avatar a deepfake beats me
           | though....
           | 
           | What's the difference? In both cases you're simulating
           | somebody's face in a way that doesn't actually require having
           | the original to drive it.
        
         | ChuckMcM wrote:
         | It is, I fully expect at some point we'll get "in game" MMORPG
         | player characters that are emotively very very effective. The
         | immersion level will be pretty intense at that point,
         | especially if it's VR but even a 3rd person view type like WoW
         | using a web cam to process your facial expression?
         | 
         | Another interesting application would be "zoom" meetings where
         | everyone is shown around the table or in the audience and it
         | processes their emotive state in real time. That could help
         | speakers engage with the audience in a better way.
         | 
         | Of course the "bad" uses of this tech are pretty out there too,
         | from porn apps, to ripping people off by getting a "zoom" from
         | a relative.
        
           | godelski wrote:
           | > Another interesting application would be "zoom" meetings
           | where everyone is shown around the table or in the audience
           | and it processes their emotive state in real time. That could
           | help speakers engage with the audience in a better way.
           | 
           | See Permutation City[0]. Another application is actually
           | masking reactions selectively. There's some interesting
           | aspects that play out with respect to this and some other
           | interesting aspects tech that people will now view as near
           | Sci-Fi. The VR identity cloning is a common Sci-Fi plot, not
           | specifically a Permutation City thing. Great book, highly
           | recommend.
           | 
           | [0] https://en.wikipedia.org/wiki/Permutation_City
        
         | muglug wrote:
         | I think the tech in that paper was demoed for this podcast:
         | https://youtu.be/MVYrJJNdrEg
         | 
         | The big roadblock to commercialisation for the moment is the
         | original capture -- for the paper they used a 110-camera
         | capture rig under ideal lighting conditions.
         | 
         | In the above podcast Zuckerberg mentions that in the future
         | people will be able to use their phones to do that same
         | capture, but I don't think that tech is coming next year.
         | 
         | I wonder if there'll be an interim period where people who want
         | high-quality avatars will have to book an appointment.
        
       | croes wrote:
       | We will get a whole new level of fake news.
       | 
       | Online meetings for important things are now unsecure because you
       | can't be sure the other people are who they claim to be.
        
         | nuz wrote:
         | People can lie, people can send emails that look like they're
         | from your boss. People now know deepfakes are a thing and have
         | an immunity from trusting suspicious online meetings where your
         | boss acts different than they usually do. Etc etc. It's not as
         | big of a threat as people want to make it out to be
        
           | DeIlliad wrote:
           | People routinely fall for lies and get phished from those
           | emails that look like they're from your boss. Every year
           | there are a handful of high profile tech companies that get
           | hacked because someone you would think should know better
           | falls for a phishing scam. I think this is a bigger threat
           | than people are making it out to be.
        
           | croes wrote:
           | We will get "recordings" were people will plot the great
           | reset.
           | 
           | We already got fake audio of news anchors apologizing for
           | years of lies.
           | 
           | We will get a lot more of that.
        
         | jayd16 wrote:
         | Security for a video call is from the user account not visual
         | verification.
        
           | croes wrote:
           | Should be but people can easily be fooled.
        
           | sbarre wrote:
           | People get their company accounts compromised all the time.
           | 
           | It's one thing to get a poorly-worded email from your CFO
           | asking for company bank info, but it's a whole other thing to
           | be asked over a Zoom video call by who you think is the right
           | person, but it's a fake gaussian splat avatar.
           | 
           | There's already precedent for scammers doing similar things
           | using voice deepfaking over phone calls. This could be a
           | whole new level of phishing.
        
       | Philpax wrote:
       | Excited for the next generation of Personas for the Vision Pro,
       | etc :-)
        
       | gigel82 wrote:
       | Why would they even include a "Code" link if it's just an empty
       | GitHub repo with a README?
        
         | blovescoffee wrote:
         | It will get updated. This is pretty common in research papers.
         | They're just getting the link attached because it's easier to
         | update the repo than the paper.
        
       | planckscnst wrote:
       | The rotating avatars have some uncannyness to them due to the eye
       | gaze not fixating on a target as it moves around. I think slowly
       | rotating the camera around the model would have done better.
       | Perhaps also some background elements so it's clear that the
       | avatar isn't moving, the camera is.
       | 
       | I'm hoping this technique can be used in video games because it's
       | significantly better than what we have now.
        
       | WhereIsTheTruth wrote:
       | How can we trust photos now? Fiction or reality, it's becoming
       | harder to differentiate
        
         | gsuuon wrote:
         | I think that point has come and gone. Not sure how society will
         | adapt - I really hope smart people are out there working on
         | this sort of stuff.
        
       ___________________________________________________________________
       (page generated 2023-12-08 23:00 UTC)