[HN Gopher] HyperNeRF
       ___________________________________________________________________
        
       HyperNeRF
        
       Author : montyanderson
       Score  : 276 points
       Date   : 2021-12-27 02:01 UTC (21 hours ago)
        
 (HTM) web link (hypernerf.github.io)
 (TXT) w3m dump (hypernerf.github.io)
        
       | themodelplumber wrote:
       | Wow. I only get the inner workings at a very basic, intuitive
       | level, but it's really cool to see the progress of this and
       | similar research. Congrats to the researchers.
       | 
       | It's awe-inspiring and even frightening at first, in the usual
       | ways, but IMO it has a lot of long-term promise in other ways.
       | 
       | Spitballing: I like that this kind of result, which clearly calls
       | into question the role or perception of physical identity, may
       | eventually inform (or even necessitate?) the deconstruction of
       | the physical "I" as a permission broker, and further open a many-
       | to-many interface between the dimensions that underlay what we
       | now think of as "self" and the true depth and variety within what
       | we now think of as "individual humans who are not me". That
       | opening process alone ought to be a huge jump for human
       | development.
       | 
       | Right now we're each held, and holding ourselves, way too
       | responsible for maintaining a singular subjective identity,
       | looking at the aggregate. Not only does this compromise our
       | outlook on others based on our subjective perception of the
       | identity match, but it also compromises our ability to reliably
       | consume and metabolize identity-construct-breaking information
       | and experiences. And many of those things, when consumed without
       | so many identity borders--so to speak--will end up being
       | incredibly useful for individuals and group both.
       | 
       | Thanks for sharing op.
        
         | walleeee wrote:
         | how in particular do you think this sort of work might inform
         | interpretations of personal identity?
         | 
         | what uses do you anticipate for a more [fluid? porous? plural?]
         | self-regard?
        
           | themodelplumber wrote:
           | Hmm, that would go into specifics, which IMO are kind of
           | tenuous from the start since the point of a spitball is to be
           | open to unknowns.
           | 
           | So with that said, some ideas could be started around topics
           | like 1) massive identity theft causing a re-thinking of
           | identity 2) creativity and constraints around the moderation
           | of physical identity and 3) technical-presentational dynamics
           | surrounding physical presence and the moderation of identity
           | presentation in a physical presence context.
           | 
           | Any one of these is a great setup for the question: How do we
           | interpret personal identity?
           | 
           | And this--again just IMO--would be an amazing point at which
           | to say, "look, if the only word-tool we can use is 'identity'
           | to describe this crisis/opportunity, then maybe all we have
           | is a metaphorical hammer and we have all these endless
           | annoying nails--in the form of identity questions--to hammer
           | down. But if we had maybe some other word-tools to use
           | instead of 'identity', maybe this really would look more like
           | an opportunity to move humanity one more step up the
           | evolutionary ladder."
           | 
           | We already moderate our identity every day, either
           | consciously or unconsciously. It's been studied for thousands
           | of years. It's in books you've read, movies you've watched.
           | It's been done for fun, for comic relief, and also it's been
           | done to solve mind-shattering problems. But now we start to
           | really unwind this question of physical identity, the one
           | concrete thing we thought was so much more certain...! and
           | things get _really_ interesting. This is a different level,
           | where there's maybe not such a need to hide or hide from this
           | departure from "this one idea of who I am" which is really
           | just a mess of a complex of ideas.
           | 
           | > what uses do you anticipate for a more [fluid? porous?
           | plural?] self-regard?
           | 
           | For one: More, and healthier, exposure to alternatives. Your
           | identity is almost synonymous with your subjective past. To
           | that degree, you're screwed in a lot of ways. To give a
           | personal example, I was born into a cult. I was screwed from
           | birth, in that way.
           | 
           | One of the best tools I had in removing myself from that
           | environment was the concept of an "online identity" which
           | could be moderated, intentionally, into whatever it needed to
           | be to help me explore alternative perceptions of what it was
           | I was involved in. I could even try on a non-cult identity,
           | and write, online, from the perspective of someone who had
           | freed themselves. And then I could consider how that felt,
           | and reflect on what I learned. Did it kill me? No. Am I in
           | hell now? Nope. etc.
           | 
           | Consider the millions of various points of identity just like
           | that. Not just cults, no way! Am I Coke or Pepsi? eh, boring.
           | Am I...which race am I? Is that a tricky question in the
           | future? And from the outside, will I get better treatment
           | from medical professionals if I can moderate my physical
           | presentation at will? Wow so many random questions that can
           | be asked for learning's sake.
           | 
           | But again, to emphasize--I love and respect the unknown. I
           | don't have answers, only openness where I don't want to have
           | certainty anymore, because that c-word makes it a little too
           | hard to solve big problems, or a little too easy to avoid
           | them. Don't leave the cult man, you'll lose all your
           | certainty.
           | 
           | Hope that helps, a little, or examples, a little.
        
           | kingcharles wrote:
           | I think the advent of DeepTomCruise [1] make us rethink the
           | solidity of identity. A majority of those watching the videos
           | appear to believe it is the real Tom Cruise, and really,
           | there is no good way to tell any longer whether it is or not,
           | without reference to external information.
           | 
           | There is no reason now that Tom Cruise even needs to exist as
           | a real person, or needs to ever act in a movie ever again.
           | Tom Cruise can just become an abstract concept, no longer a
           | living object. Perhaps it is Tom Cruise himself in these
           | videos. Perhaps the real Tom Cruise no longer exists. Perhaps
           | the whole thing is an elaborate art project. Our certainty of
           | its falsity is tied solely to whether we believe the story of
           | those who claim to have created the videos. Is it easier to
           | create fake videos of Tom Cruise or to create real videos of
           | Tom Cruise and a fake story?
           | 
           | What is a Tom Cruise?
           | 
           | [1] https://www.tiktok.com/@deeptomcruise
        
       | Namidairo wrote:
       | The team photos double as the demo, that's neat. (Mouse over to
       | see the depth colouring.)
       | 
       | I presume something along these lines will make it's way to the
       | Pixel 6 camera software, given the origin of the research and the
       | onboard edgeTPU block.
        
       | ragmurugesan wrote:
       | This is cool
        
       | ragmurugesan wrote:
       | Thanks for sharing
        
       | fartcannon wrote:
       | Is there a *Nerf to Blender pipeline?
        
       | max_ wrote:
       | Unfortunately, there is a lot of technical Jargon. I don't seem
       | to understand much.
       | 
       | Could someone help me outline what's most interesting here? Maybe
       | applications?
        
         | ibrarmalik wrote:
         | If you have a limited number of images of the same scene, with
         | NeRF you can generate new images from different positions and
         | angles (novel view synthesis).
         | 
         | But this only works with rigid scenes: e.g. if you apply NeRF
         | to images of a person, they cannot move between the pictures.
         | 
         | This is what HyperNeRF is trying to solve. If there are
         | pictures of a person, and in one of them they are smiling but
         | on another not, 1. this method will not fail, and 2. looks like
         | it will give reasonable new views/images.
        
       | visarga wrote:
       | NeRF's were inveted in 2020 and started an avalanche of papers
       | (637 citations so far).
       | 
       | > Original one: "Nerf, Representing scenes as neural radiance
       | fields for view synthesis"
       | 
       | > https://scholar.google.com/scholar?cites=9378169911033868166...
       | 
       | What I find most exciting about it is that a NeRF represents
       | images as neural nets, one neural net for each image (in the OP
       | paper generalised to image + deformations). By evaluating the net
       | at various pixel coordinates it gives the color.
       | 
       | Up until now learning to replicate the input exactly was called
       | overfitting and considered a bug, not a feature, but they showed
       | a completely new way to wield neural nets.
       | 
       | An interesting detail is that they depend on Fourier encoding for
       | the input coordinates. A variant called SIREN uses `sin` as
       | activation function throughout the net.
       | 
       | Maybe neural nets will become the data compressors of tomorrow?
       | Shoot a picture, send a neural net around. Game assets could be
       | NeRFs.
        
         | sbierwagen wrote:
         | >Maybe neural nets will become the data compressors of
         | tomorrow?
         | 
         | Long a topic of research, with many interesting ramifications:
         | https://ai.googleblog.com/2016/09/image-compression-with-neu...
         | In analogy to standard compression techniques, you can think of
         | the neural net as a very large "dictionary".
         | 
         | For text compression, there is of course the famous Hutter
         | Prize, launched in 2006:
         | https://en.wikipedia.org/wiki/Hutter_Prize ("Prediction is the
         | golden key that opens all locks". Compressing each byte of
         | wikipedia text is equivalent to _predicting_ it-- to compactly
         | represent its knowledge is to understand it.)
        
           | vintermann wrote:
           | 1. If the first byte is 0, insert the text of Wikipedia, 2.
           | If it isn't, ignore it, and all further bytes are interpreted
           | literally.
           | 
           | To avoid these sort of "joke" decompressors, they evaluate
           | compressors on (size of compressed data) + (size of
           | decompressor) in the compression competitions last time I
           | checked. That means we won't get a winner based on GPT3
           | anytime soon. 350+ GB of weights is a lot to overcome :)
           | 
           | Though of course, given enough data to compress, it might
           | well be that full-on neural language models are still worth
           | it.
        
             | [deleted]
        
             | visarga wrote:
             | > That means we won't get a winner based on GPT3 anytime
             | soon. 350+ GB of weights is a lot to overcome :)
             | 
             | You're right, and in-there lays the crucial difference
             | between compression and language modelling. Models are
             | concerned with having a good representation for both past
             | and future distributions, while compressors only care about
             | the past. Models support many tasks while compressors are
             | just for input replication.
        
             | injidup wrote:
             | I don't think it is unreasonable that both sender and
             | receiver have the 350GB NN at each end. If for example we
             | are talking video conferencing and compression of video
             | data then only a small amount of data need to be
             | transmitted real time and at each end a high fidelity image
             | can be reconstructed.
        
         | orlp wrote:
         | > What I find most exciting about it is that a NeRF represents
         | images as neural nets, one neural net for each image (in the OP
         | paper generalised to image + deformations). By evaluating the
         | net at various pixel coordinates it gives the color.
         | 
         | I actually had a similar idea and used it to write a lossy
         | image compressor for my bachelor thesis in 2018:
         | https://theses.liacs.nl/pdf/2018-2019-PetersO.pdf
        
         | vintermann wrote:
         | > Up until now learning to replicate the input exactly was
         | called overfitting and considered a bug, not a feature, but
         | they showed a completely new way to wield neural nets.
         | 
         | Well, for that particular ting there was a predecessor of sort
         | in Deep Image Priors from 2017.
         | 
         | https://arxiv.org/abs/1711.10925
         | 
         | That was all about overfitting a neural net on a single image,
         | which they used to get impressive inpainting, noise removal and
         | superresolution results without any training at all (though of
         | course it did not beat state of the art training-based
         | approaches, even then.)
         | 
         | I had a lot of fun playing around with it when it came. The
         | idea is dead simple, within reach to implement yourself with no
         | complex mathematical understanding.
        
       | productceo wrote:
       | Very interesting. Highly useful for metaverse avatars.
        
         | qwertox wrote:
         | Or YouTube video thumbnails, from the looks of it.
        
       | fxtentacle wrote:
       | In my opinion, everything nerf related gets a lot of opinion
       | because it's highly graphical and thus easy to present. But
       | there's few practical applications and it tends to be super slow
       | and not work for more challenging scenes where traditional
       | 20-year old methods like global penalty block matching still work
       | reasonably.
       | 
       | And for this paper in particular, I fail to see how they improve
       | over other nerf approaches with deformation terms like Nerfies or
       | D-Nerf
        
         | dougabug wrote:
         | Deformation fields would struggle to fundamentally change the
         | topological type, particularly where the transformation would
         | need to "tear" the manifold, such as turning a sphere into a
         | donut or dividing a cell into two child cells. HyperNeRF's
         | exploit and extend Level Set Methods, which are rooted in Morse
         | Theory.
         | 
         | https://en.wikipedia.org/wiki/Level-set_method
         | 
         | https://en.wikipedia.org/wiki/Morse_theory
        
       | sanxiyn wrote:
       | Since neural networks are universal, we can often solve problems
       | we don't know exactly how to solve with neural networks. NeRF is
       | a great example. But once solved, we should try to reverse
       | engineer the solution and optimize.
       | 
       | https://arxiv.org/abs/2112.05131 did such work and found you can
       | get NeRF quality without any neural network at all and as a
       | result 100x faster.
        
         | echelon wrote:
         | > But once solved, we should try to reverse engineer the
         | solution and optimize.
         | 
         | This is a great point.
         | 
         | Does anybody know of any papers that explore non-neural text to
         | speech / voice synthesis that achieve better than parametric
         | quality?
        
         | macawfish wrote:
         | Awesome! I'd love to see someone do this with light field
         | networks: https://www.vincentsitzmann.com/lfns/
         | 
         | By the way, in the plenoxel video they say "the key component
         | is the differentiable volumetric rendering, not the neural
         | network".
         | 
         | After watching the fractal community struggle to come up with
         | good distance estimators for years, this makes so much sense.
         | In the end, automatic differentiation came out to be one of the
         | most solid methods for coming up with distance estimators for
         | an arbitrary fractal formula.
        
         | orbital-decay wrote:
         | NeRF triggered the development of a number of improved methods,
         | though. There's DONeRF [1] that is built upon NNs and is
         | currently faster than comparable solutions (the field evolves
         | fast so I may be wrong).
         | 
         | [1] https://github.com/facebookresearch/DONERF
        
           | macawfish wrote:
           | Yeah there are dozens of spinoffs inspired by NeRF, it's kind
           | of amazing
        
         | exikyut wrote:
         | Demo videos and overview: https://alexyu.net/plenoxels/
        
         | dougabug wrote:
         | Except the plenoxel representation is also 2-3 orders of
         | magnitude larger than an MLP-based NeRF. It's not very
         | surprising that a sparse voxel representation can capture a
         | plenoptic function. Representation of volumetric video further
         | presses the size disadvantages of voxel based techniques.
         | 
         | The reasons for using deep learning function approximators are
         | manifold, for instance in RL, state and state-action spaces
         | quickly become too large for tabular methods. Using grids or
         | tables also basically closes off the opportunities for
         | exploiting meta-learning and analysis-by-synthesis.
         | 
         | Plenoxels also rely on explicit specification of a known grid
         | structure, where the HyperNeRF method can learn latent
         | parametric manifolds and handle dynamic objects with changing
         | topologies.
        
           | 3dthrow wrote:
           | An MLP-based NeRF actually has a comparable number of
           | parameters to plenoxels (it's not 2-3 orders of magnitude
           | smaller). The original NeRF is 8 dense layers with 256
           | channels and then this one adds another network with 6 dense
           | layers with 64 channels, so roughly speaking 7x256x256 +
           | 5x64x64. And remember that the voxel grid is sparse, though
           | we don't get exact numbers here. We shouldn't be miserly with
           | our megabytes in 2021. What concerns me is how HyperNeRF
           | requires 64 hours of training time on 4 TPU v4s; if you want
           | to use this for communication or entertainment, it's light-
           | years away from interactive.
           | 
           | Extending plenoxels to support dynamic objects would be great
           | future work.
        
       | bloopernova wrote:
       | Is this computational "simple" (efficient?) enough that it could
       | be used to create better interactive graphics in things like
       | games?
       | 
       | Or am I understanding this incorrectly? Sorry, this feels very
       | much beyond me.
        
         | Geee wrote:
         | It seems that they're not real-time renderable. There could be
         | a method of converting these to polygons + materials for real-
         | time rendering.
        
         | visarga wrote:
         | I see two uses:
         | 
         | - as a storage format for 3d models - it could become a method
         | to package parametrised game assets, so you can easily
         | customise them.
         | 
         | - as a representation for 3d inputs in spatial reasoning tasks,
         | like protein folding and self driving
        
           | dclowd9901 wrote:
           | I was wondering on this. So it _is_ in the photogrammetry
           | space? How soon before I can use my iPhone to take 6 pictures
           | or something and get a perfect 3D model from it?
        
       | Stevvo wrote:
       | What a time to be alive!
        
         | daenz wrote:
         | For those of you who may not get this reference, it's from a
         | popular Youtube channel, Two Minute Papers, run by Dr Karoly
         | Zsolnai-Feher that recently featured OP's link
         | https://www.youtube.com/c/K%C3%A1rolyZsolnai/videos
         | 
         | The channel is excellent and I recommend subscribing to it if
         | you like this kind of stuff.
        
           | Tarks wrote:
           | Which is very likely referencing this :
           | https://youtu.be/qu32fBkiHFE
        
       | jcims wrote:
       | What's the difference between this and using photogrammetry to
       | build a 3D model/depthmap and painting it with the images?
        
         | zcw100 wrote:
         | Better in painting and the 3D object can move.
        
         | Geee wrote:
         | These contain all the light field information of the scene. See
         | the reflections, refractions etc.
        
       | twoodfin wrote:
       | In the right circumstances, the NFL would spend $1E7-$1E8 on this
       | or similar tech. It's wild to think about how much of what we see
       | on screens in a decade or so will be "computationally inferred".
        
         | themodelplumber wrote:
         | I'm sure there are also people who would love to
         | computationally infer the preferred ending to an NFL game of
         | their choice, too. Or change the ending to a movie of which
         | they can only tolerate the first hour.
         | 
         | It would enable some really cool ideation and modeling, maybe
         | even some of which could be used for psychology work, or sports
         | psychology in the case of the NFL (I'm reminded of those
         | "imagine yourself winning" tricks)
        
       | dlightlo wrote:
        
       | dlightlo wrote:
        
       ___________________________________________________________________
       (page generated 2021-12-27 23:02 UTC)