[HN Gopher] HyperNeRF
___________________________________________________________________
HyperNeRF
Author : montyanderson
Score : 276 points
Date : 2021-12-27 02:01 UTC (21 hours ago)
(HTM) web link (hypernerf.github.io)
(TXT) w3m dump (hypernerf.github.io)
| themodelplumber wrote:
| Wow. I only get the inner workings at a very basic, intuitive
| level, but it's really cool to see the progress of this and
| similar research. Congrats to the researchers.
|
| It's awe-inspiring and even frightening at first, in the usual
| ways, but IMO it has a lot of long-term promise in other ways.
|
| Spitballing: I like that this kind of result, which clearly calls
| into question the role or perception of physical identity, may
| eventually inform (or even necessitate?) the deconstruction of
| the physical "I" as a permission broker, and further open a many-
| to-many interface between the dimensions that underlay what we
| now think of as "self" and the true depth and variety within what
| we now think of as "individual humans who are not me". That
| opening process alone ought to be a huge jump for human
| development.
|
| Right now we're each held, and holding ourselves, way too
| responsible for maintaining a singular subjective identity,
| looking at the aggregate. Not only does this compromise our
| outlook on others based on our subjective perception of the
| identity match, but it also compromises our ability to reliably
| consume and metabolize identity-construct-breaking information
| and experiences. And many of those things, when consumed without
| so many identity borders--so to speak--will end up being
| incredibly useful for individuals and group both.
|
| Thanks for sharing op.
| walleeee wrote:
| how in particular do you think this sort of work might inform
| interpretations of personal identity?
|
| what uses do you anticipate for a more [fluid? porous? plural?]
| self-regard?
| themodelplumber wrote:
| Hmm, that would go into specifics, which IMO are kind of
| tenuous from the start since the point of a spitball is to be
| open to unknowns.
|
| So with that said, some ideas could be started around topics
| like 1) massive identity theft causing a re-thinking of
| identity 2) creativity and constraints around the moderation
| of physical identity and 3) technical-presentational dynamics
| surrounding physical presence and the moderation of identity
| presentation in a physical presence context.
|
| Any one of these is a great setup for the question: How do we
| interpret personal identity?
|
| And this--again just IMO--would be an amazing point at which
| to say, "look, if the only word-tool we can use is 'identity'
| to describe this crisis/opportunity, then maybe all we have
| is a metaphorical hammer and we have all these endless
| annoying nails--in the form of identity questions--to hammer
| down. But if we had maybe some other word-tools to use
| instead of 'identity', maybe this really would look more like
| an opportunity to move humanity one more step up the
| evolutionary ladder."
|
| We already moderate our identity every day, either
| consciously or unconsciously. It's been studied for thousands
| of years. It's in books you've read, movies you've watched.
| It's been done for fun, for comic relief, and also it's been
| done to solve mind-shattering problems. But now we start to
| really unwind this question of physical identity, the one
| concrete thing we thought was so much more certain...! and
| things get _really_ interesting. This is a different level,
| where there's maybe not such a need to hide or hide from this
| departure from "this one idea of who I am" which is really
| just a mess of a complex of ideas.
|
| > what uses do you anticipate for a more [fluid? porous?
| plural?] self-regard?
|
| For one: More, and healthier, exposure to alternatives. Your
| identity is almost synonymous with your subjective past. To
| that degree, you're screwed in a lot of ways. To give a
| personal example, I was born into a cult. I was screwed from
| birth, in that way.
|
| One of the best tools I had in removing myself from that
| environment was the concept of an "online identity" which
| could be moderated, intentionally, into whatever it needed to
| be to help me explore alternative perceptions of what it was
| I was involved in. I could even try on a non-cult identity,
| and write, online, from the perspective of someone who had
| freed themselves. And then I could consider how that felt,
| and reflect on what I learned. Did it kill me? No. Am I in
| hell now? Nope. etc.
|
| Consider the millions of various points of identity just like
| that. Not just cults, no way! Am I Coke or Pepsi? eh, boring.
| Am I...which race am I? Is that a tricky question in the
| future? And from the outside, will I get better treatment
| from medical professionals if I can moderate my physical
| presentation at will? Wow so many random questions that can
| be asked for learning's sake.
|
| But again, to emphasize--I love and respect the unknown. I
| don't have answers, only openness where I don't want to have
| certainty anymore, because that c-word makes it a little too
| hard to solve big problems, or a little too easy to avoid
| them. Don't leave the cult man, you'll lose all your
| certainty.
|
| Hope that helps, a little, or examples, a little.
| kingcharles wrote:
| I think the advent of DeepTomCruise [1] make us rethink the
| solidity of identity. A majority of those watching the videos
| appear to believe it is the real Tom Cruise, and really,
| there is no good way to tell any longer whether it is or not,
| without reference to external information.
|
| There is no reason now that Tom Cruise even needs to exist as
| a real person, or needs to ever act in a movie ever again.
| Tom Cruise can just become an abstract concept, no longer a
| living object. Perhaps it is Tom Cruise himself in these
| videos. Perhaps the real Tom Cruise no longer exists. Perhaps
| the whole thing is an elaborate art project. Our certainty of
| its falsity is tied solely to whether we believe the story of
| those who claim to have created the videos. Is it easier to
| create fake videos of Tom Cruise or to create real videos of
| Tom Cruise and a fake story?
|
| What is a Tom Cruise?
|
| [1] https://www.tiktok.com/@deeptomcruise
| Namidairo wrote:
| The team photos double as the demo, that's neat. (Mouse over to
| see the depth colouring.)
|
| I presume something along these lines will make it's way to the
| Pixel 6 camera software, given the origin of the research and the
| onboard edgeTPU block.
| ragmurugesan wrote:
| This is cool
| ragmurugesan wrote:
| Thanks for sharing
| fartcannon wrote:
| Is there a *Nerf to Blender pipeline?
| max_ wrote:
| Unfortunately, there is a lot of technical Jargon. I don't seem
| to understand much.
|
| Could someone help me outline what's most interesting here? Maybe
| applications?
| ibrarmalik wrote:
| If you have a limited number of images of the same scene, with
| NeRF you can generate new images from different positions and
| angles (novel view synthesis).
|
| But this only works with rigid scenes: e.g. if you apply NeRF
| to images of a person, they cannot move between the pictures.
|
| This is what HyperNeRF is trying to solve. If there are
| pictures of a person, and in one of them they are smiling but
| on another not, 1. this method will not fail, and 2. looks like
| it will give reasonable new views/images.
| visarga wrote:
| NeRF's were inveted in 2020 and started an avalanche of papers
| (637 citations so far).
|
| > Original one: "Nerf, Representing scenes as neural radiance
| fields for view synthesis"
|
| > https://scholar.google.com/scholar?cites=9378169911033868166...
|
| What I find most exciting about it is that a NeRF represents
| images as neural nets, one neural net for each image (in the OP
| paper generalised to image + deformations). By evaluating the net
| at various pixel coordinates it gives the color.
|
| Up until now learning to replicate the input exactly was called
| overfitting and considered a bug, not a feature, but they showed
| a completely new way to wield neural nets.
|
| An interesting detail is that they depend on Fourier encoding for
| the input coordinates. A variant called SIREN uses `sin` as
| activation function throughout the net.
|
| Maybe neural nets will become the data compressors of tomorrow?
| Shoot a picture, send a neural net around. Game assets could be
| NeRFs.
| sbierwagen wrote:
| >Maybe neural nets will become the data compressors of
| tomorrow?
|
| Long a topic of research, with many interesting ramifications:
| https://ai.googleblog.com/2016/09/image-compression-with-neu...
| In analogy to standard compression techniques, you can think of
| the neural net as a very large "dictionary".
|
| For text compression, there is of course the famous Hutter
| Prize, launched in 2006:
| https://en.wikipedia.org/wiki/Hutter_Prize ("Prediction is the
| golden key that opens all locks". Compressing each byte of
| wikipedia text is equivalent to _predicting_ it-- to compactly
| represent its knowledge is to understand it.)
| vintermann wrote:
| 1. If the first byte is 0, insert the text of Wikipedia, 2.
| If it isn't, ignore it, and all further bytes are interpreted
| literally.
|
| To avoid these sort of "joke" decompressors, they evaluate
| compressors on (size of compressed data) + (size of
| decompressor) in the compression competitions last time I
| checked. That means we won't get a winner based on GPT3
| anytime soon. 350+ GB of weights is a lot to overcome :)
|
| Though of course, given enough data to compress, it might
| well be that full-on neural language models are still worth
| it.
| [deleted]
| visarga wrote:
| > That means we won't get a winner based on GPT3 anytime
| soon. 350+ GB of weights is a lot to overcome :)
|
| You're right, and in-there lays the crucial difference
| between compression and language modelling. Models are
| concerned with having a good representation for both past
| and future distributions, while compressors only care about
| the past. Models support many tasks while compressors are
| just for input replication.
| injidup wrote:
| I don't think it is unreasonable that both sender and
| receiver have the 350GB NN at each end. If for example we
| are talking video conferencing and compression of video
| data then only a small amount of data need to be
| transmitted real time and at each end a high fidelity image
| can be reconstructed.
| orlp wrote:
| > What I find most exciting about it is that a NeRF represents
| images as neural nets, one neural net for each image (in the OP
| paper generalised to image + deformations). By evaluating the
| net at various pixel coordinates it gives the color.
|
| I actually had a similar idea and used it to write a lossy
| image compressor for my bachelor thesis in 2018:
| https://theses.liacs.nl/pdf/2018-2019-PetersO.pdf
| vintermann wrote:
| > Up until now learning to replicate the input exactly was
| called overfitting and considered a bug, not a feature, but
| they showed a completely new way to wield neural nets.
|
| Well, for that particular ting there was a predecessor of sort
| in Deep Image Priors from 2017.
|
| https://arxiv.org/abs/1711.10925
|
| That was all about overfitting a neural net on a single image,
| which they used to get impressive inpainting, noise removal and
| superresolution results without any training at all (though of
| course it did not beat state of the art training-based
| approaches, even then.)
|
| I had a lot of fun playing around with it when it came. The
| idea is dead simple, within reach to implement yourself with no
| complex mathematical understanding.
| productceo wrote:
| Very interesting. Highly useful for metaverse avatars.
| qwertox wrote:
| Or YouTube video thumbnails, from the looks of it.
| fxtentacle wrote:
| In my opinion, everything nerf related gets a lot of opinion
| because it's highly graphical and thus easy to present. But
| there's few practical applications and it tends to be super slow
| and not work for more challenging scenes where traditional
| 20-year old methods like global penalty block matching still work
| reasonably.
|
| And for this paper in particular, I fail to see how they improve
| over other nerf approaches with deformation terms like Nerfies or
| D-Nerf
| dougabug wrote:
| Deformation fields would struggle to fundamentally change the
| topological type, particularly where the transformation would
| need to "tear" the manifold, such as turning a sphere into a
| donut or dividing a cell into two child cells. HyperNeRF's
| exploit and extend Level Set Methods, which are rooted in Morse
| Theory.
|
| https://en.wikipedia.org/wiki/Level-set_method
|
| https://en.wikipedia.org/wiki/Morse_theory
| sanxiyn wrote:
| Since neural networks are universal, we can often solve problems
| we don't know exactly how to solve with neural networks. NeRF is
| a great example. But once solved, we should try to reverse
| engineer the solution and optimize.
|
| https://arxiv.org/abs/2112.05131 did such work and found you can
| get NeRF quality without any neural network at all and as a
| result 100x faster.
| echelon wrote:
| > But once solved, we should try to reverse engineer the
| solution and optimize.
|
| This is a great point.
|
| Does anybody know of any papers that explore non-neural text to
| speech / voice synthesis that achieve better than parametric
| quality?
| macawfish wrote:
| Awesome! I'd love to see someone do this with light field
| networks: https://www.vincentsitzmann.com/lfns/
|
| By the way, in the plenoxel video they say "the key component
| is the differentiable volumetric rendering, not the neural
| network".
|
| After watching the fractal community struggle to come up with
| good distance estimators for years, this makes so much sense.
| In the end, automatic differentiation came out to be one of the
| most solid methods for coming up with distance estimators for
| an arbitrary fractal formula.
| orbital-decay wrote:
| NeRF triggered the development of a number of improved methods,
| though. There's DONeRF [1] that is built upon NNs and is
| currently faster than comparable solutions (the field evolves
| fast so I may be wrong).
|
| [1] https://github.com/facebookresearch/DONERF
| macawfish wrote:
| Yeah there are dozens of spinoffs inspired by NeRF, it's kind
| of amazing
| exikyut wrote:
| Demo videos and overview: https://alexyu.net/plenoxels/
| dougabug wrote:
| Except the plenoxel representation is also 2-3 orders of
| magnitude larger than an MLP-based NeRF. It's not very
| surprising that a sparse voxel representation can capture a
| plenoptic function. Representation of volumetric video further
| presses the size disadvantages of voxel based techniques.
|
| The reasons for using deep learning function approximators are
| manifold, for instance in RL, state and state-action spaces
| quickly become too large for tabular methods. Using grids or
| tables also basically closes off the opportunities for
| exploiting meta-learning and analysis-by-synthesis.
|
| Plenoxels also rely on explicit specification of a known grid
| structure, where the HyperNeRF method can learn latent
| parametric manifolds and handle dynamic objects with changing
| topologies.
| 3dthrow wrote:
| An MLP-based NeRF actually has a comparable number of
| parameters to plenoxels (it's not 2-3 orders of magnitude
| smaller). The original NeRF is 8 dense layers with 256
| channels and then this one adds another network with 6 dense
| layers with 64 channels, so roughly speaking 7x256x256 +
| 5x64x64. And remember that the voxel grid is sparse, though
| we don't get exact numbers here. We shouldn't be miserly with
| our megabytes in 2021. What concerns me is how HyperNeRF
| requires 64 hours of training time on 4 TPU v4s; if you want
| to use this for communication or entertainment, it's light-
| years away from interactive.
|
| Extending plenoxels to support dynamic objects would be great
| future work.
| bloopernova wrote:
| Is this computational "simple" (efficient?) enough that it could
| be used to create better interactive graphics in things like
| games?
|
| Or am I understanding this incorrectly? Sorry, this feels very
| much beyond me.
| Geee wrote:
| It seems that they're not real-time renderable. There could be
| a method of converting these to polygons + materials for real-
| time rendering.
| visarga wrote:
| I see two uses:
|
| - as a storage format for 3d models - it could become a method
| to package parametrised game assets, so you can easily
| customise them.
|
| - as a representation for 3d inputs in spatial reasoning tasks,
| like protein folding and self driving
| dclowd9901 wrote:
| I was wondering on this. So it _is_ in the photogrammetry
| space? How soon before I can use my iPhone to take 6 pictures
| or something and get a perfect 3D model from it?
| Stevvo wrote:
| What a time to be alive!
| daenz wrote:
| For those of you who may not get this reference, it's from a
| popular Youtube channel, Two Minute Papers, run by Dr Karoly
| Zsolnai-Feher that recently featured OP's link
| https://www.youtube.com/c/K%C3%A1rolyZsolnai/videos
|
| The channel is excellent and I recommend subscribing to it if
| you like this kind of stuff.
| Tarks wrote:
| Which is very likely referencing this :
| https://youtu.be/qu32fBkiHFE
| jcims wrote:
| What's the difference between this and using photogrammetry to
| build a 3D model/depthmap and painting it with the images?
| zcw100 wrote:
| Better in painting and the 3D object can move.
| Geee wrote:
| These contain all the light field information of the scene. See
| the reflections, refractions etc.
| twoodfin wrote:
| In the right circumstances, the NFL would spend $1E7-$1E8 on this
| or similar tech. It's wild to think about how much of what we see
| on screens in a decade or so will be "computationally inferred".
| themodelplumber wrote:
| I'm sure there are also people who would love to
| computationally infer the preferred ending to an NFL game of
| their choice, too. Or change the ending to a movie of which
| they can only tolerate the first hour.
|
| It would enable some really cool ideation and modeling, maybe
| even some of which could be used for psychology work, or sports
| psychology in the case of the NFL (I'm reminded of those
| "imagine yourself winning" tricks)
| dlightlo wrote:
| dlightlo wrote:
___________________________________________________________________
(page generated 2021-12-27 23:02 UTC)