[HN Gopher] Shape of Motion: 4D Reconstruction from a Single Video
___________________________________________________________________
Shape of Motion: 4D Reconstruction from a Single Video
Author : lnyan
Score : 108 points
Date : 2024-07-20 18:41 UTC (3 days ago)
(HTM) web link (shape-of-motion.github.io)
(TXT) w3m dump (shape-of-motion.github.io)
| smusamashah wrote:
| I was wondering how were they getting depth from a video where
| camera is still.
|
| > we utilize a comprehensive set of data-driven priors, including
| monocular depth maps
|
| > Our method relies on off-the-shelf methods, e.g., mono-depth
| estimation, which can be incorrect.
| Daub wrote:
| Actually, of the examples they showed, all but one clip
| featured both camera and in-camera motion. Granted not a lot of
| the former, but according to my non-expert opinion, maybe
| enough to construct a disparity map.
| was_a_dev wrote:
| I imagine having stereo video would also help generate a
| depth map from disparity?
| yieldcrv wrote:
| One thing I liked about Team Ico (a studio behind the Shadow of
| the Colossus, Ico, Last Guardian video games) was how the player
| can move the camera just a little but during automated sequences
|
| Getting that kind of look around in a video scene would be really
| engaging. A bit different than VR or watching in The Sphere, with
| the engagement being that there are still things right out of
| view you have to pan the camera for
| Daub wrote:
| I agree. I think that this is similar to the appeal of old-
| school stereoscopy.
| lancesells wrote:
| Haven't played the other games but Ico was incredible. It gave
| me the same feeling as Another World which was maybe 10 years
| prior.
| latexr wrote:
| > Getting that kind of look around in a video scene would be
| really engaging.
|
| It might be interesting for one or two movies specifically
| built around the feature, but otherwise it would be a gimmick
| no one would care for. For games, sure, but movies are a
| different experience.
| yieldcrv wrote:
| maybe, but for the last half decade nearly my entire social
| circle cannot sit still for movies and just won't go out of
| their way to do it anymore. I'm big into cinema, but they are
| not. Treating this like the fidget-spinning surrogate that a
| large portion of the population relies upon could potentially
| make it a hit for some viewing experiences. Its a thesis I
| would pursue, for money, at least.
| latexr wrote:
| This would make it unbearable to watch a movie with anyone
| else, so it doesn't really solve the social issue. But even
| if you're watching it alone, it only really makes sense if
| the movie itself takes advantage of it in some interesting
| way, which starts to get into game territory. It wouldn't
| even work for the most part: how do you deal with cuts and
| changes of scenery? It makes no sense in the context of a
| movie; what you're looking for is a game and we can already
| do that.
|
| Maybe you could have it work as a documentary (good luck
| getting a bored social group to go for that) or a virtual
| tour, but we already have 3D interactions of those too.
|
| We've had tons of movie viewing experiments and ultimately
| always go back to the tried and true 2D screen, with the
| bolder ideas being relegated mostly to the domain of theme
| park gimmicks. Which are interesting in their own right,
| but don't survive on their own.
| yieldcrv wrote:
| yes, my main use case would be for fidgeters solo, just
| like those Team ICO games
|
| > how do you deal with cuts and changes of scenery?
|
| the same way the games did it. by doing nothing special
| at all and retaining the same functionality. it really
| depends on how this 4D reconstruction works before I
| could say it uniquely adversely affects the experience
|
| for the most part what's interesting to me is that the
| overhead costs seem low enough not to care about random
| things big studios did at great expense with no way to
| justify the market appeal. its either a portfolio piece
| or 1,000 monhtly users supporting my lifestyle
| indefinitely.
| moritonal wrote:
| Curiosity, what is the difference between 4D or 6DoF (six degrees
| of freedom)? Sounds a lot like the 6DoF work that Lytro did back
| in 2012, although this obviously is coming at the problem from
| the other direction, generating it rather than capturing it.
| moralestapia wrote:
| Move in 3D space + rotate in 3D space, I think.
|
| But w/ time should it be 7?
| deckar01 wrote:
| Lytro added 2 spatial dimensions of info to 2D image capture:
| the angles the light was traveling at when it entered the
| camera. They could simulate the image with different camera
| parameters, which was good for changing depth of field after
| the fact, but the occlusion information was limited by the
| diameter of the aperture. They tried to make depth maps, but
| that extra data was not a silver bullet. As far as I could
| tell, they were still fundamentally COLMAPing, they just had
| extra hints to guess with.
| ryandamm wrote:
| This is spot-on. Note that the aperture on the camera was
| quite large, I want to say on the order of 100mm? They
| sourced really exotic hardware for that cinema camera.
|
| They also had the "Immerge," which was a ~1m diameter,
| hexagonal array of 2D cameras. They got the 4D data from
| having a 2D (spatially distributed) array of 2D samples (each
| camera's views). It's under sampled, because they threw out
| most of the light, but using 3D as a prior for reconstructing
| missing rays is generally pretty effective.
|
| But I also understand a lot of what they demoed at first was
| smoke and mirrors, plus a lot of traditional 3D VFX
| workflows. Still impressive as hell at the time, it's just
| that the tech has progressed significantly since ~2018.
| PaulHoule wrote:
| I got as Lytro Illium off Ebay at a reasonable price but it
| is a bit of a white elephant. I was hoping to shoot
| stereograms but I haven't been able to do it with the stock
| software (I just get two images that look the same with no
| clear disparity)
|
| I've seen open source software for plentopic images which
| might be able to generate a point cloud but I've only
| gotten one good shot of the Lytro which was similar to a
| shot I took with this crazy lens
|
| https://7artisans.store/products/7artisans-50mm-f0-95-large
| -...
| littlestymaar wrote:
| The scene itself moves over time, hence the 4D. Vanilla
| gaussian splating already give you 6 degrees of freedom since
| you have a full 3D scene.
| tizio13 wrote:
| This reminds me of the description of Disneys(future movies) in
| Cloud Atlas. The movie had a good visualization, this feels like
| that.
| mrmetanoia wrote:
| I liked Cloud Atlas, I should watch it again. It was weird and
| ambitious.
| InDubioProRubio wrote:
| Our children will be so weird out by blade runner. Not by the
| zoom into the picture, but by the fact that the guy believes in
| halucinated data.
| jajko wrote:
| Who says that the recording medium simply didn't have 1
| petapixel resolution? Or its analog analog to stick with the
| movie
| blt wrote:
| the first HyperNeRF cat video is quite interesting-looking and
| surreal!
| cooper_ganglia wrote:
| He was having a meowt of body experience.
| Geee wrote:
| For 3D VR videos, this would be useful for adjusting IPD for
| every person, rather than use the static IPD of the camera setup.
| Also, allowing just a little bit of head movement would really
| increase the immersiveness. I don't need to travel long distances
| inside the video. If the video is already filmed with static
| stereo setup, it would be even easier to reconstruct an accurate
| 4D video limited to short travel distances without glaring
| errors.
| thomastjeffery wrote:
| https://augmentedperception.github.io/deepviewvideo/
|
| We've been waiting 4 years. I just don't understand what is
| taking so long.
|
| Even at a low resolution, the difference is night and day. Even
| with a very small window, this is a leap forward for VR
| immersion. Why in the hell is no one using it?!?
| latexr wrote:
| The results are impressive, but what makes this 4D? Where's the
| extra dimension and how is it relevant to 3D human beings?
| thomastjeffery wrote:
| Time
| cooper_ganglia wrote:
| We are all 4-dimensional beings on this fine day.
| philipov wrote:
| We're 3+1 dimensional beings. Time doesn't have the same
| metric as spacial dimensions, so you can't add them
| together. You can't rotate a temporal object along the xt-
| plane, for example, nor can you speak about an object's
| length along the t-axis. The three spacial dimensions are
| interchangeable, but time is special, so calling it 4D is
| incorrect.
| tantalor wrote:
| The input videos already have that dimension, so that can't
| be the answer.
| thomastjeffery wrote:
| I agree that it _shouldn 't_ be, but it is apparent that it
| (redundantly) is.
| latexr wrote:
| By that logic all videos would be (at least) 3D. But no one
| would take you seriously if you said that.
| echelon wrote:
| Videos are already 4D.
|
| There's the 2D frame and the time dimension. Then there's
| the structural information conveyed by motion, parallax,
| scene composition, camera movement, etc.
|
| That's why there's the 180 rule, amongst other things.
|
| Algorithms can take a video and turn it into a 4D volume.
| As can our brains.
| netruk44 wrote:
| The reconstruction is a 3-dimensional scene that has animation
| contained in it.
|
| You can move a virtual camera 3-dimensionally within the scene
| at any individual frame (x, y, z), and also move the scene
| through its animation to play the animation forwards and
| backwards (in other words, you move the camera through the
| 'time' axis).
| latexr wrote:
| > in other words, you move the camera through the 'time' axis
|
| So, like the scrubber in _any_ video? Doesn't feel like that
| warrants the 4D moniker. Which is not to say you're not
| right, I think you are and that's what they mean, but it that
| being the case it feels more buzzword than anything.
| radicality wrote:
| I think it means that, given a normal flat 2D video, you
| get back that video but as a 3D scene, meaning you can move
| and pan the camera around as the 3d video plays. And I
| guess they call it 4D since you had a flat 2d video + time
| dimension, so 3d video + time dimension = 4 dimensions.
| aaroninsf wrote:
| This work is about taking an input with 2 spatial dimensions,
| plus 1 time dimension,
|
| and synthesizing a (limited) model with 3 spatial dimensions,
| plus 1 time dimension.
|
| 3D over time is colloquially called "4D;" though we don't call
| video "3D" by analogy as the term binds strongly to its purely
| spatial use.
| aaroninsf wrote:
| Re: relevance, once of the prospective uses of work like this
| is in conversion of "flat" conventional video into "spatial"
| video, eg as popular on the Apple Vision Pro.
|
| I've been interested in the state of the art in that domain
| myself, having thousands of 2D videos I've shot which I would
| love to see "spatialized" well, someday.
| latexr wrote:
| > 3D over time is colloquially called "4D;"
|
| Colloquially, meaning "used in or characteristic of familiar
| and informal conversation" 4D films have a definition, and
| that ain't it. 3D over time is colloquially still referred as
| 3D, as evidenced by decades of 3D blockbusters.
|
| https://en.wikipedia.org/wiki/3D_film
|
| https://en.wikipedia.org/wiki/4D_film
| PaulHoule wrote:
| Whenever I play a video game ( _Monster Hunter World_ comes to me
| immediately) and see an establishing shot with moving camera
| (like the ones demoed on their web site) I think the game really
| wants to run in an a VR headset where you can walk around and see
| different angles.
|
| (Funny there is a VR mod for _Monster Hunter Rise_ which makes me
| think just how fun _Monster Hunter VR_ would be)
___________________________________________________________________
(page generated 2024-07-23 23:09 UTC)