[HN Gopher] Stability.ai - Introducing Stable Video 3D
___________________________________________________________________
Stability.ai - Introducing Stable Video 3D
Author : ed
Score : 236 points
Date : 2024-03-18 20:06 UTC (2 hours ago)
(HTM) web link (stability.ai)
(TXT) w3m dump (stability.ai)
| Filligree wrote:
| If the animations shown are representative, then the mesh output
| may very well be good enough to use in a 3d printer.
|
| Looking forward to experimenting with this.
| neom wrote:
| I don't know much about 3D printing, would be very interested
| in learning more about this idea if you'd be so kind as to
| expand on it. Could I have AI spend all day auto scanning what
| teens are doing on instagram, auto generate toys based on it,
| auto generate advertisements for the toys, auto 3D print on
| demand?
| SirSourdough wrote:
| Hypothetically, sure, assuming the parent comment that these
| meshes are sufficient for modelling is correct and that you
| can find any teens who want a non-digital toy.
|
| I think a good hobbyist application for this would be
| something like modelling figurines for games, which is
| already a pretty popular 3D printing application. This would
| allow people with limited modelling skills to bring
| fantastical, unique characters to life "easily".
| Filligree wrote:
| Pretty much. We're already generating images of monsters
| and characters for a D&D campaign; being able to print
| those in 3D would be pretty amazing.
| CobrastanJorji wrote:
| I think their suggestion was more "I have a photo of a cool
| horse, and now I would like a 3D model of that same horse."
|
| Another way of looking at it, 3D artists often begin projects
| by taking reference images of their subject from multiple
| angles, then very manually turning that into a 3D model. That
| step could potentially be greatly sped up with an algorithm
| like this one. The artist could (hopefully) then focus on
| cleanup, rigging, etc, and have a quality asset in
| significantly less time.
| maicro wrote:
| OP is suggesting that this (AI model? I honestly am behind on
| the terminology) could replace one of the common steps of 3D
| printing - specifically, the step where you create a digital
| representation of the physical object you would want to end
| up with.
|
| There are other steps to 3D printing in general, though; a
| super rough outline:
|
| - Model generation
|
| - "Slicing" - processing the 3D model into instructions that
| the 3D printer can handle, as well as adding any support
| structures or other modifications to make it printable
|
| - Printing - the actual printing process
|
| - Post-processing - depending on the 3D printing technology
| used, the desired resulting product, and the specific
| model/slicing settings, this can be as simple as "remove from
| bed and use" to "carefully snip off support structures, let
| cure in a UV chamber for X minutes, sand and fill, then
| paint"
|
| As I said before, this AI model specifically would cover 3D
| model generation. If you were to use a printing technology
| that doesn't require support structures, and handles color
| directly in the printing process (I think powder bed fusion
| is the only real option here?), the entire process should be
| fairly automatable - a human might be needed to remove the
| part from the printer, but there might not be much post-
| processing to do.
|
| The rest of your desired workflow is a bit more nebulous - I
| don't know how you would handle "scanning what teens are
| doing on instagram", at least in a way that would let you
| generate toys from the information; generating and posting
| the advertisement shouldn't be too hard - have a standardish
| template that you fill in with a render from the model, and
| the description; printing on demand again is possible, though
| you'll likely need a human to remove the part, check it for
| quality and ship it. You could automate the latter, but that
| would probably be more trouble than it's worth.
| neom wrote:
| Interesting, to be clear I don't think this is a good idea
| and it's kinda my nightmare post capitalism hell. I just
| think it's interesting this could be done now.
|
| On finding out what teens want, that part is somewhat easy-
| ish, I guess you'd need a couple of agents, one that is
| scanning teen blogs for stories and then converting them to
| key words, then another agent that takes the key words
| (#taylorswift #HaileyBieberChiaPudding #latestkdrama etc)
| into Instagram, after a while your recommend page will turn
| into a pretty accurate representation of what teens are
| into, then just have an agent look at those images and
| generate difs of them. I doubt it would work for a bunch of
| reasons, but it's an interesting thought experiment!
| Thanks!
| jsheard wrote:
| With previous attempts at this problem the shaded examples
| could be quite misleading because details that appeared to be
| geometric were actually just painted over the surface as part
| of the texture, so when you took that texture away you just had
| a melted looking blob with nowhere near as much detail as you
| thought. I'd reserve judgement until we see some unshaded
| meshes.
|
| What they show in the demo: https://i.imgur.com/9bZNTcd.jpeg
|
| What comes out of the 3D printer:
| https://i.imgur.com/MZrzsfh.png
| SV_BubbleTime wrote:
| It's always been this. None of these ever show the untextured
| model.
|
| When I see a demo where they are showing wireframes I know
| it'll be good enough.
| jsheard wrote:
| Seems like a tougher nut to crack than image generation
| was, since there isn't a bajillion high quality 3D models
| lying around on the internet to use as training data,
| everyone is trying to do 3D model generation as a second-
| order system using images as the training data again. The
| things that make 3D assets good, the tiny geometric details
| that are hard to infer without many input views of the same
| object, the quality of the mesh topology and UV mapping,
| rigging and skinning for animation, reducing materials down
| to PBR channels that can be fed into a renderer and so on
| aren't represented in the input training data, so the model
| is expected to make far more logical leaps than image
| generators do.
| refulgentis wrote:
| It almost seems easier, in that you have an arbitrary #
| of real world objects to scan and the hardware is heavily
| commoditized (IIRC iPhones have this built in at highres
| now?)
| euazOn wrote:
| Therefore, what is the main usecase of this model? Generating
| cheap 3D assets for videogames?
| ionwake wrote:
| Im sorry for dumb lazy question. But would the input require more
| than one image? Is there a demo url to test this? I think it
| might jsut be time to buy a 3d printer.
|
| EDIT> Does "single image inputs" mean more than one image?
| kylebenzle wrote:
| Single image means one image.
| dartos wrote:
| Can confirm the word single means 1
| ionwake wrote:
| lol cmon guys don't be too hard on me it does say "inputs"
| stavros wrote:
| I do see how "single image inputs" can be conflated with
| "multiple inputs of a single image each time", as opposed
| to "video".
| ionwake wrote:
| TBH I always look at the worst case scenario. I was
| worried it meant it need 3 images inputted as a single
| image at direct steps of the process, so requiring
| different angles. I wasn't sure, but thought best to
| check. I feel like it would have been clearer to have
| said something like " generates a 3d models from a single
| image". ( not exact wording but you catch my drift ).
| Sorry I am over analysing but all feedback is good right?
| ganeshkrishnan wrote:
| Describe in single words only the good things that come
| into your mind about... your mother.
| simonw wrote:
| It's just a single image. It guesses the shape of the bits it
| can't see based on vast amounts of training data.
| ionwake wrote:
| Amazing! Thank you
| airstrike wrote:
| that demo animation is so clever and satisfying
| amelius wrote:
| But it doesn't look very realistic, tbh.
| dreadlordbone wrote:
| it doesn't break Euclidian space at least
| ddtaylor wrote:
| Does anyone know what hardware inference can run on or memory
| requirements?
| Mathnerd314 wrote:
| In the repo the model weights file is 9.37GB, whereas sdxl
| turbo is 13.9GB, and I don't see any mention of huge context
| windows, so probably it just needs a decent graphics card.
| kouteiheika wrote:
| It crashes with an out-of-memory error on my 24GB 4090, so at
| least when it comes to their sample script the answer is "a
| lot". Maybe it's just an inefficient implementation though.
| canadiantim wrote:
| I can't wait until we can use something like this for
| architectural design
| kouteiheika wrote:
| Just tried to run this using their sample script on my 4090
| (which has 24GB of VRAM). It ran for a little over 1 minute and
| crashed with an out-of-memory error. I tried both SV3D_u and
| SV3D_p models.
| ganeshkrishnan wrote:
| 4090 is in weird spot. High speed but low RAM. Theoretically
| everything should run in ai but practically nothing runs
| LoganDark wrote:
| 4090 has more VRAM than most computers have system RAM.
| Surprised this is considered "low RAM" in any way except for
| relative to datacenter cards and top-spec ASi.
| jokethrowaway wrote:
| What can't you run? Unquantised large text models are the
| only thing I can't run
|
| Stable diffusion, stable video, text models, audio models, I
| never had issues with anything yet
| GistNoesis wrote:
| I managed to get it working with a 4090. You need to adjust the
| parameter decoding_t of the sample function in
| simple_video_sample.py to a lower value (decoding_t = 5 works
| fine for me). I also needed to install imageio==2.19.3 and
| imageio-ffmpeg
| kouteiheika wrote:
| Ah, yep! You're right! It works now!
| bugbuddy wrote:
| "3D"
___________________________________________________________________
(page generated 2024-03-18 23:00 UTC)