[HN Gopher] Show HN: Generative Fill with AI and 3D
___________________________________________________________________
Show HN: Generative Fill with AI and 3D
Hey all, You've probably seen projects that add objects to an
image from a style or text prompt, like InteriorAI (levelsio) and
Adobe Firefly. The prevalent issue with these diffusion-based
inpainting approaches is that they don't yet have great
conditioning on lighting, perspective, and structure. You'll often
get incorrect or generic shadows; warped-looking objects; and
distorted backgrounds. What is Fill 3D? Fill 3D is an exploration
on doing generative fill in 3D to render ultra-realistic results
that harmonize with the background image, using industry-standard
path tracing, akin to compositing in Hollywood movies. How does it
work? 1. Deproject: First, deproject an image to a 3D shell using
both geometric and photometric cues from the input image. 2. Place:
Draw rectangles and describe what you want in them, akin to
Photoshop's Generative Fill feature. 3. Render: Use good ol' path
tracing to render ultra-realistic results. Why Fill 3D? + The
results are insanely realistic (see video in the github repo, or on
the website). + Fast enough: Currently, generations take 40-80
seconds. Diffusion takes ~10seconds, so we're slower, but for the
level of realism, it's pretty good. + Potential applications: I'm
thinking of virtual staging in real estate media, what do you
think? Check it out at https://fill3d.ai + There's API access! :D
+ Right now, you need an image of an empty room. Will loosen this
restriction over time. Fill 3D is built on Function
(https://fxn.ai). With Function, I can run the Python functions
that do the steps above on powerful GPUs with only code (no
Dockerfile, YAML, k8s, etc), and invoke them from just about
anywhere. I'm the founder of fxn. Tell me what you think!! PS:
This is my first Show HN, so please be nice :)
Author : olokobayusuf
Score : 116 points
Date : 2023-09-28 20:41 UTC (2 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ugh123 wrote:
| Can this be used to replace objects in a scene? In your demo
| example you place a bed, but what if I want to replace my bed
| with yours?
| xwdv wrote:
| I gasped. This is what will make it trivial to simply highlight a
| persons swimwear and tell the AI to remove.
| ad-astra wrote:
| Eh, at that point you might as well just get tickets for a
| burlesque show
| tamimio wrote:
| That's.. already happening
| k12sosse wrote:
| Have you never used stable diffusion?
|
| Today, as in right now, with less than 5 relatively not-
| horrible photographs you can create a realistic AI version of
| anyone to do anything you'd like them to, or wear. Animation
| included. From your home computer.
|
| Or just inpaint the clothes away from any image.
| mcbuilder wrote:
| Growing up in the age of generative AI is at least a big sea
| change as in the age of social media, or the internet etc.
| aantix wrote:
| Is there any way to remove objects from an initial image, so that
| then it can be utilized for staging?
| prashp wrote:
| Nice! Like your landing page.
|
| How well does it work on non-room images?
| olokobayusuf wrote:
| Depends on the image. Right now, the very first stage
| (deprojecting the image to 3D) makes assumptions about the
| image having the structure of a room: large empty floor plan;
| roughly polygonal geometry.
|
| For different kinds of images, it's a question of using other
| cues to build a 3D structure that's very close to the original
| image. And no, monocular depth estimation isn't enough (happy
| to nerd out about why) ;)
| reichardt wrote:
| Amazing! The inserted objects are renders of textured 3D models
| and not generated by a diffusion model + ControlNet? Is there a
| fixed set of textured 3D models available or are they generated
| on the fly based on the prompt?
| olokobayusuf wrote:
| Thats correct! Right now, we're using the Blenderkit dialog,
| but we can expand beyond it. When you type a prompt and search
| though, that's actually doing a multi-modal search (so you can
| ask for a 'red painting' and it'll actually find a red
| painting), so it's insanely more accurate than a regular
| search. AI everywhere!
| matsemann wrote:
| The example looks very good. Do you have more images to share? I
| think more examples would be nice to show off more of what it can
| handle. Different room types, interiors etc.
|
| Also in that regards: I'm curious about what it can't handle. Any
| situations where it borks?
| olokobayusuf wrote:
| Excellent suggestion. Will find time tomorrow to add a
| `/gallery` page. Created an issue to track:
| https://github.com/fill3d/fill/issues/1 . Best first issue :D
| ralfhn wrote:
| > virtual staging in real estate media If you can make this work
| with exteriors, Landscaping design is huge. Maybe start with
| something simple like desert landscaping (which is really just
| rocks, turf, Pavers, maybe small palm trees)
| olokobayusuf wrote:
| Very curious to learn more, how can I reach you? Or, shoot me
| an email: yusuf@fill3d.ai
| artursapek wrote:
| Wow, nice. I hope you charge realtors a fat price for this
| billconan wrote:
| I tried the demo, it seems to be buggy and it seems to only allow
| you to choose existing items from a predefined db.
| olokobayusuf wrote:
| What bugs did you encounter? And yes, because we're using
| actual 3D models, there's a fixed set of models (right now,
| just under 300). Because the priority is ultra-realism, the
| current state-of-the-art for 3D model diffusion won't cut it
| (see OpenAI Point-e https://github.com/openai/point-e).
| billconan wrote:
| so you only project the background into a 3d model, and the
| foreground is not generated, but 3d models?
|
| the bug I saw was after uploading a background image, on the
| right side, I only saw a generate and a reset button, nothing
| else. I clicked "generate", expecting it to ask me to input a
| prompt, but it started to render and the result was the same
| background I uploaded.
| bsenftner wrote:
| Between Fill3D's architecture that 'path traces to render ultra-
| realistic results' and fxn.ai transparent deployment
| capability... I gotta say this is super impressive work. I can
| use both in a current project, and will be investigating.
| ugh123 wrote:
| This kind of stuff is the future of film making.
|
| Imagine adding "yourself" into a scene like this, moving around
| as you were/are from a video you just created of yourself. As in:
| film yourself walking around your bedroom with your phone. Then
| use an app like this to add you and your movement (cropped from
| the video) to a different background scene.
|
| Goodbye, Hollywood elites!
| olokobayusuf wrote:
| I couldn't agree more! You should check out the amazing work
| from the folks at Luma Labs (https://lumalabs.ai/). They're a
| loose inspiration for this project.
| mentos wrote:
| My use case for this would be for decorating my apartment.
|
| I've got a big empty studio with a bed and couch I've already
| purchased but trying to figure out what to fill in for all the
| other gaps. Coffee table, media console, tv or UST projector, bar
| or bookshelf or desk.
|
| Would be nice if there was a way to populate it with
| items/products that can be purchased and aren't purely
| conceptual.
| olokobayusuf wrote:
| Yup this is actually a roadmap feature. Because we generate in
| 3D, users can bring their own 3D models and add it to the
| catalog. And if you add something like object capture from
| Apple (https://developer.apple.com/augmented-reality/object-
| capture...), you could literally scan your couch, upload it to
| Fill 3D, place, and generate.
|
| Exciting times ahead.
| pedalpete wrote:
| Now create a bunch of perspectives, and nerf or guassian splat
| that, and you've got a fully immersive 3D scene that is better
| than any rendering.
| olokobayusuf wrote:
| I like the way you think ;)
| jayd16 wrote:
| Why is it better than any rendering?
| olokobayusuf wrote:
| Cos it's immersive (and interactive). Check out this realtime
| demo of 3DGS in Unity by Aras P (co-founder of Unity): https:
| //www.youtube.com/watch?v=0vS3yh908TU&ab_channel=ArasP...
| blovescoffee wrote:
| Could you speak more to the "deprojection" step? What is that?
| olokobayusuf wrote:
| Fill 3D takes a different step from diffusion, in that it tries
| to build an actual 3D scene (kinda like a clone) of what's in
| the image you upload. In some sense, that's actually the most
| fundamental representation of what's in your image (or said
| another way, your image is just a representation of that
| original scene).
|
| So it works by trying to estimate a 3D 'room' that matches your
| image. Everything from the geometry, to the light fixtures, to
| the windows. It's heavily inspired by how humans (weird to
| contrast 'human' vs. AI work) do image/video compositing.
|
| TL;DR: Image in, 3D scene out.
| lee101 wrote:
| [dead]
| tamimio wrote:
| I like it, but should have added some free tier to test it out.
___________________________________________________________________
(page generated 2023-09-28 23:00 UTC)