[HN Gopher] Meta 3D Gen
       ___________________________________________________________________
        
       Meta 3D Gen
        
       Author : meetpateltech
       Score  : 276 points
       Date   : 2024-07-02 15:19 UTC (7 hours ago)
        
 (HTM) web link (ai.meta.com)
 (TXT) w3m dump (ai.meta.com)
        
       | explaininjs wrote:
       | Looks fine, but you can tell the topology isn't good based on the
       | lack of wireframes.
        
         | nuz wrote:
         | Such a silly argument. Fixing topology is a nearly solved
         | problem in geometry processing. (Or just start with a good
         | topology and 'paste' a texture onto it like they develop
         | techniques for here.)
        
           | explaininjs wrote:
           | No... it's not. But if you know something I don't the 5
           | primes will certainly be happy to pay you handsomely for the
           | implementation!
        
             | nuz wrote:
             | https://github.com/wjakob/instant-meshes
        
               | explaininjs wrote:
               | A piece of software that hasn't been touched in 5 years,
               | let alone adopted in any professional production
               | environment? Cool...
        
               | portaouflop wrote:
               | AFAICT it's used in professional applications and
               | software does not need to be constantly updated,
               | especially if it's not for the web.
        
               | explaininjs wrote:
               | If the claim was that the problem was solved, sure it
               | _might_ make sense that the package does not need to be
               | touched (in reality the field isn't as slow as you
               | presume, but I digress).
               | 
               | Instead, the claim is that it's "nearly^{TM}" solved, so
               | the proof being an abandoned repo from half a decade ago
               | actually speaks volumes: it's solved except for the hard
               | part, and nobody knows how to solve the hard part.
        
               | spookie wrote:
               | I love that tool but it really doesn't fix bad topology.
               | 
               | It gets you somewhere closer, but not a fix.
               | 
               | Moreover, depending on what you have at hand, the
               | resolution of your remeshing might destroy a LOT of
               | detail or is unable to accomodate thin sections.
               | 
               | Retopo isn't a solved problem. It only is for really
               | basic, convex meshes.
        
           | RicoElectrico wrote:
           | It's an essential skill for reading scientific papers to
           | notice _what isn 't there_. It's as important as what _is_
           | there.
           | 
           | In my field, analog IC design, if we face a wall, we often do
           | some literature review with a colleague and more often than
           | not, results are not relevant for commercial application.
           | Forget about Monte Carlo, sometimes even there aren't full
           | PVT corners.
        
             | jampekka wrote:
             | This is indeed a side effect from research papers being
             | read more outside academia (which is strictly a good thing
             | in itself).
             | 
             | In research one learns that most (almost all) papers
             | oversell their results and a lot of stuff is hidden in the
             | "Limitations" section. This is a significant problem, but
             | not that big a problem within academia as everybody, at
             | least within the field, knows to take the results with a
             | grain of salt. But those outside academia, or outside the
             | field, often don't take this into account.
             | 
             | Academic papers should be read a bit like marketing
             | material or pitch decks.
        
           | zemo wrote:
           | depends what you're talking about and what your criteria is.
           | In gamedev, studios typically use a retopology tool like
           | topogun (https://www.topogun.com/) to aid in the creation of
           | efficient topologies, but it's still a manual task, as
           | different topologies have different tradeoffs in terms of
           | poly count, texture detail, options for how the model deforms
           | when animated, etc. For example you may know that you're
           | working on a model of a player character in a 3rd person game
           | where the camera is typically behind you, so you want to
           | spend more of your budget on the _back_ of the model than the
           | _front_, because the player is typically looking at their
           | character's back. If your criteria is "find the minimum
           | number of polygons", sure, it's solved. That's just one of
           | many different goals, and not the goal that is typically used
           | by gamedev, which I assume to be a primary audience of this
           | research.
        
             | efilife wrote:
             | Fyi, we use asterisks to put emphasis on text on HN
        
           | TrevorJ wrote:
           | Hard disagree, as someone in the industry.
        
         | jsheard wrote:
         | Credit where it's due, unlike most of these papers they do at
         | least show some of their models sans textures on page 11, so
         | you can see how undefined the actual geometry is (e.g. none of
         | the characters have eyes until they are painted on).
        
           | SV_BubbleTime wrote:
           | Sans texture is not wireframe though. They have a texture,
           | it's just all white.
           | 
           | The wire frame is going to be unrecognizable-bad.
           | 
           | Still a ways to go.
        
         | tobyjsullivan wrote:
         | They seem to admit as much in Table 1 which indicates this
         | model is not capable of "clean topology". Somewhat annoyingly,
         | they do not discuss topology anywhere else in the paper (at
         | least, I could not find the word "topology" via Ctrl+F).
        
         | dyauspitr wrote:
         | That doesn't matter for things like 3D printing and CNC
         | machining. Additionally, there are other mesh fixer AI tools.
         | This is going to be gold for me.
        
           | jsheard wrote:
           | However if you 3DP/CNC these you'll only get the base shape,
           | without any of the fake details it painted over the surface.
           | 
           | Expectation vs. reality: https://i.imgur.com/82R5DAc.png
        
             | dyauspitr wrote:
             | That's still not bad. I can use the normal and texture maps
             | to generate appropriate depth maps to put the details in
             | and do some final Wacom touch ups. Way better than making
             | the whole thing from scratch.
        
           | eropple wrote:
           | _> That doesn't matter for things like 3D printing and CNC
           | machining_
           | 
           | It absolutely does. But great, let's look forward to
           | Printables being ruined by off-model nonsense.
        
             | SV_BubbleTime wrote:
             | It matters so much more, GP is just being hopeful and soon
             | to be disappointed.
        
             | dyauspitr wrote:
             | Why does it matter. As long as there are no holes, my
             | vectric software doesn't care.
        
               | TylerE wrote:
               | If your normals are flipped, your cnc cutter is going to
               | try to cut from inside up to the surface. That's no
               | bueno.
        
               | dyauspitr wrote:
               | Inverting the normals is pretty straightforward.
        
               | TylerE wrote:
               | If ALL of them are inverted, yes.
               | 
               | If the topology is a disaster...no.
               | 
               | If you're hand massaging every poly you're rather
               | defeating the purpose.
        
         | torginus wrote:
         | Afaik, there's no topology - it outputs signed distance fields,
         | not meshes.
        
       | GaggiX wrote:
       | In the comparison between the models only Rodin seems to produce
       | clean topology, hopefully in the future we will see a model with
       | the strength of both, hopefully from Meta as Rodin is a
       | commercial model.
        
         | cchance wrote:
         | Ya would be cool if we had something open that competed with
         | rodin, but just like elevenlabs for voice, seems closed is
         | gonna be ahead for a while
        
       | kgraves wrote:
       | Can this be used for image to 3D generation? What is the SOTA in
       | this area these days?
        
         | tobyjsullivan wrote:
         | The paper suggests Rodin Gen-1 [0] is capable of image-to-shape
         | generation.
         | 
         | [0] https://hyperhuman.deemos.com/rodin
        
         | Fripplebubby wrote:
         | I think what they did here was go text prompt -> generate
         | multiple 2d views -> reconstruction network to go multiple 2d
         | images to 3d representation -> mesh extraction from 3d
         | representation.
         | 
         | That's a long way of saying, no, I don't think that this
         | introduces a component that specifically goes 2d -> 3d from a
         | single 2d image.
        
       | anditherobot wrote:
       | Can this potentially support :
       | 
       | - Image Input to 3D model Output
       | 
       | - 3D model(format) as Input
       | 
       | Question: What is the current state of the art commercially
       | available product in that niche?
        
         | moffkalast wrote:
         | Meshroom, if you have enough images ;)
        
         | egnehots wrote:
         | This a pipeline for text to 3D.
         | 
         | But it's using for 3D gen, a model that is more flexible:
         | 
         | https://assetgen.github.io/
         | 
         | It can be conditioned on text or image.
        
       | Simon_ORourke wrote:
       | Are those guys still banging on about that Metaverse? That's
       | taken a decided back seat to all the AI innovation in the past 18
       | months.
        
         | dvngnt_ wrote:
         | zuck has said before that ML will help make the "metaverse"
         | more viable.
         | 
         | he still needs a moat with its own ecosystem like the iphone
        
         | yieldcrv wrote:
         | Meta has spent like $50bn on their Metaverse line item since
         | 2021 and hasn't stopped
         | 
         | that probably means a bunch of H100's now for this Meta 3D Gen
         | thing, and other yet unnannounced things still incubating in a
         | womb of datasets
        
       | localfirst wrote:
       | can somebody please please integrate SAM with 3d primitive
       | RAGging? This is the golden chalice solution as a 3d modeler,
       | having one of those "blobs" generated by Luma and likes aren't
       | very useful
        
       | rebuilder wrote:
       | I'm puzzled by the poor texture quality in these. The colours are
       | just bad - it looks like the textures are blown out (the detail
       | at the bright end clip into white) and much too contrasty ( the
       | turkey does that transition from red to white via a band of
       | yellow). I wonder why that is - was the training data just done
       | on the cheap?
        
         | firtoz wrote:
         | It seems to be very well compared to the alternatives, however
         | there's a long way to go forward indeed
        
       | wkat4242 wrote:
       | I can't wait for this to become usable. I love VR but the content
       | generation is just sooooo labour intensive. Help creating 3D
       | models would help so much and be the #1 enabler for the metaverse
       | IMO.
        
         | jsheard wrote:
         | VR is especially unforgiving of "fake" detailing, you need as
         | much detail as possible in the actual geometry to really sell
         | it. That's the opposite how these models currently work, they
         | output goopy low-res geometry and approximate most of the
         | detailing with textures, which would be immediately register as
         | fake with stereoscopic depth perception.
        
           | SV_BubbleTime wrote:
           | Agreed.
           | 
           | Everyone I see text to 3D, it's ALWAYS textured. That is the
           | obvious give-away that it is still garbage.
           | 
           | Show me text to wireframe that looks good and I'll get
           | excited.
        
           | spookie wrote:
           | Yup. I'm doing a VR project, urban environment. Haven't
           | really found a good enough solution for 3D reconstruction
           | from images.
           | 
           | Yes, there is gaussian splatting, NeRF and derivatives, but
           | their outputs _really don't look good_. It's also necessary
           | to have the surface remeshed if you go through that route,
           | and then you need to retexture it.
           | 
           | Crazy thing being able to see things up to scale and so close
           | up :)
        
             | ibrarmalik wrote:
             | By output you mean the extracted surface geometry? Or are
             | you directly rendering NeRFs in VR.
        
               | spookie wrote:
               | Given the scale it wouldn't be wise to render them
               | directly. There's also the issue of being able to record
               | in real life without changes happening while doing so.
               | 
               | I should've have clarified it, but yes I was talking
               | about the extracted surface geometry.
        
             | bhewes wrote:
             | I find it much easier to remesh and deal with textures with
             | a crappy 3d reconstruction vs working with 2d images only.
             | I also shoot HDRI and photos for PBR. I find sculpting
             | tools super useful for VR, but yeah its still an Art even
             | with all the AI help.
        
             | dclowd9901 wrote:
             | Not meta VR, but one of my favorite things to do in gran
             | turismo 7 with my PSVR2 is just "sit" in the cars and look
             | around the cabins. The level of detail the devs put in is
             | on another level.
        
           | Liquix wrote:
           | does displacement mapping not hold up in VR?
        
             | lawlessone wrote:
             | I think displacement maps are often made by starting with
             | high detailed models and converting some of the smaller
             | details to normal, bump, reflection? maps etc.
        
           | TylerE wrote:
           | I'd liken it to the trend from 5-10 years ago for every game
           | to have randomly generated levels.
           | 
           | It does't feel like an expansive world - it's the same few
           | basic building blocks combined in every possible combination.
           | It doesn't feel intentional or interesting.
        
           | outside415 wrote:
           | this is why I love half life alyx. it just gets so much
           | detail in VR space in a way that no other game ever has that
           | makes for a truly immersive experience.
        
         | samspenc wrote:
         | There are a few services that do this already, but they are all
         | somewhat lacking, hopefully Meta's paper / solution brings some
         | significant improvements in this space.
         | 
         | The existing ones:
         | 
         | - Meshy https://www.meshy.ai/ one of the first movers in this
         | space, though it's quality isn't that great
         | 
         | - Rodin https://hyperhuman.deemos.com/rodin newer but folks are
         | saying this is better
         | 
         | - Luma Labs has a 3D generator https://lumalabs.ai/genie but
         | doesn't seem that popular
        
       | iamleppert wrote:
       | I tried all the recent wave of text/image to 3D model services,
       | some touting 100 MM+ valuations and tens of millions raised and
       | found them all to produce unusable garbage.
        
         | jampekka wrote:
         | The gap from demos/papers to reality is huge. ML has a bad
         | replication crisis.
        
           | SV_BubbleTime wrote:
           | >The gap from demos/papers to reality is huge.
           | 
           | SAI showed Stable Diffusion 3 pictures of women laying on
           | grass. If you haven't been following SD3...
           | 
           | https://arstechnica.com/information-
           | technology/2024/06/ridic...
        
           | freeone3000 wrote:
           | This is not a "replication crisis". Running the paper gets
           | you the same results as the author; it's uniquely replicable.
           | The results not being useful in a product is not the same as
           | a fundamental failure in our use of the scientific process.
        
             | jampekka wrote:
             | That is reproducibility. Replicability means that the
             | results hold for replication outside the specific
             | circumstances of one study.
        
               | fngjdflmdflg wrote:
               | >Replicability means that the results hold for
               | replication outside the specific circumstances of one
               | study.
               | 
               | If by "hold for replication outside the specific
               | circumstances of one study" you mean "useful for real
               | world problems" as implied by your previous comment then
               | I don't think you are correct.
               | 
               | From a quick search it seems there are multiple
               | definitions of Reproducibility and Replicability with
               | some using the words interchangeably but the most
               | favorable one I found to what you are saying is this
               | definition:
               | 
               | >Replicability is obtaining consistent results across
               | studies aimed at answering the same scientific question,
               | each of which has obtained its own data.
               | 
               | >[...]
               | 
               | >In general, whenever new data are obtained that
               | constitute the results of a study aimed at answering the
               | same scientific question as another study, the degree of
               | consistency of the results from the two studies
               | constitutes their degree of replication.[0]
               | 
               | However I think this holds true for a lot of ML research
               | going on. The issue is not that the solutions do not
               | generalize. It's that the solution itself is not useful
               | for most real world applications. I don't see what
               | replicability has to do with it. you can train a given
               | model with a different but similar dataset and you will
               | get the same quality non-useful results. I'm not sure
               | exactly what definition of replicability you are using
               | though if there is one I missed please point it out.
               | 
               | [0] https://www.ncbi.nlm.nih.gov/books/NBK547546/
        
         | architango wrote:
         | I have too, and you're quite right. Also the various 2D-to-3D
         | face generators are mostly awful. I've done a deep dive on that
         | and nearly all of them seem to only create slight perturbations
         | on some base model, regardless of the input.
        
         | dgellow wrote:
         | Haven't tried all, but yeah, pretty bad so far
        
         | dudus wrote:
         | SOTA text-to-image 5 years ago was complete garbage. Most
         | people would think the same. Look how good it got now.
         | 
         | You have to look at this as stepping stone research.
        
           | raincole wrote:
           | Did they got such high valuation 5 years ago? Genuine
           | question.
        
             | dinglestepup wrote:
             | No. With one partial exception being OpenAI that got $1B
             | investment ~5 years ago from MS before they launched DALL-E
             | v1 (and even before GPT-3).
        
             | gpm wrote:
             | I'm not sure I'd expect valuations to be at all similar.
             | 
             | The potential target market is significantly different in
             | scale (I assume, I haven't tried to estimate either). The
             | potential competitors are... already in existence. It seems
             | more likely now that we'll succeed at good 3d-generative-AI
             | then it seemed before we got good 2d-generative-AI that we
             | would succeed at that...
        
         | ddtaylor wrote:
         | We tried them too. My wife is a 3D artist, but we needed a lot
         | of assets that frankly weren't that important. The plan was to
         | use the output as a starting point and improve as needed
         | manually.
         | 
         | The problem is that the output you get is just baked meshes. If
         | the object connects together or has a few pieces you'll have to
         | essentially undo some of that work. Similar problems with
         | textures as the AI doesn't work normally like other artists do.
         | 
         | All of this is also on top of the output being basically
         | garbage. Input photos ultimately fail in ways that would
         | require so much work to fix it invalidates the concept. By the
         | time you start to get something approaching decent output
         | you've put in more work or money than just having someone make
         | it to begin with while essentially also losing all control over
         | the art pipeline.
        
       | 999900000999 wrote:
       | Would love for an artist to provide some input, but I imagine
       | this could be really good if it generates models that you can
       | edit or start from later .
       | 
       | Or, just throw a PS1 filter on top and make some retro games
        
         | doctorpangloss wrote:
         | > for an artist to provide some input
         | 
         | Sure, the results are excellent.
         | 
         | > Or, just throw a PS1 filter on top and make some retro games
         | 
         | There's so many creative ways to use these workflows. Consider
         | how much people achieved with NES graphics. The biggest
         | obstacles are tools and marketplaces.
        
           | testfrequency wrote:
           | I question that you're actually an 3D artist. I'm an artist
           | (as is my partner) and we both agree this looks better than
           | most examples..but it still looks incredibly lackluster,
           | poorly colored, and texturally continues to have weird
           | uncanny smoothness to it that is distracting/obviously
           | generated.
           | 
           | I don't have time to leave a longer reply, and I still need
           | to read over their entire white paper later tonight, but I'm
           | surprised to see someone who claims to be an artist be
           | convinced that this is "incredible".
        
       | LarsDu88 wrote:
       | This is crazy impressive, and the fact they have the whole thing
       | running with a PBR texturing pipeline is really cool.
       | 
       | That being said, I wonder if the use of signed distance fields
       | (SDFs) results in bad topology.
       | 
       | I saw a paper earlier this week that was recently released that
       | seems to build "game-ready" topology --- stuff that might
       | actually be riggable for animation.
       | https://github.com/buaacyw/MeshAnything
        
         | jsheard wrote:
         | The obvious major caveat with MeshAnything is that it only
         | scales up to outputs with about 800 polygons, so even if their
         | claims about the quality of their topology hold up it's not
         | actually good for much as it stands. For reference a modern AAA
         | game character model can easily exceed 100,000 polygons, and
         | models made to be rendered offline can be an order of magnitude
         | bigger still.
        
           | LarsDu88 wrote:
           | I do some 3d modeling on the side with my side project
           | (https://roguestargun.com), and I suspect those 800 polygons
           | with good topology may be more useful to a lot of 3d artists
           | than blobby fully textured SDF derived models.
           | 
           | A low poly model with good topology can be very easily
           | subdivided and details extruded for higher definition ala Ian
           | Hubert's famous vending machine tutorial:
           | https://www.youtube.com/watch?v=v_ikG-u_6r0
           | 
           | And of course I'm sure those folks in Shanghai making the
           | Mesh Anything paper did not have access to the datasets or
           | compute power the Meta team had.
        
       | surfingdino wrote:
       | Not sure how adding Gen AI is going to make VR any better? I
       | wanted to type "it's like throwing good money after bad", but
       | that's not quite right. Both are black holes where VC money is
       | turned into papers and demos.
        
         | Filligree wrote:
         | The ultimate end goal is a VR game with infinite detail. Sword
         | Art Online, however, remains fiction. Perhaps for the best.
        
       | vletal wrote:
       | Seeems like simple enough 3D-to-3D will be possible soon!
       | 
       | I'll use it to upscale 8x all meshes and textures in the original
       | Mafia and Unreal Tournament, write a good bye letter to my family
       | and disappear.
       | 
       | I think the kids will understand when they grow up.
        
       | f0e4c2f7 wrote:
       | Is there a way to try this yet?
        
       | carbocation wrote:
       | For starters, I'd love to just see a rock-solid neural network
       | replacement for screened poisson surface reconstruction. (I have
       | seen MeshAnything and I don't think that's the end-game.)
        
       | w_for_wumbo wrote:
       | I think this is another precursor step in recreating our reality
       | digitally. As long as you're able to react to the persons' state,
       | with enough metrics you're able to recreate environments and
       | scenarios within a 'safe environment' for people to push through
       | and learn to cope with the scenarios they don't feel safe to
       | address in the 'real' world.
       | 
       | When the person then emerges from this virtual world, it'll be
       | like an egg hatching into a new birth, having learned the lessons
       | in their virtual cocoon.
       | 
       | If you don't like this idea, it's an interesting thought
       | experiment regardless as we can't verify, we're not already in a
       | form of this.
        
       | polterguy1000 wrote:
       | Meta 3D Gen represents a significant step forward in the realm of
       | 3D content generation, particularly for VR applications. The
       | ability to generate detailed 3D models from text inputs could
       | drastically reduce the labor-intensive process of content
       | creation, making it more accessible and scalable. However, as
       | some commenters have pointed out, the current technology still
       | faces challenges, especially in producing high-quality, detailed
       | geometry that holds up under the scrutiny of VR's stereoscopic
       | depth perception. The integration of PBR texturing is a promising
       | feature, but the real test will be in how well these models can
       | be refined and utilized in practical applications. It's an
       | exciting development, but there's still a long way to go before
       | it can fully meet the needs of VR developers and artists.
        
         | xena wrote:
         | Generally these things are useless for 3d artists because the
         | wireframe is useless for them.
        
         | guiomie wrote:
         | That would be great. I've learnt some Unity, building my own
         | little VR game, and I dread having to learn Blender or any
         | other tool to make more detailed shapes/models. I've tried a
         | few GenAI tool to create 3D models and the quality is not
         | useable.
        
       | mintone wrote:
       | I've been bullish[1] on this as a major aspect of generative AI
       | for a while now, so it's great to see this paper published.
       | 
       | 3D has an extremely steep learning curve once you try to do
       | anything non-trivial, especially in terms of asset creation for
       | VR etc. but my real interest is where this leads in terms of
       | real-world items. One of the major hurdles is that in the real-
       | world we aren't as forgiving as we are in VR/games. I'm not
       | entirely surprised to see that most of the outputs are "artistic"
       | ones, but I'm really interested to see where this ends up when we
       | can give AI combined inputs from text/photos/LIDAR etc and have
       | it make the model for a physical item that can be 3D printed.
       | 
       | [1] https://www.technicalchops.com/articles/ai-inputs-and-
       | output...
        
       ___________________________________________________________________
       (page generated 2024-07-02 23:00 UTC)