https://github.com/facebookresearch/vfusion3d Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + GitHub Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions By size + Enterprise + Teams + Startups By industry + Healthcare + Financial services + Manufacturing By use case + CI/CD & Automation + DevOps + DevSecOps * Resources Topics + AI + DevOps + Security + Software Development Explore + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Enterprise + Enterprise platform AI-powered developer platform Available add-ons + Advanced Security Enterprise-grade security features + GitHub Copilot Enterprise-grade AI features + Premium Support Enterprise-grade 24/7 support * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up Reseting focus You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} facebookresearch / vfusion3d Public * Notifications You must be signed in to change notification settings * Fork 6 * Star 79 [ECCV 2024] Code for VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models License View license 79 stars 6 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings * Code * Issues 0 * Pull requests 0 * Actions * Projects 0 * Security * Insights Additional navigation options * Code * Issues * Pull requests * Actions * Projects * Security * Insights facebookresearch/vfusion3d This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main BranchesTags Go to file Code Folders and files Last Last Name Name commit commit message date Latest commit History 18 Commits assets/40_prompt_images assets/40_prompt_images images images lrm lrm results/ results/ 40_prompt_images_provided 40_prompt_images_provided CODE_OF_CONDUCT.md CODE_OF_CONDUCT.md CONTRIBUTING.md CONTRIBUTING.md LICENSE LICENSE README.md README.md gradio_app.py gradio_app.py install.sh install.sh View all files Repository files navigation * README * Code of conduct * License * Security [ECCV 2024] VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models Project page, Paper link, HF Demo VFusion3D is a large, feed-forward 3D generative model trained with a small amount of 3D data and a large volume of synthetic multi-view data. It is the first work exploring scalable 3D generative/ reconstruction models as a step towards a 3D foundation. VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models Junlin Han, Filippos Kokkinos, Philip Torr GenAI, Meta and TVG, University of Oxford European Conference on Computer Vision (ECCV), 2024 News * [08.08.2024] HF Demo is available, big thanks to Jade Choghari's help for making it possible. * [25.07.2024] Release weights and inference code for VFusion3D. Results and Comparisons 3D Generation Results [gif1] [gif2] User Study Results [user] Setup Installation git clone https://github.com/facebookresearch/vfusion3d cd vfusion3d Environment We provide a simple installation script that, by default, sets up a conda environment with Python 3.8.19, PyTorch 2.3, and CUDA 12.1. Similar package versions should also work. source install.sh Quick Start Pretrained Models * Model weights are available here Google Drive. Please download it and put it inside ./checkpoints/ Prepare Images * We put some sample inputs under assets/40_prompt_images, which is the 40 MVDream prompt generated images used in the paper. Results of them are also provided under results/ 40_prompt_images_provided. Inference * Run the inference script to get 3D assets. * You may specify which form of output to generate by setting the flags --export_video and --export_mesh. * Change --source_path and --dump_path if you want to run it on other image folders. # Example usages # Render a video python -m lrm.inferrer --export_video --resume ./checkpoints/vfusion3dckpt # Export mesh python -m lrm.inferrer --export_mesh --resume ./checkpoints/vfusion3dckpt Local Gradio App python gradio_app.py Hints 1. Running out of GPU memory? + Try reducing the --render_size parameter to 256 or even 128. Note that this will degrade performance. 2. Unsatisfactory results? + This inference code works best with front view (or nearly front view) input images. Side views are generally supported, but may result in poorer outcomes. If this is the issue, see below. 3. Customizing for different viewing angle inputs: + Although the model supports input images from any viewing angle, you will need to modify lrm/inferrer.py, which can be a bit complex so it is usually not recommended. Specifically, adjust canonical_camera_extrinsics within _default_source_camera. To find the canonical_camera_extrinsics for the desired input image, follow these steps: 1. Use a front view image as the input to generate a video result. 2. Check render_camera_extrinsics inside _default_render_cameras along with the rendered video results. 3. Identify the view that closely matches the desired input image (in viewing angles). 4. Replace values of canonical_camera_extrinsics with the corresponding render_camera_extrinsics. 5. Run the inference code again with your desired input view. Acknowledgement * This inference code of VFusion3D heavily borrows from OpenLRM. Citation If you find this work useful, please cite us: @article{han2024vfusion3d, title={VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models}, author={Junlin Han and Filippos Kokkinos and Philip Torr}, journal={European Conference on Computer Vision (ECCV)}, year={2024} } License * The majority of VFusion3D is licensed under CC-BY-NC, however portions of the project are available under separate license terms: OpenLRM as a whole is licensed under the Apache License, Version 2.0, while certain components are covered by NVIDIA's proprietary license. * The model weights of VFusion3D are also licensed under CC-BY-NC. About [ECCV 2024] Code for VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models Resources Readme License View license Code of conduct Code of conduct Security policy Security policy Activity Custom properties Stars 79 stars Watchers 8 watching Forks 6 forks Report repository Releases No releases published Packages 0 No packages published Contributors 3 * * * Languages * Python 98.9% * Shell 1.1% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.