https://github.com/ashawkey/stable-dreamfusion

Skip to content Toggle navigation
 
Sign up

  * Product
      +  
        Actions
        Automate any workflow
      +  
        Packages
        Host and manage packages
      +  
        Security
        Find and fix vulnerabilities
      +  
        Codespaces
        Instant dev environments
      +  
        Copilot
        Write better code with AI
      +  
        Code review
        Manage code changes
      +  
        Issues
        Plan and track work
      +  
        Discussions
        Collaborate outside of code
      + Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
      + By Plan
      + Enterprise
      + Teams
      + Compare all
      + By Solution
      + CI/CD & Automation
      + DevOps
      + DevSecOps
      + Case Studies
      + Customer Stories
      + Resources
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
      + Repositories
      + Topics
      + Trending
      + Collections
  * Pricing

[                    ] 

  *  
    #
    In this repository All GitHub |
    Jump to |

  * No suggested jump to results

  *  
    #
    In this repository All GitHub |
    Jump to |
  *  
    #
    In this user All GitHub |
    Jump to |
  *  
    #
    In this repository All GitHub |
    Jump to |

Sign in
Sign up
{{ message }}
ashawkey / stable-dreamfusion Public

  * Notifications
  * Fork 29
  * Star 584

A working implementation of text-to-3D dreamfusion, powered by stable
diffusion.

License

Apache-2.0 license
584 stars 29 forks
Star
Notifications

  * Code
  * Issues 6
  * Pull requests 1
  * Actions
  * Projects 0
  * Security
  * Insights

More

  * Code
  * Issues
  * Pull requests
  * Actions
  * Projects
  * Security
  * Insights

ashawkey/stable-dreamfusion

This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
main
Switch branches/tags
[                    ]
Branches Tags
Could not load branches
Nothing to show
{{ refName }} default View all branches
Could not load tags
Nothing to show
{{ refName }} default
View all tags
1 branch 0 tags
Code

  *  
    Clone
    HTTPS GitHub CLI
    [https://github.com/a]

    Use Git or checkout with SVN using the web URL.

    [gh repo clone ashawk]

    Work fast with our official CLI. Learn more.

  * Open with GitHub Desktop
  * Download ZIP

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

@ashawkey
ashawkey fix huggingface token, add requirements
...
0cb8c0e Oct 6, 2022
fix huggingface token, add requirements
0cb8c0e

Git stats

  * 10 commits

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
assets
use issue#1 as gallery
Oct 6, 2022
freqencoder
init
Oct 6, 2022
gridencoder
init
Oct 6, 2022
nerf
fix huggingface token, add requirements
Oct 6, 2022
raymarching
init
Oct 6, 2022
scripts
init
Oct 6, 2022
shencoder
init
Oct 6, 2022
.gitignore
init
Oct 6, 2022
LICENSE
Create LICENSE
Oct 6, 2022
activation.py
init
Oct 6, 2022
encoding.py
use issue#1 as gallery
Oct 6, 2022
main_nerf.py
misc update
Oct 6, 2022
optimizer.py
init
Oct 6, 2022
readme.md
misc update
Oct 6, 2022
requirements.txt
fix huggingface token, add requirements
Oct 6, 2022
View code
[                    ]
Stable-Dreamfusion Gallery | Update Logs Important Notice Notable
differences from the paper TODOs Install Install with pip Build
extension (optional) Tested environments Usage Code organization &
Advanced tips Acknowledgement

readme.md

 Stable-Dreamfusion

A pytorch implementation of the text-to-3D model Dreamfusion, powered
by the Stable Diffusion text-to-2D model.

The original paper's project page: DreamFusion: Text-to-3D using 2D
Diffusion.

Examples generated from text prompt a high quality photo of a
pineapple viewed with the GUI in real time:

pineapple.mp4

 Gallery | Update Logs

 Important Notice

This project is a work-in-progress, and contains lots of differences
from the paper. Also, many features are still not implemented now.
The current generation quality cannot match the results from the
original paper, and many prompts still fail badly!

 Notable differences from the paper

  * Since the Imagen model is not publicly available, we use Stable
    Diffusion to replace it (implementation from diffusers).
    Different from Imagen, Stable-Diffusion is a latent diffusion
    model, which diffuses in a latent space instead of the original
    image space. Therefore, we need the loss to propagate back from
    the VAE's encoder part too, which introduces extra time cost in
    training. Currently, 15000 training steps take about 5 hours to
    train on a V100.
  * We use the multi-resolution grid encoder to implement the NeRF
    backbone (implementation from torch-ngp), which enables much
    faster rendering (~10FPS at 800x800).
  * We use the Adam optimizer with a larger initial learning rate.

 TODOs

  * The normal evaluation & shading part.
  * Better mesh (improve the surface quality).

 Install

git clone https://github.com/ashawkey/stable-dreamfusion.git
cd stable-dreamfusion

Important: To download the Stable Diffusion model checkpoint, you
should create a file called TOKEN under this directory (i.e.,
stable-dreamfusion/TOKEN) and copy your hugging face access token
into it.

 Install with pip

pip install -r requirements.txt

# (optional) install the tcnn backbone if using --tcnn
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

# (optional) install CLIP guidance for the dreamfield setting
pip install git+https://github.com/openai/CLIP.git

# (optional) install nvdiffrast for exporting textured mesh
pip install git+https://github.com/NVlabs/nvdiffrast/

 Build extension (optional)

By default, we use load to build the extension at runtime. We also
provide the setup.py to build each extension:

# install all extension modules
bash scripts/install_ext.sh

# if you want to install manually, here is an example:
pip install ./raymarching # install to python path (you still need the raymarching/ folder, since this only installs the built extension.)

 Tested environments

  * Ubuntu 22 with torch 1.12 & CUDA 11.6 on a V100.

 Usage

First time running will take some time to compile the CUDA
extensions.

### stable-dreamfusion setting
# train with text prompt
# `-O` equals `--cuda_ray --fp16 --dir_text`
python main_nerf.py --text "a hamburger" --workspace trial -O

# test (exporting 360 video, and an obj mesh with png texture)
python main_nerf.py --text "a hamburger" --workspace trial -O --test

# test with a GUI (free view control!)
python main_nerf.py --text "a hamburger" --workspace trial -O --test --gui

### dreamfields (CLIP) setting
python main_nerf.py --text "a hamburger" --workspace trial_clip -O --guidance clip
python main_nerf.py --text "a hamburger" --workspace trial_clip -O --test --gui --guidance clip

 Code organization & Advanced tips

This is a simple description of the most important implementation
details. If you are interested in improving this repo, this might be
a starting point. Any contribution would be greatly appreciated!

  * The SDS loss is located at ./nerf/sd.py > StableDiffusion >
    train_step:

# 1. we need to interpolate the NeRF rendering to 512x512, to feed it to SD's VAE.
pred_rgb_512 = F.interpolate(pred_rgb, (512, 512), mode='bilinear', align_corners=False)
# 2. image (512x512) --- VAE --> latents (64x64), this is SD's difference from Imagen.
latents = self.encode_imgs(pred_rgb_512)
... # timestep sampling, noise adding and UNet noise predicting
# 3. the SDS loss, since UNet part is ignored and cannot simply audodiff, we manually set the grad for latents.
w = (1 - self.scheduler.alphas_cumprod[t]).to(self.device)
grad = w * (noise_pred - noise)
latents.backward(gradient=grad, retain_graph=True)

  * Other regularizations are in ./nerf/utils.py > Trainer >
    train_step.
      + The generation seems quite sensitive to regularizations on
        weights_sum (alphas for each ray). The original opacity loss
        tends to make NeRF disappear (zero density everywhere), so we
        use an entropy loss to replace it for now (encourages alpha
        to be either 0 or 1).
  * NeRF Rendering core function: ./nerf/renderer.py > NeRFRenderer >
    run_cuda.
  * Shading & normal evaluation: ./nerf/network*.py > NeRFNetwork >
    forward. Current implementation harms training and is disabled.
      + use --albedo_iters 1000 to enable random shading mode after
        1000 steps from albedo, lambertian ,and textureless
      + light direction: current implementation use a plane light
        source, instead of a point light source...
  * View-dependent prompting: ./nerf/provider.pu >
    get_view_direction.
      + ues --angle_overhead, --angle_front to set the border. How to
        better divide front/back/side regions?
  * Network backbone (./nerf/network*.py) can be chosen by the
    --backbone option, but tcnn and vanilla are not well tested.
      + the occupancy grid based training acceleration (instant-ngp
        like) may harm the generation progress, since once a grid
        cell is marked as empty, rays won't pass it later.
  * Spatial density bias (gaussian density blob): ./nerf/network*.py
    > NeRFNetwork > gaussian.

 Acknowledgement

  * The amazing original work: DreamFusion: Text-to-3D using 2D
    Diffusion.

    @article{poole2022dreamfusion,
        author = {Poole, Ben and Jain, Ajay and Barron, Jonathan T. and Mildenhall, Ben},
        title = {DreamFusion: Text-to-3D using 2D Diffusion},
        journal = {arXiv},
        year = {2022},
    }

  * Huge thanks to the Stable Diffusion and the diffusers library.

    @misc{rombach2021highresolution,
        title={High-Resolution Image Synthesis with Latent Diffusion Models},
        author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Bjorn Ommer},
        year={2021},
        eprint={2112.10752},
        archivePrefix={arXiv},
        primaryClass={cs.CV}
    }

    @misc{von-platen-etal-2022-diffusers,
        author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
        title = {Diffusers: State-of-the-art diffusion models},
        year = {2022},
        publisher = {GitHub},
        journal = {GitHub repository},
        howpublished = {\url{https://github.com/huggingface/diffusers}}
    }

  * The GUI is developed with DearPyGui.

About

A working implementation of text-to-3D dreamfusion, powered by stable
diffusion.

Topics

gui nerf stable-diffusion text-to-3d dreamfusion

Resources

Readme

License

Apache-2.0 license

Stars

584 stars

Watchers

14 watching

Forks

29 forks

Releases

No releases published

Packages 0

No packages published

Languages

  * Python 64.5%
  * Cuda 33.2%
  * C 1.5%
  * Other 0.8%

Footer

 (c) 2022 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact GitHub
  * Pricing
  * API
  * Training
  * Blog
  * About

You can't perform that action at this time.
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session.