DreamFusion: Text-to-3D using 2D Diffusion
Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Recent breakthroughs in text-to-image synthesis have been driven by diffusion models 3D synthesis requires large-scale datasets and efficient architectures, which donāt exist Text-to-3D synthesis is done using a pretrained 2D text-to-image diffusion model Loss based on probability density distillation enables use of 2D diffusion model as prior DeepDream-like procedure optimizes a randomly-initialized 3D model via gradient descent Paper Content Introduction Generative image models now support high-fidelity, diverse and controllable image synthesis Quality improvements come from large aligned image-text datasets and scalable generative model architectures Diffusion models are effective at learning high-quality image generators Applying diffusion models to other modalities requires large amounts of modality-specific training data This work develops techniques to transfer pretrained 2D image-text diffusion models to 3D object synthesis 3D generative models can be trained on explicit representations of structure GANs can learn controllable 3D generators from photographs of a single object category Neural Radiance Fields can be used for neural inverse rendering Many 3D generative approaches have found success incorporating NeRF-like models This work uses pretrained 2D image-text models for 3D synthesis Score Distillation Sampling (SDS) enables sampling via optimization in differentiable image parameterizations DreamFusion generates high-fidelity coherent 3D objects and scenes for user-provided text prompts Diffusion models and score distillation sampling Diffusion models are generative models that learn to transform a sample from a noise distribution to a data distribution....