Text-Guided 3D Synthesis with Latent Diffusion Models

Date

2023

Authors

Kovalenko, Danylo

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The emergence of diffusion models has greatly impacted the field of deep generative models, establishing them as a powerful family of models with unparalleled performance in various applications such as text-to-image, image-to-image, and text-toaudio tasks. In this work, we aim to propose a solution for text-guided 3D synthesis using denoising diffusion probabilistic models, while minimizing the memory and computational requirements. Our goal is to achieve high-quality and high-fidelity 3D object generation conditioned by text or a label in a number of seconds. We propose to use a triplane space parametrization in combination with a Latent Diffusion Model (LDM) to generate smooth and coherent geometry. The LDM is trained on the large-scale text-3d dataset and is used as a latent triplane texture generator. By using a triplane space parametrization, we aim to improve the efficiency of the space representation and reduce the computational cost of synthesis. We will also give a theoretical justification that this kind of parametrization of 3d space is capable of containing not only information about the geometry but also about the color and reflectivity of the figure. Additionally, we use an implicit neural renderer to decode geometry details from triplane textures.

Description

Keywords

Citation

Kovalenko Danylo. Text-Guided 3D Synthesis with Latent Diffusion Models, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2023, 38 p.

Collections

Endorsement

Review

Supplemented By

Referenced By