UVGS

Abstract

3D Gaussian Splatting (3DGS) has demonstrated superior quality in modeling 3D objects and scenes. However, generating 3DGS remains challenging due to their discrete, unstructured, and permutation-invariant nature. In this work, we present a simple yet effective method to overcome these challenges. We utilize spherical mapping to transform 3DGS into a structured 2D representation, termed UVGS. UVGS can be viewed as multi-channel images, with feature dimensions as a concatenation of Gaussian attributes such as position, scale, color, opacity, and rotation. We further find that these heterogeneous features can be compressed into a lower-dimensional (e.g., 3-channel) shared feature space using a carefully designed multi-branch network. The compressed UVGS can be treated as typical RGB images. Remarkably, we discover that typical VAEs trained with latent diffusion models can directly generalize to this new representation without additional training.
Our novel representation makes it effortless to leverage foundational 2D models, such as diffusion models, to directly model 3DGS. Additionally, one can simply increase the 2D UV resolution to accommodate more Gaussians, making UVGS a scalable solution compared to typical 3D backbones. This approach immediately unlocks various novel generation applications of 3DGS by inherently utilizing the already developed superior 2D generation capabilities. In our experiments, we demonstrate various unconditional, conditional generation, and inpainting applications of 3DGS based on diffusion models, which were previously non-trivial.

We propose UVGS - an structured image-like representation for 3DGS obtained by spherical mapping of 3DGS primitives. The obtained UVGS maps can be further squeezed to a 3-channel “3D-aware” Super UVGS image capable of bridging the gap between 3DGS and existing image foundation models. We show Super UVGS can be used to compress the 3DGS assets using pretrained image Autoencoders, and to directly generate unconditional and conditional 3DGS objects using diffusion models.

3DGS Compression using UVGS

UVGS and Super UVGS can be used to compress 3DGS assets using pretrained image Autoencoders by upto 99.5%.

Unconditional Generation

The following figure shows a wide variety of high-quality unconditional generation result from our method. We train a diffusion model to sample Super UVGS images from random noise. The Super UVGS can be converted to 3DGS object using inverse mapping network and inverse spherical projection.

Conditional Generation

Following are the conditional generation result from our method. We train a text-conditioned diffusion model to sample Super UVGS images. The Super UVGS can be converted to 3DGS object using inverse mapping network and inverse spherical projection.

Comparison with the Baselines

Comparison of unconditional 3D asset generation on the cars category with SOTA methods. Figure shows that DiffTF produces low-quality, low-resolution cars lacking detail. While Get3D achieve higher resolution, it suffers from 3D inconsistency, numerous artifacts, and lack richness in 3D detail. Similar issues are found in GaussianCube along with symmetric inconsistency in the results. In contrast, our method generates high-quality, high-resolution objects that are 3D consistent with sharp, well-defined edges.

BibTex



      @InProceedings{Rai_2025_CVPR,
    author    = {Rai, Aashish and Wang, Dilin and Jain, Mihir and Sarafianos, Nikolaos and Chen, Kefan and Sridhar, Srinath and Prakash, Aayush},
    title     = {UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {5927-5937}
}

UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping

CVPR 2025