StyleGaussian: Instant 3D Style Transfer
with Gaussian Splatting
Arxiv 2024

Kunhao Liu
Nanyang Technological University
Fangneng Zhan
Max Planck Institute for Informatics
Muyu Xu
Nanyang Technological University
Christian Theobalt
Max Planck Institute for Informatics
Ling Shao
UCAS-Terminus AI Lab, UCAS
Shijian Lu
Nanyang Technological University
overview

Abstract

We introduce StyleGaussian, a novel 3D style transfer technique that allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps). Leveraging 3D Gaussian Splatting (3DGS), StyleGaussian achieves style transfer without compromising its real-time rendering ability and multi-view consistency. It achieves instant style transfer with three steps: embedding, transfer, and decoding. Initially, 2D VGG scene features are embedded into reconstructed 3D Gaussians. Next, the embedded features are transformed according to a reference style image. Finally, the transformed features are decoded into the stylized RGB. StyleGaussian has two novel designs. The first is an efficient feature rendering strategy that first renders low-dimensional features and then maps them into high-dimensional features while embedding VGG features. It cuts the memory consumption significantly and enables 3DGS to render the high-dimensional memory-intensive features. The second is a K-nearest-neighbor-based 3D CNN. Working as the decoder for the stylized features, it eliminates the 2D CNN operations that compromise strict multi-view consistency. Extensive experiments show that StyleGaussian achieves instant 3D stylization with superior stylization quality while preserving real-time rendering and strict multi-view consistency.

TL;DR:

StyleGaussian is a novel 3D style transfer pipeline that enables instant style transfer while preserving real-time rendering and strict multi-view consistency.


StyleGaussian

overview

Overview. Given a reconstructed 3D Gaussians, we first embed the VGG features to the 3D Gaussians (e). Then, given a style image, we transform the features of the embedded Gaussians, where the features are infused with the style information (t). Lastly, we decode the transformed features into RGB to produce the final stylized 3D Gaussians (d). We design an efficient feature rendering strategy in e that enables rendering high-dimensional VGG features while learning to embed them into 3D Gaussians. We also develop a KNN-based 3D CNN as the decoder in d.


Comparisons

overview

Comparison of StyleGaussian with two zero-shot radiance field style transfer methods: HyperNet and StyleRF . StyleGaussain demonstrates superior style transfer quality with better style alignment with the style reference images and better content preservation.


3D Stylization Demo

We provide the demo videos that demonstrate the instant style transfer ability and stylization quality of StyleGaussian (no speed-up; all style transfer and style interpolation happen on the fly). The reference style images are displayed at the lower right corner. During style interpolation, another style image is displayed at the lower left corner.

Ignatius
Caterpillar
Train
M60
Truck
Garden
Horse
Barn

Citation

Consider citing us if you find this project is helpful.
@article{liu2023stylegaussian,
  title={StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting},
  author={Liu, Kunhao and Zhan, Fangneng and Xu, Muyu and Theobalt, Christian and Shao, Ling and Lu, Shijian},
  journal={arXiv preprint arXiv:2403.07807},
  year={2024},
}

Acknowledgements

Our work is based on 3D Gaussian Splatting and StyleRF . We thank the authors for their great work and open-sourcing the code.