Controllable 3D Generative Adversarial Face Model
via Disentangling Shape and Appearance

Fariborz Teherkhani1         Aashish Rai*1         Shaunak Srivastava*1         Quankai Gao*1         Xuanbai Chen1         Fernando de la Torre1         Steven Song2         Aayush Prakash2         Daeil Kim2        
1Carnegie Mellon University          2Facebook/Meta
(* equal contribution)

WACV 2023

Abstract

3D face modeling has been an active area of research in computer vision and computer graphics, fueling applications ranging from facial expression transfer in virtual avatars to synthetic data generation. Existing 3D deep learning generative models (e.g., VAE, GANs) allow generating com- pact face representations (both shape and texture) that can model non-linearities in the shape and appearance space (e.g., scatter effects, specularities, etc.). However, they lack the capability to control the generation of subtle expressions.

This paper proposes a new 3D face generative model that can decouple identity and expression and provides granular control over expressions. In particular, we propose using a pair of supervised auto-encoder and generative adversarial networks to produce high-quality 3D faces, both in terms of appearance and shape. Experimental results in the generation of 3D faces learned with holistic expression labels, or Action Unit labels, show how we can decouple identity and expression; gaining fine-control over expressions while preserving identity.





Our generator’s uncurated set of shapes, textures and rendered faces with the FaceScape dataset. (a) Shapes of different expressions belonging to the same identity. (b) Expression-specific generated textures and corresponding rendered faces. (c) Each row shows multi-view extrapolation of the expression intensity while preserving the identity. (d) Facial expression (Smile) synthesis with different monotonic intensity.

Architecture

Overview of our 3D generative model. The first step includes training an SAE, which projects shapes into two low dimensional embedding subspaces, one of which is dedicated to the identity factor while the other to the expression factor. In the second step, we utilize a cGAN network to sample shape and texture from their respective domains. A renderer is then used to generate photorealistic faces.


Extrapolation along the expressions

Varying intensity of Expressions by Extrapolation: Faces show smooth increase in expressiveness as we vary the intensity along the expression dimension.



Interpolation between Identities

Smooth linear interpolation across identity.

          


BibTeX


  @InProceedings{Taherkhani_2023_WACV,
    author    = {Taherkhani, Fariborz and Rai, Aashish and Gao, Quankai and Srivastava, Shaunak and Chen, Xuanbai and de la Torre, Fernando and Song, Steven and Prakash, Aayush and Kim, Daeil},
    title     = {Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {826-836}
}

License


    The code is available under X11 License. Please read the license terms available at [Link]. Quick summary available at [Link].