Aashish Rai

I'm a Computer Science Ph.D. student at Brown University, supervised by Prof. Srinath Sridhar at Interactive 3D Vision & Learning Lab (IVL). I work in multimodal learning at the intersection of Vision and Sound. I'm broadly interested in making machines learn and understanding the human brain's ability to integrate audio and visual information allowing them to efficiently interact with the environment.

I'm working as a Computer Vision Engineer II (CW) at Meta Reality Labs in Burlingame California, during Summer 2024 hosted by Aayush Prakash.

Previously, I was a full-time Research Assistant at Robotics Institute, Carnegie Mellon University, advised by Fernando De la Torre at Human Sensing Lab. I worked on Realistic 3D face generation by leveraging 2D generative models in collaboration with Meta Reality Labs.

Before CMU, I did my undergrad at National Institute of Technology (NIT) Surat, India in ECE, where I worked with Kishor Upla on problems in Deep Learning and Computer Vision. During my undergrad, I had the opportunity to work/collaborate with McGill University, Norwegian Biometrics Lab, and Indian Space Research Organization (ISRO) on Computer Vision research.

I'm always open to research collaborations in Vision, Multimodal Learning, and related fields. Feel free to drop me an email.

Email  /  Google Scholar  /  GitHub  /  LinkedIn  /  CV

UPDATES
[2024] Serving as reviewer for CVPR-24, ECCV-24, NeurIPS-24
[Oct 2023] Our paper [ Towards Realistic Generative 3D Face Models ] has been accepted to WACV 2024
[Aug 2023] Started Ph.D. in Computer Science at Brown University
[May 2023] Our work [ Towards Realistic Generative 3D Face Models ] Has Been Featured on Synced and many other places!!
[Mar 2023] Serving as reviewer for XRNeRF: Advances in NeRF for the Metaverse at CVPR 2023
[Jan 2023] Presented one paper on 3D Face Generation at WACV 2023 in Hawaii
[Sep 2022] Collaborated with Pavlos Protopapas's group at Harvard University
[Dec 2021] Presented one paper on Semantic Face Editing at 20th ICMLA 2021, Pasadena, CA
[Sep 2021] Joined CMU as a full-time Research Assistant
[Apr 2020] Received scholarship for Global Talent Internship Program, Ministry of Science and Technology, Taiwan
RESEARCH EXPERIENCE

I have worked on a variety of projects within the realm of computer vision. My past research encompasses projects such as 3D face reconstructions, dynamic 3D face generation, and semantic face editing using generative models. Additionally, I've delved into optimizing super-resolution networks for computational efficiency, recognition under unconstrained environments, satellite image analysis, and comprehensive object and scene interpretation.

Computer Vision Engineer II (CW)
Meta Reality Labs

Burlingame, CA, USA
(May/2024 - )
--> Working on 3D Object Editing.
Hosted by: Aayush Prakash

Research Assistant
Robotics Institute, Carnegie Mellon University

Pittsburgh, PA, USA
(Sept/2021 - June/2023)
--> Proposed and implemented a 3D controllable generative face model to generate high-quality 3D Faces by leveraging existing 2D generative models.
--> Designed a novel 3D face generative model that can decouple identity and expression and provides granular control over expressions.
Advisor: Fernando De la Torre

Research Intern
Shared Reality Lab, McGill University

Montreal, Canada
(May/2020 - Mar/2021)
--> Worked on improving Semantic Face Editing (control any specific face attribute keeping others unaltered) of StyleGAN2 outputs in terms of perceived quality of facial features.
--> Modified the framework to make it equally usable for various other complex attributes like race, face shape, etc.
Advisor: Jeremy Cooperstock

Undergraduate Researcher
Norwegian Biometrics Laboratory, NTNU Norway

(Dec/2019 - May/2020)
--> Designed a CNN based, light weight, progressive residual propagating asymmetrical architecture with three modules (LF, HF feature extraction and reconstruction) to generate 8x upscaled (Super Resolution) images from 8x8, 16x16, 24x24 LR images.
--> The model gave appreciable results on benchmark datasets CelebA (PSNR: 26.55) and LFW (PSNR: 26.26).
Advisor: Kishor Upla, Christoph Busch

Summer Research Intern
IIRS, Indian Space Research Organization (ISRO)

Dehradun, India
(May/2019 - Jul/2019)
--> Implemented computationally efficient algorithms for the pixel-wise classification of Panchromatic (single band) and Multispectral (up-to 15 bands) satellite images using ANN and CNN.
Advisor: Anil Kumar

Undergraduate Researcher
MLCV Lab, NIT Surat

Surat, India
(Jan/2019 - Nov/2019)
--> Designed an Automated Attendance System using Deep Learning to mark the attendance of entire class simultaneously and overcome usual challenges of occlusion, orientation and luminance.
Advisor: Kishor Upla




RESEARCH PAPERS

EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos
Aashish Rai, Srinath Sridhar
Arxiv Preprint, 2024
project page | pdf

Towards Realistic Generative 3D Face Models
Aashish Rai, Hiresh Gupta, Ayush Pandey, Francisco Vicente Carrasco, Shingo Jason Takagi, Amaury Aubel, Daeil Kim, Aayush Prakash, Fernando de la Torre
Winter Conference on Applications of Computer Vision (WACV), 2024
project page | pdf

Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance
Fariborz Taherkhani, Aashish Rai, Quankai Gao, Shaunak Srivastava, Xuanbai Chen, Fernando de la Torre, Steven Song, Aayush Prakash, Daeil Kim
Winter Conference on Applications of Computer Vision (WACV), 2023
project page | pdf

Improved Attribute Manipulation in the Latent Space of StyleGAN for Semantic Face Editing
Aashish Rai, Clara Ducher, Jeremy Cooperstock
20th IEEE International Conference on Machine Learning and Applications, Pasadena, CA, USA, 2021
pdf | project page

ComSupResNet: A Compact Super-Resolution Network for Low-Resolution Face Images.
Aashish Rai, Vishal Chudasama, Kishor Upla, Kiran Raja, Raghavendra Ramachandra, Christoph Busch
8th International Workshop on Biometrics and Forensics (IWBF), Porto, Portugal, 2020
pdf | project page
(extended version is accepted in IEEE Transactions on Biometrics, Behavior and Identity Science (T-BIOM))

An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks.
Aashish Rai, , R. Karnani, V. Chudasama and K. Upla
16th IEEE India Council International Conference (INDICON), Rajkot, India, 2019
pdf | project page


(website template modified from repo )

Hey folks from around the globe!!