Aashish Rai
I'm a Computer Science Ph.D. student at Brown University, supervised by Prof. Srinath Sridhar at
Interactive 3D Vision & Learning Lab (IVL). I work in multimodal learning at the intersection of Vision and Sound. I'm broadly interested in making
machines learn and understanding the human brain's ability to integrate audio and visual information allowing them to efficiently interact with the environment.
I'm working as a Computer Vision Engineer II (CW) at Meta Reality Labs in Burlingame California, during Summer 2024 hosted by
Aayush Prakash.
Previously, I was a full-time Research Assistant at Robotics Institute, Carnegie
Mellon University, advised by Fernando De la Torre at Human Sensing Lab.
I worked on Realistic 3D face generation by leveraging 2D generative models in collaboration with Meta Reality Labs.
Before CMU, I did my undergrad at National Institute of Technology (NIT) Surat, India in ECE,
where I worked with
Kishor Upla on problems in Deep Learning and Computer Vision. During my undergrad, I had the opportunity to work/collaborate with
McGill University, Norwegian Biometrics Lab,
and Indian Space Research Organization (ISRO) on Computer Vision research.
I'm always open to research collaborations in Vision, Multimodal Learning, and related fields. Feel free to drop me an email.
Email  / 
Google Scholar  / 
GitHub  / 
LinkedIn  / 
CV
|
|
RESEARCH EXPERIENCE
I have worked on a variety of projects within the realm of computer vision. My past research encompasses projects such as 3D face reconstructions,
dynamic 3D face generation, and semantic face editing using generative models. Additionally, I've delved into optimizing super-resolution networks for computational efficiency,
recognition under unconstrained environments, satellite image analysis, and comprehensive object and scene interpretation.
|
|
Computer Vision Engineer II (CW) Meta Reality Labs
Burlingame, CA, USA
(May/2024 - )
--> Working on 3D Object Editing.
Hosted by: Aayush Prakash
|
|
Research Assistant Robotics Institute, Carnegie Mellon University
Pittsburgh, PA, USA
(Sept/2021 - June/2023)
--> Proposed and implemented a 3D controllable generative face model to generate high-quality 3D Faces by leveraging existing 2D generative models.
--> Designed a novel 3D face generative model that can decouple identity and expression and provides granular control over expressions.
Advisor: Fernando De la Torre
|
|
Research Intern Shared Reality Lab, McGill University
Montreal, Canada
(May/2020 - Mar/2021)
--> Worked on improving Semantic Face Editing (control any specific face attribute keeping others unaltered) of StyleGAN2 outputs in terms of perceived quality of facial features.
--> Modified the framework to make it equally usable for various other complex attributes like race, face shape, etc.
Advisor: Jeremy Cooperstock
|
|
Undergraduate Researcher Norwegian Biometrics Laboratory, NTNU Norway
(Dec/2019 - May/2020)
--> Designed a CNN based, light weight, progressive residual propagating asymmetrical architecture with three modules (LF, HF feature extraction and reconstruction) to generate 8x upscaled (Super Resolution) images from 8x8, 16x16, 24x24 LR images.
--> The model gave appreciable results on benchmark datasets CelebA (PSNR: 26.55) and LFW (PSNR: 26.26).
Advisor: Kishor Upla,
Christoph Busch
|
|
Summer Research Intern IIRS, Indian Space Research Organization (ISRO)
Dehradun, India
(May/2019 - Jul/2019)
--> Implemented computationally efficient algorithms for the pixel-wise classification of Panchromatic (single band) and Multispectral (up-to 15 bands) satellite images using ANN and CNN.
Advisor: Anil Kumar
|
|
Undergraduate Researcher MLCV Lab, NIT Surat
Surat, India
(Jan/2019 - Nov/2019)
--> Designed an Automated Attendance System using Deep Learning to mark the attendance of entire class simultaneously and overcome usual challenges of occlusion, orientation and luminance.
Advisor: Kishor Upla
|
|
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos
Aashish Rai, Srinath Sridhar
Arxiv Preprint, 2024
project page | pdf
|
|
Towards Realistic Generative 3D Face Models
Aashish Rai, Hiresh Gupta, Ayush Pandey, Francisco Vicente Carrasco, Shingo Jason Takagi, Amaury Aubel, Daeil Kim, Aayush Prakash, Fernando de la Torre
Winter Conference on Applications of Computer Vision (WACV), 2024
project page | pdf
|
|
Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance
Fariborz Taherkhani, Aashish Rai, Quankai Gao, Shaunak Srivastava, Xuanbai Chen, Fernando de la Torre, Steven Song, Aayush Prakash, Daeil Kim
Winter Conference on Applications of Computer Vision (WACV), 2023
project page | pdf
|
|
Improved Attribute Manipulation in the Latent Space of StyleGAN for Semantic Face Editing
Aashish Rai, Clara Ducher, Jeremy Cooperstock
20th IEEE International Conference on Machine Learning and Applications, Pasadena, CA, USA, 2021
pdf | project page
|
|
ComSupResNet: A Compact Super-Resolution Network for Low-Resolution Face Images.
Aashish Rai, Vishal Chudasama, Kishor Upla, Kiran Raja, Raghavendra Ramachandra, Christoph Busch
8th International Workshop on Biometrics and Forensics (IWBF), Porto, Portugal, 2020
pdf | project page
(extended version is accepted in IEEE Transactions on Biometrics, Behavior and Identity Science (T-BIOM))
|
|
An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks.
Aashish Rai, , R. Karnani, V. Chudasama and K. Upla
16th IEEE India Council International Conference (INDICON), Rajkot, India, 2019
pdf | project page
|
(website template modified from repo )
|
|