SEMINAR

Human Sensing for AR/VR

Speaker

Fernando De la Torre

Working

Carnegie Mellon University

Timeline

Wed, Apr 24 2024 - 07:00 am (GMT + 7)

About Speaker

Fernando De la Torre obtained his Ph. D. in Electronic Engineering from Ramon Llull University’s La Salle School of Engineering in Barcelona in 2002. He has been a research faculty member in the Robotics Institute at Carnegie Mellon University since 2005. His areas of interest in research are machine learning and computer vision. In particular, applications to human health, augmented reality, virtual reality, generative models, and methods that focus on the data (not the model). He is the director of the Human Sensing Laboratory (www.humansensing.cs.cmu.edu). He has published over 225 peer-review conference papers/journals in computer vision and machine learning, and he has served as an Associate Editor for Pattern Analysis and Machine Intelligence. In 2014, he founded FacioMetrics LLC to license technology for mobile human sensing (acquired by Facebook/Meta in 2016).

Abstract

There have been three revolutions in human computing: Personal computers in the 1980s, the World Wide Web and cloud computing in the 1990s, and the iPhone in 2007. The fourth one will include augmented and virtual reality (AR/VR) technology. The ability to transfer human motion from AR/VR sensors to avatars will be critical for AR/VR social platforms, video games, new communication systems, and future workspaces. In the first part of this talk, I will describe several techniques to compute subtle human behavior with applications to AR/VR (e.g., facial expression transfer to photorealistic avatars) and medical monitoring/diagnosis (e.g., depression diagnosis from audio/video). In addition, I will show how we can estimate dense human correspondence from WiFi signals, which could pave the way for novel AR/VR interfaces.

All these techniques for human sensing rely on training deep learning models. However, in practice metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This is partially due to the fact that obtaining a balanced (i.e., uniformly sampled over all the important attributes), diverse, and perfectly labeled test dataset is typically expensive, time-consuming, and error-prone. In the second segment of this presentation, I will introduce two methods aimed at enhancing the robustness and fairness of deep learning techniques. First, I will delve into a technique for conducting zero-shot model diagnosis. This technique allows for the assessment of failures in deep learning models in an unsupervised manner, eliminating the need for test data. Additionally, I will discuss a method designed to rectify biases in generative models, which can be achieved using only a small number of sample images that showcase specific attributes of interest.