SEMINAR

Video Understanding: from Representation Learning to Open-World, Long-term Reasoning

Speaker

Du Tran

Working
Meta AI Research
Timeline
Fri, Jul 22 2022 - 02:00 pm (GMT + 7)
About Speaker

Du Tran is a staff research scientist at Meta AI Research. He graduated with a Ph.D. in computer science from Dartmouth College and an M.S. in computer science from the University of Illinois at Urbana-Champaign, receiving the Dartmouth Presidential Fellowship and the Vietnam Education Fellowship. His research interests are in computer vision, machine learning, and computer graphics, with specific interests in video understanding, representation learning, and multimodal modeling.

Abstract

Video understanding is one of the fundamental problems in computer vision with various applications, including autonomous vehicles, robot learning, and visual perception. Although we have witnessed multiple works in video understanding in the last few years, there are many more challenging video understanding problems that are still unsolved. In this talk, I will present some of our recent work in video understanding, including cross-modal self-supervised learning of video and audio representations and open-world instance segmentation. Finally, I will speculate on several potential future research directions in this area.

Related seminars

Tim Baldwin

MBZUAI, The University of Melbourne

Safe, open, locally-aligned language models
Mon, Dec 16 2024 - 02:00 pm (GMT + 7)

Alessio Del Bue

Italian Institute of Technology (IIT)

From Spatial AI to Embodied AI: The Path to Autonomous Systems
Mon, Dec 16 2024 - 10:00 am (GMT + 7)

Dr. Xiaoming Liu

Michigan State University

Person Recognition at a Distance
Mon, Dec 9 2024 - 10:00 am (GMT + 7)

Dr Lan Du

Monash University

Uncertainty Estimation for Multi-view/Multimodal Data
Fri, Dec 6 2024 - 10:00 am (GMT + 7)