- Undergraduate
- Graduate
- Research
- News & Events
- People
- Inclusivity
- Jobs
Back to Top Nav
Back to Top Nav
Back to Top Nav
In this talk, I will present our research towards this vision through the development of multimodal video-language systems that can assist humans in their daily lives by...
Abstract: What if you had an AI assistant that seamlessly observes and understands your actions, and is available at any time to remind you where you misplaced your objects, to automatically transcribe "who" said "what" in a meeting, or to interactively guide and track your progress in the execution of complex activities, such as cooking a recipe, or fixing a bike? In this talk, I will present our research towards this vision through the development of multimodal video-language systems that can assist humans in their daily lives by perceiving and predicting their behavior from wearable cameras. Our advances in this field include models that detect and describe not only what the user is doing but also how the activity is performed to provide relevant guidance and improvement suggestions in language form. I will also discuss our contributions to the broad research community by creating and open-sourcing multiple foundational datasets to promote the next generation of multimodal perceptual AI.
Bio:Lorenzo Torresani is an AI researcher, author of the first methods for 3D reconstruction of non-rigid objects and influential architectures for image and video analysis (Classemes, C3D, TimeSformer). His current research interests are in video-language models. He received his Ph.D. in Computer Science from Stanford University in 2005. From 2009 to 2021, he was on the faculty of the Computer Science Department at Dartmouth College, where he received tenure in 2014 and was promoted to full professor in 2020. From 2018 to 2024, he held positions at Meta (formerly Facebook), most recently as a Research Director leading teams in multimodal research. Previously, he worked at Microsoft Research, Like.com, and Digital Persona. He is the recipient of multiple awards, including a CVPR Best Student Paper prize, a National Science Foundation CAREER Award, a Google Faculty Research Award, three Facebook Faculty Awards, and a Fulbright U.S. Scholar Award. He has over 100 peer-reviewed publications, an h-index of 64, and 11 patents.
Events are free and open to the public unless otherwise noted.