Dartmouth Events

Embodied Multimodal Intelligence with Foundation Models

I have focused on addressing the challenging problems of relating human language to a robot’s multimodal perceptions...

2/24/2025
11:30 am – 12:30 pm
ECSC 009
Intended Audience(s): Public
Categories: Lectures & Seminars

Speaker: Oier Mees

Abstract: Despite considerable progress in robot learning and contrary to the expectations of the general public, the vast majority of robots deployed out in the real world today continue to remain restricted to a narrow set of preprogrammed behaviors for specific tasks. As robots become ubiquitous across human-centred environments, the need for "generalist" robots grows: how can we scale robot learning systems to generalize and adapt, allowing them to perform a wide range of everyday tasks in unstructured environments based on arbitrary, multimodal instructions from the users? In my work, I have focused on addressing the challenging problems of relating human language to a robot’s multimodal perceptions and actions by introducing techniques that leverage self-supervision from uncurated data and common sense reasoning from foundation models from and for robotics.

Bio: I am a PostDoc at UC Berkeley working with Prof. Sergey Levine. I received my PhD in Computer Science (summa cum laude) in 2023 from the Freiburg University supervised by Prof. Dr. Wolfram Burgard. My research focuses on robot learning, to enable robots to intelligently interact with both the physical world and humans, and improve themselves over time. These days, I am particularly interested in how we can build self-improving embodied foundation models that can generalize the same way humans do. My research has been nominated for (and received) several Best Paper Awards, including ICRA and RA-L. Previously, I also spent time at NVIDIA AI interning with Dieter Fox.

For more information, contact:
Susan Cable

Events are free and open to the public unless otherwise noted.