Spatio-Temporal AI for Long-term Human-centric Robot Autonomy
Lukas Schmid
Massachusetts Institute of Technology (hosted by Bernt Schiele)
10 Feb 2025, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 029
CIS@MPG Colloquium
The ability to build an actionable representation of the environment of a robot
is crucial for autonomy and prerequisite to a large variety of applications,
ranging from home, service, and consumer robots to social, care, and medical
robotics to industrial, agriculture and disaster response applications.
Notably, a large part of the promise of autonomous robots depends on long-term
operation in domains shared with humans and other agents. These environments
are typically highly complex, semantically rich, and highly dynamic with agents
frequently moving through and interacting with the scene. ...
The ability to build an actionable representation of the environment of a robot
is crucial for autonomy and prerequisite to a large variety of applications,
ranging from home, service, and consumer robots to social, care, and medical
robotics to industrial, agriculture and disaster response applications.
Notably, a large part of the promise of autonomous robots depends on long-term
operation in domains shared with humans and other agents. These environments
are typically highly complex, semantically rich, and highly dynamic with agents
frequently moving through and interacting with the scene.
This talk presents an autonomy pipeline combining perception, prediction, and
planning to address these challenges. We first present methods to detect and
represent complex semantics, short-term motion, and long-term changes for
real-time robot perception in a unified framework called Khronos. We then show
how Dynamic Scene Graphs (DSGs) can represent semantic symbols in a task-driven
fashion and facilitate reasoning about the scene, such as the prediction of
likely future outcomes based on the data the robot has already collected.
Lastly, we show how robots as embodied agents can leverage these actionable
scene representations and predictions to complete tasks such as actively
gathering data that helps them improve their world models, perception, and
action capabilities fully autonomously over time. The presented methods are
demonstrated on-board fully autonomous aerial, legged, and wheeled robots, run
in real-time on mobile hardware, and are available as open-source software.
Read more