Moritz Schneider

Hi! I am a PhD student at the Neurorobotics Lab at the University of Freiburg, in collaboration with Bosch Corporate Research, focusing on pre-training approaches for reinforcement learning agents. I am advised by Prof. Joschka Boedecker. Before starting my PhD, I earned a Master of Science in Automation Engineering and a Bachelor of Science in Business Administration and Mechanical Engineering from RWTH Aachen University. During this time, I worked on disentangled representation learning and interpretability methods for reinforcement learning.

Research

My research aims to uncover the principles required to build general embodied AI (including vision-language-action policies). To this end, I focus on diverse pre-training approaches for reinforcement learning in general and approaches that enable agents to learn effectively from diverse data sources before seeing any task-specific data. In particular, I am guided by the philosophical foundations of action — what is an action and how can its meaning affect the continual learning process of an agent? To this end, my work explores efficient methods to pre-train general agents by using unsupervised training objectives like empowerment or pre-training agents without action-labeled data, for example by leveraging large-scale video datasets. This facilitates offline training based on latent actions, allowing agents and their world models to acquire a general notion of actions without the need for tedious manual data collection. This general understanding of actions can then be fine-tuned for specific downstream tasks, enabling more efficient learning in complex environments. Additionally, I incorporate pre-trained foundation models such as DINOv2 and CLIP into my approaches to leverage their rich visual and multimodal representations for reinforcement learning tasks. Furthermore, I am interested in specialized architectures for model-based reinforcement learning, especially state space models and recurrent neural networks, as transformers, while powerful, often struggle with long temporal sequences in reinforcement learning contexts.

Publications