cv | Moritz A. Zanger

Basics

Name	Moritz Akiya Zanger
Email	moritz.a.zanger@gmail.com
Url	https://anyboby.github.io/
Summary	Machine Learning Researcher with a focus on Deep Reinforcement Learning.

Work

2021.06 - present
Ph.D. in Computer Science and Artificial Intelligence

Sequential Decision Making, TU Delft, NL

Efficient Uncertainty Quantification in Deep Reinforcement Learning. Supervised by Prof. Matthijs T.J. Spaan and Dr. Wendelin Bohmer.
- Part of EU Horizons Project Epistemic AI
2021.05 - 2022.01
Research Assistant

Intelligent Systems, Karlsruhe Research Center of Information Technology, GER

Worked on natural language processing algorithms for requirements management with pretrained models (BERT).
2021.01 - 2021.05
Research Assistant

Cognitive Systems, Karlsruhe Research Center of Information Technology, GER

Developed gradient estimation techniques for trust-region methods in model-based RL.

Education

2017.10 - 2020.10

Karlsruhe, Germany
M.Sc. Mechanical Engineering

Karlsruhe Institute of Technology, Germany

Thesis Title: Model-Based Reinforcement Learning for Constrained Policy Optimization in Robot Locomotion. Supervised by Prof. Marius J. Zoellner
- Graduated with distinction
- Majored in Robotics and Medical Engineering
2015.09 - 2016.09

Sendai, JP
Visiting Student

Tohoku University, JP

Robotics
2012.10 - 2017.10

Karlsruhe, Germany
B.Sc. Mechanical Engineering

Karlsruhe Institute of Technology, Germany
- Majored in Engineering Design
Korntal, Germany
Abitur

Gymnasium Korntal Muenchingen, Germany

Awards

2022

MLSS Scholarship

Machine Learning Summer School, Krakow Poland
2018

Students@Bosch Fellow

Robert Bosch GmbH
2017

GfSE Student Award

German Association for Systems Engineering
2016

DAAD Annual Scholarship

German Academic Exchange Service

Publications

2026

Cholesky Ordered Projection Q-learning (COP-Q): Guiding Safety-First Exploitation-Exploration by Multi-Objective Uncertainty

Guopeng Li, Moritz A. Zanger, Matthijs T. J. Spaan and Julian F. P. Kooij

Under review.
2026

Universal Value-Function Uncertainties

Moritz A. Zanger, Max Weltevrede, Yaniv Oren, Pascal R. Van der Vaart, Caroline Horsch, Wendelin Böhmer andMatthijs T. J. Spaan

International Conference on Learning Representations (ICLR).
2026

Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model

Moritz A. Zanger, Pascal R. van der Vaart, Matthijs T. J. Spaan and Wendelin Böhmer

International Conference on Learning Representations (ICLR).
2025

How Ensembles of Distilled Policies Improve Generalisation in Reinforcement Learning

Max Weltevrede, Moritz A. Zanger, Matthijs T. J. Spaan and Wendelin Böhmer

Neural Information Processing Systems (NeurIPS).
2025

Value Improved Actor Critic Algorithms

Yaniv Oren, Moritz A. Zanger, Pascal R. van der Vaart, Matthijs T. J. Spaan and Wendelin Böhmer

Neural Information Processing Systems (NeurIPS).
2024

Diverse Projection Ensembles for Distributional Reinforcement Learning

Moritz A. Zanger, Wendelin Böhmer, and Matthijs T. J. Spaan.

International Conference on Learning Representations (ICLR).
2021

Safe Continuous Control with Constrained Model-Based Policy Optimziation

Moritz A. Zanger, Karam Daaboul, and J. Marius. Zollner.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Skills

Python

C++

Java

MATLAB

SQL

Git

Jax

PyTorch

Tensorflow

Slurm

ROS

MuJoCo, Unity, CARLA

Languages

	German
	Native speaker

	English
	Fluent

	Japanese
	Fluent

	Dutch
	Intermediate

Interests

Woodworking

Kayak-fishing

Tennis

References

	Professor Matthijs T. J. Spaan
	Director Sequential Decision Making, TU Delft

	Dr. Wendelin Boehmer
	Assistant Professor Sequential Decision Making, TU Delft

	Dr. Frans A. Oliehoek
	Associate Professor Sequential Decision Making, TU Delft

Basics

Work

Sequential Decision Making, TU Delft, NL

Efficient Uncertainty Quantification in Deep Reinforcement Learning. Supervised by Prof. Matthijs T.J. Spaan and Dr. Wendelin Bohmer.

Intelligent Systems, Karlsruhe Research Center of Information Technology, GER

Worked on natural language processing algorithms for requirements management with pretrained models (BERT).

Cognitive Systems, Karlsruhe Research Center of Information Technology, GER

Developed gradient estimation techniques for trust-region methods in model-based RL.

Education

Karlsruhe Institute of Technology, Germany

Thesis Title: Model-Based Reinforcement Learning for Constrained Policy Optimization in Robot Locomotion. Supervised by Prof. Marius J. Zoellner

Tohoku University, JP

Robotics

Karlsruhe Institute of Technology, Germany

Gymnasium Korntal Muenchingen, Germany

Awards

Machine Learning Summer School, Krakow Poland

Robert Bosch GmbH

German Association for Systems Engineering

German Academic Exchange Service

Publications

Guopeng Li, Moritz A. Zanger, Matthijs T. J. Spaan and Julian F. P. Kooij

Under review.

Moritz A. Zanger, Max Weltevrede, Yaniv Oren, Pascal R. Van der Vaart, Caroline Horsch, Wendelin Böhmer andMatthijs T. J. Spaan

International Conference on Learning Representations (ICLR).

Moritz A. Zanger, Pascal R. van der Vaart, Matthijs T. J. Spaan and Wendelin Böhmer

International Conference on Learning Representations (ICLR).

Max Weltevrede, Moritz A. Zanger, Matthijs T. J. Spaan and Wendelin Böhmer

Neural Information Processing Systems (NeurIPS).

Yaniv Oren, Moritz A. Zanger, Pascal R. van der Vaart, Matthijs T. J. Spaan and Wendelin Böhmer

Neural Information Processing Systems (NeurIPS).

Moritz A. Zanger, Wendelin Böhmer, and Matthijs T. J. Spaan.

International Conference on Learning Representations (ICLR).

Moritz A. Zanger, Karam Daaboul, and J. Marius. Zollner.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Skills

Languages

Interests

References