Olivier Bachem

I am a Director, Research Scientist at Google DeepMind and lead the team that built the Reinforcement Learning from Human Feedback (RLHF) technology used in Bard, PaLM 2 (via Vertex AI, PaLM API, Duet AI), Gemini, Gemma, and various other Google products.

Since joining Google Brain, I worked on various problems in machine learning and artificial intelligence, including projects related to generative modeling, copmuter vision, reinforcement learning and representation learning. In my PhD studies at ETH Zurich, I investigated coresets - small summaries of large data sets with theoretical guarantees - and other sampling methods for large-scale machine learning. During that time I also held a Google PhD Fellowship.

Follow @OlivierBachem

Selected publications

2023

Gemini: A Family of Highly Capable Multimodal Models

Gemini Team Google: (...), Olivier Bachem, (...)

Technical report

2022

A general class of surrogate functions for stable and efficient reinforcement learning

Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, and Nicolas Le Roux

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.
Best Paper Award Honorable Mention

Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint

Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, and Olivier Pietquin

In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2022.
Best paper runner-up

2021

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphael Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, and Olivier Bachem

In International Conference on Learning Representations (ICLR), 2021.
Oral presentation

2019

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem

In International Conference on Machine Learning (ICML), 2019.
Best Paper Award

2016

Fast and Provably Good Seedings for k-Means

Olivier Bachem, Mario Lucic, S. Hamed Hassani and Andreas Krause

Neural Information Processing Systems (NIPS), 2016.
Oral presentation (video, slides, spotlight), implementation of algorithm available on GitHub.

All publications

2024

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Griffin, RLHF, and Gemma Teams, (...), Olivier Bachem, (...)

Technical report.

Gemma: Open Models Based on Gemini Research and Technology

Gemma Team, Google DeepMind: (...), Olivier Bachem, (...)

Technical report.

WARM: On the Benefits of Weight Averaged Reward Models

Alexandre Ramé, Nino Vieillard, Léonard Hussenot, Robert Dadashi, Geoffrey Cideron, Olivier Bachem, Johan Ferret

To appear in International Conference on Machine Learning, 2024.

MusicRL: Aligning Music Generation to Human Preferences

Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli

To appear in International Conference on Machine Learning, 2024.

Nash Learning from Human Feedback

Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot

To appear in International Conference on Machine Learning, 2024.

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Rishabh Agarwal, Nino Vieillard, Yongchao Zhou, Piotr Stanczyk, Sabela Ramos, Matthieu Geist, Olivier Bachem

In International Conference on Learning Representations (ICLR), 2024.

2023

Gemini: A Family of Highly Capable Multimodal Models

Gemini Team Google: (...), Olivier Bachem, (...)

Technical report

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor

In Association for Computational Linguistics (ACL), 2023.

2022

vec2text with Round-Trip Translations

Geoffrey Cideron, Sertan Girgin, Anton Raichuk, Olivier Pietquin, Olivier Bachem, Léonard Hussenot

Preprint

On the importance of data collection for training general goal-reaching policies

Alexis Jacq, Manu Orsini, Gabriel Dulac-Arnold, Olivier Pietquin, Matthieu Geist, Olivier Bachem

Preprint.

A general class of surrogate functions for stable and efficient reinforcement learning

Leonard Adolphs, Michelle Chen Huebscher, Christian Buck, Sertan Girgin, Olivier Bachem, Massimiliano Ciaramita, and Thomas Hofmann

In Empirical Methods in Natural Language Processing (EMNLP), 2022.

A general class of surrogate functions for stable and efficient reinforcement learning

Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, and Nicolas Le Roux

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.
Best Paper Award Honorable Mention

The Role of Pretrained Representations for the OOD Generalization of RL Agents

Andrea Dittadi, Frederik Träuble, Manuel Wüthrich, Felix Widmaier, Peter Gehler, Ole Winther, Francesco Locatello, Olivier Bachem, Bernhard Schölkopf, and Stefan Bauer

In International Conference on Learning Representations (ICLR), 2022.

Offline Reinforcement Learning as Anti-Exploration

Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, and Matthieu Geist

In AAAI Conference on Artificial Intelligence (AAAI), 2022.

Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint

Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, and Olivier Pietquin

In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2022.
Best paper runner-up

2021

Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization

Shixiang Shane Gu, Manfred Diaz, Daniel C. Freeman, Hiroki Furuta, Seyed Kamyar Seyed Ghasemipour, Anton Raichuk, Byron David, Erik Frey, Erwin Coumans, and Olivier Bachem

Preprint, 2021.

Brax - A Differentiable Physics Engine for Large Scale Rigid Body Simulation

C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem

In Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks, 2021.

What Matters for Adversarial Imitation Learning?

Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, and Marcin Andrychowicz

In Neural Information Processing Systems (NeurIPS), 2021.

Hyperparameter Selection for Imitation Learning

Leonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Lukasz Stafiniak, Sertan Girgin, Raphael Marinier, Nikola Momchev, Sabela Ramos, Manu Orsini, Olivier Bachem, Matthieu Geist, and Olivier Pietquin

In International Conference on Machine Learning (ICML), 2021.

Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets

Baris Sumengen, Anand Rajagopalan, Gui Citovsky, David Simcha, Olivier Bachem, Pradipta Mitra, Sam Blasiak, Mason Liang, and Sanjiv Kumar

Preprint, 2021.

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

In International Conference on Learning Representations (ICLR), 2021.
Oral presentation

2019

A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation

Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem

In Journal of Machine Learning Research 21 (2020) 1-62.

Automatic Shortcut Removal for Self-Supervised Representation Learning

Matthias Minderer, Olivier Bachem, Neil Houlsby, and Michael Tschannen

In International Conference on Machine Learning (ICML), 2020.

Weakly-Supervised Disentanglement without Compromises

Francesco Locatello, Ben Poole, Gunnar Rätsch, Bernhard Schölkopf, Olivier Bachem, and Michael Tschannen

In International Conference on Machine Learning (ICML), 2020.

Evaluating Generative Models Using Divergence Frontiers

Josip Djolonga, Mario Lucic, Marco Cuturi, Olivier Bachem, Olivier Bousquet, and Sylvain Gelly

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2020.

Disentangling Factors of Variation Using Few Labels

Francesco Locatello, Stefan Bauer, Gunnar Rätsch, Bernhard Schölkopf, and Olivier Bachem

In International Conference on Learning Representations (ICLR), 2020.

Google Research Football: A Novel Reinforcement Learning Environment

Karol Kurach*, Anton Raichuk*, Piotr Stanczyk*, Michal Zajac, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, and Sylvain Gelly

In AAAI Conference on Artificial Intelligence (AAAI), 2020.

A Commentary on the Unsupervised Learning of Disentangled Representations

Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem

In AAAI Conference on Artificial Intelligence (AAAI), 2020.

2019

Are Disentangled Representations Helpful for Abstract Visual Reasoning?

Sjoerd van Steenkiste, Francesco Locatello, Jürgen Schmidhuber, and Olivier Bachem

In Neural Information Processing Systems (NeurIPS), 2019.

On the Fairness of Disentangled Representations

Francesco Locatello, Gabriele Abbati, Tom Rainforth, Stefan Bauer, Bernhard Schölkopf, and Olivier Bachem

In Neural Information Processing Systems (NeurIPS), 2019.

On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset

Muhammad Waleed Gondal, Manuel Wüthrich, Đorđe Miladinović, Francesco Locatello, Martin Breidt, Valentin Volchkov, Joel Akpo, Olivier Bachem, Bernhard Schölkopf, and Stefan Bauer

In Neural Information Processing Systems (NeurIPS), 2019.

The Visual Task Adaptation Benchmark

Xiaohua Zhai*, Joan Puigcerver*, Alexander Kolesnikov*, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly and Neil Houlsby

arXiv preprint

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem

In International Conference on Machine Learning (ICML), 2019.
Best Paper Award

High-Fidelity Image Generation With Fewer Labels

Mario Lucic*, Michael Tschannen*, Marvin Ritter*, Xiaohua Zhai, Olivier Bachem, and Sylvain Gelly

In International Conference on Machine Learning (ICML), 2019.

2018

Recent Advances in Autoencoder-Based Representation Learning

Michael Tschannen, Olivier Bachem, and Mario Lucic

arXiv preprint

Assessing Generative Models via Precision and Recall

Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, and Sylvain Gelly

In Neural Information Processing Systems (NeurIPS), 2018.

Scalable and Distributed Clustering via Lightweight Coresets

Olivier Bachem, Mario Lucic and Andreas Krause

In International Conference on Knowledge Discovery and Data Mining (KDD), 2018.

One-Shot Coresets: The Case of k-Clustering

Olivier Bachem, Mario Lucic and Silvio Lattanzi

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2018.

2017

Uniform Deviation Bounds for Unbounded Loss Functions like k-Means

Olivier Bachem, Mario Lucic, S. Hamed Hassani and Andreas Krause

In International Conference on Machine Learning (ICML), 2017.
A video of the talk is available on Vimeo.

Distributed and Provably Good Seedings for k-Means in Constant Rounds

Olivier Bachem, Mario Lucic and Andreas Krause

In International Conference on Machine Learning (ICML), 2017.
A video of the talk is available on Vimeo.

Practical Coreset Constructions for Machine Learning

Olivier Bachem*, Mario Lucic* and Andreas Krause

arXiv preprint

2016

Fast and Provably Good Seedings for k-Means

Olivier Bachem, Mario Lucic, S. Hamed Hassani and Andreas Krause

Neural Information Processing Systems (NIPS), 2016.
Oral presentation (video, slides, spotlight), implementation of algorithm available on GitHub.

Linear-Time Outlier Detection via Sensitivity

Mario Lucic, Olivier Bachem and Andreas Krause

In International Joint Conference on Artificial Intelligence (IJCAI), 2016.

Horizontally Scalable Submodular Maximization

Mario Lucic, Olivier Bachem, Morteza Zadimoghaddam and Andreas Krause

In International Conference on Machine Learning (ICML), 2016.

Strong Coresets for Hard and Soft Bregman Clustering with Applications to Exponential Family Mixtures

Olivier Bachem*, Mario Lucic* and Andreas Krause

In International Conference on Artificial Intelligence and Statistics (AISTATS), 2016.

Approximate K-Means++ in Sublinear Time

Olivier Bachem, Mario Lucic, S. Hamed Hassani and Andreas Krause

In AAAI Conference on Artificial Intelligence (AAAI), 2016.
Implementation of algorithm available on GitHub.