David Lindner

PhD Student
david.lindner@inf.ethz.ch
CAB E 65.1
+41 44 633 93 75
Twitter
LinkedIn
GitHub
External Website
My research aims to build safe, robust, and interpretable artificial intelligence (AI). Currently, I am primarily working on Reinforcement Learning from Human Feedback (RLHF) which I think is a key ingredient for building safe AI. My work in this context has two goals: first, making RLHF more sample efficient via active learning, and second, using constraint models in addition to reward models to specify tasks, particularly in contexts where safety is a critical concern.

Publications

2023
GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems B. Sukhija, M. Turchetta, D. Lindner, A. Krause, S. Trimpe, D. Baumann In Artificial Intelligence Journal (AIJ), 2023 [bibtex] [abstract] [pdf]
Open problems and fundamental limitations of reinforcement learning from human feedback S. Casper, X. Davies, C. Shi, T. K. Gilbert, J. Scheurer, J. Rando, R. Freedman, T. Korbak, D. Lindner, P. Freire, T. Wang, S. Marks, C. R. Segerie, M. Carroll, A. Peng, P. Christoffersen, M. Damani, S. Slocum, U. Anwar, A. Siththaranjan, M. Nadeau, E. J. Michaud, J. Pfau, D. Krasheninnikov, X. Chen, L. Langosco, P. Hase, E. Biyik, A. Dragan, D. Krueger, D. Sadigh, D. Hadfield-Menell In Transactions on Machine Learning Research, 2023 [bibtex] [abstract] [pdf]
Learning Safety Constraints from Demonstrations with Unknown Rewards D. Lindner, X. Chen, S. Tschiatschek, K. Hofmann, A. Krause In The 27th International Conference on Artificial Intelligence and Statistics (AISTATS), 2023 [bibtex] [abstract] [pdf]
2022
Active Exploration for Inverse Reinforcement Learning D. Lindner, A. Krause, G. Ramponi In Proc. Neural Information Processing Systems (NeurIPS), 2022 [bibtex] [abstract] [pdf] [code] [video]
Interactively Learning Preference Constraints in Linear Bandits D. Lindner, S. Tschiatschek, K. Hofmann, A. Krause In Proc. International Conference on Machine Learning (ICML), 2022 [bibtex] [abstract] [pdf] [code] [video] [blog]
2021
Information Directed Reward Learning for Reinforcement Learning D. Lindner, M. Turchetta, S. Tschiatschek, K. Ciosek, A. Krause In Proc. Neural Information Processing Systems (NeurIPS), 2021 [bibtex] [abstract] [pdf] [code] [video] [blog]
Addressing the Long-term Impact of ML Decisions via Policy Regret D. Lindner, H. Heidari, A. Krause In Proc. International Joint Conference on Artificial Intelligence (IJCAI), 2021 [bibtex] [abstract] [pdf] [code] [video]
Learning What To Do by Simulating the Past D. Lindner, R. Shah, P. Abbeel, A. Dragan In Proc. International Conference on Learning Representations (ICLR), 2021 [bibtex] [abstract] [pdf] [code] [video] [blog]

David Lindner

Publications

2023

2022

2021