David Lindner

David Lindner

  • PhD Student
  • david.lindner@inf.ethz.ch
  • CAB E 65.1
  • +41 44 633 93 75
  • Twitter
  • LinkedIn
  • GitHub
  • External Website
  • My research aims to build safe, robust, and interpretable artificial intelligence (AI). Currently, I am primarily working on Reinforcement Learning from Human Feedback (RLHF) which I think is a key ingredient for building safe AI. My work in this context has two goals: first, making RLHF more sample efficient via active learning, and second, using constraint models in addition to reward models to specify tasks, particularly in contexts where safety is a critical concern.

Publications

2023
  • Open problems and fundamental limitations of reinforcement learning from human feedback
  • , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
  • In Transactions on Machine Learning Research,
  • [bibtex] [abstract] [pdf]
2022
2021