by ,
Abstract:
We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation.
Reference:
Bias-Robust Bayesian Optimization via Dueling Bandits J. Kirschner, A. KrauseIn Proc. International Conference on Machine Learning (ICML), 2021
Bibtex Entry:
@inproceedings{kirschner21bias,
	Author = {Johannes Kirschner and Andreas Krause},
	Booktitle = {Proc. International Conference on Machine Learning (ICML)},
	Month = {July},
	Pdf = {http://proceedings.mlr.press/v139/kirschner21a/kirschner21a.pdf},
	Title = {Bias-Robust Bayesian Optimization via Dueling Bandits},
	Year = {2021}}