by , , ,
Abstract:
Uniform deviation bounds limit the difference between a model's expected loss and its loss on an empirical sample *uniformly* for all models in a learning problem. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are *unbounded*. As a result, we obtain competitive uniform deviation bounds for k-Means clustering under weak assumptions on the underlying distribution. If the fourth moment is bounded, we prove a rate of O(m^(-1/2)) compared to the previously known O(m^(-1/4)) rate. Furthermore, we show that the rate also depends on the kurtosis - the normalized fourth moment which measures the "tailedness" of a distribution. We also provide improved rates under progressively stronger assumptions, namely, bounded higher moments, subgaussianity and bounded support of the underlying distribution.
Reference:
Uniform Deviation Bounds for k-Means Clustering O. Bachem, M. Lucic, S. H. Hassani, A. KrauseIn Proc. International Conference on Machine Learning (ICML), 2017
Bibtex Entry:
@inproceedings{bachem17uniform,
	author = {Olivier Bachem and Mario Lucic and S. Hamed Hassani and Andreas Krause},
	booktitle = {Proc. International Conference on Machine Learning (ICML)},
	month = {August},
	title = {Uniform Deviation Bounds for k-Means Clustering},
	year = {2017}}