by , , ,
How can we extract representative features from a dataset containing sensitive personal information, while providing individual-level privacy guarantees? Many data summarization applications are captured by the general framework of submodular maximization. As a consequence, a wide range of efficient approximation algorithms for submodular maximization have been developed. However, when such applications involve sensitive data about individuals, their privacy concerns are not automatically addressed by these algorithms. To remedy this problem, we propose a general and systematic study of differentially private submod- ular maximization. We present privacy-preserving algorithms for both monotone and non-monotone submodular maximization under cardinality, matroid, and p-extendible system constraints, with guarantees that are competitive with optimal solutions. Along the way, we analyze a new algorithm for non-monotone submodular maximization under a cardinality constraint, which is the first (even non- privately) to achieve a constant approximation ratio with a linear number of function evaluations. We additionally provide two concrete experiments to validate the efficacy of these algorithms. In the first experiment, we privately solve the facility location problem using a dataset of Uber pickup locations in Manhattan. In the second experiment, we perform private submodular maximization of a mutual information measure to select features relevant to classifying patients by diabetes status.
Differentially Private Submodular Maximization: Data Summarization in Disguise M. Mitrovic, M. Bun, A. Krause, A. KarbasiIn Proc. International Conference on Machine Learning (ICML), 2017
Bibtex Entry:
	author = {Marko Mitrovic and Mark Bun and Andreas Krause and Amin Karbasi},
	booktitle = {Proc. International Conference on Machine Learning (ICML)},
	month = {August},
	title = {Differentially Private Submodular Maximization: Data Summarization in Disguise},
	year = {2017}}