by , ,
Abstract:
Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first practical and FAst coNsTrained submOdular Maximization algorithm, Fantom, with strong theoretical guarantees. Fantom maximizes a submodular function (not necessarily monotone) subject to intersections of a p-system and l knapsacks constrains. It achieves a (1+eps)(p+1)(2p+2l+1)/p approximation guarantee with only O(nrp log(n)/eps) query complexity (r indicates the size of the largest feasible solution). We then show how we can use Fantom for personalized data summarization. In particular, a p-system can model different aspects of data, such as categories or time stamps, from which the users choose. In addition, knapsacks encode users' constraints including budget or time. In our set of experiments, we consider several concrete applications: movie recommendation over 11K movies, personalized image summarization with  10K images, and revenue maximization on the YouTube social networks with 5000 communities. We observe that Fantom constantly provides the highest utility against all the baselines.
Reference:
Fast Constrained Submodular Maximization: Personalized Data Summarization B. Mirzasoleiman, A. Badanidiyuru, A. KarbasiIn Proc. International Conference on Machine Learning (ICML), 2016
Bibtex Entry:
@inproceedings{mirzasoleiman16fast,
	Author = {Baharan Mirzasoleiman and Ashwinkumar Badanidiyuru and Amin Karbasi},
	Booktitle = {Proc. International Conference on Machine Learning (ICML)},
	Month = {June},
	Title = {Fast Constrained Submodular Maximization: Personalized Data Summarization},
	Year = 2016}