Intelligent Information Gathering and Submodular Function Optimization

Description

A key problem in AI is to develop intelligent systems and services that actively gather most relevant information. In recent years, a fundamental problem structure has emerged as extremely useful for addressing this problem: Submodularity is an intuitive diminishing returns property, stating that adding an element to a smaller set helps more than adding it to a larger set. Applications where this property is useful include active learning and experimental design, informative path planning, multi-agent coordination, structure learning, clustering, influence maximization, weblog ranking, trading off utility and privacy as well as game theoretic applications. In contrast to most existing approaches, submodularity allows to efficiently find provably (near-)optimal solutions. In this tutorial, we will give an introduction to the concept of submodularity, discuss algorithms for optimizing submodular functions and illustrate their usefulness in solving difficult AI problems, with a special focus on active information gathering tasks. Since submodularity arises in so many different areas of AI, and since information gathering is central to many AI applications, we believe that this property is both of theoretical and practical interest to a large part of the AI community. This tutorial will not require prior knowledge beyond the basic concepts covered in an introductory AI class.

Materials

  • Tutorial slides [ppt, 20 MB]
  • MATLAB Toolbox for submodular function optimization [link].
  • Previous tutorial given at ICML 2008.

Outline

  1. Introduction
    • Motivating Applications
    • Why should the AI community care about submodularity?
  2. Submodular set functions
    • Definitions
    • Intuition: Why are submodular functions useful
    • Examples of submodular functions (entropy [8] and mutual information [10, 14], influence in graphs [9], etc.) and examples of functions which are not submodular
    • Operations preserving submodularity and relationship between submodularity and convexity [16]
  3. Minimizing submodular functions
    • Overview about known results for minimization [22, 7]
    • Queyranne’s algorithm for minimizing symmetric submodular functions [22], applications to factorizing distributions and clustering [20]
    • Learning structure of graphical models [19, 1]
  4. Maximizing submodular functions
    • The Greedy algorithm [21,32]
    • Lazy evaluations and scaling up to large data sets [28]
    • Applications of maximizing submodular functions
      • Informative path planning and multi-agent coordination [24]
      • Sensor placement and scheduling [39]
      • Information gathering in the presence of adversaries [13]
    • Online and stochastic optimization of submodular functions [5, 11]
  5. Conclusions
    • Current research directions / Open questions
    • Other pointers (submodularity in games / allocation problems [4,35])

Who should attend

The main objective of this tutorial is to introduce the concept of submodular function optimization and its emerging importance in solving complex AI problems. As a special focus, we illustrate the concept on a key AI task: Intelligent gathering of most relevant information, in a variety of complex real-world problems.
Since submodularity arises in so many different areas of AI, and since information gathering is central to many AI applications, we believe that this property is both of theoretical and practical interest to a large part of the AI community..

Presenters

Andreas Krause is an assistant professor of Computer Science at the California Institute of Technology. He received his Ph.D. from Carnegie Mellon University in 2008. He is a recipient of a Microsoft Research Graduate Fellowship, and his research on sensor placement and information acquisition received awards at several major conferences (KDD '07, IPSN '06, ICML '05 and UAI '05) and the ASCE journal of Water Resource Planning and Management.

Carlos Guestrin is the Finmeccanica Assistant Professor in the Machine Learning and Computer Science Departments at Carnegie Mellon University. Previously, he was a senior researcher at Intel, and received his PhD from Stanford University. Carlos' work received awards at a number of major conferences and a journal. He is also a recipient of the ONR Young Investigator Award, the NSF Career Award, the Alfred P. Sloan Fellowship, the IBM Faculty Fellowship, and was named one of the 2008 Brilliant 10 by Popular Science Magazine. Carlos is currently a member of the Information Sciences and Technology (ISAT) advisory group for DARPA.

References

The following references are used in the tutorial. A short high level summary is given, without any claim of completeness.
  • [1] A. Chechetka and C. Guestrin. Efficient principled learning of thin junction trees. In In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2007 [pdf]
    • Uses submodular function optimization to speed up structure learning.
  • [2] C. Chekuri and M. Pal. A recursive greedy algorithm for walks in directed graphs. In Annual Symposium on Foundation of Computer Science (FOCS), pages 245–253, 2005 [link]
    • Slightly superpolynomial approximation algorithm for submodular path planning (orienteering).
  • [3] A. Das and D. Kempe. Algorithms for subset selection in linear regression. In STOC, 2008. [pdf]
    • Variance reduction in Gaussian models is submodular under certain conditions on the covariance.
  • [4] U. Feige. On maximizing welfare when utility functions are subadditive. In STOC, 2006 [pdf]
    • Generalization of the submodular welfare problem to additive set functions.
  • [5] D. Golovin and M. Streeter. Online algorithms for maximizing submodular set functions. In Technical Report CMU-CS-07-171. 2007. [pdf]
    • Online algorithm for maximizing submodular functions, with application to scheduling.
  • [6] S. C. H. Hoi, R. Jin, J. Zhu, and M. R. Lyu. Batch mode active learning and its application to medical image classification. In ICML, 2006. [link]
    • Defines a monotonic submodular criterion for batch-mode active learning of kernelized logistic regression.
  • [7] S. Iwata, L. Fleischer, and S. Fujishige. A combinatorial strongly polynomial algorithm for minimizing submodular functions. Journal of the ACM, 48(4):761–777, 2001.
    • Strongly polynomial algorithm for minimizing arbitrary submodular functions.
  • [8] A. K. Kelmans and B. N. Kimelfeld. Multiplicative submodularity of a matrix’s principal minor as a function of the set of its rows and some combinatorial applications. Discrete Mathematics, 44(1):113–116, 1980.
    • Proves that entropy is a submodular function.
  • [9] D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD, 2003 [pdf]
    • Maximizing influence (viral marketing) in social networks is submodular.
  • [10] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. In UAI, 2005 [pdf]
    • Information gain in Naive Bayes models is submodular. Also gives (1-1/e) hardness of approximation result for information gain in such models.
  • [11] A. Krause and C. Guestrin. Nonmyopic active learning of gaussian processes: An exploration—exploitation approach. In ICML, 2007 [pdf]
    • Using submodularity to analyze sequential experimental design in Gaussian Processes with uncertain kernel parameters.
  • [12] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor placements: Maximizing information while minimizing communication cost. In Proceedings of the Fifth International Symposium on Information Processing in Sensor Networks (IPSN), 2006 [pdf]
    • Optimizing monotonic submodular functions subject to communication constraints.
  • [13] A. Krause, B. McMahan, C. Guestrin, and A. Gupta. Selecting observations against adversarial objectives. In NIPS, 2007 [pdf]
    • Maximizing the minimum over a set of monotonic submodular functions, with applications to robust experimental design.
  • [14] A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. In JMLR, 2008.
    • Proves approximate monotonicity of mutual information for experimental design in Gaussian Processes.
  • [15] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In KDD, 2007 [pdf]
    • Proves monotonic submodularity for outbreak detection in networks, with applications to sensor placement  and selecting informative blogs.
  • [16] L. Lovasz. Submodular functions and convexity. Mathematical Programming - State of the Art, pages 235–257, 1983.
    • Discusses connections between submodular and convex functions. Introduces Lovasz extension.
  • [17] A. Meliou, A. Krause, C. Guestrin, and J. Hellerstein. Nonmyopic informative path planning in spatio-temporal models. In AAAI, 2007.
    • Spatio-temporal submodular path planning.
  • [18] M. Narasimhan and J. Bilmes. A submodular-supermodular procedure with applications to discriminative structure learning. In Advances in Neural Information Processing Systems (NIPS) 19, 2006 [pdf]
    • Maximizing the difference of two submodular functions.
  • [19] M. Narasimhan and J. Bilmes. Pac-learning bounded tree-width graphical models. In Uncertainty in Artificial Intelligence, 2004 [pdf]
    • Uses symmetric submodular function minimization to find separators, which allows efficient structure learning.
  • [20] M. Narasimhan, N. Jojic, and J. Bilmes. Q-clustering. In NIPS, 2005 [pdf].
    • Shows how Queyranne's algorithm can be used for (near-)optimal clustering.
  • [21] G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265–294, 1978.
    • (1-1/e) guarantee for the greedy algorithm for maximizing submodular functions.
  • [22] M. Queyranne. A combinatorial algorithm for minimizing symmetric submodular functions. In SODA, 1995.
    • Strongly polynomial algorithm for minimizing symmetric submodular functions.
  • [23] T. G. Robertazzi and S. C. Schwartz. An accelerated sequential algorithm for producing D-optimal designs. SIAM Journal of Scientific and Statistical Computing, 10(2):341–358, March 1989.
    • Uses lazy evaluations to construct D-optimal designs.
  • [24] A. Singh, A. Krause, C. Guestrin, W. Kaiser, and M. Batalin. Efficient planning of informative paths for multiple robots. In IJCAI, 2007 [pdf]
    • An efficient approximation algorithm for submodular path planning (orienteering), with extensions for multiple robots.
  • [25] M. Sviridenko. A note on maximizing a submodular set function subject to knapsack constraint. Operations Research Letters, 32:41–43, 2004 [link]
    • Shows that (1-1/e) guarantee can be achieved for budgeted maximization of submodular functions.
  • [26] L. Wolsey. An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica, 2:385–393, 1982.
    • Proves guarantees for the greedy algorithm for submodular coverage.
  • [27] M. Goemans, and V. Ramakrishnan. Minimizing submodular functions over families of sets. Combinatorica, 15(4): 499-513, 1995 
    • Guarantees for constrained submodular minimization.
  • [28] M. Minoux. Accelerated greedy algorithms for maximizing submodular set functions. In Optimization Techniques, LNCS 234-243, 1978 [link]
    • Introduces the lazy greedy algorithm and online bounds for maximizing submodular functions.
  • [29] M. Groetschel, L. Lovasz, and A. Schrijver. The ellipsoid method and its consequences in combinatorial optimization. In Combinatorica, 1:169-197, 1981
    • Shows how the ellipsoid algorithm can minimize submodular functions in polynomial time.
  • [30] M. Queyranne. A Gomory-Hu tree for symmetric submodular functions, 1983
    • Proves the existence of Gomory-Hu trees for arbitrary symmetric submodular functions.
  • [31] L. Zhao, H. Nagamochi, and T. Ibaraki. Greedy splitting algorithms for approximating multiway partition problems, Mathematical Programming 102(1): 167-183, 2005 [link]
    • Analyzes greedy splitting for submodular clustering.
  • [32] M. Fisher, G. Nemhauser and L. Wolsey. An analysis of approximations for maximizing submodular set functions II, Math. Programming Study 8: 73–87, 1978
    • Guarantees for maximizing submodular functions over the intersection of p matroids.
  • [33] S. Fujishige, T. Hayashi, and S. Isotani. The Minimum-Norm-Point Algorithm Applied to Submodular Function Minimization and Linear Programming. Manuscript September 2006 [pdf]
    • Empirical study on the performance of the minimum norm algorithm.
  • [34] S. Fujishige. Submodular functions and optimization. Annals of Discrete Mathematics, Elsevier 2nd Edition 2005, 1st Edition 1991
    • Monograph on submodular functions.
  • [35] J. Vondrak. Optimal approximation for the Submodular Welfare Problem in the value oracle model, STOC 2008 [pdf]
    • Proves (1-1/e) guarantee for a continuous greedy algorithm for maximizing submodular functions subject to arbitrary matroid constraints.
  • [36] J. Edmonds. Submodular functions, matroids, and certain polyhedra. Proceedings of the Calgary International Conference on Combinatorial Structures and Their Applications (R. Guy, H. Hanani, N. Sauer and J. Schoenheim, eds., Gordon and Breach, New York, 1970), pp. 69–87; also in: Combinatorial Optimization—Eureka, You Shrink! (M. Juenger, G. Reinelt, and G. Rinaldi, eds., Lecture Notes in Computer Science 2570, Springer, Berlin, 2003), pp. 11–26.
    • Studies submodular functions, proves duality results, ...
  • [37] A. Krause, C. Guestrin. A Note on the Budgeted Maximization on Submodular Functions. Technical Report CMU-CALD-05-103, 2005 [pdf]
    • Guarantees for submodular function maximization if function is perturbed by noise / only approximately submodular.
  • [38] M. Seeger. Greedy Forward Selection in the Informative Vector Machine. Manuscript 2004 [pdf]
    • Proves that the greedy algorithm used in the IVM optimizes a submodular function.
  • [39] A. Krause, R. Rajagopal, A. Gupta, C. Guestrin. Simultaneous Placement and Scheduling of Sensors, IPSN 09.
Last updated: June 9 2009