Seminar: Advanced Topics in Machine Learning | Learning & Adaptive Systems Group

Seminar – Advanced Topics in Machine Learning

In this seminar, recent papers of the pattern recognition and machine learning literature are presented and discussed. Possible topics cover statistical models in computer vision, graphical models and machine learning. The seminar “Advanced Topics in Pattern Recognition” familiarizes students with recent developments in pattern recognition and machine learning. Original articles have to be presented and critically reviewed. The students will learn how to structure a scientific presentation in English which covers the key ideas of a scientific paper. An important goal of the seminar presentation is to summarize the essential ideas of the paper in sufficient depth while omitting details which are not essential for the understanding of the work. The presentation style will play an important role and should reach the level of professional scientific presentations. The seminar will cover a number of recent papers which have emerged as important contributions to the pattern recognition and machine learning literature. The topics will vary from year to year but they are centered on methodological issues in machine learning like new learning algorithms, ensemble methods or new statistical models for machine learning applications. Frequently, papers are selected from computer vision or bioinformatics – two fields, which relies more and more on machine learning methodology and statistical models. The papers will be presented in the first session of the seminar. VVZ information is available here.

Contact

Professors	Joachim M. Buhmann, Thomas Hofmann, Andreas Krause
Assistants	Hamed Hassani, Martin Jaggi, Bauer Stefan

Lectures

Tue	16-18	CAB H 52
Thu	16-18	CHN G 22

Tuesday Schedule

Date	Presenter	Topic
20 Oct	Melis	Distributed Stochastic Variance Reduced Gradient Methods (2015)
20 Oct	Jagerman	Communication Efficient Distributed Machine Learning with the Parameter Server (2014)
27 Oct	Pilgerstorfer	Adaptive subgradient methods for online learning and stochastic optimization (2011)
27 Oct	Mutny	Stochastic dual coordinate ascent methods for regularized loss (2013)
03 Nov	Karaivanov	Beyond Convexity: Stochastic Quasi-Convex Optimization (2015)
03 Nov	Wang	Non-convex Robust PCA (2014)
10 Nov	Holmer	A stochastic PCA and SVD algorithm with an exponential convergence rate (2015)
10 Nov	Herbst	Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods (2015)
17 Nov	Lianos	Training Highly Multiclass Classifiers
17 Nov	Helminger	User Conditional Hashtag Prediction for Images
17 Nov	Bahman	Sequence to Sequence Learning with Neural Networks
24 Nov	Spurr	Show and tell: A neural image caption generator
24 Nov	Marti	A Critical Review of Recurrent Neural Networks for Sequence Learning
01 Dec	Ciganovic	Neural variational inference and learning in belief networks
01 Dec	Ihnatov	Neural Turing Machines
15 Dec	Kan	Zero-shot learning by convex combination of semantic embeddings
15 Dec	Ghosh	Giraffe: Using Deep Reinforcement Learning to Play Chess

Thursday Schedule

Date	Presenter	Topic
15 Oct	Van der Goten	Adaptively Learning the Crowd Kernel (2011)
15 Oct	Vollprecht	Tuned Models of Peer Assessment in MOOCs (2013)
22 Oct	Calderara	Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing (2014)
22 Oct	Hamas	Probabilistic Programming (ICSE 2014)
29 Oct	Greuter	A New Approach to Probabilistic Programming Inference (2014)
29 Oct	Porvaznik	Learning Probabilistic Programs (2014)
5 Nov	Nikolov	On the convexity of latent social network inference (2010)
5 Nov	Minhaz	Scalable Influence Estimation in Continuous-Time Diffusion Networks (2013)
12 Nov	Koleva	Uncovering the Temporal Dynamics of Diffusion Networks (2011)
12 Nov	Carion	A Tutorial on Bayesian Optimization of Expensive Cost Functions (2010)
19 Nov	Song	Practical Bayesian Optimization of Machine Learning Algorithms (2012)
19 Nov	Wu	Input Warping for Bayesian Optimization of Non-Stationary Functions (2014)
26 Nov	Ma	High Dimensional Bayesian Optimisation and Bandits via Additive Models (2015)
26 Nov	Nishant	Introduction to causal inference (2010)
3 Dec	Abdelmessih	Identifying the direction of causal time series (2009)
3 Dec	Wang Jingyi	Nonlinear causal discovery with additive noise models (2008)
17 Dec	Demitri	Probabilistic latent variable models for distinguishing between cause and effect (2010)
17 Dec	Raiskin	Towards a learning theory of cause-effect inference (2015)

Tuesday Topics and Papers

Convex and Non-Convex Optimization

Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12, 2121-2159.
Shalev-Shwartz, Shai, and Tong Zhang. Stochastic dual coordinate ascent methods for regularized loss. The Journal of Machine Learning Research 14.1 (2013): 567-599.
Defazio, Aaron, Francis Bach, and Simon Lacoste-Julien. SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. NIPS 2014.
Christopher De Sa, Christopher Re, Kunle Olukotun, Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems, ICML 2015
Shamir, Ohad. A stochastic PCA and SVD algorithm with an exponential convergence rate. ICML 2015.
Bhojanapalli, S., and, A. K., & Sanghavi, S. (2015). Dropping Convexity for Faster Semi-definite Optimization arXiv preprint
Praneeth Netrapalli, U N Niranjan, Sujay Sanghavi (2014). Non-convex Robust PCA, NIPS 2014
Elad Hazan, Kfir Levy, Shai Shalev-Shwartz (2015). Beyond Convexity: Stochastic Quasi-Convex Optimization NIPS 2015

Deep Learning, Embeddings, Multiclass Classification

Denton, E., Weston, J., Paluri, M., Bourdev, L., & Fergus, R. (2015). User Conditional Hashtag Prediction for Images. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1731-1740). ACM.
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2014). Show and tell: A neural image caption generator. arXiv preprint
Maya R. Gupta, Samy Bengio, Jason Weston; Training Highly Multiclass Classifiers. The Journal of Machine Learning Research 15(Apr):1461−1492, 2014.
Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., … & Dean, J. (2013). Zero-shot learning by convex combination of semantic embeddings. arXiv preprint
Lipton, Z. A Critical Review of Recurrent Neural Networks for Sequence Learning arXiv preprint
Majid Janzamin, Hanie Sedghi, Anima Anandkumar (2015). Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods arXiv preprint
Mnih, A., & Gregor, K. (2014). Neural variational inference and learning in belief networks. ICML
Lai, M. (2015). Giraffe: Using Deep Reinforcement Learning to Play Chess. MSc Thesis

Variational Inference

Hoffman, M. D., Blei, D. M., Wang, C., & Paisley, J. (2013). Stochastic variational inference. The Journal of Machine Learning Research, 14(1), 1303-1347.
Salimans, T. (2014). Markov chain Monte Carlo and variational inference: Bridging the gap. arXiv preprint.
Paisley, J., Blei, D., & Jordan, M. (2012). Variational Bayesian inference with stochastic search. arXiv preprint arXiv:1206.6430.
Mnih, A., & Gregor, K. (2014). Neural variational inference and learning in belief networks. ICML
Djolonga, J., & Krause, A. (2014). From map to marginals: Variational inference in bayesian submodular models. In Advances in Neural Information Processing Systems (pp. 244-252).
Variational Message Passing. John Winn, Christopher M. Bishop. JMLR 2005

Distributed Optimization

Lee, J., Ma, T., & Lin, Q. (2015). Distributed Stochastic Variance Reduced Gradient Methods arXiv
H Mania, X Pan, D Papailiopoulos, B Recht, K Ramchandran, M I. Jordan (2015) Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
Li, M., Andersen, D. G., Smola, A., & Yu, K. (2014). Communication Efficient Distributed Machine Learning with the Parameter Server. NIPS 2014
Lee, C.-P., & Roth, D. (2015). Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM. ICML 2015.

Thursday Topics and Papers

Network Inference

M. Gomez-Rodriguez, D. Balduzzi, B. Schölkopf. Uncovering the Temporal Dynamics of Diffusion Networks. The 28th International Conference on Machine Learning (ICML), 2011.
Seth Myers, and Jure Leskovec. On the convexity of latent social network inference. NIPS’10.
Scalable Influence Estimation in Continuous-Time Diffusion Networks. Nan Du, Le Song, Manuel Gomez Rodriguez, and Hongyuan Zha. NIPS 2013.

Learning meets optimization

G. Papandreou and A. Yuille, Perturb-and-MAP Random Fields: Using Discrete Optimization to Learn and Sample from Energy Models, ICCV 2011
M. J. Wainwright, T. Jaakkola and A. S. Willsky (2005). A new class of upper bounds on the log partition function. IEEE Trans. on Information Theory
Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets, Adarsh Prasad, Stefanie Jegelka, Dhruv Batra, NIPS 2014.
Active Learning as Non-Convex Optimization, Andrew Guillory, Erick Chastain, Jeff Bilmes, AISTATS 2009.

Bayesian optimization

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. Eric Brochu, Vlad M. Cora and Nando de Freitas. eprint arXiv:1012.2599, pdf
Input Warping for Bayesian Optimization of Non-Stationary Functions, Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams. ICML’14
Bayesian Optimization with Unknown Constraints, Michael A. Gelbart, Jasper Snoek, Ryan P. Adams. UAI’14
Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. NIPS, 2012.
Bayesian Active Learning for Posterior Estimation.Kirthevasan Kandasamy, Jeff Schneider, and Barnabas Poczos. IJCAI, 2015.
High Dimensional Bayesian Optimisation and Bandits via Additive Models , Kirthevasan Kandasamy, Jeff Schneider, Barnabas Poczos, ICML 2015

Probabilistic Programming

Andrew Gordon, Thomas Henzinger, Aditya Nori, and Sriram Rajamani. Probabilistic Programming. ICSE, 2014.
A New Approach to Probabilistic Programming Inference, Wood, F., van de Meent, J. W., & Mansinghka, V., AISTATS 2014
Learning Probabilistic Programs. Perov, Y., & Wood, F. arXiv preprint arXiv:1407.2646. PDF

Learning and Economics / Game Theory

Adaptively Learning the Crowd Kernel. Omer Tamuz, Ce Liu, Serge Belongie, Ohad Shamir, Adam Tauman Kalai. ICML’11.
Tuned Models of Peer Assessment in MOOCs. C. Piech, J. Huang, Z. Chen, C. Do, A. Ng, D. Koller. EDM’13.
Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing. Y. Zhang , X. Chen, D. Zhou, M. Jordan. NIPS’14.

Causality

Spirtes, Peter. “Introduction to causal inference.” The Journal of Machine Learning Research 11 (2010): 1643-1662 with additional examples from Guyon, Isabelle. “Practical feature selection: from correlation to causality.” NATO Science for Peace and Security 19 (2008): 27-43.
Patrik O. Hoyer et al. “Nonlinear causal discovery with additive noise models” in Advances in Neural Information Processing Systems 21 (NIPS 2008)
Stegle, Oliver, et al. “Probabilistic latent variable models for distinguishing between cause and effect.” Advances in Neural Information Processing Systems. 2010.
Peters, Janzing, Gretton and Schölkopf Detecting the Direction of Causal Time Series in ICML 2009
Lopez-Paz, David, et al. “Towards a learning theory of cause-effect inference.” Proceedings of the 32nd International Conference on Machine Learning, JMLR: W&CP, Lille, France. 2015