26. ICML 2009: Montreal, Quebec, Canada
Andrea Pohoreckyj Danyluk, Léon Bottou, Michael L. Littman (Eds.): Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009. ACM 2009 ACM International Conference Proceeding Series 382 ISBN 978-1-60558-516-1
Ryan Prescott Adams, Zoubin Ghahramani: Archipelago: nonparametric Bayesian semi-supervised learning. 1
Ryan Prescott Adams, Iain Murray, David J. C. MacKay: Tractable nonparametric Bayesian inference in Poisson processes with Gaussian process intensities. 2
David Andrzejewski, Xiaojin Zhu, Mark Craven: Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. 4
Raphaël Bailly, François Denis, Liva Ralaivola: Grammatical inference as a principal component analysis problem. 5



Craig Boutilier, Kevin Regan, Paolo Viappiani: Online feature elicitation in interactive optimization. 10

Alberto Giovanni Busetto, Cheng Soon Ong, Joachim M. Buhmann: Optimized expected information gain for nonlinear dynamical systems. 13
Deng Cai, Xuanhui Wang, Xiaofei He: Probabilistic dyadic data analysis with local and global consistency. 14
Cassio Polpo de Campos, Zhi Zeng, Qiang Ji: Structure learning of Bayesian networks using constraints. 15
Nicolò Cesa-Bianchi, Claudio Gentile, Francesco Orabona: Robust bounds for classification via selective sampling. 16
Kamalika Chaudhuri, Sham M. Kakade, Karen Livescu, Karthik Sridharan: Multi-view clustering via canonical correlation analysis. 17
Jianhui Chen, Lei Tang, Jun Liu, Jieping Ye: A convex formulation for learning shared structures from multiple tasks. 18
Chih-Chieh Cheng, Fei Sha, Lawrence K. Saul: Matrix updates for perceptron training of continuous density hidden Markov models. 20
Weiwei Cheng, Jens C. Huhn, Eyke Hüllermeier: Decision tree and instance-based learning for label ranking. 21
Youngmin Cho, Lawrence K. Saul: Learning dictionaries of stable autoregressive models for audio scene analysis. 22
Myung Jin Choi, Venkat Chandrasekaran, Alan S. Willsky: Exploiting sparse Markov and covariance structure in multiresolution models. 23
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang, Yong Yu: EigenTransfer: a unified framework for transfer learning. 25
Hal Daumé III: Unsupervised search-based structured prediction. 27
Marc Peter Deisenroth, Marco F. Huber, Uwe D. Hanebeck: Analytic moment-based Gaussian process filtering. 29
Meghana Deodhar, Gunjan Gupta, Joydeep Ghosh, Hyuk Cho, Inderjit S. Dhillon: A scalable framework for discovering coherent co-clusters in noisy data. 31
Carlos Diuk, Lihong Li, Bethany R. Leffler: The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning. 32
Chuong B. Do, Quoc V. Le, Chuan-Sheng Foo: Proximal regularization for online and batch learning. 33
Trinh Minh Tri Do, Thierry Artières: Large margin training for hidden Markov models with partially observed states. 34

Lixin Duan, Ivor W. Tsang, Dong Xu, Tat-Seng Chua: Domain adaptation from multiple sources via auxiliary classifiers. 37
Alireza Farhangfar, Russell Greiner, Csaba Szepesvári: Learning to segment from a few well-selected training images. 39
M. Julia Flores, José A. Gámez, Ana M. Martínez, Jose Miguel Puerta: GAODE and HAODE: two proposals based on AODE to deal with continuous variables. 40
Chuan-Sheng Foo, Chuong B. Do, Andrew Y. Ng: A majorization-minimization algorithm for (multiple) hyperparameter learning. 41
Rahul Garg, Rohit Khandekar: Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property. 43
Roman Garnett, Michael A. Osborne, Stephen J. Roberts: Sequential Bayesian prediction in the presence of changepoints. 44
Pascal Germain, Alexandre Lacasse, François Laviolette, Mario Marchand: PAC-Bayesian learning of linear classifiers. 45
Eduardo Rodrigues Gomes, Ryszard Kowalczyk: Dynamic analysis of multiagent Q-learning with ε-greedy exploration. 47


Verena Heidrich-Meisner, Christian Igel: Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search. 51
Thibault Helleputte, Pierre Dupont: Partially supervised feature selection with regularized linear models. 52
Tzu-Kuo Huang, Jeff G. Schneider: Learning linear dynamical systems without sequence information. 54
Laurent Jacob, Guillaume Obozinski, Jean-Philippe Vert: Group lasso with overlap and graph lasso. 55
Tony Jebara, Jun Wang, Shih-Fu Chang: Graph construction and b-matching for semi-supervised learning. 56
Nikolay Jetchev, Marc Toussaint: Trajectory prediction: learning to map situations to robot trajectories. 57
(paper withdrawn). ...
Jason K. Johnson, Vladimir Y. Chernyak, Michael Chertkov: Orbit-product representation and correction of Gaussian belief propagation. 60
Hetunandan Kamisetty, Christopher James Langmead: A Bayesian approach to protein model quality assessment. 61



J. Zico Kolter, Andrew Y. Ng: Regularization and feature selection in least-squares temporal difference learning. 66

Matthieu Kowalski, Marie Szafranski, Liva Ralaivola: Multiple indefinite kernel learning with mixed norm regularization. 69
Sanjiv Kumar, Mehryar Mohri, Ameet Talwalkar: On sampling-based approximate spectral decomposition. 70
Ondrej Kuzelka, Filip Zelezný: Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. 72
Yanyan Lan, Tie-Yan Liu, Zhiming Ma, Hang Li: Generalization analysis of listwise learning-to-rank algorithms. 73


Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. 77
Bin Li, Qiang Yang, Xiangyang Xue: Transfer learning for collaborative filtering via a rating-matrix generative model. 78
Ping Li: ABC-boost: adaptive base class boost for multi-class classification. 79

Han Liu, Mark Palatucci, Jian Zhang: Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. 82
Yan Liu, Alexandru Niculescu-Mizil, Wojciech Gryc: Topic-link LDA: joint models of topic and author community. 84
Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker: Identifying suspicious URLs: an application of large-scale online learning. 86
Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro: Online dictionary learning for sparse coding. 87
Takaki Makino: Proto-predictive representation of states with simple recurrent temporal-difference networks. 88
Benjamin M. Marlin, Kevin P. Murphy: Sparse Gaussian graphical models with unknown block structure. 89
André F. T. Martins, Noah A. Smith, Eric P. Xing: Polyhedral outer approximations with application to natural language parsing. 90
Frédéric de Mesmay, Arpad Rimmel, Yevgen Voronenko, Markus Püschel: Bandit-based optimization on graphs with application to library performance tuning. 92
Joris M. Mooij, Dominik Janzing, Jonas Peters, Bernhard Schölkopf: Regression by dependence minimization and its application to causal inference in additive noise models. 94
Gerhard Neumann, Wolfgang Maass, Jan Peters: Learning complex motions by sequencing simpler motion templates. 95
Hannes Nickisch, Matthias W. Seeger: Convex variational Bayesian inference for large scale generalized linear models. 96
Sebastian Nowozin, Stefanie Jegelka: Solution stability in linear programming relaxations: graph partitioning and unsupervised learning. 97

Jason Pazis, Michail G. Lagoudakis: Binary action search for learning continuous-action control policies. 100
Jonas Peters, Dominik Janzing, Arthur Gretton, Bernhard Schölkopf: Detecting the direction of causal time series. 101
Nils Plath, Marc Toussaint, Shinichi Nakajima: Multi-class image segmentation using conditional random fields and global classification. 103
Barnabás Póczos, Yasin Abbasi-Yadkori, Csaba Szepesvári, Russell Greiner, Nathan R. Sturtevant: Learning when to stop thinking and do something! 104
Duangmanee Putthividhya, Hagai Thomas Attias, Srikantan S. Nagarajan: Independent factor topic models. 105
Guo-Jun Qi, Jinhui Tang, Zheng-Jun Zha, Tat-Seng Chua, Hong-Jiang Zhang: An efficient sparse metric learning in high-dimensional space via l1-penalized log-determinant regularization. 106
Xian Qian, Xiaoqian Jiang, Qi Zhang, Xuanjing Huang, Lide Wu: Sparse higher order conditional random fields for improved sequence labeling. 107
Ariadna Quattoni, Xavier Carreras, Michael Collins, Trevor Darrell: An efficient projection for l1,infinity regularization. 108
Milos Radovanovic, Alexandros Nanopoulos, Mirjana Ivanovic: Nearest neighbors in high-dimensional data: the emergence and influence of hubs. 109
Rajat Raina, Anand Madhavan, Andrew Y. Ng: Large-scale deep unsupervised learning using graphics processors. 110
Sudhir Raman, Thomas J. Fuchs, Peter J. Wild, Edgar Dahl, Volker Roth: The Bayesian group-Lasso for analyzing contingency tables. 111
Vikas C. Raykar, Shipeng Yu, Linda H. Zhao, Anna K. Jerebko, Charles Florin, Gerardo Hermosillo Valadez, Luca Bogoni, Linda Moy: Supervised learning from multiple experts: whom to trust when everyone lies a bit. 112
Sushmita Roy, Terran Lane, Margaret Werner-Washburne: Learning structurally consistent undirected probabilistic graphical models. 114
Stefan Rüping: Ranking interesting subgroups. 115
Mikkel N. Schmidt: Function factorization using warped Gaussian processes. 116


Vikas Sindhwani, Prem Melville, Richard D. Lawrence: Uncertainty sampling and transductive experimental design for active dual supervision. 120
Le Song, Jonathan Huang, Alexander J. Smola, Kenji Fukumizu: Hilbert space embeddings of conditional distributions with applications to dynamical systems. 121
Andreas P. Streich, Mario Frank, David A. Basin, Joachim M. Buhmann: Multi-assignment clustering for Boolean data. 122
Liang Sun, Shuiwang Ji, Jieping Ye: A least squares formulation for a class of generalized eigenvalue problems in machine learning. 123
Ilya Sutskever: A simpler unified analysis of budget perceptrons. 124
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora: Fast gradient-descent methods for temporal-difference learning with linear function approximation. 125
Istvan Szita, András Lörincz: Optimistic initialization and greediness lead to polynomial time learning in factored MDPs. 126

Graham W. Taylor, Geoffrey E. Hinton: Factored conditional restricted Boltzmann Machines for modeling motion style. 129
Tijmen Tieleman, Geoffrey E. Hinton: Using fast weights to improve persistent contrastive divergence. 130
Robert E. Tillman: Structure learning with independent non-identically distributed data. 131
Marc Toussaint: Robot trajectory optimization using approximate inference. 132
Nicolas Usunier, David Buffoni, Patrick Gallinari: Ranking with ordered weighted pairwise classification. 133
Xuan Vinh Nguyen, Julien Epps, James Bailey: Information theoretic measures for clusterings comparison: is a correction for chance necessary? 135


Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov, David M. Mimno: Evaluation methods for topic models. 139
Kilian Q. Weinberger, Anirban Dasgupta, John Langford, Alexander J. Smola, Josh Attenberg: Feature hashing for large scale multitask learning. 140
Max Welling: Herding dynamical weights to learn. 141
Frank Wood, Cédric Archambeau, Jan Gasthaus, Lancelot James, Yee Whye Teh: A stochastic memoizer for sequence data. 142
Linli Xu, Martha White, Dale Schuurmans: Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning. 143

Yi Sun, Daan Wierstra, Tom Schaul, Jürgen Schmidhuber: Stochastic search using the natural gradient. 146

Kai Yu, John D. Lafferty, Shenghuo Zhu, Yihong Gong: Large-scale collaborative prediction using a nonparametric random effects model. 149
Yisong Yue, Thorsten Joachims: Interactively optimizing information retrieval systems as a dueling bandits problem. 151
Peng Zang, Peng Zhou, David Minnen, Charles Lee Isbell Jr.: Discovering options from example trajectories. 153
De-Chuan Zhan, Ming Li, Yu-Feng Li, Zhi-Hua Zhou: Learning instance specific distances using metric propagation. 154
Kai Zhang, James T. Kwok, Bahram Parvin: Prototype vector machine for large scale semi-supervised learning. 155
Wei Zhang, Akshat Surve, Xiaoli Fern, Thomas G. Dietterich: Learning non-redundant codebooks for classifying complex objects. 156
Zhi-Hua Zhou, Yu-Yin Sun, Yu-Feng Li: Multi-instance learning by treating instances as non-I.I.D. samples. 157
Jun Zhu, Amr Ahmed, Eric P. Xing: MedLDA: maximum margin supervised topic models for regression and classification. 158
Jinfeng Zhuang, Ivor W. Tsang, Steven C. H. Hoi: SimpleNPKL: simple non-parametric kernel learning. 160
Corinna Cortes: Invited talk: Can learning kernels help performance? 161
Yoav Freund: Invited talk: Drifting games, boosting and online learning. 162
John Mark Agosta, Russell Almond, Dennis M. Buede, Marek J. Druzdzel, Judy Goldsmith, Silja Renooij: Workshop summary: Seventh annual workshop on Bayes applications. 163
Robert F. Murphy, Chun-Nan Hsu, Loris Nanni: Workshop summary: Automated interpretation and modelling of cell images. 164
Kai Yu, Ruslan Salakhutdinov, Yann LeCun, Geoffrey E. Hinton, Yoshua Bengio: Workshop summary: Workshop on learning feature hierarchies. 165
David Wingate, Carlos Diuk, Lihong Li, Matthew Taylor, Jordan Frank: Workshop summary: Results of the 2009 reinforcement learning competition. 166
Chris Drummond, Nathalie Japkowicz, William Klement, Sofus A. Macskassy: Workshop summary: The fourth workshop on evaluation methods for machine learning. 167
Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári: Workshop summary: On-line learning with limited feedback. 168
Matthias Seeger, Suvrit Sra, John P. Cunningham: Workshop summary: Numerical mathematics in machine learning. 169
Özgür Simsek: Workshop summary: Abstraction in reinforcement learning. 170
Alina Beygelzimer, John Langford, Bianca Zadrozny: Tutorial summary: Reductions in machine learning. 172
Eyal Even-Dar, Vahab S. Mirrokni: Tutorial summary: Convergence of natural dynamics to equilibria. 173
Volker Tresp, Kai Yu: Tutorial summary: Learning with dependencies between several response variables. 174
Manfred K. Warmuth, S. V. N. Vishwanathan: Tutorial summary: Survey of boosting from an optimization perspective. 175
Yael Niv: Tutorial summary: The neuroscience of reinforcement learning. 176
Paul N. Bennett, Misha Bilenko, Kevyn Collins-Thompson: Tutorial summary: Machine learning in IR: recent successes and new opportunities. 177
Jure Leskovec: Tutorial summary: Large social and information networks: opportunities for ML. 179
Noah A. Smith: Tutorial summary: Structured prediction for natural language processing. 180



