23. ICML 2006: Pittsburgh, Pennsylvania, USA
William W. Cohen, Andrew Moore (Eds.): Machine Learning, Proceedings of the Twenty-Third International Conference (ICML 2006), Pittsburgh, Pennsylvania, USA, June 25-29, 2006. ACM 2006 ACM International Conference Proceeding Series 148 ISBN 1-59593-383-2
Amit Agarwal, Elad Hazan, Satyen Kale, Robert E. Schapire: Algorithms for portfolio management based on the Newton method. 9-16
Shivani Agarwal: Ranking on graph data. 25-32
Andreas Argyriou, Raphael Hauser, Charles A. Micchelli, Massimiliano Pontil: A DC-programming algorithm for kernel selection. 41-48



Arindam Banerjee: On Bayesian bounds. 81-88
Onureena Banerjee, Laurent El Ghaoui, Alexandre d'Aspremont, Georges Natsoulis: Convex optimization techniques for fitting sparse Gaussian graphical models. 89-96
Ivona Bezáková, Adam Kalai, Rahul Santhanam: Graph model selection using maximum likelihood. 105-112
Edwin V. Bonilla, Christopher K. I. Williams, Felix V. Agakov, John Cavazos, John Thomson, Michael F. P. O'Boyle: Predictive search distributions. 121-128
Michael H. Bowling, Peter McCracken, Michael James, James Neufeld, Dana F. Wilkinson: Learning predictive state representations using non-blind policies. 129-136
Ulf Brefeld, Thomas Gärtner, Tobias Scheffer, Stefan Wrobel: Efficient co-regularised least squares regression. 137-144
Miguel Á. Carreira-Perpiñán: Fast nonparametric clustering with Gaussian blurring mean-shift. 153-160
Rich Caruana, Alexandru Niculescu-Mizil: An empirical comparison of supervised learning algorithms. 161-168
Nicolò Cesa-Bianchi, Claudio Gentile, Luca Zaniboni: Hierarchical classification: combining Bayes with SVM. 177-184
Olivier Chapelle, Mingmin Chi, Alexander Zien: A continuation method for semi-supervised SVMs. 185-192
Ronan Collobert, Fabian H. Sinz, Jason Weston, Léon Bottou: Trading convexity for scalability. 201-208
Vincent Conitzer, Nikesh Garera: Learning algorithms for online principal-agent problems (and selling goods online). 209-216
Bruno Castro da Silva, Eduardo W. Basso, Ana L. C. Bazzan, Paulo Martins Engel: Dealing with non-stationary environments using context detection. 217-224
Juan Dai, Shuicheng Yan, Xiaoou Tang, James T. Kwok: Locally adaptive classification piloted by uncertainty. 225-232

Dennis DeCoste: Collaborative prediction using ensembles of Maximum Margin Matrix Factorizations. 249-256
Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin: Learning the structure of Factored Markov Decision Processes in reinforcement learning problems. 257-264
François Denis, Christophe Nicolas Magnan, Liva Ralaivola: Efficient learning of Naive Bayes classifiers under class-conditional classification noise. 265-272
Chris H. Q. Ding, Ding Zhou, Xiaofeng He, Hongyuan Zha: R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization. 281-288
Charles Elkan: Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution. 289-296
Barbara E. Engelhardt, Michael I. Jordan, Steven E. Brenner: A graphical model for predicting protein molecular function. 297-304
Michael Fink, Shai Shalev-Shwartz, Yoram Singer, Shimon Ullman: Online multiclass learning by interclass hypothesis sharing. 313-320
Jochen Garcke: Regression with the optimised combination technique. 321-328
Yang Ge, Wenxin Jiang: A note on mixtures of experts for multiclass responses: approximation rate and Consistent Bayesian Inference. 329-335
Peter V. Gehler, Alex Holub, Max Welling: The rate adapting poisson model for information retrieval and object recognition. 337-344
Pierre Geurts, Louis Wehenkel, Florence d'Alché-Buc: Kernelizing the output of tree-based methods. 345-352
Dilan Görür, Frank Jäkel, Carl Edward Rasmussen: A choice model with infinitely many latent features. 361-368
Alex Graves, Santiago Fernández, Faustino J. Gomez, Jürgen Schmidhuber: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. 369-376
Derek Greene, Padraig Cunningham: Practical solutions to the problem of diagonal dominance in kernel document clustering. 377-384
Patrick Haffner: Fast transpose methods for kernel learning on sparse data. 385-392
Steve Hanneke: An analysis of graph cut size for transductive learning. 393-399
Tomer Hertz, Aharon Bar-Hillel, Daphna Weinshall: Learning a kernel function for classification with small training samples. 401-408
Michael P. Holmes, Charles Lee Isbell Jr.: Looping suffix tree-based inference of partially observable hidden state. 409-416
Steven C. H. Hoi, Rong Jin, Jianke Zhu, Michael R. Lyu: Batch mode active learning and its application to medical image classification. 417-424

Brendan Juba: Estimating relatedness via data compression. 441-448
Philipp W. Keller, Shie Mannor, Doina Precup: Automatic basis function construction for approximate dynamic programming and reinforcement learning. 449-456
Wolf Kienzle, Kumar Chellapilla: Personalized handwriting recognition via biased regularization. 457-464
Seung-Jean Kim, Alessandro Magnani, Stephen P. Boyd: Optimal kernel selection in Kernel Fisher discriminant analysis. 465-472
Seung-Jean Kim, Alessandro Magnani, Sikandar Samar, Stephen P. Boyd, Johan Lim: Pareto optimal linear classification. 473-480
Mike Klaas, Mark Briers, Nando de Freitas, Arnaud Doucet, Simon Maskell, Dustin Lang: Fast particle smoothing: if I had a million particles. 481-488
George Konidaris, Andrew G. Barto: Autonomous shaping: knowledge transfer in reinforcement learning. 489-496
Andreas Krause, Jure Leskovec, Carlos Guestrin: Data association for topic intensity tracking. 497-504
Neil D. Lawrence, Joaquin Quiñonero Candela: Local distance preservation in the GP-LVM through back constraints. 513-520
Chi-Hoon Lee, Russell Greiner, Shaojun Wang: Using query-specific variance estimates to combine Bayesian classifiers. 529-536


Hui Li, Xuejun Liao, Lawrence Carin: Region-based value iteration for partially observable Markov decision processes. 561-568
Ling Li: Multiclass boosting with repartitioning. 569-576
Wei Li, Andrew McCallum: Pachinko allocation: DAG-structured mixture models of topic correlations. 577-584
Bo Long, Zhongfei (Mark) Zhang, Xiaoyun Wu, Philip S. Yu: Spectral clustering for multi-type relational data. 585-592
Le Lu, René Vidal: Combined central and subspace clustering for computer vision applications. 593-600
Mauro Maggioni, Sridhar Mahadevan: Fast direct policy evaluation using multiscale analysis of Markov diffusion processes. 601-608
Julian John McAuley, Tibério S. Caetano, Alex J. Smola, Matthias O. Franz: Learning high-order MRF priors of color images. 617-624
Marina Meila: The uniqueness of a good optimum for K-means. 625-632
Roland Memisevic: Kernel information embeddings. 633-640

Mukund Narasimhan, Paul A. Viola, Michael Shilman: Online decoding of Markov models under latency constraints. 657-664
Yuriy Nevmyvaka, Yi Feng, Michael Kearns: Reinforcement learning for optimized trade execution. 673-680

Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, Kevin Regan: An analytic solution to discrete Bayesian reinforcement learning. 697-704
Rajat Raina, Andrew Y. Ng, Daphne Koller: Constructing informative priors using transfer learning. 713-720

Pradeep D. Ravikumar, John D. Lafferty: Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. 737-744
Jean-Michel Renders, Éric Gaussier, Cyril Goutte, François Pacull, Gabriela Csurka: Categorization in multiple category systems. 745-752
Lev Reyzin, Robert E. Schapire: How boosting the margin can also boost classifier complexity. 753-760
David A. Ross, Simon Osindero, Richard S. Zemel: Combining discriminative features to infer complex trajectories. 761-768
Matthew R. Rudary, Satinder P. Singh: Predictive linear-Gaussian models of controlled stochastic dynamical systems. 777-784
Sunita Sarawagi: Efficient inference on sequence segmentation models. 793-800
Victor S. Sheng, Charles X. Ling: Feature value acquisition in testing: a sequential batch test algorithm. 809-816


Vikas Sindhwani, S. Sathiya Keerthi, Olivier Chapelle: Deterministic annealing for semi-supervised kernel machines. 841-848
Le Song, Julien Epps: Classifying EEG for brain-computer interfaces: learning optimal filters for dynamical system features. 857-864
Nathan Srebro, Gregory Shakhnarovich, Sam T. Roweis: An investigation of computational and informational limits in Gaussian mixture clustering. 865-872
David H. Stern, Ralf Herbrich, Thore Graepel: Bayesian pattern ranking for move prediction in the game of Go. 873-880
Alexander L. Strehl, Lihong Li, Eric Wiewiora, John Langford, Michael L. Littman: PAC model-free reinforcement learning. 881-888
Alexander L. Strehl, Chris Mesterharm, Michael L. Littman, Haym Hirsh: Experience-efficient learning in associative bandit problems. 889-896
Masashi Sugiyama: Local Fisher discriminant analysis for supervised dimensionality reduction. 905-912

Choon Hui Teo, S. V. N. Vishwanathan: Fast and space efficient string kernels using suffix arrays. 929-936
Jo-Anne Ting, Aaron D'Souza, Stefan Schaal: Bayesian regression with input noise for high dimensional data. 937-944
Marc Toussaint, Amos J. Storkey: Probabilistic inference for solving discrete and continuous state Markov Decision Processes. 945-952
Sriharsha Veeramachaneni, Emanuele Olivetti, Paolo Avesani: Active sampling for detecting irrelevant features. 961-968
S. V. N. Vishwanathan, Nicol N. Schraudolph, Mark W. Schmidt, Kevin P. Murphy: Accelerated training of conditional random fields with stochastic gradient methods. 969-976
Hanna M. Wallach: Topic modeling: beyond bag-of-words. 977-984
Gang Wang, Dit-Yan Yeung, Frederick H. Lochovsky: Two-dimensional solution path for support vector regression. 993-1000
Manfred K. Warmuth, Jun Liao, Gunnar Rätsch: Totally corrective boosting algorithms that maximize the margin. 1001-1008
Jason Weston, Ronan Collobert, Fabian H. Sinz, Léon Bottou, Vladimir Vapnik: Inference with the Universum. 1009-1016
David Wingate, Satinder P. Singh: Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems. 1017-1024
Xiaopeng Xi, Eamonn J. Keogh, Christian R. Shelton, Li Wei, Chotirat Ann Ratanamahatana: Fast time series classification using numerosity reduction. 1033-1040
Lin Xiao, Jun Sun, Stephen P. Boyd: A duality view of spectral methods for dimensionality reduction. 1041-1048
Eric P. Xing, Kyung-Ah Sohn, Michael I. Jordan, Yee Whye Teh: Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture. 1049-1056
Linli Xu, Dana F. Wilkinson, Finnegan Southey, Dale Schuurmans: Discriminative unsupervised learning of structured predictors. 1057-1064
Xin Yang, Haoying Fu, Hongyuan Zha, Jesse L. Barlow: Semi-supervised nonlinear dimensionality reduction. 1065-1072



Alice X. Zheng, Michael I. Jordan, Ben Liblit, Mayur Naik, Alex Aiken: Statistical debugging: simultaneous identification of multiple bugs. 1105-1112
Fei Zheng, Geoffrey I. Webb: Efficient lazy elimination for averaged one-dependence estimators. 1113-1120



