Stop the war!
Остановите войну!
for scientists:
default search action
20th KDD 2014: New York City, USA
- Sofus A. Macskassy, Claudia Perlich, Jure Leskovec, Wei Wang, Rayid Ghani:
The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '14, New York, NY, USA - August 24 - 27, 2014. ACM 2014, ISBN 978-1-4503-2956-9
Keynote talks
- Oren Etzioni:
The battle for the future of data mining. 1 - Eric Horvitz:
Data, predictions, and decisions in support of people and society. 2 - Eric E. Schadt:
A data driven approach to diagnosing and treating disease. 3 - Sendhil Mullainathan:
Bugbears or legitimate threats?: (social) scientists' criticisms of machine learning? 4
Research session 1: location-based services
- Xuan Song, Quanshi Zhang, Yoshihide Sekimoto, Ryosuke Shibasaki:
Prediction of human emergency behavior and their mobility following large-scale disaster. 5-14 - Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, Nitesh V. Chawla:
Inferring user demographics and social strategies in mobile social networks. 15-24 - Yilun Wang, Yu Zheng, Yexiang Xue:
Travel time estimation of a path using sparse trajectories. 25-34 - Moshe Lichman, Padhraic Smyth:
Modeling human location data with mixtures of kernel densities. 35-44 - Meng Qu, Hengshu Zhu, Junming Liu, Guannan Liu, Hui Xiong:
A cost-effective recommender system for taxi drivers. 45-54
Research session 2: applications to healthcare and medicine I
- Yubin Park, Joydeep Ghosh:
LUDIA: an aggregate-constrained low-rank reconstruction algorithm to leverage publicly released health data. 55-64 - Subhabrata Mukherjee, Gerhard Weikum, Cristian Danescu-Niculescu-Mizil:
People on drugs: credibility of user statements in health communities. 65-74 - Marzyeh Ghassemi, Tristan Naumann, Finale Doshi-Velez, Nicole Brimmer, Rohit Joshi, Anna Rumshisky, Peter Szolovits:
Unfolding physiological state: mortality modelling in intensive care units. 75-84 - Xiang Wang, David A. Sontag, Fei Wang:
Unsupervised learning of disease progression models. 85-94 - Evangelos E. Papalexakis, Alona Fyshe, Nicholas D. Sidiropoulos, Partha Pratim Talukdar, Tom M. Mitchell, Christos Faloutsos:
Good-enough brain model: challenges, algorithms and discoveries in multi-subject experiments. 95-104
Research session 3: applications to healthcare and medicine II
- Yasuko Matsubara, Yasushi Sakurai, Willem G. van Panhuis, Christos Faloutsos:
FUNNEL: automatic mining of spatially coevolving epidemics. 105-114 - Joyce C. Ho, Joydeep Ghosh, Jimeng Sun:
Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. 115-124 - Chih-Chun Chia, Zeeshan Syed:
Scalable noise mining in long-term electrocardiographic time-series to predict death following heart attacks. 125-134 - Jiayu Zhou, Fei Wang, Jianying Hu, Jieping Ye:
From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records. 135-144 - Fei Wang, Ping Zhang, Buyue Qian, Xiang Wang, Ian Davidson:
Clinical risk prediction with multilinear sparse logistic regression. 145-154 - James C. Ross, Peter J. Castaldi, Michael H. Cho, Jennifer G. Dy:
Dual beta process priors for latent cluster discovery in chronic obstructive pulmonary disease. 155-162
Research session 4: recommender systems
- Quan Yuan, Gao Cong, Chin-Yew Lin:
COM: a generative model for group recommendation. 163-172 - Laurent Charlin, Richard S. Zemel, Hugo Larochelle:
Leveraging user libraries to bootstrap collaborative filtering. 173-182 - Yupeng Gu, Yizhou Sun, Ning Jiang, Bingyu Wang, Ting Chen:
Topic-factorized ideal point estimation model for legislative voting network. 183-192 - Qiming Diao, Minghui Qiu, Chao-Yuan Wu, Alexander J. Smola, Jing Jiang, Chong Wang:
Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). 193-202 - Mahbub Hasan, Abhijith Kashyap, Vagelis Hristidis, Vassilis J. Tsotras:
User effort minimization through adaptive diversification. 203-212
Research session 5: clustering
- Xiao He, Jing Feng, Bettina Konte, Son T. Mai, Claudia Plant:
Relevant overlapping subspace clusters on categorical data. 213-222 - Murat Dundar, Halid Ziya Yerebakan, Bartek Rajwa:
Batch discovery of recurring rare classes toward identifying anomalous samples. 223-232 - Jianhua Yin, Jianyong Wang:
A dirichlet multinomial mixture model-based approach for short text clustering. 233-242 - Andreas Züfle, Tobias Emrich, Klaus Arthur Schmid, Nikos Mamoulis, Arthur Zimek, Matthias Renz:
Representative clustering of uncertain data. 243-252 - Stephan Günnemann, Ines Färber, Matthias Sebastian Rüdiger, Thomas Seidl:
SMVC: semi-supervised multi-view clustering in subspace projections. 253-262
Research session 6: supervised learning I
- Yashoteja Prabhu, Manik Varma:
FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning. 263-272 - Shaodan Zhai, Tian Xia, Shaojun Wang:
A multi-class boosting method with direct optimization. 273-282 - Yashu Liu, Jie Wang, Jieping Ye:
An efficient algorithm for weak hierarchical lasso. 283-292 - Doyen Sahoo, Steven C. H. Hoi, Bin Li:
Online multiple kernel regression. 293-302 - Sihong Xie, Jing Gao, Wei Fan, Deepak S. Turaga, Philip S. Yu:
Class-distribution regularized consensus maximization for alleviating overfitting in model combination. 303-312
Research session 7: supervised learning II
- Teng Zhang, Zhi-Hua Zhou:
Large margin distribution machine. 313-322 - Qi Qian, Juhua Hu, Rong Jin, Jian Pei, Shenghuo Zhu:
Distance metric learning using dropout: a structured regularization approach. 323-332 - Siong Thye Goh, Cynthia Rudin:
Box drawings for learning with imbalanced data. 333-342 - Cheng-Hao Tsai, Chieh-Yen Lin, Chih-Jen Lin:
Incremental and decremental training for linear classification. 343-352 - Junbo Zhang, Guangjian Tian, Yadong Mu, Wei Fan:
Supervised deep learning with auxiliary networks. 353-361
Research session 8: trend, anomaly and novelty detection
- Tahereh Babaie, Sanjay Chawla, Romesh G. Abeysuriya:
Sleep analytics and online selective anomaly detection. 362-371 - Qi Rose Yu, Xinran He, Yan Liu:
GLAD: group anomaly detection in social media analysis. 372-381 - Dehua Cheng, Mohammad Taha Bahadori, Yan Liu:
FBLG: a simple and effective approach for temporal dependence discovery from time series data. 382-391 - Josif Grabocka, Nicolas Schilling, Martin Wistuba, Lars Schmidt-Thieme:
Learning time-series shapelets. 392-401 - Mohamed F. Ghalwash, Vladan Radosavljevic, Zoran Obradovic:
Utilizing temporal patterns for estimating uncertainty in interpretable early decision making. 402-411
Research session 9: data streams
- Junming Shao, Zahra Ahmadi, Stefan Kramer:
Prototype-based learning on concept-drifting data streams. 412-421 - Yanwei Yu, Lei Cao, Elke A. Rundensteiner, Qin Wang:
Detecting moving object outliers in massive-scale trajectory streams. 422-431 - Charu C. Aggarwal:
The setwise stream classification problem. 432-441 - Daniel Ting:
Streamed approximate counting of distinct elements: beating optimal batch methods. 442-451 - Andrew S. Lan, Christoph Studer, Richard G. Baraniuk:
Time-varying learning and content analytics via sparse factor analysis. 452-461
Research session 10: active learning
- Dan Kushnir:
Active-transductive learning with label-adapted kernels. 462-471 - Deepak Vasisht, Andreas C. Damianou, Manik Varma, Ashish Kapoor:
Active learning for sparse bayesian multilabel classification. 472-481 - De Wang, Feiping Nie, Heng Huang:
Large-scale adaptive semi-supervised learning via unified inductive and transductive model. 482-491 - Akshay Gadde, Aamir Anis, Antonio Ortega:
Active semi-supervised learning using sampling theory for graph signals. 492-501 - Jialei Wang, Nathan Srebro, James Evans:
Active collaborative permutation learning. 502-511
Research session 11: feature selection
- Xuan Vinh Nguyen, Jeffrey Chan, Simone Romano, James Bailey:
Effective global approaches for mutual information based feature selection. 512-521 - Zhixiang Eddie Xu, Gao Huang, Kilian Q. Weinberger, Alice X. Zheng:
Gradient boosted feature selection. 522-531 - Shuo Xiang, Tao Yang, Jieping Ye:
Simultaneous feature and feature group selection through hard thresholding. 532-541 - Zheng Zhao, Jun Liu, James Cox:
Safe and efficient screening for sparse support vector machine. 542-551 - Sanjay Purushotham, Martin Renqiang Min, C.-C. Jay Kuo, Rachel Ostroff:
Factorized sparse learning models with interpretable high order feature interactions. 552-561
Research session 12: statistical techniques for big data
- Dehua Cheng, Yan Liu:
Parallel gibbs sampling for hierarchical dirichlet processes via gamma processes equivalence. 562-571 - Tamraparni Dasu, Ji Meng Loh, Divesh Srivastava:
Empirical glitch explanations. 572-581 - Hongxia Yang, Jingrui He:
Learning with dual heterogeneity: a nonparametric bayes model. 582-590 - Chien-Liang Liu, Tsung-Hsun Tsai, Chia-Hoang Lee:
Online chinese restaurant process. 591-600 - Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, Wei Zhang:
Knowledge vault: a web-scale approach to probabilistic knowledge fusion. 601-610
Research session 13: scaling-up methods for big data
- Shusen Wang, Chao Zhang, Hui Qian, Zhihua Zhang:
Improving the modified nyström method using spectral shifting. 611-620 - Wenlin Chen, Yixin Chen, Kilian Q. Weinberger:
Fast flux discriminant for large-scale sparse nonlinear classification. 621-630 - Mingwang Tang, Feifei Li:
Scalable histograms on large probabilistic data. 631-640 - Flavio Chierichetti, Nilesh N. Dalvi, Ravi Kumar:
Correlation clustering in MapReduce. 641-650 - Christos Anagnostopoulos, Peter Triantafillou:
Scaling out big data missing value imputations: pythia vs. godzilla. 651-660
Research session 14: large-scale optimization and learning
- Mu Li, Tong Zhang, Yuqiang Chen, Alexander J. Smola:
Efficient mini-batch training for stochastic optimization. 661-670 - Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause:
Streaming submodular maximization: massive data summarization on the fly. 671-680 - Edith Cohen:
Distance queries from sampled data: accurate and efficient. 681-690 - Yi Li, Zhengyu Wang, David P. Woodruff:
Improved testing of low rank matrices. 691-700 - Bryan Perozzi, Rami Al-Rfou, Steven Skiena:
DeepWalk: online learning of social representations. 701-710
Research session 15: web mining
- Sunita Sarawagi, Soumen Chakrabarti:
Open-domain quantity queries on web tables: annotation, response, and consensus models. 711-720 - Bin Wu, Erheng Zhong, Ben Tan, Andrew Horner, Qiang Yang:
Crowdsourced time-sync video tagging using temporal and personalized topic modeling. 721-730 - Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, Hongyuan Zha:
Identifying and labeling search tasks via query-based hawkes processes. 731-740 - Oleksandr Polozov, Sumit Gulwani:
LaSEWeb: automating search strategies over semi-structured web data. 741-750 - Shangsong Liang, Zhaochun Ren, Maarten de Rijke:
Personalized search result diversification via structured learning. 751-760
Research session 16: transfer learning
- Pinghua Gong, Jiayu Zhou, Wei Fan, Jieping Ye:
Efficient multi-task feature learning with calibration. 761-770 - Tianyi Zhou, Dacheng Tao:
Multi-task copula by sparse graph regression. 771-780 - Mianwei Zhou, Kevin Chen-Chuan Chang:
Unifying learning to rank and domain adaptation: enabling cross-task document scoring. 781-790 - Ying Wei, Yangqiu Song, Yi Zhen, Bo Liu, Qiang Yang:
Scalable heterogeneous translated hashing. 791-800 - Chung-Yi Li, Shou-De Lin:
Matching users and items across domains to improve the recommendation quality. 801-810
Research session 17: recommendations and ratings
- Wei Lu, Stratis Ioannidis, Smriti Bhagat, Laks V. S. Lakshmanan:
Optimal recommendations under attraction, aversion, and social influence. 811-820 - Xiang Ren, Jialu Liu, Xiao Yu, Urvashi Khandelwal, Quanquan Gu, Lidan Wang, Jiawei Han:
ClusCite: effective citation recommendation by information network-based clustering. 821-830 - Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, Yong Rui:
GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation. 831-840 - Stephan Günnemann, Nikou Günnemann, Christos Faloutsos:
Detecting anomalies in dynamic rating data: a robust probabilistic model for rating evolution. 841-850 - Silei Xu, John Chi-Shing Lui:
Product selection problem: improve market share by learning consumer behavior. 851-860
Research session 18: topic modeling
- Yongxin Tong, Caleb Chen Cao, Lei Chen:
TCS: efficient topic discovery over crowd-oriented service data. 861-870 - Erich Schubert, Michael Weiler, Hans-Peter Kriegel:
SigniTrend: scalable detection of emerging topics in textual streams by hashed significance thresholds. 871-880 - Wray L. Buntine, Swapnil Mishra:
Experiments with non-parametric topic models. 881-890 - Aaron Q. Li, Amr Ahmed, Sujith Ravi, Alexander J. Smola:
Reducing the sampling complexity of topic models. 891-900 - Mikalai Tsytsarau, Themis Palpanas, Malú Castellanos:
Dynamics of news events and social media reaction. 901-910
Research session 19: security and privacy
- Qian Xiao, Rui Chen, Kian-Lee Tan:
Differentially private network data release via structural inference. 911-920 - Wentian Lu, Gerome Miklau:
Exponential random graph estimation under differential privacy. 921-930 - Jaewoo Lee, Christopher W. Clifton:
Top-k frequent itemsets via differentially private FP-trees. 931-940 - Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, Shiqiang Yang:
CatchSync: catching synchronized behavior in large directed graphs. 941-950 - Hengshu Zhu, Hui Xiong, Yong Ge, Enhong Chen:
Mobile app recommendations with security and privacy awareness. 951-960
Research session 20: dimensionality reduction
- Xiaomin Fang, Rong Pan:
Fast DTT: a near linear algorithm for decomposing a tensor into factor tensors. 967-976 - Feiping Nie, Xiaoqian Wang, Heng Huang:
Clustering and projected clustering with adaptive neighbors. 977-986 - Xilun Chen, K. Selçuk Candan:
LWI-SVD: low-rank, windowed, incremental singular value decompositions on time-evolving data sets. 987-996 - Dimitris S. Papailiopoulos, Anastasios Kyrillidis, Christos Boutsidis:
Provable deterministic leverage score sampling. 997-1006 - Tuan M. V. Le, Hady Wirawan Lauw:
Semantic visualization for spherical representation. 1007-1016
Research session 21: novel applications
- Rakesh Agrawal, Behzad Golshan, Evimaria Terzi:
Grouping students in educational settings. 1017-1026 - Jingbo Shang, Yu Zheng, Wenzhu Tong, Eric Chang, Yong Yu:
Inferring gas consumption and pollution emission of vehicles throughout a city. 1027-1036 - Karthik Raman, Thorsten Joachims:
Methods for ordinal peer grading. 1037-1046 - Yanjie Fu, Hui Xiong, Yong Ge, Zijun Yao, Yu Zheng, Zhi-Hua Zhou:
Exploiting geographic dependencies for real estate appraisal: a mutual perspective of ranking and clustering. 1047-1056 - Bo Zong, Yinghui Wu, Jie Song, Ambuj K. Singh, Hasan Çam, Jiawei Han, Xifeng Yan:
Towards scalable critical alert mining. 1057-1066
Research session 22: crowds and markets
- Caleb Chen Cao, Lei Chen, H. V. Jagadish:
From labor to trader: opinion elicitation via online crowds as a market. 1067-1076 - Weinan Zhang, Shuai Yuan, Jun Wang:
Optimal real-time bidding for display advertising. 1077-1086 - Ting Wang, Dashun Wang, Fei Wang:
Quantifying herding effects in crowd wisdom. 1087-1096 - Olivier Chapelle:
Modeling delayed feedback in display advertising. 1097-1105 - Meng Fang, Dacheng Tao:
Networked bandits with disjoint linear payoffs. 1106-1115
Research session 23: text mining
- Zhiyuan Chen, Bing Liu:
Mining topics in documents: standing on the shoulders of big data. 1116-1125 - Zhe Chen, Michael J. Cafarella:
Integrating spreadsheet data via accurate and low-effort extraction. 1126-1135 - Moritz Sudhof, Andrés Goméz Emilsson, Andrew L. Maas, Christopher Potts:
Sentiment expression conditioned by affective transitions and social forces. 1136-1145 - Furong Li, Mong-Li Lee, Wynne Hsu:
Entity profiling with varying source reliabilities. 1146-1155 - Anthony Fader, Luke Zettlemoyer, Oren Etzioni:
Open question answering over curated and extracted knowledge bases. 1156-1165
Research session 24: dynamic graph analysis
- Feng Chen, Daniel B. Neill:
Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. 1166-1175