20. CIKM 2011: Glasgow, United Kingdom
Craig Macdonald, Iadh Ounis, Ian Ruthven (Eds.): Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, October 24-28, 2011. ACM 2011 ISBN 978-1-4503-0717-8
Keynote address
David R. Karger: Creating user interfaces that entice people to manage better information. 1-2
Justin Zobel: Data, health, and algorithmics: computational challenges for biomedicine. 3-4
Maurizio Lenzerini: Ontology-based data management. 5-6
Retrieval models

Jae Hyun Park, W. Bruce Croft, David A. Smith: A quasi-synchronous dependence model for information retrieval. 17-26
Maryam Karimzadehgan, ChengXiang Zhai: Improving retrieval accuracy of difficult queries through generalizing negative document language models. 27-36
Steffen Metzger, Shady Elbassuoni, Katja Hose, Ralf Schenkel: S3K: seeking statement-supporting top-K witnesses. 37-46
Xitong Liu, Hui Fang, Conglei Yao, Min Wang: Finding relevant information of certain types from enterprise data. 47-56
Techniques for the Web
Yuchen Liu, Xiaochuan Ni, Jian-Tao Sun, Zheng Chen: Unsupervised transactional query classification based on webpage form understanding. 57-66
Roi Blanco, Berkant Barla Cambazoglu, Flavio Paiva Junqueira, Ivan Kelly, Vincent Leroy: Assigning documents to master sites in distributed search. 67-76
Xiao Bai, Berkant Barla Cambazoglu, Flavio Paiva Junqueira: Discovering URLs through user feedback. 77-86

Kai Hui, Ben He, Tiejian Luo, Bin Wang: Relevance weighting using within-document term statistics. 99-104
Exploiting query logs
Umut Ozertem, Emre Velipasaoglu, Larry Lai: Suggestion set utility maximization using session logs. 105-114
Minmin Chen, Jian-Tao Sun, Xiaochuan Ni, Yixin Chen: Improving context-aware query classification via adaptive self-training. 115-124
Ahmed Hassan, Yang Song, Li-wei He: A task level metric for measuring web search satisfaction and its application on improving relevance estimation. 125-134
Jianwei Cui, Hongyan Liu, Jun Yan, Lei Ji, Ruoming Jin, Jun He, Yingqin Gu, Zheng Chen, Xiaoyong Du: Multi-view random walk framework for search task discovery from click-through log. 135-140

Sparse data and difficult queries
Xing Yi, James Allan: Discovering missing click-through query language information for web search. 153-162

Bilyana Taneva, Mouna Kacimi, Gerhard Weikum: Finding images of difficult entities in the long tail. 189-194
Giorgos Giannopoulos, Ulf Brefeld, Theodore Dalamagas, Timos K. Sellis: Learning to rank user intent. 195-200
Type and structure
Jaime Arguello, Fernando Diaz, Jamie Callan: Learning to aggregate vertical results into web search results. 201-210



Dustin Lange, Felix Naumann: Frequency-aware similarity measures: why Arnold Schwarzenegger is always a duplicate. 243-248
Machine learning for information retrieval
Katja Hofmann, Shimon Whiteson, Maarten de Rijke: A probabilistic method for inferring preferences from clicks. 249-258
Martin Szummer, Emine Yilmaz: Semi-supervised learning to rank with preference regularization. 269-278
Hua Wang, Heng Huang, Chris H. Q. Ding: Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. 279-284
Guangxia Li, Kuiyu Chang, Steven C. H. Hoi, Wenting Liu, Ramesh Jain: Collaborative online learning of user generated content. 285-290
Karthik Raman, Thorsten Joachims, Pannaga Shivaswamy: Structured learning of two-level dynamic rankings. 291-296
Information retrieval implementation techniques

Marcus Fontoura, Maxim Gurevich, Vanja Josifovski, Sergei Vassilvitskii: Efficiently encoding term co-occurrences in inverted indexes. 307-316
Alexander A. Stepanov, Anil R. Gangolli, Daniel E. Rose, Ryan J. Ernst, Paramjit S. Oberoi: SIMD-based decoding of posting lists. 317-326
George Beskales, Marcus Fontoura, Maxim Gurevich, Sergei Vassilvitskii, Vanja Josifovski: Factorization-based lossless compression of inverted indices. 327-332
Roger B. Bradford: Implementation techniques for large-scale latent semantic indexing applications. 339-344
Language technology and information retrieval
Nico Schlaefer, Jennifer Chu-Carroll, Eric Nyberg, James Fan, Wlodek Zadrozny, David A. Ferrucci: Statistical source expansion for question answering. 345-354
Jeffrey Dalton, James Allan, David A. Smith: Passage retrieval for incorporating global evidence in sequence labeling. 355-364
Jose M. Chenlo, David E. Losada: Effective and efficient polarity estimation in blogs based on sentence-level evidence. 365-374
Dmitriy Bespalov, Bing Bai, Yanjun Qi, Ali Shokoufandeh: Sentiment classification based on supervised latent n-gram analysis. 375-382
Qiang Lu, Jack G. Conrad, Khalid Al-Kofahi, William Keenan: Legal document clustering with built-in topic segmentation. 383-392
Results in context

Kevyn Collins-Thompson, Paul N. Bennett, Ryen W. White, Sebastian de la Chica, David Sontag: Personalizing web search results by reading level. 403-412
Dimitrios Lymberopoulos, Peixiang Zhao, Arnd Christian König, Klaus Berberich, Jie Liu: Location-aware click prediction in mobile local search. 413-422
Maria Christoforaki, Jinru He, Constantinos Dimopoulos, Alexander Markowetz, Torsten Suel: Text vs. space: efficient geo-search query processing. 423-432
Algorithms
Georgia Koloniari, Nikos Ntarmos, Evaggelia Pitoura, Dimitris Souravlias: One is enough: distributed filtering for duplicate elimination. 433-442
Chen Chen, Guoren Wang, Huilin Liu, Junchang Xin, Ye Yuan: SISP: a new framework for searching the informative subgraph based on PSO. 453-462
Francisco Claude, Antonio Fariña, Miguel A. Martínez-Prieto, Gonzalo Navarro: Indexes for highly repetitive document collections. 463-468
Ismet Zeki Yalniz, Ethem F. Can, R. Manmatha: Partial duplicate detection for large book collections. 469-474
Image retrieval
Faidon Loumakis, Simone Stumpf, David Grayson: This image smells good: effects of image information scent in search engine results pages. 475-484
Songhua Xu, Hao Jiang, Francis Chi-Moon Lau: Retrieving and ranking unannotated images through collaboratively mining online search results. 485-494
George Teodoro, Eduardo Valle, Nathan Mariano, Ricardo da Silva Torres, Wagner Meira Jr.: Adaptive parallel approximate similarity search for responsive multimedia retrieval. 495-504
Min-Hee Jang, Sang-Wook Kim, Christos Faloutsos, Sunju Park: A linear-time approximation of the earth mover's distance. 505-514
Social media
Arlind Kopliku, Mohand Boughanem, Karen Pinel-Sauvagnat: Towards a framework for attribute retrieval. 515-524
Truls Amundsen Bjørklund, Michaela Götz, Johannes Gehrke, Nils Grimsmo: Workload-aware indexing for keyword search in social networks. 535-544
Giovanni Quattrone, Licia Capra, Pasquale De Meo, Emilio Ferrara, Domenico Ursino: Effective retrieval of resources in folksonomies using a new tag similarity measure. 545-550
Kyumin Lee, James Caverlee, Zhiyuan Cheng, Daniel Z. Sui: Content-driven detection of campaigns in social media. 551-556
Peng Li, Bin Wang, Wei Jin, Jian-Yun Nie, Zhiwei Shi, Ben He: Exploring categorization property of social annotations for information retrieval. 557-562
Personalization and advertising
Di Jiang, Kenneth Wai-Ting Leung, Wilfred Ng: Context-aware search personalization with concept preference. 563-572
David C. Anastasiu, Byron J. Gao, David Buttler: A framework for personalized and collaborative clustering of search results. 573-582
Lidong Bing, Wai Lam, Tak-Lam Wong: Using query log and social tagging to refine queries based on latent topics. 583-592
Sarah K. Tyler, Sandeep Pandey, Evgeniy Gabrilovich, Vanja Josifovski: Retrieval models for audience selection in display advertising. 593-598
Lei Wang, Mingjiang Ye, Yu Zou: A language model approach to capture commercial intent and information relevance for sponsored search. 599-604
Jian Tang, Ning Liu, Jun Yan, Yelong Shen, Shaodan Guo, Bin Gao, Shuicheng Yan, Ming Zhang: Learning to rank audience for behavioral targeting in display ads. 605-610
Evaluation and analysis
Ben Carterette, Evangelos Kanoulas, Emine Yilmaz: Simulating simple user behavior for system effectiveness evaluation. 611-620
Tetsuya Sakai, Makoto P. Kato, Young-In Song: Click the search button and be happy: evaluating direct and immediate information access. 621-630
Mehdi Hosseini, Ingemar J. Cox, Natasa Milic-Frayling, Trevor Sweeting, Vishwa Vinay: Prioritizing relevance judgments to improve the construction of IR test collections. 641-646
Jinyoung Kim, W. Bruce Croft, David A. Smith, Anton Bakalov: Evaluating an associative browsing model for personal information. 647-652
Classification and evaluation
Sathiya Keerthi Selvaraj, Bigyan Bhar, Sundararajan Sellamanickam, Shirish Krishnaj Shevade: Semi-supervised SVMs for classification with unknown class proportions and a small labeled dataset. 653-662
Sundararajan Sellamanickam, Priyanka Garg, Sathiya Keerthi Selvaraj: A pairwise ranking based approach to learning with positive and unlabeled examples. 663-672
Deguang Kong, Chris H. Q. Ding, Heng Huang: Robust nonnegative matrix factorization using L21-norm. 673-682
Ye Xu, Furao Shen, Wei Ping, Jinxi Zhao: TAKES: a fast method to select features in the kernel space. 683-692
Bhanukiran Vinzamuri, Kamalakar Karlapalem: Designing an ensemble classifier over subspace classifiers using iterative convergence routine. 693-698
Information filtering
Morgan Harvey, Mark James Carman, Ian Ruthven, Fabio Crestani: Bayesian latent variable models for collaborative item rating prediction. 699-708

Shinjae Yoo, Yiming Yang, Jaime G. Carbonell: Modeling personalized email prioritization: classification-based and regression-based approaches. 729-738
Rubi Boim, Tova Milo, Slava Novgorodov: Diversification and refinement in collaborative filtering recommender. 739-744
Topics and events
Shiva Prasad Kasiviswanathan, Prem Melville, Arindam Banerjee, Vikas Sindhwani: Emerging topic detection using dictionary learning. 745-754
Luciano Barbosa, Srinivas Bangalore: Focusing on novelty: a crawling strategy to build diverse language models. 755-764
Ou Jin, Nathan Nan Liu, Kai Zhao, Yong Yu, Qiang Yang: Transferring topical knowledge from auxiliary long texts for short text clustering. 775-784
Liang Tang, Tao Li, Chang-Shing Perng: LogSig: generating system events from raw textual logs. 785-794
Temporal, stream and spatial information
Ying-Ju Chen, Kun-Ta Chuang, Ming-Syan Chen: Coupling or decoupling for KNN search on road networks?: a hybrid framework on user query patterns. 795-804
Zhiyuan Cheng, James Caverlee, Krishna Yeswanth Kamath, Kyumin Lee: Toward traffic-driven location-based web search. 805-814
Di Yang, Zhenyu Guo, Elke A. Rundensteiner, Matthew O. Ward: CLUES: a unified framework supporting interactive exploration of density-based clusters in streams. 815-824
Xiangjun Dong, Zhigang Zheng, Longbing Cao, Yanchang Zhao, Chengqi Zhang, Jinjiu Li, Wei Wei, Yuming Ou: e-NSP: efficient negative sequential pattern mining based on identified positive patterns without database rescanning. 825-830
Text mining
Yafang Wang, Bin Yang, Lizhen Qu, Marc Spaniol, Gerhard Weikum: Harvesting facts from textual web sources by constrained label propagation. 837-846
Xiaofeng Yu, Irwin King, Michael R. Lyu: Towards a top-down and bottom-up bidirectional approach to joint information extraction. 847-856
Jie Liu, Jimeng Chen, Yi Zhang, Yalou Huang: Learning conditional random fields with latent sparse features for acronym expansion finding. 867-872
Dongwoo Kim, Alice H. Oh: Accounting for data dependencies within a hierarchical dirichlet process mixture model. 873-878
Zhaochun Ren, Jun Ma, Shuaiqiang Wang, Yang Liu: Summarizing web forum threads based on a latent topic propagation process. 879-884
Privacy
Muzammil M. Baig, Jiuyong Li, Jixue Liu, Hua Wang: Cloning for privacy protection in multiple independent data publications. 885-894
Nikos Pelekis, Aris Gkoulalas-Divanis, Marios Vodas, Despina Kopanaki, Yannis Theodoridis: Privacy-aware querying over sensitive trajectory data. 895-904
Yuzhe Tang, Ting Wang, Ling Liu, Shicong Meng, Balaji Palanisamy: Privacy preserving indexing for eHealth information networks. 905-914
Jyh-Ren Shieh, Ching-Yung Lin, Ja-Ling Wu: Recommendation in the end-to-end encrypted domain. 915-924
Chih-Ming Hsu, Ming-Syan Chen: Privacy preservation by independent component analysis and variance control. 925-930
Unsupervised and semi-supervised learning
Haiqin Yang, Shenghuo Zhu, Irwin King, Michael R. Lyu: Can irrelevant data help semi-supervised learning, why and how? 937-946
Paramveer S. Dhillon, Sundararajan Sellamanickam, Sathiya Keerthi Selvaraj: Semi-supervised multi-task learning of structured prediction models for web information extraction. 957-966
Niwan Wattanakitrungroj, Chidchanok Lursinsap: Memory-less unsupervised clustering for data streaming by versatile ellipsoidal function. 967-972
Can Wang, Longbing Cao, Mingchun Wang, Jinjiu Li, Wei Wei, Yuming Ou: Coupled nominal similarity in unsupervised learning. 973-978
Huawen Liu, Xindong Wu, Shichao Zhang: Feature selection using hierarchical feature clustering. 979-984
Social networks and communities
Mehdi Kargar, Aijun An: Discovering top-k teams of experts with/without a leader in social networks. 985-994
Hongliang Fei, Ruoyi Jiang, Yuhao Yang, Bo Luo, Jun Huan: Content based social behavior prediction: a multi-task learning approach. 995-1000
Hui Li, Sourav S. Bhowmick, Aixin Sun: CASINO: towards conformity-aware social influence analysis in online social networks. 1007-1012
David Lo, Didi Surian, Kuan Zhang, Ee-Peng Lim: Mining direct antagonistic communities in explicit trust networks. 1013-1018
Xufei Wang, Huan Liu, Wei Fan: Connecting users with similar interests via tag network inference. 1019-1024
Barbara Poblete, Ruth Garcia, Marcelo Mendoza, Alejandro Jaimes: Do all birds tweet the same?: characterizing twitter around the world. 1025-1030
Sentiments and other perspectives
Xiaolong Wang, Furu Wei, Xiaohua Liu, Ming Zhou, Ming Zhang: Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. 1031-1040
Zheng Lin, Songbo Tan, Xueqi Cheng: Language-independent sentiment classification using three common words. 1041-1046
Sheng Gao, Haizhou Li: A cross-domain adaptation method for sentiment classification using probabilistic latent analysis. 1047-1052
Albert Weichselbraun, Stefan Gindl, Arno Scharl: Using games with a purpose and bootstrapping to create domain-specific sentiment lexicons. 1053-1060
Bas Heerschop, Frank Goossen, Alexander Hogenboom, Flavius Frasincar, Uzay Kaymak, Franciska de Jong: Polarity analysis of texts using discourse structure. 1061-1070
Maria Soledad Pera, Rani Qumsiyeh, Yiu-Kai Ng: A query-based multi-document sentiment summarizer. 1071-1076
Classification and clustering: large-scale statistical techniques
Emmanuel Müller, Ira Assent, Stephan Günnemann, Thomas Seidl: Scalable density-based subspace clustering. 1077-1086
Yi Xu, Zhongfei Zhang, Philip S. Yu, Bo Long: Pattern change discovery between high dimensional data sets. 1097-1106
Avani Shastri, Di Yang, Elke A. Rundensteiner, Matthew O. Ward: MTopS: scalable processing of continuous top-k multi-query workloads. 1107-1116
Link prediction

John E. Hopcroft, Tiancheng Lou, Jie Tang: Who will follow you back?: reciprocal relationship prediction. 1137-1146
Rong-Hua Li, Jeffrey Xu Yu, Jianquan Liu: Link prediction: the power of maximal entropy random walk. 1147-1156
Kai-Yang Chiang, Nagarajan Natarajan, Ambuj Tewari, Inderjit S. Dhillon: Exploiting longer cycles for link prediction in signed networks. 1157-1162
Dawei Yin, Liangjie Hong, Brian D. Davison: Structural link analysis and prediction in microblogs. 1163-1168
Sheng Gao, Ludovic Denoyer, Patrick Gallinari: Temporal link prediction by integrating content and structure information. 1169-1174
Link, graph and relation mining



Michael Davis, Weiru Liu, Paul Miller, George Redpath: Detecting anomalies in graphs with numeric labels. 1197-1202
Ching-man Au Yeung, Tomoharu Iwata: Extracting multi-dimensional relations: a generative model of groups of entities in a corpus. 1203-1208
Yann Jacob, Ludovic Denoyer, Patrick Gallinari: Classification and annotation in social corpora using multiple relations. 1215-1220
Science, the past, and the future
Efstathios Stamatatos: Plagiarism detection based on structural information. 1221-1230
Ching-man Au Yeung, Adam Jatowt: Studying how the past is remembered: towards computational history through large scale text mining. 1231-1240
Ya-nan Qian, Yunhua Hu, Jianling Cui, Qinghua Zheng, Zaiqing Nie: Combining machine learning and human judgment in author disambiguation. 1241-1246
Rui Yan, Jie Tang, Xiaobing Liu, Dongdong Shan, Xiaoming Li: Citation count prediction: learning to estimate future citations for literature. 1247-1252
Anja Bachmann, Rene Schult, Matthias Lange, Myra Spiliopoulou: Extracting cross references from life science databases for search result ranking. 1253-1258
Adam Jatowt, Ching-man Au Yeung: Extracting collective expectations about the future from large text collections. 1259-1264
Information extraction and entities
Lidong Bing, Wai Lam, Yuan Gu: Towards a unified solution: data record region detection and segmentation. 1265-1274
Claudio Schifanella, K. Selçuk Candan, Maria Luisa Sapino: Fast metadata-driven multiresolution tensor decomposition. 1275-1284
Falk Brauer, Robert Rieger, Adrian Mocan, Wojciech M. Barczynski: Enabling information extraction by inference of regular expressions from sample entities. 1285-1294
Jinhan Kim, Long Jiang, Seung-won Hwang, Young-In Song, Ming Zhou: Mining entity translations from comparable corpora: a holistic graph mapping approach. 1295-1304
Bin Zhao, Xiaoxin Yin, Eric P. Xing: Max margin learning on domain-independent web information extraction. 1305-1310
Queries, questions and tags mining
Zhicheng Dou, Sha Hu, Yulong Luo, Ruihua Song, Ji-Rong Wen: Finding dimensions for queries. 1311-1320
Li Cai, Guangyou Zhou, Kang Liu, Jun Zhao: Large-scale question classification in cQA by leveraging Wikipedia semantic knowledge. 1321-1330
Yang Song, Baojun Qiu, Umer Farooq: Hierarchical tag visualization and application for tag recommendations. 1331-1340
Xin Chen, Xiaohua Hu, Yuan An, Zunyan Xiong, Tingting He, E. K. Park: Perspective hierarchical dirichlet process for user-tagged image modeling. 1341-1346
Marius Pasca: Asking what no one has asked before: using phrase similarities to generate synthetic web search queries. 1347-1352
Preparing, mining and evaluating with and for different views
Pradipto Das, Rohini K. Srihari, Yun Fu: Simultaneous joint and conditional modeling of documents tagged from two perspectives. 1353-1362
Stephan Günnemann, Ines Färber, Emmanuel Müller, Ira Assent, Thomas Seidl: External evaluation measures for subspace clustering. 1363-1372
Luca Maria Aiello, Debora Donato, Umut Ozertem, Filippo Menczer: Behavior-driven clustering of queries into topics. 1373-1382
Ullas Nambiar, Tanveer A. Faruquie, L. Venkata Subramaniam, Sumit Negi, Ganesh Ramakrishnan: Discovering customer intent in real-time for streamlining service desk conversations. 1383-1388
Xinquan Qu, Xinlei Chen: Sparse structured probabilistic projections for factorized latent spaces. 1389-1394
Information extraction and semantic techniques
Weiwei Cheng, Gjergji Kasneci, Thore Graepel, David H. Stern, Ralf Herbrich: Automated feature generation from structured knowledge. 1395-1404
Wei Wang, Romaric Besançon, Olivier Ferret, Brigitte Grau: Filtering and clustering relations for unsupervised information extraction in open domain. 1405-1414
Yunyao Li, Vivian Chu, Sebastian Blohm, Huaiyu Zhu, Howard Ho: Facilitating pattern discovery for relation extraction with semantic-signature-based clustering. 1415-1424
Gang Wu, Guilin Qi, Jianfeng Du: Finding all justifications of OWL entailments using TMS and MapReduce. 1425-1434
Data on the web

Andreas Brodt, Oliver Schiller, Bernhard Mitschang: Efficient resource attribute retrieval in RDF triple stores. 1445-1454
Fan Wang, Gagan Agrawal: Effective stratification for low selectivity queries on deep web data sources. 1455-1464
Lijun Chang, Jeffrey Xu Yu, Lu Qin, Yuanyuan Zhu, Haixun Wang: Finding information nebula over large networks. 1465-1474
Da Yan, Raymond Chi-Wing Wong, Wilfred Ng: Efficient methods for finding influential locations with adaptive grids. 1475-1484
Query processing and optimization

Quoc Trung Tran, Chee-Yong Chan, Guoping Wang: Evaluation of set-based queries with aggregation constraints. 1495-1504
Günter Ladwig, Thanh Tran: Index structures and top-k join algorithms for native keyword search databases. 1505-1514
Shenoda Guirguis, Mohamed A. Sharaf, Panos K. Chrysanthis, Alexandros Labrinidis: Optimized processing of multiple aggregate continuous queries. 1515-1524
Jesús Manuel Almendros-Jiménez, Josep Silva, Salvador Tamarit: XQuery optimization based on program slicing. 1525-1534
Semantic web and information retrieval
Zhixu Li, Laurianne Sitbon, Xiaofang Zhou: Learning-based relevance feedback for web-based relation completion. 1535-1540
Rafael S. Gonçalves, Bijan Parsia, Ulrike Sattler: Categorising logical differences between OWL ontologies. 1541-1546
Marina Drosou, Evaggelia Pitoura: ReDRIVE: result-driven database exploration through recommendations. 1547-1552
Tangjian Deng, Liang Zhao, Ling Feng, Wenwei Xue: Information re-finding by context: a brain memory inspired approach. 1553-1558
Roberto De Virgilio, Giorgio Orsi, Letizia Tanca, Riccardo Torlone: Semantic data markets: a flexible environment for knowledge management. 1559-1564
Yueguo Chen, Wei Wang, Xiaoyong Du, Xiaofang Zhou: Continuously monitoring the correlations of massive discrete streams. 1571-1576
Query answering and social search
Felipe da C. Hummel, Altigran Soares da Silva, Mirella M. Moro, Alberto H. F. Laender: Multiple keyword-based queries over XML streams. 1577-1582
Chunyang Ma, Yongluan Zhou, Lidan Shou, Dan Dai, Gang Chen: Matching query processing in high-dimensional space. 1589-1594
Kun Xu, Lei Zou, Jeffrey Xu Yu, Lei Chen, Yanghua Xiao, Dongyan Zhao: Answering label-constraint reachability in large graphs. 1595-1600
Silvia Rota, Sonia Bergamaschi, Francesco Guerra: The list Viterbi training algorithm and its application to keyword search over databases. 1601-1606
Cheng-Te Li, Man-Kwan Shan, Shou-De Lin: Context-based people search in labeled social networks. 1607-1612
Carlos R. Rivero, Inma Hernández, David Ruiz, Rafael Corchuelo: On benchmarking data translation systems for semantic-web ontologies. 1613-1618
Distributed data management and data integration
Chun Kit Chui, Ben Kao, Eric Lo, Reynold Cheng: I/O-efficient algorithms for answering pattern-based aggregate queries in a sequence OLAP system. 1619-1628

Siarhei Bykau, John Mylopoulos, Flavio Rizzolo, Yannis Velegrakis: Supporting queries spanning across phases of evolving artifacts using Steiner forests. 1649-1658
Robert Ikeda, Semih Salihoglu, Jennifer Widom: Provenance-based refresh in data-oriented workflows. 1659-1668
Keyword search and ranked queries
Veli Bicer, Thanh Tran, Radoslav Nedkov: Ranking support for keyword search on structured data using relevance models. 1669-1678
Dustin Lange, Felix Naumann: Efficient similarity search: arbitrary similarity measures, arbitrary composition. 1679-1688
Xueyao Liang, Min Xie, Laks V. S. Lakshmanan: Adding structure to top-k: from items to expansions. 1699-1708
Bo Zhao, Cindy Xide Lin, Bolin Ding, Jiawei Han: TEXplorer: keyword-based object search and exploration in multidimensional text databases. 1709-1718
Data cleaning and analysis

Lingli Li, Jianzhong Li, Hongzhi Wang, Hong Gao: Context-based entity description rule for entity resolution. 1725-1730
Xiang Lian, Yincheng Lin, Lei Chen: Cost-efficient repair in inconsistent probabilistic databases. 1731-1736
Mijung Kim, Kasim Selçuk Candan: Approximate tensor decomposition within a tensor-relational algebraic framework. 1737-1742
Roberto De Virgilio, Franco Milicchio: RFID data analysis using tensor calculus for supply chain management. 1743-1748
Vu Hung, Boualem Benatallah, Régis Saint-Paul: Spreadsheet-based complex data transformation. 1749-1754
Graph management and queries
Yuanyuan Zhu, Lu Qin, Jeffrey Xu Yu, Yiping Ke, Xuemin Lin: High efficiency and quality: large graphs matching. 1755-1764
Huiping Cao, K. Selçuk Candan, Maria Luisa Sapino: Skynets: searching for minimum trees in graphs with incomparable edge weights. 1775-1784
Konstantin Tretyakov, Abel Armas-Cervantes, Luciano García-Bañuelos, Jaak Vilo, Marlon Dumas: Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. 1785-1794
Social, search, and other behaviour
Sandeep Pandey, Mohamed Aly, Abraham Bagherjeiran, Andrew O. Hatch, Peter Ciccolo, Adwait Ratnaparkhi, Martin Zinkevich: Learning to target: what works for behavioral targeting. 1805-1814
Bastian Karweg, Christian Hütter, Klemens Böhm: Evolving social search based on bookmarks and status messages from social networks. 1825-1834
Victor Hu, Maria Stone, Jan Pedersen, Ryen W. White: Effects of search success on search engine re-use. 1841-1846
Applications in different areas
Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, Krishnaram Kenthapadi: Enriching textbooks with images. 1847-1856
Hassan H. Malik, Ian MacGillivray, Måns Olof-Ors, Siming Sun, Shailesh Saroha: Exploring the corporate ecosystem with a semi-supervised entity graph. 1857-1866
Jiyin He, Maarten de Rijke, Merlijn Sevenster, Rob C. van Ommering, Yuechen Qian: Generating links to background knowledge: a case study using narrative radiology reports. 1867-1876
David Martinez, Yue Li: Information extraction from pathology reports in a hospital setting. 1877-1882
Yingqin Gu, Jun Yan, Hongyan Liu, Jun He, Lei Ji, Ning Liu, Zheng Chen: Extract knowledge from semi-structured websites for search task simplification. 1883-1888
Debapriyo Majumdar, Rose Catherine, Shajith Ikbal, Karthik Visweswariah: Privacy protected knowledge management in services with emphasis on quality data. 1889-1894
Poster session: information retrieval
Wei Zheng, Hui Fang, Conglei Yao, Min Wang: Search result diversification for enterprise data. 1901-1904
Alessandro Bozzon, Marco Brambilla, Piero Fraternali, Marco Tagliasacchi: Diversification for multi-domain result sets. 1905-1908
Raynor Vliegendhart, Martha Larson, Christoph Kofler, Johan A. Pouwelse: A peer's-eye view: network term clouds in a peer-to-peer system. 1909-1912
Takehiro Yamamoto, Satoshi Nakamura, Katsumi Tanaka: RerankEverything: a reranking interface for exploring search results. 1913-1916
Luis Fernández-Luque, Randi Karlsen, Genevieve B. Melton: HealthTrust: trust-based retrieval of you tube's diabetes channels. 1917-1920
Dan Shen, Jean-David Ruvini, Manas Somaiya, Neel Sundaresan: Item categorization in the e-commerce domain. 1921-1924
Walid Magdy, Gareth J. F. Jones: An efficient method for using machine translation technologies in cross-language patent search. 1925-1928
Ahmet Aker, Robert J. Gaizauskas: Understanding the types of information humans associate with geographic objects. 1929-1932
Rodrygo L. T. Santos, Craig Macdonald, Iadh Ounis: Effectiveness beyond the first crawl tier. 1937-1940
Gabriella Kazai, Jaap Kamps, Natasa Milic-Frayling: Worker types and personality traits in crowdsourcing relevance labels. 1941-1944
Shahzad Rajput, Virgiliu Pavlu, Peter B. Golbus, Javed A. Aslam: A nugget-based test collection construction paradigm. 1945-1948
Andrey Styskin, Fedor Romanenko, Fedor Vorobyev, Pavel Serdyukov: Recency ranking by diversification of result set. 1949-1952
Debasis Ganguly, Johannes Leveling, Walid Magdy, Gareth J. F. Jones: Patent query reduction using pseudo relevance feedback. 1953-1956
Chang Wang, Emine Yilmaz, Martin Szummer: Relevance feedback exploiting query-specific document manifolds. 1957-1960

Yunlong Ma, Hongfei Lin, Yuan Lin: Selecting related terms in query-logs using two-stage SimRank. 1969-1972
Giuseppe Amodeo, Giambattista Amati, Giorgio Gambosi: On relevance, time and query expansion. 1973-1976
Scott Sanner, Shengbo Guo, Thore Graepel, Sadegh Kharazmi, Sarvnaz Karimi: Diverse retrieval via greedy optimization of expected 1-call@k in a latent subtopic relevance model. 1977-1980

Shoaib Jameel, Wai Lam, Ching-man Au Yeung, Sheaujiun Chyan: An unsupervised ranking method based on a technical difficulty terrain. 1989-1992
Tamer Elsayed, Jimmy J. Lin, Donald Metzler: When close enough is good enough: approximate positional indexes for efficient ranked retrieval. 1993-1996
Sairam Gurajada, P. Sreenivasa Kumar: Index tuning for query-log based on-line index maintenance. 1997-2000
Dongdong Shan, Wayne Xin Zhao, Jing He, Rui Yan, Hongfei Yan, Xiaoming Li: Efficient phrase querying with flat position index. 2001-2004
Saeedeh Momtazi, Dietrich Klakow: Trained trigger language model for sentence retrieval in QA: bridging the vocabulary gap. 2005-2008
Danilo Croce, Alessandro Moschitti, Roberto Basili: Semantic convolution kernels over dependency trees: smoothed partial tree kernel. 2013-2016
Yang Lu, Jing He, Dongdong Shan, Hongfei Yan: Recommending citations with translation model. 2017-2020
Takehiro Yamamoto, Satoshi Nakamura, Katsumi Tanaka: Extracting adjective facets from community Q&A corpus. 2021-2024
Jun Wang, Xia Hu, Zhoujun Li, Wen-Han Chao, Biyun Hu: Learning to recommend questions based on public interest. 2029-2032
Huizhong Duan, Rui Li, ChengXiang Zhai: Automatic query reformulation with syntactic operators to alleviate search difficulty. 2037-2040
Baichuan Li, Irwin King, Michael R. Lyu: Question routing in community question answering: putting category in its place. 2041-2044
Aditya Kalyanpur, Siddharth Patwardhan, Branimir Boguraev, Adam Lally, Jennifer Chu-Carroll: Fact-based question decomposition for candidate answer re-ranking. 2045-2048
Emre Varol, Fazli Can, Cevdet Aykanat, Oguz Kaya: CoDet: sentence-based containment detection in news corpora. 2049-2052
Andrey Kustarev, Yury Ustinovsky, Yury Logachev, Evgeny Grechnikov, Ilya Segalovich, Pavel Serdyukov: Smoothing NDCG metrics using tied scores. 2053-2056
Mostafa Keikha, Jangwon Seo, W. Bruce Croft, Fabio Crestani: Predicting document effectiveness in pseudo relevance feedback. 2061-2064
Abhimanu Kumar, Matthew Lease, Jason Baldridge: Supervised language modeling for temporal resolution of texts. 2069-2072
Xiaohui Yan, Jiafeng Guo, Xueqi Cheng: Context-aware query recommendation by learning high-order relation in query logs. 2073-2076
Shuhui Wang, Qingming Huang, Shuqiang Jiang, Qi Tian: Efficient lp-norm multiple feature metric learning for image categorization. 2077-2080
Bahjat Safadi, Georges Quénot: Re-ranking by local re-scoring for video indexing and retrieval. 2081-2084
Muhammad Asiful Islam, Faisal Ahmed, Yevgen Borodin, I. V. Ramakrishnan: Tightly coupling visual and linguistic features for enriching audio-based web browsing experience. 2085-2088
Jungho Lee, Seungjae Lee, Yongseok Seo, Wonyoung Yoo: Robust video fingerprinting based on hierarchical symmetric difference feature. 2089-2092
Raoul Wessel, Sebastian Ochmann, Richard Vock, Ina Blümel, Reinhard Klein: Efficient retrieval of 3D building models using embeddings of attributed subgraphs. 2097-2100
Duck-Ho Bae, Se-Mi Hwang, Sang-Wook Kim, Christos Faloutsos: Constructing seminal paper genealogy. 2101-2104
Zongda Wu, Guandong Xu, Rong Pan, Yanchun Zhang, Zhiwen Hu, Jianfeng Lu: Leveraging Wikipedia concept and category information to enhance contextual advertising. 2105-2108
Timothy Jones, David Hawking, Paul Thomas, Ramesh Sankaranarayana: Relative effect of spam and irrelevant documents on user interaction with search engines. 2113-2116
Van Dang, Xiaobing Xue, W. Bruce Croft: Inferring query aspects from reformulations using clustering. 2117-2120
Sungchul Kim, Tao Qin, Hwanjo Yu, Tie-Yan Liu: Advertiser-centric approach to understand user click behavior in sponsored search. 2121-2124
Dyut Kumar Sil, Srinivasan H. Sengamedu, Chiranjib Bhattacharyya: Supervised matching of comments with news article segments. 2125-2128
Anlei Dong, Jiang Bian, Xiaofeng He, Srihari Reddy, Yi Chang: User action interpretation for personalized content optimization in recommender systems. 2129-2132
Maria Soledad Pera, Yiu-Kai Ng: A personalized recommendation system on scholarly publications. 2133-2136
Naoki Tani, Danushka Bollegala, Naiwala P. Chandrasiri, Keisuke Okamoto, Kazunari Nawa, Shuhei Iitsuka, Yutaka Matsuo: Collaborative exploratory search in real-world context. 2137-2140
Benno Stein, Tim Gollub, Dennis Hoppe: Beyond precision@10: clustering the long tail of web search results. 2141-2144
Poster session: knowledge management
Sang-Wook Kim, Ki-Nam Kim, Christos Faloutsos, Joon Ho Lee: Spectral analysis of a blogosphere. 2145-2148
Timothy Cribbin: Citation chain aggregation: an interaction model to support citation cycling. 2149-2152

Yifan Fu, Bin Li, Xingquan Zhu, Chengqi Zhang: Do they belong to the same class: active learning by querying pairwise label homogeneity. 2161-2164
Paolo Garza: Structured data classification by means of matrix factorization. 2165-2168
Zhenfeng Zhu, Xingquan Zhu, Yangdong Ye, Yue-Fei Guo, Xiangyang Xue: Transfer active learning. 2169-2172
Nenad Tomasev, Milos Radovanovic, Dunja Mladenic, Mirjana Ivanovic: A probabilistic approach to nearest-neighbor classification: naive hubness bayesian kNN. 2173-2176
Yujing Wang, Xiaochuan Ni, Jian-Tao Sun, Yunhai Tong, Zheng Chen: Representing document as dependency graph for document clustering. 2177-2180
Michele Berlingerio, Michele Coscia, Fosca Giannotti: Finding redundant and complementary communities in multidimensional networks. 2181-2184
Bruno A. Pimentel, Anderson F. B. F. da Costa, Renata M. C. R. de Souza: A partitioning method for symbolic interval data based on kernelized metric. 2189-2192
Chaokun Wang, Wei Zheng, Zhang Liu, Yiyuan Bai, Jianmin Wang: Using random walks for multi-label classification. 2197-2200
Shin Ando: Latent feature encoding using dyadic and relational data. 2201-2204
Yong Liu, Shizhong Liao, Yuexian Hou: Learning kernels with upper bounds of leave-one-out error. 2205-2208
Guoqiong Liao, Jing Li, Lei Chen, Changxuan Wan: KLEAP: an efficient cleaning method to remove cross-reads in RFID streams. 2209-2212
Narayan Bhamidipati, Nagaraj Kota: A diversity measure leveraging domain specific auxiliary information. 2213-2216
Julia Kiseleva, Eugene Agichtein, Daniel Billsus: Mining query structure from click data: a case study of product queries. 2217-2220
Hengshu Zhu, Huanhuan Cao, Hui Xiong, Enhong Chen, Jilei Tian: Towards expert finding by leveraging relevant categories in authority ranking. 2221-2224
Qi Li, Sam Anzaroot, Wen-Pin Lin, Xiang Li, Heng Ji: Joint inference for cross-document information extraction. 2225-2228
Anish Das Sarma, Alpa Jain, Philip Bohannon: Building a generic debugger for information extraction pipelines. 2229-2232
Amara Tariq, Asim Karim: Fast supervised feature extraction by term discrimination information pooling. 2233-2236
Henning Wachsmuth, Benno Stein, Gregor Engels: Constructing efficient information extraction pipelines. 2237-2240
Krishna Yeswanth Kamath, James Caverlee: Discovering trending phrases on information streams. 2245-2248
Samaneh Moghaddam, Mohsen Jamali, Martin Ester: Review recommendation: personalized prediction of the quality of online reviews. 2249-2252
Fidel Cacheda, Victor Carneiro, Diego Fernández, Vreixo Formoso: Improving k-nearest neighbors algorithms: practical application of dataset analysis. 2253-2256
Ibrahim Uysal, W. Bruce Croft: User oriented tweet ranking: a filtering approach to microblogs. 2261-2264
Julie Séguéla, Gilbert Saporta: A semi-supervised hybrid system to enhance the recommendation of channels in terms of campaign roi. 2265-2268
Dongsheng Li, Qin Lv, Li Shang, Ning Gu: YANA: an efficient privacy-preserving recommender system for online social communities. 2269-2272
Mirwaes Wahabzada, Kristian Kersting, Anja Pilz, Christian Bauckhage: More influence means less work: fast latent dirichlet allocation by influence scheduling. 2273-2276
Mingqiang Xue, Panagiotis Karras, Chedy Raïssi, Hung Keng Pung: Utility-driven anonymization in data publishing. 2277-2280
Madhushri Banerjee, Sumit Chakravarty: Privacy preserving feature selection for distributed data using virtual dimension. 2281-2284
Hamid Turab Mirza, Ling Chen, Gencai Chen, Ibrar Hussain, Xufeng He: Switch detector: an activity spotting system for desktop. 2285-2288
Madhuchand Rushi Pillutla, Nisarg Raval, Piyush Bansal, Kannan Srinathan, C. V. Jawahar: LSH based outlier detection and its application in distributed setting. 2289-2292
Henning Weiler, Klaus Meyer-Wegener, Salvatore Mele: Authormagic: an approach to author disambiguation in large-scale digital libraries. 2293-2296
Chuan Shi, Philip S. Yu, Yanan Cai, Zhenyu Yan, Bin Wu: On selection of objective functions in multi-objective community detection. 2301-2304
Manos Papagelis, Francesco Bonchi, Aristides Gionis: Suggesting ghost edges for a smaller world. 2305-2308
Karl Gyllstrom, Marie-Francine Moens: Examining the "leftness" property of Wikipedia categories. 2309-2312
Maik Anderka, Benno Stein, Nedim Lipka: Detection of text quality flaws as a one-class classification problem. 2313-2316
Deepak P, Sutanu Chakraborti, Deepak Khemani: More or better: on trade-offs in compacting textual problem solution repositories. 2321-2324
Jing Guo, Peng Zhang, Jianlong Tan, Li Guo: Mining frequent patterns across multiple data streams. 2325-2328
Ermelinda Oro, Massimo Ruffolo: SILA: a spatial instance learning approach for deep webpages. 2329-2332
Jeffrey McGee, James Caverlee, Zhiyuan Cheng: A geographic study of tie strength in social media. 2333-2336
Changki Lee, Pum-Mo Ryu, Hyunki Kim: Named entity recognition using a modified Pegasos algorithm. 2337-2340
Tadashi Nomoto: WikiLabel: an encyclopedic approach to labeling documents en masse. 2341-2344
Mrinmaya Sachan, Danish Contractor, Tanveer A. Faruquie, L. Venkata Subramaniam: Probabilistic model for discovering topic based communities in social networks. 2349-2352
Poster session: databases
Sanghoon Lee, Jongwuk Lee, Seung-won Hwang: Scalable entity matching computation with materialization. 2353-2356
Jintian Deng, Fei Liu, Yun Peng, Byron Choi, Jianliang Xu: Predicting the optimal ad-hoc index for reachability queries on graph databases. 2357-2360
Andrew Peel, Anthony Wirth, Justin Zobel: Collection-based compression using discovered long matching strings. 2361-2364
Carlos Garcia-Alvarado, Carlos Ordonez: Integrating and querying web databases and documents. 2369-2372
Martin Krulis, Jakub Lokoc, Christian Beecks, Tomás Skopal, Thomas Seidl: Processing the signature quadratic form distance on many-core GPU architectures. 2373-2376
Johann Gamper, Michael H. Böhlen, Willi Cometti, Markus Innerebner: Defining isochrones in multimodal spatial networks. 2381-2384
Ioannis Konstantinou, Evangelos Angelou, Christina Boumpouka, Dimitrios Tsoumakos, Nectarios Koziris: On the elasticity of NoSQL databases over cloud management platforms. 2385-2388
Jun Li, Peng Zhang, Jianlong Tan, Ping Liu, Li Guo: Continuous data stream query in the cloud. 2389-2392
He Li, Kyoungsoo Bok, Jaesoo Yoo: A cluster based mobile peer to peer architecture in wireless ad hoc networks. 2393-2396
Lars Kolb, Andreas Thor, Erhard Rahm: Block-based load balancing for entity resolution with MapReduce. 2397-2400
Shen Gao, Jianliang Xu, Bingsheng He, Byron Choi, Haibo Hu: PCMLogging: reducing transaction logging overhead with PCM. 2401-2404
Hong Kyu Park, Won Suk Lee: A continuous query evaluation scheme for a detection-only query over data streams. 2405-2408
Junling Liu, Ge Yu, Huanliang Sun: Subject-oriented top-k hot region queries in spatial dataset. 2409-2412
Yonghun Park, Dongmin Seo, Kyoungsoo Bok, Jaesoo Yoo: k-Nearest neighbor query processing method based on distance relation pattern. 2413-2416
Sreenivas Gollapudi, Samuel Ieong, Alexandros Ntoulas, Stelios Paparizos: Efficient query rewrite for structured web queries. 2417-2420
Jiang Bian, Yi Chang: A taxonomy of local search: semi-supervised query classification driven by information needs. 2425-2428
Carlos Garcia-Alvarado, Zhibo Chen, Carlos Ordonez: ONTOCUBE: efficient ontology extraction using OLAP cubes. 2429-2432
Xiaojun Cheng, Guilin Qi: An algorithm for axiom pinpointing in EL+ and its incremental variant. 2433-2436
