14. CIKM 2005: Bremen, Germany
Otthein Herzog, Hans-Jörg Schek, Norbert Fuhr, Abdur Chowdhury, Wilfried Teiken (Eds.): Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31 - November 5, 2005. ACM 2005 ISBN 1-59593-140-6
Invited Talks
Ben Shneiderman: Leonardo's laptop: human needs and the new computing technologies. 1
Yannis E. Ioannidis: Emerging data management systems: close-up and personal. 2
Thomas Hofmann: From bits and bytes to information and knowledge. 3
Paper session IR-1 (information retrieval): XML retrieval
Jaap Kamps, Maarten Marx, Maarten de Rijke, Börkur Sigurbjörnsson: Structured queries in XML retrieval. 4-11
Vojkan Mihajlovic, Henk Ernst Blok, Djoerd Hiemstra, Peter M. G. Apers: Score region algebra: building a transparent XML-R database. 12-19
Paavo Arvola, Marko Junkkari, Jaana Kekäläinen: Generalized contextualization method for XML information retrieval. 20-27
Paper session DB-1 (databases): networks and peer-to-peer
Klaus Haller, Heiko Schuldt, Can Türker: Decentralized coordination of transactional processes in peer-to-peer environments. 28-35
Gianluigi Greco, Francesco Scarcello: On the complexity of computing peer agreements for consistent query answering in peer-to-peer data integration systems. 36-43
Ioannis Aekaterinidis, Peter Triantafillou: Internet scale string attribute publish/subscribe data networks. 44-51
Paper session KM-1 (knowledge management): knowledge systems
Michael Guppenberger, Burkhard Freitag: Intelligent creation of notification events in information systems: concept, implementation and evaluation. 52-59
Kaidi Zhao, Bing Liu, Thomas M. Tirpak, Weimin Xiao: Opportunity map: a visualization framework for fast identification of actionable knowledge. 60-67
Jaewoo Kang, Tae Sik Han, Dongwon Lee, Prasenjit Mitra: Establishing value mappings using statistical models and user feedback. 68-75
Paper session IR-2 (information retrieval): question answering
Valentin Jijkoun, Maarten de Rijke: Retrieving answers from frequently asked questions pages on the web. 76-83
Jiwoon Jeon, W. Bruce Croft, Joon Ho Lee: Finding similar questions in large question and answer archives. 84-90
Fernando A. Das Neves, Edward A. Fox, Xiaoyan Yu: Connecting topics in document collections with stepping stones and pathways. 91-98
Paper session DB-2 (databases): security and privacy
Barbara Carminati, Elena Ferrari, Elisa Bertino: Securing XML data in third-party distribution systems. 99-106
Béatrice Finance, Saïda Medjdoub, Philippe Pucheral: The case for access control on XML relationships. 107-114
Naizhen Qi, Michiharu Kudo, Jussi Myllymaki, Hamid Pirahesh: A function-based access control model for XML databases. 115-122
Paper session KM-2 (knowledge management): index structures
Mihail Halachev, Nematollaah Shiri, Anand Thamildurai: Exact match search in sequence data using suffix trees. 123-130
Michail Vlachos, Zografoula Vagena, Philip S. Yu, Vassilis Athitsos: Rotation invariant indexing of shapes and line drawings. 131-138
Anand Meka, Ambuj K. Singh: DIST: a distributed spatio-temporal index structure for sensor networks. 139-146
Paper session IR-3 (information retrieval): web retrieval
Thanh Tin Tang, David Hawking, Nick Craswell, Kathleen Griffiths: Focused crawling for both topical relevance and quality of medical information. 147-154
Yinghua Zhou, Xing Xie, Chuang Wang, Yuchang Gong, Wei-Ying Ma: Hybrid index structures for location-based web search. 155-162
Xiaojun Wan, Jianfeng Gao, Mu Li, Binggong Ding: Person resolution in person search results: WebHawk. 163-170
Paper session DB-3 (databases): sensors and data streams
Bugra Gedik, Kun-Lung Wu, Philip S. Yu, Ling Liu: Adaptive load shedding for windowed stream joins. 171-178
Ming-Jyh Hsieh, Ming-Syan Chen, Philip S. Yu: Integrating DCT and DWT for approximating cube streams. 179-186
Alexandru Coman, Mario A. Nascimento, Jörg Sander: Exploiting redundancy in sensor networks for energy efficient processing of spatiotemporal region queries. 187-194
Paper session KM-3 (knowledge management): classification & clustering

Ratko Orlandic, Ying Lai, Wai Gen Yee: Clustering high-dimensional data using an efficient and effective data space reduction. 201-208
Federica Mandreoli, Riccardo Martoglia, Enrico Ronchetti: Versatile structural disambiguation for semantic-aware applications. 209-216
Poster Session
Timothy M. Sutherland, Bin Liu, Mariana Jbantova, Elke A. Rundensteiner: D-CAPE: distributed and self-tuned continuous query processing. 217-218
Qiankun Zhao, Sourav S. Bhowmick, Le Gruenwald: Mining conserved XML query paths for dynamic-conscious caching. 219-220
Yongluan Zhou, Ying Yan, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou: Optimizing continuous multijoin queries over distributed streams. 221-222
Takeharu Eda, Makoto Onizuka, Masashi Yamamuro: Processing XPath queries with XML summaries. 223-224
Changqing Li, Tok Wang Ling, Jiaheng Lu, Tian Yu: On reducing redundancy and improving efficiency of XML labeling schemes. 225-226
Fang Liu, Shuang Liu, Clement T. Yu, Weiyi Meng, Ophir Frieder, David A. Grossman: Database selection in intranet mediators for natural language queries. 229-230
Ramakrishna Varadarajan, Vagelis Hristidis: Structure-based query-specific document summarization. 231-232
Ken Q. Pu, Alberto O. Mendelzon: Typed functional query languages with equational specifications. 233-234
Maithili Narasimha, Gene Tsudik: DSAC: integrity for outsourced databases with signature aggregation and chaining. 235-236
Foto N. Afrati, Paraskevas V. Lekeas, Chen Li: Answering aggregation queries on hierarchical web sites using adaptive sampling. 237-238
Terrence Mason, Ramon Lawrence: INFER: a relational query language without the complexity of SQL. 241-242
Sandeep Gupta, Jinfeng Ni, Chinya V. Ravishankar: Efficient data dissemination using locale covers. 243-244
Zhenjie Zhang, Xinyu Guo, Hua Lu, Anthony K. H. Tung, Nan Wang: Discovering strong skyline points in high dimensional spaces. 247-248
Xiaohua Hu, Illhoi Yoo, Min Song, Yanqing Zhang, Il-Yeol Song: Mining undiscovered public knowledge from complementary and non-interactive biomedical literature through semantic pruning. 249-250
Sriram Mohan, Arijit Sengupta, Yuqing Wu: Access control for XML: a dynamic query rewriting approach. 251-252
Hong-Cheu Liu, John Zeleznikow: Relational computation for mining association rules from XML data. 253-254
Helena Ahonen-Myka: Mining all maximal frequent word sequences in a set of sentences. 255-256
Aron Culotta, Andrew McCallum: Joint deduplication of multiple record types in relational data. 257-258
Jie Lian, Lei Chen, Kshirasagar Naik, M. Tamer Özsu, Gordon B. Agnew: Localized routing trees for query processing in sensor networks. 259-260
Fang Xiong, Qiong Luo, Dyce Jing Zhao: Supporting ranked search in parallel search cluster networks. 263-264
Tadahiko Kumamoto, Katsumi Tanaka: Web opinion poll: extracting people's view by impression mining from the web. 265-266
Libo Chen, Peter Fankhauser, Ulrich Thiel, Thomas Kamps: Statistical relationship determination in automatic thesaurus construction. 267-268
Giridhar Kumaran, Rosie Jones, Omid Madani: Biasing web search results for topic familiarity. 271-272
Tao Tao, Xuanhui Wang, Qiaozhu Mei, ChengXiang Zhai: Accurate language model estimation with document expansion. 273-274
Mo Chen, Jian-Tao Sun, Hua-Jun Zeng, Kwok-Yan Lam: A practical system of keyphrase extraction for web pages. 277-278
Tak-Chung Fu, Fu-Lai Chung, Pui-ying Tang, Robert Wing Pong Luk, Chak-man Ng: Incremental stock time series data delivery and visualization. 279-280
Razvan Stefan Bot, Yi-fang Brook Wu, Xin Chen, Quanzhi Li: Generating better concept hierarchies using automatic document classification. 281-282
Yi-fang Brook Wu, Quanzhi Li, Razvan Stefan Bot, Xin Chen: Domain-specific keyphrase extraction. 283-284
Jyh-haw Yeh: An RSA-based time-bound hierarchical key assignment scheme for electronic article subscription. 285-286
Bruno Pôssas, Nivio Ziviani, Berthier A. Ribeiro-Neto, Wagner Meira Jr.: Maximal termsets as a query structuring mechanism. 287-288
Jing Jiang, ChengXiang Zhai: Accurately extracting coherent relevant passages using hidden Markov models. 289-290
Georgina Ramírez, Thijs Westerveld, Arjen P. de Vries: Structural features in content oriented XML retrieval. 291-292
Henrik Nottelmann, Umberto Straccia: Information retrieval and machine learning for probabilistic schema matching. 295-296
Massih-Reza Amini, Anastasios Tombros, Nicolas Usunier, Mounia Lalmas, Patrick Gallinari: Learning to summarise XML documents using content and structure. 297-298
Xiaojun Wan, Yuxin Peng: The earth mover's distance as a semantic measure for document similarity. 301-302
Xiangye Xiao, Qiong Luo, Dan Hong, Hongbo Fu: Slicing*-tree based web page transformation for small displays. 303-304
Ronan Cummins, Colm O'Riordan: An evaluation of evolved term-weighting schemes in information retrieval. 305-306
Jaap Kamps: Web-centric language models. 307-308
Chavdar Botev, Nadav Eiron, Marcus Fontoura, Ning Li, Eugene J. Shekita: Static score bucketing in inverted indexes. 311-312
Ying Feng, Divyakant Agrawal, Amr El Abbadi, Ambuj K. Singh: Scalable ranking for preference queries. 313-314
Xiaoyong Liu, W. Bruce Croft, Matthew B. Koll: Finding experts in community-based question-answering services. 315-316
Stefan Büttcher, Charles L. A. Clarke: Indexing time vs. query time: trade-offs in dynamic information retrieval systems. 317-318
Dmitri Roussinov, Weiguo Fan, Fernando A. Das Neves: Discretization based learning approach to information retrieval. 321-322
Dmitri Roussinov, Weiguo Fan, Fernando A. Das Neves: Semantic verification for fact seeking engines. 323-324
Pierre-Alain Laur, Richard Nock, Jean-Emile Symphor, Pascal Poncelet: On the estimation of frequent itemsets for data streams: theory and experiments. 327-328
Rohini K. Srihari, Sudarshan Lamkhede, Anmol Bhasin: Unapparent information revelation: a concept chain graph approach. 329-330

Ricardo da Silva Torres, Alexandre X. Falcão, Baoping Zhang, Weiguo Fan, Edward A. Fox, Marcos André Gonçalves, Pável Calado: A new framework to combine descriptors for content-based image retrieval. 335-336
Ganesh Ramakrishnan, Deepa Paranjpe, Byron Dom: A structure-sensitive framework for text categorization. 337-338
Hans-Peter Kriegel, Martin Pfeifle: Efficient and effective server-sided distributed clustering. 339-340
Riadh Ben Messaoud, Omar Boussaid, Sabine Rabaséda: Evaluation of a MCA-based approach to organize data cubes. 341-342
Francisco M. Couto, Mário J. Silva, Pedro Coutinho: Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors. 343-344
Nan Liu, Christopher C. Yang: Extracting a website's content structure from its link structure. 345-346
Christoph Mangold, Holger Schwarz, Bernhard Mitschang: Improving intranet search-engines using context information from databases. 349-350
Yiqun Huang, Zhengding Lu, Heping Hu: A new permutation approach for distributed association rule mining. 351-352
Hui Xiong, Michael Steinbach, Vipin Kumar: Privacy leakage in multi-relational databases via pattern based semi-supervised learning. 355-356
Yingbo Miao, Vlado Keselj, Evangelos E. Milios: Document clustering using character N-grams: a comparative evaluation with term-based and word-based clustering. 357-358
Moshe Fresko, Binyamin Rosenfeld, Ronen Feldman: A hybrid approach to NER by MEMM and manual rules. 361-362
Martin Lorenz, Jan D. Gehrke, Hagen Langer, Ingo J. Timm, Joachim Hammer: Situation-aware risk management in autonomous agents. 363-364
Paper session IR-4 (information retrieval): machine learning
Carlo Curino, Yuanyuan Jia, Bruce Lambert, Patricia M. West, Clement T. Yu: Mining officially unrecognized side effects of drugs by combining web search and machine learning. 365-372
Paul-Alexandru Chirita, Jörg Diederich, Wolfgang Nejdl: MailRank: using ranking for spam detection. 373-380
Kai Simon, Georg Lausen: ViPER: augmenting automatic information extraction with visual perceptions. 381-388
Paper session DB-4 (databases): XML and query processing
Sara Cohen, Yaron Kanza, Benny Kimelfeld, Yehoshua Sagiv: Interconnection semantics for keyword search in XML. 389-396
K. Hima Prasad, P. Sreenivasa Kumar: Efficient indexing and querying of XML data using modified Prüfer sequences. 397-404
Prasan Roy, Mukesh K. Mohania, Bhuvan Bamba, Shree Raman: Towards automatic association of relevant unstructured content with structured query results. 405-412
Paper session KM-4 (knowledge management): information extraction
Eugene Agichtein, Silviu Cucerzan: Predicting accuracy of extracting information from unstructured text collections. 413-420
Qiankun Zhao, Sourav S. Bhowmick, Le Gruenwald: WAM-Miner: in the search of web access motifs from historical web log data. 421-428
Junmei Wang, Wynne Hsu, Mong-Li Lee: A framework for mining topological patterns in spatio-temporal databases. 429-436
Industry track session
Moninder Singh, Jayant Kalagnanam, Sudhir Verma, Amit J. Shah, Swaroop K. Chalasani: Automated cleansing for spend analytics. 437-445
Gilad Mishne, David Carmel, Ron Hoory, Alexey Roytman, Aya Soffer: Automatic analysis of call-center conversations. 453-459
Hang Li, Yunbo Cao, Jun Xu, Yunhua Hu, Shenjie Li, Dmitriy Meyerzon: A new approach to intranet search based on information extraction. 460-468
Paper session IR-5 (information retrieval): machine learning and collaborative filtering
Songbo Tan, Xueqi Cheng, Moustafa Ghanem, Bin Wang, Hongbo Xu: A novel refinement approach for text categorization. 469-476
Baoping Zhang, Yuxin Chen, Weiguo Fan, Edward A. Fox, Marcos André Gonçalves, Marco Cristo, Pável Calado: Intelligent GP fusion from multiple sources for text classification. 477-484
Paper session DB-5 (databases): updates and change detection

Changqing Li, Tok Wang Ling: QED: a novel quaternary encoding to completely avoid re-labeling in XML updates. 501-508
Erwin Leonardi, Sourav S. Bhowmick: Detecting changes on unordered XML documents using relational databases: a schema-conscious approach. 509-516
Paper session IR-6 (information retrieval): IR models 1
Donald Metzler, Yaniv Bernstein, W. Bruce Croft, Alistair Moffat, Justin Zobel: Similarity measures for tracking information flow. 517-524
Chenyi Xia, Wynne Hsu, Mong-Li Lee: ERkNN: efficient reverse k-nearest neighbors retrieval with local kNN-distance estimation. 533-540
Industry track session
Rasmus Kaae, Thanh-Duy Nguyen, Dennis Nørgaard, Albrecht Schmidt: Kalchas: a dynamic XML search engine. 541-548
Christian Herzog, Gianpiero Liuzzi, Mario Diwersy: SyynX solutions: practical knowledge management in a medical environment. 556-559
Stephen C. Gates, Wilfried Teiken, Keh-Shin F. Cheng: Taxonomies by the numbers: building high-performance taxonomies. 568-577
Paper session IR-7 (information retrieval): distributed retrieval
Yangbo Zhu, Shaozhi Ye, Xing Li: Distributed PageRank computation based on iterative aggregation-disaggregation methods. 578-585
Wolfgang Müller, Martin Eisenhardt, Andreas Henrich: Scalable summary based retrieval in P2P networks. 586-593
Paper session DB-6 (databases): algorithms
Hao He, Haixun Wang, Jun Yang, Philip S. Yu: Compact reachability labeling for graph-structured data. 594-601
Paper session DB-7 (databases): privacy and sharing
Jianping Fan, Hangzai Luo, Mohand-Said Hacid, Elisa Bertino: A novel approach for privacy-preserving video sharing. 609-616
Paper session IR-8 (information retrieval): sentiment and genre classification
Andrea Esuli, Fabrizio Sebastiani: Determining the semantic orientation of terms through gloss classification. 617-624
Casey Whitelaw, Navendu Garg, Shlomo Argamon: Using appraisal groups for sentiment analysis. 625-631
Elizabeth Sugar Boese, Adele E. Howe: Effects of web document evolution on genre classification. 632-639
Paper session DB-8 (databases): query optimisation
Georgia Koloniari, Yannis Petrakis, Evaggelia Pitoura, Thodoris Tsotsos: Query workload-aware overlay construction using histograms. 640-647
Doron Rotem, Kurt Stockinger, Kesheng Wu: Optimizing candidate check costs for bitmap indices. 648-655
Xiaohui Yu, Calisto Zuzarte, Kenneth C. Sevcik: Towards estimating the number of distinct value combinations for a set of attributes. 656-663
Paper session IR-9 (information retrieval): IR models 2

Fernando Diaz: Regularizing ad hoc retrieval scores. 672-679
Paper session IR-10 (information retrieval): query expansion
Jing Bai, Dawei Song, Peter Bruza, Jian-Yun Nie, Guihong Cao: Query expansion using term relationships in language models for information retrieval. 688-695
Bruno M. Fonseca, Paulo Braz Golgher, Bruno Pôssas, Berthier A. Ribeiro-Neto, Nivio Ziviani: Concept-based interactive query expansion. 696-703
Paper session DB-9 (databases): query processing 1
Dimitri Theodoratos, Theodore Dalamagas, Antonis Koufopoulos, Narain H. Gehani: Semantic querying of tree-structured data sources using partially specified tree patterns. 712-719
Neoklis Polyzotis: Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization. 720-727
Cédric du Mouza, Philippe Rigaux, Michel Scholl: Efficient evaluation of parameterized pattern queries. 728-735
Paper session IR-11 (information retrieval): novelty detection
Paper session IR-12 (information retrieval): IR potpourri
Jean Martinet, Yves Chiaramella, Philippe Mulhem: A model for weighting image objects in home photographs. 760-767
Wisam Dakka, Panagiotis G. Ipeirotis, Kenneth R. Wood: Automatic construction of multifaceted browsing interfaces. 768-775
Nicholas Lester, Alistair Moffat, Justin Zobel: Fast on-line index construction by geometric partitioning. 776-783
Paper session DB-10 (databases): query processing 2
Marcus Fontoura, Vanja Josifovski, Eugene J. Shekita, Beverly Yang: Optimizing cursor movement in holistic twig joins. 784-791
Luca Grieco, Domenico Lembo, Riccardo Rosati, Marco Ruzzi: Consistent query answering under key and exclusion dependencies: algorithms and experiments. 792-799
Qingzhao Tan, Wang-Chien Lee, Baihua Zheng, Peng Liu, Dik Lun Lee: Balancing performance and confidentiality in air index. 800-807
Paper session IR-13 (information retrieval): context and personalization
Massimo Melucci: Context modeling and discovery using vector space bases. 808-815
Reiner Kraft, Farzin Maghoul, Chi-Chao Chang: Y!Q: contextual search at the point of inspiration. 816-823



