16. CIKM 2007: Lisbon, Portugal
Mário J. Silva, Alberto H. F. Laender, Ricardo A. Baeza-Yates, Deborah L. McGuinness, Bjørn Olstad, Øystein Haug Olsen, André O. Falcão (Eds.): Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007. ACM 2007 ISBN 978-1-59593-803-9
XML query processing (DB)

Stefanos Souldatos, Xiaoying Wu, Dimitri Theodoratos, Theodore Dalamagas, Timos K. Sellis: Evaluation of partial path queries on xml data. 21-30
Guoliang Li, Jianhua Feng, Jianyong Wang, Lizhu Zhou: Effective keyword search for valuable lcas over xml documents. 31-40
Semantic annotation (KM)

Nicola Fanizzi, Claudia d'Amato, Floriana Esposito: Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases. 51-60
Paul Doran, Valentina A. M. Tamma, Luigi Iannone: Ontology module extraction for ontology reuse: an ontology engineering perspective. 61-70
Natural language I (IR)
Dmitri Roussinov, Ozgur Turetken: Semantic verification in an online fact seeking environment. 71-78
Ouyang You, Sujian Li, Wenjie Li: Developing learning strategies for topic-based summarization. 79-86
Marius Pasca: Lightweight web-based fact repositories for textual question answering. 87-96
Enterprise information management (IND)
Ronny Lempel, Yosi Mass, Shila Ofek-Koifman, Dafna Sheinwald, Yael Petruschka, Ron Sivan: Just in time indexing for up to the second search. 97-106
Christian Drumm, Matthias Schmitt, Hong Hai Do, Erhard Rahm: Quickmig: automatic schema matching for data migration projects. 107-116
Youngja Park: Automatic call section segmentation for contact-center calls. 117-126
Classification and clustering I (KM)
Seyda Ertekin, Jian Huang, Léon Bottou, C. Lee Giles: Learning on the border: active learning in imbalanced data classification. 127-136
Tao Li, Sarabjot S. Anand: Diva: a variance-based clustering approach for multi-type relational data. 147-156
Web retrieval I (IR)
Marc Najork: Comparing the effectiveness of hits and salsa. 157-164
David Fernandes, Edleno Silva de Moura, Berthier A. Ribeiro-Neto, Altigran Soares da Silva, Marcos André Gonçalves: Computing block importance for searching on web sites. 165-174
Benjamin Piwowarski, Hugo Zaragoza: Predictive user click models based on click-through history. 175-182
Spatio-temporal databases and time series streams (DB)
Reasey Praing, Markus Schneider: Modeling historical and future movements of spatio-temporal objects in moving objects databases. 183-192
Tiancheng Zhang, Dejun Yue, Yu Gu, Ge Yu: Boolean representation based data-adaptive correlation analysis over time series streams. 203-212
Explanation, knowledge provenance and synthesis (KM)
Anthony Don, Elena Zheleva, Machon Gregory, Sureyya Tarkan, Loretta Auvil, Tanya Clement, Ben Shneiderman, Catherine Plaisant: Discovering interesting usage patterns in text collections: integrating text mining with visualization. 213-222
Jonathan Yu, James A. Thom, Audrey M. Tam: Ontology evaluation using wikipedia categories for browsing. 223-232
Meiqun Hu, Ee-Peng Lim, Aixin Sun, Hady Wirawan Lauw, Ba-Quy Vuong: Measuring article quality in wikipedia: models and evaluation. 243-252
IR modeling (IR)
Donald Metzler: Automatic feature selection in the markov random field model for information retrieval. 253-262
Massimo Melucci, Ryen W. White: Utilizing a geometry of context for enhanced implicit feedback. 273-282
Record linkage and approximate matching (DB)

Luís Leitão, Pável Calado, Melanie Weis: Structure-based inference of xml similarity for fuzzy duplicate detection. 293-302
Carina F. Dorneles, Carlos A. Heuser, Viviane Moreira Orengo, Altigran Soares da Silva, Edleno Silva de Moura: A strategy for allowing meaningful and comparable scores in approximate matching. 303-312
Miscellaneous (IR)
Gordon V. Cormack, José María Gómez Hidalgo, Enrique Puertas Sanz: Spam filtering for short messages. 313-320
Aris Anagnostopoulos, Andrei Z. Broder, Evgeniy Gabrilovich, Vanja Josifovski, Lance Riedel: Just-in-time contextual advertising. 331-340
Query expansion (IR)

Guihong Cao, Jianfeng Gao, Jian-Yun Nie, Jing Bai: Extending query translation to cross-language query expansion with markov chain models. 351-360
Rong Yan, Alexander G. Hauptmann: Query expansion using probabilistic local feedback with application to multimedia retrieval. 361-370
Query processing (DB)

Manuel Tamashiro, Alex Thomo, Srinivasan Venkatesh: Towards practically feasible answering of regular path queries in lav data integration. 381-390
Tao-Young Fu, Wen-Chih Peng, Wang-Chien Lee: Optimizing parallel itineraries for knn query processing in wireless sensor networks. 391-400
Classification and clustering II (KM)
Jing Jiang, ChengXiang Zhai: A two-stage approach to domain adaptation for statistical classifiers. 401-410
Mounir Bechchi, Guillaume Raschia, Noureddine Mouaddib: Merging distributed database summaries. 419-428
Semantic IR (IR)
Susan Price, Marianne Lykke Nielsen, Lois M. L. Delcambre, Peter Vedsted: Semantic components enhance retrieval of domain-specific documents. 429-438
Trong-Ton Pham, Nicolas Maillot, Joo-Hwee Lim, Jean-Pierre Chevallet: Latent semantic fusion model for image retrieval and annotation. 439-444
David N. Milne, Ian H. Witten, David M. Nichols: A knowledge-based search engine powered by wikipedia. 445-454
OLAP and multi-dimensional databases (DB)


Todd Eavis, Alex Lopez: Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation. 475-484
Information extraction, conceptual clustering, and prioritization (KM)
Marius Pasca, Benjamin Van Durme, Nikesh Garera: The role of documents vs. queries in extracting class attributes from text. 485-494
Jean-Pierre Chevallet, Joo-Hwee Lim, Diem Thi Hoang Le: Domain knowledge conceptual inter-media indexing: application to multilingual multimedia medical reports. 495-504
Jennifer Chu-Carroll, John M. Prager: An experimental study of the impact of information extraction accuracy on semantic search performance. 505-514
Lida Li, Michael J. Muller, Werner Geyer, Casey Dugan, Beth Brownholtz, David R. Millen: Predicting individual priorities of shared activities using support vector machines. 515-524
Web retrieval II (IR)
Ahu Sieg, Bamshad Mobasher, Robin D. Burke: Web search personalization with ontological user profiles. 525-534
Qingzhao Tan, Prasenjit Mitra, C. Lee Giles: Designing clustering-based web crawling policies for search engine crawlers. 535-544
Laurence A. F. Park, Kotagiri Ramamohanarao: Mining web multi-resolution community-based popularity for information retrieval. 545-554
Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang: Learning query-biased web page summarization. 555-562
Graph based retrieval (IR)
Monique V. Vieira, Bruno M. Fonseca, Rodrigo Damazio, Paulo Braz Golgher, Davi de Castro Reis, Berthier A. Ribeiro-Neto: Efficient search ranking in social networks. 563-572
Norbert Martínez-Bazan, Victor Muntés-Mulero, Sergio Gómez-Villamor, Jordi Nin, Mario-A. Sánchez-Martínez, Josep-Lluis Larriba-Pey: Dex: high-performance exploration on large graphs for information retrieval. 573-582
Data exploration and discovery (KM)

Di Yang, Elke A. Rundensteiner, Matthew O. Ward: Nugget discovery in visual exploration environments by query consolidation. 603-612
Kevin J. Lang, Reid Andersen: Finding dense and isolated submarkets in a sponsored search spending graph. 613-622
IR evaluation (IR)
Mark D. Smucker, James Allan, Ben Carterette: A comparison of statistical significance tests for information retrieval evaluation. 623-632

Performance issues (DB)

Weixiong Rao, Lei Chen, Ada Wai-Chee Fu, Yingyi Bu: Optimal proactive caching in peer-to-peer network: analysis and application. 663-672
Sourav S. Bhowmick, Erwin Leonardi, Hongmei Sun: Efficient evaluation of high-selective xml twig patterns with parent child edges in tree-unaware rdbms. 673-682
Information representation and integration (KM)
Marius Pasca: Weakly-supervised discovery of named entities using web search queries. 683-690
Vadim von Brzeski, Utku Irmak, Reiner Kraft: Leveraging context in user-centric entity detection systems. 691-700
John M. Prager, Sarah Luger, Jennifer Chu-Carroll: Type nanotheories: a framework for term comparison. 701-710
Natural language II (IR)
Wei Zhang, Shuang Liu, Clement T. Yu, Chaojing Sun, Fang Liu, Weiyi Meng: Recognition and classification of noun phrases in queries for effective retrieval. 711-720
Abhimanyu Lad, Yiming Yang: Generalizing from relevance feedback using named entity wildcards. 721-730
Desislava Petkova, W. Bruce Croft: Proximity-based document representation for named entity retrieval. 731-740
Indexing (IR)
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han: Regularized locality preserving indexing via spectral regression. 741-750
Ruijie Guo, Xueqi Cheng, Hongbo Xu, Bin Wang: Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree. 751-760
Stefan Büttcher, Charles L. A. Clarke: Index compression is good, especially for random access. 761-770
Mingjie Zhu, Shuming Shi, Mingjing Li, Ji-Rong Wen: Effective top-k computation in retrieving structured documents with term-proximity support. 771-780
Data mining (KM)
Natural language III (IR)
Poster session
Ismail Sengör Altingövde, Rifat Ozcan, Suleyman Cetintas, Hakan Yilmaz, Özgür Ulusoy: An automatic approach to construct domain-specific web portals. 849-852
Richard Bache, Mark Baillie, Fabio Crestani: Language models, probability of relevance and relevance likelihood. 853-856
Holger Bast, Debapriyo Majumdar, Ingmar Weber: Efficient interactive query expansion with complete search. 857-860
Stephan Bloehdorn, Alessandro Moschitti: Structure and semantics for expressive text kernels. 861-864
Karin Koogan Breitman, Simone Diniz Junqueira Barbosa, Marco A. Casanova, Antonio L. Furtado: Conceptual modeling by analogy and metaphor. 865-868
Ben Carterette, James Allan: Semiautomatic evaluation of retrieval systems using document similarities. 873-876
Blaz Fortuna, Eduarda Mendes Rodrigues, Natasa Milic-Frayling: Improving the classification of newsgroup messages through social network analysis. 877-880
Yupeng Fu, Rongjing Xiang, Yiqun Liu, Min Zhang, Shaoping Ma: A CDD-based formal model for expert finding. 881-884
Vahid Garakani, Sayyed Kamyar Izadi, Mostafa Haghjoo, Mohammad Harizi: Ntjfsatnot: a novel method for query with not-predicates on xml data. 885-888
Jade Goldstein, Gary M. Ciany, Jaime G. Carbonell: Genre identification and goal-focused summarization. 889-892
Lobna Hlaoua, Mohand Boughanem, Karen Pinel-Sauvagnat: Combination of evidences in relevance feedback for xml retrieval. 893-896
Michael E. Houle, Nizar Grira: A correlation-based model for unsupervised feature selection. 897-900
Meishan Hu, Aixin Sun, Ee-Peng Lim: Comments-oriented blog summarization by sentence extraction. 901-904
Fianny Ming-fei Jiang, Jian Pei, Ada Wai-Chee Fu: Ix-cubes: iceberg cubes for data warehousing and olap on xml data. 905-908
Rosie Jones, Ravi Kumar, Bo Pang, Andrew Tomkins: "I know what you did last summer": query logs and user privacy. 909-914
Pawel Jurczyk, Eugene Agichtein: Discovering authorities in question answer communities by using link analysis. 919-922
Feng Liu, Fengzhan Tian, QiLiang Zhu: Ensembling Bayesian network structure learning on limited data. 927-930
Jianming Lv, Xueqi Cheng: CTO: concept tree based semantic overlay for pure peer-to-peer information retrieval. 931-934
Galileo Namata, Brian Staats, Lise Getoor, Ben Shneiderman: A dual-view approach to interactive network visualization. 939-942
Jorge-Arnulfo Quiané-Ruiz, Philippe Lamarre, Sylvie Cazalens, Patrick Valduriez: Satisfaction balanced mediation. 947-950
Aravindan Raghuveer, Meera Jindal, Mohamed F. Mokbel, Biplob K. Debnath, David Hung-Chang Du: Towards efficient search on unstructured data: an intelligent-storage approach. 951-954
Guillem Rull, Carles Farré, Ernest Teniente, Toni Urpí: Computing explanations for unlively queries in databases. 955-958
Luís Sarmento, Valentin Jijkoun, Maarten de Rijke, Eugenio Oliveira: "More like these": growing entity classes from seeds. 959-962
Se Jung Shin, Won Suk Lee: An on-line interactive method for finding association rules data streams. 963-966
Shaoxu Song, Lei Chen: Probabilistic correlation-based similarity measure of unstructured records. 967-970
Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng: Identifying opinion leaders in the blogosphere. 971-974
Yangqiu Song, Bin Zhang, Wen Jun Yin, Changshui Zhang, Jin Dong: Ranking with semi-supervised distance metric learning and its application to housing potential estimation. 975-978
Songbo Tan, Gaowei Wu, Huifeng Tang, Xueqi Cheng: A novel scheme for domain-transfer problem in the context of sentiment analysis. 979-982

Xuanhui Wang, Hui Fang, ChengXiang Zhai: Improve retrieval accuracy for difficult queries using negative feedback. 991-994
Youzheng Wu, Xinhui Hu, Hideki Kashioka: Mining redundancy in candidate-bearing snippets to improve web question answering. 999-1002
Shengliang Xu, Shenghua Bao, Yunbo Cao, Yong Yu: Using social annotations to improve language model for information retrieval. 1003-1006
Lei Yang, Lei Qi, Yan-Ping Zhao, Bin Gao, Tie-Yan Liu: Link analysis using time series of web graphs. 1011-1014
Hugo Zaragoza, Henning Rode, Peter Mika, Jordi Atserias, Massimiliano Ciaramita, Giuseppe Attardi: Ranking very many typed entities on wikipedia. 1015-1018
Duo Zhang, Jie Tang, Juan-Zi Li, Kehong Wang: A constraint-based probabilistic framework for name disambiguation. 1019-1022
Xiaohua Zhou, Xiaohua Hu, Xiaodan Zhang, Xiajiong Shen: A segment-based hidden markov model for real-setting pinyin-to-chinese conversion. 1027-1030
Prabhakar Raghavan: Web search: from information retrieval to microeconomic modeling. 1-2
Fernando Pereira: Learning to join everything. 9-10



