ACM SIGMOD Conference 2011: Athens, Greece
Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, Yannis Velegrakis (Eds.): Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011. ACM 2011 ISBN 978-1-4503-0661-4
Databases on new hardware
Dongzhe Ma, Jianhua Feng, Guoliang Li: LazyFTL: a page-level flash translation layer optimized for NAND flash memory. 1-12
Yanfei Lv, Bin Cui, Bingsheng He, Xuexuan Chen: Operation-aware buffer management in flash-based systems. 13-24
Biplob K. Debnath, Sudipta Sengupta, Jin Li: SkimpyStash: RAM space skimpy key-value store on flash-based storage. 25-36
Spyros Blanas, Yinan Li, Jignesh M. Patel: Design and evaluation of main memory hash join algorithms for multi-core CPUs. 37-48
Query processing and optimization
Herodotos Herodotou, Nedyalko Borisov, Shivnath Babu: Query optimization techniques for partitioned tables. 49-60
Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, Reynold Xin: CrowdDB: answering queries with crowdsourcing. 61-72
Henning Köhler, Jing Yang, Xiaofang Zhou: Efficient parallel skyline processing using hyperplane projections. 85-96
Schema mapping and data integration

Meihui Zhang, Marios Hadjieleftheriou, Beng Chin Ooi, Cecilia M. Procopiuc, Divesh Srivastava: Automatic discovery of attributes in relational databases. 109-120
Hazem Elmeleegy, Ahmed K. Elmagarmid, Jaewoo Lee: Leveraging query logs for schema mapping generation in U-MAP. 121-132
Bogdan Alexe, Balder ten Cate, Phokion G. Kolaitis, Wang Chiew Tan: Designing and refining schema mappings via data examples. 133-144
Data on the web
Songyun Duan, Anastasios Kementsietsidis, Kavitha Srinivas, Octavian Udrea: Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. 145-156
Jeffrey Pound, Stelios Paparizos, Panayiotis Tsaparas: Facet discovery for structured web search: a query-log mining approach. 169-180
Meiyu Lu, Divyakant Agrawal, Bing Tian Dai, Anthony K. H. Tung: Schema-as-you-go: on probabilistic tagging and querying of wide tables. 181-192
Data privacy and security

Sumeet Bajaj, Radu Sion: TrustedDB: a trusted hardware based database with privacy and data confidentiality. 205-216
Bolin Ding, Marianne Winslett, Jiawei Han, Zhenhui Li: Differentially private data cubes: optimizing noise sources and consistency. 217-228
Xiaokui Xiao, Gabriel Bender, Michael Hay, Johannes Gehrke: iReduct: differential privacy with reduced relative errors. 229-240
Data consistency and parallel DB
Prasang Upadhyaya, YongChul Kwon, Magdalena Balazinska: A latency and fault-tolerance optimizer for online parallel query plans. 241-252
Emad Soroush, Magdalena Balazinska, Daniel L. Wang: ArrayStore: a storage manager for complex parallel array processing. 253-264
Tuan Cao, Marcos Antonio Vaz Salles, Benjamin Sowell, Yao Yue, Alan J. Demers, Johannes Gehrke, Walker M. White: Fast checkpoint recovery algorithms for frequently consistent applications. 265-276
Nedyalko Borisov, Shivnath Babu, NagaPramod Mandagere, Sandeep Uttamchandani: Warding off the dangers of data corruption with amulet. 277-288
Service oriented computing, data management in the cloud
Herald Kllapi, Eva Sitaridi, Manolis M. Tsangaris, Yannis E. Ioannidis: Schedule optimization for data processing flows on the cloud. 289-300
Aaron J. Elmore, Sudipto Das, Divyakant Agrawal, Amr El Abbadi: Zephyr: live migration in shared nothing databases for elastic cloud platforms. 301-312
Carlo Curino, Evan P. C. Jones, Samuel Madden, Hari Balakrishnan: Workload-aware database monitoring and consolidation. 313-324
Verena Kantere, Debabrata Dash, Georgios Gratsias, Anastasia Ailamaki: Predicting cost amortization for query services. 325-336
Jennie Duggan, Ugur Çetintemel, Olga Papaemmanouil, Eli Upfal: Performance prediction for concurrent database workloads. 337-348
Spatial and temporal data management

Senjuti Basu Roy, Kaushik Chakrabarti: Location-aware type ahead search on spatial databases: semantics and efficiency. 361-372

Elio Damaggio, Alin Deutsch, Dayou Zhou: Querying contract databases based on temporal behavior. 397-408
Shortest paths and sequence data
Jun Gao, Jeffrey Xu Yu, Ruoming Jin, Jiashuai Zhou, Tengjiao Wang, Dongqing Yang: Neighborhood-privacy protected shortest distance computing in cloud. 409-420

Yinan Li, Allison Terrell, Jignesh M. Patel: WHAM: a high-throughput sequence alignment method. 445-456
Wook-Shin Han, Jinsoo Lee, Yang-Sae Moon, Seung-won Hwang, Hwanjo Yu: A new approach for processing ranked subsequence matching based on ranked union. 457-468
Data provenance, workflow and cleaning
Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Wenyuan Yu: Interaction between record matching and data repairing. 469-480
Su Chen, Xin Luna Dong, Laks V. S. Lakshmanan, Divesh Srivastava: We challenge you to certify your updates. 481-492
Zhuowei Bao, Susan B. Davidson, Tova Milo: Labeling recursive workflow executions on-the-fly. 493-504
Alexandra Meliou, Wolfgang Gatterbauer, Suman Nath, Dan Suciu: Tracing data errors with view-conditioned causality. 505-516
Information extraction
Daisy Zhe Wang, Michael J. Franklin, Minos N. Garofalakis, Joseph M. Hellerstein, Michael L. Wick: Hybrid in-database inference for declarative information extraction. 517-528
Guoliang Li, Dong Deng, Jianhua Feng: Faerie: efficient filtering algorithms for approximate dictionary-based entity extraction. 529-540
Eli Cortez, Daniel Oliveira, Altigran Soares da Silva, Edleno Silva de Moura, Alberto H. F. Laender: Joint unsupervised structure discovery and information extraction. 541-552
Keyword search and ranked queries
Sonia Bergamaschi, Elton Domnori, Francesco Guerra, Raquel Trillo Lado, Yannis Velegrakis: Keyword search over relational databases: a metadata approach. 565-576
Yufei Tao, Stavros Papadopoulos, Cheng Sheng, Kostas Stefanidis: Nearest keyword search in XML documents. 589-600
Stream and complex event processing
Kyumars Sheykh Esmaili, Tahmineh Sanamrad, Peter M. Fischer, Nesime Tatbul: Changing flights in mid-air: a model for safely modifying continuous queries. 613-624
Mohammad Sadoghi, Hans-Arno Jacobsen: BE-tree: an index structure to efficiently match boolean expressions over high-dimensional discrete space. 637-648
Chun Chen, Feng Li, Beng Chin Ooi, Sai Wu: TI: an efficient indexing mechanism for real-time search on tweets. 649-660
Query processing
K. Tuncay Tekle, Yanhong A. Liu: More efficient datalog queries: subsumptive tabling beats magic sets. 661-672
Nitin Gupta, Lucja Kot, Sudip Roy, Gabriel Bender, Johannes Gehrke, Christoph Koch: Entangled queries: enabling declarative data-driven coordination. 673-684

Data mining
Hwanjo Yu, Ilhwan Ko, Youngdae Kim, Seung-won Hwang, Wook-Shin Han: Exact indexing for support vector machines. 709-720
Venu Satuluri, Srinivasan Parthasarathy, Yiye Ruan: Local graph sparsification for scalable clustering. 721-732
Francesco Gullo, Carlotta Domeniconi, Andrea Tagarelli: Advancing data clustering via projective clustering ensembles. 733-744
Zengfeng Huang, Lu Wang, Ke Yi, Yunhao Liu: Sampling based algorithms for quantile computation in sensor networks. 745-756
Information retrieval

Nathan Bales, Alin Deutsch, Vasilis Vassalos: Score-consistent algebraic optimization of full-text search queries with GRAFT. 769-780
Mingyang Zhang, Nan Zhang, Gautam Das: Mining a search engine's corpus: efficient yet unbiased sampling and aggregate estimation. 793-804
Probabilistic and uncertain databases
Mohamed A. Soliman, Ihab F. Ilyas, Davide Martinenghi, Marco Tagliasacchi: Ranking with uncertain scoring functions: semantics and sensitivity measures. 805-816
Mohan Yang, Haixun Wang, Haiquan Chen, Wei-Shinn Ku: Querying uncertain data with aggregate constraints. 817-828
Bhargav Kanagal, Jian Li, Amol Deshpande: Sensitivity analysis and explanations for robust query evaluation in probabilistic databases. 841-852
OLAP and its applications
Peixiang Zhao, Xiaolei Li, Dong Xin, Jiawei Han: Graph cube: on warehousing and OLAP multidimensional networks. 853-864
Manos Athanassoulis, Shimin Chen, Anastasia Ailamaki, Phillip B. Gibbons, Radu Stoica: MaSM: efficient online updates in data warehouses. 865-876
Mo Liu, Elke A. Rundensteiner, Kara Greenfield, Chetan Gupta, Song Wang, Ismail Ari, Abhay Mehta: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. 889-900
Graph management
Arijit Khan, Nan Li, Xifeng Yan, Ziyu Guan, Supriyo Chakraborty, Shu Tao: Neighborhood based fast graph search in large networks. 901-912
Sebastiaan J. van Schaik, Oege de Moor: A memory efficient reachability data structure through bit vector compression. 913-924
Wenfei Fan, Jianzhong Li, Jizhou Luo, Zijing Tan, Xin Wang, Yinghui Wu: Incremental graph pattern matching. 925-936
Ziyu Guan, Jian Wu, Qing Zhang, Ambuj K. Singh, Xifeng Yan: Assessing and ranking structural correlations in graphs. 937-948
Scalable data analytics

Yuting Lin, Divyakant Agrawal, Chun Chen, Beng Chin Ooi, Sai Wu: Llama: leveraging columnar storage for scalable join processing in the MapReduce framework. 961-972
Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGregor, Prashant J. Shenoy: A platform for scalable one-pass analytics using MapReduce. 985-996
Similarity search and queries
Jiaqi Zhai, Yin Lou, Johannes Gehrke: ATLAS: a probabilistic algorithm for high dimensional similarity search. 997-1008
Zi Huang, Heng Tao Shen, Jiajun Liu, Xiaofang Zhou: Effective data co-reduction for multimedia similarity search. 1021-1032
Jianbin Qin, Wei Wang, Yifei Lu, Chuan Xiao, Xuemin Lin: Efficient exact edit similarity query processing with the asymmetric signature scheme. 1033-1044
Keynote address
Anastasia Ailamaki: Managing scientific data: lessons, challenges, and opportunities. 1045-1046
James R. Hamilton: Internet scale storage. 1047-1048
Industrial papers
Malú Castellanos, Umeshwar Dayal, Meichun Hsu, Riddhiman Ghosh, Mohamed Dekhil, Yue Lu, Lei Zhang, Mark Schreiman: LCI: a social channel analysis platform for live customer intelligence. 1049-1058
Vladislav Shkapenyuk, Theodore Johnson, Divesh Srivastava: Bistro data feed management system. 1059-1070
Dhruba Borthakur, Jonathan Gray, Joydeep Sen Sarma, Kannan Muthukkaruppan, Nicolas Spiegelberg, Hairong Kuang, Karthik Ranganathan, Dmytro Molkov, Aravind Menon, Samuel Rash, Rodrigo Schmidt, Amitanand S. Aiyer: Apache hadoop goes realtime at Facebook. 1071-1080
Christopher Olston, Greg Chiou, Laukik Chitnis, Francis Liu, Yiping Han, Mattias Larsson, Andreas Neumann, Vellanki B. N. Rao, Vijayanand Sankarasubramanian, Siddharth Seth, Chao Tian, Topher ZiCornell, Xiaodan Wang: Nova: continuous Pig/Hadoop workflows. 1081-1090
Yu Xu, Pekka Kostamaa, Yan Qi, Jian Wen, Kevin Keliang Zhao: A Hadoop based distributed loading approach to parallel data warehouses. 1091-1100
Adam Silberstein, Russell Sears, Wenchao Zhou, Brian F. Cooper: A batch of PNUTS: experiences connecting cloud batch and serving systems. 1101-1112
Jaeyoung Do, Donghui Zhang, Jignesh M. Patel, David J. DeWitt, Jeffrey F. Naughton, Alan Halverson: Turbocharging DBMS buffer pool using SSDs. 1113-1124
Rimma V. Nehme, Nicolas Bruno: Automated partitioning design in parallel database systems. 1137-1148
Krishna Kunchithapadam, Wei Zhang, Amit Ganesh, Niloy Mukherjee: Oracle database filesystem. 1149-1160
Fatma Özcan, David Hoa, Kevin S. Beyer, Andrey Balmin, Chuan Jie Liu, Yu Li: Emerging trends in the enterprise data analytics: connecting Hadoop and DB2 warehouse. 1161-1164
Kamil Bajda-Pawlikowski, Daniel J. Abadi, Avi Silberschatz, Erik Paulson: Efficient processing of data warehousing queries in a split execution environment. 1165-1176
Per-Åke Larson, Cipri Clinciu, Eric N. Hanson, Artem Oks, Susan L. Price, Srikumar Rangarajan, Aleksandras Surna, Qingqing Zhou: SQL server column store indexes. 1177-1184
Richard Wesley, Matthew Eldridge, Pawel Terlecki: An analytic data engine for visualization in tableau. 1185-1194
Tutorials


Michael Hay, Kun Liu, Gerome Miklau, Jian Pei, Evimaria Terzi: Privacy-aware data management in information networks. 1201-1204

Shan Shan Huang, Todd Jeffrey Green, Boon Thau Loo: Datalog and emerging applications: an interactive tutorial. 1213-1216
Demonstrations on systems and performance
Carlos Ordonez, Sasi K. Pitchaimalai: One-pass data mining algorithms in a DBMS with UDFs. 1217-1220
Christopher Olston, Benjamin Reed: Inspector gadget: a framework for custom monitoring and debugging of distributed dataflows. 1221-1224
Jorge-Arnulfo Quiané-Ruiz, Christoph Pinkel, Jörg Schad, Jens Dittrich: RAFT at work: speeding-up mapreduce applications under task and node failures. 1225-1228
Tuan Cao, Benjamin Sowell, Marcos Antonio Vaz Salles, Alan J. Demers, Johannes Gehrke: BRRL: a recovery library for main-memory applications in the cloud. 1233-1236
Ippokratis Pandis, Pinar Tözün, Miguel Branco, Dimitris Karampinas, Danica Porobic, Ryan Johnson, Anastasia Ailamaki: A data-oriented transaction execution engine and supporting tools. 1237-1240
Wook-Shin Han, Minh-Duc Pham, Jinsoo Lee, Romans Kasperovics, Jeffrey Xu Yu: iGraph in action: performance analysis of disk-based graph indexing techniques. 1241-1242
Badrish Chandramouli, Justin J. Levandoski, Ahmed Eldawy, Mohamed F. Mokbel: StreamRec: a real-time recommender system. 1243-1246
Demonstrations on ranking, the web, and social media
Julia Stoyanovich, Mayur Lodha, William Mee, Kenneth A. Ross: SkylineSearch: semantic ranking and result visualization for pubmed. 1247-1250
Gang Chen, Chen Liu, Meiyu Lu, Beng Chin Ooi, Shanshan Ying, Anthony K. H. Tung, Dongxiang Zhang, Meihui Zhang: A cross-service travel engine for trip planning. 1251-1254
Tim Weninger, Marina Danilevsky, Fabio Fumarola, Joshua M. Hailpern, Jiawei Han, Thomas J. Johnston, Surya Kallumadi, Hyungsul Kim, Zhijin Li, David McCloskey, Yizhou Sun, Nathan E. TeGrotenhuis, Chi Wang, Xiao Yu: WINACS: construction and analysis of web-based computer science information networks. 1255-1258
Adam Marcus, Michael S. Bernstein, Osama Badar, David R. Karger, Samuel Madden, Robert C. Miller: Tweets as data: demonstration of TweeQL and Twitinfo. 1259-1262
Xin Jin, Aditya Mone, Nan Zhang, Gautam Das: MOBIES: mobile-interface enhancement service for hidden web database. 1263-1266
Alessandro Bozzon, Daniele Braga, Marco Brambilla, Stefano Ceri, Francesco Corcoglioniti, Piero Fraternali, Salvatore Vadacca: Search computing: multi-domain search on ranked data. 1267-1270
Foteini Alvanaki, Sebastian Michel, Krithi Ramamritham, Gerhard Weikum: EnBlogue: emergent topic detection in web 2.0 streams. 1271-1274
Ilias N. Flaounas, Omar Ali, Marco Turchi, Tristan Snowsill, Florent Nicart, Tijl De Bie, Nello Cristianini: NOAM: news outlets analysis and monitoring system. 1275-1278
Demonstrations on data integration and probabilistic databases
Cornelia Hedeler, Khalid Belhajjame, Norman W. Paton, Alvaro A. A. Fernandes, Suzanne M. Embury, Lu Mao, Chenjuan Guo: Pay-as-you-go mapping selection in dataspaces. 1279-1282
Haridimos Kondylakis, Dimitris Plexousakis: Exelixis: evolving ontology-based data integration system. 1283-1286
Hazem Elmeleegy, Jaewoo Lee, El Kindi Rezig, Mourad Ouzzani, Ahmed K. Elmagarmid: U-MAP: a system for usage-based schema matching and mapping. 1287-1290
Laura Chiticariu, Vivian Chu, Sajib Dasgupta, Thilo W. Goetz, Howard Ho, Rajasekar Krishnamurthy, Alexander Lang, Yunyao Li, Bin Liu, Sriram Raghavan, Frederick Reiss, Shivakumar Vaithyanathan, Huaiyu Zhu: The SystemT IDE: an integrated development environment for information extraction rules. 1291-1294
Pierre Senellart, Asma Souihli: ProApproX: a lightweight approximation query processor over probabilistic trees. 1295-1298
Robert Fink, Andrew Hogue, Dan Olteanu, Swaroop Rath: SPROUT2: a squared query engine for uncertain web data. 1299-1302
Oliver Kennedy, Steve Lee, Charles Loboz, Slawek Smyl, Suman Nath: Fuzzy prophet: parameter exploration in uncertain enterprise scenarios. 1303-1306
Ekaterini Ioannou, Wolfgang Nejdl, Claudia Niederée, Yannis Velegrakis: LinkDB: a probabilistic linkage database system. 1307-1310
Demonstrations on user support and development environments
Panayiotis Neophytou, Panos K. Chrysanthis, Alexandros Labrinidis: CONFLuEnCE: CONtinuous workFLow ExeCution Engine. 1311-1314
Adam Marcus, Eugene Wu, David R. Karger, Samuel Madden, Robert C. Miller: Demonstration of Qurk: a query processor for humanoperators. 1315-1318
Bill Howe, Garrett Cole, Nodira Khoussainova, Leilani Battle: Automatic example queries for ad hoc databases. 1319-1322
Wenchao Zhou, Qiong Fei, Shengzhi Sun, Tao Tao, Andreas Haeberlen, Zachary G. Ives, Boon Thau Loo, Micah Sherr: NetTrails: a declarative platform for maintaining and querying provenance in distributed systems. 1323-1326
Changjiu Jin, Sourav S. Bhowmick, Xiaokui Xiao, Byron Choi, Shuigeng Zhou: GBLENDER: visual subgraph query formulation meets query processing. 1327-1330
Nitin Gupta, Lucja Kot, Gabriel Bender, Sudip Roy, Johannes Gehrke, Christoph Koch: Coordination through querying in the youtopia system. 1331-1334
Peter Buneman, James Cheney, Sam Lindley, Heiko Müller: DBWiki: a structured wiki for curated data and collaborative data management. 1335-1338
Abhijith Kashyap, Michalis Petropoulos: Rapid development of web-based query interfacesfor XML datasets with QURSED. 1339-1342



