dblp.uni-trier.de www.dagstuhl.de www.uni-trier.de

Accessing Data Cubes along Complex Dimensions.

Yuping Yang, Mukesh Singhal: Accessing Data Cubes along Complex Dimensions. DOLAP 1999: 73-78
@inproceedings{DBLP:conf/dolap/YangS99,
  author    = {Yuping Yang and
               Mukesh Singhal},
  title     = {Accessing Data Cubes along Complex Dimensions},
  booktitle = {DOLAP '99, ACM Second International Workshop on Data Warehousing
               and OLAP, November 6, 1999, Kansas City, Missouri, USA, Proceedings},
  publisher = {ACM},
  year      = {1999},
  pages     = {73-78},
  ee        = {db/conf/dolap/YangS99.html, http://doi.acm.org/10.1145/319757.319794},
  crossref  = {DBLP:conf/dolap/99},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

In a data warehouse, data cubes are accessed through their dimensions. If dimensions are numerical, because numerical data can be clustered or sorted, fast access methods such as binary search or B+ trees can be applied. However, complex attributes such as keyword sets of document contents are not easily sorted or clustered. Although it is highly desirable that documents can be searched through their sets of keywords.

Signature index is known for its ability to search along complex attributes. We propose a new indexing structure, dimensional signature index (DSI), for fast query processing in data cubes. DSI is particularly suitable for accessing data in data cubes through complex dimensions.

Through a mathematical analysis, we found that if one signature index (feature index) is built for each dimension of the data cube, if the size of all feature indices is equal to the size of a large signature index for the entire data cube as a flat file, and if a query execution involves all dimensions of a data cube, the search cost in all these feature indices is the same as the search cost in the large signature index for the entire data cube.

The significance of this discovery is that usually a query does not involve all dimensions of a data cube. By making one feature index for each dimension, only those feature indices involved in the query predicates need to be accessed. On average, this represents significant faster query executions than using a large signature file for the entire data cube.

The use of DSI scheme does not exclude the use of other fast signature index schemes. Each feature index in DSI can also use any of the previously proposed fast signature indices (S-trees, multi-leveled, frame-sliced, etc.) to achieve even faster access speed.

Copyright © 1999 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

DOLAP '99, ACM Second International Workshop on Data Warehousing and OLAP, November 6, 1999, Kansas City, Missouri, USA, Proceedings. ACM 1999
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Online Edition

Citation Page

References

[1]
...
[2]
Walter W. Chang, Hans-Jörg Schek: A Signature Access Method for the Starburst Database System. VLDB 1989: 145-153 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Uwe Deppisch: S-Tree: A Dynamic Balanced Signature Index for Office Retrieval. SIGIR 1986: 77-87 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
Christos Faloutsos, Stavros Christodoulakis: Description and Performance Analysis of Signature File Methods for Office Filing. ACM Trans. Inf. Syst. 5(3): 237-257(1987) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[5]
Christos Faloutsos: Signature-Based Text Retrieval Methods: A Survey. IEEE Data Eng. Bull. 13(1): 25-32(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
Li Fan, Pei Cao, Jussara M. Almeida, Andrei Z. Broder: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. SIGCOMM 1998: 254-265 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[7]
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[8]
Himanshu Gupta: Selection of Views to Materialize in a Data Warehouse. ICDT 1997: 98-112 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[9]
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[10]
Yoshiharu Ishikawa, Hiroyuki Kitagawa, Nobuo Ohbo: Evaluation of Signature Files as Set Access Facilities in OODBs. SIGMOD Conference 1993: 247-256 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[11]
Dik Lun Lee, Chun-Wu Roger Leng: Partitioned Signature Files: Design Issues and Performance Evaluation. ACM Trans. Inf. Syst. 7(2): 158-180(1989) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[12]
Dik Lun Lee, Young Man Kim, Gaurav Patel: Efficient Signature File Methods for Text Retrieval. IEEE Trans. Knowl. Data Eng. 7(3): 423-435(1995) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[13]
Zheng Lin, Christos Faloutsos: Frame-Sliced Signature Files. IEEE Trans. Knowl. Data Eng. 4(3): 281-289(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[14]
John L. Pfaltz, William J. Berman, Edgar M. Cagley: Partial-Match Retrieval Using Indexed Descriptor Files. Commun. ACM 23(9): 522-528(1980) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[15]
Charles S. Roberts: Partial-Match Via the Method of Superimposed Codes. Proceedings of the IEEE 67(12): 1624-1642(1979) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[16]
Ron Sacks-Davis, Kotagiri Ramamohanarao: A two level superimposed coding scheme for partial match retrieval. Inf. Syst. 8(4): 273-289(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[17]
Ron Sacks-Davis, Alan J. Kent, Kotagiri Ramamohanarao: Multikey Access Methods Based on Superimposed Coding Techniques. ACM Trans. Database Syst. 12(4): 655-696(1987) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[18]
...
[19]
...
[20]
...
[21]
Yuping Yang, Mukesh Singhal: A New Access Index for Fast Execution of Conjunctive Queries over Text Data. IDEAS 1999: 248-253 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Last update Thu May 24 04:16:29 2012 CET by the DBLP TeamThis material is Open Data Data released under the ODC-BY 1.0 license — See also our legal information page