This paper was presented at the ACM First International Workshop on Data Warehousing and OLAP (DOLAP'98) held in Bethesda, Maryland. It was the first paper I have seen that attempts to define a complete methodology for logical design of data warehouses. It defines a Dimensional Fact Model (DFM) to conceptualize data warehouse (DW) components in a consistent way, and creates a very nice visualization of the data model, including various levels of aggregate data in addition to the base data. The requirements analysis and specification phase falls back on the basic relational and ER modeling approach; however the DFM, which drives the next steps, is unique to DWs and is easy to conceptualize. The central element is a fact table such as "sales" with attributes like quantity sold, revenue, and number of customers. Similar to the star schema concept, dimensions are shown as branches from the fact table and dimension attributes like store, date, and product are shown as circles on the graph, with aggregates connected to them via arcs. Non-dimension attributes are also included with dimensional attributes, but away from the major arcs in the graph. In this manner entire hierarchies of dimensional attributes can be easily shown. Further refinements mix the scheme and workload in terms of queries, leading to a proposed methodology for taking a dimensional scheme, a workload of queries, and update information; then producing a DW scheme that minimizes query response time within a disk constraint. This mix of logical/physical design has not yet been fully demonstrated, but gives other researchers plenty of room to define and develop their own approaches for view materialization and table partitioning. An expanded version of the paper appears in the Journal of Computer Science and Information Management 2,3 (1998), and the paper URL is
a service of Schloss Dagstuhl - Leibniz Center for Informatics