Review

It is well known that a judicious choice of pre-computed or materialized views can greatly enhance the efficiency of query processing. The potential for using views in data warehouses is even higher than in a typical database application because of the interrelations between the (sequence of) queries posed by an analyst and the ubiquitous uses of aggregated information in On-Line Analytical Processing.

With the use of views to speed up query processing comes the cost of maintaining these views as updates occur to the relations from which the views are computed. Furthermore, due to changes in the query invocation pattern over time, the set of views needed will also vary over time. Thus, two problems need to be solved while using views: view selection and view maintenance. Numerous papers have been written on each of these topics! The contribution of this paper lies in its unified approach to solving these two problems. Dynamat monitors incoming queries and their data access patterns and dynamically selects the views to materialize while being cognizant of the space needed to keep the materialized views and the time necessary to update the views as the base relations change. Maintenance of the selected views can be done using a combination of recomputations and incremental updates.

To keep the maintenance problem separate from the view access and selection problems, DynaMat makes the simplifying assumption that analysts are willing to live with periods, the length of which is specified by the database administrator, of data unavailability. Given this, all updates are accumulated till then and applied to the views during these periods. Clearly, the limited length of these periods pose one set of constraints on the view selection problem: The selected views must be updatable within the maintenance period.

When the views are not being updated, the warehouse is available for querying and the DynaMat design uses a directory structure to quickly locate views that may help in the efficient processing of an incoming query. The results of a query are also candidates for inclusion in the View Pool. They get included provided there is space in the pool and the time allocated for view maintenance will not be exceeded by admitting the new query results into the pool.

DynaMat confines its attention to

Given these, when a new query is posed, DynaMat determines the best subset of previous MR query results (stored in the View Pool) to answer the query with, keeping in mind the cost model for accessing elements in the pool.

To update the elements of the View Pool efficiently, DynaMat crates an update plan to refresh as many fragments as possible. This implies that it is possible for some of the cached views to be evicted because of paucity of time to update them.

The reported performance results indicate that even an optimal view selection algorithm is outperformed by DynaMat, thereby demonstrating the need for dynamic view selection. To compare the performance of DynaMat against a static view selection approach, and also to evaluate the different heuristics for deciding which query result to include in the View Pool, authors define a measure called the Detailed Cost Saving Ratio which, unlike previous measures, captures the savings when a stored view helps in answering a query (as opposed to being an exact match for a query).

Given the number of assumptions made regarding the nature of queries and the views maintained, and also given that DynaMat is is based on periodic updates to the View Pool one can expect follow-up work that relaxes these assumptions!


a service of Schloss Dagstuhl - Leibniz Center for Informatics