Speeding up materialized view selection in data warehouses using a randomized algorithm

Minsoo Lee, Joachim Hammer

Research output: Contribution to journalArticlepeer-review

75 Scopus citations

Abstract

A data warehouse stores information that is collected from multiple, heterogeneous information sources for the purpose of complex querying and analysis. Information in the warehouse is typically stored in the form of materialized views, which represent pre-computed portions of frequently asked queries. One of the most important tasks when designing a warehouse is the selection of materialized views to be maintained in the warehouse. The goal is to select a set of views in such a way as to minimize the total query response time over all queries, given a limited amount of time for maintaining the views (maintenance-cost view selection problem). In this paper, we propose an efficient solution to the maintenance-cost view selection problem using a genetic algorithm for computing a near-optimal set of views. Specifically, we explore the maintenance-cost view selection problem in the context of OR view graphs. We show that our approach represents a dramatic improvement in time complexity over existing search-based approaches using heuristics. Our analysis shows that the algorithm consistently yields a solution that lies within 10% of the optimal query benefit while at the same time exhibiting only a linear increase in execution time. We have implemented a prototype version of our algorithm which is used to simulate the measurements used in the analysis of our approach.

Original languageEnglish
Pages (from-to)327-353
Number of pages27
JournalInternational Journal of Cooperative Information Systems
Volume10
Issue number3
DOIs
StatePublished - Sep 2001

Keywords

  • Data warehouse
  • Genetic algorithm
  • View maintenance
  • View materialization
  • View selection
  • Warehouse configuration

Fingerprint

Dive into the research topics of 'Speeding up materialized view selection in data warehouses using a randomized algorithm'. Together they form a unique fingerprint.

Cite this