We are currently collecting comments, criticisms, and testimonials for the evaluation of our efforts by our public funders. If you want to share any thoughts that we may use in our report, please feel free to send us your comments!
Researchers in machine learning and statistical analysis had developed
many data analysis techniques well before the data mining fever caught on.
In the second phase of data mining research, the focus was to ensure
that data mining algorithms are scalable, i.e., can deal with large volumes
of data. These algorithms assume that the data reside in files.
However, for data mining to be widely applicable, tools and techniques for
data mining must be well integrated with the data-warehousing infrastructure.
Therefore, it is important that implementation of data mining tools be studied
in the context of relational backends. This paper by Sarawagi, Thomas and
Agrawal is one of the first papers that exemplify such a study. The paper
compares several alternative implementations of association rules on traditional as well as Object-Relational SQL engines. The paper considers a fairly comprehensive suite of implementation alternatives that exploit SQL queries,
stored procedures/user-defined functions, and "extract and mine" strategies.
studies the performance and ease of implementation of the alternatives.
The experiments were performed on DB2 UDB Server 5.0 and quantify
relative trade-offs of the alternative implementations.
However, the paper considers association rule mining as the only knowledge discovery technique.
Similar studies of other data mining techniques will help us gain a broader
understanding of systems issues in integration of data mining with relational
database management systems.