We present a way of exploiting domain knowledge in the design and implementation of data mining algorithms, with special attention to frequent patterns discovery, within a deductive framework. In our framework, domain knowledge is represented by way of deductive rules, and data mining algorithms are specified by means of iterative user-defined aggregates and implemented by means of user-defined predicates. This choice allows us to exploit the full expressive power of deductive rules without loosing in performance. Iterative user-defined aggregates have a fixed scheme, in which user-defined predicates are to be added. This feature allows the modularization of data mining algorithms, thus providing a way to integrate the proper domain knowledge exploitation in the right point. As a case study, the paper presents how user-defined aggregates can be exploited to specify and implement a version of the a priori algorithm. Some performance analyzes and comparisons are discussed in order to show the effectiveness of the approach.
Specifying Mining Algorithms with Iterative User-Defined Aggregates
TURINI, FRANCO;
2004-01-01
Abstract
We present a way of exploiting domain knowledge in the design and implementation of data mining algorithms, with special attention to frequent patterns discovery, within a deductive framework. In our framework, domain knowledge is represented by way of deductive rules, and data mining algorithms are specified by means of iterative user-defined aggregates and implemented by means of user-defined predicates. This choice allows us to exploit the full expressive power of deductive rules without loosing in performance. Iterative user-defined aggregates have a fixed scheme, in which user-defined predicates are to be added. This feature allows the modularization of data mining algorithms, thus providing a way to integrate the proper domain knowledge exploitation in the right point. As a case study, the paper presents how user-defined aggregates can be exploited to specify and implement a version of the a priori algorithm. Some performance analyzes and comparisons are discussed in order to show the effectiveness of the approach.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.