Association Analysis with One Scan of Databases

Abstract

Mining frequent patterns with an FP-tree avoids costly candidate generation and repeatedly occurrence frequency checking against the support threshold. It therefore achieves better performance and efficiency than Apriori-like algorithms. However, the database still needs to be scanned twice to get the FP-tree. This can be very time-consuming when new data are added to an existing database because two scans may be needed for not only the new data but also the existing data. This paper presents a new data structure P-tree, Pattern Tree, and a new technique, which can get the P-tree through only one scan of the database and can obtain the corresponding FP-tree with a specified support threshold. Updating a P-tree with new data needs one scan of the new data only, and the existing data do not need to be re-scanned.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2006
Accession Number: ADA447007

Entities

People

Hao Huang
Richard Relue
Xindong Wu

Organizations

Colorado School of Mines

Association Analysis with One Scan of Databases

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers