Association Analysis with One Scan of Databases

Abstract

Mining frequent patterns with an FP-tree avoids costly candidate generation and repeatedly occurrence frequency checking against the support threshold. It therefore achieves better performance and efficiency than Apriori-like algorithms. However, the database still needs to be scanned twice to get the FP-tree. This can be very time-consuming when new data are added to an existing database because two scans may be needed for not only the new data but also the existing data. This paper presents a new data structure P-tree, Pattern Tree, and a new technique, which can get the P-tree through only one scan of the database and can obtain the corresponding FP-tree with a specified support threshold. Updating a P-tree with new data needs one scan of the new data only, and the existing data do not need to be re-scanned.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA447007

Entities

People

  • Hao Huang
  • Richard Relue
  • Xindong Wu

Organizations

  • Colorado School of Mines

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Colorado
  • Computations
  • Computer Science
  • Computers
  • Construction
  • Data Mining
  • Databases
  • Demographic Cohorts
  • Digital Information
  • Frequency
  • Information Operations
  • Military Research
  • Network Science
  • Scanning

Fields of Study

  • Computer science
  • Engineering

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Database Systems and Applications
  • Medical Imaging.