Efficient Parallel Algorithms for Mining Associations

Abstract

The problem of mining hidden associations present in the large amounts of data has seen widespread applications in many practical domains such as customer-oriented planning and marketing, telecommunication network monitoring, and analyzing data from scientific experiments. The combinatorial complexity of the problem and phenomenal growth in the sizes of available datasets motivate the need for efficient and scalable parallel algorithms. The design of such algorithms is challenging. This chapter presents an evolutionary and comparative review of many existing representative serial and parallel algorithms for discovering two kinds of associations. The first part of the chapter is devoted to the non-sequential associations, which utilize the relationships between events that happen together. The second part is devoted to the more general and potentially more useful sequential associations, which utilize the temporal or sequential relationships between events. It is shown that many existing algorithms actually belong to a few categories which are decided by the broader design strategies. Overall the aim of the chapter is to provide a comprehensive account of the challenges and issues involved in effective parallel formulations of algorithms for discovering associations, and how various existing algorithms try to handle them.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 26, 2001
Accession Number
AD1020005

Entities

People

  • Eui-hong Han
  • George Karypis
  • Mahesh V. Joshi
  • Vipin Kumar

Organizations

  • University of Minnesota

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Clustering
  • Communication Systems
  • Computational Complexity
  • Computations
  • Computer Science
  • Counting Methods
  • Data Mining
  • Databases
  • Frequency
  • Hash Tables
  • Information Science
  • Knowledge Management
  • Network Science
  • Parallel Computing
  • Parallel Processing
  • Trees (Data Structures)

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Parallel and Distributed Computing.
  • Theoretical Analysis.