An Efficient Algorithm for Discovering Frequent Subgraphs

Abstract

Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to non-traditional domains, existing frequent pattern discovery approach cannot be used. This is because the transaction framework that is assumed by these algorithms cannot be used to effectively model the datasets in these domains. An alternate way of modeling the objects in these datasets is to represent them using graphs. Within that model, the problem of finding frequent patterns becomes that of discovering subgraphs that occur frequently over the entire set of graphs. In this paper we present a computationally efficient algorithm, called FSG, for finding all frequent subgraphs in large graph databases. We experimentally evaluate the performance of FSG using a variety of real and synthetic datasets. Our results show that despite the underlying complexity associated with frequent subgraph discovery, FSG is effective in finding all frequently occurring subgraphs in datasets containing over 100,000 graph transactions and scales linearly with respect to the size of the database.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 25, 2002
Accession Number
ADA439497

Entities

People

  • George Karypis
  • Michihiro Kuramochi

Organizations

  • University of Minnesota

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Chemical Compounds
  • Computational Complexity
  • Computations
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Data Mining
  • Data Sets
  • Databases
  • Demographic Cohorts
  • Elements
  • High Performance Computing
  • Information Science
  • Operating Systems
  • Random Variables

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms