Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis

Abstract

A significant portion of knowledge discovery and data mining research focuses on finding patterns of interest in data. Once a pattern is found, it can be used to recognize satisfying instances. The new area of link discovery requires a complementary approach, since patterns of interest might not yet be known or might have too few examples to be learnable. This paper presents an unsupervised link discovery method aimed at discovering unusual, interestingly linked entities in multi-relational data sets. Various notions of rarity are introduced to measure the "interestingness" of sets of paths and entities. These measurements have been implemented and applied to a real-world bibliographic data set where they give very promising results.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2003
Accession Number
ADA462272

Entities

People

  • Hans Chalupsky
  • Shou-De Lin

Organizations

  • University of Southern California

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Anomaly Detection
  • Change Detection
  • Computer Science
  • Correlation Techniques
  • Data Mining
  • Data Science
  • Data Sets
  • Databases
  • Detection
  • Electronic Mail
  • Information Science
  • Network Science
  • Probability
  • Social Networks
  • Supervised Machine Learning
  • Universities

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms