Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis
Abstract
A significant portion of knowledge discovery and data mining research focuses on finding patterns of interest in data. Once a pattern is found, it can be used to recognize satisfying instances. The new area of link discovery requires a complementary approach, since patterns of interest might not yet be known or might have too few examples to be learnable. This paper presents an unsupervised link discovery method aimed at discovering unusual, interestingly linked entities in multi-relational data sets. Various notions of rarity are introduced to measure the "interestingness" of sets of paths and entities. These measurements have been implemented and applied to a real-world bibliographic data set where they give very promising results.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2003
- Accession Number
- ADA462272
Entities
People
- Hans Chalupsky
- Shou-De Lin
Organizations
- University of Southern California