Mining a Large-Scale Term-Concept Network from Wikipedia

Abstract

Social tagging and information retrieval are challenged by the fact that the same item or idea can be expressed by different terms or words. To counteract the problem of variable terminology, researchers have proposed concept-based information retrieval. To date, however, most concept spaces have been either manually-produced taxonomies or special-purpose ontologies, too small for classifying arbitrary resources. To create a large set of concepts, and to facilitate terms to concept mapping, we introduce mine a network of concepts and terms from Wikipedia. Our algorithm results in a robust, extensible term-concept network for tagging and information retrieval, containing over 2,000,000 concepts with mappings to over 3,000,000 unique terms.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2005
Accession Number: AD1106851

Entities

People

Andrew Gregorowicz
Mark A. Kramer

Organizations

MITRE Corporation

Mining a Large-Scale Term-Concept Network from Wikipedia

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas