Relation Extraction with Weak Supervision and Distributional Semantics

Abstract

Relation Extraction aims at detecting and categorizing semantic relations between pairs of entities in unstructured text. It benefits an enormous number of applications such as Web search and Question Answering. Traditional approaches for relation extraction either rely on learning from a large number of accurate human-labeled examples or pattern matching with hand-crafted rules. These resources are very laborious to obtain and can only be applied to a narrow set of target types of interest. This dissertation focuses on learning relations with little or no human supervision. First, we examine the approach that treats relation extraction as a supervised learning problem. We develop an algorithm that is able to train a model with approximately 1/3 of the human-annotation cost and that matches the performance of models trained with high-quality annotation. Second, we investigate distant supervision, a weakly supervised algorithm that automatically generates its own labeled training data. We develop a latent Bayesian framework for this purpose. By using a model which provides a better approximation of the weak source of supervision, it outperforms the state-of-the-art methods. Finally, we investigate the possibility of building all relational tables beforehand with an unsupervised relation extraction algorithm. We develop an effective yet efficient algorithm that combines the power of various semantic resources that are automatically mined from a corpus based on distributional semantics. The algorithm is able to extract a very large set of relations from the web at high precision.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2013
Accession Number
AD1046751

Entities

People

  • Bonan Min

Organizations

  • New York University

Tags

Communities of Interest

  • Biomedical
  • C4I
  • Ground and Sea Platforms
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence Software
  • Bayesian Networks
  • Computational Complexity
  • Computer Languages
  • Computer Science
  • Feature Extraction
  • Information Science
  • Kernel Functions
  • Machine Learning
  • Models
  • Natural Language Processing
  • Network Science
  • Ontologies
  • Probability
  • Supervised Machine Learning
  • Supervision

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks