Automatic Extraction of Semantic Classes from Syntactic Information in Online Resources

Abstract

This paper addresses the issue of word-sense ambiguity in extraction from machine-readable resources for the construction of large-scale knowledge sources. We describe two experiments: one which took word-sense distinctions into account, resulting in 97.9% accuracy for semantic classification of verbs based on (Levin, 1993); and one which ignored word-sense distinctions, resulting in 6.3% accuracy. These experiments were dual purpose: (1) to validate the central thesis of the of (Levin, 1993), i.e., that verb semantics and syntactic behavior are predictably related; (2) to demonstrate that a 20-fold improvement can be achieved in deriving semantic information from syntactic cues if we first divide the syntactic cues into distinct groupings that correlate with different word senses. Finally, we show that we can provide effective acquisition techniques for novel word senses using a combination of online sources.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 15, 1998
Accession Number
AD1006915

Entities

People

  • Bonnie J. Dorr
  • Doug Jones

Organizations

  • University of Maryland

Tags

Communities of Interest

  • Ground and Sea Platforms

DTIC Thesaurus Topics

  • Accuracy
  • Acquisition
  • Algorithms
  • Ambiguity
  • Applied Computer Science
  • Automatic
  • Computational Linguistics
  • Computer Science
  • Databases
  • Dictionaries
  • Extraction
  • Language
  • Linguistics
  • Machine Translation
  • Natural Language Processing
  • Semantics
  • Translations

Fields of Study

  • Computer science
  • Linguistics

Readers

  • Computer Science.
  • Distributed Systems and Data Platform Development
  • Systems Analysis and Design