Inferring Parts of Speech for Lexical Mappings via the Cyc KB

Abstract

We present an automatic approach to learning criteria for classifying the parts-of-speech used in lexical mappings. This will further automate our knowledge acquisition system for non-technical users. The criteria for the speech parts are based on the types of the denoted terms along with morphological and corpus-base clues. Associations among these and the parts-of-speech are learned using the lexical mappings contained in the Cyc knowledge base as training data. With over 30 speech parts to choose from, the classifier achieves good results (77.8% correct). Accurate results (93.0%) are achieved in the special case of the mass-count distinction for nouns. Comparable results are also obtained using OpenCyc (73.1% general and 88.4% mass-count).

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2004
Accession Number
ADA460959

Entities

People

  • Bjorn Aldag
  • Dave Schneider
  • Jon Curtis
  • Kathy Panton
  • Michael Witbrock
  • Nancy Salay
  • Stefano Bertolo
  • Tom O'hara

Organizations

  • New Mexico State University

Tags

Communities of Interest

  • Autonomy
  • Human Systems

DTIC Thesaurus Topics

  • Accuracy
  • Classification
  • Computational Linguistics
  • Computational Science
  • Computer Science
  • Data Mining
  • Frequency
  • Information Retrieval
  • Information Science
  • Language
  • Linguistics
  • Machine Learning
  • Machine Translation
  • Money
  • Natural Language Processing
  • Natural Languages
  • Ontologies

Readers

  • Computational Linguistics
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks