Lexical Semantic Techniques for Corpus Analysis

Abstract

In this paper we outline a research program for computational linguistics, making extensive use of text corpora. We demonstrate how a semantic framework for lexical knowledge can suggest richer relationships among words in text beyond that of simple co-occurrence. The work suggests how linguistic phenomena such as metonymy and polysemy might be exploitable for semantic tagging of lexical items. Unlike with purely statistical collocational analyses, the framework of a semantic theory allows the automatic construction of predictions about deeper semantic relationships among words appearing in collocational systems. We illustrate the approach for the acquisition of lexical information for several classes of nominals, and how such techniques can fine-tune the lexical structures acquired from an initial seeding of a machine-readable dictionary. In addition to conventional lexical semantic relations, we show how information concerning lexical presuppositions and preference relations can also be acquired from corpora, when analyzed with the appropriate semantic tools. Finally, we discuss the potential that corpus studies have for enriching the data set for theoretical linguistic research, as well as helping to confirm or disconfirm linguistic hypotheses.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 1993
Accession Number
ADA580022

Entities

People

  • James Pustejovsky
  • Peter Anick
  • Sabine Bergler

Organizations

  • Brandeis University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Acquisition
  • Artificial Intelligence
  • Cognitive Science
  • Computational Linguistics
  • Computer Programming
  • Computer Science
  • Computers
  • Databases
  • Information Retrieval
  • Information Science
  • Language
  • Law
  • Linguistics
  • Natural Language Processing
  • Natural Languages
  • Operating Systems
  • Statistics

Fields of Study

  • Linguistics

Readers

  • Computational Linguistics