Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach

Abstract

In this paper we describe a new technique for parsing free text: a transformational grammar is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled. The algorithm works by beginning in a very naive state of knowledge about phrase structure. By repeatedly comparing the results of bracketing in the current state to proper bracketing provided in the training corpus, the system learns a set of simple structural transformations that can be applied to reduce error. After describing the algorithm, we present results and compare these results to other recent results in automatic grammar induction.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1993
Accession Number
ADA458695

Entities

People

  • Eric Brill

Organizations

  • University of Pennsylvania

Tags

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Automatic
  • Context Free Grammars
  • Environment
  • Errors
  • Grammars
  • Hidden Markov Models
  • Information Science
  • Language
  • Learning
  • Markov Models
  • Natural Languages
  • Probability
  • Test Sets
  • Training
  • Transformational Grammars

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence