Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach
Abstract
In this paper we describe a new technique for parsing free text: a transformational grammar is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled. The algorithm works by beginning in a very naive state of knowledge about phrase structure. By repeatedly comparing the results of bracketing in the current state to proper bracketing provided in the training corpus, the system learns a set of simple structural transformations that can be applied to reduce error. After describing the algorithm, we present results and compare these results to other recent results in automatic grammar induction.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 1993
- Accession Number
- ADA458695
Entities
People
- Eric Brill
Organizations
- University of Pennsylvania