Analysis of German Patent Literature

Abstract

We show how several components of the JET natural language analysis tool, originally developed at New York University for the analysis of English text, were adapted to German. These components, such as the part of speech tagger and the noun chunker, are explained in terms that should be understandable to a layman. On the other hand, issues that arise specifically with regards to the German language are outlined in a way that could be of interest to people more experienced in natural language processing.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2012
Accession Number
AD1050235

Entities

People

  • Luciano Franceschina

Organizations

  • ETH Zurich
  • New York University

Tags

Communities of Interest

  • C4I

DTIC Thesaurus Topics

  • Accuracy
  • Computational Linguistics
  • Computer Science
  • Emission
  • Frequency
  • German Language
  • Grammars
  • Hidden Markov Models
  • Language
  • Linguistics
  • Markov Models
  • Models
  • Natural Language Computing
  • Natural Language Processing
  • Natural Languages
  • New York
  • Newspapers
  • Observation
  • Patents
  • Precision
  • Probability
  • Training

Readers

  • Computational Linguistics

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation