Patent Retrieval in Chemistry based on Semantically Tagged Named Entities

Abstract

This paper reports on the work that has been conducted by Fraunhofer SCAI for Trec Chemistry (Trec-Chem) track 2009. The team of Fraunhofer SCAI participated in two tasks, namely Technology Survey and Prior Art Search. The core of the framework is an index of 1.2 million chemical patents provided as a data set by Trec. For the technology survey, three runs were submitted based on semantic dictionaries and noun phrases. For the prior art search task, several fields were introduced into the index that contained normalized noun phrases, biomedical as well as chemical entities. Altogether, 36 runs were submitted for this task that were based on automatic querying with tokens, noun phrases and entities along with different search strategies.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2009
Accession Number
ADA519347

Entities

People

  • Bernd Mueller
  • Christoph M. Friedrich
  • Harsha Gurulingappa
  • Heinz-theodor Mevissen
  • Juliane Fluck
  • Martin Hofmann-apitius
  • Roman Klinger

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Automatic
  • Chemical Compounds
  • Chemistry
  • Dictionaries
  • Diseases And Disorders
  • Frequency
  • Language
  • Markup Languages
  • Named Entity Recognition
  • Natural Languages
  • Organic Chemistry
  • Patent Office
  • Preprocessing
  • Recognition
  • Standards
  • Vascular Diseases

Readers

  • Computational Linguistics
  • Technical Research and Report Writing.

Technology Areas

  • Biotechnology