BioCreAtIvE task 1A: gene mention finding evaluation

Abstract

The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has an increasing amount of work on text mining this literature, but comparing this work is hard because of a lack of standards for making comparisons. Results: We took part in running BioCreAtIvE (Critical Assessment for Information Extraction in Biology), an open common evaluation of systems on a number of biological text mining tasks. We report here on task 1A, which deals with finding mentions of genes and related entities in text. The task makes use data and evaluation software provided by the (US) National Center for Biotechnology Information (NCBI). 15 teams took part in task 1A. Conclusion: A number of teams achieved scores over 80% F-measure (balanced precision and recall). This is good, but still somewhat lags the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire. Finding mentions is a basic task, which can be used as a building block for other text mining tasks, but the teams that tried to use their task 1A systems to help on other BioCreAtIvE tasks report mixed results.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2004
Accession Number
AD1125134

Entities

People

  • Alexander Morgan
  • Alexander Yeh
  • Lynette Hirschman
  • Marc Colosimo

Organizations

  • MITRE Corporation

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Artificial Intelligence Software
  • Biotechnology
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Data Mining
  • Data Sets
  • Drosophila
  • Hidden Markov Models
  • Information Science
  • Language
  • Linguistics
  • Machine Learning
  • Markov Models
  • Materials
  • Natural Language Processing
  • Precision
  • Supervised Machine Learning
  • Systems Approach
  • Test Sets
  • Text Mining
  • Training

Readers

  • Brain and Cognitive Science; Experimental Psychology; Cognitive Neuroscience
  • Computational Linguistics
  • Technical Research and Report Writing.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • Biotechnology