A Tool for Creating and Parallelizing Bioinformatics Pipelines

Abstract

Bioinformatics pipelines enable life scientists to effectively analyze biological data through automated multi-step processes constructed by individual programs and databases. The huge amount of data and time consuming computations require effectively parallelized pipelines to provide results within a reasonable time. Considerable programming effort is needed for both integrating individual programs into a pipeline and parallelizing them. The object of our Bioinformatics Pipeline Generation and Parallelization Toolkit (BioGent) is to reduce researchers programming burden. A user only needs to create a pipeline definition file that describes the data processing sequence and input/output files. Program termed schedpipe in BioGent toolkit takes the definition file and executes the designed procedure.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 2007
Accession Number
ADA480822

Entities

People

  • Chenggang Yu
  • Paul A. Wilson

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Application Programming Interface
  • Application Software
  • Biomedical Information Systems
  • Biomedical Research
  • Biotechnology
  • Computational Biology
  • Computational Science
  • Computer Programming
  • Computers
  • Data Processing
  • Database Management Systems
  • Databases
  • Department Of Defense
  • High Performance Computing
  • Information Systems
  • Nucleic Acids
  • Pipelines

Fields of Study

  • Computer science
  • Engineering

Readers

  • Distributed Systems and Data Platform Development
  • Parallel and Distributed Computing.