Supercomputer Assembly and Annotation of Transcriptomes for Assessing Impacts of Army Stressors on Ecological Receptors

Abstract

High-throughput DNA sequencing technology was utilized to describe the protein coding regions of genomic DNA (the transcriptome) for both Western Fence Lizard (Sceloporus occidentalis, WFL) and Japanese Quail (Coturnix coturnix, JQ). 928,759 and 559,819 total transcriptomic sequences for WFL and JQ respectively, were clustered and assembled. Assembled unigenes with lengths >equal 200 base pairs were annotated using Basic Local Alignment Search Tool (BLAST) against 5 publicly available protein sequence databases using the DoD supercomputers, Diamond (SGI Altrix ICE) and Jade (Cray XT4). A total of 58,962 and 44,455 unigenes were identified for WFL and JQ, respectively. Annotation of unigenes via similarity search against known proteins in the NCBI NR. aa and Refseq, EMBLEBI UniProt-SwissProt, Uniref90, and Uniref100 protein coding databases provided 44 and 33 % unigene characterization for WFL and JQ, respectively. Sequences with significant similarity to known proteins were used to design custom ultra-high density gene expression microarrays which are being used to develop innovative methods to pro-actively assess the impacts of Army activity on environmental quality on installations. Further, this effort has developed a cyber-infrastructure capability with web-based tools and data visualization capability for the ERDC Environmental Laboratory to rapidly develop genomic infrastructure and gene expression tools for any ecological receptors that become species of concern.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2010
Accession Number
ADA532322

Entities

People

  • C. Vulpe
  • D. Pham
  • E. J. Perkins
  • K. A. Gust
  • L. Scanlan
  • M. S. Wilbanks
  • N. D. Barker
  • Xi Chen

Organizations

  • Engineer Research and Development Center

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Army Corps Of Engineers
  • Biological Sciences
  • Birds
  • Climate Change
  • Computational Biology
  • Computer Programming
  • Computer Programs
  • Data Visualization
  • Databases
  • Ecology
  • Engineers
  • Gene Expression
  • Genetic Structures
  • High Density
  • Infrastructure
  • Sequence Analysis
  • Systems Biology

Fields of Study

  • Biology

Readers

  • Database Systems and Applications
  • Molecular Genetics
  • Wetland-Land-Environmental Management.

Technology Areas

  • Cyber