A Million Cancer Genome Warehouse

Abstract

This white paper discusses the motivation and issues surrounding the development of a repository and associated computational infrastructure to house and process a million genomes to help battle cancer, which we call the Million Cancer Genome Warehouse. It is proposed as an example of an information commons and a computing system that will bring about precision medicine, coupling established clinical pathological indexes with state-of-the-art molecular profiling to create diagnostic, prognostic, and therapeutic strategies precisely tailored to each patient's individual requirements. The goal of the white paper is to stimulate discussion so as to help reach consensus about the need to construct a Million Cancer Genome Warehouse and what its nature should be. To try to anticipate concerns, including thorough cost estimates, it covers topics as varied as high-level health policy issues to low-level details about statistical analysis, data formats and structures, software design, and hardware construction and cost.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 20, 2012
Accession Number
ADA570382

Entities

People

  • Anthony D. Joseph
  • Armando Fox
  • Benedict Paten
  • David A Patterson
  • David Haussler
  • Ion Stoica
  • Mark Diekhans
  • Michael I. Jordan
  • Scott Shenker
  • Singer Ma
  • Taylor Sittler

Organizations

  • University of California, Berkeley

Tags

DTIC Thesaurus Topics

  • Breast Cancer
  • Computational Biology
  • Computational Science
  • Computer Programming
  • Computer Science
  • Data Analysis
  • Engineers
  • Genetics
  • Health Services
  • Information Processing
  • Information Science
  • Information Systems
  • Medical Personnel
  • Network Architecture
  • Network Science
  • Oncology
  • Software Development

Readers

  • Molecular and genetic basis of cancer.
  • Parallel and Distributed Computing.
  • Theoretical Analysis.