Architecture Knowledge for Evaluating Scalable Databases

Abstract

Designing massively scalable, highly available big data systems is an immense challenge for software architects. Big data applications require distributed systems design principles to create scalable solutions, and the selection and adoption of open source and commercial technologies that can provide the required quality attributes. In big data systems, the data management layer presents unique engineering problems, arising from the proliferation of new data models and distributed technologies for building scalable, available data stores. Architects must consequently compare candidate database technology features and select platforms that can satisfy application quality and cost requirements. In practice, the inevitable absence of up-to-date, reliable technology evaluation sources makes this comparison exercise a highly exploratory, unstructured task. To address these problems, we have created a detailed feature taxonomy that enables rigorous comparison and evaluation of distributed database platforms. The taxonomy captures the major architectural characteristics of distributed databases, including data model and query capabilities. In this paper we present the major elements of the feature taxonomy, and demonstrate its utility by populating the taxonomy for nine different database technologies. We also briefly describe QuABaseBD, a knowledge base that we have built to support the population and querying of database features by software architects. QuABaseBD links the taxonomy to general quality attribute scenarios and design tactics for big data systems. This creates a unique, dynamic knowledge resource for architects building big data systems.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 16, 2015
Accession Number
ADA614251

Entities

People

  • Albert Nurgaliev
  • Ian Gorton
  • John Klein

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Engineered Resilient Systems
  • Human Systems

DTIC Thesaurus Topics

  • Abstracts
  • Big Data
  • Computer Science
  • Data Centers
  • Data Management
  • Data Storage Systems
  • Databases
  • Engineering
  • Knowledge Management
  • Language
  • Models
  • Ontologies
  • Relational Databases
  • Software Design
  • Software Development
  • Standards
  • Systems Engineering

Fields of Study

  • Computer science
  • Engineering

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development
  • Software Engineering.