Scale-Independent Relational Query Processing

Abstract

An increasingly common pattern is for newly-released web applications to succumb to a "Success Disaster". In this scenario, overloaded database machines and resultant high response times destroy a previously good user experience, just as a site is becoming popular. Unfortunately, the data independence provided by a traditional relational database system while useful for agile development, only exacerbates the problem by hiding potentially expensive queries under simple declarative expressions. The disconnect between expression complexity and runtime cost often leads developers to mistrust the suitability of relational database systems for their web applications in the long term. As a result, developers of these applications are increasingly abandoning relational systems in favor of imperative code written against distributed "NoSQL" key/value stores, losing the many benefits of data independence in the process. While some claim that scalability issues are inherent in the use of the relational model this thesis challenges that notion by extending standard data independence with the notion of scale independence. In contrast to traditional relational databases, a scale-independent system is capable of providing predictable response time for all of the queries in an application even as the amount of data grows by orders of magnitude. This predictability is achieved by compile-time enforcement of strict upper bounds on the number of operations that will be performed for all queries. Coupled with a service level objective (SLO) compliance prediction model and a scalable storage architecture, these upper bounds make it easy for developers to write success-tolerant applications that support an arbitrarily large number of users while still providing acceptable performance. Statically bounding the amount of work required to execute a query can be easy for some queries, such as those that perform a lookup of a single record by primary key.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 04, 2013
Accession Number
ADA597352

Entities

People

  • Michael P. Armbrust

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies
  • Weapons Technologies

DTIC Thesaurus Topics

  • Commerce
  • Computer Programming
  • Computer Science
  • Computers
  • Construction
  • Data Storage Systems
  • Database Management Systems
  • Databases
  • Internet
  • Relational Database Management Systems
  • Relational Databases
  • Social Media
  • Social Networking Services
  • Software Development
  • Standards
  • Trees (Data Structures)
  • Web Applications

Fields of Study

  • Computer science
  • Engineering

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development
  • Systems Analysis and Design