Issues in "Big-Data" Database Systems
Abstract
Big Data is often characterized as high-volume, multi-form data that changes rapidly and comes from multiple sources. It is sometimes claimed that big data will not be manageable using conventional relational database technology, and it is true that alternative paradigms, such as NoSQL systems and search engines, have much to offer. However, relational concepts (although not necessarily current relational products) will still have an important role to play in building database systems that can support the performance, scalability, and integration demands of Big Data applications. To deal effectively with Big Data, we must consider many factors, including: number of datatypes, schema changes, data volume, query complexity, query frequency, update patterns, data contention and isolation, and system and database administration. Relational database technology has been very successful in dealing with these issues, albeit for a single, tabular data form. However, it has largely ignored the problem of integrating disparate and heterogeneous data sources, except in the most trivial ways. It is nonetheless the right starting point for research on big data systems. Significant changes may be needed, to the data model, to the query language, and certainly to physical database design and query execution techniques; but to ignore relational technology is to ignore over forty years of relevant research on data processing.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 01, 2014
- Accession Number
- ADA607106
Entities
People
- Jack Orenstein
- Marius S. Vassiliou
Organizations
- Institute for Defense Analyses