Protecting Databases from Malicious Discovery through Automated Similarity Queries

Abstract

Companies, hospitals, and research laboratories in certain domains have developed extensive databases, such as clinical databases, as part of their research or daily activities. The entities that have developed these databases may wish to lease or allow use parts of the database by external users. Due to the significant time and monetary investment in the development of the databases, and the proprietary or the private nature of the data itself, they may not want to sell or allow access to the entire database. However we show that such databases are vulnerable to reverse engineering using popularly employed similarity-based queries. We identify some important security issues related to k-NN search and investigate their vulnerabilities against users who try to copy the database by sending automated queries. We analyze two models for similarity search, namely reply model and score model. Reply model responds with the k tuples that most closely match the query according to some metric, and score model responds with only the score of similarity search which provides more power in preserving the privacy. For these models we analyze possible attack methodologies and develop strategies that can be used to detect the potential attacks. We state the limits of protection provided by each query response model, and also provide techniques to guard the database against malicious discovery.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 01, 2006
Accession Number: AD1001208

Entities

People

Ali S. Tosun
Fatih Altiparmak
Hakan Ferhatosmanoglu

Organizations

Ohio State University

Protecting Databases from Malicious Discovery through Automated Similarity Queries

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers