Artemis: Integrating Scientific Data on the Grid (Preprint)

Abstract

Grid technologies provide a robust infrastructure for distributed computing, and are widely used in large-scale scientific applications that generate terabytes (soon petabytes) of data. This data is described with metadata attributes about the data properties and provenance, and is organized in a variety of metadata catalogs distributed over the grid. In order to find a collection of data that share certain properties, these metadata catalogs need to be identified and queried on an individual basis. This paper introduces Artemis, a system developed to integrate distributed metadata catalogs on the grid. Artemis exploits several AI techniques including a query mediator, a query planning and execution system, ontologies and semantic web tools to model metadata attributes, and an intelligent user interface that guides users through these ontologies to formulate queries. We describe our experiences using Artemis with large metadata catalogs from two projects in the physics domain.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 01, 2004
Accession Number
ADA464918

Entities

People

  • Ewa Deelman
  • Rattapoom Tuchinda
  • Snehal Thakkar
  • Yolanda Gil

Organizations

  • University of Southern California

Tags

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Birds
  • Computing System Architectures
  • Data Analysis
  • Data Integration
  • Data Sets
  • Databases
  • Information Science
  • Infrastructure
  • Intelligent Systems
  • Language
  • Network Architecture
  • Observatories
  • Ontologies
  • Storage
  • User Interface
  • Web Service

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development