Techniques for Mapping Synthetic Aperture Radar Processing Algorithms to Multi-GPU Clusters
Abstract
This paper presents a design for parallel processing of synthetic aperture radar (SAR) data using multiple Graphics Processing Units (GPUs). Our approach supports real-time reconstruction of a two-dimensional image from a matrix of echo pulses and their response values. Key to runtime efficiency is a partitioning scheme that divides the output image into tiles and the input matrix into a collection of pulses associated with each tile. Each image tile and its associated pulse set are distributed to thread blocks across multiple GPUs, which support parallel computation with near-optimal I/O cost. The partial results are subsequently combined by a host CPU. Further efficiency is realized by the GPU's low-latency thread scheduling, which masks memory access latencies. Performance analysis quantifies runtime as a function of input/output parameters and number of GPUs. Experimental results were generated with 10 nVidia Tesla C2050 GPUs having maximum throughput of 972 Gflop/s. Our approach scales well for output (reconstructed) image sizes from 2,048 x 2,048 pixels to 8,192 x 8,192 pixels.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 01, 2012
- Accession Number
- ADA639785
Entities
People
- Eric Hayden
- Gunasekaran Seetharaman
- Mark Schmalz
- Sanjay Ranka
- Sartaj Sahni
- William Chapman
Organizations
- Air Force Research Laboratory