Monocular Camera Localization Using a Bag of Visual Words from Virtual World Data

Abstract

The Visual Localization (VL) problem is the question of how to take in a query image and determine the pose of the camera that took that photo. The fact that Red Green Blue (RGB) cameras have become incredibly common and inexpensive, coupled with the huge amount of data they are able to capture have made building VL pipelines that accurately generate results increasingly interesting and important. These pipelines are useful in everything from Simultaneous Localization and Mapping (SLAM) and Smoothing and Mapping (SAM) for robotics to applications in Augmented Reality (AR). The work detailed in this paper seeks to determine if a Bag of Visual Words (BOVW) is adequately able to look past repetitious features in an indoor environment and localize the camera that captured an image. This pipeline is intended to be used as a truth system to verify results from other navigation techniques in development by the Autonomy and Navigation Technology (ANT) Center at the Air Force Institute of Technology (AFIT) in lieu of more expensive solutions such as motion capture systems or the Global Positioning System (GPS) which is unreliable in an indoor environment.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Mar 24, 2022
Accession Number: AD1166918

Entities

People

Joshua A. Rinaldi

Organizations

Air Force Institute of Technology

Monocular Camera Localization Using a Bag of Visual Words from Virtual World Data

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers

Technology Areas