Computer Vision and Image Understanding with Neural Volumetric Representations

Abstract

The correct geometric and photometric representations of the natural world are crucial for computer vision and image understanding. A long-term goal of computer vision has been to reconstruct the 3D visual world, and develop new methods for image understanding.This in turn enables autonomous agents to perform maritime surveillance, assist in home robotics situations, or create dense 3D models. Physics-based vision relies on particular representations of object geometry such as depth maps, disparity maps or meshes. Indeed, a rich area of multiview stereo and 3D computer vision have been based on these traditional geometric representations, which have also received considerable interest in computer graphics and other fields.However, it has become clear in recent years that thesegeometric representations, while corresponding well to physical objects, have serious limitations. One challenge is in handling occlusions, and mis-calibration errors. Other representations, such as multi-plane images allow for an alpha channel enabling smooth compositing. Nevertheless, many of the challenges of conventional geometric representations remain. Similar challenges hold withinthe area of view synthesis or image-based rendering in computer vision and graphics, where users seek to create immersive representations of scenes from a sparse set of images. This problem is of significant naval importance in visualizing realistic maritime or battlefield scenarios in an immersive environment.We propose a radical departure from decades-old geometric and photometric representations in computer vision, to consider neural volumetric representations. Such representations consider a continuous volume ratherthan a hard surface, avoiding many of the problems inherent in dealing with occlusions and mis-calibrations. Moreover, we can leverage recent advances in machine learning to represent the scattering properties at each location in the volume using a neural representation such as a multi-layer perceptron or MLP, which effectively serves as a trainable procedural function. This builds on our award-winning, widely cited and publicized work on Neural Radiance Fields or NeRFs, which proposes an implicit volumetric representation, where the volume density and color are represented by a multi-layer perceptron (MLP) as a function of spatial location and angular direction. However, the basic NeRF method is very slow both to train and evaluate, does not generalize to new scenes, does not enable relighting, cannot deal with very sparse user input, does not deal with dynamic scenes, and a question remains on the optimal representation.This proposal seeks to address many of these challenges, developing a complete pipeline for physics-based vision using neural volumetric representations. We focus on the optimal representations, extending view synthesis to full light transport acquisition, development of sampling rates with sparse sampling, and analysis of dynamic scenes. This is of particular importance in a naval context where maritime approaches must be able to consider volumetric media such as misty air or the ocean, and reconstruct dynamic volumetric scenes. We are proposing an inherently new representation that also impacts many areas beyond physics-based computer vision, such as reconstruction, segmentation, and scene parsing.The PI is uniquely qualified for this proposal, with many of the foundational results on which this work builds developed with funding from prior ONR grants. The PI was named an IEEE and ACM Fellow and inducted into the SIGGRAPH academy in the last few years. His earlier work has been recognized with the ACM SIGGRAPH Significant New Researcher Award in Computer Graphics, and the ONR PECASE and Young Investigator Awards in physics-based computer vision; the success of these and subsequent ONR grants also shows that he can work with ONR and appreciate basic research with future naval relevance.

Document Details

Document Type: DoD Grant Award
Publication Date: May 15, 2023
Source ID: N000142312526

Entities

People

Ravi Ramamoorthi

Organizations

Office of Naval Research
United States Navy
University of California, San Diego

Computer Vision and Image Understanding with Neural Volumetric Representations

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas