Developing Explainable and Trustworthy Visual Recognition through Attributes

Abstract

This research program introduces a computer vision framework that produces explainable, correctable,and extendable visual representations for image recognition. We force the model to utilize objectattributes to predict a category label. Given an image, our framework produces a human-readablejustification that is certified to explain the internal reasoning of the model. The core of our projectis avisual dictionary of the objects, activities, and events in the visual world. We introduce a new method forconstructing this visual dictionary on a large scale without requiring manual supervision, capitalizing oncurrent large natural language models. Each thrust will share this representational interface, driving deepintegration throughout the project. Instead of a bag of separate attributes utilized in prior approaches wepropose a method for capturing both relations between attributes and between attributes and classes, aswell as sub-classes. This should lead to improvements both in generalizing to new instances of trainingclasses and to novel classes based on new combinations of attributes, as in few-shot learning. Finally,prior work has failed to demonstrate clear advantages to extracting attributes, and representing objectswith them. Here we propose an automated self-improvement loop for the model, where challenge datasetsare constructed, model errors are analyzed, and new attributes are added to the model, thereby improvingmodel performance.

Document Details

Document Type
DoD Grant Award
Publication Date
May 15, 2023
Source ID
N000142312436

Entities

People

  • Richard Zemel

Organizations

  • Office of Naval Research
  • Trustees of Columbia University in the City of New York
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML