Dataset Curation through Renders and Ontology Matching

Abstract

In this thesis we demonstrate the benefits of automated labeled dataset creation for fine-grained visual learning tasks. Specifically, we show that utilizing real-world, non-image information can significantly reduce the human effort needed for building large scale datasets. Computer vision has seen great advances in recent years in a number of complex tasks, such as scene classification, object detection, and image segmentation. A key ingredient in such success stories is the use of large amounts of labeled data. In many cases, the limiting factor is the ability to create these training sets. Issues arise in three forms:1) The act of labeling the data can be hard for human annotators, 2) n some cases it is hard to get a representative sample of the feature space, and 3) data for infrequent (yet potentially important\) instances can be completely absent from the training set. Business storefront classification is an example of 1). The number of possible labels is large, and assigning all relevant labels to an image is a time consuming task for annotators. Moreover, when the image contains a business from a country other than their own, annotators can get confused by the foreign language and produce erroneous labels. Annotators are also not consistent in their categorization of businesses into categories. In vehicle viewpoint estimation, the images themselves are hard to come by. Getting sample images of all viewpoints is hard due to bias in the way people photograph cars. Current datasets for this task lack data for many viewpoints. In addition, the labeling task is hard for the annotators. We address these issues by adding automation to the dataset creation process. Our approach is to utilize external information by matching the images to real world concepts. In the case of businesses, when images are mapped to an ontology of geographical entities, we are able to extract multiple relevant labels per image.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2015
Accession Number
ADA624287

Entities

People

  • Yair Movshovitz-attias

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Computational Science
  • Computer Languages
  • Computer Science
  • Computer Vision
  • Computers
  • Dimensionality Reduction
  • Information Processing
  • Information Science
  • Information Systems
  • Machine Learning
  • Network Science
  • Neural Networks
  • Ontologies
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Computer Vision.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks
  • Space