Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology

Abstract

This project seeks to create and assess new visual-assistive smartphone Apps for fully blind end users to better interact with the visual environment. These Apps convey sensory information gathered by cameras/sensors (e.g. color, distance, heat) and semantic information from artificial intelligence (e.g. object identity, shape, size, location in the image). This information is conveyed through spoken verbal feedback (e.g. chair, bottom left; TV middle right") and/or musical audio (e.g. musical meows that play from a cats location). Our research purpose is to produce new Apps that increase visual information accessibility, enhance daily functionality, and facilitate new interactions of interest to blind end users. In terms of scope, this 2-year project focuses on the initial development of novel technologies in the first year, with at-home beta-testing by fully blind subjects and further technology refinement in the second year. In year 1, we combined the modern iPhones 3D sensors (e.g. LiDAR range-finding) and DeepLabV3 object segmentation, to provide the required information, all running in real-time, locally on iPhone. By providing objects and their distances, we provide a stable and intuitive understanding of the environment that remains consistent across variable lighting or minor changes in object features. Overall, by combining object identity, distance, and their visual features effectively, this goes beyond prior technologies in providing both sensory and semantic-level information to the user, either separately, or in a novel combined hybrid format. In year 2, we provided blind beta testers with our Apps and supporting technology to try at home, followed by a series of interviews and questionnaires to assess their experiences of the Apps, suitability for daily tasks, and their desires for future technologies. Based on our findings, we are currently revising our Apps to meet their requests.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 2023
Accession Number
AD1226041

Entities

People

  • Giles Hamilton-fletcher
  • Kevin C Chan

Organizations

  • New York University

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computer Vision.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy