Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology

Abstract

This project seeks to create and assess new visual-assistive smartphone Apps for fully blind end users to better interact with the visual environment. These Apps convey sensory information gathered by cameras/sensors (e.g. color, distance, heat) and semantic information from artificial intelligence (e.g. object identity, shape, size, location in the image). This information is conveyed through spoken verbal feedback (e.g. chair, bottom left; TV middle right") and/or musical audio (e.g. musical meows that play from a cats location). Our research purpose is to produce new Apps that increase visual information accessibility, enhance daily functionality, and facilitate new interactions of interest to blind end users. In terms of scope, this 2-year project focuses on the initial development of novel technologies in the first year, with at-home beta-testing by fully blind subjects and further technology refinement in the second year. In year 1, we combined the modern iPhones 3D sensors (e.g. LiDAR range-finding) and DeepLabV3 object segmentation, to provide the required information, all running in real-time, locally on iPhone. By providing objects and their distances, we provide a stable and intuitive understanding of the environment that remains consistent across variable lighting or minor changes in object features. Overall, by combining object identity, distance, and their visual features effectively, this goes beyond prior technologies in providing both sensory and semantic-level information to the user, either separately, or in a novel combined hybrid format. In year 2, we provided blind beta testers with our Apps and supporting technology to try at home, followed by a series of interviews and questionnaires to assess their experiences of the Apps, suitability for daily tasks, and their desires for future technologies. Based on our findings, we are currently revising our Apps to meet their requests.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Oct 01, 2023
Accession Number: AD1226041

Entities

People

Giles Hamilton-fletcher
Kevin C Chan

Organizations

New York University

Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas