Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology
Abstract
This project seeks to create and assess new visual-assistive smartphone Apps for fully blind end users to better interact with the visual environment. These Apps convey sensory information gathered by cameras/sensors (e.g. color, distance, heat) and semantic information from artificial intelligence (e.g. object identity, shape, size, location in the image). This information is conveyed through spoken verbal feedback (e.g. chair, bottom left; TV middle right") and/or musical audio (e.g. musical meows that play from a cats location). Our research purpose is to produce new Apps that increase visual information accessibility, enhance daily functionality, and facilitate new interactions of interest to blind end users. In terms of scope, this 2-year project focuses on the initial development of novel technologies in the first year, with at-home beta-testing by fully blind subjects and further technology refinement in the second year. In year 1, we combined the modern iPhones 3D sensors (e.g. LiDAR range-finding) and DeepLabV3 object segmentation, to provide the required information, all running in real-time, locally on iPhone. By providing objects and their distances, we provide a stable and intuitive understanding of the environment that remains consistent across variable lighting or minor changes in object features. Overall, by combining object identity, distance, and their visual features effectively, this goes beyond prior technologies in providing both sensory and semantic-level information to the user, either separately, or in a novel combined hybrid format. In year 2, we provided blind beta testers with our Apps and supporting technology to try at home, followed by a series of interviews and questionnaires to assess their experiences of the Apps, suitability for daily tasks, and their desires for future technologies. Based on our findings, we are currently revising our Apps to meet their requests.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 01, 2023
- Accession Number
- AD1226041
Entities
People
- Giles Hamilton-fletcher
- Kevin C Chan
Organizations
- New York University