Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology
Abstract
Research Idea/Rationale: Modern smartphones have seen recent advances in their ability to recognize objects from the camera feed. They have also started to allow the incorporation of new sensors and have advanced in their ability to produce rich multilayered sounds and speech. In combination, this provides unprecedented opportunities to make various types of visual information accessible to the blind through the smartphone as a substitute window into the world, naming visual objects and playing sounds that convey their visual features. Using traditional vision-to-sound conversion approaches, visual dimensions in greyscale images like brightness, height, and laterality can be turned into auditory loudness, pitch, and panning/time respectively. This allows blind listeners to mentally reconstruct the original image in order to understand and interact with the visual world. This approach preserves a high level of visual detail in sound and provides the highest visual resolution to blind users -- surpassing tactile approaches as well as retinal and cortical prosthetics. However, fully blind users have expressed concerns about the (a) initial practicality and (b) user experience when using current visual assistive technologies in complex natural environments. To address these concerns, we plan to leverage advanced mobile technologies and incorporate practical sensors (e.g., depth, thermal, color) for vision-to-sound conversions, while making the auditory feedback more pleasant and easy to understand. We will use advanced computer-vision technologies to allow object descriptions in real-time to further improve practicality and reduce the mental effort required to understand complex sound patterns. This will allow users to hear the names of objects and track their locations in the environment. By combining these approaches, users can switch between hearing a recognised object s name and location, as well as its shape, color, temperature, and movement using simple, pleasant, and intuitive sound patterns. We plan to develop these approaches into a smartphone App, and recruit blind end-users to evaluate whether this mobile technology can meet users’ practicality and usability expectations. Objective(s) and/or Hypothesis(es): This 2-year project seeks to (i) develop a prototype smartphone App that conveys sensory features (3D space, thermal, color, etc.) through musical sounds, and recognised objects names through spatialized verbal feedback in real-time; and (ii) conduct beta-testing with blind end-users to evaluate the App s (a) practicality and (b) user experience, in order to further refine the mobile technology to meet these goals. Specific Aims: In year 1, a prototype smartphone App will be developed to (1) identify a variety of everyday objects and track their position in 3D space; (2) report the object s position and name via verbal feedback; and (3) play sounds conveying sensory features (e.g., shape, color, heat) of the identified objects. In year 2, 30 blind users will undergo beta-testing to examine the App for 3 months each (10 at a time). They will evaluate separate modes (musical, verbal, both) in each month in terms of their (a) practicality and (b) user experience. Beta-testers will provide feedback and take part in interviews/questionnaires. This will guide further technical refinements to the App aiming to enhance how naming and sensory information can be best conveyed to end-users. After 3 months, the next 10 beta-testers will examine the iteration of the App revised from prior feedback. We will identify common themes and experiences users have with the mobile technology, track the perceived usability of each App iteration, and explore what activities users believe the App may facilitate. Feedback will help scientists and developers understand user perceptions of sensory technology in general, guide technical development of this App, and prepare it for use in daily life by blind or visually
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Dec 05, 2021
- Source ID
- W81XWH2110615
Entities
People
- Kevin C Chan
Organizations
- Grossman School of Medicine
- United States Army