A3: Audio analytics, Artificial intelligence & Autonomous systems

Abstract

With recent innovations in technology and particularly artificial intelligence, autonomous systems are slowly becoming a reality that will impact all facets of life ranging from entertainment, commercial and healthcare systems to defense capabilities. Unlike traditional systems whose actions are predetermined at inception, an autonomous system has to exhibit ~intelligence~. It should be able to handle large amounts of data incoming from its sensors, parse and interpret the incoming information based on its prior beliefs, as well as adapt to unknown situations or unexpected events, in order to ultimately make or inform complex decisions. With sound signals incoming from audio sensors, such systems are tasked with interpreting the physical world from complex sensory streams that represent complex soundscapes with nested information at multiple timescales (from milliseconds to days and years). Translating this nested multiscale information to data-driven analytics (bottom-up) as well as cognitive processes (memory and attentional selection) continues to challenge current systems. Recent advances in artificial intelligence and particularly deep learning offer an opportunity to inform our thinking of how this multiscale bottom-up/top-down interaction takes place in order to better understand auditory cognition as well as translate that knowledge into better autonomous systems. In the present project, we aim to focus on multiscale analytics of complex soundscapes, considering the interplay between knowns (memory) and unknowns (salience) in the context of target selection (attention). We argue that a distributed scheme in which selectivity and invariance are simultaneously achieved is more in line with cortical-like processes and leads to truly ~deep~ inference that results in robust performance under novel listening conditions. Specifically, we leverage recent advances in deep neural networks to develop a parallel architecture that boasts a number of design elements: the multiscale nature of sound is tackled in a distributed fashion; this distributed representation gives rise to a distributed memory network also diffused along many granularities that consider different time constants and contexts; attentional feedback guides processing driven by both knowns and unknowns in the environment.The proposal makes explicit predictions about brain strategies for interpreting auditory scenes and ultimately for integrating audio analytics in autonomous systems. The computational framework develops artificial intelligence capabilities that will seamlessly integrate audio analytics with autonomous system platforms, allowing them to augment complex decision-making by humans. Such systems can in fact compensate for deficiencies in human perception permitting a partnership of automated and biological systems that is superior to either alone. These systems have direct impact in a broad array of cross-service military applications, including sensing systems for autonomous platforms operating in noisy and hostile environments, robust battlefield sensors, clutter-resistant passive sonar processors, and dynamically reconfigurable sensor webs. Over and above its engineering objective, this proposal also seeks to push the scientific understanding of auditory cognition which will be instrumental in guiding new approaches to enhance human performance in challenging settings and push forth new advances in the human-machine interface; as well as new visions for adaptive communication aids for the sensory-impaired.

Document Details

Document Type: DoD Grant Award
Publication Date: Jan 23, 2019
Source ID: N000141912014

Entities

People

Mounya Elhilali

Organizations

Johns Hopkins University
Office of Naval Research
United States Navy

A3: Audio analytics, Artificial intelligence & Autonomous systems

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas