Visual Common Sense Reasoning for Multi-agent Activity Prediction and Recognition

Abstract

ObjectiveDevelop computer vision algorithms and common sense reasoning for understanding human and vehicular activities and interactions in complex real world scenes.Short Work StatementThe PIs will develop computer vision algorithms for understanding human and vehicular activities from real world surveillance videos. They will develop appropriate datasets composed of real-world activities and clip art for algorithm development and evaluation. They will investigate and develop Long Short Term Memory (LSTM) networks for learning models of activities, Markov Logic Network (MLN) for common sense knowledge and reasoning, and methods for integrating these networks.Technical ApproachThis project is a collaboration of Larry Davis (Maryland), Silvio Savarese (Stanford), and Devi Parikh (Virginia Tech). The PIs propose to develop computer vision algorithms that employ large bodies of task-relevant common sense knowledge to analyze the movements of multiple, interacting agents and predict their future behaviors based on an integration of large scale data analysis and common sense reasoning. The goal is not to recognize every movement and interaction of people and vehicles in complex scenes, but to identify anomalies that have a semantic basis - for example safety violations when monitoring a busy intersection. To achieve this goal, they propose to investigate anovel attentional mechanism for video analysis based on powerful, data driven predictive models of human andvehicular activity. These predictive models will be based on deep models like Long Short-Term Memory (LSTM)networks. When predictions diverge from observations, we focus common sense reasoning on determining why. This common sense knowledge will be acquired using a clip art interface extended to the video domain and specialized to overhead scene analysis. The research will be developed and evaluated using video datasets that include extensive interactions of people and vehicles with one another. They will investigate and extend Markov Logic Networks (MLN) for representing common sense knowledge and reasoning.Merit/Relevance:This research addresses ONR~s Information Dominance focus area, as well as Autonomy and Unmanned Systems focus area. This work is expected to advance visual surveillance systems to recognize unusual multi-actor behaviors and events.Overall merit: This research is expected to develop novel approaches toward building advanced visual surveillancesystems for activity recognition by systematic incorporation of scene context and common sense knowledge and reasoning. This research is expected to develop novel approaches toward building advanced visual surveillance systems for activity recognition by systematic incorporation of scene context and common sense knowledge and reasoning.

Document Details

Document Type: DoD Grant Award
Publication Date: Aug 12, 2016
Source ID: N000141612713

Entities

People

Larry Davis

Organizations

Office of Naval Research
United States Navy
University of Maryland

Visual Common Sense Reasoning for Multi-agent Activity Prediction and Recognition

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas