Visual Common Sense Reasoning for Multi-agent Activity Prediction and Recognition
Abstract
ObjectiveDevelop computer vision algorithms and common sense reasoning for understanding human and vehicular activities and interactions in complex real world scenes.Short Work StatementThe PIs will develop computer vision algorithms for understanding human and vehicular activities from real world surveillance videos. They will develop appropriate datasets composed of real-world activities and clip art for algorithm development and evaluation. They will investigate and develop Long Short Term Memory (LSTM) networks for learning models of activities, Markov Logic Network (MLN) for common sense knowledge and reasoning, and methods for integrating these networks.Technical ApproachThis project is a collaboration of Larry Davis (Maryland), Silvio Savarese (Stanford), and Devi Parikh (Virginia Tech). The PIs propose to develop computer vision algorithms that employ large bodies of task-relevant common sense knowledge to analyze the movements of multiple, interacting agents and predict their future behaviors based on an integration of large scale data analysis and common sense reasoning. The goal is not to recognize every movement and interaction of people and vehicles in complex scenes, but to identify anomalies that have a semantic basis - for example safety violations when monitoring a busy intersection. To achieve this goal, they propose to investigate anovel attentional mechanism for video analysis based on powerful, data driven predictive models of human andvehicular activity. These predictive models will be based on deep models like Long Short-Term Memory (LSTM)networks. When predictions diverge from observations, we focus common sense reasoning on determining why. This common sense knowledge will be acquired using a clip art interface extended to the video domain and specialized to overhead scene analysis. The research will be developed and evaluated using video datasets that include extensive interactions of people and vehicles with one another. They will investigate and extend Markov Logic Networks (MLN) for representing common sense knowledge and reasoning.Merit/Relevance:This research addresses ONR~s Information Dominance focus area, as well as Autonomy and Unmanned Systems focus area. This work is expected to advance visual surveillance systems to recognize unusual multi-actor behaviors and events.Overall merit: This research is expected to develop novel approaches toward building advanced visual surveillancesystems for activity recognition by systematic incorporation of scene context and common sense knowledge and reasoning. This research is expected to develop novel approaches toward building advanced visual surveillance systems for activity recognition by systematic incorporation of scene context and common sense knowledge and reasoning.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Aug 12, 2016
- Source ID
- N000141612713
Entities
People
- Larry Davis
Organizations
- Office of Naval Research
- United States Navy
- University of Maryland