Mining Spatiotemporal Knowledge from Language and Images
Abstract
Research Problem. Our main goal is to investigate algorithms and models to mine spatiotemporal knowledge from language and images, including (a) spatial information regarding whether someone is or is not located somewhere, and (b) past, current and future temporal information indicating when and for how long. For example, a picture of a U-Haul truck indicates that somebody is moving to a new location thus it is likely that he will be there for years as opposed to just a few days or weeks. We will achieve our main goal by pursuing two objectives: to (1) investigate the spatiotemporal knowledge understood by humans from language and images, and (2) design multimodal algorithms to mine spatiotemporal knowledge. The proposed work has many innovations. First, we go beyond named entity recognition and plain spatial relations. Second, we will investigate both language and visual cues to extract spatiotemporal knowledge. Third, we will define novel global inference methods that combine several modalities (language and images) and external knowledge. Technical Approach. We will follow an empirical data-driven approach. Regarding Objective 1, we will work with readily available collections of pictures and their accompanying texts. We will (a) involve nonexperts via crowdsourcing because we are interested in spatiotemporal knowledge as intuitively understood by humans, and (b) iteratively refine the knowledge to be annotated. This refinement process will culminate in a strategy that maximizes the amount of spatiotemporal knowledge accounted for while ensuring both productivity and quality. Regarding Objective 2, we will investigate machine learning strategies to mine spatiotemporal knowledge jointly from language and images. We will build upon existing deep learning neural networks for language and image processing, and define novel algorithms to reason jointly over information extracted from both language and images. Beyond supervised frameworks, we will investigate unsupervised frameworks that incorporate event understanding (event structures and ordering, expected durations, etc.) in order to reveal spatiotemporal knowledge involving arbitrary locations and times. These algorithms will learn to pay attention to the most relevant language and visual cues using representation learning on graphs. Anticipated Outcomes. The primary result of this project will be scientific publications detailing the fundamental research carried out and describing, among others, the following: Crowdsourcing strategies to investigate and analyze spatiotemporal knowledge. Thorough analysis of the spatiotemporal knowledge intuitively understood by humans. Algorithms to merge language and image processing into a unified inference framework. Algorithms and computational models to infer spatiotemporal knowledge. We will define (a) supervised frameworks to reproduce the crowdsourced spatiotemporal knowledge and (b) unsupervised frameworks to infer spatiotemporal knowledge regarding arbitrary locations. Impact on NGA’s Capabilities. Successful completion of the proposed research will advance spatiotemporal analysis, geolocation and data uncertainty. Specifically, we will investigate (a) how much and what kind of spatiotemporal knowledge humans intuitively understand from either language or images as well as both language and images (Objective 1), and (b) algorithms and models to couple language and image processing in order to automate the task (Objective 2). Additionally, the work described here will shed light into multimodal (i.e., taking into account both language and images) spatiotemporal analysis. Beyond plain spatial relations (x is at y), we will investigate complex spatiotemporal knowledge including information regarding where someone is not located, and incorporate temporal anchors to specify when and for how long.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Oct 06, 2020
- Source ID
- HM04762010007
Entities
People
- Eduardo Blanco
Organizations
- National Geospatial-Intelligence Agency
- University of North Texas