Maintaining Consistency and Relevancy in Multi-Image Visual Storytelling

Abstract

This report proposes an approach for visual storytelling across multiple images that prioritize two aspects of narrative generation: 1) the retention of narrative consistency between clauses in the generated story; and 2) the retention of relevancy between the generated story and the images from which it was derived. We take a structured approach to multi-image visual storytelling that centers around the middle image in a sequence of three. Acting as the focal point, or climax of the narrative, the plot points surrounding this are selected using events from the Atlas of Machine Commonsense (ATOMIC) corpus for if-then reasoning about daily activities, and then the selected events are subsequently grounded to the images. The result is an architecture that, given an author goal to guide the story in the form of a prompt, will generate a short narrative that retains a narrative arc and does not deviate from the content of the images.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 13, 2020
Accession Number: AD1115439

Entities

People

Aishwarya Sapkale
Stephanie M. Lukin

Organizations

United States Army Research Laboratory
University of Maryland, Baltimore County

Maintaining Consistency and Relevancy in Multi-Image Visual Storytelling

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Readers