Maintaining Consistency and Relevancy in Multi-Image Visual Storytelling
Abstract
This report proposes an approach for visual storytelling across multiple images that prioritize two aspects of narrative generation: 1) the retention of narrative consistency between clauses in the generated story; and 2) the retention of relevancy between the generated story and the images from which it was derived. We take a structured approach to multi-image visual storytelling that centers around the middle image in a sequence of three. Acting as the focal point, or climax of the narrative, the plot points surrounding this are selected using events from the Atlas of Machine Commonsense (ATOMIC) corpus for if-then reasoning about daily activities, and then the selected events are subsequently grounded to the images. The result is an architecture that, given an author goal to guide the story in the form of a prompt, will generate a short narrative that retains a narrative arc and does not deviate from the content of the images.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 13, 2020
- Accession Number
- AD1115439
Entities
People
- Aishwarya Sapkale
- Stephanie M. Lukin
Organizations
- United States Army Research Laboratory
- University of Maryland, Baltimore County