Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

Abstract

This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. Specifically, we integrate both a neural language model and distributional semantic strained on large text corpora into a recent LSTM-based architecture for video description. We evaluate our approach on a collection of Youtube videos as well as two large movie description datasets showing significant improvements in grammaticality while modestly improving descriptive quality.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 29, 2016
Accession Number: AD1049685

Entities

People

Kate Saenko
Lisa Anne Hendricks
Raymond Mooney
Subhashini Venugopalan

Organizations

National Science Foundation

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers