A Survey of Knowledge-enhanced Text Generation

Abstract

The goal of text-to-text generation is to make machines express like a human in many applications such as conversation, summarization, and translation. It is one of the most important yet challenging tasks in natural language processing (NLP). Various neural encoder-decoder models have been proposed to achieve the goal by learning to map input text to output text. However, the input text alone often provides limited knowledge to generate the desired output, so the performance of text generation is still far from satisfaction in many real-world scenarios. To address this issue, researchers have considered incorporating (i) internal knowledge embedded in the input text and (ii) external knowledge from outside sources such as knowledge base and knowledge graph into the text generation system. This research topic is known as knowledge-enhanced text generation . In this survey, we present a comprehensive review of the research on this topic over the past five years. The main content includes two parts: (i) general methods and architectures for integrating knowledge into text generation; (ii) specific techniques and applications according to different forms of knowledge data. This survey can have broad audiences, researchers and practitioners, in academia and industry.

Document Details

Document Type
Pub Defense Publication
Publication Date
Jan 31, 2022
Source ID
10.1145/3512467

Entities

People

  • Chenguang Zhu
  • Heng Ji
  • Meng Jiang
  • Qingyun Wang
  • Wenhao Yu
  • Zaitang Li
  • Zhiting Hu

Organizations

  • Microsoft
  • National Science Foundation
  • The Chinese University of Hong Kong
  • University of California, San Diego
  • University of Illinois Urbana–Champaign
  • University of Notre Dame

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Distributed Systems and Data Platform Development
  • Neurodegenerative Parkinson's Disease and Rickettsial Disease handbook, including the data level of dopamine, BC, neurons, and PD.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks