Bag-of-Words Algorithms Can Supplement Transformer Sequence Classification and Improve Model Interpretability

Abstract

Although transformer models perform extremely well on many natural language tasks, they may struggle with computing and memory requirements on long sequences, and often require significant amounts of computing power to train. Such models also lack interpretability. We describe a simple method of improving performance on the problem of classifying sequences of text by concatenating the hidden state of a BERT-based transformer model with a dictionary-based bag-of-words model. The hybrid models that result outperform the transformer models by varying margins, while adding trivial amounts of compute requirements and boosting model interpretability.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2022
Accession Number
AD1156753

Entities

People

  • Christian Johnson
  • William M. Marcellino

Organizations

  • RAND Corporation

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence Software
  • Computational Linguistics
  • Computational Science
  • Computing System Architectures
  • Covid-19
  • Dimensionality Reduction
  • Failure Mode And Effect Analysis
  • Information Systems
  • Language
  • Machine Learning
  • National Security
  • Natural Language Processing
  • Natural Languages
  • Network Architecture
  • Neural Networks
  • Social Media
  • Unified Combatant Commands

Fields of Study

  • Computer science
  • Engineering

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Neural Network Machine Learning.