Bag-of-Words Algorithms Can Supplement Transformer Sequence Classification and Improve Model Interpretability
Abstract
Although transformer models perform extremely well on many natural language tasks, they may struggle with computing and memory requirements on long sequences, and often require significant amounts of computing power to train. Such models also lack interpretability. We describe a simple method of improving performance on the problem of classifying sequences of text by concatenating the hidden state of a BERT-based transformer model with a dictionary-based bag-of-words model. The hybrid models that result outperform the transformer models by varying margins, while adding trivial amounts of compute requirements and boosting model interpretability.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2022
- Accession Number
- AD1156753
Entities
People
- Christian Johnson
- William M. Marcellino
Organizations
- RAND Corporation