Applications of Natural Language Processing to Predict Components of Naval Aviation Readiness
Abstract
DARTE (Digital Aviation Readiness Technology Engine) is a family of artificial intelligence and machine learning models that are used to predict Naval aviation readiness. With the goal of improving DARTE predictions, this document explored means to integrate text data into the modeling using natural language processing (NLP). To provide context, basic information on NLP as well as common methods for document classification were introduced. Based on the needs of DARTE, the Doc2Vec algorithm was selected as it provides a means to produce dense, constant-length numerical representations of documents. Background information on the foundations of Doc2Vec as well as the key hyperparameters are discussed. A variety of models of different complexity were implemented and tested. As the number of features added to the model increased, it was found that the relative importance of the Doc2Vec features decreased. Work was then conducted and connections were found between the Doc2Vec features and current features in the DARTE models. Many features that are known to be important components of DARTEs predictions can be accurately classified using Doc2Vec, but there is significant dispersion in how this information is encoded in the text data. Current efforts are focused on understanding this. Future efforts may also explore different NLP techniques to identify unique or unknown features useful for DARTE.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2022
- Accession Number
- AD1156795
Entities
People
- Andrew B. Sabater
- Benjamin A. Michlin
- Dean Lee
- Gary R. Williams
- Jamal Rorie
- Josh Duclos
Organizations
- Naval Information Warfare Center Pacific