Generalized Robust Feature Selection

Abstract

Feature selection may be summarized as identifying salient features to a given response. Understanding which features affect the response enables, in the future, only collecting consequential data; hence, the feature selection algorithm may lead to saving effort spent collecting data, storage resources, as well as computational resources for making predictions. We propose a generalized approach to select the salient features of data sets. Our approach may also be applied to unsupervised datasets to understand which data streams provide unique information. We contend our approach identifies salient features robust to the sub-sequent predictive model applied. The proposed algorithm considers all provided variables, square variables, and two-way interactions as an extended data set. The algorithm implements a forward selection approach, based on correlation with the response, while fitting deep neural networks to the selected variables. These deep neural networks maintain an adaptive architecture which mirrors a full factorial design. These networks assess numeric and categorical values for both features and responses. Implementing this approach in ensemble with Recursive Feature Elimination we establish a new Pareto Frontier, consisting solely of this technique, for the Wisconsin Breast Cancer problem instance. This Pareto Frontier highlights our ensemble approach as the best performing method in both feature reduction and predictive accuracy.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 24, 2022
Accession Number
AD1172360

Entities

People

  • Bradford L Lott

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies
  • Materials and Manufacturing Processes
  • Sensors

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Breast Cancer
  • Classification
  • Computational Complexity
  • Computing System Architectures
  • Data Sets
  • Deep Learning
  • Dimensionality Reduction
  • Electrical Engineering
  • Engineering
  • Feature Selection
  • Genetic Algorithms
  • Information Science
  • Machine Learning
  • Neural Networks
  • Predictive Modeling

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Operations Research
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks