Using the Random Nearest Neighbor Data Mining Method to Extract Maximum Information Content from Weather Forecasts from Multiple Predictors of Weather and One Predictand (Low-Level Turbulence)

Abstract

A new methodology of data mining is developed to find relationships between Air Force Weather Agency (AFWA) WRF 15-km atmospheric model forecast data and low-level turbulence. Archives of historical model data forecast predictors at model gridpoints and verifying pilot reports (PIREPS) of turbulence have been collected. The new data mining method, Random Nearest Neighbor (RNN), will be shown to be capable of extracting nearly the maximum possible amount of information from a multiple predictor, single predictand dataset. In this report, the RNN methodology is used to achieve nearly the best possible turbulence forecast from a domain consisting of predictors at model gridpoints and corresponding verification from PIREPS. Two experiments using RNN will demonstrate that RNN almost completely accomplishes the goal of accurately re-creating non-linear relationships of combinations of predictors with varying combinations of values. In the first experiment with real data, it will be seen that RNN accurately linearizes a predictor to the predictand. The second experiment uses a synthetic dataset. It will be seen that RNN accurately re-creates that synthetic dataset. RNN is then utilized with the real dataset. After demonstrating the effectiveness of the RNN methodology, it will be seen that low-level turbulence has limited forecastability using the turbulence dataset used in this study. The goals of this technical report are three-fold: 1) to introduce RNN as a data mining methodology; 2) to demonstrate its effectiveness in extracting potentially complex non-linear multiple-predictor vs. predictand relationships, and 3) the implications of forecasting turbulence. Other facets of data mining and statistical forecasting, such as predictor selection techniques, are acknowledged but not explored in this report. An effort is made to explain clearly, to non-experts in statistics, how RNN works.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 30, 2014
Accession Number
ADA613335

Entities

People

  • David L. Keller

Organizations

  • 557th Weather Wing

Tags

Communities of Interest

  • Human Systems
  • Materials and Manufacturing Processes
  • Space

DTIC Thesaurus Topics

  • Air Force
  • Algorithms
  • Boundary Layer
  • Computer Programs
  • Curve Fitting
  • Data Mining
  • Information Science
  • Information Systems
  • Lapse Rate
  • Layers
  • Meteorology
  • Neural Networks
  • Reliability
  • Statistics
  • Turbulence
  • Weather Forecasting
  • Wind Shear

Readers

  • Atmospheric Science/Meteorology
  • Materials Science
  • Regression Analysis.

Technology Areas

  • AI & ML