A Hypothesis-Testing Approach to Discriminant Analysis with Mixed Categorical and Continuous Variables When Data are Missing.

Abstract

In this report we consider the problem of discriminant analysis with discrete (categorical) and continuous variables with data missing at random. We use a hypothesis-testing approach based on the generalized likelihood ratio as proposed by Baek, et al. We use bootstrapping to determine critical values in order to control the Type 1 error rate. We present three algorithms for dealing with this case, each assuming a different model for the data: the INDICATOR algorithm replaces categorical variables with indicator variables, and treats these as if they were continuous, the FULL algorithm assumes a multinomial distribution for the discrete part, and a multivariate normal distribution (with mean and covariances depending on the discrete part) as the conditional distribution of the continuous part given the discrete part, and the COMMON algorithm assumes a multinomial distribution for the discrete part, and a multivariate normal distribution (with only the means depending on the discrete part) as the conditional distribution of the continuous part given the discrete part. (That is, a common covariance matrix is assumed across all multinomial cells.) The performance of these algorithms is compared through a simulation study. While the INDICATOR algorithm seems to have highest power, it also tends to display a higher Type 1 error rate than desired. The FULL and the COMMON algorithms have very similar power, but the COMMON algorithm appears to control the Type 1 error rate most effectively, and is least susceptible to problems occurring when some multinomial cells are sparsely represented. (AN)

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jul 01, 1994
Accession Number: ADA293714

Entities

People

G. D. Mccartor
H. L. Gray
J. W. Miller
W. A. Woodward

Organizations

Southern Methodist University

A Hypothesis-Testing Approach to Discriminant Analysis with Mixed Categorical and Continuous Variables When Data are Missing.

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers