Hybrid SIS and Markov Chain Monte Carlo Sampling Methodology for Goodness-of-Fit Tests on Contingency Tables
Abstract
Logistic regression is one of the most popular means of modeling contingency table data due to its ease of use. Simple asymptotic inference (like a X^2 approximation) for evaluating goodness-of-fit tests, however, may not be valid for sparse datasets having cell counts less than 5. In these cases, we often attempt exact conditional inference via a sampler, such as Markov Chain Monte Carlo (MCMC) or Sequential Importance Sampling (SIS). This paper proposes a hybrid sampling scheme that combines MCMC and SIS to sample sparse, multidimensional contingency tables satisfying fixed marginals when MCMC alone does not guarantee an exhaustive sampling of the conditional state space. To investigate its suitability, the proposed hybrid scheme is applied to an observational dataset from Alzheimers researcher JA Mortimer measuring the cognitive states of nuns over a 15 year period beginning in 1991. Through the application of our proposed scheme, we find the estimated p-values via a hybrid MCMC and SIS sampler are remarkably similar to the X^2 asymptotic approximation p-values, even for sparse contingency tables.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2018
- Accession Number
- AD1065499
Entities
People
- Patrick M. Saluke
Organizations
- Naval Postgraduate School