Outliers Matter in Survival Analysis

Abstract

Generally, but not always, the most influential observations (cases) possess the largest Cox-Snell residuals. A case which has a variable far out in the factor space may be more influential than a case with a large residual. Either kind of outlier can affect inference. The plots of the log survival curve of the residuals and the corresponding variance-stabilized transformation generally do not indicate the importance (influence) of large residuals on estimated parameters. Regression parameter estimates from the Cox proportional hazards model are just as sensitive to influential cases as are fully parametric models. Influential observations often suggest other modelling deficiencies, such as a poorly-specified factor, nonproportional hazards, or an omitted variable. The analyses indicate how broadly-based the conclusions are. We found no evidence to support automatic exclusion of outlying observations. Outlier diagnostics based on linearization (the log-likelihood and its derivatives) work very well. The measure of importance tends to be compressed at its upper end, so the impact of very influential observations tends to be understated.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 1982
Accession Number
ADA119473

Entities

People

  • Daryl Pregibon
  • Gaineford J. Hall
  • William H. Rogers

Organizations

  • RAND Corporation

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Biometrics
  • Blood Cells
  • Cancer
  • Cell Count
  • Cells
  • Coefficients
  • Covariance
  • Data Analysis
  • Data Science
  • Databases
  • Information Science
  • Leukocytes
  • Lung Cancer
  • Maximum Likelihood Estimation
  • Neoplasms
  • Statistical Analysis
  • Statistics

Fields of Study

  • Mathematics

Readers

  • Regression Analysis.
  • Statistical inference.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • Space