Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning

Abstract

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Meta algorithms build on base algorithmssuch as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networksto estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 15, 2019
Accession Number
AD1104798

Entities

People

  • Bin Yu
  • Jasjeet S. Sekhon
  • Peter J. Bickel
  • Soren R. Kunzel

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Algorithms
  • California
  • Computer Science
  • Data Analysis
  • Families (Human)
  • Information Science
  • Machine Learning
  • Military Research
  • Political Science
  • Probability
  • Random Variables
  • Simulations
  • Statistics
  • Structural Properties
  • Supervised Machine Learning
  • Surveys
  • United States

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Regression Analysis.
  • Solar Photovoltaics and Thermoelectric Devices.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks