Extending and Applying Causal Models
Abstract
Formal techniques for causal modeling and reasoning were developed relatively recently, but have already found application in a wide range of fields. Their promise of discovering meaningful relationships from data, assessing system failures, determining fairness, and building robust and secure autonomous software agents, is only starting to be realized. This project aims to bring that promise closer to reality, by extending causal models so that they can describe a much larger class of phenomena, and by showing how causality can be applied to a variety of domains, including security and fairness. Pearl introduced structural equations models (SEMs) to model causality, and they have proved quite effective. In a SEM, there are equations that describe the value of variables in terms of other variables; these equations let us describe the effect of interventions on variables, which is critical in reasoning about causality. SEMs assume that a situation is described by finitely many variables, each of which has a finite set of possible values. This means that SEMs cannot be used to describe large classes of real-world systems, including dynamic systems and cyber-physical systems. To capture dynamical systems, this project will consider a more flexible class of models called generalized structural-equations models (GSEMs) are considered. In SEMS, we can compute the outcomes of interventions using the equations; in GSEMs, we instead have a function that explicitly describes the outcomes of interventions, without committing to a specific mechanism (such as equations) for producing the outcomes. It is easy to show that GSEMs generalize SEMs. Moreover, the way that GSEMs are defined makes it easy to lift definitions of causal notions from SEMs to GSEMs. Importantly, many standard formalisms for representing causality in infinitary settings can be captured with GSEMs, including dynamical systems, hybrid automata (a popular formalism for describing mixed discrete-continuous systems, such as firmware controlling a sprinning rotor on an unmanned aircraft system), and the rule-based models commonly used in molecular biology and organic chemistry. The second extension to SEMs that will be considered involves adding logical constraints. Sup- pose that we want to combine a number of datasets collected by different researchers. The re- searchers are interested in much the same phenomena, but may well describe the world in slightly different ways. As a trivial example, they may use different unitsÑone may use kilograms and another may use grams. For a more interesting example, one cholesterol study may just talk about total cholesterol, because researchers had not yet realized that the relevance of the distinction be- tween LDL (low-density cholesterol) and HDL (high-density cholesterol), while another talks about LDL and HDL. The combined dataset will involve all these variables. There are clearly logical constraints between weight in kilograms and weight in grams, and between LDL level, HDL level, and total cholesterol. Such constraints cannot be handled by SEMs. This project will investigate an extension of SEMs that can handle them.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jun 30, 2022
- Source ID
- W911NF2210061
Entities
People
- Joseph Halpern
Organizations
- Army Contracting Command
- Cornell University
- United States Army