Causal Concepts In Large Language Models
Abstract
To use language models to solve real problems we must adapt them to the task of interest (e.g., by prompting or finetuning). This proposal is aimed at the closely related problems of designing and evaluating such task-specific adaptations of large language models. A core challenge is formalizing these problems in a manner that allows for their systematic study. The main idea of the proposal is that, in some situations, we can achieve the required formalization in terms of causal concepts defined by the real-world process that generated the text data used to train the language model. We propose using this causal concept idea to design evaluations by measuring adaptation-induced changes in representation subspaces associated with concepts. We propose using the causal concept idea tounderstand conditions under which adaptation methods should succeed (or fail) by making use of ideas in causal identification and causal representation learning.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jun 29, 2023
- Source ID
- N000142312591
Entities
People
- Victor Veitch
Organizations
- Office of Naval Research
- United States Navy
- University of Chicago