Retrieval-Embedded Large Language Models
Abstract
Large generative language models (LLMs), such as ChatGPT and LLaMA, face challenges in tasks requiring knowledge grounding, processing long input sequences, and providing interpretable explanations. To address these issues, recent research has explored retrieval-augmented language models, incorporating the ability to retrieve and utilize stored content. While existing models have demonstratedsuccess, our recent study reveals that the limitations of many of these models stem from retrieval failures. Motivated by these findings, we propose a novel approach: retrieval embedded LLMs, where a single language model handles both retrieval and language generation. This model employs differentiable search indexes, representing documents as semantic tokens (DocIDs) and using generative language models to produce probable DocIDs for each input. This approach offers advantages in terms of end-to-end training, integrationinto LLM workflows, and reduced memory requirements. Our recent preliminarystudy addresses key design considerations of existing differentiable search indexes, leading to significant performance improvements on standard retrieval datasets. Building upon these recent findings and through foundational algorithmic and modeling contributions, this project will develop retrieval-embedded large language models (LLMs) that generate the document IDs relevant to each given text input and then generate the output text. This projectholds the potential to develop retrieval-embedded LLMs that generate grounded and transparent text, aligning with current demands for transparency, credibility, and safety in language models. Approved for Public Release
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Nov 09, 2024
- Source ID
- N000142412612
Entities
People
- Hamed Zamani
Organizations
- Office of Naval Research
- United States Navy
- University of Massachusetts