YIP Optimizing Collaborative and Trustworthy Interaction between Humans and AI Agents

Abstract

Approved for Public Release. Recent advances in generative AI have revolutionized the way humans interact with machines in both cyber and physical domains. Therefore, the ability to distill such interaction data into actionable decisions faster and richer than one s adversaries becomes increasingly important. Furthermore, our workforce is increasingly data-driven, which makes it essential to consider ways humans and AI agents can collaborate to achieve the best of both worlds. Collaborative and trustworthy interaction between humans and AI agents holds paramount importance in the realm of technological advancement and human-machine interaction. Effective human-AI collaboration can largely improve the efficiency, accuracy, and reliability of tasks. However, the current collaborative decision-making process between humans and AI suffers from a number of challenges, including but not limited to hallucination, trust, as well as task allocation between AI and humans. Comprehensively understanding these critical components is essential for developing trustworthy machine learning frameworks that optimize the combined intelligence of human and AI agents.This project aims to build trustworthy and collaborative interaction paradigms between humans and AI agents by developing robust machine learning algorithms to mitigate the AI hallucination issue in language generation, identifying ways to increase AI trust via automatic red-teaming andverifiable models, and designing novel expertise assessment and optimization techniques to support the collaboration between humansand AI agents.Technical Approach: (1) We propose to design consistency-aware and structure-aware hallucination detection metrics and further mitigate hallucination with a novel flow-of-thought step-by-step reasoning framework. With this new grounding, large language models like ChatGPT will be able to produce faithful outputs, especially when dealing with high-stakes situations. (2) We aim toinvestigate the social foundations of AI assistance via quantifying corner cases, risks, and automatic red-teaming, for enhancing the trust around collective decision-making between humans and AI agents via verifiable models. (3) We will develop innovative optimization algorithms for teaming humans and AI agents so that they can work effectively together in a given task, that cost and qualitycan be optimized, and that complementary uses of human and AI expertise can be optimized.Expected Outcomes: We will develop new algorithms, tools, and analysis techniques for reducing hallucination in model output, improving trust between human and AI interaction, and optimizing how humans and AI can work together effectively. We expect this project to lead to research publications, open-source software, and benchmarks.Impact on DoD Capabilities: Our project on developing truthful machine learning algorithms and investigating the social implications of AI have practical implications for the naval domain. Ensuring the accuracy and truthfulness of the generated information is essential to avoid false positives or hallucinations that could impact naval security. Our work can help enhance trust in AI systems and promote the responsible and ethical use of AI technologies within naval operations. Understanding theseimplications can also help policymakers develop responsible guidelines for the use of AI in naval operations. By dynamically allocating resources based on our proposed work on expertise assessment of humans and AI agents, the naval forces can optimize their utilization, leading to improved situational awareness, faster response times, and enhanced operational efficiency.

Document Details

Document Type: DoD Grant Award
Publication Date: Nov 08, 2024
Source ID: N000142412532

Entities

People

Diyi Yang

Organizations

Office of Naval Research
Stanford University
United States Navy

YIP Optimizing Collaborative and Trustworthy Interaction between Humans and AI Agents

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas