AI/ML Scaffolding
Abstract
CDAO’s AI/ML Scaffolding efforts include: * AI Test & Evaluation and Assurance Case Best Practices: Tailorable T&E frameworks that balance structure with flexibility to provide guidance on critical AI testing challenges, including: - AI T&E frameworks to support development of AI T&E policy and guidance, and T&E strategies and plans for systems with AI. - Example Assurance Cases - DoD LLM Benchmarks - Frameworks will be publicly releasable and made available to customers through CDAO and DoD publication means. Assurance Case examples and DoD LLM Benchmarks will be released at the classification level of the systems and data used to generate them * Alpha-1: provides enterprise services for DoD AI/ML development to give a baseline to DoD AI/ML projects. Services provided by Alpha-1 will be employed at the discretion of the partnering projects. CDAO will fund and provide some amount of capabilities, partnering projects which can elect to augment with additional funds or capabilities. * Digital Ecosystem Reference Architecture Model (DREAM): effort to continuously deliver AI/ML to mission spaces. - DREAM exists to address major impediments to the adoption of AI/ML across the department, including DoD’s lack of relevant technical capability, need for continuous capability update and awareness of technical innovation. - DREAM provides strategic technical awareness of relevant technologies that support the adoption of AI/ML across the DoD. - DREAM provides a reference architecture model for continuous delivery of assured AI/ML to mission spaces. * Enterprise Platforms & Capabilities (EPC) AI/ML Operations: implements the core Advana principles to bring Data and Analytics to the DoD. It provides: - Scalable compute, access to open source and industry leading packages, models, and programming languages - Ability to build, deploy, test, maintain, and monitor machine learning models built on top of live enterprise data - Access to foundational models and catalog of production/available models - Utilization of various development environments, tools, and platforms - Assessment of Large Language Models (LLMs) in accordance with CDAO guidance * Joint AI Test Infrastructure Capability (JATIC): At the direction of the DSD and recommendation of the NSCAI report, CDAO established the JATIC to enable enterprise-scale rapid development, testing, and deployment of AI capabilities across warfighting domains. The JATIC will enable enterprise-scale rapid development, testing, and deployment of AI capabilities across warfighter domains and will migrate the DoD towards Joint All Domain Test & Evaluation. This funding will support JATIC foundational programs such as: Adversarial AI T&E, AI/ML model card standards, Scalable AI Test Harness, CDAO data repository, data service marketplace, and ontologies. * Perceptor: On behalf of the DoD enterprise, Perceptor will pursue the following needs: - Accelerate the operationalization of AI systems - Increase analytic and AI reuse - Improve AI performance in production environments and reduce risk of AI degradation over time - Integrate AI insight in existing tools and interfaces - Operationalize AI inference on data streams and at the edge / ingest * Continuous Operational Behavior, Robustness, and Resilience Assessment (COBRRA): provides tools for continuous monitoring of AI models after deployment to ensure continuous validation and verification of performance characteristics among changing environmental and operational conditions * Responsible AI (RAI): RAI is critical to decision makers, warfighters, industry partners, and public trust in the technologies that the Department develops and deploys. The CDAO RAI effort: - Operationalizes the DoD AI Ethical Principles - Oversees the Deputy Secretary of Defense’s RAI Strategy and Implementation Pathway (S&IP) - Coordinates with offices throughout the DoD and federal government - Integrates RAI to build capacity of best practices and tools for AI * Smart Sensor: Smart Sensor is an effort to develop an AI and autonomy software effort enabling Group 5 UAS to conduct surveillance and reconnaissance missions in C2 and GPS-denied environments. This effort will enable autonomous, persistent, airborne surveillance and reconnaissance in contested environments via AI-enabled sensor fusion. * Test & Evaluation Infrastructure Gap Studies: CDAO's Test & Evaluation Infrastructure Gap Studies provide an evidence-based assessment of both the current state and desired state of DoD’s T&E ecosystem necessary for accomplishing robust and efficient testing of AIESs. This product series covers T&E broadly, and future work will cover resource gaps preventing sufficient evaluations of Adversarial AI. These studies provide tailored, tangible recommendations for infrastructure investments.
Document Details
- Document Type
- Accomplishment
- Publication Date
- Oct 01, 2025
- Source ID
- c38a70fd63028c80790fcc9a68f2fbe0