A Distributed Laboratory for Automated Document Generation Using Large-Scale Computational Methods
Abstract
Under the ONR funded BAA project entitled Generating Documents that are Consistent with a Knowledge Base, George Mason University and Dartmouth College are jointly developing a system to generate fake documents that are consistent with background knowledge bases. We are developing a suite of algorithms to automatically extract knowledge bases from real documents and synthesize realistic fake documents to deter potential cyber-adversaries. This requires an end-to-end solution across many different fields, such as natural language processing, computer vision, logic, and optimization because technical documents are multimodal. In addition, corporations have huge numbers of documents. To support these large-scale computational methods our systemrequires a huge number of computing resources in terms of CPUs, memory, and GPUs.We have used existing resources at George Mason and Dartmouth; however, our experiments have been severely limited a single neural network based method such as Generative Adversarial Networks may take weeks to run, causing the research to proceed very slowly, especially as multiple runs are needed to process different parameters. We propose the development of a distributed laboratory (at GMU and Dartmouth) for large-scale documentgeneration consistent with a knowledge base. The proposal addresses only the equipment cost of machines and racks that will constitute our distributed computing system. The laboratory will enable us to design and test various methods to achieve this goal. Additionally, it will offer our students the opportunity to gain precious hands-on experience.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- May 08, 2020
- Source ID
- N000142012407
Entities
People
- Sushil Jajodia
Organizations
- George Mason University
- Office of Naval Research
- United States Navy