Information Systems: Computing Sciences: Knowledge Systems: Intel-Miner: Semantic Machine for Robust Interpretation of Noisy Intelligence

Abstract

The long-term goal of this research is to develop an AI-based text analytics platform which would enable real-world users to conduct semantic analysis of multiple intelligence reports in a simple and interactive fashion without worrying about the underlying details of NLP. The PI aims to address this challenge by developing a conversational-AI agent which consists of two primary components: 1) Semantic Machine: a general purpose and intuitive semantic machine that would enable real users quickly formulate/explore alternative hypotheses about events, situations, and trends of interest from multiple intelligence reports and, 2) Dialog System: a dialog system which will verbally engage with the users to understand their information need, to communicate the relevant intelligence hypotheses formulated by the semantic machine and finally, to revise/alter a specific hypothesis in an interactive fashion. The scope of this project is limited to the first component of the conversational-AI agent, i.e., research and development of the Semantic Machine, which we call Intel-Miner. The primary goal of Intel-Miner is to provide a special language similar to set algebra which can be used to write infinitely many different text analysis programs each solving a different text analysis task. The beauty of Intel-Miner lies in its simplicity, where, the PI will implement three basic set-like operators in this framework---TextIntersect, TextUnion, and TextDifference---which are the natural text analogues of the original set operators they are named after. A careful inspection will reveal that, given two documents, TextIntersect is essentially the "extract commonalities functionality (reveal common information present in both documents), whereas, TextDifference correspond to the "extract differences functionality (reveal unique information present in the first document). Noteworthy, TextUnion can be viewed as a multi-document summarization operation with some additional constraints. Once implemented, Intel-Miner will provide a single unified framework, which would be able to support an infinite number of different applications by combining the three individual basic operators. As in the case of a general programming language, frequently used sequences of operators in Intel-Miner can also be treated as a "compound operator which can be made available to users through a library.

Document Details

Document Type
DoD Grant Award
Publication Date
Oct 12, 2022
Source ID
W911NF2210280

Entities

People

  • Shubhra Karmaker

Organizations

  • Army Contracting Command
  • Auburn University
  • United States Army

Tags

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Distributed Systems and Data Platform Development