DEVELOPING ROBUST FRAMEWORK FOR PRACTICAL DATA MINING

Abstract

In practical real-world applications, data can be measured/recorded in different forms, e.g., price of vehicles can be recorded in integer scale as = 100000 or in logarithmic scale of base 10 as ? = 5, fuel efficiency can be measured in km/ltr as = 9.0 or lt/100km as ? = 11.11. Such changes can happen because of various reasons such as settings of devices/sensors, user/domain requirements, and data compression to save space, etc. This can be common in applications like cyber-physical systems, computer networks and IoT. When data are given for data mining, such information on how data are recorded is often not available and only data values (numbers) are provided. The given form of data may not be appropriate for the task at hand and many existing algorithms may perform poorly. Data can be misleading because the same data represented differently may give different impression. In this project, we aim to (i) investigate the impact of changes in how data are recorded in the performances of data mining algorithms, and (ii) develop a robust data mining framework to learn meaningful patterns from data that can be misleading. We plan to work in two folds: (i) algorithmic level – to develop new algorithms which are robust to such changes; and (ii) pre-processing level – to develop a new robust data pre-processing technique so that existing algorithms can be used as they are. Our approach is to exploit ordering (ranks) of data which is either preserved or reversed when data are measured differently. In addition to direct outcomes of robust data mining framework and a couple of publications, we hope this research will have a broader impact and lead to more future research because this issue can happen everywhere in this era of big data, where different features of data objects are captured/recorded by different sensors.

Document Details

Document Type
DoD Grant Award
Publication Date
Aug 11, 2021
Source ID
FA23862014005

Entities

People

  • Sunil Aryal

Organizations

  • Air Force Office of Scientific Research
  • Deakin University
  • United States Air Force

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Educational Psychology
  • Speech Processing/Speech Recognition.

Technology Areas

  • 5G
  • 5G - Internet of Things
  • AI & ML
  • Cyber
  • Space