PREDICTION IN EVOLVING DATA STREAMS USING AN ADAPTIVE SYSTEM
Abstract
The main research goal is to develop the theory and algorithms for a predictive system for evolving data streams. Consider for example, intrusion detection systems, which monitor streams of network traffic to predict potential cyberattacks. Cybercriminals may evolve their attack behaviour when designing new attacks. We need to identify the new attack as early as possible, and the predictive model has to be updated to learn the behaviour of the new attack. The change of behaviour could be identified as concept drift in a data stream. Current data stream approaches, including our own work, require prediction models to be updated with the arrival of every new data instance [1-4] or the prediction models to be rebuilt when a change within the stream is explicitly signalled [5-15]. In the case of intrusion detection, an attack that has not been seen recently may reoccur in the future. The ability to recognise recurrences of previous concepts is crucial. Existing research [16-25] exploits model recurrence, but does not match the learning paradigm to the nature of the stream. They instead keep a pool of frequently used models, forgetting models that are not frequent. A drawback of these delayed forget and rebuild techniques is that concepts that rarely occur but are both critical and difficult to learn will be forgotten. Building and maintaining adaptive prediction systems in real-time is currently prohibitive due to their computational complexity. There are two open challenges. The first challenge is identifying how to efficiently and effectively deal with the evolving and volatile nature of data streams. The second challenge is finding the best way to capture and reuse suitable recurring models, while preventing unnecessary forgetting in the learning system. Using the theory and algorithms that we develop, we will be able to provide both theoretical and empirical guarantees on the performance of the system. The outcomes and deliverables of this particular research would be two-fold: (1) an open-sourced predictive system for evolving data streams, (2) research publications arising from this project.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Apr 25, 2019
- Source ID
- N629091912042
Entities
People
- Yun Sing Koh
Organizations
- Office of Naval Research
- United States Navy
- University of Auckland