Dynamic Scheduling for Web Monitoring Crawler
Abstract
Web monitoring systems report any changes on the target web pages by revisiting them frequently. As they are operated under significant constrains such as network and computing, it is necessary to minimize revisits with minimal delay and maximum coverage. Various statistical scheduling methods were proposed to resolve this problem. However they are static and cannot easily cope with events in the real world. This paper proposes a new scheduling method that manages unpredictable events. MCRDR (Multiple Classification Ripple-Down Rules) document classification knowledge base was reused to detect events and to initiate a prompt web monitoring process regardless of static monitoring schedule. The experiment demonstrates that the approach proposed improves monitoring efficiency significantly.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 27, 2009
- Accession Number
- ADA494589
Entities
People
- Byeong Ho Kang
- Hiroshi Motoda
- John Salerno
- Paul Compton
Organizations
- University of Tasmania