Dynamic Scheduling for Web Monitoring Crawler

Abstract

Web monitoring systems report any changes on the target web pages by revisiting them frequently. As they are operated under significant constrains such as network and computing, it is necessary to minimize revisits with minimal delay and maximum coverage. Various statistical scheduling methods were proposed to resolve this problem. However they are static and cannot easily cope with events in the real world. This paper proposes a new scheduling method that manages unpredictable events. MCRDR (Multiple Classification Ripple-Down Rules) document classification knowledge base was reused to detect events and to initiate a prompt web monitoring process regardless of static monitoring schedule. The experiment demonstrates that the approach proposed improves monitoring efficiency significantly.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 27, 2009
Accession Number
ADA494589

Entities

People

  • Byeong Ho Kang
  • Hiroshi Motoda
  • John Salerno
  • Paul Compton

Organizations

  • University of Tasmania

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Classification
  • Databases
  • Detectors
  • Electronic Mail
  • Engineering
  • Governments
  • Information Science
  • Information Systems
  • Knowledge Management
  • Local Governments
  • National Governments
  • Scheduling (Production)
  • Simulations
  • Students
  • United States
  • Websites

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Operations Research
  • Sensor Fusion and Tracking Systems.