Web-Based Programming for Real-Time News Acquisition

Abstract

This report describes a Web 2.0 application that was developed at the U.S. Army Research Laboratory in support of its Real-Time News Analysis (RTNA) project. It uses the Google, Inc. AJAX search application programming interface to acquire data and subsequently formats resultant data for analysis. News stories for a specified topic (e.g., terrorist bombing) are gathered from public sources by a function in a JavaScript node of an extensible markup language formatted document (XHTML). Content of selected elements is then extracted, or scraped, from the XHTML. The designed graphical user interface allows one to choose up to 10 words and/or phrases and permits explicit exclusion of certain semantics. Presently, the selected data sources are determined by Google News and user-specified in a Google Web service. A Google gadget for Maps has been added for geographic visualization of location, and additional searchers for Google Video, Blog, and Book have been tested and can be easily added to the search controller. The application also allows for integration of asynchronous JavaScript and XML technology, including Java servlets for requesting data and Java Server Pages for the responses.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2007
Accession Number
ADA474080

Entities

People

  • Andrew M. Neiderer
  • John Richardson

Organizations

  • United States Army Research Laboratory

Tags

DTIC Thesaurus Topics

  • Acquisition
  • Application Programming Interface
  • Computer Program Documentation
  • Computer Programming
  • Graphical User Interface
  • Html
  • Javascript Programming Language
  • Language
  • Linguistics
  • Markup Languages
  • Military Research
  • Two Dimensional
  • User Interface
  • Web Browsers
  • Xml

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • International Journalism and Media Studies.