Automated Extraction and Characterisation of Social Network Data from Unstructured Sources -- An Ontology-Based Approach
Abstract
Automated extraction of social network-related data is one objective of the applied research project on Social Network Analysis (SNA) in a Counter-Insurgency context (SNAC) at DRDC Valcartier. Since the vast majority of the information resides in unstructured text documents, the prototype must be able to extract social network-related data directly from them. For these tasks, the prototype leverages and refines existing services provided by the Intelligence Science & Technology Integration Platform (ISTIP) at DRDC Valcartier. These services rely on ontologies to perform document annotation and to semantically characterize the data to be persisted. Given a list of a priori known instances of entities like people, organizations, and events, the system constructs the social web that ties these entities together. To do so, on a continuous basis, documents are fed via a data source crawling service that scans existing databases and returns new documents. Then, using natural language processing services, the system scans these incoming documents, extracts information about entities as well as their relations and their respective attributes, and stores this information in a graph database. The system also provides basic and semantic filtering services as well as conversion to many formats.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 01, 2013
- Accession Number
- ADA587890
Entities
People
- Etienne Martineau
- Regine Lecocq
Organizations
- DRDC Valcartier