Integrated Data Driven Solutions (I2DS) Project in the Active Social Engineering Defense (ASED) Program
Abstract
The Purdue Team's proposal is only for TA1, which focuses on using machine learning models to detect social engineering messages. The Purdue team joined teams led by Berkeley and CMU to form the LASER team. The Purdue team developed techniques to train classification models for social engineering emails, and participated in the dry-run and the evaluations. Three models were developed. Two models analyze the subject and the text in the body. A TF-IDF (term frequency-inverse document frequency) model uses standard term frequency information. A second model extracts motive features from the text to identify the message authors intent (e.g., get information, access social network). A third model is a knowledge and graph model that extracts relation features from the sender and receiver information. An ensemble model aggregates output from the three models to make a prediction, and is comprised of Logistic Regression model and Neural Network model. The team has extensively explored different models, training techniques, and their impacts on accuracy.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 15, 2020
- Accession Number
- AD1137122
Entities
People
- Dan Goldwasser
- Jennifer Neville
- Ninghui Li
Organizations
- Purdue University