Analyzing Feature Relevance for Social Media Traffic Classification with Machine Learning
Abstract
Prior research completed the in classification of Social Media traffic was able to successfully demonstrate the feasibility of classifying Social Media (SM) network traffic using traditional Machine Learning (ML) techniques, on a packet-by-packet basis. This paper builds on these results and evaluates feature analysis methods which were explored during the ML experiments. Previously, an exhaustive search to evaluate nearly all possible combinations of input features to find the best subset for the implementation of Support Vector Machine (SVM) models was utilized. Exhaustive feature searches tend to be computational expensive and perhaps even prohibitive for large feature sets. In one case, up to sixteen features were explored, which required 65,535 combinatorial executions. While the project had the benefit of having access to High Performance Computing (HPC) resources, such types of computing resources may not always be available to every project. The potential application of enhanced feature analysis and feature selection techniques could result in an optimum subset of features in the development of ML models, hence reducing computation time and avoiding overfitting1. This ML project provides a unique opportunity to evaluate such feature reduction techniques and compare the results to its exhaustive search process. For classification problems, the underlying goal of a machine learning model is to find information within the input data to produce the best prediction. The selected approach for investigating the results from the original SM ML project will be to analyze the input data features as to identify unique characteristics that may have contributed to the ML model success in Section A and B under the investigation results. Section C and D will then explore automated feature selection techniques on the larger feature set.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 13, 2020
- Accession Number
- AD1110423
Entities
People
- Bela Erdelyi
- Johnson John
- Metin Ahiskali