Multiple Additive Regression Trees a Methodology for Predictive Data Mining for Fraud Detection
Abstract
The Defense Finance Accounting Service DFAS-Operation Mongoose (Internal Review - Seaside) is using new and innovative techniques for fraud detection. Their primary techniques for fraud detection are the data mining tools of classification trees and neural networks as well as methods for pooling the results of multiple model fits. In this thesis a new data mining methodology, Multiple Additive Regression Trees (MART) is applied to the problem of detecting potential fraudulent and suspect transactions (those with conditions needing improvement - CNI's). The new MART methodology is an automated method for pooling a "forest" of hundreds of classification trees. This study shows how MART can be applied to fraud data. In particular it shows how MART identified classes of important variables and that MART is as effective with iaw input variables as it is with the categorical variables currently constructed individually by DFAS. MART is also used to explore the effects of the substantial amount of missing data in the historical fraud database. In general MART is as accurate as existing methods, requires much less effort to implement saving many man days, handles missing values in a sensible and transparent way, and provides features such as identifying more important variables.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2002
- Accession Number
- ADA407108
Entities
People
- Antonio J. F. Da Silva Monteiro
Organizations
- Naval Postgraduate School