Misleading Information Detection Through Probabilistic Decision Tree Classifiers
Abstract
This report details our work in detecting misleading information which has focused on the detection of creative accounting practices through the analysis of SEC filing data. We have adopted a decision-tree approach to detecting red-flag conditions associated with creative accounting practices. Decision trees provide a natural way to express expert knowledge of red-flag evidence, and the resulting classifications can be explained in human terms. Focusing on the diagnostic analysis of numeric data from balance sheets and income statements, we have tested and evaluated creative accounting detection rules suggested by Mulford & Comiskey to demonstrate the validity of our approach. We developed a data mining application with a graphical user interface (GUI) to support the semi-automatic construction and induction of decision trees for classifying SEC filings as "positive" (red-flag) or "negative" instances of accounting fraud. The application permits various degrees of user involvement and/or automatic supervised learning of decision rules from training sets: (1) decision rules may be hand-crafted and used verbatim by the system; (2) top-level rules may be specified by the user, leaving the system to generate the remaining rules; (3) existing rules may be refined, via auto-adjustment of split-points (numeric thresholds), to achieve user-specified sensitivity and selectivity values. A cross-validation mechanism was implemented, to evaluate the effectiveness of auto-generated decision trees.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 26, 2002
- Accession Number
- ADA406823
Entities
People
- Gordon Dakin
- Sankar Virdhagriswaran