Serial Averaging in the Construction and Validation of Performance Tests
Abstract
The advent of the microcomputer has led to a renaissance in performance testing, that is, tests which sample what a person can do (remember, track, aim, detect, recognize, etc.) rather than what he or she knows. Psychometric theory, however, is based on knowledge tests. The unit of analysis is an item and the order of administering the items is arbitrary. In performance testing the unit of analysis is a trial and order of administration is not only nonarbitrary but often the only thing that distinguishes one trial from another. In a knowledge test it is not unreasonable to suppose that mean performance and interitem correlations are independent of order of administration. In a performance test it is. Typically, performance improves with practice and intertrial correlations tend toward a definite pattern as a function of order. The consequences of these differences for theory are drastic. In performance testing, both reliability and temporal stability frequently encounter optima as a test is lengthened. Hence, low reliability or stability may not be corrigible by increasing test length. Further, scoring all trials administered (the usual practice) may not yield the best obtainable predictive validity. Scoring only a subset of consecutive trials (early, middle, or late) frequently yields appreciably higher predictive validities than the conventional practice.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jul 09, 1991
- Accession Number
- ADA240313
Entities
People
- Marshall B. Jones
Organizations
- Penn State Milton S. Hershey Medical Center