Wednesday, December 15, 2010
Education has a number of standard units. One is basic to assigning probabilities to events like right marks on a test: the standard deviation.
Random error creates the normal distribution (the normal or bell curve). The distribution happens every time, with a large enough sample. Random error gives each individual an honest and fair chance within the distribution.
The point on the side of the normal curve where it changes bending from up to down, or down to up, is one standard deviation (SD). Some 95% of a sample is expected to fall within +/- 2 SD of the mean. When observed results do not fit within +/- 2 SD (the 5% level of significance) we know to look for a cause, other than chance.
Raw test scores are standardized, turned into Z scores, by dividing them by their SD. Two class distributions can be equated by matching the Z scores or by shifting and stretching one of the distributions to fit the other one. The idea is that students who have similar Z scores should have similar grades. The conversion can be made from Test A to Test B or from Test B to Text A.
Z scores permit adjusting two sets of raw test scores along one dimension. The Rasch model makes adjustments in two dimensions at the same time, raw scores and item difficulty. The Rasch model uses a t-statistic to detect unacceptable fit.
The t Outfit Zstd Outfit Zstd on the Winsteps bubble chart is a standardized indicator of how well student and item performances fit the Rasch model's requirements.
Positive values reflect underfit to the Rasch model or unfinished on PUP Table 3a. Negative values reflect overfit to the Rasch model or highly discriminating on PUP Table 3a. A student or item performance does not fit if the difference is more than two t-statistic units away from the perfect model. Or, for example, the performance of Item 21, 2.3 t Outfit Zstd exceeds two t-statistic units, may still be do to chance one out of 20 times (the 5% [level of significance).