The tally charts in Post 33 show non-iterative PROX and
Winsteps producing almost identical results with classroom Nursing1 data (21
students and 22 items). Post 34, with Cantrell (made up) data (34 students and
14 items), shows very different results between non-iterative PROX and Winsteps.
This post explores the way the different results were produced. Winsteps was
stopped after each iteration and the person, Table 13.1, and item, Table 17.1,
printouts examined.
Winsteps iterative PROX creates iteration one by subtracting
the item measure mean from each item logit (shifting the distribution to the person
ability zero measure location). Expansion factors are applied to person ability
and Item difficulty, and the item mean is again adjusted to zero measure for
the next iteration in Post 35. In contrast, Winsteps JMLE makes adjustments on
person ability and item difficulty simultaneously by filling in each cell with
the probability of expected score based on both person ability and item difficulty.
The relative location of the third from the lowest student
ability measure and the closely related item difficulty measures changes from
one iteration to the next on the PROX chart (left). The same is true on the
JMLE chart (right) This change in relative location of student ability and
item difficulty is, in part, the most noticeable effect of placing two sets of
data on the same graph with different starting locations and that are each expanded
at different rates. The rate of expansion decreases with each iteration until
it is too low to justify further iterations. Convergence is then declared; that
point at which person ability and item difficulty are found at the same point
on the logit measure scale.
The JMLE chart shows JMLE starting with the last Cantrell PROX
iteration. After about two iterations, the locations for person ability and
item difficulty resemble those from non-iterative PROX. But JMLE continues on
another dozen iterations to iteration 14 before stopping. By now the distribution
has been expanded an additional logit in either direction. Clearly the PUP non-iterative PROX and Winsteps JMLE are not in
agreement using Cantrell data. The two methods are in almost perfect agreement
when using Nursing1 data after just two PROX and two JMLE Winsteps iterations.
My view on this is that poor, and inadequate, data can
produce poor results. The Cantrell charts show wavy lines at the greatest
distances from the zero logit location. This hunting, hysteresis effect,
indicates the data reduction method is making large changes that may lead to
inaccurate results. The JMLE portion of Winsteps is a more finely tuned method
than the iterative PROX portiion.
Four methods for estimating measures have now been explored:
graphic, non-iterative PROX, iterative PROX and JMLE. These inventions each
have increasing sensitivity in producing convergence. Since the first three
have been fully discussed (and have been found to have no need for any pixy
dust), I am willing to trust that JMLE does not require any either. In general,
the location for person ability will yield a higher expected score than the raw
test scores from which it is derived. The further the raw
score is above 50%, the greater the difference between raw score and expected
score. The same goes for scores below 50%; the lower the raw test score, the
increasingly lower the expected score.
I am still puzzled by two observations: Students correctly answering
the same number of questions of differing difficulty land at the same ability
location. Items answered correctly by the same number of students with
differing abilities land at the same difficulty location. This does not, in my opinion, square with ability-independent and item-independent qualities or that
correctly marking one difficult question is worth marking two easier questions.
The Rasch model requires student ability and item difficulty
to be located on one fairly linear scale. It adds properties related to latent
student ability and latent Item difficulty. I see nothing in the four examined
estimation methods that, by themselves, confers these powers or properties to
marks on answer sheets. The elusive properties of the Rasch model may be based more
on use and operator skill than on the methods for estimating student ability
and item difficulty measures.
No comments:
Post a Comment