At this point I
have not found a way to directly relate the inner workings (not the outputs) of
the partial credit Rasch model and the non-iterative PROX methods for estimating
measures. I have now found an intermediary method for estimating measures that bridges
this gap: the Rasch rating scale method within Winsteps.
Both the Rasch
rating scale method and non-iterative PROX method produce the same measures for
students with the same scores and for items with the same difficulties. Winsteps
transposed values (columns become rows and rows become columns) also group
students and items by common scores and difficulties. You can visualize this by printing out the normal and transposed ability-difficulty tallies (person-item bar charts) and then flipping the transposed bar chart left to right and then tipping it end for end.
These logit values can
be restored to normal by multiplying by a -1 to flip the log ratio scale
(that is centered on zero) end for end and then adding the measure means (1.78 log its in this Nursing1 example) to shift the values into their original locations.
Restoring again matches student and item measures. A transposed student ability value of -0.34 is restored to +2.12 (0.34 + 1.78). A transposed item difficulty value of 3.30 is restored to -1.51 (-3.30 + 1.78).
Restoring again matches student and item measures. A transposed student ability value of -0.34 is restored to +2.12 (0.34 + 1.78). A transposed item difficulty value of 3.30 is restored to -1.51 (-3.30 + 1.78).
Rounding from
1/10 measure to 1/4 measure (Winsteps)
produces a better looking bar chart. But there is a noticeable distortion. This
becomes apparent when restoring transposed Rasch rating scale values. The
original and restored values are the same when manipulating the numbers (flip
the log ratio scale and add the measure mean).
The charts, however, can easily be seen to differ after flipping and tipping by placing normal and transposed charts on one sheet. This distortion is an artifact from rounding numbers in two directions on a logit scale.
With this in
mind, the Rasch rating scale method adds a neat feature. Instead of just
counting right marks (the traditional forced choice, DUMB test), a test can be
designed to let students report what they actually know and trust to be of
value; what they understand and find useful at all levels of thinking. This is
the same as Knowledge and Judgment Scoring. The Fall8850a data ranked responses 0, 1, and 2 for wrong (guessing, poor
judgment in reporting what is known and trusted), omit (good judgment in accurately
reporting what is known and trusted), and right (reporting what is known and
trusted).