44
Student ability and item difficulty logit locations remain
relatively stable during convergence when using data that are a good fit to the
requirements of the perfect Rasch IRT model. The data in the Fall8850a.txt file
requires that the average logit item difficulty value be moved one logit, from -.98
to 0, during convergence. The standard deviations for student ability and item
difficulty, 0.51 and 1.0, are also quit different.
The relative location of individual student abilities and
Item difficulties vary, from the two factors above and from the culling of data
that “do not look right”, during the process of convergence on the logit scale.
Individual changes in the relative location of student ability and item difficulty
can be viewed by re-plotting the bubble chart data shown in the previous post.
The rating scale analysis groups all students with the same score and all items with the same difficulty. The end result is a set of nearly parallel lines connecting the starting and ending convergence location of a student or an item. (Closely spaced locations have been omitted for clarity.)
Culling outliers resulted in the loss of values among the
less able students and the more difficult items.
This increased the student ability mean and decreased the item difficulty mean. Culling increased the spread of both distributions toward higher values.
This increased the student ability mean and decreased the item difficulty mean. Culling increased the spread of both distributions toward higher values.
The partial credit analysis groups all students with the
same score but treats item difficulty individually. (More locations have been
omitted for clarity.) Four of the plotted starting item difficulty locations
land at more than one ending convergence location. Culling partial credit
outliers had the same effects as culling rating scale outliers (above) related
to where the culling occurred, the migration of means, and the direction of
distribution spread. (More item difficulty locations were omitted for clarity.)
The item difficulty mean migrated to the zero logit, 50% normal, location in all four analyses: full rating scale, culled rating scale, full partial credit, and culled partial credit. Winsteps performed as advertised for psychometricians .
The individual relative locations for student ability and
item difficulty differ in all four analyses. Two items, that survived my
culling and omitting, have the same starting location but very different ending
locations: Item 13 and Item 41. Both are well within the -2 to +2 logit
location on the Winsteps bubble charts (item response theory – IRT data).
PUP lists them as
the two most difficult items on the test (classical test theory – CTT data).
PUP lists Item 13 as unfinished, with 15 out of 50 students marking, of whom
only 5 marked correctly. There is a serious problem here in instruction,
learning, and/or the item itself. Item 41 was ranked as negatively discriminating
(four of the more able students in the class marked incorrectly). Only 5
students marked item 41 and none were correct. The class was well aware that it
did not know how to deal with this item. Both items were labeled as guessing.
IRT and CTT present two different views of student and item
performance. The classroom friendly CTT charts produced by PUP require no interpretation for students
and teachers to use directly in class and when advising.
No comments:
Post a Comment