Thursday, April 7, 2011

PCRM - Item Discrimination

                                                              Chapter 23

Item discrimination identifies the items that separate a group of students that know and can do from a group that cannot. The Rasch model identifies the estimated measure at which students with an ability that matches the item difficulty will make a right mark 50% of the time. Item discrimination is not a part of the partial credit Rasch model (PCRM), however, Winsteps and PUP both print out the point biserial r (pbr) that estimates item discrimination.

About 10 discriminating items are needed in a classroom test to produce a good range of scores with which to set grades. The two halves of the biology fall 88 test (Part 1&2 and 3&4) show 11 and 16 discriminating items in PUP Table 7. All 11 discriminating items in Part 1&2 are found among the 16 in Part 3&4 (average pbr of 0.29 and 0.33, and average alpha of 0.62 and 0.77). A test composed of 50 items with this discrimination ability is expected to have an average alpha of 0.92. This puts it into a standardized test range of test reliability. A practical standardized test uses fewer items with more discrimination ability.

Dropping down from averages of groups of items and students to individual items and students restricts the validity of PUP Table 3a printouts to descriptive statistics for each test. (The Rasch model printouts from Ministep for individual estimated person and item measures are valid predictions as well as descriptions.)  What needs to be re-taught and what can students do to correct their errors?

A teacher can mix students who marked discriminating items correctly with a set of students who did not know, or marked wrong, to sort out their errors. This is in contrast to an unfinished item. Here is a problem in instruction, learning, and/or assessment. Here the teacher must take the lead. These are the only items I reviewed in a class that promoted the use of higher levels of thinking by way of Knowledge and Judgment Scoring.

End of course standardized tests scored at the lowest levels of thinking (only counting right marks) have only one valid use, ranking. There is no way for current students to benefit from the testing. New designs for 2011 will use “through-course” assessment. Even in low level of thinking environments there is time for meaningful corrections at the individual student and classroom levels. One plan (August 2010) spaces parts of the test evenly through the course, the other spaces parts over the last 12 weeks of the course.

Neither plan replaces the good teaching practice of periodic assessment in such detail that students cannot fall so far behind that they cannot catch up with the class. Self-correcting students find the student counseling matrixes helpful. Most of these biology students were functioning at and above an 80% right high-quality score by the end of the semester. 

