Tag Archive

Computer Adaptive Testing

So you know what the GMAT is all about, but you’re unsure exactly how answering all of those questions results in a final score that could make or break your chances for admission into business school. In this and following entries we will break down for you the system behind your score and how the test is administered to obtain that score.

Item Response Theory

Item Response Theory (IRT) is the system used by Computer-Adaptive Testing such as the GMAT CAT to determine which question is the “best” next question based on the demonstrated ability level of the test taker. It is a statistical model that relates the probability of a test-taker correctly answering a problem to characteristics of the problem and the test-taker’s true ability. It was first introduced in 1968.

The IRT model states that the probability of a correct response to item i for test-taker X is a function of ai, bi, and ci and test-taker X’s true ability. A person’s estimated true score is denoted as theta (). True score is the score a test-taker would receive on a perfectly reliable test. Since it is unavoidable for all tests to contain error, true scores are a theoretical concept; in an actual testing program, we will never know an individual’s true score. However, we can, compute an estimate of a test-taker’s true score and we can estimate the amount of error in that estimate.

P(ui=1 | ThetaXai, bi, ci) = ci + (1 - ci) / [1 + exp(-1.7 ai (ThetaXbi)]

The model typically involves three parameters –

ai defines the ability of the item to discriminate between individual test-takers,

b, is the difficulty of the item, and

ci is the probability that the test-taker would get the question right solely by guessing.

On the GMAT, this model is used to determine your final score, i.e., where you stand on the ability scale, or, what your Theta is. For example, in the graph below, the horizontal axis is the ability scale, ranging from very low (-3.0) to very high (+3.0). When ability follows the normal curve, 68% of the test-takers will have ability between -1 and +1; 95% will be between -2.0 and +2.0. The vertical axis is the probability of responding correctly to this item.

 

The ai parameter defines the slope of the curve at its inflection point. The curve would be flatter with a lower value of ai; steeper with a higher value. Thus aidenotes how well the item is able to discriminate between test-takers of slightly different ability (within a narrow effective range).

The bi parameter defines the location of the curve’s inflection point along the theta scale. Lower values of bi will shift the curve to the left; higher to the right. The bidoes not affect the shape of the curve.

The lower asymptote is at ci=.25. (An asymptote is a straight line or curve A to which another curve B (the one being studied) approaches closer and closer as one moves along it.) This is the probability of a correct response for test-takers with very little ability (e.g. = -2.0 or -2.6). The curve has an upper asymptote at 1.0; high ability test-takers are very likely to respond correctly.

We will continue with our analysis on the GMAT CAT scoring system tomorrow.

Posted on November 11, 2007 by Manhattan Review

This entry was posted in Admissions, GMAT and tagged , , , , , . Bookmark the permalink.