vignettes/blog/2022-04-05-simulating-data-from-the-interaction-model.Rmd
2022-04-05-simulating-data-from-the-interaction-model.Rmd
In a recent post, Jesse has given a somewhat involved example of simulating Rasch / PCM data with dexter. In spite of the rather complex design involving planned missingness and adaptivity, the models at the heart of it are well-known. In this post, we show how to simulate from Haberman’s interaction model (Haberman (2007)), which is very interesting, quite useful in practice, and to our knowledge only available in dexter. And, as an Easter promotion, we give you two methods for the price of one: one based on rejection sampling (dexter’s ‘official’ version), and one based on sampling without rejection. But we start with a short discussion of this fascinating IRT model.
While not even mentioned at the career award session dedicated to Shelby Haberman at NCME 2019, his interaction model (IM) must be one of the brightest, practically most salient ideas in psychometrics since 2000. It plays a strategic role in dexter, and we have posted several times about it and will continue to do so in the future. The IM can be represented in several ways, each of them highlighting different aspects of interest and applicability:
In other posts, we have already commented on the usefulness of these properties in various practical situations. When devising a strategy to simulate from the model, the third aspect seems the most promising.
The interaction model can be expressed as a generalization of the Rasch model where item difficulty is a linear function of the test-score: where for any possible sum score .
As an illustration, let us simulate data from an interaction model and use these to estimate a Rasch model separately for persons with the same test score (provided there are enough of them):
The dots are the estimated Rasch item parameters for each score-group, the green line represents the model from which we simulated, and the gray curve is obtained when we fit an interaction model to the simulated data. Two things are worth noting. First, we need very large samples to make such plots. Second, when we do have large samples, we often observe a linear relationship between item difficulty and test score. For some reason, the interaction model seems to describe educational data quite well.
Anyway. We have simulated data from the interaction model but the original question was how to do exactly that. Simulating from the Rasch model is fairly trivial. Given a vector of item difficulties, delta, and a sample from the ability distribution, theta, the R function is a one-liner:
simRasch = function(theta, delta)
{
as.integer(rlogis(length(delta), 0, 1) <= (theta-delta))
}
To simulate responses from the interaction model, we can apply this at each distinct test score with score-specific item difficulties rather than simply . But this is not all, as the responses conditional on a given test-score are not independent. The two algorithms presented here differ in how they tackle this dependency: one samples item parameters directly while the other one relies on a simple rejection algorithm.
As mentioned above, the interaction model preserves the item
difficulties, the item-total correlations, and the score distribution,
so we enter the simulation with three inputs: delta
,
sigma
, and scoretab
. We sample from the
discrete score distribution, scoretab
, and, since the
interaction model maps sum scores to ability estimates, we attach those
from a look-up table easily produced with dexter. (For
those who prefer to sample from the ability distribution, we will show
in the Appendix how to derive the score distribution from that.)
We then go over all simulated ‘examinees’ and we simulate responses
from the Rasch model, with theta
and item difficulties both
matching the person’s score, until the simulated sum score equals the
true one. For zero and perfect scores, we simply take vectors of 0 or 1.
In R:
sim_IM = function(delta, sigma, scoretab)
{
nI = length(delta)
nP = sum(scoretab)
y = rep(0, nI)
x = matrix(0, nP, nI)
# nP samples from the score distribution:
s = sample(0:nI, nP, replace=TRUE, prob=scoretab)
# attach the MLE estimates of theta:
abl = IM_mle(delta,sigma)
for (p in 1:nP)
{
# compute item difficulties at score s[p]:
delta_s = delta - s[p]*sigma
repeat{
y = simRasch(theta=abl[s[p]+1], delta=delta_s)
if (sum(y)==s[p]) break
}
x[p,] = y
}
return(x)
}
The ability that matches a person’s score is the maximum likelihood estimate (MLE); i.e., the ability for which the expected score is equal to the observed one. In fact, any theta would do, eventually, but then we would have to wait much longer until the sum of the simulated response pattern matches the target sum score. For the aficionados, our makeMLE function is provided in the Appendix.
The dependence between responses conditional on the sum score can be approached by simulating them sequentially and adjusting the probability after each draw, much as we would take samples from an urn without replacement. An algorithm can be constructed based on the observation that at any sum score , the probability that item is answered correctly can be computed as where , is the vector of item parameters without the th entry, and the elementary symmetric function of of order . Note the use of an R-like notation to express dropping the -th element.
This may be a bit subtle, so we illustrate what happens with a small example. Suppose that we have a test of three items. If the test score is 0 or 3, there is nothing to do because there is only one possible response pattern in each case. Suppose , and let us start by simulating the response to the third item. This is a Bernoulli experiment with the probability of success given by with the ESF written out. If the experiment fails, then we are done because, with , the responses to the first two items must be correct. If it succeeds, we are left with a test of two items and a sum score of 1. To simulate the response to item 2, we perform another Bernoulli experiment with a probability of success Regardless of the outcome, the answer to the first item is now also determined because the sum score must be 2. The case of is handled in a similar fashion, and don’t forget to replace with because we are simulating from the interaction model, remember?
The computational burden with Algorithm B is in the calculation of the elementary symmetric functions. What is minimally required is a by matrix whose columns contain the elementary functions for . Since we need for all possible test-scores , we make a three way array : test-scores by number of test-scores by number of items. The entries of are:
where the first two indices and run from zero to , and the third index runs from to . The elementary symmetric functions are calculated using the sum-algorithm. Here is the necessary code in R:
elsymRM <-function(b)
{
n=length(b)
g=matrix(0,n+1)
g[1]=1; g[2]=b[1]
if (n>1)
{
for (j in 2:n)
{
for (s in (j+1):2) g[s]=g[s]+g[s-1]*b[j]
}
}
return(g)
}
makeG <-function(b_s)
{
nI=ncol(b_s)
G = array(0,c(nI+1,nI+1,nI))
for (s in 0:nI)
{
for (i in nI:1)
{
G[s+1,1:(i+1),i] = elsymRM(b_s[s+1,1:i])
}
}
return(G)
}
It is now straightforward to implement the simulation algorithm itself:
sim_IMB <-function(delta, sigma, scoretab)
{
nI = length(delta)
nP = sum(scoretab)
x = matrix(0, nP, nI)
r = sample(0:nI, nP, replace=TRUE, prob=scoretab)
b_r = matrix(0, nI+1, nI)
for (s in 0:nI) b_r[s+1,] = exp(-(delta - s*sigma))
# so far, as in Algorithm A, but now compute the
# matrix of elementary symmetric functions rather
# than the MLE of ability at each sum score
G = makeG(b_r)
for (p in 1:nP)
{
s = r[p]
for (i in nI:1)
{
if (s == 0) break
if (s == i)
{
x[p, 1:i] = 1
break
}
pi_s = (b_r[r[p]+1, i] * G[r[p]+1, s, i-1])/G[r[p]+1, s+1, i]
if (runif(1) <= pi_s)
{
x[p, i] = 1
s = s - 1
}
}
}
return(x)
}
Note that both and are pre-calculated outside of the loop over persons. Note further that we break as soon as we know all remaining responses.
Both algorithms are fast enough for practical purposes even though they are in plain R, not your fastest language. Additional speed can be gained by rewriting the critical parts in FORTRAN or C. Algorithm A turns out to be faster, which is worth mentioning because the idea of rejection sampling is inherently associated with some waste.
Note that the plots do not show times for more than a hundred items. Algorithm B depends on computing the elementary symmetric functions, the gentlest monster you have ever encountered, but still a monster in your backyard. We have used the sum-algorithm, which is perfectly reliable for tests of reasonable length, say up to 100 items. In the interaction model the item difficulties can become large quite rapidly such that, for longer tests, it may be compromised by numeric overflow. For example, with items and equal to :
delta = runif(150,-0.5,1.5)
sigma = runif(150,-0.5,0.5)
s = 145
b_r = exp(-(delta-s*sigma))
esf = elsymRM(b_r)
max(esf)
## [1] Inf
The ESF are indispensable when dealing with test data and psychometric models, they are ubiquitous in dexter, but programming them for production purposes requires extra care. Our second author has developed a special FORTRAN library for dealing with very large numbers, somewhat similar to R package Brobdingnag (Hankin (2007)) but better and, as it happens much too often with this author, unpublished. We will revisit the problem in the future.
Dexter has, since the writing of this blog, acquired a function
r_score_IM()
to simulate item scores from the interaction
model. This is a slightly faster implementation of algorithm A, which
also works for polytomous items.
In both algorithms, we sampled “persons” from a discrete distribution
of sum scores. This may seem unusual and, more importantly: where do we
get that? Readers may be more accustomed to sampling ability values from
a continuous distribution, with the normal distribution as a
particularly popular choice. Although there is no compelling reason to
believe that abilities must be normally distributed, it is not
implausible, and everyone knows the rnorm
function in R
(R Development Core Team (2005)).
To get samples from a discrete distribution of sum scores starting with samples from a continuous distribution and a set of item parameters, we may consider that, since we can sample from and subsequently sample from .
What is ? We find it if we write Thus, Conveniently, we have already computed the elementary symmetric functions and saved them in the matrix, , so:
How to find the MLE of ability under a Rasch model is explained in
our blog entry about the Warm estimator. Here we use the
uniroot()
function to find the root.
# MLE ability estimate for dichotomous interaction model
IM_mle = function(delta, sigma)
{
scores = 1:(length(delta)-1)
Es_x = function(s)
{
delta = delta - s*sigma
function(theta)
{
Pi = 1/(exp(delta - theta)+1)
sum(Pi) - s
}
}
c(-Inf,
sapply(scores, function(s) uniroot(Es_x(s), interval = c(-10,20),extendInt='yes')$root),
Inf)
}