Naïve Tests of Basic Local Independence Model’s Invariance

Debora de Chiusole; Luca Stefanutti; Pasquale Anselmi; Egidio Robusto

doi:10.1017/sjp.2015.24

Naïve Tests of Basic Local Independence Model’s Invariance

Published online by Cambridge University Press: 28 April 2015

Pasquale Anselmi and

Debora de Chiusole*: Affiliation:
Università di Padova (Italy)
Luca Stefanutti: Affiliation:
Università di Padova (Italy)
Pasquale Anselmi: Affiliation:
Università di Padova (Italy)
Egidio Robusto: Affiliation:
Università di Padova (Italy)
*: *Correspondence concerning this article should be addressed to Debora de Chiusole. Università di Padova. FISPPA Department. Via Venenzia, 12. 35131. Padua (Italy). E- mail: deboratn@libero.it

Article contents

Abstract
Results
Results
Footnotes
References

Rights & Permissions

Abstract

The basic local independence model (BLIM) is a probabilistic model for knowledge structures, characterized by the property that lucky guess and careless error parameters of the items are independent of the knowledge states of the subjects. When fitting the BLIM to empirical data, a good fit can be obtained even when the invariance assumption is violated. Therefore, statistical tests are needed for detecting violations of this specific assumption. This work provides an extension to theoretical results obtained by de Chiusole, Stefanutti, Anselmi, and Robusto (2013), showing that statistical tests based on the partitioning of the empirical data set into two (or more) groups are not adequate for testing the BLIM’s invariance assumption. A simulation study confirms the theoretical results.

Keywords

knowledge space theory basic local independence model parameter invariance assumption bipartition models

Type: Research Article
Information: The Spanish Journal of Psychology , Volume 18 , 2015 , E26

DOI: https://doi.org/10.1017/sjp.2015.24 [Opens in a new window]
Copyright: Copyright © Universidad Complutense de Madrid and Colegio Oficial de Psicólogos de Madrid 2015

Knowledge space theory (KST) is a mathematical theory, developed by Doignon and Falmagne (Reference Doignon and Falmagne1985, Reference Doignon and Falmagne1999) and Falmagne and Doignon (Reference Falmagne and Doignon2011) for the individual knowledge assessment. The aim of a KST assessment is to find, with maximum accuracy and efficiency, the set of problems that a student masters in a specific knowledge domain (the student’s knowledge state). In real contexts, some of the student’s answers could be affected by noise, like careless errors due to time pressure or to turmoil, or correct answers resulting from guessing. Because of the noise, the student’s knowledge state is not directly observable, and it has to be inferred from the student’s responses to the problems. To provide realistic predictions of student’s responses, probabilistic models have to be considered. The first and the most used probabilistic model, developed in KST, is the so-called basic local independence model (BLIM; Falmagne & Doignon (Reference Falmagne and Doignon1988a, Reference Falmagne and Doignon1988b).

Knowledge about the properties of this model has grown over the years. For example, methods for estimating its parameters (Heller & Wickelmaier, Reference Heller and Wickelmaier2013; Schrepp, Reference Schrepp2005; Stefanutti & Robusto, Reference Stefanutti and Robusto2009) and for testing its identifiability (Spoto, Stefanutti, & Vidotto, Reference Spoto, Stefanutti and Vidotto2012; Stefanutti, Heller, Anselmi, & Robusto, Reference Stefanutti, Heller, Anselmi and Robusto2012) are available. Furthermore, some extensions of the BLIM have been proposed, like, for example, the Gain-Loss Model (Robusto, Stefanutti, & Anselmi, Reference Robusto, Stefanutti and Anselmi2010; de Chiusole, Anselmi, Stefanutti, & Robusto, 2013), a model for assessing learning processes, and a probabilistic model for skill dependence (de Chiusole & Stefanutti, Reference de Chiusole and Stefanutti2013) were developed and applied to real data.

The focus of this article is on one of the properties of the BLIM, that is the parameter invariance assumption. In de Chiusole, Anselmi, et al. (2013) de Chiusole, Stefanutti et al. (Reference de Chiusole and Stefanutti2013) it is shown that, even when the invariance assumption is violated by the data, the goodness of fit of the BLIM might be acceptable. For this reason, having a method that shows up invariance violations becomes essential. The method proposed by the authors consists in comparing the BLIM with other models, called bipartition models (BPMs), in which the invariance assumption is explicitly violated. If the comparison favors a BPM, then the parameter invariance assumption is violated, meaning that the BLIM is not adequate for those data.

In the same article, another method to discover such type of violations was considered. The method was inspired by the IRT approach (Andersen, Reference Andersen1973; Glas & Verhelst, Reference Glas, Verhelst, Fischer and Molenaar1995), and consists in partitioning the observed data set into two or more independent groups, to fit the BLIM in each of the groups, and to apply some suitable statistical test to evaluate the difference between the parameter estimates of the two groups. In the simplest case, two groups are formed by separating all the subjects with score levels below the median from those with score levels above the median. If the test is significant, then the conclusion would be that the parameter invariance assumption is violated.

Even if this method seems to be the most natural way to discover violations, it has been formally proven that, with the BLIM, it does not work properly. The main concern with this method is that the error parameter estimates in the two groups will significantly differ one another even when the invariance assumption is, indeed, respected. For this reason, in the sequel we refer to this method as the naïve test of invariance.

A procedure, similar to the one described above, was developed by de La Torre and Lee (2010) in the area of cognitive diagnostic models, for showing up violations of the parameter invariance of the DINA (deterministic inputs, noisy AND-gate) model (Junker & Sijtsma, Reference Junker and Sijtsma2001). Instead of forming two ‘pure’ groups below and above the median, they produced two data sets that were mixtures of the two pure groups. A first group collected about 60% of respondents that were below the median, and about 40% of those that were above it. The second group was constructed by reversing these proportions. By applying that procedure to a data set on fraction subtraction, de la Torre and Lee concluded that the parameter invariance of the DINA model may not hold in real data. It is worth noting that, on the performance level, the DINA model is equivalent to the BLIM (Heller, Stefanutti, Anselmi, & Robusto, Reference Heller, Stefanutti, Anselmi and Robusto2014).

The aim of this article is to generalize the theoretical results concerning inadequacy of the naïve test to any choice of the proportion p, used to form the two groups. After presenting the BLIM, naïve tests of invariance are presented along with theoretical results showing that also the general version of the test suffers for the same problems. The theoretical results are illustrated through a simulation study and an empirical application.

The BLIM and the Parameter Invariance

The aim of a KST assessment is to uncover the knowledge state that characterizes a student, on the basis of her responses to a given set Q of problems. The collection of responses is named response pattern, and it is represented by the subset R ⊆ Q of all problems that received a correct response. For a KST assessment, a deterministic model on all problems q ∈ Q, called knowledge structure, is required, along with a probabilistic model, like for example, the BLIM. A knowledge structure is defined as a pair (Q, K), in which Q is a collection of problems, and K is a collection of subsets of Q, called knowledge states. The BLIM is defined as a quadruple (Q, K, π, r), in which:

a. (Q, K) is a knowledge structure on a finite set Q;
b. π is a probability distribution on K
c. r is the response function and, for every R ⊆ Q and every K ∈ K, it is defined by

(1)

$$r(R,K) = \left[ {\prod\limits_{q \in K\backslash R} {\beta _q}} \right]\left[ {\prod\limits_{q \in K \cap R\quad } (1 - {\beta _q})} \right]\left[ {\prod\limits_{q \in R\backslash K} {\eta _q}} \right]\left[ {\prod\limits_{q \in Q\backslash (R \cup K)} (1 - {\eta _q})} \right],$$

where β_q, η_q∈(0, 1] are two parameters of each of the items, respectively called careless error probability and lucky guess probability;

The probability of sampling a student whose response pattern is R ⊆ Q is

(2)

$$P(R) = \sum\limits_{K \in {\bf{K}}} r(R,K){\pi _K}.$$

It has to be pointed out that the parameters β_q and η_q are attached to the items and do not vary with the knowledge states of the students, in other words they are invariant across individuals. We will refer to this property as the invariance assumption.

Before to get into the question of how this assumption can be tested, it is worth providing some basic notation. The set K _q = {K ∈ K : q ∈ K} collects all knowledge states containing a given item q ∈ Q, and ${{\bf{\bar K}}_q} = \{ K \in {\bf{K}}:q \notin K\}$ is its complement in K. Similarly, for R = 2^Q, let R _q = {R ⊆ Q : q ∈ R} be the set of all response patterns containing q, and ${{\bf{\bar R}}_q} = \{ R \subseteq Q:q \notin R\}$ be its complement in R. Finally, for any F ⊆ R and any J ⊆ K let

$$P({\bf{F}},{\bf{J}}) = \sum\limits_{R \in {\bf{F}}} \sum\limits_{K \in {\bf{J}}} r(R,K){\pi _K}$$

be the joint probability of F and J.

Naïve Tests of Invariance: Restricted Case

A way to assess violations of the parameter invariance assumption of the BLIM would be to partition the whole data set into two independent groups, to fit the BLIM in each of them (say, Group 1 and Group 2), and to apply some suitable statistical test of the difference between the parameter estimates in the two groups. If the test is significant then the conclusion would be that the parameter invariance is violated by the data.

As a criterion to form the two groups, consider the one that consists of choosing a certain quantile c > 0 (e.g. the median) of the sample distribution of the size of the response patterns. Those having size less or equal to c are assigned to Group 1, and those having size greater than c are assigned to Group 2. With these two groups, a test of the invariance would not work properly, since parameter estimates would be biased in both groups, even when the independence assumption is, indeed, respected. Because of this bias, the statistical test would lead to a rejection of the local independence assumption too often.

To see this, consider some cutoff c ∈ {0, 1,…,|Q| − 1}, and let ${{\bf{R}}^ \downarrow } = \{ R \in {\bf{R}}:|R| \le c\}$ be the collection of all response patterns whose size is less or equal to c, and ${{\bf{R}}^ \uparrow } = \{ R \in {\bf{R}}:|R| &gt; c\}$ be the collection of all response patterns whose size is greater than c. Then, according to the BLIM, the conditional probability that in a randomly sampled response pattern R, an item q is failed by careless error, given that the size of R is below the cutoff is

(3)

$$\beta _q^ \downarrow = {{\sum\limits_{R \in {{{\bf{\bar R}}}_q} \cap {{\bf{R}}^ \downarrow }} \sum\limits_{K \in {{\bf{K}}_q}} r(R,K){\pi _K}} \over {\sum\limits_{R \in {{\bf{R}}^ \downarrow }} \sum\limits_{K \in {{\bf{K}}_q}} r(R,K){\pi _K}}},$$

whereas the conditional probability that in a randomly sampled response pattern R, item q is solved by lucky guess, given that R is below the cutoff is

(4)

$$\eta _q^ \downarrow = {{\sum\limits_{R \in {{\bf{R}}_q} \cap {{\bf{R}}^ \downarrow }} \sum\limits_{K \in {{{\bf{\bar K}}}_q}} r(R,K){\pi _K}} \over {\sum\limits_{R \in {{\bf{R}}^ \downarrow }} \sum\limits_{K \in {{{\bf{\bar K}}}_q}} r(R,K){\pi _K}}},$$

Similar equations are obtained for the $\beta _q^ \uparrow$ and $\eta _q^ \uparrow$ parameters, by replacing R ^↓ with R ^↑ in Equations 3 and 4.

In de Chiusole, Anselmi et al. (Reference de Chiusole, Stefanutti, Anselmi and Robusto2013), it is shown that, for any choice of the cutoff c and any item q ∈ Q and β_q,η_q ∈ (0, 1), the following inequalities hold true: $\beta _q^ \uparrow &lt; {\beta _q} &lt; \beta _q^ \downarrow$ ; and $\eta _q^ \downarrow &lt; {\eta _q} &lt; \eta _q^ \uparrow$ .

These two inequalities show that careless errors are more likely when one samples below the cutoff, whereas lucky guesses are more likely when one samples above the cutoff. Thus, when the parameters of the BLIM are estimated from only a part of the data set (below/above), one obtains biased parameter estimates.

Naïve Tests of Invariance: General Case

It might be argued that the rule of forming the two groups by assigning all patterns below a certain cutoff to one group, and all the remaining ones to the other group is too strong. Maybe, there exist weaker rules that reduce or even remove the bias.

The following, more general, rule is considered here: Given some proportion p ^↓, with 0 ≤ p ^↓ ≤ 1, a sufficiently large number n ^↓ of response patterns are randomly sampled with replacement from R ^↓, and each of them is assigned to Group G ₁ with probability p ^↓, and to Group G ₂ with probability 1 − p ^↓. Analogously, given a proportion p ^↑, 0 ≤ p ^↑ ≤ 1, n ^↑ response patterns are randomly sampled with replacement from R ^↑, and each of them is assigned to G ₁ with probability p ^↑, and to G ₂ with probability 1 − p ^↑.

As the following proposition shows, even the more general rule suffers from the same problem as the naïve test of invariance.

Proposition 1

Let $\beta _q^{(1)}$ be the probability that, in a randomly sampled response pattern, an item q is failed by careless error, given that the pattern belongs to Group G ₁ , and $\eta _q^{(1)}$ be the probability that a lucky guess occurs for q, given that the pattern belongs to G ₁ . Then $\beta _q^{(1)} \le {\beta _q}$ if and only if

(5)

$${{{p^ \downarrow }} \over {{p^ \uparrow }}} \le {{{\beta _q}P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}) - P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q})} \over {P({\bf{\bar R}}_q^ \downarrow ,{{\bf{K}}_q}) - {\beta _q}P({{\bf{R}}^ \downarrow },{{\bf{K}}_q})}},$$

where ${\bf{\bar R}}_q^ \downarrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow }$ and ${\bf{\bar R}}_q^ \uparrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow }$ . Moreover, $\eta _q^{(1)} \le {\eta _q}$ if and only if

(6)

$${{{p^ \downarrow }} \over {{p^ \uparrow }}} \le {{{\eta _q}P({{\bf{R}}^ \uparrow },{{{\bf{\bar K}}}_q}) - P({\bf{R}}_q^ \uparrow ,{{{\bf{\bar K}}}_q})} \over {P({\bf{R}}_q^ \downarrow ,{{{\bf{\bar K}}}_q}) - {\eta _q}P({{\bf{R}}^ \downarrow },{{{\bf{\bar K}}}_q})}},$$

where ${\bf{R}}_q^ \downarrow = {{\bf{R}}_q} \cap {{\bf{R}}^ \downarrow }$ and ${\bf{R}}_q^ \uparrow = {{\bf{R}}_q} \cap {{\bf{R}}^ \uparrow }$ .

What Proposition 1 essentially says is that the method that consists in partitioning the sample into two subgroups by using any arbitrary proportions p ^↑ and p ^↓ will lead to biased estimates of the β_q and η_q parameters, even when the invariance assumption is indeed respected by the data. Depending on the ratio p ^↓ / p ^↑ that one chooses, the β_q and η_q probabilities might be either over- or underestimated in both groups G ₁ and G ₂. For this reason it is recommended not using methods like the one described in this section for testing parameter invariance of the BLIM.

Proof of Proposition 1

Suppose that the probabilities of the response patterns in a population are given by Equation (2). The probability that in a randomly sampled response pattern, an item q is failed by careless error, given that the pattern belongs to group G ₁ is

(7)

$$\beta _q^{(1)} = P({{\bf{\bar R}}_q}|{{\bf{K}}_q},{{\bf{G}}_1}) = {{P({{{\bf{\bar R}}}_q} \cap {{\bf{G}}_1},{{\bf{K}}_q})} \over {P({{\bf{K}}_q},{{\bf{G}}_1})}}.$$

The numerator of the right hand side of Equation (7) can be written as $P({{\bf{\bar R}}_q} \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) + P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}),$ and, by applying the concatenation rule of conditional probabilities, $P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{G}}_1}|{{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q})P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q}).$ Moreover, given R ^↓, G ₁ is independent of both ${{\bf{\bar R}}_q}$ and K _q, hence

$$P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{G}}_1}|{{\bf{R}}^ \downarrow })P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q}) = {p^ \downarrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q}).$$

Similarly, we have $P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{G}}_1}|{{\bf{R}}^ \uparrow })P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow },{{\bf{K}}_q}) = {p^ \uparrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow },{{\bf{K}}_q}).$ Therefore, the numerator of (7) becomes $P({{\bf{\bar R}}_q} \cap {{\bf{G}}_1},{{\bf{K}}_q}) = {p^ \downarrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q}) + {p^ \uparrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow },{{\bf{K}}_q}).$ On the other hand, the denominator of (7) can be written as

$$P({{\bf{K}}_q},{{\bf{G}}_1}) = P({{\bf{K}}_q},{{\bf{G}}_1} \cap {{\bf{R}}^ \downarrow }) + P({{\bf{K}}_q},{{\bf{G}}_1} \cap {{\bf{R}}^ \uparrow }) = P({{\bf{G}}_1}|{{\bf{R}}^ \downarrow },{{\bf{K}}_q})P({{\bf{R}}^ \downarrow },{{\bf{K}}_q}) + P({{\bf{G}}_1}|{{\bf{R}}^ \uparrow },{{\bf{K}}_q})P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}) = {p^ \downarrow }P({{\bf{R}}^ \downarrow },{{\bf{K}}_q}) + {p^ \uparrow }P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}).$$

Thus, with the notation ${\bf{\bar R}}_q^ \downarrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow }$ , and ${\bf{\bar R}}_q^ \uparrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow }$ , Equation (7) can be rewritten as

(8)

$$\beta _q^{(1)} = {{{p^ \downarrow }P({\bf{\bar R}}_q^ \downarrow ,{{\bf{K}}_q}) + {p^ \uparrow }P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q})} \over {{p^ \downarrow }P({{\bf{R}}^ \downarrow },{{\bf{K}}_q}) + {p^ \uparrow }P({{\bf{R}}^ \uparrow },{{\bf{K}}_q})}}.$$

By substituting $\beta _q^{(1)}$ with the right hand of this last equation in the inequality $\beta _q^{(1)} \le {\beta _q}$ , after some algebra we obtain

(9)

$${p^ \downarrow }[P({\bf{\bar R}}_q^ \downarrow ,{{\bf{K}}_q}) - {\beta _q}P({{\bf{R}}^ \downarrow },{{\bf{K}}_q})] \le {p^ \uparrow }[{\beta _q}P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}) - P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q})].$$

We know from Proposition 1 in de Chiusole, Anselmi, et al., 2013; de Chiusole & Stefanutti (Reference de Chiusole and Stefanutti2013) that ${\beta _q} &gt; \beta _q^ \uparrow$ . Since, by definition, $\beta _q^ \uparrow = P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q})/P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}),$ we have that ${\beta _q}P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}) - P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q}) &gt; 0$ . Thus, dividing both terms of the inequality in (9) by p ^↑ and, then, by $P({\bf{\bar R}}_q^ \downarrow ,{{\bf{K}}_q}) - {\beta _q}P({{\bf{R}}^ \downarrow },{{\bf{K}}_q})$ , one obtains the Inequality in (5).

The proof concerning the conditions for ${\eta _q} \le \eta _q^{(1)}$ follows an identical line of reasoning, provided that ${{\bf{\bar R}}_q}$ is replaced by R _q and K _q by ${{\bf{\bar K}}_q}$ .

Proposition 1 holds true with arbitrary values of the two proportions p ^↓ and p ^↑. A special case arises when the choice of p ^↓ and p ^↑ is such that p ^↑ = 1 − p ^↓.

Proposition 2

If p ^↑ = 1 − p ^↓ then:

1. $\beta _q^{(1)} \le {\beta _q}$ if and only if $\beta _q^{(2)} \ge {\beta _q}$ ;
2. $\eta _q^{(1)} \le {\eta _q}$ if and only if $\eta _q^{(2)} \ge {\eta _q}$ .

Proof. (1) Let f (β_q) represent the right hand term in the inequality (5). The we have $\beta _q^{(1)} \le {\beta _q}$ iff ${p^ \downarrow }/(1 - {p^ \downarrow }) \le f({\beta _q})$ iff $(1 - {p^ \downarrow })/{p^ \downarrow } \ge f({\beta _q})$ iff ${p^ \uparrow }/(1 - {p^ \uparrow }) \ge f({\beta _q})$ iff $\beta _q^{(2)} \ge {\beta _q}$ . An analogous development, applied to inequality (1), leads to condition (2).

As Equation (8) shows, the value of $\beta _q^{(1)}$ is a function of: (i) the cutoff c used to partition the dataset; (ii) the values of p ^↑ and p ^↓; (iii) the BLIM’s parameter values (i.e., the β_q, η_q and π_K probabilities). To illustrate propositions 1 and 2, Figure 1 shows how $\beta _q^{(1)}$ varies as a function of the true parameter β_q and the proportion p ^↓. In the figure the x-axis represents the parameter β_q and each of the curves corresponds to a different choice of p ^↓. The remaining parameters of the BLIM were fixed to constant values, and the restriction p ^↑ = 1 − p ^↓ was used.

Figure 1. The probability $\beta _q^{\left( 1 \right)}$ of a careless error on item q in group G₁ (y-axis) varies as a function of β _q (x-axis). Each of the curves corresponds to a different choice of the proportion p ^↓.

It can be seen from the figure that, when p ^↓ is less than a certain value^{Footnote 1} (.5 in this particular example), the β_q parameter is underestimated in group G ₁ (and thus overestimated in group G ₂), and the size of the bias increases as p ^↓ approaches zero. On the other hand, when p ^↓ is greater than .5, the β_q parameter is overestimated in group G ₁ (and thus underestimated in group G ₂), and the size of the bias increases as p ^↓ approaches one. A similar example could be provided for the η _q parameter. In that case one obtains an opposite behavior.

A Simulation Study

The theoretical results obtained in previous section are illustrated by means of a simulation study in which the two proportions p ^↓ and p ^↑ are varied systematically. The aim of the simulations is to show that, by estimating the BLIM’s parameters in each of the two groups G ₁ and G ₂, leads to reject the error parameter invariance assumption of the BLIM even when it is respected by the data, irrespectively of the values of the proportions p ^↓ and p ^↑. In particular, it is expected that on the average, the maximum likelihood estimates of the β_q and η_q parameters in group G _i (with i ∈ {1,2}) approach the theoretical values $\beta _q^{(i)} \ne {\beta _q}$ and $\eta _q^{(i)} \ne {\eta _q}$ .

For all the simulations a set of MATLAB functions, available on request to the first author, were developed.

Simulation Design

A number of 9 simulation conditions were considered in which the following variables were held fixed: a random knowledge structure, composed of 16 items and 400 knowledge states^{Footnote 2}; the true error parameter values, chosen at random from a uniform distribution in the interval (0,.25); the cutoff used for creating the two groups, that, for this knowledge structure, was the median (8). What varied among the 9 conditions was the proportion p ^↓ used to form groups G ₁ and G ₂. The values of p ^↓ were taken from the open interval (0,1) at equally spaced intervals of length 0.10. In all simulations the constraint p ^↑ = 1 − p ^↓ held true.

In each of the 9 conditions, 100 samples of 1,000 response patterns were generated. For each sample, the two groups were then formed choosing, with replacement, a proportion p ^↓ of the patterns below the median and a proportion 1−p ^↓ of the patterns above the median. In this way, for each of the 100 replications, 9 pairs {G1, G2}, both composed of 1,000 response patterns, were obtained. The BLIM was then estimated to both groups, in each of the 9 pairs, and the means of the parameter estimates were compared to those computed by applying Equation (8) for $\beta _q^{(1)}$ and the corresponding equation for $\eta _q^{(1)}$

Results

The size and the direction of the bias, in the 9 simulation conditions, were examined. Figure 2 shows the results for the β_q error parameter, obtained for both G ₁ and G ₂. In the figure, the true value of β_q is along the x-axis, whereas the theoretical and the mean estimates of β_q are along y-axis. The straight line is for reference and indicates that x = y, and there is one diagram for each of the 9 values of p. In each diagram, circles represent $\beta _q^{(1)}$ , whereas triangles represent the theoretical value of $\beta _q^{(2)}$ . Finally, the × and dots represent respectively the $\bar \beta _q^{(1)}$ and the $\bar \beta _q^{(2)}$ mean estimates.

Figure 2. Comparison between $\beta _q^{\left( i \right)}$ and $\bar \beta _q^{\left( i \right)}$ parameters in G ₁ and G ₂. The true value of β _q is along the x-axis, whereas the theoretical and the mean estimates of β _q are along y-axis. The straight line is for reference and indicates that x = y, and there is one diagram for each of the 9 values of p. In each diagram, circles represent $\beta _q^{\left( 1 \right)}$ , whereas triangles represent the theoretical value of $\beta _q^{\left( 2 \right)}$ . Finally, the × and dots represent respectively the $\bar \beta _q^{\left( 1 \right)}$ and the $\bar \beta _q^{\left( 2 \right)}$ mean estimates.

From the figure, it can be seen that: (1) $\beta _q^{(i)}$ and $\bar \beta _q^{(i)}$ parameters are in agreement for both groups, in all 9 conditions; (2) when p = .50, no bias is observed; (3) going from p = .50 to p = .10, the β_q parameter is overestimated in G ₂ and underestimated in G ₁, whereas going from p = .50 to p = .90, the β_q parameter is underestimated in G ₂ and overestimated in G ₁. These results are in line with the predictions made by Proposition 2.

The results obtained for the η_q error parameters are very similar to those obtained for the β_q. The difference is that going from p = .50 to p = .10, the η_q parameter is underestimated in G ₂ and overestimated in G ₁, whereas going from p = .50 to p = .90, the η_q parameter is overestimated in G ₂ and underestimated in G ₁.

Empirical Application

The results discussed in the previous sections are illustrated by an application to real data. The design was the same used in the simulation study: 9 conditions where considered in which the proportion p ^↓ used to create the two groups G1 and G2, respectively below and above the cutoff c, varied in the open interval (0,1) at equally spaced interval of length .10. For illustrative purposes, the data set provided by de Chiusole, Anselmi, et al. (2013 de Chiusole, Stefanutti et al. (Reference de Chiusole, Anselmi, Stefanutti and Robusto2013) was used, in which 18 problems of elementary probability theory (with a knowledge structure of 69 states) were administered to 209 Italian university students. The median of the cardinality of the knowledge states was used as the cutoff (c = 9) to form the two groups. Subsequently, in each of the 9 conditions, the BLIM was fitted to the data in each of the two groups below and above the cutoff, and the means of the β_q and η_q estimates were computed across the items, and compared to one another.

Results

The results are shown in Table 1. Concerning the careless error parameters, it can be seen that in each of the conditions from 1 to 5, in which .1 ≤ p ^↓ < .5, the inequality ${\bar \beta _{{\bf{G}}1}} &lt; {\bar \beta _{{\bf{G}}2}}$ is respected, with the only exception of condition 4 (where, however, the two means are very close to one another); whereas, in all conditions from 6 to 9, in which .5 ≤ p ≤ .9, the inequality ${\bar \beta _{{\bf{G}}1}} &gt; {\bar \beta _{{\bf{G}}2}}$ is respected. Concerning the lucky guess parameters, it can be seen that in all conditions from 1 to 5, the inequality ${\bar \eta _{{\bf{G}}1}} &gt; {\bar \eta _{{\bf{G}}2}}$ holds true, whereas in all conditions from 6 to 9, the inequality ${\bar \eta _{{\bf{G}}1}} &lt; {\bar \eta _{{\bf{G}}2}}$ holds. All these results are in line with Proposition 2.

Table 1. Comparison among the mean of the item parameter estimates of the BLIM in the 9 conditions of the study. In the table p is the proportion used to create the two groups below and above the cutoff; the ${\bar \beta _{G1}}$ and ${\bar \beta _{G2}}$ parameters are the mean of the careless errors of the groups below and above the cutoff, respectively; the ${\bar \eta _{G1}}$ and ${\bar \eta _{G2}}$ parameters are the mean of the lucky guesses of the groups below and above the cutoff, respectively

The BLIM’s parameter invariance says that the probability of a careless error or a lucky guess, for an item, does not depend on the student’s knowledge state. In de Chiusole, Anselmi, et al. (Reference de Chiusole, Stefanutti, Anselmi and Robusto2013), de Chiusole, Stefanutti et al. (Reference de Chiusole, Anselmi, Stefanutti and Robusto2013) two methods for testing this assumption were presented and discussed. The former, consists in comparing the BLIM with other models, called bipartition models (BPMs), in which the invariance assumption is explicitly violated; if the comparison favors a BPM, then the conclusion is that invariance is violated. The latter, inspired by the IRT literature, consists in partitioning the observed data set into two groups (one containing all patterns below a certain cutoff, and one containing all patterns above the cutoff), to fit the model in each of them, and to apply some statistical test to evaluate the difference between the parameter estimates of the two groups. If the test is statistically significant, then the parameter invariance is violated. This second method, called restricted naïve test, does not work properly, because it leads to biased parameter estimates in both groups. Indeed this bias is a direct effect of the manipulations introduced to partition the data into the two groups, and says nothing about possible departures of the data from the parameter invariance assumption.

In the present work, the analysis was extended to a more general method for constructing the two groups. The groups are formed by choosing a proportion p ^↑ of the patterns above a certain cutoff c, and a proportion p ^↓ of the patterns below c. Theoretical results, simulations and an empirical application showed that, also the general method suffers of the same problems as the restricted naïve test. Again, the manipulations of the data that one implements for setting up the two groups, lead to biased parameter estimates that have nothing to do with the violation of the invariance assumption. Given these observations, the only available method for testing the BLIM’s invariance is, currently, bipartition models.

Footnotes

¹ The value is obtained by replacing the inequality ≤ in (5) with an equality and solving for p ^↓, with the constraint p ^↓ = 1 − p ^↑.

² The number of items/states of the structure was not extensively varied. Nonetheless this particular choice suffices as a counterexample showing that the naïve method does not work in general.

References

Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140. http://dx.doi.org/10.1007/BF02291180 Google Scholar

de la Torre, J., & Lee, Y.-S. (2010). A note on the invariance of the DINA model parameters. Journal of Educational Measurement, 47, 115–127. http://dx.doi.org/10.1111/j.1745-3984.2009.00102.x Google Scholar

de Chiusole, D., Anselmi, P., Stefanutti, L., & Robusto, E. (2013). The gain-loss model: Bias and variance of the parameter estimates. Electronic Notes in Discrete Mathematics, 42, 33–40.Google Scholar

de Chiusole, D., & Stefanutti, L. (2013). Modeling skill dependence in probabilistic competence structures. Electronic Notes in Discrete Mathematics, 42, 41–48. http://dx.doi.org/10.1016/j.endm.2013.05.144 Google Scholar

de Chiusole, D., Stefanutti, L., Anselmi, P., & Robusto, E. (2013). Assessing parameter invariance in the BLIM: Bipartition Models. Psychometrika, 78, 710–724. http://dx.doi.org/10.1007/s11336-013-9325-5 Google Scholar

Doignon, J.-P., & Falmagne, J.-C. (1985). Spaces for the assessment of knowledge. International Journal of Man-Machine Studies, 23, 175–196. http://dx.doi.org/10.1016/S0020-7373(85)80031-6 Google Scholar

Doignon, J.-P., & Falmagne, J.-C. (1999). Knowledge Spaces. Berlin, Heidelberg, and New York, NY: Springer-Verlag.CrossRef Google Scholar

Falmagne, J.-C., & Doignon, J.-P. (1988a). A class of stochastic procedures for the assessment of knowledge. British Journal of Mathematical and Statistical Psychology, 41, 1–23. http://dx.doi.org/10.1111/j.2044-8317.1988.tb00884.x Google Scholar

Falmagne, J.-C., & Doignon, J.-P. (1988b). A Markovian procedure for assessing the state of a system. Journal of Mathematical Psychology, 32, 232–258. http://dx.doi.org/10.1016/0022-2496(88)90011-9 Google Scholar

Falmagne, J.-C., & Doignon, J.-P. (2011). Learning spaces. New York, NY: Springer.Google Scholar

Glas, C. A. W., & Verhelst, N. D. (1995). Testing the Rasch model. In Fischer, G. H. & Molenaar, I. W., editors, Rasch Models: Foundations, Recent Developments, and Applications. New York, NY: Springer.Google Scholar

Heller, J., Stefanutti, L., Anselmi, P., & Robusto, E. (2014). Cognitive diagnostic models and knowledge space theory. The non-missing link. Manuscript submitted for publication.Google Scholar

Heller, J., & Wickelmaier, F. (2013). Minimum discrepancy estimation in probabilistic knowledge structures. Electronic Notes in Discrete Mathematics, 42, 49–56. http://dx.doi.org/10.1016/j.endm.2013.05.145 CrossRef Google Scholar

Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272. http://dx.doi.org/10.1177/01466210122032064 Google Scholar

Robusto, E., Stefanutti, L., & Anselmi, P. (2010). The Gain-Loss Model: A probabilistic skill multimap model for assessing learning processes. Journal of Educational Measurement, 47, 373–394. http://dx.doi.org/10.1111/j.1745-3984.2010.00119.x Google Scholar

Schrepp, M. (2005). About the connection between knowledge structures and latent class models. Methodology, 1, 93–103. http://dx.doi.org/10.1027/1614-2241.1.3.93 Google Scholar

Spoto, A., Stefanutti, L., & Vidotto, G. (2012). On the unidentifiability of a certain class of skill multi map based probabilistic knowledge structures. Journal of Mathematical Psychology, 56, 248–255. http://dx.doi.org/10.1016/j.jmp.2012.05.001 Google Scholar

Stefanutti, L., Heller, J., Anselmi, P., & Robusto, E. (2012). Assessing local identifiability of probabilistic knowledge structures. Behavior Research Methods, 44, 1197–1211. http://dx.doi.org/10.3758/s13428-012-0187-z CrossRef Google Scholar PubMed

Stefanutti, L., & Robusto, E. (2009). Recovering a probabilistic knowledge structure by constraining its parameter space. Psychometrika, 74, 83–96. http://dx.doi.org/10.1007/s11336-008-9095-7 CrossRef Google Scholar

Figure 1. The probability $\beta _q^{\left( 1 \right)}$ of a careless error on item q in group G1 (y-axis) varies as a function of βq (x-axis). Each of the curves corresponds to a different choice of the proportion p↓.

Figure 2. Comparison between $\beta _q^{\left( i \right)}$ and $\bar \beta _q^{\left( i \right)}$ parameters in G1 and G2. The true value of βq is along the x-axis, whereas the theoretical and the mean estimates of βq are along y-axis. The straight line is for reference and indicates that x = y, and there is one diagram for each of the 9 values of p. In each diagram, circles represent $\beta _q^{\left( 1 \right)}$, whereas triangles represent the theoretical value of $\beta _q^{\left( 2 \right)}$. Finally, the × and dots represent respectively the $\bar \beta _q^{\left( 1 \right)}$ and the $\bar \beta _q^{\left( 2 \right)}$ mean estimates.

Article contents

Naïve Tests of Basic Local Independence Model’s Invariance

Abstract

Keywords

The BLIM and the Parameter Invariance

Naïve Tests of Invariance: Restricted Case

Naïve Tests of Invariance: General Case

Proposition 1

Proof of Proposition 1

Proposition 2

A Simulation Study

Simulation Design

Results

Empirical Application

Results

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests