Knowledge space theory (KST) is a mathematical theory, developed by Doignon and Falmagne (Reference Doignon and Falmagne1985, Reference Doignon and Falmagne1999) and Falmagne and Doignon (Reference Falmagne and Doignon2011) for the individual knowledge assessment. The aim of a KST assessment is to find, with maximum accuracy and efficiency, the set of problems that a student masters in a specific knowledge domain (the student’s knowledge state). In real contexts, some of the student’s answers could be affected by noise, like careless errors due to time pressure or to turmoil, or correct answers resulting from guessing. Because of the noise, the student’s knowledge state is not directly observable, and it has to be inferred from the student’s responses to the problems. To provide realistic predictions of student’s responses, probabilistic models have to be considered. The first and the most used probabilistic model, developed in KST, is the so-called basic local independence model (BLIM; Falmagne & Doignon (Reference Falmagne and Doignon1988a, Reference Falmagne and Doignon1988b).
Knowledge about the properties of this model has grown over the years. For example, methods for estimating its parameters (Heller & Wickelmaier, Reference Heller and Wickelmaier2013; Schrepp, Reference Schrepp2005; Stefanutti & Robusto, Reference Stefanutti and Robusto2009) and for testing its identifiability (Spoto, Stefanutti, & Vidotto, Reference Spoto, Stefanutti and Vidotto2012; Stefanutti, Heller, Anselmi, & Robusto, Reference Stefanutti, Heller, Anselmi and Robusto2012) are available. Furthermore, some extensions of the BLIM have been proposed, like, for example, the Gain-Loss Model (Robusto, Stefanutti, & Anselmi, Reference Robusto, Stefanutti and Anselmi2010; de Chiusole, Anselmi, Stefanutti, & Robusto, 2013), a model for assessing learning processes, and a probabilistic model for skill dependence (de Chiusole & Stefanutti, Reference de Chiusole and Stefanutti2013) were developed and applied to real data.
The focus of this article is on one of the properties of the BLIM, that is the parameter invariance assumption. In de Chiusole, Anselmi, et al. (2013) de Chiusole, Stefanutti et al. (Reference de Chiusole and Stefanutti2013) it is shown that, even when the invariance assumption is violated by the data, the goodness of fit of the BLIM might be acceptable. For this reason, having a method that shows up invariance violations becomes essential. The method proposed by the authors consists in comparing the BLIM with other models, called bipartition models (BPMs), in which the invariance assumption is explicitly violated. If the comparison favors a BPM, then the parameter invariance assumption is violated, meaning that the BLIM is not adequate for those data.
In the same article, another method to discover such type of violations was considered. The method was inspired by the IRT approach (Andersen, Reference Andersen1973; Glas & Verhelst, Reference Glas, Verhelst, Fischer and Molenaar1995), and consists in partitioning the observed data set into two or more independent groups, to fit the BLIM in each of the groups, and to apply some suitable statistical test to evaluate the difference between the parameter estimates of the two groups. In the simplest case, two groups are formed by separating all the subjects with score levels below the median from those with score levels above the median. If the test is significant, then the conclusion would be that the parameter invariance assumption is violated.
Even if this method seems to be the most natural way to discover violations, it has been formally proven that, with the BLIM, it does not work properly. The main concern with this method is that the error parameter estimates in the two groups will significantly differ one another even when the invariance assumption is, indeed, respected. For this reason, in the sequel we refer to this method as the naïve test of invariance.
A procedure, similar to the one described above, was developed by de La Torre and Lee (2010) in the area of cognitive diagnostic models, for showing up violations of the parameter invariance of the DINA (deterministic inputs, noisy AND-gate) model (Junker & Sijtsma, Reference Junker and Sijtsma2001). Instead of forming two ‘pure’ groups below and above the median, they produced two data sets that were mixtures of the two pure groups. A first group collected about 60% of respondents that were below the median, and about 40% of those that were above it. The second group was constructed by reversing these proportions. By applying that procedure to a data set on fraction subtraction, de la Torre and Lee concluded that the parameter invariance of the DINA model may not hold in real data. It is worth noting that, on the performance level, the DINA model is equivalent to the BLIM (Heller, Stefanutti, Anselmi, & Robusto, Reference Heller, Stefanutti, Anselmi and Robusto2014).
The aim of this article is to generalize the theoretical results concerning inadequacy of the naïve test to any choice of the proportion p, used to form the two groups. After presenting the BLIM, naïve tests of invariance are presented along with theoretical results showing that also the general version of the test suffers for the same problems. The theoretical results are illustrated through a simulation study and an empirical application.
The BLIM and the Parameter Invariance
The aim of a KST assessment is to uncover the knowledge state that characterizes a student, on the basis of her responses to a given set Q of problems. The collection of responses is named response pattern, and it is represented by the subset R ⊆ Q of all problems that received a correct response. For a KST assessment, a deterministic model on all problems q ∈ Q, called knowledge structure, is required, along with a probabilistic model, like for example, the BLIM. A knowledge structure is defined as a pair (Q, K), in which Q is a collection of problems, and K is a collection of subsets of Q, called knowledge states. The BLIM is defined as a quadruple (Q, K, π, r), in which:
-
a. (Q, K) is a knowledge structure on a finite set Q;
-
b. π is a probability distribution on K
-
c. r is the response function and, for every R ⊆ Q and every K ∈ K, it is defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn1.gif?pub-status=live)
where β q , η q ∈(0, 1] are two parameters of each of the items, respectively called careless error probability and lucky guess probability;
The probability of sampling a student whose response pattern is R ⊆ Q is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn2.gif?pub-status=live)
It has to be pointed out that the parameters β q and η q are attached to the items and do not vary with the knowledge states of the students, in other words they are invariant across individuals. We will refer to this property as the invariance assumption.
Before to get into the question of how this assumption can be tested, it is worth providing some basic notation. The set K
q
= {K ∈ K : q ∈ K} collects all knowledge states containing a given item q ∈ Q, and
${{\bf{\bar K}}_q} = \{ K \in {\bf{K}}:q \notin K\}$
is its complement in K. Similarly, for R = 2
Q
, let R
q
= {R ⊆ Q : q ∈ R} be the set of all response patterns containing q, and
${{\bf{\bar R}}_q} = \{ R \subseteq Q:q \notin R\}$
be its complement in R. Finally, for any F ⊆ R and any J ⊆ K let
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_equ1.gif?pub-status=live)
be the joint probability of F and J.
Naïve Tests of Invariance: Restricted Case
A way to assess violations of the parameter invariance assumption of the BLIM would be to partition the whole data set into two independent groups, to fit the BLIM in each of them (say, Group 1 and Group 2), and to apply some suitable statistical test of the difference between the parameter estimates in the two groups. If the test is significant then the conclusion would be that the parameter invariance is violated by the data.
As a criterion to form the two groups, consider the one that consists of choosing a certain quantile c > 0 (e.g. the median) of the sample distribution of the size of the response patterns. Those having size less or equal to c are assigned to Group 1, and those having size greater than c are assigned to Group 2. With these two groups, a test of the invariance would not work properly, since parameter estimates would be biased in both groups, even when the independence assumption is, indeed, respected. Because of this bias, the statistical test would lead to a rejection of the local independence assumption too often.
To see this, consider some cutoff c ∈ {0, 1,…,|Q| − 1}, and let
${{\bf{R}}^ \downarrow } = \{ R \in {\bf{R}}:|R| \le c\}$
be the collection of all response patterns whose size is less or equal to c, and
${{\bf{R}}^ \uparrow } = \{ R \in {\bf{R}}:|R| > c\}$
be the collection of all response patterns whose size is greater than c. Then, according to the BLIM, the conditional probability that in a randomly sampled response pattern R, an item q is failed by careless error, given that the size of R is below the cutoff is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn3.gif?pub-status=live)
whereas the conditional probability that in a randomly sampled response pattern R, item q is solved by lucky guess, given that R is below the cutoff is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn4.gif?pub-status=live)
Similar equations are obtained for the
$\beta _q^ \uparrow$
and
$\eta _q^ \uparrow$
parameters, by replacing R
↓ with R
↑ in Equations 3 and 4.
In de Chiusole, Anselmi et al. (Reference de Chiusole, Stefanutti, Anselmi and Robusto2013), it is shown that, for any choice of the cutoff c and any item q ∈ Q and β
q
,η
q
∈ (0, 1), the following inequalities hold true:
$\beta _q^ \uparrow < {\beta _q} < \beta _q^ \downarrow$
; and
$\eta _q^ \downarrow < {\eta _q} < \eta _q^ \uparrow$
.
These two inequalities show that careless errors are more likely when one samples below the cutoff, whereas lucky guesses are more likely when one samples above the cutoff. Thus, when the parameters of the BLIM are estimated from only a part of the data set (below/above), one obtains biased parameter estimates.
Naïve Tests of Invariance: General Case
It might be argued that the rule of forming the two groups by assigning all patterns below a certain cutoff to one group, and all the remaining ones to the other group is too strong. Maybe, there exist weaker rules that reduce or even remove the bias.
The following, more general, rule is considered here: Given some proportion p ↓, with 0 ≤ p ↓ ≤ 1, a sufficiently large number n ↓ of response patterns are randomly sampled with replacement from R ↓, and each of them is assigned to Group G 1 with probability p ↓, and to Group G 2 with probability 1 − p ↓. Analogously, given a proportion p ↑, 0 ≤ p ↑ ≤ 1, n ↑ response patterns are randomly sampled with replacement from R ↑, and each of them is assigned to G 1 with probability p ↑, and to G 2 with probability 1 − p ↑.
As the following proposition shows, even the more general rule suffers from the same problem as the naïve test of invariance.
Proposition 1
Let
$\beta _q^{(1)}$
be the probability that, in a randomly sampled response pattern, an item q is failed by careless error, given that the pattern belongs to Group
G
1
, and
$\eta _q^{(1)}$
be the probability that a lucky guess occurs for q, given that the pattern belongs to
G
1
. Then
$\beta _q^{(1)} \le {\beta _q}$
if and only if
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn5.gif?pub-status=live)
where
${\bf{\bar R}}_q^ \downarrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow }$
and
${\bf{\bar R}}_q^ \uparrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow }$
. Moreover,
$\eta _q^{(1)} \le {\eta _q}$
if and only if
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn6.gif?pub-status=live)
where
${\bf{R}}_q^ \downarrow = {{\bf{R}}_q} \cap {{\bf{R}}^ \downarrow }$
and
${\bf{R}}_q^ \uparrow = {{\bf{R}}_q} \cap {{\bf{R}}^ \uparrow }$
.
What Proposition 1 essentially says is that the method that consists in partitioning the sample into two subgroups by using any arbitrary proportions p ↑ and p ↓ will lead to biased estimates of the β q and η q parameters, even when the invariance assumption is indeed respected by the data. Depending on the ratio p ↓ / p ↑ that one chooses, the β q and η q probabilities might be either over- or underestimated in both groups G 1 and G 2. For this reason it is recommended not using methods like the one described in this section for testing parameter invariance of the BLIM.
Proof of Proposition 1
Suppose that the probabilities of the response patterns in a population are given by Equation (2). The probability that in a randomly sampled response pattern, an item q is failed by careless error, given that the pattern belongs to group G 1 is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn7.gif?pub-status=live)
The numerator of the right hand side of Equation (7) can be written as
$P({{\bf{\bar R}}_q} \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) + P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}),$
and, by applying the concatenation rule of conditional probabilities,
$P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{G}}_1}|{{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q})P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q}).$
Moreover, given R
↓, G
1 is independent of both
${{\bf{\bar R}}_q}$
and K
q
, hence
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_equ2.gif?pub-status=live)
Similarly, we have
$P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow } \cap {{\bf{G}}_1},{{\bf{K}}_q}) = P({{\bf{G}}_1}|{{\bf{R}}^ \uparrow })P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow },{{\bf{K}}_q}) = {p^ \uparrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow },{{\bf{K}}_q}).$
Therefore, the numerator of (7) becomes
$P({{\bf{\bar R}}_q} \cap {{\bf{G}}_1},{{\bf{K}}_q}) = {p^ \downarrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow },{{\bf{K}}_q}) + {p^ \uparrow }P({{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow },{{\bf{K}}_q}).$
On the other hand, the denominator of (7) can be written as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_equ3.gif?pub-status=live)
Thus, with the notation
${\bf{\bar R}}_q^ \downarrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \downarrow }$
, and
${\bf{\bar R}}_q^ \uparrow = {{\bf{\bar R}}_q} \cap {{\bf{R}}^ \uparrow }$
, Equation (7) can be rewritten as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn8.gif?pub-status=live)
By substituting
$\beta _q^{(1)}$
with the right hand of this last equation in the inequality
$\beta _q^{(1)} \le {\beta _q}$
, after some algebra we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_eqn9.gif?pub-status=live)
We know from Proposition 1 in de Chiusole, Anselmi, et al., 2013; de Chiusole & Stefanutti (Reference de Chiusole and Stefanutti2013) that
${\beta _q} > \beta _q^ \uparrow$
. Since, by definition,
$\beta _q^ \uparrow = P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q})/P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}),$
we have that
${\beta _q}P({{\bf{R}}^ \uparrow },{{\bf{K}}_q}) - P({\bf{\bar R}}_q^ \uparrow ,{{\bf{K}}_q}) > 0$
. Thus, dividing both terms of the inequality in (9) by p
↑ and, then, by
$P({\bf{\bar R}}_q^ \downarrow ,{{\bf{K}}_q}) - {\beta _q}P({{\bf{R}}^ \downarrow },{{\bf{K}}_q})$
, one obtains the Inequality in (5).
The proof concerning the conditions for
${\eta _q} \le \eta _q^{(1)}$
follows an identical line of reasoning, provided that
${{\bf{\bar R}}_q}$
is replaced by R
q
and K
q
by
${{\bf{\bar K}}_q}$
.
Proposition 1 holds true with arbitrary values of the two proportions p ↓ and p ↑. A special case arises when the choice of p ↓ and p ↑ is such that p ↑ = 1 − p ↓.
Proposition 2
If p ↑ = 1 − p ↓ then:
-
1.
$\beta _q^{(1)} \le {\beta _q}$ if and only if
$\beta _q^{(2)} \ge {\beta _q}$ ;
-
2.
$\eta _q^{(1)} \le {\eta _q}$ if and only if
$\eta _q^{(2)} \ge {\eta _q}$ .
Proof. (1) Let f (β
q
) represent the right hand term in the inequality (5). The we have
$\beta _q^{(1)} \le {\beta _q}$
iff
${p^ \downarrow }/(1 - {p^ \downarrow }) \le f({\beta _q})$
iff
$(1 - {p^ \downarrow })/{p^ \downarrow } \ge f({\beta _q})$
iff
${p^ \uparrow }/(1 - {p^ \uparrow }) \ge f({\beta _q})$
iff
$\beta _q^{(2)} \ge {\beta _q}$
. An analogous development, applied to inequality (1), leads to condition (2).
As Equation (8) shows, the value of
$\beta _q^{(1)}$
is a function of: (i) the cutoff c used to partition the dataset; (ii) the values of p
↑ and p
↓
; (iii) the BLIM’s parameter values (i.e., the β
q
, η
q
and π
K
probabilities). To illustrate propositions 1 and 2, Figure 1 shows how
$\beta _q^{(1)}$
varies as a function of the true parameter β
q
and the proportion p
↓. In the figure the x-axis represents the parameter β
q
and each of the curves corresponds to a different choice of p
↓. The remaining parameters of the BLIM were fixed to constant values, and the restriction p
↑ = 1 − p
↓ was used.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_fig1g.gif?pub-status=live)
Figure 1. The probability
$\beta _q^{\left( 1 \right)}$
of a careless error on item q in group G1 (y-axis) varies as a function of β
q
(x-axis). Each of the curves corresponds to a different choice of the proportion p
↓
.
It can be seen from the figure that, when p ↓ is less than a certain value Footnote 1 (.5 in this particular example), the β q parameter is underestimated in group G 1 (and thus overestimated in group G 2), and the size of the bias increases as p ↓ approaches zero. On the other hand, when p ↓ is greater than .5, the β q parameter is overestimated in group G 1 (and thus underestimated in group G 2), and the size of the bias increases as p ↓ approaches one. A similar example could be provided for the η q parameter. In that case one obtains an opposite behavior.
A Simulation Study
The theoretical results obtained in previous section are illustrated by means of a simulation study in which the two proportions p
↓ and p
↑ are varied systematically. The aim of the simulations is to show that, by estimating the BLIM’s parameters in each of the two groups G
1 and G
2, leads to reject the error parameter invariance assumption of the BLIM even when it is respected by the data, irrespectively of the values of the proportions p
↓ and p
↑. In particular, it is expected that on the average, the maximum likelihood estimates of the β
q
and η
q
parameters in group G
i
(with i ∈ {1,2}) approach the theoretical values
$\beta _q^{(i)} \ne {\beta _q}$
and
$\eta _q^{(i)} \ne {\eta _q}$
.
For all the simulations a set of MATLAB functions, available on request to the first author, were developed.
Simulation Design
A number of 9 simulation conditions were considered in which the following variables were held fixed: a random knowledge structure, composed of 16 items and 400 knowledge states Footnote 2 ; the true error parameter values, chosen at random from a uniform distribution in the interval (0,.25); the cutoff used for creating the two groups, that, for this knowledge structure, was the median (8). What varied among the 9 conditions was the proportion p ↓ used to form groups G 1 and G 2. The values of p ↓ were taken from the open interval (0,1) at equally spaced intervals of length 0.10. In all simulations the constraint p ↑ = 1 − p ↓ held true.
In each of the 9 conditions, 100 samples of 1,000 response patterns were generated. For each sample, the two groups were then formed choosing, with replacement, a proportion p
↓ of the patterns below the median and a proportion 1−p
↓ of the patterns above the median. In this way, for each of the 100 replications, 9 pairs {G1, G2}, both composed of 1,000 response patterns, were obtained. The BLIM was then estimated to both groups, in each of the 9 pairs, and the means of the parameter estimates were compared to those computed by applying Equation (8) for
$\beta _q^{(1)}$
and the corresponding equation for
$\eta _q^{(1)}$
Results
The size and the direction of the bias, in the 9 simulation conditions, were examined. Figure 2 shows the results for the β
q
error parameter, obtained for both G
1 and G
2. In the figure, the true value of β
q
is along the x-axis, whereas the theoretical and the mean estimates of β
q
are along y-axis. The straight line is for reference and indicates that x = y, and there is one diagram for each of the 9 values of p. In each diagram, circles represent
$\beta _q^{(1)}$
, whereas triangles represent the theoretical value of
$\beta _q^{(2)}$
. Finally, the × and dots represent respectively the
$\bar \beta _q^{(1)}$
and the
$\bar \beta _q^{(2)}$
mean estimates.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_fig2g.gif?pub-status=live)
Figure 2. Comparison between
$\beta _q^{\left( i \right)}$
and
$\bar \beta _q^{\left( i \right)}$
parameters in G
1 and G
2. The true value of β
q
is along the x-axis, whereas the theoretical and the mean estimates of β
q
are along y-axis. The straight line is for reference and indicates that x = y, and there is one diagram for each of the 9 values of p. In each diagram, circles represent
$\beta _q^{\left( 1 \right)}$
, whereas triangles represent the theoretical value of
$\beta _q^{\left( 2 \right)}$
. Finally, the × and dots represent respectively the
$\bar \beta _q^{\left( 1 \right)}$
and the
$\bar \beta _q^{\left( 2 \right)}$
mean estimates.
From the figure, it can be seen that: (1)
$\beta _q^{(i)}$
and
$\bar \beta _q^{(i)}$
parameters are in agreement for both groups, in all 9 conditions; (2) when p = .50, no bias is observed; (3) going from p = .50 to p = .10, the β
q
parameter is overestimated in G
2 and underestimated in G
1, whereas going from p = .50 to p = .90, the β
q
parameter is underestimated in G
2 and overestimated in G
1. These results are in line with the predictions made by Proposition 2.
The results obtained for the η q error parameters are very similar to those obtained for the β q . The difference is that going from p = .50 to p = .10, the η q parameter is underestimated in G 2 and overestimated in G 1, whereas going from p = .50 to p = .90, the η q parameter is overestimated in G 2 and underestimated in G 1.
Empirical Application
The results discussed in the previous sections are illustrated by an application to real data. The design was the same used in the simulation study: 9 conditions where considered in which the proportion p ↓ used to create the two groups G1 and G2, respectively below and above the cutoff c, varied in the open interval (0,1) at equally spaced interval of length .10. For illustrative purposes, the data set provided by de Chiusole, Anselmi, et al. (2013 de Chiusole, Stefanutti et al. (Reference de Chiusole, Anselmi, Stefanutti and Robusto2013) was used, in which 18 problems of elementary probability theory (with a knowledge structure of 69 states) were administered to 209 Italian university students. The median of the cardinality of the knowledge states was used as the cutoff (c = 9) to form the two groups. Subsequently, in each of the 9 conditions, the BLIM was fitted to the data in each of the two groups below and above the cutoff, and the means of the β q and η q estimates were computed across the items, and compared to one another.
Results
The results are shown in Table 1. Concerning the careless error parameters, it can be seen that in each of the conditions from 1 to 5, in which .1 ≤ p
↓
< .5, the inequality
${\bar \beta _{{\bf{G}}1}} &lt; {\bar \beta _{{\bf{G}}2}}$
is respected, with the only exception of condition 4 (where, however, the two means are very close to one another); whereas, in all conditions from 6 to 9, in which .5 ≤ p ≤ .9, the inequality
${\bar \beta _{{\bf{G}}1}} &gt; {\bar \beta _{{\bf{G}}2}}$
is respected. Concerning the lucky guess parameters, it can be seen that in all conditions from 1 to 5, the inequality
${\bar \eta _{{\bf{G}}1}} &gt; {\bar \eta _{{\bf{G}}2}}$
holds true, whereas in all conditions from 6 to 9, the inequality
${\bar \eta _{{\bf{G}}1}} &lt; {\bar \eta _{{\bf{G}}2}}$
holds. All these results are in line with Proposition 2.
Table 1. Comparison among the mean of the item parameter estimates of the BLIM in the 9 conditions of the study. In the table p is the proportion used to create the two groups below and above the cutoff; the
${\bar \beta _{G1}}$
and
${\bar \beta _{G2}}$
parameters are the mean of the careless errors of the groups below and above the cutoff, respectively; the
${\bar \eta _{G1}}$
and
${\bar \eta _{G2}}$
parameters are the mean of the lucky guesses of the groups below and above the cutoff, respectively
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921054145704-0185:S1138741615000244:S1138741615000244_tab1.gif?pub-status=live)
The BLIM’s parameter invariance says that the probability of a careless error or a lucky guess, for an item, does not depend on the student’s knowledge state. In de Chiusole, Anselmi, et al. (Reference de Chiusole, Stefanutti, Anselmi and Robusto2013), de Chiusole, Stefanutti et al. (Reference de Chiusole, Anselmi, Stefanutti and Robusto2013) two methods for testing this assumption were presented and discussed. The former, consists in comparing the BLIM with other models, called bipartition models (BPMs), in which the invariance assumption is explicitly violated; if the comparison favors a BPM, then the conclusion is that invariance is violated. The latter, inspired by the IRT literature, consists in partitioning the observed data set into two groups (one containing all patterns below a certain cutoff, and one containing all patterns above the cutoff), to fit the model in each of them, and to apply some statistical test to evaluate the difference between the parameter estimates of the two groups. If the test is statistically significant, then the parameter invariance is violated. This second method, called restricted naïve test, does not work properly, because it leads to biased parameter estimates in both groups. Indeed this bias is a direct effect of the manipulations introduced to partition the data into the two groups, and says nothing about possible departures of the data from the parameter invariance assumption.
In the present work, the analysis was extended to a more general method for constructing the two groups. The groups are formed by choosing a proportion p ↑ of the patterns above a certain cutoff c, and a proportion p ↓ of the patterns below c. Theoretical results, simulations and an empirical application showed that, also the general method suffers of the same problems as the restricted naïve test. Again, the manipulations of the data that one implements for setting up the two groups, lead to biased parameter estimates that have nothing to do with the violation of the invariance assumption. Given these observations, the only available method for testing the BLIM’s invariance is, currently, bipartition models.