1. Introduction: The Main Argument in Brief
The aim of this article is to challenge the view that Kochen-Specker (KS) arguments provide an algebraic proof for quantum contextuality if noncontextuality is interpreted as the robustness of a system’s response to a measurement against other simultaneous measurements.
As a start, it is worth discerning KS arguments from KS theorems. KS theorems are simply mathematical theorems in the form of a coloring problem, while KS arguments are physical arguments devised to prove that quantum mechanics (QM) is contextual. The KS theorems start from a family of self-adjoint operators arranged on a hypergraph (a generalization of a graph where an edge can connect any number of vertices) such that the subsets of mutually commuting operators define the hyperedges (a nonempty subset of vertices) of the hypergraph (see, e.g., Abramsky and Brandenburger Reference Abramsky and Brandenburger2011; Cabello, Severini, and Winter Reference Cabello, Severini and Winter2014; Acín et al. Reference Acín, Fritz, Leverrier and Sainz2015). Two examples for such a hypergraph are the Greenberger, Horne, and Zeilinger (GHZ) graph (fig. 1a) and the Peres-Mermin graph (fig. 1b). Here each hyperedge is depicted by an unbroken line connecting four collinear vertices on the GHZ graph and three collinear vertices on the Peres-Mermin graph.

Figure 1. a, GHZ graph; b, Peres-Mermin graph.
Next, one introduces value assignments on the graph, that is, functions assigning to each vertex one of the eigenvalues of the operators represented by the vertex in every quantum state. Since the operators are typically projections or contractions, the assignments generally yield the numbers 0, +1, and −1. The value assignments are, however, constrained by the so-called functional composition principle (FUNC; see Redhead Reference Redhead1989, 121; Held Reference Held and Zalta2018, sec. 4) requiring that if the operators on a given hyperedge stand in a certain functional relation to one another, then the values assigned to the operators should also stand in the same functional relation in every quantum state.Footnote 1 In the case of the GHZ graph, for example, the product of the operators on every hyperedge is the unit operator , except for the horizontal hyperedge, where the product is
. In the case of the Peres-Mermin graph the product of the operators on every hyperedge is
, except for the third vertical hyperedge, where it is
. Since the eigenvalues of each operator on both graphs are ±1, FUNC allows for only such value assignments for which the product of the assigned numbers on every hyperedge equals the product of the operators (i.e., +1 or −1) on that hyperedge. It is easy to show that there is no such value assignment on the two graphs in figure 1. More generally, KS theorems provide complex hypergraphs of operators such that there is no value assignment on the graph respecting FUNC. Some KS theorems work only in specific quantum states; others, across all states. Thus, one can differentiate state-dependent and state-independent (algebraic) KS theorems.
To proceed from a KS theorem to a KS argument, one needs to provide a physical interpretation for the KS graph. To this aim, one first assumes that QM admits an ontological (hidden variable) model. In other words, one assumes that the quantum states are simply distributions of underlying (dispersion-free) ontic states. Next, one associates the operators with observables and measurements. Measurements are “lists of instructions to be implemented in the laboratory” (Spekkens Reference Spekkens2005, 2), and observables are physical magnitudes that characterize a given quantum system. In a value-definite (deterministic) ontological model, each observable has a well-defined value in every ontic state. Each observable is also associated with a measurement (procedure) such that the outcome of the measurement reveals (faithfully) the value of the observable. Furthermore, each observable 𝒜 and the corresponding measurement a are represented by a self-adjoint operator such that the values of the observable and the outcomes of the measurement are just the eigenvalues of the operator. The exact nature of these associations will be examined below. Finally, one interprets the quantum probability of an operator’s spectral projection associated with a given eigenvalue as the probability of the corresponding observables having the value associated with that eigenvalue and also as the conditional probability of the outcome associated with that value provided the corresponding measurement is performed.
On this interpretation each value assignment on a KS graph represents a possible distribution of values in a given ontic state that the observables associated with the operators on the graph can take and that the corresponding measurements reveal. The constraint FUNC is justified as follows. Mutually commuting operators on a hyperedge have common eigenstates. If one prepares the system in one of these eigenstates, then the functional relationship between the operators will be realized as the functional relationship between the outcomes of the corresponding measurements and also between the values of the associated observables. Note that to justify FUNC in an eigenstate, the measurements need not be comeasurable (simultaneously measurable). But what justifies FUNC in a general quantum state? Here one can come up with three answers.
First, one can say that any ontic state featuring in the support of a general quantum state must also show up in the support of at least one eigenstate.Footnote 2 This answer, however, is not very appealing. After all, why should every quantum state be composed of the same ontic states as the eigenstates are?
Second, one can say that the mutually commuting operators of the graph represent simultaneous measurements {ai}, and on performing these joint measurements one can directly observe the functional relationship in question between the joint measurement outcomes and hence (assuming faithful measurement) between the values of the observables. Note that simultaneous measurements are understood here in the very physical sense, namely, as measurements that can jointly be performed at the same time on the same system. Also note that, although simultaneous measurements get represented in QM by commuting operators, the converse is not true: from the mathematical fact that certain measurements are represented by commuting operators, it does not follow that these measurements can be simultaneously performed. We come back to this important point below.
Third, one can refer to the mathematical fact that for every set of mutually commuting operators sitting on a hyperedge there is an operator
and functions {fi} such that
. Thus, one can say that there is only one single observable ℬ with a corresponding measurement b, and the set
of mutually commuting operators simply represents the different functions {fi(ℬ)} of this very observable. Consequently, FUNC holds trivially: it simply expresses the functional relationship among the different functions of the outcomes of b. Note that in this case the measurements {fi(b)} associated with
can be called “simultaneously measurable” only metaphorically since one performs only one single measurement, namely, b, and applies the functions to the outcome.
Now we show that these latter two justifications of FUNC lead to two different realizations of a KS graph. To reduce metaphysics and to get closer to the experimental testability, we eliminate the concept of observable from the discussion and adopt an operational approach relying purely on operators and measurements. We call an association of the operators of a KS graph with measurements a realization of the graph. A realization is unique if each operator on the graph is associated with only one measurement and nonunique if some operators are associated with more than one measurement. A measurement associated with an operator is said to be realizing the operator. Now, in the third way of justifying FUNC above, a set of operators sitting on a hyperedge is realized by one single measurement b since the functions fi applied to the measurement b are represented by
. Call a realization hyperedge based if there is at least one hyperedge on the graph that is realized by (different functions of) one single measurement.
In a unique realization of the Peres-Mermin graph; for example, one has nine different measurements associated with the nine vertices (operators) of the graph. In a (maximally) hyperedge-based realization of the same graph, one has only six measurements associated with the six hyperedges (three rows or three columns) of the graph. Can this latter realization be unique? No, it cannot, as the following simple lemma shows:
Lemma. A hyperedge-based realization in which all sets of mutually commuting operators represent simultaneous measurements cannot be unique.
Proof. Let be an operator sitting at the intersection of two hyperedges such that all operators (among them
) on the one hyperedge are realized by a measurement b. Suppose a contrario that
is realized only by b. Now, since mutually commuting operators represent simultaneous measurements, the measurements realizing the operators on the other hyperedge must be comeasurable with at least one measurement realizing
. But there is only one measurement realizing
, namely, b. Therefore, the measurements realizing the operators on the other hyperedge are comeasurable with b. But then all operators on the two hyperedges either represent functions of b or measurements that are comeasurable with b. Assuming that simultaneous measurements get represented by commuting operators, this means that all operators on both hyperedges commute. Contradiction. Consequently,
cannot be realized only by b. QED
That is, a realization of a KS graph where all sets of mutually commuting operators are realized by simultaneous measurements but some such sets are realized by one single measurement cannot be unique. In other words, only the above second justification of FUNC can lead to a unique realization; the third justification always leads to a nonunique realization.
To avoid the no-go result of the KS argument, unique and nonunique realizations follow different strategies. On a unique realization one blocks the argument by assuming that at least one measurement (associated with an operator sitting at the intersection of two hyperedges) can have different outcomes in an ontic state depending on whether it is simultaneously performed with measurements represented by operators on one or the other hyperedge. On a nonunique realization, however, the argument can also be blocked by assuming that different measurements represented by the same operator (at the intersection of two hyperedges) can have different outcomes in a given ontic state.
These two strategies for avoiding the no-go result represent two different interpretations of (non)contextuality. On the first interpretation, noncontextuality is the independence of the outcome of a measurement in every ontic state from other measurements it is simultaneously measured with. On the second interpretation, noncontextuality is a perfect correlation in every ontic state between the outcomes of two different measurements represented by the same operator.Footnote 3 Note that the two interpretations are different and logically independent.
Historically, the first interpretation of noncontextuality goes back to Bell, and the second interpretation, to van Fraassen. (For a historical survey of the notion of contextuality, see Hofer-Szabó [Reference Hofer-Szabó2021c].) Bell interprets noncontextuality as the “measurement of an observable must yield the same value independently of what other measurements may be made simultaneously” (Reference Bell1966/2004, 9). Van Fraassen’s contextuality, however, is based on the insight that “two observables [a and b] are statistically equivalent if they have the same probability distribution. … In that case they are represented in physics by the same Hermitean operator. … But that does not mean that ” (Reference van Fraassen1979, 158). In other words, two observables can be represented by the same self-adjoint operator without being the same. But then, one is not forced to assign the same value to them. Redhead (Reference Redhead1989, 135) calls this fact ontological contextuality.
Many authors working in the operational approach (e.g., Spekkens Reference Spekkens2005; Hermens Reference Hermens2011; Leifer Reference Leifer2014) follow this second interpretation. Spekkens, for example, writes: “A noncontextual ontological model of an operational theory is one wherein if two experimental procedures are operationally equivalent [i.e., they are represented by the same self-adjoint operator], then they have equivalent representations in the ontological model” (Reference Spekkens2005, 1). There are also experiments devised to test noncontextuality in this second sense (Mazurek et al. Reference Mazurek, Pusey, Kunjwal, Resch and Spekkens2016). The general idea behind this understanding of noncontextuality, once again, is that if two measurements—even if they are not simultaneous—are represented by the same self-adjoint operator (which, as van Fraassen rightly says, empirically just means that the outcome statistics of the two measurement are the same), then it is rational to assume that in every ontic state the outcomes (or more generally, the probability distributions of the outcomes) of the two measurements are also the same.
I do not doubt that this is a reasonable requirement on an ontological model.Footnote 4 I think, however, that this requirement is more closely related to the special way in which QM is representing the conditional probabilities and much less to the very concept of contextuality. If outcomes of different measurements (defined via different “lists of laboratory instructions”) are represented by the same projection, as happens in QM, then there might indeed seem to be a need for the “context” to dismantle what was put together by the representation. But this contextuality is simply the consequence of a special representation that does not discriminate mathematically between that which is different physically, namely, the outcomes of different measurements. Had this difference been respected by the representation, ontological contextuality would not arise.
If one relies, however, on the everyday usage of the term, then “context” refers simply to the circumstances in which a certain event, observation, or measurement occurs. These circumstances are not constitutive in the definition of the very event or measurement but can significantly influence the occurrence of the event or the result of the measurement. The important aspect of these circumstances, however, is that they are simultaneously present with the event or measurement. A possible context for a measurement in physics is another measurement that is performed simultaneously with the one in question. (A nonsimultaneous measurement cannot provide such a context since it lives in another possible world.) In this sense noncontextuality refers to a kind of robustness of the definite response to a measurement on a given system, with respect to simultaneous measurements that are also performed on the system. I will refer to this kind of noncontextuality as simultaneous noncontextuality. If we understand noncontextuality in this way, we just arrive at the above first interpretation of noncontextuality.
I have no objection against using noncontextuality in the second sense as Spekkens and many others use it. However, in this article I will use noncontextuality exclusively in the first sense (i.e., as simultaneous noncontextuality) and refer to the second one as Spekkens’s condition. My aim is to explore whether the KS arguments can prove that QM is contextual in the first sense. The challenge is then to construct (1) a unique realization for a KS graph, that is, to associate each operator of the graph with a different measurement such that (2) mutually commuting operators represent simultaneous measurements. We stress that points 1 and 2 are both important. Mutually commuting operators must represent simultaneous measurements, otherwise FUNC, on which the whole KS theorem is based, will not be physically justified. And the realization must be unique since nonunique realizations realizing certain operators by more than one measurement need to invoke noncontextuality in the second sense, that is, Spekkens’s condition. By abandoning Spekkens’s condition (i.e., by allowing the system to respond differently to different measurements represented by the same operator) one can always block the KS argument. In short, simultaneous measurability and unique realization are both sine qua non in proving quantum contextuality.Footnote 5
I will proceed as follows. First, I introduce the framework of operational theories (sec. 2) and ontological (hidden variable) models (sec. 3) and define (simultaneous) noncontextuality (sec. 4). Then, I accommodate QM in this framework (sec. 5); pick a simple example, the Peres-Mermin square (sec. 6); clarify what operational theories would realize it (sec. 7); and show that the standard spin measurement realization does not do the job (sec. 8). Next, I categorize KS arguments into three types (sec. 9), investigate the GHZ argument as an argument of type II (sec. 10), and show that arguments of type III can be effective only if they switch to nonunique realization (sec. 11) and if they assume Spekkens’s condition (sec. 12). Using a simple toy model, I compare Spekkens’s condition and noncontextuality (sec. 13). Finally, I contrast the KS arguments with the Bell-type arguments (sec. 14).
2. Operational Theories
An operational theory is a physical theory specifying the probability of the outcomes of some measurements performed on a physical system prepared previously in certain states. Let be the possible states or preparations of the system under investigation. Let
be the basic measurements that can be performed on the system yielding the outcomes Ai, Bj, … (
), respectively. Suppose that the measurements are repeatable and we perform them many times and obtain stable long-run relative frequencies for the outcomes in each state:

These relative frequencies allow us to introduce the conditional probabilities of obtaining certain outcomes given that the system has been prepared in certain states and the appropriate measurements have been performed:

We call a state an eigenstate of the measurement a if

If two measurements, say a and b, can be jointly or simultaneously performed, then the joint frequencies

are also well defined, which allows us to introduce the joint conditional probabilities:

Jointly or simultaneously performable measurements are also called comeasurable.
Whether two measurements are comeasurable is a physical question. One can measure the width and the length of a table at the same time. But one cannot jointly check—using Arthur Fine’s example—whether a given piece of wood is combustible and whether it can float on water. The two measurements cannot be simultaneously performed; you cannot burn the piece of wood while in water. Similarly, you are not going to burn the piece of wood along with throwing it in water—unless you want to test whether the ash floats.
Let M denote the set of all measurements (basic and joint) physically performable on a system, and let the variables x, y range over the measurements in M. The outcomes of x and y are denoted by Xk and Yl (,
), respectively, and the set of outcomes of all measurements is denoted by
. Similarly, let the variable r range over the preparations s,
of the system. An operational theory is then given by a set of conditional probabilities of the outcomes for the various basic and joint measurements in the various preparations:

which add up to 1 if we sum up for k.
Measurements that are not jointly measurable are not to be conflated with disturbing measurements. Consider the following example. In the army one performs two tests: shooting test (a) and tightrope walking (b). The two tests are jointly measurable; soldiers can well walk on a thin rope and shoot in the meanwhile. However, their performance in shooting is heavily influenced by whether they are walking on a rope while shooting. Thus, two simultaneous measurements a and b are called nondisturbing if


For space-like separated measurements, no disturbance is equivalent to no signaling.
A nondisturbing operational theory can be characterized in the following compact way. First, note that there is a natural partial ordering on the measurements of an operational theory that expresses “how joint” the measurements are: is “more joint” than a or b. Call the set of basic measurements {a, b, …} the basis of a measurement x, if
. Now, for two measurements x,
let
if the basis of x is contained in or equal to the basis of y. Using this partial ordering, an operational theory is nondisturbing if

Denote by Mm the set of maximally joint measurements, that is, the set of measurements x for which there is no other measurement y such that . For a nondisturbing operational theory it is enough to specify the conditional probabilities (2) for all
; all other conditional probabilities will then be set by (5).
3. Ontological Models
The role of an ontological model (hidden variable model; cf. Spekkens Reference Spekkens2005) is to account for the conditional probabilities of an operational theory in terms of underlying realistic entities called ontic states (hidden variables, elements of reality, beables). An ontological model defines the preparations of the system in terms of distributions over the ontic states and specifies the response of the system to the different measurements in the different ontic states in terms of the so-called response functions. The ontological model is successful if the conditional probabilities of the operational theory can be recovered in terms of these distributions and response functions.
Mathematically, the provision of an ontological model starts with the specification the set Λ of ontic states and a variable λ running over Λ. To make things simple we assume that Λ is countable.Footnote 6 Next, we associate with each preparation a probability distribution over the ontic states,

and to each measurement and ontic state a set of response functions, that is, a set of conditional probabilities,

again with the obvious normalization.
One can also impose two natural screening-off conditions expressing the independence of the preparations, measurements, and ontic states. The first screening-off condition, called no conspiracy, requires that the probability distributions do not depend causally, and hence probabilistically, on the measurements performed on the system:

The second screening-off condition, called λ sufficiency, requires that the response functions do not depend on the preparations in which the ontic states are featuring:

By means of (8) and (9) and using the theorem of total probability, one obtains

That is, one recovers the operational theory from the ontological model in terms of the probability distributions and response functions.
An ontic state λ with respect to a measurement x is called value definite if

otherwise, it is called probabilistic. Recall that one and the same λ can be value definite for the one measurement and probabilistic for the other. An ontological model is called value definite if (11) holds for all ; otherwise, it is called probabilistic.
4. Noncontextuality
Ontological models, both value definite and probabilistic, trivially exist for an operational theory if no further constraints are put on them. But now we require that the ontological model is noncontextual.
An ontological model is (simultaneous) noncontextual if every ontic state determines the probability of the outcomes of every measurement independently of what other measurements are simultaneously performed; otherwise, it is contextual.
(Simultaneous) noncontextuality can be formally expressed as follows:

In other words, each ontic state uniquely determines the probability of all outcomes of a given measurement irrespective of what other measurements are comeasured. A specific consequence of (12) is that the conditional probabilities of all basic measurements will be fixed irrespective of what other measurements they are comeasured with.
Observe, that noncontextuality (12) is almost the same requirement as no-disturbance (5), except that the latter is required for the preparations while the former is required for the ontic states.Footnote 7 Consequently, noncontextuality provides a neat explanation for why an operational theory is nondisturbing: if an ontological model for an operational theory satisfies noncontextuality (12) (and also no-conspiracy [8] and λ-sufficiency [9]), then the operational theory will satisfy no-disturbance (5). Hence, the assumption of noncontextuality is a kind of inference to the best explanation for the nondisturbing character of an operational theory.
Some notes are in place here.
i) Noncontextuality (12) is a generalization of Shimony’s (Reference Shimony, Penrose and Isham1986) parameter independence for situations when the simultaneous measurements are not necessarily space-like separated.
ii) If a value-definite ontological model is noncontextual, then (11) will hold for all
(and not just for
).
iii) Noncontextuality of an ontological model does not generally imply factorization:
(13)But it does if the ontological model is value definite.iv) Noncontextuality as defined in (12) resembles the concept of noncontextuality of Simon, Brukner, and Zeilinger (Reference Simon, Brukner and Zeilinger2001) but differs from that of Spekkens (Reference Spekkens2005) and other operationalists. Below I refer to this latter concept as “Spekkens’s condition.”
5. Quantum Mechanical Representation
On the minimal interpretation QM is an operational theory that provides conditional probabilities for the outcomes of different measurements in different states. Thus, the empirical content of QM could be expressed simply by listing the various conditional probabilities. However, in the standard formalism these conditional probabilities get represented in a linear algebraic fashion. The physical system is associated with a Hilbert space; each state is represented by a density operator
; each measurement
, by a self-adjoint operator
; and the outcome Xk of x, by the orthogonal spectral projection
of
with eigenvalue Xk. The representation is connected to experience by the Born rule:

where Tr is the trace function.
Now, if a and b are comeasurable, then gets represented in QM by commuting operators
and
. But if
and
are commuting, then a and b will turn out to be nondisturbing:

and similarly for . Thus, the quantum mechanical representation of joint measurements implies that QM cannot represent comeasurable but disturbing measurements. In other words, only nondisturbing operational theories can have a quantum mechanical representation.
Because it is an operational theory, one can search for an ontological model for QM. The KS arguments are intending to rule out such an ontological model if it is both value definite and noncontextual.Footnote 8 In the following sections I pick a special KS theorem, the Peres-Mermin square (Peres Reference Peres1990; Mermin Reference Mermin1993) and investigate whether it can be given a unique realization, that is, an operational theory composed of nine simultaneous measurements that does not admit a value-definite, noncontextual ontological model.
6. An Example: The Peres-Mermin Square
Consider the following 3 × 3 matrix of self-adjoint operators:

where ,
, and
are the Pauli operators and
is the unit operator on the two-dimensional complex Hilbert space. The operators in the matrix are arranged in such a way that two operators are commuting if and only if they are in the same row or in the same column. Each operator in the matrix has two eigenvalues, ±1. Denote the spectral projections of the operators
,
,
, … associated with the eigenvalues ±1 by
,
,
, … , respectively. Let the variables
,
, and
range over the operators of the Peres-Mermin square. Denote the spectral projections of
,
, and
by
,
, and
(j, k, l = ±1), respectively. The set of states S is represented by the set of density operators on the two-dimensional complex Hilbert space (which also include the common eigenstates for each subset of mutually commuting operators).
The quantum probabilities for the spectral projections of the three vertical and three horizontal commuting triples of operators are given by the trace formula

Now, it turns out that these quantum probabilities are nonzero only for certain combinations of spectral projections for a given commuting triple (irrespective of the quantum state). More specifically, for the third vertical triple {,
,
} the quantum probabilities are nonzero only for those combinations of projections for which the product of the associated eigenvalues is −1. For the other five triples this product must be +1. That is,

Note that these admissible combinations of eigenvalues are also associated with the four common eigenstates of the triplet in question.
Now, these admissible combinations of eigenvalues provide a constraint on the value assignments, that is, on the functions sending each of the nine operators of the Peres-Mermin square to one of its eigenvalues, that is, to ±1. The constraint is that the product of the numbers in each row and column should be +1, except for the third column where it should be −1. It is easy to see that no such value assignment exists.
But does this no-go result prove that QM does not admit a noncontextual value-definite ontological model? Not until the Peres-Mermin square is given a unique physical realization.
7. An Operational Theory Realizing the Peres-Mermin Square
Consider an operational theory with nine basic measurements:

The 3 × 3 matrix in which the measurements are arranged is to express now comeasurability relations: measurements are simultaneously measurable if and only if they are in the same row or in the same column.
Each measurement can have two outcomes, . Let the variables x, y, and z range over the basic measurements Mb. Denote the outcomes of x, y, and z by Xj, Yk, and
(j, k, l = ±1), respectively. Let the conditional probability of the six different maximally joint measurements be

Suppose furthermore that the condition probabilities of all other nonmaximally joint measurements can be obtained from (17) by marginalization. Thus, (17) characterizes a nondisturbing operational theory.
Now, suppose that the operational theory (17) is a physical realization of the Peres-Mermin square in the sense that the quantum probabilities (15) in the Peres-Mermin square represent just the conditional probabilities (17) via the Born rule (14). That is,

Note that (18) is well defined since the operators on the left-hand side are mutually commuting if and only if the represented measurements on the right-hand side are comeasurable. Also note that the operational theory (17) is a unique realization of the Peres-Mermin square, since every operator is associated with a different measurement. As we saw in the introduction, only unique realizations can decide on the status of noncontextuality in QM. (In sec. 11 we will see what nonunique realizations can do.)
From (16) and (18) it follows that the support of the probability distributions over the outcomes, that is, the set of possible outcomes for each maximally joint measurement and each preparation
, is as follows:

That is, the conditional probability is nonzero only for such joint outcomes that contain an odd number of +1s and an even number of −1s in each row and column, except for the last column where the number of +1s is even and the number of −1s is odd.
Does the operational theory (17) have a noncontextual value-definite ontological model? Assume (contrary to fact) that there is such a model with response functions:Footnote 9

Being noncontextual and value definite, the response functions are factorizing:

for all . Thus, the ontological model can be characterized by the extremal conditional probabilities:

However, the support (19) of the operational theory restricts the possible extremal conditional probabilities. Namely, for any three simultaneous measurements x, y, and z in Mb and , one requires that

Otherwise, there could be some ontic states that, if prepared (i.e., for some
), would render at least one conditional probability in (17) nonzero outside the support (19).
However, it is easy to see that there is no such set of conditional probabilities (22) that satisfies (23). This is due to the impossibility to fill in a 3 × 3 matrix with ±1s such that the product of the numbers in each row and column is +1, except for the last column where it is −1. Consequently, the operational theory (17) does not have a noncontextual value-definite ontological model.
Let me briefly reflect on the question of experimental testability of the above operational theory. Suppose that in a real experiment the support equation (19) cannot be sharply validated but only up to a fraction of all runs. How small should ε be so that a noncontextual value-definite ontological model for the operational theory can still be ruled out?
Suppose a contrario that the ontological model is noncontextual, and it conforms to the measurement statistics as much as possible; that is, for all only one of the six constraints (23) is violated. (For example, some λ assigns +1 to all nine measurements, thus violating the constraint of the third column but respecting the other five, etc.) There are six different triply joint measurements (of the three rows and three columns); hence—modulo some conspiracy—there is a 1/6 probability for any λ that a certain joint measurement will pick just that triple for which (23) is violated. Since each such measurement will contribute to the violation of (19), (19) will be violated in one-sixth of all runs. Consequently, if in a real experiment ε is smaller than 1/6, then the experiment will rule out a noncontextual value-definite ontological model for the operational theory.
This argument is a special case of a general argument provided by Simon et al. (Reference Simon, Brukner and Zeilinger2001) and Larsson (Reference Larsson2002) in the defense of the KS arguments against the so-called finite precision loophole argument of Meyer (Reference Meyer1999) and Clifton and Kent (Reference Clifton and Kent2000). As Barrett and Kent (Reference Barrett and Kent2004, sec. 4.3) nicely point out, the finite precision loophole is effective only if noncontextuality is defined in terms of operators on a Hilbert space and not operationally in terms of measurements—in short, only if KS arguments are understood as KS theorems. Thus, the finite precision loophole arguments do not nullify the KS arguments based on the above operational theory.
8. Do Spin Measurements Realize the Peres-Mermin Square?
The only question that remains is thus whether there exists an operational theory physically realizing the Peres-Mermin square? The first idea that comes to mind is the standard spin measurements. Suppose that the operator (i,
, 2, 3) represents the following measurement: first we perform two spin measurements by two Stern-Gerlach magnets on a pair of
particles in directions
and
, respectively (
;
,
, and
are mutually perpendicular), and second we check whether the outcomes of the measurements on the opposite wings are the same (+1) or not (−1). Denote this composite measurement, symbolically, by
. Furthermore, let
(
, 2, 3) and
(
, 2, 3) represent that we perform the spin measurement only on the left and right particle, respectively. Denote these singular spin measurements, symbolically, by
and
, respectively. Then, the measurements realizing uniquely the Peres-Mermin square read as follows:

Unfortunately, only four of the six commuting subsets of operators represent simultaneous measurements: the first two rows and the first two columns. Measurements in the third row and in the third column are not comeasurable. For example, the measurements c, f, and i in the third column, that is, the spin measurements in directions ,
, and
, cannot be simultaneously performed: one cannot turn the Stern-Gerlach magnets in directions
,
, and
at the same time. Consequently, although the left-hand side of (18) exists, the right-hand side is ill defined for the third column and also for the third row. The quantum probabilities

cannot be interpreted as conditional probabilities

and hence their support is not defined either. So one does not have the constraint


for the ontic states in the third column and third row and hence cannot arrive at the contradiction outlined above. The whole argument collapses. In short, the standard spin measurement does not realize the Peres-Mermin square in form of an operational theory (17) and consequently does not provide a physical realization for a quantum mechanical scenario for which a noncontextual value-definite ontological model could be ruled out.
Obviously, the standard realization of the above operators in terms of spin measurements is not the only possible physical realization. One may well come up with another unique realization on which the measurements are comeasurable if and only if the representing operators are commuting. However, I know of no such realization. And the burden of proof is on those who claim that the above arrangement of operators excludes a noncontextual value-definite ontological model for QM. An uninterpreted formalism cannot prove anything about the outer world.Footnote 10
Perhaps it is worth reflecting for a moment on the relation of commutativity and comeasurability (see Park and Margenau Reference Park and Margenau1968). Comeasurability is used in two different meanings in quantum physics. First, two measurements are called comeasurable (compatible, simultaneously measurable) if, performing them one after another, the first measurement does not alter the outcome statistics of the second one. Obviously, this usage of the term “simultaneous” is metaphoric and has no bearing on the KS arguments.
The other meaning is the one we use throughout this article: two measurement are comeasurable if they can physically be performed at the same time on the same system. Note, however, that this notion of comeasurability and the notion of commutativity are not synonymous expressions. From the simple fact that two measurements are represented by commuting operators it does not follow that the measurements are simultaneously performable. Comeasurability is a physical question that cannot be simply read from their representation. Simultaneous measurements get represented in QM by commuting operators. But the converse is not true. Not all commuting operators represent simultaneous measurements. Consider the following three pairs of commuting operators:

where ,
and
,
are spin-1 and spin-(1/2) operators, respectively. Each pair is featuring in one or another renowned KS argument: the first pair in the original Kochen and Specker (Reference Kochen and Specker1967) argument; the second in Peres’s (Reference Peres1990) and Mermin’s (Reference Mermin1993) version and also in Cabello’s (Reference Cabello, Melgar and Van Der Merwe1997) version; and the third in the Greenberger, Horne, and Zeilinger’s (Reference Greenberger, Horne, Zeilinger and Kafatos1989) version of the argument. However, none of them can be interpreted as operators representing simultaneous spin measurements on pairs or triples of spin-1 or spin-(1/2) particles. But in the absence of a unique realization of a KS graph where commuting operators represent simultaneous measurements, the no-go results do not prove that QM does not admit a noncontextual value-definite ontological model.
How then do the above KS arguments work?
9. Three Types of Kochen-Specker Arguments
To see the problem more clearly, it is worth introducing the following categorization. Suppose we are given a unique realization, that is, a KS graph and an associated operational theory realizing the operators on the graph in a one-to-one manner. Now, one can cast the KS arguments into three types according to the number of subsets of mutually commuting operators (operators on a hyperedge) that do not represent simultaneous measurements in the associated operational theory.
Arguments of type I: All commuting subsets represent simultaneous measurements.
Arguments of type II: All but one commuting subset represents simultaneous measurements.
Arguments of type III: More than one commuting subset does not represent simultaneous measurements.
There is a huge difference in the efficacy of the three types of arguments.
It is only KS arguments of type I that provide a state-independent (algebraic) proof for quantum contextuality, since for these arguments FUNC can be physically justified by the probability distribution of the joint outcomes of simultaneous measurements. Unfortunately, I am not aware of any argument of type I. In other words, I am not aware of any unique realization of any KS graph where all commuting subsets of operators would represent simultaneous measurements. Consequently, I am also not aware of any state-independent argument proving quantum contextuality.
KS arguments of type II do exist, but they provide only a state-dependent proof for quantum contextuality. An example is the GHZ argument. I return to this argument in the next section.
Finally, KS arguments of type III abound. The Peres-Mermin square with the standard spin realization is one example: the number of commuting subsets not representing simultaneous measurements is two, the three operators in the third row and the three operators in the third column. Another example for arguments of type III is the original KS graph with 117 vertices with the standard spin realization. Here none of the commuting subsets represents simultaneous measurements since the spin measurements for three orthogonal directions cannot be simultaneously performed. In section 11, I argue that arguments of type III are inconclusive in proving quantum contextuality. To get a contradiction, they need to flip to a nonunique (hyperedge-based) realization and invoke Spekkens’s condition. However, by abandoning Spekkens’s condition the contradiction can be avoided.
10. Kochen-Specker Arguments of Type II
Let us see first the KS arguments of type II. A prototype of such arguments is the GHZ argument. The GHZ graph (pentagram) reads as follows:

On the standard spin realization of the GHZ graph, all but one subset of the mutually commuting operators can be interpreted as representing simultaneous measurements. Measurements represented by commuting operators on four of the five edges of the GHZ pentagram are comeasurable since they are performed on three space-like separated subsystems. But the measurements represented by the operators on the fifth, horizontal edge are not comeasurable.
How does then the KS argument work in the GHZ case? The trick to circumvent the problem of non-comeasurability is to prepare the system in one of the common eigenstates of the measurements on the horizontal edge.Footnote 11 The outcome for each measurement on the horizontal edge will then be fixed even if the measurements are not comeasurable. The product of the possible outcomes of the four different measurements will turn out to be −1 in each common eigenstate. Now, the measurements on the other four lines of the GHZ pentagram are comeasurable, and the product of their possible joint outcomes in all states (among them in the above common eigenstates) will be +1. This means that each ontic state in the support of these common eigenstates needs to assign ±1 to the individual measurements such that the product of these numbers is +1 in each line, except in the horizontal line where it is −1. Such value assignment, however, is impossible, which rules out a noncontextual value-definite ontological model for the GHZ scenario.
More generally, KS arguments of type II, where all but one set of commuting operators represent simultaneous measurements, are all state-dependent arguments. One needs to prepare the system in one of the common eigenstates of the non-comeasurable measurements to “compensate” for the failure of comeasurability of these measurements. By doing so one obtains the same constraint on the response functions (necessary for deriving the contradiction) as one would obtain if the measurements were comeasurable. But note that these argument of type II cannot be transformed into a state-independent argument. They work only if the system is prepared in one of the common eigenstates of the operators representing non-comeasurable measurements.
11. Kochen-Specker Arguments of Type III
Finally, let us turn to the KS arguments of type III, that is, to arguments where there is more than one commuting subset not representing simultaneous measurements. Here the strategy outlined in the previous section does not work. Even if one prepares the system in a common eigenstate of a set of operators representing non-comeasurable measurements, there remains at least one other set of non-comeasurable measurements for which the joint outcomes are not known. This blocks the KS argument since the constraint on the ontic state coming from this very set of measurements will be missing.
One might however raise the question: Why not simply replace a commuting subset not representing simultaneous measurements by one single measurement and apply certain functions on the result? Then the comeasurability problem would be solved.
Well, it is indeed a mathematical fact that for any finite set of mutually commuting operators there exists an operator
and a set of functions {fi} such that
(Halmos Reference Halmos1958). Note, however, that from this mathematical fact it does not follow that there also is a physical measurement b represented by the operator
. The existence of such a measurement is a physical question that does not automatically follow from the existence of the operator
.
But now suppose that in a KS argument of type III we replace every subset of non-comeasurable measurements {ai} realizing by one single measurement b such that the functions {fi(b)} also realize
. Will it turn the argument of type III into an argument of type I?
No, it will not. Replacing non-comeasurable measurements by functions of one single measurement renders the realization hyperedge based. But then we face the following problem: to test noncontextuality, we need to provide a unique realization of the KS graph and guarantee that all subsets of mutually commuting operators represent simultaneous measurements. However, as the lemma in the introduction shows, such a realization cannot be hyperedge based. So we need to give up the uniqueness of the realization; that is, we need to associate at least one operator with more than one measurement. These measurements will be physically different but will be represented by the same operator. Operationally this means that they have the same distribution of outcomes in every quantum state. To get the no-go result, however, one needs to assume more, namely, that they have the same distribution of outcomes in every ontic state, or in other words, they have the same set of response functions. This assumption, however, is an extra assumption, different from noncontextuality. By abandoning it the KS argument can be blocked (cf. Hofer-Szabó Reference Hofer-Szabó2021a, Reference Hofer-Szabó2021b).
To sum up, KS arguments of type III do not prove quantum contextuality since FUNC cannot be physically justified for at least one set of mutually commuting operators in the argument. Replacing non-comeasurable measurements by functions of one single measurement does not solve the problem since either we stick to a unique realization but then some hyperedges will not represent simultaneous measurements or we switch to a nonunique realization but then we need to use an extra assumption in the argument. We turn to this assumption in the next section.
12. Spekkens’s Condition
Spekkens (Reference Spekkens2005) introduced a constraint on ontological models and called it measurement noncontextuality (see also Liang, Spekkens, and Wisemand Reference Liang, Spekkens and Wisemand2011; Leifer Reference Leifer2014; Krishna, Spekkens, and Wolfe Reference Krishna, Spekkens and Wolfe2017). He took it to be a generalization of the quantum mechanical noncontextuality for operational theories. I share Spekkens’s view that his requirement plays an important role in the KS arguments but, as explained in the introduction, I contest that it expresses noncontextuality.Footnote 12 Hence, I will refer to Spekkens’s noncontextuality simply as Spekkens’s condition:
If the probability of an outcome of a measurement is the same as the probability of an outcome of another measurement in every preparation, then the probability of the outcomes for the two measurements should also be the same in all ontic states.
Formally, if for some x, ,
, and

then

Now, Spekkens’s condition gives rise to a line of counterfactual reasoning. If we measure x in a certain run of the experiment and obtain the outcome Xk, then, if the ontological model is value definite with respect to x and y, we can conclude on the basis of Spekkens’s condition that had we measured y, we would have obtained Yl. But note that Spekkens’s condition is not an assumption about possible worlds but a restriction on the ontological models for an operational theory.
Spekkens’s condition, similarly to noncontextuality (12), is also a kind of inference to the best explanation: if (27) and also no-conspiracy (8) and λ-sufficiency (9) hold for an ontological model, then we obtain a neat explanation why (26) holds. The explanandum in the case of noncontextuality is no disturbance; in the case of Spekkens’s condition, it is the statistical match between outcomes of different measurements.
Note that Spekkens’s condition (26) and (27) is logically independent from contextuality (12). Spekkens’s condition does not rely on simultaneous measurability, while contextuality does. If there are no simultaneous measurements in an operational theory, then each ontological model will be noncontextual since (12) is fulfilled vacuously. Still, the model can violate Spekkens’s condition (26) and (27) if there are measurements yielding certain outcomes with the same probability in every state and differing in their response functions. Conversely, if premise (26) is not satisfied in an operational theory, then Spekkens’s condition is fulfilled vacuously. But if the theory is disturbing, the ontological model can still be contextual. In a nondisturbing operational theory, however, (26) holds for all x and y such that . Consequently, if Spekkens’s condition holds, noncontextuality will also hold. In short, in a nondisturbing operational theory (like QM) Spekkens’s condition implies noncontextuality.
It is instructive to see what an ontological model that violates Spekkens’s condition looks like. If (26) holds in an operational theory but (27) does not, then the distributions of ontic states representing the preparations cannot be arbitrary. Thus, the violation of Spekkens’s condition puts a constraint on the possible distributions of ontic states: one cannot pick arbitrarily from ontic states when preparing the system. Preparations must be composed from the underlying ontic states according to a certain pattern that is sensitive to how the ontic states respond to certain measurements. But note that it is not an a priori truth that any probability distribution of ontic states represents a physically possible preparation. There may well be many physical reasons that restrict the possible preparations of a system, and Spekkens’s condition is only one among those.
As we saw in the previous section, Spekkens’s condition plays a crucial role in nonunique KS arguments. In these arguments certain operators of the KS graph will be realized by two different measurements. The two different measurements, however—being represented by the same operator—will have the same outcome statistics. But this is exactly the antecedent (26) of Spekkens’s condition. The role of Spekkens’s condition is to ensure the consequent (27), that is, to ensure that the response functions of the two different measurements are perfectly correlated. By this assumption the no-go result can be derived. Thus, nonunique KS arguments heavily rely on Spekkens’s condition.Footnote 13
13. A Simple Toy Model
Before concluding, it is worth reflecting once more on the difference between noncontextuality and Spekkens’s condition (the first and second interpretations of noncontextuality, as we called them in the introduction) and illustrating this difference with a simple toy model. Suppose we fill a box with balls and perform two sorts of basic measurements: we pull a ball from the box and check its color or its size. The possible outcomes for the color measurement are black and white; for the size measurement the outcomes are big and small. Repeating the measurement many times we get long-run relative frequencies for the various measurement outcomes. The two measurements are comeasurable; hence, the probability distribution over the joint outcomes can be determined. Suppose furthermore that our operational theory is (1) nondisturbing, and (2) it satisfies the antecedent of the Spekkens’s condition: for every preparation, that is, for every filling of the box with balls, the probability of pulling a black ball using color measurement is the same as the probability of pulling a big ball using size measurement.
We would like to construct an ontological model for our operational theory. The model is noncontextual if, given an ontic state, the probability of all four measurement outcomes is independent of whether we produce it by a basic or a joint measurement. The model satisfies Spekkens’s condition if, given an ontic state, the probability of the outcome black/white using color measurement is the same as the probability of the outcome big/small using size measurement.
An ontological model that is both noncontextual and also satisfies Spekkens’s condition is as follows: there are just two types of balls in the box: one type is black and big, and the other type is white and small. Upon measuring the color of the first type of ball, we invariably get the outcome black independently of whether we comeasure the size (and similarly for the other outcomes). This model neatly explains the above two probabilistic facts, 1 and 2, of the operational theory.
But there are ontological models in which one of the two requirements is violated. An example of a model satisfying noncontextuality but not Spekkens’s condition is as follows: there are now four types of balls in the box: black and big, black and small, white and big, and white and small. However (for some physical reason), we can prepare the box only in such a way that there are exactly as many black and small balls in the box as there are white and big balls. Consequently, although Spekkens’s condition is violated, we get black balls using color measurement as often as we get as big balls using size measurement.
For an ontological model violating noncontextuality but not Spekkens’s condition we need to change our nondisturbing operational theory into a disturbing one.Footnote 14 Thus, suppose that there are again two types of balls in the box: black and big, and white and small. Performing a basic measurement (color, size), these ontic states invariably provide the corresponding outcomes. However, for joint measurements (color and size) the outcomes flip: for the ontic state black and big, for example, the outcome for the joint measurement will be white and small. The model is contextual but satisfies Spekkens’s condition: the probability of getting a black ball using color measurement is the same as the probability of getting a big ball using size measurement in each preparation—both equal to the relative frequency of black and big balls in that preparation.
As the toy models attest, noncontextuality and Spekkens’s condition are different and logically independent assumptions.
14. Conclusion
I have argued that a KS argument can rule out a noncontextual value-definite ontological model for QM in a state-independent way only if the KS graph on which the argument is based is (1) given a unique realization such that (2) mutually commuting operators represent simultaneous measurements. If one abandons 1, then—since some operators will be realized by multiple measurements—one needs to assume Spekkens’s condition. By giving up Spekkens’s condition, however, the no-go result can be blocked. If one abandons 2, the constraint FUNC on the value assignments cannot be physically justified. All in all, if noncontextuality is interpreted as the robustness of a system’s response to a measurement against other simultaneous measurements, then KS arguments cannot provide an algebraic for a proof of quantum contextuality.
It is important to note that the main thrust of this negative claim was not to challenge the view that QM does not admit a noncontextual value-definite ontological model. It does not. State-dependent arguments (like the GHZ argument) provide a perfect proof to this effect. The aim of the article was to challenge the view that KS arguments can prove this fact in a purely algebraic way based exclusively on measurements and not states (and in this sense the KS arguments would be stronger than the state-dependent Bell-type arguments).
But how do we know whether commuting operators represent simultaneous measurements? Well, the formalism of QM does not give us a definite answer. One cannot avoid going back to see what kind of measurements the operators are representing. A special way to ensure comeasurability (in a somewhat extended meaning) is to perform the measurements on two or more subsystems of a physical system. These subsystems are typically space-like separated parts of a bigger system. In the case of space-like separated measurements noncontextuality (12) amounts to a locality requirement, called parameter independence: measurements performed on a subsystem cannot influence the response functions of another measurement on a space-like separated other subsystem.
Noncontextuality as parameter independence plays a crucial role in the Bell-type arguments. In these arguments simultaneous measurability is guaranteed by space-like separation. KS arguments, however, are designed not specifically against locality but against noncontextuality in general. Therefore, it would be interesting to see whether there exist such KS arguments in which simultaneous measurability is not guaranteed by space-like separation. Obviously, the most baffling form of contextuality is nonlocality. But it would be instructive to see whether there are other “softer” versions of contextuality with no appeal to locality. To uncover such a contextuality, one should find a family of simultaneous measurements that are performed on the same system (and not on space-like separated subsystems) and formulate a KS argument based on these measurements. The comeasurability of these measurements should then be justified by explicitly identifying experimental procedures that can be performed on the same system at the same time, like measuring the length and width of a table. Such comeasurability would then not appeal to locality but be justified by the detailed physical description of the measurement processes. Can we come up with a KS argument where comeasurability is grounded in such a way? Does there exist a “genuine” KS argument with no appeal to locality? I do not know the answer.
A similarly open question concerns the lack of KS arguments of type I, where all sets of commuting operators represent simultaneous measurements (whether realized by space-like separation or not). Why are there no arguments providing a state-independent proof for quantum contextuality? Is there a theoretical reason for their nonexistence, or are they simply not found because they are not looked for hard enough (partly due to the negligence of the difference between commutativity and comeasurability)? Again, I have no answer.