1. Introduction
The questions studied in this paper are motivated by several negative dependence properties which are present in combinatorial probability, stochastic processes, statistical mechanics, reliability, and statistics. We focus our study on the theory of point processes because it is a natural tool in many of these fields. For each of these fields, it seems desirable to get a better understanding of what it means for a collection of random variables to be repelling or mutually negatively dependent. It is known that it is not possible to copy the theory of positively dependent random variables.
Negative association was introduced by Joag-Dev and Proschan [Reference Joag-Dev and Proschan11]. Negative association has a distinct advantage over the other types of negative dependence, namely, nondecreasing functions of disjoint sets of negatively associated random variables are also negatively associated. This closure property does not hold for other types of negative dependence.
Pemantle [Reference Pemantle25] in his negative dependence study confined himself to binary-valued random variables. The list of examples that motivated him to develop techniques for proving that measures have negative dependence properties, such as negative association, include uniform random spanning trees, simple exclusion processes, random cluster models, and the occupation status of competing urns.
In Borcea et al. [Reference Borcea, Brändén and Liggett6] several conjectures related to negative dependence made by Liggett [Reference Liggett20], Pemantle [Reference Pemantle25], and Wagner [Reference Wagner30] were solved; also Lyons’ main results [Reference Lyons21] on negative association for determinantal probability measures induced by positive contractions were extended. The authors used several new classes of negatively dependent measures for zero– one valued vectors related to the theory of polynomials and to determinantal measures (for example, strongly Rayleigh measures related to the notion of proper position for multivariate stable polynomials).
For point processes, a negative association result is known in a fairly general setting for determinantal point processes on locally compact complete separable metric spaces generated by locally trace-class positive contractions on natural L 2-space (see, e.g. [Reference Lyons22, Theorem 3.7]). A broad list of interesting examples of determinantal point processes can be found in [Reference Soshnikov28]. Negative dependence for finite point processes via determinantal and/or strongly Rayleigh measures has interesting applications in various applied fields, such as machine learning, computer vision, computational biology, natural language processing, combinatorial bandit learning, neural network compression, and matrix approximations (see, e.g. [Reference Anari, Gharan and Rezaei2], [Reference Kulesza and Taskar13], [Reference Li, Jegelka and Sra17], [Reference Li, Sra, Jegelka and Lee18], and the references therein).
Another approach to the study of dependence has been used in finance models. Positive and negative dependences for a random vector may be seen as some stochastic ordering relations of this vector with some vector with independent coordinates. Such stochastic orderings are called dependence orderings (see [Reference Joe12] or [Reference Müller and Stoyan23]). Related results in the theory of point processes and stochastic geometry, where directionally convex ordering is used to express more clustering in point patterns, have been obtained in [Reference Błaszczyszyn and Yogeshwaran4] and [Reference Błaszczyszyn and Yogeshwaran5].
Apart from the negative association property of determinantal point processes, not much is known about the negative association property of other point processes. We show the negative association property for mixed sampled point processes under an ultra log-concave (ULC) assumption on the distribution of the number of points in these point processes. In order to obtain the negative association property in this general class of point processes, we use some results from the theory of strongly Rayleigh measures on the unit cube (see Theorems 3.2 and 3.3). Consequences of the negative association property of point processes in the theory of dependence orderings of point processes are described in a separate section (see Proposition 4.1). We stress that in order to obtain comparisons in terms of dependence orderings, it is enough to use a weaker property than negative association, which we denote by wNA.
2. Negative association and related definitions
We recall the definition and basic properties of negative association.
Definition 2.1. A random vector X = (X 1, …, Xn) is negatively associated (NA) if, for every subset A ⊆ {1, …, n},
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU1.gif?pub-status=live)
whenever f and g are real nondecreasing Borel functions for which the covariance exists.
We also use negative association to refer to the set of random variables {X 1, …, Xn}, or to the underlying distribution of X.
Negative association possesses the following properties (see [Reference Joag-Dev and Proschan11]).
(i) A pair (X, Y) of random variables is NA if and only if
$$ \mathbb P(X\le x,Y\le y) \le \mathbb P(X\le x) \mathbb P(Y\le y), $$
i.e. (X, Y) is negatively quadrant dependent (NQD).
(ii) For disjoint subsets A 1, …, Am of {1, …, n}, and nondecreasing positive Borel functions f 1, …, fm, X is NA implies that
$$ \mathbb E \prod ^{m}_{i=1}f_{i}(\textbf{{X}}_{A_{i}}) \le \prod ^{m}_{i=1} \mathbb E f_{i}(\textbf{{X}}_{A_{i}}), $$
where XA i = (Xj, j ∈ Ai).
(iii) Any (at least two-element) subset of NA random variables is NA.
(iv) If X has independent components then it is NA.
(v) Increasing (nondecreasing) real functions defined on disjoint subsets of a set of NA random variables are NA.
(vi) If X is NA and Y is NA, and X is independent of Y, then (X, Y) is NA.
We shall utilize a slightly broader class than NA in our formulations on dependence orderings. We define this new class of distributions as an analogue of the weak association in sequence (WAS) class introduced in [Reference Rüschendorf27]. We say that a random vector X (or its distribution) is weakly negatively associated (wNA) if
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqn1.gif?pub-status=live)
for all real nondecreasing functions f and t ∈ ℝ, i = 1, …, n − 1.
This condition is equivalent to [(Xi +1, …, Xn) | Xi >t)] <st (Xi +1, …, Xn) for all t ∈ ℝ and i = 1, …, n − 1, where ‘<st’ denotes the usual strong stochastic ordering on ℝn. For the definitions of stochastic orderings; see [Reference Szekli29, Chapter 2]. A number of positively or negatively dependent systems of random variables is considered in [Reference Bulinski and Shashkin7] and [Reference Müller and Stoyan23].
3. Negative association for mixed sampled point processes
We first introduce some basic point process notation; see, e.g. [Reference Last and Penrose15]. Let ($\Omega, \cal F, {\mathbb{P}}$) be a probability space, and let
$\mathbb{X}$ be a complete separable metric space equipped with the Borel σ-field
$\mathcal{X}$. Denote by N the space of all measures μ on (
$\mathbb{X} , {\mathcal X}$) such that μ(B) ∈ ℕ0 : = ℕ ∪ {0} for all bounded
$B \in {\mathcal X}$. An example is the Dirac measure δx for a point
$x \in \mathbb{X} $, given by δx(B) := 1B(x). A more general example is a finite sum of Dirac measures. A point process η is a measurable mapping from (
$\Omega, \cal F, \mathbb{P}$) to (
$({\textbf{N}}, \mathcal N)$), where
${\mathcal N}$ is the smallest σ-field on N such that μ ↦ μ(B) is measurable for all
$B \in {\mathcal X}$.
We define the negative association property of point processes as follows.
Definition 3.1 A point process η is NA if, for each collection of disjoint bounded sets $B_1,\ldots, B_n\in {\mathcal X}$, the vector (η(B 1), …, η(Bn)) is NA as defined for random vectors.
For a Borel set $A \subseteq \mathbb{X}$, let
${\mathcal N}_{A}$ denote the σ-field on N generated by the functions μ ↦ μ(B) for Borel B⊆A. The natural (inclusion) partial order on N allows us to define f : N → ℝ, which is increasing. We say that a point process η has negative associations if
$\mathbb E
(\kern2ptf (\eta)g(\eta))\le \mathbb E (\kern2ptf(\eta))\mathbb E(g(\eta))$ for every pair f, g of real bounded increasing functions that are measurable with respect to complementary subsets A, Ac of
$\mathbb{X}$, meaning that a function is measurable with respect to A if it is measurable with respect to
${\mathcal N}_{A}$. Clearly, if η has negative associations then η is NA. In the case of a locally compact space
$\mathbb{X}$ the converse was shown in [Reference Lyons22, Lemma 3.6, Theorem 3.9]; see also [Reference Poinas, Delyon and Lavancier26, Theorem A.1] for the case
$\mathbb{X}= \mathbb R^d$. A different proof which works for general random measures on general Polish spaces is given in [Reference Last, Szekli and Yogeshwaran16].
Let us recall Theorem 3.7 of [Reference Lyons22]. Let λ be a Radon measure on a locally compact complete separable metric space $\mathbb{X}$. Let K be a locally trace-class positive contraction on
$L_2(\mathbb{X}, \lambda)$. By ηK we denote the determinantal point process generated by K; for details, see [Reference Lyons22, Section 3.2].
Theorem 3.1. The determinantal point process ηK defined above has negative associations.
Apart from determinantal point processes not much is known about the negative association property of point processes. Therefore we concentrate our efforts on characterizing the negative association property for an elementary but very useful class of finite point processes with an independent and identically distributed (i.i.d.) location of points. More precisely, our main focus in this paper is on the class of so-called mixed sampled point processes on $(\mathbb{X},{\mathcal X})$, defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqn2.gif?pub-status=live)
where (X)i≥1 is i.i.d. with distribution F, and τ is an ℕ0-valued random variable, independent of (Xi)i≥1. For such a process, given any finite partition $A_1,\ldots,A_k\in {\mathcal X}$ of
$\mathbb{X}$ conditionally on τ, the joint distribution of the number of points is given by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU4.gif?pub-status=live)
and, unconditionally,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU5.gif?pub-status=live)
The joint probability generating function is therefore given for z 1, …, zn ∈ [0, 1] by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU6.gif?pub-status=live)
where $P _\tau(z)=\mathbb E (z^\tau),\,z\in [0,1]$.
First, we consider mixed point processes defined by (3.1) for which random variables τ are of the form
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU7.gif?pub-status=live)
where n ∈ ℕ and U 1, …, Un are independent Bernoulli variables with possibly different success probabilities. The class of random variables which are the sums of n independent Bernoulli variables we denote by $\mathbb Q_n.$. Moreover, we let
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqn3.gif?pub-status=live)
denote the class of all random variables with values in {0, 1, …} appearing as limits in distribution of variables from $\mathbb Q_n,\,n\ge 1,$, i.e. the weak closure of
$\bigcup_{n=1}^\infty\mathbb Q_n.$.The main results of this paper are contained in Theorems 3.2 and 3.3.
Theorem 3.2. Suppose that η is a mixed sampled point process on ($(\mathbb{X},{\mathcal X})$), defined by (3.1), for which
$\tau\in {\mathcal Q}$. Then η is NA.
Proof. Let $B_1,\ldots,B_n \in {\mathcal X}$ be a partition of
$\mathbb X$, and qi : = F(Bi), i = 1, …, m.. Define
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqn4.gif?pub-status=live)
to be the vector generated by the i th sample $X_i\in \mathbb X,\,i\ge 1$. Note that each Zi has multinomial distribution with success parameters q 1, …, qm, and the number of trials equals 1, and as such is NA. Moreover, the Zi, i ≥ 1, are independent. Let U =(U 1, …, Un) be a vector of zero–one valued, independent random variables which is independent of Zi, i ≥ 1 The vector composed as (U, Z 1, …, Zn) is NA because of properties (iv) and (vi) of negative association.
Now, using property (v), we find that the vector (U 1Z1, …, UnZn) is NA as a monotone transformation (multiplication) of disjoint coordinates of (U, Z 1, …, Zn). Again, using property (v), this time for (U 1Z1, …, UnZn), and using appropriate addition, we deduce that the vector $(U_1\boldsymbol Z_1,\ldots,U_n\boldsymbol Z_n)$ is NA. It is clear that
$\sum_{i=1}^n U_i\boldsymbol Z_i$ has the same distribution as
$\sum_{i=1}^{\tau} \boldsymbol Z_i$, where
$\tau\colon =\sum_{i=1}^n U_i$. Defining η by (3.1) we hence see that (η(B 1), …, η(Bm)) is NA. This completes the proof for
$\tau \in {\mathcal Q}_n$ for arbitrary n ∈ ℕ.
For $\tau \in \mathcal Q$, there exists a sequence
$\tau_k\mathop \to \limits^{\rm{D}}
\tau,\,k\to \infty,$ for
$\tau_k\in \bigcup_{n=1}^\infty\mathcal Q_n$, and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU8.gif?pub-status=live)
for f, g supported by disjoint coordinates, which are nondecreasing and bounded. Letting k → ∞ gives
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU9.gif?pub-status=live)
Since each nondecreasing function can be monotonically approximated by nondecreasing and bounded functions, we obtain the negative association property of η.
The class $\mathcal Q$ can be completely characterized; see, e.g. [Reference Aleman, Beliaev and Hedenmalm1].
Lemma 3.1. We have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU10.gif?pub-status=live)
where τ 1 and τ 2 are independent, τ 1 has a Poisson distribution, and $\tau_2\mathop = \limits^{\rm{D}}\sum _{i=1}^\infty U_i$ for independent zero–one valued variables Ui with ℙ(Ui =1) ≥ 0, i ≥ 1, and such that
$\sum_{i=1}^\infty\mathbb P(U_i=1) <\infty$.
It is interesting to note that hypergeometric random variables belong to the class $\mathcal Q$; see, e.g. [Reference Hui and Park10].
We say that a real sequence \[({a_i})_{i = 0}^n\] has no internal zeros if the indices of its non-ero terms form a discrete interval. Following [Reference Pemantle25] we shall use the following class of sequences and distributions.
Definition 3.2. A finite real sequence \[({a_i})_{i = 0}^n\] of nonnegative real numbers with ai ≠ 0 for 1 ≤ i ≤ n−1 (no internal zeros) is ultra log-concave (ULC(n)) if
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU11.gif?pub-status=live)
Define
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU12.gif?pub-status=live)
to be the class of random variables whose probability functions have the above property. It is known that if a nonnegative sequence \[({b_i})_{i = 0}^m\] is ULC(n) and a nonnegative sequence
\[({a_i})_{i = 0}^n\] is ULC(m), then the convolution of these sequences is ULC(m + n) (see [Reference Liggett19, Theorem 2]). Let
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqn5.gif?pub-status=live)
Sums of independent variables from the class $\mathcal S$ are in
$\mathcal S$. We shall see below that
$\mathcal Q\subseteq \mathcal S$. Utilizing the class
$\mathcal S$, Theorem 3.2 can be generalized with the use of elementary symmetric functions.
Theorem 3.3. Suppose that η is a mixed sampled point process on $(\mathbb{X}, {\mathcal X})$, defined by (3.1), for which
$\tau\in \mathcal S$. Then η is NA.
Proof. Assume first that $\tau\in \mathcal S_n$. Let
$B_1,\ldots,B_n \in {\mathcal X}$ be a partition of
$\mathbb X$, and let qi := F(Bi), i = 1, …, m. Let Zi, i ≥ 1 be defined by (3.3). Note that each Zi has multinomial distribution with success parameters q 1, …, qm, and the number of trials equals 1, and as such is NA. Moreover, Zi, i ≥ 1, are independent. For fixed n ∈ ℕ, we now define a vector U = (U 1, …, Un) with {0, 1}-valued coordinates, independent of Zi, i ≥ 1. It is enough to define the distribution of U. For the generating function of τ,
\[\tau ,{P_\tau }(z) = \mathbb E({z^\tau })\], we define the distribution of U by providing its multidimensional generating function. It is obtained by substituting in Pτ (z)for each k = 0, …, n,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU13.gif?pub-status=live)
where ek(z 1, …, zn) is the kth elementary symmetric polynomial. Note that, by the definition of the elementary symmetric polynomials, for each k, the function $\binom{n}{k}^{-1} e_k(z_{1}, \dots, z_{n} )$ of variables z 1, …, zn is the multivariate generating function of a vector of n {0, 1}-valued variables which contains exactly k values 1 with the same probability
$\binom{n}{k}^{-1}$ defined for all possible selections of the k coordinates at which the values 1 occur. The distribution of U defined in such a way is the mixture with the coefficients ak := ℙ(τ = k) of the distributions corresponding to
$\binom{n}{k}^{-1} e_k(z_{1}, \dots, z_{n} ),\,k=0,\ldots, n$. Since each function ek is symmetric in variables z 1, …, zn, the same is true for the generating function of U = (U 1, …, Un); therefore, (U 1, …, Un) are exchangeable. Moreover,
$\sum_{i=1}^nU_i\mathop = \limits^{\rm{D}} \tau$, since, by setting z 1 = … = zn := z, we obtain Pτ (z). The sequence
\[(\mathbb P(\tau = i))_{i = 0}^n\] is called the rank sequence of the vector U = (U 1, …, Un) (see [Reference Borcea, Brändén and Liggett6, Definition 2.8]).
From our assumption, the rank sequence for U =(U 1, …, Un) is ULC(n) and from Theorem 2.7 of [Reference Pemantle25], we deduce that U is NA. Now the vector composed as (U, Z 1, …, Zn) is NA because of property (vi). Using property (v), we find that the vector (U 1Z1, …, UnZn) is NA as a monotone transformation (multiplication) of disjoint coordinates of (U, Z 1, …, Zn). Again, using property (v), this time for (U 1Z1, …, UnZn), and using addition, we deduce that the vector $\sum_{i=1}^n U_i\boldsymbol Z_i$ is NA. It is clear that
$\sum_{i=1}^n U_i\boldsymbol Z_i$ has the same distribution as
$\sum_{i=1}^{\tau} \boldsymbol Z_i$, which in turn has the same distribution as (η(B 1), …, η(Bm)). This completes the proof for
$\tau \in \mathcal S_n$ for arbitrary n ∈ ℕ. For
$\tau \in \mathcal S$, we apply a limiting argument analogous to that used in the proof of Theorem 3.2.
The following lemma may be regarded as known since it is an immediate consequence of the classical Newton inequalities (see, e.g. [Reference Niculescu24] for a new look at Newton’s inequalities). We formulate it in the setting of the classes of random variables introduced in this paper.
Lemma 3.2. For the classes of random variables defined in (3.2) and (3.4), we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU14.gif?pub-status=live)
Proof. Suppose that $\tau\in\mathcal Q_n$. Then, for its generating function,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU15.gif?pub-status=live)
For ak := (1 − pk)/pk, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU16.gif?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU17.gif?pub-status=live)
i.e. the coefficients ck, k = 0, …, n, are given by the corresponding elementary symmetric polynomials in the variables ai, i = 1, …, n. It is known from the classical Newton inequalities that the sequence \[({c_i})_{i = 0}^n\] is ULC(n), and, since ℙ(τ = k)= p 1 … pncn −k, we conclude that the sequence
\[(\mathbb P(\tau = k))_{k = 0}^n\] is also ULC(n), and, therefore, τ ∈ Sn, which immediately implies the inclusion
$\mathcal Q\subset \mathcal S$.
We note that the arguments utilized in Theorem 3.3 can be used for random vectors with arbitrary positive values.
Proposition 3.1. Assume that $\boldsymbol Z_i=(Z_i^1,\ldots,Z_i^m),\,i\ge 1$ is a sequence of i.i.d. random vectors with components in ℝ+ such that, for each i ≥ 1,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU18.gif?pub-status=live)
that is, at most one of the components can be positive. Then, for $\tau \in \mathcal S$, which is independent of Zi, i ≥ 1, the vector
$ \boldsymbol W\colon =\sum^\tau_{i=1}\boldsymbol Z_i$ is NA.
Proof. We use basically the same argument as used in the proof of Theorem 3.3. Let U = (U 1, …, Un), independent of Zi, i ≥ 1, be the vector of {0, 1}-valued random variables obtained by its generating function as follows. In Pτ (z), substitute $z^k\colon =\binom{n}{k}^{-1} e_k(z_{1}, \dots, z_{n} ), \,k=1,\ldots,n$, where ek(z 1, …, zn) are the elementary symmetric polynomials. This substitution defines a generating function of variables z 1, …, zn. It is then immediate that
$\sum_{i=1}^nU_i\mathop = \limits^{\rm{D}}\tau$, that is, the sequence
$(\mathbb P(\tau=i))_{i=0}^n$ is the rank sequence for (symmetric) U = (U 1, …, Un). From our assumption, the rank sequence for U = (U 1, …, Un) is ULC(n) and, from Theorem 2.7 of [Reference Pemantle25], we deduce that U is NA. Now the vector composed as (U, Z 1, …, Zn) is NA because of the following lemma and property (vi).
Lemma 3.3. Assume that Z = (Z1, …, Zm) is a random vector with components in ℝ+. Assume that $\sum^m_{j=1} \ 1_{\{Z^i>0\}}\le 1$, that is, at most one of the components can be positive. Then Z is NA.
Proof of Lemma 3.3. In order to show that cov ( f (Z 1, …, Zk), g(Zk +1, …, Zm)) ≥ 0 for nondecreasing f and g, it suffices to assume that f (0) = g(0) = 0. Otherwise, we can consider f − f (0) and g − g(0). Because only one of the coordinates can be nonzero, we obtain E( f (Z 1, …, Zk)g(Zk +1, …, Zm)) = 0, while the product of the expectations is nonnegative since f ≥ 0 and g ≥ 0.
Now, using property (v), we deduce that the vector (U 1Z1, …, UnZn) is NA because it is a monotone transformation (multiplication) of disjoint independent coordinates of (U, Z 1, …, Zn). Again, using property (v), this time for (U 1Z1, …, UnZn), and using addition, we deduce that the vector $\sum_{i=1}^n U_i\boldsymbol Z_i$ is NA. It is clear that
$\sum_{i=1}^n U_i\boldsymbol Z_i$ has the same distribution as
$\sum_{i=1}^{\tau} \boldsymbol Z_i$.
The above proposition can be used to study random measures other than point processes. We shall pursue this topic elsewhere.
4. Dependence orderings for point processes
An extensive study of dependence orderings for multivariate point processes on R is contained in [Reference Kulik and Szekli14]. Related results in the theory of point processes and stochastic geometry, where the directionally convex ordering is used to express more clustering in point patterns, are obtained by Błaszczyszyn and Yogeshwaran [Reference Błaszczyszyn and Yogeshwaran5]; see also the references therein. We shall use the negative association property of point processes to obtain comparisons of dependence in point processes. More precisely, we recall some basic facts on dependence orderings of vectors and their relation to negative association which can be directly utilized for point processes.
4.1. Dependence orderings and negative correlations for vectors
For a function f : ℝn → ℝ, define the difference operator $\Delta^\epsilon_i,\,\epsilon \gt 0, 1\le i\le n$ by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU19.gif?pub-status=live)
where ei is the ith unit vector. Then f is called supermodular if, for all 1 ≤ i < j ≤ n and ∊, δ > 0,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU20.gif?pub-status=live)
for all x ∈ ℝ, and directionally convex if this inequality holds for all 1 ≤ i ≤ j ≤ n. Let $\mathcal F^{\rm sm}$ and
$ \mathcal F^{\rm dcx}$ denote the classes of supermodular and directionally convex functions. Then, of course,
$\mathcal F^{\rm dcx}\subseteq \mathcal F^{\rm sm}.$. Typical examples from the
$\mathcal F^{\rm dcx}$ class of functions are
$ f(\boldsymbol x)=\psi (\sum_{i=1}^nx_i)$ for ψ convex, or f (x)=max1≤i≤n x i, but there are many other useful functions in this class; see, for example, [Reference Błaszczyszyn and Yogeshwaran3].
The corresponding stochastic orderings are defined by X <sm Y if $\mathbb E f(\boldsymbol x)\le \mathbb E f(\boldsymbol Y)$ for all
$f\in \mathcal F^{\rm sm},$, and analogously for X <dcx Y. For differentiable functions f, we obtain
$ f \in \mathcal F^{\rm sm}$ if and only if (∂ 2f /∂xixj) ≥ 0 for i < j, and
$ f \in \mathcal F^{\rm dcx}$ if and only if this inequality holds for i ≤ j (see [Reference Müller and Stoyan23, Theorems 3.9.3 and 3.12.2]). While comparison of X and Y with respect to ‘<sm’ implies (and is restricted to the case of) identical marginals
$\mathcal F^{\rm dcx}$, the comparison with respect to the smaller class
$\mathcal F^{\rm dcx}$ implies convexly increasing marginals Xi <cx Yi (which means by definition that
$\mathbb E \psi(X_i)\le \mathbb E\psi (Y_i)$ for all ψ : ℝ → ℝ convex). Both of these orderings belong to the class of so-called dependence orderings (see, e.g. [Reference Joe12]) which is defined by a list of suitable properties, among them the property that cov (Xi, Xj) ≤ cov (Yi, Yj).
In [Reference Rüschendorf27] another dependence ordering X <wcs Y (weakly conditional increasing in sequence order) was introduced by the condition
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU21.gif?pub-status=live)
for all monotonically nondecreasing f, and all t ∈ ℝ, 1 ≤ i ≤ n − 1.
All dependence orderings can be used to define some classes of distributions with negative (or positive covariances) when applied to vectors with independent components. More precisely, let X* denote a vector with independent components, and such that $X_i^*\mathop = \limits^{\rm{D}} X_i$. Using this approach, definition (2.1) of wNA is equivalent to the relation X <wcs X*. It is clear that wNA is further equivalent to
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU22.gif?pub-status=live)
for all i = 1, …, n − 1, t > 0, where [(Xi +1, …, Xn) | Xi > t)] denotes a random vector which has the distribution of (Xi +1, …, Xn) conditioned on the event {Xi >t}, and ‘<st’ is the usual (strong) stochastic order. Directly from the definition of the negative association property we see that the weakly negative association property is weaker than the negative association property. For X being wNA, Theorem 4.1 implies that, for example (see also [Reference Christofides and Vaggelatou8] for the negative association case), $ \sum_{i=1}^n X_i <_{\rm cx}\sum_{i=1}^n X^*_i, $, and
$ \max_{1\le k\le n}\sum_{i=1}^k X_i<_{\rm icx}\max_{1\le k\le n}\sum_{i=1}^k X^*_i, $, where ‘<icx’ is defined similarly to ‘<cx’ but with the use of nondecreasing convex functions. Taking other supermodular functions, it is possible to get maximal inequalities for wNA vectors as in [Reference Christofides and Vaggelatou8].
The following theorem from [Reference Rüschendorf27] connects the above-defined orderings.
Theorem 4.1. Let X and Y be n-dimensional random vectors.
(1)
$X_i\mathop = \limits^{\rm{D}} Y_i, \, 1\le i\le n$, then X < wcs Y implies that X < sm Y.
(2) If Xi < cx Yi, 1 ≤ i ≤ n, then X < wcs Y implies that X < dcx Y.
For the ‘<dcx’ ordering, we get the following corollary from Theorem 4.1. This corollary will be used later for point processes with the negative association property.
Corollary 4.1. Suppose that X is wNA and Y* has independent coordinates with $X_i<_{\rm cx}Y_i^*$. Then X <dcx Y*.
4.2. Negative association and dependence orderings for point processes
Using Theorem 4.1, we are able to compare the covariance structure of some point processes. To be more precise, we need a couple of definitions.
Definition 4.1. Two point processes η 1 and η 2 on $\mathbb{X}$ are ordered in the directionally convex order (dcx) (weakly conditional increasing sequence order (wcs))
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU23.gif?pub-status=live)
as defined for random vectors, for all bounded Borel sets B 1, …, Bn, n ≥ 1.
Similarly to Definition 3.1 we define a weaker version of negative association for point processes.
Definition 4.2. A point process η is wNA if, for each collection of disjoint bounded sets B 1, …, $B_1,\ldots, B_n\in {\mathcal X}$, the vector (η(B 1), …, η(Bn)) is wNA, as defined for random vectors.
We now propose a direct consequence of the above definitions and Theorem 4.1. It will be used in the next section for determinantal and mixed sampled point processes.
Proposition 4.1. Suppose that η 1 is a point process on ($(\mathbb{X}, {\mathcal X})$) with a locally finite intensity measure which is wNA. Let η 2 be a Poisson process with intensity measure
$\mathbb E(\eta_1)$. Assume that η 1(B) <cx η 2(B) for all bounded Borel sets B. Then η 1 <dcx η 2.
For illustration, we now show that the ordering ‘<dcx’ can be used to obtain comparisons of moment measures and void probabilities for point processes on ℝd. These results can be modified with similar arguments to hold on general state spaces.
The following consequence of the ‘<dcx’ ordering for point processes is known from [Reference Błaszczyszyn and Yogeshwaran3] and [Reference Błaszczyszyn and Yogeshwaran5]. Given a point process η on $\mathbb X$ and n ∈ ℕ, we let
$\mathbb E(\eta^n)$ denote the nth moment measure of η, that is, the measure
$B\mapsto \mathbb E(\eta^n(B))$ on (ℝd)n equipped with the Borel σ-field.
Lemma 4.1. Let η 1 and η 2 be two point processes on ℝd. If η 1 <dcx η 2 then the following statements holds.
1. (Moment measures) If the measure
$\mathbb E \eta_2^n$ is σ-finite then
$\mathbb E (\eta^n_1(B))\le \mathbb E (\eta^n_2(B))$ for all bounded Borel sets B ⊂ (ℝd)n.
2. (Void probabilities) ℙ(η 1(B) = 0) ≤ ℙ(η 2(B) = 0) for all bounded Borel sets B.
Using the ‘<wcs’ criterion for ‘<dcx’ from Theorem 4.1, we obtain the following result.
Corollary 4.2. Let η 1 and η 2 be two point processes on ℝd. If, for all bounded Borel sets B, η 1(B) <cx η 2(B) and η 1 <wcs η 2, then the comparisons of moment measures and void probabilities from the above lemma hold.
An interesting case for such comparisons is when η 2 is a Poisson point process.
Proposition 4.2. Suppose that η is a point process on ℝd which is wNA and has a locally finite intensity measure. Then the following statements hold.
1. (Moment measures
$\mathbb E (\eta (B_1)\cdots\eta (B_n))\le \mathbb E (\eta (B_1)) \cdots\mathbb E(\eta (B_n))$ for all disjoint, bounded Borel sets B 1, …, Bn, which, for simple point processes η, implies that
$$ \mathbb E\Big(\exp\Big(\int_{\mathbb R^d}h(\boldsymbol x)\eta(\boldsymbol x)\Big)\Big)\le \exp\Big(\int_{\mathbb R^d}(\ e^{h(\boldsymbol x)}-1)\mathbb E\eta(\boldsymbol x)\Big) $$
for all measurable h ≥ 0.
2. (Void probabilities)
$\mathbb P(\eta (B)=0)\le \exp(-\mathbb E\eta(B))$ for all bounded Borel sets B, which, for simple point processes η, implies that
\begin{equation*} \label{void} \mathbb E\Big(\exp\Big(-\int_{\mathbb R^d}h(\boldsymbol x)\eta(\boldsymbol x)\Big)\Big)\le \exp\Big(\int_{\mathbb R^d}(\ e^{-h(\boldsymbol x)}-1)\mathbb E\eta(\boldsymbol x)\Big) \end{equation*}
for all measurable h ≥ 0.
Proof. Let B 1, …, Bn be disjoint Borel sets. The vector X := (η(B 1), …, η(Bn)) is wNA, by our assumptions. Let X* denote a corresponding independent version. By the definition of wNA, we have X < wcs X*, so that Theorem 4.1 shows that X < sm X*. From the definition of ‘<sm’, $\mathbb E (\eta (B_1)\cdots\eta (B_n)))\le \mathbb E (\eta (B_1))\cdots\mathbb E(\eta (B_n))$. Now, from Proposition 1 of [Reference Błaszczyszyn and Yogeshwaran5], this implies the second assertion of the first part.
To prove the second part of the proposition, let B and B ′ be disjoint bounded Borel sets. By assumption, (η(B), η(B ′)) is wNA. Directly from the definition of wNA we conclude that ℙ(η(B) = 0, η(B ′) = 0) ≤ ℙ(η(B) = 0)P(η(B ′) = 0). Now, from Proposition 3.1 of [Reference Błaszczyszyn and Yogeshwaran4], $\mathbb P(\eta (B)=0)\le \exp(-\mathbb E\eta(B))$ for all bounded Borel sets B. Moreover, from Proposition 2 of [Reference Błaszczyszyn and Yogeshwaran5], this inequality is equivalent to the second inequality asserted in the second part.
4.3. ‘<dcx’ comparisons for determinantal and mixed sampled point processes
In the following corollary we shall assume that $\mathbb {X}$ is locally compact and that λ is a Radon measure on
$\mathbb {X}$. Let K be a locally trace-class positive contraction on L 2(
$L_2(\mathbb {X}, \lambda)$, λ), and let ηK be the determinantal point process generated by K. From Proposition 4.1 we obtain the following corollary.
Corollary 4.3. Suppose that ηK is the determinantal point process described above. Then
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU26.gif?pub-status=live)
where η 2 denotes a Poisson point process with intensity measure $\mathbb E\eta_K$.
Proof. Fix a bounded Borel set B. From Theorem 3.1 we know that ηK is NA, so in order to get the conclusion of this corollary it is enough (see Proposition 4.1) to show that ηK(B) <cx η 2(B). From [9, Proposition 9] we know that ηK(B) is distributed as a sum of independent Bernoulli random variables. From the definition of the log-concave ordering, ‘<lc’ in [Reference Whitt31], it follows that ηK(B) <lc η 2(B), which in turn implies that ηK(B) <cx η 2(B); see Theorem 1 of [Reference Whitt31].
The above corollary for the case of jointly observable sets and $\mathbb{X} = \mathbb R^d$ was proved in [Reference Błaszczyszyn and Yogeshwaran4, Proposition 5.3] using a different argument.
For mixed sampled point processes on general spaces, we obtain the following comparison result.
Proposition 4.3. Suppose that η 1 is a mixed sampled point process on ($(\mathbb{X},{\mathcal X})$) defined by (3.1) for which
$\tau\in \mathcal S$. Then
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU27.gif?pub-status=live)
where η 2 denotes a Poisson point process on $\mathbb{X}$, with the intensity measure
$\mathbb E\eta_1$.
Proof. From Theorem 3.3 we know that η 1 is NA, and from Proposition 4.1 we shall get the conclusion of the present proposition if we show that, for such processes, η 1(B) <cx η 2(B). From the definition of mixed sampled point processes we know that η 1(B) is distributed as a random sum $\sum_{i=1}^\tau U_i$, where (Ui, i ≥ 1) is an i.i.d. sequence of Bernoulli, i.e. {0, 1}-valued variables with success probability F(B). Since
$\tau \in \mathcal S_n$, from the definition of the log-concave ordering ‘<lc‘ in [Reference Whitt31], it follows that τ <lc κ, where κ has a Poisson distribution with mean
$\mathbb E(\tau)$. Therefore, τ <cx κ; see Theorem 1 of [Reference Whitt31]. It follows that
$\tau <_{\rm lc} \kappa$, where κ is now assumed to be independent of (Ui, i ≥ 1) (see, e.g. [Reference Kulik and Szekli14, Corollary 4.5]). From this we get η 1(B)<cx η 2(B), since
$\mathbb E\eta_1(B)=\mathbb E(\tau)\mathbb E(U_i)$ and
$\sum_{i=1}^{\kappa}U_i$ has Poisson distribution. For arbitrary
$\tau\in \mathcal S$, we apply weak approximation by random variables from
$\mathcal S_n,\, n\ge 1$.
From Proposition 4.2, we obtain the following corollary.
Corollary 4.4. Suppose that η is a mixed sampled point process on ℝd, defined by (3.1), for which $\tau\in \mathcal S$. Then the following statements hold
1. (Moment measures)
$\mathbb E (\eta (B_1)\cdots\eta (B_n)))\le \mathbb E (\eta (B_1)) \cdots\mathbb E(\eta (B_n))$, for all disjoint, bounded Borel sets B 1, …, Bn, n ≥ 1.
2. (Void probabilities)
$\mathbb P(\eta (B)=0)\le \exp(-\mathbb E\eta(B))$ for all bounded Borel sets B, n ≥ 1.
We illustrate this with an example of a direct approach to the comparison of void probabilities.
Example 4.1 (Comparison of binomial mixed sampled p.p. with Poisson p.p. on ℝd.) Let η be a mixed sampled point process on ℝd with τ being a binomial distributed random variable. We shall compare void probabilities for this process with void probabilities of a Poisson point process with the same intensity measure. In general, for a simple point process η on ℝd, to test whether $\mathbb P(\eta (B)=0)\le \exp(-\mathbb E(\eta(B)))$, it is enough to check (see [Reference Błaszczyszyn and Yogeshwaran5, Proposition 3.1]) that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU28.gif?pub-status=live)
for disjoint B, B ′. For arbitrary, measurable disjoint sets B, B ′, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU29.gif?pub-status=live)
Since τ has a binomial distribution with, say, parameters n (number of trials) and p ∈ (0, 1) (success probability), then Pτ (1−s)=(p(1−s)+(1−p))n. It is easy to see by differentiation that ϕ(s) := −log Pτ (1−s) is then an increasing and convex function such that ϕ(0) = 0. It is known that such a function is superadditive; therefore, ϕ(s + t) ≥ ϕ(s) + ϕ(t), and then Pτ (1 − (F(B) + F(B ′)) ≤ Pτ (1 − F(B))Pτ (1 − F(B ′)). In this case, for disjoint B, B ′ we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU30.gif?pub-status=live)
Therefore, for this process, we obtain $\mathbb P(\eta_1(B)=0)\le \exp(-\mathbb E(\eta(B)))$.
5. Some applications
Let η be a point process on a complete, separable metric space $\mathbb{X}$. Using the Chebyshev inequality, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU31.gif?pub-status=live)
for all bounded Borel sets B and ∊ > 0.
Similarly, using the Chernoff bound,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU32.gif?pub-status=live)
for any t, a > 0, and the upper bounds can be replaced by the values taken from a process larger in ‘<dcx’ than η, directly using the definition of ‘<dcx’ recalled in Section 4.1. If η is determinantal or a NA mixed sampled point process, the corresponding larger process is a Poisson process which can be used to obtain upper bounds and concentration inequalities using Proposition 4.3.
Similarly, for all bounded Borel sets B and, ∊, t > 0, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU33.gif?pub-status=live)
Using Corollary 2 of [Reference Christofides and Vaggelatou8], we can obtain Kolmogorov-type inequalities from the negative association property of η.
Corollary 5.1. Suppose that η is a mixed sampled point process on $\mathbb{X}$, for which
$\tau\in \mathcal S$. Then, for any increasing sequence bk, k ≥ 1, of positive numbers, any collection of disjoint bounded Borel sets B 1, …,
$B_1,\ldots, B_n\in {\mathcal X}$, and ∊ > 0,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190711235546464-0282:S002190021900010X:S002190021900010X_eqnU34.gif?pub-status=live)
Acknowledgements
This work was in part supported by the German Research Foundation (DFG) under grant LA965/9-2, awarded as part of the DFG-Forschergruppe FOR 1548 ‘Geometry and Physics of Spatial Random Systems’, and by the National Science Centre, Poland, under grant NCN 2015/19/B/ST1/01152.