Rejection- and importance-sampling-based perfect simulation for Gibbs hard-sphere models

Sarat Moka; Sandeep Juneja; Michel Mandjes

doi:10.1017/apr.2021.1

Rejection- and importance-sampling-based perfect simulation for Gibbs hard-sphere models

Part of: Stochastic processes Probabilistic methods, simulation and stochastic differential equations Limit theorems Algorithms - Computer Science

Published online by Cambridge University Press: 08 October 2021

Sarat Moka ,

Sandeep Juneja and

Michel Mandjes

Show author details

Sarat Moka*: Affiliation:
University of Queensland
Sandeep Juneja*: Affiliation:
Tata Institute of Fundamental Research
Michel Mandjes*: Affiliation:
University of Amsterdam
*: *Postal address: School of Mathematics and Physics, Coopers Road, St Lucia, QLD, Australia 4072. Email: s.babumoka@uq.edu.au
**Postal address: Homi Bhabha Road, Colaba, Mumbai, India 400005. Email: juneja@tifr.res.in
***Postal address: Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands. Email: m.r.h.mandjes@uva.nl

Article contents

Abstract
Introduction
Preliminaries
Large deviations results
Acceptance–rejection-based algorithms
Dominated coupling-from-the-past methods
Simulations
Conclusion
Proofs
References

Rights & Permissions

Abstract

Coupling-from-the-past (CFTP) methods have been used to generate perfect samples from finite Gibbs hard-sphere models, an important class of spatial point processes consisting of a set of spheres with the centers on a bounded region that are distributed as a homogeneous Poisson point process (PPP) conditioned so that spheres do not overlap with each other. We propose an alternative importance-sampling-based rejection methodology for the perfect sampling of these models. We analyze the asymptotic expected running time complexity of the proposed method when the intensity of the reference PPP increases to infinity while the (expected) sphere radius decreases to zero at varying rates. We further compare the performance of the proposed method analytically and numerically with that of a naive rejection algorithm and of popular dominated CFTP algorithms. Our analysis relies upon identifying large deviations decay rates of the non-overlapping probability of spheres whose centers are distributed as a homogeneous PPP.

Keywords

Exact simulation dominated coupling from the past large deviations non-overlapping probability

MSC classification

Primary: 65C05: Monte Carlo methods

Secondary: 60G55: Point processes 60F10: Large deviations 68W40: Analysis of algorithms

Type: Original Article
Information: Advances in Applied Probability , Volume 53 , Issue 3 , September 2021 , pp. 839 - 885

DOI: https://doi.org/10.1017/apr.2021.1 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Perfect sampling, that is, the generation of unbiased samples from a target distribution (also referred to as perfect simulation or exact sampling), is an important and exciting area of research in stochastic simulation. In this paper, we introduce and investigate a novel methodology for generating perfect samples of finite Gibbs hard-sphere models, which are an important family of Gibbs point processes. Roughly, a Gibbs hard-sphere model can be described as a set of spheres such that their centers constitute a Poisson point process (PPP) on a bounded Euclidean space, conditioned that no two spheres overlap with each other. The proposed methodology combines importance sampling (IS) and acceptance–rejection (AR) techniques to achieve substantial performance improvement in certain important regimes of interest. In statistical physics, there is a large body of work related to Gibbs hard-sphere models; see, e.g., [Reference Adams1, Reference Amorós and Ravi2, Reference Baranau and Tallarek6, Reference Juneja and Mandjes24, Reference Lieb and Mattis28, Reference Mayer and Mayer30, Reference Pathria35, Reference Salsburg, Zwanzig and Kirkwood37]. These models are important also in modeling adsorption of latexes or proteins on solid surfaces (see [Reference Senger, Voegel and Schaaf38, Reference Tarjus, Schaaf and Talbot40] and references therein). For the analysis of wireless communication networks, it is common to use Gibbs hard-sphere models to model base-stations in a cellular network, because normally no two base-stations are to be placed closer than a certain distance from each other [Reference Haenggi18, Reference Stoyan, Kendall and Mecke39]. Our results can be used to assess the stationary behavior of code-division multiple access wireless networks.

Literature review: The existing literature offers several perfect sampling methods for Gibbs hard-sphere models. Among these, the dominated coupling-from-the-past (CFTP) methods are most prominent; these are based on the seminal paper by Propp and Wilson [Reference Propp and Wilson36] (see [Reference Huber22, Reference Kendall25–Reference Kendall and Møller27]). Another well-known perfect sampling method for the Gibbs hard-sphere models is the backward–forward algorithm by Ferrari et al. [Reference Ferrari, Fernández and Garcia12]; see also [Reference Garcia16, Reference Huber23]. For some of the applications of perfect sampling for these models, refer to [Reference Berthelsen and Møller7, Reference Berthelsen and Møller8, Reference Møller, Pettitt, Reeves and Berthelsen34]. For other related literature on perfect sampling for spatial point processes, refer to [Reference Häggström, van Lieshout and MØller19, Reference Møller, Huber and Wolpert32].

As mentioned in [Reference Garcia16], all the existing methods are, in some sense, complementary to each other. They take advantage of the important property that the distribution of a Gibbs hard-sphere model can be realized as an invariant measure of a spatial birth-and-death process, called the interaction process. For example, the main ingredient of the dominated CFTP method is to construct a birth-and-death process backward in time, starting from its steady state at time zero, such that it dominates the interaction process, and then use thinning on the dominating process to construct coupled upper and lower bound processes forward in time such that the coalescence of these two bounding processes assures a perfect sample from the target measure, which is the invariant measure of the interaction process. The backward–forward algorithm is based on the construction of the clan of ancestors, which uses thinning of a dominating process and extends the applicability to infinite-volume measures. A crucial drawback of the naive AR and the dominated CFTP methods is that they are guaranteed to be efficient only if the intensity of the Gibbs hard-sphere model is close to the intensity of the reference PPP; see [Reference Huber23] for details. In addition, most of the dominated CFTP methods suffer from the so-called impatient-user bias (a bias that is induced when a user aborts long runs of the algorithm); see [Reference Fill13, Reference Fill, Machida, Murdoch and Rosenthal14, Reference Thönnes41].

Our contributions: AR methods are free of the impatient-user bias and involve neither thinning nor coupling (which are crucial for the other methods). Despite being an obvious alternative to the existing methods, to the best of our knowledge, the use of AR methods is still largely unexplored in the context of Gibbs point processes (except for brief discussions, e.g. in [Reference Foss, Juneja, Mandjes and Moka15, Reference Huber23]). AR methods for Gibbs hard-sphere models are amenable to further algorithmic enhancements that may substantially decrease the expected running time of the algorithm. The methodology proposed here provides one such enhancement. To highlight the significance of the proposed methodology, we compare its running time complexity with that of both the naive AR and the dominated CFTP methods. This effectiveness analysis is based on our large deviations analysis of the non-overlapping probability. A brief summary of our results is as follows:

• Our first key contribution is that we conduct a large deviations analysis of the probability of spheres not overlapping with each other when their centers constitute a homogeneous PPP. More specifically, we consider a homogeneous marked PPP on $[0,1]^d$ with intensity $\lambda$, where the points are the centers of spheres with independently and identically distributed (i.i.d.) radii as marks, which are independent of the centers and identical in distribution to $R/\lambda^\eta$ for a positive bounded random variable R and a constant ${\eta > 0}$. We establish large deviations of the probability that spheres do not overlap with each other, as ${\lambda \nearrow \infty}$. This analysis is useful in the study of the asymptotic behavior of the expected running time complexities of the proposed and the existing perfect sampling methods for the Gibbs hard-sphere models. This analysis may also be of independent interest.
• Our second key contribution is that we propose a novel IS-based AR algorithm for generating perfect samples of the Gibbs hard-sphere model obtained by considering the homogeneous marked PPP conditioned on no overlap of the spheres. This is achieved by partitioning the underlying configuration space and arriving at an appropriate change of measure on each partition. The applicability of the proposed algorithm is illustrated in two scenarios. In the first scenario, all the spheres are assumed to be of a fixed size (i.e., R is a fixed positive constant). We develop a grid-based IS technique under which spheres are generated sequentially so that the chance of spheres overlapping is small, and the corresponding likelihood ratio has a better deterministic upper bound that improves the acceptance probability in each iteration of the algorithm. In the second scenario, we consider the general case where spheres have i.i.d. radii. In this scenario, we divide the underlying configuration space into two sets. On one set, the sum of the volumes of spheres is bounded from below, and on the other set, the volume sum takes small values so that the set consists of only rare configurations. For the first set, we develop a grid-based IS method that is similar to the one described above, and for the second set, we use an exponential twisting on the sphere volume distribution. In both the scenarios, the new method provably substantially improves the performance of the algorithm compared to the naive AR method.
• We analytically and numerically compare the performance of the proposed IS-based AR method with that of some of the dominated CFTP methods. The numerical results support our analytical conclusions that the proposed method is substantially more efficient than the existing methods over the high-density regime, where $\eta d \leq 1$ and $\lambda$ is large.

Organization: Section 2 provides a definition of the hard-sphere model. Section 3 presents the large deviations of the non-overlapping probability. In Section 4, we first review a naive AR method and analyze its expected running time complexity, then propose and analyze the IS-based AR method. Section 5 gives a review of the well-known dominated CFTP methods for the hard-sphere models. Section 6 illustrates the efficiency of the proposed methodology using numerical experiments. Section 7 provides a brief conclusion. All proofs are presented in Appendix A.

2. Preliminaries

First we introduce some notation. $X \sim F$ denotes that the distribution of a random object X is F. ${\mathsf{Poi}}(\lambda)$ and ${\mathsf{Bern}}(p)$ denote, respectively, Poisson distribution with mean $\lambda > 0$ and Bernoulli distribution with success probability p. The uniform distribution on [0,1] is denoted by ${\mathsf{Unif}}(0,1)$. For an event A, the indicator function I(A) is equal to 1 if A occurs, and to 0 otherwise. A measure $\mu_1$ is absolutely continuous with respect to a measure $\mu_2$ on a measurable set A if $\mu_1(B\cap A) = 0$ for any measurable B such that $\mu_2(B\cap A) = 0$. For any probability measure $\mu$, ${\mathbb{P}}_{\mu}(A)$ denotes the probability of an event A under $\mu$, and ${\mathbb{E}}_{\mu}[\cdot]$ denotes the associated expectation. We drop the subscript $\mu$ when it is not relevant. For any non-negative real-valued functions f and g, we write $f(x) = O(g(x))$ if $\limsup_{x\rightarrow \infty} f(x)/g(x) \leq c$ for some constant $c > 0$, $f(x) = \varOmega(g(x))$ if $g(x) = O(f(x))$, and $f(x) = o(g(x))$ if $\limsup_{x\rightarrow \infty} f(x)/g(x) = 0 $. We write $f(x) = \Theta(g(x))$ if both $f(x) = O(g(x))$ and $f(x) = \varOmega(g(x))$. For any real number x, the largest integer n such that $n \leq x$ is denoted by $\lfloor x\rfloor$, and the smallest integer n such that $n \geq x$ is denoted by $\lceil x\rceil$. The set of all the non-negative integers is denoted by $\mathbb{N}_0$.

A random finite subset $\textbf{X} = \{X_1, \dots, X_N\}$ of an observation window $W \subset {\mathbb{R}}^d$ is called a Poisson point process (PPP) with a finite intensity measure $\nu$ on W if $N \sim {\mathsf{Poi}}(\nu(W))$ and for every $n \in \mathbb{N}_0$, conditioned on $N = n$, the points $X_1, \dots, X_n$ are i.i.d. with distribution $\nu({\textrm{d}} x)/ \nu(W)$. A PPP on $[0,1]^d$ is called a $\lambda$-homogeneous PPP with intensity $\lambda > 0$ if the intensity measure $\nu({\textrm{d}} x) = \lambda\, {\textrm{d}} x$, where ${\textrm{d}} x$ is Lebesgue measure on W. To each point $X_i$ of the $\lambda$-homogeneous PPP on $[0,1]^d$ we associate a mark, which is a non-negative number interpreted as the radius of a sphere centered at $X_i$. In particular, a $\lambda$-homogeneous marked PPP on $[0,1]^d$ is a PPP on $W = [0,1]^d \times [0, \infty)$ with the intensity measure ${\nu({\textrm{d}} x \times {\textrm{d}} r) = \lambda {\textrm{d}} x \times F({\textrm{d}} r)}$, where F is the distribution of each radius. That is, the centers constitute a $\lambda$-homogeneous PPP on $[0,1]^d$ which is independent of the radii, and the radii are i.i.d. with distribution F. A realization of the marked PPP with n points is denoted by $\textbf{x} = \{(y_1, r_1), \dots, (y_n, r_n)\}$, where $r_i \geq 0$ is the radius of the sphere centered at $y_i \in [0,1]^d$. Define $\mathscr{G} = \cup_{n \in \mathbb{N}_0}\mathscr{G}_n$, where

\[\mathscr{G}_n = \big\{ \textbf{x} = \{(y_1, r_1), \dots, (y_n, r_n)\} \,{:}\, (y_i, r_i) \in [0,1]^d\times [0, \infty),\, \text{ for }\, i = 1, \dots, n \big\}.\]

Now we define a Gibbs hard-sphere model. Suppose that $\mu^0$ is the distribution of a $\lambda$-homogeneous marked PPP as defined above, with F being the distribution of $R/\lambda^\eta$ for a constant $\eta > 0$ and a non-negative random variable R. Let $\mathscr{A} \subset \mathscr{G}$ be the set of all configurations with no two spheres overlapping with each other. Then the distribution $\mu$ of the Gibbs hard-sphere model is absolutely continuous with respect to $\mu^0$, with the Radon–Nikodym derivative given by

(2.1)

\begin{equation} \frac{d\mu}{d\mu^0}(\textbf{x}) = \frac{I\left(\textbf{x} \in \mathscr{A}\right)}{{\mathcal{P}(\lambda)}}, \quad \textbf{x} \in \mathscr{G},\end{equation}

where the normalizing constant ${\mathcal{P}(\lambda)}$ is the non-overlapping probability given by

(2.2)

\begin{align} {\mathcal{P}(\lambda)} = {\mathbb{P}}_{\mu^0} \left( \textbf{X} \in \mathscr{A} \right).\end{align}

We refer to the Gibbs hard-sphere model as a torus-hard-sphere model if the boundary of the underlying space $[0,1]^d$ is periodic; that is, a sphere S(x, a) centered at $x \in [0,1]^d$ with radius r is defined by

\[S(x, r) = \left\{ (y_1\, \mathsf{mod}\, 1, \dots, y_d\, \mathsf{mod}\, 1)\,{:}\, y = (y_1, \dots, y_d)\in {\mathbb{R}}^d, \, \,\|x- y \| < r\right\},\]

where $\|\cdot \|$ is the d-dimensional Euclidean norm and $\mathsf{mod}$ denotes the modulo operation [Reference Boute9]. If the boundary is not periodic, we refer to the model as a Euclidean-hard-sphere model.

From now on, the phrase ‘hard-sphere model’ refers to either of these two models, and we assume that R is bounded from above by a constant $\overline{r} > 0$. In particular, if R is a constant, we take $\overline{r} = R$. Furthermore, we assume that $2\overline{r}/\lambda^\eta < 1$ to avoid certain trivial difficulties such as the possibility of a sphere on the torus overlapping with itself.

3. Large deviations results

In this section, we obtain large deviations results for the non-overlapping probability ${\mathcal{P}(\lambda)}$. We use these results in analyzing the running time complexity of both the naive and the IS-based AR methods. Hereafter, $\gamma = {\pi^{d/2}}/{\Gamma(d/2 + 1)}$, where $\Gamma(\cdot)$ is the gamma function. Note that the volume of a sphere with radius r is given by $\gamma r^d$. Define $ m_1 \,{:\!=}\, {\mathbb E}[(R + \widehat R)^{d}]$, where $\widehat R$ is independent and identical in distribution to R, and let

(3.1)

\begin{align}\gamma' = \begin{cases} \gamma &\text{ if $[0,1]^d$ is treated as the torus},\\ \\[-7pt] \gamma/2^d &\text{otherwise}. \end{cases}\end{align}

Theorem 3.1. The non-overlapping probability ${\mathcal{P}(\lambda)}$ satisfies

\begin{align*} \lim_{\lambda \rightarrow \infty}{\mathcal{P}(\lambda)} &= \begin{cases} 1 &{ if }\, \eta d > 2,\\ \\[-7pt] \exp\left( -\frac{\gamma m_1}{2}\right) &{ if }\, \eta d = 2, \end{cases} \\[3pt] \lim_{ \lambda \rightarrow \infty}\left[\frac{1}{\lambda^{2-\eta d}}\log {\mathcal{P}(\lambda)}\right] &= - \frac{\gamma m_1}{2} \,\,{ if } \quad 1 < \eta d < 2,\\[3pt] \lim_{\lambda \rightarrow \infty}\left[\frac{1}{\lambda}\log {\mathcal{P}(\lambda)} \right] &= -1 \quad { if }\, 0 < \eta d < 1.\end{align*}

When $\eta d = 1$, the limit

\[\delta \,{:\!=}\, \lim_{\lambda \rightarrow \infty}\left[\frac{1}{\lambda}\log {\mathcal{P}(\lambda)} \right]\]

exists and $-1 \leq \delta < 0$. Furthermore, $\delta \nearrow 0$ if $\gamma m_1 \searrow 0$, and

\[\delta \leq -\frac{1}{2} \left(1 - \frac{1}{\gamma' \overline{r}^d} \right)^2\]

if $R \equiv \overline{r}$ and $\gamma' \overline{r}^d > 1$. In addition, for the torus-hard-sphere model,

\begin{align*} \lim_{ \lambda \rightarrow \infty}\left[{\mathcal{P}(\lambda)}\exp\left( \frac{\gamma m_1}{2}\lambda^{2 - \eta d}\right) \right] &= 1 \quad { if }\,\, 5/3 < \eta d < 2.\end{align*}

An important and fundamental characteristic of a Gibbs point process is its intensity; see, for example, [Reference Mase29] and references therein, as well as [Reference Baddeley and Nair5]. Roughly speaking, the intensity of a Gibbs point process is the expected number of points of the process per unit volume. There is an interesting connection between the regimes considered in Theorem 3 and the asymptotic intensity of the torus-hard-sphere model. To see this, assume that each sphere has a fixed radius $\overline{r}/\lambda^\eta$. Since the underlying space is $[0,1]^d$, the intensity $\rho(\lambda)$ of the model is exactly equal to the expected total number of points in a realization of the model. Equivalently, we may consider the fraction of the volume $\mathsf{VF}(\lambda)$ occupied by the spheres, given by $\mathsf{VF}(\lambda) = \rho(\lambda) \gamma \overline{r}^d \lambda^{- \eta d}$. For the torus-hard-sphere model, the volume fraction $\mathsf{VF}(\lambda)$ is bounded from above by $\rho^{\max} \gamma$, where $\rho^{\max}$ is the closest packing density, defined by $\rho^{\max} = \lim_{n \rightarrow \infty} N_n/(n+1)^d,$ with $N_n$ being the maximal number of mutually disjoint unit-radius spheres which are included in the hypercube $[\!-(n+1/2), (n+1/2)]^d$; see [Reference Mase29]. Proposition 3 describes the asymptotic behavior of $\mathsf{VF}(\lambda)$ as $\lambda \to \infty$ for different values of $\eta d$. In particular, the regime with $\eta d > 1$ is a low-density regime, while the regime with $\eta d < 1$ is a high-density regime. In the high-density regime, the intensity of the hard-sphere model is much smaller than the intensity $\lambda$ of the reference PPP.

Proposition 3.1. For the torus-hard-sphere model with a fixed radius $R = \overline{r}$,

\begin{align*}&\lim_{\lambda \nearrow \infty} \frac{\mathsf{VF}(\lambda)}{\gamma \overline{r}^d \lambda^{1 - \eta d}} = 1 \quad { if }\, \eta d > 1, \\[3pt]&\lim_{\lambda \nearrow \infty} \mathsf{VF}(\lambda) = \rho^{\max} \gamma \quad { if }\, \eta d < 1,\\[3pt] &\lim_{\lambda \nearrow \infty} \mathsf{VF}(\lambda) <\rho^{\max} \gamma \quad { if }\, \eta d = 1.\end{align*}

4. Acceptance–rejection-based algorithms

In Section 4.1, we present a naive AR algorithm for generating perfect samples of the hard-sphere model and analyze its expected running time complexity. We then proceed to present and analyze our IS-based AR algorithm, where the key idea is to partition the configuration space $\mathscr{G}$ so that a well chosen IS technique can be implemented on each partition. One such IS for the hard-sphere model is the reference IS presented in Section 4.3, where spheres are generated sequentially so that, whenever possible, the center of each sphere is selected uniformly over the region on $[0,1]^d$ that guarantees no overlap with the existing spheres. However, generating samples from this IS measure can be computationally challenging when $d \geq 2$. The grid-based IS introduced in Sections 4.4 and 4.5 overcomes this difficulty by imitating the reference IS; interestingly, it is more efficient than the reference IS.

In every algorithm presented in this paper, the running time complexity is calculated under the assumption that checking overlap of a newly generated sphere with an existing sphere is done in a sequential manner. That is, if there are n existing spheres, the expected running time complexity of the overlap check is proportional to n. However, if enough computing resources are available, the overlap check can be done in parallel so that its running time complexity is a constant. We omit the discussion of this parallel overlap check because it is easy to modify the results to accommodate the parallel case, and this does not change the key conclusions of the paper.

4.1. Naive acceptance–rejection algorithm

Algorithm 4.1 is a naive AR algorithm for generating perfect samples of the Gibbs hard-sphere model. The basic idea of the algorithm is standard [Reference Devroye11], and its correctness is straightforward and hence omitted.

Algorithm 4.1 Naive AR Method

Let ${\mathcal{T}}_{\mathsf{NAR}}$ be the expected running time complexity of Algorithm 4.1, where the running time complexity denotes the number of elementary operations performed by the algorithm; every elementary operation takes at most a fixed amount of time. Note that the acceptance probability of each iteration is ${\mathcal{P}(\lambda)}$. Thus the expected total number of iterations of the algorithm is $1/{\mathcal{P}(\lambda)}$. Suppose $C_{\mathsf{itr}}(\lambda)$ is the expected running time complexity of an iteration. Then

(4.1)

\begin{align} {\mathcal{T}_{\mathsf{NAR}}} = \frac{C_{\mathsf{itr}}(\lambda)}{{\mathcal{P}(\lambda)}}.\end{align}

We now establish bounds on ${\mathcal{T}}_{\mathsf{NAR}}$, and then establish its asymptotic behavior as $\lambda \nearrow \infty$ using Theorem 3. In each iteration of Algorithm 4.1, spheres are generated in a sequential order until we see an overlap or a configuration with N non-overlapping spheres. The key to proving Proposition 4.1 is to establish that the expected number of spheres generated per iteration is $\Theta\left(\lambda^{\min\{\eta d, 2 \}}\right)$.

Proposition 4.1. The expected running time complexity $C_{\mathsf{itr}}(\lambda)$ of an iteration of the naive AR algorithm, Algorithm 4.1, satisfies

(4.2)

\begin{align}C_{\mathsf{itr}}(\lambda) = \Theta\left( \lambda^{\min\{\eta d, 2 \}}\right).\end{align}

Furthermore, the expected total running time ${\mathcal{T}_{\mathsf{NAR}}}$ satisfies

\[ {\mathcal{T}_{\mathsf{NAR}}} = \begin{cases} \Theta\left(\lambda^{2}\right) & \quad { if }\, \eta d \geq 2,\\ \\[-7pt] \Theta\left(\lambda^{\eta d}\,\exp\Big(\left(\gamma m_1/2 +o(1)\right)\lambda^{2 - \eta d }\Big)\right) & \quad{ if }\, 1 < \eta d < 2,\\ \\[-7pt] \Theta\Big(\lambda^{\eta d}\, \exp\left(\delta\lambda\right)\Big), { for some }\, 0 < \delta \leq 1,& \quad{ if }\,\eta d = 1,\\ \\[-7pt] \Theta\left( \lambda^{\eta d}\, \exp\Big((1 + o(1))\lambda\Big)\right) & \quad { if }\,0 < \eta d < 1. \end{cases}\]

Remark 4.1. From (4.2) and Theorem 3.1, we see that for large values of $\lambda$ and for $\eta d < 2$, ${\mathcal{T}_{\mathsf{NAR}}}$ is mainly governed by ${\mathcal{P}(\lambda)}$, which can be very small for large $\lambda$. This suggests that any rejection-based perfect sampling algorithm with a significant improvement in the acceptance probability will have a significantly improved running time complexity.

4.2. Importance-sampling-based acceptance–rejection algorithm

A sequence of tuples

\[{\left\{ \left(D_{n,k}, \mu_{n,k}, \sigma_{n,k}\right)_{k = 1}^K\right\}_{n \in \mathbb{N}_0}}\]

with some $K < \infty$ is called a stable IS sequence if for each ${n \in \mathbb{N}_0}$, ${\left(D_{n,k} \right)_{k = 1}^{K}}$ is a partition of $\mathscr{G}_n$, and $(\mu_{n,k})_{k = 1}^{K}$ is a sequence of probability measures such that $\mu^0$ is absolutely continuous with respect to $\mu_{n,k}$ on $D_{n,k} \cap \mathscr{A}$, and the corresponding likelihood ratio

\[{L_{n,k}(\textbf{x}_n) \,{:\!=}\, \frac{d\mu^0}{d\mu_{n,k}}(\textbf{x}_n)}\]

satisfies

\begin{align*}L_{n,k}(\textbf{x}_n) \leq \sigma_{n,k} \leq 1 \quad \text{ if }\, \textbf{x}_n \in D_{n,k} \cap \mathscr{A}, \end{align*}

for $k = 1, \dots, K$. Under the stability condition, for every measurable subset ${\mathscr{B} \subseteq \mathscr{G}}$,

(4.3)

\begin{align} \mu(\mathscr{B}) &\propto {\mathbb{P}}_{\mu^0}(\textbf{X} \in \mathscr{B} \cap \mathscr{A}) = \sum_{n \in \mathbb{N}_0} e^{-\lambda}\frac{\lambda^n}{n!} \Bigg(\sum_{k=1}^{K} {\mathbb{P}}_{\mu^0}\left(\textbf{X}_n \in D_{n,k} \cap \mathscr{B} \cap \mathscr{A}\right) \Bigg) \nonumber\\ &= \sum_{n \in \mathbb{N}_0} e^{-\lambda}\frac{\lambda^n }{n!} \Bigg(\sum_{k=1}^{K} \, {\mathbb{E}}_{\mu_{n,k}}\big[ I\left(\textbf{X}_n \in D_{n,k} \cap \mathscr{B} \cap \mathscr{A}\right)L_{n,k}(\textbf{X}_n)\big] \Bigg) \nonumber\\ &= \sum_{n \in \mathbb{N}_0} e^{-\lambda}\frac{\lambda^n \widetilde \sigma(n)}{n!} \Bigg(\sum_{k=1}^{K} \frac{\sigma_{n,k}}{\widetilde \sigma(n)}\, {\mathbb{E}}_{\mu_{n,k}}\Bigg[ \frac{I(\textbf{X}_n \in D_{n,k} \cap \mathscr{B} \cap \mathscr{A})L_{n,k}(\textbf{X}_n)}{\sigma_{n,k}}\Bigg] \Bigg)\nonumber \\ &= \sum_{n \in \mathbb{N}_0} \frac{\lambda^n \widetilde \sigma(n)}{n!} \Bigg(\sum_{k=1}^{K} \frac{\sigma_{n,k}}{\widetilde \sigma(n)}\, {\mathbb{P}}_{\mu_{n,k}}\big( J = 1, \textbf{X}_n \in D_{n,k} \cap \mathscr{B} \cap \mathscr{A} \big) \Bigg),\end{align}

where ${\widetilde \sigma(n) \,{:\!=}\, \sum_{k=1}^{K} \sigma_{n,k}}$, ${U \sim \mathsf{Unif}(0,1)}$, and

\[ J \sim {\mathsf{Bern}}\left(\frac{L_{n,k}(\textbf{X}_n)}{\sigma_{n,k}}\right).\]

Let M be a non-negative integer-valued random variable with the probability mass function (PMF) defined by

(4.4)

\begin{align} {\mathbb{P}}\left(M = m\right) = \frac{1}{C_{\lambda}}\frac{\lambda^m \widetilde \sigma(m)}{m!}, \qquad m \in \mathbb{N}_0,\end{align}

where

\[C_{\lambda} \,{:\!=}\, \sum_{n=0}^\infty \frac{\lambda^n \widetilde \sigma(n)}{n!}.\]

The PMF (4.4) is well defined because ${\mathbb{E}}\left[\widetilde \sigma(N)\right]$ is finite under the stability condition. Now consider Algorithm 4.2.

Algorithm 4.2 IS-based AR method

Proposition 4.2. Algorithm 4.2 generates a perfect sample of the Gibbs hard-sphere model. Furthermore, let $N \sim {\mathsf{Poi}}(\lambda)$. Then the probability of accepting the configuration generated in an iteration of Algorithm 4.2 is given by

(4.5)

\begin{align} P_{\mathsf{acc}}(\lambda) = \frac{{\mathcal{P}(\lambda)}}{{\mathbb{E}}[\widetilde \sigma(N)]}.\end{align}

We omit the proof of Proposition 4.2 because the correctness easily follows from (4.3), and (4.5) holds from the observation that

\begin{align*} P_{\mathsf{acc}}(\lambda) = \frac{1}{C_{\lambda}}\sum_{n \in \mathbb{N}_0} \frac{\lambda^n \widetilde \sigma(n)}{n!} \Bigg(\sum_{k=1}^{K} \frac{\sigma_{n,k}}{\widetilde \sigma(n)}\, {\mathbb{E}}_{\mu_{n,k}}\Bigg[ \frac{L_{n,k}(\textbf{X}_n)}{\sigma_{n,k}}; \textbf{X}_n \in D_{n,k} \cap \mathscr{A}\Bigg] \Bigg).\end{align*}

Note that the expected number of iterations of Algorithm 4.2 is $1/P_{\mathsf{acc}}(\lambda)$. Corollary 4.2 is an important and trivial consequence of Proposition 4.2.

Corollary 4.1. For all stable IS sequences

\[\left\{ \left(D_{n,k}, \mu_{n,k}, \sigma_{n,k}\right)_{k = 1}^K\right\}_{n \in \mathbb{N}_0}\]

with the same ${\mathbb{E}}[\widetilde \sigma(N)] = \sum_{k =1}^{K}{\mathbb{E}}\left[\sigma_{N,k}\right]$, the expected number of iterations of Algorithm 4.2 is the same.

Suppose that $\widetilde C_{\mathsf{itr}}(\lambda)$ is the expected running time complexity of an iteration of Algorithm 4.2. Then the expected total running time of the algorithm is given by

(4.6)

\begin{align}{\mathcal{T}}_{\mathsf{ISAR}} = \frac{\widetilde C_{\mathsf{itr}}(\lambda){\mathbb{E}}[\widetilde \sigma(N)]}{{\mathcal{P}(\lambda)}},\end{align}

where ${N \sim {\mathsf{Poi}}(\lambda)}$. Recall that the acceptance probability of the naive AR method is ${\mathcal{P}(\lambda)}$. It is reasonable to seek a valid stable IS sequence

\[{\left\{ \left(D_{n,k}, \mu_{n,k}, \sigma_{n,k}\right)_{k = 1}^K\right\}_{n \in \mathbb{N}_0}}\]

so that ${\widetilde C_{\mathsf{itr}}(\lambda){\mathbb{E}}[\widetilde \sigma(N)]}$ is smaller than ${C_{\mathsf{itr}}(\lambda)}$. In Subsections 4.4 and 4.5, we present applications of Algorithm 4.2 where $\mathcal{T}_{\mathsf{ISAR}}$ is indeed much smaller than ${\mathcal{T}_{\mathsf{NAR}}}$.

Remark 4.2. (Extension of IS-based AR to general Gibbs point processes.) Suppose that $\mu$ is the distribution of a Gibbs point process that is absolutely continuous with respect to $\mu^0$, with the corresponding Radon–Nikodym derivative given by

\[ {\frac{d\mu}{d\mu^0} \left(\textbf{x} \right) = \frac{\exp\left( - \beta\, V(\textbf{x})\right)}{Z},\, \textbf{x} \in \mathscr{G}},\]

where the constant $\beta \in {\mathbb{R}}$ is known as inverse temperature, V is called the non-negative potential function, and ${Z = {\mathbb{E}}_{\mu^0} \left[\exp\left( - \beta\, V(\textbf{X})\right) \right]}$ is the normalizing constant. If the stability condition holds true when $I(\textbf{x} _n\in \mathscr{A})$ is replaced by $\exp\left( - \beta\, V(\textbf{x}_n)\right)$, then Algorithm 4.2 can generate perfect samples from $\mu$ if in line 4.2 of the algorithm,

\[J_2 \sim {\mathsf{Bern}}\bigg(\frac{L_{M, J_1}(\textbf{X}) \exp\left( - \beta\, V(\textbf{X})\right)I\left(\textbf{X} \in D_{M,J_1} \right)}{\sigma_{M,J_1}}\bigg).\]

To see that the hard-sphere model is a special case of such a Gibbs point process, take $\beta > 0$ and assume that ${V(\textbf{x}) = 0}$ if $\textbf{x}$ is a non-overlapping configuration of spheres; otherwise, ${V(\textbf{x}) = \infty}$.

4.3. Reference importance sampling

We now introduce an IS measure, called reference IS and denoted by $\widetilde \mu_n$ for each n, such that $\left\{ \left(\mathscr{G}_n, \widetilde \mu_n, \sigma_n\right)\right\}_{n \in \mathbb{N}_0}$ is a stable IS sequence (with $K = 1$) that can be used in Algorithm 4.2 for generating perfect samples of the hard-sphere model for an appropriate choice of the sequence $\left\{\sigma_n \,{:}\, n \in \mathbb{N}_0\right\}$. Under $\widetilde \mu_n$, we first generate an i.i.d. sequence $R_1, \dots, R_n$ identical in distribution to R, and then generate n spheres sequentially as follows. Generate the center of the first sphere uniformly distributed on $[0,1]^d$. Suppose that $i-1$ spheres are already generated. For the ith sphere generation, a subset $\mathcal{B}_i \subseteq [0,1]^d$ is called the blocking region if $\mathcal{B}_i$ is the largest set such that if the center $Y_i$ of the ith sphere fell in this region (that is, $Y_i \in \mathcal{B}_i $), the ith sphere would overlap with one of the existing $i-1$ spheres. The center of the ith sphere is generated with uniform distribution over the non-blocking region $[0,1]^d\setminus \mathcal{B}_i$. If, for some sphere $i \leq n$, the entire space is blocked (that is, $\mathcal{B}_i = [0,1]^d$), we select the centers of spheres $ i, \dots, n$ arbitrarily. Figure 1 illustrates this for $d =2$ and $n = 1,2$. In conclusion, $\widetilde \mu_n$ is the distribution of an output of Algorithm 4.3.

Figure 1. Illustration of the reference IS method for a Euclidean-hard-sphere model on $[0,1]^2$ with spheres of fixed radius $\overline{r}/\lambda^\eta$. In (a) (respectively, (b)), the grey region represents the blocking area when generating the second circle (respectively, when generating the third circle).

Algorithm 4.3 Reference IS

Observe that $\mu^0$ is absolutely continuous with respect to $\widetilde \mu_n$ on $\mathscr{G}_n \cap \mathscr{A}$, and the associated likelihood ratio satisfies

(4.7)

\begin{align} \widetilde L_n(\textbf{x}_n) = \frac{d \mu^0}{d\widetilde \mu_n}(\textbf{x}_n) = \prod_{i=1}^{n}(1 - B_{i})\end{align}

for all $\textbf{x}_n \in \mathscr{G}_n \cap \mathscr{A}$ and $n \in \mathbb{N}_0$, where $B_{i}$ is the volume of $\mathcal{B}_i$ and $ \widetilde L_0 = 1$. Note that $\widetilde L_n(\textbf{x}_n) = 0$ if and only if $\textbf{x}_n \notin \mathscr{A}$, because for any $\textbf{x}_n \notin \mathscr{A}$, there exists $i \leq n$ such that $B_i = 1$.

Observe that the blocking volume added by the ith sphere is at least $\gamma' \left(R_i/\lambda^{\eta}\right)^d$ when it does not overlap with any of the existing spheres. This is because, for the torus-hard-sphere model, the entire volume within an accepted sphere is added to the blocking volume, and for the Euclidean-hard-sphere model, at least $1/2^d$ of an accepted sphere is added to the blocking volume. Thus,

(4.8)

\begin{equation}B_{i}\geq \frac{\gamma'}{ \lambda^{\eta d}} \sum_{j =1}^{i-1} R_j^d\end{equation}

for every configuration $\textbf{x}_{i-1} \in \mathscr{G}_{i-1} \cap \mathscr{A}$. In particular, if all the spheres are of the same size with a fixed radius $\overline{r}$,

(4.9)

\begin{align} I(\textbf{x}_n \in \mathscr{A}) \widetilde L_n(\textbf{x}_n) \leq \prod_{i = 1}^{n} \left( 1 - (i-1)\frac{\gamma'}{ \lambda^{\eta d}} \overline{r}^d\right)^+ \,{=\!:}\, \delta_{n}\end{align}

for all $n \in \mathbb{N}_0$ and $\textbf{x}_n \in \mathscr{G}_n $, where $x^+ = \max(0,x)$ and $\delta_{0} = 1$. Then the stability condition is satisfied with $K = 1$, $D_{n,1} = \mathscr{G}_n $, $\mu_{n,1} = \widetilde \mu_n$, and $\sigma_{n,1} = \delta_n$ for $n \in \mathbb{N}_0$. Thus, Algorithm 4.2 generates perfect samples of the fixed-radius hard-sphere model, and from Proposition 4.2, the corresponding acceptance probability is

\[P_{\mathsf{acc}}(\lambda) = \frac{{\mathcal{P}(\lambda)}}{{\mathbb{E}}[\widetilde \sigma_N]} = \frac{{\mathcal{P}(\lambda)}}{{\mathbb{E}}[\delta_{N}]}.\]

Remark 4.3. In dimension $d = 1$, spheres become line segments and thus it is easy to generate samples from the IS measure $\widetilde \mu_n$. However, for $d \geq 2$, generating samples under the reference IS is difficult, because every time a new sphere is generated, we need to know the volume of the blocking region created by the existing spheres, and then we need to generate a point uniformly on this non-blocking region; see line 11 in Algorithm 4.3. One possible way to implement the reference IS is by combining a well-known method called power tessellation and a simple rejection method in two steps: (i) using the power tessellation, we can compute the blocking volumes exactly (see e.g. [Reference Aurenhammer4, Reference Møller and HelisovÁ33]); (ii) we then use a simple AR method, repeatedly generating a point independently and uniformly on $[0,1]^d$ until it falls within the non-blocking region.

Unfortunately, implementing the power tessellation method is computationally prohibitive. Besides, even if we have an efficient implementation of the power tessellation method, the above simple rejection step can be expensive when the non-blocking region is small. Fortunately, we can overcome both these difficulties by using a simple grid on $[0,1]^d$. From (4.6), it is evident that if there are two IS methods with the same ${\mathbb{E}}[\widetilde \sigma_N]$, it is computationally preferable to use the method that has smaller per-iteration expected running time, $\widetilde C_{\mathsf{itr}}(\lambda)$. In Subsection 4.4, we introduce a hypercubic-grid-based IS method that continues to generate perfect samples while the blocking regions are closely approximated by grid cells. With a careful choice of the cell-edge length, we make sure that the inequality (4.9) holds for the grid IS as well (and thus ${\mathbb{E}}[\widetilde \sigma_N]$ is same as that of the reference IS). As a consequence of Corollary 4.1, the expected number of iterations of Algorithm 4.2 is the same as that of the reference IS method. However, the grid method is easy to implement and has a much smaller expected iteration cost $\widetilde C_{\mathsf{itr}}(\lambda)$ than the reference IS. The choice of the hypercubic grid is just an option that simplifies the implementation; however, the method can be implemented using other kinds of grids. In the two-dimensional case, for example, it is possible to use a hexagonal grid to implement the IS method.

4.4. Grid-based importance sampling for fixed-radius case

Consider the hard-sphere model with a fixed radius $\overline{r}/\lambda^\eta$. The process of generating n spheres under the following grid-based IS measure $\widehat \mu_n$ starts with a partitioning of the underlying space $[0,1]^d$ into a hypercubic grid with a cell-edge length $ \varepsilon > 0$ such that $1/\varepsilon$ is an integer. The centers of the spheres are generated sequentially, as follows. Suppose that $i-1$ spheres with centers $Y_1, \dots, Y_{i-1}$ have already been generated. At the time of generation of the ith sphere, a cell C in the grid is labeled as fully-blocked if the cell is completely inside a sphere with radius $2\overline{r}/\lambda^\eta$ centered at an existing point, that is, $C \subseteq S(Y_j, 2\overline{r}/\lambda^\eta)$ for some $j \leq i-1$; otherwise, the cell is labeled as non-fully-blocked. A non-fully-blocked cell C is called partially-blocked if $C \cap S(Y_j, 2\overline{r}/\lambda^\eta) \neq \varnothing$ for some $j \leq i-1$; otherwise, it is called non-blocked. The center $Y_i$ of the ith sphere is selected uniformly over the non-fully-blocked cells, because selecting $Y_i$ over a fully-blocked cell will certainly result in the ith sphere overlapping with an existing sphere. We then check for overlap only if $Y_i$ is generated over a partially-blocked cell, because overlap is not possible if $Y_i$ is generated over a non-blocked cell. If either there is an overlap or all the cells are fully-blocked by the existing spheres, the centers $Y_i, \dots, Y_n$ of the remaining spheres are selected arbitrarily (such a selection results in an overlapping configuration). Otherwise, for the generation of the next sphere, $i+1$, we repeat the same procedure by relabeling the non-fully-blocked cells, considering spheres $1, \dots, i$ as the existing spheres. Note that at the beginning of each iteration all the cells are labeled as non-blocked. Also note that since all the spheres have the same radius, for the relabeling of the cells we only need to focus on the cells that might interact with the last sphere generated. See Figure 2 for an illustration of this sequential procedure.

Suppose that $\widehat \mu_n$ is the probability measure under which n spheres are generated by the above procedure. Then $\mu^0$ is absolutely continuous with respect to $\widehat \mu_n $ on $ {\mathscr{G}}_n \cap \mathscr{A}$ and the corresponding likelihood ratio is

\begin{align*}\widehat L_n(\textbf{x}_n) \,{:\!=}\, \frac{d\mu^0}{d\widehat \mu_n}(\textbf{x}_n) = \prod_{i=1}^n\left(1 - \widehat B_i\right), \quad \textbf{x}_n \in {\mathscr{G}}_n \cap \mathscr{A},\end{align*}

where $\widehat B_i$ is the volume of fully-blocking cells for the ith sphere generation; that is, $\widehat B_i$ equal to the product of the number of fully-blocked cells and $\varepsilon^d$. To apply Algorithm 4.2 for the fixed-radius hard-sphere model, take $K = 1$, and for each ${n \in \mathbb{N}_0}$, take ${D_{n,1} = {\mathscr{G}}_n}$, ${\mu_{n,1} = \widehat \mu_n}$, and ${\sigma_{n,1} =\delta_n}$. Thus, ${\widetilde \sigma(n) = \delta_{n}}$ and ${L_{n,1}(\textbf{x}_n) = \widehat L_n(\textbf{x}_n)}$ for all ${\textbf{x}_n \in {\mathscr{G}}_n \cap \mathscr{A}}$.

Figure 2. A realization with five circles on the unit square $[0,1]^2$ generated using the grid-based IS method for a Euclidean-hard-sphere model with a fixed radius (smaller circles). The grid size is $50\times 50$ and the radius is $0.1$. The bigger circle around each point is the actual region blocked by the circle. For the sixth circle generation, grey cells are fully-blocked, hatched cells are partially-blocked, and white cells are non-blocked.

Selection of the cell-edge length $\boldsymbol{\varepsilon}$: Observe that the longest diagonal length of a cell is $\sqrt{d}\, \varepsilon$. Since we focus only on the non-overlapping configurations, in the implementation, we generate a sphere only if all the existing spheres are non-overlapping. Suppose that the cell-edge length $\varepsilon$ is selected so that $\displaystyle \sqrt{d}\, \varepsilon \leq \overline{r}/\lambda^\eta.$ Then for the ith sphere generation, every cell that has non-empty intersection with $S(Y_j, \overline{r}/\lambda^\eta)$, for any ${j =1, \dots, i-1}$, has to be fully-blocked, because such a cell is a subset of $S\left(Y_j, 2\overline{r}/\lambda^\eta\right)$. Thus, the non-overlapping condition of the existing spheres imply that

\[\cup_{j = 1}^{i -1} S(Y_j, \overline{r}/\lambda^\eta)\]

is a subset of the union of the fully-blocked cells, and hence

\[{\widehat B_i \geq \frac{\gamma' (i -1)\overline{r}^d}{\lambda^{\eta d}}}.\]

Thus, for $n \geq 1$,

(4.10)

\begin{align} I(\textbf{x}_n \in \mathscr{A})\widehat L_n(\textbf{x}_n) \leq \delta_{n} = \prod_{i=1}^{n} \bigg( 1 - (i-1) \frac{\gamma' \overline{r}^d}{ \lambda^{\eta d}} \bigg)^+,\quad \textbf{x}_n \in {\mathscr{G}}_n.\end{align}

This upper bound is same as the one we obtained in the case of the reference IS; see the inequality (4.9). Since the acceptance probability ${P_{\mathsf{acc}}(\lambda)= {\mathcal{P}(\lambda)}/{\mathbb{E}}[\delta_N]}$ is the same for both the grid IS and the reference IS methods, we need to choose the cell-edge length $\varepsilon \leq \overline{r}/\lambda^\eta$ so that the expected per-iteration running time $\widetilde C_{\mathsf{itr}}(\lambda)$ is minimum. It is easy to see that the higher the value of $\varepsilon$, the smaller $\widetilde C_{\mathsf{itr}}(\lambda)$, for the following reasons:

1. Labeling of the cells is faster if they are bigger in size.
2. Increasing the cell size increases the chances of overlap of the new sphere with the existing spheres, and hence on average each iteration generates fewer spheres.

In conclusion, we choose $\varepsilon = 1/ \lfloor \lambda^\eta/\overline{r}\rfloor$ for the implementation of the grid IS method.

To reduce the per-iteration complexity of the algorithm, we make some changes to Steps 4 and 5 in Algorithm 4.2. Observe that a realization $\textbf{X}_n$ generated under $\widehat \mu_n$ is accepted only if ${\textbf{X}_n \in \mathscr{A}}$ and ${J = 1}$, where ${J \sim {\mathsf{Bern}}\left(\widehat L_n(\textbf{X}_n)/\delta_n\right)}$. In the implementation, we generate an i.i.d. sequence $U_1, \dots, U_n \sim {\mathsf{Unif}}(0,1)$ independent of everything else so far generated, and take

\[J_i = I\bigg( U_i \leq \frac{1 - \widehat B_i}{(1 - (i-1) \gamma' \overline{r}^d\lambda^{-\eta d})}\bigg)\]

for $i \leq n$. Since J and the product $\prod_{i = 1}^n J_i$ are Bernoulli random variables with the same success probability $L(\textbf{X}_n)/\delta_n$, to reduce the per-iteration cost, we generate the ith sphere only if $J_i = 1$ and the existing spheres do not overlap with each other.

Algorithm 4.4 implements the grid-based IS for a given n with the above-mentioned enhancements. Algorithm 4.2 is restated as Algorithm 4.5.

Algorithm 4.4 Grid-based IS for fixed radius

Algorithm 4.5 Perfect sampling for hard-sphere model using grid-based IS

Remark 4.4. (The PMF of M.) Note that, for the current setup, the PMF of M, given by (4.4), becomes

\[{\mathbb{P}}\left(M = m\right) = \frac{1}{C_{\lambda}}\frac{\lambda^m \delta_m}{m!},\]

$m \in \mathbb{N}_0$, where the normalizing constants are

\[C_{\lambda} = \sum_{n \in \mathbb{N}_0} \frac{\lambda^n \delta_n}{n!}.\]

The support of the PMF is finite because $\delta_m = 0$ for all $m \geq \lambda^{\eta d}/(\gamma' \overline{r}^d) + 1$. To increase the performance of the algorithm, we can further truncate the support of the PMF. Using the maximum packing density, we can obtain an integer $m_{\max}$ such that $\textbf{X} \notin \mathscr{A}$ for all $m \geq m_{\max}$ and configurations $\textbf{X}$ with $|\textbf{X}| = m$. In that case, we can take

\[{\mathbb{P}}\left(M = m\right) = \frac{1}{C_{\lambda}}\frac{\lambda^m \delta_m}{m!}, \qquad 0 \leq m \leq m_{\max},\]

with

\[C_\lambda = \sum_{n = 0}^{m_{\max}} \frac{\lambda^n \delta_n}{n!}.\]

For example, refer to [Reference Mase29] for finding maximum packing densities for $d = 2$ and $d = 3$.

We now focus on the expected running time analysis of Algorithm 4.5. By Proposition 4.2, the acceptance probability $P_{\mathsf{acc}}(\lambda)$ of Algorithm 4.5 is ${\mathcal{P}(\lambda)}/{\mathbb{E}}[\widetilde \sigma(N)] = {\mathcal{P}(\lambda)}/{\mathbb{E}}[\delta_N].$ A proof of Proposition 4.3 is given in Section A.4.

Proposition 4.3. For the fixed-radius hard-sphere model, there exists a constant $c > 0$ such that

(4.11)

\begin{align} \mathcal{T}_{\mathsf{ISAR}} \leq c \, {\mathbb{E}}\left[ \delta_N\right] \frac{\lambda^{\min\{\eta d, 1\}}}{{\mathcal{P}(\lambda)}}, \end{align}

where $N \sim {\mathsf{Poi}}(\lambda)$. Furthermore,

\begin{align*} \limsup_{\lambda \nearrow \infty }\left[ \frac{1}{\lambda^{2 - \eta d}} \log {\mathbb{E}}\left[\delta_N\right] \right] &\leq - \frac{\gamma' \overline{r}^d}{2} \,\, { if }\, \eta d > 1, \,\, { and} \\ \limsup_{\lambda \nearrow \infty }\left[ \frac{1}{\lambda} \log {\mathbb{E}}\left[\delta_N\right] \right] & \leq - b \, { if }\, 0 < \eta d \leq 1,\ { for\ some\ constant }\, b > 0. \end{align*}

The following result is a trivial consequence of Propositions 4.1 and 4.3.

Corollary 4.2. For the fixed-radius hard-sphere model, if $\eta d \geq 2$, both $\mathcal{T}_{\mathsf{ISAR}}$ and ${\mathcal{T}_{\mathsf{NAR}}}$ are of the same order, and if $0 < \eta d < 2$, there exists a constant $c > 0$ such that $\displaystyle\mathcal{T}_{\mathsf{ISAR}} \leq c\, {\mathbb{E}}\left[\delta_N\right] {\mathcal{T}_{\mathsf{NAR}}}.$

Remark 4.5. (Better choice of $\delta_n$ for the Euclidean-hard-sphere model) If the spheres are Euclidean, further improvements in the choice of $\delta_n$ can be obtained by accounting for boundary effects. For instance, for $d=2$, the four corners of $[0,1]^2$ are covered by at most four circles, each of which contributing a blocking area of at least $\gamma'\overline{r}^2/\lambda^{2 \eta} = \pi\overline{r}^2/4\lambda^{2 \eta}$, while each of the remaining circles contributing a blocking area of at least $2\gamma'\overline{r}^2/\lambda^{2 \eta} = \pi\overline{r}^2/2\lambda^{2 \eta}$. Let $b_0 = 0$,

\[b_i = (i - 1)\frac{\pi \overline{r}^2}{4 \lambda^{2\eta}}\]

for $1\leq i \leq 5,$ and

\[b_i = \frac{\pi \overline{r}^2}{\lambda^{2\eta}} + (i-4)\frac{\pi\overline{r}^2}{2\lambda^{2 \eta}}\]

for $i \geq 6$. Then, for this particular scenario, a better choice of $\delta_n$ in (4.9) (as well as in (4.10)) is $\delta_n = \prod_{i=1}^{n} \left( 1 - b_n \right)^+\!,\, n \in \mathbb{N}_0$.

4.5. Random-radii case

We now consider another application of Algorithm 4.2 for the hard-sphere model when under the marked PPP the radii of the spheres are i.i.d. For the fixed-radius case presented in Section 4.4, the proposed IS method ensured a uniform bound $\delta_n$ on the likelihood ratio over ${\mathscr{G}}_n$ for every $n \in \mathbb{N}_0$, as shown in (4.10). Such upper bounds are possible for a random-radii hard-sphere model if the radii are bounded below by a positive constant. Furthermore, a similar analysis can be established when the spheres are replaced with i.i.d. convex shapes such that each shape occupies a minimum positive volume. However, when the radii are not bounded from below almost surely, the associated blocking volumes can be arbitrarily small. We address this issue by partitioning ${\mathscr{G}}_n$ into two sets $D_{n,1}$ and $ D_{n,2}$ for each n so that the IS on $D_{n,1}$ is a grid-based IS method that is similar to Algorithm 4.4, and the IS on $D_{n,2}$ is obtained by exponentially twisting the distribution of $R^d$ to put high probability mass on configurations with lower-volume spheres.

We first introduce the exponential twisting of the distribution, say G, of $R^d$. Recall that R is assumed to be a bounded non-negative random variable. Without loss of generality further assume that ${\alpha \,{:\!=}\, {\mathbb{E}}[R^d] > 0}$. Thus the logarithmic moment generating function of $R^d$, defined by ${\Lambda(\theta) \,{:\!=}\, \log\left({\mathbb{E}}\left[\exp({\theta R^d}) \right] \right)}$, is finite for every $\theta \in {\mathbb{R}}$. Furthermore, the derivative

\[\Lambda'(\theta) = \frac{{\textrm{d}} \Lambda(\theta)}{{\textrm{d}} \theta} = \frac{{\mathbb{E}}\left[R^d\exp({\theta R^d}) \right] }{{\mathbb{E}}\left[\exp({\theta R^d}) \right]}\]

is finite and positive for all $\theta \in {\mathbb{R}}$, and in particular, ${\Lambda'(0) = \alpha}$. In fact, using the results in Chapter 2 of [Reference Dembo and Zeitouni10], it can be seen that $\Lambda(\theta)$ is strictly convex. As a consequence, $\Lambda'(\theta)$ is strictly increasing and hence

\[\alpha_{\min} \,{:\!=}\, \lim_{\theta \to -\infty} \Lambda'(\theta) < \alpha.\]

Let $\widehat \theta$ be such that ${\Lambda'(\widehat \theta) = \varrho}$ for some ${\varrho \in (\alpha_{\min}, \alpha)}$. Therefore, ${\widehat \theta < 0}$. Now consider the distribution $\widetilde G$ obtained by exponentially twisting G by the amount $\widehat \theta$, that is, ${{\textrm{d}} \widetilde G(t)} = \exp\left( \widehat \theta t - \Lambda(\widehat \theta)\right){\textrm{d}} G(t).$ Fix a constant $a \in (0,1)$, and for each integer $n \geq 1$, define

\[ H_n \,{:\!=}\, \bigg\{ (t_1, t_2, \dots, t_{\lceil n a \rceil}) \in {\mathbb{R}}_+^{\lceil n a \rceil}\,{:}\, \frac{1}{\lceil n a \rceil}\sum_{i= 1}^{\lceil n a \rceil} t_i < \varrho \bigg\}.\]

We later see that $a = 1/2$ is a good choice for increasing performance of the algorithm. Let $\Lambda^*(\cdot)$ be the Legendre–Fenchel transform of $\Lambda$, that is, ${\Lambda^*(t) = \sup_{\theta \in {\mathbb{R}}} \{\theta t - \Lambda(\theta)\}}$. This corresponds to the large deviations rate function associated with the empirical average of i.i.d. samples from G. From the definition of $\widehat \theta$ and the fact that $\Lambda(\theta)$ is strictly convex, ${\Lambda^*(\varrho) = \widehat \theta \varrho - \Lambda(\widehat \theta) >0}$. Since $\widehat \theta < 0$, for all ${(t_1, t_2, \dots, t_{\lceil n a \rceil}) \in H_n}$,

\begin{align*}\exp\left( \widehat \theta \sum_{i=1}^{\lceil n a \rceil} t_i - \lceil n a \rceil \Lambda(\widehat \theta)\right) &= \exp\left( \widehat \theta \sum_{i=1}^{\lceil n a \rceil} (t_i - \varrho) + \lceil n a \rceil\, \Lambda^*(\varrho)\right) \geq \exp\left( \lceil n a \rceil\, \Lambda^*(\varrho)\right),\end{align*}

and thus,

(4.12)

\begin{align} \prod_{i=1}^{\lceil n a \rceil} \frac{ {\textrm{d}} G}{{\textrm{d}} \widetilde G}( t_i) \leq \exp\left( - \lceil n a \rceil\, \Lambda^*(\varrho)\right) \leq \exp\left( - n a \, \Lambda^*(\varrho)\right).\end{align}

Recall the definition of the distribution $\mu$ of the hard-sphere model given by (2.1). To apply Algorithm 4.2, select $K = 2$ and define

\begin{align*} D_{n,1} = \left\{ \textbf{x}_n = \{(x_1, r_1/\lambda^{\eta d}), \dots, (x_n, r_n/\lambda^{\eta d}) \} \in \mathscr{G}_n\,{:}\, (r_1^d, \dots, r^d_{\lceil n a \rceil}) \in H^{\mathsf{c}}_n\right\}\end{align*}

and $D_{n,2} = {\mathscr{G}}_n\setminus D_{n,1},$ for each n, where $H^{\mathsf{c}}_n$ is the complement of $H_n$ within $[0, \overline{r}]^{\lceil n a \rceil}$. To apply Algorithm 4.5, we are now left with identifying the IS measures $\mu_{n,1}$ and $\mu_{n,2}$ and the corresponding bounds $\sigma_{n,1}$ and $\sigma_{n,2}$ for each ${n \in \mathbb{N}_0}$.

The measure $\mu_{n,1}$ on $ D_{n,1}$ is again a grid-based IS method similar to the grid method introduced for the fixed-radius case in Subsection 4.4. First, i.i.d. copies $\{R_1, \dots, R_n\}$ of R are generated. Then we construct a new grid and label each cell every time a new sphere is generated, as follows. For the generation of the ith, sphere with radius $R_i/\lambda^\eta$, we take the cell-edge length ${\varepsilon = 1/ \lceil \lambda^\eta/R_i\rceil}$. A cell C in the grid is labeled as fully-blocked if ${C \subseteq S(X_j, (R_j + R_i)/\lambda^\eta)}$ for an existing sphere ${j \leq i-1}$ with the center $X_j$ and the radius $R_j/\lambda^\eta$; otherwise, the cell is labeled as non-fully-blocked. A non-fully-blocked cell C is called partially-blocked if ${C \cap S(X_j, (R_j + R_i)/\lambda^\eta) \neq \varnothing}$ for some ${j \leq i-1}$; otherwise, it is called non-blocked. Then the next center $X_i$ is generated uniformly over the non-fully-blocked cells. Just as in the case of fixed radius, $X_1$ is generated uniformly over $[0,1]^d$, and we check the possibility of the overlap of ith sphere with an existing sphere only if $X_i$ falls in a partially-blocked cell.

The measure $\mu^0$ is absolutely continuous with respect to $\mu_{n,1}$ on ${D_{n,1} \cap \mathscr{A}}$, and the associated likelihood ratio $L_{n,1}$ is given by

\[L_{n,1}(\textbf{x}_n) = \prod_{i=1}^{n}\Big(1 - B_{i}\Big), \quad \textbf{x}_n \in D_{n,1} \cap \mathscr{A},\]

where $\widehat B_i$ is the volume of all the fully-blocked cells for the generation of the ith sphere. By (4.8) and the fact that the cell-edge length is $1/ \lceil \lambda^\eta/R_i\rceil$, we have

\[{B_{i} \geq \min\Big(1, \frac{\gamma' \lceil n a \rceil}{ \lambda^{\eta d}} \varrho\Big)}\]

on ${D_{n,1} \cap \mathscr{A}}$ for all ${i \geq \lceil n a \rceil + 1}$, because

\[\frac{1}{\lceil n a \rceil}\sum_{j =1}^{\lceil n a \rceil} R_j^d \geq \varrho\]

over the set $H^{\mathsf{c}}_n.$ Consequently,

\[ I(\textbf{x}_n \in \mathscr{A}) L_{n,1}(\textbf{x}_n) \leq \left[\left(1 - \frac{\gamma' \lceil n a \rceil}{ \lambda^{\eta d}} \varrho \right)^+\right]^{n(1 - a)} \,{=\!:}\, \sigma_{n, 1}, \quad \textbf{x}_n \in D_{n,1}. \]

The measure $\mu_{n,2}$ is induced by the following procedure. Generate i.i.d. samples $R^d_1, \dots, R^d_{\lceil n a \rceil}$ from $\widetilde G$, and independently of this, generate i.i.d. samples $R^d_{\lceil n a \rceil + 1}, \dots, R^d_n$ from G. For $i =1, \dots, n$, the radius of the ith sphere is $R_i/\lambda^d$, and the center is generated uniformly distributed over the non-blocking region created by the existing $i-1$ spheres. Since $R^d_1, \dots, R^d_{\lceil n a \rceil}$ are sampled from $\widetilde G$, by (4.12),

\[ I(\textbf{x}_n \in \mathscr{A})L_{n,2}(\textbf{x}_n) = \prod_{i=1}^{\lceil n a \rceil} \frac{{\textrm{d}} G}{{\textrm{d}}\widetilde G}(r^d_i) \leq \exp\left( - n a \, \Lambda^*(\varrho)\right) \,{=\!:}\, \sigma_{n,2} \quad \text{ for all }\,\textbf{x}_n \in D_{n,2}. \]

In summary,

\[\left\{ \left(D_{n,k}, \mu_{n,k}, \sigma_{n,k}\right)_{k = 1}^2\right\}_{n \in \mathbb{N}_0}\]

is a stable IS sequence, and hence Algorithm 4.2 generates perfect samples from $\mu$. However, to reduce the per-iteration complexity (as in the fixed-radius case), we make some modifications to the algorithm. Algorithm 4.6 is similar to Algorithm 4.4, and Algorithm 4.2 is restated as Algorithm 4.7.

Algorithm 4.6 Grid-based IS for random-radii case

We now focus on the running time complexity of Algorithm 4.7. Notice that $\widetilde \sigma(n) = \sigma_{n,1} + \sigma_{n,2}$ for each $n \in \mathbb{N}_0$. By Proposition 4.2, $P_{\mathsf{acc}}(\lambda) = {\mathcal{P}(\lambda)}/{\mathbb{E}}\left[\widetilde \sigma(N)\right]$ with $N \sim Poi(\lambda)$. Observe that

\[\sigma_{n,1} \leq \exp\left( - \frac{\gamma' n^2\, a(1 - a)}{\lambda^{\eta d}} \varrho \right).\]

The proof of Proposition 4.3 can be extended to the current scenario to show that

\begin{align*} \limsup_{\lambda \nearrow \infty }\left[ \frac{1}{\lambda^{2 - \eta d}} \log {\mathbb{E}}\left[ \sigma_{N,1}\right] \right] &\leq - \gamma' \, a(1 - a)\varrho \quad \text{ if }\, \eta d > 1, \,\, \text{ and}\\ \limsup_{\lambda \nearrow \infty }\left[ \frac{1}{\lambda} \log {\mathbb{E}}\left[\sigma_{N,1}\right] \right] & \leq - b \quad \text{ if }\, 0 < \eta d \leq 1, \text{ for some constant }\, b > 0. \end{align*}

It is now clear that a good choice for a is $1/2$, because it maximizes $a(1 - a)$. Furthermore, using the moment generating function of Poisson random variables, we have

\[ {\mathbb{E}}\left[ \sigma_{N,2}\right] \leq \exp\left( - \lambda \left(1 - e^{ -\Lambda^*(\varrho)/2} \right)\right).\]

Recall that ${\mathcal{T}_{\mathsf{ISAR}} \leq {\mathbb{E}}\left[\widetilde \sigma(N)\right] \widetilde C_{\mathsf{itr}}(\lambda)/{\mathcal{P}(\lambda)}}$. The per-iteration complexity $\widetilde C_{\mathsf{itr}}(\lambda)$ is mainly determined by the relabeling of cells in the new grid for each sphere generation. The grid size for the ith sphere generation is of order $\lambda^{\eta d}/R_i^d$, and the total number of spheres generated in each iteration is at most of order $\lambda^{\min\{\eta d, 1\}}$. Therefore, for $\eta d \leq 2$, we can show that $\widetilde C_{\mathsf{itr}}(\lambda)$ is of order $\lambda^{\min\{\eta d, 1\}}{\mathbb{E}}[1/R^d]$.

Algorithm 4.7 Perfect sampling for hard-sphere model with random radii

Remark 4.6. If $\varrho$ is selected to equal

\[{\rm{argmi}}{{\rm{n}}_{\varrho \in ({\alpha _{\min }},\alpha )}}{\mkern 1mu} {\mkern 1mu} ({\sigma _{n,1}} + {\sigma _{n,2}})\]

for each $n = 1, 2, \dots$, then ${\mathbb{E}}\left[\widetilde \sigma(N)\right]$ is minimum. Note that $\sigma_{n,1}$ decreases and $\sigma_{n,2}$ increases as functions of $\varrho$. The above decompositions were chosen to illustrate ideas simply. More complex decompositions are easily created for further performance improvement. For instance, we could have defined $H_n$ above as

\[H^{\mathsf{c}}_n \,{:\!=}\, \bigg\{ (r_1, \dots, r_{n}) \in {\mathbb{R}}_+^{n}\,{:}\, \frac{1}{m}\sum_{i= 1}^{m} r_i \geq \varrho_m, \,\forall m \leq n \bigg\},\]

and then arrived at appropriate $\{\varrho_m\}_{m \leq n}$ and appropriate changes of measures for configurations in $H_n$ and $H_n^{\mathsf{c}}$. While this should lead to substantial performance improvement, it also significantly complicates the analysis.

5. Dominated coupling-from-the-past methods

In this section, we review some of the well-known dominated CFTP algorithms for the hard-sphere models. We refer to [Reference Kendall and Møller27] for a general description of dominated CFTP for Gibbs point processes (this method was first proposed for area-interaction processes by Kendall [Reference Kendall25]).

Let $\textbf{D} = \{\textbf{D}(t)\,{:}\, t \in {\mathbb{R}}\}$ be the so-called dominating birth-and-death process on $[0,1]^d$, with births arriving as a Poisson process with rate $\lambda$, where each birth is a uniformly and independently generated marked point on $[0,1]^d$ denoting the center of a sphere, with the mark being its radius. Each birth is alive for an independent random time exponentially distributed with mean one. It is well known that the steady-state distribution of $\textbf{D}$ is $\mu^0$. Furthermore, it is easy to generate the dominating process $\textbf{D}$ both forward and backward in time so that ${\textbf{D}(t) \sim \mu^0}$ for all t. To see this, let ${\dots < t_{-2} < t_{-1} < 0 < t_1 < t_2 < \dots}$ be the event instants of the process $\textbf{D}$, where an event can be either a birth or a death. Assume that with each birth there is an additional mean-one exponentially distributed independent mark to determine its lifetime. Since the births are arriving as a Poisson process, the interarrival times are exponential with mean $1/\lambda$. Generate ${\textbf{D}(0) \sim \mu^0}$, determine the next event instant $t_1$, and take ${\textbf{D}(t) = \textbf{D}(0)}$ for ${0\leq t < t_1}$. If the next event is a birth, generate a new independent (marked) point; otherwise, remove the existing point with the smallest lifetime. Continue the same procedure starting with $\textbf{D}(t_1)$ to generate the process over $[t_1, t_2)$, and so on. For generating the dominating process $\textbf{D}$ backward in time, observe that $\textbf{D}$ is time-reversible, and hence we can generate $\{\textbf{D}(t)\,{:}\, -T \leq t \leq 0\}$ for any finite ${T > 0}$ just by generating an independent copy $\{\widetilde{\textbf{D}}(t)\,:\, 0 \leq t \leq T\}$ of the dominating process $\left\{\textbf{D}(t)\,:\, 0 \leq t \leq T\right\}$ and taking ${\textbf{D}(-t) = \widetilde{\textbf{D}}(t)}$ for ${0 \leq t \leq T}$.

Since the distribution $\mu$ of the hard-sphere model is absolutely continuous with respect to $\mu^0$, using coupling, it is possible to construct a spatial birth-and-death process ${\textbf{Z} = \{\textbf{Z}(t) \,{:}\, t \in {\mathbb{R}}\}}$, called the interaction process, such that ${\textbf{Z}(t) \subseteq \textbf{D}(t)}$ and ${\textbf{Z}(t) \sim \mu}$ for all ${t \in {\mathbb{R}}}$; see [Reference Kendall and Møller27]. Each iteration of any dominated CFTP method essentially involves the following two steps:

1. Fix $n > 0$ and construct ${\{\textbf{D}(t)\,{:}\, t_{-n} \leq t \leq 0\}}$ backward in time starting at time zero with ${\textbf{D}(0) \sim \mu^0}$.
2. Then, as detailed in Sections 5.1–5.3, use thinning on the dominating process $\{\textbf{D}(t)\,{:}\, t_{-n} \leq t \leq 0\}$ to obtain an upper bounding process ${\{\textbf{U}_n(t)\,{:}\,t \geq t_{-n} \}}$ with ${\textbf{U}_n(t_{-n}) = D(t_{-n})}$ and a lower bounding process ${\{\textbf{L}_n(t_{-n})\,{:}\,t \geq t_{-n} \}}$ with ${\textbf{L}_n(t_{-n}) = \varnothing}$ forward in time, such that for ${t \geq t_{-n}}$, ${\textbf{L}_n(t)\subseteq \textbf{Z}(t) \subseteq \textbf{U}_n(t) \subseteq \textbf{D}(t)}$ and ${\textbf{L}_m(t)\subseteq \textbf{L}_n(t) \subseteq \textbf{U}_n(t) \subseteq \textbf{U}_m(t)}$ for $m \leq n$.

If $\textbf{U}_n$ and $\textbf{L}_n$ coalescence at time 0, that is, ${\textbf{U}_n(0) = \textbf{L}_n(0)}$, then $\textbf{U}_n(0)$ is a perfect sample from the target distribution $\mu$. If there is no coalescence, then repeat the steps by increasing n and extending the dominating process ${\{\textbf{D}(t)\,: -t_{-n} \leq t \leq 0\}}$ further backward to time $t_{-n}$ and repeat the same procedure. It is well known that a good strategy for increasing n is to double it after every iteration. The criteria for thinning depend on the coupling used for constructing $\textbf{Z}$. However, the dominating process $\textbf{D}$ depends only on $\lambda$. In summary, a dominated CFTP algorithm is described by Algorithm 5.1.

Algorithm 5.1 Dominated CFTP

Consider the backward coalescence time ${N^* = \min\left\{n \in \mathbb{N}_0\,{:}\, L_n(0) = U_n(0) \right\}}$. The average running time complexity of Algorithm 5.1 depends on the number of operations involved within $N^*$, which further depends on the construction of the interaction process and the bounding processes. At each iteration, the length of the dominating process $\textbf{D}$ is doubled on average backwards in time. Hence, on average the running time complexity doubles at each iteration. From the definition of $N^*$, the length of the last iteration is

\[2^{\lceil\log_2 N^*\rceil} \geq N^*.\]

Let ${N^f = \min\left\{n \in \mathbb{N}_0\,{:}\, \textbf{L}_0(t_n) = \textbf{U}_0(t_n) \right\}}$ be the forward coalescence time. Because of the reversibility of the dominating process, it can be shown that $N^*$ and $N^f$ are identical in distribution [Reference Asmussen and Glynn3], and hence the expected computational effort for constructing the dominating, upper bounding, and lower bounding processes up to the forward coalescence time $N^f$, starting from time 0, is a lower bound on the expected running time of the algorithm.

Below we consider three dominated CFTP methods applicable to the hard-sphere models.

5.1. Method 1

This method is based on [Reference Kendall and Møller27]. Note that the Papangelou conditional intensity of the hard-sphere model is given by

(5.1)

\begin{align} \ell(\textbf{x}, x) \,{:\!=}\, \frac{I\left(\textbf{x}\cup\{x\} \in \mathscr{A}\right) }{I\left(\textbf{x} \in \mathscr{A}\right)} = I\left(\textbf{x}\cup\{x\} \in \mathscr{A}\,\right)\!,\end{align}

with the convention that $0/0 = 0$. The interaction process $\textbf{Z} = \{\textbf{Z}(t): t \in (-\infty, \infty)\}$ is constructed as follows. Suppose x is a birth to $\textbf{D}$ that sees $\textbf{Z}$ in a state $\textbf{x} \in {\mathscr{G}}$. Then x is added to $\textbf{Z}$ if and only if $\ell(\textbf{x}, x) = 1$. Every death in $\textbf{D}$ reflects in $\textbf{Z}$; that is, if there is a death of a point y in $\textbf{D}$, then y is removed from the process $\textbf{Z}$ as well if it is present. It can be shown that $\mu$ is the unique invariant probability measure of $\textbf{Z}$; see, e.g., [Reference Garcia16] or [Reference Ferrari, Fernández and Garcia12].

For each $n \geq 1$, the bounding processes are constructed as follows. As mentioned earlier, take ${\textbf{L}_n(t_{-n}) = \varnothing}$ and ${\textbf{U}_n(t_{-n}) = \textbf{D}(t_{-n})}$. Suppose that ${\textbf{x}^l = \textbf{L}_n(t_{i})}$ and ${\textbf{x}^u = \textbf{U}_n(t_{i})}$ for ${-n \leq i < 0},$ then assign $\textbf{L}_n(t) = \textbf{x}^l$ and $\textbf{U}_n(t) = \textbf{x}^u$ for $t_{i} < t < t_{i+1}$. In case it is a birth x in the dominating process $\textbf{D}$ at time $t_{i+1}$, set $\textbf{L}_n(t_{i+1}) = \textbf{x}^l\cup\{ x\}$ if $ \ell(\textbf{x}^u,x) = 1$; otherwise, it will remain unchanged, that is, $\textbf{L}_n(t_{i+1}) = \textbf{x}^l$. Similarly, set $\textbf{U}_n(t_{i+1}) = \textbf{x}^u \cup \{ x\}$ if $\ell(\textbf{x}^l,x) = 1$; otherwise, set $\textbf{U}_n(t_{i+1}) = \textbf{x}^u$. Every death in the dominating process reflects in both lower and upper bound processes. Note that a birth is accepted by the lower bounding process if the resulting state of the upper bounding process is in $\mathscr{A}$. Similarly, a birth in $\textbf{D}$ is accepted in the upper bounding process if the resulting state of the lower bounding process is in $\mathscr{A}$.

Theorem 5.1. The expected running time complexity ${\mathcal{T}_{\mathsf{DC1}}}$ of the above dominated CFTP algorithm satisfies

(5.2)

\begin{align}{\mathcal{T}_{\mathsf{DC1}}} \geq c\, \frac{\lambda}{{\mathcal{P}(\lambda)}},\end{align}

for some constant $c > 0$. In particular,

\[ {\mathcal{T}_{\mathsf{DC1}}} = \begin{cases} \varOmega\Big(\lambda\Big) & \quad { if }\, \eta d \geq 2,\\ \\[-7pt] \varOmega\left(\lambda\exp\left(\Big(\frac{\gamma \mu_1}{2} +o(1)\Big)\lambda^{2 - \eta d }\right)\right) & \quad { if }\, 1 < \eta d < 2,\\ \\[-7pt] \varOmega\left(\lambda\exp \left(\Big(1 +o(1)\Big)\lambda\right)\right) & \quad { if }\,0 < \eta d \leq 1. \end{cases}\]

As highlighted by the numerical results in Section 6, the lower bound (5.2) is a loose bound, because it is established by considering the running time complexity only up to the time at which the lower bounding process receives its first arrival. This can be much smaller than the running time complexity up to the coalescence of the upper and lower bound processes.

5.2. Method 2

This method is an improved version of Method 1, again based on [Reference Kendall and Møller27]. Observe that at any given time $t \in {\mathbb{R}}$, the interaction process $\textbf{Z}(t)$ can have only non-overlapping spheres. This suggests a better way of constructing the bounding processes, which we now describe. For each $n \geq 1$, just as in Method 1, start with ${\textbf{L}_n(t_{-n}) = \varnothing}$ and ${\textbf{U}_n(t_{-n}) = \textbf{D}(t_{-n})}$ to guarantee that ${\textbf{L}_n(t_{-n}) \subseteq \textbf{Z}(t_{-n}) \subseteq \textbf{U}_n(t_{-n})}$. Suppose that the event at $t_{i+1}$ is an arrival of sphere x. Irrespective of whether $\textbf{U}_n(t_i) \in \mathscr{A}$ or not, if x does not overlap with any sphere in the upper bounding process $\textbf{U}_n(t_i)$, then it cannot overlap with any sphere in $\textbf{Z}(t_i)$, and hence it is accepted to $\textbf{Z}$. Thus, we add x to both the bounding processes. (Observe that in Method 1, such an x is added to both the bounding processes only if $\textbf{U}_n(t_i) \in \mathscr{A}$, because of the Papangelou conditional intensity (5.1).) If x overlaps with any sphere in the lower bounding process $\textbf{L}_n(t_i)$, then it must overlap with a sphere in $\textbf{Z}(t_i)$ as well, and hence it is not added to any of the bounding processes $\textbf{L}$ and $\textbf{U}$. If x does not overlap with any sphere in $\textbf{L}_n(t_i)$, but does overlap with a sphere in $\textbf{U}_n(t_i)$, its presence in the process $\textbf{Z}$ cannot be ruled out, and hence we keep it in the upper bounding process, but not in the lower bounding process. Finally, every death in $\textbf{D}$ is reflected in both the bounding processes $\textbf{L}_n$ and $\textbf{U}_n$. Under this construction, the lower bounding process accepts births more often, and hence the upper bounding process accepts births less often, compared with the construction in Section 5.1. As a result the running time of Method 2 is shorter than that of Method 1.

5.3. Method 3

A different approach for dominated CFTP for repulsive pairwise interaction processes has been proposed by Huber [Reference Huber22]. Here, we discuss the main ingredients of the method for hard-sphere models; refer to [Reference Huber22, Reference Huber23] for more details. In this method, the interaction process $\textbf{Z}$ is different from the $\textbf{Z}$ in Sections 5.1–5.2; it is known as the spatial birth–death swap process, and its invariant distribution is again the distribution $\mu$ of the hard-sphere model. In addition to births and deaths of spheres, this process also allows swap moves; here a swap move is an event where an existing sphere is replaced by an arrival if it is the only sphere that overlaps with the arrival. The lower and upper bounding processes are constructed as follows. As usual, let ${\textbf{U}_n(t_{-n}) =\textbf{D}(t_{-n})}$ and ${\textbf{L}_n(t_{-n}) = \varnothing}$. For any ${0 < k < n}$, if $t_{-k}$ is the instant of a death in the dominating process $\textbf{D}(t)$, then the death is reflected in both the upper and lower bound processes. Now suppose that $x \in \textbf{D}(t_{-k})$ is born at $t_{-k}$.

Case 1: If no sphere in $\textbf{U}_n(t_{-k})$ overlaps with x, then the arrival sphere x is added to both $\textbf{U}_n(t_{-k})$ and $\textbf{L}_n(t_{-k})$. If only one sphere y in $\textbf{U}_n(t_{-k})$ overlaps with x, then y is removed from $\textbf{U}_n(t_{-k})$ (and from $\textbf{L}_n(t_{-k})$ if it is present), and x is added to both $\textbf{U}_n(t_{-k})$ and $\textbf{L}_n(t_{-k})$.

Case 2: There are at least two spheres in $\textbf{L}_n(t_{-k})$ overlapping with x. Then x is rejected by both $\textbf{U}_n(t_{-k})$ and $\textbf{L}_n(t_{-k})$.

Case 3: At most one sphere in $\textbf{L}_n(t_{-k})$ and at least two spheres in $\textbf{U}_n(t_{-k})$ overlap with x. Then x is added to $\textbf{U}_n(t_{-k})$ (but not to $\textbf{L}_n(t_{-k})$). If $y \in \textbf{L}_n(t_{-k})$ is the one that overlaps with x, then y is removed from $\textbf{L}_n(t_{-k})$.

6. Simulations

We compare the performance of all the methods discussed in this paper using numerical experiments, and illustrate the effectiveness of the proposed IS-based AR method in certain regimes where the other methods fail to work. For this, we consider the torus-hard-sphere model with a fixed radius $\overline{r}/\lambda^\eta$ on the two-dimensional square $[0,1]^2$. Thus, $\eta d = 2\eta$. In the first two experiments, by fixing values of $\eta$ and $\overline{r}$, we estimate the complexity of each algorithm as a function of the intensity $\lambda$ of the reference PPP by computing a sample average of the number of spheres (or, in this case, circles) generated per generation of a perfect sample of the hard-sphere model. Instead of estimating the expected running time complexities, we take this approach to keep the discussion independent of the underlying data structures and programming language used in the implementation of the algorithms. In addition, we estimate the non-overlapping probability ${\mathcal{P}(\lambda)}$ using the conditional Monte Carlo rare event estimation for Gilbert graphs proposed by [Reference Hirsch, Moka, Taimre and Kroese20]. The Gilbert graph under consideration is a random graph where the nodes constitute a $\lambda$-homogeneous PPP on $[0,1]^2$ and there is an edge between two points if they are within a distance of $2\overline{r}/\lambda^\eta$. Therefore, ${\mathcal{P}(\lambda)}$ is the probability that there are no edges in the graph. The code for all the methods discussed in this paper is available at https://github.com/saratmoka/PerfectSampling_HardSpheres.

For the implementation of the proposed IS-based AR method, the grid is constructed using the cell-edge length $\epsilon = 1/\lfloor \lambda^\eta/\overline{r}\rfloor$; see Section 4.4 for more details on the selection of the cell-edge length. The complexities of the algorithms are estimated using 1000 samples. In the simulation results presented below, $\widehat S_{\mathsf{NAR}}$ and $\widehat S_{\mathsf{ISAR}}$ denote the sample means of complexities of the naive AR and the IS-based AR algorithms. Likewise, $\widehat S_{\mathsf{DCM1}}$, $\widehat S_{\mathsf{DCM2}}$, and $\widehat S_{\mathsf{DCM3}}$ respectively are the corresponding estimates for the three dominated CFTP methods (Methods 1, 2, and 3) presented in Section 5.

Figure 3. Log of the expected number of points generated per perfect sample of the hard-sphere model, as a function of $\lambda$, in the regime where $\eta = 0.5$, $d = 2$, and $\overline{r} = 1$.

Figure 4. A plot of $\log {\mathcal{P}(\lambda)}$ versus $\lambda$ in the regime where $\eta = 0.5$, $d = 2$, and $\overline{r} = 1$, where the non-overlapping probability ${\mathcal{P}(\lambda)}$ is estimated using the conditional Monte Carlo method proposed in [Reference Hirsch, Moka, Taimre and Kroese20]. The plot shows that, in the high-density regime, the configurations with hard spheres can be extremely rare under the measure $\mu^0$.

Figure 5. The intensity of the hard-sphere model against $\lambda$ in the regime where $\eta = 0.5$, $d = 2$, and $\overline{r} = 1$. Owing to the extreme rarity of the hard-sphere configurations under $\mu^{0}$, as shown in Figure 4, the intensity of the hard-sphere model is much smaller than the intensity $\lambda$ of the PPP.

Figure 6. Log of the expected number of points generated per perfect sample of the hard-sphere model, as a function of $\lambda$, in the regime where $\eta = 0.75$, $d = 2$, and $\overline{r} = 0.5$.

Figure 7. A plot of $\log {\mathcal{P}(\lambda)}$ versus $\lambda$ in the regime where $\eta = 0.75$, $d = 2$, and $\overline{r} = 0.5$, where the non-overlapping probability ${\mathcal{P}(\lambda)}$ is estimated using the conditional Monte Carlo method proposed in [Reference Hirsch, Moka, Taimre and Kroese20]. Here we notice that the hard-sphere configurations are relatively less rare compared to the scenarios in Experiment 1.

Figure 8. The intensity of the hard-sphere model against $\lambda$ in the regime where $\eta = 0.75$, $d = 2$, and $\overline{r} = 0.5$. Unlike in Experiment 1, the intensity of the hard-sphere model is relatively close to the intensity $\lambda$ of the PPP.

Figure 9. Comparison between the running times of the proposed IS-based AR method and rHardcore() for generating 1000 samples.

Figure 10. A plot of $\log {\mathcal{P}(\lambda)}$ versus $\overline{r}$ in the regime where $\eta = 0.5$, $d = 2$, and $\lambda = 50$, where the non-overlapping probability ${\mathcal{P}(\lambda)}$ is again estimated using the conditional Monte Carlo method.

Figure 11. Intensity of the hard-sphere model against $\overline{r}$ in the regime where $\eta = 0.5$, $d = 2$, and $\lambda = 50$.

A standard program used for generating perfect samples of the hard-sphere model using dominated CFTP is rHardcore(), which is a part of the R package Spatstat, available at https://spatstat.org. Experiment 3 provides a perspective on the performance of the proposed method in relation to rHardcore() by comparing their expected running times as a function of $\overline{r}$ for a fixed $\lambda$. We note that rHardcore() does not support the torus-hard-sphere model. However, when ‘expand=TRUE’ is selected, it reduces the boundary effects by generating a perfect sample on a larger window, and then clipping the result to the original window $[0,1]^2$.

Experiment 1: In this experiment, we consider the high-density regime. Figure 3 compares the performance of all the algorithms for $\eta = 0.5$ (that is, $2\eta = 1$) and $\overline{r} = 1$ (this is identical to the regime where the underlying space is $[0, \sqrt{\lambda}]^2$ and the radius of each sphere is $\overline{r}$). This experiment suggests that the proposed IS-based AR method can perform significantly better than all the other methods. To show the rarity of the samples of the hard-sphere configurations under $\mu^0$, we plot $\log {\mathcal{P}(\lambda)}$ in Figure 4 and the expected intensity of the hard-sphere model in Figure 5.

The significance of the proposed IS method in the high-density regime is more evident when $\eta = 0.25$ (that is, $2\eta = 0.5$) and $\overline{r} = 1$. In this case, for values of $\lambda$ greater than 50, almost every time, the dominated CFTP algorithms terminated without producing any output. In particular, the rHardcore function terminated by producing the error ‘memory exhausted (limit reached?)’. On the other hand, the IS-based method generated 1000 samples in 0.13, 0.21, 68.94, 508.60, and 4315.19 seconds, respectively, for $\lambda$ values of 50, 100, 200, 300, and 400.

Experiment 2: In this experiment, we consider the low-density regime. Figure 6 compares the performances of all the methods for $2\eta = 1.5$ and $\overline{r} = 0.5$ to illustrate the case where $1 < \eta d < 2$. As we can see, for large values of $\lambda$, the dominated CFTP methods 2 and 3 perform better than the other methods, including the proposed method. Figure 7 is a plot of $\log {\mathcal{P}(\lambda)}$ against $\lambda$, while Figure 8 is a plot of the intensity of the hard-sphere model against $\lambda$.

Experiment 3: Figure 9 compares the running times of the proposed IS-based AR method and rHardcore() for generating 1000 samples. The same computer is used to run both programs. Here, we vary $\overline{r}$ while fixing $\lambda = 50$ and $2\eta = 1$. Observe that for large values of $\overline{r}$ the density is higher, and the proposed method performs far better than the dominated CFTP method. As we expect for this regime, as the radius $\overline r$ increases, the intensity of the hard-sphere model decreases (Figure 11) while the rarity of the hard-sphere configurations under $\mu^0$ increases (Figure 10).

7. Conclusion

In this paper we considered the problem of perfect sampling for Gibbs hard-sphere models on $[0,1]^d$. We discussed the performance of the naive AR method and introduced IS-based enhancements to it. We also compared these methods to some of the popular CFTP-based techniques prevalent in the existing literature. For performance analysis and comparison (of expected running time complexity), we developed an asymptotic regime where the intensity $\lambda$ of the reference PPP increased to infinity, while the (random) volume of each sphere, of the order of $\lambda^{- \eta d}$, decreased to zero, for different regimes of $\eta d > 0$. One main conclusion is that while the dominated CFTP methods perform better for $1< \eta d < 2$ for large $\lambda$, our IS-based methods provide a significant improvement for $\eta d \leq 1$. En route, we established large deviations results for the probability that spheres do not overlap with each other when their centers constitute a PPP. We also conducted numerical experiments to validate our asymptotic results.

The proposed IS-based AR methods rely on cleverly partitioning the underlying configuration space and arriving at an appropriate change of measure on each partition. While we have shown how this may be effectively conducted for hard-sphere models, further research is needed to develop effective implementations for perfect sampling from a broader class of Gibbs point processes.

Acknowledgements

We thank the referees and the associate editor for their comments and suggestions, which substantially improved the presentation of this paper. S. M. acknowledges the support of the Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), under Grant No. CE140100049. S. J. acknowledges the support of the Department of Atomic Energy, Government of India, under Project No. RTI4001. M. M.’s research is partly funded by the NWO Gravitation project NETWORKS, Grant No. 024.002.003.

Appendix A. Proofs

The following lemmas are useful for proving Theorem 3, Proposition 4.1, and Proposition 4.3. Lemma A.1 is a standard Chernoff bound for Poisson random variables, and Lemma A.2 is Hoeffding’s inequality for U-statistics [Reference Hoeffding21].

Lemma A.1. (Chernoff bound for Poisson) Let $N \sim Poi(\lambda)$. Then, for any $0 < \epsilon < 1$,

\[ {\mathbb{P}}\left( N \leq (1- \epsilon)\lambda \right) \leq \exp\left( - \frac{\lambda \epsilon^2}{2}\right)\,\,\, { and }\,\,\,\,\, {\mathbb{P}}\left( N \geq (1 + \epsilon)\lambda \right) \leq \exp\left( - \frac{\lambda \epsilon^2}{3} \right).\]

Lemma A.2. (Hoeffding, 1963) Suppose that $\xi_1, \xi_2, \dots, \xi_n$ are i.i.d. random variables and $g:{\mathbb{R}}^k \rightarrow [0,1]$ is a measurable function. Set

\[ Z_n = \sum_{1 \leq i_1 < i_2< \dots < i_k \leq n} g\Big(\xi_{i_1}, \xi_{i_2}, \dots,\xi_{i_k}\Big) \]

for a positive integer $k \leq n$ (this is known as a U-statistic of order k). Then, for any $\epsilon > 0$,

\[ {\mathbb{P}} \left(Z_n \geq \frac{n }{ k} \Big({\mathbb{E}}[g(\xi_1,\xi_2,\dots,\xi_k)] + \epsilon\Big)\right) \leq 2 \exp\left(- 2 \lfloor n/k \rfloor \epsilon^2\right). \]

The same estimate holds for ${\mathbb{P}} \left(Z_n \leq \frac{n }{k} \Big({\mathbb{E}}[g(\xi_1,\xi_2,\dots,\xi_k)] - \epsilon\right)\Big).$

A.1. Proof of Theorem 3.1

Recall that $\lambda > 0$, $\eta > 0$, and $\frac{R_1}{\lambda^\eta}, \dots, \frac{R_n}{\lambda^\eta}$ are the radii of n spheres whose respective centers $Y_1, \dots, Y_n$ are independently and uniformly generated on the d-dimensional unit cube $[0,1]^d$, where $R_1, \dots, R_n$ are i.i.d. positive random variables bounded from above by a constant $\overline{r}$ and are independent of $Y_1, \dots, Y_n$. Define $m_i \,{:\!=}\, {\mathbb{E}}\left[(R_1 + R_2)^{id}\right]$, for all $i = 1, 2, \dots$. Let ${\mathcal{P}_n(\lambda)}$ be the probability that these n spheres do not overlap with each other. Since the number of spheres in a $\lambda$-homogeneous marked PPP on $[0,1]^d$ is a Poisson random variable with mean $\lambda$, the non-overlapping probability is

(A.1)

\begin{align} {\mathcal{P}(\lambda)} = {\mathbb{E}}\left[\mathcal{P}_N(\lambda)\right],\end{align}

where $N \sim {\mathsf{Poi}}(\lambda)$.

In proving the theorem, we use the following lemmas, which exploit the reference IS measure $\widetilde \mu_n$ introduced in Section 4.2. By (4.7) and the definition of ${\mathcal{P}_n(\lambda)}$,

(A.2)

\begin{align}{\mathcal{P}_n(\lambda)} &= {\mathbb{P}}_{\mu^0} \left(\textbf{X} \in \mathscr{A} \big| |\textbf{X} | = n \right)\nonumber\\ &= {\mathbb E}_{\widetilde \mu_n}\bigg[I\left(\textbf{X} \in \mathscr{A} \right)\prod_{i=1}^{n}\Big(1 - B_{i}\Big)\bigg]\nonumber\\ &= {\mathbb E}_{\widetilde \mu_n}\bigg[\prod_{i=1}^{n}\Big(1 - B_{i}\Big)\big].\end{align}

The following bound holds trivially:

(A.3)

\begin{equation}B_{i} \leq \frac{\gamma}{\lambda^{\eta d}}\sum_{j=1}^{i-1}\left(R_j + R_i\right)^d,\end{equation}

where the sum is taken to be zero when $i = 1$. Let

\[\theta_{n,\lambda} = \frac{\gamma (2\overline{r})^d\, n}{\lambda^{\eta d}}.\]

We have the following upper and lower bounds on ${\mathcal{P}_n(\lambda)}$.

Lemma A.3. For the above set-up,

(A.4)

\begin{align} {\mathcal{P}_n(\lambda)} \geq \exp\bigg( - n \sum_{j = 1}^\infty \left(\frac{\gamma n}{\lambda^{\eta d}}\right)^j \frac{m_j}{j(j+1)}\bigg). \end{align}

Furthermore, for any $\epsilon > 0$,

(A.5)

\begin{align} {\mathcal{P}_n(\lambda)} \leq N_{n,\lambda} \left[ \exp\left( - \frac{\gamma n(n-1)(m_1 - \epsilon)}{2\lambda^{\eta d}} \right) + 2\exp\left( - \frac{(n-1)\epsilon^2}{(2\overline{r})^{2d}}\right)\right] \end{align}

for any n and $\lambda$ such that $\theta_{n, \lambda} < 1$, where $N_{n, \lambda}$ is a function of $n, \lambda$, and $\overline{r}$, defined by (A.8) in the proof below, such that

(A.6)

\begin{align} \lim_{\lambda \rightarrow \infty }\frac{1}{\lambda^{2 -\eta d}} \log N_{\lambda, \lambda} &= 0 \quad { if }\, \eta d > 1. \end{align}

In particular, for the torus-hard-sphere model,

(A.7)

\begin{align} \lim_{\lambda \rightarrow \infty } N_{\lambda, \lambda} &= 1 \quad { if }\, \eta d > 3/2. \end{align}

Proof. Lower bound: To prove (A.4), notice that, by (A.2),

\begin{align*} {\mathcal{P}_n(\lambda)} = {\mathbb E}_{\widetilde \mu_n}\left[\exp\left( \sum_{i=1}^{n}\log \Big(1 - B_{i}\Big)\right)\right] = {\mathbb E}_{\widetilde \mu_n}\left[\exp\left( \sum_{i=1}^{n}\sum_{j = 1}^\infty - \frac{1}{j}B_{i}^j\right)\right],\end{align*}

using the Taylor expansion $\log(1 - x) = - \sum_{j =1}^\infty x^j/j$ for $0 \leq x \leq 1$. By Jensen’s inequality and (A.3),

\begin{align*}{\mathcal{P}_n(\lambda)} &\ge \exp\bigg( - \sum_{i=1}^{n}\sum_{j = 1}^\infty \frac{1}{j}{\mathbb E}_{\widetilde \mu_n}[B_{i}^j]\bigg)\nonumber\\[3pt] &\geq \exp\bigg( - \sum_{i=1}^{n}\sum_{j = 1}^\infty \frac{\gamma^j\, }{j\lambda^{j\eta d}}{\mathbb E}\bigg[\bigg(\sum_{l = 1}^{i-1} (R_l + R_{i})^{d}\bigg)^j\bigg]\bigg)\nonumber\\[3pt] &{\geq \exp\bigg( - \sum_{i=1}^{n}\sum_{j = 1}^\infty \frac{\gamma^j\, (i-1)^j}{j\lambda^{j\eta d}}{\mathbb E}\bigg[\bigg(\frac{1}{i-1}\sum_{l = 1}^{i-1} (R_l + R_{i})^{d}\bigg)^j\bigg]\bigg).}\nonumber\end{align*}

Again by Jensen’s inequality,

\[\bigg(\frac{1}{i -1}\sum_{l = 1}^{i-1} \bigg(R_l + R_{i}\bigg)^{d}\bigg)^j \leq \frac{1}{i-1}\bigg(\sum_{l = 1}^{i-1} \bigg(R_l + R_{i}\bigg)^{jd}\bigg),\]

and thus

\[{\mathcal{P}_n(\lambda)} \ge \exp\bigg( - \sum_{j = 1}^\infty \frac{\gamma^j m_j}{j\lambda^{j\eta d}}\sum_{i=1}^{n}(i-1)^j\bigg).\]

We establish (A.4) using

\[\sum_{i=1}^{n} (i-1)^j \leq \int_{x=0}^n x^j\, dx= \frac{n^{j+1}}{j+1}.\]

Upper bound: Let $R_{(1)}, R_{(2)}, \dots, R_{(n)}$ be the order statistics of $R_1, R_2, \dots, R_n$. Since the non-overlapping probability ${\mathcal{P}_n(\lambda)}$ is independent of the order in which the spheres are generated, without loss of generality assume that the ith sphere has radius $R_{(i)}$. For each $1 \leq j \leq i -1$, let $\widetilde{B}_i(j)$ be the volume of the blocked region for the ith sphere generation when the $(j+1)$th, $(j+2)$th, …, $(i -1)$th spheres are ignored, where $\widetilde{B}_i(0) = 0$. We can think of $\widetilde{B}_i(j) - \widetilde{B}_i({j-1})$ as the blocking volume contributed by the jth sphere for the ith sphere. Under the new measure $\widetilde \mu_n$, the blocking volume seen by the ith sphere is $\displaystyle B_{i} = \sum_{j = 1}^{i-1} \left(\widetilde{B}_i(j) - \widetilde{B}_i({j-1})\right).$ Consider the sets

\[\displaystyle {\mathscr N}^{(i)} \,{:\!=}\, \left\{ j \in \{1, 2,\dots, i - 1\}\,{:}\, \widetilde{B}_i(j) - \widetilde{B}_i({j-1}) = \frac{\gamma}{\lambda^{\eta d}} \Big(R_{(j)} + R_{(i)}\Big)^d\right\},\]

for $ i \leq n$, and take $\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (i)} \,{:\!=}\, \{1,2,\dots, i -1\} \setminus{\mathscr N}^{(i)}$. Using the inequality $1 - x \leq e^{-x}$ and (A.2),

\begin{align*} {\mathcal{P}_n(\lambda)} &\leq {\mathbb{E}}_{\widetilde \mu_n}\bigg[ \exp\bigg(-\sum_{i=1}^{n} B_{i}\bigg)\bigg] = {\mathbb{E}}_{\widetilde \mu_n}\bigg[ \exp\bigg(-\sum_{i=1}^{n} \sum_{j = 1}^{i-1} \bigg(\widetilde{B}_{i}( j) - \widetilde{B}_{i}( j-1)\bigg) \bigg)\bigg]\nonumber\\[3pt] &\leq {\mathbb{E}}_{\widetilde \mu_n}\bigg[ \exp\bigg(-\frac{\gamma}{\lambda^{\eta d}}\sum_{i=1}^{n} \sum_{j \in {\mathscr N}^{(i)}} \big(R_{(j)} + R_{(i)}\big)^d \bigg)\bigg]\nonumber\\ \end{align*}

\begin{align*} &= {\mathbb{E}}_{\widetilde \mu_n}\bigg[ \exp\bigg( - \frac{\gamma\,(2\overline{r})^d}{\lambda^{\eta d}}Z_n +\frac{\gamma }{\lambda^{\eta d}}\sum_{i=1}^{n} \sum_{j \in \kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (i)}} \Big(R_{(j)} + R_{(i)}\Big)^d \bigg)\bigg]\nonumber\\ &\leq {\mathbb{E}}_{\widetilde \mu_n}\bigg[ \exp\bigg( - \frac{\gamma\,(2\overline{r})^d}{\lambda^{\eta d}}Z_n +\frac{\gamma (2\overline{r})^d}{\lambda^{\eta d}}\sum_{i=1}^{n} |\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (i)}| \bigg)\bigg],\nonumber\end{align*}

where

\[Z_n = \sum_{i =1}^{n} \sum_{j = 1}^{i-1} \left(\frac{R_{(j)} + R_{(i)}}{2\overline{r}}\right)^d,\]

and the last inequality holds from the assumption that each $R_i \leq \overline{r}$. Since $R_{(i)}$ is non-decreasing with i, from the definition of ${\mathscr N}^{(i)}$, it is easy to see that $|\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (i)}|$ is non-decreasing with i. Therefore,

\begin{align*}{\mathcal{P}_n(\lambda)} &\leq {\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left( - \frac{\gamma\,(2\overline{r})^d}{\lambda^{\eta d}}Z_n +\theta_{n, \lambda} |\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (n)}| \right)\right]\\ &= {\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left( - \frac{\gamma\,(2\overline{r})^d}{\lambda^{\eta d}}Z_n \right){\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left(\theta_{n, \lambda} |\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (n)}| \right) \Big| R_1, \dots, R_n \right] \right].\end{align*}

We now show that $|\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (n)}|$ is stochastically bounded by a binomial random variable, and as a consequence, the conditional expectation

\[{\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left(\theta_{n, \lambda} |\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (n)}| \right) \Big| R_1, \dots, R_n \right] \]

is uniformly bounded from above by a constant, which is a function of $n, \lambda$, and $\overline{r}$. Let

\[q_{j} = {\mathbb{P}}_{\widetilde \mu_n}\left(\widetilde B_{n}(j) - \widetilde B_{n}(j-1) < \frac{\gamma }{\lambda^{\eta d}}(R_{(j)} + R_{(n)})^d\right).\]

Clearly, $q_{j}$ is increasing with j, and therefore $q_j \leq q_{n -1}$ for all $j \leq n-1$. This implies that $|\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (n)}|$ is stochastically bounded by a binomial random variable with parameters n and $q_{n-1}$, and thus

\begin{align*}{\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left(\theta_{n, \lambda} |\kern5pt\overline{\kern-5pt\mathscr N}^{\kern4pt (n)}| \right) \Big| R_1, \dots, R_n \right] &\leq \Big( q_{n-1} \exp\left(\theta_{n, \lambda}\right) + (1 - q_{n-1}) \Big)^n.\end{align*}

Because of the boundary effect, $q_{n-1}$ is not the same for the torus model (where boundary spheres loop over to the opposite boundaries) and the Euclidean model. Observe that, for the Euclidean model,

\[\widetilde B_{n}(n-1) - \widetilde B_{n}(n-2) < \frac{\gamma(R_{(j)} + R_{(n)})^d}{\lambda^{\eta d}}\]

if either

(1) the center of the $(n -1)$th sphere is within a distance of $ (R_{(j)} + R_{(n-1)} + 2R_{(n)})/\lambda^\eta$ from the center of the jth sphere for some $j \leq n-2$, or
(2) the center of $(n-1)$th sphere is within a distance of $(R_{(n-1)} + R_{(n)})/\lambda^\eta$ from the boundary of the unit cube.

Note that the boundary event (2) is irrelevant for the torus-hard-sphere model. The probability of the event (1) is maximized by

\[\frac{\gamma}{\lambda^{\eta d}} \sum_{j = 1}^{n-2}\frac{\left(R_{(j)} + R_{(n-1)} + 2R_{(n)} \right)^d}{1 - B_{n-1}},\]

while that of the event (2) is maximized by

\[\frac{1 - \left(1 - 2(R_{(n-1)} + R_{(n)})/\lambda^\eta \right)^d}{1 - B_{n-1}}.\]

Since the $R_{i}$ are bounded from above by $\overline{r}$ and $B_{n-1} \leq \theta_{n, \lambda}$ (from (A.3)), we have

\begin{align*} q_{n-1} \leq \bar q_{n,\lambda} \,{:\!=}\, \begin{cases} \frac{2^d \theta_{n,\lambda}}{1 - \theta_{n,\lambda}} + \frac{c}{\lambda^\eta (1 - \theta_{n,\lambda})} &\quad \text{ for the Euclidean model}, \\ \frac{2^d \theta_{n,\lambda}}{1 - \theta_{n,\lambda}} &\quad \text{ for the torus model}, \\\end{cases}\end{align*}

for any n and $\lambda$ such that $\theta_{n,\lambda} < 1$, and for some constant c. Let

(A.8)

\begin{align}N_{n, \lambda} = \Big( 1 + \bar q_{n,\lambda} \Big(\exp\left(\theta_{n, \lambda}\right) - 1\Big) \Big)^n;\end{align}

then

\[{\mathcal{P}_n(\lambda)} \leq N_{n,\lambda}\, {\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left( - \frac{\gamma\,(2\overline{r})^d}{\lambda^{\eta d}}Z_n \right)\right].\]

Using the definition of $Z_n$, for any $\epsilon > 0$,

\begin{align*}{\mathbb{E}}_{\widetilde \mu_n}\left[ \exp\left( - \frac{\gamma\, (2\overline{r})^d}{\lambda^{\eta d}}Z_n\right)\right] \leq \exp\left( - \frac{\gamma n(n-1)(m_1 - \epsilon)}{2\lambda^{\eta d}} \right) + {\mathbb{P}}_{\widetilde \mu_n}\left(Z_n < \frac{n(n-1)(m_1 - \epsilon)}{2(2\overline{r})^d} \right).\end{align*}

By Lemma 8 (with $k = 2$),

\[{\mathbb{P}}_{\widetilde \mu_n}\left(Z_n < \frac{n(n-1)}{2} \left(\frac{m_1 - \epsilon}{(2\overline{r})^d}\right)\right) \leq 2\exp\left( - \frac{(n-1)\epsilon^2}{(2\overline{r})^{2d}}\right),\]

and thus (A.5) is established.

It remains to prove (A.6) and (A.7) under the assumption that $\eta d > 1$. For this case, we have that $\lim_{\lambda \nearrow \infty}\theta_{\lambda, \lambda} = 0$ and hence $\lim_{\lambda \nearrow \infty}\bar q_{\lambda, \lambda} = 0$. Since

\[N_{\lambda, \lambda} \leq \exp\left( \lambda\, \bar q_{\lambda, \lambda}\, \Big[\exp\left(\theta_{\lambda, \lambda} \right) - 1\Big] \right),\]

using the Taylor expansion of the exponential function,

\[0 \leq \lim_{\lambda \nearrow \infty}\frac{1}{\lambda^{2 - \eta d}} \log N_{\lambda, \lambda} \leq \lim_{\lambda \nearrow \infty }\left[\bar q_{\lambda, \lambda} \sum_{j\in \mathbb{N}_0} \frac{\gamma^{(j+1)} (2\overline{r})^{(j+1)d}}{(j+1)!} \lambda^{j(\eta d - 1)}\right] = 0.\]

Thus (A.6) holds. In particular, for the torus model with $\eta d > 3/2$,

\[\lambda \, \bar q_{\lambda, \lambda}\, \Big[\exp\left(\theta_{\lambda, \lambda} \right) - 1\Big] = \lambda \, \frac{2^d \theta_{\lambda, \lambda}}{1 - \theta_{\lambda, \lambda} }\, \Big[\exp\left(\theta_{\lambda, \lambda} \right) - 1\Big] = \frac{2^d}{1 - \theta_{\lambda, \lambda}}\sum_{j= 2}^\infty \frac{\gamma^j (2\overline{r})^{jd}}{j!} \lambda^{1 - j(\eta d - 1)}\]

goes to 0 as $\lambda \nearrow \infty$, and hence (A.7) holds.

Lemma A.4. Suppose that $1 < \eta d \leq 2$. Then, for any $0 < a < 0.5$,

(A.9)

\begin{align}{\mathcal{P}(\lambda)} \geq \exp\left( - \sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)}\right)\left[ 1 - o(1) \right].\end{align}

Furthermore, let $\bar{\lambda} = \lceil\lambda(1 - \frac{1}{\lambda^a})\rceil$ for some constant a such that $0 < a < \frac{\eta d - 1}{2}$. Then, for any $\epsilon > 0$,

(A.10)

\begin{align} {\mathcal{P}(\lambda)} \leq N_{\lambda, \lambda} \,\exp\left( - \frac{\gamma \bar{\lambda}^2\, (m_1 - \epsilon)}{2\lambda^{\eta d}} \right) \left[ 1 + o(1) \right],\end{align}

where $N_{\lambda, \lambda}$ satisfies (A.6) and (A.7). In particular, (A.9) holds with $\epsilon = 0$ if $\eta d > 5/3$ and $2 - \eta d < a < \frac{\eta d - 1}{2}$.

Proof. Lower bound: Fix a such that $0 < a < 0.5$. Since ${\mathcal{P}_n(\lambda)}$ is a decreasing function of n for any fixed $\lambda$, by Lemma 8.1, we can say that for all $n < \lambda\left(1 + \frac{1}{\lambda^a}\right)$,

\[ {\mathcal{P}_n(\lambda)} \geq \exp\left( - \sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)}\right),\]

and from (A.1) and the Chernoff bound for the Poisson variable N (see Lemma A.1),

\begin{align*}{\mathcal{P}(\lambda)} &\geq {\mathbb{E}}\left(\mathcal{P}_N(\lambda) ;\ N < \lambda\left(1 + \frac{1}{\lambda^a}\right) \right) \nonumber \\ &\geq {\mathbb{P}}\left( N < \lambda\left(1 + \frac{1}{\lambda^a}\right) \right)\,\exp\left( - \sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)}\right) \nonumber\\ &\geq \left(1 - \exp\left( - \frac{1}{3}\lambda^{1 - 2a}\right) \right)\exp\left( - \sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)}\right).\end{align*}

Now (A.9) follows easily because $\exp\left( - \frac{1}{3}\lambda^{1 - 2a}\right) = o(1)$ as a function of $\lambda$.

Upper bound: From (A.1),

(A.11)

\begin{align}{\mathcal{P}(\lambda)} &= {\mathbb{E}} [\mathcal{P}_N(\lambda)]\leq {\mathbb{E}}[\mathcal{P}_N(\lambda); N \geq \bar{\lambda}] + {\mathbb{P}}(N < \bar{\lambda}) \leq \mathcal{P}_{\bar{\lambda}}\left(\lambda\right) + {\mathbb{P}}(N < \bar{\lambda}),\end{align}

where the last inequality is due to the fact that ${\mathcal{P}_n(\lambda)}$ is a decreasing function of n for fixed $\lambda$. We now analyze $\mathcal{P}_{\bar{\lambda}}\left(\lambda\right)$ and ${\mathbb{P}}(N < \bar{\lambda})$ separately.

By Lemma A.3, for any $\epsilon > 0$,

\begin{align*} \mathcal{P}_{\bar{\lambda}}\left(\lambda\right)&\leq N_{\lambda, \lambda}\, \left[\exp\left( - \frac{\gamma \bar{\lambda}(\bar{\lambda} - 1)\, (m_1 - \epsilon)}{2\lambda^{\eta d}} \right) + 2\exp\left( - \frac{\bar{\lambda}\,\epsilon^2}{(2\overline{r})^{2d}}\right) \right]\\ &\leq N_{\lambda, \lambda}\,\left[\exp\left( - \frac{\gamma \bar{\lambda}^2\, (m_1 - \epsilon)}{2\lambda^{\eta d}} \right) \exp\left( \frac{\gamma \, m_1}{2\lambda^{\eta d -1}} \right) + 2\exp\left( - \frac{\bar{\lambda}\,\epsilon^2}{(2\overline{r})^{2d}}\right) \right],\end{align*}

where we used the fact that $\bar{\lambda} \leq \lambda$. We rewrite the above expression as follows:

\begin{align*} \mathcal{P}_{\bar{\lambda}}\left(\lambda\right)\leq N_{\lambda, \lambda}\,\exp\left( - \frac{\gamma \bar{\lambda}^2\, (m_1 - \epsilon)}{2\lambda^{\eta d}} \right)\left(\exp\left( \frac{\gamma \, m_1}{2\lambda^{\eta d -1}} \right) + 2\exp\left( \frac{\gamma \bar{\lambda}^2\, m_1}{2\lambda^{\eta d}}- \frac{\bar{\lambda}\,\epsilon^2}{(2\overline{r})^{2d}}\right) \right).\end{align*}

Note that

\[\frac{\gamma \bar{\lambda}^2\, m_1}{2\lambda^{\eta d}} = O\left( \lambda^{2-\eta d}\right)\]

and

\[\frac{\bar{\lambda}\,\epsilon^2}{(2\overline{r})^{2d}} = \Omega\left(\lambda \right).\]

Since $\eta d > 1$,

(A.12)

\begin{align} 2\exp\left( \frac{\gamma \bar{\lambda}^2\, m_1}{2\lambda^{\eta d}}- \frac{\bar{\lambda}\,\epsilon^2}{(2\overline{r})^{2d}}\right) \leq 2\exp\left( - \bar{\lambda} \left( \frac{\epsilon^2}{(2\overline{r})^{2d}} - \frac{\gamma \, m_1}{2\lambda^{\eta d -1}}\right)\right) \longrightarrow 0 \quad \text{ as }\, \lambda \rightarrow \infty,\end{align}

and since

\[\lim_{\lambda \rightarrow \infty}\exp\left( \frac{\gamma \, m_1}{2\lambda^{\eta d -1}} \right)= 1,\]

we can say that the first term $\mathcal{P}_{\bar{\lambda}}\left(\lambda\right)$ in (A.11) satisfies the following inequality:

By Lemma A.1,

(A.13)

\begin{align} {\mathbb{P}}\left(N \leq \bar{\lambda}\right) \leq {\mathbb{P}}\left(N \leq \lambda\left(1 - \frac{1}{\lambda^a} \right)\right) \leq \exp\left( - \frac{\lambda^{1 - 2a}}{2}\right).\end{align}

Since $2a < 1$ (because $\eta d \leq 2$), we have $1 - 2a > 2 - \eta d,$ and hence, using (A.13) and the fact that $N_{\lambda, \lambda} \geq 1$,

(A.14)

\begin{align} \frac{\exp\left(\frac{\gamma \bar{\lambda}^2 (m_1 - \epsilon)}{\lambda^{\eta d}}\right){\mathbb{P}}\left(N \leq \bar{\lambda}\right)}{N_{\lambda, \lambda}} \leq \exp\left(\frac{ \gamma \bar{\lambda}^2 (m_1 - \epsilon)}{2\lambda^{\eta d}} - \frac{\lambda^{1 - 2a}}{2}\right)\, \longrightarrow 0 \quad \text{ as }\, \, \lambda \rightarrow \infty. \end{align}

Therefore (A.10) follows from (A.11) and (A.14).

In particular if $\eta d > 5/3$, we choose a such that $2 - \eta d < a < \frac{\eta d - 1}{2}$. Let $\epsilon = 1/\lambda^a$. Then (A.12) and (A.14) hold. We complete the proof using the fact that

\begin{align*}\lim_{\lambda \rightarrow \infty}\exp\left( \frac{\gamma \bar{\lambda}^2\, \epsilon}{2\lambda^{\eta d}} \right) = 1.\\[-40pt]\end{align*}

Proof of Theorem 3.1. The following upper and lower bounds together complete the proof.

Lower bounds: Consider the inequality (A.9).

Case: $\underline{\boldsymbol{\eta d > 2}}$. Since $R \leq \overline{r}$, we have $m_j \leq (2\overline{r})^{jd}$. Thus, for $\eta d > 2$, all the terms in the exponent of the right-hand side of (A.9) go to zero asymptotically. In other words, for any $0 < a < 0.5$,

\[ \lim_{\lambda \rightarrow \infty } \sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)} = 0.\]

This means that $ \lim_{\lambda \rightarrow \infty} {\mathcal{P}(\lambda)} = 1$ for $\eta d > 2.$

Case: $\underline{\boldsymbol{3/2 < \eta d \leq 2}}$. Using (A.9),

\[{\mathcal{P}(\lambda)} \exp\left( \frac{\gamma m_1}{2}\lambda^{2 - \eta d} \right)\]

is bounded from below by

\[\exp\left( O\left(\lambda^{2 - \eta d - a}\right) - \sum_{j = 2}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)}\right)\left[ 1 - o(1) \right].\]

By fixing $a > 2 - \eta d$, we see that the right-hand side of the above expression goes to 1 as $\lambda \nearrow \infty$. Thus,

\[\liminf_{\lambda \rightarrow \infty } \left[{\mathcal{P}(\lambda)} \exp\left( \frac{\gamma m_1}{2}\lambda^{2 - \eta d} \right)\right] \geq 1.\]

Case: $\underline{\boldsymbol{1 < \eta d \leq 3/2}}$. By applying $\log$ on both sides of (A.9), we have for any $0 < a < 0.5$ that

\begin{align*} \log{\mathcal{P}(\lambda)} \geq \log \left(1 - o(1) \right) - \sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)},\end{align*}

and we see that

\begin{align*} \frac{1}{\lambda^{2 - \eta d}}\sum_{j = 1}^\infty \frac{\lambda^{j(1 - \eta d)+1} \left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{j(j+1)} = \frac{\left(1 + \frac{1}{\lambda^\epsilon} \right)^2 \gamma m_1}{2} + \sum_{j = 2}^\infty \frac{\left(1+ \frac{1}{\lambda^{a}}\right)^{j+1}\gamma^j\, m_j}{\lambda^{(j-1)(\eta d -1 )} j(j+1)}.\end{align*}

Thus

\[\liminf_{\lambda\rightarrow \infty} \frac{1}{\lambda^{2 - \eta d}} \log{\mathcal{P}(\lambda)} \geq - \frac{\gamma m_1}{2}\]

for $\eta d > 1$.

Case: $\underline{\boldsymbol{0 < \eta d \leq 1}}$. Configurations with one sphere or no sphere are always accepted; that is, $\mathcal{P}_1(\lambda) = \mathcal{P}_0(\lambda) = 1$. The probability of generating no sphere is $e^{- \lambda}$. Consequently, ${\mathcal{P}(\lambda)} > e^{-\lambda}$, and for any $\eta d > 0$,

(A.15)

\begin{align} \liminf_{\lambda \rightarrow \infty}\left[\frac{1}{\lambda}\log {\mathcal{P}(\lambda)} \right] \geq - 1.\end{align}

In particular, assume that $\eta d = 1$. For this case, first we show that the limit $\delta \,{:\!=}\, \lim_{\lambda \rightarrow \infty}\left[\frac{1}{\lambda}\log {\mathcal{P}(\lambda)} \right]$ exists. To prove this, partition the cube $[0,1]^d$ into a cubic grid of cell-edge length $x^{1/d} \in (0,1)$. Ignore the cells at the boundary that have edge length strictly smaller than $x^{1/d}$. The total intensity of the underlying PPP over a cell is then $\lambda x$.

When $\eta d = 1$, the radius of each sphere is identical in distribution to $R/\lambda$. Observe that the non-overlapping probability of the spheres restricting to a cell is $\mathcal{P}(\lambda x)$ (see the definition of the non-overlapping probability). Since the total number of cells is at least $1/x$, the non-overlapping probability ${\mathcal{P}(\lambda)} \leq \left(\mathcal{P}(\lambda x)\right)^{\frac{1}{x}}\!,$ and thus $\frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \leq \frac{1}{\lambda x}\log \mathcal{P}(\lambda x).$ We can increase $\lambda$ and decrease the cell-edge length $x^{1/d}$ so that $y \,{:\!=}\,\lambda x$ is fixed. Then the following inequality holds:

\begin{align*} \limsup_{\lambda \rightarrow \infty} \left[ \frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \right] \leq \frac{1}{y}\log \mathcal{P}(y) < 0.\end{align*}

Now the existence of the required limit is established by applying the limit on y:

\begin{align*} \limsup_{\lambda \rightarrow \infty} \left[ \frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \right] \leq \liminf_{y \rightarrow \infty} \left[\frac{1}{y}\log \mathcal{P}(y)\right].\end{align*}

To show that $\delta \nearrow 0$ as $\gamma m_1 \searrow 0$, assume that $\gamma m_1 < \epsilon$ for a constant $\epsilon \in (0,1)$. By (A.2) and (A.3),

\begin{align*}{\mathcal{P}_n(\lambda)} \geq {\mathbb{E}}\left[\prod_{i=1}^{n-1} \left(1 - \frac{\gamma}{\lambda} \sum_{k=1}^{i}(R_k + R_i)^d \right)^+ \right].\end{align*}

Consider the following partial order on ${\mathbb{R}}_+^n$: for any $y,\, y' \in {\mathbb{R}}_+^n$, we say that $y \preceq y'$ if $y_i \leq y_i'$ for all $i=1,\dots,n$. A function $f\,{:}\, {\mathbb{R}}_+^n \rightarrow {\mathbb{R}}$ is called increasing (or decreasing) if $f(y) \leq f(y')$ (or $f(y) \geq f(y')$) for all $y, \, y' \in {\mathbb{R}}_+^n$ such that $y \preceq y'$. If f and g are either increasing or decreasing functions, then Theorem 2.4 of [Reference Grimmett17] (the Fortuin–Kasteleyn–Ginibre (FKG) inequality) can be trivially extended to show that ${\mathbb{E}}[f(Y)g(Y)] \geq {\mathbb{E}}[f(Y)] {\mathbb{E}}[g(Y)]$. Clearly the following function $f_i$ is a decreasing function on ${\mathbb{R}}_+^n$:

\[f_{i}(y) = \left(1 - \frac{\gamma}{\lambda} \sum_{k=1}^{i}(y_k + y_i)^d \right)^+.\]

Therefore,

\[{\mathbb{E}}\left[\prod_{i=1}^{n-1} \left(1 - \frac{\gamma}{\lambda} \sum_{k=1}^{i}(R_k + R_i)^d \right)^+ \right] \geq \prod_{i=1}^{n-1} {\mathbb{E}}\left[\left(1 - \frac{\gamma}{\lambda} \sum_{k=1}^{i}(R_k + R_i)^d \right)^+ \right].\]

Using the convexity of the function $x^+$ and Jensen’s inequality, for each i,

\begin{align*}{\mathbb{E}}\left[\left(1 - \frac{\gamma}{\lambda} \sum_{k=1}^{i}(R_k + R_i)^d \right)^+ \right] \geq \left(1 - i\,\frac{\gamma m_1}{\lambda} \right)^+,\end{align*}

and thus

\[{\mathcal{P}_n(\lambda)} \geq \prod_{i=1}^{n-1} \left(1 - i\,\frac{\gamma m_1}{\lambda} \right)^+.\]

With $\underline \lambda = \lfloor \lambda + \lambda^{0.75}\rfloor$ and $N \sim Poi(\lambda)$,

\begin{align*}{\mathcal{P}(\lambda)} = \sum_{n=0}^\infty e^{-\lambda} \frac{\lambda^n}{n!} {\mathcal{P}_n(\lambda)} \geq \sum_{n=0}^{\underline \lambda} e^{-\lambda} \frac{\lambda^n}{n!} {\mathcal{P}_n(\lambda)} \geq \mathcal{P}_{\underline \lambda}(\lambda) {\mathbb{P}}\left(N \leq \underline{\lambda} \right).\end{align*}

By applying $\log$ on both the sides of the above inequality and scaling with $1/\lambda$, we obtain

\[\frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \geq \frac{1}{\lambda} \log \mathcal{P}_{\underline \lambda}(\lambda) + \frac{1}{\lambda} \log {\mathbb{P}}\left(N \leq \underline{\lambda} \right).\]

From the definition of $\underline \lambda$ and Lemma A.1, the second term, $\frac{1}{\lambda} \log {\mathbb{P}}\left(N \leq \underline{\lambda} \right)$, goes to zero as $\lambda \nearrow \infty$. We now focus on the first term, $\frac{1}{\lambda} \log \mathcal{P}_{\underline \lambda}(\lambda)$. Since $\gamma m_1 < \epsilon < 1$, for all $i \leq \underline \lambda$,

\[\frac{i}{\lambda}\frac{\gamma m_1}{\epsilon} < \frac{\underline \lambda}{\lambda}\frac{\gamma m_1}{\epsilon} \leq \left(1 + \frac{1}{\lambda^{0.25}}\right)\frac{\gamma m_1}{\epsilon} \leq 1,\]

for large values of $\lambda$. Thus, we can write using Bernoulli’s inequality that

\begin{align*}\mathcal{P}_{\underline \lambda}(\lambda) \geq \prod_{i=1}^{\underline \lambda} \left(1 - i\,\frac{\gamma m_1}{\lambda} \right) = \prod_{i=1}^{\underline \lambda} \left(1 - \epsilon \,\frac{i \gamma m_1}{\epsilon \lambda} \right)\geq \prod_{i=1}^{\underline \lambda}\left(1 - \epsilon\right)^{\,\frac{i \gamma m_1}{\epsilon \lambda}}\end{align*}

for large values of $\lambda$. Therefore, by combining the trivial bound (A.15) and the above conclusions,

\begin{align*}\delta \geq \max\left( -1, \frac{\gamma m_1}{2} \left[ \frac{\log(1 - \epsilon)}{\epsilon}\right] \right) \longrightarrow 0\,\, \text{ as }\,\, \gamma m_1 \searrow 0.\end{align*}

Upper bounds: We have a complete proof of the large deviations of ${\mathcal{P}(\lambda)}$ for the case $\eta d > 2$. So it remains to prove the theorem for $0 < \eta d \leq 2$. We first prove the required upper bounds for the case $1 < \eta d \leq 2$. If $0 < a < 0.5$ and $\bar{\lambda} = \lceil\lambda(1 - \frac{1}{\lambda^a})\rceil$, then from Lemma A.4, for any $\epsilon > 0$,

(A.16)

\begin{align} {\mathcal{P}(\lambda)} \leq N_{\lambda, \lambda}\, \exp\left( - \frac{\gamma \bar{\lambda}^2\, (m_1 - \epsilon)}{2\lambda^{\eta d}} \right) \left[ 1 + o(1) \right].\end{align}

(A.17) \begin{align}\end{align}

Case: $\underline{\boldsymbol{ \eta d > 1}}$. By applying $\log$ on both sides of (A.16) and then dividing by $\lambda^{2 -\eta d}$, we see that

\[ \frac{1}{\lambda^{2 - \eta d}}\log {\mathcal{P}(\lambda)} \leq - \frac{\gamma \, (m_1 - \epsilon)}{2}\left(1 + \frac{1}{\lambda^a} \right)^2 + \frac{1}{\lambda^{2 - \eta d}}\log N_{\lambda, \lambda} + \frac{1}{\lambda^{2 - \eta d}}\log [1 + o(1)].\]

As a consequence of Lemma A.3,

\[\limsup_{\lambda \rightarrow \infty} \frac{1}{\lambda^{2 - \eta d}}\log {\mathcal{P}(\lambda)} \leq - \frac{\gamma(m_1 - \epsilon)}{2}.\]

Now take $\epsilon \searrow 0$.

In particular, consider the torus-hard-sphere model with $5/3 < \eta d \leq 2$. We can fix a such that $2 - \eta d < a < \frac{\eta d - 1}{2}$. From Lemma A.4, (A.16) holds with $\epsilon = 0$. Therefore,

\[ {\mathcal{P}(\lambda)} \exp\left( \frac{\gamma m_1}{2}\lambda^{2 - \eta d} \right) \leq N_{\lambda, \lambda}\,\exp\left( O\left(\lambda^{2 - \eta d - a}\right) \right) \left[ 1 + o(1) \right],\]

and hence

\[\limsup_{\lambda \rightarrow \infty}\left[ {\mathcal{P}(\lambda)} \exp\left(\frac{\gamma m_1}{2}\lambda^{2 - \eta d} \right)\right] \leq 1\]

from Lemma A.3.

Case: $\underline{\boldsymbol{0 < \eta d < 1}}$. Let $\underline{\lambda} = \lfloor \lambda^{\frac{1+\eta d}{2}}\rfloor$ and $N \sim Poi(\lambda)$. From the definition,

(A.18)

\begin{align}{\mathcal{P}(\lambda)} = {\mathbb{E}} \left[ \mathcal{P}_N(\lambda)\right] \leq {\mathbb{P}} \left( N \leq \underline{\lambda}\right) + {\mathbb{E}} \left[ \mathcal{P}_N(\lambda) ;\ N \geq \underline{\lambda} + 1\right].\end{align}

For any $\epsilon > 0$, let $H_n(\epsilon) \,{:\!=}\, \left\{ \frac{1}{n} \sum_{i=1}^n R_i^d > \epsilon \right\}$. From (4.8),

\begin{align*}\mathcal{P}_{n+1}(\lambda) \leq {\mathbb{E}}\left[\prod_{i=1}^{n} \left(1 - \frac{\gamma'}{\lambda^{\eta d}} \sum_{j=1}^i R_j^d\right)^+ \right] \leq {\mathbb{P}}\left(H^c_n\left( \frac{\lambda^{\eta d}}{\gamma' n}\right) \right) \leq {\mathbb{P}}\left(H^c_n\left( \frac{\lambda^{\eta d}}{\gamma' \underline{\lambda}}\right) \right),\end{align*}

where the second inequality holds because

\[\left(1 - \frac{\gamma'}{\lambda^{\eta d}} \sum_{j=1}^n R_j^d\right)^+ = 0\]

on $H_n\left( \frac{\lambda^{\eta d}}{\gamma' n}\right)$. Since $\eta d < (\eta d +1)/2 < 1$, we see that $\frac{\lambda^{\eta d}}{\gamma' \underline{\lambda}} \searrow 0$ as $\lambda \nearrow \infty$, and thus for every $\epsilon > 0$ there exists $\lambda_\epsilon$ such that

\begin{align*}\mathcal{P}_{n+1}(\lambda) \leq {\mathbb{P}}\left(H^c_n\left( \epsilon\right) \right)\end{align*}

for all $\lambda > \lambda_\epsilon$ and $n > \underline{\lambda}$.

Suppose there is a constant $c > 0$ such that $R \geq c$. Then for all sufficiently small values of $\epsilon$, ${\mathbb{P}}\left(H^c_n\left(\epsilon\right) \right) = 0$ for all $n > \underline{\lambda}$. Thus for large values of $\lambda$, ${\mathcal{P}(\lambda)} \leq {\mathbb{P}} \left( N \leq \underline{\lambda}\right) \leq e^{-\lambda} \lambda^{\underline{\lambda}}$, and from the definition of $\underline{\lambda}$,

\[\limsup_{\lambda \rightarrow \infty} \frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \leq - 1 + \limsup_{\lambda \rightarrow \infty} \left[\frac{\underline{\lambda} \log \lambda}{\lambda} \right] = -1.\]

So we can assume that ${\mathbb{P}}(R < \epsilon)> 0$ for every $\epsilon> 0$. Recall that ${\mathbb{P}}(R > 0) = 1$ and $\Lambda(\theta)$ is the logarithmic moment generating function of $R^d$. As a consequence of positivity of R, we see that $\Lambda(\theta) \searrow -\infty$ as $\theta \searrow -\infty$. Let $\Lambda^*(x) = \sup_{\theta \in {\mathbb{R}}} \left\{\theta x - \Lambda(\theta)\right\}$. As a consequence of the assumption that ${\mathbb{P}}(R < \epsilon)> 0$ for every $\epsilon> 0$, we can show that $\Lambda^*(x) \nearrow \infty$ as $x \searrow 0$. From Theorem 2.2.3 of [Reference Dembo and Zeitouni10],

\[{\mathbb{P}}\left(H^c_n\left( \epsilon\right) \right) \leq 2 \exp\left( - n \inf_{x \leq \epsilon} \Lambda^*(x)\right) = 2 \exp\left( - n \Lambda^*(\epsilon)\right)\]

for all $n > \underline{\lambda}$ and $\epsilon < {\mathbb{E}}[R_1^d]$, where the last inequality holds because $\Lambda^*(x)$ is non-decreasing over $0 < x \leq {\mathbb{E}}[R_1^d]$. By (A.18),

\begin{align*}{\mathcal{P}(\lambda)} &\leq {\mathbb{P}} \left( N \leq \underline{\lambda}\right) + {\mathbb{P}} \left(H^c_N\left(\epsilon\right); N \geq \underline{\lambda} + 1\right) \leq {\mathbb{P}} \left( N \leq \underline{\lambda}\right) + {\mathbb{P}} \left(H^c_N\left(\epsilon\right)\right)\\&\leq {\mathbb{P}} \left( N \leq \underline{\lambda}\right) + 2\exp\left(- \lambda\left(1 - e^{-\Lambda^*(\epsilon)} \right) \right)\end{align*}

for all $\lambda \geq \lambda_\epsilon$. To conclude that $\limsup_{\lambda \rightarrow \infty} \frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \leq -1$, we see from the definition of Poisson distribution and $\underline{\lambda}$ that

\[{\mathbb{P}} \left( N \leq \underline{\lambda}\right) = \sum_{n= 0}^{\underline{\lambda}} e^{-\lambda} \frac{\lambda^n}{n!} \leq \underline{\lambda} e^{-\lambda} \left(\frac{\lambda^{\underline{\lambda}}}{\underline{\lambda}!}\right) \leq e^{-\lambda} \lambda^{\underline{\lambda}},\]

where we used the fact that $\lambda^{n-1}/(n-1)! < \lambda^{n}/n!$ for all $n < \lambda$. Hence,

\begin{align*}{\mathcal{P}(\lambda)} &\leq 2\exp\left(- \lambda\left(1 - e^{-\Lambda^*(\epsilon)} \right) \right) \left( 1 + \lambda^{\underline{\lambda}} \, \exp\left(- \lambda\Lambda^*(\epsilon)\right)\right)\\ &= 2\exp\left(- \lambda\left(1 - e^{-\Lambda^*(\epsilon)} \right) \right) \left( 1 + \exp\left(- \lambda\left(\Lambda^*(\epsilon) - \frac{\underline{\lambda}}{\lambda} \log\lambda\right)\right)\right).\end{align*}

From the definition of $\underline{\lambda}$, we see that $\frac{\underline{\lambda}}{\lambda} \log\lambda \searrow 0$ as $\lambda \nearrow \infty$. As a consequence, as $\lambda \nearrow \infty$, $\exp\left(- \lambda\left(\Lambda^*(\epsilon) - \frac{\underline{\lambda}}{\lambda} \log\lambda\right)\right)$ goes to zero. Therefore,

\begin{align*}\limsup_{\lambda \rightarrow \infty} \frac{1}{\lambda} \log {\mathcal{P}(\lambda)} \leq -\left(1 - e^{-\Lambda^*(\epsilon)} \right).\end{align*}

We have the required result by taking $\epsilon \searrow 0$.

Case: $\underline{\boldsymbol{\eta d = 1}}$. It remains to show that

\[\delta \leq -\frac{1}{2}\left(1 - \frac{1}{\gamma' \overline{r}^d}\right)^2\]

if $R \equiv \overline{r}$ and $\gamma' \overline{r}^d > 1$. Since, from (4.8),

\[\mathcal{P}_{n+1}(\lambda) \leq \prod_{i=1}^{n} \left(1 - \frac{\gamma'}{\lambda} i \overline{r}^d\right)^+ = 0\]

for all $n > \lambda \frac{\lambda}{\gamma' \overline{r}^d},$ we have ${\mathcal{P}(\lambda)} \leq {\mathbb{P}} \left( N \leq \lambda \frac{\lambda}{\gamma' \overline{r}^d}\right).$ Now the proof is complete by Lemma A.1.

A.2. Proof of Proposition 3.1

From [Reference Mase29], the intensity of the torus-hard-sphere model is given by

\[\rho(\lambda) = \frac{\sum_{n = 1}^\infty n \, \frac{\lambda^n}{n!} \mathcal{P}_{n}(\lambda)}{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!} \mathcal{P}_{n}(\lambda)} = \lambda \frac{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!} \mathcal{P}_{n+1}(\lambda)}{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!} \mathcal{P}_{n}(\lambda)},\]

where $\mathcal{P}_{n}(\lambda)$ is the non-overlapping probability of n uniformly and independently generated spheres with radius $\overline{r}/\lambda^\eta$.

Case $\underline{\boldsymbol{\eta d >1}}$. In this regime, we show that $\rho(\lambda)$ is of the order of $\gamma \overline{r}^d \lambda^{1 - \eta d}$. Using inequalities (4.8) and (A.3),

\[\left(1 - n \gamma \overline{r}^d \lambda^{-\eta d}\right) \mathcal{P}_{n}(\lambda) \geq \mathcal{P}_{n+1}(\lambda) \geq \left(1 - n \gamma 4\overline{r}^d\lambda^{-\eta d}\right) \mathcal{P}_{n}(\lambda).\]

Therefore,

\begin{align*}\rho(\lambda) \geq \lambda \frac{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!}\left(1 - n 4\overline{r}^d\lambda^{-\eta d}\right) \mathcal{P}_{n}(\lambda) }{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!} \mathcal{P}_{n}(\lambda)} \geq \lambda - \gamma 4\overline{r}^d \lambda^{1 - \eta d} \rho(\lambda),\end{align*}

and

\begin{align*}\rho(\lambda) \leq \lambda \frac{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!}\left(1 - n \overline{r}^d\lambda^{-\eta d}\right) \mathcal{P}_{n}(\lambda) }{\sum_{n \in \mathbb{N}_0} \frac{\lambda^n}{n!} \mathcal{P}_{n}(\lambda)} \leq \lambda - \gamma \overline{r}^d \lambda^{1 - \eta d} \rho(\lambda).\end{align*}

Consequently,

\[\left(\frac{1}{ 1 + \gamma 4\overline{r}^d \lambda^{1 - \eta d}} \right) \gamma \overline{r}^d \lambda^{1- \eta d} \leq \rho(\lambda)\gamma \overline{r}^d \lambda^{- \eta d} \leq \left(\frac{1}{ 1 + \gamma \overline{r}^d \lambda^{1 - \eta d}} \right) \gamma \overline{r}^d \lambda^{1- \eta d},\]

and thus

\[\lim_{\lambda \nearrow \infty} \frac{\mathsf{VF}(\lambda)}{\gamma \overline{r}^d \lambda^{1 - \eta d}} = 1.\]

Case $\underline{\boldsymbol{\eta d \leq 1}}$. We now show that $\lim_{\lambda \nearrow \infty} \mathsf{VF}(\lambda) \leq \rho^{\max} \gamma$ with equality if and only if $\eta d < 1$. Towards this end, we first consider another torus-hard-sphere model on $[0, \lambda^\eta/\overline{r}]^d$ with unit-radius spheres and absolutely continuous with respect to a $\kappa$-homogeneous PPP for some intensity $\kappa > 0$. Let $\rho(\kappa, \lambda)$ be the intensity of this new hard-sphere model. We can easily see that when $\kappa = \overline{r}^d \lambda^{1- \eta d}$, the fraction of the volume occupied by the spheres in the new hard-sphere model is also $ \mathsf{VF}(\lambda)$.

The proofs of Propositions 1 and 2 in [Reference Mase29] can easily be modified to show that $\rho(\kappa, \lambda)$ is strictly increasing in $\kappa$ for any fixed $\lambda > 0$, and

\[\lim_{\kappa \to \infty} \rho(\kappa, \lambda) = \rho^{max},\]

where $\rho^{max}$ is the closest packing density. On the other hand, by fixing $\kappa$, we can further show, using [Reference Mase29], that the limit $\lim_{\lambda \to \infty} \rho(\kappa, \lambda)$ exists and is equal to the intensity of the stationary hard-sphere model on ${\mathbb{R}}^d$ with unit-radius spheres and the reference PPP being $\kappa$-homogeneous. (In fact, the difference between $\rho(\kappa, \lambda)$ and the limit $\lim_{\lambda \to \infty} \rho(\kappa, \lambda)$ is known to be insignificant for large values of $\lambda$; see, for example, [Reference Baddeley and Nair5].)

From the above discussion, when $\eta d < 1$, for sufficiently small $\epsilon > 0$, there exist constants $\kappa_\epsilon$ and $\lambda_\epsilon$ such that $\rho(\kappa, \lambda) > \rho^{max} - \epsilon$ for all $\lambda > \lambda_\epsilon$ and $\kappa > \kappa_\epsilon$. If we take $\kappa = \overline{r}^d\,\lambda^{1- \eta d}$, since $\eta d< 1$,

\[ \lim_{\lambda \to \infty} \mathsf{VF}(\lambda) = \lim_{\lambda \to \infty} \left[\rho(\overline{r}^d\lambda^{1- \eta d}, \lambda) \gamma \right] = \rho^{max} \gamma,\]

which is the maximum packing intensity.

Finally, if $\eta d = 1$ and $\kappa = \overline{r}^d$, the limit $\lim_{\lambda \to \infty} \rho(\overline{r}^d, \lambda)$ is strictly less than $\rho^{max}$. Hence, we have $\lim_{\lambda \to \infty} \mathsf{VF}(\lambda) < \rho^{max} \gamma$.

A.3. Proof of Proposition 4.1

Let $N^\prime$ be the number of spheres generated sequentially, independently and identically, before seeing an overlap. Let $N \sim {\mathsf{Poi}}(\lambda)$ independently of $N^\prime$. Then from the construction of Algorithm 4.1,

(A.19)

\begin{align}c\,{\mathbb{E}}\left[\sum_{n=1}^{\min(N, N')} (n-1)\right] \leq C_{\mathsf{itr}}(\lambda) \leq c'\,\left(\log(\lambda) + {\mathbb{E}}\left[\sum_{n=1}^{\min(N, N')} (n-1) \right] \right)\end{align}

for some positive constants c and $c^\prime$. In the above expression, $\log(\lambda)$ appears because the cost to generate a sample of N is at most of order $\log(\lambda)$ (see e.g. [Reference Devroye11]). Observe that

(A.20)

\begin{align}{\mathbb{E}}\left[\sum_{n=1}^{\min(N, N')} (n-1) \right] = {\mathbb{E}}\left[\sum_{n=0}^{N -1} n I(N' > n) \right] = {\mathbb{E}}\left[\sum_{n=1}^{N-1} n {\mathcal{P}_n(\lambda)} \right],\end{align}

where the last equality follows from the fact that ${\mathbb{P}}(N' > n) = {\mathcal{P}_n(\lambda)}$.

Upper bound: For $\eta d \geq 2$, since ${\mathcal{P}_n(\lambda)} \leq 1$, we can bound (A.20) from above by a constant times ${\mathbb{E}}[N^{2}]$, which is further bounded from above by a constant times $\lambda^2$. Therefore we just need to consider the case $\eta d < 2$. From (A.2),

\[{{\mathcal{P}_n(\lambda)} = {\mathbb E}_{\widetilde \mu_n}\left[\prod_{i=1}^{n}\Big(1 - B_{i}\Big)\right]}.\]

As a consequence of (4.8),

\[\displaystyle{\mathcal{P}_n(\lambda)} \leq {\mathbb{E}}\left[\exp\left(- \frac{\gamma'}{\lambda^{\eta d}} \sum_{1 \leq j < i \leq n} R_j^d \right) \right] = {\mathbb{E}}\left[\exp\left(- \frac{\gamma' \overline{r}^d}{\lambda^{\eta d}} \sum_{1 \leq j < i \leq n} \frac{R_j^d}{\overline{r}^d} \right) \right].\]

Let $\alpha = {\mathbb{E}}[R_1^d]$. Since $\overline{r}$ is an upper bound on the $R_i$, by Hoeffding’s inequality (Lemma A.2) on the sequence $\{R^d_1/\overline{r}^d, \dots, R^d_n/\overline{r}^d\}$ with $\epsilon = \alpha/2\overline{r}^d$, $k = 2$, and $g(x,y) = x$, we have

\begin{align*} {\mathcal{P}_n(\lambda)} &\leq \exp\left( - \frac{\gamma'}{ 2\lambda^{\eta d}} \frac{n(n-1)}{2} \alpha\right) + {\mathbb{P}}\left( \sum_{1 \leq j < i \leq n} \frac{R_j^d}{\overline{r}^d} \leq \frac{\alpha}{2\overline{r}^d} \right) \\ &\leq \exp\left( - \frac{\gamma' (n-1)^2}{ 4\lambda^{\eta d}} \alpha\right) + \exp\left(-\frac{n\alpha^2}{4\overline{r}^{2d}} \right).\end{align*}

Let $a = \sqrt{\frac{2\lambda^{\eta d}}{\gamma' \alpha}}$. Then from the above expression,

(A.21)

\begin{align} \sum_{n=1}^\infty n\, {\mathcal{P}_n(\lambda)} \leq \sum_{n = 1}^\infty n \exp\left( -\frac{(n-1)^2}{2 a^2} \right) + \sum_{n=1}^\infty n\exp\left(- \frac{n\alpha^2}{4\overline{r}^{2d}}\right).\end{align}

Select $\lambda$ large enough so that $b > 0$. Then with $p = 1 - \exp(\!-\alpha^2/(4\overline{r}^{2d}))$, the second term on the right side of (A.21) is $1/p$ times ${\mathbb{E}}\left[Z\right]$ for a geometric random variable Z with success probability p and support $\{1,2,3,\dots\}$. Since ${\mathbb{E}}\left[Z\right] = 1/p$, the term $\sum_{n=1}^\infty n\exp\left(- n \alpha^2/(4\overline{r}^{2d})\right)$ is bounded from above by a constant.

On the other hand, since

\[n \exp\left( - (n-1)^2/(2 a^2) \right) \leq \int_{n - 2}^{n - 1} (x+2) \exp\left( - \frac{x^2}{2 a^2} \right)\]

for any non-negative integer n, we can write that

\begin{align*} \sum_{n = 1}^\infty n \exp\left( -\frac{(n-1)^2}{2 a^2} \right) &\leq 1 + \int_{0}^\infty (x+2) \exp\left( -\frac{x^2}{2 a^2} \right) \, dx = 1 + a\sqrt{\frac{\pi}{2}}\,\, {\mathbb{E}}\left[\left(\big|Z\big| +2\right)\right],\end{align*}

where Z is a Gaussian random variable with mean 0 and variance $a^2$. Since ${\mathbb{E}}[|Z|] = a \sqrt{2/\pi}$, using the definition of a, the first term in (A.21) is bounded from above by a constant times $\lambda^{\eta d}$. Thus, the required upper bound on (A.19) is established as a consequence of (A.20) and (A.21).

Lower bound: Let $\epsilon' = \min(1, \eta d/2)$. Then from (A.20),

\[\displaystyle C_{\mathsf{itr}}(\lambda) \geq {\mathbb{P}}\left(N > \lambda^{\epsilon'}/2\right) \sum_{n=1}^{ \lfloor\lambda^{\epsilon'}/2 \rfloor + 1} n {\mathcal{P}_n(\lambda)} \geq c \,\mathcal{P}_{\lfloor\lambda^{\epsilon'}/2 \rfloor}(\lambda)\,\left(\frac{\lambda^{\epsilon'}}{2}\right)^{2}\]

for a constant $c > 0$. From (A.4),

\[ \mathcal{P}_{\lfloor\lambda^{\epsilon'}/2 \rfloor}(\lambda) \geq \exp\left(- \frac{\lambda^{\epsilon'}}{2} \sum_{j=1}^\infty \left(\frac{\gamma \lambda^{\epsilon'} }{2\lambda^{\eta d}} \right)^j \frac{m_j}{j(j+1)}\right) \geq \exp\left(- \frac{\lambda^{\epsilon'}}{2} \sum_{j=1}^\infty \left(\frac{\gamma \lambda^{\epsilon'} }{2\lambda^{\eta d}} \right)^j \frac{m_j}{j}\right),\]

where $m_j = {\mathbb{E}}\left[(R_1 + R_2)^{j d}\right]$. Note that $m_j \leq (2\overline{r})^{jd}$ since $R \leq \overline{r}$. Therefore,

\[\mathcal{P}_{\lfloor\lambda^{\epsilon'}/2 \rfloor}(\lambda) \geq \exp\left(- \frac{\lambda^{\epsilon'}}{2} \sum_{j=1}^\infty \frac{1 }{j} \left( \frac{\gamma (2\overline{r}^d) }{2\lambda^{\eta d - \epsilon'}} \right)^j \right).\]

Using the Taylor expansion of $\log(1 - x)$ for $0< x <1$, along with the fact that

\[\frac{\gamma (2\overline{r}^d) }{2\lambda^{\eta d - \epsilon'}} < 1 \]

for sufficiently large values of $\lambda$, we have

\[\displaystyle\mathcal{P}_{\lfloor\lambda^{\epsilon'}/2 \rfloor}(\lambda) \geq \exp\left(\frac{\lambda^{\epsilon'}}{2} \log\left(1 - \frac{\gamma (2\overline{r}^d) }{2\lambda^{\eta d - \epsilon'}} \right) \right) = \left(1 - \frac{\gamma (2\overline{r}^d) }{2\lambda^{\eta d - \epsilon'}} \right)^{\frac{\lambda^{\epsilon'}}{2}}.\]

From the definition of $\epsilon'$,

\[ \lim_{\lambda \rightarrow \infty} \left[ \left(1 - \frac{\gamma (2\overline{r}^d) }{2\lambda^{\eta d - \epsilon'}} \right)^{\frac{\lambda^{\epsilon'}}{2}}\right] = \begin{cases} 1 & \quad \text{ if }\, \eta d > 2,\\ \exp\left( - \gamma (2\overline{r})^d/4 \right) & \quad \text{ if }\, 0 < \eta d \leq 2. \end{cases}\]

In addition, from Lemma A.1, ${\lim_{\lambda \rightarrow \infty}{\mathbb{P}}\left(N > \lambda^{\epsilon'}/2\right) = 1}$. Therefore, there exists a constant c such that ${C_{\mathsf{itr}}(\lambda) \geq c\, \lambda^{2\epsilon'}}$. The proof of Proposition 4.1 is then complete by Theorem 3.1 and (4.1).

A.4. Proof of Proposition 4.3

First note that the sphere volume is at most a constant times the cell volume for all $\lambda$. Thus, after generating a sphere, the complexity of relabeling cells around the center of the new sphere plus the complexity of the overlap check is a constant. For $\eta d > 1$, the number of spheres generated in an iteration of Algorithm 4.5 is stochastically dominated by a Poisson random variable with rate $\lambda$. Therefore, there exists a constant c such that $\widetilde C_{\mathsf{itr}}(\lambda) \leq c\, \lambda$. On the other hand, if $0 < \eta d \leq 1$, the expected number of spheres generated per iteration is of order $\lambda^{\eta d}$, because the expected volume of each sphere is of order $1/\lambda^{\eta d}$. It is clear that there exists a constant $c > 0 $ such that $\widetilde C_{\mathsf{itr}}(\lambda) \leq c\, \lambda^{\eta d } $. Thus, by (4.6) and Proposition 4.1,

\begin{align*} \mathcal{T}_{\mathsf{ISAR}} &\leq c \, \frac{\lambda^{\min(1, \eta d)}}{P_{\mathsf{acc}} (\lambda)} = c\, {\mathbb{E}}[\widetilde \sigma(N)] \frac{\lambda^{\min(1, \eta d)}}{{\mathcal{P}(\lambda)}}.\end{align*}

Thus, (4.11) holds, because $\widetilde \sigma(n) = \delta_n$ for each $n \in \mathbb{N}_0$. Furthermore, from the definition of $\widetilde \sigma(\cdot)$ and N,

\begin{align*} {\mathbb{E}}\left[\delta_N\right] \leq {\mathbb{E}}\left[ \exp\left(- \sum_{i=0}^{N}(i-1) \gamma' \frac{\overline{r}^d}{\lambda^{\eta d}} \right) \right] = {\mathbb{E}}\left[ \exp\left(- \gamma' \frac{\overline{r}^d}{2\lambda^{\eta d}} \frac{(N-1)N}{2} \right) \right]. \end{align*}

By the Chernoff bound (Lemma A.1), for any $0 < \epsilon < 1$,

\begin{align*} {\mathbb{E}}\left[ \exp\left(- \gamma' \frac{\overline{r}^d}{\lambda^{\eta d}} \frac{(N-1)N}{2} \right) \right] &\leq {\mathbb{E}}\left[ \exp\left(- \gamma' \frac{\overline{r}^d}{\lambda^{\eta d}} \frac{(N-1)N}{2} \right);\ N > \lambda\left(1 - \epsilon\right) \right]\\ &\quad + {\mathbb{P}}\left( N \leq \lambda\left(1 - \epsilon\right)\right)\\ &\leq \exp\left(- \frac{\gamma' \overline{r}^d}{2} \left(1 - \epsilon \right) \lambda^{2 - \eta d}\right) + \exp\left( -\frac{\lambda \epsilon^2}{2}\right).\end{align*}

If $\eta d > 1$, then the second term on the right-hand side of the above expression decreases faster than the first term, and thus the claim holds true. For $\eta d = 1$, take $\epsilon = 1/2$; then we have the required result with $b = \min\left(1/8, \gamma \overline{r}^d/4\right)$. Furthermore, if $0 < \eta d < 1$ then the first term decreases faster than the second one, and hence the proof is completed by taking $b = 1/2$.

A.5. Proof of Theorem 5.1

To derive the lower bound on ${\mathcal{T}_{\mathsf{DC1}}}$, we view the entire dominating process $\textbf{D}$ as a Poisson Boolean model on a higher-dimensional space and use an extension of the FKG inequality [Reference Meester and Roy31] (alternatively, see Theorem 2.2 in [Reference Meester and Roy31]). Let $s_0 = 0$, and let $s_i$ be the instant of the ith arrival in the dominating process after time zero. Let $C(\textbf{x}, \textbf{x}^u, \textbf{x}^l)$ be the running time complexity of updating the dominating, upper bound, and lower bound processes at the instant of an arrival when their respective states are $\textbf{x}, \textbf{x}^u$, and $\textbf{x}^l$.

Since $U_0(0) = D(0)$ and $L_0(0) = \varnothing$, on $\bigcap_{j=0}^i \{D(s_j) \notin \mathscr{A}\}$, for all $t \leq s_i,$

(A.22)

\begin{align} L_0(t) = \varnothing\,\, \text{ and } \,\, U_0(t) = D(t).\end{align}

Thus, $L_0(t) \neq U_0(t)$ for all $t \leq s_i$ on $\bigcap_{j=0}^i \{D(s_j) \notin \mathscr{A}\}$. Now take

\[\tau = \inf\left\{i \in \mathbb{N}_0\,{:}\, D(s_i) \in \mathscr{A}\right\}.\]

From the above conclusion, it is clear that $N^f \geq \tau$. Then,

\begin{align*} \mathcal{T}_{\mathsf{DC1}} &\geq {\mathbb{E}}\left[\sum_{i=0}^{N^f} C\Big(D(s_i), U_0(s_i), L_0(s_i)\Big) \right]\\ & \geq {\mathbb{E}}\left[\sum_{i=0}^{\tau } C\Big(D(s_i), U_0(s_i), L_0(s_i)\Big) \right]\nonumber\\ &= \sum_{i=0}^\infty {\mathbb{E}}\left[ C\Big(D(s_i), U_0(s_i), L_0(s_i)\Big) ;\ \tau \geq i \right]\nonumber\\ &= \sum_{i=0}^\infty {\mathbb{E}}\left[ C\Big(D(s_i), U_0(s_i), L_0(s_i)\Big) ;\ \bigcap_{j=0}^{i -1} \left\{D(s_j) \notin \mathscr{A}\right\}\right]\nonumber\\ &= \sum_{i=0}^\infty {\mathbb{E}}\left[ C\Big(D(s_i), D(s_i), \varnothing\Big) ;\ \bigcap_{j=0}^{i -1} \left\{D(s_j) \notin \mathscr{A}\right\}\right],\nonumber\end{align*}

where $I\left(\cap_{j=0}^{-1} \left\{D(s_j) \notin \mathscr{A}\right\} \right) = 1$ and the last equality follows from (A.22).

Suppose that $\mathscr{D}$ is the state space of the entire process $\left\{ D(t)\,{:}\,t \in {\mathbb{R}} \right\}$. Then we can define a simple partial order on $\mathscr{D}$ as follows. For any $\omega, \omega' \in \mathscr{D}$, we say $\omega \preceq \omega'$ if and only if every sphere present in $\omega$ is also present in $\omega'$; that is, either $\omega' = \omega$ or $\omega'$ is obtained by adding spheres to $\omega$. We then define the following notion of increasing functions: a real-valued function on $\mathscr{D}$ is increasing if $f(\omega) \leq f(\omega')$ for all $\omega, \omega' \in \mathscr{D}$ such that $\omega \preceq \omega'$.

At each arrival, if there are n points in the upper bounding process, the cost of deciding whether to accept the new point is at least the cost of checking the overlap condition in the upper bounding process, and that cost is of order n. Therefore, $C\Big(D(s_i), U_0(s_i), \varnothing\Big) = c |U_0(s_i)|$ for some constant c. Since $|U_0(s_i)|$ is a non-decreasing function on $\mathscr{D}$ as per the partial order stated above, by the FKG inequality (Theorem 2.2 in [Reference Meester and Roy31]),

\[ {\mathbb{E}}\left[ C\Big(D(s_i), D(s_i), \varnothing\Big) ; \bigcap_{j=0}^{i-1} \left\{D(s_j) \notin \mathscr{A}\right\} \right]\]

is bounded from below by

\[ {\mathbb{E}}\left[ C\Big(D(s_i), D(s_i), \varnothing\Big)\right] \prod_{j=0}^{i-1} {\mathbb{P}}\left( D(s_j) \notin \mathscr{A} \right).\]

Thus,

\begin{align*} \mathcal{T}_{\mathsf{DC1}} \geq \frac{{\mathbb{E}}\left[ C\Big(D(s_1), D(s_1), \varnothing\Big)\right]}{1 - {\mathbb{P}}\left( D(s_1) \notin \mathscr{A} \right)} = \frac{{\mathbb{E}}_{\mu^0}\left[ |\textbf{X}| \right]}{{\mathbb{P}}_{\mu^0} \left( \textbf{X} \in \mathscr{A} \right)} = c\,\frac{\lambda}{{\mathcal{P}(\lambda)}}\end{align*}

for some constant $c > 0$. Then (5.2) follows from (4.1) and (4.2). The proof is completed using Theorem 3.1.

References

Adams, D. J. (1974). Chemical potential of hard-sphere fluids by Monte Carlo methods. Molec. Phys. 28, 1241–1252.10.1080/00268977400102551CrossRef Google Scholar

Amorós, J. and Ravi, S. (2013). On the application of the Carnahan–Starling method for hard hyperspheres in several dimensions. Phys. Lett. A 377, 2089–2092.CrossRef Google Scholar

Asmussen, S. and Glynn, P. W. (2007). Stochastic Simulation: Algorithms and Analysis Springer, New York.10.1007/978-0-387-69033-9CrossRef Google Scholar

Aurenhammer, F. (1987). Power diagrams: properties, algorithms and applications. SIAM J. Comput. 16, 78–96.10.1137/0216006CrossRef Google Scholar

Baddeley, A. and Nair, G. (2012). Fast approximation of the intensity of Gibbs point processes. Electron. J. Statist. 6, 1155–1169.10.1214/12-EJS707CrossRef Google Scholar

Baranau, V. and Tallarek, U. (2016). Chemical potential and entropy in monodisperse and polydisperse hard-sphere fluids using Widom’s particle insertion method and a pore size distribution-based insertion probability. J. Chem. Phys. 144, 214503.CrossRef Google Scholar

Berthelsen, K. K. and Møller, J. (2006). Bayesian analysis of Markov point processes. In Case Studies in Spatial Point Process Modeling, Springer, New York, pp. 85–97.10.1007/0-387-31144-0_4CrossRef Google Scholar

Berthelsen, K. K. and Møller, J. (2008). Non-parametric Bayesian inference for inhomogeneous Markov point processes. Austral. N. Z. J. Statist. 50, 257–272.10.1111/j.1467-842X.2008.00516.xCrossRef Google Scholar

Boute, R. T. (1992). The Euclidean definition of the functions div and mod. ACM Trans. Program. Lang. Systems 14, 127–144.10.1145/128861.128862CrossRef Google Scholar

Dembo, A. and Zeitouni, O. (2010). Large Deviations Techniques and Applications. Springer, Berlin.10.1007/978-3-642-03311-7CrossRef Google Scholar

Devroye, L. (1986). Nonuniform Random Variate Generation. Springer, New York.10.1007/978-1-4613-8643-8CrossRef Google Scholar

Ferrari, P. A., Fernández, R. and Garcia, N. L. (2002). Perfect simulation for interacting point processes, loss networks and Ising models. Stoch. Process. Appl. 102, 63–88.10.1016/S0304-4149(02)00180-1CrossRef Google Scholar

Fill, J. A. (1998). An interruptible algorithm for perfect sampling via Markov chains. Ann. Appl. Prob. 8, 131–162.10.1214/aoap/1027961037CrossRef Google Scholar

Fill, J. A., Machida, M., Murdoch, D. J. and Rosenthal, J. S. (2000). Extension of Fill’s perfect rejection sampling algorithm to general chains. Random Structures Algorithms 17, 290–316.10.1002/1098-2418(200010/12)17:3/4<290::AID-RSA6>3.0.CO;2-Q3.0.CO;2-Q>CrossRef Google Scholar

Foss, S., Juneja, S., Mandjes, M. and Moka, S. (2015). Spatial loss systems: exact simulation and rare event behavior. ACM SIGMETRICS Performance Evaluation Rev. 43, 3–6.10.1145/2825236.2825238CrossRef Google Scholar

Garcia, N. L. (2000). Perfect simulation of spatial processes. Resenhas 4, 283–325.Google Scholar

Grimmett, G. (1999). Percolation, 2nd edn. Springer, Berlin.CrossRef Google Scholar

Haenggi, M. (2012). Stochastic Geometry for Wireless Networks. Cambridge University Press.10.1017/CBO9781139043816CrossRef Google Scholar

Häggström, O., van Lieshout, M.-C. N. and MØller, J. (1999). Characterization results and Markov chain Monte Carlo algorithms including exact simulation for some spatial point processes. Bernoulli 5, 641–658.10.2307/3318694CrossRef Google Scholar

Hirsch, C., Moka, S. B., Taimre, T. and Kroese, D. P. (2021). Rare events in random geometric graphs. Methodology Comput. Appl. Prob., doi: 10.1007/s11009-021-09857-7.CrossRef Google Scholar

Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58, 13–30.CrossRef Google Scholar

Huber, M. L. (2012). Spatial birth–death swap chains. Bernoulli 18, 1031–1041.CrossRef Google Scholar

Huber, M. L. (2016). Perfect Simulation. CRC Press, Boca Raton, FL.CrossRef Google Scholar

Juneja, S. and Mandjes, M. (2013). Overlap problems on the circle. Adv. Appl. Prob. 45, 773–790.10.1239/aap/1377868538CrossRef Google Scholar

Kendall, W. S. (1998). Perfect simulation for the area-interaction point process. In Probability towards 2000 (New York, 1995) (Lecture Notes Statist., Vol. 128), Springer, New York, pp. 218–234.CrossRef Google Scholar

Kendall, W. S. (2015). Introduction to coupling-from-the-past using R. In Stochastic Geometry, Spatial Statistics and Random Fields (Lecture Notes Math., Vol. 2120), Springer, Cham, pp. 405–439.Google Scholar

Kendall, W. S. and Møller, J. (2000). Perfect simulation using dominating processes on ordered spaces, with application to locally stable point processes. Adv. Appl. Prob. 32, 844–865.CrossRef Google Scholar

Lieb, E. and Mattis, D. (1966). Mathematical Physics in One Dimension: Exactly Soluble Models of Interacting Particles. Academic Press, New York.Google Scholar

Mase, S. et al. (2001). Packing densities and simulated tempering for hard core Gibbs point processes. Ann. Inst. Statist. Math. 53, 661–680.10.1023/A:1014662415827CrossRef Google Scholar

Mayer, J. and Mayer, M. (1940). Statistical Mechanics. John Wiley, New York.Google Scholar

Meester, R. and Roy, R. (1996). Continuum Percolation. Cambridge University Press.CrossRef Google Scholar

Møller, J., Huber, M. L. and Wolpert, R. L. (2010). Perfect simulation and moment properties for the Matérn type III process. Stoch. Process. Appl. 120, 2142–2158.CrossRef Google Scholar

Møller, J. and HelisovÁ, K. (2008). Power diagrams and interaction processes for unions of discs. Adv. Appl. Prob. 40, 321–347.CrossRef Google Scholar

Møller, J., Pettitt, A. N., Reeves, R. and Berthelsen, K. K. (2006). An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika 93, 451–458.CrossRef Google Scholar

Pathria, R. (1972). Statistical Mechanics. Elsevier, Amsterdam.Google Scholar

Propp, J. G. and Wilson, D. B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures Algorithms 9, 223–252.3.0.CO;2-O>CrossRef Google Scholar

Salsburg, Z. W., Zwanzig, R. W. and Kirkwood, J. G. (1953). Molecular distribution functions in a one-dimensional fluid. J. Chem. Phys. 21, 1098–1107.10.1063/1.1699116CrossRef Google Scholar

Senger, B., Voegel, J.-C. and Schaaf, P. (2000). Irreversible adsorption of colloidal particles on solid substrates. Colloids Surfaces A 165, 255–285.CrossRef Google Scholar

Stoyan, D., Kendall, W. S. and Mecke, J. (1987). Stochastic Geometry and Its Applications. John Wiley, Chichester.Google Scholar

Tarjus, G., Schaaf, P. and Talbot, J. (1990). Generalized random sequential adsorption. J. Chem. Phys. 93, 8352–8360.CrossRef Google Scholar

Thönnes, E. (1999). Perfect simulation of some point processes for the impatient user. Adv. Appl. Prob. 31, 69–87.CrossRef Google Scholar

Algorithm 4.1 Naive AR Method

Algorithm 4.2 IS-based AR method

Algorithm 4.3 Reference IS

Algorithm 4.4 Grid-based IS for fixed radius

Algorithm 4.5 Perfect sampling for hard-sphere model using grid-based IS

Algorithm 4.6 Grid-based IS for random-radii case

Algorithm 4.7 Perfect sampling for hard-sphere model with random radii

Algorithm 5.1 Dominated CFTP

Figure 3. Log of the expected number of points generated per perfect sample of the hard-sphere model, as a function of $\lambda$, in the regime where $\eta = 0.5$, $d = 2$, and $\overline{r} = 1$.

Figure 4. A plot of $\log {\mathcal{P}(\lambda)}$ versus $\lambda$ in the regime where $\eta = 0.5$, $d = 2$, and $\overline{r} = 1$, where the non-overlapping probability ${\mathcal{P}(\lambda)}$ is estimated using the conditional Monte Carlo method proposed in [20]. The plot shows that, in the high-density regime, the configurations with hard spheres can be extremely rare under the measure $\mu^0$.

Figure 6. Log of the expected number of points generated per perfect sample of the hard-sphere model, as a function of $\lambda$, in the regime where $\eta = 0.75$, $d = 2$, and $\overline{r} = 0.5$.

Figure 7. A plot of $\log {\mathcal{P}(\lambda)}$ versus $\lambda$ in the regime where $\eta = 0.75$, $d = 2$, and $\overline{r} = 0.5$, where the non-overlapping probability ${\mathcal{P}(\lambda)}$ is estimated using the conditional Monte Carlo method proposed in [20]. Here we notice that the hard-sphere configurations are relatively less rare compared to the scenarios in Experiment 1.