1. Introduction
In this paper we introduce a discrete-time Markov process for the positioning and growth of a number of particles/objects, referred to as stars. It can be thought of as a generalization of a Pólya-type urn scheme [Reference Eggenberger and Pólya10] due to Hoppe [Reference Hoppe16, Reference Hoppe17], where each star corresponds to a colour and the number of balls of the same colour is the mass of a star. Our model is obtained by adding a spatial location to all the stars of Hoppe’s urn scheme. In more detail, at each consecutive time point a new unit mass appears within a given region of space. This unit mass is either absorbed by one of the existing stars, or it defines a new star that later on starts to grow. We refer to each such unit mass as a candidate star, and devise a two-step procedure for its allocation into the population. In the first step candidates will most likely arrive at mass dense parts of the region, and in the second step they are more easily absorbed by nearby stars of high mass. We may interpret this as a competition, whereby different stars try to recruit new mass in its neighbourhood, but occasionally some candidates manage to avoid absorption and define a new star.
Our main task is to investigate how the spatial distribution, mass distribution, and number of objects evolve over time. In the spirit of [Reference Joyce and Tavaré20] and [Reference Tavaré29], this is achieved by embedding the original process into a continuous-time Markov process. This implies that new stars (accepted candidates) of the continuous-time model arrive according to a marked Poisson process with locations as marks, and existing stars grow as independent Yule processes with location-dependent intensities. From this, it follows that the number of stars increases at a logarithmic rate, since fewer of them will avoid absorption over time, and that the largest objects will be found in a region where the Yule processes grow at a maximal rate. We prove that the masses of the stars within this region have an asymptotic Poisson–Dirichlet distribution ([Reference Arratia, Barbour and Tavaré3], [Reference Kingman23]), whereas the distribution of the size of a randomly chosen star is very thick tailed, with a logarithmic decay rate.
The paper is organized as follows. In Section 2 we introduce the discrete-time model and formulate the main results. Then we define the continuous-time model in Section 3, give a number of examples in Section 4, and present proofs in Section 5. Finally, in Section 6 we discuss possible extensions of our model, and in particular their relevance for and connections to different types of urn schemes, population genetics, image segmentation, epidemic spread, and random graphs with preferential attachment.
2. Discrete-time model
Consider a set Ω ⊂ ℝq, in which points of unit mass appear at positions Yk, at discrete time points k ∈ ℤ+. Each such unit mass either gives rise to a new star at Yk or is attracted to some previously formed star in Ω, whose mass then increases by 1. Let 1 = K 1 < K 2 < K 3 < · · · refer to the time points when new stars are formed at positions Xi = YKi for i = 1, 2, … . The mass distribution over Ω at time k = 0, 1, … is conveniently summarized by the marked point process [Reference Jacobsen19]
where δx is a point mass at x, and Nki, the mark attached to Xi, is the mass of the ith formed star at time k. We introduce the sequence
for the mass size distribution at time k. We have Nki = 0 for i > Ik, where Ik is the number of stars that have been formed up to and including time k. Since I 0 = 0, we start with N 0i ≡ 0 at time 0. Then, since I 1 = 1, and Ik ≥ Ik −1 for k ≥ 2, the number of nonzero masses is a nondecreasing function of time. Moreover, since the total mass increases by 1 at each time point, it follows that
In order to define a probabilistic rule according to which new stars are formed, or existing stars increase their mass, we assume that the attractional force between two masses nx and ny at positions x, y ∈ Ω is nxnyL(x, y), where L(x, y) ≥ 0 is a kernel function that typically (but not necessarily) is a nonincreasing function of some metric d(x, y) on ℝq, for instance, the Euclidean distance between x and y. Also, let ei = (0, …, 0, 1, 0, …) refer to a sequence with 1 in position i and 0s elsewhere. Then
for i = 1, 2, … specify whether a new candidate star at y and time k − 1 is either absorbed by existing stars with probabilities proportional to their attractional forces (Ik = Ik −1), or turned into a new star at y (Ik = Ik −1 + 1, Yk = XI k) with a probability proportional to a candidate appearance rate γ . We also need to specify where new candidate stars appear in Ω. To this end, let π be a probability distribution over Ω, and let $\cB(\Omega)$ refer to the Borel sigma algebra of Ω. The proposal distribution
is a tilted version of π, such that candidates are proposed more often in densely populated regions of Ω. On the other hand, we know from (2) that candidate stars are more easily formed in sparse regions of Ω. It turns out that the amount of tilting in (3) is just strong enough to balance the latter effect, so that the positions of all formed stars have the same distribution, as specified in the following theorem.
Theorem 1. If the proposal distribution for the location of candidate stars satisfies (3), and if candidates are absorbed by existing stars or form new ones according to (2), then X 1, X 2, … are independent and identically distributed (i.i.d.) random variables with Xi ~ π.
The asymptotic mass distribution as k → ∞ depends crucially on the function λ:Ω → [0, ∞), where
is the growth intensity or growth rate of a star located at x. Let
be the maximal growth intensity, and let
be the set of points in Ω for which stars grow at this maximal rate. We will assume that newly formed stars grow at the maximal rate with a positive probability, i.e.
The important parameter of the asymptotic mass size distribution is
the ratio between the appearance rate of new candidate stars within Ω0, and the growth intensity of stars within Ω0. This is detailed in the following theorem.
Theorem 2. Suppose that (6) and the conditions of Theorem 1 hold. Then
as k → ∞, where ‘$\Lto$’ refers to joint weak convergence for all components i = 1, 2, …, Ji = |{l; 1 ≤ l ≤ i, Xl ∈ Ω0}| is the number of stars among the first i that have a maximal growth rate, and
where the Zl ~ B(1, θ) are i.i.d. and beta distributed random variables with density function fZ l(u) = θ(1 − u)θ−1 for 0 < u < 1.
It follows from Theorem 2 that the ordered relative asymptotic mass sizes U (1) ≥ U (2) ≥ · · · have a Poisson–Dirichlet distribution with parameters 0 and θ; see, for instance, [Reference Kingman23], [Reference Pitman and Yor27], and the references therein.
In order to formulate how often new stars are formed, the parameter
is important. It is related to (7), but the numerator of (8) refers to the appearance rate of all candidate stars, rather than those from the maximal region only. The following result reveals that the number of stars grows at a logarithmic rate.
Theorem 3. Suppose that the conditions of Theorem 1 hold. The number of stars among the first k candidates then satisfies
asymptotically as k → ∞, where ψ is as defined in (8) and op(1) is a remainder term that converges to 0 in probability.
It is also of interest to analyse the size
of a randomly chosen star at time k, so that
We know from Theorem 3 that new stars arrive more seldom over time. As a consequence of this, there has been plenty of time for a typical star at time k to grow when k is large. The following result specifies how this affects the size distribution.
Theorem 4. Let N 1:k be the size (9) of a randomly chosen star at time k. Then
as k → ∞, where V ~ U(0, 1) is a uniformly distributed random variable that is independent of X ~ π, whereas λ is defined in (5). Equivalently, if Vλ(X)/λ ~ F, it follows that
if x → ∞ and k → ∞ jointly in such a way that log(x)/ log(k) → c for some constant 0 ≤ c < 1.
For a fixed time point k, we may use Theorem 4 in order to approximate the tail probability of N 1:k as
for 0 ≤ x ≤ k. This implies that the distribution of N 1:k is very thick tailed. It decays to 0 at a logarithmic rate, until the upper limit k of its support is reached.
3. Continuous-time model
In order to prove Theorems 1–4, we will embed the discrete time model of Section 2 into a continuous-time Markov model with time index t > 0. The mass size distribution at time t is denoted as
with Ni(t) the mass of the ith formed star, and $N(t)=\sum_{i=1}^\infty N_i(t)$ the total mass of all stars. We will assume that
is right continuous and piecewise constant, with jumps occurring at an increasing sequence 0 = τ 0 < τ 1 < ··· < τk < ··· of time points. The mass configuration over Ω at time t is represented as a marked point process
and we let
be the time point when the ith star is formed, so that Ni(t) = 0 for 0 < t < Ti, and Ni(Ti) = 1. It is assumed that {Ti} is a Poisson process with intensity γ, and that the positions Xi ∈ Ω of new stars form an i.i.d. sequence of random variables with distribution π. Then we postulate that the Ni(t) are independent Yule processes with different starting points Ti and growth rates λ(Xi), conditionally on the arrival times and positions of all stars. By this we mean that
as h → 0 for all t ≥ Ti. Whenever Ni(·) increases by one unit, it is due to attraction of a candidate star at some position Y ∈ Ω with distribution
for all $A\in\cB(\Om)$.
Alternatively, we may specify M directly in terms of the conditional intensity by which it grows [Reference Rafler28]. It either acquires new mass due to the arrival of new stars or the growth of existing stars, according to
as h → 0. Processes of this kind were referred to as Pólya sum processes in [Reference Nehring and Zessin25] for the special case when λ(·) is constant.
In order to study the growth rate of Ni(t), we first make use of the well-known property of Yule processes that
has a geometric distribution (starting at 1) for all t ≥ Ti; see, e.g. [Reference Kendall21] or [4, p. 109]. Moreover, if λ(Xi) > 0, it follows that {Ni(t) | Ti, Xi; t ≥ Ti} is a continuous-time, supercritical branching process with Malthusian parameter λ(Xi) and expected value eλ(Xi)(t−Ti). We therefore have almost-sure convergence
of the rescaled mass of star i as t → ∞ (see, e.g. [13, Theorem 6.3]), towards a limiting random variable εi ~ Exp(1) whose distribution is exponential with rate 1.
The following result is the key to the proofs of Theorems 1–4.
Proposition 1. Consider the continuous-time marked point process (13), where new stars appear as a marked Poisson process with intensity γ and i.i.d. locations (marks) Xi ~ π. Moreover, assume that all stars give rise to independent Yule processes (15) that grow due to attraction of candidate masses, whose random locations are distributed as in (16). The embedded discrete-time Markov process Mk = M(τk) then has the same distribution as for the model of Section 2. In particular, (2)–(3) hold.
4. Examples
In this section we describe a number of examples that aim to illustrate the model of Section 2. They differ in the way the mass attractional force kernel L(x, y) is defined.
Example 1. (Homogeneous model.) Suppose that L(x, y) ≡ 1, so that the attractional force between masses is not influenced by their positions. Then (2) reduces to a Hoppe urn scheme or Chinese restaurant process
The total mass N(t) at time t for the continuous-time model is a Yule process with immigration rate γ and growth rate 1 ([Reference Joyce and Tavaré20]).
Equation (4) implies that λ(x) = 1 for all x ∈ Ω, so that λ = 1 and Ω0 = Ω. It then follows immediately from (7) and (8) that
From Theorem 4 we also deduce that F ~ U(0, 1) has a uniform distribution in (11), the formula for the asymptotic mass size distribution of a randomly chosen star.
Example 2. (Household model.) Let Ω = Ω1 ∪ · · · ∪ ΩH be a partition of Ω into H disjoint regions that represent H different households. Assume that
where 0 ≤ m < 1 is a number that quantifies the amount of interaction between households. It follows that λ(·) is piecewise constant over Ω1, …, ΩH, with
for x ∈ Ωh . Also, let
refer to the number of maximally sized households. It then follows from (7), (8), and (19) that
and
From the definition of F in Theorem 4 and (19), we then deduce that
is a finite mixture of uniform distributions with right-hand end points
Example 3. (Neighbourhood model.) Let d(x, y) refer to the Euclidean distance in ℝq, and define the kernel
so that a star at x attracts other masses within a radius r, by a force that is the same everywhere within this neighbourhood. Assume that Ω is a convex and bounded subset of Rq such that B(x, r′) ⊂ Ω for some x ∈ Ω and r′> r. If π is the uniform distribution on Ω then the growth rate for a star at x is
where | · | refers to the Lebesgue measure. From this, it follows that the maximal growth rate
is attained for all stars within the interior
of Ω, i.e. those points x Ω in whose distance to the complement of Ω is at least r. It follows from (7) and (8) that
respectively. The formula for F in Theorem 4 is more complicated; a mixture
of uniform distributions, whose right-hand end points are v(x) = |B(x, r) ∩ Ω |/|B(0, r)| for points x outside the interior set Ω0.
Example 4. (Gravitational law model.) Assume that Ω is a p-dimensional and bounded subset of ℝq for some 0 < p ≤ q, and that π is the uniform distribution on Ω. The kernel
represents a gravitational force that decreases by distance between masses according to a power law with exponent 0 < β < p, for some metric d(x, y). If Ω = B(0, 1) is the unit ball of Rq then p = q. If d(x, y) refers to Euclidean distance between x and y, it can be seen that Ω0 = {0} consists of one single point, the origin. Since π(Ω0) = 0, it follows that (6) is violated.
On the other hand, if Ω is the unit sphere in Rq then p = q − 1. It is then appropriate to choose d(x, y) as the length of the geodesic along Ω that connects x and y. Moreover, by symmetry we note that
independently of x, so that Ω0 = Ω. It therefore follows from (7) and (8) that
whereas F ~ U(0, 1) in Theorem 4.
5. Proofs
Proof of Proposition 1. Let M(t) be the continuous-time, dynamic model (13) for the mass distribution over Ω of all stars. By definition,
is the rate at which a unit mass star at x increases its mass by one unit, due to attraction of a candidate from $A\in\cB(\Om)$. Consider a time interval τk −1 < t ≤ τk. Then the rate at which a star i ≤ Ik −1 increases its mass by one unit, due to attraction of a candidate from A, is Ni(t)λ(Xi, A), whereas a new star appears in A at rate γ π(A). Let
be the embedded discrete-time process for the star mass distribution. We need to show that(20) satisfies (2)–(3). But this follows easily from the fact that the transition distributions of the embedded Markov chain are proportional to the jump intensities of the corresponding continuous-time Markov process. First, since M(·) is constant on [τk −1, τk), the probability distribution of the next candidate Yk is
in agreement with (3). Second, given that a candidate at Yk = y has appeared at time t = τk, it is allocated to an existing star or forms a new star with probabilities
that simplify to (2), since π(y) cancels out in the numerator and denominator.
Proof of Theorem 1. The result follows immediately from Proposition 1, since, by defi-nition, new stars of the continuous-time model in Section 3 are located at i.i.d. positions Xi ~ π.
Proof of Theorem 2. In view of Proposition 1, it suffices to analyse the continuous-time model of Section 3, and prove that
jointly for i = 1, 2, … as t → ∞, with Uj and Ji as defined in Theorem 2. To this end, we first write
When t ≥ Ti we put εi(t) ≡ 1 when λ(Xi) = 0, whereas ε i(t) is a rescaling of the geometric random variable in (17) with expected value 1 when λ(Xi) > 0. For definiteness, we also put ε i(t) = 0 when t < Ti. It follows from (18) that $\cE_i(t)\xrightarrow{\text{\upshape a.s.}}\cE_i\sim\Exp(1)$ as t → ∞ for any star i such that λ(Xi) > 0. Since (22) are independent processes for all i, we can analyse the mass growth within Ω0 and Ω1 =Ω \ Ω0 separately. To this end, let
be the total mass of stars within Ω0 and Ω1, respectively, at time t. We will establish that the mass fraction within the maximal region Ω0 tends to 1 asymptotically as t → ∞, i.e.
The upper part of (21) will then follow from
In order to prove the lower part of (21), we also need to derive the asymptotic mass distribution within the maximal region, i.e.
jointly for j = 1, 2, …, where Hj = min{i; Ji = j} is the order number of the jth star from Ω0 among all stars in Ω.
We thus need to show that (24)–(25) hold; let us start with (25). Since λ(Xi) = λ for all Xi ∈ Ω0, it follows from (22), (23), and the definition of Uj(t) in (25) that
as t → ∞, where $\cE_{H_{j}}(t)\xrightarrow{\text{\upshape a.s.}}E_j,E_j \~ EXP\(1\)$ are i.i.d. random variables,
and
Since $\{T_{H_j}\}_{j=1}^\infty$ is obtained by thinning a Poisson process {Ti} with intensity γ, with a thinning probability π(0), it follows that $\{\la T_{H_j}\}_{j=1}^\infty$ is a Poisson process with intensity γ π(Ω0)/λ = θ. Together with (26)–(27), it can be shown that this implies that $\smash{E_j^\prime \sim \Ga(\theta,1)}$ has a gamma distribution, and hence that Zj ~ B(1, θ). It can also be verified that the Zj are independent, thereby proving the joint convergence in (25). See [Reference Tavaré29] and [9, p. 28] for details.
It remains to prove (24). To this end, we will compare the asymptotic sizes of N(t; 0) and N(t; 1) as t → ∞. We first compute an asymptotic approximation
of the expected value of N(t; 0) as t → ∞, where in the third step we used the notation f (t) ~ g(t) for f (t)/g(t) → 1, and in the fourth step we evaluated the moment generating function of λTH j ~ Γ (j, θ), the jth arrival time of a Poisson process with intensity θ. Then, by a similar argument as in (26)–(28), we find that
as t → ∞, where W has expected value E(W) = [θ/(1 + θ)](1 + θ)/θ = 1. In order to study N(t; 1) we split Ω1 = Ω11 ∪ Ω12 into two subregions, where the growth rate of the points in Ω11 is close to the maximal growth rate λ, whereas the mass of stars in Ω12 grow at a slower rate. More specifically, let (ε > 0 be a small number, and choose η = η(ε) > 0 so that π(Ω11) ≤ επ(Ω0), where
Let $H_j^\prime$ and $H_j^{\prime\prime}$ refer to the order numbers of the jth star from Ω11 and Ω12 respectively, among all stars in Ω. Since the growth rate of stars is at most λ in Ω11, and at most λ − η in Ω12, we may repeat the argument in (29), and find that
where in the second last step we used the fact that $\smash{\{\la T_{H_j^\prime}\}}$ and $\smash{\{\la T_{H_j^{\prime\prime}}\}}$ are Poisson processes with intensities γ π(Ω11)/λ and γ π(Ω12)/λ, respectively, and in the last step the facts that π(Ω11) ≤ επ(Ω0) and π(Ω12) ≤ 1. Since ε > 0 was arbitrarily chosen, we conclude that
Putting things together, we find that
as t → ∞, where in the last step of (32) we used (31) and Markov’s inequality to deduce that $\smash{N(t;\,1)/(\theta \re^{\la t})\cvgpr 0}$, then we used equation (30) to conclude that $\smash{\theta \re^{\la t}/N(t;\,0) \Lto W^{-1}}$, and finally we invoked Slutsky’s lemma to verify that the product of these two terms converges to 0 in probability as t → ∞. But, since (24) follows from (32), the proof of the theorem is complete.
Proof of Theorem 3. Since Ki = min{k; Ik ≥ i}, it suffices to prove that
as i → ∞. Again, we will utilize the embedding of the discrete-time model of Section 2 into the continuous-time model of Section 3. It follows from (1), (12), (14), and (24) that
as i → ∞. An expression for N(Ti; 0) appears in the denominator of (26). By a similar calculation, we can rewrite this expression asymptotically as
when i → ∞, where E 1 ~ Exp(1) and Z 1 ~ B(1, θ) were introduced in (26) and (27). In the fifth step of (35) we used the fact that λTi ~ Γ(i, γ /λ) is the ith event of a Poisson process with intensity γ /λ, and, hence, by the definition of ψ in (8), λ(Ti − TH 1) = [1 + op(1)]i/ψ as I → ∞. Since (34) and (35) imply (33), the theorem is proved.
Proof of Theorem 4. Let I(t) = |{i; Ti ≤ t}| refer to the number of new stars of the continuous-time model of Section 3 that have arrived up to time t, without getting absorbed by other stars. Also, let N(1:t) = NI (1 : t ) denote the size of a randomly chosen star among those that exist at time t, with
We recall from the paragraph below (14) that new unabsorbed stars arrive according to a marked Poisson process (T 1, X 1), (T 2, X 2), … with intensity γ and independent marks Xi ~ π. It therefore follows that the number of stars at time t is Poisson distributed (I(t) ~ Po(γt)), and, moreover, a randomly chosen star among these I(t) stars satisfies
where V ~ U(0, 1) and X ~ π are independent. From (22) and (36) we deduce that
as t → ∞. On the other hand, it follows from (9) and (12) that
and by similar reasoning and notation to (35), we find that
as k → ∞. Putting things together, we note that (37)–(38) imply that
in accordance with (10). Since (11) is simply a restatement of (10) in terms of distribution functions, the theorem is proved.
6. Discussion
In this paper we introduced a spatio-temporal and discrete-time marked point process for the appearance and growth of objects (for instance stars), with a mutual attraction that is proportional to their masses and dependent on their locations. Asymptotic results were obtained for the location of these objects, their mass distribution, and the speed at which new objects arrive. The key step in the proofs of these results was to embed the discrete-time model into a continuous-time marked point process, where existing objects grow as independent Yule processes, at different rates, whereas new objects arrive according to a marked Poisson process, with locations as marks.
A number of extensions are possible. First, we assumed that with positive probability, new objects will grow at the maximal rate. This condition is violated, for instance, by the gravitational law model of Example 4, when the maximal growth rate occurs at one single point within a unit ball. For such models, the asymptotic mass distribution will be more complicated than in Theorem 2.
Second, it is possible to add a number of parameters to the model. One may for instance consider a more general type of attractional force $n_x^\alpha M^\alpha L(x,y)$ between an object at x with mass nx, and a candidate with random mass M at y, for some α ≥ 0 and attraction kernel L. It is also possible that an object at x with mass nx loses mass at rate $\mu n_x^{\alpha^\prime}$ for some constants μ > 0 and α′ ≥ 0. With these extensions it is of interest to investigate whether α, L, α′, μ, and the distribution of M can be chosen in such a way that a nondegenerate limit distribution of masses is obtained.
Third, we assumed that the dynamics of the model is solely due to new candidate objects, which are either absorbed by existing objects or become new objects themselves. A key assumption was that new candidates appear more often in high-density regions, in such a way that the discrete-time model could be embedded into a continuous-time model in a simple way. It is of interest to consider other proposal distributions for new candidates, for instance, their locations are independent and identically distributed with some distribution π. It is challenging, though, to analyse these types of models, since the particles of the continuous-time process will no longer grow independently.
Fourth, it is of interest to allow for pairs of existing objects to coalesce. As in the previous paragraph, this makes it harder to find the asymptotic mass distribution, since particles of the continuous-time marked point process will no longer grow independently. However, measure-valued Markov processes have been used to model particle aggregation; see [Reference Aldous2], [Reference Evans and Pitman11], and the references therein. It is possible that the results of these papers can be used in our setting as well.
Fifth, as mentioned in Section 1, the discrete-time model of Section 2 can be viewed as a Hoppe-type urn scheme with a spatial component. The original nonspatial urn model [Reference Hoppe16] corresponds to the homogeneous model of Example 1. It contains balls of various colours, and a new ball is added to the urn at each time point. An object corresponds to balls of the same colour, and its mass is the number of these balls. Such urns are frequently used for infinite allele models of population genetics ([Reference Durrett9], [Reference Ewens12], [Reference Kimura and Crow22]), with colours representing genetic variants (alleles), and balls with new colours representing mutations. In particular, the time-reversed version of the urn model is a coalescent, i.e. an ancestral tree for a sample of genes, where each mutation gives rise to a separate tree ([Reference Kingman24], [Reference Watterson30]). This was used by [Reference Donnelly and Tavaré8] in order to find the age-ordered asymptotic distribution of the frequencies of different alleles for a large population. It would be of interest to investigate whether the model of Section 2 can be extended to find the age-ordered asymptotic allele frequency distribution for populations with a spatial structure. For instance, the households of Example 2 could represent subpopulations, where some fractions of objects are allowed to migrate between these subpopulations at each time step. The resulting marked point process is then the time reversal of a certain type of structured coalescent with migration between subpopulations and mutations ([Reference Herbots, Donnelly and Tavaré15], [Reference Notohara26]).
Sixth, another generalization of the discrete-time model of Section 2 is to have a time-dependent parameter γk that controls how often new stars are formed. In Section 2 we assumed γk ≡ γ > 0. However, if γk = 0 for all k ≥ K, from time K and onwards we get a multivariate Pólya-type urn scheme [Reference Blackwell and Macqueen7] with a spatial component. It corresponds to a model where new candidate stars are always absorbed by some of the present stars, so that no new stars are formed. In particular, a more general version Example 2 can be defined where households and colours represent pixels and greyscales of an image. This has been used for image segmentation, where new candidates appear at one pixel at a time [Reference Banerjee, Burlina and Alajiji5] in order to update the reconstructed image for the purpose of image segmentation. The difference from our model is that candidates are always absorbed into their own household (pixel), but the allocation rule of updating its colour is defined by a Pólya urn that contains the balls within the pixel where the candidate arrives and the balls of all its neighbours. A similar model has been used by [Reference Hayhoe, Alajiji and Gharesifard14] for the spread of an epidemic along a network of nodes (households). It has two colours that represent infectious and healthy units in terms of bacteria and white blood cells for instance.
Seventh, our framework has connections to preferential attachment (PA) models of random graphs. Such graphs are generated dynamically when new nodes (objects/stars) and edges are added one at a time, and the mass of each node corresponds to its number of edges. The major difference from our model is firstly that all candidate nodes are accepted, and secondly that the total mass increases by two rather than one at each time point, whenever an edge is generated between a new node and an existing one. Since all candidate nodes are accepted, the number of nodes increases linearly with time rather than logarithmically, as in Theorem 3. Moreover, the degree distribution for the nodes of a PA model typically has a power law distribution—much less heavy tailed compared with Theorem 4. In order to illustrate this, consider the original PA model without any spatial component, which corresponds to Example 1. For this model, each new node connects to an existing node with probabilities proportional to their number of edges, regardless of spatial position. It has been shown that ℙ(N 1: k = x) is of order x −3 for the number of edges N 1: k of a randomly chosen node at time k [Reference Barabási and Albert6]. Spatial preferential attachment (SPA) models add a spatial component to the PA models, when new nodes appear randomly in space in such a way that they more easily connect to existing nodes in their neighbourhood with many edges. These SPA models are more flexible, since they allow for a power law degree distribution with a whole range of exponents [Reference Aiello, Bonato, Cooper, Janssen and Pralat1] and their local clustering shows more resemblance to that of real networks [Reference Jacob and Mörters18]. Our model is similar to a two-step SPA model, where nodes with many edges tend to influence the location of new nodes at first, so that these new nodes are more likely to arrive in an edge-rich region. The second step is analogous to the SPA model, whereby the new nodes tend to connect to the nodes in their neighbourhood with many edges. It is of interest to study the degree distributions and type of local clustering of such two-step SPA models.
Acknowledgements
The author wishes to thank Pieter Trapman for suggesting this problem, and for valuable discussions and advice. The author also thanks Tom Britton, Mia Deijfen, Robert Fitzner, and Mathias Millberg-Lindholm for their advice, and an anonymous referee for helpful comments.