Subgraphs in preferential attachment models

Alessandro Garavaglia; Clara Stegehuis

doi:10.1017/apr.2019.36

Subgraphs in preferential attachment models

Part of: Graph theory Limit theorems

Published online by Cambridge University Press: 03 September 2019

Alessandro Garavaglia and

Clara Stegehuis

Show author details

Alessandro Garavaglia*: Affiliation:
Eindhoven University of Technology
Clara Stegehuis*: Affiliation:
Eindhoven University of Technology
*: *Postal address: Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands.
*Postal address: Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands.

Article contents

Abstract
Introduction
Definitions
Main results
The probability of a subgraph being present
Proof of Theorem
Proof of Theorem
Conditional concentration: proof of Proposition
Discussion
References

Rights & Permissions

Abstract

We consider subgraph counts in general preferential attachment models with power-law degree exponent $\tau > 2$. For all subgraphs H, we find the scaling of the expected number of subgraphs as a power of the number of vertices. We prove our results on the expected number of subgraphs by defining an optimization problem that finds the optimal subgraph structure in terms of the indices of the vertices that together span it and by using the representation of the preferential attachment model as a Pólya urn model.

Keywords

Preferential attachment subgraph triangle clustering

MSC classification

Primary: 05C80: Random graphs

Secondary: 60F05: Central limit and other weak theorems

Type: Original Article
Information: Advances in Applied Probability , Volume 51 , Issue 3 , September 2019 , pp. 898 - 926

DOI: https://doi.org/10.1017/apr.2019.36 [Opens in a new window]
Copyright: © Applied Probability Trust 2019

1. Introduction

The degree distribution of many real-world networks can be approximated by a power-law distribution [Reference Faloutsos, Faloutsos and Faloutsos12],[Reference Vázquez, Pastor-Satorras and Vespignani30], where, for most networks, the degree exponent $\tau$ was found to be between 2 and 3, so that the degree distribution has infinite variance. Another important property of networks are subgraph counts, also referred to as motif counts. In many real-world networks, several subgraphs were found to appear more frequently than other subgraphs [Reference Milo19]. Which type of subgraph appears most frequently varies for different networks, and the most frequently occurring subgraphs are believed to be correlated with the function of the network [Reference Milo19], [Reference Milo20], [Reference Wuchty, Oltvai and Barabási32]. The triangle is the most studied subgraph, allowing for the computation of the clustering coefficient of the network, which expresses the fraction of connected neighbors of a vertex.

To investigate which subgraphs occur more frequently than expected in a given network, the subgraph count in a given network is usually compared to the subgraph count in a random graph null model [Reference Gao and Lafferty13], [Reference Maugis, Priebe, Olhede and Wolfe18], [Reference Milo20], [Reference Onnela, Saramäki, Kertész and Kaski21]. Several random graph models could potentially serve as null models. In practice, the null model is frequently obtained by randomly switching edges while preserving the degrees. This model, however, is not mathematically tractable for $\tau<3$ , so that it requires simulations to estimate the subgraph count in such networks [Reference Marcus and Shavitt17], [Reference Wernicke and Rasche31].

Several other null models for simple, scale-free networks exist, such as the configuration model [Reference Bollobás5], the rank-1 inhomogeneous random graph [Reference Boguñá and Pastor-Satorras4], [Reference Chung and Lu9], the preferential attachment model, [Reference Barabási and Albert2] or hyperbolic random graphs [Reference Krioukov16]. When the degree exponent satisfies $\tau<3$ , the configuration model results in a network with many multiple edges and self-loops [Reference van der Hofstad28, Chapter 7], so that it is not a null model for simple networks anymore. A possible solution is to merge all multiple edges of the configuration model, and consider the erased configuration model instead [Reference Britton, Deijfen and Martin-Löf8]. This model is mathematically tractable, and subgraph counts for this model were derived in [Reference van der Hofstad, van Leeuwaarden and Stegehuis29].

In this paper, we analyze subgraph counts for a different random graph null model, the preferential attachment model. The preferential attachment model was first introduced by Albert and Barabási [Reference Albert and Barabási1], [Reference Barabási and Albert2]. In their original work, they described a growing random graph model where a new vertex appears at each time step. The new vertex connects with a fixed number of existing vertices chosen with probability proportional to the degrees. This original Albert–Barabási model has been generalized over the last years, generating the broad class of random graphs called preferential attachment models (PAMs).

The original Albert–Barabási model is known to produce a power-law degree distribution with $\tau=3$ [Reference Barabási and Albert2]. Often, a modification is considered, where edges are attached to vertices with probability proportional to the degree plus a constant $\delta$ . The constant $\delta$ allows us to obtain different values for the power-law exponent $\tau$ . For $\delta=0$ , we retrieve the original Albert–Barabási model.

The PAM can also be extended to the case where new vertices come into the graph with a number $m\geq1$ of edges, each attached to an existing vertex (or also to the new vertex itself) [Reference Berger, Borgs, Chayes and Saberi3], [Reference Jordan15], [Reference van der Hofstad28]. In the present paper we focus on the case where m is fixed, and our results hold for any value of $\delta>-m$ . Taking $\delta\in({-}m,0)$ results in $\tau\in(2,3)$ , as observed in many real-world networks. An important difference between the preferential attachment model and most other random graph null models is that edges can be interpreted as directed. Thus, it allows us to study directed subgraphs. This is a major advantage of the PAM over other random graph null models, since most real-world network subgraphs in, for example, biological networks are directed as well [Reference Milo20], [Reference Shen-Orr, Milo, Mangan and Alon25].

1.1. Literature on subgraphs in PAMs

We now briefly summarize existing results on specific subgraph counts in PAMs. The triangle is the most studied subgraph, allowing us to investigate clustering in the PAM. Bollobás and Riordan [Reference Bollobás and Riordan6] proved that, for any integer-valued function T(t), there exists a PAM with T(t) triangles, where t denotes the number of vertices in the PAM. They further showed that the clustering coefficient in the Albert–Barabási model is of order ${(\log t)^2}/t$ , while the expected number of triangles is of order ${(\log t)^3}$ and, more generally, the expected number of cycles of length l scales as ${(\log t)^l}$ .

Eggmann and Noble [Reference Eggemann and Noble11] considered $\delta>0$ , so that $\tau>3$ and investigated the number of subgraphs for $m=1$ (so subtrees), and, for $m\geq 2,$ they studied the number of triangles and the clustering coefficient. They observed that the expected number of triangles is of order $\log t,$ while the clustering coefficient is of order $\log t/t$ , which is different than the results in [Reference Bollobás and Riordan6]. Our result on general subgraphs for any value of $\delta$ in Theorem 1 explains this difference (in particular, we refer the reader to (3)).

In a series of papers [Reference Prokhorenkova22], [Reference Prokhorenkova and Krot23], [Reference Prokhorenkova, Ryabchenko, Samosvat, Bonato, Mitzenmacher and Praat24] Prokhorenkova et al. proved results on the clustering coefficient and the number of triangles for a broad class of PAMs, assuming general properties on the attachment probabilities. These attachment probabilities are in a form that increases the probability of creating a triangle. In this setting the number of triangles is of order t, while the clustering coefficients behave differently depending on the exact attachment probabilities.

1.2. Our contribution

For every directed subgraph, we obtain the scaling of the expected number of such subgraphs in the PAM, generalizing the above results on triangles, cycles, and subtrees. Furthermore, we identify the most likely degrees of vertices participating in such subgraphs, which shows that subgraphs in the PAM are typically formed between vertices with degrees of a specific order of magnitude. The order of magnitude of these degrees can be found using an optimization problem. For general subgraphs, our results provide the scaling of the expected number of such subgraphs in the network size t. For the triangle subgraph, we obtain precise asymptotic results on the subgraph count, which allows us to study clustering in the PAM.

We use the interpretation of the PAM as a Pólya urn graph. This interpretation allows to view the edges as being present independently, so that we are able to obtain the probability that a subgraph H is present on a specific set of vertices.

1.3. Organization of the paper

We first define the specific PAM we study and explain its interpretation as a Pólya urn graph in Section 2. After that, we present our result on the scaling of the number of subgraphs in the PAM and the exact asymptotics on the number of triangles in Section 3. Section 4 then provides an important ingredient for the proof of the scaling of the expected number of subgraphs: a lemma that describes the probability that a specific subgraph is present on a subset of vertices. After that, we prove our main results in Section 5 and Section 6. Finally, in Section 8 we give conclusions and a discussion of our results.

2. Definitions

In this section, we define the specific PAM we study. Then, we give the definition of a Pólya urn graph in Section 2.2, and we specify how we count subgraphs in Section 2.3.

2.1. PAM

As mentioned in Section 1, different versions of PAMs exist. Here we define the specific PAM we consider, which is a modification of [Reference Berger, Borgs, Chayes and Saberi3, Model 3].

Definition 1

(Sequential PAM.) Fix $m\in\mathbb{N}\geq 2$ , $\delta>-m$ . Then ${(\mathrm{PA}_t(m,\delta))_{t\in\mathbb{N}}}$ is a sequence of random graphs defined as follows.

For $t=1$ , $\mathrm{PA}_1(m,\delta)$ consists of a single vertex with no edges.
For $t=2$ , $\mathrm{PA}_2(m,\delta)$ consists of two vertices with m edges between them.
For $t\geq 3$ , $\mathrm{PA}_t(m,\delta)$ is constructed recursively as follows. Conditioning on the graph at time $t-1$ , we add a vertex t to the graph, with m new edges. Edges start from vertex t and, for $j=1,\ldots,m$ , they are attached sequentially to vertices $E_{t,1},\ldots,E_{t,m}$ chosen with probability
(1) \begin{equation} \mathbb{P}(E_{t,j} = i \mid \mathrm{PA}_{t-1,j-1}(m,\delta)) = \begin{cases} \frac{D_i(t-1)+\delta}{2m(t-2)+ (t-1)\delta} & \!\mbox{if}\ j=1, \\[9pt] \frac{D_i(t-1,j-1)+\delta}{2m(t-2)+(j-1)+ (t-1)\delta} & \!\mbox{if}\ j=2,\ldots, m. \end{cases} \end{equation}

In (1) $\mathrm{PA}_{t-1,j-1}$ denotes the graph of size t after the first $j-1$ edges of vertex t have been attached. Furthermore, $D_i(t-1)$ denotes the degree of i in $\mathrm{PA}_{t-1}(m,\delta)$ , while $D_i(t-1,j-1)$ denotes the degree of vertex i in $\mathrm{PA}_{t-1,j-1}$ . We assume that $\mathrm{PA}_{t-1,0} = \mathrm{PA}_{t-1}$ .

To keep the notation light, we write $\mathrm{PA}_t$ instead of $\mathrm{PA}_t(m,\delta)$ throughout the rest of the paper. The first term in the denominator of (1) describes the total degree of the first $t-1$ vertices in $\mathrm{PA}_{t-1,j-1}$ when $t-1$ vertices are present and $j-1$ edges have been attached. The term ${(t-1)\delta}$ in the denominator comes from the fact that there are $t-1$ vertices to which an edge can attach. Note that we do not allow for self-loops, but we do allow for multiple edges.

The PAM of Definition 1 is a graph with no self-loops, but possibly with multiple edges. In this setting, new edges are added sequentially one by one, with intermediate updates of degrees in the attachment probabilities as in (1). Several other possible definitions of PAMs exist. For example, we can allow edges to be attached to their original vertex, thus generating self-loops, or we can force the edges to be attached to distinct vertices, in order to prevent the formation of multiple edges (for example, [Reference Berger, Borgs, Chayes and Saberi3, Model 2]). We can also attach the m edges independently of each other, without intermediate degree update, as in [Reference Berger, Borgs, Chayes and Saberi3, Model 1].

All these models have in common the fact that edges choose the vertex to be attached to according to probabilities of the from (1), where in the numerator the degree of the vertex, k, appears as an argument of an affine function $f(k) = k+\delta$ . For this reason, these models are called affine PAMs. The parameter $\delta$ determines the initial attractiveness of vertices when they have degree 0, or, equivalently, the prominence of the effect of high degrees in the attachment probabilities. When $\delta\rightarrow\infty$ , this creates a PAM where edges are attached uniformly (i.e. the degrees are no more relevant). For a list of affine PAMs, with differences and similarities explained, we refer the reader to [Reference Garavaglia14, Chapter 4, Section 3] and the references therein.

Affine PAMs, for fixed $m\in\mathbb{N}$ and $\delta>-m$ , are known to generate graphs where the asymptotic degree sequence is close to a power law [Reference van der Hofstad27, Lemma 4.7], where the degree exponent $\tau$ satisfies $ \tau=3+{\delta}/{m}. $

2.2. PAMs as Pólya urn graphs

As mentioned, our results are based on the Pólya urn interpretation of PAMs. We now explain this interpretation in more detail. An urn scheme consists of an urn, with blue balls and red balls. At every time step, we draw a ball from the urn and we replace it by two balls of the same color. We start with $B_0 = b_0$ blue balls and $R_0 = r_0$ red balls. We consider two weight functions,

$$W_b(k) = a_b + k \quad\mbox{and}\quad W_r(k) = a_r+k.$$

Conditionally on the number of blue balls $B_n$ and red balls $R_n$ , at time $n+1$ the probability of drawing a blue ball is equal to

$$\frac{W_b(B_n)}{W_b(B_n)+W_r(R_n)}.$$

The evolution of the number of balls ${((B_n,R_n))_{n\in\mathbb{N}}}$ obeys [Reference van der Hofstad27, Theorem 4.2]

$${\mathbb{P}(B_n = B_0 + k) = \mathbb{E}[\mathbb{P}( \mathrm{Bin}(n,\psi)=k\mid \psi)],}$$

where $\psi$ has a beta distribution with parameters $B_0+a_b$ and $R_0+a_r$ . In other words, the number of blue balls (equivalently, of red balls) is given by a binomial distribution with a random probability of success $\psi$ (equivalently, $1-\psi$ ). Sometimes we call the random variable $\psi$ the intensity or strength of the blue balls in the urn. We can also see the urn process as two different urns, one containing only blue balls and the other only red balls, and we choose a urn proportionally to the number of balls in the urns. In this case, the result is the same, but we can say that $\psi$ is the strength of the blue balls urn and $1-\psi$ is the strength of the red balls urn.

The sequential model $\mathrm{PA}_t$ can be interpreted as an experiment with t urns, where the number of balls in each urn represents the degree of a vertex in the graph. First, we introduce a random graph model.

Definition 2

(Pólya urn graph.) Fix $m\geq 1$ and $\delta>-m$ . Let $t\in\mathbb{N}$ be the size of the graph. Let $\psi_1 = 1$ , and consider $\psi_2,\ldots,\psi_t$ independent random variables, where

\begin{equation*}\psi_k {\stackrel{\text{D}}{=}} \mathrm{Beta}(m+\delta, m(2k-3)+(k-1)\delta).\end{equation*}

Define

(2)

\begin{equation} \varphi_j = \psi_j\prod_{i=j+1}^t (1-\psi_i), \qquad S_k = \sum_{j=1}^k \varphi_j, \qquad I_k = [S_{k-1},S_k). \label{eqn2} \end{equation}

Conditioning on $\psi_1,\ldots,\psi_t$ , let $\smash{\{U_{k,j}\}_{k=2,\ldots,t}^{j=1,\ldots,m}}$ be independent random variables, with $U_{k,j}$ uniformly distributed on $[0,S_{k-1}]$ . Then, the corresponding Pólya urn graph $\mathrm{PU}_t$ is the graph of size t where, for $u{<}v$ , the number of edges between u and v is equal to the number of variables $U_{v,j}$ in $I_u$ for $j=1,\ldots,m$ (multiple edges are allowed).

The two sequences of graphs ${(\mathrm{PA}_t)_{t\in\mathbb{N}}}$ and ${(\mathrm{PU}_t)_{t\in\mathbb{N}}}$ have the same distribution [Reference Berger, Borgs, Chayes and Saberi3, Theorem 2.1], [Reference van der Hofstad27, Chapter 4]. The beta distributions in Definition 2 come from the Pólya urn interpretation of the sequential model, using urns with affine weight functions.

Throughout this paper, we will use the fact that we can interpret the PAM as a Pólya urn graph to prove as well as to interpret our results on the number of subgraphs in PAMs.

2.3. Labeled subgraphs

As mentioned before, the PAM in Definition 1 is a multigraph, i.e. any pair of vertices may be connected by m different edges. One could erase multiple edges in order to obtain a simple graph, similarly to [Reference Britton, Deijfen and Martin-Löf8] for the configuration model. In the PAM in Definition 1 there are at most m edges between any pair of vertices, so that the effect of erasing multiple edges is small, unlike in the configuration model. We do not erase edges, so that we may count a subgraph on the same set of vertices multiple times. Not erasing edges has the advantage that we do not modify the law of the graph; therefore, we can directly use known results on PAM.

More precisely, to count the number of subgraphs, we analyze labeled subgraphs, i.e. subgraphs where the edges are specified. In Figure 1 we give the example of two labeled triangles on three vertices u, v, w, one consisting of edges $\{ j_{v,1},j_{w,1},j_{w,3}\}$ and the other of edges $\{ j_{v,1},j_{w,2},j_{w,3}\}$ . As it turns out, the probability of two labeled subgraphs being defined by the same vertices and different edges is independent of the choice of the edges. For a more precise explanation, we refer the reader to Section 4.1.

Figure 1: Two labeled triangles.

Notation. We use ‘ ${\stackrel{{\mathbb{P}}}{\longrightarrow}} $ ’ for convergence in probability. We say that a sequence of events ${(\mathcal{E}_n)_{n\geq 1}}$ happens with high probability (w.h.p.) if $\lim_{n\to\infty}{\mathbb{P}}({\mathcal{E}_n})=1$ . Furthermore, we write $f(n)=o(g(n))$ if $\lim_{n\to\infty}f(n)/g(n)=0$ , and $f(n)=O(g(n))$ if $|f(n)|/g(n)$ is uniformly bounded, where ${(g(n))_{n\geq 1}}$ is nonnegative. We say that $X_n=O_{\scriptscriptstyle{{\mathbb{P}}}}(g(n))$ for a sequence of random variables ${(X_n)_{n\geq 1}}$ if $|X_n|/g(n)$ is a tight sequence of random variables, and $X_n=o_{\scriptscriptstyle{{\mathbb{P}}}}(g(n))$ if $\smash{X_n/g(n){\stackrel{{\mathbb{P}}}{\longrightarrow}} 0}$ . We further use the notation $[k]=\{1,2,\ldots,k\}$ .

3. Main results

In this section, we present our results on the number of directed subgraphs in the PAM. We first define subgraphs in more detail. Let $H=(V_H,E_H)$ be a connected, directed graph. Let $\pi\colon V_H\mapsto 1,\ldots,|V_H|$ be a one-to-one mapping of the vertices of H to $1,\ldots,|V_H|$ . In the PAM, vertices arrive one by one. We let $\pi$ correspond to the order in which the vertices in H have appeared in the PAM, that is, $\pi(i)<\pi(j)$ if vertex i was created before vertex j. Thus, the pair ${(H,\pi)}$ is a directed graph, together with a prescription of the order in which the vertices of H have arrived. We call the pair ${(H,\pi)}$ an ordered subgraph.

In the PAM, it is only possible for an older vertex to connect to a newer vertex but not the other way around. This puts constraints on the types of subgraphs that can be formed. We call the ordered subgraphs that can be formed in the PAM attainable. The following definition describes all attainable subgraphs.

Definition 3

(Attainable subgraphs.) Let ${(H,\pi)}$ be an ordered subgraph where $A_\pi(H)$ denotes the adjacency matrix of H, such that the rows and columns of the adjacency matrix of H are permuted by $\pi$ . We say that ${(H,\pi)}$ is attainable if $A_\pi(H)$ defines a directed acyclic graph, where all out-degrees are less than or equal to m.

3.1. Scaling of the number of subgraphs

We now investigate how many of these attainable subgraphs are typically present in the PAM. We introduce the optimization problem

(3)

\begin{align} B(H,\pi) & = \max_{s=0,1,\dots,k} -s+\sum_{i=s+1}^k \bigg[\frac{\tau-2}{\tau-1}(d^{\scriptscriptstyle(\mathrm{in})}_H(\pi^{-1}(i))-d^{\scriptscriptstyle(\mathrm{out})}_H(\pi^{-1}(i))) -d^{\scriptscriptstyle(\mathrm{in})}_H(\pi^{-1}(i))\bigg]\nonumber \\ & \,{:\!=}\, \max_{s=0,1,\dots,k}-s+\sum_{i=s+1}^k\beta(\pi^{-1}(i)), \label{eqn3}\end{align}

where $\smash{d^{\scriptscriptstyle(\mathrm{out})}_H(i)}$ and $\smash{d^{\scriptscriptstyle(\mathrm{in})}_H(i)}$ respectively denote the in-degree and out-degree of vertex i in the subgraph H. Let $N_t(H,\pi)$ denote the number of times the connected graph H with ordering $\pi$ occurs as a subgraph of a PAM of size t. The following theorem studies the scaling of the expected number of directed subgraphs in the PAM, and relates it to the optimization problem (3).

Theorem 1

Let H be a directed subgraph on k vertices with ordering $\pi$ such that ${(H,\pi)}$ is attainable and there are r different optimizers to (3). Then, there exist ${0{<}C_1\leq C_2{<}\infty}$ such that

(4)

\begin{equation} C_1\leq\liminf_{t\to\infty}\frac{{\mathbb{E}}[{N_t(H,\pi)]}}{t^{k+B(H,\pi)} \log^{r-1}(t)}\leq \limsup_{t\to\infty}\frac{{\mathbb{E}}[{N_t(H,\pi)}]} {t^{k+B(H,\pi)}\log^{r-1}(t)}\leq C_2. \label{eqn4} \end{equation}

Theorem 1 gives the asymptotic scaling of the number of subgraphs where the order in which the vertices appeared in the PAM is known. The total number of copies of H for any ordering, $N_t(H)$ , can then easily be obtained from Theorem 1.

Corollary 1

Let H be a directed subgraph on k vertices with $\Pi \ne \emptyset $ the set of orderings ${\pi}$ such that ${(H,\pi)}$ is attainable. Let

(5)

\begin{equation} B(H)=\max_{\pi\in\Pi}B(H,\pi), \end{equation}

and let $r^*$ be the largest number of different optimizers to (3) among all $\pi\in\Pi$ that maximize (5). Then, there exist $0{<}C_1\leq C_2{<}\infty$ such that

\begin{equation*}C_1\leq \ \liminf_{t\to\infty}\frac{{\mathbb{E}}[{N_t(H)}]}{t^{k+B(H)}\log^{r^*- 1}(t)}\leq \limsup_{t\to\infty}\frac{{\mathbb{E}}[{N_t(H)}]}{t^{k+B(H)}\log^{r^*-1}(t)}\leq C_2.\end{equation*}

Note that, from Corollary 1, it is also possible to obtain the undirected number of subgraphs in a PAM, by summing the number of all possible directed subgraphs that create some undirected subgraph when the directions of the edges are removed.

3.1.1. Interpretation of the optimization problem

The optimization problem in (3) has an intuitive explanation. Assume that $\pi$ is the identity mapping, so that vertex 1 is the oldest vertex of H, vertex 2 the second oldest, and so on. We show in Section 4.2 that the probability that an attainable subgraph is present on vertices with indices $u_1{<}u_2{<}\cdots<u_k$ scales as

\begin{equation*} \prod_{i\in[k]}u_i^{\beta(i)}, \end{equation*}

with $\beta(i)$ as in (3). Thus, if, for all $i, u_i\propto t^{\alpha_i}$ for some $\alpha_i$ , then the probability that the subgraph is present scales as $\smash{t^{\sum_{i\in[k]}\alpha_i\beta(i)}}$ . The number of vertices with index proportional to $t^{\alpha_i}$ scales as $t^{\alpha_i}$ . Therefore, heuristically, the number of times subgraph H occurs on vertices with indices proportional to ${(t^{\alpha_i})_{i\in[k]}}$ such that $\alpha_1\leq \alpha_2 \leq \cdots\leq \alpha_k$ scales as

\begin{equation*} t^{\sum_{i\in[k]}(\beta(i)+1)\alpha_i}. \end{equation*}

Because the exponent is linear in $\alpha_i$ , the exponent is maximized for $\alpha_i\in\{0,1\}$ for all i. Because of the extra constraint $\alpha_1\leq\alpha_2\leq\cdots\leq \alpha_k$ which arises from the ordering of the vertices in the PAM, the maximal value of the exponent is $k+B(H)$ . This suggests that the number of subgraphs scales as $\smash{t^{k+B(H)}}$ . Thus, the optimization problem B(H) finds the most likely configuration of a subgraph in terms of the indices of the vertices involved. If the optimum is unique, the number of subgraphs is maximized by subgraphs occurring on one set of very specific vertex indices. For example, when the maximum contribution is $\alpha_i=0$ , this means that vertices with constant index, the oldest vertices of the PAM, are most likely to be a member of subgraph H at position i. When $\alpha_i=1$ is the optimal contribution, vertices with index proportional to t, the newest vertices, are most likely to be a member of subgraph H at position i. When the optimum is not unique, several maximizers contribute equally to the number of subgraphs, which introduces the extra logarithmic factors in (4).

3.1.2. Most likely degrees

As mentioned above, the optimization problem (3) finds the most likely orders of magnitude of the indices of the vertices. When the optimum is unique, the optimum is attained by some vertices of constant index, and some vertices with index proportional to t. The vertices of constant index have degrees proportional to $\smash{t^{1/(\tau-1)}}$ with high probability [Reference van der Hofstad28], whereas the vertices with index proportional to t have degrees proportional to a constant. When the optimum is not unique, the indices of the vertices may have any range, so that the degrees of these vertices in the optimal subgraph structures have degrees ranging between 1 and $\smash{t^{1/(\tau-1)}}$ . Thus, the optimization problem (3) also finds the optimal subgraph structure in terms of its degrees. The most likely degrees of all directed connected subgraphs on three and four vertices resulting from Corollary 1 and the asymptotic number of such subgraphs for $2<\tau<3$ are visualized in Figures 2 and 3. For some subgraphs, the optimum of (3) is attained by the same s and therefore the same most likely degrees for all $2<\tau<3$ , while, for other subgraphs, the optimum may change with $\tau$ .

Figure 2: Order of magnitude of $N_t(H)$ for all attainable connected directed graphs on three vertices and for $2<\tau<3$ . The vertex color indicates the optimal vertex degree.

Figure 3: Order of magnitude of $N_t(H)$ for all attainable connected directed graphs on four vertices and for $2<\tau<3$ . The vertex color indicates the optimal vertex degree.

One such example is the complete graph of size four. For the directed complete graph, there is only one attainable ordering satisfying Definition 3, so we take the vertices of H to be labeled with this ordering. For $\tau<\smash{\tfrac52}$ , the optimizer of (3) is given by $s=3$ with optimal value $-3-3{(\tau-2)}/{(\tau-1)}$ , whereas, for $\tau>\smash{\tfrac52},$ it is given by $s=4$ and optimal value -4. Thus, for $\tau<\smash{\tfrac52},$ a complete graph of size four typically contains three hub vertices of degree proportional to $\smash{t^{1/(\tau-1)}}$ and one vertex of constant degree, and the number of such subgraphs scales as $\smash{t^{1-(\tau-2)/(\tau-1)}},$ whereas, for $\tau>\smash{\tfrac52},$ the optimal structure contains four hub vertices instead and the number of such subgraphs scales as a constant.

3.2. Fluctuations of the number of subgraphs

In Theorem 1 we investigate the expected number of subgraphs, which explains the average number of subgraphs over many PAM realizations. Another interesting question is what the distribution of the number of subgraphs in a PAM realization behaves like. In this paper, we mainly focus on the expected value of the number of subgraphs, but here we argue that the limiting distribution of the rescaled number of subgraphs may be quite different for different subgraphs.

As shown in Definition 2, by viewing the PAM as a Pólya urn graph we can associate a sequence of random independent random variables ${(\psi_v)_{v\in[t]}}$ to the vertices of the PAM, where $\psi_v$ has a beta distribution with parameters depending on m, $\delta,$ and v. Once we condition on $\psi_1,\ldots,\psi_t$ , the edge statuses of the graph are independent of each other. Furthermore, the degree of a vertex v depends on the index v and $\psi_v$ . The higher $\psi_v$ , the higher $D_v(t)$ . Thus, we can interpret $\psi_v$ as a hidden weight associated to the vertex v.

Using this representation of the PAM, we can view the PAM as a random graph model with two sources of randomness: the randomness of the $\psi$ -variables, and then the randomness of the independent edge statuses determined by the $\psi$ -variables. Therefore, we can define two levels of concentration for the number of ordered subgraphs $N_t(H,\pi)$ . Denote by $\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H,\pi)] \,{:\!=}\, \mathbb{E}[N_t(H,\pi)\mid \psi_1,\ldots,\psi_t]$ . Furthermore, let $N_{t,\psi}(H,\pi)$ denote the number of ordered subgraphs conditionally on $\psi$ . Then, the ordered subgraph ${(H,\pi)}$ can be in the following three classes of subgraphs.

Concentrated: $N_{t,\psi}(H,\pi)$ is concentrated around its conditional expectation ${\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H, \pi)]}$ , i.e. as $t\rightarrow\infty$ ,
(6) \begin{equation} \frac{N_{t,\psi}(H,\pi)}{\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H,\pi)]} \stackrel{\mathbb{P}}{\longrightarrow}1, \label{eqn6} \end{equation}
and, as $t\rightarrow\infty$ ,
\begin{equation*}\frac{N_t(H,\pi)}{\mathbb{E}[N_t(H,\pi)]} {\stackrel{{\mathbb{P}}}{\longrightarrow}} 1.\end{equation*}
Only conditionally concentrated: condition (6) holds, and, as $t\to\infty$ ,
(7) \begin{equation}\frac{N_t(H,\pi)}{\mathbb{E}[N_t(H,\pi)]} {\stackrel{d}{\longrightarrow}} X \end{equation}
for some random variable X.
Nonconcentrated: condition (6) does not hold.

For example, it is easy to see that the number of subgraphs as shown in Figure 2(d) satisfies $\smash{N(H)/t{\stackrel{{\mathbb{P}}}{\longrightarrow}} m(m-1)/2}$ , so that it is a subgraph that belongs to the class of concentrated subgraphs. Below we argue that the triangle belongs to the class of only conditionally concentrated subgraphs. We now give a criterion for the conditional convergence of (6) in the following proposition.

Proposition 1

(Criterion for conditional convergence) Consider a subgraph ${(H,\pi)}$ such that $\mathbb{E}[N_t(H,\pi)]\rightarrow\infty$ as $t\rightarrow\infty$ . Denote by $\hat{\mathcal{H}}$ the set of all possible subgraphs composed by two distinct copies of ${(H,\pi)}$ with at least one edge in common. Then, as $t\rightarrow\infty$ ,

(8)

\begin{equation} \sum_{\skew3\hat{H}\in\hat{\mathcal{H}}}\mathbb{E}[N_t(\skew3\hat{H})] = o(\mathbb{E}[N_t(H,\pi)]^2) \quad\Longrightarrow\quad \frac{N_{t,\psi}(H,\pi)}{\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H,\pi)]}{\stackrel{{\mathbb{P}}}{\longrightarrow}} 1. \label{eqn8} \end{equation}

Proposition 1 gives a simple criterion for conditional convergence for a subgraph ${(H,\pi)}$ , and it is proved in Section 7. The condition in (8) is simple to evaluate in practice. We denote the subgraphs consisting of two overlapping copies of ${(H,\pi)}$ sharing at least one edge by $\skew3\hat{H}_1,\dots,\skew3\hat{H}_r$ . To identify the order of magnitude of $\mathbb{E}[\skew3\hat{H}_i]$ , we apply Corollary 1 to $\skew3\hat{H}_i$ or, in other words, we apply Theorem 1 to all possible orderings $\hat{\pi}$ of $\skew3\hat{H}_i$ . Once we have all orders of magnitude of ${(\skew3\hat{H}_i,\hat{\pi})}$ for all orderings $\hat{\pi}$ , and for all $\skew3\hat{H}_i$ , it is immediate to see if the hypothesis of Proposition 1 is satisfied.

There are subgraphs where the condition in Proposition 1 does not hold. For example, merging two copies of the subgraph of Figure 3(q) as in Figure 4 violates the condition in Proposition 1. We show in Section 7 that this subgraph is in the class of nonconcentrated subgraphs with probability close to 1.

Figure 4: The order of magnitude of this subgraph containing two merged copies of the subgraph of Figure 3(q) is $t^{{4}/{(\tau-1)}}$ , so that the condition in Proposition 1 is not satisfied for the subgraph in Figure 3(q).

3.3. Exact constants: triangles

Theorem 1 allows us to identify the order of magnitude of the expected number of subgraphs in the PAM. In particular, for a subgraph H with ordering $\pi$ , it assures the existence of two constants $0{<}C_1\leq C_2{<}\infty$ as in (4). A more detailed analysis is necessary to prove a stronger result than Theorem 1 of the type

\begin{equation*} \lim_{t\rightarrow\infty} \frac{{\mathbb{E}}[{N_t(H,\pi)}]}{t^{k+B(H,\pi)}\log^{r-1}(t)}= C \end{equation*}

for some constant $0{<}C{<}\infty$ . In other words, given an ordered subgraph ${(H,\pi)}$ , we want to identify the constant $C{>}0$ such that

(9)

\begin{equation} {\mathbb{E}}[{N_t(H,\pi)}] = Ct^{k+B(H,\pi)}\log^{r-1}(t)(1+o(1)). \label{eqn9} \end{equation}

We prove (9) for triangles to show the difficulties in the evaluation of the precise constant C for general subgraphs. The following theorem provides the detailed scaling of the expected number of triangles.

Theorem 2

(Phase transition for the number of triangles.) Let $m\geq 2$ and $\delta>-m$ be parameters for ${(\mathrm{PA}_t)_{t\geq1}}$ . Denote the number of labeled triangles in $\mathrm{PA}_t$ by $\triangle_t$ . Then, as $t\rightarrow\infty$ ,

1. if $\tau>3$ then
\begin{equation*} \mathbb{E}[\triangle_t] = \frac{m^2(m-1)(m+\delta)(m+\delta+1)} {\delta^2(2m+\delta)}\log (t)(1+o(1)); \end{equation*}
2. if $\tau=3$ then
\begin{equation*} \mathbb{E}[\triangle_t] = \frac{m(m-1)(m+1)}{48}\log^3 (t)(1+o(1)); \end{equation*}
3. if $\tau\in(2,3)$ then
\begin{equation*} \mathbb{E}[\triangle_t]=\frac{m^2(m-1)(m+\delta)(m+\delta+1)} {\delta^2(2m+\delta)}t^{(3-\tau)/(\tau-1)}\log (t)(1+o(1)). \end{equation*}

Theorem 2 in the case $\delta=0$ coincides with [Reference Bollobás and Riordan6, Theorem 14]. For $\delta>0,$ we retrieve the result in [Reference Eggemann and Noble11, Proposition 4.3], noting that the additive constant $\beta$ in the attachment probabilities in the Móri model considered in [Reference Eggemann and Noble11] coincides with (1) for $\beta = \delta/m$ .

The proof of Theorem 2 in Section 6 shows that to identify the constant in (9), we need to evaluate the precise expectations involving the attachment probabilities of edges. The equivalent formulation of the PAM given in Definition 2 below simplifies the calculations, but it is still necessary to evaluate rather complicated expectations involving products of several terms as in (15). For a more detailed discussion, we refer the reader to Remark 1.

3.3.1. The distribution of the number of triangles

Theorem 2 shows the behavior of the expected number of triangles. The distribution of the number of triangles across various PAM realizations is another object of interest. We prove the following result for the number of triangles $\triangle_t$ .

Corollary 2

(Concentration of triangles, conditionally on $\psi$ .) For $\tau\in(2,3)$ , the number of triangles $\triangle_t$ satisfies condition (6).

Corollary 2 is a direct consequence of Proposition 1, and the atlas of the order of magnitudes of all possible realizations of the subgraphs consisting of two triangles sharing one or two edges is presented in Figure 5. In Figure 6 we show a density approximation of the number of triangles obtained by simulations. These figures suggest that the rescaled number of triangles converges to a random limit, since the width of the density plots does not decrease in t. Thus, while the number of triangles concentrates conditionally, it does not seem to converge to a constant when taking the random $\psi$ -variables into account. This would put the triangle subgraph in the class of only conditionally concentrated subgraphs. Proving this and identifying the limiting random variable of the number of triangles is an interesting open question.

Figure 5: Order of magnitude of $N_t(H)$ for all merged triangles on four vertices and for $2<\tau<3$ . The vertex color indicates the optimal vertex degree as in Figure 3.

Figure 6: Density approximation of the number of triangles in $10^4$ realizations of the preferential attachment model with $\tau=2.5$ and various values of t.

4. The probability of a subgraph being present

In this section, we prove the main ingredient for the proof of Theorem 1, the probability of a subgraph being present on a given set of vertices. The most difficult part of evaluating the probability of a subgraph H being present in $\mathrm{PA}_t$ is that the PAM is constructed recursively.

We consider triangles as an example. We write the event of a labeled triangle being present by $\smash{\{u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w, \ v\stackrel{j_3}{\leftarrow} w\}}$ , where $\smash{\{u\stackrel{j}{\leftarrow}v\}}$ denotes the event that the jth edge of vertex v is attached to vertex u. In this way we express precisely which edges are used in the triangle construction.

To evaluate the probability of the event $\smash{\{u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w, \ v\stackrel{j_3}{\leftarrow} w\}}$ , we can only use (1), which gives the conditional probabilities of attaching edges. Note that $j_2$ and $j_3$ are attached to the same vertex w. Assuming that $j_2{<}j_3$ , we write

$$\mathbb{P}(u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w, \ v\stackrel{j_3}{\leftarrow} w) = \mathbb{E}[\mathbb{P}(u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w, \ v\stackrel{j_3}{\leftarrow} w \mid \mathrm{PA}_{t-1, j_3-1}) ],$$

where we condition on $\mathrm{PA}_{t-1, j_3-1}$ , because the last edge we have to insert in the graph to create the triangle is given by the event $\smash{\{v\stackrel{j_3}{\leftarrow} w \}}$ . Using (1), we write

(10)

\begin{align} &\mathbb{E}[\mathbb{P}(u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w, \ v\stackrel{j_3}{\leftarrow} w \mid \mathrm{PA}_{t-1, j_3-1}) ]\nonumber \\ &\qquad= \mathbb{E}\bigg[{\bf 1}\{u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w\}\frac{D_v(w-1,j_3-1)+\delta}{2m(w-2) +(j_3-1)+ (w-1)\delta}\bigg]. \label{eqn10}\end{align}

In (10), the indicator function $\smash{{\bf 1}\{u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w\}}$ and $D_v(w-1,j_3-1)$ are not independent, therefore evaluating the expectation on the right-hand side of (10) is not easy. A possible solution for the evaluation of the expectation in (10) is to rescale $D_v(w-1,j_3-1)$ with an appropriate constant to obtain a martingale, and then recursively use the conditional expectation. For a detailed explanation of this, we refer the reader to [Reference Bollobás, Riordan, Spencer and Tusnády7], [Reference Szymaski26], and [Reference van der Hofstad28, Section 8.3]. This method is hardly tractable due to the complexity of the constants appearing (see Remark 1 for a more detailed explanation).

We use a different approach to evaluate of the expectation in (10) using the interpretation of the PAM as a Pólya urn graph, focusing mainly on the age (indices) of the vertices, and not on precise constants. We give lower and upper bounds for the probability of having a finite number of edges present in the graph, as formulated in the following lemma.

Lemma 1

(Probability of a finite set of labeled edges.) Fix $\ell\in\mathbb{N}$ . For vertices $\boldsymbol{u}_\ell = (u_1,\ldots,u_\ell)\in[t]^\ell$ and $\boldsymbol{v}_\ell = (v_1,\ldots,v_\ell)\in[t]^\ell,$ and edge labels $\boldsymbol{j}_\ell = (\;j_1,\ldots,j_\ell)\in [m]^\ell$ , we denote the set of $\ell$ distinct labeled edges from $u_i$ to $v_i$ connected by the $j_i$ th edge for $i=1,\dots,\ell$ by $M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)$ . Assume that the subgraph defined by set $M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)$ is attainable in the sense of Definition 3. Define $\chi=(m+\delta)/(2m+\delta)$ . Then the following assertions hold.

1. There exist two constants $c_1(m,\delta, \ell),c_2(m,\delta, \ell)>0$ such that
(11) \begin{equation} c_1(m,\delta, \ell)\prod_{l=1}^{\ell}u_l^{\chi-1}v_l^{-\chi} \leq \ \mathbb{P}(M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell) \subseteq E(\mathrm{PA}_t)) \ \leq c_2(m,\delta, \ell)\prod_{l=1}^{\ell}u_l^{\chi-1}v_l^{-\chi}. \label{eqn11} \end{equation}
2. Define the set
(12) \begin{equation} J(\boldsymbol{u}_\ell, \boldsymbol{v}_\ell) = \{\boldsymbol{j}_\ell\in[m]^\ell \colon M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell) \subseteq E(\mathrm{PA}_t)\}. \label{eqn12} \end{equation}
Then there exist two constants $\hat{c}_1(m,\delta,\ell), \hat{c}_2(m,\delta,\ell)>0$ such that
(13) \begin{equation} \hat{c}_1(m,\delta,\ell)\prod_{l=1}^{\ell}u_l^{\chi-1}v_l^{-\chi} \leq \mathbb{E}[|J(\boldsymbol{u}_\ell, \boldsymbol{v}_\ell)|] \leq \hat{c}_2(m,\delta,\ell)\prod_{l=1}^{\ell}u_l^{\chi-1}v_l^{-\chi} . \label{eqn13} \end{equation}

Formula (11) in the above lemma bounds the probability that a subgraph is present on vertices $\boldsymbol{{{u}}}_\ell$ and $\boldsymbol{{{v}}}_\ell$ such that the $j_i$ th edge from $u_i$ connects to $v_i$ . Note that (11) is independent of the precise edge labels ${(j_1,\ldots,j_\ell)}$ . To be able to count all subgraphs, and not only subgraphs where the edge labels have been specified, (13) bounds the expected number of times a specific subgraph is present on vertices $\boldsymbol{{{u}}}_\ell$ and $\boldsymbol{{{v}}}_\ell$ . This number is given exactly by the elements in set $J(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell)$ as in (12). Note that the expectation in (13) may be larger than 1, due to the fact that the PAM is a multigraph.

Lemma 1 gives a bound on the probability of the presence of $\ell\in\mathbb{N}$ distinct edges in the graph as a function of the indices ${(u_1,v_1),\ldots,(u_\ell,v_\ell)}$ of the endpoints of the $\ell$ edges. Due to the properties of the PAM, the index of a vertex is an indicator of its degree, due to the old-get-richer effect. Lemma 1 is a stronger result than [Reference Dommers, van der Hofstad and Hooghiemstra10, Corollary 2.3], which gives an upper bound of the form in (11) only for self-avoiding paths.

The proof of Lemma 1 is based on the interpretation of the PAM in Definition 1 as an urn experiment as proposed in [Reference Berger, Borgs, Chayes and Saberi3]. We now introduce urn schemes and state the preliminary results we need for the proof of Lemma 1, which is given in Section 4.2.

4.1. Preliminary properties of Pólya urn graphs

The formulation in Definition 2 in terms of urn experiments allows us to investigate the presence of subgraphs in an easier way than with the formulation given in Definition 1, since the dependent random variables in (10) are replaced by the product of independent random variables. We now state two lemmas that are the main ingredients for proving Lemma 1.

Lemma 2

(Attachment probabilities.) Consider $\mathrm{PU}_t$ as in Definition 2. Then,

1. for $k\in[t]$ ,
(14) \begin{equation} S_k = \prod_{h=k+1}^t (1-\psi_h); \label{eqn14} \end{equation}
2. conditioning on $\psi_1,\ldots,\psi_t$ , the probability that the jth edge of k is attached to v is equal to
(15) \begin{equation} \mathbb{P}(U_{k,j} \in I_v\mid \psi_1,\ldots,\psi_t) = \psi_v\frac{S_v}{S_{h-1}} = \psi_{v}\prod_{h=v+1}^{k-1}(1-\psi_h). \label{eqn15} \end{equation}

The proof of Lemma 2 follows from Definition 2, and the fact that ${(S_k)_{k\in[t]}}$ as in (2) can be written as in (14) (see the proof of [Reference Berger, Borgs, Chayes and Saberi3, Theorem 2.1]).

Before proving Lemma 1, we state a second result on the concentration of the positions $\{S_k\}_{k\in[t]}$ in the urn graph ${(\mathrm{PU}_t)_{t\in\mathbb{N}}}$ . In particular, it shows that these positions concentrate around deterministic values.

Lemma 3

(Position concentration in $\mathrm{PU}_t$ .) Consider a Pólya urn graph as in Definition 2. Let $\chi = (m+\delta)/(2m+\delta)$ . Then, for every $\omega,\varepsilon>0,$ there exists $N_0 = N_0(\omega,\varepsilon)\in\mathbb{N}$ such that, for every ${t\geq N_0}$ ,

\begin{equation*}\mathbb{P}\bigg(\bigcap_{ i=N_0}^t\bigg\{ \bigg|S_i- \bigg(\frac{i}{t}\bigg)^\chi\bigg|\leq \omega \bigg(\frac{i}{t}\bigg)^\chi\bigg\}\bigg)\geq 1-\varepsilon \end{equation*}

and, for large enough t,

\begin{equation*}\mathbb{P}\bigg(\max_{i\in[t]}\bigg| S_i-\bigg(\frac{i}{t}\bigg)^\chi\bigg|\geq \omega\bigg)\leq \varepsilon. \end{equation*}

As a consequence, as $t\rightarrow\infty$ ,

(16)

\begin{equation} \max_{i\in[t]}\bigg| S_i-\bigg(\frac{i}{t}\bigg)^\chi\bigg| {\stackrel{p}{\longrightarrow}} 0. \label{eqn16} \end{equation}

The proof of Lemma 3 is given in [Reference Berger, Borgs, Chayes and Saberi3, Lemma 3.1].

4.2. Proof of Lemma 1

We now prove Lemma 1, starting with the proof of (11). Fix $\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell$ . In the proof, we denote $M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)$ simply by $M_\ell$ to keep the notation light. We use the fact that the Pólya urn graphs $\mathrm{PU}_t$ and $\mathrm{PA}_t$ have the same distribution, and evaluate $\mathbb{P}(M_\ell \subseteq E(\mathrm{PU}_t))$ . We consider $\ell$ distinct labeled edges, so we can use (15) to write

(17)

\begin{equation} \mathbb{P}(M_\ell \subseteq E(\mathrm{PU}_t)\mid \psi_1,\ldots,\psi_t) = \prod_{l=1}^\ell \psi_{u_l}\frac{S_{u_l}}{S_{v_l-1}}. \label{eqn17} \end{equation}

Now fix $\varepsilon>0$ . Define $\mathcal{E}_\varepsilon \,{:\!=}\, \{\max_{i\in[t]}| S_i-({i}/{t})^\chi|\leq\varepsilon\}$ . By (16), and the fact that the product of the random variables in (17) is bounded by 1,

(18)

\begin{equation} \mathbb{E}\bigg[\prod_{l=1}^\ell \psi_{u_l}\frac{S_{u_l}}{S_{v_l-1}}\bigg] = \mathbb{E}\bigg[{\bf 1}_{\mathcal{E}_\varepsilon}\prod_{l=1}^\ell \psi_{u_l}\frac{S_{u_l}}{S_{v_l-1}}\bigg] + o(1). \label{eqn18} \end{equation}

On the event $\mathcal{E}_\varepsilon$ , we have, for every $l\in[\ell]$ ,

(19)

\begin{equation} (1-\varepsilon)\bigg(\frac{u_l}{v_l}\bigg)^\chi \leq \frac{S_{u_l}}{S_{v_l-1}}\leq (1+\varepsilon) \bigg(\frac{u_l}{v_l}\bigg)^\chi, \label{eqn19} \end{equation}

where in (19) we have replaced $v_l-1$ with $v_l$ with a negligible error. Note that, since $v_l$ is always the source of the edge, this implies that $v_l\geq 2$ ; therefore, this is allowed. Using (19) in (18), we obtain

(20)

\begin{align} (1-\varepsilon)^\ell \prod_{l=1}^\ell \bigg(\frac{u_l}{v_l}\bigg)^\chi\mathbb{E}\bigg[{\bf 1}_{\mathcal{E}_\varepsilon} \prod_{l=1}^\ell \psi_{u_l}\bigg] &\leq \mathbb{P}(M_\ell \subseteq E(\mathrm{PU}_t)) \nonumber \\ &\leq (1+\varepsilon)^\ell \prod_{l=1}^\ell \bigg(\frac{u_l} {v_l}\bigg)^\chi \mathbb{E}\bigg[{\bf 1}_{\mathcal{E}_\varepsilon}\prod_{l=1}^\ell \psi_{u_l}\bigg]. \label{eqn20}\end{align}

Even though $\psi_1,\ldots,\psi_t$ depend on $\mathcal{E}_\varepsilon$ , it is easy to show that we can ignore ${\bf 1}_{\mathcal{E}_\varepsilon}$ in (20) and obtain a similar bound. In fact, since the random variables $\psi_{u_1},\ldots, \psi_{u_\ell}$ are bounded by 1, we can write

(21)

\begin{equation} \bigg| \mathbb{E}\bigg[\prod_{l=1}^\ell \psi_{u_l}\bigg]-\mathbb{E}\bigg[{\bf 1}_{\mathcal{E}_\varepsilon}\prod_{l=1}^\ell \psi_{u_l}\bigg]\bigg| = \mathbb{E}\bigg[{\bf 1}_{\mathcal{E}_\varepsilon^c} \prod_{l=1}^\ell \psi_{u_l}\bigg]\leq 1-\mathbb{P}(\mathcal{E}_\varepsilon^c) = o(1), \label{eqn21} \end{equation}

where o(1) is intended as $t\rightarrow\infty$ because of (16). Note that the bound in (21) depends on $\varepsilon$ through the event $\mathcal{E}_\varepsilon^c$ , but not on the choice of $M_\ell$ . As a consequence, for some constants $c_1,c_2$ and large enough t, from (20) and (21),

(22)

\begin{equation} (1-\varepsilon)^\ell \prod_{l=1}^\ell \bigg(\frac{u_l}{v_l} \bigg)^\chi\mathbb{E}\bigg[\prod_{l=1}^\ell \psi_{u_l}\bigg]\leq \mathbb{P}(M_\ell \subseteq E(\mathrm{PU}_t))\leq (1+\varepsilon)^\ell \prod_{l=1}^\ell \bigg(\frac{u_l}{v_l}\bigg)^\chi\mathbb{E} \bigg[\prod_{l=1}^\ell \psi_{u_l}\bigg]. \label{eqn22} \end{equation}

What remains is to evaluate the expectation in (22). We assumed $\ell$ distinct edges, which does not imply that the vertices $u_1,v_1,\ldots,u_\ell,v_\ell$ are distinct. The expectation in (22) depends only on the receiving vertices of the $\ell$ edges, namely $u_1,\ldots,u_\ell$ .

Let $\bar{u}_1,\ldots, \bar{u}_k$ denote the $k\leq \ell$ distinct elements that appear among $ u_1,\ldots,u_\ell$ . For $h\in[k]$ , the vertex $\bar{u}_h$ appears in the product inside the expectation in (22) with multiplicity $\smash{d^{\scriptscriptstyle(\mathrm{in})}_h}$ , which is the degree of vertex $\bar{u}_k$ in the subgraph defined by $M_\ell$ . As a consequence, we can write

(23)

\begin{equation} \mathbb{E}\bigg[\prod_{l=1}^\ell \psi_{u_l}\bigg] = \mathbb{E}\bigg[\prod_{h=1}^k \psi_{\bar{u}_h}^{d^{\scriptscriptstyle(\mathrm{in})}_h}\bigg] = \prod_{h=1}^k\mathbb{E} \bigg[\psi_{\bar{u}_h}^{d^{\scriptscriptstyle(\mathrm{in})}_h}\bigg], \label{eqn23} \end{equation}

where in (23) we have used the fact that $\psi_1,\ldots,\psi_t$ are all independent. Note that $\mathbb{E}[\psi_1^d] = 1$ for all $d\geq 0$ , since $\psi_1\equiv 1$ . Therefore, if $\bar{u}_h=1$ for some $h\in[k]$ , $\smash{\mathbb{E}[\psi_{\bar{u}_h}^d]} = 1$ and the terms depending on the first vertex contribute to the expectation in (23) by a constant.

For the terms where $\bar{u}_h\geq2$ , recall that, if $X(\alpha,\beta)$ is a beta random variable then, for any integer $d\in\mathbb{N}$ ,

\begin{equation*} \mathbb{E}[X(\alpha,\beta)^d] = \frac{\alpha(\alpha+1)\cdots(\alpha+d-1)} {(\alpha+\beta)(\alpha+\beta+1)\cdots(\alpha+\beta+d-1)}. \end{equation*}

Since $\psi_{\bar{u}_h}$ is beta distributed with parameters $m+\delta$ and $2(\bar{u}_h-3)+(\bar{u}_h-1)\delta$ ,

\begin{align*}\mathbb{E}[\psi_{\bar{u}_h}^{d^{\scriptscriptstyle(\mathrm{in})}_h}] & = \frac{(m+\delta)\cdots(m+ \delta+d^{\scriptscriptstyle(\mathrm{in})}_h-1)} {[m(2\bar{u}_h-2)+\bar{u}_h\delta] \cdots[m(2\bar{u}_h-2)+\bar{u}_h\delta +d^{\scriptscriptstyle(\mathrm{in})}_h-1]}\nonumber \\ & = \bar{u}_h^{-d^{\scriptscriptstyle(\mathrm{in})}_h}\frac{(m+\delta)\cdots(m+\delta+d^{\scriptscriptstyle(\mathrm{in})}_h-1)} {[2m+\delta-(2m)/\bar{u}_h]\cdots[2m+\delta+(d^{\scriptscriptstyle(\mathrm{in})}_h-1-2m)/\bar{u}_h]}. \end{align*}

Note that if $\bar{u}_h\geq 2$ , uniformly in t and the precise choice of the $\ell$ edges,

$$(m+\delta)^{-\ell}\leq \bigg(\bigg[2m+\delta-\frac{2m}{\bar{u}}\bigg]\cdots\bigg[2m+\delta +\frac{d^{\scriptscriptstyle(\mathrm{in})}_h-1-2m}{\bar{u}}\bigg]\bigg)^{-1}\leq (2m+\delta+\ell)^{-\ell}.$$

As a consequence, we can find two constants $c_1(m,\delta, \ell), c_2(m,\delta, \ell)$ such that

(24)

\begin{equation} c_1(m,\delta, \ell)\prod_{h=1}^k\bar{u}_h^{-d^{\scriptscriptstyle(\mathrm{in})}_h}\leq \prod_{h=1}^k\mathbb{E}[\psi_{\bar{u}_h}^{d^{\scriptscriptstyle(\mathrm{in})}_h}]\leq c_2(m,\delta, \ell)\prod_{h=1}^k\bar{u}_h^{-d^{\scriptscriptstyle(\mathrm{in})}_h}. \label{eqn24} \end{equation}

We now use (24) in (22) to obtain

(25)

\begin{align} c_1(m,\delta, \ell)(1-\varepsilon)^\ell \prod_{l=1}^\ell \bigg(\frac{u_l}{v_l}\bigg)^\chi \prod_{h=1}^k\bar{u}_h^{-d^{\scriptscriptstyle(\mathrm{in})}_h} & \leq \mathbb{P}(M_\ell\subseteq E(\mathrm{PU}_t))\nonumber \\ & \leq c_2(m,\delta, \ell)(1+\varepsilon)^\ell \prod_{l=1}^\ell \bigg(\frac{u_l}{v_l}\bigg)^\chi \prod_{h=1}^k\bar{u}_h^{-d^{\scriptscriptstyle(\mathrm{in})}_h}. \label{eqn25}\end{align}

In (25) we can just rename the constants $c_1(m,\delta, \ell) = c_1(m,\delta, \ell)(1-\varepsilon)^\ell$ and $c_2(m,\delta, \ell) = c_2(m,\delta, \ell)(1+\varepsilon)^\ell$ . Since $\smash{d^{\scriptscriptstyle(\mathrm{in})}_h}$ is the multiplicity of vertex $\bar{u}_h$ as the receiving vertex, we can write

$$\prod_{h=1}^k\bar{u}_h^{-d^{\scriptscriptstyle(\mathrm{in})}_h} = \prod_{l=1}^\ell u_l^{-1}.$$

Combining this with (25) completes the proof of (11).

The proof of (13) follows immediately from (11) and the definition of the set $J(\boldsymbol{u}_\ell, \boldsymbol{v}_\ell)$ in (12). In fact, we can write

\begin{equation*} \mathbb{E}[|J(\boldsymbol{u}_\ell, \boldsymbol{v}_\ell)|] = \sum_{\boldsymbol{j}_\ell\in[m]^\ell} \mathbb{P}(M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)\subseteq E(\mathrm{PA}_t)). \end{equation*}

Recall that $\mathbb{P}(M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)\subseteq E(\mathrm{PA}_t))$ is independent of the labels $\boldsymbol{j}_\ell$ . For a fixed set of source and target vertices $\boldsymbol{u}_\ell$ and $\boldsymbol{v}_\ell$ , there is only a finite combination of labels $\boldsymbol{j}_\ell$ such that the subgraph defined by $M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)$ is attainable in the sense of Definition 3. In fact, the number of such labels $\boldsymbol{j}_\ell$ is larger than 1 (since the corresponding subgraph is attainable), and less than $m^\ell$ (the total number of elements of $[m]^\ell$ ). As a consequence, taking $\hat{c}_1 = c_1$ and $\hat{c}_2 = c_2m^\ell$ proves (13).

5. Proof of Theorem 1

To prove Theorem 1, we write the expected number of subgraphs as multiple integrals. Without loss of generality, we assume throughout this section that $\pi$ is the identity permutation, so that the vertices of H are labeled as $1,\dots,k$ , and therefore drop the dependence of the quantities on $\pi$ . We first prove a lemma that states that two integrals that will be important in proving Theorem 1 are finite.

Lemma 4

Let H be a subgraph such that the optimum of (3) is attained by $s_1,\dots,s_r$ . Then

\begin{align*} A_1(H)&\,{:\!=}\,\int_{1}^{\infty}u_1^{\beta(1)}\int_{u_1}^{\infty}u_2^{\beta(2)} \cdots \int_{u_{s-1}}^{\infty}u_{s_1}^{\beta(s_1)}{\mathrm{d}} u_{s_1}\cdots {\rm d} u_1<\infty, \\ A_2(H)&\,{:\!=}\,\int_{0}^{1}u_{k}^{\beta({k})}\int_0^{u_{k}}u_{k-1}^{\beta({k-1})} \cdots \int_0^{u_{s_r+2}}u_{s_r+1}^{\beta(s_r+1)}{\mathrm{d}} u_{s_r+1}\cdots {\rm d} u_k<\infty. \end{align*}

Proof. Since the first integral iteratively integrates $u_{s_1+1-z}$ to the power $z-1+\smash{\sum_{i=s_1-z}^{s_1}\beta(i)}$ from $u_{s_1-z}$ to $\infty$ for $z=1,\dots,s_1$ , these integrals are finite as long as $z-1+\smash{\sum_{i=s_1-z}^{s_1}\beta(i)}<-1$ , or similarly

(26)

\begin{equation} z+\sum_{i=s_1-z}^{s_1}\beta(i)<0 \label{eqn26} \end{equation}

for all $z\in[s_1]$ . Suppose that (26) does not hold for some $z^*\in[s_1]$ . Then, the difference between the contribution to (3) for $\tilde{s}=s_1-z^*$ and $s_1$ is

\begin{equation*} -(s_1-z^*)+\sum_{i=s_1-z^*}^{k}\beta(i)+s_1-\sum_{i=s_1}^k \beta(i)= z^*+\sum_{i=s_1-z^*}^{s_1}\beta(i)\geq 0, \end{equation*}

which would imply that $s_1-z^*$ is also an optimizer of (3), which is in contradiction with $s_1$ being the smallest optimum. Thus, (26) holds for all $r\in[s]$ and $A_1(H)<\infty$ .

Similarly to the analysis in the first integral, the second integral iteratively integrates $u_{z}$ to the power $z-1-s_r+\smash{\sum_{i=s_r+1}^{z}\beta(i)}$ from 0 to $u_{z+1}$ for $z\in\{s_r+1,\dots,k\}$ . Thus, the integral is finite as long as

\begin{equation*} z-s_r+\sum_{i=s_r+1}^{z}\beta(i)>0 \end{equation*}

for all $z\in\{s_r+1,\dots,k\}$ . Suppose that this does not hold for some $z^*\in\{s_r+1,\dots,k\}$ . Set $\tilde{s}=z^*>s_r$ . Then, the difference between the contribution to (3) for $\tilde{s}=z^*$ and $s_r$ is

\begin{equation*} -z^*+s_r-\sum_{i=s_r+1}^{z^*}\beta(i)\geq 0, \end{equation*}

which is a contradiction with $s_r$ being the largest optimizer. Therefore, $A_2(H)<\infty$ .

We now use this lemma to prove Theorem 1.

Proof of Theorem 1. Again, we assume that $\pi$ is the identity mapping, so that we may drop all dependencies on $\pi$ . Suppose that the optimal solution to (3) is attained by $s_1,s_2,\dots,s_r$ for some $r\geq1$ . Let the $\ell$ edges of H be denoted by ${(u_l,v_l)}$ for $l\in[\ell]$ . Let $N_t(H,i_1,\ldots,i_k)$ denote the number of times subgraph H is present on vertices $i_1,\dots,i_k$ . We then use Lemma 1, which proves that, for some $0<C<\infty$ ,

(27)

\begin{align} {\mathbb{E}}[{N_t(H)}] &= \sum_{i_1<\cdots<i_k\in[t]}{\mathbb{E}}[{N_t(H,i_1,\dots,i_k)}]\nonumber \\[-3pt] & \leq C\sum_{i_1<\cdots <i_k\in[t]}\prod_{l=1}^\ell i_{u_l}^{\chi-1}i_{v_l}^{-\chi} \nonumber \\[-3pt] &= C\sum_{i_1<\cdots<i_k\in[t]} \prod_{q=1}^ki_q^{\beta(q)}. \label{eqn27}\end{align}

We then bound the sums by integrals as

(28)

\begin{align} {\mathbb{E}}[{N_t(H)}]& \leq \tilde{C} \int_{1}^{t}u_1^{\beta(1)}\cdots \int_{u_{k-1}}^{t}u_k^{\beta(k)}{\mathrm{d}} u_k\cdots {\rm d} u_1\nonumber \\ & \leq \tilde{C} \int_{1}^{\infty}u_1^{\beta(1)}\cdots \int_{u_s-1}^{\infty}u_{s_1}^{\beta(s_1)}{\mathrm{d}} u_{s_1}\cdots {\rm d} u_{1}\nonumber \\ & \times \int_{1}^{t}u_{s_1+1}^{\beta(s_1+1)}\int_{u_{s_1+1}}^{\infty}u_{s_1+ 2}^{\beta(s_1+2)}\cdots \int_{u_{s_2-1}}^{t}u_{s_2}^{\beta(s_2)}{\mathrm{d}} u_{s_2}\cdots {\rm d} u_{s_1+1} \nonumber \\ & \times \int_{1}^{t}u_{s_2+1}^{\beta(s_2+1)}\int_{u_{s_2+1}}^{\infty}u_{s_2 +2}^{\beta(s_2+2)}\cdots \int_{u_{s_3-1}}^{t}u_{s_3}^{\beta(s_3)}{\mathrm{d}} u_{s_3}\cdots {\rm d} u_{s_2+1}\times \cdots\nonumber \\ & \times \int_{1}^{t}u_{s_{r-1}+1}^{\beta(s_{r-1}+1)}\int_{u_{s_{r-1}+1}}^{\infty} u_{s_{r-1}+2}^{\beta(s_{r-1}+2)}\cdots \int_{u_{s_r-1}}^{t}u_{s_r}^{\beta(s_r)}{\mathrm{d}} u_{s_r}\cdots {\rm d} u_{s_{r-1}+1} \nonumber \\ & \times \int_{0}^{t}u_{s_r+1}^{\beta(s_r+1)}\int_{s_r+1}^{t}u_{s_r+2}^{\beta(s_r +2)}\cdots \int_{u_{k-1}}^{t}u_k^{\beta(k)}{\mathrm{d}} u_k\cdots {\rm d} u_{s_r+1} \label{eqn28}\end{align}

for some $0<\tilde{C}{<}\infty$ . The first set of integrals is finite by Lemma 4 and independent of t. For the last set of integrals, we obtain

(29)

\begin{align} &\int_{0}^{t}u_{s_r+1}^{\beta({s_r+1})}\int_{u_{s_r+1}}^{t}u_{s_r+ 2}^{\beta({s+2})}\cdots \int_{u_{k-1}}^{t}u_k^{\beta(k)}{\mathrm{d}} u_k\cdots {\rm d} u_{s_r+1}\nonumber \\ &\qquad= t^{k-s_r+\sum_{i=s_r+1}^{k}\beta(i)}\int_{0}^{1}w_{s_r+1}^{\beta({s_r +1})}\int_{w_{s_r+1}}^{1}w_{s_r+2}^{\beta({s_r+2})}\cdots \int_{w_{k-1}}^{1}w_k^{\beta(k)}{\mathrm{d}} w_k\cdots {\rm d} w_{s_r+1}\nonumber \\ &\qquad = Kt^{k+B(H)} \label{eqn29}\end{align}

for some $0{<}K{<}\infty$ , where we have used the change of variables $w=u/t$ and Lemma 4. For $r=1$ , this completes the proof, because then the middle integrals in (28) are empty. We now investigate the behavior of the middle sets of integrals for $r>1$ . Because the optimum to (3) is attained for $s_1$ as well as $s_2$ ,

(30)

\begin{equation} -s_1+\sum_{i=s_1+1}^{k}\beta(i)+s_2-\sum_{i=s_2+1}^{k}\beta(i)=s_2- s_1+\sum_{i=s_1+1}^{s_2}\beta(i)=0. \label{eqn30} \end{equation}

Therefore, when $s_2=s_1+1$ , the second set of integrals in (28) equals

\begin{equation*} \int_{1}^{t}u_{s_1}^{-1}{\mathrm{d}} u_{s_1}=\log (t). \end{equation*}

Now suppose that $s_1{<}s_2+1$ . Then, any $\tilde{s}\in[s_1+1,s_2-1]$ is a nonoptimal solution to (3), and, therefore,

\begin{equation*} -s_2+\sum_{i=s_2+1}^{k}\beta(i)+\tilde{s}-\sum_{i=\tilde{s}+1}^{k} \beta(i)=\tilde{s}-s_2-\sum_{i=\tilde{s}+1}^{s_2}\beta(i)>0, \end{equation*}

(31)

\begin{equation} \sum_{i=\tilde{s}+1}^{s_2}\beta(i) <s_2-\tilde{s}. \label{eqn31} \end{equation}

This implies that

(32)

\begin{align} &\int_{1}^{t}u_{s_1+1}^{\beta(s_1+1)}\int_{u_{s_1+1}}^{\infty}u_{s_1+ 2}^{\beta(s_1+2)}\cdots \int_{u_{s_2-1}}^{t}u_{s_2}^{\beta(s_2)}{\mathrm{d}} u_{s_2}\cdots {\rm d} u_{s_1+1}\nonumber \\ &\qquad= \tilde{K}\int_{1}^{t}u_{s_1+1}^{\sum_{i=s_1+1}^{s_2} \beta(i)+s_2-s_1-1}{\rm d} u_{s_1+1}\nonumber \\ &\qquad= \tilde{K}\int_{1}^{t}u_{s_1+1}^{-1}{\mathrm{d}} u_{s_1+1}\nonumber \\ &\qquad= \tilde{K}\log (t) \label{eqn32}\end{align}

for some $0{<}C{<}\infty$ . A similar reasoning holds for the other integrals, so that combining (28), (29), and (32) yields

\begin{equation*} \limsup_{t\to\infty}\frac{{\mathbb{E}}[{N_t(H)}]}{t^{k+B(H)}\log^{r-1}(t)}\leq C_2 \end{equation*}

for some $0{<}C_2{<}\infty$ .

We now proceed to prove a lower bound on the expected number of subgraphs. Again, by Lemma 1 and lower bounding the sums by integrals as in (27), we obtain, for some $0{<}C{<}\infty$ ,

\begin{equation*} {\mathbb{E}}[{N_t(H)}]\geq C \int_{1}^{t}u_1^{\beta(1)}\cdots \int_{u_{k-1}}^{t} u_k^{\beta(k)}{\mathrm{d}} u_k\cdots {\rm d} u_1. \end{equation*}

Fix $\varepsilon>0$ . We investigate the contribution where vertices $1,\dots,s_1$ have index in $[1,1/\varepsilon]$ , vertices $s_1+1,\dots,s_2$ have index in $[1/\varepsilon,\varepsilon t^{1/r}]$ , vertices $s_2+1,\dots,s_3$ have index in $[t^{1/r},\varepsilon t^{2/r}],$ and so on, and vertices $s_r+1,\dots, s_k$ have index in $[\varepsilon t,t]$ . Thus, we bound

(33)

\begin{align} {\mathbb{E}}[{N_t(H)}] & \geq C\int_{1}^{1/\varepsilon}u_1^{\beta(1)}\int_{u_1}^{1/ \varepsilon}u_2^{\beta(2)}\cdots\int_{u_{s_1-1}}^{1/ \varepsilon}u_{s_1}^{\beta(s)}{\mathrm{d}} u_{s_1}\cdots{\rm d} u_1\nonumber \\ & \times \int_{1/\varepsilon}^{\varepsilon t^{1/r}}u_{s_1+1}^{\beta(s_1+1)}\int_{u_{s_1+1}}^{u_{s_1+1}/\varepsilon} u_{s_1+2}^{\beta(s_1+2)}\cdots \int_{u_{s_2-1}}^{u_{s_2-1}/\varepsilon}u_{s_2}^{\beta(s_2)}{\mathrm{d}} u_{s_2}\cdots{\rm d} u_{s_1+1}\nonumber \\ & \times \int_{t^{1/r}}^{\varepsilon t^{2/r}}u_{s_2+1}^{\beta(s_2+1)}\int_{u_{s_2+1}}^{u_{s_2+1}/ \varepsilon}u_{s_2+2}^{\beta(s_2+2)}\cdots \int_{u_{s_3-1}}^{u_{s_3-1}/\varepsilon}u_{s_3}^{\beta(s_3)}{\mathrm{d}} u_{s_3}\cdots{\rm d} u_{s_2+1}\times\cdots \nonumber \\ & \times \int_{t^{(r-2)/r}}^{\varepsilon t^{(r-1)/r}}u_{s_{r-1}+1}^{\beta(s_{r-1}+1)} \int_{u_{s_{r-1}+1}}^{u_{s_{r-1}+1}/\varepsilon}u_{s_{r-1}+2}^{\beta(s_{r -1}+2)}\cdots \int_{u_{s_r-1}}^{u_{s_r-1}/ \varepsilon}\!u_{s_r}^{\beta(s_r)}{\mathrm{d}} u_{s_r}\cdots{\rm d} u_{s_{r-1}+1}\nonumber \\ & \times \int_{\varepsilon t}^{t}u_{s_r+1}^{\beta(s_r+1)}\int_{u_{s_r+1}}^{t}u_{s_r+2}^{\beta(s_r +2)}\cdots \int_{u_{k-1}}^{t}u_k^{\beta(k)}{\mathrm{d}} u_k\cdots{\rm d} u_{s_r+1}. \label{eqn33}\end{align}

The first set of integrals equals $A_1(H)$ plus terms that vanishes as $\varepsilon$ becomes small by Lemma 4. For the last set of integrals, we use the change of variables $w=u/t$ to obtain

(34)

\begin{align} & \int_{\varepsilon t}^{t}u_{s_r+1}^{\beta({s_r+1})}\int_{u_{s_r+1}}^{t} u_{s_r+2}^{\beta({s_r+2})}\cdots \int_{u_{k-1}}^{t}u_k^{\beta(k)}{\mathrm{d}} u_k\cdots {\rm d} u_{s_r+1}\nonumber \\ &\qquad = t^{k-s_r+\sum_{i=s_r+1}^{k}\beta(i)}\int_{\varepsilon }^{1}w_{s_r+1}^{\beta({s_r+1})}\int_{w_{s_r+1}}^{1}w_{s_r+2}^{\beta({s_r +2})}\cdots \int_{w_{k-1}}^{1}w_k^{\beta(k)}{\mathrm{d}} w_k\cdots {\rm d} w_{s_r+1}\nonumber \\ &\qquad = t^{k+B(H)}(A_2(H)-h_1(\varepsilon)) \label{eqn34}\end{align}

for some function $h_1(\varepsilon)$ . By Lemma 4, $h_1(\varepsilon)$ satisfies $\lim_{\varepsilon\to 0}h_1(\varepsilon)= 0$ . Again, if $r=1$ , the middle sets of integrals in (33) are empty, so we are done.

We now investigate the second set of integrals in (33) for $r>1$ . Using the substitutions $w_{s_1+1}=u_{s_1+1}$ and $w_i=u_i/u_{i-1}$ for $i>s_1+1$ , we obtain

(35)

\begin{align} &\int_{1/\varepsilon}^{\varepsilon t^{1/r}}u_{s_1+1}^{\beta(s_1+1)} \int_{u_{s_1+1}}^{u_{s_1+1}/\varepsilon}u_{s_1+2}^{\beta(s_1+2)}\cdots \int_{u_{s_2-1}}^{u_{s_2-1}/\varepsilon}u_{s_2}^{\beta(s_2)}{\mathrm{d}} u_{s_2}\cdots{\rm d} u_{s_1+1}\nonumber \\ &\qquad = \int_{1/\varepsilon}^{\varepsilon t^{1/r}}w_{s_1+1}^{s_2-s_1-1+\sum_{i=s_1+1}^{s_2}\beta(i)}{\rm d} w_{s_1+1}\int_{1}^{1/\varepsilon}w_{s_1+2}^{s_2-s_1-2+ \sum_{i=s_1+2}^{s_2}\beta(i)}{\rm d} w_{s_2+1}\times\cdots \nonumber \\ &\qquad \times \int_{1}^{1/\varepsilon}w_{s_2}^{\beta(s_2)}{\rm d} w_{s_2}. \label{eqn35}\end{align}

The first integral equals, by (30),

\begin{equation*} \int_{1/\varepsilon}^{\varepsilon t^{1/r}}w_{s_1+1}^{-1}{\mathrm{d}} w_{s_1+1} = \frac{1}{r}\log(t)+\log(\varepsilon^2). \end{equation*}

The integrand in all other integrals in (35) equals $\smash{w_i^{\gamma_i}}$ for some $\gamma_i<-1$ by (31). Therefore, these integrals equal a constant plus a function of $\varepsilon$ that vanishes as $\varepsilon$ becomes small so that

(36)

for some $0<K<\infty$ and some $h_2(\varepsilon)$ such that $\lim_{\varepsilon\to 0}h_2(\varepsilon)=0$ . The other integrals in (33) can be estimated similarly.

Combining (33), (34), and (36) we obtain

\begin{equation*} \liminf_{t\to\infty}\frac{{\mathbb{E}}[{N_t(H)}]}{t^{k+B(H)}\log^{r-1}(t)} \geq C_1+h(\varepsilon) \end{equation*}

for some constant $0{<}C_1{<}\infty$ and some function $h(\varepsilon)$ such that $\lim_{\varepsilon\to 0}h(\varepsilon)=0$ . Taking the limit for $\varepsilon\to 0$ then proves the theorem.

6. Proof of Theorem 2

Fix $m\geq 2$ and $\delta>-m$ . The first step of the proof consists of showing that

(37)

\begin{align} \mathbb{E}[\triangle_t] &= \frac{\tau-2}{\tau-1}\frac{m^2(m-1) (m+\delta)(m+\delta+1)} {(2m+\delta)^2} \nonumber \\ & \times \sum_{u=1}^{t-2}\bigg[\bigg(u-\frac{2m}{2m+\delta}\bigg) \bigg(u-\frac{2m-1}{2m+\delta}\bigg)\bigg]^{-1}\nonumber \\ & \times\frac{\Gamma(u+2-(2m)/(2m+\delta)}{\Gamma(u+2-(3m+\delta) /(2m+\delta))} \frac{\Gamma(u+2-(2m-1)/(2m+\delta))}{\Gamma(u+2- (3m+\delta-1)/(2m+\delta))}\nonumber \\ & \times \sum_{v=u+1}^{t-1}\bigg(v-\frac{3m+\delta-1}{2m+\delta}\bigg)^{-1}\nonumber \\ & \times \sum_{w=v+1}^{t}\frac{\Gamma(w-(3m+\delta)/(2m+\delta))} {\Gamma(w-(2m)/(2m+\delta))} \frac{\Gamma(w-(3m+\delta-1)/(2m+\delta))} {\Gamma(w-(2m-1)/(2m+\delta))}. \label{eqn37}\end{align}

We can write

(38)

\begin{equation} \triangle_t \,{:\!=}\, \sum_{u=1}^{t-2}\sum_{v=u+1}^{t-1}\sum_{w=v+1}^t \sum_{j_1\in[m]}\sum_{j_2,j_3\in[m]}{\bf 1}\{u\stackrel{j_1}{\leftarrow} v,\ u\stackrel{j_2}{\leftarrow}w, \ v\stackrel{j_3}{\leftarrow} w\}. \label{eqn38} \end{equation}

Since there are $m^2(m-1)$ possible choices for the edges $j_1,j_2,j_3$ ,

(39)

\begin{equation} \mathbb{E}[\triangle_t] = m^2(m-1)\sum_{u=1}^{t-2}\sum_{v=u+1}^{t-1}\sum_{w=v+1}^t \mathbb{E}\bigg[\psi_u\frac{S_u}{S_{v-1}}\psi_u\frac{S_u}{S_{w-1}} \psi_v\frac{S_v}{S_{w-1}}\bigg]. \label{eqn39} \end{equation}

Recalling (15), we can write every term in the sum in (39) as

(40)

\begin{equation} \mathbb{E}\bigg[\bigg(\psi_u\prod_{h=u+1}^{v-1}(1-\psi_h)\bigg) \bigg(\psi_u\prod_{k=u+1}^{w-1}(1-\psi_k)\bigg)\bigg(\psi_v \prod_{l=v+1}^{w-1}(1-\psi_l)\bigg)\bigg]. \label{eqn40} \end{equation}

Since the random variables $\psi_1,\ldots, \psi_t$ are independent, we can factorize the expectation to obtain

(41)

\begin{equation} \mathbb{E}[\psi_u^2]\mathbb{E}[\psi_v(1-\psi_v)] \prod_{k=u+1,\,k\neq v}^{w-1}\mathbb{E}[(1-\psi_k)^2] =\mathbb{E}[\psi_u^2]\frac{\mathbb{E}[ \psi_v(1-\psi_v)]}{\mathbb{E}[(1-\psi_v)^2]} \prod_{k=u+1}^{w-1}\mathbb{E}[(1-\psi_k)^2] . \label{eqn41} \end{equation}

Recall that, for a beta random variable $X(\alpha,\beta)$ , we have

(42)

\begin{equation} \begin{gathered} \mathbb{E}[X] = \frac{\alpha}{\alpha+\beta},\qquad \mathbb{E}[X(1-X)] = \frac{\alpha\beta}{(\alpha+\beta)(\alpha+\beta+1)}, \\ \mathbb{E}[X^2] = \frac{\alpha(\alpha+1)}{(\alpha+\beta)(\alpha+\beta+1)}, \end{gathered} \label{eqn42} \end{equation}

and $1-X(\alpha,\beta)$ is distributed as $X(\beta,\alpha)$ . Using (42), we can rewrite (41) in terms of the parameters of $\psi_1,\ldots, \psi_t$ . Since $\psi_k$ has parameters $\alpha\,{=}\, m\,{+}\,\delta$ and $\beta\,{=}\, \beta_k\,{=}\, m(2k\,{-}\,3)\,{+}\,(k\,{-}\,1)\delta$ , the first term in (41) can be written as

(43)

\begin{align} &\mathbb{E}[\psi_u^2] \frac{(m+\delta)(m+\delta+1)}{(m(2u-2)+u\delta) (m(2u-2)+u\delta+1)}\nonumber \\ &\qquad = \frac{(m+\delta)(m+\delta+1)}{(2m+\delta)^2}\bigg[\bigg(u-\frac{2m} {2m}+\delta\bigg)\bigg(u-\frac{2m-1}{2m+\delta}\bigg)\bigg]^{-1}. \label{eqn43}\end{align}

For the second term, we have

(44)

\begin{equation} \frac{\mathbb{E}[\psi_v(1-\psi_v)]}{\mathbb{E}[(1-\psi_v)^2]} =\frac{m+\delta}{m(2v-3)+(v-1)\delta} = \frac{\tau-2}{\tau-1} \bigg(v-\frac{3m+\delta-1}{2m+\delta}\bigg)^{-1}. \label{eqn44} \end{equation}

The last product in (41), for $k=u+1,\ldots,w-1,$ results in

(45)

\begin{align} \mathbb{E}[(1-\psi_k)^2] & =\frac{(m(2k-3)+(k-1)\delta) (m(2k-3)+(k-1)\delta+1)}{(m(2k-2)+k\delta) (m(2k-2)+k\delta+1)}\nonumber \\ & = \frac{k-(3m+\delta)/(2m+\delta)}{k-2m/(2m+\delta)}\frac{k-(3m+ \delta-1)/(2m+\delta)}{k-(2m-1)/(2m+\delta)}. \label{eqn45}\end{align}

Using the recursive property $\Gamma(a+1) = a\Gamma(a)$ of the gamma function,

(46)

\begin{align} \prod_{k=u+1}^{w-1}\mathbb{E}[(1-\psi_k)^2] &= \frac{\Gamma(u+2-(2m)/(2m+\delta)}{\Gamma(u+2-(3m+\delta)/(2m+\delta))} \frac{\Gamma(u+2-(2m-1)/(2m+\delta))}{\Gamma(u+2-(3m+\delta-1)/ (2m+\delta))}\nonumber \\ & \times \!\frac{\Gamma(w-(3m+\delta)/(2m+\!\delta))} {\Gamma(w-(2m)/(2m+\delta))} \frac{\Gamma(w-(3m+\delta\!-1)/(2m+\!\delta))} {\Gamma(w-(2m-1)/(2m+\delta))}. \label{eqn46}\end{align}

Equation (39) follows by combining (41), (43), (44), (45), and (46).

The last step of the proof is to evaluate the sum in (39), and combining the result with the multiplicative constant in front in (39). By Stirling’s formula,

\begin{equation*} \frac{\Gamma(x+a)}{\Gamma(x+b)} = x^{a-b}\bigg(1+O\bigg(\frac{1}{x}\bigg) \bigg). \end{equation*}

As a consequence, recalling that $\chi = (m+\delta)/(2m+\delta)$ , the sum in (39) can be written as

(47)

\begin{equation} \sum_{u=1}^{t-2}u^{2\chi-2}\bigg(1+O\bigg(\frac{1}{u}\bigg)\bigg) \sum_{v=u+1}^{t-1}v^{-1}\bigg(1+O\bigg(\frac{1}{v}\bigg)\bigg) \sum_{w=v+1}^t w^{-2\chi}\bigg(1+O\bigg(\frac{1}{w}\bigg)\bigg). \label{eqn47} \end{equation}

We can approximate the sum in (47) with the corresponding integral using the Euler–Maclaurin formula, thus obtaining

(48)

\begin{equation} \int_1^t u^{2\chi-2}{\mathrm{d}} u\int_u^t v^{-1}{\mathrm{d}} v\int_v^t w^{-2\chi}{\mathrm{d}} w. \label{eqn48} \end{equation}

As $t\rightarrow\infty$ , the order of magnitude of the integral in (48) is predicted by Theorem 1. If we evaluate the integral then we find that the coefficient of the dominant term in (48) is ${(2m+\delta)^2/\delta^2}$ for $\tau>2, \tau\neq 3$ , and $\smash{\tfrac16}$ for $\tau=3$ .

Putting together these coefficients with the constant in front of the sum in (37) completes the proof of Theorem 2.

Remark 1

(Constant for general subgraphs) In the proof of Theorem 2, the hardest step is to prove (38), i.e. to find the expectation of the indicator functions in (37). This is the reason why, for a general ordered subgraph ${(H,\pi)}$ on k vertices, it is hard to find the explicit constant as in (9). In fact, as we have done to move from (39) to (40), it is necessary to identify precisely, for every $v\in[t]$ , how many times the terms $\psi_v$ and ${(1-\psi_v)}$ appear in the product inside the expectations in (39). This makes the evaluation of such terms complicated.

Typically, as shown in (41), (43), (44), (45), and (46), the product of the constants obtained by evaluating the probability of an ordered subgraph ${(H,\pi)}$ being present can be written as ratios of gamma functions. The same constants can be found using the martingale approach as in [Reference Bollobás, Riordan, Spencer and Tusnády7], [Reference Szymaski26], and [Reference van der Hofstad28, Section 8.3], even though in this case constants are obtained through a recursive use of the conditional expectation.

We remark that our method and the martingale method are equivalent. We focused on the Pólya urn interpretation of the graph since it highlights the dependence of the presence of edges on the age of vertices, which is directly related to the order of magnitude of degrees.

7. Conditional concentration: proof of Proposition 1

In the previous sections, we have considered the order of magnitude of the expectation of the number of occurrences of ordered subgraphs in the PAM. In other words, for an ordered subgraph ${(H,\psi),}$ we are able to identify the order of magnitude f(t) of the expected number of occurrences $N_t(H,\pi)$ , so that $$\mathbb{E}[{N_t}(H,\pi )] = {\rm{ }}o{\rm{(}}f(t))$$ . We now show how these orders of magnitude of the expected number of subgraphs determines the conditional convergence given in (6).

7.1. Bound with overlapping subgraphs

The Pólya urn graph in Definition 2 consists of a function of uniform random variables $\smash{(U_{v,j})_{v\in[t]}^{\;j\in[m]}}$ and an independent sequence of beta random variables $\smash{(\psi_v)_{v\in[t]}}$ . We can interpret the sequence $\smash{(\psi_v)_{v\in[t]}}$ as a sequence of intensities associated to the vertices, where a higher intensity corresponds to a higher probability of receiving a connection. The sequence $\smash{(U_{v,j})_{v\in[t]}^{\;j\in[m]}}$ determines the attachment of edges. In particular, conditionally on the sequence $\smash{(\psi_v)_{v\in[t]}}$ , every edge is present independently (but with different probabilities).

For $t\in\mathbb{N}$ , let $\mathbb{P}_{\boldsymbol{\psi}_t}(\cdot) = \mathbb{P}(\cdot \mid \psi_1,\ldots,\psi_t)$ , and similarly $\mathbb{E}_{\boldsymbol{\psi}_t}[\cdot] = \mathbb{E}[\cdot \mid \psi_1,\ldots,\psi_t]$ . Furthermore, let $N_{t,\psi}(H,\pi)$ denote the number of times subgraph ${(H,\pi)}$ appears conditionally on the $\psi$ -variables. We now apply a conditional second moment method to $N_{t,\psi}(H,\pi)$ . We use the notation introduced in Section 4, so that every possible realization of H in the PAM corresponds to a finite set of edges $M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)$ , where $\ell$ is the number of edges in H such that $\smash{v_h\stackrel{j_h}{\rightarrow}u_h}$ , i.e. $u_h$ is the receiving vertex, and $j_h$ is the label of the edge. For simplicity, we denote the set $M_\ell(\boldsymbol{u}_\ell,\boldsymbol{v}_\ell,\boldsymbol{j}_\ell)$ by M. For ease of notation, we assume that $\pi$ is the identity map and drop the dependence on $\pi$ . We prove the following results.

Lemma 5

(Bound on the conditional variance.) Consider subgraph H. Then, $\mathbb{P}$ -almost surely,

\begin{equation*} \textrm{var}_{\boldsymbol{\psi}_t}(N_t(H)) \leq \mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H)]+ \sum_{\skew3\hat{H}\in\hat{\mathcal{H}}}\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(\skew3\hat{H})], \end{equation*}

where $\smash{\hat{\mathcal{H}}}$ denotes the set of all possible attainable subgraphs $\smash{\skew3\hat{H}}$ that are obtained by merging two copies of H such that they share at least one edge.

Lemma 5 gives a bound on the conditional variance in terms of the conditional probabilities of observing two overlapping subgraphs H at the same time. Note that we require these copies to overlap at least one edge, which is different than requiring that they are disjoint (they can share one or more vertices but no edges).

Proof of Lemma 5. We prove the bound in Lemma 5 by evaluating the conditional second moment of $N_t(H)$ as

\begin{align*} \mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H)^2] & = \mathbb{E}_{\boldsymbol{\psi}_t}\bigg[\sum_{M,M'}{\bf 1}_{\{M\subseteq E(\mathrm{PA}_t)\}}{\bf 1}_{\{M'\subseteq E(\mathrm{PA}_t)\}}\bigg] \\ & = \sum_{M,M'}\mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t),\ M'\subseteq E(\mathrm{PA}_t)) , \end{align*}

where M and M′ are two sets of edges corresponding to two possible realizations of the subgraph H. Note that M and M′ are not necessarily distinct. We then have to evaluate the conditional probability of having both the sets M and M′ simultaneously present in the graph. As a consequence, the conditional variance in Lemma 5 can be written as

(49)

\begin{equation} \sum_{M\neq M'}\mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t),\ M'\subseteq E(\mathrm{PA}_t)) -\mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t))\mathbb{P}_{\boldsymbol{\psi}_t}(M'\subseteq E(\mathrm{PA}_t)). \label{eqn49} \end{equation}

We define

$$\matrix{ {{\cal M}{\mkern 1mu} : = {\mkern 1mu} \{ (M,M'):{\rm{there exists }}(u,v,j){\rm{suchthat}}(u,v,j) \in M,\;(u,v,j) \in M',} \cr {M \ne M',(M \cup M')\;{\rm{defines an attainable subgraph}}\} .} \cr } $$

We then consider two different cases, i.e. whether (M,M′) is in $\mathcal{M}$ or not. If ${(M,M')\not \in \mathcal{M}}$ then one of the three following situations occurs.

$M\cup M'$ defines a subgraph that is not attainable (for instance, M and M′ require that the same edge is attached to different vertices).
$M\cup M'$ defines a subgraph that is attainable, M and M′ are disjoint sets of labeled edges (they are allowed to share vertices).
M and M′ define the same attainable subgraph (so $M=M'$ ; thus labels of edges coincide).

When $M=M'$ , we have

\begin{equation*} \mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t),\ M'\subseteq E(\mathrm{PA}_t)) = \mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t)), \end{equation*}

so that the corresponding contribution in the sum in (49) is

\begin{equation*} \mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t))-\mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t))^2\leq \mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t)), \end{equation*}

and the sum over M gives the term $\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H)]$ in the statement of Lemma 5. When $M\,{\neq}\, M'$ and $M\cup M'$ is attainable and their sets of edges are disjoint, it follows directly from the independence of ${(U_{v,j})_{v\in[t]}^{\; j\in[m]}}$ and ${(\psi_v)_{v\in[t]}}$ that

\begin{equation*} \mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t),\ M'\subseteq E(\mathrm{PA}_t)) = \mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t))\mathbb{P}_{\boldsymbol{\psi}_t}(M'\subseteq E(\mathrm{PA}_t)). \end{equation*}

Thus, in this situation the corresponding contribution is 0. When (M,M?) is not attainable, the corresponding contribution is negative. When ${(M,M')\in \mathcal{M},}$ we bound the corresponding terms in (49) by $\mathbb{P}_{\boldsymbol{\psi}_t}\Big(M\subseteq E(\mathrm{PA}_t),\ M'\subseteq E(\mathrm{PA}_t)\Big)$ , thus obtaining

\begin{equation*} \textrm{var}_{\boldsymbol{\psi}_t}(N_t(H)) \leq \mathbb{E}_{\boldsymbol{\psi}_t} [N_t(H)]+\sum_{(M,M')\in \mathcal{M}} \mathbb{P}_{\boldsymbol{\psi}_t}(M\cup M' \subseteq E(\mathrm{PA}_t)). \end{equation*}

We then rewrite this as

\begin{equation*} \textrm{var}_{\boldsymbol{\psi}_t}(N_t(H,\pi)) \leq \mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H)] + \sum_{\skew3\hat{H}\in\hat{\mathcal{H}}}\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(\skew3\hat{H})], \end{equation*}

which proves the lemma.

7.2. Criterion for conditional convergence

We now prove Proposition 1 using Lemma 5 and Lemma 7.

Proof of Proposition 1. It is sufficient to show that, for every fixed $\varepsilon>0$ ,

\begin{equation*} \mathbb{P}(|N_{t,\psi}(H,\pi)-\mathbb{E}_t[N_t(H,\pi)]>\varepsilon \mathbb{E}[N_t(H,\pi)] ) = o(1). \end{equation*}

We now apply Lemma 5, which yields

\begin{align*} &\mathbb{P}(|N_{t,\psi}(H,\pi)-\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H,\pi)] >\varepsilon \mathbb{E}[N_t(H,\pi)] ) \\ &\qquad \leq \frac{1}{\varepsilon^2\mathbb{E}[N_t(H,\pi)]^2} \mathbb{E}[\textrm{var}_{\boldsymbol{\psi}_t}(N_t(H,\pi))] \\ &\qquad \leq \frac{\mathbb{E}[\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(H,\pi)] + \sum_{\skew3\hat{H}\in\hat{\mathcal{H}}}\mathbb{E}_{\boldsymbol{\psi}_t}[N_t(\skew3\hat{H})] ]}{\varepsilon^2\mathbb{E}[N_t(H,\pi)]^2} \\ &\qquad = \frac{\mathbb{E}[N_t(H,\pi)]+ \mathbb{E}[N_t(\skew3\hat{H})]}{\varepsilon^2 \mathbb{E}[N_t(H,\pi)]^2} \\ &\qquad = o(1).\tag*{\qedhere} \end{align*}

To show how Proposition 1 may be applied to investigate the concentration properties of subgraphs, we consider triangles. Theorem 2 identifies the expected number of triangles, and by Theorem 1 we can show that $\smash{\mathbb{E}[\triangle_t^2] = \Theta(\mathbb{E}[\triangle_t]^2)}$ , so we are not able to apply the second moment method to $\triangle_t$ . Figure 6 suggests that $\triangle_t/\mathbb{E}[\triangle_t]$ converges to a limit that is not deterministic, i.e. in (7) the limiting X is a random variable. However, we can prove that $\triangle_t$ is conditionally concentrated, as stated in Corollary 2. The proof of Corollary 2 follows directly from Proposition 1, the fact that $\mathbb{E}[\triangle_t] = \smash{\Theta(t^{(3-\tau)/(\tau-1)}\log(t))}$ as given by Theorem 2, and Figure 5, which contains the information on the subgraphs consisting of two triangles sharing one or two edges.

7.3. Nonconcentrated subgraphs

We now show that, for most $\psi$ -sequences, the other direction in Proposition 1 also holds. That is, if there exists a subgraph composed of two merged copies of H such that the condition in Proposition 1 does not hold then, for most $\psi$ -sequences, H is not conditionally concentrated.

Proposition 2

Consider a subgraph ${(H,\pi)}$ such that ${\mathbb{E}}[{N_t(H,\pi)}]\to\infty$ as $t\to\infty$ . Suppose that there exists a subgraph $\skew3\hat{H}$ , composed of two distinct copies of ${(H,\pi)}$ with at least one edge in common such that $\mathbb{E}[{{N}_{t}}(3\hat{H})]/\mathbb{E}[{{N}_{t}}(H,\pi)]\rightarrow 0$ as $t\to\infty$ . Then, for any $\varepsilon>0$ , there exists $\eta>0$ such that

\begin{equation*} {\mathbb{P}}\bigg({\frac{\textrm{var}_{\psi_t}({N_t(H,\pi)})}{{\mathbb{E}}[{N_t(H,\pi)}]^2} >\eta}\bigg)\geq 1- \varepsilon. \end{equation*}

To prove Proposition 2, we need a preliminary result on the maximum value of $\psi$ , the maximal intensity in the Pólya urn graph.

Lemma 6

For every $\varepsilon>0,$ there exists $K = K(\varepsilon)\in\mathbb{N}$ such that

\begin{equation*} \mathbb{P}\bigg(\bigcap_{k\geq K}\{\psi_k\leq \frac{(\log k)^2}{(2m+ \delta)k}\}\bigg)\geq 1-\varepsilon. \end{equation*}

Lemma 6 is a part of a more general coupling result between ${(\psi_k)_{k\in\mathbb{N}}}$ and a sequence of independent and identically distributed gamma random variables. We refer the reader to [Reference Berger, Borgs, Chayes and Saberi3, Lemma 3.2] and [Reference van der Hofstad27, Lemma 4.10] for more detail. We now state the lemma we need to prove Proposition 2.

Lemma 7

(Maximal intensity) For every $\varepsilon>0,$ there exists $\omega=\omega(\varepsilon) \in(0,1)$ such that, for every $t\in\mathbb{N}$ ,

\begin{equation*} \mathbb{P}\Big(\max_{i\in 2,\ldots, t}\psi_i<\omega\Big)\geq 1-\varepsilon. \end{equation*}

Proof. Fix $\varepsilon>0$ , and consider $K(\varepsilon/2)$ as given by Lemma 6. For every $\omega\in(0,1),$ we can write

(50)

\begin{equation} \mathbb{P}\Big(\max_{i\in2,\ldots,t}\psi_i<\omega\Big) = \mathbb{P}\Big(\max_{i\in2,\ldots,K}\psi_i<\omega\Big) \mathbb{P}\Big(\max_{i\in[t]\setminus[K]}\psi_i<\omega\Big), \label{eqn50} \end{equation}

where we used the independence of $\psi_2,\ldots,\psi_t$ . If $t>K,$ the second term on the right-hand side of (50) is well defined; otherwise, we have only the first term. Define

\begin{equation*} \omega_1 = \begin{cases} \frac{(\log K)^2}{(2m+\delta)K} & \mbox{if}\ t>K, \\ 0 & \mbox{if}\ t\leq K. \end{cases} \end{equation*}

Note that, since the function $k\mapsto {(\log k)^2}/{(2m+\delta)k}$ is decreasing, it follows that

(51)

\begin{equation} \mathbb{P}\Big(\max_{i\in[t]\setminus[K]}\psi_i<\omega_1\Big)\geq 1-\tfrac12{\varepsilon}. \label{eqn51} \end{equation}

Define the random variable $X_K = \max_{i\in 2,\ldots,K}\psi_i$ , and denote its distribution function by $F_K$ and the inverse of its distribution function by $F^{-1}_K$ . Consider $\omega_2 = F^{-1}_{K}(1-\varepsilon/2)$ , which implies that

(52)

\begin{equation} \mathbb{P}\Big(\max_{i\in[K]}\psi_i<\omega_2\Big)= 1-\tfrac12{\varepsilon}. \label{eqn52} \end{equation}

Consider then $\omega = \max\{\omega_1,\omega_2\}$ . Using (51) and (52) with $\omega$ in (50), it follows that

\begin{equation*} \mathbb{P}\Big(\max_{i\in 2,\ldots, K}\psi_i<\omega\Big) \mathbb{P}\Big(\max_{i\in[t]\setminus[K]}\psi_i<\omega\Big)\geq \big(1-\tfrac12{\varepsilon}\big)^2\geq 1-\varepsilon, \end{equation*}

which completes the proof.

Proof of Proposition 2. We use the expression of the conditional variance of (49). We first study the term in the conditional variance corresponding to $\smash{\skew3\hat{H}}$ . Let $\smash{\tilde{\mathcal{M}}}$ denote the set of labeled edges M,M’ that together form subgraph $\smash{\skew3\hat{H}}$ . Let the edges that M and M’ share be denoted by $M_s$ . Furthermore, let $\smash{\tilde{\mathcal{M}}_1}$ denote the set of labeled edges M,M’ that together form subgraph $\smash{\skew3\hat{H}}$ that do not use vertex 1. We can then write this term as

\begin{align*} &\sum_{M,M'\in \tilde{\mathcal{M}}}{\mathbb{P}}_{\psi_t}({M\cup M'\subseteq E(\mathrm{PA}_t)}) (1-{\mathbb{P}}_{\psi_t}({M_s\subseteq E(\mathrm{PA}_t)}))\nonumber \\ &\qquad \geq \sum_{M,M'\in \tilde{\mathcal{M}}_1}{\mathbb{P}}_{\psi_t}({M\cup M'\subseteq E(\mathrm{PA}_t)})(1-\psi_{\max})\nonumber \\ &\qquad = (1-\psi_{\max}){\mathbb{E}}_{\psi_t}[{N_t(\skew3\hat{H})}], \end{align*}

where the inequality uses (15), and $\psi_{\max}=\max_{i\in 2,\ldots, t}\psi_i$ . Note that here we excluded vertex 1 from the number of subgraphs with negligible error. By Lemma 7, there exists $\omega$ such that, with probability at least $1-\varepsilon$ , $\psi_{\max}<\omega<1$ .

By the assumption on $\smash{\skew3\hat{H}}$ , ${\mathbb{E}}[{N_t(\skew3\hat{H})}]\geq \tilde{C}{\mathbb{E}}[{N_t(H,\pi)}]^2$ for some $\tilde{C}>0$ . We use the fact that $\smash{{\mathbb{E}}_{\psi_t}[{N_t(\skew3\hat{H})}]}= \smash{O_\mathbb{P}({\mathbb{E}}[{N_t(\skew3\hat{H})}])}$ . Thus, for sufficiently large t, we can bound the contribution from subgraph $\smash{\skew3\hat{H}}$ to the conditional variance from below with probability at least $1-\varepsilon$ by

\begin{equation*} \sum_{M,M'\in \tilde{\mathcal{M}}}{\mathbb{P}}_{\psi_t}({M\cup M'\subseteq E(\mathrm{PA}_t)}) (1-{\mathbb{P}}_{\psi_t}({M_s\subseteq E(\mathrm{PA}_t)}))\geq C {\mathbb{E}}_{\psi_t}[{N_t(H,\pi)}]^2 \end{equation*}

for some $C>0$ .

Note that the only terms that have a negative contribution to (49) are the terms where $M\cup M'$ is a nonattainable subgraph. In that situation, $\mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t),\ M'\subseteq E(\mathrm{PA}_t)) =0$ . Furthermore, the sum over $\mathbb{P}_{\boldsymbol{\psi}_t}(M\subseteq E(\mathrm{PA}_t))\mathbb{P}_{\boldsymbol{\psi}_t}(M'\subseteq E(\mathrm{PA}_t) \leq {\mathbb{E}}_{\psi_t}[{N_t(H,\pi)}]^2/n^2$ , since the two subgraphs share at least two vertices. Therefore, the negative terms in the conditional variance scale as most as ${\mathbb{E}}_{\psi_t}[{N_t(H,\pi)}]/n^2$ . We therefore find that, with probability at least $1-\varepsilon$ ,

\begin{equation*} \textrm{var}_{\psi_t}({N_t(H,\pi))}\geq \eta {\mathbb{E}}_{\psi_t}[{N_t(H,\pi)}]^2 \end{equation*}

for some $\eta>0$ , which proves the proposition.

8. Discussion

In this paper, we investigated the expected number of times a graph H appears as a subgraph of a PAM for any degree exponent $\tau$ . We found the scaling of the expected number of such subgraphs in terms of the graph size t and the degree exponent $\tau$ by defining an optimization problem that finds the optimal structure of the subgraph in terms of the ages of the vertices that form subgraph H and by using the interpretation of the PAM as a Pólya urn graph.

We derived the asymptotic scaling of the number of subgraphs. For the triangle subgraph, we obtained more precise asymptotics. It would be interesting to obtain precise asymptotics of the expected number of other types of subgraphs as well. In particular, this is necessary to compute the variance of the number of subgraphs, which may allow us to derive laws of large numbers for the number of subgraphs. We showed that different subgraphs may have significantly different concentration properties. Therefore, identifying the distribution of the number of rescaled subgraphs for any type of subgraph remains a challenging open problem.

Another interesting extension would be to investigate whether our result still holds for other types of PAMs, for example, models that allow for self-loops, or models that include extra triangles.

We further proved results for the number of subgraphs of fixed size k, while the graph size tends to infinity. It would also be interesting to let the subgraph size grow with the graph size, for example, by counting the number of cycles of a certain length that grows in the graph size.

Finally, we investigated the number of times H appears as a subgraph of a PAM. It is also possible to count the number of times H appears as an induced subgraph instead, forbidding edges that are not present in H to be present in the larger graph. It would be interesting to see whether the optimal subgraph structure is different from the optimal induced subgraph structure.

Acknowledgements

We thank Remco van der Hofstad for reading the manuscript. This work was supported in part by the Netherlands Organisation for Scientific Research (NWO) through the Gravitation Networks grant 024.002.003 and NWO TOP grant 613.001.451.

References

Albert, R. and Barabási, A.-L. (2002). Statistical mechanics of complex networks. Rev. Modern Phys. 74, 47–97.CrossRef Google Scholar

Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286, 509–512.CrossRef Google Scholar PubMed

Berger, N., Borgs, C., Chayes, J. T. and Saberi, A. (2014). Asymptotic behavior and distributional limits of preferential attachment graphs. Ann. Prob. 42, 1–40.CrossRef Google Scholar

Boguñá, M. and Pastor-Satorras, R. (2003). Class of correlated random networks with hidden variables. Phys. Rev. E 68, 036112, 13pp.CrossRef Google Scholar PubMed

Bollobás, B. (1980). A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Europ. J. Combinatorics 1, 311–316.CrossRef Google Scholar

Bollobás, B. and Riordan, O. M. (2002). Mathematical Results on Scale-Free Random Graphs. Wiley–WCH, Weinheim.CrossRef Google Scholar

Bollobás, B., Riordan, O., Spencer, J. and Tusnády, G. (2001). The degree sequence of a scale-free random graph process. Random Structures Algorithms 18, 279–290.CrossRef Google Scholar

Britton, T., Deijfen, M. and Martin-Löf, A. (2006). Generating simple random graphs with prescribed degree distribution. J. Statist. Phys. 124, 1377–1397.CrossRef Google Scholar

Chung, F. and Lu, L. (2002). The average distances in random graphs with given expected degrees. Proc. Nat. Acad. Sci. USA 99, 15879–15882.CrossRef Google Scholar PubMed

Dommers, S., van der Hofstad, R. and Hooghiemstra, G. (2010). Diameters in preferential attachment models. J. Statist. Phys. 139, 72–107.CrossRef Google Scholar

Eggemann, N. and Noble, S. D. (2011). The clustering coefficient of a scale-free random graph. Discrete Appl. Math. 159, 953–965.CrossRef Google Scholar

Faloutsos, M., Faloutsos, P. and Faloutsos, C. (1999). On power-law relationships of the internet topology. Comput. Commun. Rev. 29, 251–262.CrossRef Google Scholar

Gao, C. and Lafferty, J. (2017). Testing network structure using relations between small subgraph probabilities. Preprint. Available at https://arxiv.org/abs/1704.06742v1.Google Scholar

Garavaglia, A. (2019). Preferential attachment models for dynamic networks. Doctoral Thesis, Eindhoven University of Techonolgy.Google Scholar

Jordan, J. (2006). The degree sequences and spectra of scale-free random graphs. Random Structures Algorithms 29, 226–242.CrossRef Google Scholar

Krioukov, D. (2010). Hyperbolic geometry of complex networks. Phys. Rev. E 82, 036106, 18pp.CrossRef Google Scholar PubMed

Marcus, D. and Shavitt, Y. (2012). RAGE – a rapid graphlet enumerator for large networks. Comput. Networks 56, 810–819.CrossRef Google Scholar

Maugis, P.-A. G., Priebe, C. E., Olhede, S. C. and Wolfe, P. J. (2019). Statistical inference for network samples using subgraph counts. Preprint. Available at https://arxiv.org/abs/1701.00505v3.Google Scholar

Milo, R. (2002). Network motifs: simple building blocks of complex networks. Science 298, 824–827.CrossRef Google Scholar PubMed

Milo, R. (2004). Superfamilies of evolved and designed networks. Science 303, 1538–1542.CrossRef Google Scholar PubMed

Onnela, J.-P., Saramäki, J., Kertész, J. and Kaski, K. (2005). Intensity and coherence of motifs in weighted complex networks. Phys. Rev. E 71, 065103, 4pp.CrossRef Google Scholar PubMed

Prokhorenkova, L. O. (2017). General results on preferential attachment and clustering coefficient. Optimization Lett. 11, 279–298.CrossRef Google Scholar

Prokhorenkova, L. A. and Krot, A. V (2016). Local clustering coefficients in preferential attachment models. Dokl. Math. 94, 623–626.CrossRef Google Scholar

Prokhorenkova, L., Ryabchenko, A. and Samosvat, E. (2013). Generalized preferential attachment: tunable power-law degree distribution and clustering coefficient. In Algorithms and Models for the Web Graph, eds Bonato, A., Mitzenmacher, M. and Praat, P., Springer, Cham, pp. 185–202.Google Scholar

Shen-Orr, S. S., Milo, R., Mangan, S. and Alon, U. (2002). Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet. 31, 64–68.CrossRef Google Scholar PubMed

Szymaski, J. (2005). Concentration of vertex degrees in a scale-free random graph process. Random Structures Algorithms 26, 224–236.CrossRef Google Scholar

van der Hofstad, R. (2019+). Random Graphs and Complex Networks, Vol. 2. In preparation.Google Scholar

van der Hofstad, R. (2017). Random Graphs and Complex Networks, Vol. 1. Cambridge University Press.CrossRef Google Scholar

van der Hofstad, R., van Leeuwaarden, J. S. H. and Stegehuis, C. (2017). Optimal subgraph structures in scale-free configuration models. Preprint. Available at https://arxiv.org/abs/1709.03466v1.Google Scholar

Vázquez, A., Pastor-Satorras, R. and Vespignani, A. (2002). Large-scale topological and dynamical properties of the Internet. Phys. Rev. E 65, 066130, 12pp.CrossRef Google Scholar PubMed

Wernicke, S. and Rasche, F. (2006). FANMOD: a tool for fast network motif detection. Bioinformatics 22, 1152–1153.CrossRef Google Scholar PubMed

Wuchty, S., Oltvai, Z. N. and Barabási, A.-L. (2003). Evolutionary conservation of motif constituents in the yeast protein interaction network. Nature Genet. 35, 176–179.CrossRef Google Scholar PubMed

Figure 1: Two labeled triangles.

Figure 2: Order of magnitude of $N_t(H)$ for all attainable connected directed graphs on three vertices and for $2<\tau<3$. The vertex color indicates the optimal vertex degree.

Figure 3: Order of magnitude of $N_t(H)$ for all attainable connected directed graphs on four vertices and for $2<\tau<3$. The vertex color indicates the optimal vertex degree.

Figure 4: The order of magnitude of this subgraph containing two merged copies of the subgraph of Figure 3(q) is $t^{{4}/{(\tau-1)}}$, so that the condition in Proposition 1 is not satisfied for the subgraph in Figure 3(q).

Figure 5: Order of magnitude of $N_t(H)$ for all merged triangles on four vertices and for $2<\tau<3$. The vertex color indicates the optimal vertex degree as in Figure 3.

Figure 6: Density approximation of the number of triangles in $10^4$ realizations of the preferential attachment model with $\tau=2.5$ and various values of t.

Article contents

Subgraphs in preferential attachment models

Abstract

Keywords

MSC classification

1. Introduction

1.1. Literature on subgraphs in PAMs

1.2. Our contribution

1.3. Organization of the paper

2. Definitions

2.1. PAM

Definition 1

2.2. PAMs as Pólya urn graphs

Definition 2

2.3. Labeled subgraphs

3. Main results

Definition 3

3.1. Scaling of the number of subgraphs

Theorem 1

Corollary 1

3.1.1. Interpretation of the optimization problem

3.1.2. Most likely degrees

3.2. Fluctuations of the number of subgraphs

Proposition 1

3.3. Exact constants: triangles

Theorem 2

3.3.1. The distribution of the number of triangles

Corollary 2

4. The probability of a subgraph being present

Lemma 1

4.1. Preliminary properties of Pólya urn graphs

Lemma 2

Lemma 3

4.2. Proof of Lemma 1

5. Proof of Theorem 1

Lemma 4

6. Proof of Theorem 2

Remark 1

7. Conditional concentration: proof of Proposition 1

7.1. Bound with overlapping subgraphs

Lemma 5

7.2. Criterion for conditional convergence

7.3. Nonconcentrated subgraphs

Proposition 2

Lemma 6

Lemma 7

8. Discussion

Acknowledgements

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests