Measuring reciprocity in a directed preferential attachment network

Tiandong Wang; Sidney I. Resnick

doi:10.1017/apr.2021.52

Measuring reciprocity in a directed preferential attachment network

Part of: Limit theorems Graph theory Stochastic processes

Published online by Cambridge University Press: 15 June 2022

Tiandong Wang

and

Sidney I. Resnick

Show author details

Tiandong Wang*: Affiliation:
Texas A&M University
Sidney I. Resnick*: Affiliation:
Cornell University
*: *Postal address: Department of Statistics, Texas A&M University, College Station, TX 77843, U.S.A. Email address: twang@stat.tamu.edu
**Postal address: School of Operations Research and Information Engineering, Cornell University, Ithaca, NY 14853, U.S.A. Email address: sir1@cornell.edu

Article contents

Abstract
Introduction
Growth of the degree sequences $\boldsymbol{D}^{\textrm{in}}_{\boldsymbol{v}}(n)$, $\boldsymbol{D}^{\textrm{out}}_{\boldsymbol{v}}(n)$
Reciprocity in the preferential attachment network
Discussion
Funding information
Competing interests
References

Rights & Permissions

Abstract

Empirical studies (e.g. Jiang et al. (2015) and Mislove et al. (2007)) show that online social networks have not only in- and out-degree distributions with Pareto-like tails, but also a high proportion of reciprocal edges. A classical directed preferential attachment (PA) model generates in- and out-degree distributions with power-law tails, but the theoretical properties of the reciprocity feature in this model have not yet been studied. We derive asymptotic results on the number of reciprocal edges between two fixed nodes, as well as the proportion of reciprocal edges in the entire PA network. We see that with certain choices of parameters, the proportion of reciprocal edges in a directed PA network is close to 0, which differs from the empirical observation. This points out one potential problem of fitting a classical PA model to a given network dataset with high reciprocity, and indicates that alternative models need to be considered.

Keywords

Reciprocity preferential attachment in- and out-degrees

MSC classification

Primary: 60F15: Strong theorems 60G46: Martingales and classical analysis

Secondary: 05C82: Small world graphs, complex networks

Type: Original Article
Information: Advances in Applied Probability , Volume 54 , Issue 3 , September 2022 , pp. 718 - 742

DOI: https://doi.org/10.1017/apr.2021.52 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

In social network analysis, reciprocal edges characterize communication between two users. For instance, on Facebook, one user leaves messages on another user’s wall page, and a response from the target user then creates a reciprocal edge. Reciprocity, which is classically defined as the proportion of reciprocal edges (cf. [Reference Newman, Forrest and Balthrop12, Reference Wasserman and Faust21]), is one important network metric to measure interactions among individual users. The directed network constructed from Facebook wall posts [Reference Viswanath, Mislove, Cha and Gummadi17] is one example of social networks with a large proportion of reciprocal edges. The study on eight different types of networks in [Reference Jiang, Zhang and Towsley7] shows that online social networks (e.g. [Reference Cha, Mislove and Gummadi3, Reference Java, Song, Finin and Tseng6, Reference Magno9, Reference Mislove10, Reference Viswanath, Mislove, Cha and Gummadi17]) tend to have a higher proportion of reciprocal edges, compared to other types of networks such as biological networks, communication networks, software call graphs, and P2P networks.

Another widely observed feature of directed social networks is the scale-free property, where both in- and out-degree distributions have Pareto-like tails. The directed preferential attachment (PA) network model is appealing (cf. [Reference Bollobás, Borgs, Chayes and Riordan2, Reference Krapivsky and Redner8]), since theoretically the directed PA mechanism generates a network with the scale-free property because nodes with large degrees are likely to attract more edges than those with small degrees (cf. [Reference Resnick and Samorodnitsky14, Reference Samorodnitsky15, Reference Wan, Wang, Davis and Resnick18, Reference Wang and Resnick20]). However, the asymptotic behavior of the proportion of reciprocal edges in a directed PA model has not yet been explored in the literature. In this paper, we derive asymptotic results about (1) the number of reciprocal edges between two fixed nodes; (2) the proportion of reciprocal edges in a directed PA network, provided that the network has a large number of edges; and (3) the first time a reciprocal edge appears between two distinct fixed nodes.

Our theoretical results suggest that for certain choices of model parameters, especially when the proportion of edges added between two existing nodes is small, the proportion of reciprocal edges in the entire graph is close to 0, even though the total number of reciprocal edges between two fixed nodes may be of order $O(n^a)$ , $a\in (0,1)$ . Such behavior flags potential problems for fitting a directed PA model in practice. When fitting a directed PA model to a real network with high reciprocity using existing methods developed in [Reference Wan, Wang, Davis and Resnick18, Reference Wan, Wang, Davis and Resnick19], there is no guarantee that the calibrated model will also have a high proportion of reciprocal edges. Such a discrepancy indicates a poor fit of the PA model, since the fitted model fails to capture the important feature of high reciprocity. In these cases, variants of the directed PA model need to be considered. For instance, [Reference Cheng, Romero, Meeder and Kleinberg4] provides several different ways to predict the reciprocal edges between two given nodes, and one may incorporate those features to construct a refined network model that is both scale-free and of high reciprocity.

The rest of the paper is organized as follows. Section 1.1 provides the notation and definitions necessary to specify a growing sequence of graphs that evolve according to PA. The definitions use model parameters that control the growth of power law sequences. In Section 2, we give the power law growth of the in- and out-degree sequences. This leads to the main results of the paper, in Section 3:

1. For fixed nodes i, j, the number of reciprocal edges between i and j evolves as a power law.
2. For specified subsets of the parameters, the proportion of reciprocal edges in the graph goes to 0, showing that many real data sets with significant reciprocity do not follow this model.
3. For fixed nodes i, j we provide information about the first time a reciprocal pair of edges forms between i and j.

Section 4 contains a short discussion of theoretical results and future research directions, and the appendix gives some lemmas and proofs.

1.1. Model setup

Here is the specification of the classical directed PA model. Initialize the model with the graph G(0), which consists of one node, labeled node 1, and a self-loop. After n steps in the construction, $G(n)=(V(n), E(n))$ is the graph with node set V(n) with $V(0) = \{1\}$ and $|V(0)| = {1}$ and set of directed edges E(n) such that an ordered pair $(i,j)\in E(n)$ , $i,j\in V(n)$ , represents a directed edge $i\mapsto j$ . When $n=0$ , we have $E(0) = \{(1,1)\}$ . For later use, self-loops are not counted for reciprocity.

Let $\bigl(D^{\text{in}}_v(n), D^{\text{out}}_v(n)\bigr)$ be the in- and out-degrees of node $v \in V(n)$ with the convention that if $v \notin V(n)$ , $D^{\text{in}}_v(n) = 0 =D^{\text{out}}_v(n)$ . From G(n) to $G(n+1)$ , one of three scenarios happens:

1. With probability $\alpha$ , we add a new edge (v, w), where $w\in V(n)$ , and $v\notin V(n)$ is a new node. The existing node w is chosen with probability
\[\frac{D^{\text{in}}_w(n)+\delta_{\text{in}}}{\sum_{w\in V(n)} \big(D^{\text{in}}_w(n)+\delta_{\text{in}}\big)}= \frac{D^{\text{in}}_w(n)+\delta_{\text{in}}}{n+1+\delta_{\text{in}} |V(n)|}.\]
2. With probability $\beta$ , a new edge (v, w) is added between two existing nodes $v,w\in V(n)$ , where the starting node v and the ending node w are chosen independently with probability
\begin{align*}&\frac{D^{\text{in}}_w(n)+\delta_{\text{in}}}{\sum_{w\in V(n)} \big(D^{\text{in}}_w(n)+\delta_{\text{in}}\big)}\frac{D^{\text{out}}_v(n)+\delta_{\text{out}}}{\sum_{v\in V(n)} \big(D^{\text{out}}_v(n)+\delta_{\text{out}}\big)} \\[4pt] &\qquad=\frac{D^{\text{in}}_w(n)+\delta_{\text{in}}}{n+1+\delta_{\text{in}} |V(n)|}\frac{D^{\text{out}}_v(n)+\delta_{\text{out}}}{n+1+\delta_{\text{out}} |V(n)|}.\end{align*}
For brevity of notation, we set
(1.1) \begin{equation}A_{wv}(n) \,:\!=\, \frac{D^{\text{in}}_w(n)+\delta_{\text{in}}}{n+1+\delta_{\text{in}} |V(n)|}\frac{D^{\text{out}}_v(n)+\delta_{\text{out}}}{n+1+\delta_{\text{out}}|V(n)|};\end{equation}
then the attachment probability in the $\beta$ -scheme is $\beta A_{wv}(n)$ .
3. With probability $\gamma$ , we add a new edge (w, v), where $w\in V(n)$ , and $v\notin V(n)$ is a new node. The existing node w is chosen with probability
\[\frac{D^{\text{out}}_w(n)+\delta_{\text{out}}}{\sum_{w\in V(n)} \big(D^{\text{out}}_w(n)+\delta_{\text{out}}\big)}=\frac{D^{\text{out}}_w(n)+\delta_{\text{out}}}{n+1+\delta_{\text{out}} |V(n)|}.\]

We assume $\alpha+\beta+\gamma=1$ , $\beta\in [0,1)$ , and $\delta_{\text{in}},\delta_{\text{out}}>0$ . Owing to the $\alpha$ - and $\gamma$ -schemes, $|V(n)|-1$ follows a binomial distribution with size n and success probability $\alpha+\gamma=1-\beta$ , so that $|V(n)|\stackrel{\text{a.s.}}{\longrightarrow}\infty$ as $n\to\infty$ . For $v\ge 1$ , we define $S_v$ to be the time when node v is created, i.e.

(1.2)

\begin{equation}S_v \,:\!=\, \inf\big\{n\ge 0\,:\, |V(n)| = v\big\}.\end{equation}

Since we use the convention that $D^{\text{in}}_v(n) = 0$ and $D^{\text{out}}_v(n) = 0$ if $S_v>n$ , we have by (1.1) that $A_{vw}(n)\in [0,1]$ for all n.

Let $\mathcal{F}_n$ be the $\sigma$ -field generated by observing the network up to the creation of the nth new edge. Suppose $\tau$ is a stopping time with respect to the filtration $\big(\mathcal{F}_n\big)_{n\ge 0}$ ; then

\[\mathcal{F}_{\tau} = \{F\,:\, F\cap\{\tau=n\}\in \mathcal{F}_n\}.\]

By (1.2), we see that $S_v$ is a stopping time with respect to $\big(\mathcal{F}_n\big)_{n\ge 0}$ . For $n\ge k\ge 0$ , we have

\[\{S_v+k=n\} = \{S_v=n-k\}\in \mathcal{F}_{n-k}\subset \mathcal{F}_n,\]

so $S_v+k$ , $k\ge 0$ , is a stopping time with respect to $\big(\mathcal{F}_n\big)_{n\ge 0}$ . Note that for $v>n-k$ , $\{S_v=n-k\}=\emptyset\in \mathcal{F}_{n-k}\subset \mathcal{F}_n$ . In what follows, we write $\mathbb{E}^{\mathcal{F}_n}(\cdot) \,:\!=\, \mathbb{E}\big({\cdot}|\mathcal{F}_n\big)$ .

2. Growth of the degree sequences $\boldsymbol{D}^{\textrm{in}}_{\boldsymbol{v}}(n)$ , $\boldsymbol{D}^{\textrm{out}}_{\boldsymbol{v}}(n)$

The parameters controlling the behavior of the model are $\alpha,\beta,\gamma,\delta_{\text{in}}, \delta_{\text{out}}$ , and we now define two new parameters which will serve as power-law exponents:

(2.1)

\begin{equation}c_1 = \frac{\alpha+\beta}{1+\delta_{\text{in}}(1-\beta)}, \qquad c_2 = \frac{\beta+\gamma}{1+\delta_{\text{out}}(1-\beta)}.\end{equation}

From (2.1), $0<c_1<\alpha+\beta $ , $0<c_2<\beta+\gamma$ , and $0<c_1+c_2 < 1+\beta$ . In fact, we can reparametrize the PA model using $(\alpha,\beta,\gamma, c_1, c_2)$ , which leads to an estimation method for model parameters alternative to unavailable maximum likelihood techniques; see [Reference Wan, Wang, Davis and Resnick19] for details.

The following proposition summarizes the power-law growth of $\big(D^{\text{in}}_v(n), D^{\text{out}}_v(n)\big)$ , which is controlled by the parameters $c_1,c_2$ . From these growth rates, we will derive the limiting behavior of the number of reciprocal edges between two fixed nodes in Section 3.1.

Proposition 2.1. For $v\ge 1$ , there are random variables $\xi^{{in}}_v$ , $\xi^{{out}}_v$ satisfying $\mathbb{P}\big(\xi^{{in}}_v\in (0,\infty)\big)=1=\mathbb{P}\big(\xi^{{out}}_v\in (0,\infty)\big)$ , such that

\begin{align*}\frac{D^{{in}}_v(n)}{n^{c_1}}\stackrel{{a.s.}}{\longrightarrow} \xi^{{in}}_v,\qquad\frac{D^{{out}}_v(n)}{n^{c_2}}\stackrel{{a.s.}}{\longrightarrow} \xi^{{out}}_v.\end{align*}

The proof of Proposition 2.1 is given in Appendix A.3, after the statement and proof of two lemmas.

3. Reciprocity in the preferential attachment network

In this section, we focus on the asymptotic behavior of the number of reciprocal edges between two fixed, distinct nodes i and j as well as that of the proportion of reciprocal edges in the entire graph. We also consider the first time a reciprocal edge forms between two distinct nodes.

To assess goodness of fit of the directed PA model to a particular dataset, it is useful to evaluate statistics to see whether the empirical values match those from the fitted model. The statistic we focus on here is reciprocity. If there is a significant discrepancy between the reciprocity measure for the fitted theoretical PA model and that of the actual network data, then we conclude that variants of the classical PA model should be considered.

Given a directed graph $G=(V,E)$ , let $L_{(i,j)}= L_{(i,j)}(G)$ be the number of directed edges (i, j) in the graph G, for $i\neq j$ . Then define the reciprocity coefficient, $R=R(G)$ , as

(3.1)

\begin{equation}R = \frac{2}{|E|}\sum_{i,j\in V:i<j} \min\left\{L_{(i,j)}, L_{(\,j,i)}\right\}.\end{equation}

Note that a node pair can be counted more than once but self-loops are never counted. For example, consider the graph given in Figure 1, where there are $|E|=6$ edges and node set $V=\{1,2,3\}$ . We distinguish multiple edges between two nodes by different colors, and if a pair of reciprocal edges are observed, we label the pair with the same color. The graph in Figure 1 contains a pair of blue edges and a pair of red edges, thus giving $R_6 = 4/6 = 0.667$ . In R, we can easily compute the reciprocity coefficient by applying the dyad_census() function in the igraph package to the graph object, and its mut value outputs the total number of unordered node pairs $\{i,j\}$ with reciprocal connections (i, j) and (j, i), allowing multiplicity.

Figure 1. A graphical illustration of reciprocal edges with edge set $E=\{(1,2),(1,2),(1,2),$ $(2,1),(2,1), (2,3)\}$ and $V=\{1,2,3\}$ . If a pair of reciprocal edges are observed, we label them with the same color. Here we have $R_6 = 4/6 = 0.667$ .

Consider a sequence of graphs $\left\{G(n) = \bigl(V(n), E(n)\bigr),n\geq 1\right\}$ constructed following the PA rule with parameters $\big(\alpha,\beta,\gamma,\delta_{\text{in}},\delta_{\text{out}}\big)$ as outlined in Section 1.1. For two nodes $i\neq j\in V(n)$ , write $L_{(i,j)}(n) = L_{(i,j)}(G(n))$ , and write the reciprocity coefficient for the PA network G(n) as $R_n^{\text{pa}}\,:\!=\, R(G(n))$ . We emphasize that the definition in (3.1) excludes self-loops when counting reciprocal edges. In Sections 3.1 and 3.2, we study the asymptotic behavior of $L_{(i,j)}(n)$ for fixed $i\neq j$ , and $R_n^{\text{pa}}$ in a PA network G(n), respectively.

3.1. Reciprocal edges between two fixed nodes

In a directed PA network, the total number of reciprocal edges between two fixed nodes i, j is equal to

\[L_{i\leftrightarrow j}(n)\,:\!=\,2\min\left\{L_{(i,j)}(n), L_{(\,j,i)}(n)\right\}.\]

We first study the limiting behavior of the number of edges between two fixed nodes $i\neq j$ , $L_{(i,j)}(n)$ , when n is large. The asymptotics of $L_{i\leftrightarrow j}(n)$ then follow from a continuous mapping argument, and this leads in Section 3.2 to a study of the asymptotic behavior of $R_n^{\text{PA}}$ . The asymptotic behavior of $L_{(i,j)}(n)$ also assists in the study, in Section 3.3, of the behavior of the first time a reciprocal pair is formed between i and j.

3.1.1. Convergence of $L_{(i,j)}(n)$

Theorem 3.1 gives the main asymptotic results, but we start by presenting a lemma on $A_{ij}(n)$ which is useful for the proof of Theorem 3.1.

Lemma 3.1. Recall the definition of $A_{ij}(n)$ in (1.1) and the notation in Proposition 2.1. For fixed $1\le i<j$ ,

(3.2)

\begin{equation}\frac{A_{ij}(n)}{n^{c_1+c_2-2}}\longrightarrow \frac{1}{\big(1+\delta_{{in}}(1-\beta)\big)\big(1+\delta_{{out}}(1-\beta)\big)}\xi_j^{{in}} \xi_i^{{out}},\quad {almost\ surely\ (a.s.)\ and\ in\ L_1}.\end{equation}

This further gives the following:

1. When $c_1+c_2>1$ , there exist constants $C_{ij}>0$ such that
(3.3) \begin{align}\frac{c_1+c_2-1}{n^{c_1+c_2-1}}\sum_{k=S_j}^{S_j+n} A_{ij}(k) &\longrightarrow C_{ij}\xi_j^{{in}} \xi_i^{{out}}, \qquad {a.s.\ and\ in\ L_1}.\end{align}
2. When $c_1+c_2=1$ , there exist constants $C^{\prime}_{ij}>0$ such that
(3.4) \begin{align}\frac{1}{\log n}\sum_{k=S_j}^{S_j+n} A_{ij}(k) &\longrightarrow C^{\prime}_{ij} \xi_j^{{in}} \xi_i^{{out}},\qquad {a.s.\ and\ in\ L_1}. \end{align}
3. When $c_1+c_2<1$ , there exist constants $C^{\prime\prime}_{ij}>0$ such that
(3.5) \begin{align}\frac{1}{n^{c_1+c_2-1}}\sum_{k=n}^{\infty} A_{ij}(k) &\longrightarrow C^{\prime\prime}_{ij} \xi_j^{{in}} \xi_i^{{out}}, \qquad {a.s.\ and\ in\ L_1},\end{align}
which further implies $\sum_{k\ge 1} \mathbb{E}(A_{ij}(k)) <\infty$ and $\sum_{k\ge 1} A_{ij}(k) <\infty$ a.s.

In addition, we have similar convergence results for $A_{ji}(n)$ by replacing $A_{ij}(n)$ , $\xi^{{in}}_j$ , and $\xi^{{out}}_i$ with $A_{ji}(n)$ , $\xi^{{in}}_i$ , and $\xi^{{out}}_j$ , respectively.

Proof. By Lemma 2.1 as well as the fact that $|V(n)|/n\stackrel{\text{a.s.}}{\longrightarrow} 1-\beta$ , we have

\begin{align*}\frac{A_{ij}(n)}{n^{c_1+c_2-2}}&= \frac{\big(D^{\text{out}}_i(n)+\delta_{\text{out}}\big)\big(D^{\text{in}}_j(n)+\delta_{\text{in}}\big)}{n^{c_1+c_2}}\frac{n^{2}}{\big(n+1+\delta_{\text{in}} |V(n)|\big)\big(n+1+\delta_{\text{out}} |V(n)|\big)}\\[4pt] &\stackrel{\text{a.s.}}{\longrightarrow} \frac{1}{\big(1+\delta_{\text{in}}(1-\beta)\big)\big(1+\delta_{\text{out}}(1-\beta)\big)}\xi_j^{\text{in}} \xi_i^{\text{out}}.\end{align*}

Note that once we show

(3.6)

\begin{align}\sup_{n\ge 1}\mathbb{E}\left[\left(\frac{A_{ij}(n)}{n^{c_1+c_2-2}} \right)^2\right]<\infty,\end{align}

we have by [Reference Durrett5, Theorem 4.6.2] that $\big\{A_{ij}(n)/n^{c_1+c_2-2}\,:\, n\ge 1\big\}$ is uniformly integrable, which gives the $L_1$ -convergence in (3.2).

To prove (3.6), we now use the Cauchy–Schwarz inequality to obtain

\begin{align*}\mathbb{E}\left[\left(\frac{A_{ij}(n)}{n^{c_1+c_2-2}}\right)^2\right]&\le \mathbb{E}\left[\frac{\big(D^{\text{out}}_i(n)+\delta_{\text{out}}\big)^2\big(D^{\text{in}}_j(n)+\delta_{\text{in}}\big)^2}{n^{2\big(c_1+c_2\big)}}\right]\\[5pt] &\le \frac{1}{n^{2\big(c_1+c_2\big)}}\left(\mathbb{E}\left[\big(D^{\text{out}}_i(n)+\delta_{\text{out}}\big)^4\right]\mathbb{E}\left[\big(D^{\text{in}}_i(n)+\delta_{\text{in}}\big)^4\right]\right)^{1/2}.\end{align*}

Then by Lemma A.2 we have

\begin{align*}&\sup_{n\ge 1}\mathbb{E}\left[\left(\frac{A_{ij}(n)}{n^{c_1+c_2-2}} \right)^2\right] \le \sup_{n\ge 1}\frac{\sqrt{\mathbb{E}\left[\big(D^{\text{out}}_i(n)+\delta_{\text{out}}\big)^4\right]}}{n^{2c_2}}\times \sup_{n\ge 1}\frac{\sqrt{\mathbb{E}\left[\big(D^{\text{in}}_j(n)+\delta_{\text{in}}\big)^4\right]}}{n^{2c_1}}<\infty.\end{align*}

With (3.2) established, the results in (3.3)–(3.5) follow directly from Karamata’s theorem (cf. [Reference Resnick13, Theorem 2.1]).

The results for $A_{ji}(n)$ follow from similar reasoning.

We now give the asymptotic behavior of $L_{(i,j)}(n)$ in a directed PA model.

Theorem 3.1. Consider two fixed nodes $1\le i< j$ . We have the following:

(i) If $c_1+c_2> 1$ , then there exists some random variable $\xi_{ij}$ , satisfying $\mathbb{P}\big(\xi_{ij}\in (0,\infty)\big) = 1 $ , such that as $n\to\infty$ ,
(3.7) \begin{equation} \frac{L_{(i,j)}(n)}{n^{c_1+c_2-1}}\stackrel{{a.s.}}{\longrightarrow} \xi_{ij}. \end{equation}
So for large n, the number of (i, j) edges is of order $n^{c_1+c_2-1}$ .
(ii) If $c_1+c_2= 1$ , then there exists some random variable $\zeta_{ij}$ , satisfying $\mathbb{P}\big(\zeta_{ij}\in (0,\infty)\big) = 1 $ , such that
(3.8) \begin{equation} \frac{L_{(i,j)}(n)}{\log n}\stackrel{{a.s.}}{\longrightarrow} \zeta_{ij}. \end{equation}
So for large n, the number of (i, j) edges is of order $\log n$ .
(iii) If $c_1+c_2<1$ , then for any $i<j$ , there is a last time for an (i, j) edge to form. Furthermore, as $n\to\infty$ ,
(3.9) \begin{equation} L_{(i,j)}(n)\uparrow L_{(i,j)}(\infty)<\infty, \qquad {a.s.}, \end{equation}
and $L_{(i,j)}(n)-L_{(i,j)}(\infty) = 0$ a.s. for n large.

In addition, we obtain similar convergence results for $L_{(\,j,i)}(n)$ by replacing $L_{(i,j)}(n)$ , $L_{(i,j)}(\infty)$ , $\xi_{ij}$ , and $\zeta_{ij}$ with $L_{(\,j,i)}(n)$ , $L_{(\,j,i)}(\infty)$ , $\xi_{ji}$ , and $\zeta_{ji}$ , respectively.

Proof. Set

\[\Delta_k(i,j) = \boldsymbol{1}_{\big\{E(k)= E(k-1)\cup \{(i,j)\}\big\}};\]

i.e. $\Delta_k(i,j) = 1$ if a directed edge (i, j) is created from $G(k-1)$ to G(k). For $1\le i<j$ , notice that

(3.10)

\begin{equation}L_{(i,j)}\big(S_j+n\big) = \sum_{k=1}^{S_j+n} \Delta_k(i,j)=\sum_{k=0}^{n} \Delta_{S_j+k}(i,j).\end{equation}

For $n\ge 0$ ,

(3.11)

\begin{align}\mathbb{E}^{\mathcal{F}_{S_j+n}}&\left(\Delta_{S_j+n+1}(i,j)\right)= \beta A_{ji}\big(S_j+n\big),\end{align}

and

(3.12)

\begin{align}\mathbb{E}^{\mathcal{F}_{S_j-1}}&\big(\Delta_{S_j}(i,j)\big)= \gamma\frac{D^{\text{out}}_i\big(S_j-1\big)+\delta_{\text{out}}}{S_j+\delta_{\text{out}} (\,j-1)}.\end{align}

When $c_1+c_2\ge 1$ , (3.3) and (3.4) suggest that $\sum_{k=0}^{\infty} A_{ji}\big(S_j+k\big) = \infty$ a.s.; then we apply [Reference Durrett5, Theorem 4.5.5] to get

(3.13)

\begin{equation}\frac{L_{(i,j)}\big(S_j+n\big) }{\gamma\frac{D^{\text{out}}_i\big(S_j-1\big)+\delta_{\text{out}}}{S_j+\delta_{\text{out}} (\,j-1)}+\sum_{k=0}^{n-1} \beta A_{ji}\big(S_j+k\big) }\stackrel{\text{a.s.}}{\longrightarrow} 1.\end{equation}

Also, by a similar argument as in (3.3), we have that when $c_1+c_2>1$ , there exists some constant $\widetilde{C}_{ij}>0$ such that

(3.14)

\begin{align}&\frac{1}{n^{c_1+c_2-1}}\sum_{k=0}^{n-1} \beta A_{ji}\big(S_j+k\big)\stackrel{\text{a.s.}}{\longrightarrow} \beta \widetilde{C}_{ij} \xi^{\text{out}}_i\xi^{\text{in}}_j .\end{align}

Then combining (3.13) with (3.14) gives

(3.15)

\begin{equation}\frac{L_{(i,j)}\big(S_j+n\big) }{n^{c_1+c_2-1}}\stackrel{\text{a.s.}}{\longrightarrow} \beta \widetilde{C}_j\xi^{\text{out}}_i\xi^{\text{in}}_j.\end{equation}

Analogous reasoning is also applicable to the case $c_1+c_2=1$ , where the scaling function $n^{c_1+c_2-1}$ is replaced with $\log n$ , according to (3.4).

Next, consider the case $c_1+c_2<1$ . By the corollary in [Reference Neveu11, Chapter IV.6, p. 151], we have a.s.

\begin{align*}\Bigg\{\sum_{k\ge S_j}\Delta_{k+1}(i,j)<\infty\Bigg\}&= \Bigg\{\sum_{k\ge S_j}\mathbb{E}^{\mathcal{F}_k}\big(\Delta_{k+1}(i,j)\big)<\infty\Bigg\}\\ &= \Bigg\{\sum_{k\ge S_j}\beta A_{ji}(k)<\infty\Bigg\}.\end{align*}

Using a similar argument as in Lemma 3.1(3), we have $\sum_{k\ge S_j} A_{ji}(k)<\infty$ a.s., thus giving a.s.

\[\sum_{k\ge S_j}\Delta_{k+1}(i,j)<\infty. \]

Hence, with probability 1, there is a finite number of (i, j) edges that can be formed, and there exists a last time for an (i, j) edge to form.

Note that by the definition of $L_{i\leftrightarrow j}(n)$ , applying continuous mapping gives the asymptotic results for $L_{i\leftrightarrow j}(n)$ , which also depends on the value of $c_1+c_2$ :

(1) If $c_1+c_2> 1$ , then there exists some random variable $\overline{\xi}_{ij}$ , satisfying $\mathbb{P}\big(\overline{\xi}_{ij}\in (0,\infty)\big) = 1 $ , such that as $n\to\infty$ ,
\[\frac{L_{i\leftrightarrow j}(n)}{n^{c_1+c_2-1}}\stackrel{\text{a.s.}}{\longrightarrow} \overline{\xi}_{ij}.\]
So for large n, the number of reciprocal edges between i and j is of order $n^{c_1+c_2-1}$ .
(2) If $c_1+c_2= 1$ , then there exists some random variable $\overline{\zeta}_{ij}$ , satisfying $\mathbb{P}\big(\overline{\zeta}_{ij}\in (0,\infty)\big) = 1 $ , such that
\[\frac{L_{i\leftrightarrow j}(n)}{\log n}\stackrel{\text{a.s.}}{\longrightarrow} \overline{\zeta}_{ij}.\]
So for large n, the number of reciprocal edges between i and j is of order $\log n$ .
(3) If $c_1+c_2<1$ , then a.s.
\[L_{i\leftrightarrow j}(n)\uparrow L_{i\leftrightarrow j}(\infty)<\infty.\]

Now consider a special case with $\gamma=0$ and $\delta_{\text{in}} = \delta_{\text{out}} = \delta>0$ ; then

\[c_1+c_2<1 \qquad\Leftrightarrow\qquad \alpha> \frac{1}{1+\delta}.\]

From Theorem 3.1, we see that when the probability of generating a new node in a PA network at each step is too high, the node set grows strongly and it is difficult to form reciprocal edges. Then the number of reciprocal edges between two nodes is finite a.s. If $\beta = 0$ , then for $\delta_{\text{in}}, \delta_{\text{out}}>0$ , we always have $c_1+c_2<1$ , indicating that the number of edges between two fixed nodes is finite a.s. when no edge is added between two existing nodes.

3.2. Reciprocity in the entire graph

Here we consider the proportion of reciprocal edges in the entire PA network, $R_n^{\text{pa}} $ , and the next theorem specifies the asymptotic behavior of $R_n$ for $0< c_1+c_2< 5/3$ .

Theorem 3.2. Suppose $R_n^{{pa}}$ is as defined in (3.1). Then for $0< c_1+c_2< 5/3$ , we have

\[R_n^{{pa}} \stackrel{p}{\longrightarrow} 0,\qquad n\to\infty.\]

When $c_1+c_2<5/3$ , the reciprocity coefficient $R_n^{\text{pa}} $ is likely to be small, provided that the number of edges in the PA network is large. In particular, when $c_1+c_2\le 1$ , i.e. in the second and third cases in Theorem 3.1, we have $R_n^{\text{pa}}\stackrel{p}{\longrightarrow} 0$ .

Proof. It suffices to show that $\mathbb{E}\big(R_n^{\text{pa}} \big)\to 0$ for $0< c_1+c_2< 5/3$ , as $n\to \infty$ . Recall that

\[\Delta_k(i,j) = \boldsymbol{1}_{\big\{E(k)= E(k-1)\cup \{(i,j)\}\big\}},\]

so that $\Delta_k(i,j)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}$ indicates the event that from $G(k-1)$ to G(k), an edge (i, j) is created when (j, i) already exists in $G(k-1)$ . By the definition of $R_n^{\text{pa}} $ , we have

\begin{align*}R_n^{\text{pa}} &\le \frac{2}{n+1}\sum_{j=1}^n \sum_{i<j} \sum_{k= S_j}^n\Delta_k(i,j)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}\\ &\quad+\frac{2}{n+1}\sum_{j=1}^n \sum_{i<j} \sum_{k= S_j}^n\Delta_k(i,j)\boldsymbol{1}_{\{(i,j)\in E(k-1)\}}\\ &= \frac{2}{n+1}\sum_{j=1}^n \sum_{i<j} \sum_{k= S_j+1}^n\Delta_k(i,j)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}\\ &\quad+\frac{2}{n+1}\sum_{j=1}^n \sum_{i<j} \sum_{k= S_j+1}^n\Delta_k(i,j)\boldsymbol{1}_{\{(i,j)\in E(k-1)\}}\\ &=\!:\, Q_1(n) + Q_2(n).\end{align*}

For $Q_1(n)$ , we have

(3.16)

\begin{align}\mathbb{E}(Q_1(n)) &=\frac{2}{n+1}\sum_{j=1}^n \sum_{i<j}\mathbb{E}\left(\sum_{k= S_j+1}^n \Delta_k(i,j)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}\right)\nonumber\\&= \frac{2}{n+1}\sum_{j=1}^n \sum_{i<j}\mathbb{E}\left(\sum_{k= S_j+1}^n \mathbb{E}^{\mathcal{F}_{k-1}}\left(\Delta_k(i,j)\right)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}\right)\nonumber\\&= \frac{2}{n+1}\sum_{j=1}^n \sum_{i<j}\mathbb{E}\left(\sum_{k= S_j+1}^n \beta A_{ji}(k-1)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}\right).\end{align}

Since $S_j\ge j-1$ for $j\ge 2$ , (3.16) implies

(3.17)

\begin{align}\mathbb{E}(Q_1(n)) &\le \frac{2}{n+1}\sum_{j=2}^n \sum_{i=1}^{j-1}\sum_{k= j}^n \mathbb{E}\left(A_{ji}(k-1)\boldsymbol{1}_{\big\{(\,j,i)\in E(k-1), k\ge S_j+1\big\}}\right).\end{align}

By the Cauchy–Schwarz inequality, we have

(3.18)

\begin{align}\mathbb{E}&\left(A_{ji}(k-1)\boldsymbol{1}_{\big\{(\,j,i)\in E(k-1), k\ge S_j+1\big\}}\right)\nonumber\\ &\le \left[\mathbb{E}\left(A^2_{ji}(k-1)\right)\right]^{1/2} \left[\mathbb{P}\big((\,j,i)\in E(k-1), k\ge S_j+1\big)\right]^{1/2}\nonumber\\ &\le \left[\mathbb{E}\left(A^2_{ji}(k-1)\right)\right]^{1/2}\left[\sum_{l=j}^{k-1}\mathbb{P}\big(E(l)= E(l-1)\cup \{(\,j,i)\}, l\ge S_j+1\big)\right]^{1/2}\nonumber\\ &\le \left[\mathbb{E}\left(A^2_{ji}(k-1)\right)\right]^{1/2}\left[\sum_{l=j}^{k-1}\mathbb{E}\left(A_{ij}(l)\right)+\alpha\mathbb{E}\left(\frac{D^{\text{in}}_i\big(S_j-1\big)+\delta_{\text{in}}}{S_j+\delta_{\text{in}} (\,j-1)}\right)\right]^{1/2}.\end{align}

When $c_1+c_2>1$ , Lemma A.2 and (A.11) together imply that there exist constants $M_1$ , $M_2$ , $M_3>0$ such that for $i<j\le k\le n$ ,

\[\mathbb{E}\left(A^2_{ji}(k-1)\right) \le M_1\frac{k^{2\big(c_1+c_2-2\big)}}{i^{2c_2}j^{2c_1}},\quad \mathbb{E}\left(\frac{D^{\text{in}}_i\big(S_j-1\big)+\delta_{\text{in}}}{S_j+\delta_{\text{in}} (\,j-1)}\right)\le M_3\frac{j^{c_1-1}}{i^{c_1}},\]

and that for $j\le l\le k-1$ ,

\[ \mathbb{E}\left(A_{ij}(l)\right) \le M_2\frac{l^{c_1+c_2-2}}{i^{c_1}j^{c_2}}.\]

Therefore, when $c_1+c_2>1$ , we have

\begin{align*}\mathbb{E}&\left(A_{ji}(k-1)\boldsymbol{1}_{\big\{(\,j,i)\in E(k-1), k\ge S_j+1\big\}}\right)\\&\le \sqrt{M_1}\frac{k^{c_1+c_2-2}}{i^{c_2}j^{c_1}} \left(\frac{{M_2}}{c_1+c_2-1}\frac{k^{c_1+c_2-1}}{i^{c_1}j^{c_2}}+\alpha M_3 j^{c_1-1}/i^{c_1}\right)^{1/2},\end{align*}

and since $k^{c_1+c_2-1}/\big(i^{c_1}j^{c_2}\big)\ge j^{c_1-1}/i^{c_1}$ for $k\ge j$ , there exists some constant $M>0$ such that

(3.19)

\begin{align}\mathbb{E}&\bigg(A_{ji}(k-1)\boldsymbol{1}_{\big\{(\,j,i)\in E(k-1), k\ge S_j+1\big\}}\bigg)\le M \frac{k^{\frac{3}{2}(c_1+c_2)-\frac{5}{2}}}{i^{c_1/2+c_2}j^{c_1+c_2/2}}.\end{align}

Then (3.17) leads to

\begin{align*}\mathbb{E}(Q_1(n)) &\le \frac{2M}{n+1}\sum_{j=2}^n \sum_{i=1}^{j-1}\sum_{k= j}^n \frac{k^{\frac{3}{2}(c_1+c_2)-\frac{5}{2}}}{i^{c_1/2+c_2}j^{c_1+c_2/2}}\\&\le \frac{2M}{3/2(c_1+c_2-1)}n^{\frac{3}{2}(c_1+c_2)-\frac{5}{2}}\sum_{j=2}^n \sum_{i=1}^{j-1} i^{-(c_1/2+c_2)} j^{-(c_1+c_2/2)}.\end{align*}

If $c_1/2+c_2>1$ , $c_1+c_2/2>1$ , and $c_1+c_2<5/3$ , then

\begin{align*}\mathbb{E}(Q_1(n)) &\le \frac{2M}{3/2(c_1+c_2-1)}n^{\frac{3}{2}(c_1+c_2)-\frac{5}{2}}\sum_{i=1}^{\infty} i^{-(c_1/2+c_2)} \sum_{j=2}^\infty j^{-\left(c_1+c_2/2\right)} \to 0.\end{align*}

If $c_1/2+c_2< 1$ , $c_1+c_2/2< 1$ , then

\begin{align*}\mathbb{E}(Q_1(n)) &\le \frac{2M}{3/2(c_1+c_2-1)} n^{\frac{3}{2}(c_1+c_2)-\frac{5}{2}}\frac{n^{2-3/2(c_1+c_2)}}{(1-c_1/2-c_2)(1-c_1-c_2/2)}\\&=\frac{2M n^{-1/2}}{3/2(c_1+c_2-1)(1-c_1/2-c_2)(1-c_1-c_2/2)} \to 0.\end{align*}

If $c_1/2+c_2<1$ , $c_1+c_2/2>1$ , and $c_1+c_2<5/3$ , then

\begin{align*}\mathbb{E}(Q_1(n)) &\le \frac{2M}{3/2(c_1+c_2-1)} n^{\frac{3}{2}(c_1+c_2)-\frac{5}{2}}\frac{1}{1-(c_1/2+c_2)}\sum_{j=2}^n j^{1-(c_1/2+c_2)}j^{-1}\\&\le \frac{2M n^{c_1+c_2/2-3/2}}{3/2(c_1+c_2-1)(1-(c_1/2+c_2))^2}\to 0,\end{align*}

as $c_1+c_2/2<1+1/2=3/2$ . Similarly, $\mathbb{E}(Q_1(n)) \to 0$ , when $c_1/2+c_2>1$ , $c_1+c_2/2<1$ , and $c_1+c_2<5/3$ . The proof machinery also applies to the case where either $c_1/2+c_2=1$ or $c_1+c_2/2=1$ , and $c_1+c_2<5/3$ , which gives the conclusion that for $1<c_1+c_2<5/3$ , $\mathbb{E}(Q_1(n)) \to 0$ . Following the same reasoning, we have $\mathbb{E}(Q_2(n)) \to 0$ , for $1<c_1+c_2<5/3$ , thus implying $R_n^{\text{pa}} \stackrel{p}{\longrightarrow} 0$ for $1<c_1+c_2<5/3$ .

When $c_1+c_2=1$ , we revise the bound in (3.19) to get the following: for some constant $\widetilde{M}>0$ ,

\begin{align*}\mathbb{E}&\bigg(A_{ji}(k-1)\boldsymbol{1}_{\big\{(\,j,i)\in E(k-1), \, k\ge S_j+1\big\}}\bigg)\le \widetilde{M} \frac{k^{-1}(\log k)^{1/2}}{i^{c_1/2+c_2}j^{c_1+c_2/2}}.\end{align*}

Then we have

\begin{align*}\mathbb{E}(Q_1(n)) &\le \frac{2\widetilde{M}}{n}(\log n)^{3/2}\sum_{j=2}^n\sum_{i=1}^{j-1} i^{-(c_1/2+c_2)}j^{-(c_1+c_2/2)}\le {4\widetilde{M}}\frac{(\log n)^{3/2}}{n^{1/2}}\to 0.\end{align*}

Meanwhile, for some constant $\widetilde{M}^{\prime}>0$ , we have

\begin{align*}\mathbb{E}(Q_2(n)) &\le \frac{2\widetilde{M}^{\prime}}{n}(\log n)^{3/2}\sum_{j=2}^n\sum_{i=1}^{j-1} i^{-(c_2/2+c_1)}j^{-(c_2+c_1/2)}\le {4\widetilde{M}^{\prime}}\frac{(\log n)^{3/2}}{n^{1/2}}\to 0.\end{align*}

Hence, we have $R_n^{\text{pa}} \stackrel{p}{\longrightarrow}0$ when $c_1+c_2=1$ .

When $c_1+c_2<1$ , the bound in (3.18) implies that there exists some constant $\bar M>0$ such that

(3.20)

\begin{align}\mathbb{E}\bigg(A_{ji}(k-1)\boldsymbol{1}_{\big\{(\,j,i)\in E(k-1), k\ge S_j+1\big\}}\bigg)&\le \bar M \frac{k^{c_1+c_2-2}}{i^{c_2}j^{c_1}}\left(\frac{j^{c_1+c_2-1}}{i^{c_1}j^{c_2}}\right)^{1/2}\nonumber\\[3pt] &= \bar M \frac{k^{c_1+c_2-2}}{i^{c_1/2+c_2}j^{c_1/2+1/2}},\end{align}

which gives

\begin{align*}\mathbb{E}(Q_1(n))&\le \frac{4\bar M}{(1-c_1-c_2)(1-c_1/2-c_2)}n^{-1/2}\to 0.\end{align*}

Similar reasoning also gives $\mathbb{E}(Q_2(n))\to 0$ when $c_1+c_2<1$ , thus giving $\mathbb{E}\big(R_n^{\text{pa}} \big)\to 0$ and completing the proof of the theorem.

Remark 3.1. (i) From the definition of $c_1$ and $c_2$ in (2.1), if $\beta\le 2/3$ , then $c_1+c_2<1+\beta\le 5/3$ . Theorem 3.2 suggests that if the proportion of edges added between two existing nodes is less than $2/3$ and n is sufficiently large, the corresponding PA network will have $R_n^{\text{pa}} $ close to 0.

(ii) Note also that

\[\boldsymbol{1}_{\big\{N_0^{i\leftrightarrow j}\le n\big\}}\le \sum_{k=1}^n \Delta_k(i,j)\boldsymbol{1}_{\{(\,j,i)\in E(k-1)\}}+\sum_{k=1}^n \Delta_k(\,j,i)\boldsymbol{1}_{\{(i,j)\in E(k-1)\}}.\]

Hence, when $c_1+2c_2<1$ , applying the bound in (3.20) gives the following: for fixed i, there exists some constant $\tilde{M}>0$ such that

\[\lim_{n\to\infty}\mathbb{E}\left(\sum_{j\in V(n)}\boldsymbol{1}_{\big\{N_0^{i\leftrightarrow j}\le n\big\}}\right)\le \tilde{M}i^{-(c_1/2+c_2)}\sum_{j=1}^\infty j^{c_1/2+c_2-3/2}<\infty.\]

This indicates that when $c_1+2c_2<1$ , a fixed node i can form a reciprocal pair of edges only with finitely many nodes.

3.2.1. Simulation for $c_1+c_2\ge 5/3$

Theorem 3.2 does not explain the asymptotic behavior of $R_n^{\text{pa}} $ for $c_1+c_2\in [5/3, 1+\beta)$ , provided that $\beta\in (2/3,1)$ . For comparison, we choose three sets of parameters,

\begin{align*}\boldsymbol{\theta}_1&= (\alpha,\beta,\gamma,\delta_{\text{in}},\delta_{\text{out}}) = (0.1,0.8,0.1,2,1),\\\boldsymbol{\theta}_2& = (0.1,0.8,0.1,.4,.4),\qquad\text{and}\qquad\boldsymbol{\theta}_3 = (0.05,0.9,0.05,1,1),\end{align*}

such that the values of $c_1+c_2$ are equal to $1.393<5/3$ , $1.667=5/3$ , and $1.727>5/3$ , respectively. For each $\boldsymbol{\theta}_i$ , $i=1,2,3$ , we simulate 1000 replications of the directed PA network with $10^5$ edges, and compute the value of $R_n^{\text{pa}} $ for each replication.

The numerical results are summarized as box plots in Figure 2. For each box plot, we use a dark red dot to mark the corresponding averaged empirical $R_n^{\text{pa}} $ . Under $\boldsymbol{\theta}_1$ , all 1000 empirical $R_n^{\text{pa}} $ values are close to 0, with a maximum of 0.090 and a minimum of 0.021. The empirical $R_n^{\text{pa}} $ values under $\boldsymbol{\theta}_2$ and $\boldsymbol{\theta}_3$ are more variable, but both have higher mean than in the $\boldsymbol{\theta}_1$ case. This simulation experiment confirms that the asymptotic behavior of $R_n^{\text{pa}} $ in a directed PA model depends on the value of $c_1+c_2$ . Meanwhile, when $c_1+c_2\ge 5/3$ , the value of $R_n^{\text{pa}} $ may not necessarily concentrate around a specific value, but may vary over a certain range.

Figure 2. Box plots of $R_n^{\text{pa}} $ for directed PA models simulated using $\boldsymbol{\theta}_i$ , $i=1,2,3$ . The red dots represent the averaged empirical $R_n^{\text{pa}} $ for each $\boldsymbol{\theta}_i$ , $i=1,2,3$ .

3.3. The first time when a reciprocal pair forms

Theorem 3.1 gives results about the first time when a reciprocal pair forms between nodes $i\neq j$ . The $c_1+c_2\ge 1$ scenario is discussed in Corollary 3.1, while the $c_1+c_2<1$ case is analyzed in Proposition 3.1.

Corollary 3.1. Let $N_0^{i\leftrightarrow j}$ be the first time when a reciprocal pair of edges $i\leftrightarrow j$ is formed between nodes $i\neq j$ ; i.e.

(3.21)

\begin{equation}N_0^{i\leftrightarrow j}\,:\!=\, \inf\big\{n\ge 0\,:\, (i,j)\in E(n),\ and\ (\,j,i)\in E(n)\big\},\end{equation}

with the convention that $\inf\emptyset = \infty$ . If $1\le i<j$ are fixed, and $c_1,c_2$ are as given in (2.1) and satisfy $c_1+c_2\ge1$ , then $N_0^{i\leftrightarrow j}<\infty$ a.s.

Proof. We will show that for $c_1+c_2\ge 1$ , $\lim_{n\to\infty}\mathbb{P}\Big(N_0^{i\leftrightarrow j}{>} S_j+n\Big)=\mathbb{P}\Big(N_0^{i\leftrightarrow j}=\infty\Big)=0$ . Note that when $c_1+c_2\ge 1$ , (3.15) indicates that $L_{(i,j)}\big(S_j+n\big)\stackrel{\text{a.s.}}{\longrightarrow} \infty$ , and similarly, $L_{(\,j,i)}\big(S_j+n\big)\stackrel{\text{a.s.}}{\longrightarrow} \infty$ . Therefore,

\begin{align*}\mathbb{P}\Big(N_0^{i\leftrightarrow j}> S_j+n\Big) &\le \mathbb{P}\big((i,j)\notin E\big(S_j+n\big)\big)+\mathbb{P}\big((\,j,i)\notin E\big(S_j+n\big)\big)\\ &= \mathbb{P}\big(L_{(i,j)}\big(S_j+n\big)=0\big)+\mathbb{P}\big(L_{(\,j,i)}\big(S_j+n\big)=0\big)\to 0,\end{align*}

as $n\to\infty$ .

When $c_1+c_2<1$ , if we have $\mathbb{E}\big(L_{(i,j)}(\infty)\big)<1$ , then

\begin{align*}\mathbb{P}\big(L_{(i,j)}(\infty) <1\big)>0,\end{align*}

which implies $\mathbb{P}\big(L_{(i,j)}(\infty) =0\big)>0$ , and $\mathbb{P}\Big(N_0^{i\leftrightarrow j} =\infty\Big)>0$ . Note that

\[\mathbb{E}\big(L_{(i,j)}(\infty)\big) = \gamma\mathbb{E}\left(\frac{D^{\text{out}}_i\big(S_j-1\big)+\delta_{\text{out}}}{S_j+\delta_{\text{out}} (\,j-1)}\right)+\beta \sum_{k=0}^\infty \mathbb{E}\left(A_{ji}\big(S_j+k\big)\right),\]

and by (A.11), there exist some constants ${K}_1, {K}_2>0$ such that for $n\ge i\ge 1$ ,

\begin{align*}\mathbb{E}\Big(\big(D^{\text{in}}_i(n)\big)^2\Big)\le K_1(n/i)^{2c_1},\qquad\mathbb{E}\Big(\big(D^{\text{out}}_i(n)\big)^2\Big)\le K_2(n/i)^{2c_2}.\end{align*}

Then applying the Cauchy–Schwarz inequality gives that for $n\ge j>i\ge 1$ ,

\begin{align*}\mathbb{E}\bigl(A_{ji}(n)\bigr)&\le \left[\mathbb{E}\left(\left(\frac{D^{\text{in}}_i(n)+\delta_{\text{in}}}{n+1+\delta_{\text{in}} |V(n)|}\right)^2\right)\right]^{1/2}\left[\mathbb{E}\left(\left(\frac{D^{\text{out}}_j(n)+\delta_{\text{out}}}{n+1+\delta_{\text{out}} |V(n)|}\right)^2\right)\right]^{1/2}\\[3pt] &\le \frac{1}{n^2}\left[\mathbb{E}\bigg(\Big(D^{\text{in}}_i(n)+\delta_{\text{in}}\Big)^2\bigg)\mathbb{E}\bigg(\Big(D^{\text{out}}_j(n)+\delta_{\text{out}}\Big)^2\bigg)\right]^{1/2}\\[3pt] &\le (K_1K_2)^{1/2}\, \frac{n^{c_1+c_2-2}}{i^{c_1}j^{c_2}} \,=\!:\, K^{\prime}\frac{n^{c_1+c_2-2}}{i^{c_1}j^{c_2}}.\end{align*}

Therefore, we have

\begin{align*}\sum_{k=0}^\infty \mathbb{E}\left(A_{ji}\big(S_j+k\big)\right)&\le {K^{\prime}}\sum_{k=j}^\infty \frac{k^{c_1+c_2-2}}{i^{c_1}j^{c_2}}\le \frac{K^{\prime}}{1-c_1-c_2}i^{-c_1}j^{c_1-1},\end{align*}

which gives

(3.22)

\begin{align}\mathbb{E}\big(L_{(i,j)}(\infty)\big) \le \gamma + \frac{\beta K^{\prime}}{1-c_1-c_2}i^{-c_1}j^{c_1-1}.\end{align}

Since $c_1-1<0$ , for j sufficiently large we have $\mathbb{E}\big(L_{(i,j)}(\infty)\big)<1$ , thus giving $\mathbb{P}\Big(N_0^{i\leftrightarrow j} =\infty\Big)$ $>0$ . In other words, when $c_1+c_2<1$ , it is possible to have zero pairs of reciprocal edges between two fixed nodes i, j, if j is created at a late stage of network evolution.

Equation (3.22) also suggests that for arbitrarily chosen i, j, $\mathbb{E}\big(L_{(i,j)}(\infty)\big)<1$ if $\beta$ is small enough. Hence, if few edges are created between existing nodes, it is possible for two fixed nodes never to form a reciprocal pair of edges. In particular, when $\beta = 0$ , there is no reciprocal pair of edges, i.e. $\mathbb{P}\Big(N_0^{i\leftrightarrow j} =\infty\Big)=1$ , for all fixed i, j.

The following proposition assumes $c_1+c_2<1$ and gives the asymptotic behavior of $\sup_{j\ge n\epsilon} \mathbb{P}\Big(N_0^{i\leftrightarrow j}\le n\Big)$ .

Proposition 3.1. If $c_1+c_2<1$ , then for fixed $i\ge 1$ , $\epsilon>0$ , as $n\to\infty$ ,

\[\sup_{j\ge n\epsilon} \mathbb{P}\Big(N_0^{i\leftrightarrow j}\le n\Big) \to0.\]

Proof. Applying the union bound to $\mathbb{P}\Big(N_0^{i\leftrightarrow j}\le S_{j}+n\Big)$ gives

\begin{align*}\mathbb{P}\Big(N_0^{i\leftrightarrow j}\le S_{j}+ n\Big)&\le \sum_{k=1}^n \mathbb{P}\bigl(E\big(S_j+k\big)= E(S_j+k-1)\cup \{(i,j)\}\bigr)\\ &\quad + \sum_{k=1}^n\mathbb{P}\bigl(E\big(S_j+k\big)= E(S_j+k-1)\cup \{(\,j,i)\}\bigr)\\ &= \mathbb{E}\left(\sum_{k=0}^{n-1} \beta A_{ji}(S_{j}+k)\right) + \mathbb{E}\left(\sum_{k=0}^{n-1} \beta A_{ij}(S_{j}+k)\right)\\ &\le \mathbb{E}\left(\sum_{k=j}^{\infty} \beta A_{ji}(k)\right) + \mathbb{E}\left(\sum_{k=j}^{\infty} \beta A_{ij}(k)\right).\end{align*}

By (3.5), we see that for $\epsilon>0$ ,

\[\sup_{j\ge n\epsilon}\mathbb{E}\left(\sum_{k=j}^{\infty} A_{ji}(k)\right) \to 0,\qquad \sup_{j\ge n\epsilon}\mathbb{E}\left(\sum_{k=j}^{\infty} A_{ij}(k)\right) \to 0,\]

as $n\to\infty$ , which gives

\[\sup_{j\ge n\epsilon}\mathbb{P}\Big(N_0^{i\leftrightarrow j}\le n\Big) \le \sup_{j\ge n\epsilon}\mathbb{P}\Big(N_0^{i\leftrightarrow j}\le S_j+n\Big) \to 0.\]

4. Discussion

Suppose that we are given a scale-free network with a large proportion of reciprocal edges, e.g. Facebook wall posts [Reference Viswanath, Mislove, Cha and Gummadi17], Twitter [Reference Java, Song, Finin and Tseng6], Google $+$ [Reference Magno9], or Flickr [Reference Cha, Mislove and Gummadi3, Reference Mislove10]. In fitting a directed PA model to such a dataset using inference methods developed in [Reference Wan, Wang, Davis and Resnick18, Reference Wan, Wang, Davis and Resnick19], there is no guarantee that the calibrated model also has a large $R^{\text{pa}}_n$ . In fact, estimated values of $\hat{c}_1$ and $\hat{c}_2$ do not necessarily satisfy $\hat{c}_1+\hat{c}_2\ge 5/3$ . If we have $\hat{c}_1+\hat{c}_2<5/3$ in the calibrated model, then by Theorem 3.2, the corresponding $R_n^{\text{pa}}$ is close to 0, which differs from the feature of high reciprocity in the given dataset. This flags modeling error and suggests the consideration of alternative models or variants of the classical PA network. For instance, once a directed edge (i, j) is created following the PA rule, we may add a reciprocal edge (j, i) with probability $\rho\in (0,1)$ . The study in [Reference Cheng, Romero, Meeder and Kleinberg4] also provides other features that can be employed to predict reciprocal edges; we will defer the analysis of these variants of directed PA models to future research.

Appendix A. Lemmas and proofs needed in Section 2

In this section, we state and prove two lemmas needed for the proof of Proposition 2.1, which is also given.

A.1. Statement and proof of Lemma A.1

Lemma A.1. For some integer $p\ge 1$ , $k\ge 0$ , and $v\ge 1$ , we have the following:

(i) For $p=1$ ,
(A.1) \begin{align}\mathbb{E}^{\mathcal{F}_{S_v+k}}&\Big(D^{{in}}_v(S_v+k+1)+\delta_{{in}}\Big)\nonumber\\ &=\Big(D^{{in}}_v(S_v+k)+\delta_{{in}}\Big)\left(1+\frac{\alpha+\beta}{S_v+k+1+\delta_{{in}}|V(S_v+k)|}\right).\end{align}
(ii) For an integer $p\ge 2$ ,
(A.2) \begin{align}&\mathbb{E}^{\mathcal{F}_{S_v+k}}\left(\left(D^{{in}}_v(S_v+k+1)+\delta_{{in}}\right)^p\right)\nonumber\\ &=\Big(D^{{in}}_v(S_v+k)+\delta_{{in}}\Big)^p\left(1+p\frac{\alpha+\beta}{S_v+k+1+\delta_{{in}}|V(S_v+k)|}\right) \nonumber\\ &\qquad+ \frac{\alpha+\beta}{S_v+k+1+\delta_{{in}}|V(S_v+k)|}\sum_{r=2}^p\left(\begin{array}{c} p\\ r \end{array}\right) \Big(D^{{in}}_v(S_v+k)+\delta_{{in}}\Big)^{p-r+1}.\end{align}

Proof. Note that the right-hand sides of (A.1) and (A.2) are both $\mathcal{F}_{S_v+k}$ -measurable. We prove the results for $p\ge 2$ , and the case $p=1$ follows by a similar argument. Let $F\in \mathcal{F}_{S_v+k}$ ; then

\begin{align}&\int_F \mathbb{E}^{\mathcal{F}_{S_v+k}}\left(\left(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\right)^p\right)\text{d} \mathbb{P}\nonumber\\ & \quad = \int_F \left(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\right)^p\text{d} \mathbb{P}\nonumber\\ & \quad = \sum_{l\ge k}\int_{F\cap \{S_v+k=l\}} \left(D^{\text{in}}_v(l+1)+\delta_{\text{in}}\right)^p\text{d} \mathbb{P},\nonumber\end{align}

and since $D^{\text{in}}_v(l+1) = D^{\text{in}}_v(l) + \boldsymbol{1}_{\{\text{Node} \,v\, \text{is chosen at step}\, l+1}\} \,=\!:\, D^{\text{in}}_v(l) + \Delta_v(l+1)$ , this equals

(A.3)

\begin{align} & =\sum_{l\ge k}\int_{F\cap \{S_v+k=l\}} \left(D^{\text{in}}_v(l)+\delta_{\text{in}}+ \Delta_v(l+1)\right)^p\text{d} \mathbb{P}\nonumber\\ & = \sum_{l\ge k}\int_{F\cap \{S_v+k=l\}} \left(\left(D^{\text{in}}_v(l)+\delta_{\text{in}}\right)^p+ \sum_{r=1}^p \left(\begin{array}{c} p\\ 2 \end{array}\right) \Big(D^{\text{in}}_v(l)+\delta_{\text{in}}\Big)^{p-r}\Delta_v(l+1)\right)\text{d} \mathbb{P}.\end{align}

Since $F\cap \{S_v+k=l\}\in \mathcal{F}_l$ , the quantity in (A.3) is equal to

\begin{align*}&\sum_{l\ge k}\int_{F\cap \{S_v+k=l\}} \Big(D^{\text{in}}_v(l)+\delta_{\text{in}}\Big)^p\text{d}\mathbb{P}\\ &\qquad+ \sum_{l\ge k}\int_{F\cap \{S_v+k=l\}} p\Big(D^{\text{in}}_v(l)+\delta_{\text{in}}\Big)^{p-1}\mathbb{E}^{\mathcal{F}_l}\big(\Delta_v(l+1)\big)\text{d} \mathbb{P}\\ &\qquad + \sum_{r=2}^p \sum_{l\ge k}\int_{F\cap \{S_v+k=l\}} \left(\begin{array}{c} p\\ r \end{array}\right) \Big(D^{\text{in}}_v(l)+\delta_{\text{in}}\Big)^{p-r}\mathbb{E}^{\mathcal{F}_l}\big(\Delta_v(l+1)\big)\text{d} \mathbb{P}.\end{align*}

Note also that $\mathbb{E}^{\mathcal{F}_l}\big(\Delta_v(l+1)\big)=(\alpha+\beta)\Big(D^{\text{in}}_v(l)+\delta_{\text{in}}\Big)/\big(l+1+\delta_{\text{in}} |V(l)|\big)$ , so we have

\begin{align*}&\int_F \mathbb{E}^{\mathcal{F}_{S_v+k}}\left(\left(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\right)^p\right)\text{d} \mathbb{P}\nonumber\\ &\quad = \int_F\Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)^p\left(1+p\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)\text{d}\mathbb{P}\\ &\qquad+\int_F\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\sum_{r=2}^p\left(\begin{array}{c} p\\ r \end{array}\right) \Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)^{p-r+1}\text{d} \mathbb{P}.\end{align*}

A.2. Statement and proof of Lemma A.2

Next, we study properties of $\mathbb{E}\left[\big(D^{\text{in}}_v(S_v+k)\big)^p\right]$ and $\mathbb{E}\left[\big(D^{\text{out}}_v(S_v+k)\big)^p\right]$ , $k\ge 1$ , which are needed for deriving the theorems in Section 3 as well as for the proof of Proposition 2.1.

Lemma A.2. For $v\in V(n)$ and $p\ge 1$ , we have

\begin{align*}\sup_{k\ge 1}&\frac{\mathbb{E}\left[\big(D^{{in}}_v(k)\big)^p\right]}{k^{c_1p}}<\infty,\qquad\sup_{k\ge 1}\frac{\mathbb{E}\left[\big(D^{{out}}_v(k)\big)^p\right]}{k^{c_2p}}<\infty.\end{align*}

Proof. For $p=1$ , we see from (A.1) that

\begin{align*}\mathbb{E}&\left(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\right) \nonumber\\ &=\mathbb{E}\left(\Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)\left(1+\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}} |V(S_v+k)|}\right)\right);\nonumber \end{align*}

since $S_v\ge 0$ and $|V(S_v+k)|\ge |V(k)|$ , this is

\begin{align*} &\le \mathbb{E}\left(\Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)\left(1+\frac{\alpha+\beta}{k+1+\delta_{\text{in}} |V(k)|}\right)\right),\nonumber\\ &\le \mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\left(1+\frac{c_1}{k}\right)\nonumber\\ &\quad+ \mathbb{E}\left(\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{(\alpha+\beta)\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{(k+\delta_{\text{in}} |V(k)|)\big(1+\delta_{\text{in}} (1-\beta)\big)k}\right),\end{align*}

and as $\alpha+\beta\le 1$ , this is

(A.4)

\begin{align} &\le \mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\left(1+\frac{c_1}{k}\right)\nonumber\\&\quad+ \mathbb{E}\left(\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{(k+\delta_{\text{in}} |V(k)|)\big(1+\delta_{\text{in}} (1-\beta)\big)k}\right)\nonumber\\&\,=\!:\, H^{(1)}_v(k) + H^{(2)}_v(k).\end{align}

Since $\big|V(k)\big|-1$ follows a binomial distribution with size $k\ge 1$ and success probability $1-\beta$ , by applying the Chernoff bound we obtain

(A.5)

\begin{equation}\mathbb{P}\left(\Bigl|\big|V(k)\big| - (1-\beta)k\Bigr|\ge 1+\sqrt{12(1-\beta) k\log k}\right)\le \frac{2}{k^4},\end{equation}

and rewrite the term $H^{(2)}_v(k)$ in (A.4) as

\begin{align}&H^{(2)}_v(k)\nonumber\\ &= \mathbb{E}\left(\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{(k+\delta_{\text{in}} |V(k)|)\big(1+\delta_{\text{in}} (1-\beta)\big)k}\right.\nonumber\\ &\left.\qquad\qquad\times\boldsymbol{1}_{\left\{\Bigl||V(k)|-(1-\beta)k\Bigr|\le 1+\sqrt{12(1-\beta) k\log k}\right\}}\right)\nonumber\\ &+ \mathbb{E}\left(\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{\big(k+\delta_{\text{in}} |V(k)|\big)\big(1+\delta_{\text{in}} (1-\beta)\big)k}\right.\nonumber\\ &\left.\qquad\qquad\times\boldsymbol{1}_{\left\{\Bigl||V(k)|-(1-\beta)k\Bigr|> 1+\sqrt{12(1-\beta) k\log k}\right\}}\right).\nonumber\end{align}

Since $D^{\text{in}}_v(S_v+k)\le k+1$ and $\Bigl||V(k)|-(1-\beta)k\Bigr|\le k+1$ , the foregoing term is bounded by

(A.6)

\begin{align}& \mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{\delta_{\text{in}}\!\left(1+\sqrt{12(1-\beta) k\log k}\right)}{k^2}+\frac{\delta_{\text{in}} \big(k+1+\delta_{\text{in}}\big)(k+1)}{\big(1+\delta_{\text{in}} (1-\beta)\big)k^2}\frac{2}{k^4}\nonumber\\ &\le \mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{\delta_{\text{in}}\!\left(1+\sqrt{12k\log k}\right)}{k^2}+\frac{2\delta_{\text{in}}\big(k+1+\delta_{\text{in}}\big)(k+1)}{k^6}.\end{align}

Combining the bound in (A.6) with (A.4) gives

\begin{align*}\mathbb{E}\left(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\right) & \le\mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\left(1+\frac{c_1}{k}\right)\\ &\quad+ \mathbb{E}\left(\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\frac{\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{(k+\delta_{\text{in}} |V(k)|)\big(1+\delta_{\text{in}} (1-\beta)\big)k}\right)\\ &\le \mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\left(1+\frac{c_1}{k} + \frac{\delta_{\text{in}}\left(1+\sqrt{12 k\log k}\right)}{k^2}\right)\\ &\quad+ \frac{2\delta_{\text{in}}\big(k+1+\delta_{\text{in}}\big)(k+1)}{k^6}.\end{align*}

Recursively applying the inequality above k times, we have

(A.7)

\begin{align} \mathbb{E}&\left(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\right)\nonumber\\ &\le\mathbb{E}\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)\left(1+\frac{c_1}{k} + \frac{\delta_{\text{in}}\left(1+\sqrt{12 k\log k}\right)}{k^2}\right)\nonumber\\ &\qquad+ \frac{2\delta_{\text{in}}\big(k+1+\delta_{\text{in}}\big)(k+1)}{k^6}\nonumber\\ &\le\ldots\le\mathbb{E}\big(D^{\text{in}}_v(S_v)+\delta_{\text{in}}\big)\prod_{l=1}^k \left(1+\frac{c_1}{l}+\frac{\delta_{\text{in}}\left(1+\sqrt{12 l\log l}\right)}{l^2}\right)\nonumber\\ &+ 2\delta_{\text{in}}\sum_{l=1}^k \frac{\big(l+1+\delta_{\text{in}}\big)(l+1)}{l^6}\prod_{s=l+1}^k \left(1+\frac{c_1}{s}+\frac{\delta_{\text{in}}\left(1+\sqrt{12 s\log s}\right)}{s^2}\right).\end{align}

Here, $\mathbb{E}\big(D^{\text{in}}_v(S_v)+\delta_{\text{in}}\big)=\alpha\delta_{\text{in}} +\gamma(1+\delta_{\text{in}})$ , depending on whether the $\alpha$ - or the $\gamma$ -scenario occurs. Note that there exists a constant $M>0$ such that

(A.8)

\begin{align}\prod_{l=1}^k \left(1+\frac{c_1}{l}+\frac{\delta_{\text{in}}\left(1+\sqrt{12 l\log l}\right)}{l^2}\right)&\le \exp\left\{\sum_{l=1}^k\left(\frac{c_1}{l}+\frac{\delta_{\text{in}}\left(1+\sqrt{12 l\log l}\right)}{l^2}\right)\right\}\nonumber\\&\le M k^{c_1},\end{align}

and it follows from (A.7) that

\[\sup_{k\ge 1}\frac{\mathbb{E}\left[D^{\text{in}}_v(S_v+k)\right]}{k^{c_1}}\le \sup_{k\ge 1}\frac{\mathbb{E}\left[D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right]}{k^{c_1}}<\infty.\]

For $p\ge 2$ , suppose

\[\sup_{k\ge 1}\frac{\mathbb{E}\left[\Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)^r\right]}{k^{c_1 r}}\le A_r <\infty\]

holds for some constants, $A_r$ , $r=1,\ldots, p-1$ . Let $A_0 = \max\{A_r\,:\, r=1,\ldots, p-1\}$ ; then by (A.2), we have

(A.9)

\begin{align}\mathbb{E}&\left(\Big(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\Big)^p\right)\nonumber\\ &\le \mathbb{E}\left(\Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)^p\left(1+p\frac{\alpha+\beta}{S_v+k+\delta_{\text{in}} |V(S_v+k)|}\right)\right)\nonumber\\ &\qquad+ \sum_{r=1}^{p-1} (\alpha+\beta)\left(\begin{array}{c} p\\ 2 \end{array}\right) A_r k^{c_1 r-1}\nonumber\\ &\le \mathbb{E}\left(\Big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\Big)^p\left(1+p\frac{\alpha+\beta}{S_v+k+\delta_{\text{in}} |V(S_v+k)|}\right)\right)\nonumber\\ &\qquad + \frac{1}{2}(\alpha+\beta)p(p-1)^2\, A_0 k^{c_1(p-1)-1}\nonumber\\ &\,=\!:\, C^{(1)}_v(k)+C^{(2)}_v(k).\end{align}

We rewrite the $C^{(1)}_v(k)$ term in (A.9) to get

\begin{align*}C^{(1)}_v(k) &= \left(1+\frac{c_1p}{k}\right) \mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\right]\\[2pt] &\quad +\mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\left(\frac{p(\alpha+\beta)}{S_v+k+\delta_{\text{in}} |V(S_v+k)|}-\frac{c_1p}{k}\right)\right]\\[4pt] &\le \left(1+\frac{c_1p}{k}\right) \mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\right]\\[2pt] &\quad+\mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\frac{p\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{(k+\delta_{\text{in}}|V(k)|)(1+\delta_{\text{in}}(1-\beta))k}\right]\\[4pt] &\le \left(1+\frac{c_1p}{k}\right) \mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\right]\\[2pt] &\quad+\mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\frac{p\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{k^2}\right].\end{align*}

Similarly to the Chernoff bound in (A.5), we have, for $p\ge 2$ ,

(A.10)

\begin{equation}\mathbb{P}\left(\Bigl||V(k)\big|-(1-\beta)k\Bigr|\ge 1+\sqrt{6p(1-\beta)k\log k}\right)\le \frac{2}{k^{2p}}.\end{equation}

Therefore, analogously to the calculation in (A.6), we have

\begin{align*}&\mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\frac{p\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{k^2}\right]\\[3pt] &=\mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\frac{p\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{k^2}\boldsymbol{1}_{\left\{\Bigl||V(k)|-(1-\beta)k\Bigr|\le 1+\sqrt{6p(1-\beta)k\log k}\right\}}\right]\\[3pt] &\, + \mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\frac{p\delta_{\text{in}}\Bigl||V(k)|-(1-\beta)k\Bigr|}{k^2}\boldsymbol{1}_{\left\{\Bigl||V(k)|-(1-\beta)k\Bigr|>1+\sqrt{6p(1-\beta)k\log k}\right\}}\right], \end{align*}

and since $\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\le \big(k+1+\delta_{\text{in}}\big)^p$ , this is

\begin{align*} &\le \mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\right]\frac{p\delta_{\text{in}} \left(1+\sqrt{6p\,k\log k}\right)}{k^2}+ \frac{p\delta_{\text{in}} \big(k+1+\delta_{\text{in}}\big)^p(k+1)}{k^2}\frac{2}{k^{2p}}.\end{align*}

Hence,

\begin{align*}C^{(1)}_v(k)&\le \mathbb{E}\left[\left(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\right)^p\right]\left(1+\frac{c_1p}{k}+\frac{p\delta_{\text{in}} \left(1+\sqrt{6p\,k\log k}\right)}{k^2}\right) \\ &\quad+\frac{2p\delta_{\text{in}} \big(k+1+\delta_{\text{in}}\big)^p(k+1)}{k^{2p+2}}.\end{align*}

Note also that

\[\big(k+1+\delta_{\text{in}}\big)^p(k+1){k^{-2p-2}}\le 2(2+\delta_{\text{in}})^p k^{-p-1}\le 2\big(2+\delta_{\text{in}}\big)^p k^{c_1(p-1)-1}\]

for all $k\ge 1$ , $p\ge 2$ . We conclude from (A.9) that

\begin{align*}\mathbb{E}&\left(\big(D^{\text{in}}_v(S_v+k+1)+\delta_{\text{in}}\big)^p\right)\\ &\le \mathbb{E}\left(\big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\big)^p\right) \left(1+\frac{c_1 p}{k}+\frac{p\delta_{\text{in}}\left(1+\sqrt{6p k\log k}\right)}{k^2}\right) \\ &\qquad+ \left(A_0(\alpha+\beta)p(p-1)^2/2+4p\delta_{\text{in}}\big(2+\delta_{\text{in}}\big)^p\right) k^{c_1(p-1)-1}.\end{align*}

Following the recursive step as for the $p=1$ case gives

\[\sup_{k\ge 1}\frac{\mathbb{E}\left[\big(D^{\text{in}}_v(S_v+k)\big)^p\right]}{k^{c_1 p}}\le \sup_{k\ge 1}\frac{\mathbb{E}\left[\big(D^{\text{in}}_v(S_v+k)+\delta_{\text{in}}\big)^p\right]}{k^{c_1 p}}<\infty.\]

Note that for $p\ge 1$ ,

\begin{align*}\mathbb{E}\left(\left(\frac{D^{\text{in}}_v(n)}{n^{c_1}}\right)^p\right)&= \mathbb{E}\left(\left(\frac{D^{\text{in}}_v(n)}{n^{c_1}}\right)^p\boldsymbol{1}_{\{S_v\le n-1\}}\right)+\mathbb{E}\left(\left(\frac{D^{\text{in}}_v(n)}{n^{c_1}}\right)^p\boldsymbol{1}_{\{S_v\ge n\}}\right)\\&\le \mathbb{E}\left(\left(\frac{D^{\text{in}}_v(S_v+n)}{n^{c_1}}\right)^p\right)+\left(\frac{1}{n^{c_1}}\right)^p,\end{align*}

since $D^{\text{in}}_v(n)$ is monotone in n. Then we have, for $v\ge 1$ ,

\[\sup_{n\ge 1}\mathbb{E}\left(\left(\frac{D^{\text{in}}_v(n)}{n^{c_1}}\right)^p\right)<\infty.\]

Applying a similar argument to the out-degrees completes the proof.

Remark A.1. In the proof of Lemma A.2, if we revise the inequality $|V(S_v+k)|\ge |V(k)|$ to $|V(S_v+k)|\ge |V(v+k-1)|$ , then we have for $p\ge 1$

(A.11)

\begin{align}\sup_{n\ge 1}\sup_{v\ge 1}\mathbb{E}\left(\left(\frac{D^{\text{in}}_v(n)}{(n/v)^{c_1}}\right)^p\right)<\infty,&\qquad\sup_{n\ge 1}\sup_{v\ge 1}\mathbb{E}\left(\left(\frac{D^{\text{out}}_v(n)}{(n/v)^{c_2}}\right)^p\right)<\infty.\end{align}

These results are used in the proof of Theorem 3.2, in Section 3.2.

A.3. Proof of Proposition 2.1

We prove only the results for $D^{\text{in}}_v(n)$ ; those for $D^{\text{out}}_v(n)$ follow from a similar argument. First, by Lemma A.1, we see that for $n\ge 1$ ,

(A.12)

\begin{equation}\frac{D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}}{\prod_{k=0}^{n-1}\left(1+\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\,=\!:\,\frac{D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}}{X_n^{(v)}}\end{equation}

is a nonnegative $\left(\mathcal{F}_{S_v+n}\right)_{n\ge 0}$ -martingale, which by the martingale convergence theorem converges to some limit $L_v$ a.s. as $n\to\infty$ . It remains to analyze the denominator and to verify that $\mathbb{P}(L_v\in (0,\infty))=1$ . We do this by applying some proof machinery similar to that of [Reference Van der Hofstad16, Lemma 8.17].

By Markov’s inequality, we see that for $\epsilon>0$ , and $\max\{-1,-\delta_{\text{in}}\}<m<0$ ,

(A.13)

\begin{align}\mathbb{P}(L_v\le \epsilon)&= \limsup_{n\to\infty}\mathbb{P}\left(\frac{D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}}{X^{(v)}_n}\le \epsilon\right)\nonumber\\&\le \epsilon^{|m|}\limsup_{n\to\infty} \mathbb{E}\left[\left(\frac{D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}}{X^{(v)}_n}\right)^m\right]\nonumber\\&\le \epsilon^{|m|}\limsup_{n\to\infty} \mathbb{E}\left[\frac{\left(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\right)^m}{\prod_{k=0}^{n-1}\left(1+\frac{(\alpha+\beta)m}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\right].\end{align}

By (A.13), it suffices to show

\[\limsup_{n\to\infty} \mathbb{E}\left[\frac{\left(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\right)^m}{\prod_{k=0}^{n-1}\left(1+\frac{(\alpha+\beta)m}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\right]<\infty.\]

Similarly to [Reference Van der Hofstad16, Equation (8.7.23)], there exists some constant $C_m$ such that

\begin{align*}\limsup_{n\to\infty}\,& \mathbb{E}\left[\frac{\left(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\right)^m}{\prod_{k=0}^{n-1}\left(1+\frac{(\alpha+\beta)m}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\right]\\ &\le C_m \limsup_{n\to\infty}\,\mathbb{E}\left[\frac{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m\big)/\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}{\prod_{k=0}^{n-1}\left(1+\frac{(\alpha+\beta)m}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\right].\end{align*}

Hence, once we show

(A.14)

\begin{equation}\limsup_{n\to\infty}\,\mathbb{E}\left[\frac{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m\big)/\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}{\prod_{k=0}^{n-1}\left(1+\frac{(\alpha+\beta)m}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\right]<\infty,\end{equation}

the inequality in (A.13) implies $\mathbb{P}(L_v\in (0,\infty))=1$ .

We prove (A.14) by showing that

\[M^{(m)}_n \,:\!=\, \frac{\Gamma(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m)/\Gamma(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}})}{\prod_{k=0}^{n-1}\left(1+\frac{(\alpha+\beta)m}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)}\]

is an $\left(\mathcal{F}_{S_v+n}\right)_{n\ge 0}$ -martingale. Note that

\begin{align*}&\mathbb{E}^{\mathcal{F}_{S_v+n}}\left(\frac{\Gamma\big(D^{\text{in}}_v(S_v+n+1)+\delta_{\text{in}}+m\big)}{\Gamma\big(D^{\text{in}}_v(S_v+n+1)+\delta_{\text{in}}\big)}\right)\\ &\quad = \frac{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m\big)}{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}\left(1-\frac{(\alpha+\beta)\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}{S_v+n+1+\delta_{\text{in}}|V(S_v+n)|}\right)\\ &\qquad + \frac{(\alpha+\beta)\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}{S_v+n+1+\delta_{\text{in}}|V(S_v+n)|}\frac{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m\big)}{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}\frac{D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m}{D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}}\\ &\quad = \frac{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}+m\big)}{\Gamma\big(D^{\text{in}}_v(S_v+n)+\delta_{\text{in}}\big)}\left(1+\frac{(\alpha+\beta)m}{S_v+n+1+\delta_{\text{in}}|V(S_v+n)|}\right),\end{align*}

which confirms that $M^{(m)}_n$ is an $\left(\mathcal{F}_{S_v+n}\right)_{n\ge 0}$ -martingale. Also,

\begin{align*}\mathbb{E}\big(M^{(m)}_n\big) &=\mathbb{E}\big(M^{(m)}_0\big)\\ &= \mathbb{E}\left(\frac{\Gamma\big(D^{\text{in}}_v(S_v)+\delta_{\text{in}}+m\big)/\Gamma\big(D^{\text{in}}_v(S_v)+\delta_{\text{in}}\big)}{1+\frac{(\alpha+\beta)m}{S_v+1+\delta_{\text{in}} v}}\right).\end{align*}

Since $m<0$ and $S_v\ge v-1\ge 0$ , we have

\[\left(1+\frac{(\alpha+\beta)m}{S_v+1+\delta_{\text{in}} v}\right)^{-1}\le \left(1+\frac{(\alpha+\beta)m}{(1+\delta_{\text{in}})v}\right)^{-1}.\]

This further implies

\begin{align*}\mathbb{E}\big(M^{(m)}_n\big) &\le \left(1+\frac{(\alpha+\beta)m}{(1+\delta_{\text{in}})v}\right)^{-1} \mathbb{E}\left({\Gamma\big(D^{\text{in}}_v(S_v)+\delta_{\text{in}}+m\big)/\Gamma\big(D^{\text{in}}_v(S_v)+\delta_{\text{in}}\big)}\right)\\ &= \left(1+\frac{(\alpha+\beta)m}{(1+\delta_{\text{in}})v}\right)^{-1} \left(\alpha \frac{\Gamma(\delta_{\text{in}} +m)}{\Gamma(\delta_{\text{in}})} + \gamma \frac{\Gamma(\delta_{\text{in}}+1 +m)}{\Gamma(1+\delta_{\text{in}})}\right)<\infty,\end{align*}

thus completing the proof of (A.14).

Next, we consider the convergence of $X^{(v)}_n$ by noting that

\begin{align*}\log& X^{(v)}_n\\&= \sum_{k=0}^{n-1}\left[\log\left(1+\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)-\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right]\\&\quad+\sum_{k=0}^{n-1} \left(\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}-\frac{c_1}{S_v+k+1}\right)\\&\quad+\left(\sum_{k=0}^{n-1} \frac{c_1}{S_v+k+1}-c_1\log\frac{S_v+n}{S_v+1}\right)+c_1\log\frac{S_v+n}{S_v+1}\\&\,=\!:\, \text{I}_v(n) + \text{II}_v(n) + \text{III}_v(n)+c_1\log\frac{S_v+n}{S_v+1}.\end{align*}

Since $\log(1+x)\le x$ , for all $x\ge 0$ , we have $\text{I}_v(n+1)-\text{I}_v(n)\le 0$ for all n; i.e. $\text{I}_v(n)$ is decreasing in n. Note also that $\left|\log(1+x)-x \right|\le x^2/2$ for all $x\ge 0$ , so we have

\begin{align*}\mathbb{E}&\left|\sum_{k=0}^{\infty}\left(\log\left(1+\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)-\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)\right|\\&\le \sum_{k=0}^{\infty}\mathbb{E}\left|\log\left(1+\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)-\frac{\alpha+\beta}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right|\\&\le \frac{(\alpha+\beta)^2}{2}\sum_{k=0}^{\infty}\mathbb{E}\left(\frac{1}{S_v+k+1+\delta_{\text{in}}|V(S_v+k)|}\right)^2\le \sum_{k=1}^{\infty}\frac{1}{k^2}<\infty,\end{align*}

which implies $\text{I}_v(\infty)<\infty$ a.s., and $\text{I}_v(n)\stackrel{\text{a.s.}}{\longrightarrow} \text{I}_v(\infty)$ as $n\to\infty$ .

By [Reference Athreya and Ney1, Theorem 3.9.4], we see that there exists a finite random variable Z such that

\[ \sum_{k=1}^\infty\left(\frac{\alpha+\beta}{k+\delta_{\text{in}} |V(k-1)|}-\frac{c_1}{k}\right)\stackrel{\text{a.s.}}{\longrightarrow} Z; \]

then

\[\text{II}_v(n)\stackrel{\text{a.s.}}{\longrightarrow} Z-\sum_{k=1}^{S_v}\left(\frac{\alpha+\beta}{k+\delta_{\text{in}} |V(k-1)|}-\frac{c_1}{k}\right) \,=\!:\, \text{II}_v(\infty).\]

Since $\sum_{k=1}^n 1/k-\log n\to \tilde{c}$ , where $\tilde{c}$ is Euler’s constant, for $v=1$ we have

\[ \text{III}_1(n) \stackrel{\text{a.s.}}{\longrightarrow} c_1\tilde{c}\,=\!:\, \text{III}_1(\infty), \]

and for $v\ge 2$ ,

\[ \text{III}_v(n)\stackrel{\text{a.s.}}{\longrightarrow} c_1\left(\tilde{c}+\log(S_v+1)-\sum_{k=1}^{S_v}\frac{1}{k}\right)\,=\!:\, \text{III}_v(\infty). \]

Hence, as $n\to\infty$ ,

(A.15)

\begin{equation}\frac{X^{(v)}_n}{\left((S_v+n)/(S_v+1)\right)^{c_1}}\stackrel{\text{a.s.}}{\longrightarrow} \exp\left\{\text{I}_v(\infty)+\text{II}_v(\infty)+\text{III}_v(\infty)\right\}.\end{equation}

Combining (A.15) with the convergence of the martingale in (A.12), we have

\[ \frac{D^{\text{in}}_v(S_v+n)}{(S_v+n)^{c_1}}\stackrel{\text{a.s.}}{\longrightarrow} \frac{L_v}{(S_v+1)^{c_1}}\exp\left\{-\bigl(\text{I}_v(\infty)+\text{II}_v(\infty)+\text{III}_v(\infty)\bigr)\right\}\,=\!:\, \xi^{\text{in}}_v; \]

then

\[\lim_{n\to\infty}\frac{D^{\text{in}}_v(n)}{n^{c_1}}=\lim_{n\to\infty}\frac{D^{\text{in}}_v(S_v+n)}{(S_v+n)^{c_1}}\stackrel{\text{a.s.}}{=} \xi^{\text{in}}_v.\]

Since $\mathbb{P}(L_v\in (0,\infty))=1$ and $S_v+1\ge v\ge 1$ , we also have $\mathbb{P}(\xi^{\text{in}}_v\in (0,\infty))=1$ .

Acknowledgements

We thank the two anonymous referees for their valuable comments on the manuscript.

Funding information

There are no funding bodies to thank in relation to this creation of this article.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process for this article.

References

Athreya, K. and Ney, P. (2004). Branching Processes. Springer, New York.Google Scholar

Bollobás, B., Borgs, C., Chayes, J. and Riordan, O. (2003). Directed scale-free graphs. In Proceedings of the Fourteenth Annual ACM–SIAM Symposium on Discrete Algorithms (Baltimore, 2003), Association for Computing Machinery, New York, pp. 132–139.Google Scholar

Cha, M., Mislove, A. and Gummadi, K. (2009). A measurement-driven analysis of information propagation in the Flickr social network. In Proceedings of the 18th International Conference on World Wide Web (WWW ’09), Association for Computing Machinery, New York, pp. 721–730.10.1145/1526709.1526806CrossRef Google Scholar

Cheng, J., Romero, D., Meeder, B. and Kleinberg, J. (2011). Predicting reciprocity in social networks. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 49–56.CrossRef Google Scholar

Durrett, R. (2019). Probability: Theory and Examples, 5th edn. Cambridge University Press.CrossRef Google Scholar

Java, A., Song, X., Finin, T. and Tseng, B. (2007). Why we Twitter: understanding microblogging usage and communities. In WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, Association for Computing Machinery, New York, pp. 56–65.10.1145/1348549.1348556CrossRef Google Scholar

Jiang, B., Zhang, Z. and Towsley, D. (2015). Reciprocity in social networks with capacity constraints. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’15), Association for Computing Machinery, New York, pp. 457–466.CrossRef Google Scholar

Krapivsky, P. and Redner, S. (2001). Organization of growing random networks. Phys. Rev. E 63, 066123.10.1103/PhysRevE.63.066123CrossRef Google Scholar PubMed

Magno, G. et al. (2012). New kid on the block: exploring the Google

$+$ social graph. In Proceedings of the 2012 Internet Measurement Conference (IMC ’12), Association for Computing Machinery, New York, pp. 159–170.10.1145/2398776.2398794CrossRef Google Scholar

Mislove, A. et al. (2007). Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement (IMC ’07), Association for Computing Machinery, New York, pp. 29–42.10.1145/1298306.1298311CrossRef Google Scholar

Neveu, J. (1965). Mathematical Foundations of the Calculus of Probability. Holden-Day, San Francisco.Google Scholar

Newman, M., Forrest, S. and Balthrop, J. (2002). Email networks and the spread of computer viruses. Phys. Rev. E 66, 035101.10.1103/PhysRevE.66.035101CrossRef Google Scholar PubMed

Resnick, S. (2007). Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York.Google Scholar

Resnick, S. and Samorodnitsky, G. (2015). Tauberian theory for multivariate regularly varying distributions with application to preferential attachment networks. Extremes 18, 349–367.CrossRef Google Scholar

Samorodnitsky, G. et al. (2016). Nonstandard regular variation of in-degree and out-degree in the preferential attachment model. J. Appl. Prob. 53, 146–161.CrossRef Google Scholar

Van der Hofstad, R. (2017). Random Graphs and Complex Networks, Vol. 1. Cambridge University Press.Google Scholar

Viswanath, B., Mislove, A., Cha, M. and Gummadi, K. (2009). On the evolution of user interaction in Facebook. In Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN ’09), Association for Computing Machinery, New York, pp. 37–42.CrossRef Google Scholar

Wan, P., Wang, T., Davis, R., and Resnick, S. (2017). Fitting the linear preferential attachment model. Electron. J. Statist. 11, 3738–3780.10.1214/17-EJS1327CrossRef Google Scholar

Wan, P., Wang, T., Davis, R. and Resnick, S. (2020). Are extreme value estimation methods useful for network data? Extremes 23, 171–195.CrossRef Google Scholar

Wang, T. and Resnick, S. (2020). Degree growth rates and index estimation in a directed preferential attachment model. Stoch. Process. Appl. 130, 878–906.CrossRef Google Scholar

Wasserman, S. and Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press.10.1017/CBO9780511815478CrossRef Google Scholar

Figure 1. A graphical illustration of reciprocal edges with edge set $E=\{(1,2),(1,2),(1,2),$$(2,1),(2,1), (2,3)\}$ and $V=\{1,2,3\}$. If a pair of reciprocal edges are observed, we label them with the same color. Here we have $R_6 = 4/6 = 0.667$.

Figure 2. Box plots of $R_n^{\text{pa}} $ for directed PA models simulated using $\boldsymbol{\theta}_i$, $i=1,2,3$. The red dots represent the averaged empirical $R_n^{\text{pa}} $ for each $\boldsymbol{\theta}_i$, $i=1,2,3$.

Article contents

Measuring reciprocity in a directed preferential attachment network

Abstract

Keywords

MSC classification

1. Introduction

1.1. Model setup

2. Growth of the degree sequences $\boldsymbol{D}^{\textrm{in}}_{\boldsymbol{v}}(n)$ , $\boldsymbol{D}^{\textrm{out}}_{\boldsymbol{v}}(n)$

3. Reciprocity in the preferential attachment network

3.1. Reciprocal edges between two fixed nodes

3.1.1. Convergence of $L_{(i,j)}(n)$

3.2. Reciprocity in the entire graph

3.2.1. Simulation for $c_1+c_2\ge 5/3$

3.3. The first time when a reciprocal pair forms

4. Discussion

Appendix A. Lemmas and proofs needed in Section 2

A.1. Statement and proof of Lemma A.1

A.2. Statement and proof of Lemma A.2

A.3. Proof of Proposition 2.1

Acknowledgements

Funding information

Competing interests

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests