# «Abstract. We show that the asynchronous push-pull protocol spreads rumors in preferential attachment graphs (as deﬁned by Barab´si and a √ ...»

Asynchronous Rumor Spreading

in Preferential Attachment Graphs

Benjamin Doerr1, Mahmoud Fouz2, and Tobias Friedrich1,2

Max-Planck-Institut f¨r Informatik, Saarbr¨cken, Germany

u u

Universit¨t des Saarlandes, Saarbr¨cken, Germany

a u

Abstract. We show that the asynchronous push-pull protocol spreads

rumors in preferential attachment graphs (as deﬁned by Barab´si and

a

√

Albert) in time O( log n) to all but a lower order fraction of the nodes with high probability. This is signiﬁcantly faster than what synchronized protocols can achieve; an obvious lower bound for these is the average distance, which is known to be Θ(log n/ log log n).

1 Introduction Online social networks like Facebook and Twitter are changing the way people communicate, organize and act collectively. They are starting to take the lead over traditional news media in their ability to spread news at a remarkable speed. One striking example was the ﬁrst picture of US Airways Flight 1549’s crash landing on the Hudson River, which became known to a broad audience through Twitter even before TV channels started to report on the accident.

The theoretical model most widely used for social networks is the so-called preferential attachment (PA) model, which was introduced in a seminal paper by Barab´si and Albert [1]. It builds on the paradigm a that new vertices attach to already present vertices with a probability proportional to their degree. Several papersprove that this model indeed enjoys many properties observed in social networks and many other real world networks, e.g., a power law distribution of the vertex degrees, a small diameter and a small average degree [2, 4]. The precise deﬁnition of the PA model can be found in Section 2. Note that later extended deﬁnitions for PA graphs were given (with the preference not anymore proportional to the degree); in this paper, we shall always refer to the original one.

In this paper, we revisit the rumor spreading problem in PA graphs, i.e., the spread of one piece of information in a graph. The classical rumor spreading process is modeled on a discrete time line. A simple protocol assumes that in each time step (or round ) every node that knows the rumor forwards it to a randomly chosen neighbor. This is known as the push strategy. For many network topologies, this strategy is a very eﬃcient way to spread a rumor. Let n denote the number of vertices of a graph.

Then the push model with high probability (i.e., with probability 1−o(1)) sends the rumor to all vertices in time Θ(log n), if the graph is a complete graph [19], a hypercube [15], an Erd˝s-R´nyi random graph Gn,p with oe p ≥ (1 + ε) log(n)/n [15], or a random regular graph [17]. In contrast to this, Chierichetti, Lattanzi, and Panconesi [7] showed that the push model with non-vanishing probability needs Ω(nα ) rounds on PA graphs for some α 0.

Opposite to the push strategy is the pull strategy: each vertex in each round contacts a random neighbor and learns the rumor if its contact knows the rumor. There is a symmetry between the two models [6, 10], hence these results also hold for the pull model.

Karp, Schindelhauer, Shenker, and V¨cking [22] pointed out that for o complete graphs, the pull strategy is inferior to the push strategy until roughly n/2 vertices are informed, and then the pull strategy becomes more eﬀective. This motivates to combine both approaches. In this socalled push-pull strategy each vertex contacts another vertex chosen uniformly at random among its neighbors. It pushes the rumor in case it has the rumor, and pulls the rumor in case the neighbor has the rumor. For complete graphs and many Erd˝s-R´nyi random graphs, this protocol also oe needs Θ(log n) rounds, though with better implicit constants [9, 13, 22].

Its main advantage here is that it allows to deﬁne protolls using fewer messages. Chierichetti et al. [6] relate the broadcast time of the pushpull strategy to the conductance of graphs; graphs with conductance Φ have a broadcast time of O log2 (Φ−1 ) Φ−1 log n with high probability.

Giakkoupis [20] recently improved this bound to O(Φ−1 log n) which is tight.

For preferential attachment graphs, however, the push-pull strategy is much better than push or pull alone. Chierichetti et al. [7] showed that with this strategy, O(log2 n) rounds suﬃce with high probability.

Recently, we showed that in fact the push-pull strategy succeeds to inform all nodes in Θ(log n) rounds [11]. Surprisingly, if the push-pull strategy is slightly modiﬁed to prevent that a node contacts the same neighbor twice in a row, then with high probability already Θ(log n/ log log n) rounds suﬃce [11], which is the diameter of the PA graph.

All these results assume a synchronized model, in which all nodes take action simultaneously at discrete time steps. In many applications and certainly in real-world social networks, this assumption is not very plausible. One can also argue (see, e.g., [5]) that time-synchronization contradicts the idea of a self-organized broadcasting protocol. Boyd et al.

[5] therefore proposed an asynchronous time model with a continuous time line. Each node has its own clock that ticks at the times of a rate 1 Poisson process independent from the clocks of other nodes. The protocol now speciﬁes for every node what to do when its own clock ticks.

The rumor spreading problem in the asynchronous time model has so far received less attention. The push-pull protocol in this model, however, turns out to be closely related to Richardson’s model for the spread of a disease and to ﬁrst-passage percolation. In this sense, for the hypercube, Fill and Pemantle [16] and Bollob´s and Thomason [3] showed that the a asynchronous push-pull protocol spreads a rumor to all nodes in time Θ(log n). Similarly, for the complete graph, Janson [21] showed a bound of Θ(log n). Note that these bounds match the same asymptotics as in the synchronized case. We also suspect that the same bounds hold in case all but o(n) nodes are to be informed.

Fountoulakis, Panagiotou, and Sauerwald [18] have recently studied the push-pull protocol in the asynchronous time model for random graphs in the Ching-Lu model [8] with a given expected degree distribution that follows a power law with exponent in (2, 3). For these graphs, they show a constant runtime to inform n − o(n) nodes. Note that these graphs are quite diﬀerent from our PA graphs, e.g., their average diameter is known to be Θ(log log n) (see [8]), whereas for PA graphs the average diameter is also Θ(log n/ log log n) (see [12]).

Our results: We study the push-pull protocol in the asynchronous time model on PA graphs and prove that it spreads a rumor in time √ O( log n) to n − o(n) nodes in the PA model with high probability. The protocol thus beats the average distance of Θ(log n/ log log n), which is a natural lower bound for the synchronized protocol achieving this aim. To inform all nodes, however, our protocol is shown to need Θ(log n) time.

This is mainly due to few nodes that require Ω(log n) time to contact or be contacted by a neighbor for the ﬁrst time.

These results show that the asynchronous push-pull protocol behaves quite diﬀerently than the synchronized one, despite the fact that each node still contacts one neighbor per time unit on average. The discrepancy between informing all nodes and almost all nodes reﬂects an often observed ‘long tail’ behavior in real world networks. Such eﬀects are less visible in the synchronized case [11].

2 Precise Model and Preliminaries

This deﬁnition implies that when ei is inserted, the vertex vi is chosen n with probability proportional to its degree (except for vi = n). Since many real-world social networks are conjectured to evolve using similar principles, the PA model can serve as a model for social networks. Another property observed in many real-world networks has been formally proven for preferential attachment graphs, namely that the degree distribution follows a power-law [4].

For m = 1 the graph is disconnected with high probability; so we focus on the case m ≥ 2. Under this assumption, Bollob´s and Riora dan [2] showed that the diameter is only Θ(log(n)/ log log n) with high probability.

With a slight abuse of notation we write (u, v) ∈ E or (v, u) ∈ E both to denote {u, v} ∈ E. The deﬁnition of Gn can lead to multiple edges m and self-loops, though they typically make up only a vanishing fraction of the edges.

We examine the following broadcasting protocol.

Deﬁnition 2 (Asynchronous push-pull strategy). Each node has a clock that ticks at the times of a rate 1 Poisson process. Whenever the clock of a vertex u ticks, it chooses uniformly at random a neighbor v.

If u knows the rumor, it sends the rumor to v (“push”). If v knows the rumor, it sends the rumor to u (“pull”).

We say that an edge (u, v) ﬁres, whenever the clock of node u ticks and u contacts v. We call the time span between two ticks of a clock a round.

The length of a round is exponentially distributed with mean 1.Since the exponential distribution is memoryless, the length of a round is independent over time. The following elementary lemma shows that also the time when a node contacts a speciﬁc neighbor is exponentially distributed.

** Lemma 1. Let u be a node of degree d that is connected to a node v.**

Let T denote the time span until u contacts v. Then, P[T x] = e−x/d.

3 Statement of Results Theorem 1. With probability 1 − o(1), the asynchronous push-pull protocol broadcasts a rumor from any node of Gn to (i) all but o(n) nodes √ m in time O( log n), (ii) and to all nodes in time Θ(log n).

The proofs of the upper bounds in Theorem 1 consist of three main steps. In Section 4.3, we analyze the time needed until the rumor reaches a so-called useful node. Roughly speaking, a node is useful if its degree is at least polylogarithmic (see Section 4.2 for details). We prove that a useful node is reached in time O((log log n)2 ) with probability 1 − o(1) and in time O(log n) with probability 1 − o(n−2 ). The later bound is used for the case when all nodes are to be informed.

The core of the proof (see Section 4.4) consists of showing that once √ a useful node u has been informed, within O( log n) time the rumor is propagated to node 1. To this aim, we show that there is a short path from u to 1 such that every second node has degree exactly m that is √ traversed in time O( log n). To prove such a fast traversal we exploit edges that ﬁre fast. In particular, we use the fact that the minimum of k i.i.d. exponential random variables with mean 1 is also exponentially distributed with mean 1/k.

The result then follows from the following symmetry property.

** Lemma 2. Assume that if the rumor starts in node u, it reaches node v in time t with probability p.**

This implies the reverse statement: if the rumor is initiated by v, then it reaches u in time t with probability p.

more accessible. We ﬁrst describe the model for m = 1 and then generalize it to arbitrary m.

Let (xi, yi ) for i ∈ [n] := {1, 2,..., n} be n independently and uniformly chosen pairs from [0, 1] × [0, 1]. With probability one, all these numbers are distinct. By reordering each pair if necessary, we assume that xi yi for every i ∈ [n]. Suppose that after relabeling, y1 y2 · · · yn.

We set W0 := 0 and Wi := yi for i ∈ [n]. The graph Gn is now de- 1 ﬁned by having an edge (i, j) if and only if Wj−1 xi Wj. Deﬁne wj := Wj − Wj−1.

Similarly, for Gn, we sample mn pairs (xi,j, yi,j ) independently and m uniformly from [0, 1] × [0, 1] with xi,j yi,j for i ∈ [n] and j ∈ [m].

We relabel the variables such that yi,j is increasing in lexicographic order: y1,1 y1,2 · · · y1,m y2,1 · · · yn,1 · · · yn,m. We set W0 := 0 and Wi := yi,m for i ∈ [n]. The graph is now deﬁned by having an edge (i, j) for each k ∈ [m] such that Wj−1 xi,k Wj.

As before, deﬁne wj = Wj − Wj−1. We write i,k for the node j such that Wj−1 xi,k Wj. Note that given y1,1,..., yn,m, the random variables x1,1,..., xn,m are independent with xik being chosen uniformly from [0, yi,k ]. We instead assume that if y1,1,..., yn,m are given, then each xi,k is chosen independently and uniformly from [0, Wi ]. By this slight modication, we can work with the values of the Wi ’s and ignore the values of the yi,j ’s. This modiﬁcation only increases the probability of a loop at i.

It is straightforward to check that each step of our proof remains valid if the probability of a loop is not increased. Thus, the validity of our proof is not aﬀected.

We give a few properties of the alternative model, that are useful in the analysis. Let s = 2a be the smallest power of 2 larger than log10 n, and let 2b be the largest power of 2 smaller than 2n/3. Let It = [2t + 1, 2t+1 ].

Note that the event E5 is slightly adjusted for our purposes. In the original paper, the authors show that for i ≥ n/ log5 n; we have wi n−4/5. It is easy to check that (essentially) the same proof holds for the above version.

Instead of working directly with the alternative model where the Wi ’s are random variables, we use the following typical social network model where the Wi ’s are ﬁxed numbers that satisfy the properties E1,..., E5.

Since by Lemma 3, these properties hold with high probability, all results proven for a typical social network model carry over to Gn with m high probability. Let 0 W1 · · · Wn 1 be distinct real numbers and let wi = Wi − Wi−1. Assume that W1,..., Wn satisfy the properties E1,..., E5. A typical social network Gm (W1,..., Wn ) is obtained by connecting each node i with the nodes i,1,..., i,m, where each i,k is a node chosen randomly with P[ i,k = j] = wj /Wi for all j ≤ i.

**We always assume to have a typical social network G :=**

Gm (W1,..., Wn ).

**4.2 Useful nodes**

We use the notion of a useful node that was introduced by Bollob´s anda Riordan [2]. A node i is useful if wi ≥ log2 (n)/n. Note that we are slightly relaxing the original deﬁnition in [2] where the authors also assumed that i ≤ n/ log5 (n). We have by E5 that i n/2 for all useful nodes.