Advanced network characteristics Flashcards
Node centralities
What are node centralities? Why are they important?
They are quantities according to which it makes sense to rank nodes in a network.
- they help to pinpoint the most important, most influential players in the system
- such nodes can be imagined to be in the centre of the network, hence the name centrality
Node centralities
What is closeness centrality?
It’s the inverse of the average shortest path length for a given node.
C(c)(i) = 1/⟨ℓ(i)⟩
- a node “closer” = higher C(c) value
- nodes “closer” to the rest of the network are intuitively central
Node centralities
What is betweenness centrality?
The betweenness of a node or link is equal to the number of shortest paths (between all possible pairs of nodes) passing through the given node or link.
b(i) = ∑(s≠i,v≠i) σ(sv)(i)/σ(sv)
- if multiple shortest paths are possible between a given pair of nodes, they are given equal weights adding up to one
Node centralities
What’s eigenvector centrality?
The eigenvalue problem of the adjacency matrix where we take the largest eigenvalue and treat the corresponding eigenvector as the value of a centrality measure associated to the corresponding nodes.
- each component of the eigenvector corresponds to a node
Eigenvector centrality v(i) is the i-th component of the leading eigenvector v of A, fulfilling A ⋅ v = λv.
Node centralities
What is PageRank?
Let us assume an iterative process, where everybody is distributing its current PageRank r among its neighbours evenly, with damping.
r(i)(t + 1) = p(d)/N + [1–p(d)] ∑(j∈M(i)) r(j)(t)/k(j)
- M(i): set of sites that link to site i
- k(j): the amount of outgoing links from site j
- p(d): damping factor, we click any random with probability p(d)
- N: number of sites
Calculating PageRank:
- initally we distribute r(i) evenly among the sites: r(i) = 1/N
- iterate according to the rule above, and r(i) will converge soon
It originates from the importance of sites on Google. A site is important if other important sites link to it (recursive definition).
Node centralities
What is the steady state distribution of the PageRank?
We can rewrite the recursive formula for PageRank with the adjacency matrix:
r = p(d)/N⋅Id + [1 – p(d)]Ur,
where U = [K^(–1)A]^T, K(ij) = 1/k(i).
This is similar to the eigenvector problem of an eigenvector centrality, so PageRank is a variation of eigenvector centrality.
steady state: r(i)(t + 1) = r(i)(t) = r(i)
k(i) is the # of outgoing links from site i.
Degree distribution
What is degree distribution?
It’s the probability distribution of the node degrees. The degree distribution of a network, p(k) is equal to the probability that a randomly chosen node has a degree k.
- for a finite network: p(k) = N(k)/N
- calculating it: count how many nodes have degree k for every k and divide by the total # of nodes
N(k) is the # of nodes w/ degree k
Degree distribution
What is the degree distribution of the Erdős-Rényi model?
Erdős-Rényi graph where N nodes are linked independently w/ probability p:
- a given node can choose from N–1 possible neighbours —» binomial distribution
- we neglect the –1 in the binomial
- approximation in the large N limit —» Poisson distribution
p(k) ≃ [⟨k⟩^k/k!]exp(−⟨k⟩), ⟨k⟩ = Np
Scale-free networks
What’s a scale-free network?
A network is called scale-free if the tail of its degree distribution decays as a power-law:
p(k) ∼ k^(−γ).
- γ: node degree (decay) exponent
- spotting such a network: inhomogenous degree distribution (e.g.: airline network in the US)
- majority of the nodes have below average degree, few nodes have extremely large degree
- the average degree cannot be observed as a usual average degree
- quite common behaviour in real networks (universal property of real networks)
Scale-free networks
How does the scale-free distribution fundamentally differ from the Poisson one?
There are hubs in the scale-free distribution.
- ⟨k⟩ «_space;k
- there’s no “typical” degree
- the degree distribution is very inhomogeneous and skewed
Scale-free networks
When are functions or probability distributions scaling?
Functions:
A function F(x) is scaling if changing the argument is equivalent to multiplying F(x) with a constant.
F(a ⋅ x) = g(a) ⋅ F(x)
- e.g.: power-laws
Probability distributions:
A probability distribution is scaling if the ρ(x) density function behaves as a power-law.
ρ(x) ∼ x^(−α)
- e.g.: wealth(/Pareto) distribution, word frequency distribution (Zipf’s law)
Scale-free networks
Why are scale-free networks called scale-free?
- power-law p(k) → scaling distribution
- no “typical degree” → no typical scale for the degrees
Scale-free networks
How to normalize a scale-free distribution?
We use the continuum formalism: we treat the degree k, as a continuous variable.
- assumption for the domain: [k(min), ∞], where k(min) > 0
- integrating p(k) = Ck^(−γ) over the domain gives p(k) = (γ − 1) k^(−γ)/k(min)(1−γ)
The domain is needed because 0^(−γ) = ∞ diverges.
Scale-free networks
How can we calculate the variance, the standard deviation and the 2nd moment for scale-free networks?
The variance is the “width” of p(k) around the average, Var(k) = ⟨k^2⟩ − ⟨k⟩^2, where ⟨k^2⟩ is the 2nd moment.
- when calculating the 2nd moment we have to substitute the normalized p(k) into the integral
- for 3 > γ the 2nd moment diverges
- most measured γ are smaller then 3
- ⟨k^2⟩ and therefore the varience diverges in the N → ∞ limit
- σ(k) ≡ √Var(k), so the standard deviation diverges as well
The standard devination is diverging so ⟨k⟩, the average, is not a meaningful quantity due to the large fluctuations.
Scale-free networks
Do scale-free networks have the small-world property?
Yes, they’re small world for γ ≥ 3 and ultra small world below 3.
Degree correlation
What are the major classes of degree correlated networks?
Assortative network: small degree nodes tend to connect to other small degree nodes, hubs tend to link to each other.
Neutral network: nodes connect to each other at random, node degrees don’t affect the connection probability.
Disassortative network: hubs avoid linking to each other, instead they connect to small degree nodes.
Degree correlations
How can we describe assortativity mathematically?
P(k′ ∣ k) denotes the conditional probability for finding a node with degree k′ at one end of a link, given the node at the other end has degree k. This encodes all info about whether the network is assortative or disassortative.
P(k′ ∣ k) = P(link between k′ and k)/P(link on k) = E(k′k)/∑(k′) E(k′k)
- # of links between nodes of degree k′ and k, links between nodes with the same degree count twice
E(k′k) counts the number of links between nodes of degree k′ and k
Degree correlations
What’s the full statistical description of assortativity? ( = What quantities describe it?)
E(k′,k): number of links between nodes of degree k′ and k
e(k′,k) = Ek′,k/2L: the probability for finding a node with degree k′ at one end and a node with degree k at the other end of a randomly selected link
- the e matrix is difficult to prepare and also to evaluate though
q(k) = ∑(k′) e(k′,k): prob. of degree k at one end of a link
For a neutral network we expect e(k′k) = q(k′) ⋅ q(k), deviations from this value signify the presence of degree correlations.
q(k) can be expressed with the help of p(k): q(k) ∼ kN(k). In a normalized manner: q(k) = kN(k)/∑(k′) k′N(k′) = kp(k)N/∑(k′) k′p(k′)N = kp(k)/⟨k⟩, where p(k) = N(k)/N.
Degree correlations
What is the average nearest neighbours degree (ANND)?
For a single node: k(ANND)(i) ≡ k(nn,i) ≡ ⟨k(j)⟩(j linked to i) = (1/k(i)) ∑(j linked to i) k(j).
For the whole network: it classifies the nodes based on their degree
k(nn)(k) ≡ ⟨k(nn,i)⟩(k(i)=k) = ∑(k′) k′P(k′ ∣ k) = ∑(k′) k′e(k′k)/∑(k′) e(k′k) = ∑(k′) k′e(k′k)/q(k)
- assortative network: increasing k(ANND)(k)
- neutral network: constant k(ANND)(k)
- disassortative network: decreasing k(ANND)(k)
- Calculation: for neutral networks k(nn)(k) is independent of k, actually.
- ANND is simpler (k parameters) to evaluate compared to e(k’k) (k(k–1) parameters).
Degree correlation
What is the Pearson correlation? Why is it useful?
We can use it to measure the relatedness of nodes with it.
r = Cov(e)(k′, k)/[σ(e)(k′)σ(e)(k)] = ∑(k′,k) k′k[e(k′k) − q(k′) q(k)]/σ²(e)(k)
- –1 ≤ r ≤ 1: assortative if r > 0, neutral if r = 0, disassortative if r < 0