Week 8: influence manipulation Flashcards

1
Q

How does simple contagion spread in a small-world network, lattice, random, and scale-free network?

A

Small-world & random: each weak tie allows information to spread to an untouched region of the network. Spreads very fast
Lattice/clustered: spreads consistently but not as fast as small-world or random
Scale-free: existence of hubs means the hubs will quickly spread the information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does complex contagion spread in a small-world network, lattice, random and scale-free network?

A

Clustered: spreads consistently
Small-world: rewiring a few edges slows down the spread because the weak ties slow the spreading. Rewiring many edges can stop the spread
Random: spreads slower than clustered because people need reinforcement from multiple neighbours
Scale-free: when the behaviour is risk the hubs will wait and see until they see other communities backing up the behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does a disease spread in two networks of the same size with different average path lengths?

A

A) has shorter average path lengths - the peak will come earlier, lots of people infected at the same time
B) has longer average path lengths - takes longer to spread and results in a flatter infection curve, peak is lower

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 3 regulations to deal with COVID-19 and what are they trying to achieve?

A
  1. Remove weak/long ties between distant communities - interact with only those who are geographically proximate
  2. Keep ties with redundancy but remove narrow bridges - friends only meet when they gave many friends in common
  3. Build social bubbles through repeated contact - no gathering of more than 3 people
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the influence maximisation problem (IMP)?

A

There exists a specific set of nodes which if activated would cause the spread of information to the whole network and if immunised would prevent the diffusion of a large scale epidemic

  • IMP: to find a set S of a given size k that maximises f(S)
  • f(S) is a function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is structural IMP?

A

The influence function f(S) is fully determined by the network topology - when you are only interested in the connectivity/robustness of a network
E.g. f(S) could be the size of the largest component after the removal of S

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is functional IMP?

A

The influence function f(S) involves both network topology and contagion mechanism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What centrality heuristics can be used to approximate the exact IMP solution and what are each of them useful for?

A

Degree heuristic: the number of people that can be reached within one path length
Closeness centrality: reflects how efficient a node exchanges information with all other nodes - good for reaching all the people in the shortest time
Betweenness centrality: reflects a nodes power in controlling flow - can’t reach many people within one path length but can potentially lead to a larger outbreak
Eigenvector, k-core: based on node position, importance of neighbours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are centrality heuristics?

A

Rank all nodes according to their degrees or other centrality measures and pick up the k top-ranking nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the advantages and disadvantages of using the degree heuristics for IMP?

A

Advantages:
-focus on immediate effect
-easy to identify the highest degree nodes for online and offline networks
-least demanding on network information
Disadvantages:
-not ideal to maximise the effect of the whole network
-sometimes inefficient since nodes of the highest centrality may be highly clustered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the advantages and disadvantages of closeness, betweenness and eigenvector heuristics for IMP?

A

Advantages:
-reflect structure of the whole network
-good for project aiming for the whole population and believing that signal will not decay significantly even for longest path
Disadvantages:
-require complete information of the network
-computationally expensive
-not easy to interpret and extent the results from sample network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are 3 algorithms inspired by centrality heuristics?

A

High degree adaptive (HDA), top-k spreaders, local rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the HDA method for IMP?

A

First choose the node of the largest degree, then recalculate the degree of nodes after every step of node removal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the top-k spreaders method for IMP?

A

Network divided into communities, all communities ranked in decreasing order according to size, first spreader is selected from the largest community, then next spreader selected from second largest community having no edges incident to previous communities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the local rank method for IMP?

A

Consider the number of neighbours within 4 path lengths

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are disadvantages of HDA, top-k spreaders and local rank?

A

Requires complete network information

17
Q

What is the one-hop method for IMP?

A

Based on the friendship paradox - individuals are likely to have fewer than their friends do on average, so you should seed the neighbour of randomly selected nodes. Has shown to work better than high degree in some studies

18
Q

In ER and SF networks, which algorithms require the least nodes removed to disconnect the network?

A
  1. HDA
  2. High-degree and page rank
  3. Closeness centrality and eigenvector centrality
19
Q

What is the greedy algorithm?

A

Add nodes one by one to the target set, ensuring that each addition brings the largest increase of influence to the previous set

20
Q

What are the steps of the greedy algorithm?

A
  • set the seed set as S
  • start with an empty set of S={}
  • at each time step scan all nodes to find the one v that maximises f(S union v) and then update S by including v
  • after k time steps, you get the target set S containing k influential nodes
21
Q

What are the advantages and disadvantages of the greedy algorithm?

A

It reduces the computational time but considers the incremental spread of the k nodes individually rather than combined

22
Q

What is monotonicity in a function?

A

The value of the function increases as more items are added to the set

23
Q

What is submodularity in a function?

A

A function f is sub modular if the marginal gain from adding an element to a set S is no less than the marginal gain from adding the same element to a superset of S (diminishing return)

24
Q

What happens with the greedy algorithm when the function is monotonic and submodular?

A

The greedy algorithm can provide an approximation guarantee

25
Q

For which models are the objective functions on the expected number of activated nodes monotonic and submodular?

A

Independent cascade and linear threshold models

26
Q

How well does the greedy algorithm approximate the optimum S*?

A

Within a factor of around 60% or more

27
Q

What is the cost-effective lazy forward (CELF) algorithm?

A

From submodularity, at each time step finding the node for maximum marginal gain, the marginal gain of adding a particular node to the set decreases at each time step. Therefore do not need to re-evaluate all nodes. Only calculate for node with highest marginal gain at previous time step, if it is larger than the others were then don’t need to bother recalculating the others.
CELF reduces calculating steps and running time