Week 8: influence manipulation Flashcards by Annie Clarnette

How does simple contagion spread in a small-world network, lattice, random, and scale-free network?

Small-world & random: each weak tie allows information to spread to an untouched region of the network. Spreads very fast
Lattice/clustered: spreads consistently but not as fast as small-world or random
Scale-free: existence of hubs means the hubs will quickly spread the information

How well did you know this?

Not at all

Perfectly

How does complex contagion spread in a small-world network, lattice, random and scale-free network?

Clustered: spreads consistently
Small-world: rewiring a few edges slows down the spread because the weak ties slow the spreading. Rewiring many edges can stop the spread
Random: spreads slower than clustered because people need reinforcement from multiple neighbours
Scale-free: when the behaviour is risk the hubs will wait and see until they see other communities backing up the behaviour

How well did you know this?

Not at all

Perfectly

How does a disease spread in two networks of the same size with different average path lengths?

A) has shorter average path lengths - the peak will come earlier, lots of people infected at the same time
B) has longer average path lengths - takes longer to spread and results in a flatter infection curve, peak is lower

How well did you know this?

Not at all

Perfectly

What are 3 regulations to deal with COVID-19 and what are they trying to achieve?

Remove weak/long ties between distant communities - interact with only those who are geographically proximate
Keep ties with redundancy but remove narrow bridges - friends only meet when they gave many friends in common
Build social bubbles through repeated contact - no gathering of more than 3 people

How well did you know this?

Not at all

Perfectly

What is the influence maximisation problem (IMP)?

There exists a specific set of nodes which if activated would cause the spread of information to the whole network and if immunised would prevent the diffusion of a large scale epidemic

IMP: to find a set S of a given size k that maximises f(S)
f(S) is a function

How well did you know this?

Not at all

Perfectly

What is structural IMP?

The influence function f(S) is fully determined by the network topology - when you are only interested in the connectivity/robustness of a network
E.g. f(S) could be the size of the largest component after the removal of S

How well did you know this?

Not at all

Perfectly

What is functional IMP?

The influence function f(S) involves both network topology and contagion mechanism

How well did you know this?

Not at all

Perfectly

What centrality heuristics can be used to approximate the exact IMP solution and what are each of them useful for?

Degree heuristic: the number of people that can be reached within one path length
Closeness centrality: reflects how efficient a node exchanges information with all other nodes - good for reaching all the people in the shortest time
Betweenness centrality: reflects a nodes power in controlling flow - can’t reach many people within one path length but can potentially lead to a larger outbreak
Eigenvector, k-core: based on node position, importance of neighbours

How well did you know this?

Not at all

Perfectly

What are centrality heuristics?

Rank all nodes according to their degrees or other centrality measures and pick up the k top-ranking nodes

How well did you know this?

Not at all

Perfectly

What are the advantages and disadvantages of using the degree heuristics for IMP?

Advantages:
-focus on immediate effect
-easy to identify the highest degree nodes for online and offline networks
-least demanding on network information
Disadvantages:
-not ideal to maximise the effect of the whole network
-sometimes inefficient since nodes of the highest centrality may be highly clustered

How well did you know this?

Not at all

Perfectly

What are the advantages and disadvantages of closeness, betweenness and eigenvector heuristics for IMP?

Advantages:
-reflect structure of the whole network
-good for project aiming for the whole population and believing that signal will not decay significantly even for longest path
Disadvantages:
-require complete information of the network
-computationally expensive
-not easy to interpret and extent the results from sample network

How well did you know this?

Not at all

Perfectly

What are 3 algorithms inspired by centrality heuristics?

High degree adaptive (HDA), top-k spreaders, local rank

How well did you know this?

Not at all

Perfectly

What is the HDA method for IMP?

First choose the node of the largest degree, then recalculate the degree of nodes after every step of node removal

How well did you know this?

Not at all

Perfectly

What is the top-k spreaders method for IMP?

Network divided into communities, all communities ranked in decreasing order according to size, first spreader is selected from the largest community, then next spreader selected from second largest community having no edges incident to previous communities

How well did you know this?

Not at all

Perfectly

What is the local rank method for IMP?

Consider the number of neighbours within 4 path lengths

How well did you know this?

Not at all

Perfectly

What are disadvantages of HDA, top-k spreaders and local rank?

Study These Flashcards

Requires complete network information

What is the one-hop method for IMP?

Study These Flashcards

Based on the friendship paradox - individuals are likely to have fewer than their friends do on average, so you should seed the neighbour of randomly selected nodes. Has shown to work better than high degree in some studies

In ER and SF networks, which algorithms require the least nodes removed to disconnect the network?

Study These Flashcards

HDA
High-degree and page rank
Closeness centrality and eigenvector centrality

What is the greedy algorithm?

Study These Flashcards

Add nodes one by one to the target set, ensuring that each addition brings the largest increase of influence to the previous set

What are the steps of the greedy algorithm?

Study These Flashcards

set the seed set as S
start with an empty set of S={}
at each time step scan all nodes to find the one v that maximises f(S union v) and then update S by including v
after k time steps, you get the target set S containing k influential nodes

What are the advantages and disadvantages of the greedy algorithm?

Study These Flashcards

It reduces the computational time but considers the incremental spread of the k nodes individually rather than combined

What is monotonicity in a function?

Study These Flashcards

The value of the function increases as more items are added to the set

What is submodularity in a function?

Study These Flashcards

A function f is sub modular if the marginal gain from adding an element to a set S is no less than the marginal gain from adding the same element to a superset of S (diminishing return)

What happens with the greedy algorithm when the function is monotonic and submodular?

Study These Flashcards

The greedy algorithm can provide an approximation guarantee

For which models are the objective functions on the expected number of activated nodes monotonic and submodular?

Independent cascade and linear threshold models

How well does the greedy algorithm approximate the optimum S*?

Within a factor of around 60% or more

What is the cost-effective lazy forward (CELF) algorithm?

From submodularity, at each time step finding the node for maximum marginal gain, the marginal gain of adding a particular node to the set decreases at each time step. Therefore do not need to re-evaluate all nodes. Only calculate for node with highest marginal gain at previous time step, if it is larger than the others were then don’t need to bother recalculating the others. CELF reduces calculating steps and running time

Week 8: influence manipulation Flashcards

(27 cards)