Network Analysis Flashcards

1
Q

Data Tools and Techniques

A
  • Basic Data Manipulation and Analysis
  • Data Mining
  • Machine Learning
  • Data Visualization
  • Data Collection and Preparation

Over a specific type of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Performing well-defined computations or asking well-defined questions (“queries”)

A

Basic Data Manipulation and Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Looking for patterns in data

A

Data Mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Using data to make inferences or predictions

A

Machine Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Graphical depiction of data

A

Data Visualization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A _ is a collection of nodes (or vertices) and edges (or links) that represent relationships or connections between entities.

A

Network (graph)

note: dash in code (-) is actually underscore(_)

Load graph from CSV file with no header

f = open(‘Friends.csv’)
G = nx.read_edgelist(f, delimiter=’,’, nodetype=str)
print(G)

Graph with 10 nodes and 15 edges

Display graph

nx.draw(G, with-labels = True , node-size = 1500, node-color = ‘c’)

Displays graph

~
If Directed Graph

f = open(‘Follows.csv’)
D = nx.read_edgelist(f, delimiter=’,’, nodetype=str, create-using=nx.DiGraph())
print(D)

nx.draw(D, with-labels=True, node-size=1500, arrows=True, node-color=’c’)

~>

DiGraph with 10 nodes and 18 edges
Displays graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The _ represent entities in a network.

A

Nodes (or vertices)

Iterating through nodes of the graph

for n in G:
print(n)

~>

Aaron

Chris

Emma

...
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The _ represent connections or relationships between nodes.

A

Edges (or links)

Friends lists
for n in G:
print(n, ‘is friends with:’)
friends = G.neighbors(n) # friends is iterator
for f in friends:
print(‘ ‘, f)

~>

Aaron is friends with:

Chris

Emma
Chris is friends with:

Aaron

Drew

...

Friends lists v2
for n in G:
print(n, ‘is friends with:’, list(G.neighbors(n)))

~>

Aaron is friends with: ['Chris', 'Emma']

Chris is friends with: ['Aaron', 'Drew']

...
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The _ is the number of edges connected to a node.

A

Degree

Degree

numfriends = G.degree
print(numfriends)
print(“”)

for n in numfriends:
print(n[0], ‘has’, n[1], ‘friends’)

print(“”)

~
Or can treat list of pairs like a dictionary
for n in G:
print(n, ‘has’, numfriends[n], ‘friends’)

~>

Aaron has 4 friends

Chris has 5 friends

...
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Edges can be _ or _.

A

Directed, Undirected

Undirected edges imply a mutual connection (e.g., friendships), while directed edges indicate a one-way relationship (e.g., following someone on social media).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

It is the study of the structure and behavior of networks, focusing on the relationships between nodes (entities) and edges (connections).

A

Network Analysis

It helps to understand patterns, connectivity, and dynamics within complex systems, such as social networks, communication systems, or transportation grids, using mathematical and computational techniques.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Network examples

A
  • Flight Routes
  • Disease Transmission
  • Food Chain
  • Criminal Networks
  • Science Citations
  • Retweets
  • Facebook Friends

Other Examples
* Electricity grid + other civil infrastructure
* The brain + other biological structures
* Organizations and organizational behavior
* Spread of memes, other social phenomena
* And many, many more…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In network analysis, the _ of a graph measures how many edges are present in the graph compared to the maximum possible number of edges. It is a ratio that reflects the level of connectivity between nodes.

A

Density

A density of 1 means full connection while near 0 indicates sparse links

Density of graph

numnodes = G.number-of-nodes()
numedges = G.number-of-edges()

possedges = G.number-of-nodes() * (G.number-of-nodes() - 1)

print(‘Number of nodes:’, numnodes)
print(‘Number of edges:’, numedges)
print(‘Possible edges:’, possedges)
print(‘Density (edges divided by possible edges):’, numedges/possedges)

~>

Number of nodes: 10

Number of edges: 15

Possible edges: 90

Density (edges divided by possible edges): 0.16666666666666666

Using density function

print(‘Using density function:’, nx.density(G))

~>

Using density function: 0.3333333333333333
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for graph density?

A

Density = number of edges / number of possible edges

[Directed] Possibleedges = n(n−1)
[Undirected] Possibleedges = (n(n−1)) / 2

where n is the number of nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The _ in a graph is the minimum number of edges required to travel between two nodes in the network.

A

Shortest path

Shortest path (or shortest distance) between given pair of nodes

“Six degrees of separation”
(Four in Facebook)

Overall average shortest distance

print(‘Average shortest distance:’, nx.average-shortest-path-length(G))

~>

Average shortest distance: 2.022222222222222
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The _ of a graph is the maximum shortest distance between any pair of nodes in the graph.

A

Diameter

Maximum shortest distance in graph

Diameter

print(‘Diameter:’, nx.diameter(G))

~>

Diameter: 4
17
Q

The _ in a graph are sets of fully connected nodes, where every node is connected to every other node in the set.

A

Cliques

Sets of fully-connected nodes

Maximal cliques
cliques = nx.find_cliques(G) # cliques is iterator
for c in cliques:
print(c)

Modify code to only print cliques > 2
for c in cliques:
if len(c) > 2:
print(c)

~>

['Josh', 'Mike', 'Jess']

['Josh', 'Aaron', 'Jess']

['Josh', 'Aaron', 'Emma']

['Chris', 'Sarah', 'Drew']
18
Q

The _ in a graph measures how close a node is to all other nodes, based on the average shortest distance from the node to all others.

A

Closeness centrality

Average shortest distance to all other nodes

(inverted so higher is “better”)

Closeness centrality - average shortest distance to other nodes, normalized on reverse 0-1 scale

cc = nx.closeness-centrality(G)
print(cc)
print(“”)

sorted-keys = sorted(cc, key=cc.get, reverse=True)
print(sorted-keys)

for k in sorted-keys:
print(k, ‘has closeness centrality’, cc[k])

 ~> ```{'Aaron': 0.6, 'Chris': 0.6923076923076923, ...}```
['Chris', 'Aaron', ...]
Chris has closeness centrality 0.6923076923076923

Aaron has closeness centrality 0.6

...
19
Q

The _ in a graph indicates the number of shortest paths that pass through a particular node, showing how crucial it is to network connectivity.

A

Betweenness centrality

Number of shortest paths the node lies on

Betweenness centrality - number of shortest paths it’s on, normalized on 0-1 scale

bc = nx.betweenness-centrality(G)
sorted-keys = sorted(bc, key=bc.get, reverse=True)
for k in sorted-keys:
print(k, ‘has betweenness centrality’, bc[k])

~>

Chris has betweenness centrality 0.5555555555555556

Aaron has betweenness centrality 0.1759259259259259

...
20
Q

In a directed graph, _ is the number of edges directed toward a node (followers).

A

In-degree

How many “followers”

Number of follows and followers
followers = D.in-degree
print(‘Number of followers: ‘, followers)

~>

Number of followers:  [('Aaron', 3), ('Chris', 4), ...]
21
Q

In a directed graph, _ is the number of edges directed from a node (following).

A

Out-degree

How many “following”

Number of follows and followers
follows = D.out-degree
print(‘Number of follows: ‘, follows)

~>

Number of follows:  [('Aaron', 3), ('Chris', 2), ...]

~
Can treat list of pairs like a dictionary
for n in D:
print(n, ‘follows’, follows[n], ‘people and has’, followers[n], ‘followers’)

~>

Aaron follows 3 people and has 3 followers

Chris follows 2 people and has 4 followers
22
Q

The _ in a directed graph measures how often links between nodes are bidirectional in a directed network.

A

Reciprocity

How often links are bidirectional

Reciprocity - people that follow each other
for n in D:
print(f”{n} follows {list(D.neighbors(n))}”)

~>

Aaron follows ['Chris', 'Emma', 'Josh']

Chris follows ['Aaron', 'Drew']

...

~
Alternative reciprocity
cycles = nx.simple_cycles(D)
for c in cycles:
if len(c) == 2:
print(c[0], ‘and’, c[1], ‘follow each other’)

~>

Mike and Jess follow each other

Chris and Aaron follow each other
23
Q

The _ in a directed graph occur when a sequence of directed edges leads back to the starting node.

A

Cycles

Cycles
cycles = nx.simple_cycles(D)
for c in cycles:
print(c)

~>

['Mike', 'Chris', 'Drew']

['Mike', 'Jess', 'Aaron', 'Chris', 'Drew']

['Mike', 'Jess', 'Aaron', 'Emma', 'Sarah', 'Chris', 'Drew']

['Mike', 'Jess']

['Chris', 'Aaron']

['Chris', 'Aaron', 'Emma', 'Sarah']

['Sarah', 'Emma']
24
Q

In network analysis, _ attempts to forecast which new edges will form in the network in the future, often used for friend or follower recommendations.

A

Link prediction

Predict future edges added to the graph

Friends (or Follows) recommendations

Dolphin friend recommendation

for n1 in G:
~for n2 in G:
~~if n1 != n2 and not G.has_edge(n1, n2):
~~~common = set(G.neighbors(n1)) & set(G.neighbors(n2))
~~~if len(common) >= 4:
~~~~print(f”Dolphins {n1} and {n2} have common friends with
~~~~{sorted(list(common))}”)

~>

Dolphins 6 and 7 have common friends with ['10', '14', '57', '58']

Dolphins 10 and 55 have common friends with ['14', '42', '58', '7']

...
25
Q

In network analysis, _ identifies groups of nodes that are more densely connected to each other than to the rest of the network.

A

Community detection

Sets of interlinked/similar nodes

26
Q

In network analysis, _ describe the spread or propagation of information through a network.

A

Cascades

Information propagation

27
Q

Which Python package is commonly used for network analysis?

A

networkx

import networkx as nx

The networkx package is widely used for creating, manipulating, and analyzing graphs and networks in Python.