Graph Neural Networks GNN Flashcards

1
Q

GNN pooling

A
  • if missing node info, pool edge info, etc…
  • aggregation - usually sum()
  • like global pool is like CNN global average pooling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

message passing between parts of graphs

A

pool neighboring info to influence eachothers updated embeddings

  1. gather all neighboring “messages”
  2. aggregate all messages via agg. function (like sum)
  3. messages passed through an update function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

XAI GNN Methods

A
  1. Gradient: use gradient as approximation of feature importance
  2. Perturbation: use output variation with respect to the perturbation of input (what attributes need to be kept to reconstruct graph)
  3. Surrogate: train a surrogate model using neighboring areas
  4. Decomposition: decompose prediction into several scores representing importance score
  5. Generation: learn to generate graphs that achieve optimal prediction scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Perturbation GNN method

A

use output variation with respect to the perturbation of input (what attributes need to be kept to reconstruct graph)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Surrogate GNN method

A

train a surrogate model using neighboring areas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Decomposition GNN method

A

decompose prediction into several scores representing importance score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ordering GNN update

A

can do a weave fashion: node→node(linear), edge→edge(linear), edge→node(edge layer), node→edge(node layer)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

global representations

A
  • nodes far apart cannot transfer info (k-layers will propagate at most k-steps)
  • overcome using the master node global context vector connected to all other nodes and edges
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

multigraphs

A

multi-edge graphs where a pair of nodes can share multiple types of edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

nested graph

A

a hypernode graph where nodes represent a graph

  • useful for representing hierarchical info
  • example - node representing molecules and edges represent interactions if way of transforming
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

hypergraph

A

edge connected to multiple nodes

  • example - hyper-edge that connects to all nodes in a community
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

sampling/batching GNN

A
  • poses a problem b/c context matters in sub-selection
  • method 1: to preserve the structure, you could randomly sample uniform num nodes
    • then add neighboring nodes of distance, including their edges
    • each neighborhood is considered a graph and GNN trained on batches of these subgraphs
    • mask to consider node-set
  • method 2: randomly sample a single node, expand its neighborhood to distance k, then pick the other node within the expanded set
    • operations terminated once a certain number of edges or subgraphs constructed
    • constant size neighborhoods, picking the initial node-set, then subsampling a constant num node (e.g. random walk or Metropolis)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

4 ways of sampling a graph

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

inductive bias in graphs

A
  • identify patterns in the data and adding modeling components to take advantage of these attributes
  • GNN should preserve explicit relationships (adjacency matrix)
  • GNN should also preserve graph symmetries (permutation invariance)
  • graphs structure great where interactions b/w entities is important
  • GNN should transform on sets → operation order on nodes and edges should not matter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

aggregation operation

A
  • want similar inputs providing similar aggregated outputs and vice versa
    • take a variable number of inputs and output the same
  • sum, mean, max, variance
    • mean() is good when you need normalized views
    • max() is good when you want to highlight salient features
    • sum() is a good balance and is used in practice more often
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

bag of subgraphs

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

graphs dual

A
  • edge and node predictions often reduce to the same problem - “an edge prediction task on a graph G can be phrased as a node-level prediction on Gs dual
  • to obtain Gs dual, convert nodes to edges (and edges to nodes)
  • example - to solve edge classification on G, we can think about doing graph convolutions on Gs dual
18
Q

significance of matrix multiplication on a graph

A
  • applying multiplication multiply times propogates information
  • Example - A^2, all connected nodes continue to receive information as a_i1*
19
Q

graph attention network

A
  • use weighted sum in permutation invariant fashion to communicate info between graph attributes
  • method 1: scalar scoring function that assigns weights based on pairs b/w nodes (how relevant neightboring node is to center node
  • method 2: inner product and nodes are transformed before scoring into query and key vectors via linear map
  • scoring weights can be used as a measure of importance of an edge in relation to a task
20
Q

relationship b/w transformer and GNN

A
  • the transformer models several elements (e.g. tokens) as nodes in a fully connected graph
  • attention mechanism is assigning edge embeddings to each node-pair which are used to compute attention weights
  • the difference - GNN assumes sparse pattern
21
Q

GNN XAI

A
  • varies from graph to graph
  • GNNExplainer extracts most relevant subgraph
  • attribution assign ranked importance values to parts of the graph
22
Q

generative graph modeling

A
  • method 1: generate new graphs by sampling from learned distributions or completing a graph given a starting point
  • example - novel molecular graphs with desirable attributes
  • example solution - graphVAE learns connectivity by treating adjacency matrix like an image
  • method 2: build graph sequentially starting w/ a graph and applying addition, subtraction, …. of nodes and edges repeatedly
23
Q

node-order equivariance

A
  • algorithms should not depend on the ordering of the nodes
  • graphs have no inherent ordering present amonst nodes (e.g. contrasting with images with coordinates)
24
Q

problems that can be defined over graphs

A
25
Q

node representation learning

A

map individual neodes to fixed-size real-valued vectors
* generally, compute nodes calculated in iterative process
* each iteration equals a layer in standard ANNs
* params
1. gaph G
2. nodes V
3. edges E
4. node v at k iteration h_v^(k)
5. feature x for node v

26
Q

graph Laplacian

A
  • Laplacian is square (n x n)
    • L = D - A
    • shows up on random walks, spectral cluster, and many more
27
Q

degree of a node

A
  • is the number of edges incident at v
    • D = Sigma_u(A)
28
Q

polynomial of the Laplacian

A
29
Q

polynomial feature vector convolution

A
30
Q

Graph Convolutional Network (GCN)

A
31
Q

Graph Attention Networks (GAT)

A
32
Q

Graph Isomorphic Network (GIN)

A
33
Q

Global Propagation via Embeddings

A
  • graph-level embeddings constructed using pooling tends to ignore underlying topology
  • spectral convolutions do not
34
Q

spectral embedding

A
35
Q

GNNs in Practice

A
  • can update equations using sparse matrix-vector product
  • allows efficient vectorized implementations of GNNs on GPUs
  • can use regularization such as dropout, but methods such as drop edge can boost for many GNNs
36
Q

GNN Pooling Methods

A
  1. SortPool: Sort vertices of the graph to get a fixed-size, and then apply any standard neural network architecture.
  2. DiffPool: Learn to cluster vertices, build a coarser graph over clusters instead of nodes, then apply a GNN over the coarser graph. Repeat until only one cluster is left.
  3. SAGPool: Apply a GNN to learn node scores, then keep only the nodes with the top scores, throwing away the rest. Repeat until only one node is left.
36
Q

GNN Pooling Methods

A
  1. SortPool: Sort vertices of the graph to get a fixed-size, and then apply any standard neural network architecture.
  2. DiffPool: Learn to cluster vertices, build a coarser graph over clusters instead of nodes, then apply a GNN over the coarser graph. Repeat until only one cluster is left.
  3. SAGPool: Apply a GNN to learn node scores, then keep only the nodes with the top scores, throwing away the rest. Repeat until only one node is left.
37
Q

Node Classification

A
  • L(y_v,y^_v)= -∑y_vc log y^_vc
  • L_G = ∑L(y_v,y^_v)/|Labeled(G)|
38
Q

Link Prediction

A
39
Q

Node Clustering

A
  • method 1: train GNN to predict local and global graph properties
  • method 2: random walk like node2vec and DeepWalk: L=∑∑log(exp(zz)/∑exp(zz)
  • or Noise Contrastive Estimation for large graphs