Neo4j data science algorithms2 Flashcards
Graph Attention Networks (GATs)
A type of Graph Neural Network that uses attention mechanisms to weigh the importance of neighboring nodes when aggregating information. For example, GATs can prioritize influential neighbors when predicting a node’s label.
Label Propagation Algorithm (LPA)
An iterative algorithm that propagates labels through a graph based on the majority labels of neighboring nodes, used for community detection. For example, LPA can identify clusters of similar users in a social network.
Approximate Nearest Neighbors (ANN)
Techniques for finding the nearest neighbors of a node with reduced computational complexity compared to exact methods. For example, ANN algorithms can be used in recommendation systems to quickly find similar items.
Eigenvector Centrality
A measure of node importance in a graph based on the principle that connections to high-scoring nodes contribute more to a node’s score. For example, nodes with high eigenvector centrality are influential in networks like the web.
Graph Kernel
A method for measuring the similarity between graphs by comparing their substructures, enabling machine learning tasks like classification. For example, graph kernels can be used to classify molecular structures in bioinformatics.
Spectral Clustering
A technique that uses the eigenvalues of a graph’s Laplacian matrix to perform clustering, effective for detecting complex community structures. For example, spectral clustering can identify functional modules in biological networks.
Graph Convolutional Networks (GCNs)
A type of neural network that applies convolutional operations to graph data, capturing local neighborhood information. For example, GCNs are used for node classification tasks in citation networks.
HITS Algorithm (Hyperlink-Induced Topic Search)
An algorithm that identifies two types of nodes in a graph: hubs and authorities, based on mutual reinforcement. For example, in a web graph, hubs are pages that link to many authorities, and authorities are pages linked by many hubs.
GraphSAGE (Graph Sample and Aggregate)
An inductive framework that generates node embeddings by sampling and aggregating features from a node’s local neighborhood. For example, GraphSAGE can be used for dynamic graphs where new nodes are frequently added.
DeepWalk
A technique that learns latent representations of nodes in a graph by performing random walks and treating them as sentences for training a skip-gram model. For example, DeepWalk can generate embeddings for nodes in social networks.
Transductive Learning
A type of machine learning task where the goal is to predict labels for a specific set of nodes, leveraging both labeled and unlabeled data. For example, transductive learning can be applied to semi-supervised classification in graphs.
Node2Vec
An algorithm that generates node embeddings by optimizing a biased random walk to capture diverse network features. For example, Node2Vec can be used to create embeddings that preserve both local and global graph structures.
Community Detection with Louvain Method
An algorithm for identifying communities in large networks by maximizing modularity, a measure of the strength of division of a network into modules. For example, the Louvain method can detect meaningful groups in social or biological networks.
Chebyshev Spectral Filters
Filters used in spectral graph convolutional networks to approximate graph convolution operations, enhancing computational efficiency. For example, these filters can be used to improve the scalability of GCNs for large graphs.
Edge Betweenness Centrality
A measure of an edge’s importance based on the number of shortest paths passing through it, used to identify critical links in a network. For example, edges with high betweenness centrality can indicate bottlenecks in communication networks.