Neo4j data science algorithms1 Flashcards
Graph Projection
The process of creating a new graph from an existing one by focusing on certain types of nodes and relationships. For example, projecting a bipartite graph into a unipartite graph by considering only nodes of one type and their interconnections.
Graph Embeddings
Techniques used to represent nodes, edges, or entire graphs as vectors in a continuous vector space, facilitating machine learning applications. For example, embeddings can be used to predict missing relationships in a graph.
Weakly Connected Components
Subgraphs in which any two nodes are connected by a path, but there is no requirement for the direction of the edges to be followed. For example, in a directed graph, a weakly connected component ignores the direction of edges.
Strongly Connected Components
Subgraphs in which every node is reachable from every other node, respecting the direction of edges. For example, a strongly connected component in a directed graph requires a path in both directions between any two nodes.
K-Nearest Neighbors (KNN)
An algorithm used to classify nodes based on the majority class of their neighbors. For example, in a graph of customer purchases, KNN can be used to predict a customer’s interest in a new product based on similar customers’ behaviors.
Graph Neural Networks (GNNs)
A class of neural networks designed to operate on graph-structured data, learning representations that capture the structure and properties of graphs. For example, GNNs can be used for node classification or link prediction tasks.
Node
A fundamental unit of a graph representing entities or objects, such as a person, place, or thing. For example, in a social network graph, each user is represented as a node.
Edge
A connection between two nodes in a graph, representing a relationship or interaction between them. For example, an edge could represent a friendship between two users in a social network graph.
Pathfinding
Algorithms used to find the shortest or most efficient path between nodes in a graph. For example, Dijkstra’s algorithm is used to determine the shortest path between two nodes.
Centrality
Measures used to identify the most important or influential nodes in a graph. For example, Betweenness Centrality measures the number of times a node acts as a bridge along the shortest path between two other nodes.
Community Detection
Algorithms used to identify groups of nodes that are more densely connected to each other than to the rest of the graph. For example, the Louvain method is used to detect communities within a network.
Clustering Coefficient
A measure of the degree to which nodes in a graph tend to cluster together. For example, a high clustering coefficient indicates that a graph has a high tendency to form clusters or tightly knit groups.
PageRank
An algorithm used to rank nodes in a graph based on their importance, initially developed for ranking web pages. For example, nodes with higher PageRank are considered more important or influential within the graph.
Node Similarity
Algorithms that measure the similarity between nodes based on their properties or relationships. For example, the Jaccard similarity coefficient is used to determine how similar two nodes are based on their shared neighbors.