Mining Structure from Graphs Flashcards by Phil Pieper

Nenn Similarity measures für Clustering Graphs

Geodesic distances
Structural Similarity

How well did you know this?

Not at all

Perfectly

Was sind geodesic distances?

distance along curved spaces
* May be approximated by adding many short straight
segments, using the Euclidean distance for each of these

How well did you know this?

Not at all

Perfectly

Wie berechnet man die geodesic distance in einem Graphen?

How well did you know this?

Not at all

Perfectly

Was ist die Eccentricity von einem Knoten v?

How well did you know this?

Not at all

Perfectly

Was ist ein peripheral vertex?

How well did you know this?

Not at all

Perfectly

Was ist der Radius von einem Graphen?

How well did you know this?

Not at all

Perfectly

Was ist der Diameter eines Graphen?

How well did you know this?

Not at all

Perfectly

Beschreib Graph Clustering über Sparset Cut

Trennen von einem Graphen in Subgraphen

How well did you know this?

Not at all

Perfectly

Wie kann man eine Entscheidung für einen Cut beim Graph Clustering über Sparsest Cut treffen? Was ist eine gut Methode dafür?

Bestimmen der Größe eiens Cuts über die Anzahl der Kanten.
Min-cut (not good partition)
Besser: Sparsity

How well did you know this?

Not at all

Perfectly

Was ist Sparsity?

Methode um eine Entscheidung für den Cut bei Graph Clustering zu wählen

size of cut = anzahl an subgraphen

How well did you know this?

Not at all

Perfectly

Wann ist ein cut sparsest?

A cut is sparsest if its sparsity is not greater than that of any other cut

How well did you know this?

Not at all

Perfectly

Was ist modularity beim Graphclustering?

The modularity of a clustering assesses the quality of the clustering

How well did you know this?

Not at all

Perfectly

Wie berechnet man die modularity?

How well did you know this?

Not at all

Perfectly

Challenges of Finding Good Cuts

Was sind Herausforderungen für Graph Clustering?

How well did you know this?

Not at all

Perfectly

Welche Ansätze gibt es für Graph Clustering?

Methods specifically designed for clustering graphs
-> Search the graph to find well-connected components as clusters
Using generic clustering methods for high-dimensional data
-> Extract a similarity/affinity matrix

How well did you know this?

Not at all

Perfectly

Wie bestimmt man die structural similarity?

How well did you know this?

Not at all

Perfectly

SCAN: Structural Clustering Algorithm for Networks

Zwischen welchen Individuals unterscheidet man?

Individuals in a tight social group, or clique
Individuals who are hubs
Individuals who are outliers

How well did you know this?

Not at all

Perfectly

SCAN: Structural Clustering Algorithm for Networks

Beschreib Individuals in a tight social group, or clique

Individuals in a tight social group, or clique, know many of the same people, regardless of the size of the group

How well did you know this?

Not at all

Perfectly

SCAN: Structural Clustering Algorithm for Networks

Beschreib Individuals who are hubs

Individuals who are hubs know many people in different groups but belong to no single group. E.g., politicians bridge multiple groups

How well did you know this?

Not at all

Perfectly

SCAN: Structural Clustering Algorithm for Networks

Beschreib Individuals who are outliers

Individuals who are outliers reside at the margins of society. E.g., hermits know few people and belong to no group

How well did you know this?

Not at all

Perfectly

Was bedeutet ein Wert von 1 für die Structural Similarity?

Die Nachbarschaft zweier Knoten überdecken sich komplett

How well did you know this?

Not at all

Perfectly

Wie ist eine epsilon-neighborhood definiert?

How well did you know this?

Not at all

Perfectly

Wie ist ein Vertex-CORE definiert?

How well did you know this?

Not at all

Perfectly

Wenn epsilon größer wird, wie verändern sich die Clustergrößen?

Sie werden kleiner

How well did you know this?

Not at all

Perfectly

Wenn mü größer wird, wie verändern sich die Cluster?

Sie haben eine höhere minimale Clustergröße -> ergo mehrere Outliers (Knoten, die keine Cluster werden können)

Wie ist `direct structure reachable` definiert?

Was ist eine transitive closure?

Die transitive Hülle einer Relation R auf einer Menge X ist die kleinste transitive Relation, die R enthält * In der Graphentheorie: Sie kann als Datenstruktur verstanden werden, die Erreichbarkeitsfragen in einem Graphen beantwortet. Sie zeigt, ob man von einem Knoten zu einem anderen über einen oder mehrere Schritte gelangen kann

Wie ist `structure reachable` definiert?

Wie ist structure connected definiert?

Wie ist ein `structure-connected cluster C` definiert?

Was ist Chinese Whispers?

Beschreib den Chinese Whispers Algorithm

## Footnote selction by majority vote

Definiere die Energy Function für Chinese Whispers. Was passiert bei einem Update einer Klasse von einem Knoten?

Wozu führt das globale Minimum der Energy Function bei Chinese Whispers?

Dazu, dass alle Knoten zu einer Klasse gehören

Wie kann man verhindern, dass Chinese Whispers dazu führt, dass alle Knoten der gleichen Klasse angehören?

Mit negativen Gewichten

Was ist spectral clustering?

Was ist eine affinity matrix? Welche Eigenschaften hat sie?

Beschreib die Idee des Spectral Clusterings anhand von affinity matrizen

Wie berechnet man die Affinity Matrix bei gegebenen Datenpunkten?

Wie normalisiert man eine affinity matrix?

Wie berechnet man die Matrix D zum normalisieren einer affinity matrix?

# Spectral Clustering Algorithm Was tut man im Anschluss mit der Matrix L, der normalisierten affinity matrix?

# Spectral Clustering Algorithm Nachdem man die Matrix X mit den k Eigenvektoren als Spalten bestimmt hat, was muss man als nächstes tun?

# Spectral Clustering Algorithm Wie geht man vor, nachdem man Renormalisiert hat?

Man wendet zum Beispiel k-means an, um cluster zu bilden

# Spectral Clustering: Choice of σ Wozu führt ein höheres Sigma?

Dass weiter entfernte Datenpunkte trotzdem noch der gleichen Klasse zugeordnet werden

# Spectral Clustering Is the Davies-Bouldin index a good measure for these kinds of clusters?

No! It compares distances of points to mean versus distances between means (in this example, means are all same!)

# Spectral Clustering: Choice of k Welche Heuristik wird genutzt, um das optimale k zu finden?

Was bedeutet folgendes?

Wodurch kann ein semantisches Netzwerk für knowledge representation- and programming languages repräsentiert werden?

by a set of triplets

Was ist die Matching rule für eine query Q that matches a database D?

Wie funktioniert Classification and information retrieval in semantic networks?

by query matching

Was sind Probabilistic Graphical Models?

Modelling of observations and their relationships

Beschreib Bayes Theorem

Was ist ein Bayesian Belief Network?

Beschreib “Conditional Independence” in Bayesian Belief Network

Beschreib “Explaining away” in Bayesian Belief Network

Was lösen Bayesian Networks im Bezug auf Aufwand?

Nenn die Types of Reasoning in Bayesian Networks

* Diagnostic (from symptoms to causes) * Predictive (resoning from new info to new beliefs) * inter-causal (explaining away) * conditional independence

Wie werden Bayesian Networks konstruiert?

Wie trainiert man Bayesian Networks?

Was besagt das Maximum likelihood principle?

favors Bayesian networks that maximize the probability of observing the given data set

Was ist die Markovian assumption?

Each variable becomes independent of its non-effects once its direct causes are known

The Chinese Whispers algorithm visits every node exactly once. Stimmt das?

Nein

Which of the following statements about distance measures in graphs are correct? 1. The diameter of a graph is the maximum eccentricity of all vertices in a graph. 2. The cut set contains all the sets which are the result of a cut. 3. The eccentricity a vertex v is the largest geodesic distance between v and any other vertice in the graph. 4. The radius of a graph is the largest distance between any pair of vertices in a graph.

1 und 3

Which of the following statements about spectral clustering are correct? 1. Spectral clustering uses k-means in eigenvector space to cluster the data. 2. The parameter 𝜎 is a neighborhood parameter for some given data point. 3. Only convex clusters can be detected. 4. The number of clusters is determined by the algorithm.

1 und 2

Which of the following statements about semantic networks are correct? 1. Edges in a semantic network describe concepts and individuals. 2. Domain-specific rules have to be defined by axiomization. 3. ISA and INSTANCE are well-defined in all contexts. 4. Relations between relations can be expressed with semantic networks.

2 und 3

Which of the statements about the SCAN algorithm are correct? 1. Smaller 𝜖 lead to larger clusters. 2. Some vertices may not be assigned to any cluster. 3. The number of clusters has to be chosen at initialization stage. 4. A node is part of its own neighborhood.

1, 2 und 4

Which of the statements about the Chinese Whispers algorithm are correct? 1. The algorithm has no optimization parameters. 2. The algorithm is deterministic. 3. The algorithm has linear runtime complexity. 4. The algorithm converges to a global minimum.

1 und 3

Mining Structure from Graphs Flashcards

(69 cards)