Topological Data Analysis Flashcards
What is the definition of a topological space and what are the conditions for a collection of subsets to be a topology on a set?
A topological space is a set endowed with a structure, called a topology, which allows defining continuous deformation of subspaces, and, more generally, all kinds of continuity. A topology on a set is a collection of subsets, called open sets, that satisfies the following three conditions: 1) The empty set and the whole set are in the collection. 2) Any union of sets in the collection is also in the collection. 3) Any finite intersection of sets in the collection is also in the collection.
What is the difference between homotopy and homeomorphism and how do they relate to the concept of topological equivalence?
Homotopy and homeomorphism are both ways of defining a kind of equivalence between topological spaces. A homeomorphism is a bijective and continuous function between two topological spaces that has a continuous inverse function. Two spaces with a homeomorphism between them are called homeomorphic, and from a topological viewpoint they are essentially identical. Homotopy is a more general concept, where one function can be continuously deformed into another. The concept of homotopy equivalence is a coarser way to classify spaces, and a homotopy equivalence gives a relation of equivalence in homotopy theory.
What is a simplicial complex and what are the two ways of representing it (abstract and geometric)?
A simplicial complex is a set composed of points, line segments, triangles, and their n-dimensional counterparts (which are called simplices). Simplicial complexes should be closed under the operation of taking subsets, meaning that every face of a simplex is also in the complex. They can be represented in two ways: 1) Geometrically, where each simplex is a true geometric object (i.e., a set of points in space). 2) Abstractly, where each simplex is just a set of vertices and the geometric realization is not considered.
What is a filtration and how does it help us study the evolution of topological features over different scales?
A filtration is a nested sequence of simplicial complexes, each one contained in the next, which results in a final complex. Filtrations allow us to study the ‘birth’ and ‘death’ of topological features as we add simplices according to their filtration values. This is the basis of persistent homology, a method in topological data analysis that can quantify the multi-scale shape of a data set.
What is persistent homology and how can we visualize it using barcodes and persistence diagrams?
Persistent homology is a method for computing topological features of a space at different spatial resolutions. More persistent features are detected over a wide range of spatial scales and are deemed more likely to be features of the underlying space rather than artifacts of sampling, noise, or particular choice of parameters. Barcodes and persistence diagrams are two ways of visualizing these persistent topological features. A barcode is a collection of intervals on the real line, each corresponding to a topological feature, while a persistence diagram is a scatter plot on the plane, where the x-coordinate of a point corresponds to the birth time of a feature, and the y-coordinate to its death time.
What are the most common distance metrics for comparing persistence diagrams and how are they defined?
The most common distance metrics for comparing persistence diagrams are the bottleneck distance and the Wasserstein distance. The bottleneck distance is the minimum value over all bijections between the diagrams such that the maximum matching distance between pairs is minimized. The Wasserstein distance is a more sensitive measure, defined as the minimum value over all bijections of the p-th root of the sum of the p-th powers of the distances between matched pairs.
What is a Vietoris Rips complex and how can we construct it from a metric space?
The Vietoris–Rips complex is a type of simplicial complex that can be defined from any metric space by forming a simplex for every finite set of points that has diameter less than ε. It is a way to encode the topological information of a metric space and is commonly used in topological data analysis.
What is a group and what are the properties of a group operation?
A group is a set equipped with an operation that combines any two of its elements to form a third element in such a way that four conditions called group axioms are satisfied: 1) Closure: For all a, b in the group, the result of the operation, or the ‘product’ a * b, is also in the group. 2) Associativity: For all a, b and c in the group, (a * b) * c equals a * (b * c). 3) Identity element: There is an element e in the group such that, for every element a in the group, the equations e * a and a * e return a. 4) Inverse element: For each a in the group, there exists an element b in the group, commonly denoted 1/a or a−1, such that a * b and b * a are both equal to the identity element.
What is a field and how does it differ from a group?
A field is a set on which addition, subtraction, multiplication, and division are defined, and behave as the corresponding operations on rational and real numbers do. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. The main difference between a field and a group is that a field has two operations (addition and multiplication) that must satisfy the field axioms, while a group has one operation that must satisfy the group axioms.
What is an invariant and what are some examples of topological invariants?
An invariant is a property of a mathematical object (or a class of mathematical objects) which remains unchanged, under some transformation. Examples of topological invariants include the number of connected components, the number of holes in different dimensions (as captured by the Betti numbers), the Euler characteristic, and the fundamental group.
What is a Sybil attack and how does it exploit trust relationships in distributed networks?
A Sybil attack is an attack wherein a reputation system is subverted by forging identities in peer-to-peer networks. It is named after the subject of the book Sybil, a case study of a woman diagnosed with dissociative identity disorder. The name was suggested in or before 2002 by Brian Zill at Microsoft Research. The term has been widely used in the computer science community for over a decade.
What are the advantages and disadvantages of using a global vs. a local view for Sybil detection in online social networks?
A global view for Sybil detection has the advantage of being able to detect Sybil nodes that are far away from the honest nodes in the network. However, it requires knowledge of the entire network, which may not be feasible in large-scale or dynamic networks. A local view, on the other hand, only requires knowledge of a small portion of the network around a particular node. This makes it more scalable and adaptable to changes in the network. However, it may not be able to detect Sybil nodes that are far away from the node under consideration.
What is an ego network and how can we use it to obtain a local view of a network?
An ego network is a subgraph of a network that is centered on a single node, known as the ego. The ego network includes the ego, the nodes to which it is directly connected (called alters), and all the links among those nodes. Ego networks provide a local view of a network, which can be used to study the properties of individual nodes and their immediate neighborhoods.
How can we use topological data analysis to detect Sybil attacks in ego networks?
Topological data analysis can be used to detect Sybil attacks in ego networks by identifying topological features that are characteristic of Sybil nodes. For example, Sybil nodes may form tightly-knit communities that are loosely connected to the rest of the network. These communities can be detected as topological ‘holes’ in the network. Additionally, the persistence of these holes across different scales can be used to distinguish between real communities and Sybil communities.
What are the steps of the topological pipeline for Sybil detection and what are the inputs and outputs of each step?
The topological pipeline for Sybil detection consists of several steps: 1) Constructing a simplicial complex from the network data. 2) Computing the persistent homology of the simplicial complex. 3) Extracting features from the persistence diagram. 4) Classifying the nodes based on these features. The inputs and outputs of each step are: 1) Input: Network data. Output: Simplicial complex. 2) Input: Simplicial complex. Output: Persistence diagram. 3) Input: Persistence diagram. Output: Feature vector. 4) Input: Feature vector. Output: Classification of nodes.