334 - 442: Multidimensional Data Flashcards
What is a Quadtree?
A Quadtree is a tree-based data structure used for indexing two-dimensional spatial data by recursively dividing the space into quadrants.
Which structure is used to index multi-dimensional spatial data in Quadtrees?
a) Grids
b) B+ Trees
c) K-D Trees
d) Quadrants
d) Quadrants
How does a Quadtree handle non-uniform spatial data distribution?
By recursively subdividing space into quadrants, adapting to the density of data in each region.
What is the tradeoff associated with the Curse of Dimensionality?
It becomes increasingly difficult to differentiate points in high-dimensional space due to diminishing contrast in distances.
What is a Quadtree used for in multi-dimensional data?
A Quadtree is used to partition a 2D space into quadrants recursively, helping to organize data efficiently for spatial queries like range searches.
What is the primary difference between a Quadtree and a PR (Point-Region) Quadtree?
In a standard Quadtree, partitioning depends on the inserted data points, whereas in a PR Quadtree, the space is partitioned into equal-sized regions, independent of data.
In a PR Quadtree, where are the data points stored?
Inner nodes
Leaf nodes
Root node
In all nodes
Leaf nodes
What is the Minimum Bounding Rectangle (MBR) in an R-tree?
An MBR is the smallest rectangle that encloses a group of spatial objects, used for indexing and searching in an R-tree.
What is the primary drawback of overlapping MBRs in R-trees?
Overlapping MBRs can increase the number of nodes that must be visited during a query, reducing efficiency.
Define MINDIST in the context of R-trees.
MINDIST is the minimum possible distance between a point and any point within an MBR, used for approximating the nearest neighbor search.
Which algorithm is primarily used for choosing the correct leaf node to insert data in an R-tree?
Breadth-First Search
ChooseLeaf
SplitNode
AdjustTree
ChooseLeaf
How does the MINMAXDIST help during a nearest neighbor query in an R-tree?
MINMAXDIST provides an upper bound for the distance to the nearest neighbor, allowing the algorithm to prune subtrees that cannot contain closer points.
What is a major advantage of the R* Tree over the standard R Tree?
The R* Tree improves node splitting by considering both area increase and the degree of overlap, leading to better query performance.
What is the condition for splitting a node in an R-tree?
When it has only one entry.
When it exceeds its capacity (M).
When its MBR does not overlap.
When its parent node is full.
When it exceeds its capacity (M).
What is the difference between a kd-tree and a Quadtree?
A kd-tree partitions space alternately along different dimensions, while a Quadtree partitions space uniformly into quadrants.
Why is the original Guttman R-tree split algorithm considered inefficient for large M?
The algorithm has 2^𝑀−1 possible splits to evaluate, making it computationally expensive for large node capacities.
What are the three pruning strategies used in nearest neighbor (NN) search in R-trees?
Downward pruning: Ignore MBRs with MINDIST larger than the MINMAXDIST of another MBR.
Ignore objects with distance to P larger than MINMAXDIST of an MBR.
Upward pruning: Ignore MBRs with MINDIST larger than the distance to a known object.
In a VP Tree, how are the partitions created during the indexing process?
Select a pivot object.
Compute the median of all distances to the pivot.
Partition the data into two sets: points within the median distance and points outside.
How does a Generalized Hyperplane Tree (GH Tree) partition objects?
Pick two pivot objects.
Partition data into two sets: closer to the first pivot or the second pivot.
What are the components of an M-Tree node?
Pivot object
Radius of the enclosing ball
Distance to the parent node
Pointer to a subtree
What are the three properties a distance function must satisfy in metric space indexing?
Symmetry
Non-negativity
Triangle inequality
How does ball partitioning differ from other metric space indexing methods?
It partitions data based on whether points lie inside or outside a ball defined by a pivot.
What does the pivot in a VP Tree represent?
A) The smallest object.
B) A randomly selected object.
C) The object at the median distance.
D) The furthest object.
C) The object at the median distance.
What are the two methods to prioritize MBRs in NN search?
A) According to height and depth
B) According to MINMAXDIST and MINDIST
C) By alphabetical order
D) Using the number of contained objects
B) According to MINMAXDIST and MINDIST
In downward pruning, when can an MBR be ignored?
A) When MINDIST is less than MINMAXDIST of another MBR.
B) When MINDIST is greater than MINMAXDIST of another MBR.
C) When the query point lies within the MBR.
D) When the MBR is at the root level.
B) When MINDIST is greater than MINMAXDIST of another MBR.