Chapter 28 Flashcards

1
Q

What is purpose of join

A

Used to retrieve data from multiple tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 3 fast join algorithms

A

1- sort-merge
2- hash-based
3- index-based

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is nested-loop join

A

Nested-loop join works like a nested loop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the time-complexity of the nested loop

A

O(RS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does time complexity should be independent of the order of tables. i.e. O(RS) is same as O(SR)

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are 3 Variants of Nested-Loop Join

A
  1. Temporary index nested-loop join
  2. Index nested-loop join
  3. Naive nested-loop join
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Naive nested-loop join

A

There are many variants of the traditional nested-loop join. The simplest case is when an entire table is scanned; this is called a naive nested-loop join.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Index nested-loop join

A

If there is an index, and that index is exploited, then it is called an index nested-loop join.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Temporary index nested-loop join

A

If the index is built as part of the query plan and subsequently dropped, it is called as a temporary index nested-loop join.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is sort-merge join

A

A “sort merge” join is performed by sorting the two data sets to be joined according to the join keys and then merging them together. The merge is very cheap, but the sort can be prohibitively expensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Does Sort-Merge Join very fast

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is hash join

A

A hash join is performed by hashing one data set into memory based on join columns and reading the other one and probing the hash table for matches. The hash join is very low cost when the hash table can be held entirely in memory. It is good for VLDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are 2 phases of hash join

A

hashed (build) phase and probed phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is partitioning fan-out

A

In hash join, initially, the two tables are entirely consumed and partitioned (using a hash function on the hash keys) into multiple partitions. The number of such partitions is sometimes called the partitioning fan-out.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Hash-Based Join: Partition Skew problem and its solution

A

Partition skew can become a problem in hash-join. In the first step of hash join, records are hashed into the main memory into their corresponding bucket. This being done based on the hash function used. However, an attribute being hashed may not be uniformly distributed within the relation, and some buckets may then contain more records than other buckets. When this difference becomes large, the corresponding bucket may no longer fit in the main memory. As a consequence, hash-based join degrades into performance of a nested-loop join. The only possible solution is to make available other hash functions to be chosen by the optimizer; that better distribute the input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Hash-Based Join: Intrinsic Skew

A

Intrinsic skew occurs when attributes are not distributed uniformly. it is also called attribute value skew.