Chapter 28 Flashcards

Question 1

Q

What is purpose of join

Answer

A

Used to retrieve data from multiple tables.

Question 2

Q

What are 3 fast join algorithms

Answer

A

1- sort-merge
2- hash-based
3- index-based

Question 3

Q

What is nested-loop join

Answer

A

Nested-loop join works like a nested loop

Question 4

Q

What is the time-complexity of the nested loop

Question 5

Q

Does time complexity should be independent of the order of tables. i.e. O(RS) is same as O(SR)

Question 6

Q

What are 3 Variants of Nested-Loop Join

Answer

A

Temporary index nested-loop join
Index nested-loop join
Naive nested-loop join

Question 7

Q

What is Naive nested-loop join

Answer

A

There are many variants of the traditional nested-loop join. The simplest case is when an entire table is scanned; this is called a naive nested-loop join.

Question 8

Q

What is Index nested-loop join

Answer

A

If there is an index, and that index is exploited, then it is called an index nested-loop join.

Question 9

Q

What is Temporary index nested-loop join

Answer

A

If the index is built as part of the query plan and subsequently dropped, it is called as a temporary index nested-loop join.

Question 10

Q

What is sort-merge join

Answer

A

A “sort merge” join is performed by sorting the two data sets to be joined according to the join keys and then merging them together. The merge is very cheap, but the sort can be prohibitively expensive.

Question 11

Q

Does Sort-Merge Join very fast

Question 12

Q

What is hash join

Answer

A

A hash join is performed by hashing one data set into memory based on join columns and reading the other one and probing the hash table for matches. The hash join is very low cost when the hash table can be held entirely in memory. It is good for VLDB.

Question 13

Q

What are 2 phases of hash join

Answer

A

hashed (build) phase and probed phase

Question 14

Q

What is partitioning fan-out

Answer

A

In hash join, initially, the two tables are entirely consumed and partitioned (using a hash function on the hash keys) into multiple partitions. The number of such partitions is sometimes called the partitioning fan-out.

Question 15

Q

What is Hash-Based Join: Partition Skew problem and its solution

Answer

A

Partition skew can become a problem in hash-join. In the first step of hash join, records are hashed into the main memory into their corresponding bucket. This being done based on the hash function used. However, an attribute being hashed may not be uniformly distributed within the relation, and some buckets may then contain more records than other buckets. When this difference becomes large, the corresponding bucket may no longer fit in the main memory. As a consequence, hash-based join degrades into performance of a nested-loop join. The only possible solution is to make available other hash functions to be chosen by the optimizer; that better distribute the input.

Question 16

Q

What is Hash-Based Join: Intrinsic Skew

Answer

Study These Flashcards

A

Intrinsic skew occurs when attributes are not distributed uniformly. it is also called attribute value skew.

Chapter 28 Flashcards

(16 cards)