Desginingn data intensive book Flashcards

1
Q

B tree faster in reading or writing?

A

reading

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lsm trees faster in read or write?

A

write

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Reads are typically slower on LSM-trees because

A

because they have to check several different data structures and SSTables at different stages of compaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

throughput

A

tavane amaliyati

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

empirically.

A

به صورت تجربی

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

B-Trees: Write Path

A

B-Trees write every piece of data at least twice: once to the write-ahead log (WAL) and once to the tree page itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

LSM-Trees: Write Path

A

LSM-Trees rewrite data multiple times due to compaction and merging of SSTables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Write Amplification in LSM-Trees

A

Write amplification in LSM-Trees means one write to the database results in multiple writes to disk over its lifetime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Performance Bottleneck In write-heavy applications

A

the rate of writing to disk can be a bottleneck.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Write Amplification Impact

A

Write amplification affects performance by reducing writes per second within available disk bandwidth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

LSM-Trees Write Throughput in comparison with b trees

A

LSM-Trees generally sustain higher write throughput than B-Trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sequential Writes vs random access speed

A

Sequential writing in LSM-Trees is faster than the random access writes required by B-Trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reason for Higher Throughput

A

LSM-Trees have lower write amplification in some cases and write compact SSTable files sequentially.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Batching Writes

A

LSM-Trees accumulate many writes in memory and then flush them to disk in one go, reducing constant disk access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Minimizing access disk

A

LSM-Trees organize data in larger batches, reducing the need access disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

LSM-Trees Disk Space

A

LSM-Trees often produce smaller files on disk due to better compression.

17
Q

B-Trees Disk Space

A

B-Trees leave some disk space unused due to fragmentation.

18
Q

reading speed and compaction in High Write Throughput

A

At high write throughput, compaction may not keep up with incoming writes, leading to more unmerged segments and slower reads.

19
Q

B-Trees Transactional Semantics

A

B-Trees have an advantage as each key exists in one place in the index.

20
Q

Log-Structured Storage Engines

A

Log-structured storage engines may have multiple copies of the same key in different segments, complicating transaction isolation and lock management.

21
Q

B-Trees Popularity

A

B-Trees are deeply integrated into many databases.

22
Q

LSM-Trees Popularity

A

LSM-Trees are gaining popularity in new data stores due to their write performance benefits.

23
Q

Databases using B-Trees

A

MySQL (InnoDB), PostgreSQL, SQLite

24
Q

Databases using LSM-Trees

A

Apache Cassandra, RocksDB, LevelDB

25
Q

What is a Secondary Index?

A

Indexes columns other than the primary key, enabling efficient joins and searches on non-primary key fields.

26
Q

What does “Storing Values within the Index” mean?

A

Indexes can store actual row values or references to the rows stored elsewhere (heap files).

27
Q

What is a Clustered Index?

A

Stores the actual row data within the index, minimizing lookup steps for read-heavy workloads.

28
Q

What is a Covering Index?

A

Includes some of the table’s columns within the index itself, satisfying some queries without accessing the table.

29
Q

What are Multi-Column Indexes?

A

Combines several columns into one index key, enabling efficient querying on multiple columns simultaneously.

30
Q

What are Multi-Dimensional Indexes used for?

A

Supports querying several columns at once, especially useful for geospatial data.

31
Q

What are Full-Text Search and Fuzzy Indexes?

A

Supports searching for similar keys or handling typos and synonyms.