Partitioning, Indexes, Proxies Flashcards

1
Q

Data Partitioning

A

The process of dividing a large database into smaller, more manageable parts called partitions or shards)

data is partitioned based on criteria such as data range, data size, data type.

each partition is assigned to a separate processing node, which can perform operations on its assigned data subset independently of the others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do we use data partitioning

A
  • improve performance and scalability of large-scale data processing applications
  • balances workload across multiple servers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Horizontal Partitioning

A

Horizontal Partitioning = Sharding

partitions data into sets of rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Vertical Partitioning

A

Vertical partitioning splits a database table into multiple partitions wherein each partition is a set of columns

This can reduce the amount of data that needs to be scanned and prevents us from frequently accessing data that’s not needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Hybrid Partitioning

A

Combines vertical partitioning and sharding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Partition Criteria

A

the facts or criteria used to divide a large dataset into smaller parts or partitions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Consistent Hashing

A

A hashing scheme used in distributed systems

represents requestors and servers in a virtual ring known as a hashring

this keep the hash table independent from the number of servers available –> this minimizes key relocation when changes to scale occur, for example when more servers are added

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Common problems with data partitioning

A

JOINS
Joins that span database partitions (which are spread across different machines) will be slow

REFERENTIAL INTEGRITY
issues with relationships between tables, especially when a row with a foreign key is deleted from one table but not another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Database Indexing

A

Indexes make it faster to search a table for the row or rows that we want

An index is a data structure that can be perceived as a table of contents

Indexes make reads faster, but writes slower because we must update the indexes when inserting new data into the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Proxy Server

A

Proxy by default means “Forward” Proxy

A proxy server is an intermediate piece of software or hardware that sits between the client and the server to facillitate traffic.

Makes requests on behalf of the client, anonymizing the client.

Proxies are used to cache data, filter requests, log requests, or transform requests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Collapsed Forwarding

A

When a proxy combines the same data access requests into one request to prevent reading the same data from disk more than once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reverse Proxy

A

A reverse proxy anonymizes the server

Can be used for caching, load balancing or routing requests to appropriate resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly