Database Domain Flashcards

1
Q

I: DATABASE

A

is an organized collection of data.

Databases are classified by the way they store this data.

SQL vs. NoSQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

I: DATA MODELING

A

is the process of diagramming data flows. When creating a new or alternate database structure, the designer starts with a diagram of how data will flow into and out of the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

I: INDEX

A

is the way to get an unordered table into an order that will maximize the query’s efficiency while searching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

I: STATEFUL DB

A

remembers client data (state) from one request to the next.

Stateful servers do store session state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

I: SHARDING

A

is a type of database horizontal partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

I: DB SCALING

A

SQL databases, they are vertically scalable which means that you can increase the load on a single server by increasing components like RAM, SSD, or CPU.

NoSQL databases are horizontally scalable which means that they can handle increased traffic simply by adding more servers to the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

I: DB HIGH AVAILABILITY

A

Amazon RDS Multi-AZ deployments.

When you provision a Multi-AZ DB instance, Amazon RDS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ).

Amazon Aurora increases availability by replicating your data six ways across three Availability Zones. This means that your DB cluster can tolerate a failure of an Availability Zone without any loss of data and only a brief interruption of service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

I: DB MIRRORING

A

is a primarily software solution for increasing database availability. Mirroring is implemented on a per-database basis and works only with databases that use the full recovery model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

I: TABLE SCANS

A

is the reading of every row in a table and is caused by queries that don’t properly use indexes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

I: OLAP

A

OLAP is an online system that reports to multidimensional analytical queries like financial reporting, forecasting, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

I: IN-MEMORY DB

A

In-memory databases are designed to attain minimal response time by eliminating the need to access disks.

Elasticache for Redis/Memcached – real-time bidding, gaming leaderboards, caching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

I: DATABASE PERFORMANCE

A

Query Tuning

Improving Indexing

Input/Output Tuning

Scaling

Caching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

I: DATABASE TROUBLESHOOTING

A

Gathering the Facts – you need to know what kind of problem is happening.

Test in Different environments and Machines – you might find only one environment is affected, a set of environments or all environments.

Review the SQL Server Error Log – SQL Server creates a log file called “ERRORLOG”. A new ERRORLOG file is created every time SQL Server starts up.

Review the Event Log – use the Event Viewer to look at the different event log records. The event log contains both informational warnings and error events.

Review the Default Trace – this trace captures all configuration changes to an instance.

Review the Change Log – a change log is some centralized location that identifies all changes the have been introduced.

Develop a testing plan – identify how the application is connecting to SQL Server and the T-SQL code that is being executed.

Backup Database – this backup will provide you a recovery point should you want to start tweaking SQL Server as part of your diagnostics troubleshooting steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

I: CACHING

A

Caching allows you to efficiently reuse previously retrieved or computed data.

Trading off capacity for speed, a cache typically stores a subset of data transiently, in contrast to databases whose data is usually complete and durable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

I: RAID FOR DB

A

Use RAID 10 for high IO databases, when possible, but keep in mind that 50% of the total disk space is used for mirroring.

RAID 5 / RAID 6 have slower write performance, but fast read performance, and are suitable for read-only databases.

Logs should be written to separate RAID arrays from data files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

I: SQL (Relational DB)

A

Relational Database (Aurora, MySql, Oracle, SQL Server, MariaDB):

is a collection of data items with pre-defined relationships between them.

These items are organized as a set of tables with columns and rows.

Tables are used to hold information about the objects to be represented in the database.

Each column in a table holds a certain kind of data and a field stores the actual value of an attribute.

17
Q

I: READ REPLICAS

A

allow data to be available for reading across any number of servers, called “slaves”. One server remains the “master” and accepts any incoming write requests, along with read requests. This technique is common for relational databases.

18
Q

I: DATABASE PORTS

A

Aurora/MySQL/MariaDB = 3306
PostgeSQL = 5432
Oracle = 1521
SQL Server = 1433

19
Q

I: NORMALIZATION OF DATA

A

the process of eliminating unstructured data and redundancy in order to ensure logical data storage.

20
Q

I: DENORMALIZE TABLE

A

Denormalization is a database optimization technique in which we add redundant data to one or more tables.

Can help with avoiding costly joins in a relational DB.

21
Q

I: CAP THEOREM

A

that a distributed system can deliver only two of three desired characteristics: consistency, availability, and partition tolerance (the ‘C,’ ‘A’ and ‘P’ in CAP).

Consistency – means that all clients see the same data at the same time, no matter which node they connect to.

Availability – means that that any client making a request for data gets a response, even if one or more nodes are down.

A partition tolerance – is a communications break within a distributed system—a lost or temporarily delayed connection between two nodes.

22
Q

I: DATA CHARACTERISTICS

A

Accuracy

Completeness

Consistency

Timeliness

Validity

Uniqueness

23
Q

I: SAGA PATTERN

A

is a way to manage data consistency across microservices in distributed transaction scenarios.

24
Q

I: STATELESS

A

A Stateless server keeps no state information. Stateless file servers do not store any session state.

25
Q

I: DB CLUSTERING

A

is a technology that provides high availability for the entire SQL Server instance. This means that it’s possible to have to servers with identical structure (OS, applications, services, service packs, patches…etc) and with shared storage (the data itself).

26
Q

I: OLTP

A

is a system that manages transaction-oriented applications on the internet for example, ATM.

27
Q

I: KEY-VALUE DB

A

is a type of non-relational database that uses a simple key-value method to store data. A key-value database stores data as a collection of key-value pairs in which a key serves as a unique identifier.

DynamoDB/Cassandra – session state, shopping cart

28
Q

I: NOSQL (Non-Relational)

A

is particularly useful for storing unstructured data, which is growing far more rapidly than structured data and does not fit the relational schemas of RDBMS. Common types of unstructured data include: user and session data; chat, messaging, and log data; time series data such as IoT and device data; and large objects such as video and images.

DynamoDB
Cassandra
MongoDB

Document databases.
Key-value stores.
Column-oriented databases.
Graph databases.