Database Domain Flashcards by KeepingIt Anonymous

I: DATABASE

is an organized collection of data.

Databases are classified by the way they store this data.

SQL vs. NoSQL

How well did you know this?

Not at all

Perfectly

I: DATA MODELING

is the process of diagramming data flows. When creating a new or alternate database structure, the designer starts with a diagram of how data will flow into and out of the database.

How well did you know this?

Not at all

Perfectly

I: INDEX

is the way to get an unordered table into an order that will maximize the query’s efficiency while searching.

How well did you know this?

Not at all

Perfectly

I: STATEFUL DB

remembers client data (state) from one request to the next.

Stateful servers do store session state.

How well did you know this?

Not at all

Perfectly

I: SHARDING

is a type of database horizontal partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards.

How well did you know this?

Not at all

Perfectly

I: DB SCALING

SQL databases, they are vertically scalable which means that you can increase the load on a single server by increasing components like RAM, SSD, or CPU.

NoSQL databases are horizontally scalable which means that they can handle increased traffic simply by adding more servers to the database.

How well did you know this?

Not at all

Perfectly

I: DB HIGH AVAILABILITY

Amazon RDS Multi-AZ deployments.

When you provision a Multi-AZ DB instance, Amazon RDS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ).

Amazon Aurora increases availability by replicating your data six ways across three Availability Zones. This means that your DB cluster can tolerate a failure of an Availability Zone without any loss of data and only a brief interruption of service.

How well did you know this?

Not at all

Perfectly

I: DB MIRRORING

is a primarily software solution for increasing database availability. Mirroring is implemented on a per-database basis and works only with databases that use the full recovery model.

How well did you know this?

Not at all

Perfectly

I: TABLE SCANS

is the reading of every row in a table and is caused by queries that don’t properly use indexes.

How well did you know this?

Not at all

Perfectly

I: OLAP

OLAP is an online system that reports to multidimensional analytical queries like financial reporting, forecasting, etc.

How well did you know this?

Not at all

Perfectly

I: IN-MEMORY DB

In-memory databases are designed to attain minimal response time by eliminating the need to access disks.

Elasticache for Redis/Memcached – real-time bidding, gaming leaderboards, caching

How well did you know this?

Not at all

Perfectly

I: DATABASE PERFORMANCE

Query Tuning

Improving Indexing

Input/Output Tuning

Scaling

Caching

How well did you know this?

Not at all

Perfectly

I: DATABASE TROUBLESHOOTING

Gathering the Facts – you need to know what kind of problem is happening.

Test in Different environments and Machines – you might find only one environment is affected, a set of environments or all environments.

Review the SQL Server Error Log – SQL Server creates a log file called “ERRORLOG”. A new ERRORLOG file is created every time SQL Server starts up.

Review the Event Log – use the Event Viewer to look at the different event log records. The event log contains both informational warnings and error events.

Review the Default Trace – this trace captures all configuration changes to an instance.

Review the Change Log – a change log is some centralized location that identifies all changes the have been introduced.

Develop a testing plan – identify how the application is connecting to SQL Server and the T-SQL code that is being executed.

Backup Database – this backup will provide you a recovery point should you want to start tweaking SQL Server as part of your diagnostics troubleshooting steps.

How well did you know this?

Not at all

Perfectly

I: CACHING

Caching allows you to efficiently reuse previously retrieved or computed data.

Trading off capacity for speed, a cache typically stores a subset of data transiently, in contrast to databases whose data is usually complete and durable.

How well did you know this?

Not at all

Perfectly

I: RAID FOR DB

Use RAID 10 for high IO databases, when possible, but keep in mind that 50% of the total disk space is used for mirroring.

RAID 5 / RAID 6 have slower write performance, but fast read performance, and are suitable for read-only databases.

Logs should be written to separate RAID arrays from data files.

How well did you know this?

Not at all

Perfectly

I: SQL (Relational DB)

Study These Flashcards

Relational Database (Aurora, MySql, Oracle, SQL Server, MariaDB):

is a collection of data items with pre-defined relationships between them.

These items are organized as a set of tables with columns and rows.

Tables are used to hold information about the objects to be represented in the database.

Each column in a table holds a certain kind of data and a field stores the actual value of an attribute.

I: READ REPLICAS

Study These Flashcards

allow data to be available for reading across any number of servers, called “slaves”. One server remains the “master” and accepts any incoming write requests, along with read requests. This technique is common for relational databases.

I: DATABASE PORTS

Study These Flashcards

Aurora/MySQL/MariaDB = 3306
PostgeSQL = 5432
Oracle = 1521
SQL Server = 1433

I: NORMALIZATION OF DATA

Study These Flashcards

the process of eliminating unstructured data and redundancy in order to ensure logical data storage.

I: DENORMALIZE TABLE

Study These Flashcards

Denormalization is a database optimization technique in which we add redundant data to one or more tables.

Can help with avoiding costly joins in a relational DB.

I: CAP THEOREM

Study These Flashcards

that a distributed system can deliver only two of three desired characteristics: consistency, availability, and partition tolerance (the ‘C,’ ‘A’ and ‘P’ in CAP).

Consistency – means that all clients see the same data at the same time, no matter which node they connect to.

Availability – means that that any client making a request for data gets a response, even if one or more nodes are down.

A partition tolerance – is a communications break within a distributed system—a lost or temporarily delayed connection between two nodes.

I: DATA CHARACTERISTICS

Study These Flashcards

Accuracy

Completeness

Consistency

Timeliness

Validity

Uniqueness

I: SAGA PATTERN

Study These Flashcards

is a way to manage data consistency across microservices in distributed transaction scenarios.

I: STATELESS

Study These Flashcards

A Stateless server keeps no state information. Stateless file servers do not store any session state.

I: DB CLUSTERING

is a technology that provides high availability for the entire SQL Server instance. This means that it's possible to have to servers with identical structure (OS, applications, services, service packs, patches...etc) and with shared storage (the data itself).

I: OLTP

is a system that manages transaction-oriented applications on the internet for example, ATM.

I: KEY-VALUE DB

is a type of non-relational database that uses a simple key-value method to store data. A key-value database stores data as a collection of key-value pairs in which a key serves as a unique identifier. DynamoDB/Cassandra – session state, shopping cart

I: NOSQL (Non-Relational)

is particularly useful for storing unstructured data, which is growing far more rapidly than structured data and does not fit the relational schemas of RDBMS. Common types of unstructured data include: user and session data; chat, messaging, and log data; time series data such as IoT and device data; and large objects such as video and images. DynamoDB Cassandra MongoDB Document databases. Key-value stores. Column-oriented databases. Graph databases.

Database Domain Flashcards

(28 cards)