Database Domain Flashcards
I: DATABASE
is an organized collection of data.
Databases are classified by the way they store this data.
SQL vs. NoSQL
I: DATA MODELING
is the process of diagramming data flows. When creating a new or alternate database structure, the designer starts with a diagram of how data will flow into and out of the database.
I: INDEX
is the way to get an unordered table into an order that will maximize the query’s efficiency while searching.
I: STATEFUL DB
remembers client data (state) from one request to the next.
Stateful servers do store session state.
I: SHARDING
is a type of database horizontal partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards.
I: DB SCALING
SQL databases, they are vertically scalable which means that you can increase the load on a single server by increasing components like RAM, SSD, or CPU.
NoSQL databases are horizontally scalable which means that they can handle increased traffic simply by adding more servers to the database.
I: DB HIGH AVAILABILITY
Amazon RDS Multi-AZ deployments.
When you provision a Multi-AZ DB instance, Amazon RDS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ).
Amazon Aurora increases availability by replicating your data six ways across three Availability Zones. This means that your DB cluster can tolerate a failure of an Availability Zone without any loss of data and only a brief interruption of service.
I: DB MIRRORING
is a primarily software solution for increasing database availability. Mirroring is implemented on a per-database basis and works only with databases that use the full recovery model.
I: TABLE SCANS
is the reading of every row in a table and is caused by queries that don’t properly use indexes.
I: OLAP
OLAP is an online system that reports to multidimensional analytical queries like financial reporting, forecasting, etc.
I: IN-MEMORY DB
In-memory databases are designed to attain minimal response time by eliminating the need to access disks.
Elasticache for Redis/Memcached – real-time bidding, gaming leaderboards, caching
I: DATABASE PERFORMANCE
Query Tuning
Improving Indexing
Input/Output Tuning
Scaling
Caching
I: DATABASE TROUBLESHOOTING
Gathering the Facts – you need to know what kind of problem is happening.
Test in Different environments and Machines – you might find only one environment is affected, a set of environments or all environments.
Review the SQL Server Error Log – SQL Server creates a log file called “ERRORLOG”. A new ERRORLOG file is created every time SQL Server starts up.
Review the Event Log – use the Event Viewer to look at the different event log records. The event log contains both informational warnings and error events.
Review the Default Trace – this trace captures all configuration changes to an instance.
Review the Change Log – a change log is some centralized location that identifies all changes the have been introduced.
Develop a testing plan – identify how the application is connecting to SQL Server and the T-SQL code that is being executed.
Backup Database – this backup will provide you a recovery point should you want to start tweaking SQL Server as part of your diagnostics troubleshooting steps.
I: CACHING
Caching allows you to efficiently reuse previously retrieved or computed data.
Trading off capacity for speed, a cache typically stores a subset of data transiently, in contrast to databases whose data is usually complete and durable.
I: RAID FOR DB
Use RAID 10 for high IO databases, when possible, but keep in mind that 50% of the total disk space is used for mirroring.
RAID 5 / RAID 6 have slower write performance, but fast read performance, and are suitable for read-only databases.
Logs should be written to separate RAID arrays from data files.