Database Architecture Flashcards

1
Q

Blank describes the components of a computer system and the relationships between components

A

Architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

MySQL components are organized in four blank

A

Layers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Name the four layers of MySQL components

A

Tools

The query processor

The storage engine

The file system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Blank interact directly with database users and administrators, and send queries to the query processor.

A

Tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The blank manages connections from multiple users and compiles queries into low-level instructions for the storage engine.

A

Query processor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The blank, also called a storage manager, executes instructions, manages indexes, and interacts with the file system. Some storage engines support database transactions, described elsewhere in this material.

A

Storage engine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The blank accesses data on storage media. The file system contains both system and user data, such as log files, tables, and indexes.

A

File system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The Enterprise Edition includes MySQL Server and components for high-end commercial installations, such as what two things

A

Monitor
Audit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Blank collects and displays information on CPU, memory, and index utilization, as well as queries and results. Database administrators use Enterprise Monitor to manage and tune large databases with many users.

A

Monitor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Blank keeps track of all database changes. For each change, Audit tracks the time of change and who made the change. Audit supports government and business audit requirements for sensitive databases such as financial, medical, and defense.

A

Audit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The blank layer includes Connectors and APIs, Workbench, and utility programs.

A

Tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Blank and blank are groups of application programming interfaces, linking applications to the query processor layer.

A

Connectors and APIs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Blank are newer and developed by Oracle, which sponsors MySQL.

A

Connectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Blank are older and, with the exception of the C API, developed by other organizations.

A

APIs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Most programmers use blank but system programmers may write specialized utilities in C with the C API.

A

Connectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Blank is a desktop application to manage and use databases. Blank is designed for both database administrators and users.

A

Workbench

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Blank programs include approximately 30 tools, grouped in five categories.

A

Utility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Name the five categories of utility programs

A

installation, client, administrative, developer, and miscellaneous tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Upgrade existing databases to a new MySQL release
Backup databases
Import data to databases
Inspect log files
Administer database servers

Are all things that the blank programs do.

A

utility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The blank is a particularly important utility program, commonly used by both database administrators and users. The blank displays the mysql> prompt and processes individual SQL queries interactively.

A

Command-Line Client

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The query processor layer has two main functions. Name them.

A

manage connections and compile queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

A blank is a link between tools and the query processor.

A

connection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Each connection specifies a blank, blank, blank, and blank. The connection manager creates connections and manages communications between tools and the query parser.

A

database name, server address, logon name, and password

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Blank generates a query execution plan.

A

Query compilation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

An blank is a detailed, low-level sequence of steps that specify exactly how to process a query.

A

execution plan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The blank generates an execution plan in two steps.

A

query processor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

The query processor generates an execution plan in two steps. Name them.

A

The query parser checks each query for syntax errors and converts valid queries to an internal representation.

The query optimizer reads the internal representation, generates alternative execution plans, estimates execution times, and selects the fastest plan. Estimates are based on heuristics and statistics about data, like the number of rows in each table and the number of values in each column. These statistics are maintained in the data dictionary, described below.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

For optimal performance, the query processor layer has a blank that stores reusable information in main memory.

A

cache manager

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

If data used in repeated queries does not change, the cache manager may also blank query results.

A

save

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The storage engine layer has two main functions. Name them.

A

transaction management and data access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Blank includes the concurrency system, recovery system, and lock manager. These components ensure all transactions are atomic, consistent, isolated, and durable,

A

Transaction management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The data access component communicates with the file system and translates table, column, and index reads into blank.

A

block addresses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

To reduce data access time, the blank retains data blocks from the file system for possible reuse.

A

buffer manager

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

The data blocks are retained in an area of main memory called the blank.

A

buffer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

The buffer manager is similar to the blank of the query processor layer.

A

cache manager

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

The buffer manager has a fixed amount of blank.

A

memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

As the database processes queries and reads blocks, an blank determines which blocks to retain and which to discard.

A

algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

The InnoDB buffer manager uses a least recently used or blank, which tracks the time each block was last accessed and, when space is needed, discards ‘stale’ blocks. If data in a block has been updated, discarded blocks are first saved on disk.

A

LRU algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

The database administrator can assign a different blank to each table in a database. InnoDB is the default and most commonly used storage engine.

A

storage engine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

The blank layer consists of data stored on storage media and organized in files.

A

file system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

The file system contains three types of data for each database. Name them

A

user data, log files, and a data dictionary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Blank data includes tables and indexes.

A

User

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Blank files contain a detailed, sequential record of each change applied to a database.

A

Log

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

The recovery system uses blank to restore data in the event of a transaction, system, or storage media failure.

A

log files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

A blank, also known as a data dictionary, is a directory of tables, columns, keys, indexes, and other objects in a relational database.

A

catalog

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

All relational databases contain a blank. Query processors and storage managers use this information when queries are processed and executed.

A

catalog

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

MySQL uses the term blank instead of catalog

A

‘data dictionary’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Data dictionary tables cannot be accessed directly with SELECT, INSERT, UPDATE, and DELETE queries. However, the table contents can be accessed indirectly. The blank query is compiled as a SELECT query against dictionary tables.

A

SHOW

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Multiple computers linked by a network are often grouped in layers, called blank, and arranged in a hierarchy.

A

tiers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Prior to 1990, most software ran in a blank, consisting of a personal or corporate computer connected directly to monitors. Although computers often communicated with each other, the dependencies between applications running on different computers were limited.

A

single-tier architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Since 1990, complex corporate and government applications have increasingly been implemented in a blank.

A

multi-tier architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

In a multi-tier architecture, the top tier consists of computers interacting directly with blank

A

end-users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

In a multi-tier architecture, the bottom tier consists of blank managing resources like databases and email.

A

servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

In a multi-tier architecture, one or more blank execute a variety of functions, such as user authorization, business logic, and communication with other computers.

A

middle tiers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Typically, blank run on a middle tier and implement business logic.

A

application programs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Since user interaction and data are managed in the top and bottom tiers, applications are easier to write and maintain in a blank.

A

multi-tier architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Blank is a multi-tier architecture consisting of web browsers and web servers communicating over the internet

A

Web architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

In a web architecture, blank on the top tier, manage user interaction.

A

web browsers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

In a web architecture, blank, on a middle tier, generate web pages for display on web browsers and transmit user requests to services running on lower tiers.

A

Web servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

In a web architecture, blank run application software, process user requests, and communicate with databases and other services.

A

Application servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

In a web architecture, blank, such as database and authentication, comprise the bottom tier.

A

Services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

The term blank refers to either a software or hardware layer

A

tier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Prior to 2000, most commercial software was blank, or installed and run on customer computers.

A

on-premise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Since 2000, blank have increasingly replaced on-premise software. With blank, a vendor such as Amazon, Microsoft, or Google implements computer services on lower tiers of a web architecture. For a fee, blank are made available over the internet to customers.

A

cloud services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Cloud services fall into three broad categories. Name them.

A

Infrastructure-as-a-service, or IaaS
Platform-as-a-service, or PaaS
Software-as-a-service, or SaaS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Blank provides computer processing, memory, and storage media, as if the customer were renting a computer. Ex: Elastic Compute Cloud, or EC2, from Amazon Web Services offers it

A

Infrastructure-as-a-service, or IaaS,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Blank provides tools and services, such as databases, application development tools, and messaging services. Ex: Azure is Microsoft’s cloud services environment, offering the SQL Database service.

A

Platform-as-a-service, or PaaS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Blank provides complete applications, usually through web browsers on customer machines. Ex: Salesforce offers sales management software, and Google offers document processing applications like Docs, Sheets, and Pages.

A

Software-as-a-service, or SaaS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Usually blank are offered on virtual machines.

A

cloud services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

A blank, is a software layer that emulates a complete, independent computing environment.

A

virtual machine, or VM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Multiple virtual machines can run on blank, enabling cloud providers to support many customers on the same machine.

A

one computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

A cloud database is a database offered as a blank cloud service. Most databases are now available either on-premise or as a cloud service, but cloud database use is growing rapidly.

A

PaaS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

A blank is a statement or proposition, from which another statement is inferred.

A

premise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Premises refers to buildings and land occupied by a business. Thus, blank is technically correct in the context of cloud software. However on-premise is easier to say and therefore commonly used.

A

on-premises

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Cloud databases have a number of compelling benefits. Name them (5)

A

Administration.
Security.
Reliability.
Elasticity.
Capital cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Installing, managing, upgrading, and backing up database systems is time-consuming and complex. With cloud databases, consumers delegate blank to cloud providers.

A

administrative activities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Cloud providers are large companies with extensive resources. Cloud providers can invest heavily in blank providing better than most cloud customers.

A

security professionals and infrastructure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Cloud providers provide blank computing systems with little or no down-time.

A

redundant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

Many organizations struggle with daily, monthly, or seasonal fluctuations in processing workload. By averaging fluctuations over many customers, cloud providers provide blank on demand.

A

flexible database resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Cloud providers absorb all initial, or capital, costs of computers and facilities. Blank is recovered by cloud service fees.

A

Capital cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Cloud databases raise blank. Companies entrust data to cloud providers, which may store data on servers located in countries with different privacy regulations.

A

data privacy questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

In the United States, data privacy is governed by blank in specific areas, such as medical and financial. As a result, a European company may avoid servers located in the United States.

A

limited regulations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Data privacy is a concern primarily for blank, such as financial and medical applications. For organizations that do not manage blank, cloud databases offer convincing benefits and have been widely adopted.

A

sensitive data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

A blank consists of multiple processors managed by a single operating system instance.

A

parallel computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

Parallel computers achieve faster processing speeds by processing multiple instructions blank.

A

concurrently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Parallel computers fall into three categories. Name them.

A

A shared memory computer
A shared storage computer
A shared nothing computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

In a blank, processors share the same memory and storage media.

A

shared memory computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

In a blank, processors share storage media only. Each processor has private memory.

A

shared storage computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

In a blank, processors share neither memory nor storage media.

A

shared nothing computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

Blank memory is optimal for parallel processing against a common data set in a single memory space.

A

Shared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

Blank and blank scale to more processors, since processors do not contend for the same memory.

A

Shared storage and shared nothing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

Multiple computers can communicate via a blank or blank network

A

local or wide area

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

A blank consists of cables extending over a small area, typically within one facility.

A

local area network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

Local area networks usually use the blank protocol.

A

Ethernet communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

A blank spans multiple facilities in different geographic locations, separated by many miles.

A

wide area network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

Wide area networks may communicate via cables, satellite, or telephone lines, often using blank protocols.

A

internet communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

A blank is one of a group of computers connected by either a local or wide area network.

A

node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

A blank is a group of nodes connected by a local area network, managed by separate operating system instances, and coordinated by specialized cluster management software.

A

cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

A cluster is similar to a blank. Both can execute program instructions in parallel on multiple processors. Both can share storage or share nothing. Computers in a cluster cannot share memory, however, since local area networks are too slow to support memory access.

A

parallel computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

Blank can often be decomposed into parts that run concurrently and execute faster on parallel computers or clusters.

A

Queries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Parallel and distributed databases exploit blank for faster query execution

A

multiple processors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

A blank runs on a parallel computer or cluster.

A

parallel database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

A blank runs on multiple computers connected by a wide area network.

A

distributed database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

Both parallel and distributed databases present a blank view of data to database users and programmers. The physical location of data on storage media is visible to database administrators only.

A

unified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

In a parallel database, data location has blank on query processing since local area networks are relatively fast and reliable.

A

limited impact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

In a distributed database, data location is blank since wide area networks are relatively slow and unreliable. Wide area networks create technical challenges with distributed transaction

A

significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

Despite technical challenges, blank offer compelling benefits for databases with users in many locations

A

distributed databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

A blank updates data on multiple nodes of a distributed database.

A

distributed transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

In a distributed transaction, either blank or blank must be successfully updated.

A

all nodes or no nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

Databases commonly implement distributed transactions with a technique called blank.

A

two-phase commit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

The two-phase commit has four steps. Name them.

A

In phase 1, a central transaction coordinator notifies all participating nodes of the required updates.

Participating nodes receive the notification, store the update in a local log, and send a confirmation message to the transaction coordinator. Participating nodes do not yet commit the update to the database.

Phase 2 begins when the transaction coordinator receives confirmation from all participating nodes. The transaction coordinator now instructs all nodes to commit.

Participating nodes receive the commit message, commit the update to the database, and notify the transaction coordinator of success.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

The two-phase commit must account for what two failure scenarios:

A

In step 2, if the transaction coordinator does not receive confirmation from all nodes within a fixed time period, the transaction coordinator instructs participating nodes to roll back the update.

In step 4, if a node becomes unavailable and fails to notify the transaction coordinator of success, the transaction coordinator resends the commit message until the node responds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

The two-phase commit ensures updates are applied to either all nodes or no nodes. In the first failure scenario, the transaction blank, and no updates are applied. In the second failure scenario, the transaction blank, and all updates are applied.

A

rolls back
commits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

Two-phase commit and two-phase locking are different procedures. Two-phase commit blank and blank at the end of distributed transactions only. Two-phase locking, governs blank and blank of locks during either local or distributed transactions.

A

governs commit and rollback
acquisition and release

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

A blank updates data on a single node of a distributed database.

A

local transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

Distributed transactions are relatively blank, as multiple nodes must respond before the transaction commits. As a faster alternative, multiple nodes can be updated blank with local transactions

A

slow
independently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

In a local transaction, the blank notifies participating nodes of required updates.

Nodes blank and confirm with the transaction coordinator.

If a node is unavailable, the transaction coordinator blank until confirmation is received.

A

transaction coordinator
commit immediately
repeats the update message

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

Local transactions create blank, as nodes are updated at different times.

A

temporary inconsistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

The choice of local or distributed transactions depends on blank and blank requirements

A

performance and consistency

120
Q

In many financial databases, all nodes must be consistent at all times. Blank are necessary.

A

Distributed transactions

121
Q

In databases that log website activity, temporary inconsistency may be acceptable. Blank might be used to process updates quickly and support a high volume of web clicks.

A

Local transactions

122
Q

Updates in a distributed transaction are blank, since the updates occur at the same time from the perspective of the database user.

A

synchronous

123
Q

Updates in separate local transactions are blank.

A

asynchronous

124
Q

Databases that use local rather than distributed transactions are called blank.

A

eventually consistent

125
Q

A blank database conforms to all rules at all times.

A

consistent

126
Q

In a distributed database, a blank may govern data on multiple nodes.

A

rule

127
Q

In an blank database, ‘live’ nodes must respond to queries at all times.

A

available

128
Q

A ‘dead’ node may be blank, but ‘live’ nodes must respond regardless of the state of other nodes.

A

unresponsive

129
Q

A blank forms when a network error prevents nodes from communicating.

A

network partition

130
Q

A blank occasionally experiences network partitions since nodes are connected by wide area networks that occasionally fail.

A

distributed database

131
Q

A blank database continues to function when a network partition occurs.

A

partition-tolerant

132
Q

The blank states that a distributed database cannot simultaneously be Consistent, Available, and Partition-tolerant.

A

CAP theorem

133
Q

A distributed database can guarantee blank, but blank, of the CAP properties.

A

any two, but not all three

134
Q

As a practical matter, most distributed databases must always function and are therefore partition-tolerant. Consequently, most distributed databases guarantee either blank or blank, but not both.

A

consistency or availability

135
Q

The blank between consistency and availability is relative, not absolute.

A

tradeoff

136
Q

Since wide area networks are relatively slow, the time to propagate an update from one node to another is significant. If a query accesses updated data before all nodes are updated, the database must either return blank or blank.

A

inconsistent data or not respond immediately

137
Q

Rather than choose between consistency and availability, a database must choose blank to provide a consistent response.

A

how long to wait

138
Q

Blank commonly means the percentage of time a database is responsive to users and programs. In the context of the CAP theorem, however, blank is the response of individual nodes rather than the entire database system.

A

Availability

139
Q

In the context of networks, a blank is a subset of nodes. In the context of data storage, a blank is a subset of table data.

A

partition

140
Q

A blank is a copy of an entire database, a table, or a subset of table data.

A

replica

141
Q

A blank maintains two or more replicas on separate storage devices

A

replicated database

142
Q

Data can be replicated in any database with multiple storage devices, such as blank and blank.

A

parallel and distributed databases

143
Q

Replicated databases have several major advantages. Name three.

A

High availability.
Fast concurrent reads.
Local reads.

144
Q

If one storage device fails, the database blank to a replica on another storage device.

A

routes queries

145
Q

In general, if a database maintains N replicas, the database can survive simultaneous failure of blank

A

N-1 storage devices.

146
Q

Concurrent queries can read separate replicas without blank with each other. One large query can be blank into smaller queries that read separate replicas in parallel.

A

interfering
decomposed

147
Q

In a distributed database, reads can be executed blank, eliminating network delays and outages.

A

locally

148
Q

Replicated databases have one major disadvantage. Name it

A

Slow or inconsistent updates

149
Q

Updates must be applied to all replicas on blank. If all replicas on different nodes are updated with a blank, the update is relatively slow. If replicas on different nodes are updated with blank, updates are relatively fast but replicas are temporarily inconsistent.

A

multiple storage devices
distributed transaction
local transactions

150
Q

Blank simplifies some database administration activities but makes others more complex.

A

Replication

151
Q

One replica can be blank while transactions execute against other replicas.

A

backed up

152
Q

Blank can be restricted to one replica, accessible only to trusted database users. Updates are propagated to blank, accessible to a broader user group.

A

Updates
read-only replicas

153
Q

In a replica database, database administrators must determine how to blank updates across replicas.

A

propagate

154
Q

Blank is commonly used in parallel and distributed databases, particularly when reads are frequent, updates are infrequent, and temporary inconsistency is acceptable.

A

Replication

155
Q

Updating replicated data in a database running on a single node is blank.

A

straightforward

156
Q

Some storage devices, called blank, manage replicas internally, without database intervention. Alternatively, the database can update all replicas within a single local transaction. Either way, synchronizing replicas does not require special database capabilities.

A

storage arrays

157
Q

Updating all replicas in a blank guarantees consistency but is relatively slow and fails when any replica is unavailable.

A

distributed transaction

158
Q

The blank technique designates one node as primary. All updates are first applied to the primary node in local transactions. Secondary nodes are updated after the primary node commits, with independent local transactions. If the primary node fails, the database automatically designates a new primary node to ensure continued availability.

A

primary/secondary

159
Q

The blank technique applies updates to any node in a group. Prior to committing, a node broadcasts transaction information to other nodes, which look for conflicts with concurrent transactions. If any node detects a conflict, an algorithm determines which transaction commits and which rolls back. This algorithm may be simple, such as the transaction that commits first wins, or complex. If a network partition occurs and nodes cannot communicate, processing is temporarily suspended.

A

group replication

160
Q

Blank update technique is eventually consistent, not partition-tolerant.

A

Group replication

161
Q

Blank update technique is always consistent, not partition-tolerant

A

Distributed transaction

162
Q

Blank update is eventually consistent, partition-tolerant.

A

Primary/Secondary

163
Q

A blank is a directory of information describing database objects such as tables, columns, keys, and indexes

A

Catalog

164
Q

Blank is necessary to process queries and access data. Each node in a distributed database can process queries and therefore requires access to the catalog.

A

Catalog information

165
Q

In a distributed database, the catalog can be structured in what two ways

A

In a central catalog
In a replicated catalog

166
Q

In a blank, the entire catalog resides on a single node. Storing the catalog on a single node is relatively easy to manage. However, query processing at remote nodes must access the catalog via a wide area network, which may be slow or unreliable. Furthermore, query processing at all nodes interact with the blank, which may become a bottleneck.

A

central catalog

167
Q

In a blank a copy of the catalog resides on each node. Most queries are fast and reliable since all catalog data is available locally. However, statements that update the catalog, such as CREATE, ALTER, and DROP, must update all replicas. Updating replicas generates increased network traffic and, if executed in a distributed transaction, fails when any replica is unavailable.

A

replicated catalog,

168
Q

Since catalog updates are infrequent compared to other database queries, many distributed databases use a blank.

A

replicated catalog

169
Q

To improve performance of catalog updates, many databases use a variation of the blank

A

primary/secondary technique

170
Q

Updates are first applied to the replica on the blank containing the affected data object and then propagated to other replicas.

A

node

171
Q

When catalog replicas are updated with blank, some replicas are momentarily out of date. If a query cannot be processed due to an out-of-date replica, the database might display an error and advise the user to resubmit the query. This rarely occurs, however, since catalog updates are infrequent and the delay between replica updates is short.

A

local transactions

172
Q

Organizations use blank to conduct daily business functions.

A

operational data

173
Q

Organizations use blank to understand, manage, and plan the business.

A

analytic data

174
Q

Analytic data is sometimes called blank or blank.

A

reporting data or decision support data

175
Q

Operational and analytic data differ in what four ways

A

Volatility.
Detail.
Scope.
History.

176
Q

Blank changes in real time as business functions are executed. Blank is updated at fixed intervals, often daily or weekly, so that reports and summaries always refer to a known time.

A

Operational data
Analytic data

177
Q

Most blank is detailed, reflecting individual transactions. Blank is often summarized by time period, business unit, geography, and other business dimensions.

A

operational data
Analytic data

178
Q

Most blank are designed for a specific business function. Consequently, operational databases supporting different business functions are often incompatible. Blank combine data from many business functions in an integrated, enterprise-wide view of data, with standard formats, data types, and keys across all tables.

A

operational databases
Analytic databases

179
Q

Many blank are concerned primarily with current data. Blank often track trends over time and therefore usually contain current and historic data. Ex: Operational data may include active employees only. Analytic data may include past employees and illustrate changes in total employment by month.

A

operational databases
Analytic databases

180
Q

Blank and blank are often maintained in separate databases with different designs.

A

operational and analytic data

181
Q

Operational data is blank and blank

A

Volatile and detailed

182
Q

Analytic detail is blank, blank, blank and blank

A

Detailed, Summary, Enterprise-wide, and Historical

183
Q

In a blank database, a database stores each line item of an order, including item name, product code, cost, and quantity.

A

Detailed

184
Q

A blank database, the database records the current project assignments for each employee. Updates are made right away when assignments change. The database does not retain prior assignments.

A

Volatile

185
Q

In a blank database, each year, employees receive raises for performance and increased cost of living. The database retains salary data for every year employees have worked at the company.

A

historic

186
Q

in a blank database, the database stores total count of employees working at each corporate facility.

A

summary

187
Q

In a blank database, A report lists employee names alongside employee office numbers. Employee names are extracted from a human resources database. Employee office assignments are stored in a separate facilities database.

A

Enterprise-wide

188
Q

Storing operational and analytic data in the same database creates what three problems:

A

Database design.
Interference.
Reference time.

189
Q

Since blank is volatile, operational databases are typically optimized for updates, with most tables in third normal form. Third normal form minimizes redundancy but generates many tables and is not optimal for blank. Analytic queries often combine columns from many third normal form tables, resulting in complex joins that are difficult to write and slow to run.

A

operational data
analytic queries

190
Q

Blank often summarize large volumes of data. When executed against an operational database, analytic queries compete with blank, degrade query response time, and interfere with business operations.

A

Analytic queries
operational queries

191
Q

Blank usually reference a specific point in time, such as sales totals as of midnight on the last day of the month. Since operational data is volatile, results depend on the blank a query is submitted. Analytic queries against operational databases thus have an uncertain reference time and may be misleading.

A

Analytic queries
precise time

192
Q

A blank is a separate database optimized for analytics rather than operations.

A

data warehouse

193
Q

A data warehouse consists of data extracted from blank and restructured to support analytic queries.

A

operational databases

194
Q

Data is usually extracted periodically, at a fixed time, so that data in the warehouse has a known blank. Data is extracted during times of low database use to minimize impact on operational queries.

A

reference time

195
Q

Data warehouses integrate data from blank for use by the entire organization.

A

multiple business functions

196
Q

A blank is a data warehouse designed for a specific business area, such as sales, human resources, or product development.

A

data mart

197
Q

Since data marts have blank than a data warehouse, data marts are easier to build and maintain. A data mart can be derived directly from operational databases or indirectly from a data warehouse.

A

smaller scope

198
Q

Data warehouses are refreshed periodically with a five-step process. Name the steps

A

Extract data from operational databases
Cleanse data
Integrate data into a uniform structure.
Restructure data
Load data to the data warehouse.

199
Q

Extract data from operational databases into a temporary database, called a blank. Since the data warehouse already contains data from the prior period, only data that has changed since the prior period is extracted.

A

‘staging area’

200
Q

The five-step process is commonly referred to as the blank, or ETL, process.

A

extract-transform-load

201
Q

Since the ETL process is time-consuming and difficult to automate, many organizations use special software products, called blank, to minimize programming.

A

ETL tools

202
Q

Blank from Informatica is a high-end ETL product intended to manage large extracts for complex organizations.

A

PowerCenter

203
Q

Blank from Microsoft is designed for SQL Server data warehouses.

A

SQL Server Integration Services

204
Q

Blank supports many data sources but is optimized to load Oracle database products.

A

Oracle Data Integrator

205
Q

Reading data from operational systems and writing the data to a temporary database is called blank.

A

extraction

206
Q

Correcting errors in operational data and converting to a standard format is called blank.

A

‘cleansing’

207
Q

Data in different operational systems often have incompatible or missing keys. Creating uniform primary and foreign keys is one example of blank.

A

data integration

208
Q

Converting data from a design optimized for operations to a design optimized for analytics is called blank.

A

restructuring

209
Q

Data warehouses are periodically blanked with cleansed, integrated, and restructured data from the temporary database.

A

loaded

210
Q

To simplify analytic queries, data warehouses commonly use a blank.

A

dimensional design

211
Q

A dimensional design, also called a blank, consists of fact and dimension tables:

A

star schema

212
Q

A blank contains numeric data used to measure business performance, such as sales revenue or number of employees.

A

fact table

213
Q

Each row in a fact table consists of numeric fact columns and foreign keys that reference blank.

A

dimension tables

214
Q

A blank contains textual data that describes the fact data, such as product line, organizational unit, and geographical region.

A

dimension table

215
Q

The blank of a fact table is the composite of all foreign keys referencing dimension tables.

A

primary key

216
Q

The primary key of a blank is a small, meaningless integer

A

dimension table

217
Q

Most data warehouses have many blanks

A

fact tables

218
Q

A blank is a sequence of columns in which each column has a one-many relationship to the next column.

A

dimension hierarchy

219
Q

A dimension table usually contains one or more column blank.

A

hierarchies

220
Q

In some cases, several columns are at the same blank of a hierarchy

A

level

221
Q

Blank usually summarize data at one level of one hierarchy from each dimension.

A

Analytic queries

222
Q

For fast execution, frequently used summary data may be computed in advance and stored in a blank.

A

data warehouse

223
Q

Since data warehouses track historical data, dimensional designs usually have blank and blank dimension tables.

A

date and time

224
Q

Each row of the blank dimension table corresponds to a day

A

date

225
Q

Each row of the blank dimension table corresponds to a minute of the day.

A

time

226
Q

The time dimension contains blank rows (24 hours × 60 minutes per hour).

A

1,440

227
Q

Fact tables contain blank referencing date, time, or both dimensions, to establish the time of a fact.

A

foreign keys

228
Q

The blank and blank dimensions provide an elegant way to track historical data.

A

date and time

229
Q

Foreign keys StartDateID and EndDateID are added to the fact table and indicate the blank of each row. Current rows have an end date in the distant future, such as December 31, 2999.

A

effective dates

230
Q

f a fact changes to a new value on date X, the end date of the current row is set to X and a new row is inserted with what values

A

The fact column is the new value.
StartDateID refers to date X.
EndDateID refers to December 31, 2999.
Other columns are identical to the prior row.

231
Q

Adding start and end foreign keys to the fact table is called blank. Historical data can be tracked with other designs, but type 2 design is simple, effective, and commonly used.

A

type 2 design for slowly changing dimensions

232
Q

A straightforward approach for considering the value of a BI program suggests at least what four dimensions of value?

A

Financial value
Productivity value
Trust value
Risk value

233
Q

A blank is the primary source of information that feeds the analytical processing within an organization. There are a number of different analytic applications that are driven by business needs, yet most, if not all of these applications are driven by the data that has been migrated into a data warehouse.

A

data warehouse

234
Q

A data warehouse is a centralized repository of blank.

A

information

235
Q

A data warehouse is organized around the relevant blank important to the organization.

A

subject areas

236
Q

A data warehouse provides a platform for different blank (both human and automated) to submit queries about enterprise information.

A

consumers

237
Q

A data warehouse is used for analysis and not for blank. n The data in a data warehouse is nonvolatile.

A

transaction processing

238
Q

A data warehouse is the target location for blank from multiple sources, both internal and external to an enterprise.

A

integrating data

239
Q

A blank is a subject-oriented data repository, similar in structure to the enterprise data warehouse, but it holds the data necessary for the decision support and BI needs of a specific department or group within the organization.

A

data mart

240
Q

Blank is different from the typical operational or transaction processing systems. There are many proposed definitions of OLAP, most of which describe what OLAP is used for. The most frequently used terms are ‘‘multidimensional’’ and ‘‘slice-and-dice.’’

A

Online analytical processing

241
Q

Online analytical processing tools provide a means for presenting data sourced from a data warehouse or data mart in a way that allows the data consumer to view blank across multiple dimensions. In addition, these metrics are summarized in a way that allows the data consumer blank (which means to expose greater detail) on any particular aspect of the set of facts.

A

comparative metrics
to drill down

242
Q

The data to be analyzed in an OLAP environment are arranged in a way that enables visibility along any of the dimensions. Usually this is described as a blank, although the organization is intended to allow the analyst to fix some set of dimensions and then see aggregates associated with the other dimensional hierarchies

A

cube

243
Q

The value of an OLAP tool is derived from the ability to quickly analyze the data from blank, and so OLAP tools are designed to pre-calculate the aggregations and store them directly in the OLAP databases.

A

multiple points of view

244
Q

Data integration is not just limited to extracting data sets from internal sources and loading them into a data warehouse, but focuses on effectively facilitating the blank. Data integration goes beyond ETL, data replication, and change data capture, although these remain key components of the integration fabric.

A

delivery of information to the right places within the appropriate time

245
Q

A basic concept for populating a data warehouse is that data sets from multiple sources are collected and then added to a data repository from which blank can source their input data.

A

analytical applications

246
Q

What is the general steps of an ETL process. 7 steps

A

Get the data from the source location.

Map the data from its original form into a data model that is suitable for manipulation at the staging area.

Validate and clean the data.

Apply any transformations to the data that are required before the data sets are loaded into the repository.

Map the data from its staging area model to its loading model.

Move the data set to the repository.

Load the data into the warehouse.

247
Q

The first part of the ETL process is to assemble the infrastructure needed for aggregating the raw data sets and for the application of the transformation and the subsequent preparation of the data to be forwarded to the data warehouse. This is typically a combination of a hardware platform and appropriate management software that we refer to as the blank.

A

staging area

248
Q

What data is to be extracted essentially relies on what the BI clients expect to see ultimately factored into their analytical applications, and will have been identified as a result of the blank.

A

data requirements analysis process

249
Q

How data should be extracted may depend on the scale of the project, the number (and disparity) of data sources, and how far into the implementation the developers are. Extraction can be as simple as a blank, the use of blank that connect to different originating sources, yet can be as complex as to require blank written in a proprietary programming language.

A

collection of simple SQL queries
adapters
specially designed programs

250
Q

Blank can be used as a way to capture the metadata of a data set

A

Data profiling

251
Q

Blank includes parsing strings representing integer and numeric values and transforming them into the proper representational form for the target machine, and converting physical value representations from one platform to another (EBCDIC to ASCII being the best example).

A

Data type conversion

252
Q

Blank are rules we can uncover through the profiling process can be applied, along with directed actions that can be used to correct data that is known to be incorrect and where the corrections can be automated. This component also covers data-duplicate analysis and elimination and merge/purge.

A

Data cleansing

253
Q

Blank includes exploiting the discovery of table and foreign keys for representing linkage between different tables, along with the generation of alternate (i.e., artificial) keys that are independent of any systemic business rules, mapping keys from one system to another, archiving data domains and codes that are mapped into those data domains, and maintaining the metadata (including full descriptions of code values and master key-lookup tables).

A

Integration

254
Q

Blank in relation to the foreign key relationships exposed through profiling or as documented through interaction with subject matter experts, this component checks that any referential integrity constraints are not violated and highlights any nonunique (supposed) key fields and any detected orphan foreign keys.

A

Referential integrity checking

255
Q

Blank are any transformations based on business rules, new calculations, string manipulations, and such that need to be applied as the data moves from source to target are applied during the transformation stage. For example, a new ‘‘revenue’’ field might be constructed and populated as a function of ‘‘unit price’’ and ‘‘quantity sold.’’

A

Derivations

256
Q

Blank and blank are frequently data that is in normalized form when it comes from the source system needs to be broken out into a denormalized form when dimensions are created in repository data tables. Conversely, data sourced from join extractions may be denormalized and may need to be renormalized before it is forwarded to the warehouse.

A

Denormalization and renormalization.

257
Q

Blank is any aggregate information that is used for populating summaries or any cube dimensions can be performed at the staging area.

A

Aggregation

258
Q

Blank is as a matter of reference for integrity checking, it is always useful to calculate some auditing information, such as row counts, table counts, column counts, and other tests, to make sure that what you have is what you wanted. In addition, some data augmentation can be done to attach provenance information, including source, time and date of extraction, and time and date of transformation.

A

Audit information.

259
Q

Blank is because nulls can appear in different forms, ranging from system nulls to explicit strings representing different kinds of nulls, it is useful to have some kind of null conversion that transforms different nulls from disparate systems.

A

Null conversion

260
Q
A
261
Q

The blank component of ETL is centered on moving the transformed data into the data warehouse.

A

Loading

262
Q

The loading component of ETL has what two critical issues

A

Target dependencies
Refresh volume and frequency

263
Q

The blank compares pairs of records taken from different data sets to determine if they represent the same entity and are therefore candidates for merging

A

Merge/purge operation

264
Q

The more data sets that are being blank, the greater the amount of work that needs to be done for the integration to complete

A

Integrated

265
Q

Blank refers to the process of discovering patterns that lead to actionable knowledge from large data sets through one or more traditional data mining techniques, such as market basket analysis and clustering

A

Knowledge discovery

266
Q

Data Mining Techniques is the process of mining data can be described as a blank

A

virtuous cycle.

267
Q

Blank is the task of taking a large collection of entities and dividing that collection into smaller groups of entities that exhibit some similarity

A

Clustering

268
Q

Blank is the process of organizing data into predefined classes

A

Classification

269
Q

Blank is a process of assigning some continuously valued numeric value to an object.

A

Estimation

270
Q

The subtle difference between blank and the previous two tasks is that blank is the attempt to classify objects according to some expected future behavior.

A

Prediction

271
Q

Blank is a process of evaluating relationships or associations between data elements that demonstrate some kind of affinity between objects.

A

Affinity grouping

272
Q

The last of the tasks is blank, which is the process of trying to characterize what has been discovered or trying to explain the results of the data mining process.

A

Description

273
Q

An blank is a database that stores data in main memory, instead of or in addition to storage media

A

in-memory database

274
Q

Main memory is much faster than storage media, such as flash memory and disk drives. Consequently, in-memory databases are appropriate for blank, which require fast execution of lengthy queries.

A

Analytic applications

275
Q

In-memory databases are also appropriate for applications that rapidly insert blank, such as data collection from internet devices.

A

High volume of data

276
Q

In-memory databases can now store blank of data, which is adequate for many databases.

A

Terabytes

277
Q

Main memory is volatile and lost when power fails or the database process crashes, so in-memory data is periodically backed up on blank. In-memory databases may also record insert, update, and delete operations in a log file on blank, which can be used to reconstruct databases in the event of a crash.

A

Storage media

278
Q

Blank is an extension to SQL Server supporting in-memory tables. In-memory tables offer the same transaction and recovery options as storage media tables.

A

SQL Server In-Memory OLTP

279
Q

Blank creates in-memory copies of tables. The table source data remains on storage media, grouped by rows in blocks. In-memory copies are physically organized by column, rather than by row. The memory’s columnar organization is optimal for analytic queries, which often summarize large volumes of data from one or two columns.

A

Oracle Database In-Memory

280
Q

MySQL assigns a specific storage engine to individual tables. Both the blank and blank storage engines support in-memory tables. MEMORY does not support transactions or recovery in the event of a failure, and consequently is appropriate for temporary tables only. NDB Cluster supports transactions, recovery, and distributed data, and is recommended for persistent data.

A

MEMORY and MySQL NDB Cluster

281
Q

Blank either run under different database systems or have incompatible schema.

A

Heterogeneous databases

282
Q

Databases with blank might have inconsistent primary and foreign keys, similar tables with different designs, or similar columns with different names and data types

A

incompatible schema

283
Q

The coordinating software layer is called blank since the software lies between application programs and database software.

A

Middleware

284
Q

A blank is a directory of participating database objects, such as tables, columns, and indexes.

A

A global catalog

285
Q

A blank decomposes a federated query into queries for each participating database.

A

A global query processor

286
Q

A blank converts the decomposed queries to the appropriate syntax for each participating database.

A

Database wrapper

287
Q

Some products support blank or SQL/MED, an extension of the SQL standard for federated databases. SQL/MED adds constructs such as nicknames and user mappings to SQL.

A

SQL/Management of External Data,

288
Q

A blank is a federated database name for a participating database object, such as tables and columns.

A

Nickname

289
Q

A blank associates a federated database user with a participating database user.

A

User mapping

290
Q

Give the syntax for nickname and user mapping

A

CREATE NICKNAME Employee
FOR DB2SERVER.HRdatabase.Emp2table;

CREATE USER MAPPING FOR SamSnead
SERVER DB2SERVER
OPTIONS (REMOTE_AUTHID ‘sam.snead@gmail.com’, REMOTE_PASSWORD ‘X!8sflHn’);

291
Q

Although a federated database does not provide a seamless view of data, a federated database is relatively easy to build and often the only practical way to blank from existing, incompatible databases.

A

Combine data

292
Q

A blank is an analytic database of raw, unprocessed data copied from multiple data sources.

A

Data lake

293
Q

Data lakes share some characteristics of blank and some characteristics of blank

A

data warehouses and federated databases

294
Q

Like a blank, a data lake is a separate database designed for analytic queries and consisting of data extracted from multiple source systems.

A

Data warehouse

295
Q

Like a blank, data in a data lake is not cleansed, integrated, or restructured. Data is stored in the original format and structure. Depending on the data source, data may be loaded continuously rather than at fixed intervals.

A

Federated database

296
Q

Data lakes often contain blank of data, such as sensor data or website clicks. Data lakes also contain blank, such as images, video, and text documents, which consume megabytes or gigabytes per data item. As a result, data lakes usually require a large amount of storage and utilize inexpensive, but relatively slow, storage media.

A

Large volumes
Unstructured data

297
Q

Data lakes are more suitable for blank, who are trained to work with complex, unstructured data, than for business analysts.

A

Data scientists