Chapter 7: Databases and Data Warehouses Flashcards
Data can be maintained in one of the two ways: the _________ - which has no mechanism for tagging, retrieving, and manipulating data- and the _________, which does have this mechanism.
traditional files approach
database approach
_________ wastes storage space (and consequently money) and is inefficient.
Data redundancy
Inaccuracies affect _______ - the characteristic that the data represents what it is supposed to represent and that it is complete and correct.
data integrity
An ________ is any object about which an organization chooses to collect data.
entity
The smallest piece of data is a _________.
character
A ______ is one piece of information about an entity, such as the last name or first name of a student, or the student’s address.
field
The fields related to the same entity make up a _______.
record
A collection of related records, such as all the records of a college’s students, is called a _______.
file
The program used to build databases, populate them with data, and manipulate the data is called a ________.
database management system (DBMS)
The program used to build databases, populate them with data, and manipulate the data is called a _____.
database management system (DBMS)
Data is accessed in a database by sending messages called _____, which request data from specific records and/or fields and direct the computer to display the results.
queries
The _________ consists of tables. Its roots are in relational algebra, but you do not have to know relational algebra to build and use them.
relational database model
A ___ is a field whose values identify records either for display or for processing.
key
A _____ combines data from two or more tables.
join table
A _____ is the field by which records in a table are uniquely identified. If your query specified that you wanted the record whose CustomerID value is 36003, the system would retrieve the record of the person you wanted, even if there are more records of people with the same name.
primary key
A _____ is a combination of two or more fields that together serves as a primary key, because it is impractical to use a single field as a primary key.
composite key
A ____ is created when a group of employees belongs to only one department. All would have the same department number as a foreign key in their records, and none will have more than one department key.
one-to-many relationship
A ______ can be maintained, for instance, for professors and students in a college database as a professor might have many students, and a student might have many professors.
many-to-many relationship
The ______ uses the object-oriented approach to maintaining records.
object-oriented database model
The combined storage of both data and the procedures that manipulate them is referred to as ______.
encapsulation
The ability in object-oriented structures to create a new object automatically by replicating all or some of the characteristics of a previously developed object (called the parent object) is called _______.
inheritance
A _______ creates a temporary table that is a subset of the original table or tables. It allows you to create a report containing records that satisfy a condition, create a list with only some fields about an entity, or product a report from a join table, which combines relevant data from two or more tables. If so desired, the user can save the newly created table.
relational operation
The three most important relational operations are ____, ____, and ____.
select
project
join
____ is the selection of records that meet certain conditions. For example, a human resources manager might need a report showing the entire record of every employee whose salary exceeds $60,000.
Select
______ is the selection of certain columns from a table, such as the salaries of all the employees.
Project
In relational model, the joining of data from multiple tables is a called a ____.
join
________ has become the query language of choice for many developers of relational DBMSs. It is an international standard and is provided with most relational database management programs.
Structured Query Language (SQL)
The _____ describes the structure of the database being designed: the names and types of fields in each record type and the general relationships among different sets of records or files.
schema
The description of each table structure and types of fields become part of a _____, which is a repository of information about the data and their organization.
data dictionary
The information describing each field can be called as _____. It includes the source of the data including contact information, population rules: what is inserted, or updated, and how often, etc.
metadata (data about the data)
Analyzing an organization’s data and identifying the relationships among the data is called _____.
data modeling
Effective data modeling and design of each database involves the creation of a conceptual blueprint of the database. Such a blueprint is called an _______.
entity relationship diagram (ERD)
It is a graphical representation of all entity relationships, and they are often consulted to determine a problem with a query or to implement changes.
Transaction data can be used for important management decisions, such as researching market trends or tracking down fraud. Organizing and storing data for such purposes is called _______.
data warehousing
A _______ is a large, typically relational, database that supports management decision making.
data warehouse
Organizations often set up their data warehouse as a collection of _____, smaller collections of data that focus on a particular subject or department.
data marts
Three phases are involved in transferring data from a transactional database to a data warehouse: ______, ______, and ______.
extraction
transforming
loading
In the ____ phase, the builders create the files from transactional databases and save them on the server that holds the data warehouse.
extraction
In the _____ phase, specialists “cleanse” the data and modify it into a form that allows insertion into the data warehouse. For instance, they will check if the data contains any spelling errors and fix them, and also make sure that all data is consistent.
transformation
In the _____ phase, the specialists transfer the transformed files to the data warehouse. They then compare the data in the data warehouses with the original data to confirm completeness.
loading
_____ is a magnification of expansion of the amount, types, and level of detail that is collected and stored. It is data specifically collected about and from individuals.
Big Data
The collection and storage of ever-more detailed quantities of data.
______ is a data mining method that uses a combination of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials.
Sentiment analysis