Databases and Information Management Flashcards
is a subset of a data warehouse in which a summarized or highly focused portion of the organization’s
data is placed in a separate database for a specific population of users.
Data mart
where the same attribute may have different values
Data Inconsistency
, events are linked over time.
Sequences
occurs when different groups in an organization independently
collect the same piece of data and store it independently of each other.
Data Redundancy
A group of related files makes up a
Database
deals with the policies and
processes for managing the availability, usability, integrity, and security of the
data employed in an enterprise, with special emphasis on promoting privacy,
security, data quality, and compliance with government regulations.
Data governance
These tools are
able to extract key elements from large unstructured data sets, discover
patterns and relationships, and summarize the information. Businesses might
turn to text mining to analyze transcripts of calls to customer service centers to
identify major service and repair issues.
Text mining tools
acts as an interface
between application programs and the physical data files.
DBMS
supports multidimensional data analysis, enabling users to
view the same data in different ways using multiple dimensions.
OLAP - Online analytical processing
Rows are commonly referred to as records, or in very technical terms, as
Tuples
is software that permits an
organization to centralize data, manage them efficiently, and provide access
to the stored data by application programs.
Database management system
is a person, place, thing, or event on
which we store and maintain information
Entity
capability to specify the structure of the content
of the database. It would be used to create database tables and to define the
characteristics of the fields in each table
Data definition
presents data as they would be
perceived by end users or business specialists,
Logic View
represents a single character, which can be a letter, a number, or another symbol.
Byte
Type of information obtained in data mining are
Association, sequences, classification, clustering, and forecasting.
stores the data and procedures that act on those data as objects that can be automatically retrieved and shared
Object-oriented DBMS
A group of related fields, such as the student’s name, the course taken,
the date, and the grade, comprises a
Record
represent data as two-dimensional tables (called relations). Tables may be referred to as files. Each table contains data on an
entity and its attributes.
Relational DBMS
is a database that stores current and historical data of
potential interest to decision makers throughout the company.
consolidates and standardizes
information from different operational databases so that the information can be
used across the enterprise for management analysis and decision making
Data warehouse
Problems with traditional file environment
- Data Redundancy and Inconsistency
- Program-Data Dependence
- Lack of Flexibility
- Poor Security
- Lack of data sharing and availability
Data Hierarchy
bit
byte
field
record
file
database
The process of creating small, stable, yet flexible and
adaptive data structures from complex groups of data
Normalization
In a client/server environment, the DBMS resides on a
dedicated computer called
Database server
has a rudimentary data dictionary capability that displays information about the size,
format, and other characteristics of each field in a database.
Microsoft access
Three basic operation in Relational DBMS
Select, join and project
use data mining techniques, historical data, and assumptions about future conditions to predict outcomes of events, such as the probability a customer will respond to an offer or purchase a specific product
Predictive analytics
The design group
establishes the physical database, the logical relations among elements, and the
access rules and security procedures.
Database management
The most prominent data manipulation language today is
Structured query language
A grouping of characters into a word, a group of words, or a complete number
Field
uses predictions in
a different way. It uses a series of existing values to forecast what other values
will be
Forecasting
to ensure that relationships between coupled tables remain consistent.
Referential intergrity
It is more discovery driven. It provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules from them to predict future behavior.
Data Mining
is the presence of duplicate data in multiple data files so that the same data are stored in more than place or location.
Data Redundancy
The discovery and analysis of useful patterns and information from the World
Wide Web
Web mining
The field for Supplier_Number in the SUPPLIER table uniquely identifies
each record so that the record can be retrieved, updated, or sorted and it is
Key Field
is the process of extracting
knowledge from the content of Web pages, which may include text, image, audio,
and video data.
Web mining
a group of records of the same
type
File
specifies the organization’s rules for sharing,
disseminating, acquiring, standardizing, classifying, and inventorying information.
lays out specific procedures and accountabilities,
identifying which users and organizational units can share information, where
information can be distributed, and who is responsible for updating and
maintaining the information.
Information policy
is an automated or
manual file that stores definitions of data elements and their characteristics.
Data dictionary
is responsible for the specific
policies and procedures through which data can be managed as an organizational
resource.
Data administration
consists of activities for
detecting and correcting data in a database that are incorrect, incomplete,
improperly formatted, or redundant.
not only corrects errors
but also enforces consistency among different sets of data that originated in
separate information systems.
Data cleansing or data scrubbing
is a collection of data
organized to serve many applications efficiently by centralizing the data and
controlling redundant data.
Database
It refers to the coupling of data stored in files and the
specific programs required to update and maintain those files such that changes
in programs require changes to the data.
Program-data dependence
shows how data are actually organized and structured on physical storage
media.
Physical View
recognizes patterns that describe the group to which an item
belongs by examining existing items that have been classified and by
inferring a set of rules.
Classification
DBMS includes tools for accessing and manipulating information in databases.
Most DBMS have a specialized language called a
data manipulation language
works in a manner similar to classification when no groups have yet
been defined.
Clustering
represents the smallest unit of data a computer can handle.
Bit
Each characteristic or quality
describing a particular entity
Attribute
which is a structured survey of the accuracy and level of
completeness of the data in an information system
can be
performed by surveying entire data files, surveying samples from data files, or
surveying end users for their perceptions of data quality
Data quality audit
This key field is the unique identifier for all the information in any row of the table
Primary key
are occurrences linked to a single event.
Associations
Databases in the cloud
” Cloud computing
providers offer database management services, but these services typically
have less functionality than their on-premises counterparts.