week 7: Databases Flashcards
data governance
an approach to managing information across an entire organization
Uses Master Data Management
what does data governance control?
Master Data (semi-permanent or core data)
Transaction Data (business activities)
what do databases minimize?
Data redundancy
Data isolation
Data inconsistency
Data redundancy
The same data are stored in many places
Data isolation
Applications cannot access data associated with other applications
Data inconsistency
Various copies of the data do not agree
what do Database Management Systems (DBMS) maximize?
Data security
Data integrity
Data independence
Data security
Databases must have extremely high security measures in place to deter mistakes and attacks since data is stored in one place
Data integrity
Data must meet certain constraints, such as no alphabetic characters in a Social Insurance Number field
Data independence
Applications and data are not linked to each other so that all applications are able to access the same data
bit
(binary digit)
represents the smallest unit of data a computer can process
(e.g. 1 or 0)
byte
represents a single character
often composed of eight bits (e.g. 01101010 which represents a lower case j)
field
A logical grouping of related characters
record
A logical grouping of related fields
describes database entities
File (or table)
A logical grouping of related records
database
A logical grouping of related files
what do Database management systems (DBMS) do?
create and manage a database
database entities
a person, place, thing, or event about which an organization maintains information
database instance
one specific, unique representation of the entity
database attribute
a characteristic or quality of a particular entity
database primary key
a field that uniquely identifies a record
database secondary keys
other identifying fields that typically do not identify the file with complete accuracy
database foreign key fields
used to uniquely identify a row of another table that is linked to the current table
does every record have a primary key?
yeee
what do records describe?
entities
BIG DATA
data so large and complex it cannot be managed by traditional systems
by how much is data increasing each year?
50%
computer data volume
computer-generated from many sources
computer data velocity
flows rapidly to/from within the organization
computer data variety
in addition to numbers and text, it includes images, sound, web- based content and others
dirty data
inaccurate
incomplete,
incorrect
duplicate
erroneous (e.g. incorrect spelling)
issues with big data
dirty data
untrusted sources
Big data changes since quality issues can arise
big data benefits
Making data available
Enabling organizations to conduct experiments
Microsegmenting customers
Creating new business models
Being able to analyze more data
Microsegmenting customers
dividing customers into smaller groups to provide tailored services
DATA WAREHOUSES
AND DATA MARTS:
CHARACTERISTICS
Organized by business dimension or subject
Use On-line Analytical Processing
Integrated
Time Variant
Nonvolatile
Multidimensional
DATA WAREHOUSES
AND DATA MARTS include :
Source systems
Data-integration technology and processes that prepare the data for use
Different architectures for storing data in an organization’s data warehouse or data marts
Different tools and applications for the variety of users
Metadata, data quality, and governance to meets its purposes
data warehouses
repositories of historical data
organized by subject to support decision makers in the organization
data marts
low-cost, scaled-down version of a data warehouse
designed for the end-user needs in a strategic business unit (SBU) or an individual department.