Chapter 6: Data and Business Intelligence Flashcards
Information Granularity
Refers to the extent of detail within the information
Detail (fine), summary, aggregate (coarse)
Information levels
Individual, department, enterprise
Information formats
Document, presentation, spreadsheet, database
Information Quality
- decisions are only as good as the quality of info used to make them
- you never want to find yourself using technology to help you make a bad decision faster
Characteristics of high quality information
Accurate Complete Consistent Unique Timely
Information Timeliness
An aspect of info that depends on the situation
Real time information
Immediate, up to date information
Real time system
Provides real time information in response to requests
Examples of low quality information
Missing info (no first name) Incomplete info (no street) Inaccurate info (invalid email) Probable duplicate info (similar names, same address, number) Potential wrong info
Primary sources of low quality information
- customers intentionally enter inaccurate info to protect privacy
- different entry standards and formats
- operators enter abbreviated or erroneous info by accident to save time
- third party and external info contains inconsistencies, inaccuracies, and errors
Costs of using low quality information
- inability to track customers
- difficulty identifying valuable customers
- inability to identify selling opportunities
- marketing to nonexistent customers
- difficulty tracking revenue
- inability to build strong customer relationships
Benefits of good information
- significantly improve chances of making a good decision
- good decisions can directly impact an organizations bottom line
Database
- where information is stored
- maintains information about various types of objects, events, people, and places
Database management systems (DBMS)
Allows users to create, read, update, and delete data in a relational database
Data element
The smallest or basic unit of information
Data model
Logical data structures that detail the relationships among data elements using graphics or pictures
Metadata
Provides details about data
Data dictionary
Compiles all of the metadata about the data elements in the data model
Entity
A person, place, thing, transaction, or event about which information is stored
- rows in a ramble contain entities
- primary keys and foreign keys
Attribute (field, column)
The data elements associated with an entity
Record
A collection of related data elements
Primary key
A field that uniquely identifies a given entity in a table
Foreign key
A primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables
Database advantages
- increased flexibility
- increased scalability and performance
- reduced information redundancy
- increased information integrity (quality)
- increased information security
Increased flexibility
- handle changes quickly and easily
- provide users with different views
- have only one physical view
- have multiple logical views
Physical view
Deals with the physical storage of information on a storage device
Logical view
Focuses on how individual users logically access information to meet their own particular business needs
Increased scalability and performance
- a database must scale to meet increased demand, while maintaining acceptable performance levels
Scalability
Refers to how well a system can adapt to increased demands
Performance
Measures how quickly a system performs a certain process or transaction
Data redundancy
The duplication of data or storing the same information in multiple places
Information integrity
Measures the quality of information
Integrity constraint
Rules that help ensure the quality of information
- relational integrity constraint
- business critical integrity constraint
Information cleansing or scrubbing
A process that weeds out and fixed or discards inconsistent, incorrect, or incomplete information
Increased information security
Password
Access level
Access control
Password
Provides authentication of the user
Access level
Determines who has access to the different types of information
Access control
Determines types of user access, such as read-only access
Data driven websites
An interactive website kept constantly updated and relevant to the needs of its customers using a database
Data driven website advantages
- easy to manage content
- easy to store large amounts of data
- easy to eliminate human errors
Transactional information
Encompasses all of the information contained within a single business process or unit of work, and it’s primary purpose is to support the performing of daily operational tasks
- airline ticket, packiNg slip, sales receipt
Analytical information
Encompasses all organizational information, and it’s primary purpose is to support the performing of managerial analysis tasks
- trends, future growth, sales projection, product statistics
Benefits of data warehousing
Extend the transformation of data into information
- provides the ability to support decision making without disrupting day to day operations
Data warehouse
A logical collection of information gathered from many different operational databases that supports business analysis activities and decision making tasks
- primary purpose is to aggregate information throughout an organization into a single repository for decision making purposes
Extraction, transformation, and loading (ETL)
A process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse
Data mart
Contains a subset of data warehouse information
Multidimensional analysis
Databases are 2D
- data warehouses are multidimensional, layers of columns and rows
Dimension
A particular attribute of information
Cube
Common term for the representation of multidimensional information
Data mining
The process of analyzing data to extract information not offered by the raw data alone
Data mining tools
Classification
Estimation
Affinity grouping
Clustering
Structured data
Data already in a database or spreadsheet
Unstructured data
Data does not exist in a fixed location and can include text documents, PDFs, voice messages, emails
Text mining
Analyzes unstructured data to find trends and patterns in words and sentences
Web mining
Analyzes unstructured data associated with websites to identify consumer behavior and website navigation
Common forms of data mining analysis
Cluster analysis
Association detection
Statistical analysis
Cluster analysis
A technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible
Association detection
Reveals the relationship between variables along with the nature and frequency of the relationships
Statistical analysis
Performs such functions as information correlations, distributions, calculations, and variance analysis
- forecast
- time series info
Forecast
Predictions made on the basis of time series information
Time series information
Time stamped information collected at a particular frequency
Business benefits of high quality information
- information is everywhere
- employees must be able to obtain and analyze diff levels, formats, granularities of info to make decisions
- successfully collecting and analyzing info provides insight on performance