Chapter 4 - Predictive Analytics I: Data Mining Flashcards
What is Data Mining?
A term used to describe discovering knowledge from large amounts of data.
How are companies dealing with data as it relates to understanding their customer?
They are analyzing the vast amount of data that they collect. Data mining helps the management of mission critical tasks with a high level of accuracy and timeliness.
What are some reasons businesses have turned to Data Mining? (7)
- More intense competition at the global scale
- Untapped value hidden in large data sources
- Consolidation and integration of database records
- Consolidation of databases into a single location
- Exponential increase in data processing & storage technologies
- Significant reduction in cost of hardware and software
- Movement toward demassification of business practices
What is Genomic Data?
It combines genetics with statistical data analysis and computer science.
What are Four Example Uses of Data Mining?
Used to detect and reduce fraudulent activities
Identify customer buying patterns and reclaim profitable customers
Identify trading rules from historical data
Aid in increased profitability using market-basket analysis
What are the Seven (1-3) Characteristics and Objectives of Data Mining?
- Data are cleansed and consolidated into a data warehouse
- Data Mining environment is usually a client/server architecture or a Web-based IS architecture.
- Sophisticated new tools help to remove information buried in corporate files or archival public records. Also explores the usefulness of soft data.
What are the Seven (4-7) Characteristics and Objectives of Data Mining?
- The miner is often the end user who obtains answers quickly
- Striking it rich involves finding unexpected results and requires users to think creatively throughout the process
- Data mining tools are readily combined with spreadsheets and other software development tools.
- Due to the large amounts of data, it is sometimes necessary to use parallel processing for data mining.
What are the Six (6) Multiple Disciplines associated with Data Mining?
- Knowledge Extraction
- Pattern Analysis
- Data Archaeology
- Information Harvesting
- Pattern Searching
- Data Dredging
What are the Four (4) Major Types of Patterns Data Mining Seeks to Identify?
Association - Find the commonly co-occurring groupings
Predictions - Tell the nature of future occurrences of certain events based on what has happened in the past.
Clusters - Identify natural groupings of things based on their known characteristics.
Sequential Relationships - Discover time-ordered events
What is the Main Difference between Data Mining and Statistics?
Statistics starts with a well-defined proposition and well-defined hypothesis whereas data mining starts with a loosely-defined discovery statement.
What are the Fourteen (1-5) Industry Focuses where Data Mining can be Applied?
- Customer Relationship Management (CRM)
- Banking
- Retailing and Logistics
- Manufacturing and Production
- Brokerage and Securities Trading
What are the Fourteen (6-10) Industry Focuses where Data Mining can be Applied?
- Insurance
- Computer Hardware and Software
- Government and Defense
- Travel Industry (Airlines; Hotels; Rental Car Companies, etc.)
- Healthcare
What are the Fourteen (11-14) Industry Focuses where Data Mining can be Applied?
- Medicine
- Entertainment Industry
- Homeland Security and Law Enforcement
- Sports
What does CRISP-DM Stand For?
Cross-Industry Standard Process for Data Mining
What are the Six (1-3) Steps associated with CRISP-DM?
- Business Understanding - The key element of any data mining study is to know what the study is for.
- Data Understanding - Identify the relevant data from many available databases.
- Data Preparation - Take data and prepare it for analysis by data mining methods.