Exam 3 Flashcards
Rate of Data Growth
Doubles every 6 months
Information Abundance
World has changed, jobs have changed - so much information - need to geek up
Business Intelligence
Firms that are basing decisions on hunches aren’t managing, they are gambling.
Having good data gives the business the power to make an informed decision.
Analytics
The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions
Data
Raw facts and figures
tells you nothing
Information
Data presented in a context so it can ‘answer a question’ or ‘support decision making’
Knowledge
Insight derived from experience and expertise
what humans bring to the table
Structed Data
Organized
Predefined Characteristics “Schema”
Unstructured Data
Not Organized – No Schema
Text – email, Facebook pages, news stories, etc.
Binary – Images, audio, video
Table
An organized collection of data
Records
Rows in a database table
Fields
Columns in a database table
Relational database
Multiple tables that are related
Uses a Key Field (unique identifier) to link tables together
What is a “transaction”?
What are its two key characteristics?
Any business exchange
- Standardized schema
- Occurs repeatedly
Point of Sale system
Retail sales transactions - a cash register
Tracks transactions when item is scanned at checkout and sold to a customer
How do loyalty cards generate valuable data?
The company is paying you for data about you that you otherwise would not give them
(helps the company to see who is buying what items instead of cash anonymous)
ERP
Enterprise Resource Planning
Look into paychecks, invoices, payments become a business transaction and data
SCM
Supply Chain Management
Each order for finished goods, each order for raw materials are a transaction and data
Sources of customer-provided data
Customer surveys
Product registration cards
Contests
Data Aggregator
Firms that trawl the Internet and other sources for data, then package that data up for resale
Business operations – examples
Healthcare Industry – patient data (pharmaceutical research)
Michigan – tags cows at birth
Transportation – engine on Boeing (new airbus aircrafts have over 100k sensors gathering data)
Switzerland – put sensors on 9k trains and 5k km of tracks
Top CIOs say that data growth is the #1 challenge today. What two problems arise from that challenge?
Handling explosive growth with constrained budgets
and Exploiting all that data
What is an SSD? How does it address the problems with data growth?
Solid State Drive Uses flash memory (faster) Lower power consumption (less heat) RAID Prices dropping
What is Automated Data Tiering?
Match storage performance to access frequency (automatically make data decisions)
Top Tier: Currently working data
Mid Tier: Recently used data
Bottom Tier: Historical data
What is DeDupe?
Software that identifies where there are duplicates and eliminates the extra data in order to tame the growth of unstructured data
(eliminates growth in area of unstructured data)
Data Silos
No sharing / communication possible
Can be caused by data trapped in obsolete legacy systems or incompatible systems
Causes us to miss opportunities to see correlations, patterns and trends
Operational data
Data that is continually generated in the day-to-day business operations of a business. When an order is entered, operational data is created and is used immediately by systems that pick inventory from the warehouse, print labels, and arrange for shipping
Things like customer, inventory, and purchase data fall into this category. This type of data is pretty straightforward and will generally look the same for most organizations
How does the analysis of operational data compete with customers?
What can a company do about this problem?
Putting extra load on the system that slows the system down and customers and sales can be lost
Add separate data repositories
Data Warehouse? What are its characteristics
Collection of databases that supports decision making
Many sources
Fast queries and Exploration
Best way to let your managers do analytics without harming the performance of your operational system
How is a Data Mart different from a Data Warehouse?
Similar but, the scale is different.
Instead of looking at an enterprise, we are looking at a specific problem and a specific unit