Chapter 17 - Exam 3 Flashcards
Business Intelligence (BI)
- set of Techniques and Tools
- transforms raw data into meaningful info
- used for business Analysis Purposes
Once BI is used to transform data, what uses it?
DSS - Decision Support Systems
Describe DSS when there was no Business Intelligence yet
- DSS used by banks to see if start-up businesses would work
- 1950s
- internet speed was too slow making data hard to collect (so BI was born)
Big Data
social media data, phone records data, internet records data, etc.
4V’s of Big Data?
Volume - amount of data collected
Variety - where data all comes from
Veracity - quality of data
Velocity - speed at which data is collected
First step in collecting data?
Take inventory (figure out where data is)
- CRM customer Relational Model
- ERPs
- social media
Second Step in collecting data?
Figure out how much to collect
-“what questions do i need answered?”
Third Step in Collecting Data?
Figure out where to store it
- Data warehouse
- Datamarts
Data Warehouses v.s. Datamarts
Data Warehouses
- used by large companies
- can store yottabytes of data
- store data in centralized location
Datamarts
- used by departments of large companies
- more focused and specific data
Explain the parts of ETL
Extract
-pull data from CRMs and ERPs
Transfrom
-“normalize data”
-put data into fields/records of relational database
Load
-put data in Data Warehouses or Datamarts
Disadvantage of ETL?
Too slow and takes up too much storage
Who developed Hadoop?
Doug Cutting and Mike Cafarella
How did Hadoop get its name?
Doug Cutting’s son’s yellow toy elephant
Apache Hadoop?
- Open-source software framework
- written in java
Map Reduce
“processing arm” or Search Engine of Hadoop
- finds data where it resides, queries it, processes query
- does NOT bring centralized data to user computer