Data Mining Flashcards

1
Q

what is the definition of data mining?

A

the process of discovering new non-trivial potentially useful patterns from large data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the goal of data mining?

A

the discovery of new patterns and relationships from data

transforming data into knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

why do you need data visualization

A

its difficult to see patterns in just a table of numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the storage catapity trends effects of data

A

decreasing costs for storage capacity means the amt of data availiable is massive. need to find the relationships within this mountain of data to improve a business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some sources of data

A

internal systems, external systems, data from social networking and user generate3d data, transactional data from company ops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does a data warehouse do?

A

aggregates data from all other databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some problems with Operational Data

A

some data isnt suitable for sophistocated data mining, values missing or inconsistent across diff records, data too corse(broad) or too fine (detailed), too much data!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the curse of dimensionality and how do you solve it?

A

it is the curswe of being too overwhelmed by the massive amount of data with lots of diff info (too many rows!) cure this by only paying attention to a couple of things at a time. RESTRICTING data to display in an Operational Data Dashboard of a Score Card

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are business intelligence systems and what do they do

A

use data created by other systems and provide reporting and analysis for decision making. Pulling data from across the business is key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the value of BI systems

A

help to analyze the data, look for patterns, use patterns to make business decisions, share this info with bus partners, manage inventory, designing mktg and ad strategies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 4 diff types of BI systems and what are their defining characteristics

A

1) reporting system (gets data organized to be viewed)
2) Data Mining Systems (use of Statistics to find patterns and relationships
3) Knowledge Management Systems (forums to share knowledge ie Piazza)
4) Expert Systems (turn human knowledge into if/then decisions to make recommendations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Details on REporting Systems

A

integrate data from multiple sources, sort/group/sum/avg/compare, format into reports, GIVE THE RIGHT INFO TO THE RIGHT USER AT THE RIGHT TIME

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are data marts used for

A

they answer one business problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are data cubes used for

A

they aggregate and summarize data along multiple vectors(location, time, product) to make for faster querying esp in drill down queries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does ETL stand for and mean

A

stands for Extract, Transform, Load. THis is the proces that reporting systems do everynight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Flow of data in Reporting system

A

(Legacy, Operational, Transactional, Application, WEbservices systems)->ETL->(Data WARehouse, Cube, Mart)

17
Q

Details on Data Mining Systems

A

Process data using statistical techniques (regression and decision tree analysis) also look for patterns and relationships to predict outcomes (Market Basket analysis[what ppl buy together], Predict donations as to hit target audience)

18
Q

Details on Knowledge Management Systems

A

Create value from intellectual capital, collect and share human knowledge, foster innovation, increase organizational responsiveness (Piazza like)

19
Q

Details on Expert Systems

A

encapsulate expert knowledge and put that into the new employees by producing if/then rules to improve desicion making in non-experts, example: the longevity game and web MD. These are interactive things that you put info into and get a response back

20
Q

What are some examples of Pattern finding and data mining

A

ppl with bad credit scores have more wrecks, on thursday nights ppl buy a lot of diapers and beer

21
Q

What is RFM analysis

A

REcency+Frequency+Money(spent pervisit). its a technique used to evaluate how valuable a customer is. The program divides customers up into 5 groups on each different area and ranks 1->5 with 1 being the most recent/most frequent/biggest spender

22
Q

Who do you want to target with your marketing when looking at RFM analysis

A

look for somone with good (low numbers) Frequency and Money but with bad (hi) reccecy. You want to get that consistent big spender back in the store

23
Q

What is a loss leader, cross selling and upselling

A

loss leader is a product oyu are willing to take a loss on b/c you want to sell other goods to that customer, cross selling is that b/c you bought this you would also like this, upselling i smoving customers to a more expensive version,

24
Q

What is the tricky part about pricing?

A

low pricing might signal low confidence in product, but dont go too crazy

25
Q

Walmart used predictive tech to know what ppl buy when hurricanes are coming in and empowers employees to do what they think is necessary

A

true that