data mining Flashcards

1
Q

what is data mining

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How ml and dm different

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

DM applications

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is algo decision tree

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is support vector algo

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how to know if a point a is support vector

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is algo k nearest neighbours

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is algo naive bayes

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is algo k means clustering

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is Algo hirarchical cluster

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Algo expected maxi

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is apriori algo

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

classification vs clustering

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is emsemble learning algo

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Regression tree use cases

A
  • An example tree which estimates the probability of kyphosis after spinal surgery, given the age of the patient and the vertebra at which surgery was started
  • Let’s say you want to predict whether a person is fit given their information like age, eating habit, and physical activity, etc.
  • are Outlook, Temperature, Humidity, Wind and the outcome variable is whether Golf was played on the day. Now, our job is to build a predictive model which takes in above 4 parameters and predicts whether Golf will be played on the day
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Support vector machine classification

A

All of these use labeled data

  • Company wants to automate the loan eligibility process (real-time) based on customer details provided while filling an online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others
  • In an image cancer or not cancer
  • Face or not face in an image
17
Q

KKN use cases

A

Uses labels

  • Classify an startup as high return , moderate or low return
  • Clasiffy patients quick recovery, medium recovery and slow recovery
18
Q
A
19
Q

Use cases hirarchical clustering

A
  • 2) Charting Evolution through Phylogenetic Trees
20
Q

Linear regression formula

A
21
Q
A
22
Q
A