data mining Flashcards

1
Q

what is data mining

A

extraction of info or patterns from data in large databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

big data must account for:

A
  • volume
  • variety
  • velocity
  • veracity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

pac learning idea

A

any hypothesis h that is consistent with a sufficiently large number of training examples, is unlikely to be seriously wrong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the 7 steps of KDD

A
  1. Develop an understanding of the application domain (task topic, prior knowledge)
  2. Creating a target dataset on which the discovery is going to be performed (selection)
  3. Data cleaning and pre-processing (can take 60% of the effort)
  4. Data reduction and projection: finding the useful features to represent data (transformation)
  5. Matching the goals of the process with a data mining method (regression, classification, etc.)
  6. Exploratory analysis and model/hypothesis selection, choosing the data mining method
  7. Data mining: searching for patterns of interest
  8. Interpreting mined patterns
  9. Acting on/using of discovered knowledge
How well did you know this?
1
Not at all
2
3
4
5
Perfectly