Data Mining and Visualisation Flashcards

1
Q

What are the three ares for big data application?

A

Scientific
Medical
Commericial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

State the stages in the basic scientific process

A
  1. Observe data about the world
  2. Notice patterns in data
  3. Devise a hypothesis which explains data
  4. Run an experiment on unseen data
  5. Refine or reject hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the stages in the knowledge discovery pipeline?

A
Acquisition
Cleaning
Selection
Processing
Data Mining
Visualisation
Interpretation/Knowledge
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a risk of a deep neural net?

A

Model is so flexible that it will fit any data and predict nothing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the steps in a k-Means algorithm

A

Pick k points at random as initial means
Assign each point to the nearest mean
Replace means by actual means of points assigned to it
Repeat until nothing changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe what k-Means clustering algorithm does

A

Discovers similar groups in data, data falls into k clusters, each represented by the nearest mean. Evaluation is least total distance from each point to its nearest mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is statistics used to do?

A

Extract patterns from data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the p value

A

How likely it is that a result this unusual could have occured by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the p value used to do?

A

Assess the significance of a result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe statistical power

A

The probability that your test detects an effect if it is real

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the statistical power depend on?

A

Size of the effect and the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What should graphical displays do?

A
  • Show the data
  • Induce the viewr to think about the substance
  • Avoid distorting what the data as to say
  • Present many numbers in a small space
  • Make large data sets coherent
  • Reveal the data at several levels of details
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is graphical excellence?

A

Well designed presentation of interesting data, consists of complex ideas communicated with clarity, precision and efficiency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly