Catchall Flashcards

1
Q

Outlier

A

Values of attributes that are infrequent, or are a long way from the average values of the attribute or from the range , or from the range of “typical” values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Should you remove outliers?

A
  1. If you recognise outliers as being due to a particular error or due to unwanted samples in the dataset, it will generally be safe to remove them.
  2. If you find outliers that you believe to be from errors or unwanted samples but do not know why they are there, you may remove them but should certainly record exactly what you have removed. You should perhaps conduct investigation into why they occur.
  3. If you find outliers in data that you are analysing without much prior knowledge of what to expect, you should certainly not remove them without further investigation. If you discard outliers without proper consideration, you may be biasing your data or ignoring something that is scientifically interesting.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

4 Possible ways to identify bias in data

A
  • Checking a wide range of different properties of a dataset can sometimes reveal that it has a different structure or distribution to what one would expect. This can be a sign of bias.
  • Comparing different datasets with comparable sample sets can show that they do not actually deal with the same distribution of samples.
  • Checking the distribution of an attribute against independent data about that attribute may show that the distribution is not typical.
  • Testing whether the attributes that are significant with respect to properties of the data that we are interested in, are properties that we expect to be significant.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 Possible definitions of analysis

A
  • The process of separating something into its constituent elements.
  • Resolution of anything complex into simple elements.
  • Detailed examination of the elements or structure of something.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Analysis vs Synthesis

A

Analysis is a kind of enquiry that breaks down its subject of examination to find its basic elements and structure.

Synthesis compares and brings together different subjects and phenomena to reveal connections and correspondences between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data Mining

A

Data mining is a kind of investigation where we try to use systematic means - in the form of algorithms - to automatically explore patterns and connections that exist within diverse data values and data sources, in order to identify relationships of which we had no prior awareness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly