Week 2 Flashcards

1
Q

Define population in terms of data? Example?

A

A complete set of objects, such as undergraduate students

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define sample in terms of data? Example?

A

A subgroup of a given population eg undergraduate students in a specific module

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What makes a good sample?

A

No cherry picking
Modifications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define variables in terms of data? Example?

A

Characteristics or properties that can take on more than one value and can be changed.
Such as a groups or values eg numbers (age, height), categories (colours) qualitative - descriptive and quantitative - measure able numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Two types of variables in experiments?

A

Independent variables - predictors
Dependent variables - outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the independent variable show?

A

Represents the value being changed or manipulated. It is controlled to see the relationship on the observed outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the dependent variable show?

A

The observed result, can be dependent on the IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the control variable?

A

variables that are kept constant to prevent influence on IV and DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Four representative data types in statistics?

A

Nominal/Categorical
Ordinal
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What data type is qualitative and which is quantitative ?

A

Nominal and ordinal - qual
Interval and ratio - quan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does levels in data variables mean? Example

A

The number of that certain variable eg a test of vo2 max using bikes, treadmills and rowing machine, there is 3 IV for each machine therefore level 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define frequency in data ?

A

How often a value appears

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a histogram in data?

A

Visualisation of how data is distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the mode?

A

The largest/most frequently occuring value, can be used for all variables mainly nominal and ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the median?

A

A middle value dividing group into 2, can only be done with ordered variables so excludes nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the mean? With what variables is it used?

A

An average of the data set
Can only be defined in interval and ratio variables as they are numerical

17
Q

Define spread in data? Example?

A

Distribution of a group, eg a stacks of coins, 60 coins 3 stacks of 30 is shorter spread than 6 stacks of 10

18
Q

Types of spread and what they mean?

A

Quantile - cut off points for each section
Quartile - when there are 4 sections
Percentile - 100 sections

19
Q

Define standard deviation?

A

Distance from the mean, shows the centre how spread the data is

20
Q

Define skewness?

A

Measures the degree of asymmetry, positive skew is to the left, negative is to the right and 0 means it is evenly distributed

21
Q

Define Kurtosis?

A

Measures the sharpness

22
Q

What is the 2nd,3rd and 4th moment?

A

2nd = variance
3rd = Skew
4th = Kurtosis

23
Q

Define outliers? How did they occur?

A

Extreme values compared to other data, usually occurs from inaccuracies with participant, measurements and processing

24
Q

How to determine outliers?

A
  • Based on z-score (ratio of SD to mean) if it is more than 3 times of SD
  • Interquartile range (IQR) = width between 1st and 3rd quartile. If 1.5 above 3rd or 1.5 below 2nd.
25
Q

How to determine outliers?

A

Based on z-score (ratio of SD to mean) if it is more than 3 times of SD

26
Q

What is a box plot? What does it show?

A

Summarises quartile based stats and displays distribution of data set.

Location of quartiles/data points - usually main 50% (1st to 3rd quartiles)
Range/spread of data
and outliers detected by quartiles.

27
Q

Nominal Data? Example?

A

Categorical data with no order or ranking

Eg: blood type, Gender, ethnicity.

28
Q

Ordinal Data? Example?

A

Categorical data with clear, ordered ranking but intervals/difference between values are not necessarily equal

Eg: Customer satisfaction (satisfied, very satisfied etc..), education levels.

29
Q

Interval Data? Example?

A

Numeric data with equal intervals/difference between values, but no true zero point (0 still has meaning, not an empty value)

Eg: Iq, Temp

30
Q

Ratio Data? Example?

A

Numeric data with both equal intervals/differences and a zero point (0 is nothing, empty value) allowing for ratios

Eg: Height, weight, income, age.

31
Q

What is the sum for 2nd/3rd/4th moment? How to make it dimensional?

A

(Distance from mean)^2/3/4 to each data point/ number of data points = Variance/Skew/Kurtosis

To make dimensional divide answer by SD^2/3/4 depending on which.