3] Descriptive Statistics 2 Flashcards

1
Q

What is a measure of variation

A

It is a way to describe the distribution or dispersion of data, showing how far data points are from one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do we use the measure of variation

A

We use it because just investigating the average can oversimplify the data so we must consider the variability in scores around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a model

A

It is a simple representation of a complex thing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the three common measures of variation

A

1: The range
2: The interquartile range
3: The standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two concepts/measures of standard deviation

A

1: Sum of squares
2: The variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the range

A

It is the most simple descriptive measure of variation for a numerical variable
It represents the difference between the smallest and largest value to measure the total spread of data
E.g: 19 - 5= 14

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The range: Advantages and disadvantages

A

Advantages
1: Very simple measure
Disadvantages
1: Doesn’t take account of all scores
2: Can be oversimplified
3: Doesn’t take into account of how values are distributed
4: Greatly effected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the interquartile range

A

It finds the extremely high and low scores in a dataset so it’s common
To find the range you exclude the lowest 25% of scores and the highest 25% of scores and that excludes any outliers
It’s the distance between Q1 and Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The interquartile range: Advantages and disadvantages

A

Advantages
1: The range is derived from the middle 50% of a distribution
2: Less likely to be influenced by extreme scores
3: Provides a better and more stable measure if variability than the range
Disadvantages
1: Only regards the middle 50% of scores and disregards the rest
2: Crude measure of variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is standard deviation

A

Is a measure of how dispersed the data is in relation to the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we see how far each number (results in a class test) is from the mean

A

We take each answer away from the mean (range) of the results overall and then add each answer
E.g: 3, 5, 6, 7, 9 = 6 (range/mean)
3 - 6 = -3
5 - 6 = -1
6 - 6 = 0
7 - 6 = 1
9 - 6 = 3
= 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the sum of squares

A

When we want to investigate the difference between each score on a test and the mean but keep getting the answer 0 we can square each answer
E.g: 3, 5, 6, 7, 9 = 6 (range/mean)
3 - 6 = -3 (x-3) = 9
5 - 6 = -1 (x-1) = 1
6 - 6 = 0 (x0) = 0
7 - 6 = 1 (x1) = 1
9 - 6 = 3 (x3) = 9
= 20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the average sum of squares

A

We divide the result of the sum of squares by the amount of numbers
E.g: SoS = 20 and N = 5
20 / 5 = 4
It shows that on average each data point is 4 square units away from the mean
This is called the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for variance

A

E (x - [-] x) (squared) / N = Variance
x (each data point)
[-] x (the mean)
N (number of data points)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the problem with variance

A

Its expressed in unusual units compared to the mean, which doesn’t make much sense
E.g: Mean = 6 and Variance = 4 squared units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do we fix the issue with Variance

A

We use the same formula for variance but instead square root the entire thing
The value that is left I called the standard deviation and will always be a non-negative value
E.g: (square root) 4 = 2
This represents the average distance that each score is from the mean