Descriptive Statistics Flashcards

1
Q

What are descriptive statistics?

A

Describe the data you have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the population?

A

Entire group of people you are interested in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample?

A

Subset of population

Usually represented with n (also known as sample size)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is categorical data?

A

Usually nominal or ordinal

Two or more categories with no ordering to them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are examples of categorical data?

A

Hair colour

Marital status

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is discrete data?

A

Usually ordinal, ratio or interval variables

Fixed value with logical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are examples of discrete data?

A

Shoe size

Score out of 10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is continuous data?

A

Usually ratio or interval variables

Can take any fractional value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are examples of continuous data?

A

Reaction times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can categorical data be presented in a frequency distribution?

A

As its raw frequency or as a percentage frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can discrete data be presented in a frequency distribution?

A

As raw frequency or percentage

As cumulative frequency or percentage

If loads of values, use frequency ranges instead (grouped in meaningful way)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are measures of central tendency?

A

Sometimes want to condense entire frequency distribution into single number

Where might want to calculate tendency of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are three types of measures of central tendency?

A

Mode

Median

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the mode?

A

Score occurring most often in dataset

Sometimes takes more than one value (bimodal and multimodal distributions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What data is the mode used for?

A

Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the median?

A

Middle score in dataset

Middle value in dataset or mean of middle two values

17
Q

How do you work out the median for odd value datasets?

18
Q

How do you work out the median for even value datasets?

A

(middle two values) / 2

19
Q

What are the pros of the median?

A

Insensitive to outliers

Often gives real, meaningful data value

20
Q

What data is the median used for?

A

Ordinal data

Skewed interval/ratio data

21
Q

What are the cons of the median?

A

Ignores a lot of data

Difficult to calculate without a computer

Can’t use with nominal data

22
Q

What is the mean?

A

Sum of data points divided by number of data points

23
Q

What are the pros of the mean?

A

Uses all of the data

Most effective for normally distribution datasets

24
Q

What are the cons of the mean?

A

Sensitive to outliers

Values not always meaningful

Only meaningful for ratio and interval data

25
Q

What measure of spread is used for the mode?

26
Q

What measure of spread is used for the median?

A

“Distance based” measures

Range, IQR

27
Q

What measures of spread are used for the mean?

A

“Centre-based” measures

Variance, standard deviation

28
Q

What is the interquartile range?

A

Similar to range but ignores most extreme values

Range of scores within middle 50% of scores

UQ - LQ

29
Q

What is the lower quartile?

A

Median of lower half of data

30
Q

What is the upper quartile?

A

Median of upper half of data

31
Q

What are the pros of the IQR?

A

Insensitive to outliers

Often gives real, meaningful data value

Useful for ordinal data and skewed interval/ratio data

32
Q

What are the cons of the IQR?

A

Ignores lot of data

Difficult to calculate without a computer

Can’t use with nominal data

33
Q

What is the deviance?

A

Each score subtracted from mean

Could see deviance of “0”

How far score is away from the mean

34
Q

What is the sum of squared errors (SS)?

A

Deviance is squared and all deviances are summed

More data points = bigger SS

35
Q

What is the variance?

A

“Average” of sum of squared errors

36
Q

What are the pros of the variance?

A

Uses all data

Forms basis of several other tests/statistics

37
Q

What are the cons of the variance?

A

Requires normal distribution

Sensitive to outliers

Units not sensitive

38
Q

What is the standard deviation?

A

Measure of spread that’s equal to the unit of measurement of DV

Square root of the variance

Can measure s of population or an estimated s of population based on sample

Allows us to get unbiased estimate of population’s s if only have access to sample of data