Descriptive Statistics Flashcards

1
Q

what are discrete variables

A

variables with fixed values (often numbers) such as number of objects or shoe size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are continuous variables

A

can take any fractional value within their range eg amount of time, distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

why do we use frequency ranges

A

its often not sensible to calculate frequencies on the basis of each possible score, there may be a lot of possibilites

can condense data while still retaining a lot of information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the three common measures of central tendancy

A

mean mode median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

pros of using the mode

A

can be used for categorical data (nothing else can)

always gives a real data value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

cons of using the mode

A

sometimes gives multiple values (bimodal distibutions)

varies depending on group size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

when do we use mode most

A

usually just for nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

pros of the median

A

insensitive to outlying data so not skewed

often gives a real data value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

cons of the median

A

ignores a lot of the data as doesnt care about outliers

difficult to calculate for lots of data without a computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

when do we commonly use median

A

for ordinal data and sometimes for skewed interval or ratio data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

1 pro of the mean

A

uses all the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

cons of the mean

A

very sensitive to anomalous results

doesnt always give a meaningful value (2.4 children??)

only meaningful for ratio and interval data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

name 4 measures of data spread

A

range
interquartile range
variance
standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

con of the range

A

very sensitive to outliers - the highest and lowest score are not likely to represent the majority of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how do you find the interquartile range and semi IQR

A

quartile is the lowest score needed to include a given quarter of the population

Q1 is the lowest 25%, Q2 is the median, Q3 is 75% of data

IQR is Q3-Q1

semi IQR is (Q3-Q1)/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

pros of IQR

A

less sensitive to outliers

few assumptions about data

17
Q

cons of IQR

A

hard to calculate by hand for large datasets

doesnt use all the data

18
Q

describe variance

A

its a complicated sum i dont need to remember

essentially: as the spread of a set of scores increases so does the variance

19
Q

pros of the variance

A

uses all data

forms basis for several other resta

20
Q

cons of variance

A

more sensitive to outliers

requires a normal distribution

doesnt really have a unit

21
Q

what is standard deviation

A

square root of the variance

same unit as variable being measured