descriptive stats Flashcards

1
Q

What 3 factors should you try to encompass when designing a study

A

Types of data
If looking for difference or relationship
Number of groups or variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the 2 types of data

A

measurement and categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is measurement data

A

frequency or quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is categorical data

A

qualitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the 4 types of scales

A

nominal
ordinal
interval
ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a nominal scale and when is it used

A

used for categorical data which reflects labels for categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

why shouldn’t you calculate summary descriptions for categorical data

A

results in nonsensical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

define ordinal scales and what they’re used for

A

ordering objects along continuum of various rankings

no information given on differences btwn scale points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

give an example of a study using ordinal scales

A

Holmes and Rahe 1967

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define interval scales and what they’re used for

A

used when have equal intervals btwn objects to represent equal differences
do not allow talk on ratios as 0 point on scale is arbitrary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define arbitrary

A

not based on system or re

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

define ratio scales and what they’re used for

A

have true zero point

true zero corresponds to absence of thing being measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the aims descriptive statistics

A

to characterise numerical dataset representatively
to condense meaningful a lot of info
minimise error involved in condensing process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are inferential statistics

A

goal to infer characs of whole pop from sample and make likely assertions from information instead of certain ones
use sample stats to estimate population parameters
use of theoretical sampling distributions made of innumerable random samples
uses p-values and confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are the 3 categories of descriptive statistics

A

measures of central tendency and measures of dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the 3 measure of central tendency

A

mean
median
mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what are the measures of dispersion

A

range
IQR
variance
standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is the mean; give the equation

A

average score; calculate by sum of scores/number scores

Σ x / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When is the mean most useful and why

A

For normal/symmetric distributions, the mean is the most efficient and least subject to sample fluctuations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what are the disadvantages of using the mean

A

greatly influenced by extreme scores

Inaccurate sometimes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

how can you tell if the mean is an appropriate measure to use on a dataset

A

by using a histogram to see if data is symmetrical and if mean is appropriate

22
Q

what type of distribution is unsuitable for the mean

A

skewed distributions

23
Q

why is the median

A

when all scores arranged in order; central value

24
Q

why and when is the median useful

A

less sensitive extreme scores; gives more accurate representation of data
better measure than mean for highly skewed distributions.

25
Q

what is median formula

A

N+1/ 2

26
Q

define mode

A

most common score

27
Q

what happens if you have 2 adjacent modes

A

add them/2

28
Q

what happens if you have 2 nonadjacent modes

A

bimodal distribution

29
Q

what are the 2 defining features of measures of central tendency

A

they indicate typical values and are summarised by a single number

30
Q

what are the suitable summary descriptions for categorical data

A

frequencies
percentages
mode

31
Q

what are measures of variability

A

describe degrees to which values vary

32
Q

what is the range and how is it computed

A

measure of distance from lowest to highest score; max value- min value

33
Q

what are the disadvantages of using range

A

extreme values/ outliers distort

unstable across diff samples

34
Q

what is the real advantage of range

A

straightforward to calculate and easy to interpret

35
Q

what is the IQR, what does it use and how is it calculated

A

1/2 the distance needed to cover 1/2 the scores
it uses percentiles
It is computed as one half the difference between the 75th percentile [often called (Q3)] and the 25th percentile (Q1). The formula for semi-interquartile range is therefore: (Q3-Q1)/2.

36
Q

what is the difference btwn IQR in a normal vs skewed distribution

A

In a symmetric distribution, an interval stretching from one semi-interquartile range below the median to one semi-interquartile above the median will contain 1/2 of the scores. This will not be true for a skewed distribution, however.

37
Q

what are the advantages of IQR and what kind of distribution is it useful in

A

little affected by extreme scores; good measure of spread for skewed distributions.

38
Q

what is the calculation for the separate IQR/percentiles

A

percentile/100 first, e.g. 50th percentile= 0.50
then
0.50 * (N+1) = rank X
then go across dataset and find number at rank position

39
Q

what is the disadvantage for IQR in normal distributions

A

more subject to sampling fluctuation in normal distributions than the standard deviation and therefore not often used for data that are approximately normally distributed.

40
Q

define variance

A

measure of how much scores vary in terms of distance from mean
average of each score’s squared deviation from mean score

41
Q

what is variance formula

A

σ2= Σ (x- MEAN)2 / N

42
Q

how does variance formula change when computing for sample vs population

A

N-1 for sample

N for pop

43
Q

when do you use sample variance formula

A

when have done sample and want to generalise to wider population and so estimate population variance

44
Q

what is standard deviation

A

square root of variance

45
Q

what does a bigger SD value mean

A

values more spread out

46
Q

what is the equation for sample and population SD

A

Population σ = √σ2

Sample s = √s2

47
Q

what can you do if you know the SD and mean in normal distribution

A

possible to compute the percentile rank associated with any given score

48
Q

in a normal distribution, how many of the scores are within 1 SD of the mean

A

68%

49
Q

in a normal distribution, how many of the scores are within 2 SDs of the mean

A

95%

50
Q

why is SD useful

A

used in many inferential stats tests

51
Q

what is a disadvantage of the SD and how can this be overcome

A

not a good measure of spread in highly-skewed distributions

supplement by the IQR.