Descriptive Statistics (B) Flashcards

1
Q

what is the difference between bar graphs and histograms?

A
  • bar graphs are good when data is in categories

- histograms deal with continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is continuous data/

A
  • can be measured
  • has an infinite scale
  • e.g. temperature
  • opposite of discrete data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a histogram

A

displays the frequency density of X-values occurring in each class interval of a frequency table

as a series of rectangular bars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is/how do you work out frequency density?

A
  • measures of how many observations there are per unit of X
  • width of a bar is proportional to the width of class interval and the height (area) of the bar is proportional to the density (frequency)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what do you normally use standard deviation for?

A
  • in comparative terms, e.g. one data set is more variable than another, hard to see whether one is high or low
  • make claims about the proportion of data values we expect to find within a certain number of standard deviations from the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the empirical rule?

A

also known as the 68,95,99.7 rule

68 refers to 68%, amount of data you’ll find will fall within this area

95 wil fall between two standard deviation

99.7 will fall between three standard deviations

creates bell shaped frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

when can you use the empirical rule?

A

only the you have normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the coefficient of variation?

A

with larger means there is normally larger standard deviations, which is hard to compare

it is the ratio of the mean to standard deviation

measure of relative variability

it is independent of units of measurement

the higher the variation the greater level of dispersion around mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

when is the coffeictent of variation useful?

A
  • when you want to compare results from two surveys in which use two measures, e..g tests with different scoring systems
  • same variable over time
  • international comparisons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how do you work out the coefficient of variation?

A

standard deviation / mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is skewness?

A

measures the degree of symmetry of a distribution

degree of distortion around the symmetrical bell curve

histogram is useful graphical display for plotting frequency of values against a numerical scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

which way does positive skew go?

A

left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

which way does a negative skew?

A

right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the relative position of mean and median when distribution is symmetrical?

A

mean = median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the relative position of mean and median when data is positively skewed?

A

mean is greater than the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the relative position of mean and median when data is negatively skewed?

A

mean is less than the median

17
Q

what are the ways in which you measure skewness?

A

1) persons coffectient of skewness (PCS)

2) Bowleys coefficient of skewness (BCS)

18
Q

what is PCS?

A

Pearsons coefficient of skewness

work out if the skewness is positive or negative

19
Q

how do you work out PCS?

A

= 3(mean-median) /SD

20
Q

what are the results of PCS?

A

ranges from -3 to +3
0 = symmetrical (normal)

values outside -1 to +1 can be regarded as highly skewed

21
Q

when can you not use PCS?

A

you you don’t know anything in formula

only know quartile, if you have extreme outliers

22
Q

what is BCS?

A

more robust that PCS
use when know quartiles or have outliers

focuses on middle 50% of data

23
Q

what is the equation for BCS?

A

[(Q3-median) - (median - Q10] / (Q3-Q1)

24
Q

what are the results for BCS?

A

if the upper quartile is further from the median than lower then positive skew

0> p
<0 n

25
Q

what is a scatter plot?

A

displays the relationship between two continuous variables

26
Q

why are scatter plots useful?

A

useful in the early stage of analysis when exploring data and determining whether a linear regression analysis is appropriate

may show outliers

27
Q

what are the measures of association?

A
  • scatter plot

- correlation coefficient

28
Q

what does measure of associations show?

A

analysis of relationships between to variables..

form the next logical step beyond decretive data analysis

29
Q

what are the measures of correlation?

A
  • pearsons correlation coefficient

- spearmans rank

30
Q

what does Pearsons correlation show?

A

measure of how slowly the point on a scatter plot lie on a straight line (range -1 to +1)

association between two numbers

if two numbers move in the same direction = positive correlation

if one moves up and one down = negative correlation

useful for comparative purposes

unit free

31
Q

what is the calculation for persons correlation coefficient?

A

covariance (X,Y) / SD(X) x SD(Y)

32
Q

what is covariance?

A

shows how two variable X and Y change with eahcothwr (move together)

33
Q

how do you work out covariance?

A

sigma

(X-mean of X) (Y-Mean of Y) / n-1

34
Q

how can you interpret Pearsons correlation?

A

theoretical / textbook perspective

(-) 0.7 = strong
(-) 0.5) = moderate
(-) = weak

Cohens effect sizes

0.1 = small
0.3 = medium
>0.5 = large