Topic 3 (statistics) Flashcards

Question 1

Q

measures of central tendency adv and dis

Answer

A

mode:
pros- qualitative data, not affected by outliers or errors or omission always an observed data value
cons: doesnt use all data, not representative if low frequency or if other values have similar frequency

median:
pros-not affected by outliers or significantly affected by error or omissions
cons: doesnt make use of all data

mean:
pros: uses all values in set, large set = outlier not big impact on data
cons: data is small = outliers have a big impact

Question 2

Q

measure of dispersion/spread adv and dis

Answer

A

range:
pros reflects full data set
cons distorted by outliers

IQR
pros not distorted by outliers
cons doesnt reflect full data set (half is disregarded)

standard deviations
pros data set is large = few outliers = negligible impact
cons data set small outliers - big impact on data

Question 3

Q

outliers

Answer

A

a value that lies significantly outside the set of values of a variable

due to:

errors in measuring/recording data
natural variation
clean data if value incorrect
included if genuine result from natural variation

Question 4

Q

ways outliers are defined

Answer

A

anything bigger than Q3 + k(Q3-Q1)
anything smaller than Q1- k(Q3-Q1)
k = typically constant 1.5

OR

anything more than a given number of standard deviations from the mean

Question 5

Q

advantages of a stem and leaf diagram

Answer

A

visibility of data easy to spot clusters and outliers
convenient to calculate median mode and range
can compare 2 data sets easily back to back

Question 6

Q

cumulative frequency graphs

Answer

A

cf on y axis
variable on x axis

plot the upper bounds

Question 7

Q

Frequency density

Answer

A

Frequency divided by class width

Question 8

Q

Histograms

Answer

A

Continuous data 
No gaps between bars 
Height = freq density 
Area of bars proportional to frequency 
Plot at bounds

Question 9

Q

no skew/ symmetrical skew

Answer

A

Q2 - Q1 = Q3-Q2
mean = median = mode

use median and IQR when data skewed

Question 10

Q

positive skew

Answer

A

more data to the left
Q2 - Q1 < Q3-Q2
mean >median > mode

Question 11

Q

negative skew

Answer

A

more data to right
Q2 - Q1 > Q3-Q2
mean < median < mode

Question 12

Q

3(mean - median) divided by standard deviation

Answer

A

postive for positive skew
negative for negative skew
0 for symmetrical skew
greater value = stronger skew

Question 13

Q

comparing data:

Answer

A

comment on measure of location (usually mean or median)
comment on measure of spread
make comparison in context

compare median and IQR (not affected by extreme values, or if the data is skewed)
compare mean and standard deviation (used when data are fairly symmetrical)

Topic 3 (statistics) Flashcards

(13 cards)