Lecture 4 REVISED Flashcards

Question 1

Q

continuous variable

Answer

A

can take on any value in an interval

e.g., worker’s hourly income can take on any value between 0 and infinity

Question 2

Q

discrete variable

Answer

A

can only take on set, distinct values within in interval

e.g., how many people chose blue as their favourite colour can only be whole number values

Question 3

Q

what levels of measurement are required for continuous variables?

Answer

A

interval or ratio

Question 4

Q

rectangle in a histogram is called a…

Question 5

Q

how does a discrete/continuous data distribution look on a graph?

Answer

A

discrete: bars
continuous: curve

Question 6

Q

frequency distribution is…

Answer

A

a tabular summary of a dataset showing the frequency of items in each class

Question 7

Q

symmetric, skewness, kurtosis in frequency distributions?

Answer

A

symmetric: distribution is split into two identical halves

skewness: level of asymmetry in which an elongated tail extends

kurtosis: degree of peakedness/steepness in a distribution

Question 8

Q

when a distribution is perfectly symmetrical, what is the relationship between the mean and median?

Answer

A

mean and median are the same values

when a distribution is skewed, this isn’t the case

Question 9

Q

why does the median tend to be more representative than the mean?

Answer

A

because if a distribution isn’t symmetrical, an outlier may skew the mean/average

Question 10

Q

where is the mode in a frequency distribution?

Question 11

Q

what formula is used to find the position of the median value?

Answer

A

(n+1) / 2

Question 12

Q

what is the formula to calculate standard deviation?

Answer

A

subtract the mean from each value
square all the deviations and add them together
divide this by (n-1)
square root this figure

Question 13

Q

what does standard deviation tell us about the dataset?

Answer

A

how close each value is from the mean

small standard deviation = low amount of variability, values are close to the mean

high standard deviation = high variability, values are far from the mean

Question 14

Q

variance relationship with standard deviation?

Answer

A

standard deviation is the square root of the variance

Question 15

Q

density curve

Answer

A

an idealised description of a data distribution

describes the overall pattern of a distribution

Question 16

Q

disadvantage of variance for practical applications?

Answer

Study These Flashcards

A

its units differ from the units of the variable

hence why standard deviation is more commonly reported as a measure of dispersion

Question 17

Q

if the dataset is a sample/population, how is the standard deviation denoted and calculated??

Answer

Study These Flashcards

A

sample: denoted s, calculated by dividing the squared deviations by n-1

population: denoted sigma, calculated by dividing the squared deviations by n

Question 18

Q

mean absolute deviation (MAD)

Answer

Study These Flashcards

A

measures the absolute distance/deviation of values in a dataset from the mean

Question 19

Q

how is MAD calculated in a sample/population?

Answer

Study These Flashcards

A

divide the sum of the deviations by the number of data points

Question 20

Q

what does MAD indicate?

Answer

Study These Flashcards

A

how spread out data is

Question 21

Q

percentile

Answer

Study These Flashcards

A

describes the percentage of data values that fall at or below another data value

Question 22

Q

how to calculate percentiles?

Answer

Study These Flashcards

A

(p/100)n

percentile in question divided by 100 multiplied by the number of variables in the dataset

Question 23

Q

quartiles

Answer

Study These Flashcards

A

specific percentiles dividing the data into four parts

first/lower quartile corresponds to the 25th percentile (Q1)

second quartile (median) corresponds to the 50th percentile (Q2)

third (upper) quartile corresponds to the 75th percentile

fourth quartile corresponds to the maximum

Question 24

Q

interquartile range

Answer

Study These Flashcards

A

the difference between the third and first quartile

Q3 - Q1

the range for the middle 50% of the data

overcomes the sensitivity to extreme data values

Lecture 4 REVISED Flashcards

(24 cards)