Chapter 4: Numerical Descriptive Techniques Flashcards

Question 1

Q

Measures of central location

Answer

A

Arithmetic mean (mean, average)
Median
Mode

Question 2

Q

Arithmetic mean

Answer

A

Aka mean or average

Sum of all observations / total number of observations

Essentially same calculation for sample and population

(Average function in Excel)

Usually first selection of central location but can be sensitive to extreme outlier values

Only functional for interval data

Question 3

Q

Median

Answer

A

Observations that falls in the middle of a list of observations places in order

If even number of observations then median determined by averaging two middle observations

Same calculation for sample and population

Median function in Excel

Often a better function of center than mean if there are a small number of extreme outlier observations

50% of observations are above and 50% below

Useful for interval AND ordinal data

Question 4

Q

Mode

Answer

A

The observation (or observations) that occur with the greatest frequency

Sample and population calculated the same way

For larger samples and populations modal class may make more sense than a single mode value

Not great for small samples, potentially not unique

Mode function in Excel
- If multiple Excel returns smallest mode without indicating alternatives

Can be used for any type of data (interval, ordinal, nominal)

Question 5

Q

Using Excel to calculate multiple statistics

Answer

A

Data
Data analysis
Descriptive statistics
Select input range
Summary statistics

Question 6

Q

Measures of variability

Answer

A

Range
Variance
Standard deviation
Coefficient of variation

Question 7

Q

Range

Answer

A

= largest observation - smallest observation

No information about observations in between

Question 8

Q

Variance

Answer

A

Average deviation from the mean squared

calculate mean
find the difference (deviation) of each observation from the mean
square each deviance and sum them together
divide that by 1 less than the number of observations (this corrects for the mean observation)
results in variance ^2

Excel: use VAR function

Mostly useful for comparing multiple sets of data

Question 9

Q

Shortcut method for variance

Answer

A

S^2 = (1/n-1) x (sum of all observations squared - (sum of all observations/number of observations))

Question 10

Q

Standard deviation

Answer

A

Average deviation from the mean

Square root of the variance

Measure of consistency

Question 11

Q

Empirical rule for interpreting standard deviation

Answer

A

If histogram of observations is bell shaped (symmetrical and unimodal) then:

approx 68% of all observations fall within one standard deviation of the mean
approx 95% of all observations fall within two standard deviations of the mean
approx 99.7% of all observations fall within three standard deviations of the mean

Question 12

Q

Chebysheff’s theorem

Answer

A

The proportion of observations in any sample or population that lie within k standard deviations of the mean is:

1 - (1/k^2) for k>1

Provides the lower bound of proportions in an interval

Can be used when the empirical rule does not apply (non bell shaped histograms)

Can be used when empir

Question 13

Q

Coefficient of variation

Answer

A

The standard deviation of the observations divided by the mean

Indicates if standard deviation is large or small given the observation set

Question 14

Q

Measures of relative standing

Answer

A

Provide information about the position of particular values relative to the entire data set.

Percentile
Quartiles
(Quintiles, deciles)
Interquartile range

Question 15

Q

Percentile

Answer

A

The Pth percentile is the value for which P% are less than the value and (100 - P)% are greater than the value

Use to describe a single set of interval or ordinal data to communicate relative standing

Question 16

Q

Quartiles

Answer

Study These Flashcards

A

Describe the 25th, 50th, and 75th percentiles

25th percentile- first/ lower quartile, Q1
50th percentile - second quartile, Q2 (median)
75th percentile - third/ upper quartile, Q3

Use to describe a single set of interval or ordinal data to communicate relative standing

Excel: use descriptive statistics box
Define kth largest (integer closest to n/4)
Same for kth smallest
To approximate third and first quartiles

Gives some idea of histogram shape
Skewed vs symmetric

Question 17

Q

Location of a percentile

Answer

Study These Flashcards

A

Location of percentile P = (n + 1) * p/100

n= number of observations

Tells you the distance the of the percentile from the surrounding observations

Question 18

Q

Interquartile range

Answer

Study These Flashcards

A

= Q3 - Q1

Measures the spread of the middle 50% of observations

Large values = observations far apart = high variability

Use to describe a single set of interval or ordinal data to communicate variability

Question 19

Q

Measures of linear relationship

Answer

Study These Flashcards

A

Covariance
Coefficient of correlation
Coefficient of determination

Question 20

Q

Covariance

Answer

Study These Flashcards

A

Covariance of variables x and y = sum of all observations (distance of x from mean of x) * (distance of y from mean of y) / n-1

Covariance is positive number = variables move in the same direction

Negative number: variables move in opposite directions

Large number: strong relationship
Small number: less strong relationship
- hard to judge without additional data

Question 21

Q

Coefficient of correlation

Answer

Study These Flashcards

A

The covariance divided by the product of the standard deviations of the variables

Sets limits at - and +1 respectively

\+1 = perfect positive relationship
-1 = perfect negative relationship
0 = no linear relationship

Must always judge in relation to other variables

Question 22

Q

Coefficient of determinarion

Answer

Study These Flashcards

A

Square of the coefficient of correlation

Determines the amount of variation in the dependant variable that is explained by the variation of the independent variable

1= 100%
0= no relationship

Excel: trendline, more options, display r+ squared value on chart

Chapter 4: Numerical Descriptive Techniques Flashcards

(22 cards)