Central Tendency, Variability, & Z-scores Flashcards by Hannah Williams

What data classes are best for mean?

interval, ratio

How well did you know this?

Not at all

Perfectly

What data classes are best for mode?

nominal, ordinal, interval, ratio

How well did you know this?

Not at all

Perfectly

What data classes are best for median?

ordinal, interval, ratio

How well did you know this?

Not at all

Perfectly

Under what conditions might a median be a better measure of central tendency than the mean?

when the data is ordinal (mean does not apply)
interval/ratio data if there are extreme values

How well did you know this?

Not at all

Perfectly

It should seem clear how the mean and the median are measures of the central tendency of the data since the mean is is a familiar average and the median is the middle. However, explain why mode is also considered a measure of central tendency?

most data sets peak in the middle (bell shape). The mode is the highest frequency so it’s usually in the middle somewhere.

How well did you know this?

Not at all

Perfectly

For any data set, what is Σ(X − X ̅)?

Σ(X − X) may be written as:
ΣX − ΣX ̅ = ΣX − nX ̅ = ΣX − n(ΣX)/n = ΣX − ΣX = 0.

How well did you know this?

Not at all

Perfectly

The following data represent a sample of the time to complete a certain task in minutes and seconds (mm:ss).

6:30, 11:15, 6:22, 11:32, 8:12, 5:02, 9:17, 6:51, 8:44, 7:45, 9:37, 7:28, 4:29, 7:42

compute the mean:
compute the std. dev.:

Since the values are given in minutes and seconds they first need to be converted to either minutes and decimal parts (eg. 6:30 = 6 + 30/60 = 6 + 0.5000 = 6.5000min) or to seconds (eg. 6:30 = 6*60 + 30 = 360 + 30 = 390s) so that they can be easily added.

mean: 7:55
std. dev.: 2:04

How well did you know this?

Not at all

Perfectly

For a certain set of data, the mean and standard deviation are computed.

How does X ̅ (data treated as sample) compare to μ (data treated as a population)?

How does s (data treated as sample) compare to σ (data treated as a population)?

X ̅ is the sample mean, μ is the population mean; they are calculated the same way.

standard deviation of sample (N-1) vs. population (N)

How well did you know this?

Not at all

Perfectly

Given the following sample data set:

        6, 12, 9, 7, 8, 4, 3, 12, 15

Compute the mean.
What is the median?
What is the mode?
Compute the variance.
Compute the standard deviation.

mean: 8.44
median: 8
mode:12
variance: 15.77
standard deviation: 3.97

How well did you know this?

Not at all

Perfectly

For the following sample data set:
X frequency
52 5
54 8
57 2

Compute the mean.
Compute the variance.
Compute the standard deviation.

mean: 53.73
variance: 2.635
standard deviation: 1.62

How well did you know this?

Not at all

Perfectly

The following sample data of the number of communications are taken from logs of communication with Distance Education students:

5, 9, 5, 23, 27, 55, 34, 7, 30, 15, 22, 60, 14, 52, 297, 8, 51, 15, 51, 35, 15, 39, 137, 43, 38, 14, 93, 7

Compute the mean.
Compute the standard deviation.
Draw a boxplot with the minimum, Q1, Q2, Q3, and maximum.
Which is a better representation of the central tendency: mean or median? Explain.

mean: 42.89
std. dev.: 57.28
Minimum: 5
Q1: 14
Q2: 28.5
Q3: 51
Maximum: 297

The mean is; this is due to extreme values.

How well did you know this?

Not at all

Perfectly

If the two largest values in the sample data set of the previous problem were omitted,

Compute the mean.
Compute the standard deviation.
Draw a boxplot with the minimum, Q1, Q2, Q3, and maximum.
Which is a better representation of the central tendency: mean or median? Explain.

mean: 29.50
std. dev.: 21.68
minimum: 5
Q1: 14
Q2: 25
Q3: 43
Maximum: 93

Mean may now be a better measure because extreme outliers have been removed.

How well did you know this?

Not at all

Perfectly

Consider the following data set:
21, 34, 18, 26, 30, 35, 24, 29, 25

If this is a population, compute the mean.
If this is a sample, compute the mean.
If this is a population, compute the standard deviation.
If this a sample, compute the standard deviation.

μ=26.9
X ̅= 26.9
σ= 5.34
s= 5.67

How well did you know this?

Not at all

Perfectly

If we had a set of ordinal values (not interval/ratio), could you create a boxplot?

Technically yes, because quartiles depend only on the position in the ordered data set. Thus, one could determine the positions in the ordered set for Q1, Q2 (median), and Q3 and the first and last position for the minimum and maximum. However, without interval/ratio data, visualizing this with a boxplot would not make sense.

For example, imagine you ask 9 people what size drink they ordered, small, medium, or large. The ordered data might be: small, small, small, medium, medium, large, large, large, large. Q1 is position 2.5 (small), Q2 is position 5 (medium), and Q3 is position 7.5 (large), minimum is position 1 (small) and maximum is position 9 (large).

How well did you know this?

Not at all

Perfectly

Typically we consider quantitative data that is symmetric about the mean. If we have a data set that has a few extreme high values, then

a. How is it skewed?
b. Would you use a mean or median? Why?

It is positively skewed (right-skewed)

You would use median since it is less sensitive to extreme values.

How well did you know this?

Not at all

Perfectly

For the MCAT, µ = 500 and σ = 10. What is the probability of an individual getting a score greater than 502.5?

z=0.25
p=0.413

How well did you know this?

Not at all

Perfectly

For the MCAT, µ = 500 and σ = 10.

What is the minimum score would you have to obtain to be in the top 5%?

What is the minimum score you would have to obtain to be in the top 2.5%?

95%
500 + (1.64 x 10) = 516.4

97.5%
500 + (1.96 x 10)= 519.6

How well did you know this?

Not at all

Perfectly

Correlational method

looking for relationships between variables (correlation or regression)

How well did you know this?

Not at all

Perfectly

Experimental method

Study These Flashcards

manipulating one variable to determine if this causes changes in another variable

independent variable

Study These Flashcards

what we control/manipulate

dependent variable

Study These Flashcards

what we measure (is influenced)

confounding/extraneous variables

Study These Flashcards

other things impacting (things impacting dependent that aren’t independent)

random assignment

Study These Flashcards

equal chance to end up in group (bigger the better)

helps decrease extraneous variables

experimental vs control groups

Study These Flashcards

experimental: at least 2 different groups
control: group with no treatment (placebo)

Placebo

any treatment that has no active properties

hypothetical constructs

an explanatory variable which is not directly observable we must find ways to operationalize these

operational definition

how do we assign a number?

population

all the people we want to apply results to (we control/decide)

sample

we find a subset of the pop. that is representative of the whole pop. (random)

random sample

random group from population

descriptive statistics

summarizes data

inferential statistics

trying to infer back to population (generalize)

parameter vs. statistics

statistics for sample; parameter for population

sampling error

the difference between stat. from sample and it's parameter

discrete vs. continuous variable

discrete: categories w/ nothing in between continuous: infinite values between any two categories

quantitative vs. categorical data

quantitative: directly measuring something (continuous data) categorical data: counts of things (discrete variables)

scales of measurements

nominal: no inherent order of different categories (weakest) ordinal: one group is above other, not evenly spaced (can use median) interval: there's equal interval, but no true zero ratio: there's equal interval, but is true zero

frequency distributions

- real lower limit - real upper limit - midpoint

visualizing data

- histogram: frequency distribution turned into a graph. We can see shape o destitution and the spread of the data. - line graph: good for looking at change/time - scatterplot: tells us about relationships between variables. shows pos. & neg. relationships. strength of relationship based on how linear. - boxplot: box represents 50% of data.

shapes of distributions

symmetrical - unimodal: bell-shaped (normal dist.) - bimodal: clear 2 peaks (one can be higher) - rectangular: data of equal freq. for all values asymmetrical - pos. skewed: skewed to right (not norm. but unimodal) - neg. skewed: skewed to left (not norm. nut unimodal)

central tendency

mean: avg. of all numbers median: middle number in list mode: most freq. number

variability -range -interquartile range - variance - standard dev.

range: x(max)- x(min) interquartile range: Q1=.25 x (# in data data set) Q2=.50 x (# in data data set) Q3=.75 x (# in data data set) IQR= Q3-Q1 variance: avg. squared dev. of each number from mean std. dev.: sqrt var. (takes away squared unit)

z-scores

raw score to z-score: x=u+2o z-score to raw score: z=(x-u)/o

standardize a dist.

shape of standard distribution: the shape of the distribution of z-scores will be the same as the shape of the original dist. raw scores mean: z-score dist. always have mean of zero so and above=+ and any below=- standard deviation: the z-score dist. will always have a standard dist. of 1. The numerical val. of z-score is exactly the same number of standard deviation from the mean

normal distribution

empirical rule: the following apprrox. holds - 68% of obs. fall between u-o & u+o 95% of obs. fall between u-2o & u+2o 99.7% of obs. fall between u-3o & u+3o

un-biased stat

a statistic whose long range average is equal to the parameter it estimates

Central Tendency, Variability, & Z-scores Flashcards

(46 cards)