rest of UNIT ONE vocab Flashcards
What is a contingency table?
shows distributions across 2 variables like gender and music pref. AKA 2-way table
How can you tell if variables in a contingency table are independent?
If the distributions are the same across the variables.. Then it doesn’t DEPEND.. so INDEPENDENT
When drawing a graph or chart, what do you have to remember to do?
LABEL AXES, make a KEY(if needed ) AND GIVE IT A NAME!!! “Figure 1: Age and Food Preference”
marginal distribution
overall distributions of a single variable in contingency table (out in margins)
conditional distribution?
A distribution within the table, along only one row or one column? NOT IN THE MARGINS
Association and Independence: How are they related?
Variables are either independent or associated
Give an example of independent variables
If 80% prefer cheese and only 20% prefer pepperoni IN EACH GRADE AT BHS, then they all have the same preference, so grade doesn’t matter. We say “school year and pizza choice are independent” (not dependent)
Give a quick example of associated variables
A higher percentage of boys play video games than girls so we say “gender and video game playing are associated” or “gender and video game playing are not independent”
Gender and Video Game playing are___________ because_______
associated (not independent) because a higher percentage of males play video games. (think.. It depends on gender)
Year in school (F,S,J,S) and Pizza Preference (pepperoni or cheese) are __________ because _______________
independent (not associated) because they all have the same preferences. it doesn’t depend on grade, 80% of each group likes cheese better.
What do you call things that are not independent?
associated
mean/SD/median/IQR? How do I know which ones to use?
when unimodal and symmetric, mean and sd. skewed or outliers? Median and IQR. BIMODAL? Talk about the MODES
How do you describe distributions (histograms)?
Shape-Cener-Spread- and STRANGE (Outliers and gaps) some say GSOCS. where’s yo GSOCS?
If asked to compare distributions, what should you write about?
Compare Shapes, Centers, Spreads, and Stranges.. The GSOCS
What does GSOCS stand for?
Gaps Shape Outliers Center Spread
If a distribution is skewed right, what will be greater, the mean or median? WHY?
Mean. The mean moves further to the right to keep balance. (the mean chases the tail)
If a distribution is skewed left, what will be greater, the mean or median? WHY?
Median. The mean moves left to keep balance. (the mean chases the tail)
Give a simple example showing that adding a constant doesn’t change the spread, but changes the center. (this always happens)
Data set: 1,2,3,4,5 Spread(range): 5-1=4, Center: 3
add three and get new data set: 3,4,5,6,7 spread: still 4 Center: 5 (center went up, spread stayed the same). The IQR and SD will stay the same, but median and mean +3
Give a simple example showing that multiplying by a constant changes both the spread and the center. (this always happens)
Data set: 1,2,3,4,5 Spread(range): 5-1=4, Center: 3
mult by three and get new data set: 3,6,9,12,15 spread:12 Center:9 (both center and spread were multiplied by three) IQR and SD will be multiplied by 3 and all values including Q1, median, etc.
How do you describe center?
Talk about the mean (balance), median (splits area in half), mode (peaks? if bimodal, talk about both modes) or simply say: “centered around ____”
How do you describe shape?
unimodal, bimodal, multimodal, uniform AND symmetric, skewed
Spread description?
range, IQR, stand dev, variance, or simply say: “ From here to about here”
If the distribution is unimodal and symmetric, what would you use for center and spread statistics?
Mean (center) and Standard Deviation (spread)
If the distribution is skewed (or outliers/not symmetric) what would you use for center and spread statistics?
Median (center) and IQR (spread)
If the distribution is bimodal or multimodal, what would you use for center and spread statistics?
Talk about each mode (center) and maybe use the range or IQR. You could also say “one group is from __ to __ and the other from about __ to __”
what happens if you ADD a constant to each value in a data set?
it is SHIFTED only. This effects all of the data values and measures of center (mean, med) and quartiles, deciles, etc… IT DOES NOT CHANGE THE SPREAD! (IQR, St Dev, Range all stay the SAME).