DECK 2: UNIT 1 part A Flashcards
When drawing a graph or chart, what do you have to remember to do?
LABEL AXES, make a KEY(if needed ) AND GIVE IT A NAME!!! “Figure 1: Age and Food Preference”
When are box plots used most often?
When comparing a bunch of different sets of data.
What is the IQR?
Interquartile range… a measure of spread. Q3-Q1. The distance from Q1 to Q3. The regular range is Hi-Lo, this is the inner range, the interquartile range.
How can you match boxplots to histograms?
USE THE FISH TANK METHOD!
What is a CUMULATIVE FREQUENCY GRAPH?
An OGIVE. It shows the added up totals as you go left to right.
What do OGIVES look like?
They all start at the bottom left (0%) and go to top right (100%)
What is a “percentile?”
It tells you the percent of data BELOW a certain value
How do you find a certain percentile on an OGIVE?
Start at the % on the Y axis.. travel horizontally to the right until you hit the line, then straight down to the X axis. That data value is the percentile.
How can you turn OGIVES into histograms?
RECTANGLE DROP! (bin drop)
where are the “outlier fences?”
1.5 IQR above Q3 and 1.5 IQR below Q1. Just a rule of thumb.
What is the five number summary?
min, Q1 , Q2(median), Q3 and max
How do you find Q1 and Q3?
Q1 is the median of the bottom half and Q3 is the median of the upper half (they are the 25th and 75th percentiles)
What percentile is Q3?
75th
How do you describe distributions (histograms)?
Shape-Cener-Spread- and STRANGE (Outliers and gaps) some say GSOCS. where’s yo GSOCS?
How can you describe spread?
range, IQR, stand dev, variance, or simply say: From here, to about here
How can you describe shape?
TWO THINGS: modes and symmetry.
unimodal, bimodal, multimodal AND uniform, symmetric, skewed
How do you describe CENTER for bimodal or multimodal?
talk about the modes (the lumps, the clusters)
How do you describe CENTER for skewed or distributions with outliers?
use the MEDIAN
How do you describe CENTER for unimodal and symetric distributions?
use the MEAN
How do you descrive SPREAD for unimodal and symmetric distributions?
use the standard deviation
How do you describe SPREAD for skewed distributions (or distributions with outliers?)
Use the IQR
How do you describe SPREAD for bimodal or multimodal?
talk about the outer edges of the clusters “from here to here” or use the IQR.
If asked to compare distributions, what should you write about?
A sentence comparing the SHAPES. A sentence comparing the CENTERS. A sentence comparing the SPREADS. and a sentence comparing the STRANGE STUFF. (GSOCS)
What does GSOCS stand for?
Gaps Shape Outliers Center Spread (put on your gsocs when comparing distributions) be sure to talk about each one clearly (make a list)
How can you describe the center of a distribution?
OPTIONS: give the mean (balance), median (splits area in half), mode (peaks, if bimodal talk about both modes) or say “centered around ____”
How can you tell if variables in a contingency table are independent?
If the distributions are the same across the variables.. Then it doesn’t DEPEND… so INDEPENDENT. Ex: 30% of freshman and 30% of seniors like cabbage.
What do you call things that are not independent?
associated. Or not independent. We generally don’t say DEPENDENT (unless talking about y variable on a scatterplot).
Give an example of independent variables
If 80% prefer cheese and only 20% prefer pepperoni IN EACH GRADE AT BHS…then they all have the same preference, so grade doesn’t matter. We say “school year and pizza choice are independent”
marginal distribution
distribution in the margins (outside of the table). The overall distributions of a single variable in contingency table.
Gender and Video Game playing are___________ because_______
associated (or not independent) because a higher percentage of males play video games. (think.. It depends on gender)
Year in school (F,S,J,S) and Pizza Preference (pepperoni or cheese) are __________ because _______________
independent because all grades have similar preference distributions..
40% cheese, 30%pepperoni, 20% veggie 10% other
What is a contingency table?
shows distributions across 2 variables like gender and music pref. AKA 2-way table
Association and Independence. How are they related?
Variables are either independent or associated. Meaning: if one impacts the other then we say there is an association. If not, Then they are independent.
When there is a relationship between two variables, we say that they are
associated (or not independent)
When there is no relationship between two variables, we say they are
independent (or not associated)
independent is the same as __________
not associated
associated is the same as __________
not independent
not associated is the same as being ____________
independent
Give a quick example of associated variables
A higher percentage of boys play video games than girls so we say “gender and video game playing are associated” or “gender and video game playing are not independent”
<p>what is a conditional distribution?</p>
<p>A distribution with a condition (within the table), along only one row or one column… NOT IN THE MARGINS. You are given a condition.. Then read along that row or column.</p>
not independent is the same as
associated
What percent of the data is above Q3?
25%
What percent of the data is between Q1 and Q3?
the middle 50%. That is the IQR
What is Q2 also known as?
the median
What percent of the data is below Q2?
50%