Highlighted memory Flashcards
FIve Number summary
Min, Q1, Med, Q3, Max
Describing or comparing distributions
For quantitative data discuss
shape center spread outliers
Outlier
An extreme observation is an outlier if it is smaller than Q1-(1.5IQR) or larger than Q3+(1.5IQR)
Choosing Measure of Center and spread
The mean and standard deviation are used to compare roughly symmetric distributions. The median and IQR are used to compare distributions where at least one is skewed, because they are resistant to outliers and skewness
percentile
Percent of the distribution that is below the value of that distribution
Z-Score
How many standard deviations x lies above or below the mean
Density curve
A density curve always
remains on or above the horizontal axis
has total area 1 underneath it
Describing scatterplots
discuss
direction form strength outliers
residual plot
when the residual plot has no obvious pattern, the linear model is appropriate for the actual data
when the residual plot has an obvious pattern, the linear model is NOT appropriate for the actual data
S:
Standard deviation of the residuals, the typical size of the prediction errors (residuals) when using the regression line, in context
r^2
Coefficient of determination: r^2 percent of the in variation in y is accounted for by the least squares regression line relating x to y, in context
4 basic principals of experimental design
Comparison: Use a design that compares two or more treatments
Random Assignment: Use chance to assign experiment units to treatments. This helps create roughly equivalent groups before treatments are imposed
Control: Keep as many other variables as possible the same for all groups. Control helps avoid confounding and reduces the variation in responses, making it easier to decide whether a treatment is effective.
Replication: Impose each treatment on enough experiment units so that the effects of the treatment can be distinguished from chance differences between the groups
Statistically significant
When an observed difference in responses between the groups in an experiment is too large to be explained by chance variation in the random assignment
scope of inference
We can infer about population if individuals taking part in a study were randomly selected from the population
We can infer about cause and effect if a well-designed experiment that randomly assigns experimental units to treatments is used
Law of large numbers/probability
the law of large numbers says that the proportion of times that a particular outcome occurs after many repetitions will approach a single number, its probability