Summer stats Flashcards
Statistics
The science and art of collecting, analyzing, and drawing conclusions from data.
Individuals
An object describe in a set of data. Can be people, animals, or things
Variable
An attribute that can take different values for different individuals
Categorical Variable
Assigns labels that place each individual into a particular group, called a category
Quantitative Variable
Takes number values that are quantities-counts or measurements
Discrete Variable
A quantitive variable that takes a fixed set of possible values with gaps between them
Continuous Variable
A quantitative variable that can take any value in an interval on the number line
Frequency Table
Shows the number or count of individuals having each value.
Relative Frequency Table
Shows the proportions or percent of individuals having each value.
Bar graph
Shows each category as a bar. The heights of the bars show the category frequencies or relative frequencies. There is space between the bars.
Pie Chart
Shows each category as a slice of the pie. The areas of the slice are proportional to the category frequencies or relative frequencies
Association
There is an association between two variables if knowing the value of one variable helps us predict the value of the other. If knowing the value of one variable does not help us predict the value of the other, then there is no association between the variables.
Symmetric Distribution
A distribution is roughly symmetric if the right side of the graph (containing half of the observations with the largest values) is approximately a mirror image of the left side.
Skewed Distribution
A distribution is skewed tot he left if the left side of the graph is much longer than the right side. In other words distribution is skewed left if there is a tail stretching out of the left side. a distribution is skewed to the right if the right side of the graph is much longer than the left side.
Mean
The mean of a disribition of quantitative data is the average of the individual data values. To find add all the values and divide by the total number of data values
Statistic
A number that describes some characteristic of a sample
Parameter
A number that describes some characteristic of a population
Resistant
A statistical measure is ____ if it isn’t sensitive to extreme values(outliers)
Median
The midpoint of a distribution is the number such that about half the observations are smaller and about half is larger. to find arange the data values from smallest to largest. If the number of data values is odd the ____ is the middle value. If the number of data is even, then it’s the average of the two middle values. The _____ Repersentst he 2nd Quartile andis at 50%
Range
The ___ of a distribution I the distance between the minimum value and the maximum value. Max-Min= ___
Standard Deviation
Measures the typical distance of the values in a distribution from the mean.
Variance
The standard deviation squared
Quartiles
The quartiles of a distribution divide an ordered data set into four groups having roughly the same number of values. To find the ____, arrange the data values from smallest to largest and find the median.
First Quartile
the Median of the data values that are left to the median in an ordered list. (25%)
Third Quartile
The median of the data value that are to the right of the median (75%)
Inter Quartile Range
The distance between the first and third quartiles of a distribution
(q3-q1)
Five Number Summary
The five number summary of a distribution of quantitative data consists of the minimum, the first quartile, the median, the third quartile, and the maximum
Percentile
The Pth percentile of a disribution is the value with p% of observation less than or equal to it
Standardized Score (z-score)
for an individual value in a distribution, tells us how many standard deviations from the mean the value falls, and in what direction
Response Variable
Measures an outcome of a study. Graphed on the y-axis
Explanatory Variable
May help predict or explain changes in a response variable.. graphed on the x-axis
y-intercept of the least-squares regression line(LSRL)
the PREDICTED Y-value when x=0
Slope of a least-squares regression line
the PREDICTED change in y for every 1 unit increase in x
population
in a statistical study, is the entire group of individuals we want information about
sample
a subset of indivuals in the population from which we collect data
bias
The design of a statistical study shows bias if it is very likely to underestimate or very likely to overestimate the value you want to know.
Random Sampling
Involves using a chance process to determine which members of a population are included in the sample.
Simple Random Sample
Chosen so that every group of n individuals in the population has an equal chance to be selected.
Observational Study
observes individuals and measures variables of interest but does not attempt to influence the responses
experiment
Deliberately imposes treatment (conditions) on individuals to measure their response
Statistically significant
When observed results of a study are too unusual to be explained by chance alone
Independent events
A and B are ____ if knowing whether or not one event has occurred does not change the probability that the other events will happen
probability
The probabillut of any outcome of random process is an number between 0 and 1 that describes the proportions of times that outcome would occur in a very long series go trials.
Bimodal
A graph of quantitative data with two clear peaks
1.5 IQR Outlier Rule
An observation is an outlier if it is less than q1-1.5(IQR) or greater than q3+1.5(IRQ)
measure of Center
Mean and median
Positive association
Two variables have a positive association when the values of one variable tend to increase as the values of the other variables increase
Measure of Variability
Range, IQR, and standard deviation
mode
value in the distribution having the greatest frequency
outlier
Individual value that falls outside the overall pattern of a disribution