Summer Work Flashcards
What is Statistics?
The study of variability
What is variability?
Differences and how things differ
What are 2 branches of AP STATS?
inferential and descriptive
What are descriptive stats?
Describing what you see in statistics
What are inferential stats?
Using information from a sample to make an inference about an entire population
Compare descriptive and inferential stats
Descriptive explains the data you have inferential uses that data to say something about the entire population
What is data?
Collected information
what is a population?
The group you’re interested in studying
what is a sample?
A subset of a population often taken to make inferences about the population
compare population to sample
Populations are the whole, Sample is the subset of the whole
Compare data to statistics
Data is the individual little things we collect. Statistics is the average of the data in a sample.
Compare data to parameters
Data is each little bit of information collected. Parameter is data from each member of a population averaged.
what is a parameter?
A numerical summary of a population
What is a statistic?
A numerical summary of a sample
We are curious about the average wait time at a Dunkin’ Donuts drive-through in your neighborhood. You randomly sample cars one afternoon and find the average wait time is 3.2 minutes. What is the population parameter? What is the statistic? What is the parameter of interest? What is the data?
The parameter is the true average wait time at that Dunkin’ Donuts. This is the number you don’t have and will never know. The statistic is 3.2 minutes. It is the average of the data you collected. The parameter of interest is the same thing as the population parameter. In this case, it is the true average wait time of all the cars. The data is the wait time of each individual car. You take that data and find the average that averages called a statistic, and you use that to make an inference about the true parameter.
Compare data-statistic-parameter using categorical example
Data are individual measures like meal preference. Statistics and parameters are summaries of the data. A statistic would be 42% of sample preferred tacos. A parameter would be 42% of population prefer tacos.
compare data-statistic-parameter using quantitative example
Data are individual measures that can be averaged. Statistics and parameters are summaries of the data like ‘an average breath holding time of 45.2.’ Statistics cover samples while parameters cover population.
What is a census?
getting Information from every member of the population
Does a census make sense?
It is OK for small populations but impossible for larger populations
What is the difference between a parameter and a Statistic?
Both are a single number summarizing a larger group of numbers, But a parameter comes from a population and a statistic comes from a sample.
If I take a random sample of 20 hamburgers from five guys and count the number of pickles on a bunch of them and one of them had nine pickles, then the number of nine from that burger would be called what?
A datum, or a data value
If I take a random sample of 20 hamburgers from five guys and count the number of pickles on a bunch of them and the average number of pickles was 9.5, then 9.5 is considered a what?
A statistic
If I take a random sample of 20 hamburgers from five guys and count the number of pickles on a bunch of them and I do this because I want to know the true average number of pickles on a burger at five guys the true average number of pickles is considered a what?
parameter
What is the difference between a sample and a census?
With a sample, you get information from a small part of the population. In a census, you get info from the entire population. You can get a parameter from a census, but only a statistic from a sample
Use the following words in one sentence population, parameter, census, sample, data, statistics, inference, population of interest.
I was curious about a population parameter, but a census was too costly so I decided to choose a sample, collect some data, calculate a statistic and use that statistic to make an inference about the population parameter a.k.a. the perimeter of interest.
If you are tasting a soup then the flavor of each individual thing in the spoon is the blank, the entire spoon is a blank the flavor of all that stuff together is like the blank and use that to blank about the flavor of the entire pot of soup, which would be the blank.
Data, sample, statistic, make an inference, parameter
What are random variables?
If you randomly choose people from a list, then their hair color, height, weight and any other data collected from them can be considered random variables.
What is the difference between quantitive and categorical variables?
Quantitative variables are numerical measures, like height and IQ. Categorical are categories, like eyecolor and music preference
What is the difference between quantitative and categorical data?
The data is the actual gathered measurements. So, if it is eye color , then the data would look like this “blue, brown, brown, brown, green, blue“ the data from categorical variables are usually words, often it is simply “yes, yes, yes, no, yes“ if it was weight, Then the data would be Quantitative like “125, 150, 220, 178,” the data from quantitative variables are numbers.
What is the difference between discrete and continuous variables?
Discrete can be counted, like numbers of cars sold. They are generally integers or whole numbers, while continuous variables would be something like weight of a mouse, 4.3 ounces etc.
What is a quantitative variable?
Quantitative variables are Numeric like height, age, number of cars sold, SAT score, etc.
What is a categorical variable?
Categorical or qualitative variables are like categories, blonde, favorite music, gender, yes, no, etc.
What do we sometimes called a categorical variable?
Qualitative
What is quantitative data?
The actual numbers gathered from each subject
What is categorical data?
The actual individual category from a subject, like blue or female
What is a random sample?
When you choose a sample by rolling a dice, choosing names from a hat, or other real randomly generated sample.
What is frequency?
How often something comes up
What is the difference between data and datum?
Datum is singular, data is plural
What is a frequency distribution?
A table, or a chart, that shows how often certain values or categories occur in a data set
What is meant by relative frequency?
The percent of time something comes up
How do you find relative frequency?
divide frequency by the total
What is meant by cumulative frequency?
adding up the frequencies by categories as you go.
Make a guess as to what relative cumulative frequency is
It is the added up percentages
What is the difference between a bar chart and a histogram?
Bar charts are for a categorical data and histograms are for quantitative data
What is the mean?
The balancing point of the histogram
What is the difference between a population mean and a sample mean?
Population mean is the mean of a population, is a parameter, sample mean is a mean of a sample, so it is a statistic.
What symbols do we use for population mean and sample mean?
MU for population mean a.k.a. parameter, X – bar for sample mean a.k.a. statistic
How can you think about the mean and median to remember the difference when looking at a histogram?
Mean is balancing point of histogram, median splits the area of the histogram in half
What is the median?
The middle number, it splits area and half
What is the mode?
The most common number, or the peaks of a histogram.
When do we most often use mode?
With categorical values.
Why don’t we always use the mean, we have been calculating it all of our lives?
It is impacted by skewness and outliers. It is not resilient
When we say the average teenager are we talking about mean, median or mode?
It depends, if we are talking height, it might be the mean, if we are talking about parental income, we’d probably use the median, if we were talking about music preference, we’d probably use the mood to talk about the average teenager.
What is a clear example of where the mean with change but the median wouldn’t, this would show resilience
A set of numbers one, two, two, five, five, eight, eight, nine. Another data point is added of 9000. The first set of data has a mean a five and a median at five. The second has a median of five in a mean of 1000
How are mean, median and mode positioned in a skewed left histogram
Goes in order from left to right. Mean median mode
How are mean, median and mode positioned in a skewed right histogram?
Goes in the opposite order mode median mean
Who chases the tale of a histogram, mean median or mode
The mean chases the tail a.k.a. the outliers
Is there a way to study these efficiently instead of just rereading them?
Use Brainscape!