Class Created Cards Flashcards
What is the difference between one and two-tailed tests in hypothesis testing?
- one-sided tests measure specifically how much the actual data falls above OR below the distribution based on the null hypothesis BUT NOT BOTH
- two-sided tests measure how much the actual data falls above AND below the distribution based on the null hypothesis
Is Block 1 the best stats block?
Obviously
What does the symbol “r” represent
It represents the correlation coefficient or r = cov(x,y)/(σx*σy) and generaly tells you how close a set of theoretical points are to being fit by a line (for positive slope 1 is a line, 0 is just random points, and for negative slow -1 is a line.)
Does population size
impact bias?
No. Bias is only affected by the method of sampling
What does the sigma (σ) symbol mean in statistics?
Standard deviation
Does rejecting the null hypothesis mean that the null hypothesis is false?
A null hypothesis is saying that we are accepting the alternate hypothesis (the effect of the hypothesis does exist in the population). This does not prove that the null hypothesis is a false statement.
What is a good way to think about what the “mean” means?
The mean can be thought of as the fulcrum of a scale, like the balance point of the data. In other words, it’s the most typical value. If you were to select random points from a distribution, the mean would be the closest on average.
What is a z-score?
number of standard deviations away from the mean a certain value is
What is a good way to think about what the “median” means?
The median is the number that, when organizing data in numerical value, is in the center. Example - 1 4 6 8 9 - median would be 6.
Why do we use hypothesis testing?
We use a hypothesis test to account for the uncertainty caused by sampling variation
What is a good way to think about what the “mode” means?
The mode is the number in the data that appears the most amount of times.
How are greek and roman letters used in statistics?
To represent concepts/words
-greek tends to represent the general population and roman tends to represent the sample population
Does sample size impact bias?
No, it doesn’t matter how big or small the sample size is, the results will still be random either way
What is the difference between a population and a sample?
A population is the entire group that you are trying to draw a conclusion about while the sample is the group within the population that you collect data from.
Are there patterns in randomness? Explain.
Randomness is characterized by the lack of patterns or predictability in the sequence of events. A random process, such as the output of a truly random number generator, will not exhibit any recognizable patterns or regularities in its output.
What does p represent in statistics?
p is short for the p-value, which represents the probability of obtaining results at least as extreme as the observed result.
What is a population?
A population is a pool which a sample is being drawn from to study and interpret.
What does the mu (µ) symbol mean in statistics?
mean
What is the symbol in statistics for the mean of a sample distribution?
X̄
What are the Four Pillars of Inference?
These are four types of conclusions to take away from data: significance, estimation, generalization, and causation.
What happens to the variation of your sampling distribution as you increase the number of trials from which you are collecting the statistic of interest when you are modeling the null hypothesis for some situation using TinkerPlots?
The variation of the sampling distribution decreases with a larger number of trials.
A student says, “I wanted to see if the results were due to random chance.” How might a student go about doing this? What is the reasoning behind that approach— why does it work?
A good way to do this would be a Null Hypothesis Test through a Monte Carlo simulation. The first step would be to construct a sampler that corresponds to the probability model suggested in the null hypothesis. Note that the null hypothesis assumes a no-effect model.
Next, you would want to run your sampler at least 500 times and graph the resulting distribution. From this graph, you can get an idea of how likely various results are to be obtained from random chance.
Finally, we compare our observed result to the sample distribution. If the p-value is less than our alpha value than we reject the null hypothesis and say it’s unlikely that this result was generated due to random chance. Thank you for attending my TED talk.
What is a p-value?
A p-value is the probability of having a result be at least as extreme as the observed result if the null hypothesis is true.
What is a two-tailed hypothesis test?
A hypothesis test where you check if the observed result is significantly higher or lower than the simulated results (both sides)
How can you establish statistical significance?
To establish statistical significance, a common approach is hypothesis testing. It involves creating a null hypothesis (no significant difference) and an alternative hypothesis (significant difference), then using a test statistic (such as a t-value or p-value) to determine likelihood of sample data given the null hypothesis. P-value compares the significance level (often 0.05) to determine rejection or non-rejection of the null hypothesis. If p-value is less, it’s considered statistically significant and the null hypothesis is rejected, otherwise it’s not statistically significant and not rejected.
What is the term for the standard deviation of a sample distribution?
The standard deviation of a sample distribution is usually called the standard error and written SE.
How do you determine an appropriate alpha level?
1 - confidence level. For exmaple, if you want to be 90% certain your analysis is accurate do 1 - .9 = .1 or 10%. This is for one tailed tests, for two tailed tests, divide this number by 2
Doespopulationsize impact sampling variability?
The variability will depend on the population size. This is because larger populations are more likely to produce means that are closer to the actual mean of the whole population.
What is a sample?
A sample is a set of data that is taken from a population
What is the difference between a simulation and a theoretical approach to statistical hypothesis testing?
Simulation and theoretical are two methods of hypothesis testing. Simulation uses simulated data, while theoretical uses mathematical formulas to calculate test statistics and p-values. Simulation is useful when data generating process is complex, while theoretical is suitable when underlying assumptions of the model are met.
What is a Hypothesis Test
Testing a hypothesis in comparison to statistical values. One might compare a hypothesis to a null hypothesis, and then make inferences.
What are one-sided and two-sided hypothesis tests?
One-sided -A hypothesis test where you check for statistical significance in one direction
Two sided-A hypothesis test where you check for statistical significance in both directions