Test 2 Flashcards
What is probability?
Relative likelihood that one particular outcome will (or will not) occur relative to some other outcomes.
p=1 means?
Absolute certainty (100%)
p=0 means?
Complete impossibility (0%)
p>0 means?
Reflects a possible outcome: unlikely/improbable not impossible.
What is the addition rule?
- The or rule
- Add the possibilities
- Sum of all outcomes: p=1
What is the multiplication rule?
- The and rule
- Multiply the possibilities
What is the normal distribution?
- Mean = median = mode
- Symmetric/zero skew
- Mesokurtic
- Asymptotic tails
What are Z scores?
- Number of standard deviations that a particular score is away from the mean of its distribution.
Z = (X-Xbar)/SD
How do you calculate the raw score?
X = Xbar + (Z)(SD)
What does converting to Z scores allow?
Allows you to compare scores that come from different distributions.
How do you calculate what percentage/area is above a certain score?
- Calculate the Z score
- Find the proportion that matches the Z score in the Z table
- Subtract this value from 0.5 or 50%
How do you calculate what percentage/area is below a certain score?
- Calculate the Z score
- Find the proportion that matches the Z score in the Z table
- Subtract this value from 0.5 or 50%
How do you calculate what percentage/area is between two scores?
- Calculate the Z score of each
- Find the proportion that matches the Z scores in the Z table
- Add both of these values
How do you calculate what percentage/area is outside (above and below) two scores?
- Calculate the Z score of each
- Find the proportion that matches the Z scores in the Z table
- Add both of these values
- This will be the value between so then subtract it from 1 or 100% (evenly split on both sides of the curve)
How do you calculate what score is within a certain percentage?
- Find the proportion in the Z table and its corresponding Z score
- Use the raw score formula
How do you calculate what score is within the middle 50%?
- Find the proportion in the Z table and its corresponding Z score
- Use the raw score formula twice (one for positive Z and one for negative)
What is a population?
Entire group of interest.
What is a sample?
Subgroup being studied.
Why limit research to samples when you are ultimately interested in complete population?
- Population potentially massive
- Inefficient to study everyone
- Population changes over time
What is the challenge to limiting research to samples?
Main difficulty is that any sample will differ from the population due to random factors. (sampling error)
What do inferential stats do?
Accounts for chance.
What is sampling error?
Difference between a sample statistic and a population parameter due to random factors and/or sampling.
What is random sampling?
A technique where all units in population have equal and non-zero chance of being included in the sample:
- Equal probability of inclusion
- Selection of units independent
- Any/all combinations possible
What is sampling distribution of means?
- One way to estimate sampling error is by calculating this - inefficient and only modeled theoretically.
- It is calculated from multiple random samples drawn from same population:
Mean of means - population mean
Standard deviation - sampling error
What are the three simple rules that allow us to determine the basic characteristic of sampling distribution of the means?
- Distribuation mean = population mean
- Standard deviation less than population
- Distribution approximately normal
What is distribution mean = population mean?
Mean of sampling distribution of the means calculated from means of an infinite number of smaller samples from the population.
- Distributed around population mean
- Reduces impact of extreme scores
What is standard deviation less than population?
Distribution is made up from the means of infinite samples -> Extreme scores become less likely.
- Averaging candles extreme scores
- Results in more regular distribution
What is distribution approximately normal?
Regardless of distribution of original scores, the sampling distribution of the mean tends to be normally distributed.
- Extreme scores wash out
- Sample size is normal
What is the central limit theorem?
If repeated random samples of size n are taken from a population with the mean u and standard deviation o, then the sampling distribution of the mean;
- Mean equal to population mean (u)
- Standard error equal to (oXbar = o/sqrt*n)
- Approach normal as n increases
What is the formula for the standard error of a population?
oXbar = o / sqrt*n
What is the formula for the standard error of a sample?
sXbar = s / sqrt*n
What does standard deviation do?
Standard deviation estimates the distance of any score from the sample/population mean.
What does the standard error do?
Standard error estimates the distance of any sample mean from the population mean.
What are confidence intervals?
- Confidence intervals estimate the range of possible means that are likely include the population mean.
- Usually intervals of 95% or 99% confidence
How do we calculate confidence intervals when population standard deviation(o) is known?
- Calculate standard error
- Use Z score of +- 1.96 for 95% and +- 2.58 for 99%
- CI = Xbar +- (z)(oXbar)
- Write answer as # - #
How do we calculate confidence intervals when sample standard deviation(s) is known?
- Calculate standard error
- Calculate n-1 for degrees of freedom (df)
- Use level of significance of 0.05 for 95% and 0.01 for 99%
- Use df and level of significance to find t statistic in t distribution table
- CI = Xbar +- (t)(sXbar)
- Write answer as # - #
What is the interpretation of the confidence intervals?
Confidence interval establishes certainty that mean falls within the interval - NOT certainty that sample mean equals population mean.
- Not exact estimate
- Might be incorrect
- Probably correct
What is the formula for the standard error of the proportion (sp)?
sp = sqrt of ( (P(1-P)) / n )
How do we calculate confidence intervals with the standard error of proportion?
- Calculate standard error
- Use Z score of +- 1.96 for 95% and +- 2.58 for 99%
- CI = P +- (z)sp)
- Write answer as # - #
What is the margin of error?
margin of error = (z)(sp)
What are descriptive statistics?
Present, organize and summarize larger sets of numbers using fewer numbers.
- Central tendency and variability
What are inferential statistics?
Analyses that compare groups or draw conclusion from samples and populations.
- Z and t test
What is hypothesis testing?
- Method of statistical inference comparing sampled data to other sampled data, theoretically modeled data, or population parameters.
- Method purposes alternate hypotheses describing predicted relationships between variables of study.
What is the null hypothesis (H0)?
- Default position is that there is no relationship between variables / association among groups.
- Assume groups are equivalent
- There is no difference between …
What is the alternate hypothesis (H1)
- Describes the predicted relationship between variables that you are currently investigating.
- Assumes that some difference exists
- There is a difference between …
What is the level of significance (a)?
- Researches determine what represents a real difference between means in advance.
- Typically a = 0.05 or 0.01 (convention)
- Translates to there being less than 5% or 1% probability of the result occurring by chance.
What is the critical value?
Value that the calculated test statistic (result) must meet or exceed to consider the difference real.
- Depends on level of significance
- Depends on the distribution used (Z or t)
What is the formula for the Z test of the test statistic?
Z = (Xbar - u) / oXbar
What critical values are used for the Z test?
For a = 0.05, it is +- 1.96
For a = 0.01, it is +- 2.58
What do we do when test statistic (Z or t) exceeds critical value?
Test statistic (?) exceeds critical value, reject null hypothesis. Conclude that difference is probably real at (0.05 or 0.01) significance.
What do we do when test statistic (Z or t) does not exceed critical value?
Test statistic (?) does not exceed critical value, retain null hypothesis. Conclude that there is no difference at (0.05 or 0.01) significance.
What is the formula for the t test of the test statistic?
t = (Xbar - u) / sXbar
What critical values are used for the t test?
Use t table to find df (n-1) for a = 0.05 and a = 0.01.
What are directional hypotheses?
- All out previous analyses comparing means test to see if means differ in either direction - two-tailed test
- We can also perform analyses to specifically test if one mean is higher or lower - one-tailed test
Hypothesis not only predicts a difference, but also predicts the specific direction of the difference
What is a two-tailed test?
- Predicts the results will differ - but down not suggest the direction
- Equal sensitivity at both ends
What is a one-tailed test?
- Predicts the results will differ - that they will be higher/lower
- Greater sensitivity at one end
- Test becomes more likely to discover a difference
How to calculate Z test with two-tailed?
Z 0.05 = +- 1.96
Z 0.01 = +- 2.58
How to calculate Z test with one-tailed?
Z 0.05 = 1.65
Z 0.01 = 2.33
How to calculate t test with two/one-tailed?
They have separate tables.
What are decision errors?
- Situations where right procedure leads to wrong conclusions regarding rejecting or retaining null hypothesis.
- Drawing conclusions about population using information drawn from samples involves risk.
What is a Type I Error (a)?
Rejecting null hypothesis when it is actually true - significant result when no difference exists (false positive).
- Probability: a = p(0.05 or 0.01)
- Smaller p value reduces risk
What is a Type II Error (B)?
Retaining the null hypothesis when alternate hypothesis is true - non-significant result when difference exists.
- Probability: B(multiple factors)
- Smaller p value increases risk
What does reducing p value do to each error?
Reducing p value decreases risk of Type I Error while increasing risk of Type II Error
How do you control Type I Error?
To decrease type I is to arbitrarily lower the p value.
- Increases Type II Error
- Violation of Conventiom
How do you control Type II Error?
Arbitrarily increasing the p value is not feasible - other steps used to decrease Type II Error
- Increse sample size
- Incease treatment effects
- Decrease experimental error
What are single sample tests?
- These tests allow us to deterine if a sample is likely to have been drawn from a population.
- Calculate test statistic from difference of means: compare it to the critical value (Z/t)
What are two sample tests?
- Used to examine if sample means differ from each other, instead of if it differs from the population mean.
- Related and Independent samples t tests
What are related samples t tests?
- Compare means of related samples: samples that were not chosen independently from each other (scores dependent on each other)
- Usually same group of people that are tested twice (before/after, two halves of a pair…)
- Both samples always of equal sizes
How do we calculate the degrees of freedom for the related samples t tests?
n-1
What are the formulas for the related samples t tests (never used in class before)?
- Difference of scores (d): simple difference between pairs of scores
- Mean of the differences: Dbar = sum of d / n
- Standard deviation of the differences: sd = sqrt of (sum of (d-Dbar)^2 / (n-1) )
- Standard error of the mean difference: sDbar = sd/sqrt*n
- t test: t = Dbar / sDbar
What are the interpretations to the related samples t tests?
- Reject H0 if t value > critical value
- Retain H0 if t value < critical value
What are independent samples t tests?
- Compare means of independent samples: samples that were chosen independently from each other.
- Any comparison between samples where selection of each sample occurs independently of the other.
- Samples of random composition and not matched in any way
- Can be same or different sizes
How do we calculate the degrees of freedom for the independent samples t tests?
n1 + n2 - 2
What are the formulas for the related independent samples t tests (never used in class before)?
- Means and standard deviations: calculate mean andSD of each sample separately
- Estimate population variance: s^2 = [ s1^2 (n1-1) + s2^2 (n2-1) ] / (n1 + n2 -2)
- Standard error of the difference of means: sXbar1 - Xbar2 = sqrt of [ s^2 ( (n1+n2) / (n1n2) ) ]
- t test: t = (Xbar1 - Xbar2) / (sXbar1 - Xbar2)
What are the interpretations to the independent samples t tests?
- Reject H0 if t value > critical value
- Retain H0 if t value < critical value