STATS FINALS Flashcards
Determines whether there is a statistically significant difference between the means in two unrelated groups
Ex: comparing cancer patients and pregnant women in population
Independent T- TEST
The mean of a single group is compared with the given mean.
Ex: Determining the increase and decrease in sales in the given average sales
One sample T-TESTA
Analyzes the difference between the means of more than two groups.
____ determines how one factors impacts another, whereas
____ analyzes compare samples with different variables. It determines the impact of one ore more factors by comparing the means of different samples.
One way ANOVA
Two way ANOVA
Provides regression analysis and analysis of variance for multiple dependent variable by one or more factor variable or covariates. It examines the statistical difference between one continous dependent variable and an independent grouping variable.
MANOVA
Statistical test the determines wether two population means are different, Provided the variances are known and the sample size is large
Z test
It test the difference between two variables from the same population ( pre-and-post test)
Ex: measuring performance score of the trainee before and after the completion of the training program
Paired T TEST
ASSUMPTIONS OF T-TEST FOR ____
- The dependent variable must be continuous (interval ratio)
- The observations are independent of one another
- The dependent variable should approximately normally distributed
- The dependent variable should not contain any outliers (have population mean)
T TEST for single sample
ASSUMPTIONS OF T TEST FOR ___
- Dependent variable is continous
- Independent variable must be two, categorical related groups
- Observation are independent
- Normal distribution in different scores
- No significant outliers in different scores
T TEST FOR dependent samples
ASSUMPTIONS OF ____
- The dependent variable should be measured at the ordinal or continous level
- The independent variable should consist of two categorical “related groups” or “matched pairs”
- The distribution is not normal
Wilcoxon signed rank test
ASSUMPTIONS OF T TEST FOR ___
- Dependent variable is continous
- Independent variable should have two categorical independent groups
- Observation are independent
- Normal distribution among two groups
- No outliers in two groups
- There has to be homogeneity of variance (if violated use ____ )
Independent samples
Weltch T
ASSUMPTIONS OF ____
- The dependent variable should be measured at the ordinal or continous level
- The independent variable should consist of two categorical “related group “ or matched pairs”
- There must be independence of observation
- The distribution is not normal
Mann Whitney U test
ASSUMPTIONS OF ____
- Dependent variable is continous
- Independent variable is consisting of two or more categorical independent groups
- Observation are independent
- No significant outliers
- Normal distribution is observed among all groups or the residuals of the dependent variable is normally distributed
- There is a homogeneity of variance
One way ANOVA
Additional hypothesis test that are done after an ANOVA to determine exactly which mean difference are significantly and which are not
Post hoc comparison
Done after rejecting the null hypothesis and there are more than two treatments
Post hoc comparison
Most used for groups with equal sample size and groups have homogeneity of variances
Turkey’s honesty significant difference (HSD) test
An option if n is unequal
Turkey kramer test
It is usually applied when sample sizes between two groups are unequal but there is homogeneity of variance
Scheffé test
A non parametric approach to compare combinations of groups or treatments where it does not assume equal variance and sample sizes
Games howell test
Used if non parametric counterpart of ANOVA is used
Bonferroni procedure
If the homogeneity of variance is violated, use ____ and ____ as post hoc comparison if needed
Welch’s ANOVA
Games- howell test
If the normality assumption is violated or there is presence of significant outliers use ___ and ____ as post hoc comparison if needed
Kruskal wallis H test
Dunn’s test of DSCF pairwise comparison
(One way anova on ranks)
A non parametric test used to determine if there are statistically significant difference between two or more groups of an independent variable on a continuous or ordinal dependent variable
Kruskal wallis H test
ASSUMPTIONS OF ___
- The dependent variable shoud be measured at the ordinal or continous level
- The independent variable should consist of two or more categorical, independent groups
- You should have independent of observation
- The data are not normally distributed
Kruskal wallis H test
There is more than one dependent variable
MANOVA
There is more than one dependent variable
MANCOVA
ASSUMPTIONS OF ____
- Variable are both continous
- Variable should be paired
- Observation are independent
- Relationship must be linear
- Bivariate normal distribution is present
- No univariate or multivariate outliers
- Homoscedasticity is present
Pearson’s r
Non parametric equivalent of a correlation coefficient for rank ordered scores
Use to measure the relationship between X and Y when both variables are measured on ordinal scales or if the continous variable did not meet the assumptions of pearsons r
Spearman’s RHO
ASSUMPTIONS OF ___
- The variable should be atleast ordinal
- Variables can be paired
- There is a monotonic relationship
Spearman’s RHO
A non parametric measure of the strength and direction of association that exist between two variables measured on atleast an ordinal scale
Kendall’s tau-B
A psychologist is interested in the effects of a new cognitive behavioral therapy program on reducing anxiety levels. Participants undergo a standardized anxiety test before and after completing an 8 week CBt course. The psychologist wants to determine if there is statistically significant change in anxiety levels after therapy
What type of test should be used?
T test for dependent samples
A basketball coach wants to determine if a new training program has effectively improved the free throw performance of his team. He records the number of successful free throws out of 20 attempts for each of his 15 players before and after the training program
What type of test should be used?
Wilcoxon signed rank test
A researcher is studying the impact of two different teaching methods on student performance. One group of students is taught using a traditional lecture based approach while another group is taught using a more interactive, problem solving approach. The researcher wants to compare the final exam scores of the two groups to determine if there is a significant difference in performance
What type of test should be used?
T test for independent samples
A nutritionist is conducting a study to compare the effectiveness of two different diets, diet A and diet B on weight loss, the nutritionist has two independent groups of participants, each following one of the diets for a period of 3 months. The weight loss result are not normally distributed and the sample sizes are small
What type of test should be used?
Mann Whitney U test
A sports scientist wants to determine if there is a difference in the average recovery time after injury between three different rehabilitation programs. The programs are designed for athletes who have suffered knee injuries. Each program has a different combination of physical therapy, diet and rest
HO: There are no difference in the average recovery time across the programs,
HA: Atleast one program leads to a different Average recovery time
One way ANOVA
A botanist is conducting a research to understand the effects of sunlight exposure and watering frequency on plant growth. She plants 40 seeds and allows them to grow for two months under various conditions of sunlight exposure and watering frequency. After two months, she measures the height of each plant to determine the impact of these two factors
Two independent variable: sunlight exposure and watering frequency) that could affect the dependent variable (plant growth)
Two way anova
A food scientist is testing the shelf of life of new energy bar under three different packaging materials: plastic, aluminum, and biodegradable material. The scientist conduct an experiment where the energy bars are stored in each type of packaging and then measures the time until spoilage
After conducting ANOVA and finding the significant difference, the scientist would use this test to identify which pairs of packaging materials have significantly different affects on the shelf life of the energy bars.
Post hoc comparison
A food company wants to test the taste performance for new beverage flavor among different age groups. They have four age groups : children, teenagers, adults and seniors. Each group is given the new beverage and asked to rate its taste on a scale of 1 to 10
After conducting ANOVA and finding a significant difference the food company would perform this test to identify which pairs of groups have significantly different taste ratings. This test control for the TYPE 1 error
Turkey’s honesty significant difference (HSD) test
A pharmaceutical company has developed three different formulations of a new allergy medication. To determine which formulation is most effective, they conduct a clinical trial with patients who have seasonal allergies. Due to various constraints, the number of participants in each group ends up being unequal.
It is used when the group sample size are unequal, which adjusts the comparisons to account for this variance
Turkey kramer test
An agriculture scientist is comparing the yield of four different stains of wheat to determine which one produces the highest yield. The scientist plants each strain in separate field under the same environmental conditions and measures the yield after the harvest season
The agricultural scientist would perform this test to identify which pairs of wheat strains have significantly different yields.
This test is particularly useful when the researcher wants to compare all possible simple and complex pairs of means
Scheffé test
A researcher is investigating the impact of different sleep patterns on cognitive performance. Participants are divided into four groups based on their sleeping habits; less than 6 hours, 6-7 hours, 7-8 hours and more than 8 hours of sleep per night. After a month of monitoring each participant undergoes a series of cognitive test to assess their performance
Question: after finding significant difference in cognitive test scores among the four sleep groups using ANOVA which specific sleep groups differ from each other in terms of cognitive performance?
This test is particularly useful when the group variances are unequal and the sample size may differ, as it provided a robust comparison without assuming eqaul variances
Games howell test
A medical researcher is comparing the effectiveness of four different drugs for treating high blood pressure. The researcher conducts a clinical trial with patients randomly assigned to one of the four drugs. After a period of treatment, the blood pressure levels of the patients are measured to assess the effectiveness of each drug.
Question: After finding a significant difference in blood pressure reduction among the four drugs using ANOVA, which specific drugs differ from each other in terms of their effectiveness?
Used to adjust significant levels for multiple comparisons, thereby controlling the family-wise error reducing type 1 error.
Bonferroni procedure test
A researcher is studying the effect of different types of music on concentration levels. Participants are divided into four groups, each exposed to a different genre of music (classical, jazz, rock, and silence) while completing a series of tasks that require concentration. The concentration levels are measured based on the number of tasks completed correctly.
Question: Is there a significant difference in concentration levels among participants exposed to different genres of music?
The test will determine if there is a significant difference in the median concentration level across the four groups.
Kruskal wallis H test
A psychologist is interested in studying the effects of different treatments on two different outcomes: anxiety levels and self-esteem scores among adults with social phobia. The psychologist decides to compare three treatments: cognitive behavioral therapy (CBT), medication, and a control group (no treatment).
Question: Do the different treatments have a statistically significant effect on both anxiety levels and self-esteem scores among adults with social phobia?
This test will help determine if there is a significant difference in the combination of these two psychological outcomes across the three treatment groups
MANOVA
A team of educational researchers is studying the impact of a new teaching strategy on student performance. They are particularly interested in how the strategy affects both math and science scores. To account for differences in students’ reading abilities, which could influence their performance, reading scores are included as a covariate.
Question: Does the new teaching strategy have a statistically significant effect on students’ math and science scores after controlling for their reading ability?
To assess the statistical difference on the two dependent variable (math and science scores) by independent variable (teaching strategy) while controlling for the covariate (reading score)
MANCOVA
A health researcher is studying the relationship between the number of hours spent exercising per week and the level of HDL cholesterol (the “good” cholesterol) in adults. The researcher collects data from a sample of adults who report their weekly exercise hours and have their HDL cholesterol levels measured.
Question: Is there a significant linear correlation between the number of hours spent exercising per week and the HDL cholesterol levels in adults?
• to determine if there is a positive, negative, or no correlation between the two variables. A positive r value would indicate that as exercise hours increase, HDL cholesterol levels also increase, while a negative r value would suggest the opposite.
Pearson’s r
A sociologist is conducting a study to explore the relationship between social class and happiness. Participants are asked to rank their perceived social class on a scale from 1 to 10, with 1 being the lowest and 10 being the highest. They are also asked to rank their level of happiness on a similar scale.
Question: Is there a significant monotonic relationship between individuals’ perceived social class and their level of happiness?
• to determine if there is a positive, negative, or no monotonic relationship between the two ranked variables. A positive rho value would indicate that as perceived social class increases, happiness levels also tend to increase, while a negative rho value would suggest the opposite.
Spearman’s RHO
A market researcher is interested in understanding the relationship between customer satisfaction ratings and the likelihood of customers recommending a company’s services to others. The researcher surveys a group of customers, asking them to rank their satisfaction with the company’s service and their likelihood of recommending the company on a scale from 1 to 10.
Question: Is there a significant association between customer satisfaction rankings and the likelihood of recommending the company’s services as indicated by the customers?
• if the test is significantly different from zero it would suggest that there is a monotonic relationship between customer satisfaction and the likelihood of recommending the company. A positive tau-b would indicate that higher satisfaction is associated with higher likelihood of recommendation
Kendall’s tau-B test
A large electronics company claims that their new smartphone battery lasts on average 24 hours on a single charge. A consumer advocacy group is skeptical of this claim and decides to test it. They randomly select 50 smartphones and run them until the batteries die. The group knows the population standard deviation for battery life is 2.5 hours based on manufacturer data.
Question: Is there a significant difference between the claimed average battery life of 24 hours and the observed average battery life from the consumer advocacy group’s sample?
Z tests
ASSUMPTIONS:
Multivariate Normality: The dependent variables should be normally distributed.
Equal Covariance Matrices: The groups should have similar variance-covariance matrices.
Linearity: There should be linear relationships between dependent variables and covariates.
Example: An educational study looks at how different teaching methods affect math and science scores, controlling for reading ability
Mancova
Multivariate Normality: The combination of dependent variables should be normally distributed.
Equal Covariance Matrices: The groups should have similar variance-covariance matrices.
Independent Observations: No participant belongs to more than one group.
Example: A psychologist examines how different therapies affect both stress and sleep quality. This test can determine if there’s a significant effect on both outcomes.
MANOVA
ASSUMPTIONS:
Monotonic Relationship: The relationship between the two variables should consistently increase or decrease but doesn’t need to be linear.
Ordinal Data: The variables should be rankable.
Example: A survey asks people to rank their happiness and income levels. These test can assess if higher income ranks correlate with higher happiness ranks.
Spearman’s RHO and Kendall’s tau
ASSUMPTIONS:
Linear Relationship: The relationship between the two variables should be a straight line.
Normal Distribution: Both variables should be normally distributed.
Homoscedasticity: The spread of one variable should be consistent at all levels of the other variable.
Example: A researcher studies the relationship between study time and exam scores among students, assuming that more study time consistently relates to higher scores.
Pearson’s r
ASSUMPTIONS:
Independent Samples: Different groups must have different individuals.
Ordinal or Continuous Data: The data should be rankable or measurable.
Example: A company wants to compare the job satisfaction levels of employees in different departments using a survey, and the Kruskal-Wallis H test is used because the data are not normally distributed.
Kruskal-wallis H test
ASSUMPTIONS:
Paired Samples: The two sets of data are related (e.g., measurements taken before and after an event on the same subjects).
Ordinal or Continuous Data: The data should be rankable or measurable.
Example: Patients’ pain levels are recorded before and after taking a new medication, and the test compares the two related sets of scores.
Wilcoxon signed rank test
ASSUMPTIONS:
Independent Samples: The two groups being compared must not overlap.
Ordinal or Continuous Data: The data should be rankable or measurable.
Example: Two different sales teams use different strategies, and their sales numbers (which are not normally distributed) are compared using the ___
Mann Whitney U test
ASSUMPTIONS:
Normal Distribution: The groups should come from normally distributed populations.
Equal Variances: All groups compared should have similar variances.
Independent Samples: No individual is in more than one group.
Example: A farmer uses three types of fertilizer and wants to see if there is a difference in crop yield. This test can test if any fertilizer leads to higher yields.
ANOVA
ASSUMPTIONS:
Population Variance Unknown: We don’t know the population’s variance and estimate it from the sample.
Normal Distribution: The data should be roughly normally distributed, especially for small samples.
Equal Variances: For comparing two groups, their variances should be similar.
Sample Independence: Each observation is collected without influence from the others.
Example: A teacher wants to know if a new teaching method affects test scores differently than the old method. Since the population variance is unknown, a ___ is appropriate.
T- TEST
ASSUMPTIONS:
Population Variance Known: We assume the variance of the entire population is known.
Normal Distribution: The data should come from a normally distributed population, which is less of a concern for large samples due to the Central Limit Theorem.
Sample Independence: Each data point should not be influenced by any other data point.
Example: A factory knows the standard deviation of the weight of boxes it ships. To check if a new machine is filling boxes to the correct average weight
Z test