Stats 1 Flashcards
Compare and contrast the applications, assumptions, and methodologies of one-way ANOVA and two-way ANOVA in the context of experimental research.
A one-way ANOVA is done to compare the means between two groups, when you have one independant variable (IV). A Two-way ANOVA is done when you have two IVs.The assumptions for a one-way ANOVA is homoscedasticity, independance of observations, normality, and that data should be on interval or ratio scale. The assumptions for a two-way ANOVA is the same, but additionally the assumption of sphericity.
Normality means that the data should be normally distributed. Independence of observations means that the data should not correlate with each other.
When you make an ANOVA you check the assumptions before the test is done, and do post- hoc tests after the test is done. You do post-hoc tests to see what differed between the groups, since an ANOVA only gives you an answer on wether the means differ or not.
A friend from the university who has never taken a course in statistics is tasked with doing significance testing for an assignment and asks you to explain the p-value and provide some guidance on factors that are important to consider when interpreting the results. What information does the p-value provide? What are the drawbacks?
The p-value states the probability of the observed effects being due to chance. When the statistical test shows an effect, the p-value will be less than the alpha level. Normally for psychological research the alpha-level is set at 0,05. The alpha level, often denoted as α, is the predetermined threshold used to determine statistical significance in hypothesis testing.
It represents the probability of making a Type I error, which is the error of rejecting a true null hypothesis. Researchers choose the alpha level based on factors such as the desired balance between Type I and Type II errors and the conventions in their field of study.
The p-value is the probability of observing the data, or more extreme data, under the assumption that the null hypothesis is true.
If the p-value is less than or equal to the alpha level (p ≤ α), the result is considered statistically significant, and the null hypothesis is rejected in favor of the alternative hypothesis.
When the p-value is lower than the alpha-level, we call the test result statistically significant, which means we can confidently state that there’s an effect. There’s two hypothesis when it comes to statistical testing, the H0 (null-hypothesis) and the H1 (alternative hypothesis). The H0 assumes that there’s no real effect, and the H1 assumes there is an effect. The drawbacks are the alpha-level that’s decided by the researcher. Sometimes the researcher lower the alpha-level to help prevent Type I error, but that on the other hand increases the risk for Type II error. Type I error is what’s called a false positive. That means we reject the null-hypothesis (H0) even though there is no real effect (we should not have rejected the H0). Type II error means what’s called a false negative. That means we don’t reject the null-hypothesis even though we should have done, since there’s a real effect.
Simple quasi-experimental designs can often be improved by adding design elements. Can you give examples of this? Can quasi-experimental designs be refined to the extent that one can draw causal conclusions?
Quasi-experimental design usually looks something like this: NR O1 X O2. NR stands for non-randomization, which means that the group was not randomly selected. O1 stands for the first observation (often called pre-assessment), X stands for intervention/treatment, and O2 stands for the second observation (often called post-assessment). One design element that can be added to a simple quasi-experiment to improve it is a non-treatment control group. That usually looks like this NR O1 O2. The group is still not randomized, but adding a second group to the experiment shows the researcher what may have happened if the test condition had not recieved the intervention/treatment. Another design element that can be added to improve this type of experiment is additional pre-assessments in both conditions. For the test condition it could look something like this: NR O1 O2 X O4. More **pre-assessments provides the researcher with a more stable baseline **for the participants and group as a whole, which helps the researcher to more confidently make claims about the results. I would say that quasi-experimental designs can’t be refined all the way to make the researcher be able to draw causal conclusions, but this design is definitely closer to doing it that Naturalistic observation, Case studies, Correlation studies and cross-selctional studies. If the design was refined enough it would just become an experimental design, if we for example also randomize the groups.
The issue of internal validity is frequently discussed within psychological research. What is meant by this type of validity, what are the most important threats to internal validity (name at least three) and how can the researcher best deal with these threats?
Internal validity means the degree to which the study in question can state something about what it tried to test. If a study tested the effect of cognitive behavioral therapy (the IV) on depression (the DV), the internal validity is how we can make out how trustworthy the results are. Some threats to internal validity is maturation, history, and selection bias.
Maturation means the natural processes in humans, where we age and go through normal processes that changes us (for example puberty). This is especially important to control for, and/or discuss in the study, when it comes to studies with longitudinal designs, and children as participants. To control for maturation, the researcher could do multiple pre-assessments both on the test condition and the control group. This way the researcher can get a more stable baseline of every participant, and the group as a whole.
History means the time period we live in when the present study is done, big world events, and changes in society. It’s important to discuss what impact the world right now could have on your study. For example: studies done in close proximity to the 9/11 terror attacks about depression or anxiety could show effects based on that big event rather than what the study aimed to test. It’s not possible to control for history but the researcher can discuss it in the study, and bring attention to what confounders could have affected the study and how.
Selection bias means when the test condition and the control condition are not balanced. This could for example be if only depressed people were put in the test condition, and only healthy people were put in the control condition, when the study aims to test a new medicine for depression. This way you can not state anything of value, since we don’t know what would have happened to the people with depression if they had not gotten the medicine, or the healthy people if they had gotten the medicine. If the depressed people got less depressed after your study, you can still not state that the medicine should be distributed to people with depression, since you don’t know if it’s the medicine that made them feel better. If you had people with depression in your control group, you may have observed an improvement in well-being in them also. To control for this the researcher can randomly allocate participants to the conditions. Therefore equally many people with depression and healthy people are in both groups.
The causality problem is a main concern in empirical research. Discuss the meaning of causality (cause-effect), with your philosopher’s hat on.
Hume:
Cause proceeds effect in time
Cause and effect are closely related in time and space
Every time we observe the cause we should observe the effect
Mills:
all 3 + there should be no other plausible explanation for the effect than the cause
The causality problem discusses whether you can reliably say correct statements about the cause and effect of any event. Psychological research tries to do this, by having a similar way of handling research as Hume’s and Mills’s statements. Empirical research often controls for equality between conditions; such as equal groups, equal environments and equal test experiences. Therefore the research tries to make sure that there’s no other plausible explenation for the effect than the cause (if there is one). There’s often repeated measures, which helps control for the fact that we should observe the effect everytime we observe the cause. In empirical research there’s often a dependant variable (DV) and an independant variable (IV). We manipulate the IV, and observe the possible changes in the DV. The DV is the presumed effect, and IV is the presumed cause. So if we change the IV we should see a difference in the DV also. This type of research has been refined over the years, and is the closest way that we can make statements about cause and effect in empirical research right now. Although, there will probably never be possible to with 100% certainty state the cause and effect of any event. That’s because we cannot possibly control for everything in empirical research, even if we try to. There could always be some confounding variable that we missed to control for. We can for example seldom test entire populations, but there’s most times a smaller sample being tested to represent the population.
Assumptions
All of them: Normality
Regression Analysis:
Linearity: The relationship between the independent and dependent variables is linear.
Independence of Errors: The errors (residuals) are independent of each other.
Homoscedasticity: The variance of the errors is constant across all levels of the independent variables.
Normality of Errors: The errors are normally distributed
T-test and one way ANOVA:
Normality: The data within each group or sample come from a population that follows a normal distribution.
Homogeneity of Variance
Independence
Interval or Ratio Data
Mediator and moderator
Mediator:
The IV must cause the mediator.
Ex: Stress is IV, which causes cortisol (mediator), which may affect depression (DV)
Moderator:
A variable that influences the strength or direction of the relationship between two other variables.
If IV is studying time and DV is exam performance then moderator is type of studying method
Type 1 and type 2 errors
what they are:
When rejecting the null hypothesis
How to reduce:
Eigenvalue
eigenvalue is a number, telling you how much variance there is in the data in that direction
Large number: indicates the direction in which the data has the most variance
how to calculate?
Effect sizes in a meta analysis is heterogenous
The effects vary more than what could be expected from random variation
Threats to internal validity
Selection bias: Groups not equal to start with (sex, age) don’t put all sick people in the treatment group Randomization
Maturation: Natural change processes, occurring over time regardless of the treatment. Control group
History: Events that occur during the study-period, other than the treatment / intervention, that may affect the results (9/11, WWIII) Control group
Regression towards the mean: Extreme (high/low) values tend to move towards
the mean with repeated measurement. Control group
Instrumentation: The precision of the instrument / measurement tool may alter over time. Calibrate instruments
Testing: Fatigue effects (getting tired of it), habituation (getting used to it), learning effects (get better over time)
Attrition: (dropouts) Special case of selection bias, that occurs after the treatment has started → imbalance in your comparison group. Comparison groups may therefore be ”uneven”. Randomization does not guard against this threat. Analyze and report
efficacy study
effectiveness study
Efficacy study: evaluates effectiveness of intervention in controlled environment /ideal conditions
Effectiveness study: assesses how well an intervention will work in the real world, every day conditions.
Used to see the performance of the intervention in diverse populations
Power
“Power” is the probability that a statistical test will correctly reject a false null hypothesis.
It reflects the ability of a study or experiment to detect a true effect when one exists.
Maturation
Internal validity
Maturation means the natural processes in humans, where we age and go through normal processes that changes us (for example puberty). This is especially important to control for, and/or discuss in the study, when it comes to studies with longitudinal designs, and children as participants. To control for maturation, the researcher could do multiple pre-assessments both on the test condition and the control group. This way the researcher can get a more stable baseline of every participant, and the group as a whole
History
Internal validity
History means the time period we live in when the present study is done, big world events, and changes in society. It’s important to discuss what impact the world right now could have on your study. For example: studies done in close proximity to the 9/11 terror attacks about depression or anxiety could show effects based on that big event rather than what the study aimed to test. It’s not possible to control for history but the researcher can discuss it in the study, and bring attention to what confounders could have affected the study and how.