RIP final Flashcards

1
Q

How to proceed with answering the question: Is there a difference between the mean resting heart rate of men and women?

A

The first step is calculating the difference between the two means. We must transform this distance into a relative distance (t-statistic). It allows us to compare the difference to a standardized distribution (the t-distribution). We calculate the test statistic using the formula for t. When we have the value of t, we use p-value to measure how extreme the difference is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the formula for the t-statistic?

A

observed difference/standard error for the difference in the two means

(M1 - M2) / SE(M1 - M2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Once we have the value of t, what do we use to measure how extreme the difference is?

A

p-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

conditions of causality

A
  1. covariance
  2. temporal precedence
  3. internal validity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

internal validity

A

Alternative explanations for the relationship should be ruled out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

randomized experiment

A

A research design where:
▪by randomization, groups can be assumed to be similar
▪one variable is manipulated(varied) by the researcher
▪the researcher measures the effect of this manipulation on another variable (the outcome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

confounding variable

A

A second variable that happens to vary systematicallyalong with the intended independent variable. This variable is therefore an alternative explanation for the results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

internal validity

A

asks if groups were comparable at the beginning of the experiment, with respect to the dependent variable and other dependent variables (observed and unobserved). If, for some reason, the groups turn out to be not comparable at the start of the experiment, we speak of a selection effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

selection effect

A

Crucial question: how were the groups created. To reduce selection effects, groups must be formed using random assignment. for some reason, the groups turn out to be not comparable at the start of the experiment, we speak of selection effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

goal of random assignment

A

making sure that: the mean and variance in scores, on all variables, measured and unmeasured, are similar for both groups at the onset of the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

randomization issues

A

contamination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

contamination in randomization

A

▪Participants in the experimental group communicate with participants in the control group
▪Participants do not adhere to the treatment
▪Influence from researcher(s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

PICO

A

The identifier of an experimental research question

Population
Intervention
Comparison
Outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what do researchers use when comparing mean scores of two independent groups?

A

independent sample t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

standard error for difference in means

A

contains the group sizes (n1and n2) and spread in scores in both groups (SD1and SD2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

With the t-test we consider the relative difference between the groups, using:

A

*The mean difference: M1–M2
*The spread in scores in both groups:SD1and SD2
*The group sizes: n1 and n2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

the idea behind the test statistic t

A

When a lot of samples are drawn from a population in which H0is true, The difference between the sample means will often be near zero. So, t will often be near zero, too. Values of t that are far from zero will be found less often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is the standard error of t dependent on?

A

Group sizes (n1and n2) *Variation in scores in both groups (SD1and SD2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

as standard deviation increases, standard error

A

also increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

as n increases, standard error

A

decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

overall the test statistic is dependent on

A
  • relative difference in means
  • standard deviation pooled (weighted average of sd in sample 1 and sd in sample 2)
  • and sample size per group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

a larger diference in means what for the t value

A

larger t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

more variation in scores means what for the t value

A

smaller t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

larger samples means what for the t value

A

larger tr

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

randomization

A
  • key of true experiment
  • observed and unobserved factors are equally likely in both groups
  • transparent, reproducible
  • allows causal claims
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

between subject design

A

When participants are divided into different groups and each groups receives different treatment. The data is then compared between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

within subject design

A

When all participants receive all different treatments (one after the other, possibly randomized in order). We first compare the data within each person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

how does a pretest-posttest design compare to posttest

A

can serve as a randomization check, correction for differences, and can track changes. in just a posttest design, we would not know if/how the groups differed at the beginning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

disadvantage of the pretest-posttest design

A

learning effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

solomon four group design and advantages/disadvantages

A

both prettest-posttest and just posttest. can solve unequal groups at the beginning and check for learning effect. however, can be highly costly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

repeated measures design

A

where the same participants are measured multiple times under different conditions or at different time points. This allows researchers to examine changes within individuals, reducing variability and the need for a large sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

counterbalanced measures design

A

A research design used to control for order effects in repeated measures studies. Participants experience all conditions, but the order of conditions is varied across participants to prevent biases from practice, fatigue, or carryover effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

quasi-experiment

A

Research designs that evaluate the effect of an intervention or treatment without random assignment. Instead, groups are naturally formed or pre-existing, making them useful in real-world settings where randomization isn’t feasible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

interrupted time series design

A

A quasi-experimental design that measures an outcome variable repeatedly over time, both before and after an intervention or event (the “interruption”). It evaluates changes in trends or levels caused by the intervention, making it useful for analyzing the effects of policies, treatments, or external events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

field experiment

A

An experiment with a close simulation of the conditions under which the process under study occurs or in a natural settin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

threats to internal validity

A

design confounds
selection effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

design confounds

A

A second variable that happens to vary SYSTEMATICALLY along with the intended independent variable
▪This variable is therefore an alternative explanation for the results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

threats to internal validity in experimental design

A

▪Design confounds
▪Selection effect
▪Contamination
▪Learning effect
▪Maturation
▪History
▪Regressing to the mean
▪Attrition
▪Testing
▪Instrumentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

threats to internal validity in all research

A

▪Observer bias
▪Demand characteristics
▪Placebo effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Observer bias

A

When the researcher has certain expectations and is influenced by this in assessing the participants/ interpreting the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Deman characteristics

A

When the participants realize what the study is for and therefore start to behave differently (in the expected direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Placebo effect

A

When participants make progress because they believe they are receiving an effective treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Maturation

A

Is it the manipulation or the development (aging, maturing) that caused the differences?

Observed differences between the pre- and post-measurement could arise from natural developments of the participants, when participants’ characteristics change as part of a natural process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

History threats

A

Is it the manipulation or external events causing the differences?

Not only natural changes of participants are a source of influence, but external events as well - events that are not necessarily related to the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Regressing threats

A

Is it the manipulation or the natural “shifting” that caused the differences?

Regressing to the mean can occur when the participants show extreme values (on average) at the start of the experiment. At a later time, values are expected to be shifted towards the ‘normal’, less extreme, mean value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Attrition threats

A

Is it the manipulation or the drop-out of a group of participants that caused the differences?

When participants drop out during a study, the outcome can be affected by this. This is primarily a problem when the people that quit the study are different from the people that do not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Instrumentation threats

A

Is it the manipulation or the new instrument that caused the differences?

When the instrument measuring the dependent variable changes during the experiment, the results are affected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What are posible explanations if no effect is found after an experiment

A

weak manipulations

power problem (there is an effect, but too few participants to detect it)

no effect (there really is no difference in the population

49
Q

how is the null hypothesis protected in NHST?

A

by making the chance of making a type one error small (the significance level)

50
Q

chance of inverse of a type 1 error

A

power: 1-B

51
Q

power

A

chance of correctly rejecting H0. measures the chance that an existing difference in the population will be found by the sample data and the statistical test`

52
Q

what happens to power when alpha increases?

A

power also increases. By increasing alpha (the threshold for rejecting the null hypothesis), it becomes easier to detect a true effect, which increases the likelihood of rejecting the null hypothesis correctly, thereby increasing power. However, the chance of making a type 1 error also increases. Researchers need to find a balance between a small value of aand high power

53
Q

factors power is influenced by

A

The sample size

The size of the difference in the population

The level of significance

The spread (or variability) in the measured scores

The choice of the statistical technique

54
Q

type two error

A

A type II error is that the null hypothesis is not rejected, while the null hypothesis is not true.

55
Q

when spread in scores decreases, what happens to power?

A

power increases

56
Q

4 principles which are the basis of integrity in research

A

Reliability, honesty, respect, accountability

57
Q

major violations of scientific integrity

A

fabrication - making up data, deliberate

plagiarism - copying other people’s work, deliberate

data falsification - not reporting certain findings, adjusting data, misinterpreting it, all deliberately

58
Q

publication bias

A

absence of non-significant effects leads to bias towards large effects

59
Q

Causes of questionable research practices (QRP)

A

Scientific journals would like to publish interesting/innovative results, which attracts more readers AND researchers need to publish enough to make a career

60
Q

p-hacking

A

things like:
*Removing outliers to make a difference significant
*Add a few more participants to make results significant
*Run a different analysis than planned

61
Q

HARKing

A

hypothesising after results are known: in hindsight, formulating hypotheses and pretending that they were the main focus of the research all alon

62
Q

Solutions to questionable research practices

A

post-publication peer review

retraction

pre-registration of aims and intended methods and expectations

replication as a standard part of research

63
Q

Cohen’s D

A

Used to describe the size of a difference

Measure of relevance; expresses difference between two means in the number of standard devaitions

(M2-M1)/SDpooled

64
Q

SD pooled

A

Weighted average of SD1 and SD2

65
Q

confidence interval

A

another way to describe the size of the difference between the two groups. a range of plausible probable values based on sample data

66
Q

width of confidence interval depends on

A
  • Sample size (smaller standard error –> narrower interval)
  • Spread/variation in scores in population (means greater spread of scores in sample, so more uncertainty –> wider interval)
  • Chosen confidence level (95% confidence level widely used - more certainty, wider interval)
67
Q

Four parts to evaluate statistical validity

A
  1. significance is determined based on test statistic t and the p-value
  2. relevance is assessed using a measure of effect size, such as cohen’s d
  3. accuracy is assessed using a confidence interval
  4. suitability of statistical test is assessed by checking the assumptions
68
Q

how is effect size measured for regression analysis

A

R squared

69
Q

how is effect size measured for chi squared

A

Cramer’s V

70
Q

Three claims

A

Frequency claim

Association claim (correlation and regression studies)

Causal claim (best made in context of randomized experiments)

71
Q

Four validities

A

Construct

Internal

External

Statistical

72
Q

Assessing statistical validity

A

Sig (det by p value)

Relevance (assessed using effect size)

Accuracy (assessed using confidence interval)

73
Q

How is suitability of a statistical test assessed

A
  1. check assumptions
  2. check if hyp match expectations
  3. check if results match hypotheses
74
Q

assumptions of t test

A
  1. random sample
  2. dependent variable is of interval or ratio measurement level
  3. two groups are independent
  4. scores in both groups are normally distributed
  5. scores in both groups have equal spread

Violating these assumptions leads to lower statistical validity

75
Q

How can we check Assumption 1 of t-test

A
  • read methods section of article; how did researchers select participants?

if sample is not random:
- be cautious interpreting results because random sample ensures independence of observation

76
Q

How can we check Assumption 2 of t-test

A
  • methods section
  • how are constructs operationally defined? is it plausible we can interpret in interval/ratio level?
  • ig you have enough levels for ordinal, people won’t bother (eg aggression)
77
Q

What if DV of a t-test is not interval or ratio (or ordinal)? eg answers to yes/no questions

A

Solution: use a statistical test for categorical variables (the chi-squared test of homogeneity)

78
Q

Chi-squared test of homogeneity

Q example: is the distribution of answers of people with treatment the same as the distribution of answers of people without?

A
  • two independent samples (like t-test)
  • DV is categorical (unlike t-test)
  • Used to determine if the distribution of a categorical variable is the same in two groups, can be used with more than 2 groups
79
Q

Chi-squared test of homogeneity hypotheses

A

H0: distribution of answers in control is equal to distribution of answers in treatment

H1: distribution of answers in control is different from the distribution of answers in treatment

80
Q

How can we check Assumption 3 of t-test

A
  • Read Methods Section of an article
  • Are the participants randomly assigned to two separate groups?
  • Is there a link between the measurements in the two groups?
81
Q

What if two groups are not independent (assumption 3 of t-test)

A

Solution: conduct a t-test for dependent samples

82
Q

How can we check Assumption 4 of t-test

A

Independent sample t-test: two histograms, 1 of scores in control group and 1 of scores in experimental group

Paired sample t test: make 1 histogram of difference scores

83
Q

How can we check Assumption 5 of t-test

A

Can use side-by-side box plot and observe the spread of the arms
- graphical checking is preferred

Can also use one of the formal t-tests for equal variances (significance means unequality of variances)
- Levene’stest
- Brown-Forsythe test
- F-max test

84
Q

What to do if the equal variance assumption (assumption 5) is not satisfied?

A

Use an alternative called Welch’s test

The t-test we use under the assumption of equal variances has more power, so that option is preferred

85
Q

what do we do to compare the distribution of a categorical variable between two (or more) groups

A

Use the chi-squared test of homogeneity to test if the distributions are homogeneous

86
Q

the steps to measuring a theoretical concept

A

theoretical concept –> conceptual definition —> operational definition —> variable

87
Q

correlation is used for

A

measuring strength and direction of linear relationship

88
Q

regression is used for

A

describing the linear relationship with an equation and making predictions using this equation when only data on the independent variable is available

89
Q

Least squares regression is a technique used for

A

finding the equation of the line best fitting to the data

90
Q

Residuals

A

Residuals are the difference between the observed value of Y and the predicted value of Y (= point on the line). When a line fits the data well, the residuals will tend to be small. the equation with the smallest sum of squared residuals is the winner!

91
Q

Root Mean Squared Error, or Standard Error of the Estimate in logistic regression

A

the standard deviation of the residuals.

roughly, the average error we make when using the regression equation to make predictions

92
Q

coefficient of determination is

A

R squared

93
Q

What does R squared tell us in a regression model

A

how much of the variation in Y can be explained by the linear relationship with X. percentage variance explained.

94
Q

What are the two tests we can use to find out if the linear relationship is a significant relationship in the regression model

A

option 1: test for the slope
- we can test if the slope is significantly different from 0, using the t-test

option 2: test for explained variance
- to test if the model explains a significant proportion of the variation, we can test to see if the proportion of the variation that is explained by the model, is significantly greater than 0, using the F-test.

95
Q

beta (standardized) coefficient

A

measures the change in Y with one SD increase in X

96
Q

the assumptions of least squares regression

A
  1. linear relationship (check using scatter plot)
  2. interval or ration measurement level
  3. no outliers (check using residual plot)
  4. residuals are normally distributed
  5. homoscedasticity (spread around regression line is independent of the value of X)
97
Q

adding more independent variables to a predictor always….

A
  • explains more of the variation in the DV (so higher R squared)
  • reduces the average prediction error (so SE will decrease as accuracy increases)
98
Q

what to be careful of when removing variables from an MLR model

A

do it one at a time; never remove multiple variables at once based on the t-test!

99
Q

Principle of p value in null hypothesis significance testing

A

Given the null hypothesis is true, what is the chance of observing the data we observed

100
Q

Principle of Bayesian testing

A

Given the data we observed, what is the chance the null hypothesis is true?

101
Q

what does the bayes factor measure

A

How much more does the observed data support the null hypothesis as compared to the alternative hypothesis

relative support for null hypothesis, as measured by

support in data for H0/support in data for H1

102
Q

What does a Bayes factor of 5 mean

A

the support in the data for H0is 5 times greater than the support for H1

103
Q

what does BF01 measure

A

support in the data for H0/support in the data for H1

104
Q

what does BF10 measure

A

support in the data for H1/support in the data for H0

105
Q

How do we interpret a BF01 of 0.4

A

the support in the data for H0is 0.4 times greater than for H1

but this doesn’t really make sense.

so in this case we flip the Bayes factor so that

B10 = 1/0.4 = 2.5, so the support in the data for the alternative hypothesis is 2.5 times greater than for the null hypothesis

106
Q

confidence interval in NHST

A

Interval estimate to give the reader an idea of the size of the effect

107
Q

interpretation of credible interval in Bayesian testing

A

Given the evidence in the data, the mean score of condition A has a 95% chance of falling between x and y

108
Q

Results of the reproducibility project

A

In almost all original studies the null hypothesis was rejected (had a p-value smaller than .05

but only a third of the replication studies were able to reject the null

effect sizes were only half as large in the replications compared to original studies

109
Q

mission of open science research

A

increase the openness, integrity, and reproducibility of scientific research”

everyone should have access to this scientific knowledge*everyone should be able use it for the benefit of science/ societ

110
Q

in open science, researchers are…

A

working digitally
*collecting enormous amounts of data *able to easily share data online

111
Q

advantages of open science

A

Increases citations
increases visibility of academic research increases reusability of academic research results

112
Q

disadvantages of open science

A

the range of high-quality, fully open access journals is still limited

the number of available reliable journals and articles varies per discipline

Quality and reliability of open access journals

113
Q

FAIR principles for how data should be stored

A

Findable
Accessible
Interoperable
Reusable

114
Q

Following FAIR guidelines leads to

A

a greater efficiency of the research process, because new research questions do not always require the collection of new data because suitable data are already available

*better reproducibility and greater reliability of research

115
Q

A good data management plan leads to

A

FAIRness of data

116
Q

adv and disad direct replication

A

adv: easy to compare

disad: problems with internal validity in original research will still be prese

117
Q

adv and disad conceptuala replication

A

adv:
- ability to improve design
- increase internal validity

disadvantage:
- not as easy to compare

117
Q
A
118
Q

adv and disad replication plus extension

A

adv: Possibility to examine additional research question

disad: Not as easy to compare