Research Design and Statistics Flashcards

1
Q

What is the sequence of the scientific method?

A
  1. Form a hypothesis
  2. Operationally define the hypothesis: what will be measured to show results?
  3. Collect & analyze data
  4. Disseminate results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Independent Variable: Define

A

The variable that is manipulated by researchers
The variable that is thought to impact the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Dependent Variable: define

A

The outcome variable
What is hypothesized to change as a result of the IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Predictor & Criterion Variables: Define

A

**Predictor: **Essentially the same as IV, but it can’t be manipulated
E.g. gender, age
Criterion: essentially the dependent variable

This is for correlational research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can a variable have levels in a study?

A

Yes, especially the independent variable
E.g. Male & Female could be levels of the predictor variable
No treatment/Med Only/Combined treatment could be levels of the IV for treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Factorial Designs

A

These have multiple IV’s

E.g. 1 IV is treatment; 2nd IV is type of schizophrenia

If you look at the effects of all levels on each other, it becomes a factorial design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What gives a study Internal Validity?

A

If you can determine a causal relationship between the IV and DV

No/limited effects of extraneous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Internal Validity in Multiple Group Studies: what impacts it?

A

The groups must be comparable to control for extraneous/confounding factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Internal Validity: History

A

What is it? Any external event that affects scores on the dependent variable

Example: learning environment between groups is different, w/ one being superior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Internal Validity: Maturation

A

What is it? an internal change that occurs in subjects while the experiment is in progress

Example: time may lead to intellectual development, or fatigue, boredom, hunger may impact it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Internal Validity: Testing

A

What is it? practice effects

Example: take an EPPP sample test, attend a course, and then retake an exam to see if the course helped improve. but it may be just knowing what to expect on the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Internal Validity: Instrumentation

A

What is it? changes in DV scores that are due to the measuring instrument changing

Example: raters may gain more experience over time. This is why we need highly reliable measuring instruments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Internal Validity: What is Statistical Regression?

A

What is it? extreme scores tend to fall closer to the mean upon re-testing

Example: if you test severe rated depression people, just by nature they are likely to report as less depressed next time regardless of any IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Internal Validity: Selection

A

What is it? Pre-existing subject factors that account for scores on DV

Example: Classroom A students may simply just be smarter than Classroom B students, so regardless of different interventions they will score better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Internal Validity: Differential Validity

A

What is it? drop out is inevitable, so if you have 2 diff groups and there are differences in the type of people who drop out from each group, it can affect int. validity

Example: studying a new SSRI, some people may experience a worsening of depression/SI while on it, and they drop out. Because they dropped out, the med may appear to have been more helpful than it truly is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Internal Validity: Experimenter Bias

A

What is it? researchers preconceived bias impacts how they interact with subjects, which impacts the subjects scores

AKA: experimental expectancy effect, rosenthal effect, pygmalion effect

Example: experimenter unintentionally communicates expectations to subject

Prevention: double-blind technique

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Protecting Internal Validity: Random Assignment

A

Each person has equal chance of ending up in a particular group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Protecting Internal Validity: Matching

A

What is it? ID subjects who are matched on an expected confounding variable, and then randomly assign them to treatment/control group

Ensures that both groups have equal proportion of the confounding variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Protecting Internal Validity: Blocking

A

What is it? make the confounding variable another IV to determine to what extent it may be impacting the DV

Allows you to separate the effects of a variable and see interactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Protecting Internal Validity: Holding the Extraneous Variable Constant

A

What is it? only use subjects who match the same on the extraneous variable

Problem: results not generalizable to other groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Protecting Internal Validity: Analysis of Covariance

A

What is it? a stat strategy that adjusts DV scores so that subjects are equalized in terms of status on extraneous variables

Pitfall: only effective for extraneous variables that have been identified by the researchers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

External Validity: Define

A

The degree to which results of a study can be generalized to other settings, times, people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Threats to External Validity: Interaction between Selection & Treatment

A

What is it? effects of a treatmetn don’t generalize to other target populations

Example: may work with college students, but not non-college students

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Threats to External Validity: Interaction between History & Treatment

A

What is it? effects of treatment don’t generalize beyond setting and/or time period the experiment was done in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Threats to External Validity: Interaction between Testing & Treatment

A

What is it? pre-tests may sensitize the subjects to the purpose of the research study

AKA: Pretest sensitization

Example: pre-test before a film designed to reduce racism. The group who viewed the film may be primed and more motivated to pay attention to the film, as opposed to those who may watch the film without a pretest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Threats to External Validity: Demand Characteristics

A

What are they? cues in the research setting that may tip subjects off to the hypothesis

People pleasers may act in ways to confirm the hypothesis, while others may act to disprove it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Threats to External Validity: Hawthorne Effect

A

What is it? research subjects may behave differently simply because they are participating in research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Threats to External Validity: Order Effects

AKA Carryover effects & Multiple Treatment Interference

A

What is it? DV is impacted by other aspects of the study

Example: subjects get three treatments, always in the same order. Last treatment may show the best results, but there’s no way of knowing if it’s just from that treatment, or from impacts of the previous two

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Stratified Random Sampling

Protecting External Validity

A

Take a random sample from subgroups of a population

Example: random sample of different age groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Cluster Sample

Protecting External Validity

A

The unit of sampling is a naturally occurring group of individuals
Example: residents of a city

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Naturalistic Research

Protecting External Validity

A

Behaviour is observed and recorded in its natural setting

Reduces many external validity concerns, but has no internal validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is analogue research?

Protecting External Validity

A

Results of lab studies are used to draw conclusions about real-world phenomenon

E.g. Milgram’s obedience studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Single and Double-Blind Research

Protecting External Validity

A

Single Blind: subjects don’t know what group they are in
Double Blind: neither subjects or research know what group they are in

Reduce demand characteristics, researcher bias and hawthorne effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Counterbalancing

Protecting External Validity

A

Controls for order effects by ensuring variables are received in different order

Latin Square Design: order the administration of variables so that each appears only once in each position

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

True Experimental Research

A

Subjects randomly assigned to groups
Groups receive different levels of manipulated variable

Greatest for internal validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Quasi Experimental Research

A

When to use? when random assignment is not possible

Example: studying a learning program that is being introduced to all grade 1 classes

Next best for internal validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Correlational Research

What is it used for?
Does it have Internal Validity?

A

Internal Validity? correlational research has none
Use for? Prediction, esp. for variables that can’t be manipulated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Developmental Research: 3 types

A

**Goal: **Assessing variables over time
Longitudinal: same people studied over long time
* Pitfall: underestimate changes, bc its often those who drop out that have the most significant changes
Cross-Sectional: different groups of subjects, divided by age, are assessed at same time
* Pitfall: cohort effects lead to overestimation of differences (e.g. may not account for an aid a different generation had, which is responsible for helping memory)
Cross-Sequential: combines the two. Samples of diff groups are assessed more than once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Time-Series Design

What is it?
What are the benefits?

A

Take multiple measurements over time (e.g. multiple pretest/posttest) to assess effects of IV

Benefits: controls for threats to internal validity. You can add a control group to help with history effects

Example: smoking reduction program in school. degree of post test results can indicate if it was a confounding factor or a result of the program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Single Subjects Design

A

Can be one subject, or multiple that are treated as one group

Used for: behaviour modification research

Dependent variable measured multiple times during phases of the study (phase 1-no treatment/phase 2-treatment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Single Subject Design: AB Design

A

Single baseline and single treatment phase

Phase 1: collect data on frequency of behaviour before treatment
Phase 2: give treatment, collect data on if it reduced behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Single Subject Design: Reversal (Withdrawal)

A

Benefits: controls for extraneous factors, which AB does not

What does it do? give treatment, withdraw treatment and reassess, and then provide treatment again. If behaviour continues again without treatment, the effect was likely due to treatment

Types:
ABA: baseline -> treatment -> withdraw
ABAB: baseline -> treatment -> withdraw -> treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Multiple Baseline Design

When to use?
Types of baselines to use

A

Used when: reversal not possible for ethical reasons

It doesn’t involve withdrawal of treatment

Treatment applied sequentially

Multiple Baseline Across Behaviours: start with one behaviour, then use same treatment for another
Multiple Baseline Across Settings: home, school
Multiple Baseline Across Subjects: try treatment on another subject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Qualitative Research: Surveys

Types
Risks/Benefits

A

Cons: many threats to validity
Pros: can try to ensure random sample
Types: personal interviews, telephone surveys, mail surveys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Qualitative Research: Case Studies

A

Con: lack internal and external validity
Pro: thorough on one person

Useful as pilot studies that can ID variables to be studied in a more systematic manner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Qualitative Research: Protocol Analysis

A

What is it? research involving the collection and analysis of verbatim reports

Example: subject thinks aloud while doing something, which is then analyzed to look for themes/concepts evident as the subject performed the task

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Scales of Measurement: Nominal Data

A

Unordered categories, none of which are higher than the others

E.g. male/female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Scales of Measurement: Ordinal Data

A

Provides info about the ordering of categories, but not specifics

E.g. agree, strongly agree, neutral, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Scales of Measurement: Interval Data

A

Numbers are scaled at equal distances, but the scale has no absolute zero point

e.g. IQ scores, temperature

Multiplication or division not possible, but addition and subtraction are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Scales of Measurement: Ratio Data

A

Identical to interval, but they have an absolute zero

E.g. dollar amounts, time, distance, height, weight, frequency of behaviours per hr

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What does a Frequency Distribution provide? How are they displayed?

A

A summary of a set of data

tables, bar graphs, histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Normal Distribution

A

Symmetrical, half scores above mean and half below

Most scores are close to mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Skewed Distributions

A

May happen with ceiling/floor effects

Negatively Skewed: has a tail on the left. Indicates easy test

Positively Skewed: has a tail on the right. Indicates difficult test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Measures of Central Tendency: the mean

A

Arithmetic average
Add all values and divide by n
Con: sensitive to extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Measures of Central Tendency: the median

A

What is it? The middle value of data when ordered from lowest to highest (Md)

Odd groups: literally the middle number
Even groups: mean of the two middle numbers

Pros: not as affected by extreme scores, so good for skewed distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Measures of Central Tendency: the mode

A

What is it? the most frequent value in a set of numbers

May have multiple modes (bimodal/multimodal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Relationship between the Mean, Median & Mode

A

Normal Distribution: all equal

Positively Skewed Distribution: mean higher than median, median higher than mode
Negatively Skewed Distribution: mean is less than median, mode is more than median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Measures of Variability: the range

A

What is it? the difference between the highest and lowest scores

Cons: impacted by extremes, so doesn’t give accurate representation of the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Measures of Variability: The Variance

A

What is it? The average of the squared differences of each observation from the mean

For me: Get the mean. How far is each score from the mean? Square that distance, and then add them all up. Take an average of the sum. This is variance.

What to know?
1. measure of variability of distribution
2. many stat tests use it in formulas
3. It’s equal to the square of the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Measures of Variability: the standard deviation

A

What is it? the expected deviation from the mean of a score chosen at random

Higher SD = more scores are likely to deviate from the mean

61
Q

Transformed Scores: z-scores

A

What are they? raw scores stated in standard deviation terms. Measures how many SD’s a raw score is from the mean

Calculate by: subtract teh sample mean by the score, and divide by the SD

Pro: can compare across different measures and tests

62
Q

Transformed Scores: t-scores

A

What are they? mean of 50, Sd of 10

63
Q

Percentile Ranks: what shape is their distribution and what does it mean?

A

Shape:It is flat/rectangular.

Means: within a given number of percentile ranks, there will always be the same number of scores

64
Q

Standard Deviation Curve: 4 things to know

A
  1. In a normal distribution, 68% of scores fall between -1.0Z and +1.0Z
  2. In a ND, 95% of scores fall between z scores -2.0 and +2.0
  3. In a ND, z-score +1.0 is a percentile rank of 84 (top 16%). -1.0 z-score is a PR of 16 (bottom 16%)
  4. In an ND, z-score of +2.0 is 98th PR (top 2%). z-score -2.0 is PR of 2 (bottom 2%)
65
Q

Where are Percentile Rank Scores Clustered?

A

Most are around the mean (PR 50-84)
At the extreme end there are less (84-98)

66
Q

What is the point of Inferential Statistics?

A

To allow us to make inferences about the population based on a sample

67
Q

What is Sampling Error?

Inferential Statistics

A

The inevitable error between the sample scores and the population

68
Q

What is the Standard Error of the Mean?

A

The extent to which a sample mean can be expected to deviate from its corresponding population mean

69
Q

What is the relationship between Standard Error of the Mean and Sample Size?

A

As sample size increases, the standard error decreases

INVERSE relationship

70
Q

Null VS Alternative Hypothesis

Inferential Statistics

A

Null: no difference between means of sampled populations. IV has no effect on DV.
Alternative: IV does have an effect on DV

71
Q

4 possible outcomes of testing a null hypothesis

A
  1. Retain null, no difference exists in population (correctly retained)
  2. False null rejected, differences do exist in population (correctly rejected)
  3. Null rejected, no differences exist (incorrectly rejected)
  4. False null retained, differences do exist (incorrectly retained)
71
Q

One-tailed VS Two-tailed Hypotheses

Inferential Statistics

A

One-tailed: we hypothesize a particular direction. E.g. we anticipate one mean to be significantly higher than the other mean

Two-tailed: hypothesize a difference in means, but not in what direction.

72
Q

Type I Error

A

Null hypothesis is rejected but it is true

You think you have something but you really don’t

73
Q

Alpha Level and Type I Error

Inferential Statistics

A

Set by the research in advance, and it is the probability of making a Type I Error

p = 0.05 or .01

74
Q

Type II (Beta) Error

Inferential Statistics

A

Fail to reject null hypothesis, but it is false

Thinking you don’t have something when you really do

75
Q

Type II Error and Power

Inferential Statistics

A

Power: the probability of NOT making a Type II error
* 1-beta
* Sensitivity of a statistical test to detect an existing difference

76
Q

What affects Power?

Inferential Statistics

A
  1. Sample size
  2. Alpha: higher alpha level = higher power
  3. One-tailed tests are more powerful
  4. Magnitude of Population Difference: more difference between population means = more likely to detect them. Can impact this by increases difference levels of IV
77
Q

Parametric Tests: what are they used for and what are their assumptions?

Inferential Statistics

A

Used for: interval and ratio data
Assumptions:
1.* Normal distribution of DV. Robust
2.
Homogeneity of Variance*: variance of groups is equal. Robust
3. Independence of Observations: scores within same sample or group shouldn’t be correlated (if they are, it means the scores could be impacted by a group factor) Not robust

78
Q

Nonparametric Tests

Used for?
How are they similar/different than parametric?
Name 2 types

Inferential Statistics

A

Used for: DV measured on ordinal or nominal scale

Differences from Parametric:
* don’t assume normal distribution
* Less powerful

Similarity to Parametric:
* assume data come from unbiased sample

Types:
* chi-square
* Mann-Whitney U

79
Q

How to decide to reject the null hypothesis?

Inferential Statistics

A

The obtained stat value is compared to a critical value in table
1. depends on the pre-set alpha level
2. degrees of freedom for the test

80
Q

t-test: what is it used for? what does it mean?

A

**Used for: **To test hypotheses about 2 different means
It cannot be used for more than 2 means

Means: t-ratio, if significant, indicates that the means are different

81
Q

One-sample t-test: when to use

Inferential Statistics

Inferential Statistics

A

When a study involves only one sample

Compare one mean to a known population mean

Rarely used

degrees of freedom: N - 1

82
Q

T-test for independent samples

Degrees of Freedom?

When to use?

Inferential Statistics

A

Used for: compare 2 means from unrelated samples (e.g. treatment & control group; test scores of students from different schools; avg height of men & women)
Degrees of Freedom: N - 2
Assumptions:
-Homogeneity of variances
-Data in each group ~normally distributed

83
Q

Paired Samples T-Test

When to use?
Degrees of Freedom?

Inferential Statistics

A

Used for: samples that are related to each other somehow (e.g. matched sample, pretest-posttest)

Degrees of Freedom: N-1 (N is the pair of scores)

84
Q

One-Way Analysis of Variance (ANOVA): when to use

Inferential Statistics

A
  • In a study w/ one independent variable where the means of more than two groups are compared
  • What is the probability that these means are from the same population?
  • F Ratio: if significant, the null is rejected
  • It doesn’t tell you which means are different, so must do post-hoc tests
85
Q

ANOVA: what does the F ratio represent?

Inferential Statistics

A

It represents a comparison between 2 estimates of variance
1. Between-group variance
2. Within-group variance

86
Q

ANOVA: F ratio and significant relationship

How does it effect the 2 types of variance?

Inferential Statistics

A

If the null hypothesis is true, the 2 estimates of varaince should be the ~same

If null hypothesis false, the between-group variance should be HIGHER than within-group

Differences between group means should be large enough to not be accounted for by error

87
Q

What is the ANOVA fraction?

Inferential Statistics

A

variance between groups/variance within groups

If top is BIG and bottom SMALL, that means it’s significant

88
Q

ANOVA: sum of squares

Inferential Statistics

A

What does it do? measure of the variability of a set of data
In ANOVA Summary Table:
1. between-group sum of squares
2. within-group sum of squares (error/residual)
3. Total Sum of Squares

Used to calculate the F-Ratio

89
Q

ANOVA: degrees of freedom

Inferential Statistics

A

Two Types
1. df between (k - 1)
2. df within (N - k)
*K = number of groups
*N = total number of observations

90
Q

ANOVA: mean square

Inferential Statistics

A

What is it? the stat measure to estimate between and within-group variance
Mean Square Between: sum of squares between / df between
Mean Square Within: sum of squares within / df within

91
Q

ANOVA: how to get the F ratio

What two other values are used?

A

Equation:
mean square between / mean square within

92
Q

ANOVA: what do post-hoc tests do?

Inferential Statistics

A

They make pairwise comparisons or complex comparisons between means

Pair wise: compare means of novel treatment group to mean of typical treatment group

Complex: compare combined mean of novel treatment and typical treatment with the control group mean

93
Q

ANOVA: risks of doing multiple post-hoc comparisons

A

Increases risk of Type I error

94
Q

ANOVA

When to use certain post-hoc tests

A
  1. Scheffe test is most conservative (most protection against Type I error, but increases chances of Type II)
    2.If only doing pairwise comparisons, Tukey is the best one to choose
95
Q

One-way ANOVA for repeated measures

A

Used when all subjects receive all levels of the IV (e.g. group receives novel treatment and typical treatment)

96
Q

ANCOVA: when to use it?

A

When you need to adjust dependent variable scores to control for effects of extraneous variables

97
Q

Factorial ANOVA: when to use?

A

When study has more than one IV and you want to look at the effects of each IV separately (main effect) but also together (interactions)

It helps you see the bigger picture, as the reality is that multiple factors play into dependent variables

98
Q

Factorial ANOVA: main effect

A

the effect of one independent variable by itself

99
Q

Factorial ANOVA: interaction

A

effects of an independent variable at the different levels of the other independent variables

E.g. one-sided versus two-sided communication have different effectiveness based on a persons intelligence

In graphs, can be seen as intersecting lines (e.g. in an X pattern)

100
Q

What are some variations of the Factorial ANOVA?

A

Mixed ANOVA: more than one independent variable, but it has ~1 between subjects IV and ~1 repeated measured (within-subjects) variable

101
Q

Multivariate Analysis of Variance (MANOVA): when to use

A

When: study involves 2+ dependent variables and 1+ independent variable. You want to look at the effect of each IV separately but also together

Why use it over multiple one-way ANOVA or factorial ANOVA? Reduces the likelihood of Type I error

102
Q

Chi-Square Test: when to use?

Nonparametric

A

Use for: categorical data (nominal)
e.g. survey results

Means: the obtained frequencies in a set of categories differ significantly from null hypothesis

103
Q

How to calculate df in chi-square test?

Single sample & Multiple sample

A

Single sample chi-square: C-1
Multiple sample chi-square: (C-1)(R-1)
R = no. of rows

103
Q

3 considerations in using Chi-Square Test

A
  1. no observation can be related to one another. can’t be used in before-after studies
  2. each observation classified into only one category/cell (e.g. you can only belong to one political party)
  3. Percentages of observations w/i categories can’t be compared. Frequency data is required.
104
Q

How to calculate expected frequencies in chi-square tests?

A

Single Sample:
divide no. of subjects by the number of cells
Multiple Sample:

105
Q

Mann-Whitney U

Non-parametric tests

A

Used for:
* rank ordered ordinal data w/ two independent groups
* Assumptions of independent t-test not met

Used when:
1. when data from a research study are rank-ordered
2. 2 independent groups you want to compare
2. Assumptions of parametric tests are not met
3. Ordinal data (ranked but differences between ranks aren’t consistent)

106
Q

Wilcoxon Matched-Pairs Test

Nonparametric test

A

Used for:
* Comparing two related groups (repeated measures) using rank-ordered data
* Assumptions for paired t-test not met
* Is there a consistent difference between the two sets of paired data?

Used when:
1. assumptions of parametric are not met

107
Q

Kruskal-Wallis Test

Nonparametric test

A
  • Comparing 3+ groups
  • Data not normally distributed; assumptions for ANOVA not met
  • Ordinal data
  • Question: Is there a significant difference in the ranks of the data between the groups?
108
Q

Person r Correlation Coefficient

A
  • Used when calculating the relationship between two variables measured on an interval or ratio scale
  • Calculated based on z-scores, but don’t need to know specifics
109
Q

What affects the Pearson r?

3 things (LHR)

A
  1. Linearity: assumes linear relationship between two variables, so can’t be used for curvilinear relationships
  2. Homoscedasticity: refers to an equal distribution of scores throughout the scattergram. Heteroscedasticity is when they are not equally dispersed. It lowers the r
  3. Range of Scores: wider range makes for more accurate correlation
110
Q

What is the coefficient of determination?

Regression

A
  • The squared correlation coefficient
  • Indicates the percentage of variability in one measure (IV) that is accounted for by the variability in another measure (DV)
  • E.g. .70 correlation between IW and grades = 49% variation explained by IQ (get this by squaring the correlation coefficient)
111
Q

Point-Biserial and Biserial Coefficients

Correlation Coefficients

A

Point-Biserial:
* Look @ relationship between a continuous variable and dichotomous variable

Biserial:
* Look @ relationship between one continuous variable and an artifically dichotomized variable (a continuous variable that has been divided up)

112
Q

Phi and Tetrachoric Coefficients

Correlation

A

Phi Coefficient:
* 2 naturally binary dichotomous variables
* No assumption about distribution

Tetrachoric Coefficient:
* Two artificially dichotomized variables
* Assumes the variables are continuous and that they follow a normal distribution

113
Q

Contingency

Correlation coefficients

A
  • Correlation between two nominally scaled variables (unordered variables, each having more than two categories)
  • Describes how two categorical variables are related
  • Uses contingency tables
  • Things like Phi, Chi-square measure the strength of the associations between variables
114
Q

Spearman’s Rho

Correlation coefficients

A
  • Correlation measure between two variables w/ an ordinal scale (ranked data)
  • Data not linear (monotonic), may have outliers
  • If data was linear and continuous w/o ranks, pearson r would be used
  • E.g. same students ranked on two different tests, Rho could be used to correlate them
115
Q

Eta

Correlation coefficients

A
  • Strength of relationship between categorical and continuous variable
  • This measures NON-LINEAR relationships
  • Eta (n): tells you how much variance in the continuous variable is explained by the categorical variable
  • Eta (n2): expresses the PROPORTION of variance in the continuous variable that can be explained by the categorical variable
  • Often used with ANOVA
  • Ranges from 0 to 1
116
Q

What is the purpose of a regression?

A
  • An equation that is used to estimate the value of one variable based on the value of another
  • It finds the line of best fit, which is used to predict the dependent variable
  • E.g. can the EPPP score predict my performance ratings as a psychologist?
117
Q

What variables are in a regression?

A
  1. Predictor/Independent Variable
  2. Criterion/dependent Variable

Don’t need to know the equation

118
Q

The 3 Assumptions of Regression

A
  1. Linear Relationship
    This is often depicted with the line of best fit (determined using least squares criterion)
  2. Error scores are normally distributed w/ a mean of 0 Independence of errors
  3. Homoscedasticity: variance of residuals is constant across levels of IVs
  4. No perfect Multicollinearity: IVs not highly correlated w/ each other
  5. Normality of Residuals:
  6. Exogeneity: IV’s not correlated w/ error term
  7. Correct Model Specification: includes relevant variables, excluded unnecessary ones
119
Q

How to Substitute Regression for ANOVA?

A

Code the subjects status on the IV using numbers, which are then put into the regression equation to predict a DV

120
Q

What is Multiple Correlation Coefficient?

A
  • A measure of how well multiple IV’s predict the DV in a multiple regression
  • Higher values = stronger relationship between combo of predictor variables at the criterion variable
  • R ranges from 0 to 1
  • R2 (coefficient of determination) tells you the PROPORTION of variance in DV that is explained by IVs
121
Q

What is multiple regression?

A

The scores on more than one predictor are used to estimate scores on a criterion

122
Q

4 Things to understand about multiple correlation/multiple regression?

A
  1. Multiple correlation coefficient is highest when predictor variables have high correlations with criterion but low with each other (multicollinearity = they correlate)
  2. Multiple correlation coefficient is never lower than the highest simply correlation between an individual predictor and the criterion
  3. Multiple R can never be negative
  4. Can be squared (coefficient of multiple determination)
123
Q

What is multicollinearity?

Multiple Regression

A

When predictors have high correlations with one another in a multiple regression

124
Q

What is the Coefficient of Multiple Determination?

Multiple Regression

A
  • The multiple correlation squared
  • Indicates the proportion of variance in the criterion variable accounted for by combo of predictor variables
  • Ranges from 0 to 1, w/ higher scores meaning the IVs provide a good fit to the data
125
Q

Stepwise Multiple Regression

Forward & Backward Multiple Regressions

A

When to use? if you have a large number of potential predictors, but want to use a small subset of them for the final equation

Forward Step Wise MR: start with one predictor and add others to the equation one at a time. Check predictive power after each one.
MOST COMMON ONE

Backward Stepwise Regression: start with all potential predictors, remove them on at a time and check for predictive power.

126
Q

Canonical Correlation

A
  • This is used when there are multiple criterion and multiple predictor variables, and you want to understand their overall relationship
  • Looks @ two SETS of variables
  • Creates LINEAR COMBINATIONS of the variables that are maximally correlated w/ one another
127
Q

Discriminant Function Analysis

A
  • Creates discriminant functions (linear combos of the IV’s) that best distinguish between groups
  • Used when: to classify cases into groups, or find which variables best differentiate between groups
  • WIlks Lambda test looks at how well a DFA separates the groups
  • Eigenvalues: amount of variance in DV explained by DF
  • Canonical Correlation: tells you ow well IV’s explain group diffs

How is it different from multiple regression?
It predicts criterion GROUP rather than criterion SCORE
E.g. high achievement or low achievement group, rather than specific scores

128
Q

What is Differential Validity?

Discriminant Function Analysis

A

A characteristic in which the predictors involved in classifying people into criterion groups should have a different correlation with each criterion variable

E.g. if you are trying to predict which major a uni student will excel in, the predictor variables should all be different to differentiate between possible groups (english, science, etc)

129
Q

Logistic Regression

What is it used for?
What do scores mean?
How is it different from DFA?

A

Used for: make predictions about which criterion group a person belongs to

How is it different from Discriminant Function Analysis?
* doesn’t rely on the same assumptions
* predictors can be nominal (categorical) or continuous

Use when:
* with dichotomous dependent variables (e.g. responder/non-responder to therapy)

Scores:
0-1
0.80 = 80% chance of being a responder

130
Q

What are the assumptions of the Discriminant Function Analysis?

A
  1. Normality
  2. Homogeneity of Variance-Covariance Matrices
  3. Independence: observations independent of each other
  4. Linearity: relationships between IV’s linear
131
Q

Multiple Cutoff

Correlation & Regression

A

Cut offs are used for each predictor, and missing the cut off of even one predictor eliminates you

E.g. job selection in which you need ALL eligibility criteria

Compared to multiple regression, in which high scores on one predictor can compensate for lower scores on another (e.g. GRE scores)

132
Q

Partial Correlation

A

If a relationship between two variables is obtained, but you suspect that the relationship may be due to another variable, you can ‘partial out’ its effect

E.g. partially out hot weather in the correlation between ice cream and boat accidents

133
Q

What is a Suppressor Variable?

Partial Correlation

A

This is a spurious/extraneous variable that reduces the correlation rather than inflate it

E.g. reading skill on a test impacting a job that doesn’t require reading skill

134
Q

Structural Equation Modeling

A

What is it? A general term for techniques that are based on correlations between multiple variables

Assumptions: linear relationship between variables

Used for: testing causal models based on multiple variables

135
Q

Steps for using Structural Equation Modeling to Test Causal Models based on Multiple Variables

A
  1. Specify a causal model involving many variables: IQ -> education -> empathy -> parenting -> childrens IQ
  2. Conduct Stat Analysis: correlation between all pairs of variables
  3. Interpret Results of Analysis: show if the data are consistent with the model
136
Q

Path Analysis

Structural Equation Modeling

Correlation & Regression

A

Verify causal models that propose one-way causal flows between variables

Can be used only with observed variables (what you measure)

137
Q

LISREL

Structural Equation Modeling

Correlation & Regression

A

can be used with one-way and two-way causal relationships

E.g. prediction that self esteem increases work success, which in turn leads to more self esteem

Uses observed and latent (inferred) variables

138
Q

Trend Analysis

Correlation & Regression

A

Used when:
* both variables are quantitative (interval/ratio)
* interested in the trend of change rather than magnitude

Break points: point where the scores for subjects change direction in a predictable way

What does it tell you? what trends are significant

139
Q

Theoretical Sampling Distribution

A
  1. Population: whole set of something the research is interested in
  2. Sample Distribution: set of scores obtained from a sample of a population
  3. Sampling Distribution: multiple samples taken from population, and the means of those multiple samples are used to create a frequency distribution
    *samples must be same size
    *each population member must have same probability of being selected
140
Q

What is Sampling with Replacement?

Theoretical Sampling Distribution

A

When you pick a sample from a population, record the mean of that sample, and then return the sample to the population before you select your next sample

The ones you just put back have the same probability of ending up in the next sample as do all the rest

141
Q

Assumptions of Central Limit Theorem

A
  1. As sample size increases, shape of the sample distribution approaches normalcy. True even if the population distribution of scores isn’t normal
  2. The mean of the sampling distribution is equal to the mean of the population
142
Q

2 Assumptions of Sampling Distribution

A
  1. Sample distribution has less variability than population distribution
  2. SD of sample distribution is equal to population SD divided by square root of the size of the samples from which means were obtained
143
Q

What is test robustness?

Related to parametric test assumptions
Central Limit Theorem

A

When the rate of Type I error’s is not increased by violations of the assumptions of parametric stat tests

Central Limit Theorem is why parametric tests are robust with normality assumption, provided that the sample size is adequate enough to bring normality to the sample distribution

Homogeneity of variance assumption: parametric tests are robust so long as equal no. of subjects in each experimental group

144
Q

Time-Series Analysis

A

You don’t need to have independence of observation to use this test (as opposed to t-tests)

Autocorrelation: correlation between observations at given lags (e.g. between observations re: lag)

145
Q

Bayes’ Theorem

A

This is a formula used to get a special type of conditional probability

E.g. probability that an 85yo has Alzheimer’s given that they came up positive on a diagnostic test?
-Basically, what is the probability that they have it and not a false positive?

146
Q

Meta-Analysis

What is it?
What measure does it use?

A

Multiple studies analyzed at once, each study becomes a separate subject

Effect Size: indicates magnitude of IV effect
calculated for each DV, then summed and divided by # of effects
It’s the difference between the means of the control group and treatment group, divided by SD of control group