quiz 3 Flashcards

1
Q

what is correlation

A
  • Each individual is measured on two variables (X, Y)

- We are interested in exploring the relationship between scores on X and scores on Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a bivariate scatterplot

A

plot of the data on the axis, we do not know if there is a correlation yet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

explain 0 correlation, positive and negative

A
  • If there is no relationship between X and Y, the correlation is 0
  • If higher scores on X are associated with higher scores on Y, the correlation is positive
  • If higher scores on X are associated with lower scores on Y, the correlation is negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is Pearson Correlation

A
  • Statistic that allows us to express the relationship between X and Y (r)
  • May take on values ONLY between -1 and +1
  • If there is no correlation between X and Y, then r = 0
  • If there is a positive correlation between X and Y, then r will be between 0 and +1
  • If there is a negative correlation between X and Y, then r will be between -1 and 0.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the two parts of a Pearson Correlation and what do they tell us

A
  • The sign (+/-) tells us whether the correlation is positive or negative
  • The magnitude (absolute value) tells us the strength of the relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

the closer a magnitude is to 1 means what?

A

the stronger the relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

explain Perfect Positive Correlation

A
  • this does not happen, statistically cannot happen)
  • r = +1.00
  • Perfect Negative Correlation (r = -1.00)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is significance testing looking at

A
  • We want to make inferences to the whole population based on a sample selected from the population.
  • Sampling error will always be involved.
  • We might find a positive correlation in our sample, but how do we know that the variables are actually correlated in the population?
  • How likely is it that I will make an error by claiming that the two variables are correlated in the population?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what does significance testing use

A

p-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does p-value tell us

A
  • Tells us the chance that we will be WRONG if we conclude that there is a correlation between the two variables in the population
  • p = .04 means that there is a 4% chance that we will make an error if we conclude that the two variables are in fact correlated in the population
  • Convention: p ≤ .05 is considered “statistically significant”
  • 5% chance or less that you’re wrong
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are three important considerations for correlations

A
  • Shape of the relationship
  • Homoscedasticity
  • Restriction of range
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what does the shape of the relationship mean

A
  • Pearson r applies only if the relationship between the variables is presumed to be linear.
  • Whatever the connection is, we are assuming that one thing is directly impacting the other

-Curvilinear relationships cannot be described by Pearson r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

explain homoscedasticity and heteroscedasticity

A

Homoscedasticity = all data points fall within a (more or less) elliptical/oval shape; range of values on Y are same for each value of X

Heteroscedasticity = shape of data points deviates from ellipse (e.g., fan shaped); range of values on Y are NOT the same for each value of X
-We do not use Pearson’s R because it returns a result of 0, and this is not true because there is a relationship (same is true for curvilinear relationships)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

visual difference between homo and heteroscedasticity

A

see ppt slides

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

explain restricted range

A
  • Common reason why population correlation coefficients can be underestimated by sample r’s
  • If your sample is in a restricted range, makes you conclude there is no relationship when there actually might
  • The effect of a restricted range is to reduce the magnitude of the calculated r.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what two variables does Pearson r measure

A

variable 1: interval or ratio

variable 2: interval or ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what two variables does Spearman rho measure

A

variable 1: ordinal (ranks)

Variable 2: ordinal (ranks)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what two variables does Phi measure

A

variable 1: true dichotomy (nominal, two categories)

Variable 2: true dichotomy (nominal, two categories)

19
Q

what two variables does tetrachoric measure

A

variable 1: artificial dichotomy (ex. pass or fail on a math test)
Variable 2: artificial dichotomy

20
Q

what two variables does contingency coefficient measure

A

variable 1: nominal, two or more categories

Variable 2: nominal, two or more categories

21
Q

what two variables does point biserial measure

A

variable 1: true dichotomy (nominal)

Variable 2: interval or ratio

22
Q

what two variables does biserial measure

A

variable 1: artificial dichotomy

Variable 2: interval or ratio

23
Q

what two variables does eta (curvilinear) measure

A

variable 1: interval or ratio

Variable 2: interval or ratio

24
Q

what is artificial dichotomy

A

Artificial dichotomy is when a variable is not dichotomous but you are making it dichotomous

25
Q

what is the coefficient of determination

A
  • Obtained by squaring the correlation coefficient (r2)
  • Interpreted as the percentage of variance in one variable that is predictable (explained by or shared with) the other variable

For example, if the correlation (r) between IQ and reading test scores is .7

  • The Coefficient of Determination is .72 = .49 = 49%
  • This means that 49% of the variance in reading test scores is predictable (or explained by) by IQ scores
  • And 51% of the variance in reading test scores is due to other factors
26
Q

why does correlation not imply causality

A

A strong correlation (positive or negative) between X and Y could mean any one of three things:

  1. X causes Y
  2. Y causes X
  3. A third,unmeasured variable influences both X and Y
27
Q

what is factor analysis

A
  • Expanding to the possibility that we have multiple measurements on each individual
  • For example, each individual has five test scores
  • We can take each pair of tests and calculate a correlation coefficient on that pair
28
Q

what is a correlation matrix

A

visual representation of the correlation of each factor to all the other factors

  • Entries in diagonal are +1.00 (perfect positive correlation)
  • Section above is a mirror image of the section below
29
Q

how to determine the number of unique entries in a correlation matrix involving n variables

A

n(n-1) / 2

30
Q

what is the purpose of factor analysis

A
  • Goal = simplification
  • Discovery of underlying dimensions or constructs that can account for the pattern of correlations among our variables
  • Reduces the number of variables we have to work with
31
Q

what are the 6 steps in factor analysis

A

-Step 1. Deciding on the number of factors

  • Step 2. Extracting the factors
  • Perform the factor analysis by directing the software to extract the number of factors identified in Step 1
  • Step 3. Examining the factor loadings
  • Step 4. Performing a rotation
  • Step 5. Examining the rotated factor loadings
  • Step 6. Interpreting and naming the factors
32
Q

what is the maximum number of factors that can be extracted

A

number of variables

33
Q

what is an Eigenvalue

A
  • amount of variance associated with each factor
34
Q

what is a scree plot

A
  • Scree Plot is a graphical representation
  • eigenvalues (vertical axis) vs.
  • the number of factors (horizontal axis)
  • Locate the place where there is a large drop
  • Number of factors to extract is at the top of the drop
35
Q

what is a factor loading

A

-correlation of each of the original variables with each factor

36
Q

explain factor rotation

A
  • Plot the factor loadings on a graph
  • Rotate the axes until they pass through the greatest number of data points
  • Recalculate the factor loadings
37
Q

what is Thurstone’s Criteria

A
  1. Eliminate negative factor loadings

2. Each variable has a high loading on only one factor

38
Q

what is a rotated factor matrix

A
  • Shows the correlation between each original variable and the new, rotated factors
  • Step 6. Interpret and name the factors
39
Q

explain unidimensional vs.. multidimensional

A

1) Unidimensional test =
- all items load on a single factor
- test is measuring a single construct
- Beck Depression Inventory might look like a unidimensional test but it is not

2) Multidimensional test =
- items group into two or more separate factors
- test is measuring more than one construct
- most tests are typically multidimensional

40
Q

what are orthogonal factors

A

extracted factors are not correlated with one another (orthogonal)
-Lines are forming right angle with each other, will not eventually intersect/cross with each other

41
Q

explain oblique factors

A

permit our factors to themselves be correlated with one another

42
Q

what can you do if the factors are oblique

A

-If we extract many factors and the factors are oblique (correlated), we can repeat the process and factor analyze the factors themselves ( Second-Order Factors)

43
Q

define factor

A

a mathematical concept that is utilized to determine if things have a relationship to one another
-Referring to the underlying relationship between two things, cannot now know it until you have done a factor analysis