Correlation Flashcards

1
Q

The correlation coefficient describes the relationship between what kind of variables?

A

Ratio/Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If X and Y have a correlation of 0, and an X score has a z score of 1.0,
then what is the best guess of the corresponding Z score of Y?

A

The Mean of Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If X and Y have a correlation of 1, and an X score has a z score of 0.6,
then what is the best guess of the corresponding Z score of Y?

A

.6

If X and Y have a correlation of 1, and an X score has a z score of 0.6,
then what is the best guess of the corresponding Z score of Y?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If X is the maximum possible R2 and Y is the minimum possible R2, then what is X + Y?

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If Y can be predicted perfectly by X, what is the correlation of X and Y?

A

Can’t determine. Depends upon the type of prediction - if linear prediction then correlation is 1 or -1, but with quadratic prediction, correlation could be 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In a set of 100 husbands and wives, the wife’s IQ is always exactly 3 points higher than the husband’s IQ. What is the correlation between the husband IQ and Wife IQ?

A

1, because we can always perfectly predict. `

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If the Z score of a husband’s age and the Z score of the wife’s age always add up to 0, then what is the correlation between the ages?

A

-1. Because we can always predict one age from the other, but they are always in opposite directions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Standard Error of the Estimate Formula

A

√SSresidual / (N-2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The BEST regression line equation minimizes the error between the actual values and those predicted by the regression line. Error is defined as…?

A

Sum of squared differences between actual and predicted values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How much error is there in the least squares regression line with 2 pairs of XY values?

A

None. We can always draw a line between two points that contains both points (hence no error between predicted and actual values).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Goodness of fit test.

A

This test compares an actual distribution of frequencies to a theoretical probability distribution. For example, we could compare an actual distribution of a coin flip to a theoretical distribution (i.e. 50/50) to determine if there was evidence for an unfair coin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The Test for Independence

A

This test looks at two categorical variables and determines if one category in one variable tends to be associated with one category in the other variable. For example, we could look at whether or Men and Women differentially answer a question such as ‘Have you been arrested?’. The test would determine if men are more likely to give a certain response (e.g. yes).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The McNemar test for change

A

This test compares the distribution of a categorical variable before and after an event. For example, this test could be used to see if support for a government policy (yes/no) is affected by an external event (such as a war or an economic downturn).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Goodness of Fit test; Calculate the deviations from the expected frequency

A

Calculate Σ(Obsi-Expi)2/Expi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  1. If I want to see if being married has an effect on voting for a married candidate in an election, what kind of statistical test would I use?
A

independence chi squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

I want to see if getting married affects support for sexual predator laws. I collect my data and find that getting married produces X switches from ‘Yes’ to ‘No’ and X switches from ‘No’ to ‘Yes’. What are the results of the hypothesis test?

A

Since the number of switches are the same, the expected frequencies for each category will be the total (2x)/2 = X, which means..

The expected and the observed frequencies will be the same. This means the χ2values will all be 0, so

Sum of all χ2values will = 0. This means that the null hypothesis will not be rejected because observed χ2= 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Total error for regression

A

Total Error = Σ(Ypredicted-Yactual)2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q
  1. Consider the following information for a power calculation (normal score distribution with μ1 known, σ estimated from sample):
    μ1=100,μ2=102,σX = 16,N=16,Tails=2,α=.05.
    Calculate the power of this experiment
A

CAN’T SOLVE!!!!
BECAUSE σ is estimated from sample when it HAS to be KNOWN.

CANT SOOVLE POWER PROOBS WITH ESTIMATED σ, we need The standard deviation of the populations ONLYYYYY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When do we use a one-tailed confidence interval?

A

WE DON’T!!! ALWAYS TWO TAILED FOR CONFIDENCE INTERVALS :D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

When comparing scores of two groups, dependent t tests are preferred over independent t tests because they usually have____?
A) a lower standard error
B) more variability
C) a lower value in the denominator of the equation which calculates tobs
D) three of the other answers
E) two of the other answers

A

E,

lower standard error and a lower value in the denominator of the question!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Consider the hypothesis test that examined how phone design (flat, flip, fold, or telescope) affects phone usage. Subjects participated in all conditions. Surprisingly, each subject in the experiment used each phone for exactly the same amount of time, but different subjects used the phones for different amounts of time. For example, subject 1 used flat, flip,folded and telescope for 23 minutes. Subject 2 used flat, flip, folded and telescope for 28 minutes. What is the SS-Residual?

A) Two of the other answers
B) Cannot determine
C) Same as the SSW if we analyzed the data using a between subjects ANOVA
D) Same as the SSB if we analyzed the data using a between subjects ANOVA
E) None of the other answers

A

E, it’s 0. there is only SS BS.

22
Q

Consider a non-normal distribution of scores. If X is the standard deviation of the score distribution, and N is the sample size of means that produce a sample mean distribution, then what is the variance of the sample mean distribution?

A

D) X2/N

23
Q
  1. A confidence interval is 50
A

SOOO what you do it…

calculate the width.
the width in this case is 
50 
since u is 100
there is 50 between it and the confidence interval

so using 100 times more subjects, makes the width smaller since we’re even more confident.

sqrt 100 = 10

so we divide 50 by 10 and we get 5

100-5 = 95
100+5=105
new interval

24
Q
  1. If the same set of data is analyzed with a Between-Subjects ANOVA and a Within-subjects ANOVA, but there is no variability between subjects, then a between subjects ANOVA is more likely to show a significant result. Why is this?
A

B) The between subjects ANOVA has a lower MS term in the denominator of the F ratio than the within subjects ANOVA.

B
Explanation: With no variability between subjects, the between subjects anova will have

a lower critical value
a lower ms term in the denominator of the F ratio (because the between subjects anova has higher df which creates a lower ms term because the ss-w (same as the ss-r) gets divided by the larger df-w).

25
Q

If y=x^2, then we can perfectly predict y from x. Given that this is true, the correlation between y and x for y=x^2 is about…

A

0

26
Q

How much error is there in the least squares regression line with 2 pairs of XY
values?

A

NONE: because we’re connecting one line two dots, there is no error :P

27
Q
  1. Consider a normally distributed population of scores with a mean (μ)=100 and a standard deviation (σX)= 3.
    What is the Z-score of a score of 112?
A

3, since its a population of scores, the standard deviation is from the population.

28
Q
  1. A confidence interval is 60
A

Explanation: With a sample size that is 10000 times larger, the confidence interval will be 100 (Sqrt(10000)) times smaller.

Original confidence interval was centered at 100 with 40 as the width (60

29
Q

What is the standard error of the t distribution?

dependent

A

SD=SD/√N

30
Q
  1. Which of the following studies should use an independent t test to analyze the data?

1) Asking each person in an experiment to rate the taste of vanilla and chocolate ice cream
2) Comparing two groups, with sample sizes X and Y. X - Y = 1.

A

Just 2

31
Q

Degrees of freedom total

A

dft=df1 + df2

32
Q

If the variance of both groups are the same

A

then the pooled variance is the same.

33
Q

Confidence Interval Changes in # of subject

A

if it increased by 100 subjects, then divide the width by sqrt of 1000

90<101 is the new width

34
Q

Sample scores are 8,9,10

What is the sample variance (S2

)?

A

1

35
Q
  1. The F ratio of my one-way ANOVA is 1.05, what is the probability that the null hypothesis is

rejected?

A

Cannot Determine

36
Q
  1. When comparing scores of two groups, dependent t tests are preferred over independent t

tests because they almost always have _____________________________

A

B) a lower standard error

C) less variability

D) a lower value in the denominator of the equation which calculates tobs

37
Q
  1. Consider the hypothesis test that compares the IQ scores of people who take a smart pill and

people who do not take a smart pill. The two groups are matched by IQ. Here are the pairs of

scores.

P1 P2 X

Pill 109 109 109

No Pill 99 97 98

Assume that the scores are normally distributed (α=.05,two tailed)

What are the results of the hypothesis test?

A

-11

38
Q
  1. Why are single sample Z-tests not used very often?

A) None of the other answers.

B) That’s not true, single sample Z tests are commonly used.

C) Because the population standard deviation and mean are usually known.

D) Two of the other answers

E) Because the sample standard deviation and mean are usually known.

A

A

39
Q

Consider the hypothesis test that examined how phone design (flat, flip, fold, or telescope) affects phone usage. Subjects participated in all conditions. Surprisingly, all subjects in the experiment used all phones for exactly 17 minutes. What is the MS-Residual?

A) 0
B) Cannot determine without knowing MSSubject
C) Cannot determine without knowing MSBetween Occasions D) Cannot determine without knowing MSWithin
E) Two of the other answers

A

A) 0

40
Q
  1. If I wanted to examine the effects of gender, race and age on depression, what kind of ANOVA would I use?

A) Two of the other answers
B) One-way Between subjects ANOVA with three levels C) One-way Within subjects ANOVA with three levels D) None of the other answers
E) Two-way Between subjects ANOVA

A

Three way Between Subjects..

Therefore E is incorrect

41
Q
  1. A within­subjects ANOVA is generally preferred over a between subjects ANOVA because…
    A) The critical value for a within subjects ANOVA is usually lower than the critical value for a Between subjects ANOVA
    B) Two of the other answers
    C) People like to participate in all conditions
    D) The MSBO is usually lower than the MSB term E) The MSR is usually lower than the MSW term
A

E

42
Q
25. If Y=X2, then we can perfectly predict Y from X. All X values are negative. Given that this is true, the correlation between X and Y for Y=X2 is Z. What is Z? (Draw a graph to help you solve this problem).
A) 0
B) none of the other answers 
C) Between ­-1 and 0
D) 1
E) ­1
A

C

43
Q

If the sum of squared differences (actual - predicted) is 100 and the number of
pairs is 7 then what is the standard error of the estimate?

A

4.47

44
Q

If the power of an experiment with is 0.6 (2 tails,α=.05) then what will the new power be if everything remains the same but the N increases to 2N? (mean distributions are not normal)

A

Can’t do it because it’s not normal!

45
Q
  1. A sample mean distribution is normal, even though the sample size used to create the distribution is only 4. How is this possible?
    A) It’s not possible
    B) Two of the other answers
    C) We used more than 30 samples
    D) The sample mean distribution is based on a score distribution which is asymptotic, asymmetric and unimodal
    E) The sample means came from a score distribution that was normal
A

E) The sample means came from a score distribution that was normal

46
Q
  1. A distribution is asymmetric, and the mean of the scores is 25 and the standard deviation is 5. The highest score is 30. What is the proportion of scores above 25? (Hint: Draw a picture of an asymmetric distribution)
A

Since it’s asymmetric, you can’t determine this shit.

47
Q
  1. If the variance (σX2) of the score distribution is 900 and the standard error (σX) of the sample mean distribution is 10, then what is the sample size?
A) 90
B) 3
C) 30
D) Cannot determine
E) 9
A

E

48
Q
  1. There is a new scale called the ‘hello index’ which measures how many times people say hello in a day. What measures of central tendency can be applied to the ‘hello index’?
A) Mode and Median
B) Mode
C) Mode and Mean
D) Median and Mean
E) None of the other answers
A

E, since it’s ratio, all three can be applied.

49
Q

Consider a non-normal distribution of scores. If X is the standard deviation of the score distribution, and N is the sample size of means that produce a sample mean distribution, then what is the variance of the sample mean distribution?

A) (X/N)2
B) X/(√N)
C) √X/N
D) X2/N 
E) Cannot determine
A

D

50
Q
  1. Consider the hypothesis test that compares the IQ scores of people who take a smart pill and people who do not take a smart pill. Here are the scores. The scores in the two groups are not related.

With Pill 100 97 103 X=100
Without 100 106 X=103

Assume that the scores are normally distributed (α=.05,two tailed)

What is the pooled variance?

A) 18
B) none of the other answers
C) 12
D) 6
E) 9
A

C) 12

51
Q
  1. If Y=X2, then we can perfectly predict Y from X. All X values are positive. Given that this is true, the correlation between X and Y for Y=X2 is Z. What is Z? (Draw a graph to help you solve this problem).
A) 1
B) -1
C) 0
D) Between -1 and 0
E) none of the other answers
A

E
Explanation: If we only have X values greater than 0, then we have the entire right side of the Y=X^2 graph. Any set of these values always produces a positive correlation.Y values are all positive and X values are all positive with more positive x values always producing higher Y value. Pints will not be a straight line because the function is a curve (so correlation is not 1). Draw a graph and you will see this.