Lecture 3 Flashcards

Correlation and Simple Regression

1
Q

Which of the following statements best describes the role of correlation in predicting outcomes?
A. Correlation guarantees accurate predictions without any error.
B. Correlation measures the strength of the connection between two variables, helping make predictions with absolute certainty.
C. Correlation assesses the relationship between variables, allowing for predictions with some degree of error.
D. Correlation is irrelevant when it comes to making predictions.

A

Correlation assesses the relationship between variables, allowing for predictions with some degree of error.

Correlation indeed helps assess the strength and direction of the relationship between variables, allowing for predictions, but it doesn’t guarantee 100% accuracy due to the presence of some degree of error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If the correlation coefficient between study hours and exam scores is -0.75, what does this value indicate?
A. There is a perfect positive relationship between study hours and exam scores.
B. There is a strong negative relationship between study hours and exam scores.
C. There is no relationship between study hours and exam scores.
D. The correlation coefficient is too low to draw any conclusions.

A

There is a strong negative relationship between study hours and exam scores.

A correlation coefficient of -0.75 indicates a strong negative relationship, meaning as study hours increase, exam scores tend to decrease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In a research study examining the correlation between “X” and “Y,” the correlation coefficient was found to be “epsilon.” If “epsilon” is a Greek letter representing a very small positive value, what can be confidently concluded about the relationship between “X” and “Y”?

A. There is a negligible relationship between “X” and “Y.”
B. “X” and “Y” have a strong positive relationship.
C. The correlation coefficient is undefined.
D. “X” and “Y” have a perfect positive relationship.

A

There is a negligible relationship between “X” and “Y.”

The use of the Greek letter “epsilon” to represent a very small positive value implies a correlation coefficient close to zero.
A correlation coefficient close to zero indicates a weak or negligible relationship between the variables “X” and “Y.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In a research study comparing two groups, Group A and Group B, the variance of Group A is higher than that of Group B. What might this higher variance suggest regarding the study’s outcomes?

A. The study is likely to find a significant difference between Group A and Group B.
B. The study may face challenges in detecting a significant difference between Group A and Group B.
C. The standard deviation is irrelevant in this context.
D. The p-value will be unaffected by the variance difference.

A

The study may face challenges in detecting a significant difference between Group A and Group B.

Higher variance can make it more challenging to detect a significant difference between groups. When variance is high, it may obscure the underlying effects or differences, making it harder to draw robust conclusions. Option B would be more appropriate because it acknowledges the potential challenges associated with higher variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A research team is conducting a study on reaction times in different age groups. The variance of reaction times in the elderly group is found to be significantly higher than that in the young adult group. How might this impact the study’s conclusions?

A. It will make it easier to detect significant differences between age groups.
B. The study’s conclusions may be influenced by the variability in reaction times.
C. Variance is irrelevant when studying age groups.
D. The study’s p-value will be lower due to the higher variance.

A

The study’s conclusions may be influenced by the variability in reaction times.

Higher variance in the elderly group might introduce challenges in drawing conclusions about the differences between age groups. When variance is high, it may complicate the interpretation of results, and significant differences may be more difficult to identify. Option B would be more accurate because it recognizes the potential impact of higher variance on the study’s conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If the standard deviation of a dataset is 0, what can be concluded about the variability of the data?

A. The data is perfectly normally distributed.
B. There is no variability in the data; all values are identical.
C. The variance of the data is also 0.
D. The p-value for any analysis will be significant.

A

There is no variability in the data; all values are identical.

A standard deviation of 0 indicates that all values are identical; there is no variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A researcher is analyzing the results of an experiment and observes a high variance in the data. What potential challenges might the researcher face in interpreting the study’s outcomes?

A. It will be easier to draw robust conclusions with a high variance.
B. The researcher may struggle to identify significant effects or differences.
C. Standard deviation has a greater impact on interpretation than variance.
D. The p-value is independent of the variance.

A

The researcher may struggle to identify significant effects or differences.

High variance can indeed make it challenging to identify significant effects or differences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

If a research study finds a significant difference between treatment and control groups, but the variance in the treatment group is much larger, what consideration should be given to the study’s results?

A. The results are likely invalid due to the large variance in the treatment group.
B. A larger variance in the treatment group enhances the credibility of the findings.
C. The p-value will be unaffected by the variance difference.
D. The study’s conclusions should be cautious, considering the impact of variance on the interpretation.

A

The study’s conclusions should be cautious, considering the impact of variance on the interpretation.

A larger variance in the treatment group should make one cautious about the study’s conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A researcher conducts a study predicting customer satisfaction with three predictor variables: product quality (X1), price (X2), and brand loyalty (X3). The model’s coefficients indicate that product quality has a significant negative effect, price has a significant positive effect, and brand loyalty has a non-significant effect on customer satisfaction. What inference can be drawn regarding the contribution of these predictor variables to explaining variance in customer satisfaction?

A. Product quality and price together explain a substantial portion of the variance in customer satisfaction.
B. Brand loyalty has no impact on customer satisfaction.
C. The sign of the coefficients indicates the direction but not the magnitude of the impact on variance.
D. The non-significant effect of brand loyalty means it contributes the most to explaining variance in customer satisfaction.

A

Product quality and price together explain a substantial portion of the variance in customer satisfaction.

The significance of the coefficients (whether they are significant or not) does provide information about the impact on variance. A significant coefficient indicates a significant contribution to explaining variance.
Product quality and price are significant contributors to explaining variance,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Consider two variables, X and Y. The covariance between them is calculated to be 40. What does this value tell you about the relationship between X and Y?

A. X and Y have a perfect positive linear relationship.
B. X and Y have a perfect negative linear relationship.
C. X and Y have a strong relationship, but the direction is not clear.
D. X and Y have no relationship.

A

X and Y have a strong relationship, but the direction is not clear.

Covariance doesn’t provide information about the strength or direction of the relationship; it only indicates the degree of linear association.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If the covariance between two variables is zero, what can be concluded about their relationship?

A. There is no relationship between the variables.
B. The variables have a perfect positive linear relationship.
C. The variables have a perfect negative linear relationship.
D. The relationship is non-linear.

A

The variables have no relationship.

A covariance of zero indicates no linear relationship between the variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

For a given dataset, the correlation coefficient (r) between variables X and Y is calculated to be 0.95. What can be inferred about the relationship between X and Y?

A. X and Y have a perfect positive linear relationship.
B. X and Y have a strong positive linear relationship.
C. X and Y have a perfect negative linear relationship.
D. The correlation coefficient is too high to be meaningful.

A

X and Y have a strong positive linear relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If the correlation coefficient between two variables is -0.20, what does this indicate about their relationship?

A. There is a weak positive linear relationship.
B. There is a strong negative linear relationship.
C. There is a weak negative linear relationship.
D. The variables are not related.

A

There is a weak negative linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In a regression analysis, the coefficient of determination is calculated to be 0.75. What does this value signify?

A. 75% of the variability in the dependent variable is explained by the independent variables.
B. The model does not explain any variability.
C. 75% of the variability is unexplained by the model.
D. The model is a perfect fit.

A

75% of the variability in the dependent variable is explained by the independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If r-squared is 0, what does it indicate about the regression model?

A. The model does not explain any variability in the dependent variable.
B. The model is a perfect fit.
C. The model explains all the variability in the dependent variable.
D. The relationship is nonlinear.

A

The model does not explain any variability in the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the equation: SST=SSM+SSR represent in the context of regression analysis?
A. The balance between total and model variations.
B. The sum of squares of total variations.
C. The relationship between residuals and the model.

A

The sum of squares of total variations.

It represents the decomposition of the total sum of squares into the sum of squares explained by the model and the sum of squares left unexplained (residuals).

17
Q

If SSM (Model Sum of Squares) is equal to SSR (Residual Sum of Squares), what can be inferred about the regression model?
A. The model perfectly explains all variations.
B. The model doesn’t explain any variations.
C. The model explains some, but not all, of the variations.

A

The model perfectly explains all variations

If SSM is equal to SSR, it means that the model has successfully explained all the variations in the data. In other words, there are no unexplained variations or residuals left. The model perfectly captures and accounts for all the variability observed in the dependent variable. Therefore, the correct interpretation is that the model perfectly explains all variations.

18
Q

In the context of SST (Total Sum of Squares), what does “total” refer to?
A. The overall sum of squared residuals.
B. The combined sum of squared model and residual variations.
C. The total number of data points in the analysis.

A

The combined sum of squared model and residual variations.

SST represents the total variability in the dependent variable, including both the variability explained by the model (SSM) and the variability left unexplained (SSR).

19
Q

Which of the following statements about Pearson’s correlation coefficientis not true?
A. It can only be used with continuous variables
B. It can be used as an effect size measure
C. It varies between –1 and +1
D. A correlation coefficient of zero indicates there is no relationship between the variables

A

It can only be used with continuous variables

it can be used with binary or categorical variables

20
Q

A psychologist was interested in whether the amount of news people watch (minutes per day) predicts how depressed they are (from 0 = not depressed to 7 = very depressed). What does the standardized beta (-.224) tell us in the output?

A. As news exposure decreases by 0.224 standard deviations, depression increases by 1 standard deviation
B. As news exposure increases by 1 minute, depression decreases by 0.224 units
C. As news exposure decreases by 0.224 minutes, depression increases by 1 unit
D. As news exposure increases by 1 standard deviation, depression decreases by 0.224 of a standard deviation

A

As news exposure increases by 1 standard deviation, depression decreases by 0.224 of a standard deviation

Shows relationship between predictor and outcome in terms of standard deviations (unitless measurement)

21
Q

A psychologist was interested in whether the amount of news people watch predicts how depressed they are. In this table, what does the value 4.404 represent (F)?

A. The ratio of how much the prediction of depression has improved by fitting the model, compared to how much variability there is in depression scores
B. The ratio of how much error there is in the model, compared to how much variability there is in depression scores
C. The proportion of variance in depression explained by news exposure
D. The ratio of how much the prediction of depression has improved by fitting the model, compared to how much error still remains

A

The ratio of how much the prediction of depression has improved by fitting the model, compared to how much error still remains

22
Q

The coefficient of determination:
A. Is the square root of the variance
B. Is a measure of the amount of variability in one variable that is shared by the other variable
C. Is the square root of the correlation coefficient
D. Indicates whether the correlation coefficient is significant

A

Is a measure of the amount of variability in one variable that is shared by the other variable

The proportion of the variation in the outcome variable (Y) that is predictable from the predictor variable (X).
A measure of how much variability in one variable can be “explained by another”.
R² shows how well terms (data points) fit a model curve or line.