10a | Regression / Classification Flashcards
(POLL)
To determine the strength of an association between two numerical variables which are normally distributed we use a …
* t-test
* wilcox-test
* Pearson correlation test
* Kendall tau test
* Spearman correlation test
- Pearson correlation test
(POLL)
A R-square value of 0.08 for a Peason correlation is due to Cohen’s rule of thumb a …
* neglectable association
* weak association
* medium assocation
* strong association
Detlef:
* medium association
0.08 is the squared value so r is either 0.28 or -0.28
chatgpot:
An R-square value of 0.08 for a Pearson correlation corresponds to a correlation coefficient (r) of approximately 0.08=0.283
According to Cohen’s rule of thumb, an r2 value of
- 0.09 is considered a medium effect,
- 0.01 is a small effect
Since 0.08 is close to 0.09 but does not quite reach it, it would be interpreted as a weak association, bordering on medium.
(POLL)
Mutual Information
* covers only linear associations
* covers linear and non-linear associations
* has a value range of -1 to 1
* has a value range of 0 to Inf
* does not provide a p-.value
* does provide a p-value
- covers linear and non-linear associations
- has a value range of 0 to Inf
- does not provide a p-.value
(POLL)
which of the following methods can be used to predict qualitative data
* simple linear regression
* multiple regression
* logistic regression
* regression trees
* classification trees
- logistic regression
- classification trees
(POLL)
The correlation between A and B is 0.9 p <0.001, what might be possible reasons?
* A is directly influencing B
* B is directly influencing A
* A is influencing B via a factor C
* B is influencing A via a factor C
* A and B are controlled by a factor C in the same way
* The correlation is just spurious, by accident
all possible!
last one is unlikely by chance but remember that Pearson correlation has problems with outliers!
(POLL)
Which of the following association measures are sensitive to outliers?
* Kendall-Tau-Correlation
* Mutual Information
* Pearson-Correlation
* Spearman-Correlation
- Pearson-Correlation
(QUIZ 5)
If a correlation plot looks like a diagonal line going from low to high, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
1.0
(QUIZ 5)
If a correlation plot looks like a narrow ellipsoid cloud going from low to high, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
0.8
(QUIZ 5)
If a correlation plot looks like a plump ellipsoid cloud going from low to high, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
0.4
(QUIZ 5)
If a correlation plot looks like a circular cloud, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
0
(QUIZ 5)
If a correlation plot looks like a plump ellipsoid cloud going from high to low, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
-0.4
(QUIZ 5)
If a correlation plot looks like a narrow ellipsoid cloud going from high to low, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
-0.8
(QUIZ 5)
If a correlation plot looks like a diagonal line going from high to low, what value would you expect?
* 1.0
* 0.8
* 0.4
* 0
* -0.4
* -0.8
* -1.0
-1.0
(QUIZ 5)
The correlation value R-squared for Pearson and Spearman correlation is also known as the_________. It summarizes the ______ variance between the two variables.
Coefficient of determination, shared
(QUIZ 5)
If the Pearson correlation r between variables A and B is about 0.5 and we know that A influences B it means that ____ of the variance in B is explained by A.
25%
(QUIZ 5)
We can as well use Pearson and ______ correlation if we have nominal variables with only _____ levels.
For ordinal where the values have _________ we can as well use correlation measures
By using certain correlation thresholds between individual ______ of variables, we can create correlation ______.
Spearman, 2, a certain order, pairs, networks
(QUIZ 5)
Factor (qualitative) or number (quantitative)?
Simple linear regression
number
(QUIZ 5)
Factor (qualitative) or number (quantitative)?
Multiple linear regression
number
(QUIZ 5)
Factor (qualitative) or number (quantitative)?
Logistic regression
Factor
(QUIZ 5)
Factor (qualitative) or number (quantitative)?
Classification tree
factor
(QUIZ 5)
Factor (qualitative) or number (quantitative)?
Regression tree
number
(QUIZ 5)
Decision trees:
In contrast to linear regression techniques, decision trees ______ need a specific data distribution. The input data of the independent variables can be ______ types.
Does not, both numerical and categorical
(QUIZ 5)
Decision trees:
The dependent variable is the variable we _____. If the dependent variable is of type factor we have a ______ tree.
Want to predict, classification
(QUIZ 5)
The variables shown in the decision tree are usually the ______ variables. If we have many independent variables using a ______ is a good alternative to the decision tree
Independent
random forest