Statistics VIII - Exam Questions II Flashcards

1
Q

What’s the linear regression model?
What assumptions are made about the errors or residuals?
What are the consequences if those assumptions are not true and what can one do about it?

A

A model about the linear effect of x on y.
Errors independent and normally distributed.
No normal distribution in a histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s the difference btw multiple and multivariate regression?

A

Multiple regression takes several independent variables and only one dependent variable.
Multivariate regression takes one independent variable and several dependent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the coefficient of determination calculated and how can it be interpreted?

A

The coefficient of determination is the squared correlation: R² = r² (squared correlation coefficient).
R² indicates how much of the variance of y can be explained by the linear relation to x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s a partial coefficient of regression?

A

y=β0 +z1 β1 +z2 β2 +z3 β3+ε
βi … partial coefficient of regression
Interpretation: β₂ is the coefficient of y and z₂, if z₁ and z₃ are held constant.
The partial coefficient of regression is NOT equal to the bivariate coefficient of regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the differences btw a factor analysis and a PCA (in their calculation and interpretation)?

A

When an investigator has a set of hypotheses that form the conceptual basis for her/his factor analysis, the investigator performs a confirmatory, or hypothesis testing, factor analysis. In contrast, when there are no guiding hypotheses, when the question is simply what are the underlying factors the investigator conducts an exploratory factor analysis. The factors in factor analysis are conceptualized as “real world” entities such as depression, anxiety, and disturbed thought. This is in contrast to principal components analysis (PCA), where the components are simply geometrical abstractions that may not map easily onto real world phenomena.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

You designed a questionaire for a study. After a testrun you notice that questions 1 and 3 are answered in exactely the same way by all participants. Why should you consider to rethink your questions?

A

There will be no variance on these questions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

You use three variables to measure fish in two ponds and you are interested in the differences of the two populaitons. The uncorrected p-values for the group differences (e.g. t-test) are:
body weight: 0.04
body length: 0.01
length of tail fin: 0.005
If you want to except a collective alpha-error of 5% after a Bonferroni-correction, for which of the three variables do you assume a significant group difference?

A

0.05 (5%) / 3 (number of tests) = 0.0167
->
Body length and length of tail fin is still ok

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Plants and fertilizer: height of plant (in cm) = Y and amount of fertilizer (in g) = X.
Interpret the following result (describe equation in words): Y = 10.3 + 1.7 X
What should you control for in this experiment?

A

Positive influence of x on y.
Started at 10.3 cm.
Plants are 1.7 cm higher per g fertilizer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does R² usually stand for?

A

Coefficient of Determinatioin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does MANCOVA mean?

A

Multivariate Analysis of Covariance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a partial correlation coefficient?

A

The correlation coefficient between the residuals of a regression is called partial correlation coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s the difference btw 1-way and 2-way ANOVA?

A

The 2-way is an extension of the 1-way ANOVA to include different categorical independent variables on one dependent variable. (1-way ANOVA uses just one independent variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You recorded 5 physiological measurments in one group of subjects before and after the treatment with a certain drug. How will you proceed statistically to investigate the effect of the drug?

A

Test covariance in paired comparison -> smaller variances can be detected that would be masked by interpersonal variance in cross-sectional studies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

You are interested in whether students, who do have plants in their classroom, get better grades. You record grades in the subjects Maths, German and Biology and conduct your study at 5 different schools.
How many factors are there in your study and which are fixed and which are random?

A

Fixed Factors:
plants or no plants
5 different schools

Random Factors:
ethnicity?
exact age of students
grades in Maths
grades in German
grades in Biology

??????????????

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

You are interested in whether students, who do have plants in their classroom, get better grades. You record grades in the subjects Maths, German and Biology and conduct your study at 5 different schools.
Which statistical analysis would you use?

A

MANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

You are interested in whether students, who do have plants in their classroom, get better grades. You record grades in the subjects Maths, German and Biology and conduct your study at 5 different schools.
How can you correct for the different ages of the subjects?

A

Using MANCOVA and age as covariate.

17
Q

You are interested in whether students, who do have plants in their classroom, get better grades. You record grades in the subjects Maths, German and Biology and conduct your study at 5 different schools.
The p-value for your whole model is 0.19. What does it mean?

A

No significant difference between plants and no-plants.

18
Q

You are interested in whether students, who do have plants in their classroom, get better grades. You record grades in the subjects Maths, German and Biology and conduct your study at 5 different schools.
The p-value for your whole model is 0.01. How can you proceed?

A

Try to determine what fields are influenced how.

??????????????

19
Q

A study investigates 10 variables in 2 groups á 50 cases. For each of these variables a t-test was used to search for differences of the means. The authors of the study claim that for two variables they found significant differences between the means (p = 0.05). Would you trust their results? Why?

A

No. They apperantly haven’t conducted a Bonferroni-correction, after which the critical p-value would be 0.005.

20
Q

A study investigates 10 variables in 2 groups á 50 cases. For each of these variables a t-test was used to search for differences of the means. The authors of the study found no significant univariate t-test. Could there still be a (multivariate) difference between the two groups?

A

Yes, MANOVA or Hotelling’s T2 could help.

????????????????

21
Q

As with other parametric tests, we make the following assumptions when using two-way ANOVA:

A
  • The population from which the sample are obtained must be normally distributed.
  • Sampling is done correctly. Observations for within and between groups must be independent.
  • The variance among populations must be equal (homoscedastic).
  • Data are interval or nominal
22
Q

What is the trace of a matrix?

A

The sum of all (main) diagonal fields.

23
Q

What does cross-validation mean?

A

Testing the classification with a sample that was not used for the computation of classification rules.