Data Analysis: Investigating relationships Flashcards

1
Q

What do correlation coefficient (r) tests do?

A

Measure strength of a relationship between two continuous variables measures between r = -1 and 1 -1 - negative linear 0 - no linear relationship 1 - positive linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we interpret the correlation coefficient?

A

-0.3 to 0.3: weak -0.5 to -0.3 or 0.3 to 0.5: moderate -0.9 to -0.5 or 0.5 to 0.9: strong -1.0 to -0.9 or 0.9 to 1.0: very strong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When is regression useful?

A

Regression is useful when we want to:

  1. look for significant relationships between 2 variables
  2. predict a value of one variable for a given value of the other
  3. it involves estimating the line of best fit through the data which minimises the sum of the squared residuals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are residuals?

A

The differences between the observed and predicted weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do we assume about regression and what do we plot to check?

A
  1. the relationship between the independent and dependent variables is linear - check using original scatter plot of dep and indep variables
  2. the variance of the residuals about predicted responses should be the same for all predicted responses - check using plots of standardised predicted values and residuals
  3. The residuals are independently normally distributed - check by plotting residuals in histogram
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we check normality?

A

Histogram of residuals looks approx. normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What shape suggests problems for residuals?

A

A funneling shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What if assumptions are not met for regression?

A

If residuals are heavily skewed or residuals show diff variances as predicted values increase, the data needs to be transformed Try taking natural log (ln) of dependent variable. Then repeat analysis and check the assumptions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the steps to choosing the right test?

A
  1. Research question must be clear with measurable quantities.
  2. dependent variables: what variable is the dependent? (think about type of data)
  3. data types
  4. comparing means
  5. Do you have repeated measures?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two types that stats tests fall into?

A

Parametric

assume data follows a particular distribution e.g. normal distribution

Non-parametric

usually based on ranks/signs rather than actual data

  • numerical data is ordered and ranked, analysis is then carried out on the ranked data rather than the actual data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When are non parametric tests used?

A
  • When data is ordinal
  • when data doesn’t seem to follow any particular shape or distribution
  • assumptions underlying parametric are not met
  • a plot of data appears to be very skewed
  • there are potential influential outliers in the dataset
  • sample size is small
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What can be done about non - normality?

A

•If the data are not normally distributed, there are two options:

  1. Use a non-parametric test
  2. Transform the dependent variable

•For positively skewed data, taking the log of the dependent variable often produces normally distributed values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pair the non-parametric tests with the parametric tests if normality isn’t present.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Summary

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly