bivariate data analysis Flashcards

1
Q

how is a scatterplot constructed?

A
  1. draw a number plane
  2. determine a scale and title for the horizontal or x-axis
  3. determine a scale and title for the vertical or y-axis
  4. plot each ordered pair of numbers with a dot
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the two forms of association in a bivariate scatterplot?

A

linear form - when the points tend to follow a straight line
non-linear form - when the points tend to follow a curved line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the two directions of association in a bivariate scatterplot?

A

positive - gradient of the line is positive
negative - gradient of the line is negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the three strengths of association in a bivariate scatterplot?

A

strong - small amount of scatter in the plot
moderate - modest amount of scatter in the plot
weak - large amount of scatter in the plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is Pearson’s correlation coefficient?

A

Pearson’s correlation coefficient (r) measures the strength of a linear association (−1 ≤ r ≤ +1)
* positive correlation (0 to +1) – both quantities increase or decrease at the same time
* zero or no correlation (0) – no association between the quantities
* negative correlation (−1 to 0) – one quantity increases, the other quantity decreases
note: high correlation between two variables does not imply that one causes the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how is the line of best fit determined?

A

a line of best fit is a straight line that approximates a linear association between points. the equation of the line of best fit is found using the gradient–intercept formula:
y = mx + c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a least-squares line of best fit?

A

a line of best fit minimises the sum of the squares of the vertical distances (or residuals)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the equation of the least-squares line of best fit?

A

the equation is given by y = mx + c where gradient (or slope) is
m = r (sy)
—-
(sx)
y-intercept - c = ȳ - mx̄
r - Pearson’s correlation coefficient
sx and sy - standard deviation of x and y
x̄ and ȳ - mean of x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is interpolation?

A

interpolation is the use of the linear regression line to predict values within the range of the dataset. if the data has a strong linear association then we can be confident our predictions are accurate. however, if the data has a weak linear association, we are less confident with our predictions

interpolation predicts values within the dataset range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is extrapolation?

A

extrapolation is the use of the linear regression line to predict values outside the range of the dataset. predicted values are either smaller or larger than the dataset. the accuracy of predictions using extrapolation depends on the strength of the linear association similar to interpolation. it may not be reasonable to extrapolate too far as this example shows

extrapolation predicts values outside the dataset range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the steps to statistical investigation?

A

a statistical investigation involves four steps: collecting data, organising data, summarising and displaying data and analysing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are issues in a statistical investigation?

A

a statistical investigation raises a number of ethical issues such as bias, accuracy, copyright and privacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is causation?

A

causation indicates that one event is the result of the occurrence of another event (or variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly