Statistical Analysis Flashcards

1
Q

What do you press to put your calc in stats mode in order to set out the data needed to for pearsons correlation coefficient? (CASIO)

A
  1. MODE
  2. STAT - (2)
  3. A + Bx - (2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Once you have plugged your data into a calc what do you press to work out pearsons correlation coefficient? (CASIO)

A

(Assuming the calc is already in stats mode)

  1. SHIFT STAT - (1)
  2. Reg - (5)
  3. r - (3)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When working out the line of best fit what do you press to reach the 5 values you need? (CASIO)

A

(Assuming the calc is already in stats mode)

  1. SHIFT STAT - (1)
  2. Var - (4)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What formulas do you require to work out the least squares line of best fit and name them in order?

A
  1. Gradient
  2. Y-intercept
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the gradient formula used to work out the least squares line of best fit equation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Y-intercept formula used to work out the least squares line of best fit equation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the formula used to work out the least squares line of best fit equation?

A

(AKA) y = mx + b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the meaning of this value - x̅ - used in the calculation of least squares line of best fit?

A

x̅ is the mean of x scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the meaning of this value - ȳ - used in the calculation of least squares line of best fit?

A

ȳ is the mean of y scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the meaning of this value - r - used in the calculation of least squares line of best fit?

A

r is the correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the meaning of this value - σx - used in the calculation of least squares line of best fit?

A

σx is the standard deviation of x scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the meaning of this value - σy - used in the calculation of least squares line of best fit?

A

σy is the standard deviation of y scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is systematic sampling?

A

A systematic sample is obtained by selecting one person at random and then choosing additional people at evenly spaced intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is self-selected sampling?

A

A sample is self-selected when the inclusion or exclusion of people is determined by whether they chose to participate themselves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is simple random sampling?

A

This method occurs when participants are selected from the population at random, free from all bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is stratified sampling?

A

The population is broken into groups based on one particular characteristic or feature. A stratified sample is then obtained by selecting a simple random sample from each group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does the categorical variable ‘ordinal’ mean?

A

Categorical data whose name does indicate order. E.g movie ratings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the categorical variable ‘nominal’ mean?

A

Categorical data whose name does not indicate order. E.g gender; male and female.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is discrete numerical data?

A

The data obtained when a quantity is counted. It can only take exact numerical values. E.g the number of rooms in a house.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is continuous numerical data?

A

Quantitative data that can be measured. It has an infinite number of possible values within a selected range. E.g temperature range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the mode?

A

The mode of a set of data values is the value that appears most often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the median?

A

The median is the number in the middle of the data set. To find the median, the data should be arranged in order from least to greatest. If there is an even number of items in the data set, then the median is found by taking the average of the two middle numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the range?

A

The Range is the difference between the lowest and highest values in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the mean?

A

The mean is the average of the numbers. It is calculated by adding up all the numbers, then dividing by however many numbers there are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Finish the sentence…

“The symbol ‘μ’ represents the population ?

A

The symbol ‘μ’ represents the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the steps taken to calculate the interquartile range?

A
  1. Put the numbers in order.
  2. Make a mark in the centre of the data.
  3. Find Q1 and Q3 - Q1 is the median (the middle) of the lower half of the data, and Q3 is the median (the middle) of the upper half of the data.
  4. Subtract Q1 from Q3 - IQR = Q3 - Q1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the effect of outliers on summary statistics?

A

Outliers may skew results. They often have a significant effect on the mean - generally don’t impact the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is an outlier?

A

An outlier is a value that is very different from the other data in your data set. E.g 1, 2, 2, 3, 10 or 12, 13, 11, 2

29
Q

What is a unimodal distribution in a data set?

A

A distribution with one clear peak or most frequent value.

30
Q

What is a bimodal distribution in a data set?

A

A bimodal distribution is a set of data that has two peaks (modes).

31
Q

What is a multimodal distribution in a data set?

A

Distribution of data with more than one peak, or “mode.”

32
Q

What values are included in a five-number summary?

A

The five-number summary of a data set consists of the five numbers the smallest value, Q1 , median, Q3 , and largest value of the data set.

33
Q

What is theoretical probability?

A

Theoretical probability is a method to express the likelihood that something will occur.

34
Q

What is relative frequency?

A

In statistics, the frequency of an event is the number of times the event occurred in an experiment or study.

35
Q

If event (A) is an impossibility, P(A) = ?

A

P(A) = 0

36
Q

If event (A) is a certainty, P(A) = ?

A

P(A) = 1

37
Q

What is the formula used to calculate the probability of an event where outcomes are equally likely?

A
38
Q

What is the compliment rule?

A

The Complement Rule states that the sum of the probabilities of an event and its complement must equal 1. For the event A, P(A) + P(A’) = 1.

39
Q

What is meant by “the form of an association”?

A

If an association exists between the variables then the points in a scatterplot tend to follow a linear pattern or a curved pattern. This is called the form of an association.

40
Q

Which graph indicates a strong association?

A

GRAPH A

41
Q

Which graph indicates a moderate association?

A

GRAPH B

42
Q

Which graph indicates a weak association?

A

GRAPH C

43
Q

Which graph indicates a positive association?

A

GRAPH A

44
Q

Which graph indicates a negative association?

A

GRAPH B

45
Q

A correlation coefficient between +0.75 to +0.99 indicates what association?

A

Strong positive association

46
Q

A correlation coefficient between +0.50 to +0.74 indicates what association?

A

Moderate positive association

47
Q

A correlation coefficient between +0.25 to +0.49 indicates what association?

A

Weak positive association

48
Q

A correlation coefficient between –0.24 to +0.24 indicates what association?

A

No linear association

49
Q

A correlation coefficient between –0.25 to −0.49 indicates what association?

A

Weak negative association

50
Q

A correlation coefficient between –0.50 to −0.74 indicates what association?

A

Moderate negative association

51
Q

A correlation coefficient between –0.75 to −0.99 indicates what association?

A

Strong negative association

52
Q

Finish the sentence…

“Pearson’s correlation coefficient ‘r’ has a value between ?-?

A

Pearson’s correlation coefficient ‘r’ has a value between −1 and +1.

53
Q

The graph indicates what association?

A

No linear association - r = 0

54
Q

The graph indicates what type of association?

A

Positive linear association - r = +1

55
Q

The graph indicates what type of association?

A

Negative linear association - r = −1

56
Q

The graph indicates what type of association?

A

Weak positive linear association

57
Q

The graph indicates what type of association?

A

Moderate positive association

58
Q

The graph indicates what type of association?

A

Strong positive linear association

59
Q

What is interpolation?

A

Interpolation is the use of the linear regression line to predict values within the range of the dataset.

60
Q

What is extrapolation?

A

Extrapolation is the use of the linear regression line to predict values outside the range of the dataset.

61
Q

What are the four steps in a statistical investigation?

A
  1. Collect the data
  2. Organise the data
  3. Summarise and display the data
  4. Analyse the data
62
Q

What are some issues in a statistical investigation?

A

A statistical investigation raises a number of ethical issues such as bias, accuracy, copyright and privacy.

63
Q

What is causation?

A

Causation indicates that one event is the result of the occurrence of another event (or variable).

64
Q

Finish the sentence…

A normal distribution has the same ?** , **?** and **?

A

A normal distribution has the same mean, mode and median.

65
Q

Finish the sentence…

A normal distribution is symmetrical to ?

A

A normal distribution is symmetrical to the centre line.

66
Q

Finish the sentence…

A normal distribution has ? at the end of the x-axis.

A

A normal distribution has an asymptote ​at the end of the x-axis.

67
Q

Finish the sentence…

Standard deviation measures ?

A

Standard deviation measures how far typical values are from the mean.

68
Q

Finish the sentence…

The ?** the graph, the **?** the standard deviation, as the graph is **? to the centre.

A

The skinnier the graph, the lower the standard deviation, as the graph is closer to the centre.