Statistical Analysis Flashcards
What do you press to put your calc in stats mode in order to set out the data needed to for pearsons correlation coefficient? (CASIO)
- MODE
- STAT - (2)
- A + Bx - (2)
Once you have plugged your data into a calc what do you press to work out pearsons correlation coefficient? (CASIO)
(Assuming the calc is already in stats mode)
- SHIFT STAT - (1)
- Reg - (5)
- r - (3)
When working out the line of best fit what do you press to reach the 5 values you need? (CASIO)
(Assuming the calc is already in stats mode)
- SHIFT STAT - (1)
- Var - (4)
What formulas do you require to work out the least squares line of best fit and name them in order?
- Gradient
- Y-intercept
What is the gradient formula used to work out the least squares line of best fit equation?

What is the Y-intercept formula used to work out the least squares line of best fit equation?

What is the formula used to work out the least squares line of best fit equation?
(AKA) y = mx + b

What is the meaning of this value - x̅ - used in the calculation of least squares line of best fit?
x̅ is the mean of x scores
What is the meaning of this value - ȳ - used in the calculation of least squares line of best fit?
ȳ is the mean of y scores
What is the meaning of this value - r - used in the calculation of least squares line of best fit?
r is the correlation coefficient
What is the meaning of this value - σx - used in the calculation of least squares line of best fit?
σx is the standard deviation of x scores
What is the meaning of this value - σy - used in the calculation of least squares line of best fit?
σy is the standard deviation of y scores
What is systematic sampling?
A systematic sample is obtained by selecting one person at random and then choosing additional people at evenly spaced intervals.
What is self-selected sampling?
A sample is self-selected when the inclusion or exclusion of people is determined by whether they chose to participate themselves.
What is simple random sampling?
This method occurs when participants are selected from the population at random, free from all bias.
What is stratified sampling?
The population is broken into groups based on one particular characteristic or feature. A stratified sample is then obtained by selecting a simple random sample from each group.
What does the categorical variable ‘ordinal’ mean?
Categorical data whose name does indicate order. E.g movie ratings.
What does the categorical variable ‘nominal’ mean?
Categorical data whose name does not indicate order. E.g gender; male and female.
What is discrete numerical data?
The data obtained when a quantity is counted. It can only take exact numerical values. E.g the number of rooms in a house.
What is continuous numerical data?
Quantitative data that can be measured. It has an infinite number of possible values within a selected range. E.g temperature range.
What is the mode?
The mode of a set of data values is the value that appears most often.
What is the median?
The median is the number in the middle of the data set. To find the median, the data should be arranged in order from least to greatest. If there is an even number of items in the data set, then the median is found by taking the average of the two middle numbers.
What is the range?
The Range is the difference between the lowest and highest values in the data set.
What is the mean?
The mean is the average of the numbers. It is calculated by adding up all the numbers, then dividing by however many numbers there are.
Finish the sentence…
“The symbol ‘μ’ represents the population ? “
The symbol ‘μ’ represents the population mean.
What are the steps taken to calculate the interquartile range?
- Put the numbers in order.
- Make a mark in the centre of the data.
- Find Q1 and Q3 - Q1 is the median (the middle) of the lower half of the data, and Q3 is the median (the middle) of the upper half of the data.
- Subtract Q1 from Q3 - IQR = Q3 - Q1
What is the effect of outliers on summary statistics?
Outliers may skew results. They often have a significant effect on the mean - generally don’t impact the median.
What is an outlier?
An outlier is a value that is very different from the other data in your data set. E.g 1, 2, 2, 3, 10 or 12, 13, 11, 2
What is a unimodal distribution in a data set?
A distribution with one clear peak or most frequent value.

What is a bimodal distribution in a data set?
A bimodal distribution is a set of data that has two peaks (modes).

What is a multimodal distribution in a data set?
Distribution of data with more than one peak, or “mode.”

What values are included in a five-number summary?
The five-number summary of a data set consists of the five numbers the smallest value, Q1 , median, Q3 , and largest value of the data set.
What is theoretical probability?
Theoretical probability is a method to express the likelihood that something will occur.
What is relative frequency?
In statistics, the frequency of an event is the number of times the event occurred in an experiment or study.
If event (A) is an impossibility, P(A) = ?
P(A) = 0
If event (A) is a certainty, P(A) = ?
P(A) = 1
What is the formula used to calculate the probability of an event where outcomes are equally likely?

What is the compliment rule?
The Complement Rule states that the sum of the probabilities of an event and its complement must equal 1. For the event A, P(A) + P(A’) = 1.
What is meant by “the form of an association”?
If an association exists between the variables then the points in a scatterplot tend to follow a linear pattern or a curved pattern. This is called the form of an association.
Which graph indicates a strong association?

GRAPH A

Which graph indicates a moderate association?

GRAPH B

Which graph indicates a weak association?

GRAPH C

Which graph indicates a positive association?

GRAPH A

Which graph indicates a negative association?

GRAPH B

A correlation coefficient between +0.75 to +0.99 indicates what association?
Strong positive association
A correlation coefficient between +0.50 to +0.74 indicates what association?
Moderate positive association
A correlation coefficient between +0.25 to +0.49 indicates what association?
Weak positive association
A correlation coefficient between –0.24 to +0.24 indicates what association?
No linear association
A correlation coefficient between –0.25 to −0.49 indicates what association?
Weak negative association
A correlation coefficient between –0.50 to −0.74 indicates what association?
Moderate negative association
A correlation coefficient between –0.75 to −0.99 indicates what association?
Strong negative association
Finish the sentence…
“Pearson’s correlation coefficient ‘r’ has a value between ?-? “
Pearson’s correlation coefficient ‘r’ has a value between −1 and +1.
The graph indicates what association?

No linear association - r = 0
The graph indicates what type of association?

Positive linear association - r = +1
The graph indicates what type of association?

Negative linear association - r = −1
The graph indicates what type of association?

Weak positive linear association
The graph indicates what type of association?

Moderate positive association
The graph indicates what type of association?

Strong positive linear association
What is interpolation?
Interpolation is the use of the linear regression line to predict values within the range of the dataset.
What is extrapolation?
Extrapolation is the use of the linear regression line to predict values outside the range of the dataset.
What are the four steps in a statistical investigation?
- Collect the data
- Organise the data
- Summarise and display the data
- Analyse the data
What are some issues in a statistical investigation?
A statistical investigation raises a number of ethical issues such as bias, accuracy, copyright and privacy.
What is causation?
Causation indicates that one event is the result of the occurrence of another event (or variable).
Finish the sentence…
A normal distribution has the same ?** , **?** and **?
A normal distribution has the same mean, mode and median.
Finish the sentence…
A normal distribution is symmetrical to ?
A normal distribution is symmetrical to the centre line.
Finish the sentence…
A normal distribution has ? at the end of the x-axis.
A normal distribution has an asymptote at the end of the x-axis.
Finish the sentence…
Standard deviation measures ?
Standard deviation measures how far typical values are from the mean.
Finish the sentence…
The ?** the graph, the **?** the standard deviation, as the graph is **? to the centre.
The skinnier the graph, the lower the standard deviation, as the graph is closer to the centre.