Statistics 1 Flashcards
Class Midpoint
(UpperBound + LowerBound) / 2
Steps of Modelling a Mathematical Problem
- Observe a real world problem
- Devise a mathematical model
- Use the model to make predictions for the real world
- Experimental data is collected from the real world
- Compare predicted and observed outcomes
- Statistical tests are used to assess the model
- Mathematical model is improved and refined
Class Width
UpperBound - LowerBound
Mean
Sum of values / number of values
Combining Means
(N1X1) + (N2X1) / (N1+N2)
N1 = number of values in first sample N2 = number of values in second sample X1 = mean of first sample X2 = mean of second sample
Coding
Definition
A kind of transformation of data
Coding
General Form
y = (x-a) / b
x = original data or mean y = new data or mean
Q1
Lower Quartile
Q2
Median
Q3
Upper Quartile
Interquartile Range
Q3 - Q1
Q1
List of Data
4 pots
Q1
Data in a Table
4 pots
Q1
Data in a Grouped Table
1/4 nth value
Q2
List of Data
4 pots
Q2
Data in a Table
4 pots
OR
n+1/2 th value
Q2
Data in a Grouped Table
n/2 th value
Q3
List of Data
4 pots
Q3
Data in a Table
4 pots
Q3
Data in a Grouped Table
3/4n th value
Variance
Symbol
σ²
Standard Deviation
Symbol
σ
Variance
Equation (list of data)
Ex² / n - (Ex /n)²
The sum of x² divided by n
Minus
(The sum of x divided by n) all squared
Standard Deviation
Equation
√variance
Variance
Equation (data in a table)
Efx² / Ef - (Efx / Ef)²
The sum of frequency times x² divided by the sum of the frequencies
Minus
(The sum of frequency times x divided by the sum of the frequencies) all squared
Variance
Equation (data in a grouped table)
Efx² / Ef - (Efx / Ef)²
x the value of x is the midpoint of each class
The sum of frequency times x² divided by the sum of the frequencies
Minus
(The sum of frequency times x divided by the sum of the frequencies) all squared
Coding - Mean
±a
±a
Coding - Variance
±a
Not effected
Coding - Standard Deviation
±a
Not effected
Coding - Mean
xa
xa
Coding - Variance
xa
xa²
Coding - Standard Deviation
xa
xa
Coding - Mean
/a
/a
Coding - Variance
/a
/a²
Coding - Standard Deviation
/a
/a
Frequency Density
Formula
Frequency Density = Frequency / ClassWidth
Negative Skew
Q2 - Q1 > Q3 - Q2
mode > median > mean
Positive Skew
Q2 - Q1 < Q3 - Q2
mode < median < mean
Symmetrical Skew
Q2 - Q1 = Q3 - Q2
mode = median = mean
Outlier
An extreme value
Greater than Q3 + (1.5 x IQR)
OR
Les than Q1 - (1.5 x IQR)
Experiment
Definition
A repeatable process that gives rise to a number of outcomes
Event
Definition
A collection or set of one or more outcomes
Sample Space
Definition
The set of all possible outcomes of an experiment
Probability
‘
Not
Probability
U
Union
AND
OR
Probability
n
Intersection
AND
Addition Rule
P(AnB) = P(A) + P(B) - P(AUB)
Multiplication Rule
P(A|B) = P(AnB) / P(B)
Complimentary Probability
P(‘A) = 1 - P(A)
Mutually Exclusive Events
Definition
Events have no outcomes in common
Mutually Exclusive Events
Formulae
P(AUB) = P(A) + P(B)
P(A|B) = 0
P(AnB) = 0
Independent Events
Definition
Events that have no effect on each other are independent
Independent Events
Formulae
P(AnB) = P(A) x P(B)
P(A|B) = P(A)
Bivariate Data
Definition
Data that comes in pairs
Variability of Bivariate Data
Sxx
Sxx = Ex² - (Ex)²/n
Variability of Bivariate Data
Syy
Syy = Ey² - (Ey)²/n
Variability of Bivariate Data
Sxy
Sxy = Exy - (ExEy)/n
Product Moment Correlation Coefficient
Symbol
r
Product Moment Correlation Coefficient
Formula
r = Sxy / √(SxxSyy)
Product Moment Correlation Coefficient
r = 1
perfect positive linear correlation
Product Moment Correlation Coefficient
r = -1
perfect negative linear correlation
Product Moment Correlation Coefficient
r = 0
no linear correlation
Regression Line
Definition
The line that minimises the distance to each point on the line and passes through (x,y)
x = mean of x y = mean of y
Regression Line
Formula
y = a + bx
a = y - bx b = Sxy / Sxx
Interpolation
Definiton
Estimating the value of the dependent variable within the range of data
Extrapolation
Definition
Estimating the value of he dependent variable outside of the range of data
This can be unreliable
Probability Distribution
Definition
A table showing the probability associated with each outcome of an experiment
Discrete Random Variables
Expectation
E(X) = E x P(X=x)
Discrete Random Variables
Variance
Var(X) = E(X²) - (E(X))² = E x² P(X=x) - (E(X))²
Discrete Random Variables
Coding - Expectation
E(aX+b) = aE(X) + b
Discrete Random Variables
Coding - Variance
Var(aX+b) = a² Var(X)
Discrete Uniform Distribution
Each outcome has the same probability
P(X=x) = 1/n
Discrete Uniform Distribution
Expectation
E(X) = (n+1) / 2
Discrete Uniform Distribution
Variance
Var(X) = (n+1)(n-1) / 12
Conditions for a Discrete Uniform Distribution
A discrete random variable X is defined over a set of n distinct values
Each value is equally likely
Standard Normal Distribution
Z~N (0,1²)