Stats Flashcards
Continuous variable
Can take on any value within a given range.
An infinited number of poosible values, limited only by our ability to measure them.
Discrete variable
Can only take on certain distinct values within a certain range.
The scale is still meaninful.
Ranked variable
A categorical variable in which the categories imply some order or relative posistion.
Numerical values are usually assigned.
Categorical variable
One in which the “value” taken by the variable is a non-numerical category or class.
Dot plot
Like a bar graph but with dots.
One dot per data point
Frequency table
Divide the number line into intervals.
Count the number of data points within each interval - frequency.
Relative frequency is the proportion of weights in each interval.
Guidelines for forming class intervals
(3)
- Use intervals if equal lengths with middpoints at convenient round numbers.
- For a small data set, use a small number if intervals.
- For a large data set, use more intervals.
Stem and leaf
ie:
2 1234557
3 033456
4 1234555667
5 1233
stem = tens digit
leave = list of units that take than tens digit - should be in order
Summary statistics
Any set of measurements has two properties: the central or typical value and the spread about that value.
Mean
Average
Sum of data / number of data
Median
The value in the middle of all the data if it is ordered from smallest to largest.
Mode
Most common value in the data set
Interquartile range
Data are split into 4 groups.
How far apart groups 1 and 4 are.
Sort of medians but for quarters.
Box and Whisker plot
Median and interquartile range shown as the box.
Whiskers are extended to the furthest point that isnt an outlier.
Outliers are points further than 1.5x the IQR and are shown as dots.
Standard deviation
Measure of spread around the mean.
1. calculate mean
2. Calculate difference between mean and each value
3. square differences
4. Sum the squares
5. Divide by n-1
6. Square root
Sample variance
Better measure of spread around the mean than standard deviation.
1. calculate mean
2. Calculate difference between mean and each value
3. square differences
4. Sum the squares
5. Divide by n-1
Z scores
Shows how many standard deviations above the mean something is.
z = (data - mean)/std
Bernouli Trial
(3)
- Result of each trial is a successs or failure
- Probability p of success is the same in every trial
- Trials are independent.
Binomial random variable
x = number of successes
n = no. of repeated Bernouli trials
p = probability of success
p^x (1-p)^(n-x) times the binomial coefficient nPr
Finding binomial coefficient
___n!___
(k! (n - k)!)
Normal/Gaussian distribution
(4)
- Symmetrical about the mean
- Bell shaped
- mean, median and mode are the same
- The two tails never touch the horizontal axis
Mean in binomial distribution
mean = np