Math IB Stats Flashcards
Discrete data
Fixed to certain values; no gaps between data values
Continuous data
Not fixed to certain values; can occupy a continuous range
Reliable data
If you can repeat the data and obtain similar results
Sufficient data
When there is enough data to support your conclusions
Population
The entire group that you want to draw conclusions about
Sample
Subset of population; group of individuals from the population that will give info about the population as a whole
Sampling technique: convenience
Most easily accessible members of a population
Sampling technique: simple random
Randomly choose members - equal chance for everybody
Sampling technique: systematic
Pick at a fixed interval — eg every 6th person
Sampling technique: stratified
Divide group into groups (stratas) based on shared characteristics, then sample from the groups
Sampling technique: quota
Stratified sampling, but sample from each stratum is proportional and to the size of each stratum
Bar chart
For discrete data; has gaps in between the bars
Histogram
No gaps in between bars; for continuous data
Skew
Where the majority of the data is located (shape)
Histogram skew
Left (-): most data is on right side - left tail
Normal: equally distributed
Right (+): most data is on left side - left tail
Mode
Value that occurs the most
Modal
For grouped data — can’t find mode so we would say the modal range
Bimodal; no mode
Bimodal - 2 modes in set of data
No mode - all numbers appear only once
Mean
The average
Median
The middle data value when data set is arranged in order of size (if even data set — median is avg of two middle numbers)
Range
Max - min
Quartiles
Divides data into quarters
- 1st: 25% of data below it
- 2nd: the median and has 50% of data below and above
-3rd: 75% below
Interquartile range (IQR)
Difference between Q3 and Q1
Lower quartile
Q1
Middle quartile
Q2
Upper quartile
Q3
Outlier for boxplot
Outliers are 1.5xIQR above Q3 or below Q1
Cumulative frequency
The sum of all previous frequencies up to the current point
Percentile
A value below which a certain percentage of observations lie
Percentile rank
Calculate by dividing #of values below ___ by total # of values
Variance
How far a data point is spread from the mean (sigma squared)
Standard deviation
Square root of variance
Bivariate data
Study of relationships between to sets of data
Correlation
When change in x corresponds to change in y
Causation
When one event is the result of the a occurrence of another event
Pearson product moment correlation coefficient (r)
Measure of the correlation strength between two variables.
Between -1 and 1 ( can equal)
R value
0 is weakest. 1 is strongest.
(-) values mean there is a negative correlation.
Line of best fit
Straight line drawn though the center of a group of points plotted on a scatter diagram
Interpolation
Predictions inside the domain your data points are in
Extrapolation
Predictions outside the domain of your data
Draw line of best fit
Find mean point which line will go through. Equal number of points above and below line.
Residual
The vertical distance between data pints and a graph of a regression line
Least square regression line
Has the smallest possible value for the sum of squares of the residual
Regression line y on x
Y=ax+b
A = change on y for each change in x
B = y int
Binomial distribution elements:
- fixed number of trials
- only two outcomes, success or failure
- constant probability each trial
- trials are independent
Probability - with replacement
Elements in sample space remains unchanged (e.g if you pull a card out of a deck, you put it back)
Probability - without replacement
Items are not returned to the sample space (e.g if you pull a card out from a deck, you leave it out, changing probabilities for next time)