key defintions Flashcards
what are the measures of central tendancy
Mean median mode
what are the measures of dispersion
Range, Variance, SD, Q1,Q2,Q3, Correlation coefficient, Coefficient of determination, Upper fence, lower fence
what do the upper and lower fence mean
the upper and lower fences represent the cut-off values for upper and lower outliers in a dataset
SD and variance?
Variance is a measure of how far a set of data is spread out. It is the average of the squared differences from the mean of the data. In other words, variance measures the average distance of each data point from the mean of the dataset. It is denoted by the symbol σ^2.
Standard deviation is the square root of the variance. It measures the amount of variation or dispersion of a set of data from the mean. Standard deviation is denoted by the symbol σ.
do variance and SD measure the same thing?
idk
Variance is a measure of the average squared deviation of the data from the mean. It is a mathematical calculation that is used to quantify the amount of spread or variability in a dataset. In IB Maths, it is denoted by the symbol σ^2.
Standard deviation, on the other hand, is the square root of the variance and is a measure of the spread or dispersion of the data. It is a widely used measure of variability and is denoted by the symbol σ.
if r = 0
regression line is horizontal
in histograms the heights of each boxes are ….
proportional to the frequency of the values of the intervals
mode class refers to the whole thing of
100 <=x,=200 for example
measures of dispersion generally tell us
how spread out the data is and a measure of its variability
dont forget that
Q1 represents that 25% of data is below this value and same with the rest for ex
Q3 represents that 75% of data is below this #
what is the 5 number summary
min, Q1,Q2,Q3, max
Mean deviation means?
average distance of ALL OF THE DATA POINTS from the mean/avg
diff between mean deviation and standard deviation
Mean deviation measures the average distance of each data point from the mean. (important) The mean deviation uses the absolute values of the differences from the mean unlike standard deviation that just squares the differences.
SD measures the amount of variation or dispersion of a set of data from the mean.
define population variance
the sum of the squared differences divided by the population
what is the names for the r values
0-0.25
0.25-0.5
0.5 - 0.75
0.75-1
1) very weak model
2) weak model
3)moderate
4) strong
what are the two ways to find the mean points
using calc, 2) finding the POI of the lines y on x and x on y using substitution
how are spearmans rank and pearsons product moment correlation different
Spearman’s rank correlation coefficient and Pearson’s product moment correlation coefficient measure the strength and direction between two variables
Pearson’s correlation is used when both variables are continuous and the relationship is linear,
while Spearman’s correlation is used when the variables are ordinal or the data is not normally distributed.
What is the r value of the equation of y on x and x on y when they are equivalent equations and why
the r value is either r=1 or r=-1 and this is because of the perfect fit with the line of best fit and there is no deviation from the line as (1,1), (2,2) and so on.
define interpolation and extrapolation
interpolation is making predictions insde the domain (accurate)
extrapolation is making predictions outside your data points (inaccurate)
define an outlier
Outliers are extreme data values that do not fit with the rest of the data.
why do we rank values in spearman’s rank coefficient
Reduces the impact of outliers: Ranking the values can reduce the impact of outliers or extreme values in the data.
Makes the data more comparable: Ranking the values makes the data more comparable, particularly when the data has different scales or units of measurement
Handles non-linear relationships: By ranking the values, we can detect non-linear relationships between variables.
Improves the accuracy of the correlation coefficient: Ranking the data can improve the accuracy of the correlation coefficient by reducing the impact of measurement errors or inconsistencies in the data.
when is spearmans rank coefficent better than pearsons product moment correlation
when the relationship between two variables is nonlinear or ordinal, Spearman’s rank correlation coefficient is better than Pearson’s product moment correlation coefficient.