Lecture 4 Flashcards
What are the three types of t-tests?
Single-sample t-test
Independent samples t-test
Paired samples t-test
What is the question that all three t-tests answer?
Is there a statistically significant difference between two means?
What is the question single-sample t-tests?
Is here a statistically significant difference between your sample mean and a known comparative population or large- sample value?
What is the questions that the independent-sample t-test?
Is there a statistically significant difference between two sample means taken from a categorical/ grouping/ nominal variable in your sample?
Define the mean, median and mode?
Mean: arithmetic average of the scores in the data set.
Median: the middle score when the data is ordered by size.
Mode: most frequently occurring score in the data set
Why is 5% trimmed mean a potentially useful statistic?
It is less susceptible to sampling fluctuation than the mean for extremely skewed distributions.
Define a 95% confidence interval?
If same sampling method is used to select different samples and computed an interval estimate for each sample, we would expect the true population parameter to fall within the interval estimates 95% of the time.
What is the interquartile range?
The IQR is a measure of variability based on dividing a data set in quartiles. Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles.
How is the interquartile range defined in terms of percentiles?
It is called the 75th percentile. The IQR is the difference between upper (Q3) and lower (Q1) quartiles and describes the middle 50% of values when ordered from lowest to highest. This is not affected by outliers.
If a score is at the 90% percentile, what does that mean?
This means you scored better than 90% of people who took the test.
What is the relationship between the variance and the standard deviation?
The standard deviation is the square root of the variance. This is expressed in the same units as the mean is, whereas the variance is expressed in squared units, but for looking at a distribution (you can use either but be clear)
AVERAGE DIFFERENCE BETWEEN EACH SCORE AND THE MEAN OF THE DATA SET
What is the range?
Difference between the largest and smallest score.
When should you delete outliers?
If you know that it is wrong or you have a lot of data, so data won’t be hurt by dropping a questionable outlier, if if you can go back and recollect or verify the questionable data point.
What is the assumption of normality and why is it important?
The assumption of normality is that your data comes from population that is normally distributed. If data is non-normal, it makes the statistical test inaccurate. It’s important to know if data is normal or non-normal.
What is a common transformation used to deal with violation of the homogeneity of variance assumption?
Alternative F statistics