Lecture 11 9/18/24 Flashcards
How are data sets typically arranged?
-variables are in columns
-experimental units are in rows
What are the characteristics of nominal data?
-categories should be exhaustive
-categories should be mutually exclusive
-no comparative relationship is implied
What are the characteristics of ordinal data?
-categories should be all inclusive and mutually exclusive
-categories should have rank order with qualitative differences in relative amounts between the categories
-distances between categories are not assumed to be equal
What are the characteristics of continuous data?
-values should be all inclusive and mutually exclusive
-differences between values is uniform across the entire scale
What are the two types of continuous scale data?
-interval, measured in whole integers
-ratio, measured in decimals or fractions
Why is it important to distinguish between categorical and continuous variables?
-determines the method of presentation in graphs/tables
-determines the choice of statistical tests for significance
-different statistics are often used for nominal vs. ordinal variables
What are the characteristics of expressing continuous data as categorical?
-can always be done, but with a loss of information
-loss of information often leads to less statistical power
-continuous variables should only be categorized if there is good reason
When is it best to use mean vs. median to represent central tendency?
-mean works well in large populations with normal distribution
-median works well when the distribution is skewed
Which measures of central tendency are resistant to extreme values, and which are not?
resistant: median, mode
non-resistant: mean
Why is it important to NOT calculate a mean with number-labelled categories?
because they are categories, the difference between “numbers” does not need to be equal; therefore the mean does not actually represent the average of the population
What is measure of dispersion?
extent to which a set of scores deviate from some measure of central tendency for that set
What is range?
the difference between the largest and smallest values in the distribution
What do percentiles and quartiles measure?
the proportion of all observations that fall between specified values
What is kurtosis?
when the values in a data set skew either to the average of the data set or to the extremes of the data set
What is skewness?
when the values in a data set skew to one side of the data set (minimum or maximum)
How does a normal distribution differ from a non-normal distribution?
-normal distribution is accurately described by the mean and standard deviation
-non-normal distribution does not fit as well with the mean and standard distribution
Why is it useful to divide a distribution into percentile segments?
it can be used to compare two distributions for equality
What is an independent variable?
-causal, predictor, exposure, or explanatory variable
-change in the variable influences an outcome
What is a dependent variable?
-outcome variable
-change in the variable results from independent variable change
What type of statistical test is used on continuous data with a normal distribution and 2 groups?
T test
What is the null hypothesis for a T test?
H0: means are equal
What type of statistical test is used on continuous data with a normal distribution and more than 2 groups?
ANOVA
What is the null hypothesis for an ANOVA?
H0: means are equal
What types of statistical test are used on discrete data with 2 non-paired groups?
-chi-square test if there are more than 5 data points in 75% of cells or more
-Fisher’s exact test if there are more than 5 data points in less than 75% of cells
What type of statistical test is used on discrete data with more than 2 groups?
chi-square test
What is the null hypothesis for tests on discrete data?
H0: proportions are equal