Statistical Tests Flashcards
What are the two types of variables/data
-Qualitative (Categorical)
-Quantitative
Qualitative (Categorical) variables include (3)
-dichotomous (binary)
-nominal
-ordinal
Dichotomous (binary) data is where
-Every observation is in one of two categories (yes/no)
-represented as a percentage
Nominal data is
-three or more categories/classes have no inherent ordering (labelled as numbers)
-represented as a bar chart or % in each category
Ordinal data is
-Three or more categories with the categories having some inherent order
-represented as a bar chart or % of subjects in each category
Two types of Quantitative variables/data
-discrete (discontinuous or count)
-continuous
Discrete data (counts)
-Have only values as whole numbers (integers)
-represented as a histogram
Continuous data
-have any value defined within a range
-represented in a histogram or box & whisker plot
What type of data is pulse rate
Quantitative: Discrete/Count
What type of data is eye color
Qualitative: Nominal
What type of data is daily milk yield per cow
Quantitative: Continuous
What type of data is number of lesions
Quantitative: Discrete/Count
What type of data is pregnancy status of each cow
Qualitative: Dichotomous/Binary
What type of data is number of puppies per litter
Quantitative: Discrete/Count
Central tendency includes
-mean
-median
-mode
Measures of spread includes (4)
-range
-percentile
-variance
-standard deviation
Common types of graphs are
-bar charts
-histograms
-box plots
-scatter plots
Bar charts are used for
Nominal or Ordinal data
Histograms are used for
Continuous or Count data
Box & Whisker plots are used for
Continuous data
Scatter Plots are used for
Continuous data
Normal distribution is
-bell shaped
-variables are evenly distributed
Right skewed distribution means the tail is on the ___ while left skewed means the tail is on the ___
Right; left
In normal distribution,
The mean, median, and mode are very similar
Skewed distribution,
The mode and median may be similar but the mean will be a poor indicator of central tendencies
The range is
The difference between largest value minus lowest value
The box in box and whisker plot represents ___ while the whiskers represent ___
Upper and lower quartiles (and median); range
Binary, nominal, and ordinal data are all
Qualitative
The standard error or the sample mean (SEM) is used to
Calculate confidence intervals
What is the interpretation of a 95% confidence interval
“The interval from __ to __ has a 95% chance (probability) to contain the true population mean”
The null hypothesis is
The hypothesis that there is no difference between groups (easier to disprove)
The alternative hypothesis is
The hypothesis that there is a difference between the groups
The goal is to disprove the ___ hypotheses
Null
The p-value is the
The probability of data occurring if the null hypotheses is true
If p is small
-null hypotheses is unlikely to be true and it is rejected
“Difference is statistically significant”
If p is large
-data is consistent with the null hypotheses
Ie. no strong evidence
Chi square test is
Used to compare proportions
One sample t-test is used
-To test whether the mean of a sample/population is different from a particular value
-normal distribution
-one group
Two sample t-test
-used to test equality of the means of two populations
-normal distribution
-two comparison groups
Paired t-test
-used to test equality of the means of two samples/populations, when the observations are paired samples
-normal distribution
-two comparison groups that are paired/related
Analysis of Variance (ANOVA)
-used to test equality of the means of 2+ populations
-normal distribution
Wilcoxon’s Signed rank test
-test whether the median of a sample or population is different from a particular value
-similar to t-test one sample
-not normally distributed
-one group
Wilcoxon’s Rank Sum Test (3)
-used to test equality of the mean ranks of two samples/populations
-not normally distributed
-similar to two sample t-test
Wilcoxon’s Signed rank test - two matched pairs
-used to test the difference between two samples/populations using matched pairs
-not normally distributed
-similar to paired t-test
Kruskal-Wallis test
-used to test equality of the mean ranks of 2+ samples/populations
-not normally distributed
Kaplan-Meier curve with Log-Rank test
-outcome must be time ie. survival
Cox Proportional Hazards Regression
Uses time and the outcome ie. comparing survival time
Logistic Regression
-yes/no outcome