Final Flashcards
What is the “who”
Study units or simply who the subjects or participants are
Sample
representative sample of a target population
parameter
measure of a population
statistic
measure of a sample
explanatory or predictor variables
independent variables
response variable
dependent variable
extraneous variables
explanatory variables that are not of interest that could affect the response variable
systematic sampling
First-person is selected randomly
Then Every [period of time]
Then Every nth person
Stratified vs cluster
Stratified: Population is divided into groups based on prior information. Then within each group random sampling is done. Alberta example
Cluster: Randomly select groups of people and then all people within these groups are interviewed.
Voluntary response bias
Asking for volunteers but people who like something are more likely to volunteer
Response bias
Loaded questions: Questions suggest or prompt a particular response favored by the researcher
Nonresponse bias
Large amount of people fail to respond to questions
Difference between observational and experimental studies
In observational studies, there is no manipulation or control of variables/conditions. In experiments there is deliberate manipulation of explanatory variables
Observational studies population and casual inferences
Population inferences: Can be made with random selection
Casual Inferences: Can NOT be made as there is too many extraneous variables. cause and effect cannot be made
Experimental studies population and causal inferences
Both can be made if there is random selection and random assignment
Relative frequency
Frequency divided by total number of observations
Pie Chart
Frequency of categorical data
Bar Graph
show the frequencies for one variable
Marginal and joint distribution
Marginal: Total frequency for each variable
Joint: Frequency of joint event
Conditional distribution
Negative skew vs positive skew
Median and mean resistance
Median is resistant to extreme values or skewness
Mean: is NOT resistant because it is influenced by skewness
Skewed distribution best measure of centre and spread
Centre: Median
Spread: Quartiles
Symmetric distributions best measure of centre and spread
Centre: Mean
Spread: Standard deviation
Mean mode and median in right/positive skewed
Mode < Median < Mean
RODE
Mean mode ande median in left/negative skewed
Mean < Median < Mode
If there is an odd set of observations is the median included
no
Boxplots parts
Population standard deviation
Boxplots skew
Quartiles skew
Probability for at least one
Sample size for normal distribution
> 30
between a small sample and a large sample, when sampling is taken, which has more variability
smaller sample has more variability
Calculating quartiles
height calculation for a uniform distribution
The variance between two values
finding z or probability for sample mean
probability or z score for a sample proportion
assumption of normality for proportion
central limit theory
mean for two points
Probability between two points sometimes on a rectangle
This finds the actual probability not the z score
Type I error
Accidently rejecting null hypothesis
You thought there was a difference when there wasn’t one
Alpha
Type II error
Not rejecting the null hypothesis when you should have
You thought there was no difference when there was one
beta
To reject the null hypothesis is p greater then or less than alpha
p<alpha
What does alpha mean
The maximum probability of the type I error that you will allow for
what does the p value mean
Observed probability of a type 1 error that you will find
Margin of error one proportion
What does the p value mean in proportions
When to reject the null hypothesis in a one-population proportion test
the hypothesized proportion is not within the confidence interval
Determining sample size
two population proportions rejecting the null hypothesis
Chi squared expected frequency calculation
when do you double the p-value
in z score when it is two tailed
don’t forget to subtract from 1 if its positive
How to calculate margin of error
alpha/2 multiplied by SE
so confidence interval without estimate
When to reject the null hypothesis for one population mean
what does sp mean
pooled standard deviation
when to use paired t test
Ecological monitoring
Medical measurements from the same patient
Taking measurements from the same area
what does k and n mean in anova
k means number of populations
n means total number of observations (x)
how to find pooled standard deviation in anova
How to find F statistic in ANOVA
on formula sheet
mean square of between groups
___________________________________
means square of within groups
relationship between sum of squares, df, and mean square
What does a linear regression measure
relationship between two quantitative variables
How to determine the strength of a linear relationship
r close to -1 or 1 means a strong linear relationship
r close to 0 indicates none or weak relationship
how to calculate r
units of r
no units as they cancel out during calculations
interpolation vs extrapolation
coefficient of determination
r squared
How to tell if two events are independent
If P(A∩B) is the same as P(A) times P(B) then the events are independent
Chi square tests
Chi square independence test hypothesis
Assumptions for a one/two proportion z test
- All samples are taken independently
- The number of failures and successes are both at least 10
Properties of chi square curves
one tailed
always positive
CHi square assumptions
- simple random sampling
- indpendent sample
- sample size should be no more than 10% of population
- all expected frequencies are at least 5
properties of a t-curve
margin of error and sample size
inversely proportional
Properties of a F curve
Disjoint
Events are considered disjoint if they never occur at the same time; these are also known as mutually exclusive events.
Events are considered independent if they are unrelated.
The level of significance in a hypothesis test is?
The probabililty of rejecting a null hypothesis when it is in fact true
probability for mean
probability for mean
probability for proportion
what is the assumption of equal variance?
When the SD divided are under 2
how to find the best estimate of standard deviation or the best estimate of common variance?
Square root of MSe
in lab this is the squared root of the residual
assumption of equal variances for an anova test
the F-test is valid as long as the largest standard deviation is no more than
twice the smallest standard deviation
how to find the sum of squares or ss
how to find the sum of squares/ss error or residual