Modules 1 and 2 Flashcards
What is a population?
The set of all “subjects” relevant to the scientific hypothesis under examination.
What are variables?
Characteristics that differ among individuals.
What are parameters?
Quantities describing a population.
What are some examples of parameters?
mean, standard deviation, median, mode.
What is the mean of a population denoted by?
mu
What is the standard deviation of a population denoted by?
theta
The parameters of a population are denoted using what alphabet?
Greek alphabet
What is a census?
A collection of data where the entire population is examined.
True or False: censuses are common to find the true mean of a population.
False, censuses are often hard to perform, and are uncommon.
What is a sample?
The subset of “subjects” selected from a statistical population that are actually examined during a particular study.
What are sample statistics?
The statistics calculated from a sample and used to estimate the population parameters.
What is the mean of a sample statistic denoted by?
x
What is the standard deviation of a sample statistic denoted by?
s
The sample statistics are denoted by what alphabet?
Roman alphabet.
What two requirements make a random sample?
Each subject has an equal chance of being selected and each subject is selected independently from the other subjects.
What four requirements made a good sample?
- Carefully defined statistical populations
- Random sampling
- Precise
- Unbiased
What is precision?
When results are all close together and not widely spread out from each other.
What is bias?
When results confirm a different truth due to certain factors not being considered.
What is a sample of convenience?
A collection of subjects that are easily available.
What is a volunteer sample?
Participants volunteer information or participation in the study.
A volunteer sample and a convenience sample are both what type of samples?
Non-random samples.
What are the two types of studies?
Experimental and Observational
What is an experimental study?
The treatments are assigned randomly to individuals.
What is an observational study?
The treatments are not assigned by the researcher, but are instead already in place.
If Max randomly assigns twenty plants to a selection of five lakes to watch the effects on phosphorous concentration in those plants, is that experimental or observational?
Observational, because while Max is assigning the plants to the lakes, he’s not assigning the phosphorous levels of the lakes, those are natural and are what is being observed.
What is a confounding variable?
An additional variable that impacts the outcomes of a study.
What are the two types of variables?
Categorical and Numerical.
What are the two ways a numerical variable is measured?
Continuous and Discrete.
What is a categorical variable?
A variable that is non-numerical and is instead based on a named value, such as a colour or a satisfaction level.
What is a numerical variable?
A variable that is a number value, such as the number of cats in a house, or the length of someone’s arm.
What is a continuous numerical variable?
A variable that has a value that is divisible and occurs on a range of any infinite number of in-between values, such as the weight or height.
What is a discrete numerical variable?
A variable that has a value that is not divisible and must be a whole number, such as the number of cats in a house or the number of students in a classroom.
What are the two ways a categorical value can be measured?
Nominal and Ordinal.
What is a nominal categorical variable?
A variable that has categorical options that have no order, such as one’s favourite colour.
What is an ordinal categorical variable?
A variable that has categorical options that are in a specific order, such as letter grades.
What are the four measurement scales for variables?
Nominal, Ordinal, Interval, and Ratio.
What is an interval numerical variable?
A variable that is on a numerical scale with an arbitrary value representing zero, such as how we number years.
What is a ratio numerical variable?
A variable that is on a numerical scale with a true zero value, often one denoting the absence of something, such as the length of one’s thumb, which could be zero should someone have no thumb.
What is an explanatory variable?
The variable responsible for the change in the response variable. It is also called the independent variable.
What is a response variable?
The focus of the study. It is also called the dependent variable.
What are the four quantities that are used to describe a sample?
- Frequency Distributions
- Measures of Location
- Measures of Spread
- Measures of Shape
What is a frequency distribution?
It describes the number of times each value of a variable occurs in a sample. It can use absolute or relative values.
What is a measure of location?
The mean, median, and mode. The central tendency of a curve.
What is the mean?
The arithmetic average. When all values are considered, what is the average value.
What is the median?
The middle of the data. The value that is perfectly in the middle of the maximum and the minimum values.
What is the mode?
The most commonly occurring value. The highest point on the graph.
What is positive skew in central tendency?
When the central tendency peaks on the lower end of the graph.
What is a negative skew in central tendency?
When the central tendency peaks on the higher end of the graph.
What are the three variables considered in the measure of spread?
Range, Variance, and Standard Deviation
What is range?
The most basic maximum and minimum values on the graph.
What is variance?
The expected squared difference between an observation and the mean.
What is the formula to calculate variance?
s^2 = (sum(xi - x)^2)/(n - 1)
Where
s = standard deviation
xi = one observation
x = mean
n = number of all observations
What is standard deviation?
The positive square root of the variance. The statistical measurement that looks at how far a group of numbers is from the mean.
What is the formula to calculate standard deviation?
s = sqrt(s^2) = sqrt((1 / (n - 1)) sum((xi - x)^2))
What is right skew?
Positive skew. When the highest point is to the left and the tail of the right side splays out.
What is left skew?
Negative skew. When the highest point is to the right and the tail of the left side splays out.
What is estimation?
The process of inferring a population parameter from sample data.
What is uncertainty?
A situation in which something is not known, in statistics, it is the error of an estimate.
What is a sampling distribution?
The probability distribution of all the values for an estimate that we might have obtained when we samples the population.
What are the differences between frequency distributions, probability distributions, and sampling distributions?
A frequency distribution measures the number of each type of observation that is seen. A probability distribution displays the expected distribution, what the actual probability of each outcome is. A sampling distribution shows the actual distribution from a sample.
For example, the probability distribution for the total number displayed when rolling two dice would be a bell curve, but the sampling distribution after actually rolling two dice a hundred times might not be so clean.
What is the confidence interval?
A range of values that is likely to contain the true population parameter.
How is the confidence interval calculated?
1.96 * Standard Error. This will help to calculate the low and high range. By adding and subtracting this value from the mean, the range is found.
How is standard error calculated?
s / sqrt(n)
Where
s = standard deviation
n = number of subjects
True or False: standard deviation is the same as standard error.
False. Standard error is use to calculate the confidence interval.
True or False: parameters estimates are influenced by chance.
True.
True or False: the standard deviation is the standard error of the sampling distribution.
False. The standard error is the standard deviation of the sampling distribution.
True or False: confidence intervals bound plausible parameter values.
True.
What is the relationship between uncertainty and precision when related to sample size?
Inverse relationship. As sample size increases, uncertainty decreases and precision increases.
What is a biological hypothesis?
A statistical hypothesis about population parameters but tested using sample statistics.
What is a null hypothesis
A hypothesis that nothing will happen, that there is no change. For example, lizards of all sizes are equally vulnerable to birds. It is denoted by H0
What is an alternative hypothesis?
A hypothesis that something significant will happen, that there is a statistically significant change. For example, lizard vulnerability to birds is dependent on lizard size. It is denoted by H1 or HA
What does hypothesis testing look like?
- Null hypothesis
- If H0 is true, what should the data look like?
- If data is unexpected, infer that H0 is false (improbable)
- Then infer that the alternative hypothesis is true.
- If data is expected, fail to reject H0.
True or False: In some cases we reject the null hypothesis, in other cases, we accept the null hypothesis.
False, we never accept a hypothesis, we just fail to reject.
What is the p-value?
The probability of obtaining data this or more different from the null hypothesis if the null hypothesis is true.
How is the p value calculated?
You look at the null distribution and find where on it the result you got is. All the probabilities of values from that point to the extreme, and all the probabilities of values that are equally extreme on the other side, are all added up.
What is a null distribution?
The probability distribution a test statistic values when a random sample is taken from a hypothetical population for which the null hypothesis is true
What is the significance level (alpha)?
The probability used as a criterion for rejecting the null hypothesis.
What alpha value is typically used?
0.05
What do the p-value and alpha value tell us about the null hypothesis?
p-value > alpha, fail to reject H0
p-value <= alpha, reject H0
True or False: in class, we’re using one-tailed tests.
False, we’re using two-tailed tests.
What is a type I error?
A false positive. Rejecting a true null hypothesis.
What is a type II error?
A false negative. Failing to reject a false null hypothesis. We aren’t usually aware of this one.
What is power dependent on?
How different the truth is from the null hypothesis, the type I error rate, and the sample size.
What are the advantages of an experimental study?
It minimizes the impact of confounding variables on the outcome of the study.
What are the disadvantages of an experimental study?
Experimental artifacts can introduce a bias through unintended consequences of experimental procedures.
What are experimental artifacts?
Blind spots in our research. Things we don’t consider that impact the results of the study.
True or False: every study should have a control group.
True.
What is a control group?
A group of subjects with similar requirements to the other groups except they don’t get the treatment. It often works better if they’re under the impression that they did get the treatment.
What is randomization?
The random assignment of treatments to the different subjects of an experiment.
What is blinding?
The process of concealing information about the control/treatment group’s assignment from the participants.
What is single blind vs double blind?
Single blind is when a control group doesn’t know they are the control group. Double blind is when a control doesn’t know they are the control group and the researchers also don’t know which group is the control.
Why are studies made blind?
Because it prevents modification of behaviour based on treatment that would otherwise skew results.
What are the four ways sampling error is decreased?
- Replication
- Balance
- Blocking
- Extreme treatments
What is replication?
The application of treatment to multiple, independent experimental subjects of units.
What is pseudoreplication?
When subjects of the same environment are treated as individuals. For example, four red plants being put in one chamber for treatment and four blue plants being put in a different chamber for treatment. The chambers might be different enough. Treating interdependent observations as independent.
What is balance?
Having a equal number of units in each treatment. For example, including five blue plants and five red plants in one treatment instead of seven red plants and three blue plants.
What is blocking?
Assigning an equal number of each type of experimental unit into each group. So if you’re testing blue plants, red plants, and yellow plants, each group will have an equal number of red, blue, and yellow plants.
True or False: Blocking and Random Assignment are the same.
False. Blocking is used when confounding variables are known. Random Assignment is used when confounding variables are not known.
What is a problem with using extreme treatments to reduce sampling error?
Responses are not always linear, so this method may not work.
What are the three ways to reduce bias?
- Control Groups
- Randomization
- Blinding
What is the difference between bias and sampling error?
Bias is what we unconsciously change that affects the results. Sampling error is what we aren’t even aware of that affects the results. The difference is that one is something we do and one is something that happens.