2 - Me, Myself and I Flashcards
Why is biology a considered a quantitative subject?
Research relies heavily on accurate and precise measurements, and variables are often manipulated and use of controls allow us to observe cause and effect relationships. We can quantify diversity via experiments, and using this data we can observe trends and create graphs; We can also create hypotheses, making predictions and establishing causes.
Why are statistical analyses important?
Mathematical models and statistical analyses are important as they help us to understand genetic data and complex mechanisms.
What is the mean, median and range?
Mean - the average of a dataset.
Median - the middle value of a dataset.
Range - maximum and minimum values of a dataset (spread of variable values).
What is the difference between a sample and population?
A sample is only a small subset of the total population, and so when looking at results of a sample we must take this into consideration - the data collected will/may differ from the wider population.
The population is all members of a defined group.
What is a sampling error and its causes?
A sampling error is the random variation introduced into a dataset as a function of only sampling a subset of the total population - there is a difference in the value(s) from the sample compared to the true population value(s).
Ways to represent categorical data (1).
Chi-square test assumes variables are categorical (can be divided into groups), independent and are >5. It can be used where the observations are assigned into mutually exclusive classes - these are compared to those under the null.
Ways to represent continuous data (1).
Boxplots are effective in presenting continuous data (changes over time) - it shows the median, range, IQR and dots for outliers.
What is the null hypothesis?
The default expectation that categorical outcomes are all equally likely and so there is no relationship or association.
What is the alternative hypothesis?
The expectation that categorical outcomes are not all equally likely, and so there is a relationship between two measured phenomena, or association.
What are the degrees of freedom?
This refers to the number of values in a calculation that are free to vary - minimum is 1.
What does p<0.05 mean?
The probability is statistically significant and so we can reject the null and accept the alternative.
How does statistical significance relate to the p-value?
p < 0.05 means that there is strong evidence supporting the alternative hypothesis. So if it is small, we are more inclined to reject the null and favour the alternative as we now have less than a 5% chance of seeing a trend/deviation following the null.
How can p-values be used as evidence?
If the p-value is sufficiently lower than 0.05, then we know to reject the null as the probability of it occurring is little - but too close to the threshold may incline you to repeat the experiment and increase sample size; This then allows us to see whether we support the null or an alternative through statistical analysis. It also allows us to see whether the deviation from the null is likely due to sampling error.
What is a type I and II error?
Type I: false positives e.g. p = 0.049.
Type II: false negatives e.g. p = 0.051.
In both cases we should not reject/accept any hypotheses, and instead should collect more data.
These sampling errors may arise due to sample size (data collection) and experimental design.
What is effect size?
The effect size is the degree to which the phenomena affects the whole population, and not just the sample - the magnitude of the effect. Small effect size - indicates minimal/negligible effects, large - substantial effects.