me, myself and I Flashcards
discuss quantifying biological data
-biological research relies on accurate and precise measurements of various biological parameters. e.g. length, mass, concentration, time, and genetic sequence.
-researchers often manipulate variables and control experimental conditions to understand cause-and-effect relationships. rigorous quantification is required to ensure reliable, reproducible results.
-mathematical models and statistical analyses play a vital role in understanding genetic data and deciphering complex genetic mechanisms
what are SI units?
system of international units
length= meter
weight= kg
what is quantitative biology?
Quantitative biology is an umbrella term encompassing the use of mathematical, statistical or computational techniques to study life and living organisms. The central theme and goal of quantitative biology is the creation of predictive models based on fundamental principles governing living systems.
what is the mean?
The mean is equal to the sum of all the values in the data set divided by the number of values in the data set.
what is the median?
The middle value in a set of numbers arranged in increasing order. If there is an even number of values, then median is the average of the middle two values.
what is the range?
The range in statistics for a given data set is the difference between the highest and lowest values
what is the difference between samples and populations?
A population is the entire group that you want to draw conclusions about. A sample is the specific group that you will collect data from.
what is a sampling error?
refers to the possibility of mistaken inference when generalizing about a population based on a sample, due to chance variations between the sample and the population.
why may a sampling error arise?
Sampling errors occur because the sample is not representative of the population or is biased in some way. Even randomized samples will have some degree of sampling error because a sample is only an approximation of the population from which it is drawn.
continuous vs categorical data
Continuous data can take on any value within a defined range and is often measured on a continuous scale, such as weight, height, or temperature. Categorical data, on the other hand, consists of discrete values that fall into distinct categories or groups, such as gender, ethnicity, or product types.
null vs alternative hypothesis
The null hypothesis is the statement or claim being made (which we are trying to disprove) and the alternative hypothesis is the hypothesis that we are trying to prove and which is accepted if we have sufficient evidence to reject the null hypothesis.
what is the chi-squared test?
The Chi-Square test is a statistical procedure for determining the difference between observed and expected data. This test can also be used to decide whether it correlates to our data’s categorical variables.
a statistical test used to compare observed results with expected results. The purpose of this test is to determine if a difference between observed data and expected data is due to chance, or if it is due to a relationship between the variables you are studying.
what is statistical significance?
In research, statistical significance measures the probability of the null hypothesis being true compared to the acceptable level of uncertainty regarding the true answer.
how does statistical significance relate to the p-value?
The lower the p-value, the greater the statistical significance of the observed difference.
A p-value of 0.05 or lower is generally considered statistically significant.
P-value can serve as an alternative to—or in addition to—preselected confidence levels for hypothesis testing.
how can the p-value be used as evidence to support/reject the null hypothesis?
A p-value less than 0.05 is typically considered to be statistically significant, in which case the null hypothesis should be rejected. A p-value greater than 0.05 means that deviation from the null hypothesis is not statistically significant, and the null hypothesis is not rejected.
what are type I and type II errors?
A type I error (false-positive);
occurs if an investigator rejects a null hypothesis that is actually true in the population
a type II error (false-negative);
occurs if the investigator fails to reject a null hypothesis that is actually false in the population.
what is the effect size? how does it relate to the practical significance of findings?
effect size is the magnitude of the difference between groups. The absolute effect size is the difference between the average, or mean, outcomes in two different intervention groups.
An effect size is a measure that describes the magnitude or size of the difference or relatedness between the variables we are measuring. This means that it is describing the practical/meaningful significance
what is the equation of a straight line?
y = mx + c
what is r-squared?
The coefficient of determination (R²) is a number between 0 and 1 that measures how well a statistical model predicts an outcome. You can interpret the R² as the proportion of variation in the dependent variable that is predicted by the statistical model.
what is the 95% confidence interval?
A 95% confidence interval (CI) of the mean is a range with an upper and lower number calculated from a sample. Because the true population mean is unknown, this range describes possible values that the mean could be.
what is the best estimate of population average?
the best estimate (point estimate) of the population average is the SAMPLE AVERAGE
population vs sample distribution
The moments of a sample distribution are referred to as statistics of the sample. The moments of a population distribution are referred to as parameters of the population. If samples are drawn from the population with replacement, then any number of samples of a given size, N, can be drawn.
population vs sample average
In statistics, there are two different averages: the sample mean and the population mean. The sample mean only considers a selected number of observations—drawn from the population data. The population mean, on the other hand, considers all the observations in the population—to compute the average value.
what is a multivariate linear model?
The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.