Definitions Flashcards
What happens in inductive reasoning?
From a specific premise to general conclusions
What happens in deductive reasoning?
From a general premise to specific conclusions
What is the definition of a hypothesis?
Statement that is tested by investigation (preferably experimental), in contrast to a model or theory
What is a research hypothesis?
A hypothesis derived from questions, models and theories
What is a statistical hypothesis?
Come from statistics and represent tests of the predictions of the research hypothesis
What is a sample study?
Estimate the value of a parameter for a population
What is an observational study?
Explain how two population parameters relate to each other without interfering or affecting the individuals
What is an experiment?
Intervention to explore causality
What is sample or empirical distribution?
The pattern that the data makes
What is Population distribution?
The pattern that the whole group of interest makes
What is uncertainty?
The region in which the parameter could fall
What is the sampling distribution?
The sample distribution of a statistic
What is kurtosis?
Sharpness of the peak of a frequency-distribution curve
What is a type 2 error?
Not rejecting a false H0
What is a type 1 error?
Incorrectly rejecting a true H0
What is a p-value?
The probability of getting a sample as extreme or more extreme as ours given that the null hypothesis is true
What is accuracy?
How close a measurement is to the true value intended to be measured
What is Precision?
How repeatable a measure is, irrespective of how close it is to the actual value
What is bias?
Systematic lack of accuracy
What is an experimental unit?
The physical entity which can be assigned to a treatment
What is a treatment group?
A group of experimental units that all receive the same treatment
What is an experimental factor?
A set of treatments and controls
What is replication?
The process of assigning several experimental units to the same treatment/intervention
What is independence and pseudoreplication?
Value of a measurement from one unit is not affected by the values of other units
What is the standard error?
Precision of mean as an estimated parameter
What does standard deviation show?
Spread of data
What is a priori?
Knowledge considered to be true without investigation (like π=3.14)
What is the F-distribution?
Partition of variability
What is homoscedasticity?
Assumption of equal variances in different groups being compared
What is residual term?
The difference between the observed Y value and the fitted Y value for the same X
What are influential cases?
Extreme values that might influence the regression results when included or excluded from the analysis
What is a binary response?
A response or trait that takes on one of two possibilities
What are examples of sampling units?
Cohorts of patients
Clusters of related genes
Regions in tissues or genes
What are examples of experimental units?
Individual organisms
Tissue culture plates
What is a control group?
Group used for comparison - not always needed
What is a feature of an experimental group?
Vary by only one variable
Randomly sampled
At least three replicas
What is an independent variable?
The one we change between groups
What is a dependent variable?
The one that changes in the experimental group as a result
(Ie depends on another thing)
What is a controlled variable?
The one we keep constant
What is an experimental variable?
Actual property measured by the individual observation
What is a random variable?
Measured property whose results are not known before a sample is taken
What is falsificationism?
Hypotheses are there to be disproved because proof is logically impossible
What are the steps of a falsificationist test?
Observation of a pattern or a deviation from a pattern
Explanation of an observed pattern is a model or a theory
Predictions deduced from the model or theory
Experimental tests
Practicalities to think about when asking questions in biology
Ethics
Has the material been prepared in a way that doesnt affect outcome
Can we study material of interest under Lab conditions
Suitable experimental system
Justifiable assumptions
What does hypothesis formulation depend on?
Systematic observation
What are the advantages of experimentation?
Reliable evidence to infer causality
Distinguish between hypothesis
Independently assess the effect of external factors on variables
What are the disadvantages of experimentation?
Experiments involve artificial manipulations that can be amplified by the laboratory environment
The bigger space or time the thing occupies, the harder it is to experiment on
Experiments are directly challenged by their natural variability
Drawing general conclusions from experiments is not always possible
What should a set of units represent?
A sample of a clearly designed population - all the possible observations we are interested in
What are the types of numerical data?
Discrete (counts) and continuous (measurements)
What are the different types of categorical data?
Nominal (categories with no ranked order)
Ordinal (Ranked/ ordered categories)
What is a sample space?
All potential values for a variable
What is external validity?
Can we generalise (from mice to humans)
What is internal validity?
Does the sample accurately reflect what is going on in the group we are studying
What is the difference between sample and population distribution?
Sample = the pattern the data makes
Population = the pattern the whole group of interest makes
What does the area underneath a normal distribution curve represent?
The proportion of the population being that thing
What is the central limit theorem?
If you have a population with a mean and a standard deviation and take big enough random samples from the population, then the distribution of the sample means will be approx normally distributed
What is standard error of the mean?
How much uncertainty there is in the estimate of the population mean
What are confidence intervals calculated using?
The standard error of the mean
What are the main functions of graphs?
Exploration
Analysis
Presentation and communication of results
Aid stats analysis
What do dot plots show?
Skewness and large or small values
Why are box plots useful?
Resistant to extreme values and show distribution
What can scatter plots identify?
Normality and linearity
What are the graph integrity principles?
Numbers depicted on the graph should be proportional to the numbers reported by the data
Clear and complete labelling
Show data variations not design variations
Number of variable dimensions must not exceed the number of dimensions in the data
Graphs must not quote data out of context
When do you use a t statistic?
When you don’t know the standard deviation and can assume normal distribution
What are the steps of the hypothesis test?
Formulate hypothesis
Calculate test statistic
Consider decision rule
State decision rule
Conclude and report
What is a decision rule?
T statistic or P value
Something that tells us when to reject the null hypothesis
What is the difference between a database and a spreadsheet?
Database: meaningfully structured and stored, always electronic,
Spreadsheet: interactive computer application, collection of a variable of interest aiming at answering a scientific question, table of variables with all observations of a variable of the same type
What are the sources of bias?
Non random sampling
Conditioning of biological material
Interference by the process of investigation
Investigator bias
When do you have to use nonparametric tests?
When the sample size is small
you dont know the distribution
Cant assume data is normally distributed
Why do we use parametric tests where possible?
More informative and powerful than non-parametric tests
What does a sign test do?
Decides whether the data are equally likely to be on either side of a reference value
What are the assumptions of a sign test?
Ordinal, continuous dataset
A set of independent measurements
What is the Wilcoxon signed rank test?
Rank of an observation amongst a set of observations
What does the Mann-Whitney-Wilcoxon test do?
Decide whether population distributions are identical without assuming normality
What does the random test measure?
Whether the observed data are different from a random distribution generated by reordering the observed data
What are the assumptions of the random test?
Observational independancy
What are the criticisms of the bonferroni correction?
General null hypotheses are rarely of interest
Counterintuitive as interpretation depends on the number of tests
What is the purpose of ANOVA?
Comparing means across groups or factors
What does correlation measure?
Degree of a relationship between two variables
What is the parametric correlation test?
Pearson correlation coefficient
What are the assumptions of the Pearson correlation coefficient?
Variables must be normally distributed
Relationship between them is assumed to be linear
What are the non-parametric correlation measures?
Spearman’s rank correlation coefficient
Kendalls rank correlation tau
Why would you use Kendall’s rank over spearman’s?
Smaller values
Less sensitive to potential errors
P-values are more accurate with smaller sample sizes
Better statistical properties
Interpretation is very direct
What does simple linear regression investigate?
A linear association between an outcome and an independent variable
What are the types of regression diagnosis plots?
Residuals vs fitted
Normal q-q
Scale-location
Residuals vs leverage
What is cooks distance used to find?
Influential outliers