QBIO2001 Flashcards
Small scale data
What is synthetic biology?
Synthetic biology- the use of molecular biology tools and techniques to forward engineer cellular behavior
What is the design of the synthetic biology process?
- Design objectives and specifications:
a. Inputs and outputs
b. System performance - Design according to spec:
a. Conceptual design
b. Detailed design - System models composed from parts
a. Data may come from standardized database of biological parts - In silico verification
a. Analyse models
b. Simulate/predict behavior - Implementation
a. DNA assembly
b. DNA synthesis - Evolution
- Testing and characterization of the system
What are 2 different types of design?
- Parts design
* Model based design
What can synthetic biology be used for?
- Autoregulatory circuits
- Toggle switch
- Edge detection circuits
- Recombinase-based logic
What is liquid chromotography and mass spectrometry used for?
Used for protein and metabolite analysis
Describe the scientific method?
- Initial observation
a. Basic observations of the dataset
b. Observation allows people to generate a theory or consult a theory - Consult/Generate theory
- Generate hypothesis
a. Needs to be a testable statement by experiment
b. Needs to be a falsifiable by experiment
c. Needs to be very clear
d. This is the stage where variables are identified
i. Includes outcome variable
ii. Includes independent variable - Collect Data to test hypothesis
a. Measure the variables
b. From t test, can so whether hypothesis is right or not
c. Tests normally examine the null hypothesis
d. Low score on t test the alternative hypothesis is right
e. Have to watch out for confounding (potentially unexpected) variables - Analyse data
a. Graph data
b. Fit a model
c. If failed, another hypothesis can be generated
What is an experimental unit?
Object of replication that can be assigned to a treatment
What are examples of experimental unit?
- Could be an individual human or hundreds of mice could be 1 experimental unit
- For animal experiments, cage can be experimental unit (mice in the cage are put in the same conditions)
- Experimental unit could be regions of skin on one animal
- Individual cell in dish could be an experimental unit, as each cell has its own variability
What is the between-subjects treatment group method?
- An experimental unit is chosen
- This experimental unit is put in one of three groups:
a. Experimental (treatment) group
i. Unknown change in dependent (outcome) variable
b. Negative control (untreated) group
i. No change in dependent variable expected
c. Positive control group
i. Determines test validity
ii. A known change in dependent (outcome) variable expected
iii. Exposed to a treatment that we know affects the dependent variables
What are flaws in the between subjects-treatment groups method?
o Two major sources of variance in between subject design:
Variability of subjects/ experimental units (unsystematic variance)
Systematic variance -> treatment
o Need thousands of subjects to balance out variability
What is the within subjects (paired/dependent) treatment groups method?
- Experimental unit is chosen
- That experimental unit goes through an initial test to determine baseline
- The same individual goes through the experimental treatment, and then a test
- The same individual also goes through the negative control (no treatment) after a period of time (to ensure the experimental treatment has worn off), and then a test
What are flaws of the within subjects (paired/dependent) treatment groups method?
o Patient is not exactly the same as they were in initial test, as going through an experimental or negative control test may change their attitude towards the testing experience as a whole and, if the testing relies heavily on patient’s mindset, this might confound the results
Hence, the crossover design is used, where some experimental units go through experimental treatment first, then negative and vice-versa
o Time brings change, and since these experiments are done over time, all sorts of things can change over time
o People can drop out of the experiment, and then you only have half your data.
What are advantages of the within subjects (paired/dependent) treatment groups method?
Number of experimental units is much less because variance is less
What are two sources of variance?
- Unsystematic
- Systematic
What is unsystematic variance?
Due to differences between experimental conditions (e.g. time of day, temperature, etc…) OR experimental units (e.g. genetics, sickness, etc…)
What is systematic variance?
Due to the experimenter performing a treatment on all experimental units in one group but not those of another group (untreated negative control)
What are experimental units representative of?
small sample (representative sample) of entire population
What does randomization do and how can this be done?
• Randomisation -
o Experimental units assigned randomly to treatment groups to minimize unsystematic variation
Variance is equally/randomly distributed so it has minute influence on results
This can be done by computer algorithms
What does blinding do?
o Eliminates bias that may increase variation
Eliminates psychological influences on outcome variable
What is blinding?
when the patient doesn’t know which group they’ve been allocated to
What is double blinding?
When both the patient and the researcher doesn’t know which group they’ve been allocated to: makes sure the researcher doesn’t give anything away/ doesn’t influence data collection
What is blocking and what does it do?
o Similar number of experimental units assigned to each treatment group in a block to minimize effects of unavoidable variance
o Eliminates effect of known variance
o Make sure there are equal proportions of treatment/untreated groups across confounding variables
o Blocking used to spread the outcome variable over time
What are power calculations?
Estimate the sample size required to detect an effect of a given size with a given degree of confidence
What is the preliminary data needed for power calculations?
o Effect size- change in outcome variable you want to see
o Standard deviation of the outcome variable
o Significance level required (p<0.05)
o Type of statistical test
o Desired power (probability of detecting true effect)
Why are sample sizes constrained?
• Sample size constraints:
o Need to put request through ethics communities
o Funding
Why is graphing data useful?
• Graphing data is extremely useful in informing the researcher of the shape and distribution of their data
What is a big assumption in most tests?
• A big assumptions in most tests is that values are normally distributed
What does kurtosis mean?
The pointiness of the skew
What is the assumption of independent between subjects tests?
• Independent (between subjects) tests:
o The assumption that distributions within treatment groups are normally distributed
What is the assumption of dependent (within subjects, paired) tests?
o The assumption that distributions of differences between treatment groups are normally distributed
What is the Shapiro-Wilk test in R?
o Shapiro.test
How can data be normalised?
o Every value in comparison needs to be treated equally and transformed
o Log () is used to bring data closer to the mean of values used for high values
o Sqr()- If some values are negative, a constant might have to be applied first so that all values become positive
o Reciprocal -> when very large values become very small
What is an outlier?
Any value >1.5x the interquartile range
What can you do to outliers?
Remove outliers
• mvoutlier
Transform data
What is the null hypothesis
Difference between means of treatment groups is zero
o Where H0 is the null hypothesis, which assumes that the difference between the observed value (data) and expected value (EV) is due to chance alone. p=p0 for example p=0.5
What is the alternative hypothesis?
• Alternative hypothesis- Difference between means of treatment groups is either not zero (two-tailed), or less/greater than zero (one-tailed). Can be given directionality
o Where H1 is the alternative hypothesis, which assumes that the difference between the observed value (data) and expected value (EV) is NOT due to chance alone. pp0 (upper sided test) or p doesn’t equal to p0 (two sided)
What is the assumption of homogeneity of variance?
All comparison groups have the same variance
How can the assumption of homogeneity of variance be tested for?
Levene’s test (leveneTest), p<0.05 indicates difference variance between treatment groups
What test does not assume homogeneity of variance?
• Welch’s t-test (t.test(paired=FALSE)) does not assume homogeneity of variance
o Need data to be normally distributed
When comparing 2 means between subjects, how can you make the data distributed using a test after transformation?
o Wilcox’s robust tests including (yuen)- trimmed mean calculated after 20% of scores have been removed from each extreme of the distribution
Can specify the level of trimming:
• Makes normal distribution
• If trim too much, data would lose its effect and a lot of experimental units would be needed
• The data should be transformed first
What 2 means have the smallest variance: between subjects or within subjects?
Within subjects
Why do we adjust value for plotting purposes?
Easier to see data trend
What test should we do if 2 samples have unequal variance?
Welch 2-sample T-test
R command: t.test(x,y,mu=0,var.equal=F)
What test should we do if 2 samples suggest non-normality?
Transformations or non parametric tests
Mann-Whitney-Wilcoxon test: wilcox.test
What test should we do if 2 samples are not independent
Paired T-Test
-Sometimes it is desirable to analyse dependent data. We often design an experiment to take advantage of this dependency in order to control variation between experimental groups
What is ANOVA?
o Analysis of variance (ANOVA) is used to compare several means between subjects
How do you see the ANOVA output?
o One-way independent ANOVA (aov(outcome~predictor)) then (summary) to see the output
What is assumed in ANOVA?
o Both homeogeneity of variance is assumed and normality is assumed
What is standard deviation?
the variance within one particular sample
What is standard error?
• Standard error- standard deviation of mean across many samples
What is relative quantitation?
compare abundance across multiple samples (typically expressed as a ratio) without determining number of molecules (e.g. copies, pmol/L, etc)
o Most commonly used because it’s easier
What is absolute quantitation?
etermine the number of molecules in each sample (e.g. copies, pmol/L, etc)
o Gives more information
o Don’t need to have multiple samples -> don’t need to compare between multiple sample (but generally need to compare against multiple samples)
o Can use known concentration of a thing, then use abundance of known thing to calculate exact molecules of an unknown thing
How does mass spectrometry work?
- Needs to be in a vacuum
1. You place the substance you want to study in a vacuum chamber inside the machine (into an inlet)
- The substance is bombarded with a beam of electrons so the atoms or molecules it contains are turned into ions. This process is called ionization and produces molecular ions.
- Ion source - The ions shoot out from the vacuum chamber into a powerful electric field (the region that develops between two metal plates charged to high voltages), which makes them accelerate. Ions of different atoms have different amounts of electric charge, and the more highly charged ones are accelerated most, so the ions separate out according to the amount of charge they have. (This stage is a bit like the way electrons are accelerated inside an old-style, cathode-ray television.)
- Mass Analysers - The ion beam shoots into a magnetic field (the invisible, magnetically active region between the poles of a magnet). When moving particles with an electric charge enter a magnetic field, they bend into an arc, with lighter particles (and more positively charged ones) bending more than heavier ones (and more negatively charged ones). The ions split into a spectrum, with each different type of ion bent a different amount according to its mass and its electrical charge.
- Mass Analysers - A computerized, electrical detector records a spectrum pattern showing how many ions arrive for each mass/charge.
- Computer
- Instrument control and data acquisition
What is the purpose of mass spectrometry?
• Can determine the structure and quantity of molecules by measuring their mass to charge ratio and comparing those to that of elements to see what elements are in the molecule.
Why is liquid chromotography useful?
o Use liquid chromatography to decrease complexity of sample before it’s fed into the mass spectrometer
How is liquid chromotography used?
o Separates out proteins/ peptides in time by introducing them more slowly to the instrument –> spread out separation of analytes in time so we can give mass spectrometer more time to quantify and identify number of analytes in the sample
Feed to the mass spectrometer more slowly
Instrument has finite speed at which it can operate
Where does liquid chromotography occur?
In the inlet of the mass spectrometer
Using liquid chromotography, in a 2 hour acquisition time, how many peaks would there be in a seperation and what does that mean?
o In a 2 hour acquisition (normal acquisition time), would have about 40,000 peaks in the separation- 40,000 analytes separated
What is electrospray ionization?
Electrospray ionization (ESI) is a technique used in mass spectrometry to produce ions using an electrospray in which a high voltage is applied to a liquid to create an aerosol. It is especially useful in producing ions from macromolecules because it overcomes the propensity of these molecules to fragment when ionized.
Why is electrospray ionisation done?
Applies very high voltage to liquid coming out of liquid chromatograph
Generates a spray of droplets which leads to generation of gas phase ions in front of the source region of the mass spectrometer -> ions get attracted and introduced into the machine itself.
What is tandem mass spectrometry?
o Technique to break down selected ions (precursor ions) into fragments (product ions)
o Fragments reveal aspects of the chemical structure of the precursor ions
What should the peaks of a mass spectrometer and liquid chromotography be?
Gaussian (normally distributed)
Analyse the peaks of mass spectrometers
• One large peak has many different peaks because all molecules analyzed are composed of different elements, which are made of many isotopes
o 1% of all carbon is C13, so that means that it is a common contaminant
o Very important for biological molecules (such as peptides and proteins)
o Rate of contamination by these isotopes are quite high
• Peaks are mixtures of different elements in the molecule being analysed
o Contamination becomes bigger and bigger as peaks decrease to the right of the monoisotopic peak
• These distributions are referred to as the isotopic envelope, and they represent the inclusion of isotopes within each particular analyte
What is a monoisotopic peak?
A pure peak