analysis of gene expression Flashcards
1
Q
focus on mRNA
A
- easier to purify and measure than proteins
- easier to make high throughput
- protein expression controlled by PTMs
- activity not proportional to mRNA expression
2
Q
experimental considerations
A
- uniform growth conditions
- comparable setup without confounding variables
- homogeneous cell population
- heterogeneity produces misleading data
- control choice
- must be relevant
3
Q
microarrays
process of image to numbers
A
- scan
- correct background noise
- data transformation
- normalisation
- data analysis
4
Q
microarray types
A
- single channel:
- 2 samples analysed separately
- 2 sets of expression values
- 2 channel:
- hybridise labelled cDNA from 2 samples to same microarray
- extratc values
- different colour tags
5
Q
process of microarrays
A
- isolate and reverse transcribe mRNA
- label cDNA with labelled nucleotides
- hybridise
- scan and analyse
- calculat eintensity relative to normal background
- some chips have inherently higher intensity than others
6
Q
microarray data transformation
A
- difference between or ratio of 2 samples
- ratios for larger differences
- depends what you are comparing
- be wary of small numbers
- differences more likely due to chance
- to identify differential expression:
- plot of log ratios → focus on tails of gaussian distribution
- plot of log intensity → above or below diagonal line
7
Q
microarrays
data normalisation methods
A
- global intensity
- houskeeping genes
- exogenous/spiked control
- intensity dependent linear
- loess
8
Q
experimental variables affecting microarray intensity levels
A
- number of mRNA copies (want to be the only variable)
- hybridisation efficiency
- cross-hybridisation
- efficiency of reverse-transcription
- marker incorporaiton into cDNA
- scanning efficiency
- activity of fluorescent dyes
- equipment differences
9
Q
global intensity normalisation
A
- assumption:
- same total expression for sample and control
- method:
- calculate sum of intensities for all genes for sample and control
- use ratio of intensity sum to calculate normalisation factor, k
10
Q
housekeeping gene normalisation
A
- assumption:
- housekeeping gene mRNA expression is constant
- tubulin, actin
- their expression does not depend on cell condition/status
- housekeeping gene mRNA expression is constant
- method:
- use ratio of means of hk gene intensities to calculate k
- m(sample)/(control)
- use ratio of means of hk gene intensities to calculate k
11
Q
spiked control normalisation
A
- add mRNA from foreign genes
- 5-10 B. subtilis genes with artifical polyA tails
- equal amounts of exogenous controls
- use average intensities to compute k
- whole normalise data sets
- can detect large changes in genome-wide mRNA levels
12
Q
intensity dependent linear normalisation
A
- predicts experimental expression using control expression minus intercept for normalisation
- assumptions:
- constant mRNA levels
- all genes on the microarray have the same variance
13
Q
intensity dependent linear normalisation
method
A
- fit data to linear regression model
- yi = α + βxi + εi where:
- yi = background-subtracted intensity of gene in set 1 (experimental)
- xi = background-subtracted intensity of gene in set 2 (control)
- α = y-intercept of microarray data
- β = normalisation factor
- normalise yi:
- y’i = (yi - α)/β
14
Q
intensity dependent linear normalisation
advantages vs disadvantages
A
- advantages:
- each gene contributes equally to nromalisaiton factor
- prevents high expression bias
- can correct skew in data with y-intercept correction
- each gene contributes equally to nromalisaiton factor
- disadvantages:
- linear data only
- can you be sure of linearity
- assumes constant mRNA levels
- linear data only
15
Q
M vs A plot
A
- detemrines whether non-linear normalisation is needed or not
- minus vs add
- difference in log intensity of each channel
- average log intensity of each halved
- plot against each other
- quick overview
- if no change in expression:
- points lie straight on x=0
- otherwise loess normalisation needed