L8 - Microarrays Flashcards
What are microarrays?
A substrate of some kind (eg. glass slide or quartz wafer) on which there are chemically bonded DNA probes
Works by hybridisation - Watson-Crick base pairing between probe & sample
• DNA/RNA in sample we want to bind is the TARGET
The target DNA/RNA is labelled with a fluorescent marker
Intensity of flouresence of feature relates to amount of target DNA in the sample – this allows us to observe changes for specific genes
Microarray applications
Initial use was for gene expression, extract mRNA, convert to cDNA
Arrays run under different conditions, (drug vs control) allowed for differential gene expression as reported fold-changes
Allows us to see what genes are “switched-on” by the cell in response to different conditions, these can then be investigated or characterised
Why do we care about gene expression?
The genes in a genome are not all actively transcribed at once
Detecting which mRNA molecules are around at any given moment in time shows which genes are being actively transcribed
Lowest level at which genotype gives rise to phenotype
Compare gene expression in normal Vs cancer cells / tissue
Can compare expression between different environmental conditions or stress to see how an organism responds
Old methodology of microarrays
Northern blotting
What is Northern blotting?
Northern Blot refers to capillary transfer of RNA from the electrophoresis gel to the blotting membranes
Principle: Northern blotting is a commonly used method to study gene expression by detection of RNA (or isolated mRNA) in samples
Cons of northern blotting
Time consuming and laborious
Uses nasty chemicals and radioactive labelling
• Radioactive phosphate probes
• Formaldehyde
Can only work on a handful of genes
Provides mainly qualitative data
You need to decide in advance which genes you want to investigate
Why are microarrays an improvement on northern blotting?
Less Laborious, just dump the sample over a chip, automated
Not radioactive
Can work on 10,000s of genes, can use genomes to design probe set
Has a much higher dynamic range enables relative abundances to be determined
What is the difference between a northern blot and a western blot?
The northern blot is used chiefly to study gene expression by detection of RNA or RNA species (like mRNA)
The western blot is used to detect specific proteins in a sample by “immunoblotting”
How do you measure microarrays?
Using an affymetrix
The little “windows” are the hybridisation chamber where the experimental samples are introduced
Microarray method
1) Extract RNA from target
2) mRNA converted to cDNA via reverse transcriptase - labelled with fluorescent cyanine dye
3) Hybridisation to array (45-65 degrees)
4) Array has ssDNA probes bound in spots called features
5) Washing to remove non-specific binders
6) Laser scans of the array
7) Data processing
How many base pairs make up a strand on the array?
25 bp
How does hybridisation work?
Hybridisation between two DNA strands, the property of complementary nucleic acid sequences to specifically pair with each other by forming hydrogen bonds between complementary nucleotide base pairs
Shining with a laser caused flagged DNA fragments that hybridise to glow
Why is it not easy to compare between 2 different microarrays directly using raw intensity?
Due to variation in multiple things:
- Differences in sample extraction yield
- Differences in hybridisation conditions, including background signal
- Differences from one array to the next and artefacts on the array
Absolute abundance is difficult to quantify
How do you reduce the variation seen using microarrays?
Replicates - preferably on the same day
What are the 2 types of replicates?
Biological and technical
Biological are extra patients in a clinical study
Technical replicates would be using another array for the same patients sample, or even replicated samples
How do you transform the intensity values?
Convert data to a log to the base of 2 (log2)
Will cover a smaller numerical range
Intensity will be bell shaped - normally distributed
How do you convert transformed values back to the raw intensity values?
Anti-logging the values
Raise 2 to the power of X, where X is your log2 value of intensity
What is the batch effect?
Batch effects show up where sample variability is due to sample handling, rather than genuine biological variation
Were samples extracted the same day?
Were samples hybridised on the same day?
Were sampled processed by the same person?
Batch effects can mask genuine results
How can you spot batch effects?
Check how your samples cluster:
• Unsupervised hierarchical clustering successively links samples with similar expression profiles to give a tree like structure (dendrogram) and heatmap.
• And/or principal component analysis (PCA)
There are computational strategies for removing batch effects but it’s best not to introduce them in the first place
What is PCA?
Principal Component Analysis
• PCA is a way of explaining variability in data.
• The first principal component describes the majority of the variability
• The second principal component accounts for as much of the remaining variability as it can and so on
• Plotting 3 principal components leads to a 3-dimension graph where alike samples should cluster together
A dimensionally reduction technique
Analysis of PCA?
Allows you to spot and remove outlier arrays
We would expect that replicates all cluster together and and that different conditions separate out as clusters
What is the main purpose of a gene expression microarray experiment?
To spot genes that are regulated by one or more of the experimental parameters:
• Time
• Drug / ligand response
• Stress, like oxidative or heat shock etc.
• Developmental process
What happens if you have no replicates?
No statistics - can’t estimate population variance from sample size of one