Cancer genomics Flashcards
What are the 2 next gene sequencing technologies likely to come into mainstream use?
Oxford Nanopore
Pacbio
Both are too expensive atm (but will get competitive soon). They can read up to 10kb but make mistakes. This is in contrast to Illumina which can read 150 bases but is more accurate. Long reads with mistakes would help string sequences together.
How does exome sequencing work?
Break up genomic DNA, hybridise with bits you’re interested (e.g. exome) and pull down. Sequence these bits. Lose some information but is cheaper.
How does Illumina sequencing work?
Error rate is 1/100 or 1/1000 depending on location in genome. Do paired end sequencing (sequence from both ends) of 250-500bp fragments of DNA. Get approx. 150bp in from each end. Then align to a reference.
What are the problems with sequence alignment in Illumina?
If there are too many repeats/errors in the sequence it can’t be lined up
If there are many mutations it won’t align perfectly - go for where it is closest but may be in wrong place or won’t align at all.
If there are translocations, they will be found if one end of the fragment aligns with chromosome a and the other aligns with chromosome b. Have to allow for mismatches however as there is a high error rate in this process - misalignment (where there is homology in reference genome on another chromosome) is as common as a translocation. Can also get translocations when repairing fragments if they join with a piece from another chromosome.
LOTS of noise in translocations, need to sequence many fragments
How can misalignments in Illumina sequencing be verified?
Can’t do by Sanger as is too expensive. Look at agreement between labs using the same data – 80-90% agreement for single base mutations; 30-40% for INDELs. Due to use of different software.
Artefacts of the sequencing can lead to findings of ‘translocations’ in cancer which are false.
How can cancer mutations be found by non-genome methods?
Historically looked in retroviruses that cause tumours in animals. If retroviruses are selected for efficiency, many had acquired an oncogene which could be confirmed by sequencing the virus
Can transfect oncogenes into cells and look for phenotype
How can cancer mutations be found by genome-wide methods?
Hereditary predisposition (map, find location, find tumour w/ deletion and sequence) - helped find tumour suppressors
Cytogenetics (oncogenes in translocations)
Losses by cytogenetics and loss of heterozygosity (looking for common deletions in cancer)
CGH arrays
Sanger sequencing screens (e.g. MAPK screen found BRAF)
What are the advantages and disadvantages of using cytogenetics to look for chromosome translocations?
Good in leukaemia as there are few translocations
Not good in epithelial cancers where there are too many translocations. Improved a bit with FISH but still don’t know what parts of the chromomse have swapped so can’t tell significance.
Also can’t pick up small translocations e.g. TMPRSS2-ERG - TMPRSS2 gives promoter, ERG is a transcription factor. Is a small gene fusion
What has CGH arrays taught us about cancer genomics?
Comparative genome hybridisation arrays. Able to count the number of copies of a genome region in cancer (deletions and amplifications) - informs about copy number variants. Normal DNA is dyed green, tumour DNA is dyed red, they are hybridised and a signal is looked for. Doesn’t take to gene level - looking at 100kb areas at best. Has been superseded by whole genome sequencing - count reads at each location.
What have Sanger sequencing screens taught us about cancer genomics?
PCR exon by exon and sequence and look for statistically significant results. Can look for candidate screens e.g. hypothesis beta-catenin could be an oncogene and go look for mutations. Also, screens for the MAPK pathway have been done.
What is the significance of TTC28?
Has a mobile element in an intron and therefore is translocated all over the genome, with the same sequence with in 4kb. This is found a lot in cancers as non-LTR LINEs can be activated. LINE is copied and takes part of TTC28 with it.
How does LINE1 move around the genome?
mRNA is transcribed and translated to make reverse transcriptase. Get DNA copied from the mRNA. In the case of TTC28, the mRNA extends into TTC28 and is therefore carried when it is reverse transcribed. L1 is polymorphic and not unique so can’t align and is therefore not detected.
What is the current knowledge of cancer genetics?
Point mutations in exons - high reliability, 50% sensitivity
Indels in exons - less than 50% reliability
Small mutations in non-coding sequences (promoters, enhancers) - little known
Structural rearrangements - few from sequencing, well known ones for cytogenetics/CGH
Mobile element insertions - only just discovered
Epigenetics - poor
What are the different levels of transcription regulation?
DNA state (histone code, DNA methylation, chromatin structure)
Transcription factors
Co factors
Protein post translational modifications
RNA Pol II mediated gene expression (near promoters)
Describe the histone modifications found at enhancers and promoters?
Are distinct from each other
Enhancers: mono/dimethylation of H3K4 (not tri, required for enhancer function)
Promoters: trimethylation of H3K4 for activation. Also acetylation. Methylation at K9 and K27 is repressive.
How can enhancers be identified?
Tend to be in euchromatin. Can be identified using DNase I hypersensitivity. Embryonic stem cells have ~3% of the genome open (could be an enhancer). These close as cells differentiate.
What is special about nuclear receptors?
Are the only transcription factors that can be constituently switched on by a ligand.
Describe the structure of nuclear receptors
Have a variable section at the N terminus, then a DNA binding domain consisting of 2 zinc finger domains, then a ligand binding domain.
How can nuclear receptors be drugged?
The ligand binding domain is unique and therefore druggable
How do nuclear receptors act in cancer?
Oestrogen-ER causes breast cancer (75%)
Androgen-AR causes prostate cancer (100%)
Are key targets in therapy and provide models to learn about transcription and cell growth.
How do cofactors work with nuclear receptors?
A very complex interaction involving many co-factors. Varies between cell type - not all co-factors function in all cells. Can edit histone modifications e.g. CBBP/p300 is tethered to the nuclear receptor by SRC-1 and others and they can acetylate histones. Cofactors are commonly mutated in cancer.
How can the binding points for ER be found in an unbiased way?
Use Illumina to find the DNA bound to the transcription factor. Pull down ER and sequence, remove noise and look. Is an unbiased method that is reproducible. Found that most binding sites were in the middle of nowhere.
What did the discovery of where ER binds tell us?
Found that ER mostly bound in the middle of nowhere. Thought to not be random as DNA was DNase I hypersensitive and had histone markers (H3K4me1/2). Also seemed to have a cis regulatory element (the oestrogen responsive element or the forkhead TF binding motif. for a forkhead protein).
What is FoxA1?
Is a forkhead box protein. Is a pioneer factor that binds forkhead motifs at tightly wound DNA (not heterochromatin) and opens it up to allow other transcription factors to bind.