lecture 3 - nuclear organisation and next generation sequencing Flashcards
where is Constitutive heterochromatin found ?
at centromeres and telomeres
what are centromeres?
Region where spindle fibres attach and pull apart chromatids during mitosis.
Repetitive sequences
Chromatin contains specialised histones
Chromatin is always very highly condensed into heterochromatin
what are telomeres?
Chromosome ends
DNA repeats, maintained during replication by telomerase
Telomere length decreases as an organism ages, except in stem cells
Chromatin structure is always very highly condensed into heterochromatin
what is Constitutive Heterochromatin?
Highly Condensed chromatin
Centromeric and telomeric repeats
Repressive histone modifications
Methylated DNA
No meiotic recombination
Replicated late in S phase
Non-coding RNA
what is Facultative Heterochromatin?
Condensed chromatin
Inactive genes
Repressive histone modifications
Methylated DNA
Replicated later in S phase
Non-coding RNA?
What is euchromatin?
Less condensed
Active genes
Gene promoters have active histone modifications
Gene promoters not DNA methylated
Replicated throughout S phase
what is DNA methylation?
DNA methylation works with repressive histone modifications to condense and silence chromatin.
Approximately 80% of CpG dinucleotides in human somatic cells are subject to cytosine methylation.
CG dinucleotides are methylated on both strands.
where are Lamin-associated domains (LAD) located and what are they?
located at the nuclear periphery
LADs are associated with lamins of the nuclear membrane
LADs are heterochromatin
Contain few genes, and these are silenced
LADs are replicated late in S phase
Some LADs vary depending on cell type
when do Chromosomes localise?
to distinct territories in interphase
Each chromosome is labelled with a different fluorescent dye
what tends to be true for euchromatin?
-Decondensed
-Acetylated histones
-Active genes
what are types of sequencing technology?
Individual clones of DNA molecules: Sanger sequencing
Whole genome: Next generation sequencing
what is Sanger sequencing?
PCR containing fluorescent, chain termination dideoxynucleotide triphosphates
Current Sanger sequencing sequencing technology uses fluorescently-labelled ddNTPs (dideoxynucleotide triphosphosphates) which do not have a free 3’ OH, mixed in with dNTPS. Whenever the DNA polymerase incorporates a ddNTP it won’t be able to add any other nucleotides.
Each sequencing reaction uses one PCR sample
Sequence length is 600-1300 bp
Sequence is typically obtained within 2 days, analysis is easy
One reaction costs around £6
Each sequencing reaction uses one PCR sample
Sequence length is 600-1300 bp
Sequence is typically obtained within 2 days, analysis is easy
One reaction costs around £6
what is the human genome project?
Thousands of short regions of the genome cloned into plasmids
Sanger sequencing of each cloned sequence
Many labs around the world combined their data
First draft of the human genome released in 2001
Took 10 years
Cost $3,000,000,000
what is Next generation sequencing / High throughout sequencing / Massively parallel sequencing?
Latest technology (Illumina HiSeqX)
Sequence of a whole genome or transcriptome obtained in one reaction
Sample prep, sequencing and bioinformatic analysis can take less than one week, but typically takes 2-3 months
Costs can be as low as £1000 for full sequencing of a mammalian genome
Costs for RNAseq ~£400 per sample
what is High throughput sequencing ? -
several technologies, but some principles in common
DNA molecules within the library amplified by PCR, not by cloning individually into bacteria
Amplified DNA templates are spatially segregated: eg on beads, in an emulsion, or on a slide.
DNA templates are sequenced simultaneously in a massively parallel fashion
Incorporation of specific bases can be detected in various ways, eg:
by fluorescent tags using a camera
by release of H ions using a semiconductor chip
by how they block the flow of ions through a nanopore
what is De novo genome assembly?
35 – 100 bp reads are aligned with each other to produce a consensus genome sequence.
Need at least 10x coverage. Repetitive regions are harder to sequence.
Centromeres and telomeres extremely hard to sequence.
how do we Identify sequence variants
mutations or SNPs – single nucleotide polymorphism
Which of these questions would you use NGS (next generation sequencing) for?
What is the genome sequence of the organism I’ve working with?
What mutations are present in this patient’s cancer?
What can we do with RNAseq data?
What was your hypothesis?
Which genes are upregulated or downregulated?
Gene ontology - are the altered genes associated with particular functions?
eg. Neuronal function; glucose metabolism, gametogenesis
Pathway analysis – are the altered genes associated with particular pathways?
eg. MAPK signalling, insulin signalling
what is gene ontology used for?
Here, GO was used to identify gene classes containing an overrepresentation of genes that were differentially expressed in ES cells induced to undergo neurogenesis.
what is the Ingenuity Pathway Analysis?
In this example, IPA was used on RNAseq data to elucidate a network of transcription factors that mediate neurogenesis
what questions would you use RNAseq for? What other techniques might be better?
What genes are expressed when ES cells differentiate into cardiomyocytes?
Neural stem cells die when treated with tazemetostat (inhibits lysine methylation). Which genes are affected by this treatment?
what are the applications of NGS?
Mutations, Translocations duplications –> cancer
Somatic variants- single nucleotide polymorphisms (SNPs) –>
Genetic diseases - Genome-wide association studies (GWAS) for chronic disease
RNAseq – transcriptome analysis - model organisms –> epigenetic analysis
what is RNAseq analysis?
expressed exons are revealed by alignment of NGS reads to the reference genome
how do we Identify genes that are significantly up- or down-regulated?
Validate results from RNAseq using conventional methods such as qRT-PCR and western blotting
what is the Solexa technical approach?
1 - bind single DNA molecules to surface
2 - amplify
3 - DNA clusters about 500 copies per cluster and each about 1 micron in diameter
what is RNA-SEQ?
RNA isolation (Experimental or patient samples) –> cDNA amplification –> library prep –> sequencing (Need around 40 million reads for human RNAseq)–> data analysis (Align to reference genome. Compare results between samples, or with previously generated data)