Review of Basic Genetic and Bioinformatic Technologies in the Context Flashcards
Central Dogma of What Genomics Measures
Linera flow of information
Genome –> Transcriptome –> Proteome –> Metabolome
DNA –> coding mRNA –> Protein(enzymatic roles and altering of metabolites and cells) –> Metabolites
Aluminium template representing Thymine from Crick and Watson’s model of DNA based on X-ray diffraction photographs by Rosalind Franklin, This explained how genetic information could be copied and passed to future generations –> Nobel Prize in 1962
-idea of how DNA can be copied and replicated
Alterations to Central Dogma
- Central Dogma isn’t the only way information flows
as DNA –>
a) coding mRNA
b) non-coding mRNAs (miRNAs) (micro RNAs)
-influence stability and degradation of mRNA (translation of coding mRNAs by acting of ribosomes)
-influence how metabolism occurs on cells
-Nobel prize awarded of discovering subset of micro RNAs
-acts on coding mRNAs, Proteins and Metabolites
Now realising there are alternative ways that information flows through cells
+ - Methylation of residues on DNA
- How different parts of genomes are pulled together in 3D matrix, to influence how genes are used and what turns them on and off
Why is the breaking of the central dogma very important
The way that information flows through cells is very important
-and how central dogma has been broken by methylation, imprinting and microRNAs are important for disease
DNA genomics
Genomics is focused on DNA part of information flow
Genomics is “how genes are used”
-broad
- how genes expressed, relation to disease, how theyre selected
Traditional gene sequencing
Older but still useful technology:
-DNA sequence analysis by Sanger Sequencing (traditionally). Still well used today/
-take a single piece of DNA and sequence it from end to end
–reads genome from template base by base
=Read out
-sequence gene by randomly stopping polymerase, depending on which base it stops at will result in a different coloured floursecent signal
-Further developed by Leroy Hood and coworkers to enable the human genome project
Newer gene sequencing Technology
Next Generation Sequencing/ Massively parallel sequencing
-sequence lots of molecules at once
-break up a molecule randomly into lots of different fragments and sequence them all at once/all in parallel
1. Pick Genomic DNA –>
2. cut DNA (with enzymes or random fragmentation (sonication, shattering DNA with sound waves)) –>
3. Add Linkers (little pieces of definied sequence of DNA on ends using enzyme)
-defined sequences=adapters. allow DNA to bind to microscope slide–>
4. Input library –>
5. Flow cell (-defined sequences=adapters. allow DNA to bind to microscope slide)–> In Situ PCR
Then traditional polymerase chain reaction PCR to amplify, build up large number of copies of each individual fragments–>
6. Sequencing –> An image of hundreds of extended molecules
-as amplified, flash of light for base in each position in fragment
-as gradual build up sequence, get a flash of light of base (series of photographs)
What is another name for the Newer Genomic Sequencing Technologies
Next Generation Sequencing
- Massively Parallel Sequencing
- massively parallel nature, 100 millions of DNA fragments being sequenced simultaneously
- each flashing light is from different position of microscope of different fragment
Use of Next Generation Sequencing
specific piece of DNA
RNA- measure how much each gene is used, by counting number of times each RNA appears in the cell
Exome sequencing
Genomes vs Exomes vs Targeted gene “panels”
-Whole Genome (3 billion base pairs)
-Exome = 1% (only 1-2% of the 3 billion base pairs only codes for genes which we know have a clear function of which we understand (e.g. encoding for protein or mRNA))
-exome sequencing of one part of the whole genome
After fragmenting the DNA, capture the DNA fragments corresponding to the exome using baits
Whole Genome sequencing
Useful
3 Billion base pairs to a whole genome
can sequence human genome 1000-200genome
-more affordable
-but provides alot of information that we dont know how to use with the current level of scientific knowledge
vs. Exome sequencing is sequencing of the part of genome that we know how to use/know the best
-get much more data on each part of DNA for the same cost and same effort as sequencing whole genome
Exome Sequencing mechanism
After fragmentation:
capture the DNA fragments on little beads,
each bead has a piece of complementary DNA that can bind to DNA and pull it away from the 99% that you dont want
=resulting in just having exome sequences that you want
corresponding to the exome using baits (parts of the genome that encode proteins or mRNAs/exome sequences_
The recent advance of sequencing technology
Allumina
smaller - and now in big centres and individual research labs - can do several in a week
MinION - sequencing machine that can plug into USB port of laptop
–> increased field use
-e.g. farmers (understand what pathogen is infecting their crops)
-public health nurses what pathogen infecting patient
Ease of Next Generation sequencing
Technically not easy to do
- once get all photographs/lights of four bases
- results in knowing the sequence for all the fragments
- A boring but essential job: processing the millions of sequence reads produced by each next generation sequence run (mapping them to the genome - finding out where they belong)
- Sequence Read Alignment
Sequence Read Alignment
boring but essential job
Is like a jigsaw puzzle where they give you the cover on the box
-individual DNA Reads are analogous to pieces of jigsaw
-now because of human genome project know sequence of most parts of human genome - use as a reference
-often results in large numbers of reeds that cover any one parts (some pieces are easier to place than others)
-but other pieces are hard to place/align (pieces that look like each other, or pieces with unique features)
Variant Calling and Annotation
Identifying and visualising sequence variants
-genome browser used daily to understand
-each little grey lines is a sequence
-2x strands of human genome =sense and antisense strand of human genome
-mutation= green reeds are different from reference = compare to reference to patients germline
=random fragments cover point of mutation and large numbers of different reeds
1. Each reed is a random fragment from Next Generation Sequencing
2. End product of being able to overlay/pile up all different reeds that are randomly produced from fragmenting genome, but can be combined to cover any one base in genome –> then being able to identify mutations not in normal tissue