Genomes and Sequencing Flashcards
What do you do before you can start to sequence?
Extract DNA and fragment it
What is shotgun sequencing?
Splitting up DNA into fragments randomly to be sequenced
How is shotgun sequencing achieved?
Sonifaication
What is the best sequencing method to use in reality?
Illumina
What type of PCR goes along with Illumina?
Bridge PCR
Why is Illumina the best to use?
454 (pyrosequencing) has a homopolymer problem i.e. can’t distinguish between strings of same nucleotide and can’t get volume
Ion torrent is mostly for specialised sequencing
Sanger is old
Nanopore not consistent, high error rate
Why shouldn’t you discuss all of them?
Because that would make it too fragmented and it wouldn’t work in real life
What is an amplicon?
A PCR product
What should your target coverage be and why?
30x because above that you probably can’t improve the error rate anymore
What should your length be?
At least 15 - 20 base pairs
How does Illumina work?
After PCR, fluorescently label all bases with different colour (reversible terminator bases) and synthesise complementary strand of DNA to already acquired strand. Bases compete to form a regular second strand of DNA by matching up with base pair, other 3 are washed away and the camera can detect the fluro colour so allowing us to work out the original base.
Outline your methodology for sequencing a genome
DNA extracted
Fragmented by sonification shotgun sequencing
Adaptors added to both ends of a fragment
Adaptors attach to inside of flow cell
Bridge PCR performed
After amplification, nucleotides labelled with own fluorescent colour
Bases compete to form regular complementary strand, opposite nucleotide binds and the other 3 are washed away
Observe colour under light to work out original base
Modified dNTP inhibits extension at 3’ end so only 1 base added at a time
Resulting contigs (overlapping bits of DNA) are scaffolded by filling in the gaps and linking
Describe the assembly process
Ordering sequenced DNA fragments into genome using paired end reads
Contigs are consensus read of fragments so they are grouped to create a contig
Contigs lined up and joined using paired end data resulting in scaffolds
What is meant by coverage?
The number of times a section of the genome is represented in the sequenced fragment
What is meant by identity?
The percentage of the reads that agree with the same nucleotide
What are identity and coverage used for?
To discount errors and differentiate between errors and heterozygosity. 50% or close indicates heterozygosity
What is bin size?
When boundaries are set to distinguish between statistical variation and error
What is genomics?
The study of all the genes of a cell or tissue at the DNA, mRNA or protein level
What is comparitive genomics?
The study of the relationship of genome structure and function across different biological species or strains
What is functional genomics?
The study of gene and protein functions and interactions utilising the data produced by genome projects