Midterm 1 - Notes 5 (Part 3) Flashcards
What type of sequencing has a higher error rate than SMRT?
Nanopore sequencing
Why do nanopore sequencing occur very fast?
Occurs very fast because there is no sample DNA or enzyme involved
What is the sample prep for nanopore sequencing? (3)
- Fragment DNA
- continuous reads - Add leader to one side
- directs to motor protein at pore - Add hairpin adaptor to other side of structure
- goes through the pore and sinch the hairpin attaches to the complementary strand then that goes through the pore next
What does nanoproe sequencing allow?
Sequencing the same fragment twice
Does nanopore sequencing have a high error rate?
Yes
- 100x higher than illumina
What are 4 benefits to nanopore sequencing?
- Very long reads
- longer than SMRT - Very high throughput
- Very fast
- Small instrument footprint
Contig
Refers to overlapping sequence data (reads); in top down sequencing projects
- refers to the overlapping clones that form a physical map of the genome that is used to guide sequencing and assembly
Contig assembly classical alignment programs (4)
- Start with 2 sequences
- Start off with a sliding window where the strands are identical
- The you move out and look for differences between the 2 sequences
- Need to allow a few mutations because it could be the same after the single mutation
What do they do in contif assembly instead of taking a whole sequence and looking at the individuals? (2)
- We break down a single read into even smaller species (k-mer)
- Then you start looking for identical overlap
What is the difference for contig assembly?
You are only looking for identical sequences
What does it mean when if k-mers from different reads follow the same path?
They are overlapping
What are 2 common paths that may diverge (polymorphism)?
- Single nucleotide differences cause “bubbles” of length k in the graph
- Introns or deletions introduce shorter path in the graph
What are 4 advantages for contig assembly?
- No reference genome needed
- relay on the data you already have - Identifies novel reactions not presented in reference
- Can identify genomic DNA/ transcripts from exogenous sources (not from our organisms)
- bacteria, viruses, etc.
- have to be careful with contamination - For RNA sequences, long introns are not a problem
What are 2 disadvantages for contig assembly?
- Computationally intrusive
- needs up to a terabyte of memory - Creates smaller contigs: many gaps in assembly
- need much higher sequence depth
- for genome sequence: needs complementation with longer reads
- for RNA sequence: many split transcripts
Promoter
A site on DNA to which the enzyme RNA polymerase can bind to initiate the transcription of DNA into RNA