Week 22: (B) Putting the Genome Together Flashcards
What are the repetitive sequences?
short, noncoding sequences that are repeated hundreds of times in a tandem
Particularly in the centromere
What are transposons?
jumping genes
ancestral viral bits
Mobile genetic elements – sequences of a few kb that can move about the genome. Thousands of copies in eukaryotes
What part of the sequence creates a problem when we try and put our genomes together?
repetitive sequences
short reads make it hard to overcome repeat regions
What is a contig?
A ‘contiguous’ (continuous) consensus sequence from an
assembly
What s a Scaffold?
A series of contigs where we have additional information to place them together in the right order and orientation but the sequence between the contigs is not complet
What is an assembly? (genome assembly)
The set of scaffolds for one genome.
What is an N50?
The size of the largest contig/scaffold of which 50% of the assembled data is in a contig/scaffold of that size or larger.
medium length contig where the median is measured interns of the total measured genome.
Can be used to describe how complete an assembly is
What is coverage?
number of reads covering any one position on average
What is read length?
length of read
What is overlap?
number of bases overlapping
Number of bases used to join one read to another
How do you coverage?
how many bp worth pf reads you do divided by the total genome length
What is a read?
an inferred sequence of base pairs (or base pair probabilities) corresponding to all or part of a single DNA fragment.
one sequence
How do we overcome repetitive regions? solution 1
need longer reads to span over repeat regions. Illumina was god at this, up to 300 bp, Sanger was up to ~1500
How long can repeats spand?
10 bases to tens of thousands
How can we reduce the number of repeats we have to deal with?
sequencing smaller chunks
THE REPEAT MAY ONLY OCUR ONCEIN THE BAC but many times in the genome
how do we overcome repetitive genes? solution 2
getting the sequence from the end of long fragments
even though we don’t know what’s in the middle
> If we know how long that fragment is we know how far apart those 2 sequencing are
> paired-end reads
How do we sequence the end of a fragment?
sequence each end with different primer
What is paired end sequencing or mate paired sequencing?
When we sequence each end of a long fragment