Shane - Lecture 3 Flashcards
How many genes do human have?
22,000 genes
How long did it take to sequence the full human genome?
About 10 years
How much of your genetic material is the exact same as a random stranger?
99% of it is identical
Why did it take so long to sequence the human genome?
Because we have 3 billion base pairs but only 22,000 genes
What is computational gene prediction?
Trying to find what genes are found on a sequence of DNA i.e. what region of the uncharacterised sequence codes for proteins
What information can be found via computational gene prediction?
(6)
What regions codes for protein
Which DNA strand encodes the gene
Which reading frame is used
Where does the gene start and end
Where are the exon-intron boundaries in eukaryotes
Where are the regulatory sequences for that gene
What often acts as the start codon?
ATG
What are the benefits of gene finding on prokaryotes?
(3)
Small genomes
High coding density
No introns
What is the gene level accuracy of gene finding of prokaryotes?
99%
What are the characteristics of eukaryotic genes?
Large genomes
Low coding density
Intron/exon structure
What is the gene level accuracy of gene finding on eukaryotic genes?
About 50% accuracy
What are the problems associated with gene finding on prokaryotes?
(3)
Overlapping open reading frames
Very short genes - protein might be only a few dozen amino acids
Finding transcription start sites (TSS) and promoters
What is a TSS?
The point at which RNA polymerase starts trascribing
What is a TSS?
The point at which RNA polymerase starts transcribing
What are the four ways we can predict the location of genes in genomic sequences?
Searching by signal
Searching by content
Similarity-based methods
Comparative genomics