Lecture 02 Flashcards
Useful features when identifying genes
a. What does 5’ exon start with?(1) preceded by?(1) Free of?(1)
b. Describe internal exons (3)
c. Describe 3’ exon(3)
d. What do all coding regions have?(3)
a. 5′ exon starts with a TSS; preceded by core promotor site (e.g., TATA box roughly at -30 bp); free of in-frame stop codons and ends immediately before a GT splice signal – rarely an exon occur before 5’ exon with ATG
(also Kozak sequence: consensus ACCAUGG)
b. internal exons are also free of in-frame stop codons; begin immediately
after an AG and end immediately before GT splice signals
c. 3′ exon starts immediately after AG and ends with a stop codon; followed by a polyadenylation signal sequence – rarely an exon occur after 3’ exon
with stop codon
d. all coding regions have non-random sequence characteristics (codon usage); hexanucleotides can best distinguish coding from non-coding regions; use a
set of known genes from a (model) organism as a training set, pattern recognition
programs can be tuned to particular genomes
Diagram on slides 9 and 10
Scope and applications of genome
sequencing projects
a. How is the human genome similar to other genomes?(1)
b. List 2 reasons for sequencing non-human genomes(2)
c. List 3 other reasons for sequencing non-human genomes (3)
d. What are the clinical applications in humans?(1)
a.Contents and dynamic aspects of the human genome are similar in general
features to what other genomes contain
b. Many reasons for sequencing non-human genomes – see Table 1.2
b. 1. reveal and illuminate the processes of evolution
b. 2 help understand the functions of different regions in the human genome
b. 2.1 if evolution conserves something, then it is essential (opp. is also true)
c. Other reasons for sequencing non-human genomes
c. 1 .application to human welfare – e.g., genomes of pathogens exhibiting (or threatening to exhibit) antibiotic resistance
c. 2.improving plant and (domesticated) animals; biotech applications (e.g. bioethanol); conservation of endangered species
c. 3. metagenomics – ocean water, soil samples and even human gut/body (i.e. microbiome)
d.Clinical applications in humans – genetic testing for diseases; genealogy; law
enforcement; mutation discovery (e.g. cancer
Scope and applications of genome
sequencing projects
a. Discuss variations within and between populations
1. What does a genome belong to?
2. Discuss comparisons
3. Discuss in humans
4. Discuss in other species
a. Variations within and between populations
a.1. a genome belongs to an individual organism; holistically, it’s essential to relate
genomes to each other
a.2 comparisons: within and among species; within species- within and among
populations
a.3. in humans: seq variation has applications in medicine, anthropology, migration
studies, genealogy, personal ID (determine parentage or law enforcement)
a.4. other species: species history, incl. domestication and tracing superior alleles in
source populations for breeding purposes
Genetic diversity estimated
a. How much does the genomic sequence of 2 people differ?
b.
c. What does comparative genome analysis permit?
Genetic diversity estimated
a. any two people – except for identical siblings – have genomic sequences that
differ at approximately 0.1% of the positions
b. Question: if frequency of mutations = 0.1%, then this equates to how many
variable sites in the genome? Suppose human genome = 3.2 Gbp
c. comparative (human) genome analysis permit distinction between random components of this variation and those that systematically characterize
different populations
Scope and applications of genome
sequencing projects
a. Are mutations consistent with a healthy life?(1)
b. Is loss of some proteins harmful?(1) Discuss
c. What has evolution optimized?(2)
Mutations and disease
a. many mutations (even when not synonymous) are consistent with a healthy life
b. surprisingly, loss of some proteins is innocuous, e.g. mice lacking myoglobin; elsewhere, species-wide loss of biosynthetic enzymes is not considered as a
disease (lack of VitC in animals make it essential, otherwise the disease scurvy
occurs)
c. nevertheless, evolution has largely optimized proteins for their roles in healthy
organisms- as such, amino-acid changing mutations (thus, non-synonymous) are
deleterious; see OMIM or OMIA
Scope and applications of genome
sequencing projects
a. Do all people have the same DNA sequences?(1)
b. How much do unrelated individuals differ?(1)
c. What do mutations in genomes include?(2)
d. What are the most common mutations?(2)
Single-nucleotide polymorphisms (SNPs, pronounced ‘SNiPs’)
a. except for identical siblings, all people have unique DNA sequences
b. unrelated individuals: ∼0.1% at genomic level
c. mutations in genomes include substitutions, insertions and deletions (‘indels’) and translocations
d. most common mutations: base substitutions or SNPs; short indels, but also
replicates for repetitive seq’s and gene copy number variation (CNV)