Genomes Flashcards
What is a DNA promotor sequence?
A sequence of DNA to which transcription apparatus binds to initiate transcription. It indicates direction of trancription and which of the 2 DNA strands are read as a template.
What are introns and exons?
Eukaryotic genes have stretches of coding and non-coding nucleotides. The coding strands are called exons and the non-coding strands are called introns.
How are introns and exons different in more complex organisms?
In eukaryotic genomes the size and number of introns appear directly related to organism complexity.
What happens to percentage of genome that is non-coding with increasing complexity of the organism?
It increases
What makes up the repetitive DNA sequences?
Repetitive DNA that includes transposable elements and related sequences. 44%
Repetitive DNA unrelated to transposable elements. 14%
What are pseudogenes?
Genes that are no longer functional that result from duplication of genes with normal function but have accumulated too many deleterious mutations.
How many genes are there in the human genome?
19 - 20k
What is the largest gene in the human genome?
Dystrophin which is 2.4Mb long
How are proteins arranged?
Into families
How are relationships between proteins determined?
By comparing amino acid sequences; more similar sequences cluster on the same branches of the evolutionary tree.. Protein functionality can be reflected in the clustering.
What are the 2 classes of homologous molecules?
Paralogs: Homologous sequences that are present within the same genome but often differ in biochemical functions.
Orthologs are homologs present in different species but often retain similar function through evolution.
What does homology tell us?
It can give us more information about evolutionary history and function.
Homology between protein of unknown function and one that is known gives information about unknown protein’s function.
What are repeats?
Repeated units of DNA that are 1 - 200 bp long
What are transposable elements?
Pieces of DNA that have the ability to jump from place to place in the genome.
Why do pseudogenes exist?
They can exist because a single functioning copy of a gene is sufficient
What parts of normal genes do pseudogenes lack? What does this tell us?
They lack a promoter and introns.
This tells us they are believed to derive from mRNA copy reverse transcribed into cDNA and reintroduced into the genome.
Which enzyme are pseudogenes believed to have been made from?
Reverse transcriptase from transposons
How are processed pseudogenes believed to have been created?
Transcription and processing of RNA.
Reverse transcription to cDNA
Integration of cDNA into chromosomal DNA
Second-strand synthesis and DNA repair
How are inactive pseudogenes named?
A Ψ symbol is put before the name.
Where are alpha globin genes located for haemoglobin?
Chromosome 16
Where are beta globin genes located for haemoglobin?
Chromosome 11
Why are alpha and beta chains of haemoglobin on separate chromosomes?
They started off together as duplicate genes and then must have been transposed to different locations in the genome.
What is the function of pseudogenes thought to be?
Regulation of gene expression
What are minisatellites?
Repeat sequences of 7 - 100 bp long which are highly variable in length.
They can be in coding or non-coding regions and can affect gene regulation.
How many minisatellites are in the human genome?
> 1000
What is another name for minisatellites?
VNTRs
What are microsatellites?
Tandem repeats of 1 - 6 bps
What is another name for microsatellites?
Simple Sequence Repeats
What is replication slippage?
Sometimes coding and replicating strand can slide apart and then come together again but in a misaligned way resulting in some sequences being replicated again.
What is the result of replication slippage?
Expansion and contraction of repeat number can occur during DNA replication
What are some diseases that are caused by trinucleotide repeat expansion?
Huntington’s disease
Fragile X syndrome
Myotonic dystrophy
How many diseases are caused by mutations involving CGG, GCC, GAA, CTG, and CAG?
14 diseases (these mutations are found in coding and non-coding regions)
What causes huntington’s disease?
Expansion of CAG (Glutamine codon) repeats in the coding region of huntingtin gene.
CAG repeats are translated into polyglutamine tract (6 - 35 repeats cause no disease. 36 - 121 repeats cause huntington’s disease. >70 repeats cause juvenile onset huntington’s disease).
What is huntington’s disease characterized by?
Midlife onset of dementia followed by death
What is a polyglutamine tract?
A sequence of glutamine residues produced due to the repeats.
What are transposable elements?
DNA sequence that can change its position in the genome.