MT Flashcards
What year was the Thomas et al. (?) paper published?
2003.
What was the main objective of the Thomas et al. (2003) paper?
To compare gene contigs from multiple species. Looking for multispecies conserved sequences (MCSs).
What is the “BAC” in “BAC library” representing?
Bacterial artificial chromosome.
What were the main findings of the Thomas et al. (2003) study into MCSs? Focus on human, chicken, and pufferfish.
The gene content and order was conserved in tetrapods, but the chicken seq. was 1/4 the length and the pufferfish seq. was even shorter, though they both contained all the same genes.
Are orthologous regions in genomes all the same length?
No, they can vary while still remaining orthologous (conserved seq.s).
What are paralogous genes?
Homologous genes in the same species that resulted from a duplication event.
What are orthologous genes?
Homologous genes in different species which resulted from speciation.
What are homologous genes?
Genes which derive from the same ancestral sequence.
How did paralogous genes in humans and zebrafish complicate the Thomas et al. (2003) study?
An orthologous seq. in zebrafish overlapped with an unexpected gene in humans. That gene turned out to be paralogous to the CFTR region (that they were actually looking at).
What type of alignment allows you to find more distantly related seq.s?
Discontiguous BLAST.
Rodents are more closely related to humans than the ungulates are. Why then does the rodent seq. not align better with the human seq.?
Rodent seq.s have undergone neutral mutation at a more rapid rate than human or ungulate seq.s have.
What is a transposable element (TE)?
A seq. in the genome that transposes itself at then reinserts that copy somewhere else in the genome.
How can analysis of transposable elements (TEs) be used to determine relationship between species?
If 2 species have the same Te, it’s unlikely that they got it independently. So they must both have it from a common ancestor. More TEs in common = more closely related.
How did Nikaido et al. (2001) counter the common belief and show that sperm whales are actually in the same family as toothed whales rather than baleen whales?
By looking at the TEs that the whales had in common with these other groups!
What can phylogenetic tree branch lengths tell us about seq. divergence?
The longer the branch, the more change that seq. has undergone since the ancestral speciation event.
What accounts for the length variations between the compared seq.s from different species?
The presence or absence of interspersed repeats.
What was the first method Thomas et al. (2003) used to confirm which sequences are more similar than would be expected by chance (MCSs)?
Compared neutral evolution at degenerate sites to 25bp regions across the seq. MCSs were anywhere that had > average conservation.
What was the second method Thomas et al. (2003) used to confirm which sequences are more similar than would be expected by chance (MCSs)?
Mapped entire multi-species alignment onto a tree and picked the ones which minimized the # of substitutions (maximum parsimony).
What did the researchers conclude about changes to the MCSs?
That they were selected against (negative/purifying selection).
What parts of the genome are most likely to be conserved?
Exons and adjacent UTRs responsible for recruiting TFs and acting as regulatory elements.
If a study identifies a bunch of MSCs in introns with no known regulatory function, what should we conclude?
That these are somehow functional, probably involved in splicing and spacing.
What are 2 alternative hypotheses to the MCS theory?
- Some parts of the genome are more/less susceptible to mutation
- DNA repair efficiency varies by genome region
What is an ancestral repeat (AR)? How have these been used in evolutionary biology in the past?
A transposon/TE “fossil” that predates mammalian radiation. Largely nonfunctional, can be used to represent the background rate of neutral evolution.
What did Bejerano et al. (2004) find by comparing the human, mouse, and rat genomes?
That there were regions of the genome (exonic and non-exonic) that are 100% identical in all three species (low odds that happens by chance).
How many of the UCEs identified by Bejerano et al. (2004) corresponded to known coding seq.s?
111 of 481 (<25%!).
What was the function of the partly exonic UCEs Bejerano et al. (2004) identified?
Most encoded RNA binding proteins and proteins which were involved in splicing.
Of the UCEs that Bejerano et al. (2004) identified in introns, where were the majority located in relation to known genes? What about the others?
Most were located within genes that regulate transcription (homeodomain-containing genes). ex: Hox and Pax.
Others were near annotated genes, and may have a regulatory role.
Where were the 3 longest UCEs that Bejerano et al. (2004) identified located?
In POLA introns. POLA encodes a subunit of DNA polymerase (very important).
Are UCEs present in all species? Elaborate.
Nope, they’re only present in amniotes (so not fish) and rare outside the vertebrates. Something must have happened to make them favourable.
At what rate do UCEs change?
1% per million years.
How are the UCEs in Pax2 and Pax5 related?
Both originated from a duplication in a common ancestor but are more closely related to orthologous UCEs than they are to each other. Since tetrapods and fish diverged they’ve stopped changing.
If you compare any 2 species with humans the UCEs you find might not all be the same. What does this tell you?
UCEs do occasionally go missing in one group or another. Also, any comparison with rodents is subject to their high rate of change.
What do Stephen et al. (2008) conclude about the origin of UCEs?
That they might have originated during the transition from marine to terrestrial living environments.
What DNA seq. might we expect to find in a rock from Mars?
Only the most conserved, since any organism would have been isolated millions of years ago. So 16S and 23S rRNA.
What are 4 possible explanations for high gene variation between species?
- High mutation rate
- Gene conversion
- Frequency-dependent selection (favours rare alleles)
- Overdominant selection