Genomics and Evolution Flashcards

Question 1

Q

What are the main 2 types of genome?

Answer

A

The nuclear genome and the mitochondrial genome.

Question 2

Q

What is high level and low level in genome organisation?

Answer

A

High level is the chromosomes and low level is all the junk DNA.

Question 3

Q

What is chromosome fusion?

Answer

A

It is the fusing of two chromosomes into one.

Question 4

Q

What are segmental duplications?

Answer

A

They are when a section of the chromosome are duplicated.

Question 5

Q

What are inversions?

Answer

A

It is where there is flipping of the genes on the chromosome.

Question 6

Q

What are translocations?

Answer

A

They are where there is movement of genes across chromosomes.

Question 7

Q

What do pseudoautosomal regions do?

Answer

A

They make sure the sex chromosomes pair and are separated correctly.

Question 8

Q

Sex chromosomes are ____ and what happens after they pair?

Answer

A

Sex chromosomes are homologous and pair, then recombine in male meiosis.

Question 9

Q

What species has an interesting case of chromosomal fusion?

Answer

A

Muntjac deer

Question 10

Q

What happens in male meiosis?

Answer

A

Sex chromosomes form chromosomal chains and then split after meiosis. This chain is used through translocation of autosomal regions leading to pairing.

Question 11

Q

What did sex chromosomes originate as?

Answer

A

They originated as a pair of autosomes.

Question 12

Q

How did sex chromosomes arise?

Answer

A

There was stopping in recombination in a pair of autosomes and this is where the sex-determining gene arose. From here, they non-recombining regions expanded, creating an evolutionary state on the sex chromosomes.

Question 13

Q

What happens once a region becomes non-recombing?

Answer

A

There is an accumulation of deleterious mutations which leads to it becoming degenerate.

Question 14

Q

Sex chromosomes evolved _________?

Answer

A

The evolved multiple times independently in different groups.

Question 15

Q

The process of degredation isn’t what, and what happens after degradation?

Answer

A

The process isn’t linear and after degredation, it stays at the “base level”.

Question 16

Q

There is a general tendency to lose what genes?

Answer

A

Genes that become unecessary.

Question 17

Q

What are 3 examples of gene loss?

Answer

A

The gene loss in the Y chromosome.
The loss of the Vitamin C producing gene multiple times over multiple lineages.
The loss of teeth in birds and turtles.

Question 18

Q

How often are genes gained and lost?

Answer

A

All of the time.

Question 19

Q

There is a what between genes being lost and gained?

Answer

A

There is a dynamic equilibrium between genes being lost and gained.

Question 20

Q

What are some mechanisms by which new genes arise?

Answer

A

Exon shuffling, gene duplication, retroposition, gene fusion, gene fission, and de novo origination.

Question 21

Q

Many proteins contain what, and how are new proteins with new functions made?

Answer

A

Many proteins contain “borrowed” domains, and old and new domains combine to create new proteins with new functions.

Question 22

Q

What is an example of where exon shuffling was used?

Answer

A

It was used in the origin of the jingwei gene in Drosophila.

Question 23

Q

What happens if a minor splice form does something useful?

Answer

A

It will be selected to increase its abundance in the cell, resulting in the evolution of major alternative gene isoforms.

Question 24

Q

What is the introns early theory?

Answer

A

The theory that introns are ancient and are gradually lost.

Question 25

Q

What is the introns late theory?

Answer

A

The theory that introns evolved in early eukaryotes and keep spreading.

Question 26

Q

What is a common process behind the evolution of new genes?

Answer

A

Evolution by duplication, ranging from single genes to the whole genome.

Question 27

Q

The is the 2R hypothesis?

Answer

A

It’s that there was two rounds of genome duplication in vertebrates.

Question 28

Q

What are whole genome duplications more common in?

Question 29

Q

What happens when a gene is duplicated?

Answer

A

Some functional redundancy is created, which reduces purifying selection and allows the copies to accumulate mutations and diverge in function.

Question 30

Q

What is an example of gene duplication and sub-functionalisation?

Answer

A

The evolution of colour vision in primates, where the ancestral state is dichromatic, the S-gene and L-gene, and the L-gene duplicated and diverged. There was sub-functionalisation where the copies diverged to have different light sensitivities.

Question 31

Q

The size of the genome has little to do with what?

Answer

A

The size of a genome has little to do with the organism’s complexity.

Question 32

Q

What is the C-value paradox?

Answer

A

The idea that larger genomes don’t lead to higher complexity in eukaryotes.

Question 33

Q

In what does the number of genes and genome size show a pretty good correlation?

Answer

A

Viruses and prokaryotes.

Question 34

Q

How can genome sizes for extinct animals be measured?

Answer

A

The genome sies are measured from the size of the cells inside the bones. It is well-known that genome size correlates with a bigger nucleus, and so a bigger cell. To measure cell size, they measure the size of the pores that are in the bones.

Question 35

Q

What do transposable elements play a major role in?

Answer

A

Increasing genome size.

Question 36

Q

What is the only way to downsize a genome and what determines its efficiency?

Answer

A

Deletions are the only way to decrease the genome size, and the efficiency of downsizing depends on the frequency and size of the deletions.

Question 37

Q

What is an example of extreme genome reduction?

Answer

A

Buchnera is a mutualistic intracellular symbiont of aphids and since the revolutionary process started, there has been a massive reduction in genome size where only essential genes remain. The genome is currently essentially in genome stasis.

Question 38

Q

What was major in answering many questions regarding human evolution and why was it used?

Answer

A

mtDNA was used as it is more frequently mutating than DNA and has no recombination.

Question 39

Q

Where is human genetic diversity highest?

Answer

A

In Africa.

Question 40

Q

Why might looking at a single locus to explore the migration of humans be misleading?

Answer

A

It may tell a story of spread of an advantageous mutation and not the story of the migrations of humans, meaning it’s important to look at other parts of the genome.

Question 41

Q

Why is the human Y chromosomal tree used for phylogenetic reconstructions?

Answer

A

Because it is paternally inherited and the phylogeny is aligned with mtDNA.

Question 42

Q

Why is using gene trees problematic for autosomal markers?

Answer

A

It is problematic due to the recombination.

Question 43

Q

On what basis are principal component analysis plots created?

Answer

A

They are created on the basis of individual genotypes for autosomal markers.

Question 44

Q

Why could human DNA show a lower global mobility in men?

Answer

A

This could reflect the fact that males inherit the land from their father and stay whereas women are married off to other families.

Question 45

Q

What genes are good for going really far back in human history?

Answer

A

Nuclear genes.

Question 46

Q

What is one way to learn about the ancestors of humans?

Answer

A

To use remenants of DNA in the ancient skeletal tissues.

Question 47

Q

Where was there hybridisation between Neanderthals and humans?

Answer

A

Only in Europe.

Question 48

Q

What has ancient DNA been used to study?

Answer

A

Ancient DNA has been used to study whether the ancestors of modern humans interbred with Neanderthals and other archaic hominids.

Question 49

Q

How much of our genome is believed to be Neanderthal?

Question 50

Q

What is the oldest genome that’s been sequenced?

Answer

A

A Denisovan genome from a finger bone and a tooth, showed they are separate from Neanderthals and humans.

Question 51

Q

Closely related species were doing what when meeting?

Answer

A

Hybridising.

Question 52

Q

Different species on different islands in early Asia have suggested what?

Answer

A

That Homo.erectus had the ability and skill to travel across open ocean.

Question 53

Q

What does the adaptation of human skin pigmentation do?

Answer

A

It has a strong positive correlation with UV intensity, so dark skin has more protection against UV, which reduces the photolysis of folic acid. Light skin leads to more production of Vitamin D.

Question 54

Q

Selection leaves what in DNA polymorphisms?

Answer

A

Distinct footprints.

Question 55

Q

What is a result of the spread and fixation of an adaptive allele?

Answer

A

The loss of genetic variation around the target of selection.

Question 56

Q

What is an example of a footprint of recent adaptive evolution?

Answer

A

The genetic diversity around the gene involved in adaptation to milk adaptation in adult humans. The distribution of lactase persistence correlates with historical centres of dairy farming.

Question 57

Q

What happens when there is adaptation to contrasting conditions?

Answer

A

There is spread and fixation of different locally adaptive alleles in the population, which creates the signal of population differentiation at the genes under selection.

Question 58

Q

What is a good example of adaptation to local conditions?

Answer

A

An example is an adaptation to life at high altitudes where interspecies hybridisation was advantageous as Tibetans; they can breathe more easily at high altitudes due to having introgression of Denisovan-like DNA.

Question 59

Q

What is population genetics?

Answer

A

The study of genetic diversity in biological populations and of the processes that cause genetic diversity to change.

Question 60

Q

Genetic diversity is synonymous with what?

Answer

A

Intra-specific diversity.

Question 61

Q

What is the major process that differentiates intra- and inter-specific diversity?

Answer

A

Gene flow.

Question 62

Q

When did population genetic arise and from where?

Answer

A

Population genetics arose in the 1930s/1940s from the Modern synthesis of Mendelian Genetics and Darwinian Natural Selection.

Question 63

Q

Population genetics ultimately underpins what?

Answer

A

It underpins all phenomena in evolutionary biology.

Question 64

Q

What is a phenotype?

Answer

A

It is any observable or quantifiable characteristic of organisms that vary within or among populations.

Answer 63

A

Genome regions that are useful for measuring and investigating genetic variation in populations.

Answer 64

A

If more than one allele is commonly found.

Answer 65

A

The development of genetics, from proteins to DNA.

Answer 66

A

The allelic make-up of an individual.

Answer 67

A

DNA sequence variation.

Answer 68

A

You can count the number of distinct sequences and the proportion of variable sites. You can also measure the average pairwise difference.

Answer 69

A

The number of differences between each pair of sequences.

Answer 70

A

The fraction of individuals in a population that are expected to be heterozygous.

Answer 71

A

It is equivalent to the probability that any two alleles randomly sampled from the population are different.

Answer 72

A

The proportion of loci observed to be heterozygous in an average individual, and it is obtained by averaging h across many loci.

Answer 73

A

It predicts genotype frequencies based on allele frequencies, when stable across generations in a stable population.

Answer 74

A

it’s a diploid organism with sexual reproduction
there are non-overlapping generations
there’s an infinite population size
there’s non-random mating
males and females have equal allele frequencies
it’s a closed population
there’s no mutation
there’s no selection

Answer 75

A

It is an example of a null model as it describes the state of population when nothing interesting is happening.

Answer 76

A

It extends to more than 2 alleles and to multiple loci that segregate independently.

Answer 77

A

Mutations.

Answer 78

A

The fact that the inheritance of genes on the same chromosome is not independent.

Answer 79

A

It is decreased due to recombination and random assortment

Answer 80

A

It means that individuals mate at random with respect to a particular genotype, it doesn’t mean that there’s absolutely no choice.

Answer 81

A

Exceptions are inbreeding (mating with relatives more often by chance), and positive assortive mating (mating occurs with individuals with similar phenotypes).

Answer 82

A

It is where offspring are more likely to inherit the same allele from both parents.

Answer 83

A

To measure the level of recent inbreeding

Answer 84

A

It results in reduced fitness, and it often arises from homozygosity in recessive deleterious alleles.

Answer 85

A

The idea that chance alone can result in changes in genetic variation over time.

Answer 86

A

When an allele’s frequency reaches 100% in the population.

Answer 87

A

It usually occurs when populations go through bottlenecks and it causes substantive changes in allele frequencies.

Answer 88

A

It is observed due to genetic drift and inbreeding in a subpopulation.

Answer 89

A

Human diseases, some wild felid populations, and in captive breeding.

Answer 90

A

A reduction in overall heterozygosity.

Answer 91

A

The fraction of total genetic diversity is due to differences among populations.

Answer 92

A

Effective population size is used instead of census population size as it takes into account that not all individuals in all generations have an equal propensity to reproduce.

Answer 93

A

The size of an idealised population that would experience the same rate of genetic drift as the real population, due partly to the limited proportion of breeding individuals.

Answer 94

A

They can identify subgroups within a species using genetic marker data from multiple loci.

Answer 95

A

It leads to the positive selection of beneficial alleles, which eventually results in their fixation.

Answer 96

A

The whole organism.

Answer 97

A

Positive, negative, and balancing selection.

Answer 98

A

Directional, disruptive and stabilising selection.

Answer 99

A

The average number of offspring produced by the individuals with a particular genotype compared to the number of produced by individuals with another genotype.

Answer 100

A

It is expressed as a selection coefficient.

Answer 101

A

The increase or decrease in fitness conferred by that allele compared to another.

Answer 102

A

The occur more rapidly in haploids than diploids, which is due to the fact that the relationship between genotype and phenotype is similar.

Answer 103

A

Sigmoidal.

Answer 104

A

Allele interactions.

Answer 105

A

The rate of allele dominance.

Answer 106

A

A new rare allele initially created is mostly heterozygous, and selection can only favour the of the allele is dominant.

Answer 107

A

The domainance allows for the less-fit allele to hide in heterozygotes, which makes it difficult to remove.

Answer 108

A

Balancing selection is often typified by the case of heterozygote advantage, where both alleles will stably coexist with a frequency that is proportional to the relative fitness of the two homozygotes.

Answer 109

A

Frequency-dependent selection, and fluctuation selection.

Answer 110

A

It is where allele fitness is high when the allele is rare, and so when the allele is common, the allele fitness is low.

Answer 111

A

It is where allele fitness depends on an aspect of the environment tat is rapidly and constantly changing.

Answer 112

A

You get a phenotypic curve.

Answer 113

A

It is to do with describing, naming, identifying, and classifying species.

Answer 114

A

It is to do with reconstructing patterns of shared ancestry among organisms.

Answer 115

A

In On the Origin of Species.

Answer 116

A

It can be seen in hierarchal tables, the ladder of nature, and representing a process.

Answer 117

A

If they are similar and have descended from a common ancestor.

Answer 118

A

When they are similar but have descended from different ancestors.

Answer 119

A

Molecular sequences contain information about the evolutionary processes that produce them, but they are often scrambled, fragmentary, hidden, or lost.

Answer 120

A

They use mathematical, statistical and computational methods.

Answer 121

A

They are sequences from different species.

Answer 122

A

They are sequences from the same species.

Answer 123

A

They are sequences from different genes in the same genome.

Answer 124

A

The arrived during the molecular biology revolution of the mid-20th century.

Answer 125

A

they are very common.
they are objective.
they are easy to quantify.
they are available when morphology is uninformative.
it is cheap and fast.
it can be obtained without specialist training.

Answer 126

A

It is unavailable for extinct species.

Answer 127

A

It is a mutation of a purine-to-purine, or pyrimidine-to-pyrimidine.

Answer 128

A

It is from a purine to a pyrimidine.

Answer 129

A

As a synonymous mutation.

Answer 130

A

As a non-synonymous mutation.

Answer 131

A

The principle of parsimony.

Answer 132

A

The concept of positional homology.

Answer 133

A

If they exist at equivalent position in their respective sequences.

Answer 134

A

It is essential for good phylogenies.

Answer 135

A

Most alignment methods start by assigning a different “cost” to each type of sequence difference. Each possible alignment, therefore, has a total cost. Algorithms then identify the alignment with the lowest cost.

Answer 136

A

When the sequences are diverse or contain long insertions or deletions.

Answer 137

A

It is a problem seen when looking at how different two sequences are. When divergence is low, the observed number of changes is similar to the true number, but when divergence is high, the observed number underestimates the true genetic distance.

Answer 138

A

They are used to estimate the true genetic distance from the observed changes.

Answer 139

A

They represent the stochastic process of sequence evolution through time.

Answer 140

A

A 20x20 matrix.

Answer 141

A

evolution at each site occurs at the same rate.
nucleotide base frequencies are the same for all sequences.
evolution at each site is independent.

Answer 142

A

They are used to capture the variation in evolutionary rates among sites.

Answer 143

A

The gamma-distribution.

Answer 144

A

All lines represent genetic distance.

Answer 145

A

A rooted tree has evolutionary direction, and only horizontal lines represent genetic distance.

Answer 146

A

It transforms genetic distances into a tree.

Answer 147

A

They define some kind of score for each possible tree.

Answer 148

A

They are methods that calculate a probability for each possible tree and frame phylogeny estimation as a formal statistical problem.

Answer 149

A

The tree which requires the fewest evolutionary changes to explain the observed sequences is the best tree.

Answer 150

A

It is most useful when it applies to morphological character data.

Answer 151

A

When there are fast-evolving sequences.

Answer 152

A

The tree which is probabilistically most likely to have given rise to the observed sequences is the best tree. It is slower and the probabilities are given by nuclear substitution models.

Answer 153

A

Where each tree has a probability given the data, and the whole probability distribution is considered, not just the one most likely.

Answer 154

A

It is the minimum number of evolutionary changes required to explain the observed characteristics.

Answer 155

A

To find the topology with the highest likelihood.

Answer 156

A

To use an exhaustive search which tries every possible tree and is only feasible with small numbers of taxa.
To do hill climbing which searches through trees by iterative trial and error, and it doesn’t check all possible trees and isn’t guaranteed to find the optimal one.

Answer 157

A

The most common technique to do is bootstrapping, which involves permutations of the original data to create large number of pseudoreplicates.

Answer 158

A

They provide a single estimate of a ‘true’ tree.

Answer 159

A

The generated trees from each replicate have clusters and it’s the frequency of these clusters that is a measure of its reliability.

Answer 160

A

Zuckerandl and Pauling in 1962 compared the LCA of the Hepatitis C virus as a time scale and then compared the number of mutations compared to humans and this showed correlation, which led to the formation of the idea of the molecular clock.

Answer 161

A

They can estimate the date of a common ancestor for which no fossils are known, and the divergence dates when there is no obvious morphological change.

Answer 162

A

It became obvious that there was much sequence variation at the molecular level, and that the amount of molecular diversity varied within genes, among genes, and among species.

Answer 163

A

The neutralist approach and the selectionist approach.

Answer 164

A

To understand why some genes/species/genomic regions evolve at different rates, and to estimate a timescale for phylogenies and evolutionary history.

Answer 165

A

It is the rate at which sequences in different populations diverge through time.

Answer 166

A

It is the rate at which individuals incorporate errors during replication.

Answer 167

A

The difference between the substitution/fixation rate, and the mutation rate.

Answer 168

A

When Ns is between 1 and -1.

Answer 169

A

When N is small, slightly deleterious mutations are controlled by drift and can occasionally become fixed.

Answer 170

A

When N is large, the slightly deleterious mutations are controlled by negative selection and never get fixed.

Answer 171

A

Substitution rates can increase in smaller populations, but organisms in small populations tend to have longer generation times, which may cancel out this effect.

Answer 172

A

It is the time between germ line replications.

Answer 173

A

It is a particularly important factor for selectively neutral polymorphisms.

Answer 174

A

Higher concentration of oxygen radicals.

Answer 175

A

The X and Y chromosomes are a good example as there may be more cell division events in some species in the germ line than the female, which leads to faster Y chromosome evolution.

Answer 176

A

It is thought to be due to a higher basal metabolic rate, which is then caused by increased oxygen free radicals produced by aerobic respiration, which can generate mutations. However, there is no clear association that has been found due to there being too many confounding variables.

Answer 177

A

Genetic distance = evolutionary rate x (2 x divergence time)

Answer 178

A

RNA viruses and ratroviruses have mutation rates many times higher than those of eukaryotes as they replicate using different polymerases.

Answer 179

A

They were calibrated by assuming that all lineages/species evolve at the same rate, and this is known as a strict clock.

Answer 180

A

When the sequences are from evolutionary different points in time.

Answer 181

A

An example is where the phylogeny of cats was used to date the evolution of feline papillomaviruses.

Answer 182

A

It is a study of how population processes shape phylogenies, and it includes changes in population size, migration, speciation and extinction.

Answer 183

A

It was first coined by Grenfell et al. in Science in 2004 where it was used to describe how epidemiological, immunological and evolutionary processes can shape viral phylogenies.

Answer 184

A

It works backwards in time and traces ancestry given a set of sampled sequences. It typically considers intra-specific processes.

Answer 185

A

It works forwards in time, and it is where given a population process and it determined what the resultant phylogeny would look like, and it considers inter and intra-specific processes.

Answer 186

A

Coalescent theory has gained importance in population-level sequencing and has become widespread in anthropology, association mapping, conservation biology, epidemiology, global warming, and cancer biology.

Answer 187

A

It is an ideal population, and it assumes that individuals have equal propensity to reproduce, that generations are non-overlapping, and that there is a constant population size.

Answer 188

A

Coalescence theory is genetic drift in reverse, and vice-versa.

Answer 189

A

It is the probability that two lineages coalesce in the previous generation, and move back in time, it is the rate of coalescence.

Answer 190

A

r = (probability that a pair of sampled lineages share the same parent) x (the number of possible pairs of sampled lineages)

Answer 191

A

It includes sequences from the past.

Answer 192

A

Moving back in time, the population size decreases and the rate of lineage joining increases.

Answer 193

A

Theta denotes sequence diversity. It is related to the number of mutations in the history of the sample.

Answer 194

A

Theta equals the average pairwise genetic distance between sampled sequences.

Answer 195

A

There are often long internal/near root branches, which means there are many mid-frequency polymorphisms.

Answer 196

A

There are long terminal branches, which means there are many low frequency polymorphisms.

Answer 197

A

Sequences contain information about demographic history.

Answer 198

A

A statistic that measures whether mutations are mostly high/medium/low frequency.

Answer 199

A

Methods used include: Tajima’s D, skyline plots, and the sequentially markovian coalescent model.

Answer 200

A

They are plots that use the mathematical relationship between r(t) and 2N(t) to estimate past population size.

Answer 201

A

It is a complex approach used for human genomes.

Answer 202

A

Only lineages in the same deme/subpopulation can coalesce.

Answer 203

A

Incomplete lineage sorting occurs when coalescences predate multiple speciation times, and this is more likely to occur when ancestral effective population sizes are large.

Answer 204

A

When looking at the population sizes of Beringian bison over time, and the origins of HIV.

Answer 205

A

They try to find genotypes associated with human diseases like diabetes.

Answer 206

A

Coalescent theory is used to interpret large-scale human genomics data.

Answer 207

A

A complete population tree displays the full population dynamics and displays the dynamics giving rise to individuals at time T.

Answer 208

A

It has been used to study the diversification of mammals after the extinction of dinosaurs, and to study the spread of Ebola in Sierra Leone in 2014.

Answer 209

A

One of the most important tasks is to understand the selective forces acting on individual genes, gene regions, and codons.

Answer 210

A

Demographic and selective processes can generate similar trees.

Answer 211

A

You can look for differences in genetic diversity, tree shape, or mutation frequencies among genes or along chromosomes, compare silent and replacement changes within a gene, and look for parallel/convergent evolutionary changes.

Answer 212

A

It is the ratio of the number of replacement fixations to the number of silent fixations, and it is not a differential.

Answer 213

A

It means that all replacement mutations are neutral.

Answer 214

A

It means that all replacement mutations are deleterious.

Answer 215

A

It means that at least som of the replacement fixations are beneficial.

Answer 216

A

Because only a few codons are positively selected and most codons are selectively constrained and therefore dN/dS = 9.

Answer 217

A

When the ratio is applied to parts of genes or individual codons.

Answer 218

A

When they are in: overlapping genes, alternate reading frames, regulatory sequence elements (they affect the stability of RNA/mRNA/DNA structure), and where codons for the same amino acids differ in fitness.

Answer 219

A

In the codons that form the active site of the gene, so the antigen recognition site.

Answer 220

A

It can be adapted to study adaptation in measurable evolving populations and ratherthan using an outgroup to

Answer 221

A

It is a simple method to contrast the patterns of within-species polymorphism and between-species divergence at synonymous and nonsynonymous sites in the coding region of a gene.

Answer 222

A

You would see that the ratio of replacement to synonymous differences between species should be the same as the ratio of replacement to synonymous polymorphisms within species.

Answer 223

A

They are very small infectious agents that replicate inside living cells.

Answer 224

A

The high mutation rates.

Answer 225

A

The within-host scale and the between-host scale.

Answer 226

A

Human pathogenic viruses.

Answer 227

A

It is a single genome where new diversity is generated by mutation and recombination, and there is gradual evolution.

Answer 228

A

It is comprised of 8 genome segments, each encoding 1 or more genes. New diversity is generated by mutation and reassortment between segments can also occur.

Answer 229

A

David Baltimore created the classification of viruses in 1971.

Answer 230

A

It is based on the route of information transmission from the genome for mRNA, from which virus proteins are translated.

Answer 231

A

Higher mutation rates.

Answer 232

A

Higher substitution rates.

Answer 233

A

The high substitution rates.

Answer 234

A

Acute infections are usually caused by RNA viruses which have a high mutation rate.

Answer 235

A

There is limited opportunity for within-host evolution and it is expected that selection for transmission plays a relatively large role.

Answer 236

A

Latent persistent infections are usually caused by DNA viruses, where there is a short burst of replication followed by long periods of latency.

Answer 237

A

One expects to see little within-host evolution and to see selection for transmission play a relatively large role.

Answer 238

A

Chronic persistent infections are usually caused by RNA or DNA-RT viruses.

Answer 239

A

There is ongoing rapid evolution and one expects to see within-host selection playing a relatively large role in determining adaptive evolution at the host-population scale.

Answer 240

A

There is selection pressure to maximise within-host fitness.

Answer 241

A

There is selection pressure to maximise between-host fitness, normally seen as transmission.

Answer 242

A

You take multiple sequences from the same individual at different times.

Answer 243

A

You take consensus sequences from different individuals at different times.

Answer 244

A

Data showed that selected mutations typically involve evasion from host immunity and mutations that are selected for in some individuals are selected against in others.

Answer 245

A

Adapt and revert is commonly seen in viruses and it could explain why we see high mutation rates within the individual.

Answer 246

A

In immunocompromised individuals.

Answer 247

A

Because there is little opporunity for adaption so it is unlikely to see the arms race, and selection will be driven by intrinsic transmissibility and immune escape.

Answer 248

A

They can be understood by using phylogenetic data.

Answer 249

A

The long branches lead to ‘variants of concern’. The leading hypothesis is that these long branches are a consequence of evolution during chronic infection, and these are characterised by many nonsynonymous mutations in Spike.

Answer 250

A

They are constructed from immunologial assay data and are used to choose vaccine strains.

Answer 251

A

In Influenza, genetic divergence is continuous, but antigenic change is punctuated, with switches among discrete antigenic types being observed.

Answer 252

A

It is commonly seen between humans, birds and pigs.

Answer 253

A

Reconstructing the phylogeny found that HIV-1 is most diverse in Central Africa and the phylogeographic and molecular clock methods place common ancestor in the captical of the DRC in the 1920s. The virus is thoguht to have spread to humans from chimps in Cameroon but the origins before that were unknown.

Answer 254

A

Using phylogenies, it shows there was direct transmission from chimps to humans, however, there wasn’t just one transmission event but rather the virus jumped between lots of different species and then jumped from chimps to humans.

Answer 255

A

Scientists took 8 segments from the genome and each genome segment was telling a different evolutionary story due to the reassortment. The best evidence shows it emerged in Mexico from pigs.

Answer 256

A

It is where networks are generated of similar consensus sequences from different individuals.

Answer 257

A

Bacteria are one of the earliest forms of life.

Answer 258

A

The bacterium Haemophilus influenza in 1995.

Answer 259

A

Sample prep, cluster generation, sequencing, and data analysis.

Answer 260

A

The cost of genome sequencing plummeting.

Answer 261

A

Complete, assembled genomes with annotation.

Answer 262

A

Archival short-sequence data.

Answer 263

A

First method is mapping where reads are aligned to a reference genome.
Second method is assembly, where genomes are reconstructed from raw read data using de novo assembly.

Answer 264

A

It is a reference-free assembly and comparison that is independent of biological information.

Answer 265

A

The overlap-layout-consensus method, and the De Bruijn method.

Answer 266

A

Start with sequences.
Divide all possible k-mers and look for all possible overlapping 4-mers.
Spades is the most common assembler.

Answer 267

A

Two sequences that have a defined, known gap between them.

Answer 268

A

100-300 bps.

Answer 269

A

It compares short reads to a high-quality reference, particularly used in comparing very closely related isolates.

Answer 270

A

rapid
accurate, even with ‘low coverage’ samples
comparable
reproducible
problems are easy to visualise to help with identifying problems and errors.

Answer 271

A

requires high-quality reference genome
can only identify variants relative to the reference genome
repeat high regions are problematic and can lead to induced error or under-reporting of variants
can’t be reliably used to report large genomic events.

Answer 272

A

Where all of the overlaps between reads are determined then the reads and overlaps are all laid out on a graph and consensus sequences are identified, and a ‘String Graph Assembler’ (SGA) does this.

Answer 273

A

A graph that is constructed from a set of k-mers, where the vertices represent the k-mers and the edges represent the relationships between them.

Answer 274

A

referene free so novel sequences can be constructed and identified.
can be used to identify large genomic sequence variants.

Answer 275

A

struggles to solve repetitive or very similar regions.
computationally expensive
-time consuming
no clear ‘ground truth’ as the output can be variable based upon input parameters.

Answer 276

A

short reads do not contain enough information to resolve low complexity regions that are larger than the length of the short read, leading to gaps in the assembly.
the assembled genome is fragmented into multiple contiguous sequences.
some regions will not be assembled.

Answer 277

A

By spanning the entire length of low complexity regions, or resolving intermittent identical repeats.

Answer 278

A

Pacific Biosystems (not used lots)
Oxford Nanopore (portable and used in Ebola outbreak and COVID).

Answer 279

A

They are combined with second generation sequencing reads for an accurate hybrid assembly.

Answer 280

A

They are assemblies that combine the bae calling accuracy of short read sequencing with the scaffolding power of long reads to solve genomic features that are unresolvable by short reads alone.

Answer 281

A

Location, e.g. which sequence, where on the sequence, and which strand it’s on
Feature type, e.g. protein coding, or tRNA, or repeat region
Attributes, e.g. products, enzyme code, cellular location.

Answer 282

A

A gene-by-gene annotation.

Answer 283

A

A database of orthology relationships, functional annotation, and gene evolutionary histories.

Answer 284

A

Their biology, so where they are and what they do, e.g. if they are free-living or obligate or facultative.

Answer 285

A

It is the genes that are the same in all bacterial individuals of a species.

Answer 286

A

They tend to have about 1/4 of their genomes different to each other.

Answer 287

A

All the different genes, so the variable genome content.

Answer 288

A

The core and accessory genome added together.

Answer 289

A

They tend to be G and C rich, with it being unknown why this is, but it is potentially related to temperature, which increases stability under high temperatures.

Answer 290

A

It is where the genomes and order of genes are all shuffled, and this is seen in prokaryotes.

Answer 291

A

There is large variation due to bacteria having been around for a very long time.

Answer 292

A

It is to construct a phylogenetic tree from DNA sequences of bacterial strains with different phenotypes.

Answer 293

A

A model that emphasises that most of the genetic variation can be explained by genetic drift.

Answer 294

A

They highlight selection for adapted lineages in a given environment.

Answer 295

A

Where there is a genetic mutation which effects the fitness/survival of an individual.

Answer 296

A

-DNA replication errors
- horizontal gene transfer

Answer 297

A

Where there is generation of point mutations, rearrangements, or deletions of various sizes.

Answer 298

A

Genetic material that is acquired from an external source and incorporated into the chromosome by recombination.

Answer 299

A

Mutation.

Answer 300

A

Under strong selective pressure.

Answer 301

A

It is thought that it is a consequence of differing relative rates of recombination to mutation, although other forces may play a role.

Answer 302

A

It is not the same as a selective sweep.

Answer 303

A

One method is to compare the frequency of substitutions at synonymous sites.

Answer 304

A

It is associated with negative or purifying selection, which supresses protein changes.

Answer 305

A

It is associated with positive selection, promoting protein sequence changes.

Answer 306

A

Host immune evasion or antimicrobial resistance.

Answer 307

A

Within host populations, where isolation from the ancestral population results in greater genetic drift and less time to purge deleterious mutations.

Answer 308

A

Selection operates on features other protein-coding sequences which don’t necessarily affect dN/dS.
dN/dS ratios do not detect complex traits such as interactions between genes.
Frameshifts and incorrect interpretation of start codons can lead to non-synonymous single nucleotide polymorphisms being interpreted as synonymous.
the estimates aren’t accurate if polymorphisms are not fixed between independent lineages, and segregating variation in the population is likely weakly deleterious and destined to be purged in the future.

Answer 309

A

Most pathogens are principally commensal organisms.

Answer 310

A

One can compare the genomes of pathogens with other genomes of the ancestors and related non-pathogens.

Answer 311

A

Convergent evolution, also known as homoplasy.

Answer 312

A

It is where the effect of one allele depends on another, which is also known as epistasis.

Answer 313

A

There is a higher linkage disequilibrium.

Answer 314

A

Recombination promotes adaptation by introducing novel functionality; on the other hand, it risks creating disharmonious gene combinations that are likely to be selected against.

Answer 315

A

potential variation, which can set the stage for subsequent genetic changes that can result in beneficial adaptations.
compensatory change, which adjusts the recipient genome to minimise potential disruptions, facilitating transition between fitness peaks.

Answer 316

A

The differences in genome size.

Answer 317

A

The vast majority of genomic DNA codes for protein in prokaryotes.

Answer 318

A

non-coding DNA performs essential functions.
Non-coding DNA is useless “junk”, carried passively by the chromosome simply because it is linked to functional genes.
Non-coding DNA has a structural or nucleoskeletal function.
Non-coding DNA is a functionless “parasite” that is in a selective battle with the host.

Answer 319

A

size of cell nucleus
duration of mitosis and meiosis
metabolic rate in birds and mammals
minimum generation time
seed size
response of annual plants to CO2
embryonic development time in Salamanders
morphological complexity in the brains of frogs and salamanders.

Answer 320

A

The hypothesis claims that cell size is adaptively important so that more genomic DNA is required to make bigger cells. So, DNA mass directly determines nuclear volume and there must be a constant ratio of nucleus to cell volume to maintain a balance between RNA synthesis and protein in the cytoplasm.

Answer 321

A

The evidence for the theory is in cryptomonad algae, where DNA in the nucleus performs a skeletal function.

Answer 322

A

While it is seen in unicellular eukaryotes, scaling it up to multicellular eukaryotes is challenging.

Answer 323

A

It suggested that effective population sizes are too small to allow for natural selection to effectively remove non-coding DNA from eukaryotic genomes.

Answer 324

A

Probably because they have a single origin of replication and need to replicate quickly.

Answer 325

A

It is non-coding repetitive DNA consisting of short sequence motifs repeated 100s to 1000s of times in tandem.

Answer 326

A

Satellite DNA (2-40Kb)
Minisatellites (11-60bp)
Microsatellites (2-5bp)

Answer 327

A

They have very high mutation rates, meaning that their loci are extremely variable.

Answer 328

A

“Selfish” DNA sequences which are able to increase their copy number by jumping around the genome and making additional copies of themselves as they do so.

Answer 329

A

They are called insertion sequences.

Answer 330

A

Class I elements (retroelements)
Class II elements (DNA elements)
Miniature Inverted-Repeat Transposable Elements (MITES).

Answer 331

A

They transpose via an RNA intermediate using the enzyme reverse transcriptase.

Answer 332

A

LTR retrotransposons
non-LTR retrotransposons

Answer 333

A

Long Interspersed Nuclear Elements (LINEs), which are very common in eukaryotes.

Answer 334

A

It can cause deleterious mutations.

Answer 335

A

Short Interspersed Nuclear Elements , which do not encode their own reverse transcriptase like LINEs and they are very common in eukaryotic genomes.

Answer 336

A

Copy number could increase by 20-100 copies in a single generation.

Answer 337

A

Some drosophila species have P elements which are Class II elements. Wild flies carry them while lab flies don’t. The insertion of P elements can lead to hybrid dysgenesis.

Answer 338

A

It is an increased infertility due to chromosome breakage.

Answer 339

A

Retroviral infection of the germline
fixation
-amplification
-inactivation through mutations
loss through recombinal deletion
decay into junk
-co-option.

Answer 340

A

Class I, II, and III.

Answer 341

A

Endogenous retroviruses cause diseases in a range of mammals, but there is no definitive link with disease that has been seen in humans.

Answer 342

A

Evidence has shown that a captive protein from an ancient endogenous retroviral insertion is involved in placental morphogenesis.

Answer 343

A

It predicts that transposable elements will be preferentially found in regions with low recombination.

Answer 344

A

It can happen through homologous recombination between distant loci.

Answer 345

A

Selection against transposable elements that cause ectopic exchange.

Answer 346

A

A complex interplay of factors specific to transposable element biology and the biology of the host.

Answer 347

A

1.5% codes for genes.

Answer 348

A

To identify gene coding regions and determine their biological function.

Answer 349

A

Regions of “dark matter” that show accelerated evolution in one species but not others.

Answer 350

A

It is near identical to the chimpanzee.

Answer 351

A

It is a project which aims to delineate all functional elements encoded in the human genome.

Answer 352

A

They are discrete genome segments that encode a defined product or display a reproducible biochemical signature.

Answer 353

A

pterophytes
gymnosperms
angiosperms (mainly the monocots).

Answer 354

A

It arises from polyploidization events followed by chromosome reshaping.

Answer 355

A

It is best known in flowering angiosperms, and they have been seen up to 400 million years ago in seed plants, then in ancestral angiosperms, and then in specific clades.

Answer 356

A

Innovations and adaptation in angiosperms.

Answer 357

A

Very large genome sizes.

Answer 358

A

It doubles the genome size and gene number.

Answer 359

A

A return to a diploid state is most stable and has profound effects on the evolution of genome architecture.

Answer 360

A

Retrotransposons.

Answer 361

A

Ty1/copia and Ty3/gypsy.

Answer 362

A

While most LTR-retrotransposons are degenerate and inactive, stress tends to activate the movement of intact copies.

Answer 363

A

Because they tend to be highly nested in the genome.

Answer 364

A

The repeated sequences tend to drive up genome size, but the relationship is dynamic and changes in larger plant genomes.

Answer 365

A

Sequence divergence between the terminal repeats of a single retrotransposon can be used as a molecular clock. LTRs are initially identical and then their sequences decay due to random mutation.

Answer 366

A

They are efficient in key repair mechanisms to remove LTR-retrotransposon copies.

Answer 367

A

It is where multiple chromosome sets derived from a single taxon. It comes from no chromosome disjunction during meiosis or spontaneous, somatic genome doubling.

Answer 368

A

It is where multiple chromosomes derived from two or more diverged taxa.

Answer 369

A

They are abundant in crop plants.

Answer 370

A

Triploids include bananas, citrus, and some apples.
Tetraploids include wheat, cotton, potato, canola and rapeseed.
Hexaploids include chrysanthemum, bread wheat, oat, and kiwi.
Octaploids include strawberry and sugar cane.

Answer 371

A

It results in sub and neo-functionalisation which facilitates evolutionary change including adaptation.

Answer 372

A

Duplicated genes are initially redundant and most often, one copy is lost.

Answer 373

A

Natural selection.

Answer 374

A

The sequencing method and assembly method used.

Answer 375

A

They both evolved through endosymbiosis from prokaryotic organisms, which were alpha-proteobacteria, and cyanobacteria.

Answer 376

A

It is where a character/gene is inherited from one parent only. This often takes the form of maternal inheritance as the egg contributes the bulk of the cytoplasm to the zygote.

Answer 377

A

They independently studied inheritance of leaf colour in variegated plants and found that inheritance of the trait could not be explained according to Mendel’s laws of heredity.

Answer 378

A

Genetic analysis
Biochemical analysis
Imaging.

Answer 379

A

The group of Boris Ephrussi studied yeast petite mutants and they were unable to grow on sugar-poor medium due to defective oxidative phosphorylation, and so formed small colonies. Sometimes this character was not inherited in Mendelian fashion, and it was later correlated with defective mitochondria.

Answer 380

A

You stain DNA with ethidium bromide, and mitochondria with CiOC6. Where yellow is seen, it is mitochondrial DNA.

Answer 381

A

They contain double-stranded DNA molecules called mtDNA and cpDNA/ptDNA, meaning they are semi-autonomous.

Answer 382

A

They are small DNAs that do not encode for proteins and can be represented by circular DNA maps.

Answer 383

A

They are much smaller than the ancestral genome, which is due to the genes needed for free-living being lost and many others being transferred to the nuclear genome.

Answer 384

A

There is no correlation.

Answer 385

A

organelle genomes lack features typical of nuclear chromosomes and exist as nucleoids.
DNA replication replication is not tightly coupled to cell division.
Organelle genome transcription and translation machineries are prokaryotic in character.
some genes are transcribed together to form polycistronic RNAs.
Introns exist but are of a different type.
the genetic code can deviate from the standard.
organelle transcripts can be subject to RNA editing.

Answer 386

A

mtDNA is transcribed using machinery that is related to T7 bacteriophage RNA polymerase.

Answer 387

A

A single-subunit RNA polymerase and it requires the assistance of 2 transcription factors, mitochondrial transcription factor A, and mitochondrial transcription factor B.

Answer 388

A

Transcription is initiated in the non-coding region.
Transcription proceeds in both directions, from 2 promoters: light-strand promoter and heavy-strand promoter.
Two transcripts spanning almost the entire genome are formed.
These polycistronic primary transcripts are processed to yield mRNAs, tRNAs, and rRNAs.

Answer 389

A

They are 55S instead of 70S.
They have evolved unique features reflecting the special requirements of highly hydrophobic OxPhos proteins in the organelle.

Answer 390

A

TFAM molecules bind to mtDNA in short patches.
TFAM bends the mtDNA, and bridges neighbouring mtDNA stretches by cross-strand binding. This compacts the mtDNA to form the nucleoid.
The mtDNA in the nucleoid is inaccessible to the transcription an replication machineries.

Answer 391

A

The reasons are unclear but likely reflect the unique evolutionary and operational circumstances of the organelles.

Answer 392

A

They tend to be AT-rich.

Answer 393

A

It is conserved in animals and fungi, but the protein is absent in plants.

Answer 394

A

They often undergo C-to-U editing which often alters the coding sequences of a transcript to produce translatable mRNA.

Answer 395

A

Plants have the largest mtDNAs, and the number varies little between species.

Answer 396

A

some derived from chloroplast, nuclear or viral DNA.
some has been acquired by horizontal transfer from other plants.

Answer 397

A

Most is of unknown origin.

Answer 398

A

They are 1-2 orders of magnitude higher than in plant mtDNAs, and higher than animal nuclear genes.

Answer 399

A

It is responsible for the unusually low mutation rates in plant organelle genomes. It is dual-targeted to chloroplasts and mitochondria, and it mediates efficient recognition and correction of DNA sequence errors.

Answer 400

A

Electrophoresis and microscopy studies suggest that genome-size mtDNA circles are rare or absent. Many repeated sequences are present which enables for homologous recombination and leads to highly variable structural organisation.

Answer 401

A

A long single-copy region (LSC), a short single-copy region (SSC), and two inverted repeats (IR).

Answer 402

A

Anaerobic microbes, resulting in hydrogenosomes and mitosomes.

Answer 403

A

In a very small number of plants and algae.

Answer 404

A

It is inherited maternally.

Answer 405

A

maternal spindle transfer
pronuclear transfer.

The difference is whether it is carried out before or after fertilisation.

Answer 406

A

They are larger than mammalian mtDNAs, but smaller than plant mtDNAs.

Answer 407

A

It was developed by Fred Sanger in 1977 and it used radioactively labelled ddNTPs with four independent reactions with each of the radioactive base analogues.

Answer 408

A

It uses flourescent tags instead of radioactively labelling.

Answer 409

A

up to 1000bp read.

Answer 410

A

In the mid 90s, Shankar Balasubramanian and David Klenerman realised their work imaging the action of single polymerase molecules could be the basis for a new sequencing reaction by imaging the energy of the fluorescence omitted by the chemistry of the extension reaction.

Answer 411

A

In vitro library preparation
In vitro clonal amplification
highly parallels as limited only by size of sequencing features and imaging limitations
low reagent volume ratios per sequencing feature.

Answer 412

A

Medical/personal/human population genomics
Metagenomics
Environmental genomics
Evolutionary/population genomics
Understanding gene regulation mechanisms and the genome at new depths.

Answer 413

A

To detect variation and inform on mechanisms underpinning phenotype.

Answer 414

A

Circularised fragments of >1 kb pieces or “confirmation capture” brings more distant part of the genome together, with the ends appearing in the same sequenced fragment.

Answer 415

A

When distance rules are broken among all reads.

Answer 416

A

highly complicated.
technical problems: biases in library preparation, biases in sequencing profile after amplification, sequencing error rates.
assembly/information problems: polymorphisms and repetitive regions.

Answer 417

A

It results in over representation over sequence at origins of replication.

Answer 418

A

One sees peaks of reads at replication origins.

Answer 419

A

It is used to insert sequencing compatible sequences into the genome where it then dissociates and leaves the insertion sequences. It often inserts into open chromatin.

Answer 420

A

It is methylated at its 5th carbon and it is essential to development as a loss of any mammalian cytosine methyltransferases is lethal.

Answer 421

A

Imprinting
Retrotransposon silencing
X chromosome inactivation.

Answer 422

A

crosslink proteins to DNA
fragment DNA
Use antibodies to rescue DNA with nucleosomes with histone mark of interest
de-crosslink DNA from proteins
make sequencing library and sequence.

Answer 423

A

They are being sequenced using technologies that produce long reads.

Answer 424

A

overcomes problems from repetitive regions
allows for structural variation to be detected more directly
much better assemblies
easier to get high quality assemblies of new complex genomes
epigenetic base modification can be read directly
simpler library preps
ability to sequence impure environmental samples more directly.

Answer 425

A

It allows for Single Molecule Real Time sequencing. They have a longer read, 5kb on average but up to 15kb. They have a higher error rate but allow for detection of structural variation and detection of base modifications.

Answer 426

A

it can read lots of different lengths
it is portable
it can convey structure information
they are addressable and programmable on the array
longest read is over 2 million bps.

Answer 427

A

it is limited by the size of the molecules going into the machine
can’t sequence genomes at very high accuracy yet
some technical issues.

Answer 428

A

It is a project trying to sequence 60,000 eukaryotes in Britain and Ireland.

Answer 429

A

Unicellular ancestor of animals had a complex repertoire of genes linked to multicellular processes, suggesting that changes in the regulatory genome were key to the origin of animals.