W1- Genome Variation Flashcards

1
Q

What does the human genome consist of?

A

Gross structure – 23 pairs chromosomes
Molecular structure – DNA sequence
* 3 billion bases (3000Mb)
* ~20,000 genes
* ~1.5% genome codes for protein = exome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Are people identical?

A
  • Major macro-level differences generally associated with disease (aneuploidy,
    translocations, etc)
  • Also micro or molecular-level pathogenic difference sometimes associated with disease
    (point mutation and SCA, 3bp deletion in CFTR)~99.7%
    DNA same between any 2 people (i.e. ~9 million bases different)
    Any position in the genome that varies between individuals is considered polymorphic = a variant
  • 2 people differ in DNA sequence at ~9 million bps

Major Allele = the one most commonly present in the genome.
Minor allele = less frequent variant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a single nucleotide variant (SNV)/polymorphism (SNP)?

A
  • High frequency: 1 every 300 nucleotides in reference genome
  • One individual: 1 every 1000 bases
  • Millions SNVs identified in human genomes
  • Majority not in exome
  • Generated by mismatch repair during DNA replication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does DNA replication and mismatch repair work?

A

Double stranded DNA is unwound using helicase enzyme. The polymerase runs along producing the new daughter strands using the parental strand adding bases. A mismatch in the base pairs is detected by one of the repair mechanisms - these are not able to identify which one the wrong base is. This means they can end up eliminating the original base causing a SNP. Variation has now been introduced to this group of cells and can be passed down to the next generation.

Two alleles are called the genotype.
Once you get a SNP, that’s a position where it is biallelic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Where could SNVs be found?

A

Gene
* No amino acid change (synonymous)
* Amino acid change (non-synonymous/missense)
* Stop codon (nonsense)
* Splice site
* UTR (gene expression)

Promoter
* Protein expression

Non-coding region

Without a deleterious effect or population annihilation, SNVs do not disappear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are the effects of SNV deleterious or beneficial?

A

Single variant can be both detrimental and beneficial. Depends on where in the genome this has taken place, but also the environment. An example of this is Sickle Cell Anaemia.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a variant?

A
  • If minor allele freqy >1% (i.e. at least 1 in every 100 chromosomes has non-reference allele) = polymorphism
  • Rare polymorphism: MAF 1-5%
  • Common polymorphism: MAF >5%
  • Safer to use term variant since it is not always associated with disease.
  • All variants start off rare
  • Evol^y forces affect whether or not a variant remains rare
  • Rare variant may be damaging and/or recent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a mutation?

A
  • New allele arises, we now have a Variant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is gene flow?

A
  • Migration leading to introduction of that variant into another population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is genetic drift?

A

Random change in variant allele frequency between generations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is selection?

A

Non-random change in variant allele frequency between generations because presence of one allele/genotype is pathogenic (negative selection) or beneficial (positive selection)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is every base identical between individuals? Is every genome exactly 3000Mb?

A
  • Example of a microsatellite
  • Also known as a short tandem repeat
  • The AC = repeat, it is repeat in tandem (i.e. one after another)
  • Can be variation in number of repeats between people

Microsatellites can be repeated di, tri, tetra, penta, hexa etc depending on the length of polymorphism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

a) What are micro satellites?
b) Where might micro satellites be found?

A

a)
* 1000s in genome
* Repeat units
* Varying numbers of repeats
* Alters actual size of that region of the genome
* Multiallelic
* Can be anywhere in genome
* May do nothing….

b)
* Part of the genome not coding for protein
* Intronic or UTR: may affect gene expression
* Intergenic
* Exonic
* Extra amino acids in protein
* Can you think of a pathogenic example?
Microsatellites may be in…….

Expansion disorders, e.g. Huntington’s = trinucleotide repeat expansion disorder, basically a “bad” microsatellite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a copy number variation?

A

This is when a chunk of DNA is copied.

  • The simplest type of copy number variation is the
    presence or absence of a gene.
  • An individual’s genome could therefore contain two, one, or zero copies.
  • Duplication of a genomic segment
  • could result in diploid copy numbers of two, three,
    or four.
  • Pair of homologous chromosomes, i.e. 2 copies of chromosome 12
  • Every locus (gene, base, genomic region) in theory is present as diploid
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are Non-allelic homologous recombination in meiosis?

A
  • A-D = loci on chromosome
  • Grey and blue = homologous chromosomes aligning in meiosis I
  • Red bands = regions of high sequence similarity, often viral/bacterial genomes that have been incorporated through evolution
  • Allelic recombination is good! – shuffling of alleles
  • But non-allelic recombination results in duplication/deletion and copy number change
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is CNVs?

A
  • Intergenic
  • But – quite large (>1kb) so often affect one or more genes (parts of genes)
  • ~12% genome = CNV
  • > 2000 identified
  • 1kb-5000kb

This can cause diseases like diverge syndrome.

17
Q

What are the types of common genetic variant?

A
  • Single Nucleotide Polymorphisms (SNPs) ~17 million identified; ~3 million/genome
  • Microsatellites ~3% of the genome
  • Copy Number Variants (CNVs) >2000 identified; ~100 per genome
  • Remember – everyone “has” every variant, what may differ between individuals is the genotype

If biallelic, the frequency of the minor allele is relatively high
* Population frequency
* i.e. proportion of chromosomes that carry each allele in the population
* Or multiallelic

18
Q

What are common variants and disease/traits associated with them?

A
  • Most common variants not causing Mendelian, monogenic disorders.
  • Majority are probably neutral (particularly intergenic variants).
    BUT!
  • May well impact upon complex, non-Mendelian disorders and undoubtedly
    contribute to general individual variation (personality, sporting ability, looks
    etc)
19
Q

What are the effects of variants?

A
  • Can be beneficial
  • Can be pathogenic
  • Most are neutral
  • Yes, can be used as markers to help find disease-causing genes and mutations
  • Autozygosity mapping & linkage studies (Microsatellites, SNPs)
  • Association analysis (SNPs, CNVs)
20
Q

What is the book analogy?

A
  • Whole book = genome
  • Chapter = chromosome
    Delete or duplicate chapter and you can really mess the story up
  • Paragraph = CNV*
    Delete or duplicate a paragraph and, as long as it’s not key, you can make do
  • Sentence = microsatellite
    Accidentally repeat several words within that sentence, you elongate it, it’s annoying but not fatal to the plot
  • Letter = SNP
    Typos often barely change the meaning at all
21
Q

What is the polymerase slippage model?

A

It is an error in DNA replication.
With micro-satellites, they can be multiallelic.
The polymerase gets confused about the repeats and where it should be adding and if it had already created a daughter cell based on these strands. This means it detaches for a bit and tries to realign. But because of how many repeats there are, it can get confused and reattach in the wrong place. This leaves a bubble of unpaired bases.

When the repair mechanism comes along, it adds more bases then on the top too, ultimately increasing the number of repeats. This means there are alleles of different lengths and these then gets passed onto the next generation too. They are highly variable and unstable.