Genome variation Flashcards

1
Q

There are 3000Mb in the human genome, how come we all look different?

A

There are coding variants that effect traits (height, hair colour, intelligence). However, ~99.7% DNA is same between any 2 people (I.e ~9 million bases different). Any position in the genome that varies between individuals is considered polymorphic = a variant. There are some positions that are monomorphic in the genome (the same between two individuals)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different types of common genetic variant?

A

SNPS – Single nucleotide polymorphisms. ~17 million identified, ~3million per genome
Microsatellites - ~3% of genome. 1000s in genome
CNVs – Copy number variants. >2000 identified. ~100 per genome. ~12% genome=CNV
Everyone ‘has every variant’, everyone has this position (snp) in the genome, what may differ between individuals is the alleles that they have (I.e., the genotype).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is ‘common’?

A

We see lots of types of variants throughout the genome. If biallelic, the frequency of the minor allele is relatively high. - population frequency’s I.e. proportion of chromosomes that carry each allele in the population.
Or multiallelic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a SNP?

A

A Single nucleotide polymorphism/ Variant (SNVs) is a change in a single base (hence a single base substitution). There is a high frequency of them within the genome- find 1 in every 300 nucleotides. The majority of SNPs are not in the exome; non-exomic region is larger and there is no selective pressure (coding areas). They are generally bi-allelic. They arise due to mismatch repair going wrong- they are generated during faulty replication of DNA during mitosis. Although there are mismatch repair mechanisms which should correct these mistakes, some don’t get corrected and you end up with a SNP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the normal process of DNA replication, without any faults?

A

Sequence of DNA
Separation- unwound by helicase into single parental strands.
Replication- complementary daughter strands produced via DNA polymerase using parental template strands. Results in two identical copies of DNA.
The bases added should always be complementary, however this doesn’t always happen – resulting in mismatches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens when there is a mismatch?

A

Sometimes DNA replication goes wrong and the wrong base is incorrectly incorporated – a mismatch. When this occurs, there is a mismatch repair mechanism which should identify the mistake and correct it so the bases are a standard Watson-crick base pair.

However, some mismatches do not get corrected and what can actually happen is that the original parental base is corrected (instead of the mismatch). This results in variation. One daughter cell will have a different base at the same position to another daughter cell. Thus, you end up with a Single Nucleotide Variant/polymorphism. If this change occurs in the gametes and isn’t deleterious then it will get passed onto the next generation, and as time goes on it can spread through the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where may SNVs/SNPs be located?

A

Like any variant in a genome, they may be in a:

  • gene: No amino acid change (synonymous), Amino acid change (missense/non-synonymous), stop codon (nonsense), UTR (and effect gene expression).
  • Promoter- and hence effect protein expression.
  • non-coding region (most will be in this).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Do SNVs disappear?

A

Without a deleterious effect or population annihilation, SNVs do not disappear. They will potentially spread and increase alleles frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a point mutation?

A

A missense mutation in which the single base change results in codon that codes for a different amino acid. Hence, it is a SNP/SNV. When SNVs are pathogenic – they lead to disease- they can be referred to as point mutations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is presence of an allele expressed?

A

For a pop^n, presence of an allele is as expressed as a freq^y or %

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you differentiate between a mutation and a polymorphism?

A

If minor allele freq^y is > 1% (I.e. at least 1 in every 100 chromosomes has non-reference allele) it is a polymorphism. Rare polymorphism MAF 1-5%, Common polymorphism MAF >5%

If it is less than 1% then we would call it a mutation, however frequency can vary in different populations so ‘polymorphism’ is actually used to describe something that doesn’t have a detrimental effect. This is not STRICTLY true though, and hence it is safer to use the term ‘variant’ when describing any position in the genome that can vary. Technically, all variants start off rare, but evolutionary forces affect whether or not a variant remains rare. Rare variants may be damaging (deleterious) and/or recent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you create genetic variation?

A

You must have a MUTATION. Leads to new allele arising, so you now have a variant. What happens is GENE FLOW, which is migration leading to the introduction of that variant into another population. What you can then see is GENETIC DRIFT, which is random changes in variant allele frequency between generations. SELECTION will then only happen if the variant/allele is pathogenic (negative selection) or beneficial (positive selection), otherwise allele frequency will just be subject to genetic drift.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a microsatellite?

A

A microsatellite is where a specific unit is repeated n number of times. This can vary between individuals in terms of the actual number of repeat units occurring at that position. They are also known as STRs- short tandem repeats. The actual unit is not what varies, but rather the number of times it appears. This thus means one person may have the unit GATA one time, and another 12 times. Therefore, human genome isn’t exactly 3000mb as the physical length of our DNA varies slightly. Microsatellites are variants that arise due to polymerase slippage. They are highly multiallelic so most people will be heterozygous (unlike SNPs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the different types of microsatellites (length polymorphisms)?

A

Dinucleotide (involves two bases a certain no. Of times)- (CA)(CA)(CA), Trinucleotide (GCC)(GCC)(GCC), Tetranucleotide (AATG)(AATG)(AATG), Pentanucleotide, hexanucleotide etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the reference allele?

A

The allele that is more common in the population than the others. It is often the smallest allele, but a range is often quoted. The other alleles would occur in a lower frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do microsatellites arise?

A

Error in DNA replication. Usually, polymerase is used to generate new strands, however sometimes the polymerase ‘slips’ or stutters (has a tendency to occur in repeat sequences)– Polymerase slippage model.

The polymerase disengages from the strands of DNA and the daughter strands are left hanging free/dangling off the template strand. Polymerase looks for complementary bases to anneal to as it has to re-anneal for replication to carry on. However, because the sequence is repetitive it can hybridise at the wrong place, hence forcing the unpaired bases into a bubble. The bubble formed in the new strand is recognised as an error and the DNA repair mechanism tries to correct this by realigning the template strand with the new strand, so the bubble is straightened out. It does this by creating a gap in template parental strand and pop in n new bases, which creates an expansion when repetition continues. The resulting double helix is thus expanded so the daughter ends up with an increased number of copies in that unit = variation generated.

17
Q

Where may microsatellites be located?

A

Part of the genome not coding for protein (98% of genome) I.e Intronic or UTR (may effect gene expression), Intergenic.
Exonic – Extra amino acids in protein.

18
Q

What are expansion disorders?

A

Where microsatellites lead to disease. E.g. Huntingtons – trinucleotide repeat expansion (CAG units) disorder.

19
Q

What are CNVs?

A

Copy Number Variants. When Chunks (~1kb-5Mb) of genome is repeated or deleted, I.e first 2000 bases same, and then starts again (duplicated). There can be a variation in number of copies between people. Is relatively common and does not cause disease most of the time. The simplest type of copy number variation is the presence or absence of a gene. An individual’s genome could thus contain 1, 2, or 0 copies. Duplication of a genomic segment could result in diploid copy numbers 2, 3, or 4. CNVs are variants that arise due to non-allelic homologous recombination.

20
Q

What do we expect in terms of copies in chromosome?

A

Any given position on a chromosome, we expect there to be two copies because we are diploid. However, sometimes this isn’t the case due to a deletion (resulting in 1 copy of a gene locus) or duplication (resulting in 3 copies of gene locus e.g.)

21
Q

How do CNVs arise?

A

Non-allelic homologous recombination in meiosis. Normal allelic recombination that cross over, chiasma formation = good thing. Due to shuffling of alleles generating recombinants, leads to genetic variation.

However, sometimes homologous pairs of chromosomes can misalign slightly, where you can get NAHR. Every locus (gene base) should be aligned with its corresponding homologous chromosome, however they shift out of alignment. This is due to sequence similarity (regions where they have the same sequence as each other) in different parts of the chromosome misaligning. This becomes a problem when a recombination event occurs around that region and NAHR can thus result in duplication/deletion and copy number change. The resulting zygote can thus have simultaneous duplication and deletion events.

22
Q

Where may CNVs be located?

A

Can be in any given part of the genome. They are often intergenic. But, because they are quite large (>1kb) they can affect one or more genes (parts of genes)

23
Q

What is a microdeletion disorder?

A

A disorder that arises due to a pathogenic CNV. E.g. DiGeorge syndrome.

24
Q

What are the parts of the Book analogy?

A

Letter= SNP – typos often barely change meaning
Sentence= Microsatellite – repeating words/sentences. Annoying, not fatal to plot
Paragraph= CNV – Delete or duplicate paragraph and as long as its not key, its fine.
Chapter = Chromosome – Delete/duplicate chapter – can mess up story!
Whole book= genome

25
Q

What is an allele?

A

A locus is a unique position in the genome. An allele is one version of a particular position or locus on the genome. May be a single base e.g A allele, or a version of a gene e.g ABO blood group with, A,B,O alleles.