Genome Variation Flashcards

1
Q

The human genome and what are macro/micro variation?

A
  • 23 pairs of chromosomes.
  • 3 billion base pairs (20,000 genes).
  • 2% of genomes code for protein (exomes).
  • Major macro-level differences/variation generally associated with disease (aneuploidy, translocations). - rare.
  • Micro or molecular-level pathogenic difference sometimes associated with disease (point mutation and SCA, 3bp deletion in CFTR). - also rare.

Coding variants effect traits (height, hair colour, intelligence, etc.
99.7% DNA is the same between any 2 people (i.e. yet ~9 million bases are different).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a variant?

A

Any position in the genome that varies between individuals is considered a (polymorphic) variant.

Polymorphism = a discontinuous genetic variation resulting in the occurrence of several different forms or types of individuals among the members of a single species.

Discontinuous genetic variation divides the individuals of a population into two or more sharply distinct forms.

Monomorphic = not variant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 common genetic variants?

A

There are not generally harmful.

  • Single Nucleotide Polymorphisms (SNPs) ~ 17 million identified; ~ 3 million/genome.
  • Microsatellites ~ 3% of genome
  • Copy Number Variants (CNVs) > 2000 identified; ~ 100 per genome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is every base identical between individuals?

A

No, 2 people differ in DNA sequence at about 9 million base pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a Single Nucleotide Variant (SNV)?

A

It is a change in a single base (base substitution).

The genome is littered with them. Comparing human genomes reveals:

  • There is a high frequency: 1 every 300 nucleotides in reference genome. genomes.
  • In one individual: 1 occurs every 1000 bases.
  • Millions SNVs identified in human genomes (12 million SNVs identified in total).
  • Majority not in exome
  • Generated by mismatch repair during DNA replication. –> typically generated by faulty DNA replication in mitosis. Although there are mismatch repair mechanisms which should correct these mistakes, some don’t get corrected and we end up with an SNV.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does Bi-allelic mean?

A

When there is a possibility for 2 alleles at one site

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe how SNVs/SNPs come about.

A

During DNA replication, the two strands will separate and will be used as templates to synthesise complementary strands.
If that goes well, then we should end up with two identical copies.

However, when synthesising this strand, instead of incorporating an A, a G has been incorporated (THE MISTAKE). The mismatch repair system will identify this mistake and correct it so that the bases are a standard Watson-Crick base pair.

However, in this instance, it hasn’t corrected the G, it has instead replaced the T with a C. And thus, what we end up with is at this position there’s either a T or a C.

If these changes occur in the gametes and aren’t deleterious, then it will get passed on to the next generation, and as time goes on, it can spread throughout the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Where can SNVs end up in the genome, and what effect can they have?

A

SNVs can end up anywhere, such as:

THE GENE:

  • no amino acid change (synonymous variant)
  • amino acid change (non-synonymous/missense)
  • stop codon (nonsense)
  • splice split (splice variant)
  • UTR (gene expression)

THE PROMOTER:
- protein expansion

THE NON-CODING REGION:
- n/a (unknown)

Without a deleterious effect a population or population annihilation, SNVs do not disappear. They can potentially spread by random chance throughout the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between mutations and polymorphisms?

A

If the minor allele frequency is less than 1%, it’s a mutation.

If the minor allele frequency is more than 1%, it’s a polymorphism.

  • rare polymorphisms: MAF 1-5%
  • common polymorphism: MAF >5%

Thus, it is safer to use the term variant [all variant start off rare].

Evolutionary forces affect whether or not the variant remains rare - if it is damaging or recent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do evolutionary forces affect SNVs?

A

MUTATION: a new allele arises, we now have a variant

GENE FLOW: migration leading to the introduction of that variant into another population

GENETIC DRIFT: random change in variant allele frequency between generations

SELECTION: non-random change in variant allele frequency between generations because the presence of one allele/genotype is pathogenic (negative selection) or beneficial (positive selection)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are biological impacts of variants?

A

Consider…..
- Where are they?
In a gene?
Not in a gene?

  • What sort of gene?
    Key developmental gene, e.g. HOXD1
    Pigmentation, e.g. MC1R
  • Not straightforward
    Depends on the type of variant (lots of variants in every gene –some pathogenic, some not; depends on the environment)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Summary of SNPs

A
  • Millions in genome
  • A position in genome at which the base can vary
  • Can be anywhere in the genome (genic or non-genic)
  • May do nothing, may affect a trait, may be associated with disorder
  • Generally bi-allelic
  • Due to mutation and mismatch repair
  • These are base substitutions
  • When pathogenic, may call point mutations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are microsatellites (a.k.a short tandem repeats)?

A

These are a set of short, repeated DNA sequences in tandem (ie. after one another) at a particular locus on a chromosome. They vary in number in different individuals, and so can be used for genetic fingerprinting.

Microsatellites may be in the part of the 98% of the genome not coding for protein (intronic or UTR: may affect gene expression, or intergenic), or it may be in exons (extra amino acids can be added in protein).

The sequence in unit (e.g. GATA) does not vary. Number of times unit appears can vary.

There are dincuelotides, trinucleotides, tetranucleotides etc etc. which can repeat a varied number of times.

Microsatellites -Increased heterozygousity, highly polymorphic, highly multiallelic. Whereas SNVs, most people are homozygous.

Microsatellites are generally not harmful, an expansion disorders is for e.g. Huntington’s = trinucleotide repeat expansion disorder, basically a “bad” microsatellite.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe the Polymerase Slippage Model and what an error can lead to?

A

During replication, polymerase slippage and subsequent reattachment may cause a bubble to form in the new strand. Slippage is thought to occur in sections of DNA with repeated patterns of bases (such CAG) - microsatellites.

Then, DNA repair mechanisms realign the template strand with the new strand and the bubble is straightened out. The resulting double helix is thus expanded (microsatellite expanded).

Polymerase slippage (as theorised) cannot occur in DNA without repeating patterns of bases (microsatellites).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Summary of Microsatellites

A
  • 1000s in genome
  • Repeat units
  • Varying numbers of repeats
  • Alters actual size of that region of the genome
  • Multiallelic
  • Can be anywhere in genome
  • May do nothing….
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are Copy Number Variants (CNVs)?

A

CNV is a phenomenon is which sections of the genome are repated and the number of repeates in the genome varies between individuals.

17
Q

How do CNVs occur?

A

They come about through non-allelic homologous recombination in meiosis.

Allelic recombination is good; it provides the shuffling of alleles.
However, non-allelic recombination results in the duplication/deletion, and thus a copy number change.

(MISSALIGNMENT in meiosis) - check panopto.

18
Q

Where could CNVs be found?

A

They can be intergenic, but they are quite large (>1 kb), so they often affect one or more genes (parts of genes).

  • 12% of genome = CNV
  • > 2000 identified
19
Q

Are variants deleterious or beneficial or neutral?

A

Most are neutral.

They can be deleterious or beneficial. Or they can be beneficial or deleterious (e.g. - individuals who are heterozygous for the sickle cell disease (1 sickle gene and 1 normal hemoglobin gene, a.k.a. - sickle cell trait) have some protective advantage against malaria.)

Most common variants not causing Mendelian, monogenic disorders.
Majority are probably neutral (particularly intergenic variants).

But they can impact upon complex, non-Mendelian disorders and undoubtedly contribute to general individual variation (personality, sporting ability, looks etc

20
Q

What are some common variations and disease/traits associations?

A
Common variants:
Height
Allergies
Haemochromatosis
Type 1 and Type 2 diabetes
Alzheimer’s
Anxiety
Dyslexia
Memory
Sexual desire
Aging
Common diseases/traits:
Nicotine dependance
Faithfulness
Age-related hearing loss
Gout
Sciatica
Sense of smell
HIV susceptibility
Anti-social behaviour
21
Q

Are variants of any use?

A

Yes, they can be used as markers to help find disease-causing genes and mutations.
Examples include:
- autozygosity mapping & linkage studies (Microsatellites, SNPs)
- association analysis (SNPs, CNVs)

22
Q

The book analogy.

Example

A

Whole book = genome

Chapter = chromosome
Delete or duplicate chapter and you can really mess the story up

Paragraph = CNV
Delete or duplicate a paragraph and, as long as it’s not key, you can make do

Sentence = microsatellite
Accidentally repeat several words within that sentence, you elongate it, it’s annoying but not fatal to the plot

Letter = SNP
Typos often barely change the meaning at all

23
Q

Summary

A
  • The genome is dynamic!
  • The sequence varies
  • The size varies
  • Most genetic variation is harmless otherwise we’d be in trouble!
  • Because variation is common
  • Types of variation that are common = SNPs, microsatellites, CNV
  • Some of this variation of course accounts for the huge phenotypic variation we see in the human species
  • Some may be pathogenic
  • We know where in the genome these variants are therefore we have maps of the genome based on this information and they can be used to identify location of disease-causing variants