S4: Genome Variation Flashcards

1
Q

How big is the human genome and does it vary between individuals?

A
  • The whole human genome is 3 billion base pairs in the haploid genome containing 20,000 genes.
  • Only 2% of the genome codes for protein, 98% doesn’t so those 20,000 genes make up only 2% of the entire genome called exome.
  • Every base is not identical between individuals as we all appear phenotypically different e.g. height, colour, diseases. Pathogenic mutations are rather rare, otherwise we’d all have bad diseases, however there is also a lot of common coding variation in the genome that isn’t associated with disease but is associated with normal phenotypic differences we see e.g. Height, hair, colour, intelligence. Some of this variation will be in the coding regions of the genome (the 2%) others will be in the non-coding regions (the 98%).
  • 99.7% of DNA between any two people is identical, this gives about 3 million bases difference in the genome between individuals.
  • Any position in the genome that varies between individuals is considered to be polymorphic.
  • Major macro-level differences generally associated with disease (aneuploidy, translocations, etc).
  • Also micro or molecular-level pathogenic difference sometimes associated with disease (point mutation and SCA, 3bp deletion in CFTR).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Using CF as an example of common variation in genes

A
  • There are many different genetic mutations at different positions in the CFTR gene, and only a few of these mutations will cause cystic fibrosis.
  • The vast majority of these mutations in the gene are harmless and common variations that can occur in anyone. They may even change an amino acid and thus the primary sequence of the protein but this is still harmless.
  • So even in a gene associated with disease we have common variations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an allele?

A

An allele is a unique position (locus) in the genome, this could be a single base or an entire gene. In a diploid genome, we have two alleles at any autosomal locus, these may be homozygous (alleles are identical) or heterozygous (alleles are different). The combination of alleles gives us our genotype.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does bialleic, trialleic , multialliec mean ?

A
  • At a particular locus in genome we only ever see two possible alleles in the population = biallelic.
  • If three = triallelic (e.g. ABO blood groups, the gene will produce either A-antigen, B-antigen or O-antigen, so there are three possible variants).
  • If more than three = multiallelic.
  • If biallelic, the frequency of the minor allele is relatively high.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a genetic variant?

A
  • There is common and uncommon.
  • A variant is common if we see lots of that type of variation in the genome (e.g. CNV, STR).
  • A trisomy is not a common variation as we would only see it once or twice in the genome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does variant give us population frequency (Pop n)?

A
  • The frequency of the different allele/variant is relatively high in the population. In other words the less common allele still has a high population frequency so occurs quite a bit in the population. This is the proportion of chromosomes that carry each allele in the population e.g. what proportion of chromosomes in a lecture carry the variant and what proportion do not. This gives us population frequency.
  • For a population frequency of an allele this will be expressed as a % or decimal e.g. 50% carry this alleleic variant.
  • If we looked at the allele frequency of two different populations of the same species the allele frequency may be different.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a polymorphism and mutation?

A
  • A polymorphism is if the minor allele frequency (frequency of rarest allele) is greater than 1% in the population.
  • A rare polymorphism is when the minor allele frequency is between 1 -5%.
  • A common polymorphism is when the minor allele frequency is greater than 5%.
  • Any allelic variant that appears less than 1% in the population is considered as a mutation because with such a low appearance it is likely to be damaging as selective pressure keeps its frequency down.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why do all variants start off as rare?

A
  • All variants of an allele start off rare, at one point a person has an allele but then there is a change and they have a new variant.
  • Evolutionary forces (selection) will determine whether the variant remains rare or becomes more common.
  • Thus a rare variant may be damaging and or recent.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

List types of rare genetic variation

A
  • Translocations, Aneuploidy, Deletions and Duplications.

- Most people do not have them and they generally have severe clinically consequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

List types of common genetic variation

A
  • Single nucleotide polymorphism (SNP)/Single nucleotide variant (SNV).
  • Microsatellite/Short tandem repeat (STR).
  • Minisatellite/Variable number of tandem repeats (VNTR).
  • Copy number variation (CNV).
  • We all have lots of these. They may cause disease, affect traits or alter susceptibility to disease.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we know what is normal and what is a variant when there is a different allele?

A
  • This came from human genome mapping project which is entirely based on genome of 4 anonymous individuals.
  • The consensus (reference sequence) is based on the majority allele are on those positions.
  • Since then, thousands of people have had their genome sequence to constantly update the reference DNA.
  • The reference allele will therefore be the most common in the population and the minor allele is the minority in the population. The minor allele frequency can be calculated to see if the position is polymorphic.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Single Nucleotide Polymorphisms (SNPs)/Single Nucleotide Variant (SNV)?

A
  • These appear lots in the genome, with one position every 300 nucleotides differing by a substitution of a base.
  • There are approximately 17 million SNPs identified in the human genome, these are natural common variations that have been generated due to problems with replication of DNA during the mismatch repair during mitosis.
  • Majority not in the exome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe how SNP arise

A
  • During DNA replication DNA helicase separates the two complementary strands and the DNA polymerase moves along each strand synthesising a new complementary strand on the template. DNA polymerase also has a proof-reading ability, so if the wrong nucleotide is inserted then it is immediately removed and replaced with the correct nucleotide.
  • Sometimes this doesn’t work, so in this case we have the mismatch repair system that recognises the mismatch between non-complementary bases and then takes out one and puts a complementary one. Sometimes this doesn’t occur correctly and this generates SNPs.
  • The mismatch repair system that will cut out the correct base in the sequence and put in the complementary one (to the wrong base).
  • So the pair are now complementary but now in the daughter cells there is a difference in DNA at that locus, this has created a SNP. If this happened in a gamete it would be passed on to the next generation.
  • They are usually bialleic as there are two possible allele/genes in any population.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the consequences if SNP occur in a gene or other?

A
  • Point mutations include an amino acid that is changed (missense/non-synonymous), a stop codon introduced (nosense) or splice site affected.
  • No amino acid change as codon system is a degenerative code (i.e. more than one codon for a single AA) this is synonymous.
  • Affect promoter and then protein expression.
  • Non coding region.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Do SNP disappear?

A

Without a deleterious effect or population annihilation, SNPs do not disappear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do evolutionary forces and SNP/SNV account for changes in allele frequency?

A
  1. Mutation where a new new allele arises, we now have a SNP or SNV.
  2. Gene flow, migration leads to introduction of SNP into another population.
  3. Genetic drift where there are random changes in SNP allele frequency between generations.
  4. Selection due to non random change in SNP allele frequency between generations because presence of one allele/genotype is pathogenic (negative selection) or beneficial (positive selection).
17
Q

Factors to consider if genetic variants are likely to be neutral or not

A
  • Where are they? In a gene or not?
  • What sort of gene? For example a key developemental gene e.g. HOXD1 is likely to lead to severe consequences. Pigmentation change e.g. MC1R complex variation is a common variation leading to subtle changes.
  • Is it straight forward? This depends on the type of variant as some are pathogenic, some are not and some depend on enviroment.
18
Q

What is a microsatellite?

A
  • Microsatellite is also known as a short tandem repeat.
  • For example, ACACAC where the AC= repeat (dinucleotide repeat) is repeated in tandem (i.e. one after another).
  • Can be variation in number of repeats between people, this changes specific size of chromosome.
  • The specific unit can vary aswell e.g. di/tri/tetra/penta/hexa nucleotide.
  • Children can inherit them from parents.
19
Q

Are microsatellites multialleic?

A

Unlike SNPS with are billetic, microsatellites are multiallelic and there can be many different variations than 2 so they are highly polymorphic.

20
Q

How do microsatellites arise?

A
  • Polymerase slippage model snd error in DNA replication.
  • Polymerase stutters because of the repetitively of the sequence. There is a flapping of single stranded DNA as it can become ‘unstuck’ during the replication process and the DNA pops out (bubble of unpaired bases).
  • Polymerase can therefore stick to the wrong part of DNA due to incorrect sequence aligning due to sequence similarity (the bubble).
  • The bubble needs to be repaired so it is opened up and there are two new complementary bases integrated into the sequences so then there is the introduction of two new bases (an extra repeat unit). - This is how variation is generated at a microsatellite.
21
Q

Where are microsatellites found?

A
  • Part of 98% of genome not coding for protein. This can be intronic or UTR which may affect gene expression. It can also be in the intergenic region.
  • Exonic which can cause extra amino acids in proteins for example in Huntington’s disease.
22
Q

What is a copy number variant (CNV)?

A
  • ~12% genome = CN and >2000 identified.
  • So CNV has massive changes where there is massive chunks of DNA replicated or deleted.
  • The simplest type of copy number variation is the presence or absence of a gene. An individual’s genome could therefore contain two, one, or zero copies.
  • Duplication of a genomic segment
    could result in diploid copy numbers of two, three, or four. Normally, we should have 2 copies of every gene.
23
Q

How does CNV arise?

A
  • Non-allelic homologous recombination (NAHR) in meiosis.
  • Allelic recombination is good as it shuffles alleles.
  • But non-allelic recombination results in duplication/deletion and copy number change. This is due to misalignment of a pair of chromosomes when chromosomes are paired up as homologous chromosomes. This is due to repetitive and similar sequences in a chromosome. A recombination event can mix alleles up inappropriately and when they pull apart in meiosis I and then II, there is a weird mixture of DNA in chromosomes leading to deletion of a gene in one and a duplication of a gene in another.
24
Q

Where are CNV found?

A
  • In genes.
  • Intergenic.
  • As they are quite large >1kb they often affect one or more genes (parts of genes).
25
Q

Describe associations between common variants and disease

A
  • Most common variants not causing Mendelian, monogenic disorders. Majority are probably neutral (particularly intergenic variants).
  • However, they may well impact upon complex, non-Mendelian disorders and undoubtedly contribute to general individual variation (personality, sporting ability, looks etc).
26
Q

Describe variant effects

A
  • Can be beneficial.
  • Can be pathogenic.
  • Most are neutral.
  • Can be of use especially in mapping and can be used as markers to help find disease-causing genes and mutations e.g. Autozygosity mapping & linkage studies (Microsatellites, SNPs)
    and association analysis (SNPs, CNVs).