Human Genome variation Flashcards
How many pairs of chromosomes do we have (gross structure)?
23
How big is the human genome?
- 3 billion bases (3000Mb)
- ~20,000 genes
- ~2% genome codes for protein = exome
What are Major macro-level differences?
• Major macro-level differences generally associated with disease (aneuploidy, translocations, etc) (not common)
Whar are micro or molecular-level pathogenic differences?
• Also micro or molecular-level pathogenic difference sometimes associated with disease (point mutation and SCA, 3bp deletion in CFTR)
Give some examples of Coding variants that effect traits
- height
- hair colour
- intelligence
What is a variant?
- ~99.7% DNA same between any 2 people (i.e. ~9 million bases different)
- Any position in the genome that varies between individuals is considered polymorphic = a variant
Define “common” in terms of genomics
- The frequency of the minor allele is relatively high in the population frequency and proportion of chromosomes that carry each allele in the population
- Or multiallelic
What are 2 of the same and different alleles called?
- 2 same alleles = homozygous
* 2 different alleles = heterozygous
What is a gene reference?
A gene reference is the most common allele in the population
What is Single Nucleotide Variant (SNV)/Polymorphism (SNP)?
A single base change
Describe the frequency of an SNV
Where are they mostly found?
How are they generated?
- High frequency: 1 every 300 nucleotides in reference genome
- One individual: 1 every 1000 bases
- Millions SNVs identified in human genomes
- Majority not in exome
- Generated by mismatch repair during DNA replication
Describe the process of DNA replication and incorporate a SNP
- DNA is unwound by helicase
- DNA polymerase is used to generate new daughter strands based on parental template strands
- The bases are complementary AT and CG
- Repair mechanisms in place
- This diagram shows how DNA replicates and how it corrects itself if there is a mismatch. This produces variation. A single nucleotide variant. This is shown in read. We have introduced a variant into a population
What is biallelic?
2 alleles present
Where may a single nucleotide variant be found?
Gene: • No amino acid change (synonymous) • Amino acid change (non-synonymous/missense) • Stop codon (nonsense) • Splice site • UTR (gene expression)
Promoter:
• Protein expression
- Non-coding region:
- Without a deleterious effect or population annihilation, SNVs do not disappear
Give an example of a single nucleotide variant
On image
What is a polymorphism?
What percentage is a rare polymorphism, common and mutation?
What is a better description of one?
If minor allele freqy >1% (i.e. at least 1 in every 100 chromosomes has non-reference allele) = polymorphism
Rare polymorphism: MAF 1-5%
Common polymorphism: MAF >5%
Less than 1% is a mutation
Polymorphism is used to describe a single nucleotide variant that doesn’t have a bad effect, so use variant INSTEAD.
What causes single nucleotide variants?
Mutation
New allele arises, we now have a Variant
Gene flow
Migration leading to introduction of that variant into another population
Genetic drift
Random change in variant allele frequency between generations
Selection
Non-random change in variant allele frequency between generations because presence of one allele/genotype is pathogenic (negative selection) or beneficial (positive selection)
What are microsatellites?
Another name for them?
Repetitive bases
Also known as a short tandem repeat
Name the 6 types of microsateliltes
On image pg 5
How does the formation of microsatellites happen?
- Errors in DNA replication
- The polymerase stutters and causes repeat sequences. It will cause gaps in other words, bases will shift back to other bases leaving some unpaired bases.
- To fix this, it has to re-anneal back to the parental strand. It looks for complementary bases. It causes a gap in the parental strand and adds some bases
On image
Where are Microsatellites found?
Part of the 98% of genome not coding for protein
Intronic or UTR: may affect gene expression
Intergenic
Exonic
Extra amino acids in protein
Can you think of a pathogenic example?
Read summary of microsatellites
- 1000s in genome
- Repeat units
- Varying numbers of repeats
- Alters actual size of that region of the genome
- Multiallelic
- Can be anywhere in genome
What is “Copy number of variants”?
Copy number of variants >2000 identified – 100 per genome
• An entire chunk of bases may be repeated, as shown below.
How does copy number variation occur?
Non-allelic homologous recombination in meiosis
Describe the extent of copy number variations
CNVs may be….
• Intergenic • But – quite large (>1kb) so often affect one or more genes (parts of genes) • ~12% genome = CNV • >2000 identified 1kb-5000kb
What are the effects of copy number variants?
Most common variants not causing Mendelian, monogenic disorders.
Majority are probably neutral (particularly intergenic variants).
BUT!
May well impact upon complex, non-Mendelian disorders and undoubtedly contribute to general individual variation (personality, sporting ability, looks etc)
What are the variant effects?
• Can be beneficial
• Can be pathogenic
• Most are neutral
• Are these of any use?
• Yes, can be used as markers to help find disease-causing genes and mutations
Autozygosity mapping & linkage studies (Microsatellites, SNPs)
Association analysis (SNPs, CNVs)