W1- Genome Variation Flashcards
What does the human genome consist of?
Gross structure – 23 pairs chromosomes
Molecular structure – DNA sequence
* 3 billion bases (3000Mb)
* ~20,000 genes
* ~1.5% genome codes for protein = exome
Are people identical?
- Major macro-level differences generally associated with disease (aneuploidy,
translocations, etc) - Also micro or molecular-level pathogenic difference sometimes associated with disease
(point mutation and SCA, 3bp deletion in CFTR)~99.7%
DNA same between any 2 people (i.e. ~9 million bases different)
Any position in the genome that varies between individuals is considered polymorphic = a variant - 2 people differ in DNA sequence at ~9 million bps
Major Allele = the one most commonly present in the genome.
Minor allele = less frequent variant.
What is a single nucleotide variant (SNV)/polymorphism (SNP)?
- High frequency: 1 every 300 nucleotides in reference genome
- One individual: 1 every 1000 bases
- Millions SNVs identified in human genomes
- Majority not in exome
- Generated by mismatch repair during DNA replication
How does DNA replication and mismatch repair work?
Double stranded DNA is unwound using helicase enzyme. The polymerase runs along producing the new daughter strands using the parental strand adding bases. A mismatch in the base pairs is detected by one of the repair mechanisms - these are not able to identify which one the wrong base is. This means they can end up eliminating the original base causing a SNP. Variation has now been introduced to this group of cells and can be passed down to the next generation.
Two alleles are called the genotype.
Once you get a SNP, that’s a position where it is biallelic.
Where could SNVs be found?
Gene
* No amino acid change (synonymous)
* Amino acid change (non-synonymous/missense)
* Stop codon (nonsense)
* Splice site
* UTR (gene expression)
Promoter
* Protein expression
Non-coding region
Without a deleterious effect or population annihilation, SNVs do not disappear
Are the effects of SNV deleterious or beneficial?
Single variant can be both detrimental and beneficial. Depends on where in the genome this has taken place, but also the environment. An example of this is Sickle Cell Anaemia.
What is a variant?
- If minor allele freqy >1% (i.e. at least 1 in every 100 chromosomes has non-reference allele) = polymorphism
- Rare polymorphism: MAF 1-5%
- Common polymorphism: MAF >5%
- Safer to use term variant since it is not always associated with disease.
- All variants start off rare
- Evol^y forces affect whether or not a variant remains rare
- Rare variant may be damaging and/or recent
What is a mutation?
- New allele arises, we now have a Variant
What is gene flow?
- Migration leading to introduction of that variant into another population
What is genetic drift?
Random change in variant allele frequency between generations
What is selection?
Non-random change in variant allele frequency between generations because presence of one allele/genotype is pathogenic (negative selection) or beneficial (positive selection)
Is every base identical between individuals? Is every genome exactly 3000Mb?
- Example of a microsatellite
- Also known as a short tandem repeat
- The AC = repeat, it is repeat in tandem (i.e. one after another)
- Can be variation in number of repeats between people
Microsatellites can be repeated di, tri, tetra, penta, hexa etc depending on the length of polymorphism.
a) What are micro satellites?
b) Where might micro satellites be found?
a)
* 1000s in genome
* Repeat units
* Varying numbers of repeats
* Alters actual size of that region of the genome
* Multiallelic
* Can be anywhere in genome
* May do nothing….
b)
* Part of the genome not coding for protein
* Intronic or UTR: may affect gene expression
* Intergenic
* Exonic
* Extra amino acids in protein
* Can you think of a pathogenic example?
Microsatellites may be in…….
Expansion disorders, e.g. Huntington’s = trinucleotide repeat expansion disorder, basically a “bad” microsatellite
What is a copy number variation?
This is when a chunk of DNA is copied.
- The simplest type of copy number variation is the
presence or absence of a gene. - An individual’s genome could therefore contain two, one, or zero copies.
- Duplication of a genomic segment
- could result in diploid copy numbers of two, three,
or four. - Pair of homologous chromosomes, i.e. 2 copies of chromosome 12
- Every locus (gene, base, genomic region) in theory is present as diploid
What are Non-allelic homologous recombination in meiosis?
- A-D = loci on chromosome
- Grey and blue = homologous chromosomes aligning in meiosis I
- Red bands = regions of high sequence similarity, often viral/bacterial genomes that have been incorporated through evolution
- Allelic recombination is good! – shuffling of alleles
- But non-allelic recombination results in duplication/deletion and copy number change