Genome variation Flashcards
There are 3000Mb in the human genome, how come we all look different?
There are coding variants that effect traits (height, hair colour, intelligence). However, ~99.7% DNA is same between any 2 people (I.e ~9 million bases different). Any position in the genome that varies between individuals is considered polymorphic = a variant. There are some positions that are monomorphic in the genome (the same between two individuals)
What are the different types of common genetic variant?
SNPS – Single nucleotide polymorphisms. ~17 million identified, ~3million per genome
Microsatellites - ~3% of genome. 1000s in genome
CNVs – Copy number variants. >2000 identified. ~100 per genome. ~12% genome=CNV
Everyone ‘has every variant’, everyone has this position (snp) in the genome, what may differ between individuals is the alleles that they have (I.e., the genotype).
What is ‘common’?
We see lots of types of variants throughout the genome. If biallelic, the frequency of the minor allele is relatively high. - population frequency’s I.e. proportion of chromosomes that carry each allele in the population.
Or multiallelic.
What is a SNP?
A Single nucleotide polymorphism/ Variant (SNVs) is a change in a single base (hence a single base substitution). There is a high frequency of them within the genome- find 1 in every 300 nucleotides. The majority of SNPs are not in the exome; non-exomic region is larger and there is no selective pressure (coding areas). They are generally bi-allelic. They arise due to mismatch repair going wrong- they are generated during faulty replication of DNA during mitosis. Although there are mismatch repair mechanisms which should correct these mistakes, some don’t get corrected and you end up with a SNP.
What is the normal process of DNA replication, without any faults?
Sequence of DNA
Separation- unwound by helicase into single parental strands.
Replication- complementary daughter strands produced via DNA polymerase using parental template strands. Results in two identical copies of DNA.
The bases added should always be complementary, however this doesn’t always happen – resulting in mismatches.
What happens when there is a mismatch?
Sometimes DNA replication goes wrong and the wrong base is incorrectly incorporated – a mismatch. When this occurs, there is a mismatch repair mechanism which should identify the mistake and correct it so the bases are a standard Watson-crick base pair.
However, some mismatches do not get corrected and what can actually happen is that the original parental base is corrected (instead of the mismatch). This results in variation. One daughter cell will have a different base at the same position to another daughter cell. Thus, you end up with a Single Nucleotide Variant/polymorphism. If this change occurs in the gametes and isn’t deleterious then it will get passed onto the next generation, and as time goes on it can spread through the population.
Where may SNVs/SNPs be located?
Like any variant in a genome, they may be in a:
- gene: No amino acid change (synonymous), Amino acid change (missense/non-synonymous), stop codon (nonsense), UTR (and effect gene expression).
- Promoter- and hence effect protein expression.
- non-coding region (most will be in this).
Do SNVs disappear?
Without a deleterious effect or population annihilation, SNVs do not disappear. They will potentially spread and increase alleles frequency.
What is a point mutation?
A missense mutation in which the single base change results in codon that codes for a different amino acid. Hence, it is a SNP/SNV. When SNVs are pathogenic – they lead to disease- they can be referred to as point mutations.
How is presence of an allele expressed?
For a pop^n, presence of an allele is as expressed as a freq^y or %
How do you differentiate between a mutation and a polymorphism?
If minor allele freq^y is > 1% (I.e. at least 1 in every 100 chromosomes has non-reference allele) it is a polymorphism. Rare polymorphism MAF 1-5%, Common polymorphism MAF >5%
If it is less than 1% then we would call it a mutation, however frequency can vary in different populations so ‘polymorphism’ is actually used to describe something that doesn’t have a detrimental effect. This is not STRICTLY true though, and hence it is safer to use the term ‘variant’ when describing any position in the genome that can vary. Technically, all variants start off rare, but evolutionary forces affect whether or not a variant remains rare. Rare variants may be damaging (deleterious) and/or recent.
How do you create genetic variation?
You must have a MUTATION. Leads to new allele arising, so you now have a variant. What happens is GENE FLOW, which is migration leading to the introduction of that variant into another population. What you can then see is GENETIC DRIFT, which is random changes in variant allele frequency between generations. SELECTION will then only happen if the variant/allele is pathogenic (negative selection) or beneficial (positive selection), otherwise allele frequency will just be subject to genetic drift.
What is a microsatellite?
A microsatellite is where a specific unit is repeated n number of times. This can vary between individuals in terms of the actual number of repeat units occurring at that position. They are also known as STRs- short tandem repeats. The actual unit is not what varies, but rather the number of times it appears. This thus means one person may have the unit GATA one time, and another 12 times. Therefore, human genome isn’t exactly 3000mb as the physical length of our DNA varies slightly. Microsatellites are variants that arise due to polymerase slippage. They are highly multiallelic so most people will be heterozygous (unlike SNPs)
What are the different types of microsatellites (length polymorphisms)?
Dinucleotide (involves two bases a certain no. Of times)- (CA)(CA)(CA), Trinucleotide (GCC)(GCC)(GCC), Tetranucleotide (AATG)(AATG)(AATG), Pentanucleotide, hexanucleotide etc.
What is the reference allele?
The allele that is more common in the population than the others. It is often the smallest allele, but a range is often quoted. The other alleles would occur in a lower frequency.