Genome Variation Flashcards
What proportion of the genome is the exome - codes for protein?
How much of our genome do we share with someone else?
2%
99.7% - only 9 million bases out of 3 billion are different
What kind of differences in the genome are associated with disease?
Macro-level differences (e.g. trisomy 21) and micro-level, molecular-level differences (e.g. a single point mutation as in SCA/ a 3BP deletion in the CFTR gene leading to CF)
What is special about mono-zygotic twins’ DNA?
They are identical at every base
What in DNA terms is considered polymorphic?
Any base in the genome that varies between individuals is polymorphic
What is a reference sequence and a reference allele?
A sequence database which summarizes the base at that position that is present for the majority of people
The most common allele
How are variants/ polymorphic positions found?
By comparing someone’s sequence to a reference sequence and seeing that they are different
How was the referencing sequence generated?
4 anonymous individuals genomes were sequenced and averaged out in the human genome project
How often does a SNV occur in the reference sequence and in one individual?
Once every 300 nucleotides in the reference sequence; once every 1000 nucleotides in an individual
Where are the majority of SNVs found?
Not in the exome
How are SNVs generated?
By faulty mismatch repairing that occurs during DNA replication
What is a biallelic site?
A site in DNA where there could be 2 possible alleles (2 variants, one of which is the reference sequence base)
“A biallelic site is a specific locus in a genome that contains two observed alleles, counting the reference as one, and therefore allowing for one variant allele. In practical terms, this is what you would call a site where, across multiple samples in a cohort, you have evidence for a single non-reference allele.”
How is a SNV formed?
In DNA replication the two strands separate and are templates to synthesise complementary strands, forming identical copies.
However, when synthesising this strand instead of incorporating an A, a G has been incorporated.
The mismatch repair mechanism will identify this mistake and correct it so that the bases are a standard Watson-Crick base pair
However, in this instance it hasn’t corrected the G, it’s replaced the T with a C.
If this change occurs in the gametes and isn’t deleterious then it will get passed on to the next generation and as time goes on it can spread through the population
Where can SNVs be found?
In genes, promoters and non-coding regions
In genes, they can change (non-synonymous/ missense) or not change (synonymous) an amino acid, and could change the amino acid into a stop codon (nonsense). They also change where the splicing can occur in a sequence (the splice sites) and can occur in a UTR, affecting gene expression
In promoters they can affect protein expression
When do SNVs disappear from the genome?
When they have a deleterious effect (causing harm/ damage) or cause population annihilation
What kind of mutation is SCA?
A point, missense mutation
How common is the SCA point mutation?
White European people
0.02%
2 in every 10,000 chromosomes
African people
4.5%
1 in every 20 chromosomes