Why Sequence the Human Genome? Flashcards
Why was the human genome sequenced?
The human genome project, begun in 1990, aimed to:
- identify all human genes, and their roles
- analyse genetic variation between humans
- sequence the genomes of several model organisms used in genetics
- develop new sequencing techniques and computational analyses
- to share genome information with scientists and the general public as fast as possible
What is a genome?
complete set of DNA of an organism, including all its genes
What is genomics?
The study of genes
What were the key findings of the human genome?
- There are fewer genes than expected
- Less than 2% of our genome codes for proteins
- The genome is dynamic
- We still don’t know what many of our protein coding genes do
- Most human genes are related to those of other animals
- All humans are 99.9% similar at sequence level
What does the human reference genome include?
Nucelar DNA:
- 22 Autosomes, X and Y
- 6 million base pairs
- Half from each parent
- <20,000 genes
Mitochondrial DNA:
- Single, circular
- 16,569 base pairs
- all from mother
- 37 genes
What did they find when they tried to identify all human protein coding genes, and their roles?
- Define a gene and look for things that look like genes in sequences
- <2% coding (exons)
- 20% introns (the splice out of the DNA before it was translated)
- ~20,000 genes
- Many (about 20%) still have unknown function
While any two human genomes are 99.9% similar…
… across the population there is a lot of variation, and it is important
Why are our genomes 0.1% different?
Single Nucleotide Polymorphisms (SNPs) - are sites in DNA that commonly vary within populations
- Between the human reference genome and Watsons genome there are 1.9 Million SNPs difference
Describe SNPs
They are common single base pair changes or variants
- SNPs are common, around 1 in every 300 nucleotides
- Your SNPs are mostly different from your parents (occurred in your own DNA replication)
- Each genome sequence adds to the variation on record
- Diversity in genome sequencing adds to knowledge of variation
- Many SNPs don’t “do” anything, they are just inherited variations, but that doesn’t mean they aren’t useful
What can analysing common variants (genotyping) tell you?
- Who you are related to
- Where (some of) your ancestors came from
- Disease risk/assosciation
- How you might respond to drugs
This data can also be used to solve crime
Describe what happens when an SNP is found in certain places on the DNA
- Linked SNP: is close to the gene so is more likely to be inherited with the gene (but if its not it won’t affect the gene itself)
- Non-coding SNP: if that SNP changes how the enzyme binds to the DNA then it can effect how much of that protein is made
- Coding SNP: If that SNP is at the start of the codon then it is more likely to change the amino acid. If the amino acid is in an important region of the protein then it will effect the proteins function. If it is not on an important area of the protein then it will not have an effect on the proteins function.
Describe short tandem repeats (STRs) and DNA profiling
- STRs are repeats of 2-5 nucleotides, found in specific regions of the genome
- Each person inherits two alleles, one from each biological parent - which can be different lengths
- They can be used to create genetic profiles, or ‘DNA fingerprints’.
EG. 8 repeats of CAG from mother, 3 repeats of CAG from father. This person is 3,8 at STR 1.
Describe the importance of InDels
InDels: small deletions or insertions
- Second most common variant type in the human genome
- Can cause a “frameshift” - change the way DNA is read, if in protein coding regions
- Between the human reference genome and Watsons genome there is a difference of 0.2 million InDels
One of the most common genetic human diseases, cystic fibrosis, is caused by CFTR deltaF508, which is a three nucleotide deletion.
What are the structural variants that cause variation in the human genome?
CNV’s - copy number variants. chunks of DNA>500bp that are present at different amounts ‘or copy numbers’ relative to a reference genome
- can be deleted or duplicated
- can span multiple genes
- humans have 10,000 CNVs, found within and between genes
- many genes found in CNV are associated with sensory perception and immunity
Between the human reference genome and Watsons genome there is a difference of 23 large CNV (34 genes)