11 - The human genome project Flashcards
What is the size range of human chromosomes?
55x10^6 bp to 250x10^6 bp
What were the 4 goals of the human genome project?
- Determine the sequence of the 3 billion chemical base pairs in human DNA.
- Identify all genes in human DNA to their position on chromosomes.
- Attempt to predict the function of all genes.
4 . Utilise this information for understanding disease, developing medicines, understanding human variability and how humans compare to other species.
What were the 3 phases of the human genome project?
Phase 1 - Produce high resolution chromosomal maps
- position genetic markers and genes
- Create libraries of BAC clones for sequencing.
Phase 2 - Sequence each BAC DNA
Phase 3 - Assemble all sequences to produce final draft and annotate to identify genes.
What are the differences between the old sanger sequencing and the new sanger sequencing?
Old sequencing:
- 4 separate dideoxy reactions (one for each base).
- Very slow.
- Manual reading of results off X-ray film.
New sequencing:
- Like PCR
- Uses fluorescent terminators
- Products run on a gel, separated by size, laser scans bands to read them.
What approach did IHGSC use for sequencing the human genome and what were the advantages and disadvantages of their approach?
Clone-by-clone approach:
Advantages:
- Very effective at getting over regions of highly repetitive DNA sequences
Disadvantages:
- Slow approach
- Expensive
What approach did Celera use for sequencing the human genome?
Shotgun sequencing:
Blast genome into small fragments sequence each one and then use the power of computers to reassemble to sequence.
Had to rely on public databases of sequence and mapping information in order to assemble the sequence that was generated by this method.
When was the human genome sequence 80 and 99.9% finished?
80% - June 2000
99.9% - July 2003
What are the 2 major types of global human genetic variation?
- Single nucleotide polymorphisms (SNPs)
- Copy number variants (CNVs)
What was the goal following the human genome project?
SNP identification - technologies were developed for rapid, large-scale identification of SNPs and other DNA sequence variants.
A public database of DNA differences was created dbSNP.
What was the international HapMap project?
Aimed to find the common SNP variants in the world’s population.
Studied 270 people in total from 4 populations around the world.
Found >1 million human SNPs.
Once the gaps in the human genome project were filled what was found?
After 20 years the final 8% of the sequence was finished:
- 3,054,815,472 bp of nuclear DNA
- 16,569 bp mitochondrial genome
- 63,494 genes
- 233,615 transcripts
- 19,969 genes and 86,245 transcripts are predicted to be protein coding
Many non-coding genes:
- tRNAs and rRNAs (involved in translation).
- microRNAs and long non-coding RNAs (involved in transcription regulation).
What causes an increase in complexity of protein structure as the evolutionary tree is climbed?
Alternative mRNA splicing complexity increases - more protein isoforms per gene.