Lecture 5 DA Flashcards
What are some reasons why we sequenced the human genome (3)?
- Because its there - a bioinformatical challenge.
- Helps against inherited diseases, including those we don’t know about.
- Helps understands consequences of mutation.
What is responsible for the phenotypic diversity among different individual humans?
Single nucleotide polymorphisms - SNPs.
What is more important, the nucleotide sequence or the protein sequence?
Protein sequence.
Which chromosome was sequenced first, and why? Which came after?
22 because it’s the shortest. 21 came after.
Describe the hierarchical approach to sequencing the human genome (5).
- Different groups are each given a chromosome to sequence.
- The groups generate bacterial artificial chromosome sequences (BACS).
- BACS were divided, and shotgun sequencing was done on them.
- High fidelity maps with identifiable motifs allowed them to detect overlapping regions and assemble the sequence.
Describe the shotgun approach to sequencing the human genome (6).
- DNA is isolated and chopped into fragments.
- Fragments are cloned into vectors, and sequenced.
- Overlapping genes are combined to assemble the genome into contigs.
- Scaffolds generated from contigs.
What is celera sequencing, and what is it like to hierarchical sequencing?
Celera sequencing is a whole genome shotgun sequence at once. Finished faster than hierarchical approach.
At how many locations do SNPs occur?
3m.
How many genes total were found?
~51k.
How many coding genes were found?
~20k.
How many non-coding genes were found?
~20k
What are pseudogenes, and how many were found?
Genes that seem to be protein coding, but mutation rendered them non-coding. 18k found.
How many genes with variants were found?`
~20k.
How many mRNA genes were found? What does this mean?
98k. For about every gene, there are 5 mRNAs that can be made, meaning we technically have ~100k genes.
What % of the genome is coding? What % is repeating junk DNA?
Coding -