Human genome Flashcards
Human Genome Project
Oct 1st 1990 - Apr 2003
15 years
3 billion dollars
Almost 3 billion base pairs (92%)
20-25K genes
Whole-genome shotgun sequencing
- Breaking down the genome into fragments
- Amplify (clone) the DNA (recombinant DNA)
- Read the DNA
- Software reconstructs the sequence
Telomere-to-Telomere consortium
2022
3.05 billion base pairs (remaining 8%)
Open Reading Frame (ORF)
Stretch of nucleotides with no stop codon, if long probably is a gene
Comparative genomics
Search for homolog genes and sequences to infer on evolution and gene conservation
Human genome: conserved proportions
5% conserved across mammalians (so important):
- 1.2% coding DNA (20-25K genes)
- 3.5% regulation, condensarion, …
ENCODE consortium
75% transcribed in at least one type of studied cells
45% derived from transposons, poorly conserved repeats
Ch1 has the most genes, ChY the least
Individual variability
0.01%, due to SNPs and copy number variations (CNVs)
Constitutive heterochromatin
6.7% made of highly repeated sequences (interspersed or tandem):
- Centromeric: satellite DNA
- Telomeric: minisatellite DNA (TTAGGG)
- Microsatellites (STRPs)