Genome Organization Flashcards
What are the components of chromatin?
DNA + histones + non-histone proteins (acidic)
How many base pairs are in the haploid human genome sequence?
3e9 base pairs
What were some of the findings from the first human genome sequence?
- Human genome is not static
- There is no “one” human genome; there are many.
- Genome is not organized in a random manner
Give three examples of how the human genome is not static.
- ~30 new mutations occur in each individual
- Shuffling of regions occur at each meiosis due to recombination
- Somatic DNA changes can be produced as well as germ-line changes
What is single nucleotide polymorphism?
A comparative difference in a single base pair
If there is an average of 1 SNP for every 1000 base pairs between any two randomly chosen human genomes, approximately how many differences are there?
3 million
even though 99.9% identical
Which chromosome is classified as “gene-rich”?
Chromosome 19
Which chromosomes are classified as “gene-poor”?
Chromosomes 13, 18, 21
What are the differences between euchromatic and heterochromatic regions of the genome?
Euchromatic regions are more relaxed and the focus of genome sequencing effort; still many unsequenced gaps in the regions; make up most of the genome
Heterochromatic regions are more condense with more repeats; the regions are essentially unsequenced; tend to be near centromeres and make up less of the genome
What is the general genome composition?
1.5% is translated
20-25% is represented by genes
50% “single copy” sequences
40-50% classes of “repetitive DNA” related hundred of millions of times
Acknowledging that GC-rich and AT-rich regions are not random, what percent of the genome is GC- and AT-rich?
38% GC-rich
54% AT-rich
What are the two classes of repetitive DNAs?
- Tandem repeats (“satellite DNAs”)
2. Dispersed repetitive elements
What are two of the locations of certain tandem repetitive DNA?
A particular pentanucleotide sequence is found as part of a heterochromatic region on the long arms of Chromosome 1, 9, 16, and Y, which are hotspots
“alpha-satellite” repeats are a 171 bp repeat found near centromeric region of all chromosomes; may be important to segregation during mitosis/meiosis
Give an example of a short interspersed repetitive element.
Alu family
~ 300 base pair related members
500,000 copies in the genome
Give an example of a long interspersed repetitive element.
L1 family
~6 kilobase pair related members
100,000 copies in the genome
What are dispersed repeats in the genome and how can they be medically relevant?
They are retrotransposition elements that can effectively copy their own sequences into other locations in the DNA
=> retrotransposition of a copy into the middle of another gene may inactivate that gene
=> NAHR leading to disease
What is NAHR (non-allelic homologous recombination?
When repeats facilitate aberrant recombination events between different copies of dispersed repeats leading to disease
What are the types of DNA variation that occur between genomes?
Insertion-deletion polymorphisms
SNPs (single nucleotide polymorphisms)
CNVs (copy number variations)
Other:
Chromosomal variation, larger scale variation, rearrangements, translocations
Silent variants
What are the two types of insertion-deletion polymorphisms?
Minisatellites:
tandemly repeated 10-100 bp blocks of DNA = highly variable number
VNTR (variable number of tandem repeats) = can be used for genetic fingerprinting
Microsatellites:
di-, tri-, and tetra- nucleotide repeats
more than 5e4 per genome
aka STRPs (short tandem repeat polymorphisms)
Why can SNPs be used for genetic fingerprinting?
SNPs can be detected by PCR markers
They are easy to score
They are widely distributed
What is copy number variation?
Variance in the number of copies of a particular gene in an individual
In segments from 200 bp to 2 million bp
What is a gene family and how did they arise?
Gene families are genes that have high sequence similarity, over 85%, that may carry out similar but distinct functions
They arise through gene duplication; when a gene duplicates it frees up one copy to vary while the other copy continues to carry out a critical function
What is structural variation of the human genome?
All changes in the genome are NOT due to single base pair substitutions
=> CNV (copy number variations) is the primary type of structural variation
What are the characteristics and implications of CNV (copy number variations)?
- CNV loci are the primary type of structural variation and may cover 12% of the genome
- CNV is implicated in increasingly larger number of diseases
- CNV regions are involved in rapid/recent evolutionary change