Medical Genetics Wk 7 Flashcards
Repetitive DNA
In addition to single copies of unique DNA sequences that make up genes, many DNA sequences within eukaryotic chromosomes are repetitive in nature. Various levels of repetition occur within the genomes of organisms. Many studies have now provided insights into repetitive DNA, demonstrating various classes of sequences and organization. Some functional genes are present in more than one copy (they are referred to as multiple-copy genes) and so are repetitive in nature. However, the majority of repetitive sequences do not encode proteins. Nevertheless, many are transcribed, and the resultant RNAs play multiple roles in eukaryotes, including chromatin remodeling.
What is sequence complexity
Refers to the number of times a particular base sequence appears in the genome
3 main types
Unique or non repetitive
Moderately repetitive
Highly repetitive
LOOK AT GOODNOTES FOR DETAILS Middle Repetitive sequences: VNTRs and STRs;
Repetitive transposed sequences: SINEs and LINEs
Look at GOODNOTES overview
LOOK AT GOONOTES FOR Distinguish between unique or single copy genes and highly repetitive sequences in nuclear DNA
Satellite DNA
The nucleotide composition (e.g., the percentage of G-C versus A-T pairs) of the DNA of a particular species is reflected in the DNA’s density, which can be measured with a technique called sedimentation equilibrium centrifugation, which in essence determines the molecule’s density.
When eukaryotic DNA is analyzed in this way, the majority is present and represented as a single main band, of fairly uniform density. However, one or more additional peaks indicate the presence of DNA that differs slightly in density. This component, called satellite DNA, makes up a variable proportion of the total DNA, depending on the species. For example, a profile of mainband and satellite DNA from the mouse is shown in Figure. By contrast, bacteria do not contain satellite DNA. Several conclusions were drawn:
1. Satellite DNA differs from main-band DNA in its molecular composition,
as established by buoyant density studies.
2. It is composed of short repetitive sequences.
3. Finally, satellite DNA is found in the heterochromatic centromeric regions of chromosomes.
Centromeric DNA Sequences
Careful analysis has confirmed, that the repetitive DNA sequences contained within the centromere are critical for the separation of homologs during mitosis and meiosis.
The minimal region of the centromere that supports the function of chromosomal segregation is designated the CEN region. Within this heterochromatic region of the chromosome, the DNA binds a platform of proteins, which in multicellular organisms includes the kinetochore that binds to the microtubules making up the spindle fiber during division (see Figure). In humans, one of the most recognized satellite DNA sequences is the alphoid family, found mainly in the centromere regions.
Alphoid sequences, each about 170 bp in length, are present in tandem arrays of up to 1 million base pairs. It is now believed
that such repetitive DNA in eukaryotes is transcribed and that the RNA that is produced is ultimately involved in kinetochore function.
One final observation of interest is that the H3 histone, a normal part of most all eukaryotic nucleosomes, is substitutedby a variant histone designated CENP-A in centromeric heterochromatin. It is believed that the unique N-terminal protein tails that make CENP-A unique are involved in the binding of kinetechore proteins that are
essential to the microtubules of spindle fibers. This finding supports the supposition that the DNA sequence found only in centromeres is related to the function of this unique chromosomal structure
Middle Repetitive Sequences: VNTRs and STRs
Middle Repetitive Sequences: VNTRs and STRs
In addition to highly repetitive DNA, which constitutes about 5 percent of the human genome (and 10 percent of the mouse genome), a second category, middle (or moderately) repetitive DNA, is fairly well characterized.
Variable number tandem repeats (VNTRs)
The number of tandem copies of each specific sequence at each location varies from one individual to the next, creating localized regions of 1000–20,000 bp (1–20 kb) in length. The variation in size (length) of these regions between individual humans was originally the basis for the forensic technique
referred to as DNA fingerprinting.
Short tandem repeats (STRs)
Like VNTRs, short tandem repeats are dispersed throughout the genome and vary among individuals in the number of repeats present at any site.
For example, in humans, the most common microsatellite is the dinucleotide (CA)n, where n equals the number of repeats. Most commonly, n is between 5 and 50. These clusters have served as useful molecular markers for genome analysis.
The Federal Bureau of Investigation in the United States currently uses 13 STRP markers for its DNA fingerprinting panel. Two individuals (other than monozygotic twins) are so unlikely to have identical genotypes at all 13 loci that the panel will allow defi nitive determination of whether two samples came from the same individual.
DNA fingerprinting, also known as genetic fingerprinting, DNA typing, and DNA profiling is a molecular genetic method that enables identification of individuals using hair, blood, semen or other biological samples, based on unique patterns (polymorphisms) in their DNA.
Repetitive Transposed Sequences: SINEs and LINEs
Another category of repetitive DNA consists of sequences that are interspersed individually throughout the genome, rather than being tandemly repeated. They can be either short or long, and many have the added distinction of being transposable sequences, which are mobile and can potentially move to different locations within the genome.
For example: SINEs and LINEs represent a significant portion of human DNA. SINEs constitute about 13 percent of the human genome, whereas LINEs constitute up to 21 percent. Within both types of elements, repeating sequences of DNA are present in combination with unique sequences.
Short interspersed elements (SINEs)
The best characterized human SINE is a set of closely related sequences called the Alu family (the name is based on the presence of DNA sequences recognized by the restriction endonuclease Alu I).
Members of this DNA family, also found in other mammals, are 200–300 base pairs long and are dispersed rather uniformly throughout the genome, both between and within genes. In humans, the Alu family encompasses more than 5 percent of the entire genome.
Alu sequences are of particular interest because some members of the Alu family are transcribed into RNA, although the specific role of this RNA is not certain. Even so, the consequence of Alu sequences is their potential for transposition within the genome, which is related to chromosome rearrangements during evolution.
Alu sequences are thought to have arisen from an RNA element whose DNA complement was dispersed throughout the genome as a result of the activity of reverse transcriptase (an enzyme that synthesizes DNA on an RNA template).
Long interspersed elements (LINEs)
The group of long interspersed elements (LINEs) represents yet another category of repetitive transposable DNA sequences. The most prominent example in humans is the L1 family.
Members of this sequence family are about 6400 base pairs long and are present more than 500,000 times. Their 5’ end is highly variable, and their role within the genome has yet to be defined. The general mechanism for transposition of L1 elements is now clear. The L1 DNA sequence is first transcribed into an RNA molecule.
The RNA then serves as the template for synthesis of the DNA complement using the enzyme reverse transcriptase. This enzyme is encoded by a portion of the L1 sequence. The new L1 copy then integrates into the DNA of the chromosome at a new site. Because of the similarity of this transposition mechanism to that used by retroviruses, LINEs are referred to as retrotransposons.
Middle Repetitive Multiple-Copy Genes
In some cases, middle repetitive DNA includes functional genes present tandemly in multiple copies. For example, many copies exist of the genes encoding ribosomal RNA. Drosophila has 120 copies per haploid genome. Single genetic units encode a large precursor molecule that is processed into the 5.8S, 18S, and 28S rRNA components. In humans, multiple copies of this gene are clustered on the p arm of the acrocentric chromosomes 13, 14, 15, 21, and 22. Multiple copies of the genes encoding 5S rRNA are transcribed separately from multiple clusters found together on the terminal portion of the p arm of chromosome 1.
The genes that code for rRNA are located in an area of the chromosome known as the nucleolar organizer
region (NOR). The NOR is intimately associated with the nucleolus, which is a processing center for ribosome production
The Vast Majority of a Eukaryotic Genome Does Not Encode Functional Genes
Given the preceding information concerning various forms of repetitive DNA in eukaryotes, it is of interest to pose
an important question: What proportion of the eukaryotic genome actually encodes functional genes? We have seen that, taken together, the various forms of highly repetitive and moderately repetitive DNA comprise a substantial portion of the human genome—approximately 50 percent of all DNA sequences by most estimates. In addition to repetitive DNA, a large amount of the DNA consists of single- copy sequences that appear to be noncoding. Included are many instances of what we call pseudogenes. These are DNA sequences representing evolutionary vestiges of duplicated copies of genes that have undergone significant mutational alteration. As a result, although they show some homology to their parent gene, they are usually not transcribed because of insertions and deletions throughout their structure.