Ch. 4 - Genomes and DNA Flashcards

1
Q

Genome

A

The total genetic information that an organism (whether a living cell or a virus) possesses.

Bacterial genomes are mostly circular with densely packed genes, some of which are grouped into operons (where multiple genes are controlled by one promoter. Genome sizes vary from 0.5 to 12.0 Mega-base pairs (Mbp) and generally encodes 600-6000 proteins.

Viral genomes are smaller (about 0.2 to over 1 Mb) than most bacterial genomes and often are missing key genes for survival as they rely on their host cell to provide these gene products.

Organelle genomes are circular and only have some of the genes necessary for their function within the cell. Genome sizes vary from about 15 kb to 0.2 Mb.

Eukaryotic genomes are large (ranging from ∼10 Mbp in some fungi to >13 000 Mb in lungfishes). Arabidopsis has about 115 Mbp. Humans have about 3.300 Mbp in their genome that encodes for over 20 000 genes. Eukaryotes contain much more intervening or non-coding DNA, and introns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Operon

A

A group/cluster of multiple genes in a genome that have their expression controlled by one set of promoter and regulatory regions. Often based upon function of the genes. Mostly found in bacteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Genome size and C-value paradox

A

Genome size, the number of genes in the genome, and the number of chromosomes can vary, and they are independent of each other. Also, none of these variables directly determine/correlate with the complexity of the organism. However, parasitic organisms that rely on others to provide the essentials for life often have relatively smaller genomes than corresponding free-living organisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The Symbiotic theory

A

Proposes that the complex eukaryotic cell arose by a series of symbiotic events in which organisms of different lineages merged. Throughout time, the symbionts lost the ability to survive on their own, and became specialized to provide a specific function for the host. The theory suggests that the organelles og higher organisms (eukaryotic cells) are derived/remnants of ancient symbiotic bacteria. According to the symbiotic theory, mitochondria are derived from ancestral bacteria that specialized in respiration whereas chloroplasts are descended from ancestral photosynthetic bacteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Non-coding DNA

A

DNA that does not code for proteins or functional RNA molecules. Accounts for the majority of the DNA in eukaryotes, especially in higher animals and plants. Explains the C-value paradox: the amount of DNA does not correlate with the number of genes, and the complexity of an organism doe not relate to the amount of DNA in its genome. Regions of non-coding DNA between genes are called intergenic DNA. Non-coding regions that interrupt the coding regions of genes are called intervening sequences, or introns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Exons and Introns

A

Exons: Region of the DNA that contain coding information, segment of a gene that codes for protein. Exons are still present in the mRNA after processing is complete. Most eukaryotic genes consists of exons alternating with introns.
Introns: Region of non-coding DNA, segment of a gene that does not code for protein. Introns are transcribed and forms part of the primary transcript. In lower single-celled eukaryotes, introns are relatively rare and often quite short. In higher eukaryotes, introns are often longer than the exons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Repeated sequences

A

DNA sequences that are repeated multiple times throughout the genome. Also called repetitive sequences. When the repeated sequences follow each other directly, they are called tandem repeats. When the repeated sequences are spread separately around the genome, they are called interspersed sequences. About 50 % of the human genome are repeated sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Consensus sequence

A

Idealized base sequence consisting of the bases most often found at each position. Derived by examining and comparing multiple related individual sequences and the frequency of base appearances at each position. The sequence that is the most representative for the series of related sequences compared, is the consensus sequence. Consensus sequences are used to describe many different DNA motifs, including transcription factor binding sites, RNA polymerase binding sites, enhancer elements, DNA binding sites etc. They can also be used to describe conserved protein domains, but instead of using nucleotides, a protein consensus sequence is described by the most common amino acid at each position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Pseudogenes

A

A small category of repeats found in eukaryotic cells. Present in only one or two copies, and can be located next to or far away from the original, functional version of the gene. Some pseudogenes are defective duplicates of genuine genes whose defects prevent them from being expressed. Other pseudogenes are expressed, but their mRNA regulates expression of other genes rather than coding for proteins. Account for only a tiny fraction of the DNA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Moderately repetitive sequences and LINEs

A

DNA sequences that exist in hundreds or thousands of copies. In the human genome, 25% of the total DNA falls into this category. This includes multiple copies of highly used genes, like those for ribosomal RNA, as well as non-functional stretches of DNA that are repeated many times. In every life form studied to date, rRNA genes are arranged in linear clusters in the genome. These are expressed as polycistronic RNA and then processed into separate rRNAs.

Long INtersperced Elements (LINEs): Long sequence found in multiple copies that makes up much of the moderately repetitive non-coding DNA of mammals. Thought to be derived from retrovirus-like ancestors. A complete LINE-1 (L1) element contains about 7000 bp, although most individual L1 elements are shorter. LINEs are scattered throughout the genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Highly repetitive DNA and SINEs

A

DNA sequences that exist in hundreds of thousands to millions of copies. About 10% of the human DNA.

Short INterspersed Elements (SINEs): Short sequence found in multiple copies that make up much of the highly or moderately repetitive DNA of mammals. These sequences are almost all non-functional as far as is known. The best known SINE is the 300 bp Alu element. About 6-8% of the human DNA consists of repeats of the Alu element. Recent studies suggest that they bind to RNA polymerase II to repress gene transcription. SINEs are scattered throughout the genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Satellite DNA or Tandem repeats

A

Highly repetitive non-coding DNA of eukaryotic cells that is found as long clusters of tandem repeats. Satellite DNA is inert and permanently coiled tightly into heterochromatin. A large proportion of satellite DNA, and therefore heterochromatin is located around the centromeres of the chromosomes in humans, suggesting that it serves some structural role. These repeats are called alpha DNA. The amount of satellite DNA is highly variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Unequal crossing over

A

Long series of tandem repeats tend to misalign when pairs of chromosomes line up for recombination during meiosis. Unequal crossing over will then produce one shorter and one longer segment of repetitive DNA. Thus, the exact number of tandem repeats varies from individual to individual within the same population, and even between chromosomes in a chromosome pair.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Palindromes and Inverted repeats

A

Palindrome: A sequence that reads the same backwards as forwards. In DNA, which is double stranded, two types of palindromes are theoretically possible.

Mirror-like palindromes: Similar to those of ordinary text. Sequence is the same when read backwards and forwards on the same strand. Involves both strands as they are complementary. If one of the strands are palindromic, the complementary strand must be palindromic too.
ATGCCGTA
TACGGCAT

Inverted repeat: Sequence reads the same forwards on one strand as it reads backwards on the complementary strand. Much more common and of major biological significance. Inverted repeats are extremely important as recognition sites on the DNA for the binding of a variety of proteins. Many regulatory proteins as well as restriction and modification enzymes recognize inverted repeats.
GGATATCC
CCTATAGG

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

VNTR alleles and DNA fingerprinting

A

Due to unequal crossing over, the number of repeats in a given VNTR varies among individuals.
Although VNTRs most often are non-coding DNA and not true genes, the different versions of them are referred to as alleles. An allele is a particular version of a gene, or more broadly, a particular version of any locus on a molecule of DNA.
Some hyper-variable VNTRs may have as many as 1000 different alleles and give unique patterns for almost every individual. This quantitative variation may be used for the identification of individuals by DNA fingerprinting
DNA fingerprints are individually unique patterns due to the multiple bands of DNA produced using restriction enzymes, separated by gel electrophoresis, and usually visualized by Southern blotting (or simple dye).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Variable Number Tandem Repeats (VNTRs)

A

Segments of DNA consisting of short tandem repeats, but in much fewer copies than satellites. Therefore they are often categorized as either minisatellite DNA (around 25 bp) or microsatellites (less than 13 nt) (length and definition vary). In mammals, VNTRs are common and are scattered over the genome. One of the most common minisatellites is the repeat found in eukaryotic telomeres.

VNTRs are clusters of tandemly-repeated sequences in the DNA whose number of repeats differs between individuals. Person-to-person variation in the overall length of short tandem repeats allows individual identification and is used in forensic analysis.

17
Q

Gel electrophoresis

A

Electrophoresis is a method to separate molecules on the basis of charge. The movement of these charged molecules is due to an electric field/current. Gel electrophoresis is the electrophoresis of charged molecules through a gel meshwork. This method is often used to separate DNA or RNA fragments based on their size/molecular weight, as they are negatively charged. The method can also be used to separate proteins.

Most DNA is separated using agarose gel electrophoresis, where the gel consists of a cross-linked agar polymer. Smaller/shorter fragments of DNA are separated by polyacrylamide gel electrophoresis (PAGE), which has a higher resolution gel with smaller pores.

DNA samples are loaded into wells on the gel by a negative electrode. An electric current drives the migration of the DNA fragments towards the positive electrode through the meshwork of the gel. Smaller fragments travel faster through the gel, and after a certain time, they will travel further than the larger fragments. This way, DNA samples can be separated based on their size.

In addition to the samples, a loading dye is usually added to the wells, which stabilizes the DNA and increases their density. The loading dye often contain a dye that allows visualization of the DNA fragments (often in UV-light) after the gel is run. A standard kilobase ladder with fragments of known lengths is also run parallel to the samples, to determine the approximate size of the fragments.

18
Q

Supercoiling of Bacterial DNA and Linking number

A

Higher-level coiling of DNA that is already a double helix (a double helix that is twisted again). Necessary for packaging of bacterial DNA. As the bacterial DNA is 1000 times longer than the cell that contains it, the DNA must be supercoiled in order to fit inside the cell. There is roughly one supercoil for every 200 nt in a typical bacterial DNA.

The original double helix has a right-hand twist, but the supercoils twist in the opposite sense. They are left-handed or “negative” supercoils. Negative (rather than positive) supercoiling helps promote the unwinding and strand separation necessary during replication and transcription.

The total amount of twisting present in a DNA molecule is referred to as the linking number (L). This is the sum of the contribution due to the double helix plus the supercoiling. The number of double helical turns is called twists (T), and the number of superhelical turns is called the writhe or writhing number (W). L = T + W.

19
Q

Hairpin structures and Stem and loop motifs

A

Hairpin: If a single strand of DNA containing inverted repeats is folded back upon itself, base pairing occurs and holds the two halves of the sequence together, forming a hairpin structure. The U-turn at the top of the hairpin is possible, but energetically unfavorable.
Stem and loop: In practice, a few unpaired bases (N) are found forming a loop at the top of the base paired stem, forming a stem and loop motif. Such stem and loop motifs can form from one strand of any inverted repeat that has a few extra bases in the middle.

20
Q

Bent DNA

A

A DNA sequence that consists of several runs of adenine (A) residues (3-5 nt long) separated from each other by 10 bp (one turn of the double helix), that forms bends in the helix. Bending occurs at the 3´ end of the adenines (A-tract). Thought to help binding of DNA replication initiating proteins. In addition to “naturally” bent DNA, certain regulatory proteins bend DNA into U-turns when activating transcription. Bent DNA moves slower during gel electrophoresis, and the mobility of the DNA varies depending on the locations of the bent regions. Bends in the middle of the DNA molecule have greater effect than those close to the ends.

21
Q

Catenanes and Knots

A

Circular molecules of DNA may become interlocked during replication or recombination. Such structures are called catenanes (structures in which two or more circles of DNA are interlocked). The circles may be liberated by certain type II topoisomerases, such as topoisomerase IV in E. coli and related enzymes. Circular DNA may also form knots. Type II topoisomerases can both create and untie knots.
Topoisomerases may uncoil, unknot, or unlink DNA as well as carry out the coiling, knotting, and interlinking of DNA.

22
Q

Covalently closed circular DNA and open DNA

A

Bacterial chromosomes and plasmids are double stranded circular DNA molecules and are often referred to as covalently closed circular DNA or cccDNA.

If one strand of a double stranded circle is nicked, the supercoiling can unravel. Such a molecule is known as an open circle.

23
Q

Bacterial Chromosomes: Scaffold and nucleoid

A

The second level of compaction of the bacterial DNA occurs when approximately 50 loops of supercoiled DNA are arranged around a protein scaffold (extending from a central scaffold). In other words, the bacterial chromosome consists of about 50 supercoiled loops of DNA.

The two levels of compaction are essential and actually keep the bacterial chromosomal DNA compacted into a small area called the nucleoid. This structure contains the chromosome and its associated proteins. The nucleoid is a dense area offset from the rest of the cytoplasm, but is not compartmentalized with a membrane like in eukaryotes. The location of a gene on the circle correlates with a specific location within a nucleoid region.

24
Q

Topoisomerases and DNA gyrase

A

The same circular DNA. molecule can have different numbers of supercoils. These forms are known as topological isomers, or topoisomers. The enzymes that insert or remove supercoils are therefore named topoisomerases. Topoisomerases are enzymes that change the level of supercoiling.

Type I Topoisomerase: breaks only one strand of DNA, which changes the linking number in steps of one.
Type II Topoisomerase: breaks both strand of DNA, passing another part of the double helix through the gap. This changes the linking number in steps of two.

DNA gyrase: a type II topoisomerase that introduces negative supercoils into closed circular molecules of DNA, such as plasmids or the bacterial chromosome. Gyrase works by cutting both strands of the DNA, twisting the DNA strands, and then rejoining the cut ends. It is ideally ATP-driven, and can generate 1000 supercoils per minute. The enzyme is a tetramer of two types of subunits. The GyrA subunit cuts and rejoins the DNA, and the GyrB subunit provides energy by ATP hydrolysis.

25
Q

Local supercoiling

A

Whether the DNA is found in a prokaryotes or a eukaryote, when DNA is replicated or when genes are transcribed and expressed, the double helix must first be unwound. This is aided by the negative supercoiling of the chromosome. However, as the replication apparatus or the transcriptions apparatus proceeds along the double helix of DNA, it creates positive supercoiling ahead of itself. For replication and transcription to proceed more than a short distance, DNA gyrase must insert negative supercoils to cancel out the positive ones. Behind the moving replication/transcription apparatus, a corresponding wave of negative supercoiling is generated. Excess negative supercoils are removed by topoisomerase I. As a result, at any given instant, the extent of supercoiling varies greatly in any particular region of the chromosome.

26
Q

Helical structures of DNA

A

B-form/B-DNA: The most common and stable form, described by Watson and Crick (1953). It is right-handed with 10 bp per turn of the helix. The grooves running down the helix are of different depths, and are referred to as the major and minor grooves.

A-form/A-DNA: An alternative form of the double helix with 11 bp per turn. It is shorter and broader than the B-form. Often found for double-stranded RNA, but rarely for DNA. Double-stranded DNA tends to form an A-helix only at a high salt concentration or when it is dehydrated. Hybrids with one RNA strand and one DNA strand usually forms A-helices. In the A-form, the bases tilt away from the axis, the minor groove becomes broader and shallower, and the major groove becomes narrower and deeper.

Z-form/Z-DNA: An alternative form of the double helix with left-handed turns and 12 bp per turn. it is longer and thinner than the B-form, and its backbone forms a zigzag line rather than a smooth helical curve. Negative supercoiling induces the formation of Z-DNA, and the appearance of Z-DNA in part of a DNA molecule helps to remove supercoiling stress. High salt favors the Z-form, and Z-DNA is formed in regions of DNA that contain large numbers of alternating GC and GT pairs (G alternating with C or T), such as
GCGCGCGC or GTGTGTGT
CGCGCGCG CACACACA

H-form/H-DNA: An even more peculiar alternative form. H-DNA contains a triple helix, consisting of one purine-rich (GA) strand and two pyrimidine-rich (CT) strands. The other purine-rich strand is displaced and left unpaired. The H-form depends on long tracts of purines in one strand and, consequently, only pyrimidines on the other strand, such as
GGGGGGGG or GAGAGAGA
CCCCCCCC CTCTCTCT
Two such fragments are required and may interact forming H-DNA when the DNA is highly supercoiled. in addition, the overall region must be a mirror-like palindrome.

27
Q

Packaging DNA into Eukaryotic Nuclei

A

Eukaryotic chromosomes may be up to a centimeter long and must be folded up to fit into the cell nucleus, which is five microns across. The folding requires that the DNA be compacted 2000-fold. However, eukaryotic chromosomes are not circular, and instead of supercoiling using DNA gyrase, the mechanism of packaging involves several levels of twisting and protein interactions.

28
Q

Histones, Nucleosomes, and Chromatin

A

In the first level of compaction, DNA is wound around special proteins called the histones. Histones are special positively-charged (attracts DNA) protein that binds to DNA and helps to maintain the structure of the chromosomes. Eight histones comprise the core unit: two of H2A, two of H2B, two of H3 and two of H4. The DNA wraps around the ball of histones two times, and this structure is called a nucleosome.

Nucleosome: The basic unit in the folding of eukaryotic DNA, composed of around 200 bp of DNA coiled around the core of eight histone proteins, and one separate histone (H1) at the site where the wrapped DNA diverges.

The entire length of nucleosomes is called chromatin, and is also referred to as “beads on a string” due to its appearance. In between each nucleosome, is a short span of free DNA, which is partially protected by a ninth histone, H1.

29
Q

Histone proteins: H2A, H2B, H3, H4, and H1

A

The core histones, H2A, H2B, H3, and H4 are small, and roughly spherical proteins with 102 to 135 amino acids. However, the linker histone, H1, is longer, having about 220 amino acids.

H1 has two arms extending from its central spherical domain. The central part of H1 binds to its own nucleosome, and the two arms bind to the nucleosomes on either side.

The core histones have a body of about 80 amino acids and a tail of 20 amino acids at the N-terminal end. The tail contains several lysine residues that may have acetyl groups added or removed. This is thought to partly control the state of DNA packaging and hence of gene expression. Thus, in active chromatin, the core histones are highly acetylated. In addition to acetylation, the histone tails can also be modified by methylation or phosphorylation. When histones are methylated, nearby histones are deacetylated, resulting in inhibition of transcription. Phosphorylation confers a negative charge to the histones, opening the chromatin structure and inducing transcription.

In addition to histone modification, proteins like Heterochromatin protein 1 (HP1), Polycomb group (PcG) proteins, and chromatin remodeling complexes modulate the level of nucleosome packaging.

30
Q

Heterochromatin and Euchromatin

A

Heterochromatin: Highly condensed DNA that is permanently and tightly coiled. These forms of chromatins are genetically inert, RNA polymerase can not access it. Heterochromatin form when 30 nm loops are highly condensed. The rest of the chromatin are euchromatin.

Euchromatin: Opposite of heterochromatin. Looser and more accessible form of DNA in eukaryotes. The more extended form, like “strings on a bead” or 30 nm fiber. About 10% of this euchromatin is even less condensed, and is either being transcribed or is accessible for transcription in the near future. This is the “active chromatin”. Expressed genes are generally found within euchromatin.

31
Q

30 nm fiber and the third level of compaction

A

Chain of nucleosomes that is arranged helically with 6 nucleosomes per turn, approximately 30 nm in diameter. Second level of compaction. The nucleosomes may form a tubular solenoid shape, or they may zigzag back and forth.

In the third level of compaction, the 30 nm fibers loop back and forth. the loops vary in size, averaging about 50 of the helical turns (i.e., about 300 nucleosomes) per turn ( a lot like the bacterial). The ends of the loops are periodically attached/anchored to a protein scaffold or chromosome axis.

32
Q

Chromatid

A

Further folding of chromosomes occurs in preparation for cell division. Just before cell division, the DNA condenses and folds up. The typical metaphase chromosome has replicated its DNA previously, and is about to divide into two daughter chromosomes. It therefore consists of two identical double-helical DNA molecules that are still held together at the centromere. These are known as chromatids. Between cell divisions, and in non-dividing cells, each chromosome consists of only a single chromatid.

Chromatid: Single double-helical DNA molecule making up whole or half of a chromosome; a chromatid also contains histones and other DNA associated proteins.