Oct 4 Genes and genomes, transposable elements Flashcards
what is the genome?
the entirety of an organism’s hereditary information
usually DNA, but some viruses have RNA genomes
how is double stranded DNA length measured?
measured in base pairs (bp)
100bp=1kbp or kb
1000000bp=1mbp=mb
how is biological complexity related to the size of DNA content in the genome?
it is NOT related
what is the largest sequenced genome?
the australian lungfish xiphophorus, 43Gb (14x human genome)
doesn’t have more genes but has lots of transposable elements
what is a gene?
the entire nucleic acid sequence that is necessary for the synthesis of a functional product (polypeptide or RNA)
can be considered as transcription units
what do exons contain?
the coding region or open reading frame (ORF)
what are control regions?
promoter and cis-regulatory factors
what are introns?
separate the exons
spliced out during mRNA processing
what do proteins with similar functions have?
proteins with similar functions often contain similar amino acid sequences that encode functional domains
how can nucleic acid and protein sequence similarity be found?
BLAST
how does protein number vary among species relative to DNA content?
varies much less
what is the difference in genome size mostly due to?
mostly due to different amounts of non-coding DNA and transposable elements
what are orthologs?
the same protein in different species (alpha tubulin in humans and flies)
what are paralogs?
closely related proteins in the same species (alpha tubulin and beta tubulin in humans)
what are solitary/single copy genes?
protein coding genes that are represented once in the genome
there is only one gene that looks like that, if you do the BLAST search of a protein it makes, there will be no paralogy
what is a gene family?
a set of related genes formed by duplication of an original single copy gene
what proportion of protein coding genes do solitary genes represent?
25-50%
the remainder occurs as duplicates or in multiple copies
what gene density do multicellular animals and plant genomes have?
lower gene density
noncoding introns and other noncoding sequences
long tandem arrays of repeated short sequences (larger genomes, but fewer genes)
what does DNA fingerprinting compare?
compares individual differences in simple sequence tandem arrays
what some characteristics of microsatellite DNA?
- repeat units are typically 1 to 4bp in length
- arrays of up to 600bp in length and are composed of tandem repeat units
- are sometimes found in transcription units
- expansion underlie several neuromuscular diseases like myotonic dystrophy and spinocerebellar ataxia
how can short repeated sequences be generated?
by backward slippage during replication or
one unit is looped out (first replication produces n+1 and second replication makes duplex with extra repeat unit)
what are some characteristics of minisatellite DNA?
- repeat units are 14 to 100bp in length
- 20-50 tandem repeat units
- arrays of 1 to 5 kbp in length
- often in centromeres and telomeres
how are repeated sequences used in testing and identification?
they vary extensively in length among individuals
used for paternity determination and to identify criminals
what are transposable (mobile) DNA elements?
non coding
transposable DNA elements move within genomes by different mechanisms
mobile DNA elements influenced evolution and can cause mutations leading to disease
what does recombination between repeated elements do?
can shuffle exons and produce new genes with new combinations of existing exons
what is stable transfection?
the plasmid has the gene that provides antibiotic resistance
can select for the cells that have the gene with drug resistance
selectable marker
how can retroviral vectors be used to integrate cloned genes into mammalian genomes?
retroviral vectors can target living cells
can infect particular tissue types
can introduce a gene into a living organism
tissue culture cells are transiently transfected with 3 different plasmids
1. vector plasmid with the gene of interest
2. packaging plasmid which encodes a lot of viral proteins
3. viral coat plasmid which encodes a gene for VSV G protein
cells that have all 3 of the plasmids can make functional virus particles that have the gene of interest integrated in the viral genome
what do some genes encode (that are not proteins)?
can encode RNA that has a function
ribosomal RNA, tRNA, small RNA involved in splicing
those RNA do not encode proteins but they are functional
what is included in a gene?
what codes for mRNA
but that is not linear, interrupted by introns
what region lies upstream of the 5’ end on a gene?
regions that control the expression of the gene (in what cell type, at what developmental time, and how it responds to stimulus)
what region lies on the 3’ end?
the Poly(A) site (polyadenilation)
A residues are enzymatically added to the RNA at the 3’ end
this Poly(A) site determines where the transcript terminates and where the A will be added
what happens to those regions during transcription?
not transcribed, do not become mRNA, but they are part of the gene
how does BLAST work?
BLAST makes the best possible alignment between two or more sequences -> can see the similarities
how can short repeated sequences be generated during replication?
can be generated by backwards slippage
when DNA polymerase hits one of the sequences it can get confused and sometimes it will slip over one of them and keep going
if that happens, in the next replication, one of the strands has an extra repeat, and that can expand because of the errors polymerase makes
the more it expands the worse the disease is
protein is not functional anymore or has altered functions
what is the mechanism for increasing the copy number of DNA transposons?
if a transposon moves from a region that has been replicated to a region that hasn’t been replicated yet (jumps ahead of the replication fork) then its copy number will increase by 1 in one of the daughter chromosomes (will be replicated twice)
what are the two major classes of transposons and the difference?
DNA transposon:
move as pieces of DNA and are part of DNA replication (3% of human genome)
retrotransposon:
have an RNA intermediate involved in the movement
40% of the human genome
what is the general structure of eukaryotic LTR (long terminal repeat) retrotransposons?
whatever was a one end gets replicated at the other
the protein coding regions encodes enzymes that are needed for transposition: reverse transcriptase (take RNA back to DNA), integrase (allows DNA copy to get integrated into a new chromosomal site) and others
what is a retrovirus?
a virus whose genome resembles a retro transposon
what are LINEs?
nonviral DNA retrotransposons (cannot be packaged into a virus)
has one open reading frame that encodes RNA binding protein and another ORF that encodes protein that is both reverse transcriptase and DNA nuclease
linked to some diseases if inserted in the wrong place
what can recombination between repeated elements do?
can shuffle exons and produce new genes with new combinations of existing exons
what can DNA transposons and LINEs do when they move?
they can carry unrelated flanking sequences (neighbouring DNA) with them when they move