genes, genomes and genomics Flashcards
What is a genome?
The entire compliment of hereditary genetic information, in other words DNA
Region of the cell containing this genetic material is called a nucleus/nucleoid
This will include genes, regulatory sequences, structural DNA and ”junk” DNA (non-coding)
Genome functions
Genome size and complexity
DNA divided into “GENES” in usually “discrete” units of heredity
One gene per protein (most of the time)
Genomes vary in size
DNA is measured in base pairs (bp)
1 bp = 1 letter in the genetic code (A,T, C or G)
Therefore: 1,000bp = 1,000 As, Ts, Cs and Gs (1,000bp = 1kb)
Genome size is therefore measured in base pairs
- C-value: the total amount of DNA in the genome
- Does not always equate to the number of genes contained within the genome
- We would expect that the more complex the organism, the more DNA is needed to sustain the organism
- Therefore, linear relationship between genome size and organism complexity
The C-value paradox: massive disparity between genome size and complexity
Sequencing
Background
The human genome contains 3,164,700,000 bp
The average gene consists of 3,000bp
Sizes vary greatly
Largest known human gene being dystrophin at 2.4 million bp
The total number of genes is estimated at 20,000 - 25,000, perhaps or maybe more recently suggested to be 21,000
TWO approaches
EXTRINSIC
Compare the sequence to known sequences
Other genes already identified in other species
AB INITIO
Compare the sequence to key motifs in a gene
COMBINED APPROACH:
Human genome resources publicly available
Why study genomes?
L2 Viral genomes
DNA or RNA
Single or double stranded or both
Linear or circular or both
Viruses are not prokaryotes
Type of genome depends on life cycle
Prokaryotic vs eukaryotic genes
eukaryotic:
Untranslated region at 5’ to 3’ end
A TATA box indicates where a genetic sequence can be read and decoded
Promoter sequence, specifies to other molecules where transcription begins
Transcription is a process that produces an RNA molecule from a DNA sequence
prokaryotic:
Prokaryotic vs eukaryotic genomes
differences:
SIMILAR STRUCTURE TO PROKARYOTIC GENE (BUT DIFFERENT)
CODING SEQUENCE INTERRUPTED BY INTRONS
INTRONS ARE SPLICED FROM THE MRNA
Prokaryotic vs eukaryotic genome structure
eukaryotic:
DNA
Double stranded
Linear segmented
Many chromosomes
Location: cytoplasm
Mitochondria: multiple copies per cell.
Except mammalian red blood cells
Own genome: circular, multiple copies per cell
DNA is wrapped around histones
Each histone complex forms a nucleosome
Several nucleosomes wrapped together form a “solenoid” structure
Chromatin fibre
Nucleosomes wind into helix
Six nucleosomes per complete turn
prokaryotic:
DNA
double stranded
Circular
Non-segmented
One chromosome
Replication begins at “Ori” and ends at “Ter”
Two “replichores”
Left and right
DNA forms supercoils
Supercoils form DNA loops
Supercoils relaxed by topoisimerase
HETEROCHROMATIN and TRANSPOSABLE ELEMENTS
HETEROCHROMATIN-TIGHTLY PACKED FORM OF DNA CONDENSED DNA COMES IN MULTIPLE VARIETIES
TRANSPOSABLE ELEMENTS(TRANSPOSON OR JUMPING GENE)-GENE CAN CHANGE POSITION IN THE GENOME, CAN CREATE OR REVERSE MUTATIONS, DUPLICATION OF GENETIC MATERIAL
is the human genome unique
Homology is the existence of shared ancestry between a pair of structures or genes, in different taxa
Derived from the same ancestral tetrapod structure Most human genes are homologous to other species
DNA sequence that can be compared between two genomes is almost 99% identical
DNA categories across genome
SINEs are short interspersed nuclear elements, non- coding transposable elements
LINEs are long interspersed nuclear elements, transcribed in RNA and then converted back into DNA with RT(reverse of transcriptase) to insert into genome
Functional elements of the human genome
Alu sequences are the most common SINE
About 300 bp long
1,000,000 + copies in the human genome
Comprises approximately 10% of genome
The eukaryotic gene- regulatory sequences
This is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes
Non-coding DNA-sequence
TATA box is the binding site of the TATA-binding protein (TBP) and other transcription factors
TF recruit the enzyme called RNA polymerase
TATAWAW (W= A or T)
THE TATA BOX-DEFINES THE DIRECTION OF TRANSCRIPTION AND STRAND OF DNA TO BE READ
The eukaryotic and prokaryotic gene
A short (50–1,500 bp) region of DNA
Bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur
These proteins are usually referred to as transcription factors
Proteins known as transcription factors
Bind to the enhancer and increase the activity of the promoter
Found in both prokaryotes and eukaryotes
DNA is folded and coiled in the nucleus
Enhancer may actually be located near the transcription start site in the folded state
The eukaryotic gene-regulatory sequences
The eukaryotic equivalent of SHINE DALGARNO sequence in prokaryote
ACCATGG, part of the start codon
Functions as the protein translation initiation site
Site where ribosomes bind
Additional ribosomal binding site
5’ methylated cap of the messenger RNA