Lecture 7: Genomes and their evolution Flashcards
Genome:
is the complete set of genes in an organism or the total genetic content in one set of chromosomes
Genomics:
is the study of whole sets of genes and their interactions
Comparative genomics:
is the analysis & comparison of genomes from different species
Genome sequencing: most ambitious sequencing project to date:
the Human Genome Project which began in 1990 and was completed by 2003
Genome sequencing:
Chimpanzee genome was completed by
2005 & today many more genomes have been completely sequences
Human genome sequencing project originally took
13 years at a cost of $1 billion
In 2008 Illumina sequencing of human genome costs
$350,000
in 2014 Illumina annonced they could sequence human genome for
$1000
It is now possible to generate hundreds of gigabases of data very quickly, however:
assembling genomes and annotated genomes still takes a long time.
- Computational and bioinformatic analyses lag behind our ability to generate data.
How many genomes have been sequenced to date? By 2016..
the genomes of more than 14,000 different organisms had been sequenced and another many more are in the process of being sequenced.
Many of the sequences genomes are
bacterial & archeal genomes. Approx 10& are eukaryotes, an include vertebrates, invertebrates, protists, fungi & plants
Where are genomes sequenced?
- Genomes are sequenced by a variety of public and private organisations e.g.
– National Human Genome Research Institute (NHGRI)
– The Institute for Genomic Research (TIGR)
– The Wellcome Trust Sanger Institute
– JGI – DOE Joint Genome Institute
– Private companies e.g. Syngenta/Monsanto
– Research lab in Universities, Research Institute
How is genome sequencing prioritised?
• The priority-setting process is based on the medical, agricultural and biological opportunities expected to be created by sequencing a given organism
Genomes vary in:
SIZE
genomes of most bacteria range from
1 to 6 million base pairs (Mb)
Eukaryotic genomes are __ than bacteria
larger
most multicellular animals & plants have genomes with at least
100 Mb
fruit fly genome has how many base pairs?
165 Mb
Human genome number of base pairs
3,000 Mb
The genome of the single- celled yeast S. cerevisiae has about __ Mb
12 Mb
Within each kingdom/domain there is __ systematic relationship between _____ & ____
no
genome size
phenotype
Genomes vary in the number of
genes they contain
Free-living bacteria and archaea have ___ genes
1,500 to 7,500
The number of genes in eukaryotes ranges from
about 5000 (e.g. the unicellular yeast) to at least 40,000 genes in multicellular eukaryotes
The number of genes is / is not correlated with genome size
IS NOT
e. g.
- the nematode worm C. elegans has 100 Mb genome and 20,000 genes
- Drosophilahas165Mb genome and 13,700 genes
Eukaryotic genomescna produce more than one polypeptide per
gene because of alternative splicing of RNA transcripts
Gene diversity:
the number of genes presenting a given length of DNA
In bacterial genomes most of the DNA consists of
genes coding for proteins, tRNA or rRNA
Most eukaryotic DNA does
not code for protein and is not transcribed into functional RNA molecules.
Humans have ___ times as much noncoding DNA as bacteria
10,000
Multicellular eukaryotes have introns and a vast amount of
non-protein coding DNA between genes.
“junk DNA” is thought to play important roles in the cell for example
- For example, the genomes of humans, rats, and mice show high sequence conservation for about 500 noncoding regions.
- The sesequences are more highly conserved than protein-coding genes in these species
Only __ of the human genome codes for proteins or produces rRNAs and tRNAs
1.5%
Gene-relatedregulatory sequences and introns account for, respectively __% and __% of the human genome.
5%
20%
Intergenic DNA is
Noncoding DNA found between genes
- Pseudogenes
- Repetitive DNA
Pseudogenes are
former genes that have accumulated mutations and are nonfunctional
Repetitive DNA is
is present in multiple copies in the genome
About __% of repetitive DNA is made up of transposable elements
44%
Comparative genomics
is the analysis and comparison of genomes from different species
Comparative genomics allows us to
- Gain a better understanding of how species have evolved
- Help explain how the evolution of development leads to morphological diversity
- Determine the function of genes and non- coding regions of the genome
Genome researchers look at many different features when comparing genomes:
– sequence similarity,
– gene location,
– the length and number of coding regions (exons) within genes,
– the amount of noncoding DNA in each genome,
– highly conserved regions maintained in organisms
as simple as bacteria and as complex as humans.
comparative genomics involves the use of
computer programs that can line up multiple genomes and look for regions of similarity among them.
Some of these sequence-alignments tools are accessible to the public over the Internet.
Comparative genomics can begin to address a range of questions e.g.
- Which sets of genes are common to many different organisms, or groups of organisms.
- Which genes are unique?What do these genes do?
- Which genes are necessary for multicellular life forms; which set of genes are only found in multicellular organisms but not in unicellular ones?
- Where and how have new genes emerged in evolutionary history?
Genome comparisons of distantly related species help us
understand ancient evolutionary events
Genome comparisons of closely related species help us
understand recent evolutionary events
Highly conserved genes that have changed very little over time help clarify relationships among species that
diverged from each other long ago.
Archaea and bacteria diverged from each other between
2 & 4 billion years ago - know this form DNA sequencing
Highly conserved genes can be studied in one model organism, and the results
applied to other organisms
Orthologs are
Orthologs are genes in different species that evolved from a common ancestral gene by speciation.
• Normally, orthologs retain the same function in the course of evolution.
60 percent of genes are conserved between
the fruit fly and humans - The two organisms appear to share a core set of genes
When Scientists inserted human gene associated with early-onset Parkinson’s disease into fruit flies
they displayed symptoms similar to those seen in humans with the disorder.
-Thus control of expression of this gene may be similar between the 2 organisms raising the possibility that Drosophila could act as a new model for testing therapies aimed at Parkinson’s.
Researchers have compared the human genome with the genomes of the
chimpanzee, mouse, rat, and other mammals.
- Identifying the genes shared by these species but not by non- mammals provides clues about what it takes to make a mammal.
- Identifying the genes shared by chimpanzees and humans but not by rodents gives information about primates.
In single-base substitutions, chimp and human genomes differ by only
1.2%
chimps &humans:
Longer stretches of DNA show a ___% difference due to
2.7%
insertions or deletions of larger regions in the genome of one or the other species.
– Many of the insertions are duplications or other repetitive DNA.
A __ of the human duplications are not present in the chimpanzee genome and some contain regions associated with human diseases.
third
- Transcription factors regulate gene expression and thus play a key role in orchestrating the overall genetic program.