MBI-HGP Flashcards
what is the human genome project
Human Genome Project (HGP) was called a mega project.
If two individuals differ, then their DNA sequences should also
be different, at least at some places. These assumptions led to the quest of
finding out the complete DNA sequence of human genome. With the
establishment of genetic engineering techniques where it was possible to
isolate and clone any piece of DNA and availability of simple and fast
techniques for determining DNA sequences, a very ambitious project of
sequencing human genome was launched in the year 1990.
cost and scale of hgp
Human genome is said to have approximately 3 x 109 bp, and if the cost of sequencing required is US $ 3 per bp (the estimated cost in the
beginning), the total estimated cost of the project would be approximately
9 billion US dollars.
The enormous amount of data expected to be generated also
necessitated the use of high speed computational devices for data storage
and retrieval, and analysis. HGP was closely associated with the rapid
development of a new area in biology called Bioinformatics.
what are the goals of HGP
(i) Identify all the approximately 20,000-25,000 genes in human DNA;
(ii) Determine the sequences of the 3 billion chemical base pairs that
make up human DNA;
(iiii) Store this information in databases;
(iv) Improve tools for data analysis;
(v) Transfer related technologies to other sectors, such as industries;
(vi) Address the ethical, legal, and social issues (ELSI) that may arise
from the project.
who coordinated the human genome project
The Human Genome Project was a 13-year project coordinated by
the U.S. Department of Energy and the National Institute of Health. During
the early years of the HGP, the Wellcome Trust (U.K.) became a major
partner; additional contributions came from Japan, France, Germany,
China and others. The project was completed in 2003.
aims of HGP
Knowledge about
the effects of DNA variations among individuals can lead to revolutionary
new ways to diagnose, treat and someday prevent the thousands of disorders that affect human beings. Besides providing clues to
understanding human biology, learning about non-human organisms
DNA sequences can lead to an understanding of their natural capabilities
that can be applied toward solving challenges in health care, agriculture,
energy production, environmental remediation.
what other organisms dna has been sequences
Many non-human model
organisms, such as bacteria, yeast, Caenorhabditis elegans (a free living
non-pathogenic nematode), Drosophila (the fruit fly), plants (rice and
Arabidopsis), etc., have also been sequenced.
approach to gene sequencing
The methods involved two major approaches.
One approach focused on identifying all the genes that are expressed as
RNA (referred to as Expressed Sequence Tags (ESTs).
The other took
the blind approach of simply sequencing the whole set of genome that
contained all the coding and non-coding sequence, and later assigning
different regions in the sequence with functions (a term referred to as
Sequence Annotation).
how is the sequencing done
For sequencing, the total DNA from a cell is
isolated and converted into random fragments of relatively smaller sizes
(recall DNA is a very long polymer, and there are technical limitations in
sequencing very long pieces of DNA) and cloned in suitable host using
specialised vectors. The cloning resulted into amplification of each piece
of DNA fragment so that it subsequently could be sequenced with ease.
The commonly used hosts were bacteria and yeast, and the vectors were
called as BAC (bacterial artificial chromosomes), and YAC (yeast artificial
chromosomes).
after the digested pieces of dna were sequenced, how was it put together?
The fragments were sequenced using automated DNA sequencers that
worked on the principle of a method developed by Frederick Sanger.
(Remember, Sanger is also credited for developing method for
determination of amino acid
sequences in proteins).
These
sequences were then arranged based
on some overlapping regions
present in them. This required
generation of overlapping fragments
for sequencing. Alignment of these
sequences was humanly not
possible. Therefore, specialised
computer based programs were
developed
how are chromosomal dna put together
These
sequences were subsequently
annotated and were assigned to each
chromosome. The sequence of
chromosome 1 was completed only
in May 2006 (this was the last of the
24 human chromosomes – 22
autosomes and X and Y – to be sequenced).
2nd challenge to hgp
Another challenging task was assigning the genetic and
physical maps on the genome. This was generated using information on
polymorphism of restriction endonuclease recognition sites, and some
repetitive DNA sequences known as microsatellites
Salient Features of Human Genome
(i) The human genome contains 3164.7 million bp.
(ii) The average gene consists of 3000 bases, but sizes vary greatly, with
the largest known human gene being dystrophin at 2.4 million bases.
(iii) The total number of genes is estimated at 30,000–much lower than previous estimates of 80,000 to 1,40,000 genes. Almost all (99.9 per cent) nucleotide bases are exactly the same in all people.
(iv)(iv) The functions are unknown for over 50 per cent of the discovered
genes.
(v) Less than 2 per cent of the genome codes for proteins.
(vi) Repeated sequences make up very large portion of the human genome.
Chromosome 1 has most genes (2968), and the Y has the fewest (231)
what areptitive sequences
Repetitive sequences are stretches of DNA sequences that are
repeated many times, sometimes hundred to thousand times. They
are thought to have no direct coding functions, but they shed light
on chromosome structure, dynamics and evolution.
what are SNP’s
Scientists have identified about 1.4 million locations where singlebase DNA differences (SNPs – single nucleotide polymorphism,
pronounced as ‘snips’) occur in humans. This information promises
to revolutionise the processes of finding chromosomal locations for
disease-associated sequences and tracing human history.
challenges with hgp
Deriving meaningful knowledge from the DNA sequences will define
research through the coming decades leading to our understanding of
biological systems. This enormous task will require the expertise and
creativity of tens of thousands of scientists from varied disciplines in both
the public and private sectors worldwide.