C3&4 human genome project Flashcards
genome
complete set of chromosomes
what does every genetic and physical marker have
a specific locus in genome
what was the goal of the human genome project
-determine the seq of the 3 billion chemical base pairs in human DNA
-identify all genes in human DNA to their position on chromosomes
-attempt to predict the function of all genes
-utilise this info for understanding of disease, developing better medicine
name given to those who participated in mapping and sequencing process
formal international consortium
phase 1 of human genome project
produce high resolution chromosomal maps
-position genetic markers (and genes)
-create libraries of BAC clones for sequencing (physical map)
phase 2 of human genome project
sequence each BAC DNA
phase 3 of human genome project
assemble all sequences to produce final draft and annotate to identify genes
overview of human genome project (7 steps)
- genomic DNA (2 male, 2 female)
- BAC library (250,000bp)
- organised mapped large clone contigs
- BAC to be sequenced
- shotgun sequence
- assembly
what maps are combined to order all BACs
genetic and physical
what is old sanger sequencing
radiation
4 separate dideoxy reactions
one for each base, very slow, manual reading of results off x-ray fillm
new sanger sequencing
like PCR but with fluorescent terminators
-run products on gel
-separated by size
-laser scans bands (ACGT)
second phase genome project
-used these advances in sanger DNA sequencing technology and reduction in cost
-computational assembly of all sequences into ‘contig’
how to provide one-fold coverage
requires ~3 million separate sequencing reactions producing 1000 bases each
draft sequence coverage
4 fold
finished sequence coverage
9 fold
what is IHGSC
clone by clone approach
advantages and disadvantages of IHGSC
-very effective at getting over regions of highly repetitive DNA sequences
-able to retrieve clones later
-slow process
-expensive
what is celera
shotgun sequencing
blast genome into small fragments, sequence each one and then use the power of computers to reassemble sequence
what did celera have to rely on
public databases of sequence and mapping information in order to assemble the sequence that was generated by this method
completion of human genome
-june 2000 white house announced 80% sequenced
-working draft publication made available on web july 2000
-publication of 90%
-completion of 99.99% july 2003
how many genes (original 2001 answer)
20,000-25,000 genes
~1.5-5% of the genome
genetic variation between humans is visible at the genomic scale
population genetics
how many chromosomes were sequenced
10 (5 people)
what were the two major types of revelation
single nucleotide polymorphisms (SNPs)
copy number variants (CNVs)
what is the international hapmap project
finding the more common SNP varients in the worlds population
how many human SNPs in nature in 2005 were reported to HapMap
> 1 million
what 4 populations of people were studied in HapMap project
270 people
nigeria (african)
japan and china (asian)
utah (northern & western european ancestry)
how many gaps were in the original human genome project
8%
who published the final results of the HGP
telomere to telomere (T2T) consortium
the draft ‘T2T-CHM13’ annotation totals
63494 genes
233,615 transcripts
what increases in complexity as you climb the evolutionary tree
transcript/protein structure
6 things learned in HGP
1.how do humans compare to other species
2. humans are evolving
3. fine structure of inheritance
4. human (pre)history
5. way in which our genes control our response to medication
6. disease genes
what is comparative genomics and evolution
comparisons with other species is possible to determine evolutionary relatedness
and understanding of evolutionary changes in genes/proteins
what methods are used to comparative genomics
-sequence alignments across species
-genome based phylogenies
organism genome comparison (smallest to biggest genes)
E.coli- 4,200
S. cerevisiae (yeast)-6,000
C. elegans (nematode)- 14,000
D. melanogaster (fruit fly)- 14,000
A. thaliana (mustard plant)- 24,000
Mammalian- 20,000-25,000
what allows novel genes/proteins during evolution
modular domains
many pathways are highly what across species
conserved
what is TBC1D3
only found in humans
regulates growth factors and role in RAS-mediated cancers
examples of selective expansion in protein families and domains
-immune function
-intercellular signalling
-metabolic function
-olfaction
-haemostasis
-apoptosis
-neural function
-translation
how many genes with human specific features vs how many entirely human specific genes
850
50+
what kinds of genes show evidence for fast/recent evolution in humans
-pathogen response
-cell cycle/DNA metabolism
-protein metabolism
-reproduction
-neuronal activity
-skin pigmentation
2 gene selections showing evidence for fast/recent human evolution
HBB gene selection
haemoglobin B (anaemias)
LCT gene selection
lactase gene variants selected for in early human groups which used milk in diet
how do most HIV strains enter cells
using CCR5 as main co-receptors (with CD4)
how do some people have natural resistance to HIV infection
homozygous for ∆ 32 mutation on CCR5 gene
what is the paradox of CCR5 ∆32 variant prevalence
distribution requires millennia but HIV only around since 1970/80s
possible explanation for paradox of CCR5 ∆32 variant prevalence
∆ 32 may have been protected (been selected for in) ancestral populations against earlier HIV like epidemics or even small pox
NOT bubonic plague/yersinia pestis ass was once thought
what is FOXP2
gene involved in human speech and language disorders
potentially critically involved in human specifiic development of language
2 aa changes between chimps and humans/neanderthals/denisovans
example of human specific loss of sequences
olfactory genes
how were LD blocks discovered
through the study of common variants
term given to haploid genotype
haplotype
how are haplotypes formed
formed by a collection of linked marker alleles on one chr that are inherited together
over short distances how do haplotypes remain intact
not disrupted by meiotic reccombination
what are the very short distances of haplotypes that remain intact called
linkage disequilibrium blocks
[LD blocks]
unit of inheritance
where does chiasma formation in meiosis happen
hotspots
an LD block is the same in between
what populations show highest levels of genetic polymorphisms
african
every other continent has less genetic variation than africa but some from what
archaic humans
what is mediated by genetic differences in terms of drug treatment
individual variation
what factors are under genetic control in pharmacogenetics
-inactivation/activation by oxidative pathways (cytochrome P450s)
-conjugation for excretion through the kidney (GST)
-target sensitivity
-toxicity=side effects
-disease mutation type
what is an advent of personalised/stratified medicine
identifying patients who will respond well to drug treatment
pharmacogenomics knowledge applied with example
screening patients for CYP2C9 and VKORC1 gene varients identifies those who should start on a low dose of warfarin to avoid risk of internal bleeding
what variants affect antidepressant escitalopram serum conc and treatment success
CYP2C19
PegIFN-a-2b or PegIFN-a-2a combined with RBV for what treatment
hep c
in terms of CF what is the most common mutation in european pops and what is the result
DF508
no protein produced
in terms of CF what percentage of mutations just damage the function of the protein
5%
in terms of CF what drug is used to restore function
ivacaftor
8.5 in 100,000 users of flucloxacillin have what reaction
serious liver reaction called drug induced liver injury (DILI)
what has been associated with DILI risk from flucolaxacillin
an SNP marker
rs2395029[G]
what can be said about the SNP marker associated with flucolaxacillin (linkage)
marker in complete linkage disequilibrium [LD] HLA*7501 immune cell surface protein
basic research practical uses of human genome project
all genes known forever
vast and integrated info set based on genome publicly available
medical advances since human genome project
-pharmacogenomics; right drug for right people
-genome wide genetic studies offer best hope of cracking complex genetic disorders
-diagnostic tests