Sorefan Flashcards
What is whole genome sequencing?
- complete genome seq of organism at single time
- inc seq of chromosomal DNA and mito/chloro etc DNA
What are the challenges for genome sequencing?
- NA extraction from cells –> needs high quality and conc
- fragmentation
- sub-fractionation size selection –> to isolate fragments of correct size
- separating indiv molecules
- amplification of signal
- reading signal
- data analysis
What were the 3 phases of human genome project?
- genetic and physical maps of human and mouse, seq yeast and worm
- -> technology dev
- draft seq –> inc many gaps and errors
- finished seq –> fill in gaps and correcting errors
How are genetic maps made?
- analyse genetic distance between genes by measuring recombination freq
- markers rely on variation of seq between parents and individuals
- distance measured in centimorgans
- mostly PCR based, eg. polymorphisms in genes and DNA markers
- linkage map by looking at relative distances of 2 or more polymorphic genes and measuring RFs
- DNA markers superseded phenotypic markers
- DNA based mol markers could be RFLPs
- -> methods to analyse are slow so moved onto using SSLPs as easy to analyse w/ PCR
What are SSLPs?
- simple seq length polymophisms
- repeat regions in genome that vary in length between pops
- usually mini and microsatellite seqs
What are minisatellites?
- repeat units up to 25bp
- not spread evenly around genome, mostly at telomeric regions
- several kb long
- difficult to PCR
What are microsatellites?
- usually di or trinucleotide repeats
- few 100 bases long
- easy to PCR
- 650,000 in genome
Why are genetic maps in humans limited?
- large pops of siblings don’t exist, so limited no. recombination events to study
- recombination events not at random genome positions –. recombination hotspots
How are physical maps created?
- restriction mapping locates relative positions on DNA molecule of recognition seqs for for REs
- FISH = map marker locations by hybridising probe containing marker to intact chromosomes
- STS = map positions of short seqs by PCR
What are the advantages of creating BAC libraries from indiv chromosomes?
- BAC clone library can be used to seq genome
- BACs w/ inserts from each chromosome could be shared across consortium
How is genome sequencing carried out clone by clone?
- extract DNA
- fragment DNA
- -> ideally completely random so no parts missed out
- -> by physical methods = sonication, hydrodynamic shearing, restriction enzymes and transposase
- -> by chemical methods (mostly used to fragment RNA) = heat and divalent cation (Zn and Mg)
- size selection –> gel electrophoesis
- clone 100-200kbp fragments into BAC plasmids to create library
- transformation of bacteria for BACs
- pick indiv colonies and extract vector (each tube has many copies of indiv DNA insert)
How are clones positioned on genetic and physical maps?
- test clones for PCR markers w/ known locations
- BAC end sequencing using Sanger
- -> known seq so can design primer
- -> denature vector and Sanger seq
- -> design primer to reverse strand to seq other direction
- -> end seqs from same insert, so are paired end read
Why are paired end read useful?
- can physically link 1 end of seq w/ another, so can be used to resolve seq gaps
How is it decided which BAC has insert next to insert of interest?
- gen contiguous set of clones
- if any of BACs inc end seq, then insert they contain must be next to it
- test BAC library for end seq from desired vector by PCR
- repeated over and over again until all BACs placed in order on each chromosome
- created contig
Why was shotgun seq of BAC clones needed?
- as BAC end seq leaves most of middle of genome insert to seq
How was shotgun seq BAC clones carried out?
- each BAC clone broken up into 5-10kb fragments
- cloned into diff vector that accepts smaller inserts
- if seq lots of paired end seqs can assemble large fragment (=consensus seq)
How did Celera seq human genome?
- fragmented genome into 2-50kbp fragments
- cloned 2, 10 and 50kbp fragments into plasmids to create library
- assemble reads to create consensus seq and seq contigs
- draft genome had 98% bases
Why did the IHGP use clone by clone instead of whole genome shotgun seq?
- to prove feasible for complex repeat rich genome
- assembly easier and could be performed confidently
- could target gaps for finishing
- better suited to diverse international consortium
What needed to be done to finish the human genome?
- fill in sequencing gaps and physical gaps
Why were gaps present in human genome, and how could these problems be solved?
- cloning bias
- no restriction sites –> use diff RE, use physical or chem fragmentation method
- insert unstable –> use diff vector
How were seq gaps closed?
- paired end seqs align to either side of gap
- if gap < 1kbp = PCR across gap
- if gap > 1kbp = sequential seq along insert