Genome Sequencing Flashcards

1
Q

Sequencing a genome can be viewed as…

A

Obtaining the parts list of the cell/organism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What 4 things are required for DNA synthesis?

A

DNA polymerase, primer, template DNA and dNTPs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe DNA sequencing by the chain termination method (aka Sanger Sequencing)

A
  • Template DNA, primer, DNA polymerase, dNTPs and ddNTPs added to reaction in gel electrophoresis
  • Different size fragments are generated during DNA synthesis depending on the location of ddNTP incorporation/termination.
  • Reaction stops when ddNTP is added, which helps determine the order of nucleotides
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe fluorescent-labelled ddNTPs and capillary electrophoresis

A

Fluorescent ddNTPs determine which ddNTP has been incorporated in sequencing reaction
- As the DNA fragments exit the capillary, they pass through a laser detection system. The laser excites the fluorescent dye attached to the ddNTP at the end of each fragment, causing it to emit light at a specific wavelength corresponding to one of the four bases (A, T, C, or G). The emitted light is detected and recorded as a peak in a chromatogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is capillary electrophoresis better for fluorescent-labelled ddNTPs

A

Sequencing reactions are run on capillary gel electrophoresis (better heat dissipation and resolution, less sample required, more parallel reactions run at a time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is automated base calling?

A
  • Form of Sanger sequencing, scanner records coloured images of different sized termination fragments for each fluorescent-labelled ddNTP
  • Computer processes fluorescent signals to generate an electropherogram, assigning a base to each peak.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In automated base calling, what is Phred?

A
  • Phil’s revised editing program
  • Electropherograms are usually messy, so Phred estimates a probability of error for each base call in the electropherogram
  • Error % is based on parameters such as shape of a peak, spacing between peaks, height of a peak.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are automated sequencers?

A
  • All steps from sample loading to base calling is automated
  • Sequencing reactions are usually performed manually in 96-well microplates in a thermal cycler (denaturing, annealing, extension)
  • Using machines like the “Applied Biosystems 3730xl DNA analyzer”, we can obtain up to 800 bp of sequence/reaction.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When doing Sanger sequencing, why is the reaction only limited to obtaining up to 800 bps of sequence per reaction?

A

The polymerase falls off (the polymerase has a certain sensitivity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the Human Genome Sequencing Project, specifically the public and private sectors involved in the project

A
  • Advances in automated sequencing allowed for genomic projects such as the human genome project.
  • Project formally proposed in 1985 with NIH and US Department of Energy with a 15 year and $3 billion plan (public) consisting of international genome sequencing centers
  • Private consortium (Celera Genomics) started second project in 1998 to complete genome sequence in three years (very-profit driven, and you can’t patent anything made by nature)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In the Human Genome project, the DNA came from anonymous donors of diverse ethnic backgrounds. Why?

A

This system was better than a lottery system to sequence some random person’s DNA because it helped us determine how similar human genomes actually are (since the genome hadn’t been sequenced by that point). Found that humans have a pretty similar genome, with only some different alleles to account for our different phenotypes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the shotgun approach to sequencing

A

The shotgun approach requires breaking the genomes into smaller fragments or clones and sequencing these fragments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Random shearing/sonication in sequencing

A

Randomly breaking the fragments of the chromosome into random bits, and fragments are sequenced independently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

You need many copies of the same fragment to perform Sanger sequencing and to accurately see fluorescence. What can be used (in general) to accomplish this?

A

Cloning vectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 5 common features of a vector?

A
  1. Promoter: constitutive (always on)/inducible
  2. Multicloning site: unique restriction sites of inserting gene
  3. Epitope tag: protein purification/localization
  4. Origin of replication: determines copy number (also ensures that both daugheter cells have the vector)
  5. Selectable marker: antibiotic resistance (used to identify which E/coli actually has the vector)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Phagemid

A

1 kb insert

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Plasmid

A

up to 10 kb insert

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

P1 clone

A

100 kb insert

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Bacterial Artificial Chromosome

A

up to 300 kb insert

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the steps of hierarchical shotgun sequencing (3 steps)

A
  1. Chromosome is fragmented by partial restriction digest or shearing (sonication)
  2. Clone the unique fragments into BACs (300 kb), PACs (100 kb) and cosmids (50 kb), and transform into E.coli (DNA library which contains all the colonies together. Each colony contains one vector)
  3. Map the correct order of cloned fragments to select BACs for sequencing (all genome is represented).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the goal when mapping the correct order of BAC clones?

A

Sequence the minimum number of nucleotides to cover the entire genome to cut costs (i.e. don’t want to sequence multiple BACs containing the same region of the genome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are two ways of detecting BACs with overlapping genome sequences?

A
  1. BAC library screening by hybridization
  2. Restriction fingerprinting BAC clones
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe the steps for BAC library screening by hybridization

A
  • Rapid identification of overlapping clones using a random sequence/probe (single stranded DNA)
    1. BAC colonies are robotically transferred to nitrocellulose/nylon membrane and screened with a radioactive probe
    2. Probe will only hybridize to BAC colonies with overlapping fragments. Black spots show where the probe is bound (black due to the radioactivity of the probe)
    3. The sequence at the end of a clone can be used as a probe in a subsequent screen to look for overlapping fragments: “chromosome walking”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe restriction fingerprinting of BAC clones

A
  • Complete restriction digest of BAC clones followed by gel electrophoresis to determine restriction fragment profile for each BAC clone
  • Identify BAC clones with common restriction fragments
    -Overlapping patterns indicate that two BAC clones share common DNA sequences, allowing researchers to identify overlaps between different clones
  • By comparing the overlap of many clones, scientists can begin to determine the relative positions of BAC clones along the genome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Describe hierarchical shotgun sequencing after original BAC cloning (3 steps)

A

A BAC contacts 300 kb of base pairs which is still very big. The goal is to make these BACs smaller so that they’re easier to sequence
1. Shear BACs by sonication (unique fragments)
2. Clone the fragments into phagemids (1 kb) or plasmids (2-10 kb) and transform into E.coli (“shotgun library”).
3. Sequence library clones, and assemble genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Describe whole genome sequencing (which was done by Celera, the private company in the human genome project)

A
  1. DNA extraction
  2. DNA fragmentation (sonication)
  3. Clone into vectors, transform bacteria for replication, purify vectors
  4. Sequence library clones and assemble genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the advantages and disadvantages of WGSS compared to HSS?

A

HSS: Easier to assemble genome sequence but have to build physical map (labor intensive)
WGSS: Bypasses physical map (mapping where the BACs are and any overlap), but assembly of the genome is more difficult especially for more complex genomes (like the human genome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How did Craig Venter (founder of Celera) cheat when doing the human genome project?

A

Each time a new sequence was found, it was put into the NCBI database. Venter used this public information to help him assemble the entire genome. This shows how profit-driven Celera was.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is genome coverage?

A

How many times a genome is sequenced (because nucleotides are resequenced often)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Coverage formula

A

C (coverage) = LN/G
L: sequence read length in bp (# of reads you get in a reaction)
N: Number of reads sequenced (aka number of clones)
G: Haploid genome length in bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the assumption concerning the genome coverage formula?

A

Sequencing reads will be randomly distributed in the genome (i.e. the ability to sequence a particular region of the genome does not differ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Given a genome size of 5Mb, what would 1X and 2X coverage be?

A

1X= 5 Mb
2X = 10 Mb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

An insert is usually sequenced from (one/both) end (s)

A

Both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Since the insert is sequenced from both ends, what are these sequences called?

A

Paired reads/mate pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What are universal primers?

A

Used when sequencing inserts in vectors because we already know the sequence of the vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Greater length of sequencing reads is better for…

A

Aligning sequences and better coverage of the genome
- More overlap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

How many clones would have to be screened for 1X coverage of a 4 Mb genome with paired reads of 500 bp each?

A

N= CG/L= (1)(4x10^6)/1000 = 4000 clones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Assuming a random library, the sequence coverage of a genome roughly follows a _____ distrbution

A

Poisson distribution

39
Q

Poisson distribution formula

A

P(y)= (λ^y ⋅ e^-λ)/y!
- y= number of events in a given interval (number of times a nucleotide is sequenced)
- λ= mean number of events in a given interval (genome coverage)
- P= probability that a certain nucleotide will be sequenced a certain number of times

40
Q

For genome sequencing, we only want to determine the probability that…

A

Any base is NOT sequenced

41
Q

P(0) formula

A

P(0)=e^-λ

42
Q

If you are sequencing a 10 kilobase circular insert, and each sequencing read yields 800 base pairs from both the forward and reverse directions, how many base pairs of the insert remain unsequenced after one sequencing read?

A

10,000-1600=8400 base pairs

43
Q

After all the fragments are sequenced, how are they assembled?

A

DNA fragments with overlapping sequences must be adjacent to one another. Overlaps are found until they are assembled into contigs (continuous sequence). The contigs are then assembled into a scaffold/supercontig (complete sequence of a chromosome, an ordered set of contigs usually derived from mate pairs)

44
Q

What’s the minimum number of nucleotides that must be shared in an overlapping sequence to be considered to truly overlap in the genome?

A

> 24 mer (nucleotides)

45
Q

To be considered true overlapping fragments, what must the % identity be between the two fragments? Why this number?

A

94%
- Accounts for any spontaneous mutations, or any mistakes made by the polymerase during sequencing.

46
Q

What effect do repetitive sequences play on genome assembly?

A

Repetitive sequences make it impossible to distinguish reads from two or more distinct places in the genome
- More of an issue in eukaryotes
- Assembly of reads will only detect on region instead of both regions, resulting in a repeat collapse

47
Q

When sequencing, repeats shorter than read length are… (okay/not okay)

A

Okay

48
Q

When sequencing, repeat with less base pair differences than sequencing error rate are… (okay/not okay)

A

Not okay

49
Q

True or false: Genes can be coded from either strand
How are coding genes/the coding strand shown?

A

True
Coding strand shown using arrows

50
Q

True or false: Genome sequence usually shows both sequences obtained

A

False, usually shows base pairs of only one strand

51
Q

If the sequence of genes on the opposite strand are the reverse complement, how do you find the sequence of the genes using the reverse strand?

A

Take the reverse strand (3’ to 5’) reverse it (5’ to 3’) then take the complement.

52
Q

What are the two gene-finding approaches that use bioinformatics?

A
  1. Ab initio (intrinsic) approach
  2. Extrinsic (evidence-based) approach
53
Q

Describe the Ab initio (intrinsic) gene-finding approach

A

The genomic DNA sequence alone is systematically searched for protein-coding genes (looking at the raw sequence and looking for signatures of genes in the raw data like promoter sequences for example)

54
Q

Describe the extrinsic (evidence-based) gene-finding approach

A

The target genome is compared to other genomes to look for similarity to known mRNA and protein sequences in databases (NCBI, EMBL)

55
Q

True or false: Ab initio (intrinsic) gene-finding approach and extrinsic (evidence-based) gene-finding approach are usually both used and are highly complementary

A

True

56
Q

What are four ways that you can find genes that encode for proteins?

A
  1. Presence of an open reading frame (start [ATG] and stop codons [TAA, TAG,TGA] > 300 bp
  2. Presence of CpG islands (60-70% GC content) associated with 5’ end of transcribed genes (indication of a promoter site)
  3. Splicing sites
  4. Sequence contains known protein domains (e.g. if you’re looking for the gene coding for a transmembrane protein, you need to look for a transmembrane domain gene.
57
Q

True or false: the open reading frame varies depending on the species

A

False; The open reading frame is conserved over multiple species (genome alignment)

58
Q

The function of many known genes is…

A

Preliminary at best (interactions, localization, regulation are unknown)

59
Q

What are the 3 main differences between second-generation sequencers and capillary-based sequencers (first generation)?

A
  1. Library construction: Fragment genomic DNA and PCR, bypassing vector cloning (used to make many fragments of the same DNA because many fragments are needed for the sequencing reaction).
  2. Number of parallel reads ( the ability to sequence many DNA fragments simultaneously, rather than one at a time): up to 4 billion compared to 96
  3. Read lengths: Generally shorter: 100-300 bp compared to >800 bp for Sanger (might be an issue when trying to assemble the genome, but who cares? The coverage is way higher anyways
  4. Amount of genomic template: need only a few micrograms for second-generation
60
Q

PCR steps (3)

A
  1. Denaturation of dsDNA (1 minute at 94 degrees C)
  2. Annealing of forward and reverse primers (forward on bottom strand and reverse on top strand), 45 seconds at 54 degrees C
  3. Extension (2 minutes at 72 degrees C, only dNTPs added)
61
Q

True or false: Taq polymerase synthesizes DNA one at a time during PCR

A

False. Taq polymerase synthesized both DNA strands simultaneously

62
Q

How to determine the number of molecules synthesized per cycle of PCR?

A

2^x, where x= # of cycles

63
Q

What are the steps for the DNA library preparation part of the Roche 454 Sequencer (SGS-2004) method of sequencing? 4 steps

A
  1. Fragment DNA and ligate adaptors to ends (adaptors have known sequences which will allow for making primers)
  2. Select fragments with two different adaptors (because if you use the same adaptor on both ends, you’ll primer dimers when adding primer, since the ends primers would be complementary)
  3. Certain adaptors have biotin on them. Add beads containing streptavidin, since biotin binds streptavidin. At this point, strands without biotin on either adaptor (same adaptor) will be selected against.
  4. Nick nonbiotinylated strand to get sstDNA library (nick= breaking one strand of DNA). Strands with biotin on both adaptors will stay looped to the bead and also get selected against.
64
Q

What are the steps of the emPCR part of the Roche 454 Sequencer (SGS-2004) method of sequencing? 3 steps

A
  1. Add more DNA capture beads than DNA templates
  2. Emulsify beads and PCR reagents in lipid molecules.
  3. Clonal amplification occurs inside microreactors
65
Q

What are the steps of the sequencing part of the Roche 454 Sequencer (SGS-2004) method of sequencing? 5 steps

A
  1. Put beads in wells of picotiter plate (plate with lots of wells, one bead per well)
  2. Add sequencing reaction components including adenosine 5’phosphosulfate (APS), luciferin, luciferase and primers. Basically everything but dNTPs at this point.
  3. Flood dNTPs one at a time over the picotiter plate.
  4. If nucleotide is added to new DNA strand, pyrophosphate is given off that results in light emission. This is because pyrophosphate reacts with APS to form ATP. ATP then reacts with luciferin, which results in light.
  5. Take an image of picotiter plate and repeat with next dNTP.
66
Q

Describe the sample preparation and amplification steps for Illumina Solexa Sequencing (SGS)

A
  1. Fragment DNA and add linkers (adaptors) at the ends
  2. Denature and bind one end of the ssDNA fragments to surface of flow cell (glass, each glass slide is coated with a lawn of adaptors)
  3. Free end of fragments hybridize to other adaptors on the flow cell surface (bridging reaction, when the DNA fragment randomly bends in the vicinity of a surface adaptor)
  4. Add PCR components (e.g. dNTPs, Taq polymerase) and carry out PCR in flow cell - flow cell adaptors now act as primers
  5. DNA fragments are amplified generating clusters of multiple copies (millions) of the same molecule
67
Q

Describe the sequencing by synthesis part steps for Illumina Soleca Sequencing (SGS)

A
  1. Initiate sequencing of clusters by adding primers, DNA polymerase and reversible ddNTPS (reversible = the ddNTPs will stop the rxn temporarily, but then the rxn will resume after).
    - Each type of ddNTP is labeled with a different fluorophore
  2. Add all four ddNTPs at once, allow incorpration in sequencing reaction and image flow cell
  3. Remove fluorophore from each ddNTP and then add new ddNTPs with fluorophore and continue sequencing
  4. Repeat n times to create a read length of n nucleotides
68
Q

Why can we use dNTPs for Roche 454 sequencing but have to use ddNTPs for Illumina Soleca sequencing?

A
  • dNTPs can be used in Roche 454 sequencing because the nucleotides are added one at a time, and the system relies on a light signal produced by the reaction to indicate the addition of a nucleotide to the daughter strand. There’s no risk of adding multiple different nucleotides back to back in one reaction, because only one type of nucleotide is flooded in each reaction. Overall, we can control the rate of nucleotide incorporation if dNTPs are used since they’re flooded one at a time.
  • Reversible ddNTPs are needed for Illumina Soleca sequencing because all4 nucleotides are present in each reaction cycle. The ddNTPs ensure that only one nucleotide is sequenced per reaction, so the flow cell can be imaged accurately. Overall, the rate of addition of nucleotides is controlled by the use of ddNTPs in Illumina sequencing.
69
Q

Describe Applied Biosystems SOLiD sequencing (SGS)

A
  1. Library preparation similar to Roche 454 (beads and emPCR)
  2. Universal primer hybridize to P1 adapter sequence at the end of fragments.
  3. A set of 16 8-mers (single nucleotide sequences that are 8 nucleotides long) that are fluorescently labelled is flooded over the fragments.
    - First 2bases of each 8-mer are fixed (dibase probes, base pair with the template), and the remaining 6 bases are degenerate (no specificity/complementary binding)
  4. Allow probe to bind template and ligate to primer (sequencing by ligation).
  5. the fluorophore is cleaved off the probe. This removes the fluorescent label, leaving behind only the dinucleotide that was ligated to the DNA strand. The cleavage usually removes a small portion of the ligated probe (the last few bases, including the fluorophore), leaving an exposed 5’ end on the growing strand where the next probe can attach.
  6. Following several ligation cycles, the template is removed (daughter strand is killed) and the process is repeated with a new primer (offset by one nucleotide)
70
Q

Why does Applied Biosystems SOLiD sequencing have high accuracy?

A

Because you’re sequencing the exact same template over and over again.

71
Q

Main difference between SOLiD sequencing and the other second generation sequencers?

A

SOLiD uses ligase, not polymerase

72
Q

Main differences between second-generation and third generation sequencers (3 things)

A
  • No PCR amplification required for third-generation sequencers (sequencing of single DNA molecules)
  • Read lengths: much longer (10,000 to 100,000 bp) and therefore, less coverage required
  • Error rate and costs are still much higher than 2nd generation platforms
73
Q

Describe steps for Pacific Biosciences Single Molecule Real Time (SMRT) Sequencing (TGS)
(3 steps)

A
  1. Sequencing reaction carried out in extremely small wells (50 nm) called zero-mode waveguides (ZMV) allowing for high sensitivity to measure fluorescence
  2. DNA and polymerase is embedded on the bottom of ZMVs
  3. Fluorescent dNTPs are added all at the same time and incorporation is measured by intensity and colour of fluorescence.
74
Q

Describe steps for Oxford Nanopore Technologies (TGS)
(5 steps)

A
  1. Nanopore is the bacterial α-hemolysin protein embedded in a synthetic membrane on an array chip
  2. Membrane has high electrical resistance and the application of a potential across the membrane cause a current to flow through the aperture of the nanopore
  3. DNA is inserted in a nanopore by a DNA helicase and travels through the nanopore one nucleotide at a time
  4. As each type of nucleotide travels through the nanopore, it causes a unique current disruption.
  5. The current changes are measured to identify the nucleotide sequence
75
Q

Describe High Fidelity Long Read Sequencing (High-fidelity TGS)

A
  • Major breakthrough in 2021
  • Strategy is to sequence a circularized DNA molecule instead of a linear DNA molecule by PacBio or Nanopore
  • Allows for multiple rounds of sequencing of a single DNA molecule generating a long sequencing read with multiple copies (subreads)
  • Comparison of subread sequences identifies errors and increases fidelity
  • 99.9% accuracy
76
Q

~8% of the human genome remained not sequenced for 20 years since the human draft genome sequence in 2001. Why was this?

A

Due to limitations of Sanger and NGS in obtaining sequence of highly repetitive regions and structural polymorphisms (section of a gene occurs in several different forms, such as copy number variation, rearrangements, inversions of regions greater than 1 kb)

77
Q

Where did large sequencing gaps remain?

A

Large sequencing gaps remain on short arms of acrocentric chromosomes, as well as the Y chromosome

78
Q

How are companies overcoming the sequencing gaps in the human genome?

A

The Telomere to TElomere (T2T) Consortium used high-fidelity PacBio and Nanopore platforms to complete sequencing of the human genome.

79
Q

How many more genes were sequenced using high-fidelity TGS for the human genome project?

A

200 Mbp of new sequence, 2000 candidate genes including 99 new coding genes

80
Q

High-fidelity TGS is an invaluable resource for…

A

New discoveries in gene regulation, genetic variability and new disease loci

81
Q

What is an application of next-generation sequencing regarding SNPs?

A

Large number of sequence reads of genomes make it easier to identify SNPs linked to polygenic diseases and interesting traits

82
Q

What is an application of next-generation sequencing regarding metagenomics?

A

Sequence genetic material from environmental samples to determine identity and diversity of microbes (gut, volcanic vents, oil sands)

83
Q

What is an application of next-generation sequencing regarding sequencing capacity?

A

Sequencing capacity allows for greater coverage of a genome that is present in a low proportion of the total genetic material in a sample (e.g. Neanderthal DNA is <5% of sample; wooly mammoth)

84
Q

What is an application of next-generation sequencing regarding transcriptomics (RNA-Seq)?

A

Global identification of low abundant transcripts (including microRNAs) with higher sensitivity than microarrays

85
Q

What is an application of next-generation sequencing regarding chromatin immunoprecipitation (ChIP)- Seq?

A

Global identification of binding sites of nucleic-acid binding proteins and chemical modifications (e.g. histone occupancy and acetylation, transcription factors)

86
Q

What is a limitation of next-generation sequencing regarding shorter reads?

A

Shorter reads (100-300 bp) in 2nd generation sequencing make assembly of denovo eukaryotic genomes difficult (no template to help assemble the reads)
- okay for prokaryotes with little repetitive sequences or for resequencing projects

87
Q

What is a limitation of next-generation sequencing regarding coverage?

A

Have to increase coverage (20X-30X) for 2nd generation sequencing due to shorter reads than Sanger sequencing

88
Q

What is a limitation of next-generation sequencing hi-fidelity 3rd generation sequencing?

A

Hi-fidelity 3rd generation sequencing reduces error rate, but cost is still expensive (but it is going down)

89
Q

What is a limitation of next-generation sequencing regarding infrastructure and support?

A

Few centers with strong infrastructure and support for assembly and analysis

90
Q

What is a limitation of next-generation sequencing regarding bioinformaticians?

A

Shortage of highly-trained bioinformaticians for assembly and analysis of genome sequences.

91
Q

What are 5 applications of human genome sequencing to medicine? (Routine human genome sequencing/personalized medicine)

A
  1. Personal genomic information can lay out the health roadmap of the individual (genetic makeup helps identify how we respond to certain therapies)
  2. Provide advanced screening for disease (nanopore sequencing can identify bacteria pathogens within 7 hours compared to 2-4 days for culturing)
  3. Select safer and more effective medications and dosages
  4. Create better vaccines (DNA/RNA vaccines)
  5. Lower health care costs
92
Q

What are 5 examples of human genome projects and describe them

A
  1. Personal genome project (Harvard Medical School and other countries including Canada): can volunteer to “share your genome information for the greater good”
  2. 1000 Human Genomes Project (completed in 2012): genomes of over 1092 anonymous people from 14 populations around the world were sequenced.
  3. The Cancer Genome Atlas (TCGA): started as a three-year pilot in 2006 funded by NCI and NHGRI to focus on the molecular understanding of brain, lung and ovarian cancer
  4. UK10K: Sequence 4000 healthy humans and exomes of 6000 currently living with a genetic disease (obesity, schizophrenia and congenital heart disease)
  5. 10K Autism Genome project: sequencing both the kid as well as the parents to determine the polymorphisms between parent and child.
93
Q

As more disease genes are discovered…

A

We will gain a better molecular understanding of the disorder and develop better diagnostics and therapeutics