LEC48: Applications of Next Gen Sequencing Flashcards

1
Q

how does sanger sequencing work

A

take DNA, compartmentalize/target regions of interest in genome

generate sequence-specific primers

these modified DNA bases lack 3’-OH on ribose moiety, so any replicating DNA chain into which they’re added by DNA pol will beu nable to be further extended

these terminator bases are added into individual elongating DNA molecules

produces ladder of DNA chains of specific lengths

each chain is tagged by terminator molecule w/ unique flourescent molecule tag that’s detected by reading device

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how does next gen DNA sequencing process work

A

physically shear DNA, produce fragments

size select fragments for enzymatic attachment of adaptor primers

denature the DNA into single strands, adaptor primers used to capture individual fragments onto a sequencing matrix

amplify these molecules to make library, by PCR

add DNA polymerase + modified bases that aren’t extendable but are tagged w/ a flourescent color, 1 for each base

generates the series of individual rxns that’re located at a specific location on matrix and emit a specific color, allow ID of base added

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is massively parallel sequencing

A

once have done 1 round of next gen sequencing,

wash, removing all reagents

chemically modify, remove flourescent tags, DNA can be elongated again

add second reagents cycle

generates a 2nd piece of seuqnece info for each position

repeat this cycle multiple times, generate sequence data in parallel at massive scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how are clusters visualized w/ next gen

A

measure dNTP-specific flourescence at individual rxn centers b/c each piece of light represnts an individual sequencing rxn

this is advanced microscopy

tihs requires lots of computational power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

once use next gen sequencing to sequence, how do you apply that info

A

massively parallel sequence data analysis: probabilistic approach, align fragments against reference genome and find variance w/in that fragment of DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how can computer faster analyze next gen genome

A

1) split genome by chromosome, create many jobs
2) run jobs concurrently on diff cluster nodes
3) combine results into single output for further analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the computational challenege re: input of next gen sequencing

A

input n bp long sequences from sample, as short reads

map those back to reference genome to align them, map out genome

can map your reference back to specific range

but **repetitive regions are a problem for this **b/c hard to unambiguously assign to reference genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are limitations to computational abitilies of next gen

A

1) space needed is massive for storing image files and subsequent data
2) processing power needed for aligning huge number of relatively short sequence fragments (reads) thatre generated in order to ID positions w/ sequence variatns (polymorphisms, mutations)

need **high performance computer assays, sophisticated computational algorithms **to minimize the processing time needed to accomplish these tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

sequence alignment vs databse mapping?

A

sequence alignment: comparing 2 sequences of DNA

database mapping: comparing many small sequences to one really big sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what happens w/ DNA fragment sequence once generated in next gen seq

A

must align sequence unambiguously to a specific chromosomal position

use coputer to generate best fits of each fragment to a genome region

highly repetitive regions are difficult to align well, arent sequenced this way

also can not align regions of genome which arent efficienctly amplified by PCR b/c of sequence identity (i.e. high GC content)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the significance of variants in next gen sequencing?

A

if sample fragments have variation from reference genome, could be incorrect seq assignment, poor alignment, experimental noise if in region w/ few seq reads

HOWEVER if true variation, hard to interpret b/c of:

1) incomplete knowledge of fxn of all genes
2) incomplete knowledge of range of tolerated variation in human populations
3) incomplete knowledge of effect of individual amino acid changes on protein fxn
4) incorrect assignments of pathogenicity in current mutation databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the exome

A

protein coding portion of the genome

better understood than regulatory regions of genome that’re noncoding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

why might do exome sequnecing?

A

to reduct amount of variation to be interpreted in clinical seq sample for next gen seq

can capture **specific DNA fragments representing the coding part of the genome, using specially designed primers that incorporate tag molecules **

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how does exome sequencing work

what can it be used on?

A

primers confer specificity to your target

primers incorporate tag molecule and use its physical characteristics to yield **enrichment in DNA fragmnets of interest **

cna capsure: whole exome, subset of known disease-associated genes (medical exome), or panel of genes for a specific condition (i.e. epilepsy)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

efficacy of exome sequencing?

A

v helpful for clinical test especially in undiagnosed, mendelian disease - good for rare disease detection

has been used at Baylor CoM, 25% hit rate on first 250, exomes done

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the possible applications of next gen seq?

A

1) transcription profiling
2) undiagnosed diseases
3) cancer
4) infectious diseases

17
Q

what info does NGS transcriptional profiling provide?

A

1) tissue-specific mRNA abundance, expression across tissues or pathologic states
2) alternative splicing events in normal and diseased tissue

18
Q

describe process of NGS for transcriptional profiling

A

take total RNA

fragment it

create random hexamer primed cDNA

map to reference gene

do gene function analysis

understanding tissue-specific gene expression in healthy & disease states;

observe rare splicing events, better quantitation of expression

19
Q

how can NGS help w/ undiagnosed disease?

A

patients who’ve exhausted medical testing and remain undiagnosed

if ID underlying genetic basis of disease, may be beneficial b/c prognostic, diagnostic (genetic counseling in the family), or therapeautic (rarely)

20
Q

how many variants does WGS produce?

how do we analyze them, how does penetrance complicate analysis?

A

~4.8 million

must define if they have small effect size, low penetrance, and thus are polymorphic in “normal” population - requires DB w/ large control pop

or if large effect size, high penetrance, and variant is de novo in proband, or inherited from a phenotypically normal parent - presumes variant’s fully penetrant

21
Q

how are variants studied?

A

filtration of variants by polymorphic frequency, false positives, inheritance models

22
Q

what are limitations of NGS technology?

A

1) false-positive & false-negative variant calls increase w/ size of sequenced target
2) much varaibility among datasets in SNVs, indels, calls
3) sequence-specific limitations- highly GC or AT rich regions don’t amplify well & extended repetitive sequence runs won’t assemble, sequence well

23
Q

how is NGS used to study cancer

A

1) study underlying disease biology - demonstrate clonal evolution of relapsed cancer
2) make ttmnt decisions based on ID of specific driver mutations that might be amenable to targeted therapies
3) RNA req and look at epigenome which is helpful for therapy

24
Q

how is NGS used to study ID

A

rapid ID of microbial species from epidemic outbreaks, i.e. Haitian cholera outbreak after the earthquake

reconstructed phylogenetic relationships among strains of pathogen

can sequence a CSF sample, identify organism

25
Q

how can genomic data be integrated into clinical practice?

A

personalized medicine

apply genetic data in clinical practice - use whole exome data to ID individuals who carry genomic variants that confer specific disease susceptibilities

more possible as prices for genomic sequencing decline

26
Q

Does NGS provide a comprehensive look at the entire genome? Why or why not?

A

Yes, you fragment the entire genome for NGS whereas w/ Sanger technique, target a part of the genome for study

Here use special DNA adaptors that amplify the entire genome for study

27
Q

Would NGS provide data on mitochondrial DNA mutations?

A

No because mitochondria has its own genome

28
Q

How is quantitation of gene expression accomplished by NGS?

A

Massive computing

Aligns fragments found with reference genomes of the population

Calculate frequency variance between patient and model reference genome

29
Q

Which types of mutations are most likely to be unambiguously associated w/ clinical disease states?

A

Changes such as splicing changes or mutations in the exome or on RNA analysis, in coding regions

30
Q

List 4 challenges associated with classifying variants as pathogenic or benign

A
  1. the reference genomes we have are incomplete
  2. there can be computing errors that may be variants but may be misreads by the computer
  3. variants are not necessarily pathogenic they could just be variants and we have more info on some populations and subpopulations than others making this especially a challenge in understudied populations
  4. cannot know if variant is de novo in the proband or inherited from a phenotypically normal parent
  5. penetrance hard to classify w/ NGS