Lecture 3 Flashcards

1
Q

Define “massive parallel sequencing” in NGS:

A

Massive: several regions at once; parallel: several samples at a time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 2 main NGS platforms currently used?

A

Ion torrent and illumina

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is ion torrent also referred to as?

A

“Label-free sequencing” - fluorescence or spikes of light are not used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does ion torrent measure?

A

Changes in ph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ion torrent characterized by?

A

High accuracy and good coverage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How long are the reads sequenced by ion torrent?

A

About 200 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How long are the average Sanger sequence reads?

A

600-800 bp up to 1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is ilumina characterized by?

A

Bridge amplification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is illumina visualized?

A

By the use of fluorescence → each nucleotide is linked to a different fluorophore which emits a unique signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the library structure of an illumina sample:

A

A DNA insert with “read1” and “read2” ( primers similar to the forward and reverse primers in Sanger) on either side, two indexes on either side that serve as an 8 bp “barcode” exclusive to each sample, and 2 adaptors complimentary to those linked on the flow cell called p5 and p7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the process called in which all samples collected by illumina sequencing are pulled together to observe associations between the obtained sequence and the sorted samples?

A

Demultiplexing → each read is associated to a unique sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is cluster generation?

A

Amplification of the flow cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a quality score?

A

A prediction of the probability of an error in base calling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When measuring Phred quality score, what probability corresponds to high accuracy?

A

Low probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is read depth?

A

The total number of bases sequenced and aligned at a given reference base position

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is coverage defined as?

A

The average number of sequenced bases that align to each base of the reference DNA → ex: a whole genome sequenced at 30x coverage means that each base in that genome analysis was sequenced 30 times on average

17
Q

What does NGS accuracy depend on?

A

Coverage

18
Q

How can using paired-end sequencing reduce the chance of introducing an error in the base calling?

A

We can check if there is a balanced calling of a genetic variant that is defected

19
Q

Where is the variant calling base balanced between in illumina paired-end sequencing?

A

Between read 1 and read 2 → otherwise the variation might be an error

20
Q

What are some pros of paired-end sequencing?

A

Millions of parallel sequencing reactions are performed and can lead to the identification of changes in the number of copies

21
Q

What are some cons of paired-end sequencing?

A

A very large amount of data is collected and both false positives and false negatives are possible

22
Q

Currently, what is required at the end of data analysis?

A

The use of Sanger sequencing because the results obtained by NGS require validation