Module 4.2 Sequencing Run QC Flashcards
1
Q
tiles
A
regions on a flow cell
2
Q
Summary Table categories
14
A
- Number of lanes
- Number of tiles per lane
- Density (K/mm2)
- Cluster PF (%)
- Phase/Prephase (%)
- Reads (M)
- Reads PF (M)
- % > Q30
- Yield (G)
- Cycles Err Rated
- Aligned (%)
- Error Rate (%)
- Error Rate 35 cycle (%)
- Error Rate 75 cycle (%)
3
Q
Summary Table
A
- statistics showing means and standard deviations over all the tiles used in one lane
4
Q
Summary Table
Density
(K/mm2)
A
- concentration of clusters detected by image analysis
- expressed as thousands of clusters per square millimeter
5
Q
Summary Table
Cluster PF
(%)
PF = Passing Filter
A
- reads that pass Chastity filter
- Reads PF / Reads = ANSWER
6
Q
Chastity
used in Chastity Filter
A
- ratio of the brightest base intensity divided by the sum of the brightest and second brightest base intensities
- Illumina internal quality filtering procedure
- Clusters pass filter if no more than one base call has ANSWER value below 0.6 in first 25 cycles.
7
Q
Summary Table
Reads
(M)
A
- total number of clusters
- measured in millions
8
Q
Summary Table
Reads PF
(M)
A
- number of clusters that pass Chastity filter
- measured in millions
9
Q
Yield
(G)
A
- number of bases sequenced that pass filter.
- 1 cluster = 1 read sequence
- answer = (number of clusters passing filter) x (number of bases sequenced in each read)
10
Q
Summary Table
% >= Q30
A
percentage of bases with Quality Score of 30 or higher
11
Q
Phred Quality Score
A
- prediction of probability of an error in base calling
- Q = -10log10P
- P = base calling error probability
- generally range 0-40
- error prone due to extreme GC bias, specific patterns or homopolymers
- data with less than Q 20 not valuable
12
Q
Phred
Analyzing process
A
- calculates several parameters related to peak shape and peak resolution at each base position
- use parameters to predict error probabilities generated from sequence trace, where correct sequence was known
- score (gray bar) matches to each colored peak (base)
-No score = N base assignment (not good enough peak for base call)
13
Q
Illumina quality scores calculation
steps (3)
A
- intensity profiles and signal to noise ratios measure base calling reliability
- compute quality predictor values for a new base call and compare to values in pre-calibrated Quality (Q) table/model. Quality scores recorded for each cycle in base call (BCL) files
- quality scores converted to FASTQ format
14
Q
Quality Score
FASTQ format
A
4 lines with separated fields per sequence read
- @Sequence_Identifier
- ATGC Raw Sequencing Data
- +(sequence id again or description
- Quality values for each base (ASCII)
15
Q
FASTQ file
A
- text-based format for storing both the nucleotide sequence and its corresponding quality data
- quality scores encoded in compact ASCII character format so it uses only one byte per quality value.
- Quality score = ASCII character code -33