P4: ASV Inference Flashcards

1
Q

generally explain the 16S rRNA workflow

A

sample collection –> DNA extraction –> library preparation –> sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

16S rRNA workflow - DNA extraction

A
  • extract nucleic acid
  • can be done with RNA as well
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

16S rRNA workflow: DNA extraction - why 16S

A
  • 16S is a ribosomal subunit that combines with proteins and its present in both mitochondria and chloroplasts
  • this makes it the best marker for analyzing DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

16S rRNA workflow - library prep

A
  • PCR 1 and 2 (with cleanup 1 and 2 stages between them)
  • amplifying targets via primers
  • result is the final library
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

16S rRNA workflow: library prep - PCR1

A
  • region specific
  • amplifies specific hypervariable regions
  • has required primer overhangs
  • goes on to clean up 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

16S rRNA workflow: library prep - PCR2

A
  • indexing
  • 2nd amplification
  • adds barcodes/indexes (to identify the specific sequence)
  • adds sequence adaptors
  • needs a different primer than PCR1 and will go on to clean up 2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

16S rRNA workflow: library prep - final library

A
  • has the adaptor proteins necessary for sequencing
  • from left to right: priming site for sequence reaction, library index, and flowcell handle
  • will go on to sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

16S rRNA workflow - how is clean up 1 and 2 done

A

based on magnetic beads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how are sequencing results shown

A
  • Fastq files
  • they are a text-based format that contains the nucleotide sequence and its corresponding quality scores
  • every 4 line represents 1 specific sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fastq files - how to read them

A
  • contains 4 levels of information
    1. header
    2. sequence results
    3. base and Q
    4. Q scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fastq files - header

A
  • starts with “@” symbol
  • has the barcode provided by sequencing authority
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Fastq files - Base and Q

A

tells you what strand the gene was sequenced on (leading vs lagging)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Fastq files - Q scores

A
  • shown as ASCII characters
  • shows how reliable every sequenced nucleotide is
  • numbered through 0-40
  • 40: reliable
  • </= 20: unreliable
  • should have a ton of errors and quality drops in the beginning of a sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are other (non-sequencing) 16S rRNA pipelines

A
  • amplicon sequencing variants
  • operational taxonomic units
  • PhyloChips
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

other 16S rRNA pipelines - ASV

A
  • distinguishing rogue amplicons by reducing noise (denoising) made by sequencing errors and keeping the reliable ones
  • more resolution than OTUs
  • intraspecific
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

other 16S rRNA pipelines - OTU

A
  • clustering based on similarity
  • shows general variation of taxonomy
  • table done is based on a representative set
17
Q

other 16S rRNA pipelines - PhyloChips

A
  • non-PCR
  • hybridize DNA after extracting and putting it in a chip
  • the chip will then calculate every grouping within a sample
  • specific to 1 type of microbiome but is good for environments that are well known
  • novel (new) organisms cannot be detected
18
Q

ASV inference using DADA^2

A
  1. filter and trim
  2. dereplicate
  3. learn error rates
  4. infer sample composition: denoising
  5. merge F/R reads
  6. construct sequence table
  7. remove chimera
  8. assign taxonomy
  9. export table
19
Q

ASV inference using DADA^2 - filter and trim

A
  • gets the environment ready
  • reads files and keeps quality > 20
20
Q

ASV inference using DADA^2 - dereplicate

A
  • the reduction of a set of sequences that are identical
  • creates a table
21
Q

ASV inference using DADA^2 - learn error rates

A
  • sequences that are most abundant have higher prevalence to be mutated
  • these sequences may have subsequences that have more errors
22
Q

ASV inference using DADA^2 - infer sample composition (denoising)

A
  • computational method for removing sequence errors from amplicon reads
  • or identifying the correct biological sequences in the reads
23
Q

ASV inference using DADA^2 - merge F/R reads

A
  • might lose some reads in this step
  • possible reason: one of the reads may not have passed the quality/error score
24
Q

ASV inference using DADA^2 - remove chimeras

A
  • chimeras are sequences that comes from 2 different organisms/species
  • sometimes polymerase cannot extend and will leave a sequence incomplete and the sequences will merge together
  • this will make the next step in PCR amplify the merged sequence