P4: ASV Inference Flashcards

Question 1

Q

generally explain the 16S rRNA workflow

Answer

A

sample collection –> DNA extraction –> library preparation –> sequencing

Question 2

Q

16S rRNA workflow - DNA extraction

Answer

A

extract nucleic acid
can be done with RNA as well

Question 3

Q

16S rRNA workflow: DNA extraction - why 16S

Answer

A

16S is a ribosomal subunit that combines with proteins and its present in both mitochondria and chloroplasts
this makes it the best marker for analyzing DNA

Question 4

Q

16S rRNA workflow - library prep

Answer

A

PCR 1 and 2 (with cleanup 1 and 2 stages between them)
amplifying targets via primers
result is the final library

Question 5

Q

16S rRNA workflow: library prep - PCR1

Answer

A

region specific
amplifies specific hypervariable regions
has required primer overhangs
goes on to clean up 1

Question 6

Q

16S rRNA workflow: library prep - PCR2

Answer

A

indexing
2nd amplification
adds barcodes/indexes (to identify the specific sequence)
adds sequence adaptors
needs a different primer than PCR1 and will go on to clean up 2

Question 7

Q

16S rRNA workflow: library prep - final library

Answer

A

has the adaptor proteins necessary for sequencing
from left to right: priming site for sequence reaction, library index, and flowcell handle
will go on to sequencing

Question 8

Q

16S rRNA workflow - how is clean up 1 and 2 done

Answer

A

based on magnetic beads

Question 9

Q

how are sequencing results shown

Answer

A

Fastq files
they are a text-based format that contains the nucleotide sequence and its corresponding quality scores
every 4 line represents 1 specific sequence

Question 10

Q

Fastq files - how to read them

Answer

A

contains 4 levels of information
1. header
2. sequence results
3. base and Q
4. Q scores

Question 11

Q

Fastq files - header

Answer

A

starts with “@” symbol
has the barcode provided by sequencing authority

Question 12

Q

Fastq files - Base and Q

Answer

A

tells you what strand the gene was sequenced on (leading vs lagging)

Question 13

Q

Fastq files - Q scores

Answer

A

shown as ASCII characters
shows how reliable every sequenced nucleotide is
numbered through 0-40
40: reliable
</= 20: unreliable
should have a ton of errors and quality drops in the beginning of a sequence

Question 14

Q

what are other (non-sequencing) 16S rRNA pipelines

Answer

A

amplicon sequencing variants
operational taxonomic units
PhyloChips

Question 15

Q

other 16S rRNA pipelines - ASV

Answer

A

distinguishing rogue amplicons by reducing noise (denoising) made by sequencing errors and keeping the reliable ones
more resolution than OTUs
intraspecific

Question 16

Q

other 16S rRNA pipelines - OTU

Answer

Study These Flashcards

A

clustering based on similarity
shows general variation of taxonomy
table done is based on a representative set

Question 17

Q

other 16S rRNA pipelines - PhyloChips

Answer

Study These Flashcards

A

non-PCR
hybridize DNA after extracting and putting it in a chip
the chip will then calculate every grouping within a sample
specific to 1 type of microbiome but is good for environments that are well known
novel (new) organisms cannot be detected

Question 18

Q

ASV inference using DADA^2

Answer

Study These Flashcards

A

filter and trim
dereplicate
learn error rates
infer sample composition: denoising
merge F/R reads
construct sequence table
remove chimera
assign taxonomy
export table

Question 19

Q

ASV inference using DADA^2 - filter and trim

Answer

Study These Flashcards

A

gets the environment ready
reads files and keeps quality > 20

Question 20

Q

ASV inference using DADA^2 - dereplicate

Answer

Study These Flashcards

A

the reduction of a set of sequences that are identical
creates a table

Question 21

Q

ASV inference using DADA^2 - learn error rates

Answer

Study These Flashcards

A

sequences that are most abundant have higher prevalence to be mutated
these sequences may have subsequences that have more errors

Question 22

Q

ASV inference using DADA^2 - infer sample composition (denoising)

Answer

Study These Flashcards

A

computational method for removing sequence errors from amplicon reads
or identifying the correct biological sequences in the reads

Question 23

Q

ASV inference using DADA^2 - merge F/R reads

Answer

Study These Flashcards

A

might lose some reads in this step
possible reason: one of the reads may not have passed the quality/error score

Question 24

Q

ASV inference using DADA^2 - remove chimeras

Answer

Study These Flashcards

A

chimeras are sequences that comes from 2 different organisms/species
sometimes polymerase cannot extend and will leave a sequence incomplete and the sequences will merge together
this will make the next step in PCR amplify the merged sequence

P4: ASV Inference Flashcards

(24 cards)