Exam IDB Flashcards

Question 1

Q

What is GWAS and how is it used?

Answer

A

Genome-wide association studies
Hypothesis free method that test variants across the genome to identify alleles that are associated with a phenotype as specific resistance
Run by scoary

Question 2

Q

What is Prokka in pangenome analysis

Answer

A

Prokka is a software rapidly annotates genes and identify coding sequences which can be used by Roary in nemt step

Question 3

Q

What is Roary in pangenome analysis?

Answer

A

Roary is a software used to construct pangenome based on annotation from Prokka, identifying orthologous groups of genes using a fast clustering algortihm, making a pangenome matrix representing presence or absence of the groups in each genome for analysis of downstream gene function and evolution

Question 4

Q

What is OLC and how does it work?

Answer

A

Overlap consensus sequence, Identifying overlaps between reads and join them together to form a consensus sequence
Alignment to identify regions of similarity (overlaps) that are used to make an overlap graph. Each read is represented as a node and overlaps are represented as edges that connect the nodes. Its used to identify clusters of reads that might represent the same DNA fragments and assemble the clusters into contigs

Question 5

Q

What is the conclave algorithm used for?

Answer

A

Resolve multimapping sequences from redundant sequence patterns
The ConClave scheme can be used, when you have reads that map to multiple locations. The read is then mapped to the location/ template with the highest scoring alignment.

Question 6

Q

Modern sequence alignemt methods includes?

Answer

A

Mapping of sequences to reduce the search space.
Chaining of maximum exact matches.
Adjust the expectations of outcome between two groups.
Mapping is the first step of alignment, so that the MEM (maximum exact matches) can be chained together

Question 7

Q

Which identification method has the highest discriminatory power?

Answer

A

Whole genome sequencing
This analyzes the whole genome which ofc is the highest discriminatory power because it can discriminate between all genes.

Question 8

Q

If you cannot find expected resistance. What could be the explanation for lack of resistance genes with a resistance phenotype?

Answer

A

You may forget to include point mutations
The strain could be intrinsic resistance to the antibiotic.
When a strain of bacteria is said to be intrinsically resistant to an antibiotic, it means that the bacteria naturally possesses mechanisms that prevent the antibiotic from working against it, without the need for the bacteria to acquire additional resistance genes or mutations.
Its a new gene or new mechanism

Question 9

Q

What is a replicon?

Answer

A

Molecules of DNA or RNA that are capable of survival and replication

Question 10

Q

How can a resistance gene be linked to a plasmid?

Answer

A

Its only possible if resfinder identifies the resistance gene on the same contig as plasmidfinder identifies the plasmid replicon (THIS REQUIRES THE ANALYSES RUN ON ASSEMBLED SEQUENCES

Question 11

Q

What can you use as input in BEAST program?

Answer

A

Aligned aa sequences
Aligned nucleoetide sequences
Concatenated SNPs

Question 12

Q

After running BEAST but the log file from BEAST has very low ESS (Effective sample site) for most of specified parameters

Answer

A

Increase number of BEAST runs
Change prior model
Change clock model

Question 13

Q

What can BEAST not do?

Answer

A

Identify recombinations, because BEAST models evolution as a tree-like process, where sequences evolve along a single branching path without any exchange of genetic material between lineages.
Recombination, on the other hand, involves the exchange of genetic material between different lineages, which can result in sequences that do not conform to a simple tree-like pattern of evolution. Therefore, detecting recombination requires methods that can identify and model patterns of genetic exchange between lineages, such as methods based on genetic linkage or network analyses

Question 14

Q

What program do you use for summarising the information from a sample of trees produced by BEAST onto a single “target” tree ?

Answer

A

TreeAnnotator

Question 15

Q

What is the difference between maximum likelihood and maximum parsimony in phylogeny?

Answer

A

Maximum likelihood is most accurate, dont provide SNPs distances, so finding difference in same species, whereas maximum parsimony is minimum evolution providing SNPs distance underestimating actual evolution and can be used for different species

Question 16

Q

How do you infer phylogeny from SNPs?

Answer

Study These Flashcards

A

We call SNPs for each isolate using the same reference
Concatenate SNPs into ‘SNP sequences’ one per isolate
Create a tree using a chosen algorithm

Question 17

Q

CSI phylogeny for raw reads pipeline looks like?

Answer

Study These Flashcards

A

Map reads to reference (BWA is used)
Call all possible SNPs (using samtools)
Filter positions and SNPs using: coverage, quality and z-score.
Prune SNPs. This removes SNPs that are in close proximity to remove mobile elements and repeat sequences.
Output is VCF file (variant calling format)

Question 18

Q

CSI phylogeny of assembly pipeline looks like?

Answer

Study These Flashcards

A

NUCMER (part of MUMMER) that aligns all contigs to the reference to find SNPS
Pruning
It is preferred to use raw reads because then we can validate the SNPs that we are calling.

Question 19

Q

pMLST portable genome sequence multi locus sequence typing is?

Answer

Study These Flashcards

A

A method type plasmids to characterize bacterial isolates. The database has the plasmid alleles and can give you the ST type for the plasmid. For this analysis we need to know the plasmid type and therefore plasmidfinder should be done first

Question 20

Q

What is the purpose of the PlasmidFinder?

Answer

Study These Flashcards

A

Does not take the entire plasmid, but plasmid replicons, if we know the replicons of the plasmid, we know the type of plasmid we are working with and if they have the same plasmid. (Input is FASTA as well as a database of plasmid replicon sequences for comparison, and Output is TSV file)

Question 21

Q

What is the purpose of the ResFinder?

Answer

Study These Flashcards

A

Resistant gene detection, showing which bacteria show to be antibiotic resistant and which exact phenotype is being resistant. The program can give you the gene, class of gene or genome. It can detect the whole resistance gene and chromosomal point mutations causing resistance in the whole genome sequence

Question 22

Q

What is KMA?

Answer

Study These Flashcards

A

KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend. KMA is particulary good at aligning high quality reads against highly redundant databases, where unique matches often does not exist.

Question 23

Q

What is chaining?

Answer

Study These Flashcards

A

MEMs (maximal exact match) are likely to belong together and produce high quality alignments, they can therefore be chained together.

Exam IDB Flashcards

(23 cards)