Exam IDB Flashcards

1
Q

What is GWAS and how is it used?

A

Genome-wide association studies
Hypothesis free method that test variants across the genome to identify alleles that are associated with a phenotype as specific resistance
Run by scoary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Prokka in pangenome analysis

A

Prokka is a software rapidly annotates genes and identify coding sequences which can be used by Roary in nemt step

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Roary in pangenome analysis?

A

Roary is a software used to construct pangenome based on annotation from Prokka, identifying orthologous groups of genes using a fast clustering algortihm, making a pangenome matrix representing presence or absence of the groups in each genome for analysis of downstream gene function and evolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is OLC and how does it work?

A

Overlap consensus sequence, Identifying overlaps between reads and join them together to form a consensus sequence
Alignment to identify regions of similarity (overlaps) that are used to make an overlap graph. Each read is represented as a node and overlaps are represented as edges that connect the nodes. Its used to identify clusters of reads that might represent the same DNA fragments and assemble the clusters into contigs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the conclave algorithm used for?

A

Resolve multimapping sequences from redundant sequence patterns
The ConClave scheme can be used, when you have reads that map to multiple locations. The read is then mapped to the location/ template with the highest scoring alignment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Modern sequence alignemt methods includes?

A

Mapping of sequences to reduce the search space.
Chaining of maximum exact matches.
Adjust the expectations of outcome between two groups.
Mapping is the first step of alignment, so that the MEM (maximum exact matches) can be chained together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which identification method has the highest discriminatory power?

A

Whole genome sequencing
This analyzes the whole genome which ofc is the highest discriminatory power because it can discriminate between all genes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

If you cannot find expected resistance. What could be the explanation for lack of resistance genes with a resistance phenotype?

A

You may forget to include point mutations
The strain could be intrinsic resistance to the antibiotic.
When a strain of bacteria is said to be intrinsically resistant to an antibiotic, it means that the bacteria naturally possesses mechanisms that prevent the antibiotic from working against it, without the need for the bacteria to acquire additional resistance genes or mutations.
Its a new gene or new mechanism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a replicon?

A

Molecules of DNA or RNA that are capable of survival and replication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can a resistance gene be linked to a plasmid?

A

Its only possible if resfinder identifies the resistance gene on the same contig as plasmidfinder identifies the plasmid replicon (THIS REQUIRES THE ANALYSES RUN ON ASSEMBLED SEQUENCES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What can you use as input in BEAST program?

A

Aligned aa sequences
Aligned nucleoetide sequences
Concatenated SNPs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

After running BEAST but the log file from BEAST has very low ESS (Effective sample site) for most of specified parameters

A

Increase number of BEAST runs
Change prior model
Change clock model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What can BEAST not do?

A

Identify recombinations, because BEAST models evolution as a tree-like process, where sequences evolve along a single branching path without any exchange of genetic material between lineages.
Recombination, on the other hand, involves the exchange of genetic material between different lineages, which can result in sequences that do not conform to a simple tree-like pattern of evolution. Therefore, detecting recombination requires methods that can identify and model patterns of genetic exchange between lineages, such as methods based on genetic linkage or network analyses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What program do you use for summarising the information from a sample of trees produced by BEAST onto a single “target” tree ?

A

TreeAnnotator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between maximum likelihood and maximum parsimony in phylogeny?

A

Maximum likelihood is most accurate, dont provide SNPs distances, so finding difference in same species, whereas maximum parsimony is minimum evolution providing SNPs distance underestimating actual evolution and can be used for different species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you infer phylogeny from SNPs?

A

We call SNPs for each isolate using the same reference
Concatenate SNPs into ‘SNP sequences’ one per isolate
Create a tree using a chosen algorithm

17
Q

CSI phylogeny for raw reads pipeline looks like?

A

Map reads to reference (BWA is used)
Call all possible SNPs (using samtools)
Filter positions and SNPs using: coverage, quality and z-score.
Prune SNPs. This removes SNPs that are in close proximity to remove mobile elements and repeat sequences.
Output is VCF file (variant calling format)

18
Q

CSI phylogeny of assembly pipeline looks like?

A

NUCMER (part of MUMMER) that aligns all contigs to the reference to find SNPS
Pruning
It is preferred to use raw reads because then we can validate the SNPs that we are calling.

19
Q

pMLST portable genome sequence multi locus sequence typing is?

A

A method type plasmids to characterize bacterial isolates. The database has the plasmid alleles and can give you the ST type for the plasmid. For this analysis we need to know the plasmid type and therefore plasmidfinder should be done first

20
Q

What is the purpose of the PlasmidFinder?

A

Does not take the entire plasmid, but plasmid replicons, if we know the replicons of the plasmid, we know the type of plasmid we are working with and if they have the same plasmid. (Input is FASTA as well as a database of plasmid replicon sequences for comparison, and Output is TSV file)

21
Q

What is the purpose of the ResFinder?

A

Resistant gene detection, showing which bacteria show to be antibiotic resistant and which exact phenotype is being resistant. The program can give you the gene, class of gene or genome. It can detect the whole resistance gene and chromosomal point mutations causing resistance in the whole genome sequence

22
Q

What is KMA?

A

KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend. KMA is particulary good at aligning high quality reads against highly redundant databases, where unique matches often does not exist.

23
Q

What is chaining?

A

MEMs (maximal exact match) are likely to belong together and produce high quality alignments, they can therefore be chained together.