Bioinformatics Tools Flashcards

1
Q

What bioinformatics tool used for predicting pathogenic variants uses human phenotype ontology?

A

Exomiser

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What tool can be used to identify phenotype ontology terms using semantic relationships between clinical features.

A

Phenomizer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is FastQC?

A

FastQC is a bioinformatics QC tool which aims to provide a QC report which can spot problems which originate either in the sequencer or in the starting library material.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does bcl2fastq do?

A

BCL files are base calls per cycle for Illumina sequencing They contain base call and quality for each tile in each cycle. Illumina software (bcl2fastq) converts BCL to FASTQ (demultiplexing). Other platforms have different types of raw data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does demultiplexing do and what is needed as input?

A

Multiplexing allows multiple samples to be run simultaneously on the same lane of a flowcell.
Each sample has a unique tag.
Sample sheet (.csv file) contains details of run including samples and tags.
This tag is then used to sort FASTQ data into files for each sample – demultiplexing
Bcl2fastq software does demultiplexing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the features an effective alignment algorithm should have?

A
  • Highly accurate
  • Be able to deal with problems in the data such as mismatches, errors and gaps
  • Needs to run fast enough to be useful
  • Has reasonable memory requirements
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do alignment algorithms do?

A

Alignment algorithms construct indices for read sequences, reference sequence (or both)
Based on type of index, alignment algorithms divided into three catagories:
Based on hash tables
Based on suffix trees
Based on merge sorting (Slider/SliderII)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Whats the difference between global and local alignment?

A

Global (Needleman-Wunsch) uses entire lengths of sequences involved,

Local (Smith Waterman) only uses parts of sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two methods of variant calling?

A

Probabilistic – e.g. freebayes

Heuristic – e.g. VarScan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What took can be used to assess the mapping quality?

A

Picard

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a probabilistic variant calling method based on?

A

Bayes Theorem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a heuristic variant calling method based on?

A

Instead of modelling the distribution of the observed data and using Bayesian statistics to calculate genotype probabilities, variant calls are made based on a variety of heuristic factors, such as minimum allele counts, read quality cut-offs, bounds on read depth, etc. Although they have been relatively unpopular in practice in comparison to probabilistic methods, in practice due to their use of bounds and cut-offs they can be robust to outlying data that violate the assumptions of probabilistic models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly