Genome assembly Flashcards

1
Q

FASTQ

A

Four line of sequence, line 1 starts with @, line 3 starts with +, line 4 encodes the quality values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

FASTA file

A

A FASTA file contains sequence information readable by many programs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Genome assembly

A

Programs combine fragmented DNA reads to construct the genome, ideally using long, high-quality reads to manage the complexity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Contig structure

A

Contiguous sequence formed by several overlapping reads without any gaps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A scaffold in genome assembly

A

Ordered and oriented set of contigs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

N50 statistic

A

Weighted median statistic where 50% of the total assembly length is contained in contigs or scaffolds of length N or larger.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

De novo vs comparative genome assembly

A

Involves assembling reads to form a new sequence without a reference genome, whereas comparative assembly aligns reads against an existing reference sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

De Bruijn graph

A

Graph representing overlaps between sequences of symbols, used in genome assembly by splitting reads into uniform sized units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Eulerian walk in a graph

A

Closed trail in a graph with no repeated edges covering all edges of the graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly