Exam 1 Homework and Quizzes Flashcards

1
Q

The human genome project was launched in ____, produced the first draft assembly in _____, and was finished in _____

A

1990
2001
2003

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Abbreviations in order from smallest to largest

A

bp Kb Mb Gb Tb Pb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

approximately how many bases are in a typical diploid mammalian genome

A

6 million

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

approximately how many bases are in a typical mammalian mitochondrial genome

A

16,000 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Approximately what proportion of a mammalian genome codes for proteins

A

2%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Approximately 50% of a mammalian genome is comprised of what type of DNA element

A

repetitive DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The polymerase chain reaction has four key ‘ingredients’ necessary to replicate DNA in a tube

A

DNA template
DNA polymerase
Nucleotides
Primers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the three stages of the polymerase chain reaction

A

denaturing
annealing
extending

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sanger sequencing differs from PCR in one key element, what is that key element

A

sanger uses dideoxynucleotides along with the deoxynucleotides

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Illumina sequencing requires that the library contain fragments in a certain size range. what size range are typical of whole genome sequencing libraries

A

300-350 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

PacBio and Oxford Nanopore sequencing are very different but share one characteristic in common, and this characteristic differentiates them from Illumina technology. What is that characteristic?

A

they need a long DNA template strand to start sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Five domains of genome research

A
  • understanding the structure of genomes
    -understanding the biology of genomes
    -understanding the biology of disease
    -advancing the science of medicine
    -improving the effectiveness of health care
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the largest current bottleneck in genomics

A

analyzing the stream of data from technological advances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the Illumina process, the nucleotides are very specialized. they have two key attributes, what are they

A

flour specific for the identity of the nucleotides
3’ hydroxyl group is blocked with a chemical blocker so next step can be accurately detected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

____ of reads to the reference sequence is the first step to identify variation of all types

A

alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Long read sequencers such as the PacBio instrument are a departure from short read sequencings such as Illumina. What is the first major requirement for these long read technologies that is different from short read technologies

A

high molecular weight genomic DNA
must be sufficient quality to allow for >30Kb shearing to produce PacBio continuous reads

17
Q

A typical workflow of whole exome sequencing analysis consists of the following steps

A

-raw data QC
-Pre-processing
-mapping
-post-alignment processing
-variant calling
-annotation
-prioritization

18
Q

standard preprocessing procedure includes

A

-3’ end adapter removal
-trimming of low quality bases at the ends of the reads

19
Q

many different tools have been developed for short reads mapping. In general, they use two algorithms for aligning sequences

A

-Burrows-Wheeler transformation- compression technique
-smith-waterman- dynamic programming algorithm

20
Q

Of the sequence aligners they evaluated which two were the fastes

A

Bowtie 2
BWA

21
Q

After mapping reads to the reference genome, a three-step post-alignment processing procedure is recommended to minimize the artifacts that may affect the quality of downstream variant calling. It consists of

A

-read duplicate removal
-indel realignment
-base quality score recalibration (BQSR)

22
Q

Variant analysis consists of

A

genotyping
variant calling
annotation
prioritization