Gene Sequencing Programmes Flashcards

1
Q

What are the four main types of chromosome maps used in genome sequencing projects

A

Karyotypic map – based on visual observation of chromosomal spreads.

Linkage map – derived from monitoring recombination between genetic markers.

Physical map – measured in base pairs; often made using a tiling path of overlapping BAC clones.

Sequence map – the final goal, showing the actual base sequence along chromosomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why was the Human Genome Project started

A

To comprehensively sequence the human genome, gaining new insight into genes, health, evolution, and disease. It promised to transform “real science” by opening new fields like genomics, personalized medicine, and bioinformatics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When did discussions and formal work on the HGP begin

A

1985: Initial discussions at UC Santa Cruz.

1986: Department of Energy (DOE) started its own program.

1988: Support from key scientists like J. Watson and S. Brenner.

1990: Project officially began with both NIH and DOE.

1993: Sanger Centre opened in the UK, led by John Sulston.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What was the official goal of the Human Genome Project

A

To sequence 95% of the human genome within 15 years.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Whose genome was sequenced in the HGP

A

Identity of contributors kept anonymous.

70% of the sequence came from one individual.

Volunteers were recruited from a lab in Buffalo, NY.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is hierarchical genome sequencing and how does it work

A

Genome is broken into large segments using BACs (~100kb).

A tiling path of overlapping BACs is created.

Each BAC is sequenced using shotgun sequencing (randomly breaking them into smaller fragments, sequencing them, and compiling the sequence).

BAC sequences are then assembled into the full genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is shotgun genome sequencing and who used it

A

Used by Celera Genomics (founded by Elmer and Craig Venter in 1998).

Breaks the whole genome into small fragments and sequences them directly using computational assembly.

Faster and cheaper than the public project, but required powerful computing for accurate assembly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Next Generation Sequencing (NGS) and what enabled its rise

A

Massively parallel sequencing of millions of fragments.

Emerged in the early 2000s.

Enabled by advances in computing (Moore’s Law – doubling compute power every ~2 years).

First NGS platform was launched by 454 Life Sciences in 2005.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does 454 pyrosequencing work (emulsion PCR)

A

DNA fragments are annealed to beads with attached oligonucleotides.

Beads are dispersed in oil to form an emulsion (each droplet = microreactor).

Each droplet undergoes PCR amplification.

Resulting amplified beads are added to a sequencing plate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is light used in 454 pyrosequencing

A

When a nucleotide is incorporated, pyrophosphate (PPi) is released.

PPi powers a reaction involving luciferase, generating light.

The light’s intensity indicates the base added.

Allows real-time sequencing of DNA on each bead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Illumina sequencing and how is it different from 454

A

Uses bridge amplification to create clusters (‘polonies’) on a flow cell.

Sequencing is done by adding fluorescently labelled, 3’-blocked nucleotides.

One nucleotide is incorporated per cycle.

The incorporated base is identified by imaging, then the block is removed, allowing the next cycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does Illumina sequencing allow for high throughput

A

Many clusters are sequenced in parallel on the flow cell.

Each cluster emits a fluorescent signal when a base is added.

This results in massive amounts of sequence data from one run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is third-generation sequencing and what are its advantages

A

Sequences single DNA molecules in real-time without amplification.

No dephasing; allows longer read lengths.

Can detect DNA modifications directly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the main third-generation sequencing platforms

A

SMRT (Single Molecule Real-Time) by PacBio.

Oxford Nanopore Technologies (ONT) – DNA is passed through a protein nanopore, and current changes indicate base sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is combining short- and long-read sequencing effective

A

Short reads (e.g. Illumina) offer high accuracy and depth.

Long reads (e.g. ONT) help resolve structural variants and complex regions.

Combining both gives a more complete genome assembly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is bioinformatics essential in genomics

A

To store, access, analyze, and interpret massive amounts of sequencing data. Bioinformatics helps make biological sense of raw genomic sequences.

17
Q

What types of information are included in genome annotation

A

Source info: species, strain, tissue, cell line, etc.

Background info: researcher, literature, etc.

Sequence features: promoters, exons, introns, coding regions, motifs, etc.

Linked data: protein sequences, functions, pathways, etc.

18
Q

How are open reading frames (ORFs) used in annotation

A

ORFs >100 codons are identified.

These are searched in databases to identify:

Related genes

Conserved domains or motifs

Functional elements (e.g. signal peptides, transmembrane domains).

19
Q

What are BACs and why are they used in genome projects

A

BACs (Bacterial Artificial Chromosomes) can hold large DNA inserts (~100 kb), ideal for breaking the genome into manageable chunks in hierarchical sequencing. They are easy to clone, manipulate, and store.

20
Q

Why was the public genome sequencing effort slower and more expensive than Celera’s

A

Used Sanger sequencing, which was costlier and slower.

Labs involved had other research duties.

Despite the cost, it provided higher coverage and accuracy, which is critical for reference-quality genome data.