Shane 3 (AI) Molecular Diagnostics - Whole Genoms/Whole Exome Sequencing Flashcards
What percentage of the human genome is similar across individuals?
99.9%
The differences in the human genome are only found in 0.1% of the genome.
What is the significance of genetic mutations in cancer?
Mutations are usually found in many different genes.
An example is the BRCA1 mutation, which often occurs alongside other mutations.
What is the relationship between mutations and cancer development?
Multiple mutations occur before someone develops cancer.
This indicates that cancer is often the result of cumulative genetic changes.
Define Mendelian disorders.
Mutations in one gene.
Examples include Cystic Fibrosis (CF) and Prader-Willi Syndrome (SNRPN gene).
What are some characteristics of Mendelian disorders?
Some remain mysterious with unknown causative genes.
This highlights gaps in current genetic understanding.
What is a method for prenatal diagnosis?
Cell-free fetal DNA from the mother.
This method is less invasive compared to traditional techniques.
What was previously used for prenatal diagnosis but is no longer commonly done?
Amniocentesis.
It was considered too risky for routine use.
Fill in the blank: Genetic mutations associated with cancer usually occur in _______.
[many different genes].
True or False: The majority of human genome differences are found in more than 0.1% of the genome.
False.
Differences are only in 0.1% of the genome.
What year did Watson and Crick infer DNA’s structure?
1953
What significant event in 1977 involved nucleotide sequencing?
The first nucleotide sequence of the gene encoding yeast alanine tRNA was reported
Who introduced initial DNA sequencing methods in 1980?
Sanger, Maxam, and Gilbert
What was the first human linkage map based on in 1983?
Restriction fragment length polymorphism
What significant discovery was made in 1985?
Mullis discovered the PCR technique
What idea was proposed in 1986?
The idea of human genome sequencing
What was cloned in 1990?
The first human disease gene
What was accomplished in 1993 regarding homozygosity?
The first homozygosity mapping was done
What was the focus of the first positional cloning done in 1995?
A recessive disease gene on the basis of linkage
What was reported in 1996 regarding genome sequencing?
The first genome sequence of an organism (Hemophilus influenza)
What major milestone occurred in 2001?
The first human chromosome was sequenced
What was the first organism whose genome was sequenced in 2000?
Fruit fly
In what year was the first draft of the human genome sequence published?
2001
What key event occurred in 2003 regarding the human genome?
The human genome sequence was completed
What technology emerged in 2005 that transformed sequencing?
Massively parallel sequencing platforms (next-generation sequencing)
What year saw the introduction of the first NGS instrument?
2009
What was the first application of WES reported in 2010?
Proof of principle: disease gene identification by WES
Fill in the blank: The PCR technique was discovered by _______.
Mullis
True or False: The first individual genome based on NGS was published in 2010.
True
What percentage of Watson’s genome was published in 2008?
33%
What major advancement in genetics was made in 1985?
PCR technique
Who received a Nobel Prize for the development of the PCR technique?
Mullis
What did Watson’s genome reveal about his health?
Predisposed to lots of conditions
What is the primary purpose of Whole Genome Sequencing?
To determine the complete DNA sequence of an organism’s genome
This includes identifying all of the organism’s genes and their functions.
What is a BAC in the context of genome sequencing?
Bacterial Artificial Chromosome
BACs are used to clone large fragments of DNA for genomic studies.
What is the first step in the Hierarchical Shotgun Approach?
Fragmenting genomic DNA into large pieces
This initial step allows for the organization and mapping of the genome.
What does the term ‘minimal tiling path’ refer to in genome sequencing?
The process of using restriction enzymes to digest DNA into smaller fragments
This approach helps to create a more manageable set of sequences for assembly.
Fill in the blank: In the shotgun approach, DNA is fragmented into different sizes like _______.
shotgun pellets
This metaphor illustrates the random nature of fragmenting DNA.
What is the significance of overlapping reads in genome assembly?
They allow for the reconstruction of the original sequence
Overlapping reads are crucial for accurate genome assembly.
True or False: The process of preparing chromosome spreads is simple and quick.
False
Preparing chromosome spreads is described as a tedious process.
What is the role of restriction enzymes in the shotgun sequencing process?
To cut DNA into fragments of varying sizes
This facilitates the cloning and sequencing of DNA.
What are shotgun clones made from?
BAC clones that have been digested into smaller fragments
Shotgun clones are the actual sequences that are read during sequencing.
What was a major reason why the Human Genome Project took approximately 10 years to complete?
The tedious process of assembling sequences from overlapping reads
The complexity of genome sequencing contributed to the lengthy timeline.
What does the term ‘reads’ refer to in the context of genome sequencing?
Short sequences of nucleotides obtained from shotgun clones
Reads are the data points used to assemble the genome.
What is the primary technique used in whole genome sequencing known as the shotgun approach?
The shotgun approach involves randomly breaking up DNA into smaller fragments and then sequencing those fragments to reconstruct the entire genome.
True or False: The shotgun sequencing method requires the genome to be assembled in a specific order before sequencing.
False
Fill in the blank: In shotgun sequencing, the DNA is first ______ into smaller fragments.
fragmented
Which of the following is a key advantage of the shotgun sequencing method? A) It is labor-intensive B) It requires prior knowledge of the genome C) It allows for rapid sequencing of large genomes
C) It allows for rapid sequencing of large genomes
What are the main steps involved in the shotgun sequencing process?
The main steps are fragmentation of DNA, sequencing of the fragments, and computational assembly of the sequences.
What does BAC stand for in BAC libraries?
Bacterial Artificial Chromosome
True or False: BAC libraries are used to clone large fragments of DNA for genomic studies.
True
Fill in the blank: BAC libraries can contain inserts of DNA fragments up to _____ kilobases in size.
300
How are BAC libraries utilized in whole genome sequencing?
BAC libraries are used to organize and sequence large sections of the genome, allowing for more efficient mapping and assembly.
Which of the following is a primary advantage of using BAC libraries in genomic research? A) Small insert size B) Stability of cloned DNA C) High mutation rate D) Low cloning efficiency
B) Stability of cloned DNA
What is a minimal tiling path (MTP) in whole genome sequencing?
A minimal tiling path is the shortest sequence of overlapping DNA fragments that covers the entire genome of an organism.
True or False: A minimal tiling path includes gaps between DNA fragments.
False
Fill in the blank: The primary purpose of a minimal tiling path is to ______ the genome efficiently for sequencing.
cover
Which of the following best describes the utility of a minimal tiling path in genome sequencing? A) Reducing redundancy B) Increasing fragment size C) Enhancing computational complexity D) None of the above
A) Reducing redundancy
How does a minimal tiling path contribute to genome assembly?
It provides a framework for assembling overlapping sequences into a complete genome, ensuring all regions are represented.
What is the Celera Genomics Approach?
A private method for faster DNA sequencing that skips mapping of clones and goes straight to shotgun clones.
This approach allowed for quicker sequencing compared to public methods.
What sequencing method did Celera Genomics utilize?
Shotgun sequencing.
This method involves directly sequencing random DNA fragments.
What is the main advantage of the Celera Genomics Approach over public methods?
It took a lot less time to sequence DNA.
This efficiency was due to the direct sequencing of shotgun clones.
What percentage of the genome was Celera Genomics attempting to patent?
28%.
This indicates their focus on a specific portion of the genome for commercial purposes.
What is the significance of skipping the mapping of clones in the Celera Genomics Approach?
It allows for a more rapid sequencing process.
Mapping typically requires significant time and resources.
Fill in the blank: The Celera Genomics Approach went straight to _______ clones.
shotgun.
Shotgun clones are fragments of genomic DNA that are sequenced directly.
What is the starting material for Whole Genome Sequencing?
Double-stranded genomic DNA.
How are DNA fragments prepared for sequencing?
DNA is fragmented using a sonicater, such as Covaris.
What is the size range of DNA fragments for sequencing?
Small fragments, typically 200-500 nucleotides long.
How do you verify the size of DNA fragments?
Run the fragments on a gel alongside a size marker.
What is added to the DNA fragments to facilitate ligation?
A nucleotides (deoxy version of A nucleotide) to create an A overhang.
Why is an A overhang important?
It allows for easier ligation of an adaptor starting with a T.
What can you do if the DNA fragments are not the correct size?
Run the fragments on a gel again to check their size.
What technique can be used to recover a specific DNA sequence from a gel?
Cut the sequence out of the gel using a blade and melt down the agarose.
What can be designed using the adaptor sequences?
Primers for sequencing the DNA.
What are the three types of genetic variation among people?
- Single nucleotide variants (SNVs)
- Short insertions/deletions (indels)
- Structural variation (SV)
Each type represents different forms of genetic differences that can occur in the human genome.
Define single nucleotide variants (SNVs).
Variations at a single nucleotide position in the genome, with approximately 3.7-4.1 million occurring per person.
SNVs can affect gene function, but many do not result in changes to the amino acid sequence.
What is the range of short insertions/deletions (indels) in base pairs?
1-100 bp
Individuals can have approximately 300-600,000 indels in their genomes.
What is structural variation (SV) in genetics?
Large blocks (1000 bp or more) of DNA that are deleted, inserted, or inverted, with over 2,500 differences relative to the reference human genome sequence.
SVs can be associated with various genetic conditions.
What technology is primarily used for sequencing complete genomes?
Massively parallel DNA sequencing, with Illumina being the most efficient.
Other sequencing technologies can also be employed.
What percentage of people have structural variations associated with genetic conditions?
26%
Structural variations can lead to serious genetic conditions, especially if DNA is deleted.
What is the significance of short insertions/deletions in the human genome?
They are very common, and a person can have thousands of these variations.
Many occur in non-coding regions, leading to no observable effects.
True or False: Most single nucleotide variants change the amino acid that is expressed.
False
Many single nucleotide variants do not result in changes to the protein produced.
Fill in the blank: A person can have approximately _______ differences in structural variation compared to the reference human genome.
2500
This includes large changes such as deletions, insertions, and inversions.
What notable aspect is mentioned about Watson’s genome?
It contains lots of mutations and was sequenced when he was about 80 years old.
This highlights the accumulation of genetic variations over a lifetime.
What was the first complete genome sequenced using massively parallel DNA sequencing?
The genome of James D. Watson
Sequenced by Wheeler DA et al. in 2008
How long did it take to sequence James D. Watson’s genome?
Two months
Using massively parallel sequencing technology
What was the redundancy level achieved in sequencing James D. Watson’s genome?
7.4-fold redundancy
Indicates the depth of coverage during sequencing
How many single nucleotide polymorphisms were identified in Watson’s genome?
3.3 million
Includes 10,654 that cause amino acid substitutions
What is the range of base pairs for copy number variations identified in Watson’s genome?
26,000 to 1.5 million base pairs
Refers to large-scale gain and loss of chromosomal segments
What percentage of James D. Watson’s genome variants caused amino acid substitutions?
Approximately 0.3%
10,654 out of 3.3 million variants
What types of polymorphisms were identified in Watson’s genome?
Small-scale insertion and deletion polymorphisms
Ranging from 2 to 40,000 base pairs
True or False: James D. Watson was about 80 years old when his genome was sequenced.
True
Indicates the age of Watson at the time of sequencing
Fill in the blank: The sequencing of Watson’s genome revealed _______ variants.
[3.3 million]
Includes a significant number of mutations
What is the significance of the amino acid substitutions found in Watson’s genome?
They can be very similar or very different, potentially pathologic
The impact of substitutions varies based on the properties of the amino acids
What was the primary goal of the 1000 Genomes Project?
To sequence the genomes of individuals from populations around the world
The project aimed to provide a comprehensive map of human genetic variation.
How many individuals’ genomes were sequenced in the 1000 Genomes Project?
1,092 individuals
These individuals were selected from diverse populations globally.
What types of genetic variations were mapped in the 1000 Genomes Project?
SNPs, short insertions and deletions, larger deletions
Specifically, the project identified 38 million SNPs, 1.4 million short insertions and deletions, and more than 14,000 larger deletions.
What is the significance of the data obtained from the 1000 Genomes Project?
Used to screen variants in genetic disorders and cancer genome projects
This data helps researchers understand the genetic basis of diseases.
What is a SNP?
Single Nucleotide Polymorphism
SNPs are the most common type of genetic variation among people.
What is considered normal variation in humans according to the 1000 Genomes Project?
Genetic variations found in healthy individuals
These variations include common SNPs, insertions, and deletions.
What is the current status of the 1000 Genomes Project?
Morphed into 10,000 genomes and aims for 100,000 genomes
The project continues to expand its research on genetic variation.
Fill in the blank: The 1000 Genomes Project provides a map of _______ variations in the human genome.
genetic
This includes SNPs, insertions, deletions, and other variations.
True or False: The 1000 Genomes Project only sequenced individuals with genetic disorders.
False
The project focused on healthy individuals to establish normal genetic variation.
What is the purpose of screening out common SNPs in genetic research?
To focus on rarer SNPs
Rarer SNPs may have more significant implications for understanding diseases.
Where is the information from the 1000 Genomes Project stored?
In libraries like Ensembl
Ensembl is a genome browser that provides access to genomic data.
What is a shotgun library in molecular biology?
A collection of DNA fragments used for sequencing and mapping
Shotgun libraries are created to facilitate the sequencing of large genomes by breaking them into smaller pieces.
What percentage of the human genome is made up of exons?
Approximately 1%
Exons constitute about 180,000 exons and 30 million base pairs of the human genome.
What percentage of disease-causing mutations are found in protein-coding genes (exons)?
85%
This highlights the importance of exome sequencing in identifying genetic diseases.
What system is used for hybrid capture of target sequences in exome sequencing?
Agilent SureSelect Capture Array
This system utilizes a solution-based approach for capturing specific DNA sequences.
What is the purpose of the shearing step in preparing genomic DNA?
To produce small DNA fragments
This step is crucial for creating a library suitable for sequencing.
What are biotinylated RNA library baits used for?
To hybridize with sample DNA for target sequence capture
These baits are designed to be complementary to specific exon sequences.
What is the role of streptavidin in the capture process?
It binds to biotin to isolate target DNA sequences
Streptavidin has a high affinity for biotin, allowing for effective separation of bound and unbound sequences.
How are non-exons removed during the capture process?
By washing away unbound DNA after magnetic pull-down
This ensures that only the desired exonic sequences remain for further analysis.
Fill in the blank: The process of _______ involves digesting RNA probes to isolate pure DNA.
RNase treatment
RNase is an enzyme that specifically digests RNA, leaving behind only DNA.
What type of assay is used to pull down target sequences in the capture process?
Magnetic pull-down assay
This assay utilizes magnetic beads coated with streptavidin to isolate biotin-labeled DNA.
True or False: Most of the human genome is coding DNA.
False
Most of the human genome is non-coding; only about 1% consists of exons.
What is the primary goal of whole exome sequencing?
To identify mutations in protein-coding regions of the genome
This approach focuses on the parts of the genome that are most likely to affect disease.
What are cRNA baits?
Complementary RNA baits used to capture specific DNA sequences
These baits are critical for targeting exons during exome sequencing.
What is the significance of high-temperature conditions during hybridization?
To maintain stringency and prevent unwanted binding
High temperature helps ensure that only perfectly matched sequences hybridize.