Genomics Flashcards
What is adjusted rand index?
Adjusted Rand index: A measure of the similarity between two data clusterings, adjusted for chance grouping of the elements.
What is CoC analysis?
Cluster of clusters (CoC) analysis: A method of obtaining clusters (e.g., of patient samples) that represent a consensus among the individual data types (in this study, we incorporated DNA methylation, DNA copy number, mRNA expression, and microRNA expression into the analysis).
What is a DM-BER?
Double-minute chromosome–breakpoint-enriched region (DM-BER): As detected by whole-exome and whole-genome sequencing, highly amplified gene regions that are connected by DNA rearrangement breakpoints and allow cancer cells to maintain high levels of oncogene amplification.
Define exon
The portion of a gene that encodes amino acids to form a protein.
Define fusion transcript
A transcript composed of parts of two separate genes joined together by a chromosomal rearrangement, in some cases with functional consequences for oncogenesis, therapy, or both.
What is methylation?
The attachment of methyl groups to DNA at cytosine bases. Methylation is correlated with reduced transcription of the gene immediately downstream of the methylated site.
Define microRNA
A short regulatory form of RNA that binds to a target RNA and generally suppresses its translation by ribosomes.
What is meant by ‘molecular subtype’
Molecular subtype: Subgroup of a tumor type based on molecular characteristics (rather than, e.g., histologic or clinical features); in this study, a molecular subtype is one of three classes based on IDH mutation and 1p/19q codeletion status.
Define mutation frequency
Mutation frequency: The number of mutations detected per megabase of DNA.
What is meant by ‘significantly mutated gene’?
Significantly mutated gene: A gene with a greater number of mutations than expected on the basis of the background mutation rate, which suggests a role in oncogenesis.
Sanger sequencing: how does the classical chain-termination method work?
Method of DNA sequencing based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. The classical chain-termination method requires a single-stranded DNA template, a DNA primer, a DNA polymerase, normal deoxynucleotidetriphosphates (dNTPs), and modified di-deoxynucleotidetriphosphates (ddNTPs), the latter of which terminate DNA strand elongation. These chain-terminating nucleotides lack a 3’-OH group required for the formation of a phosphodiester bond between two nucleotides, causing DNA polymerase to cease extension of DNA when a modified ddNTP is incorporated. The ddNTPs may be radioactively or fluorescently labeled for detection in automated sequencing machines. The DNA sample is divided into four separate sequencing reactions, containing all four of the standard deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA polymerase. To each reaction is added only one of the four dideoxynucleotides (ddATP, ddGTP, ddCTP, or ddTTP), while the four other nucleotides are ordinary ones. The dideoxynucleotide is added in approximately 100-fold excess of the corresponding deoxynucleotide(e.g. 0.5mM ddATP : 0.005mM dATP) allowing for enough fragments to be produced while still transcribing the complete sequence. Following rounds of template DNA extension from the bound primer, the resulting DNA fragments are heat denatured and separated by size using gel electrophoresis. In the original publication of 1977,[2] the formation of base-paired loops of ssDNA was a cause of serious difficulty in resolving bands at some locations. This is frequently performed using a denaturing polyacrylamide-urea gel with each of the four reactions run in one of four individual lanes (lanes A, T, G, C). The DNA bands may then be visualized by autoradiography or UV light and the DNA sequence can be directly read off the X-ray film or gel image. In the image on the right, X-ray film was exposed to the gel, and the dark bands correspond to DNA fragments of different lengths. A dark band in a lane indicates a DNA fragment that is the result of chain termination after incorporation of a dideoxynucleotide (ddATP, ddGTP, ddCTP, or ddTTP). The relative positions of the different bands among the four lanes, from bottom to top, are then used to read the DNA sequence.
In which instances is Sanger sequencing still often useful?
Sanger method remains in wide use, for smaller-scale projects, validation of Next-Gen results and for obtaining especially long contiguous DNA sequence reads (>500 nucleotides).
Limitations of chain-termination methods of Sanger sequencing?
Limitations include non-specific binding of the primer to the DNA, affecting accurate read-out of the DNA sequence, and DNA secondary structures affecting the fidelity of the sequence.
Describe two technical variations of chain-termination sequencing.
Technical variations of chain-termination sequencing include tagging with nucleotides containing radioactive phosphorus for radiolabelling, or using a primer labeled at the 5’ end with a fluorescent dye. Dye-primer sequencing facilitates reading in an optical system for faster and more economical analysis and automation. The later development by Leroy Hood and coworkers of fluorescently labeled ddNTPs and primers set the stage for automated, high-throughput DNA sequencing.
What does CRISPR stand for?
Clustered regularly interspaced short palindromic repeats Fifty years ago, microbiologists sparked the recombinant-DNA revolution with the discovery that bacteria have innate immune systems based on restriction enzymes. These enzymes bind and cut invading viral genomes at specific short sequences, and scientists rapidly repurposed them to cut and paste DNA in vitro — transforming biologic science and giving rise to the biotechnology industry. Ten years ago, microbiologists discovered that bacteria also harbor adaptive immune systems, and subsequent progress has been breathtakingly rapid.1 Between 2005 and 2009, microbial genetic studies conducted by the laboratories of Mojica, Jansen, Koonin, Horvath, van der Oost, Sontheimer, Marraffini, and others revealed that bacteria have a programmable mechanism that directs nucleases, such as Cas9, to bind and cut invading DNA that matches “guide RNAs” encoded in specific bacterial genome regions containing clustered regularly interspaced short palindromic repeats (CRISPR).
How might CRISPR technology be applied to HIV?
To treat HIV infection, physicians might edit a patient’s immune cells to delete the CCR5 gene, conferring the resistance to HIV carried by the 1% of the U.S. population lacking functional copies of this gene. To treat progressive blindness caused by dominant forms of retinitis pigmentosa, they might inactivate the mutant allele in retinal cells. To prevent MIs that kill patients with homozygous familial hypercholesterolemia, they might edit liver cells to restore a functional copy of the gene encoding low-density lipoprotein receptors. Editing of blood stem cells might cure sickle cell anemia and hemophilia. These goals will require overcoming serious technical challenges (such as avoiding “off-target” edits elsewhere in the genome, which might give rise to cancer), but they pose no unique ethical issues because they affect only a patient’s own somatic cells.
Describe four central issues with human germline editing using CRISPR-Cas9 technology.
- Technical issues: whether genome editing can be performed with sufficient precision to permit scientists to responsibly contemplate creating genetically modified babies. Currently, the technology is far from ready: Liang and colleagues recently applied genome editing to human tripronuclear zygotes (abnormal products of in vitro fertilization [IVF] that are incapable of developing in vivo) and documented problems including incomplete editing, inaccurate editing, and off-target mutations. Even with improved accuracy, the process is unlikely to be risk-free. 2. Do compelling medical needs outweigh the risks both from inaccurate editing and from unanticipated effects of the intended edits. Various potential applications must be considered. 3. Who has the right to decide? Can parents consent for future generations? Some people will argue that parents should have unfettered autonomy — that modifying one’s progeny is akin to using PGD to avoid genetic diseases or choosing sperm donors on the basis of intellectual or athletic prowess. Yet parental autonomy must be weighed against the interests of future generations who cannot consent to the genetic modifications their flesh will be heir to. 4. Morality — what’s right and wrong and how we ought to live as a society. Authorizing scientists to make permanent changes to the DNA of our species is a decision that should require broad societal understanding and consent. It has been only about a decade since we first read the human genome. We should exercise great caution before we begin to rewrite it.
Describe potential applications of germline editing using CRISPR technology and arguments for/against.
i) Preventing devastating monogenic diseases, such as Huntington’s disease. Though avoiding the roughly 3600 rare monogenic disorders caused by known disease genes is a compelling goal, the rationale for embryo editing largely evaporates under careful scrutiny. Genome editing would require making IVF embryos, using preimplantation genetic diagnosis (PGD) to identify those that would have the disease, repairing the gene, and implanting the embryo. Yet it would be easier and safer simply to use PGD to identify and implant the embryos that aren’t at risk: the proportion is high in the typical cases of a parent heterozygous for a dominant disease (50%) or two parents who are carriers for a recessive disease (75%). To reduce the incidence of monogenic disease, what’s needed most is not embryo editing, but routine genetic testing so that the many couples who don’t know they are at risk can avail themselves of PGD. ii) Reducing the risk of common diseases, such as heart disease, cancer, diabetes, and multiple sclerosis. The heritable influence on disease risk is polygenic, shaped by variants in dozens to hundreds of genes. Common variants tend to make only modest contributions (for example, reducing risk from 10% to 9.5%); rare variants sometimes have larger effects, including a few for which heterozygosity provides significant protection against disease. iii) Reshaping the human gene pool by endowing all children with many naturally occurring “protective” variants. However, genetic variants that decrease risk for some diseases can increase risk for others. (For example, the CCR5 mutations that protect against HIV also elevate the risk for West Nile virus, and multiple genes have variants with opposing effects on risk for type 1 diabetes and Crohn’s disease.) The full medical effect of most variants is poorly characterized, let alone the combined effects of many variants. Safety studies would be needed to assess effects across various genetic backgrounds and environmental exposures. The situation is particularly dicey for rare protective heterozygous variants: most have never been seen in the homozygous state in humans and might have deleterious effects. Yet heterozygous parents would routinely produce homozygous children (one quarter of the total) — unless humans forswore natural reproduction in favor of IVF. iv) Currently, the best arguments might be for eliminating the ε4 variant at the APOE gene (which increases risk for Alzheimer’s disease and cardiovascular disease) and bestowing null alleles at the PCSK9 gene (which reduces the risk of myocardial infarction). Still, our knowledge is incomplete. For example, APOE ε4 has also been reported to be associated with better episodic and working memory in young adults. v) Why limit ourselves to naturally occurring genetic variants? Why not use synthetic biology to write new cellular circuits that, for example, cause cells to commit suicide if they start down the road toward cancer? But such efforts would be reckless, at least for now. We remain terrible at predicting the consequences of even simple genetic modifications in mice. One cautionary tale among many is a genetic modification of the tp53 gene that protected mice against cancer while unexpectedly causing premature aging.5 We would also need to anticipate the potential interactions among the diverse genetic circuits that creative scientists will cast into the gene pool. Mistakes would be inevitable, and there would be no way to recall novel genes from the human population. vi) Reshape non-medical traits. Height may prove challenging (the hundreds of natural variants have tiny effects), but hair and eye color may be pliable. Disruption of the MC1R gene is associated with bright red hair, although it also heightens the risk of melanoma. Sports-minded parents might want to introduce the overactive erythropoietin gene that conferred high oxygen-carrying ability on a seven-time Olympic medalist in cross-country skiing. Nonnatural genetic modifications hold even bolder prospects — and risks.